Grammar as a Foreign Language

O. Vinyals,Lukasz Kaiser,Terry Koo,Slav Petrov,I. Sutskever,Geoffrey E. Hinton

Published 2014 in Neural Information Processing Systems

ABSTRACT

Syntactic constituency parsing is a fundamental problem in natural language processing and has been the subject of intensive research and engineering for decades. As a result, the most accurate parsers are domain specific, complex, and inefficient. In this paper we show that the domain agnostic attention-enhanced sequence-to-sequence model achieves state-of-the-art results on the most widely used syntactic constituency parsing dataset, when trained on a large synthetic corpus that was annotated using existing parsers. It also matches the performance of standard parsers when trained only on a small human-annotated dataset, which shows that this model is highly data-efficient, in contrast to sequence-to-sequence models without the attention mechanism. Our parser is also fast, processing over a hundred sentences per second with an unoptimized CPU implementation.

PUBLICATION RECORD

Publication year
2014
Venue
Neural Information Processing Systems
Publication date
2014-12-23
Fields of study
Mathematics, Linguistics, Computer Science
Identifiers
arXiv 1412.7449
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

On Using Very Large Target Vocabulary for Neural Machine Translation
2014cited by this paper
Sparser, Better, Faster GPU Parsing
2014cited by this paper
Learning to Execute
2014cited by this paper
Sequence to Sequence Learning with Neural Networks
2014influential reference
End-to-end Continuous Speech Recognition using Attention-based Recurrent NN: First Results
2014cited by this paper
Neural Machine Translation by Jointly Learning to Align and Translate
2014influential reference
Ambiguity-aware Ensemble Training for Semi-supervised Dependency Parsing
2014cited by this paper
Show and tell: A neural image caption generator
2014cited by this paper
Addressing the Rare Word Problem in Neural Machine Translation
2014cited by this paper
Generating Sequences With Recurrent Neural Networks
2013cited by this paper
Recurrent Continuous Translation Models
2013cited by this paper
Efficient Estimation of Word Representations in Vector Space
2013influential reference
Fast and Accurate Shift-Reduce Constituent Parsing
2013influential reference
Overview of the 2012 Shared Task on Parsing the Web
2012influential reference
Deep Learning for Efficient Discriminative Parsing
2011cited by this paper
Parsing Natural Scenes and Natural Language with Recursive Neural Networks
2011cited by this paper
Incremental Sigmoid Belief Networks for Grammar Learning
2010cited by this paper
Products of Random Latent Variable Grammars
2010cited by this paper
Self-Training with Products of Latent Variable Grammars
2010cited by this paper
Self-Training PCFG Grammars with Latent Annotations Across Languages
2009influential reference
Algorithms for Deterministic Incremental Dependency Parsing
2008cited by this paper
Constituent Parsing with Incremental Sigmoid Belief Networks
2007cited by this paper
OntoNotes: The 90% Solution
2006cited by this paper
QuestionBank: Creating a Corpus of Parse-Annotated Questions
2006influential reference
Learning Accurate, Compact, and Interpretable Tree Annotation
2006influential reference
Effective Self-Training for Parsing
2006influential reference
Discriminative Training of a Neural Network Statistical Parser
2004cited by this paper
Incremental Parsing with the Perceptron Algorithm
2004cited by this paper
Accurate Unlexicalized Parsing
2003cited by this paper
Inducing History Representations for Broad Coverage Statistical Parsing
2003cited by this paper
A Linear Observed Time Statistical Parser Based on Maximum Entropy Models
1997cited by this paper
Three Generative, Lexicalised Models for Statistical Parsing
1997cited by this paper
Long Short-Term Memory
1997influential reference
Building a Large Annotated Corpus of English: The Penn Treebank
1993cited by this paper
Under review as a conference paper at ICLR 2019 softmax Transpose Updated Memory Memory embedded input linear concat Attention Weights concat
year unknowncited by this paper

CITED BY

OWL-S Grounding Parameters Matching by Means of LLM: Preliminary Investigation
2025cites this paper
Adept: AI-Generated Text Detection Based on Phrasal Category N-Grams
2025cites this paper
Trajectory-Ordered Objectives for Self-Supervised Representation Learning of Temporal Healthcare Data Using Transformers: Model Development and Evaluation Study
2025cites this paper
Proposing TAGbank as a Corpus of Tree-Adjoining Grammar Derivations
2025cites this paper
Domain Adaptation for Japanese Sentence Embeddings with Contrastive Learning based on Synthetic Sentence Generation
2025cites this paper
Towards Improving the Reliability of LLMs in Requirements Engineering with Structured Confidence and Tag Governance
2025cites this paper
Grammar as the backbone of language learning
2025cites this paper
Step-by-step Instructions and a Simple Tabular Output Format Improve the Dependency Parsing Accuracy of LLMs
2025cites this paper
Self-Correction Makes LLMs Better Parsers
2025cites this paper
Abstractive Text Summarization for Bangla Language Using NLP and Machine Learning Approaches
2025cites this paper
An inclusive review on deep learning techniques and their scope in handwriting recognition
2024cites this paper
Generative Pretrained Structured Transformers: Unsupervised Syntactic Language Models at Scale
2024cites this paper
An Effective Model for Aspect Category Sentiment Analysis on a Chinese Tobacco Dataset
2024cites this paper
Explainability Approach-Based Series Arc Fault Detection Method for Photovoltaic Systems
2024cites this paper
Information Extraction with Differentiable Beam Search on Graph RNNs
2024cites this paper
jp-evalb: Robust Alignment-based PARSEVAL Measures
2024cites this paper
Cross-Modal Attention Network for Detecting Multimodal Misinformation From Multiple Platforms
2024cites this paper
Utilizing Deep Learning Technologies for Innovative Application
2024cites this paper
Air Quality Prediction Based on Singular Spectrum Analysis and Artificial Neural Networks
2024cites this paper
Hierarchical syntactic structure in human-like language models
2024cites this paper
Automatic generation algorithm of power report summary based on machine learning
2024cites this paper
A New Multimodal Large Model Framework for Knowledge-enhanced Image Caption Generation
2024cites this paper
Revisiting Absence withSymptoms that *T* Show up Decades Later to Recover Empty Categories
2024cites this paper
Entity-Aware Biaffine Attention Model for Improved Constituent Parsing with Reduced Entity Violations
2024cites this paper
A survey on handwritten mathematical expression recognition: The rise of encoder-decoder and GNN models
2024cites this paper
Syntactic Modeling and Neural-Based Parsing for Multifunction Radar Signal Interpretation
2024cites this paper
Cross-domain Constituency Parsing by Leveraging Heterogeneous Data
2024cites this paper
KALE-LM-Chem: Vision and Practice Toward an AI Brain for Chemistry
2024cites this paper
Null Subjects in Spanish as a Machine Translation Problem
2024cites this paper
Deep learning for named entity recognition: a survey
2024cites this paper
Improving Production Management and Control with Intelligent Chatbot Services
2024cites this paper
Deep Autoregressive Models as Causal Inference Engines
2024cites this paper
PrivyNAS: Privacy-Aware Neural Architecture Search for Split Computing in Edge–Cloud Systems
2024cites this paper
Neural Representation Learning in Linguistic Structured Prediction
2024cites this paper
Reinforced Sequence Training based Subjective Bias Correction
2023cites this paper
Hierarchical Multi-label Text Classification Method Based On Multi-level Decoupling
2023cites this paper
Approximating CKY with Transformers
2023cites this paper
Unleashing the True Potential of Sequence-to-Sequence Models for Sequence Tagging and Structure Parsing
2023cites this paper
Context-Free Grammars and Constituency Parsing
2023cites this paper
Answering Complex Questions over Text by Hybrid Question Parsing and Execution
2023cites this paper
Towards English-centric Zero-shot Neural Machine Translation: The Analysis and Solution
2023cites this paper
A Unified Framework for Synaesthesia Analysis
2023cites this paper
Student as an Inherent Denoiser of Noisy Teacher
2023cites this paper
User-Aware Prefix-Tuning is a Good Learner for Personalized Image Captioning
2023cites this paper
Pushdown Layers: Encoding Recursive Structure in Transformer Language Models
2023cites this paper
Multistage Collaborative Knowledge Distillation from Large Language Models
2023cites this paper
A Novel Alignment-based Approach for PARSEVAL Measuress
2023cites this paper
VietLegalQA: Unsupervised Legal Question Answering for Vietnamese Using Cloze Translation Approach
2023cites this paper
Structured Prediction with Stronger Consistency Guarantees
2023cites this paper
Form follows Function: Text-to-Text Conditional Graph Generation based on Functional Requirements
2023cites this paper
Retrieval-Augmented Parsing for Complex Graphs by Exploiting Structure and Uncertainty
2023cites this paper
Underwater Robotics Semantic Parser Assistant
2023cites this paper
Constituency Parsing Using LLMs
2023cites this paper
On the Challenges of Fully Incremental Neural Dependency Parsing
2023cites this paper
DyVal: Dynamic Evaluation of Large Language Models for Reasoning Tasks
2023influential citation
On Graph-based Reentrancy-free Semantic Parsing
2023cites this paper
Data-driven Communicative Behaviour Generation: A Survey
2023cites this paper
Dual-interest Factorization-heads Attention for Sequential Recommendation
2023cites this paper
Newton-Cotes Graph Neural Networks: On the Time Evolution of Dynamic Systems
2023cites this paper
Adversarial Learning-Based Sentiment Analysis for Socially Implemented IoMT Systems
2023cites this paper
Faster sorting algorithms discovered using deep reinforcement learning
2023cites this paper
Grammar-Constrained Decoding for Structured NLP Tasks without Finetuning
2023cites this paper
Multistage Collaborative Knowledge Distillation from a Large Language Model for Semi-Supervised Sequence Generation
2023cites this paper
Hunting for Insider Threats Using LSTM-Based Anomaly Detection
2023cites this paper
On Compositional Uncertainty Quantification for Seq2seq Graph Parsing
2023cites this paper
Deep Learning for Natural Language Processing: A Survey
2023cites this paper
Dependency Parsing via Sequence Generation
2022cites this paper
Annotated History of Modern AI and Deep Learning
2022cites this paper
SVD-PINNs: Transfer Learning of Physics-Informed Neural Networks via Singular Value Decomposition
2022cites this paper
An Overview on Controllable Text Generation via Variational Auto-Encoders
2022cites this paper
Structural String Decoder for Handwritten Mathematical Expression Recognition
2022cites this paper
Autoregressive Structured Prediction with Language Models
2022cites this paper
Shift-Reduce Task-Oriented Semantic Parsing with Stack-Transformers
2022cites this paper
Private Semi-supervised Knowledge Transfer for Deep Learning from Noisy Labels
2022cites this paper
Fast Rule-Based Decoding: Revisiting Syntactic Rules in Neural Constituency Parsing
2022cites this paper
Zero-Shot 3D Drug Design by Sketching and Generating
2022cites this paper
Model Blending for Text Classification
2022cites this paper
GROOT: Corrective Reward Optimization for Generative Sequential Labeling
2022cites this paper
Aspect-based Sentiment Analysis with Opinion Tree Generation
2022cites this paper
Constrained Sequence-to-Tree Generation for Hierarchical Text Classification
2022cites this paper
When More Data Hurts: A Troubling Quirk in Developing Broad-Coverage Natural Language Understanding Systems
2022cites this paper
Framework for Automatic Semantic Annotation of Images Based on Image’s Low-Level Features and Surrounding Text
2022cites this paper
Dense Paraphrasing for Textual Enrichment
2022cites this paper
How Does Beam Search improve Span-Level Confidence Estimation in Generative Sequence Labeling?
2022cites this paper
LSTM with particle Swam optimization for sales forecasting
2022cites this paper
Global-local neighborhood based network representation for citation recommendation
2022cites this paper
Towards recurrent neural network with multi-path feature fusion for signal modulation recognition
2022cites this paper
An Attention-Based Digraph Convolution Network Enabled Framework for Congestion Recognition in Three-Dimensional Road Networks
2022cites this paper
Attention-Based Bi-LSTM Model for Arabic Depression Classification
2022cites this paper
Transformer Grammars: Augmenting Transformer Language Models with Syntactic Inductive Biases at Scale
2022cites this paper
Automatic detection and segmentation of optic disc using a modified convolution network
2022cites this paper
BenchCLAMP: A Benchmark for Evaluating Language Models on Syntactic and Semantic Parsing
2022cites this paper
Disentangling Style and Speaker Attributes for TTS Style Transfer
2022cites this paper
Syntactic Inductive Biases for Natural Language Processing
2022cites this paper
A Brief Overview of Universal Sentence Representation Methods: A Linguistic View
2022cites this paper
Towards Collaborative Neural-Symbolic Graph Semantic Parsing via Uncertainty
2022cites this paper
Transfer learning-based query classification for intelligent building information spoken dialogue
2022cites this paper
E-Tanh: a novel activation function for image processing neural network models
2022cites this paper
Structured Prediction as Translation between Augmented Natural Languages
2021cites this paper
Syntax-guided text generation via graph neural network
2021cites this paper