System Building Cost vs. Output Quality in Data-to-Text Generation

Published 2009 in European Workshop on Natural Language Generation

ABSTRACT

Data-to-text generation systems tend to be knowledge-based and manually built, which limits their reusability and makes them time and cost-intensive to create and maintain. Methods for automating (part of) the system building process exist, but do such methods risk a loss in output quality? In this paper, we investigate the cost/quality trade-off in generation system building. We compare four new data-to-text systems which were created by predominantly automatic techniques against six existing systems for the same domain which were created by predominantly manual techniques. We evaluate the ten systems using intrinsic automatic metrics and human quality ratings. We find that increasing the degree to which system building is automated does not necessarily result in a reduction in output quality. We find furthermore that standard automatic evaluation metrics underestimate the quality of handcrafted systems and over-estimate the quality of automatically created systems.

PUBLICATION RECORD

Publication year
2009
Venue
European Workshop on Natural Language Generation
Publication date
2009-03-30
Fields of study
Computer Science
Identifiers
DOI 10.3115/1610195.1610198
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

Automatic generation of weather forecast texts using comprehensive probabilistic generation-space models
2008influential reference
SUMTIME-METEO: Parallel Corpus of Naturally Occurring Forecast Texts and Weather Data
2008cited by this paper
Generation by Inverting a Semantic Parser that Uses Statistical Machine Translation
2007cited by this paper
Moses: Open Source Toolkit for Statistical Machine Translation
2007cited by this paper
Comparing Automatic and Human Evaluation of NLG Systems
2006cited by this paper
Learning for Semantic Parsing with Statistical Machine Translation
2006cited by this paper
An Introduction to Synchronous Grammars
2006cited by this paper
Automatic Evaluation of Machine Translation Quality
2006influential reference
On Some Pitfalls in Automatic Evaluation and Significance Testing for MT
2005cited by this paper
Choosing words in computer-generated weather forecasts
2005cited by this paper
Statistical Phrase-Based Translation
2003cited by this paper
A Systematic Comparison of Various Statistical Alignment Models
2003cited by this paper
Bleu: a Method for Automatic Evaluation of Machine Translation
2002cited by this paper
Automatic evaluation of machine translation quality using n-gram co-occurrence statistics
2002influential reference
Automatic Evaluation of Machine Translation Quality Using N-gram Co-Occurrence Statistics
2002influential reference
Forest-Based Statistical Sentence Generation
2000influential reference
Generation that Exploits Corpus-Based Statistical Knowledge
1998cited by this paper
Building applied natural language generation systems
1997cited by this paper
The Mathematics of Statistical Machine Translation: Parameter Estimation
1993cited by this paper

CITED BY

Self-training from Self-memory in Data-to-text Generation
2024cites this paper
Is ChatGPT better than Human Annotators? Potential and Limitations of ChatGPT in Explaining Implicit Hate Speech
2023influential citation
Chain of Explanation: New Prompting Method to Generate Quality Natural Language Explanation for Implicit Hate Speech
2022cites this paper
Chain of Explanation: New Prompting Method to Generate Higher Quality Natural Language Explanation for Implicit Hate Speech
2022cites this paper
Modified Transformer Architecture to Explain Black Box Models in Narrative Form
2022cites this paper
Automatic Generation of Test Documents Based on Knowledge Extraction
2022cites this paper
A Study of Automatic Metrics for the Evaluation of Natural Language Explanations
2021cites this paper
Neural Data-to-Text Generation with Dynamic Content Planning
2020cites this paper
Disentangling the Properties of Human Evaluation Methods: A Classification System to Support Comparability, Meta-Evaluation and Reproducibility Testing
2020cites this paper
Intégration de VerbNet dans un réalisateur profond
2018cites this paper
Generating Descriptions from Structured Data Using a Bifocal Attention Mechanism and Gated Orthogonalization
2018cites this paper
Probabilistic Verb Selection for Data-to-Text Generation
2018cites this paper
Automated learning of templates for data-to-text generation: comparing rule-based, statistical and neural methods
2018cites this paper
The Code2Text Challenge: Text Generation in Source Libraries
2017cites this paper
The Code2Text Challenge: Text Generation in Source Libraries
2017cites this paper
Statistical Natural Language Generation from Tabular Non-textual Data
2016influential citation
Response Generation in Dialogue Using a Tailored PCFG Parser
2015cites this paper
Stochastic Language Generation Using Situated PCFGs
2015cites this paper
A Probabilistic Approach to Text Generation of Human Motions extracted from Kinect Videos
2013cites this paper
Generating Weather Forecast Texts with Case Based Reasoning
2012cites this paper
Discrete vs. Continuous Rating Scales for Language Evaluation in NLP
2011cites this paper
Unsupervised Alignment of Comparable Data and Text Resources
2011cites this paper
Sentence generation for artificial brains: A glocal similarity-matching approach
2010cites this paper
Harvesting Re-usable High-level Rules for Expository Dialogue Generation
2010cites this paper
A Simple Domain-Independent Probabilistic Approach to Generation
2010cites this paper
Extracting Parallel Fragments from Comparable Corpora for Data-to-text Generation
2010cites this paper
Introducing Shared Tasks to NLG: The TUNA Shared Task Evaluation Challenges
2010cites this paper
Evaluating a dialog language generation system: comparing the mountain system to other NLG approaches
2010influential citation
Introducing shared task evaluation to NLG : The TUNA shared task evaluation challenges
2010cites this paper
Assessing the Trade-Off between System Building Cost and Output Quality in Data-to-Text Generation
2010cites this paper
Comparing Rating Scales and Preference Judgements in Language Evaluation
2010cites this paper
Data-driven Natural Language Generation: Making Machines Talk Like Humans Using Natural Corpora
2010cites this paper
From data to text in the Neonatal Intensive Care Unit: Using NLG technology for decision support and information management
2009cites this paper
Answer generation for Chinese cuisine QA system
2009cites this paper