SMT vs NMT: A Comparison over Hindi and Bengali Simple Sentences

S. Mahata,Soumil Mandal,Dipankar Das,Sivaji Bandyopadhyay

Published 2018 in ICON

ABSTRACT

In the present article, we identified the qualitative differences between Statistical Machine Translation (SMT) and Neural Machine Translation (NMT) outputs. We have tried to answer two important questions: 1. Does NMT perform equivalently well with respect to SMT and 2. Does it add extra flavor in improving the quality of MT output by employing simple sentences as training units. In order to obtain insights, we have developed three core models viz., SMT model based on Moses toolkit, followed by character and word level NMT models. All of the systems use English-Hindi and English-Bengali language pairs containing simple sentences as well as sentences of other complexity. In order to preserve the translations semantics with respect to the target words of a sentence, we have employed soft-attention into our word level NMT model. We have further evaluated all the systems with respect to the scenarios where they succeed and fail. Finally, the quality of translation has been validated using BLEU and TER metrics along with manual parameters like fluency, adequacy etc. We observed that NMT outperforms SMT in case of simple sentences whereas SMT outperforms in case of all types of sentence.

PUBLICATION RECORD

Publication year
2018
Venue
ICON
Publication date
2018-12-12
Fields of study
Linguistics, Computer Science
Identifiers
arXiv 1812.04898
External record
Open on Semantic Scholar
Source metadata
Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

No claims are published for this paper.

CONCEPTS

No concepts are published for this paper.

REFERENCES

MTIL2017: Machine Translation Using Recurrent Neural Network on Statistical Machine Translation
2019cited by this paper
Attention is All you Need
2017cited by this paper
BUCC2017: A Hybrid Approach for Identifying Parallel Sentences in Comparable Corpora
2017cited by this paper
Google's Neural Machine Translation System: Bridging the Gap between Human and Machine Translation
2016cited by this paper
A Character-level Decoder without Explicit Segmentation for Neural Machine Translation
2016cited by this paper
WMT2016: A Hybrid Approach to Bilingual Document Alignment
2016cited by this paper
Improved Neural Machine Translation with SMT Features
2016cited by this paper
Effective Approaches to Attention-based Neural Machine Translation
2015cited by this paper
Statistical Machine Translation
2014cited by this paper
A Recursive Recurrent Neural Network for Statistical Machine Translation
2014cited by this paper
Addressing the Rare Word Problem in Neural Machine Translation
2014cited by this paper
Neural Machine Translation by Jointly Learning to Align and Translate
2014cited by this paper
On the Properties of Neural Machine Translation: Encoder–Decoder Approaches
2014cited by this paper
The Stanford CoreNLP Natural Language Processing Toolkit
2014cited by this paper
Recurrent Continuous Translation Models
2013cited by this paper
Decoding with Large-Scale Neural Language Models Improves Translation
2013cited by this paper
KenLM: Faster and Smaller Language Model Queries
2011cited by this paper
Eye tracking as an MT evaluation technique
2010cited by this paper
Large-Scale Machine Learning with Stochastic Gradient Descent
2010cited by this paper
Moses: Open Source Toolkit for Statistical Machine Translation
2007cited by this paper
A Study of Translation Edit Rate with Targeted Human Annotation
2006cited by this paper
A STUDY OF TRANSLATION ERROR RATE WITH TARGETED HUMAN ANNOTATION
2005cited by this paper
tRuEcasIng
2003cited by this paper
A Systematic Comparison of Various Statistical Alignment Models
2003cited by this paper
Bleu: a Method for Automatic Evaluation of Machine Translation
2002cited by this paper
Long Short-Term Memory
1997influential reference
Machine translation of languages
1956cited by this paper
Translation
1873cited by this paper

CITED BY

An empirical investigation of the neural base approaches based on the sentence length using low-resource language: English-to-Nyishi
2025cites this paper
Comparative analysis of machine translation for Hindi-Dogri text using rule-based, statistical, and neural approaches
2025cites this paper
Comparative study of low resource Digaru language using SMT and NMT
2024cites this paper
An empirical analysis on statistical and neural machine translation system for English to Mizo language
2023cites this paper
Statistical Machine Translation for Indic Languages
2023cites this paper
Simplification of English and Bengali Sentences for Improving Quality of Machine Translation
2022cites this paper
Bilingual Parallel Corpora: A Major Resource for Developing Computational Tools for Automatic Processing of Hindi-Dogri Language Pair
2022cites this paper
Pseudocode Generation from Source Code Using the BART Model
2022cites this paper
Source Code Generation-based on NLP and Ontology
2022cites this paper
Investigating the roles of sentiment in machine translation
2021cites this paper
SMT Versus NMT: An Experiment with Punjabi–English
2021cites this paper
An Improved English-to-Mizo Neural Machine Translation
2021cites this paper
Preparation of Sentiment tagged Parallel Corpus and Testing its effect on Machine Translation
2020cites this paper
Data Augmentation and Terminology Integration for Domain-Specific Sinhala-English-Tamil Statistical Machine Translation
2020cites this paper
The Roles of Language Models and Hierarchical Models in Neural Sequence-to-Sequence Prediction
2020cites this paper
Code-Mixed to Monolingual Translation Framework
2019cites this paper
JUMT at WMT2019 News Translation Task: A Hybrid Approach to Machine Translation for Lithuanian to English
2019cites this paper
Pinyin as a Feature of Neural Machine Translation for Chinese Speech Recognition Error Correction
2019cites this paper
Sentence Simplification using Syntactic Parse trees
2019cites this paper
Neural Machine Translation: A Review
2019cites this paper