Generating captions without looking beyond objects

Hendrik Heuer,Christof Monz,A. Smeulders

Published 2016 in arXiv.org

ABSTRACT

This paper explores new evaluation perspectives for image captioning and introduces a noun translation task that achieves comparative image caption generation performance by translating from a set of nouns to captions. This implies that in image captioning, all word categories other than nouns can be evoked by a powerful language model without sacrificing performance on n-gram precision. The paper also investigates lower and upper bounds of how much individual word categories in the captions contribute to the final BLEU score. A large possible improvement exists for nouns, verbs, and prepositions.

PUBLICATION RECORD

  • Publication year

    2016

  • Venue

    arXiv.org

  • Publication date

    2016-10-01

  • Fields of study

    Computer Science

  • Identifiers
  • External record

    Open on Semantic Scholar

  • Source metadata

    Semantic Scholar

CITATION MAP

EXTRACTION MAP

CLAIMS

  • No claims are published for this paper.

CONCEPTS

  • No concepts are published for this paper.

CITED BY

Showing 1-18 of 18 citing papers · Page 1 of 1