Image captioning is a challenging modality transformation task in computer vision and natural language processing, aiming to understand the image content and describe it with a natural language. Recently, the relationship information between objects in the image has been investigated to be of importance in generating a more vivid and readable sentence. Many types of research have been done in relationship mining and learning for leveraging into the caption models. This paper mainly summarizes the methods of relational representation and relational encoding in image captioning. Besides, we discuss the advantages and disadvantages of these methods and provide commonly used datasets for the relational captioning task. Finally, the current problems and challenges in this task are highlighted.
A Survey on Learning Objects' Relationship for Image Captioning
Du Runyan,Zhang Wenkai,Guo Zhi,Sun Xian
Published 2023 in Computational Intelligence and Neuroscience
ABSTRACT
PUBLICATION RECORD
- Publication year
2023
- Venue
Computational Intelligence and Neuroscience
- Publication date
2023-05-29
- Fields of study
Medicine, Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar, PubMed
CITATION MAP
EXTRACTION MAP
CLAIMS
- No claims are published for this paper.
CONCEPTS
- No concepts are published for this paper.
REFERENCES
Showing 1-86 of 86 references · Page 1 of 1
CITED BY
Showing 1-3 of 3 citing papers · Page 1 of 1