Automatic synthesis of realistic images from text would be interesting and useful, but current AI systems are still far from this goal. However, in recent years generic and powerful recurrent neural network architectures have been developed to learn discriminative text feature representations. Meanwhile, deep convolutional generative adversarial networks (GANs) have begun to generate highly compelling images of specific categories, such as faces, album covers, and room interiors. In this work, we develop a novel deep architecture and GAN formulation to effectively bridge these advances in text and image modeling, translating visual concepts from characters to pixels. We demonstrate the capability of our model to generate plausible images of birds and flowers from detailed text descriptions.
Generative Adversarial Text to Image Synthesis
Scott E. Reed,Zeynep Akata,Xinchen Yan,Lajanugen Logeswaran,B. Schiele,Honglak Lee
Published 2016 in International Conference on Machine Learning
ABSTRACT
PUBLICATION RECORD
- Publication year
2016
- Venue
International Conference on Machine Learning
- Publication date
2016-05-17
- Fields of study
Computer Science
- Identifiers
- External record
- Source metadata
Semantic Scholar
CITATION MAP
EXTRACTION MAP
CLAIMS
CONCEPTS
- birds and flowers
Two natural image categories used as the demonstration targets.
- deep architecture and gan formulation
The combined model design that links text encodings to adversarial image generation.
Aliases: novel deep architecture and GAN formulation
- deep convolutional generative adversarial networks
Convolutional adversarial image generators used as the visual synthesis branch.
Aliases: GANs
- detailed text descriptions
Text inputs with enough detail to specify the desired visual content.
Aliases: text descriptions
- discriminative text feature representations
Text embeddings that capture descriptive content for conditioning image generation.
Aliases: text feature representations
- recurrent neural network architectures
Sequence models used to learn discriminative text representations from descriptions.
- text-to-image synthesis
The task of generating images from text descriptions.
Aliases: text to image synthesis
REFERENCES
Showing 1-41 of 41 references · Page 1 of 1