Text Image Generation Method with Scene Description

Youwen Huang; Bin Zhou; Xin Tang

doi:10.3788/LOP202158.0410012

Laser & Optoelectronics Progress, Volume. 58, Issue 4, 0410012(2021)

Text Image Generation Method with Scene Description

Youwen Huang, Bin Zhou^*, and Xin Tang

Author Affiliations

School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, Jiangxi 341000, China

show less

Abstract Get PDF(in Chinese)

In this paper, a method of generating corresponding images based on scene description text is studied, and a generative adversarial network model combined with scene description is proposed to solve the object overlapping and missing problems in the generated images. Initially, a mask generation network is used to preprocess the dataset to provide objects in the dataset with segmentation mask vectors. These vectors are used as constraints to train a layout prediction network by text description to obtain the specific location and size of each object in the scene layout. Then, the results are sent to the cascaded refinement network model to complete image generation. Finally, the scene layout and images are introduced to a layout discriminator to bridge the gap between them for obtaining a more realistic scene layout. The experimental results demonstrate that the proposed model can generate more natural images that better match the text description, effectively improving the authenticity and diversity of generated images.

Keywords

generative adversarial network image generation image processing scene description scene layout segmentation mask

Tools

Get Citation

Copy Citation Text

Youwen Huang, Bin Zhou, Xin Tang. Text Image Generation Method with Scene Description[J]. Laser & Optoelectronics Progress, 2021, 58(4): 0410012

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites