Laser & Optoelectronics Progress, Volume. 58, Issue 4, 0410012(2021)

Text Image Generation Method with Scene Description

Youwen Huang, Bin Zhou*, and Xin Tang
Author Affiliations
  • School of Information Engineering, Jiangxi University of Science and Technology, Ganzhou, Jiangxi 341000, China
  • show less

    In this paper, a method of generating corresponding images based on scene description text is studied, and a generative adversarial network model combined with scene description is proposed to solve the object overlapping and missing problems in the generated images. Initially, a mask generation network is used to preprocess the dataset to provide objects in the dataset with segmentation mask vectors. These vectors are used as constraints to train a layout prediction network by text description to obtain the specific location and size of each object in the scene layout. Then, the results are sent to the cascaded refinement network model to complete image generation. Finally, the scene layout and images are introduced to a layout discriminator to bridge the gap between them for obtaining a more realistic scene layout. The experimental results demonstrate that the proposed model can generate more natural images that better match the text description, effectively improving the authenticity and diversity of generated images.

    Tools

    Get Citation

    Copy Citation Text

    Youwen Huang, Bin Zhou, Xin Tang. Text Image Generation Method with Scene Description[J]. Laser & Optoelectronics Progress, 2021, 58(4): 0410012

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing

    Received: Jun. 30, 2020

    Accepted: Aug. 7, 2020

    Published Online: Feb. 24, 2021

    The Author Email: Zhou Bin (zhoubin_master@163.com)

    DOI:10.3788/LOP202158.0410012

    Topics