Semiconductor Optoelectronics, Volume. 45, Issue 6, 960(2024)

Deep Fusion-GAN Enhancement Model for Text-to-Image

WEI Yiran, YI Junkai, ZHU Kequan, and TAN Lingling
Author Affiliations
  • College of Automation, Beijing Information Science and Technology University, Beijing 100192, CHN
  • show less
    References(18)

    [1] [1] Li D, Wang S, Zou J, et al. Paint4Poem: A dataset for artistic visualization of classical Chinese poems [J]. arXiv preprint arXiv: 2109.11682, 2021.

    [2] [2] Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks [J]. Advances in Neural Information Processing Systems, 2014, 3: 2672-2680.

    [3] [3] Elman J L. Finding structure in time [J]. Cognitive Science, 1990, 14(2): 179-211.

    [4] [4] Hochreiter S, Schmidhuber J. Long short-term memory [J]. Neural Computation, 1997, 9(8): 1735-1780.

    [5] [5] Reed S, Akata Z, Yan X, et al. Generative adversarial text to image synthesis [C]// ICML'16: Proc. of the 33rd International Conference on Machine Learning, 2016: 1060-1069.

    [6] [6] Mirza M, Osindero S. Conditional generative adversarial nets [J]. arXiv preprint arXiv: 1411.1784, 2014.

    [7] [7] Dash A, Gamboa J C B, Ahmed S, et al. Tac-GAN text conditioned auxiliary classifier generative adversarial network [J]. arXiv preprint arXiv: 1703.06412, 2017.

    [9] [9] Zhang H, Xu T, Li H, et al. StackGAN: Text to photorealistic image synthesis with stacked generative adversarial networks [C]// Proc. of the IEEE International Conference on Computer Vision, 2017: 5907-5915.

    [10] [10] Zhang H, Xu T, Li H, et al. StackGAN++: Realistic image synthesis with stacked generative adversarial networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(8): 1947-1962.

    [11] [11] Tao M, Tang H, Wu F, et al. DF-GAN: A simple and effective baseline for text-to-image synthesis [J]. arXiv preprint arXiv: 2008.05865, 2020.

    [12] [12] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding [J]. arXiv preprint arXiv: 1810.04805, 2018.

    [13] [13] Xu T, Zhang P, Huang Q, et al. AttnGAN: Fine-grained text to image generation with attentional generative adversarial networks [J]. arXiv preprint arXiv: 1711.10485, 2017.

    [14] [14] Zhang H, Goodfellow I, Metaxas D, et al. Self-attention generative adversarial networks [J]. arXiv preprint arXiv: 1805.08318, 2018.

    [15] [15] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding [J]. arXiv preprint arXiv: 1810.04805, 2018.

    [16] [16] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need [C]// 31st Conference on Neural Information Processing Systems (NIPS), 2017: 1-15.

    [17] [17] Perez E, Strub F, De Vries H, et al. FiLM: Visual reasoning with a general conditioning layer [J]. arXiv preprint arXiv: 1709.07871, 2017.

    [18] [18] Salimans T, Goodfellow I, Zaremba W, et al. Improved techniques for training GANs [J]. arXiv preprint arXiv: 1606.03498, 2016.

    [19] [19] Heusel M, Ramsauer H, Unterthiner T, et al. GANs trained by a two time-scale update rule converge to a nash equilibrium [J]. arXiv preprint arXiv: 1706.08500, 2017.

    Tools

    Get Citation

    Copy Citation Text

    WEI Yiran, YI Junkai, ZHU Kequan, TAN Lingling. Deep Fusion-GAN Enhancement Model for Text-to-Image[J]. Semiconductor Optoelectronics, 2024, 45(6): 960

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: May. 26, 2024

    Accepted: Feb. 28, 2025

    Published Online: Feb. 28, 2025

    The Author Email:

    DOI:10.16818/j.issn1001-5868.2024052602

    Topics