Deep Fusion-GAN Enhancement Model for Text-to-Image

[1] [1] Li D, Wang S, Zou J, et al. Paint4Poem: A dataset for artistic visualization of classical Chinese poems [J]. arXiv preprint arXiv: 2109.11682, 2021.

[2] [2] Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks [J]. Advances in Neural Information Processing Systems, 2014, 3: 2672-2680.

[3] [3] Elman J L. Finding structure in time [J]. Cognitive Science, 1990, 14(2): 179-211.

[4] [4] Hochreiter S, Schmidhuber J. Long short-term memory [J]. Neural Computation, 1997, 9(8): 1735-1780.

[5] [5] Reed S, Akata Z, Yan X, et al. Generative adversarial text to image synthesis [C]// ICML'16: Proc. of the 33rd International Conference on Machine Learning, 2016: 1060-1069.

[6] [6] Mirza M, Osindero S. Conditional generative adversarial nets [J]. arXiv preprint arXiv: 1411.1784, 2014.

[7] [7] Dash A, Gamboa J C B, Ahmed S, et al. Tac-GAN text conditioned auxiliary classifier generative adversarial network [J]. arXiv preprint arXiv: 1703.06412, 2017.

[9] [9] Zhang H, Xu T, Li H, et al. StackGAN: Text to photorealistic image synthesis with stacked generative adversarial networks [C]// Proc. of the IEEE International Conference on Computer Vision, 2017: 5907-5915.

[10] [10] Zhang H, Xu T, Li H, et al. StackGAN++: Realistic image synthesis with stacked generative adversarial networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(8): 1947-1962.

[11] [11] Tao M, Tang H, Wu F, et al. DF-GAN: A simple and effective baseline for text-to-image synthesis [J]. arXiv preprint arXiv: 2008.05865, 2020.

[12] [12] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding [J]. arXiv preprint arXiv: 1810.04805, 2018.

[13] [13] Xu T, Zhang P, Huang Q, et al. AttnGAN: Fine-grained text to image generation with attentional generative adversarial networks [J]. arXiv preprint arXiv: 1711.10485, 2017.

[14] [14] Zhang H, Goodfellow I, Metaxas D, et al. Self-attention generative adversarial networks [J]. arXiv preprint arXiv: 1805.08318, 2018.

[15] [15] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding [J]. arXiv preprint arXiv: 1810.04805, 2018.

[16] [16] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need [C]// 31st Conference on Neural Information Processing Systems (NIPS), 2017: 1-15.

[17] [17] Perez E, Strub F, De Vries H, et al. FiLM: Visual reasoning with a general conditioning layer [J]. arXiv preprint arXiv: 1709.07871, 2017.

[18] [18] Salimans T, Goodfellow I, Zaremba W, et al. Improved techniques for training GANs [J]. arXiv preprint arXiv: 1606.03498, 2016.

[19] [19] Heusel M, Ramsauer H, Unterthiner T, et al. GANs trained by a two time-scale update rule converge to a nash equilibrium [J]. arXiv preprint arXiv: 1706.08500, 2017.

Tools

Get Citation

Copy Citation Text

WEI Yiran, YI Junkai, ZHU Kequan, TAN Lingling. Deep Fusion-GAN Enhancement Model for Text-to-Image[J]. Semiconductor Optoelectronics, 2024, 45(6): 960

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category:

Received: May. 26, 2024

Accepted: Feb. 28, 2025

Published Online: Feb. 28, 2025

The Author Email:

DOI:10.16818/j.issn1001-5868.2024052602

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology