Semiconductor Optoelectronics, Volume. 45, Issue 6, 960(2024)
Deep Fusion-GAN Enhancement Model for Text-to-Image
[1] [1] Li D, Wang S, Zou J, et al. Paint4Poem: A dataset for artistic visualization of classical Chinese poems [J]. arXiv preprint arXiv: 2109.11682, 2021.
[2] [2] Goodfellow I J, Pouget-Abadie J, Mirza M, et al. Generative adversarial networks [J]. Advances in Neural Information Processing Systems, 2014, 3: 2672-2680.
[3] [3] Elman J L. Finding structure in time [J]. Cognitive Science, 1990, 14(2): 179-211.
[4] [4] Hochreiter S, Schmidhuber J. Long short-term memory [J]. Neural Computation, 1997, 9(8): 1735-1780.
[5] [5] Reed S, Akata Z, Yan X, et al. Generative adversarial text to image synthesis [C]// ICML'16: Proc. of the 33rd International Conference on Machine Learning, 2016: 1060-1069.
[6] [6] Mirza M, Osindero S. Conditional generative adversarial nets [J]. arXiv preprint arXiv: 1411.1784, 2014.
[7] [7] Dash A, Gamboa J C B, Ahmed S, et al. Tac-GAN text conditioned auxiliary classifier generative adversarial network [J]. arXiv preprint arXiv: 1703.06412, 2017.
[9] [9] Zhang H, Xu T, Li H, et al. StackGAN: Text to photorealistic image synthesis with stacked generative adversarial networks [C]// Proc. of the IEEE International Conference on Computer Vision, 2017: 5907-5915.
[10] [10] Zhang H, Xu T, Li H, et al. StackGAN++: Realistic image synthesis with stacked generative adversarial networks [J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 41(8): 1947-1962.
[11] [11] Tao M, Tang H, Wu F, et al. DF-GAN: A simple and effective baseline for text-to-image synthesis [J]. arXiv preprint arXiv: 2008.05865, 2020.
[12] [12] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding [J]. arXiv preprint arXiv: 1810.04805, 2018.
[13] [13] Xu T, Zhang P, Huang Q, et al. AttnGAN: Fine-grained text to image generation with attentional generative adversarial networks [J]. arXiv preprint arXiv: 1711.10485, 2017.
[14] [14] Zhang H, Goodfellow I, Metaxas D, et al. Self-attention generative adversarial networks [J]. arXiv preprint arXiv: 1805.08318, 2018.
[15] [15] Devlin J, Chang M W, Lee K, et al. Bert: Pre-training of deep bidirectional transformers for language understanding [J]. arXiv preprint arXiv: 1810.04805, 2018.
[16] [16] Vaswani A, Shazeer N, Parmar N, et al. Attention is all you need [C]// 31st Conference on Neural Information Processing Systems (NIPS), 2017: 1-15.
[17] [17] Perez E, Strub F, De Vries H, et al. FiLM: Visual reasoning with a general conditioning layer [J]. arXiv preprint arXiv: 1709.07871, 2017.
[18] [18] Salimans T, Goodfellow I, Zaremba W, et al. Improved techniques for training GANs [J]. arXiv preprint arXiv: 1606.03498, 2016.
[19] [19] Heusel M, Ramsauer H, Unterthiner T, et al. GANs trained by a two time-scale update rule converge to a nash equilibrium [J]. arXiv preprint arXiv: 1706.08500, 2017.
Get Citation
Copy Citation Text
WEI Yiran, YI Junkai, ZHU Kequan, TAN Lingling. Deep Fusion-GAN Enhancement Model for Text-to-Image[J]. Semiconductor Optoelectronics, 2024, 45(6): 960
Category:
Received: May. 26, 2024
Accepted: Feb. 28, 2025
Published Online: Feb. 28, 2025
The Author Email: