Semiconductor Optoelectronics, Volume. 45, Issue 6, 960(2024)
Deep Fusion-GAN Enhancement Model for Text-to-Image
A deep fusion generative adversarial network (DF-GAN) enhancement model combined with self-attention mechanism is proposed for low semantic relevance, fuzzy details, and inadequate structural integrity in text-to-image tasks. First, the bidirectional encoder representations from transformers (BERT) model is used to mine the semantic features of text context and combined with the deep text-image fusion block to realize the matching of deep text semantics and image regional features. Second, a self-attention mechanism module is introduced as a supplement to the convolution module at the model architecture level, aiming to enhance the establishment of long-distance and multilevel dependencies. The experimental results demonstrate that the proposed enhancement model not only strengthens the semantic relationship between the text and image but also ensures the inclusion of precise details and the overall integrity of the generated image.
Get Citation
Copy Citation Text
WEI Yiran, YI Junkai, ZHU Kequan, TAN Lingling. Deep Fusion-GAN Enhancement Model for Text-to-Image[J]. Semiconductor Optoelectronics, 2024, 45(6): 960
Category:
Received: May. 26, 2024
Accepted: Feb. 28, 2025
Published Online: Feb. 28, 2025
The Author Email: