Infrared and Laser Engineering, Volume. 54, Issue 3, 20240486(2025)
Improved CycleGAN algorithm to transfer visible images to infrared images (invited)
Author Affiliations
1Officers College of PAP, Chengdu 610000, China2Fujian Key Laboratory of Light Propagation and Transformation, College of Information Science and Engineering, Huaqiao University, Xiamen 361021, China3Institute of Fluid Physics, China Academy of Engineering Physics, Mianyang 621000, China4College of Physics and Information Engineering, Minnan Normal University, Zhangzhou 363000, Chinashow less
ObjectiveThe primary objective of our research is to improve the transfer of visible images to infrared images, addressing the limitations of traditional CycleGAN models which often result in detail loss and artifacts. This advancement is crucial for applications in security surveillance, medical imaging, and remote sensing, where the ability to discern thermal variations is imperative. We develop an improved CycleGAN-based algorithm that can effectively transfer visible images into infrared images while preserving details and reducing artifacts, thereby improving the system's perception capabilities and accuracy.
MethodsOur approach involves the development of an improved CycleGAN network that integrates several innovative components to enhance the transfer process. The generator network is equipped with an Agent Attention Mechanism, which is designed to focus on the most significant features within an image, thereby improving the quality of the translated infrared images. This mechanism allows the model to capture intricate details and understand global structures more effectively. Furthermore, we introduce the Learned Perceptual Image Patch Similarity (LPIPS) as the cyclic consistency loss function. LPIPS calculates the perceptual similarity between images based on deep features, ensuring that the generated images maintain consistency in content and style. This is a significant improvement over traditional pixel-wise loss functions, which often fail to capture the perceptual quality of images. To enhance the discriminator's ability to assess the authenticity of the generated images, we optimize it using a PatchGAN architecture. We also incorporate the ContraNorm module, which increases the discriminator's sensitivity to image details and enhances its ability to differentiate between real and synthesized images. Our model is trained on three datasets: FLAME2 and OSU-CT. FLAME2 includes images of volcanic activities, and a dataset comprising wildfire remote sensing images. OSU-CT includes images of Pedestrian Scene Dataset. These datasets provide a diverse range of scenarios to test the effectiveness of our model.
Results and DiscussionsThe experimental results demonstrate the superior performance of our improved CycleGAN model. The integration of the Agent Attention Mechanism and LPIPS loss function has led to a significant enhancement in preserving image details and reducing color distortion. The optimized discriminator, fortified with the ContraNorm module, has shown an increased ability to discern the authenticity of generated images. Comparative analysis with the original CycleGAN model reveals that our model produces thermal infrared images with more accurate color representation and superior detail preservation. Quantitative metrics such as the Structural Similarity Index (SSIM), Peak Signal-to-Noise Ratio (PSNR), and Visual Information Fidelity (VIF) substantiate the superior performance of our model, indicating higher quality in the translated images.
ConclusionsThis study concludes that the improved CycleGAN model presents a significant advancement in the field of image translation, particularly from the visible to the infrared spectrum. The integration of the Agent Attention Mechanism and LPIPS loss function, coupled with the enhanced discriminator, results in higher-quality infrared images characterized by fewer artifacts and better detail preservation. The model's performance, as evidenced by both qualitative and quantitative metrics, underscores its potential for practical applications where accurate infrared imaging is imperative. The significance and impact of this study are underscored by the model's exceptional performance in both experimental and real-world scenarios, highlighting its transformative potential in the realm of image processing and analysis.