Infrared Technology
Co-Editors-in-Chief
Junhong Su
2025
Volume: 47 Issue 7
15 Article(s)

Aug. 12, 2025
  • Vol. 47 Issue 7 1 (2025)
  • Zhichao SHENG, Haorun ZHANG, and He WANG

    Image fusion aims to improve image recognition and detail richness of images by fusing two or more images into a new image using a specific algorithm. This paper proposes an image fusion algorithm based on edge-guided filtering enhancement and graph wavelet transform (GWT) to solve problems of traditional image fusion methods such as lack of details and unclear edge textures. Firstly, edge-guided filtering is used to pre-process and enhance low light images. Then, GWT is used to perform multi-scale decomposition on infrared and low light images, obtaining their respective low-frequency and high-frequency sub-band images. For low-frequency sub images, rolling guidance filtering (RGF) is used for decomposition to obtain the base layer and detail layer. The base layer is fused using visual saliency map (VSM), while the detail layer is fused using the maximum absolute (MA) principle; For high-frequency sub images, fusion is performed using maximum regional energy. Finally, GWT inverse transformation is performed on the fused low-frequency and high-frequency sub-band images to obtain the final fusion result. The experimental results on public datasets demonstrate the proposed method is superior to compared algorithms in terms of high subjective visual effects, and texture information preservation and edge details.

    Aug. 12, 2025
  • Vol. 47 Issue 7 793 (2025)
  • Hongpeng XU, Gang LIU, Qifeng SI, and Huixiang CHEN

    A candidate region twin-network tracking algorithm with a parallel hybrid attention mechanism is proposed to solve the problem of infrared target tracking under complex background interference. The parallel hybrid attention mechanism calculates spatial- and channel-attention feature maps in parallel. Subsequently, the spatial- and channel-attention feature maps are transformed into the same dimensions as the input feature map through dimensional expansion. Then, the extended spatial- and channel-attention feature maps are multiplied element-wise to effectively aggregate multiple attention weights, thereby obtaining a hybrid attention feature map. To dynamically adjust the weights of the corresponding elements in the original feature map based on the hybrid attention weights, the hybrid attention feature map and the input feature map are multiplied element-wise. A parallel hybrid attention mechanism is integrated into SiamRPN, and infrared aircraft targets under a ground/air background are tracked. Experimental results show that compared with SiamRPN, SiamBAN, and Mixformer, the success rate and accuracy of the proposed algorithm are improved by 15.5%, 1.8%, 8.5% and 20.1%, 9.3%, and 7.7%, respectively, whereas its tracking speed reaches 201 frames/s. The proposed method effectively realizes infrared target tracking under complex background interference and exhibits favorable real-time performance.

    Aug. 12, 2025
  • Vol. 47 Issue 7 802 (2025)
  • Deyin ZHANG, Yuyao ZHANG, Juntong LI, and Zhanghui WU

    To address the issues of uneven infrared feature distribution, indistinct contours, and loss of crucial background information in fused images caused by insufficient examination of the interaction between CNN- and Transformer-extracted features, this paper proposes a novel infrared and visible image fusion network incorporating CNN–Transformer feature interaction. First, the new fusion network designs a novel spatial-channel hybrid attention mechanism to enhance the extraction efficiency of both global and local features, thus yielding hybrid feature blocks. Second, feature interaction between a CNN and Transformer is leveraged to obtain fused hybrid feature blocks, and a multiscale reconstruction network is constructed to achieve image feature reconstruction for the output. Finally, comparative image fusion experiments are conducted on the TNO dataset between the proposed network and nine other fusion networks. The experimental results show that the fused images obtained by the new network exhibit excellent visual perception, i.e., it effectively highlights infrared features and object contours while preserving rich background texture details. The network achieves average improvements of approximately 64.73%, 8.17%, 69.05%, 66.34%, 15.39%, and 25.66% over existing fusion networks on the EN, SD, AG, SF, SCD, and VIF metrics, respectively. Ablation experiments further validated the effectiveness of the new model.

    Aug. 12, 2025
  • Vol. 47 Issue 7 813 (2025)
  • Xuchen GUO, Yugang FAN, and Mingkai JIANG

    To enhance the accuracy of hyperspectral object recognition, a hyperspectral image object recognition model based on the self-ensembling network is proposed. By introducing a regularization term to optimize the self-ensemble network, this model improves the generalization performance of the object recognition model and builds a self-ensemble learning mechanism to address the underfitting problem under limited labeled samples, reducing the dependence of the hyperspectral image recognition model training on a large number of labeled samples. The model consists of a student network and a teacher network, with a dense connection module with gradient operators added to the network to enhance the network's perception of edge and fine-grained features and improve the feature extraction performance of hyperspectral images. Under the joint constraints of supervised and unsupervised losses, the student network and the teacher network learn from each other, thereby establishing the model's self-ensemble mechanism and ensuring the model's classification accuracy. To further enhance the model's generalization performance, an L2 regularization term is introduced during model optimization to constrain the training and optimization of the objective function, thereby overcoming the overfitting problem of the model. The proposed method is applied to three hyperspectral datasets, Pavia University, Salinas, and WHU-Hi-LongKou, with average classification accuracies of 96.91%, 96.73%, and 98.12%, respectively. Compared with multiple classification algorithms, it is verified that the proposed method has better classification accuracy under limited labeled samples.

    Aug. 12, 2025
  • Vol. 47 Issue 7 823 (2025)
  • Zitian DING, Wenfei XI, Tanghui QIAN, Junqi GUO, Tingting JIN, Wenyu HONG, and Fuyu GUI

    UAV (Unmanned Aerial Vehicle) image recognition in foggy conditions is crucial in environmental monitoring, disaster rescue, and other fields. However, owing to light attenuation and fog-obscuring ground objects in foggy environments, the conventional single-feature recognition method for UAV foggy images is ineffective. Hence, this study proposes a method that combines multi-feature UAV foggy-image recognition. The dark channel features, texture features, and color features in UAV images are extracted, and the extracted features are combined into feature vectors and subjected to dimensionality reduction. Finally, they are trained and classified using a support vector machine to achieve accurate recognition of UAV foggy images. The experiment demonstrates that the method achieves an accuracy of 97.68% and a false-alarm rate of 5.05% on the UAV foggy-image dataset, thus highlighting its superiority over four other compared methods. This method provides a new reliable solution for the image recognition and defogging of UAVs in foggy environments, as well as offers high practicality and popularization value.

    Aug. 12, 2025
  • Vol. 47 Issue 7 833 (2025)
  • Qingdian ZHAO, and Dehong YANG

    This study proposes a fusion algorithm for infrared and visible-light images based on image enhancement and multiscale decomposition to address the low-contrast and unclear edge contours of conventional multiscale transformation methods in the fusion process of infrared and visible-light images. First, an image-enhancement algorithm based on guided filtering is used for visible-light images to improve the overall contrast and visibility. Second, an improved rolling-guided filter is used to decompose the enhanced visible and infrared images into base layers and detail layers of different scales. Subsequently, saliency analysis is performed on the basic and detailed layers, a saliency map is constructed, and the weight map is calculated. Finally, the weighted average fusion of the base and detail layers is performed using the weight graph, and the fusion of the base and detail layers is added to obtain the final fusion result. The subjective image-quality analysis and seven evaluation indices are compared using eight multiscale fusion methods. Experimental results show that the proposed method not only preserves the edge contour of the source image and improves the overall contrast and clarity of the fusion result but also reduces artifacts.

    Aug. 12, 2025
  • Vol. 47 Issue 7 842 (2025)
  • Jian WANG, Baoliang WANG, Huandong LI, and Yi LU

    To address the vastly different styles and insufficiently high fidelity of simulation images simulated by the conventional image-generation platform, a cycle generative adversarial network is proposed. This enables the simulated image in the semi-physical simulation test of an image-guided weapon to closely resemble the actual battlefield environment. The proposed network comprises generators and discriminators, and the method involves an unsupervised deep-learning model embedded in existing image-generation platforms. To ensure real-time performance, locking shared memory technology is used to address image transmission timeouts in semi-physical objects. Test results show that the method can ensure real-time performance and improve confidence in semi-physical simulations.

    Aug. 12, 2025
  • Vol. 47 Issue 7 852 (2025)
  • Miao ZHANG, Xiaojun WANG, Jingfa LEI, Ruhai ZHAO, and Yongling LI

    In multifrequency heterodyne-structured light three-dimensional measurement technology, phase errors significantly affect the measurement accuracy. Hence, this paper proposes a phase-error correction method based on multifrequency heterodyne. First, a nonlinear correction was performed on the projection device using a polynomial to deduce the relationship between the intensities of the output and input lights. Subsequently, this polynomial was used to correct the input phase-shifted fringe pattern. A Gaussian adaptive bilateral filtering algorithm was proposed for denoising the phase-shifted fringe pattern captured by the camera, and the processed images were used to generate a truncated phase using the phase-shifting method. Absolute phase recovery was performed by optimizing the forward construction of the continuous auxiliary phase and the inverse derivation of the truncated phase-series constraint process. Experimental results show that, after correction, the nonlinear average absolute error and root mean square error were reduced by 74.58% and 77.65%, respectively, thereby effectively reducing the nonlinear error of the absolute phase. Additionally, the proposed method effectively suppressed the phase-jump error, thus resulting in a smoother surface for the point-cloud model. When measuring the step height difference of the standard block, the absolute error and relative error were reduced to 0.034 mm and 0.38%, respectively, thus enhancing the measurement accuracy and robustness of the structured light three-dimensional measurement system.

    Aug. 12, 2025
  • Vol. 47 Issue 7 859 (2025)
  • Lingling PANG, Zhixiong YANG, Chunchao YU, Ruilin DING, Boyang WANG, Jianan BAO, Lixian WANG, and Jie XUE

    Infrared radiometric calibration is a method for establishing a relationship between the digital quantization values of imaging spectra and the spectral radiance values of a target. Calibrating the measured spectra is crucial in quantitative analysis. In practical applications, because of the stability of the instrument and the effect of the target radiation characteristics on the instrument, the quality of calibration results cannot be evaluated effectively. This study proposes a method based on noise equivalent spectral radiance (NESR) to quantitatively analyze the accuracy of calibration results. A two-point linear radiometric calibration method is used to calibrate the instrument, and the measured NESR value is obtained using the NESR signal-to-noise ratio calculation method. By comparing the measured NESR with the nominal NESR parameter, the accuracy of the radiometric calibration results can be determined rapidly and quantitatively. Data obtained using a self-developed imaging telemetry system are used for verification, which demonstrates that the proposed method can rapidly and effectively evaluate the accuracy of radiometric calibration results, thereby improving the quality of brightness temperature spectra after calibration and establishing a foundation for subsequent infrared spectral gas identification.

    Aug. 12, 2025
  • Vol. 47 Issue 7 869 (2025)
  • Qinghua LU, Qi ZHANG, Chunpeng ZHANG, Dongming PI, Lei WANG, Hongqing WEN, and Liujing XIANG

    This study investigates a miniaturized multifunctional optical system based on a multiband image fusion system. The miniaturized multifunctional optical system solves the problems of large volume, heavy weight, and delayed white-light observation in the original system. The miniaturized multifunctional optical system comprises four modules. The energy convergence of the laser module is greater than ϕ1 mm. The modulation transfer function of the low-light-level module is 53 lp/mm, which is favorable. The magnification of the white-light module is 5.7×. Additionally, the central and edge fields-of-view are greater than 5.7′ and 17.1′, respectively. The modulation transfer function of the image-observation module in the center field-of-view at 62 lp/mm is greater than 0.3. The miniaturized multifunctional optical system reduces the volume of a multiband image fusion system from the original 261 cm×237 cm×115 cm to 220 cm×173 cm×102 cm, and the weight is reduced by 496.6 g. Therefore, this miniaturized multifunctional optical system offers a certain reference value.

    Aug. 12, 2025
  • Vol. 47 Issue 7 877 (2025)
  • Shanfeng LIU, Wandeng MAO, Miaomiao LI, Qiankai ZHOU, Wenjie ZOU, and Hua BAO

    A novel cross-modal multilevel feature fusion algorithm based on adaptive fusion and self-attention enhancement is proposed to address the low robustness of power-equipment detection algorithms and inaccurate small-target detection in complex environments. The algorithm begins by constructing a dual-stream feature-extraction network to extract multilevel target representations from visible-light and infrared images. An adaptive fusion module is introduced to capture complementary features from both the visible-light and infrared branches. Furthermore, a self-attention mechanism based on a Transformer is employed to enhance the semantic spatial information of the complementary features. Finally, precise target localization is achieved by utilizing deep features at different scales. Experimental evaluations were conducted on a custom-developed power-equipment dataset, and the results show that the proposed algorithm achieved an average precision mean value of 91.7%. Compared with using only the visible-light or infrared branch separately, the algorithm shows improvements of 3.5% and 3.9%, respectively, thus effectively achieving cross-modal information fusion. Compared with current mainstream object-detection algorithms, it exhibits superior robustness.

    Aug. 12, 2025
  • Vol. 47 Issue 7 884 (2025)
  • Sihao ZHAO, Feng WANG, Juanjuan YANG, Yang PANG, and Jianwu DANG

    To address the typical challenges in visible and infrared image fusion, such as difficulty in recognizing small cracks, loss of texture details, and introduction of edge artifacts due to the simultaneous weakening of light intensity, this study proposes a multiscale feature extraction–multiscale attention generative adversarial network (M2GAN) method for image fusion. First, the M2GAN introduces a multiscale feature-extraction module that utilizes aligned visible and infrared images to extract information at different scales from both image types. This approach ensures that crack details and semantic information are preserved during the fusion process through side connections, thus resulting in more prominent crack features. Additionally, a multiplexed attention mechanism is proposed to stitch the multiscale fused image with the infrared and visible source images to construct the infrared intensity path and visible gradient path, respectively, thus preserving more target and background information. On a custom-developed dataset, the results of six evaluation indices show significant improvements by the proposed method compared with many mainstream image-fusion algorithms. Specifically, the structural similarity and edge retention improved by an average of 10.66% and 24.92%, respectively. The M2GAN demonstrates better visual effects and structural similarity, thus outperforming comparative methods in objective evaluations.

    Aug. 12, 2025
  • Vol. 47 Issue 7 895 (2025)
  • Yang YANG, Yingyue ZHOU, Runxia HUANG, Qi LIU, Hongsen HE, and Xiaoxia LI

    In recent years, biometric recognition technology has developed rapidly from single-mode identification to multimodal feature fusion. Identity recognition technology based on palm prints and veins is investigated extensively, and challenges persist in achieving real-time noncontact palm recognition. In this study, we use a binocular camera to simultaneously capture visible and near-infrared palm images, locate the region of interest based on palm key-point detection, and design a log-Gabor-convolution palm print and vein network. The network adopts a dual-branch parallel feature-extraction structure and designs a parameter-adaptive log Gabor convolution and multi-receptive-field feature-fusion module, thus significantly improving the ability to extract texture features from dual-mode images. Method testing was conducted on two publicly available palm-print and palm-vein datasets, i.e., CASIA-PV and TJU-PV, respectively, as well as on a custom-developed dataset, SWUST-PV. Experimental results show that the proposed method achieved a recognition accuracy exceeding 99.9%, with an error rate of 0.0012% or less. Compared with the basic model, the proposed model is lightweight and decreases the model parameter count and floating-point computational complexity by 76% and 81%, respectively.

    Aug. 12, 2025
  • Vol. 47 Issue 7 906 (2025)
  • Haofan GUO, Ting JIAO, Fangliang SUN, Chuge CHEN, Renshi LI, Ruifeng KAN, Zhenyu XU, and Hao DENG

    To address the issues of unsatisfactory intuitiveness and real-time performance, as well as the high false-alarm rate in current automated infrared imaging gas-leakage detection methods, a real-time leakage-detection model named Gas-Seg, which is based on the improved YOLOv5-Seg, is proposed. Gas-Seg adopts a leakage gas cloud-segmentation method, thus achieving a low false-positive identification and an intuitive display of the leakage areas. To enhance the model's ability to learn the key features of the leaking gas, a convolutional block attention module was used to merge the spatial and channel features. Atrous spatial pyramid pooling was applied to extract the multi-scale features of gas clouds, thereby improving the accuracy of gas cloud identification. Additionally, the use of the C3Ghost module reduced the model's parameters, consequently enhancing its inference speed. Finally, an auxiliary validation method was introduced to eliminate false alarms from stationary areas, thereby effectively reducing false alarms in single-frame detections. Ultimately, the Gas-Seg model achieved 93.5% and 66.5% improvements in the mAP@0.5 and mAP@0.5:0.9 metrics, respectively, which correspond to improvements by3.7% and 2% compared with YOLOv5-Seg, respectively. In ethylene-gas detection experiments at distances of 10 m with leakage rates of 0.75 and 1.5 L/min, the warning accuracies reached 84.4% and 99.7% respectively. Furthermore, the inference speed reached 51 frames per second, thus demonstrating its potential for real-time detection.

    Aug. 12, 2025
  • Vol. 47 Issue 7 918 (2025)
  • Please enter the answer below before you can view the full text.
    Submit