A cooled long-wavelength infrared imaging optical system is proposed and designed for a 320×256 long-wavelength refrigerated area array detector. The optical system is composed of five lenses, and the system is designed with different material combinations and a back-focus adjustment mechanism to achieve clear imaging in the operating temperature of -40℃ to 70℃. The working spectrum of the optical system is 7.5-9.5 m. The focal length is 50 mm; the relative aperture is 1/2; and the full field of view is 11°×8.8°. This system has the advantages of a simple structure, large relative aperture, and high transmittance. Design results show that the modulation transfer function (MTF) of the optical system is better than 0.594 at a Nyquist frequency of 16.7 lp/mm, and the root mean square size is smaller than the single pixel size. Energy concentration is better than 88.5% within the pixel size, and distortion is less than 0.23%. After setting a tolerance, the MTF of the system was better than 0.504, indicating that the system is easy to process, has high realizability, and exhibits good imaging performance after assembly.
Atmospheric radiative transfer in the thermal infrared spectrum is influenced by various factors, and atmospheric transmittance is a critical parameter. Currently, researching atmospheric transmittance within 8-14 m thermal infrared spectrum remains challenging because of numerous elusive input background parameters. Consequently, a simplified parameterization scheme for thermal infrared radiative transfer in satellite remote sensing is proposed. Using the moderate resolution atmospheric transmission (MODTRAN) model, the impact parameters of atmospheric transmittance were quantitatively simulated and analyzed, to identify key parameters, leading to the development of a MODTRAN-model-based simplified scheme. Atmospheric transmittance values calculated by the MODTRAN model and simplified scheme were compared and validated at the central wavelengths of the two thermal infrared channels of a medium resolution spectral imager-LL (MERSI-LL). The R2 value exceeded 0.99, and the root mean-square error (RMSE) was below 0.007458, indicating high accuracy. The simplified scheme relies solely on the view zenith angle and water vapor column, eliminating the need for CO2, O3, and aerosol optical depth. Compared with the MODTRAN model, the simplified scheme reduces the number of input parameters and enhances computational efficiency. This study offers theoretical support for atmospheric correction of thermal infrared data.
Linear avalanche photodiode (APD) focal-plane infrared detectors have a wide range of applications. Coupling APD detectors with multiple-mode readout integrated circuits can achieve multimode detection within a limited pixel area, thereby enhancing the integration of the detection system. In this study, we designed an APD readout integrated circuit with four modes: infrared thermal imaging, gated 3D imaging, laser ranging, and asynchronous laser pulse detection. The input-stage circuits of the four modes were multiplexed. A Krummenacher structure was used to suppress the influence of background radiation, thereby expanding the detection range of photon flight time. An improved time-discrimination circuit was proposed to enhance distance measurement accuracy by reducing the time-discrimination error. The readout integrated circuit was designed using a 0.18 m, 3.3 V complementary metal-oxide semiconductor (CMOS), with an array size of 128×128, a pixel center distance of 30 m, and a maximum charge capacity of 3.74 Me-. Simulation results show that in the laser ranging mode, at an integration capacitance of 13 fF and a background current of 1-150 nA, the background current response amplitude was ≤1.35 mV, which is much smaller than the response amplitude of 280 mV when the laser response current is 500 nA. The amplitude sensitivity of the asynchronous laser pulse detection mode was approximately 110 nA, and the pulse width sensitivity was approximately 4 ns. For laser pulses with a response of 150-500 nA, the time discrimination error of the improved time discrimination circuit was approximately 4 ns. The multimode multiplexed APD readout integrated circuit designed in this study has certain engineering application value.
Traditional target detection algorithms based on deep learning usually require extensive computing resources and long-term training, which do not meet the needs of the industry. Lightweight target detection networks sacrifice part of the detection accuracy in exchange for faster inference speed and lighter models. They are suitable for applications in edge-computing devices and have received widespread attention. This study introduces lightweight technologies commonly used to compress and accelerate models, classifies and analyzes the structural principles of lightweight backbone networks, and evaluates their practical impact on YOLOv5s. Finally, the prospects and challenges of lightweight target-detection algorithms are discussed.
The structure of infrared imaging systems and complexity of the imaging environment lead to complex types of noise during infrared image processing, which can seriously affect image quality. This paper first describes the structure of the infrared imaging system and source of image noise, further discussing traditional and improved algorithms for infrared image noise reduction from the perspective of space and frequency domains, air-frequency combination, and deep learning. In this study, we focused on deep learning noise reduction algorithms, in view of their broad application and excellent noise reduction effect. The classical noise reduction algorithm was selected to conduct noise reduction experiments on real noisy infrared images. Experiments show that the deep-learning algorithm surpasses the traditional algorithm in performance.
To balance the accuracy and efficiency of multisource object detection networks, a lightweight infrared and visible light object detection model with a multiscale attention structure and an improved object-box filtering strategy was designed by applying group convolution to multimodal object features. First, multiple feature dimensionality reduction strategies were adopted to sample the input image and reduce the impact of noise and redundant information. Subsequently, feature grouping was performed based on the mode of the feature channel, and deep separable convolution was used to extract infrared, visible, and fused features, to enhance the diversity and efficiency of extracted multisource feature structures. Then, an improved attention mechanism was utilized to enhance key multimodal features in various dimensions, combining them with a neighborhood multiscale fusion structure to ensure scale invariance of the network. Finally, the optimized non-maximum suppression algorithm was used to synthesize the prediction results of objects at various scales for accurate detection of each object. Experimental results based on the KAIST, FLIR, and RGBT public thermal datasets show that the proposed model effectively improves object detection performance compared with the same type of multisource object detection methods.
Gamut mapping is a technology used to achieve high-fidelity transmission of color images between different devices. However, an image obtained through gamut mapping inevitably produces serious artifacts and distortions because of color information loss, which leads to distortions in texture and color naturalness. Since color information loss is serious in gamut mapped images (GMIs), a no-reference quality evaluation method based on double-order color representation is proposed. Many traditional image quality assessment (IQA) methods extract quality-aware features (QAFs) in the gray domain, and a few IQA methods extract QAFs from color components, such as hue and saturation. Hue and saturation were calculated linearly using the R, G, and B color components while ignoring the derivative information of the color. Therefore, this study extracted features from zero-order (R, G, and B components) and first-order (derivative) color information. These features were then used for regression training to obtain a quality prediction model. Experimental results show that the model is superior to existing no-reference quality evaluation methods in predicting the quality of GMIs.
To address issues such as a large number of targets, similar appearance, and easy to miss detection and misdetection when the background and target are of similar color in substation infrared images, a multi-target detection method is proposed for electric power equipment by improving YOLOv7. First, to better retain the shallow information in the infrared image, cavity convolution and mean pooling were introduced into a spatial pyramid pooling cross-stage partial convolution (SPPCSPC) module, to expand the receptive field while preventing small infrared targets from being submerged in the background. Second, to deal with misdetection and detection omission in multi-target detection, a lightweight simple attention module (SimAM) was introduced into the head network to focus on the region of interest. Finally, a hybrid edge regression loss function suitable for small-target detection, combining the normalized Gaussian Wasserstein distance (NWD) and complete intersection over union (CIOU) losses, was chosen to effectively improve the accuracy of target detection at different scales in infrared images. We conducted comparison experiments with seven other representative detection methods using a self-constructed infrared image dataset of power equipment. The experimental results showed that the improved YOLOv7 network model significantly improves leakage detection and reduced false detection. Its mean average precision (mAP) reached 88.9%, which is a significant improvement compared with those of other representative target detection algorithms for infrared multi-target detection of power equipment.
The traditional hyperspectral image (HSI) clustering algorithm suffers from issues, such as poor accuracy. In addition, accurately measuring the similarity relationship between pixels using long-time and commonly used distance-measurement criteria is difficult. To improve the clustering performance of hyperspectral images, this study proposes a hyperspectral image clustering algorithm based on spectral unmixing and dynamic-weighted diffusion mapping. The algorithm is based on the decomposition of mixed pixels and diffusion distance, calculated using the diffusion mapping theory. The proposed method uses the high-dimensional geometry and abundance structure observed in hyperspectral images to solve the clustering problem. Experimental results on two real hyperspectral datasets showed that the proposed algorithm has high classification accuracy and can be successfully applied to hyperspectral image clustering.
In this study, a novel weak infrared small target detection network based on sparse attention and multiscale feature fusion is proposed to address the challenges of low pixel occupancy and limited texture features for weak infrared small targets within complex backgrounds, leading to difficulties in feature extraction, low detection rates, and high false alarm rates. The network utilizes the segmentation attention of ResNest to extract features at different scales. A BiFormer attention module is introduced to learn the distant relationships between targets and backgrounds. Furthermore, a fusion module is employed to merge both high- and low-level features, with the final detection results represented as a binary image through a head module. The experimental results demonstrate that the proposed method achieves the best performance in terms of both Intersection over Union (IoU) and F-measure. Compared with the dense nested attention network (DNANet), the proposed method improved the IoU by 3.9% and F-measure by 5.6%. Compared with the attentive bilateral contextual network (ABCNet), the proposed method improved the IoU by 5.8% and F-measure by 10%. Moreover, the proposed approach exhibited robustness and adaptability in effectively detecting small weak infrared targets in diverse, complex backgrounds. This method is applicable to weak infrared small-target detection in complex backgrounds, exhibiting superior performance.
A visible and infrared image matching method (VIMN) based on multiscale feature point extraction is proposed to address the issues of low matching accuracy and poor applicability, caused by significant differences in image features in visible and infrared image matching tasks. First, to enhance the ability of the VIMN to adapt to geometric image transformations, a deformable convolution layer is introduced into the feature extraction module. A spatial pyramid pooling (SPP) layer is used to complete multiscale feature fusion, considering both low- and high-level semantic information of an image. Second, a joint feature space and channel response score map are constructed on the multiscale fusion feature map to extract robust feature points. Finally, an image patch matching module uses metric learning for visible light and infrared image matching. To verify the superiority of the VIMN matching method, comparative experiments were conducted on matching experimental datasets using scale-invariant feature transform (SIFT), particle swarm optimization (PSO)-SIFT, dual disentanglement network (D2 Net), and contextual multiscale multilevel network (CMM-Net). The qualitative and quantitative results indicate that the VIMN proposed in this study has better matching performance.
In infrared and visible image fusion, fused images often suffer from insufficient prominence of significant targets, inadequate expression of visible light information, edge blurring, and local information imbalance under uneven lighting conditions. To address these issues, an image fusion algorithm that combines attention mechanisms and equilibrium loss, termed the depthwise separable, squeeze-and-excitation, and equilibrium loss-based convolutional neural network (DSEL-CNN), is proposed. First, a depth-wise separable convolution is used to extract the image features. Subsequently, a fusion strategy is used to apply the squeeze-and-excitation attention mechanism to enhance the weight of effective information. Finally, an equilibrium composite loss function is utilized to calculate the loss of the fused image to ensure balanced information. A comparison of the fusion generative adversarial network (FusionGAN), DenseFuse, and four other fusion algorithms on the TNO and multi-spectral road scenarios (MSRS) public datasets showed that the proposed method achieved the highest improvements in mutual information (MI), visual information fidelity (VIF), and edge retention index (Qabf) by 1.033, 0.083, and 0.069, respectively. Experimental results demonstrate that the proposed algorithm outperforms six commonly used fusion methods in terms of visual perception, information content, and edge and texture preservation in fused images.
An infrared and visible image fusion algorithm, based on a dual-discriminator generative adversarial network, is proposed to address issues, such as the insufficient extraction of global and multiscale features and the imprecise extraction of key information, in existing infrared and visible image fusion algorithms. First, a generator combines convolution and self-attention mechanisms to capture multiscale local and global features. Second, the attention mechanism is combined with skip connections to fully utilize multiscale features and reduce information loss during the downsampling process. Finally, two discriminators guide the generator to focus on the salient targets of the infrared images and background texture information of visible-light images, allowing the fused image to retain more critical information. Experimental results on the public multi-scenario multi-modality (M3FD) and multi-spectral road scenarios (MSRS) datasets show that compared with the baseline algorithms, the results of the six evaluation metrics improved significantly. Specifically, the average gradient (AG) increased by 27.83% and 21.06% on the two datasets, respectively, compared with the second-best results. The fusion results of the proposed algorithm are rich in detail and exhibit superior visual effects.
A stable interactive registration algorithm (SIRA) based on the Imregtform algorithm is proposed to address issues such as complex image backgrounds, low mutual information, and few effective feature points, leading to registration difficulties in the detection of unexploded ordnance (UXO) using infrared and visible-light imaging techniques. First, the Cpselect algorithm is incorporated to realize the accurate alignment of the key nodes of an image, which are aggregated by arithmetic averaging as the initial matrix. The contrast-limited adaptive histogram equalization (CLAHE) algorithm is incorporated to adaptively segment and equalize the image and avoid contrast over-enhancement, combined with bilinear interpolation to ensure smooth continuity between the regions and a stable iterative alignment process. Matrix Frobenius proximity (MFP) was introduced as an alignment evaluation index to alleviate the volatility of traditional evaluation indices. Experimental results show that SIRA enhanced the alignment efficiency by approximately 4.72× and MFP by 15.47× compared to the Imregtform algorithm. The algorithm exhibited high accuracy and stability for UXO image alignment.
A dual-branch feature enhancement and fusion backbone network (DBEF-Net) is proposed for object detection to address the challenges of infrared and visible bimodal object detection in complex dynamic environments. Specifically, DBEF-Net addresses issues such as insufficient object feature expression and the inability of infrared and visible features to fully utilize the complementary features in bimodal fusion leading to omission and misdetection. To further address the insufficient attention of the model to infrared and visible light features, a feature interaction enhancement module is designed to effectively focus on and enhance the useful information in bimodal features. A transformer-based bimodal fusion network is further adopted. To utilize the complementary features of bimodal modalities more effectively, a cross-attention mechanism is introduced to achieve deep fusion between the modalities. Experimental results show that the proposed method has higher average detection accuracy than existing bimodal object detection algorithms on the SYUGV dataset, meeting the processing speed for real-time detection.
In response to the growing demand for moisture detection across various industries in our country, near-infrared spectroscopy offers the advantage of non-contact measurement. However, existing equipment often suffers from drawbacks such as large volume, high power consumption, and difficulty in modulating optical signals. Furthermore, the core technology is still monopolized by foreign manufacturers. In this study, a detection scheme using a light-emitting diode (LED) with a 1450 nm detection band and 1200 nm reference band is proposed for the measurement of the light source. A portable near-infrared moisture detector was developed using nondispersive infrared technology, and relevant gradient moisture detection experiments were conducted on crops. The experimental results showed that the instrument was stable and performed well in the moisture content test of wheat kernels; furthermore, its moisture content discrimination was high. In addition, the instrument had a fast response speed and convenient light source frequency modulation. After further optimization, this instrument can be widely used for nondestructive real-time online moisture detection.