Owing to the scarcity of pixel values and limited color features in infrared street images, issues such as missed detections, false detections, and poor detection performance are common. To address these problems, a spatially adaptive and content-aware infrared small object detection algorithm is proposed. The key components of this algorithm are as follows. 1) Spatially adaptive transformer: This transformer is designed by stacking local attention and deformable attention mechanisms to enhance the modeling capability of long-range dependency features and capture more spatial positional information. 2) Content-aware reassembly of features (CARAFE) operator: This operator is used for feature upsampling, aggregating contextual information within a large receptive field, and adaptively recombining features using shallow-level information. 3) High-resolution prediction head: A high-resolution prediction head of size 160x160 is added to map the pixels of input features to finer detection regions, further improving the detection performance of small objects. Experimental results on the FLIR dataset demonstrate that the proposed algorithm achieves an average precision mean of 85.6%, representing a 3.9% improvement over the YOLOX-s algorithm. These results validate the superiority of the proposed algorithm in detecting small objects in infrared images.
In this study, we propose a robust adaptive tracking algorithm for infrared dim objects that addresses the problem of tracking discontinuities and failures caused by missed detections and clutter in complex scenes. In the pre-processing stage, an algorithm that measures the image complexity eliminates unnecessary calculations. This algorithm determines the scene type by calculating multiple features of the infrared image to obtain the scene complexity, and then selects the corresponding detection algorithm to extract the target candidate location, grayscale and local histogram features. Subsequently, a measurement model and likelihood function are established based on the scene type. In the tracking stage, to flexibly match the filtering parameters of the generalized labeled multi-Bernoulli (GLMB) filter, an adaptive algorithm suitable for video image distribution is proposed for track initiation. Aiming at the unknown detection probability of an infrared image sequence, a cardinality probability hypothesis density (CPHD) filter was integrated into the GLMB to estimate the detection probability of the target in real time, thereby improving the accuracy of the tracker. The simulation results show that the proposed algorithm can effectively track small infrared objects in different complex scenarios.
Addressing the issues of inadequate feature extraction, lack of saliency in fused image regions, and missing detailed information in infrared-visible image fusion, this paper proposes a method for infrared-visible image fusion based on multi-scale contrast enhancement and a cross-modal interactive attention mechanism. The main components of the proposed method are as follows. 1) Multi-scale contrast enhancement module: Designed to strengthen the intensity information of target regions, facilitating the fusion of complementary information from both infrared and visible images. 2) Dense connection block: Employed for feature extraction to minimize information loss and maximize information utilization. 3) Cross-modal interactive attention mechanism: Developed to capture crucial information from both modalities and enhance the performance of the network. 4) Decomposition network: Designed to decompose the fused image back into source images, incorporating more scene details and richer texture information into the fused image. The proposed fusion framework was experimentally evaluated on the TNO dataset. The results show that the fused images obtained by this method feature significant target regions, rich detailed textures, better fusion performance, and stronger generalization ability. Additionally, the proposed method outperforms other compared algorithms in both subjective performance and objective evaluation.
To address the challenges of detail loss and the imbalance between visual detail features and infrared (IR) target features in fused infrared and visible images, this study proposes a fusion method combining multiscale feature fusion and efficient multi-head self-attention (EMSA). The method includes several key steps. 1) Multiscale coding network: It utilizes a multiscale coding network to extract multilevel features, enhancing the descriptive capability of the scene. 2) Fusion strategy: It combines transformer-based EMSA with dense residual blocks to address the imbalance between local details and overall structure in the fusion process. 3) Nested-connection based decoding network: It takes the multilevel fusion map and feeds it into a nested-connection based decoding network to reconstruct the fused result, emphasizing prominent IR targets and rich scene details. Extensive experiments on the TNO and M3FD public datasets demonstrate the efficacy of the proposed method. It achieves superior results in both quantitative metrics and visual comparisons. Specifically, the proposed method excels in targeted detection tasks, demonstrating state-of-the-art performance. This approach not only enhances the fusion quality by effectively preserving detailed information and balancing visual and IR features but also establishes a benchmark in the field of infrared and visible image fusion.
To enhance the recognition efficiency of UAVs in dark conditions and reduce missed detections and delays in complex environments and road conditions, this study proposes an improved YOLOv5s-GN-CB infrared image recognition method. This method enhances the efficiency of UAV infrared aerial images for detecting vehicles, people, and other types of targets. The main improvements to YOLOv5s achieved in this study include the following three aspects: 1) introducing the Ghost module into the YOLOv5s backbone network and incorporating NWD loss into Ghost; 2) adding the coordinate attention (CA) mechanism; 3) incorporating a weighted bidirectional feature pyramid network (BiFPN). The improved YOLOv5s-GN-CB detection model achieves an average accuracy of 95.1% (mAP@0.5) on the InfiRay infrared aerial photography man-vehicle detection dataset, with the FPS increased to 75.188 frames per second. Compared with the original YOLOv5 model, the average accuracy and FPS are improved by 4.2% and 12.02%, respectively. In the same scenario, the detection accuracy of UAV aerial photography infrared image target recognition has been significantly improved, and the delay rate has decreased.
To address the challenges of low contrast, low signal-to-noise ratio, and low resolution in infrared images, this study proposes an infrared object detection network that combines traditional image processing methods with deep learning technology for feature enhancement and fusion. The main steps in this approach are as follows. 1) Preprocessing: The network employs image filtering, sharpening, and equalization methods to highlight object features in the infrared image and enrich the input information. 2) Feature Extraction: A multi-level information aggregation feature extraction structure has been designed to fully extract and integrate the spatial and semantic information of objects, addressing both single-dimension and multi-dimension features. 3) Attention mechanism: To improve the weighting of key features in the extraction structure, a hybrid attention mechanism is introduced. This captures global context information in multiple ways, enhancing both spatial and channel information. 4) Feature fusion: An adaptive weighting method is applied to fuse features from adjacent dimensions, ensuring accurate and efficient detection of infrared objects. Experimental results on the KAIST, FLIR, and RGBT datasets show that the proposed method significantly improves the performance of infrared object detection compared to existing neural network-based methods. Additionally, this method demonstrates higher adaptability in complex scenes compared to other similar algorithms.
To solve the problems of image blur smoothing, texture distortion, and excessively large parameters in real-world infrared-image recovery algorithms, a global-local attention-guided super-resolution reconstruction algorithm for infrared images is proposed. First, a cross-scale global-local feature fusion module utilizes multi-scale convolution and a transformer to fuse information at different scales in parallel and to guide the effective fusion of global and local information by learnable factors. Second, a novel domain randomization degradation model accommodates the degradation domain of real-world infrared images. Finally, a new hybrid loss based on weight learning and regularization penalty enhances the recovery capability of the network while speeding up convergence. Test results on classical degraded images and real-world infrared images show that, compared with existing methods, the images recovered by the proposed algorithm have more realistic textures and fewer boundary artifacts. Moreover, the total number of parameters can be reduced by up to 20%.
In the realm of image processing algorithms for infrared thermal imaging, traditional algorithm simulation often requires the use of graphics processing tools such as Matlab. These tools simulate the algorithm, which then needs to be converted into code that can run on FPGA processors. However, the language and implementation methods used in Matlab are entirely different from those used in FPGA hardware description languages. This results in complex conversion processes, loss of conversion accuracy, and long development cycles. This study proposes an infrared-image simulation method and system based on the ModelSim simulation tool. Similar to graphic processing tools such as Matlab, after programming and writing the code, one can import the image simulation, immediately output the image display, and view data changes during the intermediate processing stages. Furthermore, the simulation code run by ModelSim can be directly transferred to the FPGA compilation tool for deployment on the hardware board. In engineering applications, this simplifies the conversion process and significantly improves development efficiency.
In naval warfare, infrared missiles are important weapons for destroying targets. To effectively demonstrate the ability of infrared missiles to destroy targets, it is necessary to first obtain the infrared target characteristics of typical destroyers. Therefore, this study investigated the infrared target characteristics of "Ali Burke" class destroyers. A three-dimensional physical model of the "Ali Burke" class destroyer was established. The flow field characteristics and medium- and long-wave infrared radiation characteristics of the model in the cruise state were calculated via numerical simulations. The results indicate that the plume temperature of the target chimney of the military ship is as high as 688.5 K, and the temperature of the ship body is low and uniform. The medium-wave spectral radiation of ship targets is primarily contributed by hightemperature plumes, whereas the long-wave spectral radiation is primarily contributed by the normaltemperature ship body. In addition, the characteristics of medium- and long-wave spectral radiation are significantly affected by the observation angle. The maximum value of the medium-wave spectral radiation intensity is 65000 W/(sr-um), and the highest value of the long-wave spectral radiation intensity, 18200 W/(sr-um), is reached at the detection angle directly above.
In this study, the effect of the type-I band on the performance of HgCdTe-based nBn devices was analyzed theoretically. A theoretical calculation of the relationship between the composition and doping concentration of the barrier layer and the band offset was obtained, and the relationship between the doping concentration of the absorption layer and the dark current of nBn LWIR HgCdTe devices was determined. Both the doping concentration and composition gradient between the barrier and absorption layers of nBn LWIR HgCdTe devices were optimized. A two-dimensional device simulation model was established, and the band structure of nBn LWIR HgCdTe devices was calculated. The results show that optimization of the device structure parameters effectively reduced the turn-on voltage required for device operation, while almost no depletion region was formed in the absorption layer, which effectively inhibited the SRH generation-recombination current and tunneling current. In this study, we also calculated the temperature-dependent dark current of optimized nBn LWIR HgCdTe devices; the operating temperature of the device was above 110 K. This study establishes a theoretical basis for developing high-performance barrier-structured LWIR-HgCdTe devices.
The modulation transfer function (MTF) is an important parameter for evaluating the imaging ability of an infrared focal plane (FPA) for targets with different spatial frequencies. The MTF of the focal plane is affected by the size of the photosensitive area of the pixel, center distance of the pixel, and carrier diffusion length. As the number of pixels decreases, the influence of the carrier diffusion length on the MTF becomes more evident. In this study, a convenient and accurate MTF testing method was designed to meet the requirements of MTF testing for hybrid FPA. A special microstructure was fabricated on the focal plane through metal deposition, photolithography, and other processes to replace the inclined knife edge. The MTF of the FPA was obtained by the proposed infrared focal-plane test method. The results demonstrate that the MTF of the FPA can be measured accurately and conveniently using this method, which is convenient for FPA production and development companies to evaluate the FPA performance and verify device fabrication quickly.
To meet with the ongoing demand for high uniformity, low dark current and low-blind pixels of linear InGaAs short-wavelength-infrared focal plane detector in color separation industry, based on MOCVDgrown n-i-n type InP/InGaAs/InP epitaxial materials, a 512×2-element linear InGaAs short-wavelengthinfrared focal plane detector was fabricated using diffusion techniques, preparation of the passivation layer, and growth of the electrode. The dark current of this detector was effectively suppressed by optimizing the structure of the detector and the passivation layer technique, Moreover, high reliability and low-blind pixels were achieved by optimizing the parameters of flip-chip interconnection. The detector assembly was tested. The measurement results show a peak detectivity of 1.13×1012 cm?Hz1/2/W, dark current density of 12.8 nA/cm2, effective pixel rate higher than 99.5%, and response non-uniformity as low as 0.63% at room temperature (25℃).
An ORB infrared binocular ranging method based on adaptive geometric constraints and the random sampling consistency method is proposed to address the issues of low computational efficiency, high mismatching rates, and insufficient accuracy in binocular vision measurements by traditional feature matching algorithms. First, key points are detected and described using the FAST and BRIEF algorithms, and the initial matching of feature points is performed using the fast library for approximate nearest neighbors (FLANN) algorithm. Then, based on the slope and distance of the initial matching pairs, appropriate thresholds are selected, and geometric constraints based on these parameters are constructed to eliminate incorrect matching pairs. Finally, a random sample consensus (RANSAC) method is used to remove anomalous points and complete the fine matching. The distance of the target object is calculated by combining the thermal camera calibration parameters. Experimental results show that the improved ORB algorithm yields higher quality feature points and greater measurement accuracy compared to traditional algorithms, with an average absolute error of distance measurements at 1.64%, demonstrating its practical value.
Micro-channel plates (MCPs) feature millions of through-holes. Preparing an ion barrier film (IBF) on the input surface of MCP components requires a continuous and dense organic film as a temporary carrier. Therefore, organic membranes are crucial in the preparation of IBF-MCPs. To meet the demand for mass production of IBF-MCP components, improving the production efficiency and qualification rate of organic films is essential. This study analyzed methods to enhance the yield of organic film production, resulting in a 30 percentage point increase in yield. This improvement is significant for boosting the efficiency and success rate of MCP component preparation. Additionally, the technology for preparing organic membrane solutions for MCP components is vital in advancing the development of third-generation highly reliable image tubes.
A temperature correction model, EACN, based on a channel attention mechanism is proposed to address the issues of insufficient accuracy and slow speed in temperature measurements from thermal imaging cameras. First, the model parameters are reduced by decreasing the features through 1x1 convolution. Second, we introduce a channel attention mechanism, ECA, to enhance the feature saliency expression between channels in the feature mapping module stage, compensating for lost feature information during dimensionality reduction and compression, thereby further improving the feature characterization capability of the model. Finally, through skip connections, shallow feature information is combined with semantic space information in the feature reconstruction stage, thus improving temperature correction accuracy. In this experiment, two data strategies were used on a self-built dataset. The experimental results show that the EACN model outperforms the SRCNN and VDSR models in both correction accuracy and speed.