Light Dim Small Target Detection Network with Multi-Heterogeneous Filters

Fei Zhao; Yingjie Deng

doi:10.3788/AOS221736

Acta Optica Sinica, Volume. 43, Issue 9, 0915001(2023)

Light Dim Small Target Detection Network with Multi-Heterogeneous Filters

Fei Zhao^* and Yingjie Deng

National Key Laboratory of Science and Technology on ATR, College of Electronic Science and Technology, National University of Defense Technology, Changsha 410073, Hunan , China

show less

Abstract Get PDF(in Chinese)

Objective

Dim small target detection in infrared images with complex backgrounds is a key technology for precise guidance systems and infrared surveillance systems, and the detection performance directly determines the success or failure of tasks. As a result, it has become a hot topic, and different detection methods have been presented. Compared with traditional algorithms, deep network algorithms have achieved remarkable results in many aspects in recent years, and some frameworks designed based on existing deep networks have been applied to detect the dim small target. Although these methods can improve the detection performance of small targets by modifying the network structure because the infrared images have only information of one dimension and limited features in small targets, it is difficult to obtain satisfactory results when the deep network is directly applied to detect dim small targets in the complex infrared background, and the large network scale makes it difficult to deploy the above methods on the embedded platform with constrained resources.

Methods

In view of the characteristics of single information dimension in infrared images and inconspicuous features of dim small targets, this study enriches the information of original images and incorporates multiple filters with different structures into the YOLOv5n network. In this study, three filters with different structures, namely the Top Hat filter, difference of Gaussian filter (DoG), and mean filter, are selected from the perspective of highlighting targets, suppressing backgrounds, and filtering high-frequency noises. By introducing three heterogeneous filters to process the images in the input layer of the network, the one-dimensional gray information of the original image is expanded into three dimensions, and then they are fed to the network through three channels, which improves the adaptability of the network to dim small targets in complex backgrounds.

YOLOV5n network is selected in this study and improved as follows. 1) In order to make the deep network improve the feature weight of the region of interest and suppress the response of the unrelated region during training, the lightweight convolutional block attention module (CBAM) is added to the backbone of YOLOv5n so that the extracted feature map can play a greater role in the subsequent target extraction. The output in the convolution layer first passes through the channel attention module (CAM) to improve the weight of target-related features and then through the spatial attention module (SAM), which enables the weighted feature to remain in the deeper network. 2) In the standard YOLOV5n network, target detection is carried out using the feature maps of P17, P20, and P23 layers. In the process of target extraction, targets are searched and selected through preset anchor boxes of different sizes. Since the shallow network has a feature map with a large size and contains rich original information, it is conducive to small target detection. Therefore, this study adjusts the size of the anchor in P3 layer to [5, 6, 6, 8, 9, 11], which is beneficial to small target detection. 3) The perception field of view of the shallow network is small, which is conducive to extracting the local features of the target. The deep network has a large perception field of view, which is mainly used to extract the global features of the target. For the application scenario of small target detection, the features extracted by the deep network are limited and may even interfere with the final detection results. After multi-layer feature extraction of the backbone network, the deep network almost does not contain small target features, so the standard YOLOv5n network is cropped to remove P5-P23 layers, and only P3-P17 and P4-P20 output features are used for detection. By adding attention modules, adopting small anchor strategies, and cutting deep branches of the network, the dim small target detection performance of the YOLOv5n network is improved, and the consumption of computational and storage resources is reduced.

Results and Discussions

In order to verify the performance of the algorithm, this study selects the dim small target detection and tracking infrared dataset against the ground/air background. Multiple deep network algorithms dedicated to small target detection are selected for comparison. Furthermore, the classical target detection network algorithms which are modified for small target detection are selected. In terms of detection performance, the proposed algorithm obtains the second-highest average precision (AP) value of 0.888, which is 1.4% lower than the highest value and 3% higher than the third-highest value. In terms of network size and computational efficiency, the proposed algorithm achieves the fastest processing speed of 416 frame/s at the smallest network size (3 MB), and the network size is half that of the algorithm in Ref. [7]. Compared with the algorithm with the best detection performance, the proposed algorithm performs approximately 60 times more efficiently, and the network size is approximately 1/16. This study analyzes the performance gains of improvement measures, such as introducing multi-heterogeneous filters, adding attention modules and small anchor box strategies, and cropping deep networks. The experimental results show that the proposed algorithm can still maintain an excellent detection performance with the smallest parameter size and the highest operational efficiency.

Conclusions

In order to improve the detection performance of dim small targets and enhance the deployment ability of algorithms, a light dim small target detection network with multi-heterogeneous filters is proposed. Experimental results show that the proposed algorithm can detect dim small targets in the complex infrared background effectively. In addition, fewer computational and storage resources are consumed, which lays a foundation for deployment on the embedded platform with constrained resources.

Keywords

infrared dim small target limited resources machine vision multi-heterogeneous filters YOLOv5n

Tools

Get Citation

Copy Citation Text

Fei Zhao, Yingjie Deng. Light Dim Small Target Detection Network with Multi-Heterogeneous Filters[J]. Acta Optica Sinica, 2023, 43(9): 0915001

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites