Acta Optica Sinica, Volume. 45, Issue 9, 0910002(2025)
Infrared Traffic Object Detection Network for Edge Device Deployment
The rapid advancement of intelligent transportation systems has increased the demand for precise and swift traffic object detection capabilities, particularly under challenging low-light conditions and complex backgrounds. Infrared (IR) imaging technology has emerged as a critical tool for such scenarios due to its ability to capture heat signatures, thus enabling reliable detection even in darkness or through obscurants. However, traditional IR detection methods are often hindered by high computational complexity, large parameter sizes, and dependency on high-performance computing resources, which makes them unsuitable for deployment on resource-limited mobile devices. We introduce the Edge-DETR network, specifically designed for the efficient and accurate detection of traffic objects in IR imagery on edge devices.
The Edge-DETR network is an innovative detection framework that builds upon the RT-DETR model with several enhancements. It incorporates a context anchor attention multistage efficient layer aggregation network (CAA-MELAN) to adaptively extract multi-scale features and understand global dependencies, thereby addressing the challenges posed by varying object sizes and environmental dynamics. Additionally, a global information supplement module (GISM) is employed for effective feature downsampling, which ensures the preservation of essential spatial information while reducing computational load. The cross-level feature fusion module (CFFM) facilitates interaction among features at different scales, enhancing the network’s capability to integrate high-level semantic information with low-level spatial details. The HiLo attention mechanism is introduced to optimize intra-scale feature interaction, with a focus on target contours while minimizing parameter and computational requirements. To address the detection of objects with complex shapes and sizes, a Shape-IoU loss function is utilized, which accounts for the shape and size of bounding boxes in the loss calculation. Extensive experiments are conducted across multiple datasets, including FLIR, KAIST, LLVIP, and a self-built dataset, to comprehensively evaluate the network’s performance.
The Edge-DETR network demonstrates superior performance across various datasets, significantly outperforming similar methods in terms of detection accuracy and computational efficiency. Compared to the RT-DETR model, our network achieves a 46% reduction in parameters and a 39% decrease in floating point operations (FLOPs), with the model size compressed by 45%, down to 21 MB. The network’s accuracy is particularly notable for detecting small targets and in complex scenarios, significantly reducing false positives and missed detections. Fig. 1 illustrates the network model, showcasing its structural components, while Fig. 6 presents the heatmaps of detection results, indicating the network’s precision in target contour detection despite the reduction in parameters and FLOPs. The ablation study results, as shown in Table 1, further validate the contribution of each component to the overall network performance, with the combined model showing the best detection accuracy. The network’s performance in terms of precision is also superior, as evidenced by the high mAP scores achieved across different datasets. Detailed analysis of the results reveals that the CAA-MELAN module significantly enhances feature extraction capabilities, particularly for small and rectangular targets commonly found in traffic scenes. The CFFM’s ability to fuse features across different scales provides a more comprehensive understanding of the scene, leading to improved detection accuracy. The HiLo attention mechanism effectively balances the trade-off between computational efficiency and detection accuracy, while the Shape-IoU loss function fine-tunes the network to better handle the complexities of real-world traffic scenarios. We also observe that the network’s performance is robust across different weather conditions and lighting environments, which is crucial for real-world applications.
The Edge-DETR network has proven its effectiveness for IR object detection on edge devices, striking a balance between detection accuracy and computational efficiency. Its ability to adapt to different scales and contexts, coupled with a lightweight computational footprint, positions it as a leading solution for edge device deployment in traffic object detection scenarios. The success of this network lies in its innovative approach to feature extraction and fusion, which allows for a detailed understanding of the traffic environment while maintaining low resource consumption. The Edge-DETR network’s performance under various conditions, including different lighting and weather scenarios, suggests that it can provide consistent detection capabilities, which are essential for safety-critical applications. Its scalability also means it can be adapted to different levels of complexity in traffic scenarios, from simple urban settings to complex highway environments. This flexibility, combined with efficiency, makes the Edge-DETR network a promising candidate for integration into a wide range of transportation systems. As we continue to develop and refine this technology, we anticipate that it will play a crucial role in enhancing the safety and efficiency of transportation networks worldwide. The Edge-DETR network’s success in handling the nuances of IR imagery also opens up possibilities for its application in other domains where IR detection is critical, such as military surveillance, search and rescue operations, and industrial automation. Overall, the Edge-DETR network’s performance metrics, versatility, and potential for integration into existing and future systems make it a standout solution in the field of edge computing for traffic object detection.
Get Citation
Copy Citation Text
Yulan Han, Deao Chen, Tong Wu, Xianlu Liu, Chaofeng Lan. Infrared Traffic Object Detection Network for Edge Device Deployment[J]. Acta Optica Sinica, 2025, 45(9): 0910002
Category: Image Processing
Received: Dec. 19, 2024
Accepted: Mar. 11, 2025
Published Online: May. 20, 2025
The Author Email: Yulan Han (hanyulan@hrbust.edu.cn)
CSTR:32393.14.AOS241913