Infrared and Laser Engineering, Volume. 53, Issue 11, 20240256(2024)

An improved YOLOv8s method and its application in road traffic target detection

Jiageng SANG1,2, Zhijia ZHANG1, Chuanmin XIAO3, Haibo LUO2, and Junyao ZHANG4
Author Affiliations
  • 1College of Artificial Intelligence, Shenyang University of Technology, Shenyang 110870, China
  • 2Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110169, China
  • 3The Third Militray Representative Office of the Air Force Equipment Department, Shenyang 110144, China
  • 4China Academy of Machinery Shenyang Research Institute of Foundry Co., Ltd., Shenyang 110022, China
  • show less

    Objective Infrared image target detection has significant application value in the field of transportation, as it can help people promptly detect targets and respond in special conditions such as strong light at night or in rainy and foggy weather. However, due to the characteristics of infrared images, such as low resolution, lack of color information, poor contrast, and blurred features, existing models do not achieve high average detection accuracy when detecting infrared vehicles and pedestrians. The main issue is the problem of missing detection for overlapping targets and small targets in traffic scenes. Therefore, this paper aims to design an infrared pedestrian and vehicle detection model based on YOLOv8s (You only look once version 8), which is crucial for improving the safety of intelligent driving.Methods YOLOv8s, an advanced object detection model in recent years, is categorized into five distinct versions—n, s, m, l, and x—according to the network's depth and breadth to cater to diverse requirements. YOLOv8s, ensuring a certain level of detection precision with a moderate model size, is chosen as the base model. The manuscript introduces four improvements to the YOLOv8s architecture (Fig.2). Firstly, the network architecture is re-engineered with the incorporation of a small target detection layer to improve detection capabilities for distant pedestrians and vehicles (Fig.3). Secondly, the SPD (space-to-depth) module replaces the original network's 3×3 downsampling convolution in the backbone and neck networks (Fig.4), to safeguard the fine-grained details within the image. Thirdly, a hybrid attention mechanism (Fig.5) is crafted to bolster the network's attentiveness to pedestrians and vehicles. Fourthly, the Focal EIOU loss function is utilized, which not only addresses the deficiencies of the CIOU loss function that may become ineffective under certain circumstances but also mitigates the issue of imbalance between positive and negative samples.Results and DiscussionsThe dataset utilized in this study is the FLIR ADAS (Advanced Driver Assistance System) v2 dataset, which was recently released by Teledyne FLIR in 2022 for the purpose of environmental perception in autonomous driving applications (Fig.1). The main evaluation metrics are mAP (mean Average Precision) and model size, with P (precision) and R (recall) as secondary metrics. Ablation experiments (Tab.1) were used to verify the feasibility of each improvement method introduced, with the improved network showing a 5.3% increase in mAP compared to the initial network. This paper compares the detection effect before and after adding a small object detection layer (Fig.6) and before and after adding an SPD module (Fig.7), compares detection accuracy with different attention mechanisms (Tab.2), and further demonstrates the effectiveness of the hybrid attention mechanism with heat maps (Fig.8-Fig.9). It also compares the detection effect before and after using attention mechanisms, compares the performance with different loss functions (Tab.3), and shows the detection effect before and after changing the loss function (Fig.11). On this basis, the detection performance of different algorithms is compared (Tab.4), and the detection effect before and after the improvement is compared (Fig.12). Through the above experiments, the improved network has shown excellent detection performance.Conclusions This paper presents an improved YOLOv8s-based infrared vehicle and pedestrian object detection algorithm. By adding a small target detection layer, the algorithm enhances its ability to detect small target vehicles and pedestrians. The SPD module is utilized to preserve fine-grained information during downsampling. The designed hybrid attention mechanism enables the network to suppress noise interference and focus more on the targets themselves. The improved loss function enhances the model's learning capabilities. The refined algorithm has demonstrated good detection performance on the test set, showing improved detection capabilities for overlapping targets, small targets, and blurred targets.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Jiageng SANG, Zhijia ZHANG, Chuanmin XIAO, Haibo LUO, Junyao ZHANG. An improved YOLOv8s method and its application in road traffic target detection[J]. Infrared and Laser Engineering, 2024, 53(11): 20240256

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: 图像处理

    Received: Jun. 11, 2024

    Accepted: --

    Published Online: Dec. 13, 2024

    The Author Email:

    DOI:10.3788/IRLA20240256

    Topics