Opto-Electronic Engineering, Volume. 49, Issue 3, 210372-1(2022)

Real-time object detection for UAV images based on improved YOLOv5s

Xu Chen, Dongliang Peng, and Yu Gu*
Author Affiliations
  • School of Automation, Hangzhou Dianzi University, Hangzhou, Zhejiang 310018, China
  • show less
    Figures & Tables(17)
    YOLOv5 backbone network architecture diagram
    Structure diagram of feature fusion module
    (a) Res-DConv module; (b) Receptive field mapping
    Improved module structure
    YOLOv5sm+ model architecture
    (a) Total number of category instances on the VisDrone dataset; (b) Classes confusion matrix of YOLOv5m algorithm
    The detection examples of different algorithms in the VisDrone UAV scene. (a) YOLOv5m model; (b) YOLOv5sm+ model; (c) YOLOv5s model
    Comparison of the detection effects of three algorithms in dense vehicle scenes. (a) YOLOv5m; (b) YOLOv5s; (c) YOLOv5sm+
    Detection comparison of improved algorithm in DIOR dataset. (a) YOLOv5s; (b) YOLOv5sm+
    • Table 1. Receptive field analysis table

      View table
      View in Article

      Table 1. Receptive field analysis table

      YOLOv5s感受野通道YOLOv5sm感受野通道
      Focus632Conv 3*3 (stride:2)324
      Conv3*3 (dilation:2)1548
      下采样1064Conv3*3 (stride:2)1996
      Res-Block2796
      C3_x11864Res-Dconv5196
      下采样26128Conv 3*3 (stride:2)59192
      C3_x374128C3_x3107192
      下采样90256Conv3*3 (stride:2)123384
      C3_x3186256C3_x3219384
      下采样218512Conv3*3 (stride:2)251768
      Spp218~634512Spp251~667768
      C3_x1282~698512C3_x1315~731768
    • Table 2. Pre-setting anchors in response to the receptive field and down-sampling

      View table
      View in Article

      Table 2. Pre-setting anchors in response to the receptive field and down-sampling

      下采样因子345
      最大感受野/pixel111255731
      先验框范围8*8~37*3732*32~85*8596*96~365*365
    • Table 3. Statistics of different types of objects

      View table
      View in Article

      Table 3. Statistics of different types of objects

      目标种类Small (0×0~32×32)Mid (32×32~96×96)Large (96×96~)
      数量44.4418.631.704
    • Table 4. Performance comparison experiment results of depth and width models

      View table
      View in Article

      Table 4. Performance comparison experiment results of depth and width models

      深度宽度mAP50mAPBFLOPs
      0.330.50.5020.28816.5
      0.330.750.5400.31936.3
      1.330.50.5250.31135.4
    • Table 5. Verification experiment results on Res-Dconv module

      View table
      View in Article

      Table 5. Verification experiment results on Res-Dconv module

      BaselineRes-DconvmAP50mAPBFLOPs
      0.5020.28816.5
      0.5160.29919.8
    • Table 6. The ablation experiment results of our algorithm modules on the VisDrone dataset

      View table
      View in Article

      Table 6. The ablation experiment results of our algorithm modules on the VisDrone dataset

      BaselineSMSCAMSDCMmAPmAP50BFLOPsInferAP-smallAP-mediumAP-large
      注:加粗字体为该列最优值
      YOLOv5s0.3190.54816.54.80.2200.4370.495
      0.3580.58930.18.30.2800.4760.495
      0.3240.55514.73.80.2250.4460.511
      0.3330.55519.54.90.2500.4480.482
      0.3560.59338.09.00.2780.4750.512
      0.3600.59630.87.70.2810.4790.505
    • Table 7. Detection performance of different algorithms on VisDrone dataset

      View table
      View in Article

      Table 7. Detection performance of different algorithms on VisDrone dataset

      算法mAP50mAPmAP75AP-smallAP-midAP-largeBFLOPsInfer/ms
      注:+为添加改进模块的模型,*为多尺度测试结果,包含引用文献实验结果。
      YOLOv30.6090.3890.4170.2970.4960.545154.927.8
      Scaled-YOLOv40.6200.4000.4280.3050.5140.626119.427.1
      ClusDet[1]0.5620.3240.316-----
      HRDNet[1]0.6200.35510.351-----
      YOLOv5s0.5480.3190.3170.2200.4370.49516.54.8
      YOLOv5m0.5950.3650.3720.2850.4820.52550.49.8
      YOLOX-s0.5350.3140.3170.2250.4150.48541.655.1
      MobileNetv30.5540.3290.3290.2450.4430.49523.88.0
      MobileViT0.5550.3330.3370.2490.4420.418-13.7
      YOLOv5sm+0.5960.3600.3690.2810.4790.50530.87.7
      YOLOv5sm+*0.6060.3670.3780.2950.4780.439--
    • Table 8. Detection performance of different algorithms on DIOR dataset

      View table
      View in Article

      Table 8. Detection performance of different algorithms on DIOR dataset

      模型BackBonemAP50
      注:加粗字体为该列最优值,包含其他文献对比结果。
      Faster R-CNN[33]VGG160.541
      PANet[20]ResNet500.638
      Retina-Net[24]ResNet500.685
      文献[32] ResNet500.732
      CAT-Net[34]ResNet500.763
      YOLOv5sm+(ours)-0.667
    Tools

    Get Citation

    Copy Citation Text

    Xu Chen, Dongliang Peng, Yu Gu. Real-time object detection for UAV images based on improved YOLOv5s[J]. Opto-Electronic Engineering, 2022, 49(3): 210372-1

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Nov. 22, 2021

    Accepted: --

    Published Online: Apr. 24, 2022

    The Author Email: Yu Gu (guyu@hdu.edu.cn)

    DOI:10.12086/oee.2022.210372

    Topics