Infrared and Laser Engineering, Volume. 53, Issue 12, 20240286(2024)

Embedding spatial position information and Multi-view Feature Extraction for infrared small target detection

Zifen HE, Jinsheng XUE, Yinhui ZHANG*, and Guangchen CHEN
Author Affiliations
  • Mechanical and Electrical Engineering, Kunming University of Science and Technology, Kunming 650500, China
  • show less
    Figures & Tables(15)
    SpatialLocation Information and Multi-View Feature Fusion network model
    Spatial Location Information Fusion Attention
    Multi-view Feature Extraction Module
    Migration process of convolution kernel
    Multi-view Feature Fusion
    Large Selection Kernel
    Distribution of small target proportion
    Test set data image
    Size distribution of small targets in the test set
    Detection results of Multi-View Feature Extraction on different datasets
    Detection graphs of different networks
    • Table 1. Ablation experiments

      View table
      View in Article

      Table 1. Ablation experiments

      ModelsSLIFMVFELSKAIFIGFLOPsSize/MBmAP75mAP50-95Inference time/ms
      YOLOv5s15.813.782.8%70.6%6.9
      Ours116.014.386.3%71.1%6.6
      Ours216.614.487.0%72.2%6.9
      Ours317.515.288.8%72.4%7.1
      ESLIMFENet17.416.990.5%74.5%8.5
    • Table 2. Comparison experiments of different models

      View table
      View in Article

      Table 2. Comparison experiments of different models

      ModelsGFLOPsSize/MBmAP75mAP50-95Inference time/ms
      YOLOv3-Tiny12.916.676.0%63.45.1
      YOLOv5s-ghost8.07.470.5%58.4%7.5
      YOLOv5s15.813.782.8%70.6%6.9
      YOLOv5s-bifpn108.589.087.0%73.1%20.3
      YOLOv611.88.378.0%65.2%2.6
      YOLOv7103.271.388.5%72.3%9.8
      YOLOR-CSP118.9100.686.2%71.5%11.8
      YOLOv7-Tiny13.011.786.7%71.5%4.8
      YOLOv8n8.16.086.9%72.6%2.1
      YOLOv8s28.421.587.4%72.9%4.2
      ESPIMFENet17.416.990.5%74.5%8.5
    • Table 3. Comparison between different attention

      View table
      View in Article

      Table 3. Comparison between different attention

      YOLOv5sGFLOPsSize/MBmAP75mAP50-95Inference time/ms
      +SPPF15.813.782.8%70.6%6.9
      +SA15.813.772.2%63.2%7.0
      +CBAM16.814.084.6%69.8%6.9
      +C3 ghost14.813.483.5%70.8%7.2
      +CA15.813.783.6%68.6%6.9
      +MHSA16.415.283.8%70.2%6.8
      +SIIF16.215.086.3%71.1%6.6
      +MVFE16.814.486.1%71.4%7.2
      +LSK16.113.986.3%72.2%7.0
      +AIFI15.715.587.5%72.4%7.9
    • Table 4. Experimental results of Multi-view Feature Extraction on different datasets

      View table
      View in Article

      Table 4. Experimental results of Multi-view Feature Extraction on different datasets

      ModelsGFLOPsSize/MBmAP75mAP50-95Inference time/ms
      YOLOv5s (SIRST)15.813.677.6%64.5%6.9
      YOLOv5s (NUDT-SIRST)15.813.778.8%68.3%6.9
      Ours1 (SIRST)16.014.386.3%71.1%7.0
      Ours1 (NUDT-SIRST)16.014.387.9%72.2%7.0
    Tools

    Get Citation

    Copy Citation Text

    Zifen HE, Jinsheng XUE, Yinhui ZHANG, Guangchen CHEN. Embedding spatial position information and Multi-view Feature Extraction for infrared small target detection[J]. Infrared and Laser Engineering, 2024, 53(12): 20240286

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: 图像处理

    Received: Sep. 22, 2024

    Accepted: --

    Published Online: Jan. 16, 2025

    The Author Email:

    DOI:10.3788/IRLA20240286

    Topics