Laser & Optoelectronics Progress, Volume. 61, Issue 10, 1037009(2024)

Global-Sampling Spatial-Attention Module and its Application in Image Classification and Small Object Detection and Recognition

Jingyu Lu1,2,3, Haiyang Zhang1,2,3、*, Wenxin Wang1,2,3, and Changming Zhao1,2,3
Author Affiliations
  • 1School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081
  • 2Key Laboratory of Optoelectronic Imaging Technology and Systems, Ministry of Education, Beijing 100081
  • 3Key Laboratory of Information Photonics Technology, Ministry of Industry and Information Technology, Beijing 100081
  • show less
    Figures & Tables(21)
    BAM architecture[1] and CBAM architecture[11]
    Architecture of non-local neural network attention module[13]
    Architecture of 3×3 deformable convolution module[12]
    Detailed architecture of the GSSAM
    Dimension reduction of input feature map
    Schematic of benchmark selection process
    Process of obtaining attention map by difference comparison
    Loss and accuracy changes on training set. (a) Loss change; (b) accuracy change
    Loss change on validation set. (a) Original loss change; (b) loss fitting; (c) local comparison
    Accuracy change on validation set. (a) Original accuracy change; (b) accuracy fitting; (c) local comparison
    Comparison of accuracy fitting curves of different networks. (a) VGG19 fitting curves; (b) VGG19 local fitting curves; (c) ResNet50 fitting curves; (d) ResNet50 local fitting curves
    Comparison of loss fitting curves of different networks. (a) VGG19 fitting curves; (b) VGG19 local fitting curves; (c) ResNet50 fitting curves; (d) ResNet50 local fitting curves
    Comparison of Grad-CAM's visualization results
    Schematic of the role of GSSAM in small target images
    Schematic of GSSAM embedding YOLOx
    Partial “low slow small” UAV images
    Schematic of residual GSSAM structure
    Schematic of module structure without sampling and comparison
    • Table 1. Classification results of different networks for images on some ImageNet-1K data sets

      View table

      Table 1. Classification results of different networks for images on some ImageNet-1K data sets

      ModelParamsFLOPs /109Accuracy /%
      ResNet3425,856,58611.2680.00
      ResNet34+CBAM25,908,78811.2676.80
      ResNet34+GSSAM25,864,72811.2781.00
      ResNet5023,528,5225.1978.20
      ResNet50+CBAM24,225,8445.1980.80
      ResNet50+GSSAM23,551,2925.2179.80
      ResNet10142,520,65010.0579.60
      ResNet101+CBAM43,217,97210.0580.00
      ResNet101+GSSAM42,543,45610.0779.40
      ResNet15258,164,29814.8577.80
      ResNet152+CBAM58,861,62014.8577.20
      ResNet152+GSSAM58,187,06814.8779.40
      DarkNet5340,613,0348.9980.00
      DarkNet53+CBAM40,788,1169.0081.20
      DarkNet53+GSSAM40,624,7429.0181.60
      VGG19139,622,21823.0282.60
      VGG19+CBAM139,698,99623.0381.80
      VGG19+GSSAM139,632,92023.0484.00
    • Table 2. Statistical of experimental results

      View table

      Table 2. Statistical of experimental results

      ModelTime /msmAP50∶95 /%
      Origin YOLOx20.9658.67
      YOLOx-GSSAM Sigmoidposition 122.4660.64
      Position 222.4760.50
      YOLOx-GSSAM standardizationposition 124.9659.50
      Position 224.4358.96
      YOLOx-GSSAM Sigmoid Resposition 122.3161.44
      position 221.9961.01
    • Table 3. Verification for sampling and comparison effects

      View table

      Table 3. Verification for sampling and comparison effects

      ParameterGSSAMGSSAM ResConvConv Res
      position 1position 2position 1position 2position 1position 2position 1position 2
      mAP50∶95 /%60.6460.5061.4461.0159.5659.4960.0160.06
    Tools

    Get Citation

    Copy Citation Text

    Jingyu Lu, Haiyang Zhang, Wenxin Wang, Changming Zhao. Global-Sampling Spatial-Attention Module and its Application in Image Classification and Small Object Detection and Recognition[J]. Laser & Optoelectronics Progress, 2024, 61(10): 1037009

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Digital Image Processing

    Received: Aug. 18, 2023

    Accepted: Oct. 9, 2023

    Published Online: Apr. 29, 2024

    The Author Email: Haiyang Zhang (ocean@bit.edu.cn)

    DOI:10.3788/LOP231933

    CSTR:32186.14.LOP231933

    Topics