Optics and Precision Engineering, Volume. 32, Issue 5, 727(2024)

Position-sensitive Transformer aerial image object detection model

Daxiang LI... Jiani XIN* and Ying LIU |Show fewer author(s)
Author Affiliations
  • College of communication and information engineering, Xi'an University of Posts and Telecommunication, Xi'an710121, China
  • show less

    Addressing the challenge of detecting numerous small objects in UAV⁃captured aerial images, this paper introduces the Position⁃Sensitive Transformer Target Detection (PS⁃TOD) model. Initially, it presents a multi⁃scale feature fusion (MSFF) module incorporating a Positional Channel Embedded 3D Attention (PCE3DA) mechanism. PCE3DA leverages the interplay between spatial and channel data to generate 3D attention, enhancing feature representation in areas of interest. This foundation supports a bottom⁃up, cross⁃layer MSFF approach, augmenting the semantic richness of combined features. Subsequently, it proposes a novel Position⁃Sensitive Self⁃Attention (PSSA) mechanism, leading to the development of a position⁃sensitive Transformer encoder⁃decoder. This innovation heightens the model's sensitivity to target positioning, facilitating the capture of long⁃term dependencies within the image's global context. Comparative tests using the VisDrone dataset reveal that the PS⁃TOD model attains an Average Precision (AP) of 28.8%, marking a 4.1% enhancement over the baseline model (DETR). Furthermore, it demonstrates precise object detection in UAV aerial imagery against complex backdrops, significantly boosting the detection accuracy of small targets.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Daxiang LI, Jiani XIN, Ying LIU. Position-sensitive Transformer aerial image object detection model[J]. Optics and Precision Engineering, 2024, 32(5): 727

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: May. 30, 2023

    Accepted: --

    Published Online: Apr. 2, 2024

    The Author Email: XIN Jiani (xjn_2000@163.com)

    DOI:10.37188/OPE.20243205.0727

    Topics