Position-sensitive Transformer aerial image object detection model

Daxiang LI; Jiani XIN; Ying LIU

doi:10.37188/OPE.20243205.0727

Optics and Precision Engineering, Volume. 32, Issue 5, 727(2024)

Position-sensitive Transformer aerial image object detection model

Daxiang LI, Jiani XIN^*, and Ying LIU

College of communication and information engineering， Xi'an University of Posts and Telecommunication， Xi'an710121， China

show less

Abstract Get PDF(in Chinese)

References(33)

[1] [1] 朱威，王立凯，靳作宝，等. 引入注意力机制的轻量级小目标检测网络［J］. 光学精密工程， 2022， 30（8）： 998-1010. doi: 10.37188/OPE.20223008.0998ZHUW， WANGL K， JINZ B， et al. Lightweight small object detection network with attention mechanism［J］. Optics and Precision Engineering， 2022， 30（8）： 998-1010.（in Chinese）. doi: 10.37188/OPE.20223008.0998

[2] [2] 范丽丽，赵宏伟，赵浩宇，等. 基于深度卷积神经网络的目标检测研究综述［J］. 光学精密工程， 2020， 28（5）： 1152-1164.FANL L， ZHAOH W， ZHAOH Y， et al. Survey of target detection based on deep convolutional neural networks［J］. Optics and Precision Engineering， 2020， 28（5）： 1152-1164.（in Chinese）

[3] REN S Q, HE K M, GIRSHICK R et al. Faster R-CNN： towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149(2017).

[4] CAI Z W, VASCONCELOS N. Cascade R-CNN： delving into high quality object detection[C], 18, 6154-6162(2018).

[5] LIU W, ANGUELOV D, ERHAN D et al. SSD： Single Shot Multibox Detector[M]. Computer Vision-ECCV 2016, 21-37(2016).

[6] BOCHKOVSKIY A, WANG C Y, LIAO H. YOLOv4： Optimal Speed and Accuracy of Object Detection[webpage]. arXiv preprint(2020).

[7] YANG C, HUANG Z H, WANG N Y. QueryDet： cascaded sparse query for accelerating high-resolution small object detection[C], 18, 13658-13667(2022).

[8] LI W T, CHEN Y J, HU K X et al. Oriented RepPoints for aerial object detection[C], 18, 1829-1838(2022).

[9] LIANG D, GENG Q X, WEI Z Q et al. Anchor retouching via model interaction for robust object detection in aerial images[J]. IEEE Transactions on Geoscience and Remote Sensing, 60, 1-13(2022).

[10] LAW H, DENG J. CornerNet： detecting objects as paired keypoints[J]. International Journal of Computer Vision, 128, 642-656(2020).

[11] TIAN Z, SHEN C H, CHEN H et al. FCOS： fully convolutional one-stage object detection[C], 9626-9635(2019).

[12] DAI P W, YAO S Y, LI Z K et al. ACE： anchor-free corner evolution for real-time arbitrarily-oriented object detection[J]. IEEE Transactions on Image Processing, 31, 4076-4089(2022).

[13] CARION N, MASSA F, SYNNAEVE G et al. End-to-end Object Detection with Transformers[M]. Computer Vision-ECCV 2020, 213-229(2020).

[14] ZHU X Z, SU W J, LU L W et al. Deformable DETR： deformable transformers for end-to-end object detection[C], 1-14(2021).

[15] LI F, ZHANG H, LIU S L et al. DN-DETR： accelerate DETR training by introducing query DeNoising[C], 18, 13609-13617(2022).

[16] HOU Q B, ZHOU D Q, FENG J S. Coordinate attention for efficient mobile network design[C], 13713-13722(2021).

[17] DOSOVITSKIY A, BEYER L, KOLESNIKOV A et al. An image is worth 16x16 words： transformers for image recognition at scale[C], 15-35(2021).

[18] VASWANI A, SHAZEER N, PARMAR N et al. Attention is all you need[C], 6000-6010(2017).

[19] KUHN H W. The Hungarian method for the assignment problem[J]. Naval Research Logistics Quarterly, 2, 83-97(1955).

[20] ZHU P F, WEN L Y, DU D W et al. Vision Meets Drones： Past， Present and Future[webpage](2020).

[21] LIN T Y, GOYAL P, GIRSHICK R et al. Focal loss for dense object detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 318-327(2020).

[22] ZHENG Z H, WANG P, LIU W et al. Distance-IoU loss： faster and better learning for bounding box regression[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 12993-13000(2020).

[23] LIN T Y, MAIRE M, BELONGIE S et al. Microsoft COCO： Common Objects in Context[M]. Computer Vision-ECCV 2014, 740-755(2014).

[24] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C], 18, 7132-7141(2018).

[25] PARK J, LEE J Y et al. Bam： Bottleneck attention module[webpage]. arXiv preprint(2018).

[26] PARK J, LEE J Y et al. Cbam： Convolutional block attention module[C], 3-19(2018).

[27] DAI Z H, YANG Z L, YANG Y M et al. Transformer-XL： attentive language models beyond a fixed-length context[C], 2978-2988(2019).

[28] HUANG Z H, LIANG D, XU P et al. Improve transformer models with better relative position embeddings[C], 3327-3335(2020).

[29] WU Y, CHEN Y P, YUAN L et al. Rethinking classification and localization for object detection[C], 13, 10186-10195(2020).

[30] RUKHOVICH D, SOFIIUK K, GALEEV D et al. IterDet： Iterative Scheme for Object Detection in Crowded Environments[M]. Lecture Notes in Computer Science, 344-354(2021).

[31] SUN W, DAI L, ZHANG X R et al. RSOD： real-time small object detection algorithm in UAV-based traffic monitoring[J]. Applied Intelligence, 52, 8448-8463(2022).

[32] LI Y T, FAN Q S, HUANG H S et al. A modified YOLOv8 detection network for UAV aerial image recognition[J]. Drones, 7, 304(2023).

[33] WANG W H, XIE E Z, LI X et al. PVT v2： improved baselines with pyramid vision transformer[J]. Computational Visual Media, 8, 415-424(2022).

Tools

Get Citation

Copy Citation Text

Daxiang LI, Jiani XIN, Ying LIU. Position-sensitive Transformer aerial image object detection model[J]. Optics and Precision Engineering, 2024, 32(5): 727

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category:

Received: May. 30, 2023

Accepted: --

Published Online: Apr. 2, 2024

The Author Email: Jiani XIN (xjn_2000@163.com)

DOI:10.37188/OPE.20243205.0727

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology