Infrared Technology, Volume. 47, Issue 7, 884(2025)
Cross-Modal Multilevel Feature Fusion-Based Algorithm for Power-Equipment Detection
[6] [6] Redmon J, Farhadi A. YOLOv3: An incremental improvement[J]. arXiv preprint arXiv:1804.02767, 2018.
[7] [7] REN S, HE K, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[J].IEEE Transactions on Pattern Analysis & Machine Intelligence, 2017,39(6): 1137-1149. DOI: 10.1109/TPAMI.2016.2577031.
[8] [8] TAO Xian, ZHANG Dapeng, WANG Zihao, et al. Detection of power line insulator defects using aerial images analyzed with convolutional neural networks[J].IEEE Transactions on Systems, Man, and Cybernetics: Systems, 2020,50(4): 1486-1498.
[11] [11] Woo S, Park J, Lee J Y, et al. CBAM: convolutional block attention module[C]//Proceedings of the European Conference on Computer Vision(ECCV), 2018: 3-19.
[12] [12] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 7132-7141.
[13] [13] LI X, WANG W, HU X, et al. Selective kernel networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 510-519.
[19] [19] LIU S, QI L, QIN H, et al. Path aggregation network for instance segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2018: 8759-8768.
[21] [21] Redmon J, Farhadi A. YOLOv3: An incremental improvement[J/OL]. [2018-04-8]. https://arxiv.org/abs/1804.02767.
[22] [22] Wagner J, Fischer V, Herman M, et al. Multispectral pedestrian detection using deep fusion convolutional neural networks[C]//ESANN, 2016,587: 509-514.
[23] [23] LIU J, ZHANG S, WANG S, et al. Multispectral deep neural networks for pedestrian detection[J]. arXiv preprint arXiv:1611.02644, 2016.
[25] [25] LIU J, FAN X, HUANG Z, et al. Target-aware dual adversarial learning and a multi-scenario multi-modality benchmark to fuse infrared and visible for object detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2022: 5802-5811.
[28] [28] Dosovitskiy A, Beyer L, Kolesnikov A, et al. An image is worth 16×16 words: Transformers for image recognition at scale[J]. arXiv preprint arXiv:2010.11929, 2020.
[29] [29] ZHU X, LYU S, WANG X, et al. TPH-YOLOv5: Improved YOLOv5 based on transformer prediction head for object detection on drone-captured scenarios[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2021: 2778-2788.
[30] [30] Prakash A, Chitta K, Geiger A. Multi-modal fusion transformer for end-to-end autonomous driving[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 7077-7087.
[31] [31] Bochkovskiy A, WANG C Y, LIAO H Y M. YOLOv4: optimal speed and accuracy of object detection[J/OL]. arXiv preprint arXiv:2004.10934.
[32] [32] FANG Qingyun, HAN Dapeng, WANG Zhaokui. Cross-modality fusion transformer for multispectral object detection[J]. arXiv preprint arXiv:2111.00273, 2021.
[33] [33] SUN Y, CAO B, ZHU P, et al. Drone-based RGB-infrared cross-modality vehicle detection via uncertainty-aware learning[J].IEEE Transactions on Circuits and Systems for Video Technology, 2022,32(10): 6700-6713.
Get Citation
Copy Citation Text
LIU Shanfeng, MAO Wandeng, LI Miaomiao, ZHOU Qiankai, ZOU Wenjie, BAO Hua. Cross-Modal Multilevel Feature Fusion-Based Algorithm for Power-Equipment Detection[J]. Infrared Technology, 2025, 47(7): 884