Optoelectronics Letters, Volume. 21, Issue 6, 354(2025)
Marine organism classification method based on hierarchical multi-scale attention mechanism
[1] [1] LECUN Y, BOTTOU L, BENGIO Y, et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 1998, 86(11): 2278-2324.
[2] [2] KRIZHEVSKY A, SUTSKEVER I, HINTON G E. ImageNet classification with deep convolutional neural networks[J]. Advances in neural information processing systems, 2012, 25.
[3] [3] RUSSAKOVSKY O, DENG J, SU H, et al. ImageNet large scale visual recognition challenge[J]. International journal of computer vision, 2014: 1-42.
[4] [4] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[J]. Computer science, 2014.
[5] [5] HE K, ZHANG X, REN S, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-July 1, 2016, Las Vegas, Nevada, USA. New York: IEEE, 2016: 770-778.
[6] [6] TAN M, LE Q. EfficientNet: rethinking model scaling for convolutional neural networks[C]//International Conference on Machine Learning (ICML), June 10-15, 2019, Long Beach, California, USA. IMLS, 2019: 6105-6114.
[7] [7] TAN M, CHEN B, PANG R, et al. Mnasnet: platform-aware neural architecture search for mobile[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 16-20, 2019, Long Beach, California, USA. New York: IEEE, 2019: 2820-2828.
[8] [8] TAN M, LE Q. Efficientnetv2: smaller models and faster training[C]//International Conference on Machine Learning (ICML), July 18-24, 2021, Vienna, Austria. IMLS, 2021: 10096-10106.
[9] [9] DING X, ZHANG X, HAN J, et al. Scaling up your kernels to 31×31: revisiting large kernel design in cnns[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 19-24, 2022, New Orleans, Louisiana, USA. New York: IEEE, 2022: 11963-11975.
[10] [10] CHEN J, KAO S, HE H, et al. Run, don't walk: chasing higher FLOPS for faster neural networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 18-22, 2023, Vancouver, British Columbia, Canada. New York: IEEE, 2023: 12021-12031.
[11] [11] XIONG Y, VARADARAJAN B, WU L, et al. EfficientSAM: leveraged masked image pretraining for efficient segment anything[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 16-20, 2024, Seattle, Washington, USA. New York: IEEE, 2024.
[12] [12] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 18-22, 2018, Salt Lake City, Utah, USA. New York: IEEE, 2018: 7132-7141.
[13] [13] HOU Q, ZHOU D, FENG J. Coordinate attention for efficient mobile network design[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 20-25, 2021, Nashville, TN, USA. New York: IEEE, 2021: 13713-13722.
[14] [14] ZHANG H, WU C, ZHANG Z, et al. ResNest: split-attention networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 19-24, 2022, New Orleans, Louisiana, USA. New York: IEEE, 2022: 2736-2746.
[15] [15] SI C, YU W, ZHOU P, et al. Inception transformer[J]. Advances in neural information processing systems, 2022, 35: 23495-23509.
[16] [16] OUYANG D, HE S, ZHANG G, et al. Efficient multi-scale attention module with cross-spatial learning[C]//IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), June 4-6, 2023, Rhodes, Greece. New York: IEEE, 2023: 1-5.
[17] [17] ZHU L, WANG X, KE Z, et al. Biformer: vision transformer with bi-level routing attention[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), June 18-22, 2023, Vancouver, British Columbia, Canada. New York: IEEE, 2023: 10323-10333.
[18] [18] SELVARAJU R R, COGSWELL M, DAS A, et al. Grad-cam: visual explanations from deep networks via gradient-based localisation[C]//Proceedings of the IEEE International Conference on Computer Vision (ICCV), October 22-27, 2017, Venice, Italy. New York: IEEE, 2017: 618-626.
[19] [19] LIU Y, SUN G, QIU Y, et al. Transformer in convolutional neural networks[EB/OL]. (2021-06-06)[2023-12-23]. https://arxiv.org/abs/2106.03180v1.
Get Citation
Copy Citation Text
XU Haotian, CHENG Yuanzhi, ZHAO Dong, XIE Peidong. Marine organism classification method based on hierarchical multi-scale attention mechanism[J]. Optoelectronics Letters, 2025, 21(6): 354
Received: Mar. 22, 2024
Accepted: Jun. 27, 2025
Published Online: Jun. 27, 2025
The Author Email: