Multi-Head Mixed Self-Attention Mechanism for Object Detection

Qinghua Su; Jianhong Mu; Wenhui Liang; Xiyu Wang; Juntao Li

doi:10.3788/LOP241509

Laser & Optoelectronics Progress, Volume. 62, Issue 6, 0637006(2025)

Multi-Head Mixed Self-Attention Mechanism for Object Detection

Qinghua Su^*, Jianhong Mu, Wenhui Liang, Xiyu Wang, and Juntao Li

Author Affiliations

School of Information, Beijing Wuzi University, Beijing 101149, China

show less

Abstract Get PDF(in Chinese)

With the traditional attention mechanism, the representational ability and detection performance of the model are limited or its complexity and calculation cost are high. To solve these problems, an innovative lightweight multi-head mixed self-attention (MMSA) mechanism is proposed, aimed at enhancing the performance of object detection networks while maintaining the model's simplicity and efficiency. The MMSA module ingeniously integrates channel information with spatial information, as well as local and global information, by introducing a multi-head attention mechanism, further augmenting the network's representational capabilities. Compared to other attention mechanisms, MMSA achieves a superior balance between model representation, performance, and complexity. To validate the effectiveness of MMSA, it is integrated into the Backbone or Neck portions of the YOLOv8n network to enhance its multi-scale feature extraction and feature fusion capabilities. Extensive comparative experiments on the CityPersons, CrowdHuman, TT100K, BDD100K, and TinyPerson public datasets show that, compared with the original algorithm, YOLOv8n with MMSA improved their mean average precision (mAP@0.5) by 0.9 percentage points, 0.9 percentage points, 2.3 percentage points, 1.0 percentage points, and 1.7 percentage points, respectively, without significantly increasing the model size. Additionally, the detection speed reached 145 frame/s, fully meeting the requirements of real-time applications. Experimental results fully demonstrate the effectiveness of the MMSA mechanism in improving object detection outcomes, showcasing its practical value and broad applicability in real-world scenarios.

Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.

Keywords

attention mechanism machine vision multi-head mixed self-attention mechanism object detection YOLO

Tools

Get Citation

Copy Citation Text

Qinghua Su, Jianhong Mu, Wenhui Liang, Xiyu Wang, Juntao Li. Multi-Head Mixed Self-Attention Mechanism for Object Detection[J]. Laser & Optoelectronics Progress, 2025, 62(6): 0637006

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites