Laser & Optoelectronics Progress, Volume. 62, Issue 6, 0637006(2025)
Multi-Head Mixed Self-Attention Mechanism for Object Detection
With the traditional attention mechanism, the representational ability and detection performance of the model are limited or its complexity and calculation cost are high. To solve these problems, an innovative lightweight multi-head mixed self-attention (MMSA) mechanism is proposed, aimed at enhancing the performance of object detection networks while maintaining the model's simplicity and efficiency. The MMSA module ingeniously integrates channel information with spatial information, as well as local and global information, by introducing a multi-head attention mechanism, further augmenting the network's representational capabilities. Compared to other attention mechanisms, MMSA achieves a superior balance between model representation, performance, and complexity. To validate the effectiveness of MMSA, it is integrated into the Backbone or Neck portions of the YOLOv8n network to enhance its multi-scale feature extraction and feature fusion capabilities. Extensive comparative experiments on the CityPersons, CrowdHuman, TT100K, BDD100K, and TinyPerson public datasets show that, compared with the original algorithm, YOLOv8n with MMSA improved their mean average precision (mAP@0.5) by 0.9 percentage points, 0.9 percentage points, 2.3 percentage points, 1.0 percentage points, and 1.7 percentage points, respectively, without significantly increasing the model size. Additionally, the detection speed reached 145 frame/s, fully meeting the requirements of real-time applications. Experimental results fully demonstrate the effectiveness of the MMSA mechanism in improving object detection outcomes, showcasing its practical value and broad applicability in real-world scenarios.
Get Citation
Copy Citation Text
Qinghua Su, Jianhong Mu, Wenhui Liang, Xiyu Wang, Juntao Li. Multi-Head Mixed Self-Attention Mechanism for Object Detection[J]. Laser & Optoelectronics Progress, 2025, 62(6): 0637006
Category: Digital Image Processing
Received: Jun. 19, 2024
Accepted: Aug. 1, 2024
Published Online: Mar. 6, 2025
The Author Email:
CSTR:32186.14.LOP241509