Journal of Optoelectronics · Laser, Volume. 35, Issue 6, 570(2024)
Scene text detection based on dual attention and multi-scale feature fusion
Addressing the challenges associated with text detection in complex natural scenes, this paper presents a novel scene text detection method that employs a dual-attention and multi-scale feature fusion strategy. By introducing the dual-attention fusion mechanism, the correlation between text feature channels is strengthened, leading to an overall improvement in detection performance. Furthermore, considering the potential loss of semantic information resulting from up-and-down sampling of deep feature maps, a hollow convolutional multi-scale feature fusion pyramid is introduced. This approach adopts a dual fusion mechanism to enhance semantic features and overcome the impact of scale variations. To address the issues of semantic conflict and limited representation of multi-scale features resulting from the fusion of information with different densities, an innovative multi-scale feature fusion module (MFFM) is introduced. In addition, the feature refinement module (FRM) is introduced for the problem of small text that is easily masked by conflicting information. The experiments show the effectiveness of our method for text detection in complex scenes with F-values of 85.6%, 87.1% and 86.3% on three datasets, CTW1500, ICDAR2015, and Total-Text.
Get Citation
Copy Citation Text
QIANG Guanchen, YANG Qian, ZHANG Lizhen, XIONG Wei, LI Lirong. Scene text detection based on dual attention and multi-scale feature fusion[J]. Journal of Optoelectronics · Laser, 2024, 35(6): 570
Category:
Received: Sep. 4, 2023
Accepted: Dec. 13, 2024
Published Online: Dec. 13, 2024
The Author Email: XIONG Wei (xw@mail.hbut.edu.cn)