Journal of Optoelectronics · Laser, Volume. 35, Issue 6, 570(2024)

Scene text detection based on dual attention and multi-scale feature fusion

QIANG Guanchen1, YANG Qian1, ZHANG Lizhen1, XIONG Wei1,2,3,4、*, and LI Lirong1,2
Author Affiliations
  • 1School of Electrical and Electronic Engineering, Hubei University of Technology, Wuhan, Hubei 430068, China
  • 2Hubei Key Laboratory of Solar Energy Efficient Utilization and Energy Storage Operation Control, Hubei University of Technology, Wuhan, Hubei 430068, China
  • 3Hubei Engineering Research Center for Safety Monitoring of New Energy and Power Grid Equipment, Hubei University of Technology, Wuhan, Hubei 430068, China
  • 4Department of Computer Science and Engineering, University of South Carolina, Columbia, South Carolina 29201, USA
  • show less

    Addressing the challenges associated with text detection in complex natural scenes, this paper presents a novel scene text detection method that employs a dual-attention and multi-scale feature fusion strategy. By introducing the dual-attention fusion mechanism, the correlation between text feature channels is strengthened, leading to an overall improvement in detection performance. Furthermore, considering the potential loss of semantic information resulting from up-and-down sampling of deep feature maps, a hollow convolutional multi-scale feature fusion pyramid is introduced. This approach adopts a dual fusion mechanism to enhance semantic features and overcome the impact of scale variations. To address the issues of semantic conflict and limited representation of multi-scale features resulting from the fusion of information with different densities, an innovative multi-scale feature fusion module (MFFM) is introduced. In addition, the feature refinement module (FRM) is introduced for the problem of small text that is easily masked by conflicting information. The experiments show the effectiveness of our method for text detection in complex scenes with F-values of 85.6%, 87.1% and 86.3% on three datasets, CTW1500, ICDAR2015, and Total-Text.

    Tools

    Get Citation

    Copy Citation Text

    QIANG Guanchen, YANG Qian, ZHANG Lizhen, XIONG Wei, LI Lirong. Scene text detection based on dual attention and multi-scale feature fusion[J]. Journal of Optoelectronics · Laser, 2024, 35(6): 570

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Sep. 4, 2023

    Accepted: Dec. 13, 2024

    Published Online: Dec. 13, 2024

    The Author Email: XIONG Wei (xw@mail.hbut.edu.cn)

    DOI:10.16136/j.joel.2024.06.0468

    Topics