Journal of Optoelectronics · Laser, Volume. 33, Issue 5, 479(2022)

Natural scene text recognition algorithm based on multilevel feature selection

LI Lirong1,2、*, ZHANG Kai1, ZHANG Yunliang1, YUE Ling1, ZHOU Lei1, and GONG Pengcheng1,2
Author Affiliations
  • 1[in Chinese]
  • 2[in Chinese]
  • show less

    Aiming at the problem that existing scene text recognition methods only focus on the classification of local sequence characters and ignore the global information of the entire word,a multilevel feature selection scene text recognition (MFSSTR) algorithm is proposed.The algorithm uses a stacked block architecture and applies a multilevel feature selection module to capture contextual and semantic features in visual features. In the process of character prediction,a novel multilevel attention selection decoder (MASD) is proposed,which combines visual features,context features and semantic features into a new feature space,and re-weights the new feature space through a self-attention mechanism.While paying attention to the internal relations of the feature sequence,select more valuable features and participate in decoding prediction.At the same time,intermediate supervision is introduced in the training process to gradually refine the text prediction.The experimental results show that the algorithm in this paper can reach a high level of recognition accuracy on multiple public scene text data sets.In particular,the accuracy rate can reach 87.1% on the irregular text data set SVTP,which is improved compared with the current popular algorithms by about 2%.

    Tools

    Get Citation

    Copy Citation Text

    LI Lirong, ZHANG Kai, ZHANG Yunliang, YUE Ling, ZHOU Lei, GONG Pengcheng. Natural scene text recognition algorithm based on multilevel feature selection[J]. Journal of Optoelectronics · Laser, 2022, 33(5): 479

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Received: Nov. 12, 2021

    Accepted: --

    Published Online: Oct. 9, 2024

    The Author Email: LI Lirong (Rongli@hbut.edu.cn)

    DOI:10.16136/j.joel.2022.05.0761

    Topics