Journal of Optoelectronics · Laser, Volume. 34, Issue 11, 1158(2023)

Natural scene text recognition based on character attention

XIONG Wei1,2,3、*, SUN Peng1, ZHAO Di1, and LIU Yue1
Author Affiliations
  • 1[in Chinese]
  • 2[in Chinese]
  • 3[in Chinese]
  • show less
    References(18)

    [1] [1] SHI B G,BAI X,YAO C.An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(11):2298-2304.

    [2] [2] BORISYUK F,GORDO A,SIVAKUMAR V.Rosetta: large scale system for text detection and recognition in images[C]//the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,August 19-23,2018,London,United Kingdom.New York:ACM,2018:71-79.

    [3] [3] GAO Y,CHEN Y,WANG J,et al.Reading scene text with attention convolutional sequence modeling[J].Neurocomputing,2017,339(C):161-170.

    [5] [5] SHI B,WANG X,LYU P,et al.Robust scene text recognition with automatic rectification[C]//IEEE Conference on Computer Vision and Pattern Recognition,June 27-30,2016,Las Vegas,NV,USA.New York:IEEE,2016:4168-4176.

    [6] [6] SHI B G,YANG M K,WANG X G,et al.ASTER:an attentional scene text recognizer with flexible rectification[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(9):2035-2048.

    [8] [8] LIAO M H,LYU P Y,HE M H,et al.Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,43(2):532-548.

    [9] [9] LYU P,YANG Z,LENG X,et al.2D attentional irregular scene text recognizer[EB/OL].(2019-06-13)[2022-08-26].https://arxiv.org/abs/1906.05708.

    [10] [10] WANG T,ZHU Y,JIN L,et al.Decoupled attention network for text recognition[C]//Proceedings of the AAAI Conference on Artificial Intelligence,February 7-12,2020,New York,USA.Palo Alto,California,USA:AAAI Press,2020,34(7):12216-12224.

    [12] [12] FANG S,XIE H,WANG Y,et al.Read like humans: autonomous,bidirectional and iterative language modeling for scene text recognition[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition,June 19-25,2021,Virtual.New York:IEEE,2021:7098-7107.

    [13] [13] LIU Z,LIN Y,CAO Y,et al.Swin transformer:hierarchical vision transformer using shifted windows[C]//IEEE/CVF International Conference on Computer Vision,October 10-17,2021,Montreal,QC,Canda.New York:IEEE,2021:10012-10022.

    [14] [14] WANG Q L,WU B G,ZHU P F,et al.ECA-Net:efficient channel attention for deep convolutional neural networks[EB/OL].(2019-10-08)[2022-08-26].https://doi.org/10.48550/arXiv.1910.03151.

    [15] [15] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need[C]//Advances in Neural Information Processing Systems,December 4-9,2017,Long Beach,California.USA.Red Hook,NY,United States:Curran Associates Inc,2017:6000-6010.

    [16] [16] ZHU X,HU H,LIN S,et al.Deformable convnets v2:more deformable, better results[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition,June 15-20,2019,Long Beach,CA,USA.New York:IEEE,2019:9308-9316.

    [17] [17] RONNEBERGER O,FISCHER P,BROX T.U-net:Convolutional networks for biomedical image segmentation[C]// International Conference on Medical Image Computing and Computer-Assisted Intervention,October 5-9,2015,Munich,Germany.Cham:Springer,2015:234-241.

    [18] [18] AREVALO J,SOLORIO T,MONTES-Y-GóMEZ M,et al.Gated multimodal units for information fusion[EB/OL]. (2017-02-07)[2022-08-26].https://arxiv.org/abs/1702.01992.

    [19] [19] QIAO Z,ZHOU Y,YANG D,et al.SEED:semantics enhanced encoder-decoder framework for scene text recognition[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition,June 13-19,2020,Seattle,WA,USA.New York:IEEE,2020:13528-13537.

    [20] [20] WAN Z,HE M,CHEN H,et al.Textscanner:reading characters in order for robust scene text recognition[C]// Proceedings of the AAAI Conference on Artificial Intelligence,February 7-12,2020,New York,USA.Palo Alto,California,USA:AAAI Press,2020,34(7):12120-12127.

    [21] [21] YUE X,KUANG Z,LIN C,et al.RobustScanner:dynamically enhancing positional clues for robust text recognition[C]//European Conference on Computer Vision,August 2328,2020,online.Cham:Springer,2020:135-151.

    Tools

    Get Citation

    Copy Citation Text

    XIONG Wei, SUN Peng, ZHAO Di, LIU Yue. Natural scene text recognition based on character attention[J]. Journal of Optoelectronics · Laser, 2023, 34(11): 1158

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Received: Sep. 6, 2022

    Accepted: --

    Published Online: Sep. 25, 2024

    The Author Email: XIONG Wei (xw@mail.hbut.edu.cn)

    DOI:10.16136/j.joel.2023.11.0625

    Topics