Natural scene text recognition based on character attention

[1] [1] SHI B G,BAI X,YAO C.An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2017,39(11):2298-2304.

[2] [2] BORISYUK F,GORDO A,SIVAKUMAR V.Rosetta: large scale system for text detection and recognition in images［C］//the 24th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining,August 19-23,2018,London,United Kingdom.New York:ACM,2018:71-79.

[3] [3] GAO Y,CHEN Y,WANG J,et al.Reading scene text with attention convolutional sequence modeling[J].Neurocomputing,2017,339(C):161-170.

[5] [5] SHI B,WANG X,LYU P,et al.Robust scene text recognition with automatic rectification［C］//IEEE Conference on Computer Vision and Pattern Recognition,June 27-30,2016,Las Vegas,NV,USA.New York:IEEE,2016:4168-4176.

[6] [6] SHI B G,YANG M K,WANG X G,et al.ASTER:an attentional scene text recognizer with flexible rectification[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2019,41(9):2035-2048.

[8] [8] LIAO M H,LYU P Y,HE M H,et al.Mask textspotter: an end-to-end trainable neural network for spotting text with arbitrary shapes[J].IEEE Transactions on Pattern Analysis and Machine Intelligence,2021,43(2):532-548.

[9] [9] LYU P,YANG Z,LENG X,et al.2D attentional irregular scene text recognizer［EB/OL］.(2019-06-13)［2022-08-26］.https://arxiv.org/abs/1906.05708.

[10] [10] WANG T,ZHU Y,JIN L,et al.Decoupled attention network for text recognition［C］//Proceedings of the AAAI Conference on Artificial Intelligence,February 7-12,2020,New York,USA.Palo Alto,California,USA:AAAI Press,2020,34(7):12216-12224.

[12] [12] FANG S,XIE H,WANG Y,et al.Read like humans: autonomous,bidirectional and iterative language modeling for scene text recognition［C］//IEEE/CVF Conference on Computer Vision and Pattern Recognition,June 19-25,2021,Virtual.New York:IEEE,2021:7098-7107.

[13] [13] LIU Z,LIN Y,CAO Y,et al.Swin transformer:hierarchical vision transformer using shifted windows［C］//IEEE/CVF International Conference on Computer Vision,October 10-17,2021,Montreal,QC,Canda.New York:IEEE,2021:10012-10022.

[14] [14] WANG Q L,WU B G,ZHU P F,et al.ECA-Net:efficient channel attention for deep convolutional neural networks［EB/OL］.(2019-10-08)［2022-08-26］.https://doi.org/10.48550/arXiv.1910.03151.

[15] [15] VASWANI A,SHAZEER N,PARMAR N,et al.Attention is all you need［C］//Advances in Neural Information Processing Systems,December 4-9,2017,Long Beach,California.USA.Red Hook,NY,United States:Curran Associates Inc,2017:6000-6010.

[16] [16] ZHU X,HU H,LIN S,et al.Deformable convnets v2:more deformable, better results［C］//IEEE/CVF Conference on Computer Vision and Pattern Recognition,June 15-20,2019,Long Beach,CA,USA.New York:IEEE,2019:9308-9316.

[17] [17] RONNEBERGER O,FISCHER P,BROX T.U-net:Convolutional networks for biomedical image segmentation［C］// International Conference on Medical Image Computing and Computer-Assisted Intervention,October 5-9,2015,Munich,Germany.Cham:Springer,2015:234-241.

[18] [18] AREVALO J,SOLORIO T,MONTES-Y-GóMEZ M,et al.Gated multimodal units for information fusion［EB/OL］. (2017-02-07)［2022-08-26］.https://arxiv.org/abs/1702.01992.

[19] [19] QIAO Z,ZHOU Y,YANG D,et al.SEED:semantics enhanced encoder-decoder framework for scene text recognition［C］//IEEE/CVF Conference on Computer Vision and Pattern Recognition,June 13-19,2020,Seattle,WA,USA.New York:IEEE,2020:13528-13537.

[20] [20] WAN Z,HE M,CHEN H,et al.Textscanner:reading characters in order for robust scene text recognition［C］// Proceedings of the AAAI Conference on Artificial Intelligence,February 7-12,2020,New York,USA.Palo Alto,California,USA:AAAI Press,2020,34(7):12120-12127.

[21] [21] YUE X,KUANG Z,LIN C,et al.RobustScanner:dynamically enhancing positional clues for robust text recognition［C］//European Conference on Computer Vision,August 2328,2020,online.Cham:Springer,2020:135-151.

Tools

Get Citation

Copy Citation Text

XIONG Wei, SUN Peng, ZHAO Di, LIU Yue. Natural scene text recognition based on character attention[J]. Journal of Optoelectronics · Laser, 2023, 34(11): 1158

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Received: Sep. 6, 2022

Accepted: --

Published Online: Sep. 25, 2024

The Author Email: XIONG Wei (xw@mail.hbut.edu.cn)

DOI:10.16136/j.joel.2023.11.0625

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology