Super-resolution reconstruction of text image with multimodal semantic interaction

Yulan HAN; Yihong LUO; Yujie CUI; Chaofeng LAN

doi:10.37188/OPE.20253301.0135

Optics and Precision Engineering, Volume. 33, Issue 1, 135(2025)

Super-resolution reconstruction of text image with multimodal semantic interaction

Yulan HAN^*, Yihong LUO, Yujie CUI, and Chaofeng LAN

College of Measurement and Control Technology and Communication Engineering， Harbin University of Science and Technology， Harbin150080， China

show less

Abstract Get PDF(in Chinese)

References(30)

[1] GUAN T K, SHEN W, YANG X et al. Self-supervised character-to-character distillation for text recognition[C], 1, 19473-19484(2023).

[2] LI M H, LV T C, CHEN J Y et al. TrOCR： transformer-based optical character recognition with pre-trained models[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 37, 13094-13102(2023).

[3] DONG C, LOY C C, HE K M et al[M]. Learning a Deep Convolutional Network for Image Super-resolution, 184-199(2014).

[4] NIU B, WEN W L, REN W Q et al[M]. Single image Super-resolution Via a Holistic Attention Network, 191-207(2020).

[5] 寇旗旗, 李超, 程德强. 基于注意力和宽激活密集残差网络的图像超分辨率重建[J]. 光学精密工程, 31, 2273-2286(2023).

KOU Q Q, LI CH, CHENG D Q et al. Image super-resolution reconstruction based on attention and wide-activated dense residual network[J]. Opt. Precision Eng., 31, 2273-2286(2023).

[6] 周颖, 裴盛虎, 陈海永. 基于多尺度自适应注意力的图像超分辨率网络[J]. 光学精密工程, 32, 843-856(2024).

ZHOU Y, PEI SH H, CHEN H Y et al. Image super-resolution network based on multi-scale adaptive attention[J]. Opt. Precision Eng., 32, 843-856(2024).

[7] XIA ZH P, CHEN H, ZHANG Y N et al. Lightweight video super-resolution based on hybrid spatio-temporal convolution[J]. Opt. Precision Eng., 32, 2564-2576(2024).

夏振平, 陈豪, 张宇宁. 基于混合时空卷积的轻量级视频超分辨率重建[J]. 光学精密工程, 32, 2564-2576(2024).

[8] ZHU S P, ZHAO Z Y, FANG P F et al. Improving scene text image super-resolution via dual prior modulation network[C], 3843-3851(2023).

[9] CHEN X Y, WANG X T, ZHOU J T et al. Activating more pixels in image super-resolution transformer[C], 17, 22367-22377(2023).

[10] WANG W J, XIE E Z, SUN P Z et al. TextSR： content-aware text super-resolution guided by recognition[webpage], 1909-07113. https：//arxiv.org/abs/1909.07113v4

[11] WANG Y Y, SU F, QIAN Y. Text-attentional conditional generative adversarial network for super-resolution of text images[C], 8, 1024-1029(2019).

[12] MOU Y Q, TAN L, YANG H et al[M]. PlugNet： Degradation Aware Scene Text Recognition Supervised by a Pluggable Super-resolution Unit, 158-174(2020).

[13] WANG W J, XIE E Z, LIU X B et al. Scene text image super-resolution in the wild[C], 650-666(2020).

[14] MA J Q, GUO S, ZHANG L. Text prior guided scene text image super-resolution[J]. IEEE Transactions on Image Processing(2023).

[15] MA J Q, LIANG Z T, ZHANG L. A text attention network for spatial deformation robust scene text image super-resolution[C], 18, 5911-5920(2022).

[16] YANG H, ZHOU H B. Degradation prior guided scene text image super-resolution[C], 2, 170-175(2022).

[17] MA J Z, JIN L W, ZHANG J X et al. TextSRNet： scene text super-resolution based on contour prior and atrous convolution[C], 21, 3252-3258(2022).

[18] FU X Y, CH'NG E, AICKELIN U et al. CRNN： a joint neural network for redundancy detection[C], 29, 1-8(2017).

[19] FANG S C, XIE H T, WANG Y X et al. Read like humans： autonomous， bidirectional and iterative language modeling for scene text recognition[C], 20, 7098-7107(2021).

[20] LIU Z, LIN Y T, CAO Y et al. Swin transformer： hierarchical vision transformer using shifted windows[C], 10, 10012-10022(2021).

[21] LI J F, WEN Y, HE L H. SCConv： spatial and channel reconstruction convolution for feature redundancy[C], 17, 6153-6162(2023).

[22] SHI B G, YANG M K, WANG X G et al. ASTER： an attentional scene text recognizer with flexible rectification[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41, 2035-2048(2019).

[23] LUO C J, JIN L W, SUN Z H. MORAN： a multi-object rectified attention network for scene text recognition[J]. Pattern Recognition, 90, 109-118(2019).

[24] ZHAO C R, FENG S Y, ZHAO B N et al. Scene text image super-resolution via parallelly contextual attention network[C], 2908-2917(2021).

[25] CHEN J Y, LI B, XUE X Y. Scene text telescope： text-focused scene image super-resolution[C], 20, 12026-12035(2021).

[26] CHEN J Y, YU H Y, MA J Q et al. Text gestalt： stroke-aware scene text image super-resolution[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 36, 285-293(2022).

[27] HONDA K, KUREMATSU M, FUJITA H et al. Multi-task learning for scene text image super-resolution with multiple transformers[J]. Electronics, 11, 3813(2022).

Tools

Get Citation

Copy Citation Text

Yulan HAN, Yihong LUO, Yujie CUI, Chaofeng LAN. Super-resolution reconstruction of text image with multimodal semantic interaction[J]. Optics and Precision Engineering, 2025, 33(1): 135

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category:

Received: Jul. 31, 2024

Accepted: --

Published Online: Apr. 1, 2025

The Author Email: Yulan HAN (hanyulan@hrbust.edu.cn)

DOI:10.37188/OPE.20253301.0135

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology