Computer Engineering, Volume. 51, Issue 8, 364(2025)
Optical Chemical Structure Recognition Based on Multi-order Gated Aggregation Network
In the field of Optical Chemical Structure Recognition (OCSR), current deep-learning-based models predominantly utilize Convolutional Neural Networks (CNNs) or Vision Transformers for visual feature extraction and Transformers for sequence decoding. Although these models are effective, they are still limited by their ability to extract image features and the accuracy of position encoding during decoding, which affect the recognition efficiency. In response to these limitations, this study uses an encoder-decoder architecture composed of a Multiorder gated aggregation Network (MogaNet) and a Transformer, which introduces relative positional encoding in the OCSR field, and proposes an optical chemical structure recognition model based on MogaNet. First, the model captures multiscale features, reduces feature redundancy using the MogaNet spatial aggregation module during image feature extraction, and improves channel dimension diversity using the MogaNet channel aggregation module. Second, during sequence decoding, a Transformer with relative positional encoding is used as the decoder to accurately capture the relative positional relationships between words. To train and validate this model, a chemical structure dataset containing 400 000 molecular structures is constructed, which includes both Markush and non-Markush structures. Experimental results demonstrate that the model achieves an accuracy of 92.36%, outperforming other models.
Get Citation
Copy Citation Text
LIN Fan, LI Jianhua. Optical Chemical Structure Recognition Based on Multi-order Gated Aggregation Network[J]. Computer Engineering, 2025, 51(8): 364
Category:
Received: Jan. 22, 2024
Accepted: Aug. 26, 2025
Published Online: Aug. 26, 2025
The Author Email: LI Jianhua (jhli@ecust.edu.cn)