Laser & Optoelectronics Progress, Volume. 58, Issue 20, 2010020(2021)

Character Segmentation for Historical Uchen Tibetan Document Based on Structure Attributes

Ce Zhang1,2 and Weilan Wang1、*
Author Affiliations
  • 1Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, Gansu 730030, China
  • 2School of Mathematics and Information Engineering, Chongqing University of Education, Chongqing 400065, China
  • show less

    Character segmentation is an important part in image analysis and recognition of historical Tibetan document. Aiming at the problems of text line slanting, stroke overlapping, crossing, touching between characters, stroke breaking and noise interference of historical Uchen Tibetan document, a character segmentation method for historical Uchen Tibetan document based on structure attributes is proposed in this paper. First, a character block dataset of historical Uchen Tibetan document is established. Then, the local baseline of character block is detected by using syllable point position information or combining horizontal projection and linear detection, and the character block is divided horizontally into two parts above and below the baseline. The improved template matching algorithm is used to detect touching strokes and touching type above the baseline. The multi-direction and multi-path touching character segmentation algorithm is used to realize crossing and touching strokes segmentation. Finally, according to Tibetan structure attribute, to complete the attribution of each stroke. Experimental results show that the proposed method can effectively solve the challenge problem in character segmentation. The recall rate, precision rate and F-Measure of character segmentation reached 96.52%, 98.24% and 97.37%, respectively.

    Tools

    Get Citation

    Copy Citation Text

    Ce Zhang, Weilan Wang. Character Segmentation for Historical Uchen Tibetan Document Based on Structure Attributes[J]. Laser & Optoelectronics Progress, 2021, 58(20): 2010020

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing

    Received: Jan. 8, 2021

    Accepted: Mar. 3, 2021

    Published Online: Nov. 3, 2021

    The Author Email: Wang Weilan (wangweilan@xbmu.edu.cn)

    DOI:10.3788/LOP202158.2010020

    Topics