Laser & Optoelectronics Progress, Volume. 58, Issue 2, 0210008(2021)

Text Line Segmentation of Tibetan Historical Documents Based on Text Core Regions Combined with Expansion Growth

Jincheng Li1, Xiaojuan Wang2, Weilan Wang1、*, Qiang Lin2, and Pengfei Hu2
Author Affiliations
  • 1Key Laboratory of China's Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, Gansu 730030, China
  • 2College of Mathematics and Computer Science, Northwest Minzu University, Lanzhou, Gansu 730030, China
  • show less
    Figures & Tables(14)
    Partial illustration of lines in ancient Tibetan texts. (a) Baseline and syllable points; (b) slanted text line; (c) distorted text line; (d) overlapping and adhesion of text lines
    Flow chart of method
    Tibetan historical document image
    Syllable point detection of Tibetan historical document. (a) Binary map after layout processing (black background area is local binary map); (b) syllable point image
    Projection images. (a) Horizontal projection of syllable point image; (b) vertical projection of binary original image
    Formation of pseudo-text connected regions. (a) Image of text core area; (b) image of pseudo-text connected region
    Schematic of pixel point expansion growth process. (a) Pixel point without expansion; (b) pixel point during expansion; (c) pixel point after expansion
    Line acquisition of pseudo text. (a) Expansion growth result; (b) pseudo-text lines
    Local diagram in attribution process of broken strokes. (a) Text lines to be split; (b) pseudo-text lines; (c) complete split lines
    Final segmentation result of local image
    Text lines and their segmentation results. (a)(c)(e) Original images; (b)(d)(f) segmentation results
    Segmentation effects of overlapped, adhered and broken strokes. (a)(b) Overlapped strokes; (c)(d) adhered strokes; (e)(f) broken strokes
    Wrong segmentation and corresponding correct forms. (a)(c)(e)Wrong segmentation; (b)(d)(f)corresponding correct forms
    • Table 1. Text line segmentation results under different parameters

      View table

      Table 1. Text line segmentation results under different parameters

      T1T2CTspaceRP /%
      5558080800.60.60.618202315141521151189.2689.6889.09
      5558080800.70.70.718202315241535151989.8590.5089.56
      1010101001001000.60.60.618202315091516150388.9789.3888.62
      1010101001001000.70.70.718202315261538152389.9790.6889.79
    Tools

    Get Citation

    Copy Citation Text

    Jincheng Li, Xiaojuan Wang, Weilan Wang, Qiang Lin, Pengfei Hu. Text Line Segmentation of Tibetan Historical Documents Based on Text Core Regions Combined with Expansion Growth[J]. Laser & Optoelectronics Progress, 2021, 58(2): 0210008

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing

    Received: Jun. 8, 2020

    Accepted: Jul. 7, 2020

    Published Online: Jan. 8, 2021

    The Author Email: Weilan Wang (wangweilan@xbmu.edu.cn)

    DOI:10.3788/LOP202158.0210008

    Topics