Laser & Optoelectronics Progress, Volume. 58, Issue 14, 1410006(2021)

Layout Segmentation and Description of Tibetan Document Images Based on Adaptive Run Length Smoothing Algorithm

Yuanyuan Chen1, Weilan Wang2、*, Huaming Liu3, Zhengqi Cai1, and Penghai Zhao2
Author Affiliations
  • 1College of Mathematics and Computer Science, Northwest Minzu University, Lanzhou, Gansu 730030, China
  • 2Key Laboratory of China’s Ethnic Languages and Information Technology of Ministry of Education, Northwest Minzu University, Lanzhou, Gansu 730030, China
  • 3College of Computer and Information Engineering, Fuyang Normal University, Fuyang, Anhui 236041, China
  • show less
    Figures & Tables(12)
    Tibetan document image
    Flow chart of layout analysis method
    RLSA samples. (a) Pixels before smoothing; (b) pixels after smoothing
    ARLSA process of text lines in Tibetan document images. (a) Binary figures; (b) ARLSA processing results
    Filtering results in connected domains. (a) ARLSA processing result; (b) rectangular outer box for connected domains; (c) filtering result
    Vowel attribution separated from baseline. (a) ARLSA processing result of text line; (b) centroids of connected components; (c) vertical distance between centroids; (d) filtering result
    Cluster analysis graphs of random segmentation block sample data. (a) Random sample data distribution; (b) K=3 cluster
    Structural diagram of layout data
    Word segmentation and recognition. (a) Separation of vowels and base words, and word adhesion; (b) segmentation result; (c) recognition result
    Layout analysis results. (a) Original image; (b) target connected region; (c) classification result of layout elements; (d) layout description
    Wrong classification images. (a)(c) Original images; (b)(d) wrong classification results
    • Table 1. Data representation of cluster centers in connected components

      View table

      Table 1. Data representation of cluster centers in connected components

      Cluster centerCenter 1Center 2Center 3Center 4
      Widthw1w2w3w4
      Heighth1h2h3h4
    Tools

    Get Citation

    Copy Citation Text

    Yuanyuan Chen, Weilan Wang, Huaming Liu, Zhengqi Cai, Penghai Zhao. Layout Segmentation and Description of Tibetan Document Images Based on Adaptive Run Length Smoothing Algorithm[J]. Laser & Optoelectronics Progress, 2021, 58(14): 1410006

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing

    Received: Sep. 21, 2020

    Accepted: Nov. 12, 2020

    Published Online: Jul. 14, 2021

    The Author Email: Weilan Wang (wangweilan@xbmu.edu.cn)

    DOI:10.3788/LOP202158.1410006

    Topics