Laser & Optoelectronics Progress, Volume. 57, Issue 8, 081006(2020)

Optical Music Recognition Method Combining Multi-Scale Residual Convolutional Neural Network and Bi-Directional Simple Recurrent Units

Qiong Wu, Qiang Li, and Xin Guan*
Author Affiliations
  • School of Microelectronics, Tianjin University, Tianjin 300072, China
  • show less
    Figures & Tables(16)
    Schematic diagram of MF-RC-BiSRU
    Schematic diagram of residual structure
    Schematic diagram of multi-scale feature fusion
    Structure of SRU
    Schematic diagram of BiSRU
    Difficulties of note recognition in music score
    Three methods of data processing to simulate unsatisfactory music image. (a) Original incipit; (b) incipit of white Gaussian noise added; (c) incipit of Perlin noise added; (d) incipit of elastic transformations added
    Comparison of training loss and accuracy for C-BiLSTM and RC-BiLSTM networks. (a) Comparison of training loss; (b) comparison of symbol error rate
    Comparison of features in different convolution layers. (a) Original incipit; (b) shallow feature map C1; (c) deeper feature map C3; (d) deepest feature map C5; (e) multi-scale feature fusion map F4
    Comparison of the symbol error rates in the different networks
    Comparison of MF-RC-BiSRU and MF-RC-BiLSTM. (a) Comparison of training loss; (b) comparison of symbol error rates
    Test results of the same incipit in four different networks.(a) Original incipit; (b) C-BiLSTM; (c) RC-BiLSTM; (d) MF-RC-BiLSTM; (e) MF-RC-BiSRU
    Comparison of loss in different methods
    • Table 1. Structure parameters of the improved network

      View table

      Table 1. Structure parameters of the improved network

      Input(128×weight×1)
      PartLayerParameters
      FeatureextractionResidual_Conv_1(3,3,32)
      Max_Pool(2,2,32)
      Residual_Conv_2(3,3,64)
      Max_Pool(2,2,64)
      Residual_Conv_3(3,3,128)
      Max_Pool(2,2,128)
      Residual_Conv_4(3,3,256)
      Max_Pool(2,2,256)
      Residual_Conv_5(3,3,256)
      Max_Pool(2,2,256)
      Note recognitionand classificationBiSRU512
      BiSRU512
      CTC1780
    • Table 2. Comparison of accuracy in different networks

      View table

      Table 2. Comparison of accuracy in different networks

      NetworkSymbol errorrate /%Sequence errorrate /%
      C-BiLSTM3.248014.3498
      RC-BiLSTM1.84408.1071
      MF-RC-BiLSTM0.33121.4637
      MF-RC-BiSRU0.32341.4571
    • Table 3. Performance comparison of different methods

      View table

      Table 3. Performance comparison of different methods

      MethodSymbol errorrate /%Sequenceerror rate /%Time /s
      CNN-STN[15]5.020816.80560.98
      DWD[17]8.781118.56091.21
      MF-RC-BiSRU0.32341.45710.56
    Tools

    Get Citation

    Copy Citation Text

    Qiong Wu, Qiang Li, Xin Guan. Optical Music Recognition Method Combining Multi-Scale Residual Convolutional Neural Network and Bi-Directional Simple Recurrent Units[J]. Laser & Optoelectronics Progress, 2020, 57(8): 081006

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing

    Received: Jun. 26, 2019

    Accepted: Sep. 10, 2019

    Published Online: Apr. 3, 2020

    The Author Email: Guan Xin (guanxin@tju.cn)

    DOI:10.3788/LOP57.081006

    Topics