Laser & Optoelectronics Progress, Volume. 57, Issue 18, 181702(2020)

Speaker-Dependent Speech Recognition Algorithm for Laparoscopic Supporter Control

Kailong Ren, Yi Wang*, Xiaodong Chen, and Huaiyu Cai
Author Affiliations
  • School of Precision Instruments and Optoelectronics Engineering, Tianjin University, Tianjin 300072, China
  • show less

    A long short-term memory (LSTM) recurrent neural network based on an i-vector feature is presented for speech control of laparoscopic supporter to realize short-term isolated word command recognition from the speech of a specific doctor using small training samples. In this model, LSTM recurrent neural network is used as the basic model, Mel-frequency cepstrum coefficient (MFCC) is used as the input characteristic parameter, i-vector feature is used as the deep input information of LSTM recurrent neural network, and the deep feature information behind LSTM layer in the neural network is spliced to achieve the purpose of parameter fusion, so as to realize the accurate recognition of the voice instructions of the specific surgeon and the rejection recognition of the voice instructions of the non surgeon. This approach offers a secure and intelligent speech recognition scheme for laparoscopic surgeries. Further, a self-built speech database is used as a training library to verify speech recognition performance of the proposed algorithm as well as its rejection performance for the speech not included in the training library. Experiments show that compared with dynamic time warping(DTW)and Gaussian mixture model-Hidden Markov model (GMM-HMM), the proposed model exhibits a 99.6% correct recognition rate for voice commands of specific people recorded in the training library while maintaining a false acceptance rate of 0%, with an average false acceptance rate of 2.5% for voices not included in the training library. The proposed model meets the requirements of accuracy and safety expected by laparoscopic supporter control standards.

    Tools

    Get Citation

    Copy Citation Text

    Kailong Ren, Yi Wang, Xiaodong Chen, Huaiyu Cai. Speaker-Dependent Speech Recognition Algorithm for Laparoscopic Supporter Control[J]. Laser & Optoelectronics Progress, 2020, 57(18): 181702

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Medical Optics and Biotechnology

    Received: Feb. 5, 2020

    Accepted: Mar. 19, 2020

    Published Online: Sep. 2, 2020

    The Author Email: Wang Yi (koala_wy@tju.edu.cn)

    DOI:10.3788/LOP57.181702

    Topics