Laser & Optoelectronics Progress, Volume. 59, Issue 13, 1307001(2022)

Language Identification Using Joint Voice Activity Detection and Dynamic Range Control

Yankai Wang, Hua Long*, Yubin Shao, Qingzhi Du, and Yao Wang
Author Affiliations
  • Faculty of Information Engineering and Automation, Kunming University of Science and Technology, Kunming 650500, Yunnan , China
  • show less

    In the language identification system, the interference of silent segments and the inconsistency of voice decibel range leads to a decline in language identification. Additionally, algorithms using spectrograms for language identification cannot effectively show the information of its low-frequency part, which results in performance failure. To mitigate this, we proposed a language identification method based on joint voice activity detection and dynamic range control. First, we extracted the first dimension coefficient of the Mel-scale frequency cepstral coefficients. Second, we applied median filtering to smooth the feature parameters and perform voice activity detection to remove the silent segment of the voice. Next, we used the dynamic range control to adjust the decibel range of different voices. Finally, we put the log scale spectrogram into the convolutional neural network for classification. The experimental results show that the proposed algorithm improved performance by 7.16 percentage points as compared with the traditional language identification algorithm using spectrogram in the VoxForge public corpus under the ResNeSt network. Additionally, under the same experimental settings, the recognition performance of the log scale spectrogram showed superiority over other mainstream features, which fully validates the effectiveness and superiority of the proposed algorithm and features.

    Tools

    Get Citation

    Copy Citation Text

    Yankai Wang, Hua Long, Yubin Shao, Qingzhi Du, Yao Wang. Language Identification Using Joint Voice Activity Detection and Dynamic Range Control[J]. Laser & Optoelectronics Progress, 2022, 59(13): 1307001

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Fourier Optics and Signal Processing

    Received: Jul. 12, 2021

    Accepted: Aug. 13, 2021

    Published Online: Jun. 9, 2022

    The Author Email: Long Hua (2748373869@qq.com)

    DOI:10.3788/LOP202259.1307001

    Topics