Laser & Optoelectronics Progress, Volume. 59, Issue 13, 1307001(2022)
Language Identification Using Joint Voice Activity Detection and Dynamic Range Control
Fig. 1. MFCC0 feature voice activity detection. (a) Voice waveform; (b) MFCC0 features; (c) MFCC0 feature voice activity detection result after median filtering
Fig. 2. DRC input/output processing unit
Fig. 3. Voice changes before and after DRC processing. (a) Voice waveform changes before and after DRC processing; (b) spectropram before DRC processing; (c) spectropram after DRC processing
Fig. 4. Comparison of different frequency scales. (a) Linear scale spectrogram; (b) log scale spectrogram
Fig. 5. Flow chart of language recognition
Fig. 6. Multi-classification task evaluation parameters
Fig. 7. Results of different frequency coordinate scales
Fig. 8. Resnet classification results
Fig. 9. ResNeSt classification results
Fig. 10. Language recognition result confusion matrix
|
|
|
Get Citation
Copy Citation Text
Yankai Wang, Hua Long, Yubin Shao, Qingzhi Du, Yao Wang. Language Identification Using Joint Voice Activity Detection and Dynamic Range Control[J]. Laser & Optoelectronics Progress, 2022, 59(13): 1307001
Category: Fourier Optics and Signal Processing
Received: Jul. 12, 2021
Accepted: Aug. 13, 2021
Published Online: Jun. 9, 2022
The Author Email: Long Hua (2748373869@qq.com)