Research on Speech Segmentation and Clustering Based on Mixed Features

LIU Jing-tian; JIANG Nan

Electro-Optic Technology Application, Volume. 34, Issue 5, 37(2019)

LIU Jing-tian and JIANG Nan

Author Affiliations

[in Chinese]

show less

Abstract Get PDF(in Chinese)

The problem of extracting the target speaker speech from multiple speaker speech is researched. In order to improve the accuracy of multi-speaker speech segmentation and clustering, a speech segmentation and clustering algorithm based on Mel frequency cepstral coefficient (MFCC) and Gammatone frequency cepstral coefficient (GFCC) hybrid features is proposed, which can effectively avoid problems such as poor robustness of noisy speech segmentation and clustering. For the superimposed pink noise and factory noise speech, a comparative analysis is made based on the conventional algorithm and the improved segmentation clustering algorithm respectively. The results show that the proposed segmentation clustering algorithm based on mixed features is more accurate in extracting target human speech.

Keywords

Gammatone frequency cepstral coefficient (GFCC)Mel frequency cepstral coefficient (MFCC)robustness speech segmentation and clustering

Tools

Get Citation

Copy Citation Text

LIU Jing-tian, JIANG Nan. Research on Speech Segmentation and Clustering Based on Mixed Features[J]. Electro-Optic Technology Application, 2019, 34(5): 37

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites