Computer Applications and Software, Volume. 42, Issue 4, 223(2025)

THE SEQ2SEQ MODEL FOR ASSISTING THE SPEAKER VERIFICATION ON SHORT UTTERANCES

Yang Shuang, Ma Baichao, Yang Yu, and Chen Dan
Author Affiliations
  • State Grid Shandong Electric Power Company Heze Power Supply Company, Heze 274002, Shandong, China
  • show less

    The text-independent speaker verification system is less effective when the test utterance is shorter. In view of this, a method of enhancing acoustic features is proposed to assist the system. The method used a generation model based on seq2seq to generate longer acoustic features from short-term acoustic features. The generation model included an encoder for extracting deep features and a decoder for outputting acoustic features. It used an attention mechanism to obtain the relationship between sequences and added cosine distance loss to improve the generalization performance of the generation model during training. The trained text-independent speaker verification model was used as a component of the generation model training architecture to help the generation model training. The experimental results show that under the condition of 1-3 seconds of speech duration, the equal error rate of the system is reduced by 7.78% on average after using this model.

    Tools

    Get Citation

    Copy Citation Text

    Yang Shuang, Ma Baichao, Yang Yu, Chen Dan. THE SEQ2SEQ MODEL FOR ASSISTING THE SPEAKER VERIFICATION ON SHORT UTTERANCES[J]. Computer Applications and Software, 2025, 42(4): 223

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Jan. 17, 2022

    Accepted: Aug. 25, 2025

    Published Online: Aug. 25, 2025

    The Author Email:

    DOI:10.3969/j.issn.1000-386x.2025.04.032

    Topics