Experiment Science and Technology, Volume. 23, Issue 4, 1(2025)
Experimental Design of Speech Emotion Recognition with the Multi-Task Teacher-Student Model
[1] LANGARI S, MARVI H, ZAHEDI M. Efficient speech emotion recognition using modified feature extraction[J]. Informatics in Medicine Unlocked, 20, 100424(2020).
[2] MAI J L, XING X F, CHEN W D et al. DropFormer: A dynamic noise-dropping transformer for speech emotion recognition[C], 2645-2649(2024).
[3] WANG J C, ZHAO Y, LU C et al. Boosting cross-corpus speech emotion recognition using CycleGAN with contrastive learning[C], 1605-1609(2024).
[4] CHATTERJEE R, MAZUMDAR S, SHERRATT R S et al. Real-time speech emotion analysis for smart home assistants[J]. IEEE Transactions on Consumer Electronics, 67, 68-76(2021).
[5] LAGHARI M, TAHIR M J, AZEEM A et al. Robust speech emotion recognition for Sindhi language based on deep convolutional neural network[C], 543-548(2021).
[6] KODURU A, VALIVETI H B, BUDATI A K. Feature extraction algorithms to improve the speech emotion recognition rate[J]. International Journal of Speech Technology, 23, 45-55(2020).
[7] ZHOU H S, DU J, TU Y H et al. Using speech enhancement preprocessing for speech emotion recognition in realistic noisy conditions[C], 4098-4102(2020).
[8] CHAKRABORTY R, PANDA A, PANDHARIPANDE M et al. Front-end feature compensation and denoising for noise robust speech emotion recognition[C], 3257-3261(2019).
[9] SUN L H, FU S, WANG F. Decision tree SVM model with Fisher feature selection for speech emotion recognition[J]. EURASIP Journal on Audio, Speech, and Music Processing, 2019, 2(2019).
[10] BANDELA S R, KUMAR T K. Unsupervised feature selection and NMF de-noising for robust Speech Emotion Recognition[J]. Applied Acoustics, 172, 107645(2021).
[11] TRIANTAFYLLOPOULOS A, KEREN G, WAGNER J et al. Towards robust speech emotion recognition using deep residual networks for speech enhancement[C], 1691-1695(2019).
[12] SUN L H, LEI Y L, WANG S et al. Joint enhancement and classification constraints for noisy speech emotion recognition[J]. Digital Signal Processing, 151, 104581(2024).
[13] BUSSO C, BULUT M, LEE C C et al. IEMOCAP: Interactive emotional dyadic motion capture database[J]. Language Resources and Evaluation, 42, 335-359(2008).
[14] LOHRENZ T, LI Z Y, FINGSCHEIDT T. Multi-encoder learning and stream fusion for transformer-based end-to-end automatic speech recognition[C], 2846-2850(2021).
[15] LI J Y, ZHAO R, HUANG J T et al. Learning small-size DNN with output-distribution-based criteria[C], 1910-1914(2014).
[16] NEUMANN M, VU N T. Improving speech emotion recognition with unsupervised representation learning on unlabeled speech[C], 7390-7394(2019).
[17] XU M K, ZHANG F, CUI X D et al. Speech emotion recognition with multiscale area attention and data augmentation[C], 6319-6323(2021).
[19] XU M K, ZHANG F, KHAN S U. Improve accuracy of speech emotion recognition with attention head fusion[C], 1058-1064(2020).
[20] ZHU W J, LI X. Speech emotion recognition with global-aware fusion on multi-scale feature representation[C], 6437-6441(2022).
[21] YE J X, WEN X C, WANG X Z et al. GM-TCNet: Gated multi-scale temporal convolutional network using emotion causality for speech emotion recognition[J]. Speech Communication, 145, 21-35(2022).
[22] YE J X, WEN X C, WEI Y J et al. Temporal modeling matters: A novel temporal emotional modeling approach for speech emotion recognition[C], 1-5(2023).
Get Citation
Copy Citation Text
Linhui SUN, Ping’an LI, Yunlong LEI, Zixiao ZHANG. Experimental Design of Speech Emotion Recognition with the Multi-Task Teacher-Student Model[J]. Experiment Science and Technology, 2025, 23(4): 1
Category:
Received: Jul. 25, 2024
Accepted: Oct. 30, 2024
Published Online: Jul. 30, 2025
The Author Email: Linhui SUN (sunlh@njupt.edu.cn)