Speaker Identification Based on Multimodal Long Short-Term Memory with Depth-Gate

Huangkang Chen; Ying Chen

doi:10.3788/LOP56.031007

Laser & Optoelectronics Progress, Volume. 56, Issue 3, 031007(2019)

Speaker Identification Based on Multimodal Long Short-Term Memory with Depth-Gate

Huangkang Chen^* and Ying Chen^**

Key Laboratory of Advanced Process Control for Light Industry of the Education Ministry of China, Jiangnan University, Wuxi, Jiangsu 214122, China

show less

Abstract Get PDF(in Chinese)

In order to effectively fuse the audio and visual features in the task of speaker recognition, a multimodal long short-term memory network (LSTM) with depth-gate is proposed. First, a multi-layer LSTM model is established for each type of individual features. Then the depth-gate is used to connect the memory cells in the upper and lower layers, and the connection between the upper and lower layers is enhanced, which improves the classification performance of the feature itself. At the same time, the connection among layer models can be learned by sharing the output of hidden layers and the weight of each gate unit among different models. The experimental results show that this method can be used to effectively fuse the audio and video features and improve the accuracy of speaker recognition. Moreover, this method is robust to external disturbance.

Keywords

depth-gate fusion image processing long short-term memory network speaker recognition weight sharing

Tools

Get Citation

Copy Citation Text

Huangkang Chen, Ying Chen. Speaker Identification Based on Multimodal Long Short-Term Memory with Depth-Gate[J]. Laser & Optoelectronics Progress, 2019, 56(3): 031007

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Image Processing

Received: Jun. 13, 2018

Accepted: Aug. 31, 2018

Published Online: Jul. 31, 2019

The Author Email: Chen Huangkang (6161918009@vip.jiangnan.edu.cn), Chen Ying (chenying@jiangnan.edu.cn)

DOI:10.3788/LOP56.031007

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology