Laser & Optoelectronics Progress, Volume. 56, Issue 3, 031007(2019)

Speaker Identification Based on Multimodal Long Short-Term Memory with Depth-Gate

Huangkang Chen* and Ying Chen**
Author Affiliations
  • Key Laboratory of Advanced Process Control for Light Industry of the Education Ministry of China, Jiangnan University, Wuxi, Jiangsu 214122, China
  • show less
    Figures & Tables(8)
    Basic structure of LSTM
    Multimodal DGLSTM architecture for speaker recognition
    Final output strategy for network
    Test process of 2.5 s window
    • Table 1. Recognition accuracy of multimodal DGLSTM with different structures%

      View table

      Table 1. Recognition accuracy of multimodal DGLSTM with different structures%

      MethodRecognition accuracyIll-paired-rejection accuracy
      2-layer multimodal DGLSTM (shared)90.3095.85
      2-layer multimodal DGLSTM( not shared)92.2596.55
      3-layer multimodal DGLSTM (not shared)88.2595.00
    • Table 2. Recognition accuracy with different algorithms%

      View table

      Table 2. Recognition accuracy with different algorithms%

      MethodRecognition accuracy
      Ref. [6]83.26
      Ref. [5]86.12
      Ref. [8]90.15
      Only depth-gate67.90
      Proposed92.25
    • Table 3. Recognition accuracy on simulation video sequence for different algorithms%

      View table

      Table 3. Recognition accuracy on simulation video sequence for different algorithms%

      MethodTime window
      0.5 s2.5 s
      Ref. [8]89.3695.86
      Proposed91.7196.04
    • Table 4. Training time and test time for different algorithms

      View table

      Table 4. Training time and test time for different algorithms

      MethodTrainingtime /sTrainingepochsTesttime /s
      Ref. [8]35.17500.0064
      Proposed42.19840.0076
    Tools

    Get Citation

    Copy Citation Text

    Huangkang Chen, Ying Chen. Speaker Identification Based on Multimodal Long Short-Term Memory with Depth-Gate[J]. Laser & Optoelectronics Progress, 2019, 56(3): 031007

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing

    Received: Jun. 13, 2018

    Accepted: Aug. 31, 2018

    Published Online: Jul. 31, 2019

    The Author Email: Huangkang Chen (6161918009@vip.jiangnan.edu.cn), Ying Chen (chenying@jiangnan.edu.cn)

    DOI:10.3788/LOP56.031007

    Topics