Speaker Identification Based on Multimodal Long Short-Term Memory with Depth-Gate

Table 1. Recognition accuracy of multimodal DGLSTM with different structures%
View table
Table 1. Recognition accuracy of multimodal DGLSTM with different structures%
Method Recognition accuracy Ill-paired-rejection accuracy
2-layer multimodal DGLSTM (shared) 90.30 95.85
2-layer multimodal DGLSTM( not shared) 92.25 96.55
3-layer multimodal DGLSTM (not shared) 88.25 95.00

Table 2. Recognition accuracy with different algorithms%
View table
Table 2. Recognition accuracy with different algorithms%
Method Recognition accuracy
Ref. [6] 83.26
Ref. [5] 86.12
Ref. [8] 90.15
Only depth-gate 67.90
Proposed 92.25

Table 3. Recognition accuracy on simulation video sequence for different algorithms%
View table
Table 3. Recognition accuracy on simulation video sequence for different algorithms%
Method Time window
0.5 s 2.5 s
Ref. [8] 89.36 95.86
Proposed 91.71 96.04

Table 4. Training time and test time for different algorithms
View table
Table 4. Training time and test time for different algorithms
Method Trainingtime /s Trainingepochs Testtime /s
Ref. [8] 35.17 50 0.0064
Proposed 42.19 84 0.0076

Tools

Get Citation

Copy Citation Text

Huangkang Chen, Ying Chen. Speaker Identification Based on Multimodal Long Short-Term Memory with Depth-Gate[J]. Laser & Optoelectronics Progress, 2019, 56(3): 031007

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites