Human Action Recognition Combining Sequential Dynamic Images and Two-Stream Convolutional Network

Table 1. Recognition accuracy of UCF101 dataset with different input modes unit: %
View table
Table 1. Recognition accuracy of UCF101 dataset with different input modes unit: %
Method Split1 Split2 Split3 Accuracy
SI 84.6 84.9 85.0 84.8
SOF 87.3 89.9 91.0 89.4
FSDI 83.9 83.8 83.1 83.6
BSDI 84.1 83.3 84.3 83.9
SDI 85.7 86.2 85.5 85.8
ESDI 87.2 86.8 87.6 87.2
SI+SOF 93.2 94.0 94.2 93.8
ESDI+SOF 94.8 94.6 95.3 94.9

Table 2. Recognition accuracy of HMDB51 dataset with different input modes unit: %
View table
Table 2. Recognition accuracy of HMDB51 dataset with different input modes unit: %
Method Split1 Split2 Split3 Accuracy
SI 54.8 50.4 49.6 51.6
SOF 64.2 63.6 62.7 63.5
FSDI 50.7 51.4 53.6 51.9
BSDI 51.6 51.5 54.1 52.4
SDI 54.5 52.9 53.7 53.7
ESDI 53.6 55.5 55.6 54.9
SI+SOF 68.7 67.5 68.4 68.2
ESDI+SOF 69.6 71.2 71.6 70.8

Table 3. Recognition accuracy of different fusion methods on dataset unit: %
View table
Table 3. Recognition accuracy of different fusion methods on dataset unit: %
Consensus function UCF101 HMDB51
Max 93.0 69.1
Average 94.9 70.8
Weighted average 93.8 69.7

Table 4. Recognition accuracy of different network models on dataset unit: %
View table
Table 4. Recognition accuracy of different network models on dataset unit: %
Network structure UCF101 HMDB51
Resnet101 93.6 68.4
Bn-inception 94.2 68.2
InceptionV3 94.9 70.8

Table 5. Recognition accuracy of different human behavior recognition models unit: %
View table
Table 5. Recognition accuracy of different human behavior recognition models unit: %
Network UCF101 HMDB51
Spatial stream 84.8 51.4
Temproral stream 89.4 63.5
Original two-stream 88.0 59.4
Ref. [19] 94.0 69.4
Appearance and long-sequential stream 87.2 54.9
Short sequential stream 89.9 64
TS-CNN 94.9 70.8

Table 6. Recognition accuracy of different algorithms unit: %

View table

Table 6. Recognition accuracy of different algorithms unit: %

Feature extraction	Method	UCF101	HMDB51
Tradition	Ref. [7]	84.8	57.2
Tradition	Ref. [8]	87.9	61.1
Deep learning	Ref. [17]	88.0	59.4
	Ref. [21]	88.6	--
	Ref. [22]	91.5	65.9
	Ref. [23]	93.1	63.3
	Ref. [24]	93.4	66.4
	Ref. [19]	94.0	69.4
	Proposed	94.9	70.8

Tools

Get Citation

Copy Citation Text

Wenqiang Zhang, Zengqiang Wang, Liang Zhang. Human Action Recognition Combining Sequential Dynamic Images and Two-Stream Convolutional Network[J]. Laser & Optoelectronics Progress, 2021, 58(2): 0210007

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Image Processing

Received: Jun. 5, 2020

Accepted: Jul. 7, 2020

Published Online: Jan. 5, 2021

The Author Email: Zhang Liang (l-zhang@cauc.edu.cn)

DOI:10.3788/LOP202158.0210007

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology

Table 1. Recognition accuracy of UCF101 dataset with different input modes unit: %

Table 1. Recognition accuracy of UCF101 dataset with different input modes unit: %

Table 2. Recognition accuracy of HMDB51 dataset with different input modes unit: %

Table 2. Recognition accuracy of HMDB51 dataset with different input modes unit: %

Table 3. Recognition accuracy of different fusion methods on dataset unit: %

Table 3. Recognition accuracy of different fusion methods on dataset unit: %

Table 4. Recognition accuracy of different network models on dataset unit: %

Table 4. Recognition accuracy of different network models on dataset unit: %

Table 5. Recognition accuracy of different human behavior recognition models unit: %

Table 5. Recognition accuracy of different human behavior recognition models unit: %

Table 6. Recognition accuracy of different algorithms unit: %

Table 6. Recognition accuracy of different algorithms unit: %