Optoelectronics Letters, Volume. 21, Issue 8, 499(2025)

Robust human motion prediction via integration of spatial and temporal cues

Shaobo ZHANG, Sheng LIU, Fei GAO, and Yuan FENG
References(21)

[1] [1] GUI L Y, ZHANG K, WANG Y X, et al. Teaching robots to predict human motion[C]//2018 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), October 1-5, 2018, Madrid, Spain. New York: IEEE, 2018: 562-567.

[2] [2] HABIBI G, JAIPURIA N, HOW J P. Context-aware pedestrian motion prediction in urban intersections[EB/OL]. (2018-07-25) [2024-02-23]. https://arxiv.org/abs/1806.09453.

[3] [3] LI T, LIU J, ZHANG W, et al. Hard-net: hardness-aware discrimination network for 3D early activity prediction[J]. IEEE transactions on circuits and systems for video technology, 2020, 34(12): 12112-12126.

[4] [4] KICIOGLU S, RHODIN H, SINHA S N, et al. Activemocap: optimized viewpoint selection for active human motion capture[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 13-19, 2020, Seattle, WA, USA. New York: IEEE, 2020: 103-112.

[5] [5] IONESCU C, PAPAVA D, OLARU V, et al. Human3.6M: large scale datasets and predictive methods for 3D human sensing in natural environments[J]. IEEE transactions on pattern analysis and machine intelligence, 2013, 36(7): 1325-1339.

[6] [6] LIU J, GUANG Y, ROJAS J. GAST-Net: graph attention spatio-temporal convolutional networks for 3D human pose estimation in video[EB/OL]. (2020-03-11) [2024-02-23]. https://arxiv.org/abs/2003.14179.

[7] [7] MAO W, LIU M, SALZMANN M, et al. Learning trajectory dependencies for human motion prediction[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, October 27-November 2, 2019, Seoul, Korea (South). New York: IEEE, 2019: 9489-9497.

[8] [8] FU J, YANG F, DANG Y, et al. Learning constrained dynamic correlations in spatiotemporal graphs for motion prediction[EB/OL]. (2022-04-04) [2024-02-23]. https://arxiv.org/abs/2204.01297.

[9] [9] MEDINA E, LOH L, GURUNG N, et al. Context-based interpretable spatio-temporal graph convolutional network for human motion forecasting[C]//Proceedings of the IEEE/CVF Winter Conference on Applications of Computer Vision, January 3-8, 2024, Waikoloa, HI, USA. New York: IEEE, 2024: 3232-3241.

[10] [10] CAMAGOZ N, HADFIELD S, KOLLER O, et al. Subunets: end-to-end hand shape and continuous sign language recognition[C]//Proceedings of the IEEE International Conference on Computer Vision, October 22-29, 2017, Venice, Italy. New York: IEEE, 2017: 3056-3065.

[11] [11] MARTINEZ J, BLACK M J, ROMERO J. On human motion prediction using recurrent neural networks[C]//IEEE Conference on Computer Vision and Pattern Recognition, July 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 2017: 4674-4683.

[12] [12] PAVLLO D, FEICHTENHOFER C, GRANGIER D, et al. 3D human pose estimation in video with temporal convolutions and semi-supervised training[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 15-20, 2019, Long Bench, CA, USA. New York: IEEE, 2019: 7753-7762.

[13] [13] WOO S, PARK J, LEE J Y, et al. CBAM: convolutional block attention module[C]//15th European Conference on Computer Vision (ECCV), September 8-14, 2018, Munich, Germany. Heidelberg: Springer, 2018: 3-19.

[14] [14] LEBAILLY T, KICIROGLU S, SALZMANN M, et al. Motion prediction using temporal inception module[C]//Proceedings of the Asian Conference on Computer Vision, November 30-December 4, 2020, Kyoto, Japan. Heidelberg: Springer, 2020: 651-665.

[15] [15] ZHAO L, PENG X, TIAN Y, et al. Semantic graph convolutional networks for 3D human pose regression[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 15-20, 2019, Long Bench, CA, USA. New York: IEEE, 2019: 3425-3435.

[16] [16] VELICKOVIC P, CUCRRULL G, CASANOVA A, et al. Graph attention networks[EB/OL]. (2017-10-30) [2024-02-23]. http://arxiv.org/abs/1710.10903.

[17] [17] SHI L, ZHANG Y, CHENG J, et al. Two-stream adaptive graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 15-20, 2019, Long Bench, CA, USA. New York: IEEE, 2019: 12026-12035.

[18] [18] LI C, ZHANG Z, LEE W S, et al. Convolutional sequence to sequence model for human dynamics[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, June 18-23, 2018, Salt Lake City, UT, USA. New York: IEEE, 2018: 5226-5234.

[19] [19] BOUAZIZI A, HOLZBOCK A, KRESSEL U, et al. MotionMixer: MLP-based 3D human body pose forecasting[EB/OL]. (2022-07-01) [2024-02-23]. https://arxiv.org/abs/2207.00499.

[20] [20] MA T, NIE Y, LONG C, et al. Progressively generating better initial guesses towards next stages for high-quality human motion prediction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-24, 2022, New Orleans, LA, USA. New York: IEEE, 2022: 6437-6446.

[21] [21] ZHAO M, TANG H, XIE P, et al. Bidirectional transformer GAN for long-term human motion prediction[J]. ACM transactions on multimedia computing, communications and applications, 2023, 19(5): 1-19.

Tools

Get Citation

Copy Citation Text

ZHANG Shaobo, LIU Sheng, GAO Fei, FENG Yuan. Robust human motion prediction via integration of spatial and temporal cues[J]. Optoelectronics Letters, 2025, 21(8): 499

Download Citation

EndNote(RIS)BibTexPlain Text
Save article for my favorites
Paper Information

Received: May. 14, 2024

Accepted: Jul. 24, 2025

Published Online: Jul. 24, 2025

The Author Email:

DOI:10.1007/s11801-025-4119-4

Topics