Journal of Terahertz Science and Electronic Information Technology , Volume. 22, Issue 7, 781(2024)

Improved infrared human pose estimation algorithm based on MobileViT

ZHANG Wenyang1... XU Zhaofei2, LIU Qing2, WANG Kejun3,*, YUE Guanghui4, WANG Shuigen2 and SHANG Zaifei5 |Show fewer author(s)
Author Affiliations
  • 1[in Chinese]
  • 2[in Chinese]
  • 3[in Chinese]
  • 4[in Chinese]
  • 5[in Chinese]
  • show less
    References(24)

    [3] [3] TOSHEV A,SZEGEDY C. DeepPose:human pose estimation via deep neural networks[C]// 2014 IEEE Conference on Computer Vision and Pattern Recognition. Columbus,OH,USA:IEEE, 2014:1653-1660. doi:10.1109/CVPR.2014.214.

    [4] [4] CARREIRA J,AGRAWAL P,FRAGKIADAKI K,et al. Human pose estimation with iterative error feedback[DB/OL]. (2015-07-23)[2022-08-09]. https://arxiv.org/abs/1507.06550. doi: 10.48550/arXiv.1507.06550.

    [5] [5] WEI S E,RAMAKRISHNA V,KANADE T,et al. Convolutional pose machines[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas,NV,USA:IEEE, 2016:4724-4732. doi:10.1109/CVPR.2016.511.

    [6] [6] NEWELL A, YANG Kaiyu, DENG Jia. Stacked hourglass networks for human pose estimation[J/OL]. Springer International Publishing, 2016:483-499. doi:10.1007/978-3-319-46484-8_29.

    [7] [7] CHEN Yilun, WANG Zhicheng, PENG Yuxiang, et al. Cascaded pyramid network for multi-person pose estimation[C]// 2018IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City, UT, USA: IEEE, 2018: 7103-7112. doi:10.1109/CVPR.2018.00742.

    [8] [8] FANG Haoshu, XIE Shuqin, TAI Y W, et al. RMPE: Regional Multi-person Pose Estimation[C]// 2017 IEEE International Conference on Computer Vision(ICCV). Venice,Italy:IEEE, 2017:2353-2362. doi:10.1109/ICCV.2017.256.

    [9] [9] SUN Ke,XIAO Bin,LIU Dong, et al. Deep high-resolution representation learning for human pose estimation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Long Beach, CA, USA: IEEE, 2019: 5686-5696. doi:10.1109/CVPR.2019.00584.

    [10] [10] LI W,WANG Z,YIN B,et al. Rethinking on multi-stage networks for human pose estimation[DB/OL]. (2019-01-01)[2022-08-09]. https://arxiv.org/abs/1901.00148. doi: 10.48550/arXiv.1901.00148.

    [11] [11] INSAFUTDINOV E, PISHCHULIN L,ANDRES B, et al. DeeperCut: a deeper,stronger, and faster multi-person pose estimation model[C]// The 14th European Conference. Amsterdam, the Netherlands: Springer International Publishing, 2016: 34-50. doi:10.1007/ 978-3-319-46466-4_3.

    [12] [12] PISHCHULIN L, INSAFUTDINOV E, TANG Siyu, et al. DeepCut: joint subset partition and labeling for multi person pose estimation[C]// 2016 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas,NV,USA:IEEE, 2016:4929-4937. doi:10.1109/CVPR.2016.533.

    [13] [13] NEWELL A, HUANG Zhi'ao, DENG Jia. Associative embedding: end-to-end learning for joint detection and grouping[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. [S. l.]:Association for Computing Machinery, 2017:2274-2284.

    [14] [14] CAO Zhe, SIMON T,WEI S, et al. Realtime multi-person 2D pose estimation using part affinity fields[DB/OL]. (2016-11-24) [2022-08-09]. https://arxiv.org/abs/1611.08050. doi:10.48550/arXiv.1611.08050.

    [15] [15] KREISS S, BERTONI L, ALAHI A. PifPaf: composite fields for human pose estimation[C]// 2019 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Long Beach, CA, USA: IEEE, 2019: 11969-11978. doi: 10.1109/CVPR.2019.01225.

    [16] [16] CHENG Bowen,XIAO Bin,WANG Jingdong,et al. HigherHRNet:scale-aware representation learning for bottom-up human pose estimation[C]// 2020 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Seattle,WA,USA:IEEE, 2020:5385-5394. doi:10.1109/CVPR42600.2020.00543.

    [17] [17] MEHTA S, RASTEGARI M. Mobilevit: light-weight, general-purpose, and mobile-friendly vision transformer[C]// IEEE Conference on Computer Vision and Pattern Recognition(CVPR). 2021. doi: 10.48550/arXiv.2110.02178.

    [18] [18] SANDLER M,HOWARD A,ZHU Menglong,et al. MobileNetV2:inverted residuals and linear bottlenecks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Salt Lake City,UT,USA:IEEE, 2018:4510-4520. doi:10.1109/CVPR.2018.00474.

    [19] [19] LIN T Y, DOLLáR P, GIRSHICK R, et al. Feature pyramid networks for object detection[C]// 2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Honolulu,HI,USA:IEEE, 2017:936-944. doi:10.1109/CVPR.2017.106.

    [20] [20] BISWAS K,KUMAR S,BANERJEE S,et al. Smooth maximum unit:smooth activation function for deep networks using smoothing maximum technique[C]// IEEE Conference on Computer Vision and Pattern Recognition(CVPR). New Orleans,LA,USA:IEEE,2021:784-793. doi:10.1109/CVPR52688.2022.00087.

    [21] [21] FU Jun,LIU Jing,TIAN Haijie,et al. Dual attention network for scene segmentation[C]// IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Long Beach,CA,USA:IEEE, 2019:3146-3154.

    [22] [22] HE Kaiming, ZHAGN Xiangyu, REN Shaoqing, et al. Deep residual learning for image recognition[C]// IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Las Vegas,NV,USA:IEEE, 2016:770-778.

    [23] [23] SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for Large-Scale image recognition[DB/OL]. (2014-09-04) [2022-08-09]. https://arxiv.org/abs/1409.1556. doi: 10.48550/arXiv.1409.1556.

    [24] [24] YU Changqian, XIAO Bin, GAO Changxin, et al. Lite-HRNet: a lightweight high-resolution network[C]// 2021 IEEE/CVF Conference on Computer Vision and Pattern Recognition(CVPR). Nashville,TN,USA:IEEE, 2021: 10435-10445. doi: 10.1109/CVPR46437.2021.01030.

    [25] [25] HU Jie, SHEN Li, SUN Gang. Squeeze-and-Excitation networks[C]// 2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition. Salt Lake City,UT,USA:IEEE, 2018:7132-7141. doi:10.1109/CVPR.2018.00745.

    [26] [26] ZHANG Qinglong, YANG Yubin. SA-Net:shuffle attention for deep convolutional neural networks[C]// ICASSP 2021—2021IEEE International Conference on Acoustics, Speech and Signal Processing(ICASSP). Toronto,ON,Canada:IEEE, 2021: 2235-2239. doi:10.1109/ICASSP39728.2021.9414568.

    Tools

    Get Citation

    Copy Citation Text

    ZHANG Wenyang, XU Zhaofei, LIU Qing, WANG Kejun, YUE Guanghui, WANG Shuigen, SHANG Zaifei. Improved infrared human pose estimation algorithm based on MobileViT[J]. Journal of Terahertz Science and Electronic Information Technology , 2024, 22(7): 781

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Aug. 9, 2022

    Accepted: --

    Published Online: Aug. 22, 2024

    The Author Email: Kejun WANG (wangkejun@hrbeu.edu.cn)

    DOI:10.11805/tkyda2022149

    Topics