Journal of Terahertz Science and Electronic Information Technology , Volume. 22, Issue 7, 781(2024)
Improved infrared human pose estimation algorithm based on MobileViT
Human pose estimation primarily relies on capturing joint points from visual image information to obtain global posture information of limbs and torso. Currently, depth learning methods based on visible light have high detection accuracy, but the risk of privacy leakage limits their practical application. Infrared detectors of the same cost can highlight human targets more effectively, but due to their lower imaging resolution and poor image quality, the detection accuracy is reduced. Inspired by visual Transformers, this paper introduces MobileViT-FPN to extract key human body points, using MobileViT to capture the relationship between local and global joint features, and then using Fixed Pattern Noise (FPN) to aggregate these representational information at multiple scales. Combined with an improved OpenPose for key point clustering, the estimated results are outputted. In the key point cascading phase, the attention mechanism allows the model to adaptively focus on the area of interest,enhancing the recovery of occluded parts. Experiments show that this method can real-time detect infrared human targets with varying scales and partial occlusions, accurately depicting human posture.
Get Citation
Copy Citation Text
ZHANG Wenyang, XU Zhaofei, LIU Qing, WANG Kejun, YUE Guanghui, WANG Shuigen, SHANG Zaifei. Improved infrared human pose estimation algorithm based on MobileViT[J]. Journal of Terahertz Science and Electronic Information Technology , 2024, 22(7): 781
Category:
Received: Aug. 9, 2022
Accepted: --
Published Online: Aug. 22, 2024
The Author Email: Kejun WANG (wangkejun@hrbeu.edu.cn)