Optics and Precision Engineering, Volume. 30, Issue 19, 2379(2022)
Global hand pose estimation based on pixel voting
Global hand pose estimation under changing gestures remains a challenging task in computer vision. To address the problem of large errors in this task, a method based on pixel voting was proposed. First, a convolutional neural network with an encoder-decoder structure was established to generate feature maps of semantic and pose information. Second, hand pixel positions and pixel-by-pixel pose voting were obtained from the feature maps using semantic segmentation and pose estimation branches, respectively. Finally, the pose voting of hand pixels was aggregated to obtain the voting result. Simultaneously, to solve the problem of scarcity of global hand pose datasets, a procedure for generating synthetic datasets of the human hand was established using the OpenSceneGraph 3D rendering engine and a 3D human hand model. This procedure could generate depth images and global pose labels of human hands under different gestures. Experimental results show that the average error of global hand pose estimation based on pixel voting is 5.036°, thus verifying that the proposed method can robustly and accurately estimate global hand poses from depth images.
Get Citation
Copy Citation Text
Jingang LIN, Dongnian LI, Chengjun CHEN, Zhengxu ZHAO. Global hand pose estimation based on pixel voting[J]. Optics and Precision Engineering, 2022, 30(19): 2379
Category: Information Sciences
Received: May. 16, 2022
Accepted: --
Published Online: Oct. 27, 2022
The Author Email: LI Dongnian (dongnianli@qut.edu.cn)