Laser & Optoelectronics Progress, Volume. 56, Issue 13, 131003(2019)

Head Pose Estimation Based on Multi-Scale Convolutional Neural Network

Lingyu Liang1,2,3、**, Tiantian Zhang1,3, and Wei He1、*
Author Affiliations
  • 1 Key Laboratory of Wireless Sensor Network and Communication, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 201800, China
  • 2 School of Information Science and Technology, ShanghaiTech University, Shanghai 200120, China
  • 3 University of Chinese Academy of Sciences, Beijing 100049, China
  • show less
    Figures & Tables(10)
    Head pose estimation flow chart
    Deep neural network structure for head pose estimation. (a) Multi-scale convolution structure; (b) process structure after feature combination
    Part of the experimental head posture library picture. (a) CAS-PEALR1; (b) Pointing'04
    Face images after cropped
    Partial head posture pictures under different interference factors. (a) Standard; (b) with mask; (c) with glasses; (d) expression; (e) weak illumination; (f) strong illumination; (g) background
    • Table 1. Multi-scale convolution vs. single-scale convolution

      View table

      Table 1. Multi-scale convolution vs. single-scale convolution

      ConvolutionRecognition accuracyon test set /%
      Multi-scale convolution98.9
      3×3 single-scale convolution94.3
      5×5 single-scale convolution93.1
      7×7 single-scale convolution93.2
    • Table 2. Accuracy of different algorithms on Pointing'04 and CAS-PEAL-R1

      View table

      Table 2. Accuracy of different algorithms on Pointing'04 and CAS-PEAL-R1

      AlgorithmPointing'04accuracy /%CAS-PEAL-R1 accuracy /%Numberofgestures
      Algorithm ofthis paper96.598.921
      Cluster-glassificationBayesian network[23]94.896.212
      Based onfacialfeature points[24]92.793.56
    • Table 3. Gestures Conversion relationship table

      View table

      Table 3. Gestures Conversion relationship table

      AftergesturesconversionPitch attitudebefore gesturesconversionYaw attitudebefore gesturesconversion /(°)
      LevelPM0,-15,+15
      Level leftPM+30,+45
      Level rightPM-30,-45
      Pitch downPD0,-15,+15
      Pitch upPU0,-15,+15
      Left upPU+30,+45
      Right upPU-30,-45
      Left downPD+30,+45
      Right downPD-30,-45
    • Table 4. Different interference factors effect on recognition rate

      View table

      Table 4. Different interference factors effect on recognition rate

      InterferencefactorAccuracy ofthis paper /%Accuracy ofmethod in Ref.[23] /%
      Standard98.596.3
      With mask92.986.4
      With glasses96.389.7
      Expression98.194.9
      Weak illumination95.591.8
      Strong illumination96.192.2
      Background97.795.1
    • Table 5. Recognition time at different resolution

      View table

      Table 5. Recognition time at different resolution

      Resolution /(pixel×pixel)Timeof thispaper /msTimeof method inRef. [23] /msTimeof method inRef. [24] /ms
      1920×108034.351.3562.1
      1360× 76032.748.5451.7
      800×60030.845.2349.8
    Tools

    Get Citation

    Copy Citation Text

    Lingyu Liang, Tiantian Zhang, Wei He. Head Pose Estimation Based on Multi-Scale Convolutional Neural Network[J]. Laser & Optoelectronics Progress, 2019, 56(13): 131003

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing

    Received: Dec. 6, 2018

    Accepted: Jan. 24, 2019

    Published Online: Jul. 11, 2019

    The Author Email: Liang Lingyu (liangly@shanghaitech.edu.cn), He Wei (wei.he@mail.sim.ac.cn)

    DOI:10.3788/LOP56.131003

    Topics