Opto-Electronic Engineering, Volume. 46, Issue 12, 190006(2019)

Equal-scale structure from motion method based on deep learning

Chen Peng*, Ren Jinjin, Wang Haixia, Tang Yuesheng, and Liang Ronghua
Author Affiliations
  • [in Chinese]
  • show less
    References(21)

    [1] [1] Liu G, Peng Q S, Bao H J. An interactive modeling system from multiple images[J]. Journal of Computer-Aided Design & Computer Graphics, 2004, 16(10): 1419–1424, 1429.

    [2] [2] Cao T Y, Cai H Y, Fang D M, et al. Robot vision localization system based on image content matching[J]. Opto-Electronic Engineering, 2017, 44(5): 523–533.

    [3] [3] Tomasi C, Kanade T. Shape and motion from image streams under orthography: a factorization method[J]. International Journal of Computer Vision, 1992, 9(2): 137–154.

    [4] [4] Pollefeys M, Koch R, van Gool L. Self-calibration and metric reconstruction inspite of varying and unknown intrinsic camera parameters[J]. International Journal of Computer Vision, 1999, 32(1): 7–25.

    [5] [5] Dai J J. Research on the theory and algorithms of 3D reconstruction from multiple images[D]. Shanghai: Shanghai Jiao Tong University, 2012.

    [6] [6] Zhang T. 3D reconstruction based on monocular vision[D]. Xi’an: Xidian University, 2014.

    [7] [7] Xu Y X, Chen F. Real-time stereo visual localization based on multi-frame sequence motion estimation[J]. Opto-Electronic Engineering, 2016, 43(2): 89–94.

    [8] [8] Huang W Y, Xu X M, Wu F Q, et al. Research of underwater binocular vision stereo positioning technology in nuclear condition[J]. Opto-Electronic Engineering, 2016, 43(12): 28–33.

    [9] [9] Yi K M, Trulls E, Lepetit V, et al. LIFT: learned invariant feature transform[C]//Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 467–483.

    [10] [10] He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of 2016 Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA, 2016: 770–778.

    [11] [11] Newell A, Yang K Y, Deng J. Stacked hourglass networks for human pose estimation[C]//Proceedings of the 14th European Conference on Computer Vision, Amsterdam, The Netherlands, 2016: 483–499.

    [12] [12] Zhou T H, Brown M, Snavely N, et al. Unsupervised learning of depth and ego-motion from video[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA, 2017: 6612–6619.

    [13] [13] Newcombe R A, Izadi S, Hilliges O, et al. Kinect Fusion: real-time dense surface mapping and tracking[C]//Proceedings of the 10th IEEE International Symposium on Mixed and Augmented Reality, Basel, Switzerland, 2011: 127–136.

    [14] [14] Usenko V, Engel J, Stückler J, et al. Direct visual-inertial odometry with stereo cameras[C]//Proceedings of 2016 IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 2016: 1885–1892.

    [15] [15] Concha A, Loianno G, Kumar V, et al. Visual-inertial direct SLAM[C]//Proceedings of 2016 IEEE International Conference on Robotics and Automation, Stockholm, Sweden, 2016: 1331–1338.

    [16] [16] Ham C, Lucey S, Singh S. Hand waving away scale[C]//Proceedings of the 13th European Conference on Computer Vision, Zurich, Switzerland, 2014: 279–293.

    [17] [17] Mur-Artal R, Tardós J D. Visual-inertial monocular SLAM with map reuse[J]. IEEE Robotics and Automation Letters, 2017, 2(2): 796–803.

    [18] [18] Mur-Artal R, Tardós J D. ORB-SLAM2: an open-source slam system for monocular, stereo, and RGB-D cameras[J]. IEEE Transactions on Robotics, 2017, 33(5): 1255–1262.

    [19] [19] Ham C, Lucey S, Singh S. Absolute scale estimation of 3d monocular vision on smart devices[M]//Hua G, Hua X S. Mobile Cloud Visual Media Computing: From Interaction to Service. New York: Springer International Publishing, 2015: 329–344.

    [20] [20] Mustaniemi J, Kannala J, S?rkk? S, et al. Inertial-based scale estimation for structure from motion on mobile devices[C]//Proceedings of 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems, Vancouver, BC, Canada, 2017: 4394–4401.

    [21] [21] Godard C, Mac Aodha O, Brostow G J. Unsupervised monocular depth estimation with left-right consistency[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 2017: 6602–6610.

    CLP Journals

    [1] Kan Wang, Jun Gong, Jinghe Wei, Ce Zhu, Kai Liu. Euclidean 3D reconstruction based on structure from motion of matching adjacent images[J]. Infrared and Laser Engineering, 2020, 49(6): 20200078

    Tools

    Get Citation

    Copy Citation Text

    Chen Peng, Ren Jinjin, Wang Haixia, Tang Yuesheng, Liang Ronghua. Equal-scale structure from motion method based on deep learning[J]. Opto-Electronic Engineering, 2019, 46(12): 190006

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Article

    Received: Jan. 8, 2019

    Accepted: --

    Published Online: Jan. 9, 2020

    The Author Email: Peng Chen (chenpeng@zjut.edu.cn)

    DOI:10.12086/oee.2019.190006

    Topics