Progress in Deep Learning Based Monocular Image Depth Estimation

Fig. 4. Long-tail distributions on depth and semantic labels. (a) Pixel-depth distribution of NYU Depth V2 dataset; (b) pixel-depth distribution of KITTI dataset; (c)(d) pixel-semantic label distributions of NYU Depth V2 dataset (all categories/40 categories)

Download full size

Fig. 5. Schematic of network architecture proposed by Jiao et al.[42]

Download full size

Fig. 6. Schematics of proposed LSU and SUC connections. (a) LSU; (b) SUC

Download full size

Fig. 7. Schematic of network architecture proposed by Fu et al.[47]

Download full size

Fig. 8. Structural schematic of full-image encoders

Download full size

Fig. 9. Overall flow chart of algorithm proposed by Garg et al.[30]

Download full size

Fig. 10. Schematics of network architecture proposed by Godard et al[31]. (a) Sampling strategy with left-right consistency; (b) loss function

Download full size

Fig. 11. Schematic of semi-supervised network architecture and loss function proposed by Kuznietsov et al.[53]

Download full size

Fig. 12. Schematic of network frame of video depth estimation based on view synthesis[32]

Download full size

Fig. 13. Schematic of GeoNet network architecture proposed by Yin et al.[59]

Download full size

Table 1. Quantitative evaluation of selected algorithms on NYU Depth V2 dataset

View table

Table 1. Quantitative evaluation of selected algorithms on NYU Depth V2 dataset

Method	Year	Type	e_Abs	e_RMS	e_lg	δ<1.25	δ<1.25²	δ<1.25³
Method in Ref. [23]	2014	Sup.	0.215	0.907	-	0.611	0.887	0.971
Method in Ref. [26]	2016	Sup.	0.213	0.759	0.087	0.650	0.906	0.976
Method in Ref. [27]	2016	Sup.	0.127	0.573	0.055	0.811	0.953	0.988
Method in Ref. [61]	2017	Sup.	0.121	0.586	0.052	0.811	0.954	0.987
Method in Ref. [47]	2018	Sup.	0.115	0.509	0.051	0.828	0.965	0.992
Method in Ref. [42]	2018	Sup.	0.098	0.329	0.040	0.917	0.983	0.996
Method in Ref. [45]	2018	Sup.	0.139	0.505	0.058	0.820	0.960	0.989
Method in Ref. [10]	2019	Sup.	0.128	0.523	0.059	0.813	0.964	0.992

Table 2. Quantitative evaluation of selected algorithms on KITTI dataset

View table

Table 2. Quantitative evaluation of selected algorithms on KITTI dataset

Method	Year	Type	e_Abs	e_RMS	e_lg	δ<1.25	δ<1.25²	δ<1.25³
Method in Ref. [23]	2014	Sup.	0.190	7.156	0.270	0.692	0.899	0.967
Method in Ref. [26]	2016	Sup.	0.217	7.046	-	0.656	0.881	0.958
Method in Ref. [53]	2017	Semi. (stereo)	0.113	4.621	0.189	0.862	0.960	0.986
Method in Ref. [31]	2017	Unsup. (stereo)	0.114	4.935	0.206	0.861	0.949	0.976
Method in Ref. [47]	2018	Sup.	0.072	2.727	0.120	0.932	0.984	0.994
Method in Ref. [42]	2018	Sup.	-	5.110	0.215	0.843	0.950	0.981
Method in Ref. [45]	2018	Sup.	0.113	4.687	-	0.856	0.962	0.988
Method in Ref. [33]	2018	Unsup. (video)	0.109	4.750	0.187	0.874	0.958	0.982
Method in Ref. [62]	2018	Unsup. (stereo)	0.095	4.316	0.177	0.892	0.966	0.984
Method in Ref. [59]	2018	Unsup. (video)	0.153	5.737	0.232	0.802	0.934	0.972
Method in Ref. [57]	2019	Unsup. (video)	0.139	5.160	0.215	0.833	0.939	0.975

Table 3. Summary of selected representative algorithms

View table

Table 3. Summary of selected representative algorithms

Method	Year	Type	Data type	Loss	Main contributions
Method in Ref. [23]	2014	Sup.	RGB+depth	Inference error(original)	First to use deep learning onmonocular depth estimation (MDE)
Method in Ref. [27]	2016	Sup.	RGB+depth	Inference error(berHu loss)	Introduction of residual learning toMDE with optimized up-convolutions
Method in Ref. [61]	2017	Sup.	RGB+depth	Inference error(square loss)	Achievement of end-to-end MDEwith CNN layers fused within CRF
Method in Ref. [53]	2017	Semi.	Binocular RGB +sparse depth	Berhu loss (supervisedloss), image alignmenterror (unsupervisedloss), regularization loss	Introduction of a semi-superviseddeep learning approach of MDE
Method in Ref. [47]	2018	Sup.	RGB+depth	Ordinal regression loss	Ordinal regression method forMDE with dilated convolutions
Method in Ref. [59]	2018	Unsup.	Video	Wrapping loss, depthsmoothness loss,geometric consistency loss	Cascaded architecture to resolve rigidflow and object motion separately indepth estimation from monocular video
Method in Ref. [33]	2018	Unsup	Video	Reconstruction loss, wrappingloss, depth smoothness lossobject size constraints	MDE in highly dynamic scenes withexplicit modeling of 3D motions ofmoving objects and camera itself

Tools

Get Citation

Copy Citation Text

Yang Li, Xiuwan Chen, Yuan Wang, Maolin Liu. Progress in Deep Learning Based Monocular Image Depth Estimation[J]. Laser & Optoelectronics Progress, 2019, 56(19): 190001

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Reviews

Received: Mar. 20, 2019

Accepted: Apr. 11, 2019

Published Online: Oct. 12, 2019

The Author Email: Li Yang (yang.li2012@pku.edu.cn)

DOI:10.3788/LOP56.190001

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology