Infrared and Laser Engineering, Volume. 54, Issue 7, 20250157(2025)
Recent progress in research and applications of monocular and binocular depth estimation (invited)
[6] ARAMPATZAKIS V, PAVLIDIS G, MITIANOUDIS N et al. Monocular depth estimation: A thorough review[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 46, 2396-2414(2023).
[8] [8] SAXENA A, CHUNG S, NG A. Learning depth from single monocular images [J]. Advances in Neural Infmation Processing Systems, 2005: 11611168.
[13] [13] EIGEN D, PUHRSCH C, FERGUS R. Depth map prediction from a single image using a multiscale deep wk [J]. Advances in Neural Infmation Processing Systems, 2014, 27.
[14] [14] LAINA I, RUPPRECHT C, BELAGIANNIS V, et al. Deeper depth prediction with fully convolutional residual wks[C]Proceedings of the 2016 Fourth International Conference on 3D Vision, 2016: 239248.
[15] [15] FU H, GONG M, WANG C, et al. Deep dinal regression wk f monocular depth estimation[C]Proceedings of the IEEE Conference on Computer Vision Pattern Recognition, 2018: 20022011.
[17] [17] LIU Z, LIN Y, CAO Y, et al. Swin transfmer: Hierarchical vision transfmer using shifted windows[C]Proceedings of the IEEECVF International Conference on Computer Vision, 2021: 1001210022.
[18] LIU Y, ZHANG Y, WANG Y et al. A survey of visual transformers[J]. arXiv, 2111.06091cs, 2023.
[19] [19] RANFTL R, BOCHKOVSKIY A, KOLTUN V. Vision transfmers f dense prediction[C]Proceedings of the IEEECVF International Conference on Computer Vision, 2021: 1217912188.
[21] [21] GARG R, BG V K, CARNEIRO G, et al. Unsupervised cnn f single view depth estimation: Geometry to the rescue[C]Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The herls, October 1114, 2016, Proceedings, Part VIII 14, 2016: 740756.
[22] [22] GODARD C, MAC AODHA O, BROSTOW G J. Unsupervised monocular depth estimation with leftright consistency[C]Proceedings of the IEEE Conference on Computer Vision Pattern Recognition, 2017: 270279.
[23] [23] GODARD C, MAC AODHA O, FIRMAN M, et al. Digging into selfsupervised monocular depth estimation[C]Proceedings of the IEEECVF International Conference on Computer Vision, 2019: 38283838.
[24] [24] ZHOU T, BROWN M, SNAVELY N, et al. Unsupervised learning of depth egomotion from video[C]Proceedings of the IEEE Conference on Computer Vision Pattern Recognition, 2017: 18511858.
[25] [25] WATSON J, MAC AODHA O, PRISACARIU V, et al. The tempal opptunist: Selfsupervised multiframe monocular depth[C]Proceedings of the IEEECVF Conference on Computer Vision Pattern Recognition, 2021: 11641174.
[27] CHEN L-C, PAPANDREOU G, KOKKINOS I et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected crfs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 834-848(2017).
[28] [28] BHAT S F, ALHASHIM I, WONKA P. Adabins: Depth estimation using adaptive bins[C]Proceedings of the IEEECVF Conference on Computer Vision Pattern Recognition, 2021: 40094018.
[30] [30] SILBERMAN N, HOIEM D, KOHLI P, et al. Indo segmentation suppt inference from rgbd images[C]Proceedings of the Computer Vision–ECCV 2012: 12th European Conference on Computer Vision, Flence, Italy, October 713, 2012, Proceedings, Part V 12, 2012: 746760.
[31] [31] GEIGER A, LENZ P, URTASUN R. Are we ready f autonomous driving the kitti vision benchmark suite[C]Proceedings of the Conference on Computer Vision Pattern Recognition, 2012: 33543361.
[33] [33] GAIDON A, WANG Q, CABON Y, et al. Virtual wlds as proxy f multiobject tracking analysis[C]Proceedings of the IEEE Conference on Computer Vision Pattern Recognition, 2016: 43404349.
[34] [34] YANG L, KANG B, HUANG Z, et al. Depth anything: Unleashing the power of largescale unlabeled data[C]Proceedings of the IEEECVF Conference on Computer Vision Pattern Recognition. 2024: 1037110381.
[36] RANFTL R, LASINGER K, HAFNER D et al. Towards robust monocular depth estimation: Mixing datasets for zero-shot cross-dataset transfer[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 1623-1637(2020).
[37] CHOI S, GOPAKUMAR M, PENG Y et al. Neural 3D holography: Learning accurate wave propagation models for 3D holographic virtual and augmented reality displays[J]. ACM Transactions on Graphics, 40, 1-12(2021).
[39] [39] KE B, OBUKHOV A, HUANG S, et al. Repurposing diffusionbased image generats f monocular depth estimation[C]Proceedings of the IEEECVF Conference on Computer Vision Pattern Recognition, 2024: 94929502.
[41] LAGA H, JOSPIN L V, BOUSSAID F et al. A survey on deep learning techniques for stereo-based depth estimation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 1738-1764(2020).
[43] HIRSCHMULLER H. Stereo processing by semiglobal matching and mutual information[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 328-341(2007).
[44] [44] BARRON J T, ADAMS A, SHIH Y, et al. Fast bilateralspace stereo f synthetic defocus[C]Proceedings of the IEEE Conference on Computer Vision Pattern Recognition, 2015: 44664474.
[45] [45] MAYER N, ILG E, HAUSSER P, et al. A large dataset to train convolutional wks f disparity, optical flow, scene flow estimation[C]Proceedings of the Proceedings of the IEEE Conference on Computer Vision Pattern Recognition, 2016: 40404048.
[46] [46] CHANG JR, CHEN YS. Pyra stereo matching wk[C]Proceedings of the IEEE Conference on Computer Vision Pattern Recognition, 2018: 54105418.
[47] [47] KENDALL A, MARTIROSYAN H, DASGUPTA S, et al. Endtoend learning of geometry context f deep stereo regression[C]Proceedings of the IEEE International Conference on Computer Vision, 2017: 6675.
[48] [48] LI Z, LIU X, DRENKOW N, et al. Revisiting stereo depth estimation from a sequencetosequence perspective with transfmers[C]Proceedings of the IEEECVF International Conference on Computer Vision, 2021: 61976206.
[50] [50] HUANG X, CHENG X, GENG Q, et al. The apolloscape dataset f autonomous driving[C]Proceedings of the IEEE Conference on Computer Vision Pattern Recognition Wkshops, 2018: 954960.
[51] [51] MAYER N, ILG E, HAUSSER P, et al. A large dataset to train convolutional wks f disparity, optical flow, scene flow estimation[C]Proceedings of the IEEE Conference on Computer Vision Pattern Recognition, 2016: 40404048.
[52] [52] SSTEIN D, SZELISKI R. Highaccuracy stereo depth maps using structured light[C]2003 IEEE Computer Society Conference on Computer Vision Pattern Recognition, 2003.
[53] [53] KHAMIS S, FANELLO S, RHEMANN C, et al. Stereo: Guided hierarchical refinement f realtime edgeaware depth prediction[C]Proceedings of the European Conference on Computer Vision, 2018: 573590.
[54] [54] PaddlePaddle. PaddleDepth: A Toolkit f Depth Infmation Argumentation[CPOL]. GitHub, 2023(20230503)[20250718]. https:github.comPaddlePaddlePaddleDepth.
[56] [56] GAO T, ZOU D, CHEN C P, et al. Online lane mapping based on multisens SLAM CatmullRom splines [J] Measurement Science Technology, 2025, 36: 026318.
Get Citation
Copy Citation Text
Haiyang HU, Chaoping CHEN, Tianmu GAO, Baoen HAN, Yunfan YANG, Yi LIU, Xiaojun WU. Recent progress in research and applications of monocular and binocular depth estimation (invited)[J]. Infrared and Laser Engineering, 2025, 54(7): 20250157
Category: Special issue—Advanced display technology and applications
Received: Mar. 10, 2025
Accepted: --
Published Online: Aug. 29, 2025
The Author Email: