Indoor self-supervised monocular depth estimation based on level feature fusion

[1] ZHANG S G, CHEN Y F, ZHANG L J et al. Study on robot grasping system of SSVEP-BCI based on augmented reality stimulus[J]. Tsinghua Science and Technology, 28, 322-329(2022).

[2] LIN D H, FIDLER S, URTASUN R. Holistic Scene Understanding for 3D Object Detection with RGBD Cameras[C], 1417-1424(1).

[3] RASOULI A, TSOTSOS J K. Autonomous vehicles that interact with pedestrians： a survey of theory and practice[J]. IEEE Transactions on Intelligent Transportation Systems, 21, 900-918(2019).

[4] KHAMIS S, FANELLO S, RHEMANN C et al. StereoNet： Guided Hierarchical Refinement for Real-Time Edge-Aware Depth Prediction[webpage]. arXiv, 1807-08865(2018). https：//arxiv.org/abs/1807.08865.pdf

[5] RANFTL R, VINEET V, CHEN Q F et al. Dense Monocular Depth Estimation in Complex Dynamic Scenes[C], 4058-4066(27).

[6] [6] 伍锡如，薛其威. 基于激光雷达的无人驾驶系统三维车辆检测［J］. 光学精密工程， 2022， 30（4）： 489-497. doi: 10.37188/OPE.20223004.0489WUX R， XUEQ W. 3D vehicle detection for unmanned driving systerm based on lidar［J］. Opt. Precision Eng.， 2022， 30（4）： 489-497. （in Chinese）. doi: 10.37188/OPE.20223004.0489

[7] EIGEN D, PUHRSCH C, FERGUS R. Depth map prediction from a single image using a multi-scale deep network[C], 2366-2374(13).

[8] GODARD C, AODHA OMAC, BROSTOW G J. Unsupervised monocular depth estimation with left-right consistency[C], 6602-6611(21).

[9] ZHOU T H, BROWN M, SNAVELY N et al. Unsupervised learning of depth and ego-motion from video[C], 6612-6619(21).

[10] GODARD C, AODHA OMAC, FIRMAN M et al. Digging into self-supervised monocular depth estimation[C], 3827-3837.

[11] GUIZILINI V, AMBRUS R, PILLAI S et al. 3D Packing for self-supervised monocular depth estimation[C], 2482-2491(13).

[12] YU Z H, JIN L, GAO S H[M]. P2Net： Patch-Match and Plane-Regularization for Unsupervised Indoor Depth Estimation, 206-222(2020).

[13] GEIGER A, LENZ P, URTASUN R. Are we ready for autonomous driving？[C], 3354-3361(16).

[14] SAXENA A, SUN M, NG A Y. Make3D： learning 3D scene structure from a single still image[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 31, 824-840(2009).

[15] EIGEN D, FERGUS R. Predicting Depth， Surface normals and semantic labels with a common multi-scale convolutional architecture[C], 2650-2658(7).

[16] LIU F Y, SHEN C H, LIN G S et al. Learning depth from single monocular images using deep convolutional neural fields[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 38, 2024-2039(2016).

[17] KIM S, PARK K, SOHN K et al[M]. Unified Depth Prediction and Intrinsic Image Decomposition From A Single Image Via Joint Convolutional Neural Fields, 143-159(2016).

[18] FU H, GONG M M, WANG C H et al. Deep ordinal regression network for monocular depth estimation[C], 2002-2011(18).

[19] WOFK D, MA F C, YANG T J et al. FastDepth： fast monocular depth estimation on embedded systems[C], 6101-6108(20).

[20] SHU C, YU K, DUAN Z X et al[M]. Feature-Metric Loss for Self-Supervised Learning of Depth and Egomotion, 572-588(2020).

[21] CHEN Y H, SCHMID C, SMINCHISESCU C. Self-Supervised learning with geometric constraints in monocular video： connecting flow， depth， and camera[C], 7062-7071.

[22] ZHOU J S, WANG Y W, QIN K H et al. Moving indoor： unsupervised video depth learning in challenging environments[C], 8617-8626.

[23] TUYTELAARS T, VAN GOOL L[M]. SURF： Speeded Up Robust Features, 404-417(2006).

[24] LI B Y, HUANG Y, LIU Z Y et al. Structdepth： leveraging the structural regularities for self-supervised indoor depth estimation[C], 12643-12653(10).

[25] CHENG D Q, CHEN L L, LV C et al. Light-guided and cross-fusion U-net for anti-illumination image super-resolution[J]. IEEE Transactions on Circuits and Systems for Video Technology, 32, 8436-8449(2022).

[26] [26] 黄慧，董林鹭，刘小芳，等. 改进Retinex的低光照图像增强［J］. 光学精密工程， 2020， 28（8）： 1835-1849.HUANGH， DONGL L， LIUX F， et al. Improved retinex low light image enhancement method［J］. Opt. Precision Eng.， 2020， 28（8）： 1835-1849. （in Chinese）

[27] ZHANG Y H, ZHANG J W, GUO X J. Kindling the darkness： a practical low-light image enhancer[C], 1632-1640(2019).

[28] CAI J R, GU S H, ZHANG L. Learning a deep single image contrast enhancer from multi-exposure images[J]. IEEE Transactions on Image Processing, 27, 2049-2062(2018).

[29] CHEN C, CHEN Q F, XU J et al. Learning to see in the dark[C], 3291-3300(18).

[30] JIANG Y F, GONG X Y, LIU D et al. EnlightenGAN： deep light enhancement without paired supervision[J]. IEEE Transactions on Image Processing, 30, 2340-2349(2021).

[31] GUO C L, LI C Y, GUO J C et al. Zero-reference deep curve estimation for low-light image enhancement[C], 1777-1786(13).

[32] WANG K, ZHANG Z Y, YAN Z Q et al. Regularizing nighttime weirdness： efficient self-supervised monocular depth estimation in the dark[C], 16035-16044(10).

[33] HE K M, ZHANG X Y, REN S Q et al. Deep residual learning for image recognition[C], 770-778(27).

[34] HUANG G, LIU Z, VAN DER MAATEN L et al. Densely connected convolutional networks[C], 2261-2269(21).

[35] SIMONYAN K, ZISSERMAN A. Very Deep Convolutional Networks for Large-Scale Image Recognition[webpage]. arXiv, 1409-1556(2014). https：//arxiv.org/abs/1409.1556.pdf

[36] [36] 程德强，赵佳敏，寇旗旗，等. 多尺度密集特征融合的图像超分辨率重建［J］. 光学精密工程， 2022， 30（20）： 2489-2500. doi: 10.37188/OPE.20223020.2489CHENGD Q， ZHAOJ M， KOUQ Q， et al. Multi-scale dense feature fusion network for image super-resolution［J］. Opt. Precision Eng.， 2022， 30（20）： 2489-2500. （in Chinese）. doi: 10.37188/OPE.20223020.2489

[37] [37] 程德强，陈杰，寇旗旗，等. 融合层次特征和注意力机制的轻量化矿井图像超分辨率重建方法［J］. 仪器仪表学报， 2022， 43（8）： 73-84.CHENGD Q， CHENJ， KOUQ Q， et al. Lightweight super-resolution reconstruction method based on hierarchical features fusion and attention mechanism for mine image［J］. Chinese Journal of Scientific Instrument， 2022， 43（8）： 73-84.（in Chinese）

[38] [38] 蔡体健，彭潇雨，石亚鹏，等. 通道注意力与残差级联的图像超分辨率重建［J］. 光学精密工程， 2021， 29（1）： 142-151. doi: 10.37188/OPE.20212901.0142CAIT J， PENGX Y， SHIY P， et al. Channel attention and residual concatenation network for image super-resolution［J］. Opt. Precision Eng.， 2021， 29（1）： 142-151. （in Chinese）. doi: 10.37188/OPE.20212901.0142

[39] HU J, SHEN L, SUN G. Squeeze-and-excitation networks[C], 7132-7141(18).

[40] GHIASI G, LEE H, KUDLUR M et al. Exploring The Structure of a Real-Time， Arbitrary Neural Artistic Stylization Network[webpage]. arXiv, 1705-06830(2017). https：//arxiv.org/abs/1705.06830.pdf

[41] LIU L N, SONG X B, WANG M M et al. Self-supervised monocular depth estimation for all day images using domain separation[C], 12717-12726(10).

[42] JADERBERG M, SIMONYAN K, ZISSERMAN A et al. Spatial Transformer Networks[webpage]. arXiv, 1506-02025(2015). https：//arxiv.org/abs/1506.02025.pdf

[43] WANG Z, BOVIK A C, SHEIKH H R et al. Image quality assessment： from error visibility to structural similarity[J]. IEEE Transactions on Image Processing： a Publication of the IEEE Signal Processing Society, 13, 600-612(2004).

[44] SILBERMAN N, HOIEM D, KOHLI P et al[M]. Indoor Segmentation and Support Inference from RGBD Images, 746-760(2012).

[45] DAI A, CHANG A X, SAVVA M et al. ScanNet： richly-annotated 3d reconstructions of indoor scenes[C], 2432-2443(21).

[46] HU J J, OZAY M, ZHANG Y et al. Revisiting single image depth estimation： toward higher resolution maps with accurate object boundaries[C], 1043-1051(7).

[47] YIN W, LIU Y F, SHEN C H et al. Enforcing geometric constraints of virtual normal for depth prediction[C], 5683-5692.

[48] FAROOQ BHAT S, ALHASHIM I, WONKA P. AdaBins： depth estimation using adaptive bins[C], 4008-4017(20).

[49] NIKLAUS S, YANG J M et al. 3D Ken Burns effect from a single image[J]. ACM Transactions on Graphics, 38, 1-15.

[50] ZHAO W, LIU S H, SHU Y Z et al. Towards better generalization： joint depth-pose learning without PoseNet[C], 9148-9158(13).

[51] BIAN J W, ZHAN H Y, WANG N Y et al. Unsupervised scale-consistent depth learning from video[J]. International Journal of Computer Vision, 129, 2548-2564(2021).

[52] BIAN JW, ZHAN H, WANG N et al. Unsupervised depth learning in challenging indoor video： Weak rectification to rescue[J]. arXiv preprint arXiv, 2020.

[53] JIANG H L, DING L Y, HU J J et al. PLNet： plane and line priors for unsupervised indoor depth estimation[C], 741-750(1).

[54] BIAN J W, ZHAN H Y, WANG N Y et al. Auto-rectify network for unsupervised indoor depth estimation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 9802-9813(2022).

Tools

Get Citation

Copy Citation Text

Deqiang CHENG, Huaqiang ZHANG, Qiqi KOU, Chen LÜ, Jiansheng QIAN. Indoor self-supervised monocular depth estimation based on level feature fusion[J]. Optics and Precision Engineering, 2023, 31(20): 2993

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites