Indoor self-supervised monocular depth estimation based on level feature fusion

Due to a high number of areas with low texture and lighting in complex indoor scenes， current self-supervised monocular depth estimation network models suffer from certain issues. These problems include imprecise depth predictions， noticeable blurriness around object edges in the predictions， and significant loss of details. This paper introduces an indoor self-supervised monocular depth estimation network model based on level feature fusion. First， to enhance the visibility of poorly lit areas and address the issue of pseudo planes deteriorating the model， the Mapping-Consistent Image Enhancement module was applied to process indoor images. This module simultaneously maintained brightness consistency. Subsequently， a novel self-supervised monocular depth estimation network model that incorporates the Cross-Level Feature Adjustment module was proposed， utilizing an attention mechanism. This module effectively fused multilevel feature information from the encoder to the decoder， enhancing the network's ability to utilize feature information and reducing the semantic gap between predicted depth and true depth. Finally， the Gram Matrix Similarity Loss function was introduced based on image style features， as an additional self-supervised signal to further constrain the network model. This addition enhanced the network’s depth prediction capabilities， leading to improved accuracy. Through training and testing on NYU Depth V2 and ScanNet indoor datasets， this paper achieves a pixel accuracy rate of 81.9% and 76.0%， respectively. The experimental results also include a comparative analysis with existing main indoor self-supervised monocular depth estimation network models. The network model proposed in this paper excels in preserving object edges and details， effectively enhancing the accuracy of predicted depth.

Keywords

feature fusion gram matrix image enhancement monocular depth estimation self-supervision

Tools

Get Citation

Copy Citation Text

Deqiang CHENG, Huaqiang ZHANG, Qiqi KOU, Chen LÜ, Jiansheng QIAN. Indoor self-supervised monocular depth estimation based on level feature fusion[J]. Optics and Precision Engineering, 2023, 31(20): 2993

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Information Sciences

Received: Mar. 1, 2023

Accepted: --

Published Online: Nov. 28, 2023

The Author Email: QIAN Jiansheng (qianjsh@cumt.edu.cn)

DOI:10.37188/OPE.20233120.2993

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology

微信扫一扫：分享