Optics and Precision Engineering, Volume. 31, Issue 20, 2993(2023)

Indoor self-supervised monocular depth estimation based on level feature fusion

Deqiang CHENG1... Huaqiang ZHANG1, Qiqi KOU2, Chen LÜ1 and Jiansheng QIAN1,* |Show fewer author(s)
Author Affiliations
  • 1School of Information and Control Engineering, University of Mining and Technology, Xuzhou 226, China
  • 2School of Computer Science and Technology, University of Mining and Technology, Xuzhou 1116, China
  • show less

    Due to a high number of areas with low texture and lighting in complex indoor scenes, current self-supervised monocular depth estimation network models suffer from certain issues. These problems include imprecise depth predictions, noticeable blurriness around object edges in the predictions, and significant loss of details. This paper introduces an indoor self-supervised monocular depth estimation network model based on level feature fusion. First, to enhance the visibility of poorly lit areas and address the issue of pseudo planes deteriorating the model, the Mapping-Consistent Image Enhancement module was applied to process indoor images. This module simultaneously maintained brightness consistency. Subsequently, a novel self-supervised monocular depth estimation network model that incorporates the Cross-Level Feature Adjustment module was proposed, utilizing an attention mechanism. This module effectively fused multilevel feature information from the encoder to the decoder, enhancing the network's ability to utilize feature information and reducing the semantic gap between predicted depth and true depth. Finally, the Gram Matrix Similarity Loss function was introduced based on image style features, as an additional self-supervised signal to further constrain the network model. This addition enhanced the network’s depth prediction capabilities, leading to improved accuracy. Through training and testing on NYU Depth V2 and ScanNet indoor datasets, this paper achieves a pixel accuracy rate of 81.9% and 76.0%, respectively. The experimental results also include a comparative analysis with existing main indoor self-supervised monocular depth estimation network models. The network model proposed in this paper excels in preserving object edges and details, effectively enhancing the accuracy of predicted depth.

    Tools

    Get Citation

    Copy Citation Text

    Deqiang CHENG, Huaqiang ZHANG, Qiqi KOU, Chen LÜ, Jiansheng QIAN. Indoor self-supervised monocular depth estimation based on level feature fusion[J]. Optics and Precision Engineering, 2023, 31(20): 2993

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Information Sciences

    Received: Mar. 1, 2023

    Accepted: --

    Published Online: Nov. 28, 2023

    The Author Email: QIAN Jiansheng (qianjsh@cumt.edu.cn)

    DOI:10.37188/OPE.20233120.2993

    Topics