Laser & Optoelectronics Progress, Volume. 62, Issue 14, 1412004(2025)

Indoor Monocular Depth Estimation Based on Global-Local Feature Fusion

Zengyu Tian, Changku Sun, Yue Li, Luhua Fu, and Peng Wang*
Author Affiliations
  • State Key Laboratory of Precision Measurement Technology and Instruments, Tianjin University, Tianjin 300372, China
  • show less

    To address the challenges of low-depth estimation accuracy, blurred object contours, and detail loss caused by occlusion and lighting variations in complex indoor scenes, we propose an indoor monocular depth estimation algorithm based on global-local feature fusion. First, a hierarchical Transformer structure is incorporated into the decoder to enhance global feature extraction, while a simplified pyramid pooling module further enriches feature representation. Second, a gated adaptive aggregation module is introduced in the decoder to optimize feature fusion during upsampling by effectively integrating global and local information. Finally, a multi-kernel convolution module is applied at the end of the decoder to refine local details. Experimental results on the NYU Depth V2 indoor scene dataset demonstrate that the proposed algorithm significantly improves depth prediction accuracy, achieving a root mean square error of only 0.361. The generated depth maps exhibit enhanced continuity and detail representation.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Zengyu Tian, Changku Sun, Yue Li, Luhua Fu, Peng Wang. Indoor Monocular Depth Estimation Based on Global-Local Feature Fusion[J]. Laser & Optoelectronics Progress, 2025, 62(14): 1412004

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Instrumentation, Measurement and Metrology

    Received: Jan. 2, 2025

    Accepted: Feb. 25, 2025

    Published Online: Jul. 16, 2025

    The Author Email: Peng Wang (wang_peng@tju.edu.cn)

    DOI:10.3788/LOP250436

    CSTR:32186.14.LOP250436

    Topics