Lightweight Unsupervised Monocular Depth Estimation Framework Using Attention Mechanisms

Xiyu Li; Yilihamu Yaermaimaiti; Lirong Xie; Shuoqi Cheng

doi:10.3788/LOP241688

Laser & Optoelectronics Progress, Volume. 62, Issue 8, 0811005(2025)

Lightweight Unsupervised Monocular Depth Estimation Framework Using Attention Mechanisms

Xiyu Li, Yilihamu Yaermaimaiti^*, Lirong Xie, and Shuoqi Cheng

Author Affiliations

College of Electrical Engineering, Xinjiang University, Urumqi 830017, Xinjiang , China

show less

Abstract Get PDF(in Chinese)

To overcome the inherent limitations of existing unsupervised monocular depth estimation frameworks and enhance the network's generalizability across various scenarios, a lightweight unsupervised monocular depth estimation method that combines convolutional neural networks, attention mechanisms, and speeded-up robust features (SURF) is proposed. First, a residual block with a linear self-attention mechanism (CCT-Block) and a residual block with a coordinate attention mechanism (CA-Block) were designed. These residual blocks were alternately used within the residual network framework to construct a multiscale encoder capable of capturing rich contextual information and mapping the relationship between the depth and image features while reducing the requirements for parameter computation and storage. In addition, the reprojection error of SURF was introduced to mitigate ambiguities that may arise in depth and pose estimation networks. Finally, evaluations were conducted on multiple datasets, including KITTI, Make3D, NYUDepth-v2, and Cityscapes. The experimental results show that the proposed method achieves an absolute relative error of 0.107 and a root mean square error of 4.674 on the KITTI dataset using only 4.9×10⁶ model parameters. Furthermore, the proposed method exhibits strong generalizability across different datasets.

Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.

Keywords

attention mechanism deep estimation machine vision reprojection error unsupervised learning

Tools

Get Citation

Copy Citation Text

Xiyu Li, Yilihamu Yaermaimaiti, Lirong Xie, Shuoqi Cheng. Lightweight Unsupervised Monocular Depth Estimation Framework Using Attention Mechanisms[J]. Laser & Optoelectronics Progress, 2025, 62(8): 0811005

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites