Laser & Optoelectronics Progress, Volume. 62, Issue 8, 0811005(2025)
Lightweight Unsupervised Monocular Depth Estimation Framework Using Attention Mechanisms
To overcome the inherent limitations of existing unsupervised monocular depth estimation frameworks and enhance the network's generalizability across various scenarios, a lightweight unsupervised monocular depth estimation method that combines convolutional neural networks, attention mechanisms, and speeded-up robust features (SURF) is proposed. First, a residual block with a linear self-attention mechanism (CCT-Block) and a residual block with a coordinate attention mechanism (CA-Block) were designed. These residual blocks were alternately used within the residual network framework to construct a multiscale encoder capable of capturing rich contextual information and mapping the relationship between the depth and image features while reducing the requirements for parameter computation and storage. In addition, the reprojection error of SURF was introduced to mitigate ambiguities that may arise in depth and pose estimation networks. Finally, evaluations were conducted on multiple datasets, including KITTI, Make3D, NYUDepth-v2, and Cityscapes. The experimental results show that the proposed method achieves an absolute relative error of 0.107 and a root mean square error of 4.674 on the KITTI dataset using only 4.9×106 model parameters. Furthermore, the proposed method exhibits strong generalizability across different datasets.
Get Citation
Copy Citation Text
Xiyu Li, Yilihamu Yaermaimaiti, Lirong Xie, Shuoqi Cheng. Lightweight Unsupervised Monocular Depth Estimation Framework Using Attention Mechanisms[J]. Laser & Optoelectronics Progress, 2025, 62(8): 0811005
Category: Imaging Systems
Received: Jul. 15, 2024
Accepted: Oct. 12, 2024
Published Online: Apr. 2, 2025
The Author Email: Yilihamu Yaermaimaiti (65891080@qq.com)
CSTR:32186.14.LOP241688