Laser & Optoelectronics Progress, Volume. 59, Issue 16, 1611002(2022)
Efficient Monocular Image Depth Estimation Based on Transfer Learning
When performing computer vision tasks such as three-dimensional reconstruction and scene understanding, it is a basic task to recover depth information in three-dimensional space from two-dimensional images. When deep learning is currently used to complete this task, methods with higher accuracy often require a huge amount of data, and the acquisition of these data is usually complicated and expensive. In response to this problem, this paper based on transfer learning, and proposes a encoder-decoder network using global self-attention. It takes a single image as input and has a global receptive field at each stage of encoding. After decoding, the depth regression task is transformed into a classification task, greatly reducing the amount of training data required while ensuring the accuracy of the model. The experimental results show that compared with the current state-of-the-art depth estimation networks AdaBins and DPT-Hybrid, the designed model reduces the root mean square error by about 2.2% and 0.3%, and reduces the amount of training data by about 80% and 99.6%.
Get Citation
Copy Citation Text
Jiatao Liu, Yaping Zhang, Yuwei Yang. Efficient Monocular Image Depth Estimation Based on Transfer Learning[J]. Laser & Optoelectronics Progress, 2022, 59(16): 1611002
Category: Imaging Systems
Received: Jul. 30, 2021
Accepted: Sep. 24, 2021
Published Online: Aug. 8, 2022
The Author Email: Zhang Yaping (zhangyp@ynnu.edu.cn)