Efficient Monocular Image Depth Estimation Based on Transfer Learning

When performing computer vision tasks such as three-dimensional reconstruction and scene understanding, it is a basic task to recover depth information in three-dimensional space from two-dimensional images. When deep learning is currently used to complete this task, methods with higher accuracy often require a huge amount of data, and the acquisition of these data is usually complicated and expensive. In response to this problem, this paper based on transfer learning, and proposes a encoder-decoder network using global self-attention. It takes a single image as input and has a global receptive field at each stage of encoding. After decoding, the depth regression task is transformed into a classification task, greatly reducing the amount of training data required while ensuring the accuracy of the model. The experimental results show that compared with the current state-of-the-art depth estimation networks AdaBins and DPT-Hybrid, the designed model reduces the root mean square error by about 2.2% and 0.3%, and reduces the amount of training data by about 80% and 99.6%.

Keywords

depth estimation imaging systems monocular vision self-attention mechanism transfer learning

Tools

Get Citation

Copy Citation Text

Jiatao Liu, Yaping Zhang, Yuwei Yang. Efficient Monocular Image Depth Estimation Based on Transfer Learning[J]. Laser & Optoelectronics Progress, 2022, 59(16): 1611002

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Imaging Systems

Received: Jul. 30, 2021

Accepted: Sep. 24, 2021

Published Online: Aug. 8, 2022

The Author Email: Zhang Yaping (zhangyp@ynnu.edu.cn)

DOI:10.3788/LOP202259.1611002

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology