Monocular Depth Estimation Fusing Multi-scale Feature with Semantic Information

In monocular image depth estimation, current unsupervised learning methods have inaccurate estimation results and fuzzy edges.To solve the problems, an unsupervised monocular depth estimation network that combines multi-scale feature information with semantic information is proposed.The network not only introduces layer connection from the encoder to the decoder to realize the extraction and fusion of features of different scales, but also adds a semantic layer of multiple parallel dilated convolutions between the encoder and the decoder to enlarge the receptive field and make the result more precise.Finally, training and testing are conducted on the KITTI data set.The results show that all the error indicators are lower than that of the current unsupervised learning methods.The accuracy of image prediction reaches 91%, 96.8% and 98.7% respectively under the three ratio thresholds, which exceeds that of all the other supervised and unsupervised methods.The improved method makes the edges clearer and the levels more distinct.

Keywords

depth estimation dilated convolution encoding/decoding structure multi-scale feature unsupervised learning

Tools

Get Citation

Copy Citation Text

ZHOU Weiqiang, HAN Jun. Monocular Depth Estimation Fusing Multi-scale Feature with Semantic Information[J]. Electronics Optics & Control, 2022, 29(2): 67

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category:

Received: Jan. 11, 2021

Accepted: --

Published Online: Mar. 4, 2022

The Author Email:

DOI:10.3969/j.issn.1671-637x.2022.02.015

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology