Laser & Optoelectronics Progress, Volume. 58, Issue 16, 1615001(2021)
Vision-Language Navigation Algorithm Based on Cosine Similarity
This paper proposes a vision-language navigation algorithm based on cosine similarity using the Regretful model to solve the problems of low navigation accuracy and weak generalization ability in vision-language navigation tasks. By increasing the cosine similarity loss function to guide neural network learning and predict navigation direction, the difference in intraclass features in feature space is reduced. The distribution range of interclass features increases, and the navigation accuracy of the model without search strategy improves. Simultaneously, a feature-smoothing method of panoramic view is proposed to enhance data and improve the generalization performance of the model. Experimental results show that the algorithm improves the navigation accuracy and other model indicators on the R2R(Room-to-room) dataset. Additionally, its effect is better than that of the Regretful model, confirming the superiority and robustness of the proposed method.
Get Citation
Copy Citation Text
Jie Jin, Kaiyan Liu, Shunkao Zha. Vision-Language Navigation Algorithm Based on Cosine Similarity[J]. Laser & Optoelectronics Progress, 2021, 58(16): 1615001
Category: Machine Vision
Received: Oct. 12, 2020
Accepted: Dec. 6, 2020
Published Online: Aug. 19, 2021
The Author Email: Zha Shunkao (zhashunkao@gmail.com)