Laser & Optoelectronics Progress, Volume. 58, Issue 16, 1615001(2021)

Vision-Language Navigation Algorithm Based on Cosine Similarity

Jie Jin1, Kaiyan Liu1, and Shunkao Zha2、*
Author Affiliations
  • 1School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China
  • 2School of Software Engineering, University of Science and Technology of China, Suzhou, Jiangsu 215123, China
  • show less

    This paper proposes a vision-language navigation algorithm based on cosine similarity using the Regretful model to solve the problems of low navigation accuracy and weak generalization ability in vision-language navigation tasks. By increasing the cosine similarity loss function to guide neural network learning and predict navigation direction, the difference in intraclass features in feature space is reduced. The distribution range of interclass features increases, and the navigation accuracy of the model without search strategy improves. Simultaneously, a feature-smoothing method of panoramic view is proposed to enhance data and improve the generalization performance of the model. Experimental results show that the algorithm improves the navigation accuracy and other model indicators on the R2R(Room-to-room) dataset. Additionally, its effect is better than that of the Regretful model, confirming the superiority and robustness of the proposed method.

    Tools

    Get Citation

    Copy Citation Text

    Jie Jin, Kaiyan Liu, Shunkao Zha. Vision-Language Navigation Algorithm Based on Cosine Similarity[J]. Laser & Optoelectronics Progress, 2021, 58(16): 1615001

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Machine Vision

    Received: Oct. 12, 2020

    Accepted: Dec. 6, 2020

    Published Online: Aug. 19, 2021

    The Author Email: Zha Shunkao (zhashunkao@gmail.com)

    DOI:10.3788/LOP202158.1615001

    Topics