Optoelectronics Letters, Volume. 17, Issue 6, 361(2021)

A lightweight convolutional neural network for large-scale Chinese image caption

Dexin ZHAO, Ruixue YANG*, and Shutao GUO
Author Affiliations
  • Tianjin Key Laboratory1of Intelligence Computing and Novel Software Technology, Tianjin University of Technology, Tianjin 300384, China
  • show less

    Image caption is a high-level task in the area of image understanding, in which most of the models adopt a convolutional neural network (CNN) to extract image features assigning a recurrent neural network (RNN) to generate sentences. Researchers tend to design complex networks with deeper layers to improve the performance of feature extraction in recent years. Increasing the size of the network could obtain features of high quality, but it is not an efficient way in terms of computational cost. A large number of parameters brought by CNN makes the research difficult to apply in human daily life. In order to reduce the information loss of the convolutional process with less cost, we propose a lightweight convolutional neural network, named as Bifurcate-CNN (B-CNN). Furthermore, recent works are devoted to generating captions in English, in this paper, we develop an image caption model that generates descriptions in Chinese. Compared with Inception-v3, the depth of our model is shallower with fewer parameters, and the computational cost is lower. Evaluated on the AI CHALLENGER dataset, we prove that our model can enhance the performance, improving BLEU-4 from 46.1 to 49.9 and CIDEr from 142.5 to 156.6 respectively.

    Tools

    Get Citation

    Copy Citation Text

    ZHAO Dexin, YANG Ruixue, GUO Shutao. A lightweight convolutional neural network for large-scale Chinese image caption[J]. Optoelectronics Letters, 2021, 17(6): 361

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Received: Jun. 14, 2020

    Accepted: Sep. 1, 2020

    Published Online: Sep. 2, 2021

    The Author Email: Ruixue YANG (yangruixue1995@outlook.com?)

    DOI:10.1007/s11801-021-0100-z

    Topics