Optics and Precision Engineering, Volume. 29, Issue 12, 2944(2021)

Image caption of space science experiment based on multi-modal learning

Pei-zhuo LI... Xue WAN* and Sheng-yang LI |Show fewer author(s)
Author Affiliations
  • Key Laboratory of Space Utilization, Chinese Academy of Sciences, Technology and Engineering Center for Space Utilization, Chinese Academy of Sciences, University of Chinese Academy of Sciences, Beijing100094, China
  • show less
    References(22)

    [1] [1] 1刘媛媛, 张硕, 于海业, 等. 基于语义分割的复杂场景下的秸秆检测[J]. 光学 精密工程, 2020, 28(1): 200-211. doi: 10.3788/ope.20202801.0200LIUY Y, ZHANGSH, YUH Y, et al. Straw detection algorithm based on semantic segmentation in complex farm scenarios[J]. Opt. Precision Eng., 2020, 28(1): 200-211.(in Chinese). doi: 10.3788/ope.20202801.0200

    [2] [2] 2陈彦彤, 李雨阳, 吕石立, 等. 基于深度语义分割的多源遥感图像海面溢油监测[J]. 光学 精密工程, 2020, 28(5): 1165-1176.CHENY T, LIY Y, LÜSH L, et al. Research on oil spill monitoring of multi-source remote sensing image based on deep semantic segmentation[J]. Opt. Precision Eng., 2020, 28(5): 1165-1176.(in Chinese)

    [3] [3] 3王中宇, 倪显扬, 尚振东. 利用卷积神经网络的自动驾驶场景语义分割[J]. 光学 精密工程, 2019, 27(11): 2429-2438. doi: 10.3788/ope.20192711.2429WANGZH Y, NIX Y, SHANGZH D. Autonomous driving semantic segmentation with convolution neural networks[J]. Opt. Precision Eng., 2019, 27(11): 2429-2438.(in Chinese). doi: 10.3788/ope.20192711.2429

    [4] K M HE, G GKIOXARI, P DOLLÁR et al. Mask R-CNN, 2980-2988(2017).

    [5] O RONNEBERGER, P FISCHER, T BROX. U-net: convolutional networks for biomedical image segmentation, 234-241(2015).

    [6] N OTSU. A threshold selection method from gray-level histograms. IEEE Transactions on Systems, Man, and Cybernetics, 9, 62-66(1979).

    [7] [7] 7李彦, 赵其峰, 闫河, 等. Canny算子在PCBA目标边缘提取中的优化应用[J]. 光学 精密工程, 2020, 28(9): 2096-2102. doi: 10.37188/OPE.20202809.2096LIY, ZHAOQ F, YANH, et al. Optimized application of canny operator in PCBA target edge extraction[J]. Opt. Precision Eng., 2020, 28(9): 2096-2102.(in Chinese). doi: 10.37188/OPE.20202809.2096

    [8] Q B HOU, M M CHENG, X W HU et al. Deeply supervised salient object detection with short connections(828).

    [9] H AGRAWAL, K R DESAI, Y F WANG et al. Nocaps: novel object captioning at scale, 8947-8956(2019).

    [10] A KARPATHY, F F LI. Deep visual-semantic alignments for generating image descriptions, 664-676(2015).

    [11] O VINYALS, A TOSHEV, S BENGIO et al. Show and tell: a neural image caption generator, 3156-3164(2015).

    [12] J JOHNSON, A KARPATHY, F F LI. DenseCap: fully convolutional localization networks for dense captioning, 4565-4574(2016).

    [13] L A HENDRICKS, S VENUGOPALAN, M ROHRBACH et al. Deep compositional captioning: describing novel object categories without paired training data, 1-10(2016).

    [14] S VENUGOPALAN, L A HENDRICKS, M ROHRBACH et al. Captioning images with diverse objects, 1170-1178(2017).

    [15] P ANDERSON, X D HE, C BUEHLER et al. Bottom-up and top-down attention for image captioning and visual question answering, 6077-6086(2018).

    [16] G KULKARNI, V PREMRAJ, V ORDONEZ et al. BabyTalk: understanding and generating simple image descriptions. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 2891-2903(2013).

    [17] P ANDERSON, B FERNANDO, M JOHNSON et al. Guided open vocabulary image captioning with constrained beam search, 945(2017).

    [18] S Q REN, K M HE, R GIRSHICK et al. Faster R-CNN: towards real-time object detection with region proposal networks, 1137-1149(2017).

    [19] F PERAZZI, J PONT-TUSET, B MCWILLIAMS et al. A benchmark dataset and evaluation methodology for video object segmentation, 724-732(2016).

    [20] K Papineni. BLEU : a method for automatic evaluation of MT. Research Report, W0109, 2001(22176).

    [21] S BANERJEE, A LAVIE. METEOR: an automatic metric for MT evaluation with improved correlation with human judgments(2005).

    [22] P ANDERSON, B FERNANDO, M JOHNSON et al. SPICE: semantic propositional image caption evaluation, 382-398(2016).

    Tools

    Get Citation

    Copy Citation Text

    Pei-zhuo LI, Xue WAN, Sheng-yang LI. Image caption of space science experiment based on multi-modal learning[J]. Optics and Precision Engineering, 2021, 29(12): 2944

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Information Sciences

    Received: Apr. 29, 2021

    Accepted: --

    Published Online: Jan. 20, 2022

    The Author Email: WAN Xue (wanxue@csu.ac.cn)

    DOI:10.37188/OPE.2021.0244

    Topics