Image caption of space science experiment based on multi-modal learning

Pei-zhuo LI; Xue WAN; Sheng-yang LI

doi:10.37188/OPE.2021.0244

Optics and Precision Engineering, Volume. 29, Issue 12, 2944(2021)

Image caption of space science experiment based on multi-modal learning

Pei-zhuo LI, Xue WAN^*, and Sheng-yang LI

Key Laboratory of Space Utilization， Chinese Academy of Sciences， Technology and Engineering Center for Space Utilization， Chinese Academy of Sciences， University of Chinese Academy of Sciences， Beijing100094， China

show less

Abstract Get PDF(in Chinese)

References(22)

[1] [1] 1刘媛媛，张硕，于海业，等. 基于语义分割的复杂场景下的秸秆检测［J］. 光学精密工程， 2020， 28（1）： 200-211. doi: 10.3788/ope.20202801.0200LIUY Y， ZHANGSH， YUH Y， et al. Straw detection algorithm based on semantic segmentation in complex farm scenarios［J］. Opt. Precision Eng.， 2020， 28（1）： 200-211.（in Chinese）. doi: 10.3788/ope.20202801.0200

[2] [2] 2陈彦彤，李雨阳，吕石立，等. 基于深度语义分割的多源遥感图像海面溢油监测［J］. 光学精密工程， 2020， 28（5）： 1165-1176.CHENY T， LIY Y， LÜSH L， et al. Research on oil spill monitoring of multi-source remote sensing image based on deep semantic segmentation［J］. Opt. Precision Eng.， 2020， 28（5）： 1165-1176.（in Chinese）

[3] [3] 3王中宇，倪显扬，尚振东. 利用卷积神经网络的自动驾驶场景语义分割［J］. 光学精密工程， 2019， 27（11）： 2429-2438. doi: 10.3788/ope.20192711.2429WANGZH Y， NIX Y， SHANGZH D. Autonomous driving semantic segmentation with convolution neural networks［J］. Opt. Precision Eng.， 2019， 27（11）： 2429-2438.（in Chinese）. doi: 10.3788/ope.20192711.2429

[4] HE K M, GKIOXARI G, DOLLÁR P et al. Mask R-CNN[C], 2980-2988(2017).

[5] RONNEBERGER O, FISCHER P, BROX T. U-net： convolutional networks for biomedical image segmentation[C], 234-241(2015).

[6] OTSU N. A threshold selection method from gray-level histograms[J]. IEEE Transactions on Systems， Man， and Cybernetics, 9, 62-66(1979).

[7] [7] 7李彦，赵其峰，闫河，等. Canny算子在PCBA目标边缘提取中的优化应用［J］. 光学精密工程， 2020， 28（9）： 2096-2102. doi: 10.37188/OPE.20202809.2096LIY， ZHAOQ F， YANH， et al. Optimized application of canny operator in PCBA target edge extraction［J］. Opt. Precision Eng.， 2020， 28（9）： 2096-2102.（in Chinese）. doi: 10.37188/OPE.20202809.2096

[8] HOU Q B, CHENG M M, HU X W et al. Deeply supervised salient object detection with short connections[C](828).

[9] AGRAWAL H, DESAI K R, WANG Y F et al. Nocaps： novel object captioning at scale[C], 8947-8956(2019).

[10] KARPATHY A, LI F F. Deep visual-semantic alignments for generating image descriptions[C], 664-676(2015).

[11] VINYALS O, TOSHEV A, BENGIO S et al. Show and tell： a neural image caption generator[C], 3156-3164(2015).

[12] JOHNSON J, KARPATHY A, LI F F. DenseCap： fully convolutional localization networks for dense captioning[C], 4565-4574(2016).

[13] HENDRICKS L A, VENUGOPALAN S, ROHRBACH M et al. Deep compositional captioning： describing novel object categories without paired training data[C], 1-10(2016).

[14] VENUGOPALAN S, HENDRICKS L A, ROHRBACH M et al. Captioning images with diverse objects[C], 1170-1178(2017).

[15] ANDERSON P, HE X D, BUEHLER C et al. Bottom-up and top-down attention for image captioning and visual question answering[C], 6077-6086(2018).

[16] KULKARNI G, PREMRAJ V, ORDONEZ V et al. BabyTalk： understanding and generating simple image descriptions[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 2891-2903(2013).

[17] ANDERSON P, FERNANDO B, JOHNSON M et al. Guided open vocabulary image captioning with constrained beam search[C], 945(2017).

[18] REN S Q, HE K M, GIRSHICK R et al. Faster R-CNN： towards real-time object detection with region proposal networks[C], 1137-1149(2017).

[19] PERAZZI F, PONT-TUSET J, MCWILLIAMS B et al. A benchmark dataset and evaluation methodology for video object segmentation[C], 724-732(2016).

[20] Papineni K. BLEU ： a method for automatic evaluation of MT[J]. Research Report, W0109, 2001(22176).

[21] BANERJEE S, LAVIE A. METEOR： an automatic metric for MT evaluation with improved correlation with human judgments[webpage](2005).

[22] ANDERSON P, FERNANDO B, JOHNSON M et al. SPICE： semantic propositional image caption evaluation[C], 382-398(2016).

Tools

Get Citation

Copy Citation Text

Pei-zhuo LI, Xue WAN, Sheng-yang LI. Image caption of space science experiment based on multi-modal learning[J]. Optics and Precision Engineering, 2021, 29(12): 2944

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Information Sciences

Received: Apr. 29, 2021

Accepted: --

Published Online: Jan. 20, 2022

The Author Email: Xue WAN (wanxue@csu.ac.cn)

DOI:10.37188/OPE.2021.0244

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology