Optics and Precision Engineering, Volume. 29, Issue 12, 2944(2021)
Image caption of space science experiment based on multi-modal learning
[1] [1] 1刘媛媛, 张硕, 于海业, 等. 基于语义分割的复杂场景下的秸秆检测[J]. 光学 精密工程, 2020, 28(1): 200-211. doi: 10.3788/ope.20202801.0200LIUY Y, ZHANGSH, YUH Y, et al. Straw detection algorithm based on semantic segmentation in complex farm scenarios[J]. Opt. Precision Eng., 2020, 28(1): 200-211.(in Chinese). doi: 10.3788/ope.20202801.0200
[2] [2] 2陈彦彤, 李雨阳, 吕石立, 等. 基于深度语义分割的多源遥感图像海面溢油监测[J]. 光学 精密工程, 2020, 28(5): 1165-1176.CHENY T, LIY Y, LÜSH L, et al. Research on oil spill monitoring of multi-source remote sensing image based on deep semantic segmentation[J]. Opt. Precision Eng., 2020, 28(5): 1165-1176.(in Chinese)
[3] [3] 3王中宇, 倪显扬, 尚振东. 利用卷积神经网络的自动驾驶场景语义分割[J]. 光学 精密工程, 2019, 27(11): 2429-2438. doi: 10.3788/ope.20192711.2429WANGZH Y, NIX Y, SHANGZH D. Autonomous driving semantic segmentation with convolution neural networks[J]. Opt. Precision Eng., 2019, 27(11): 2429-2438.(in Chinese). doi: 10.3788/ope.20192711.2429
[4] HE K M, GKIOXARI G, DOLLÁR P et al. Mask R-CNN[C], 2980-2988(2017).
[5] RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[C], 234-241(2015).
[6] OTSU N. A threshold selection method from gray-level histograms[J]. IEEE Transactions on Systems, Man, and Cybernetics, 9, 62-66(1979).
[7] [7] 7李彦, 赵其峰, 闫河, 等. Canny算子在PCBA目标边缘提取中的优化应用[J]. 光学 精密工程, 2020, 28(9): 2096-2102. doi: 10.37188/OPE.20202809.2096LIY, ZHAOQ F, YANH, et al. Optimized application of canny operator in PCBA target edge extraction[J]. Opt. Precision Eng., 2020, 28(9): 2096-2102.(in Chinese). doi: 10.37188/OPE.20202809.2096
[8] HOU Q B, CHENG M M, HU X W et al. Deeply supervised salient object detection with short connections[C](828).
[9] AGRAWAL H, DESAI K R, WANG Y F et al. Nocaps: novel object captioning at scale[C], 8947-8956(2019).
[10] KARPATHY A, LI F F. Deep visual-semantic alignments for generating image descriptions[C], 664-676(2015).
[11] VINYALS O, TOSHEV A, BENGIO S et al. Show and tell: a neural image caption generator[C], 3156-3164(2015).
[12] JOHNSON J, KARPATHY A, LI F F. DenseCap: fully convolutional localization networks for dense captioning[C], 4565-4574(2016).
[13] HENDRICKS L A, VENUGOPALAN S, ROHRBACH M et al. Deep compositional captioning: describing novel object categories without paired training data[C], 1-10(2016).
[14] VENUGOPALAN S, HENDRICKS L A, ROHRBACH M et al. Captioning images with diverse objects[C], 1170-1178(2017).
[15] ANDERSON P, HE X D, BUEHLER C et al. Bottom-up and top-down attention for image captioning and visual question answering[C], 6077-6086(2018).
[16] KULKARNI G, PREMRAJ V, ORDONEZ V et al. BabyTalk: understanding and generating simple image descriptions[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 35, 2891-2903(2013).
[17] ANDERSON P, FERNANDO B, JOHNSON M et al. Guided open vocabulary image captioning with constrained beam search[C], 945(2017).
[18] REN S Q, HE K M, GIRSHICK R et al. Faster R-CNN: towards real-time object detection with region proposal networks[C], 1137-1149(2017).
[19] PERAZZI F, PONT-TUSET J, MCWILLIAMS B et al. A benchmark dataset and evaluation methodology for video object segmentation[C], 724-732(2016).
[20] Papineni K. BLEU : a method for automatic evaluation of MT[J]. Research Report, W0109, 2001(22176).
[21] BANERJEE S, LAVIE A. METEOR: an automatic metric for MT evaluation with improved correlation with human judgments[webpage](2005).
[22] ANDERSON P, FERNANDO B, JOHNSON M et al. SPICE: semantic propositional image caption evaluation[C], 382-398(2016).
Get Citation
Copy Citation Text
Pei-zhuo LI, Xue WAN, Sheng-yang LI. Image caption of space science experiment based on multi-modal learning[J]. Optics and Precision Engineering, 2021, 29(12): 2944
Category: Information Sciences
Received: Apr. 29, 2021
Accepted: --
Published Online: Jan. 20, 2022
The Author Email: Xue WAN (wanxue@csu.ac.cn)