Video Summarization Algorithm Based on Improved Fully Convolutional Network

[1] Potapov D, Douze M, Harchaoui Z et al. Category-specific video summarization[M]. //Fleet D, Pajdla T, Schiele B, et al. Computer vision-ECCV 2014. Lecture notes in computer science, 8694, 540-555(2014).

[2] de Avila S E F, Lopes A P B, da Luz A Jr et al. VSUMM: a mechanism designed to produce static video summaries and a novel evaluation method[J]. Pattern Recognition Letters, 32, 56-68(2011).

[3] Pritch Y, Rav-Acha A, Peleg S. Nonchronological video synopsis and indexing[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30, 1971-1984(2008).

[4] Pritch Y, Rav-Acha A, Gutman A et al. Webcam synopsis: peeking around the world[C]. //2007 IEEE 11th International Conference on Computer Vision, October 14-21, 2007, Rio de Janeiro, Brazil, 9848979(2007).

[5] Zhang K, Chao W L, Sha F et al. Video summarization with long short-term memory[M]. //Leibe B, Matas J, Sebe N, et al. Computer vision-ECCV 2016. Lecture notes in computer science, 9911, 766-782(2016).

[6] Li Z T, Yang L. Weakly supervised deep reinforcement learning for video summarization with semantically meaningful reward[C]. //2021 IEEE Winter Conference on Applications of Computer Vision (WACV), January 3-8, 2021, Waikoloa, HI, USA., 3238-3246(2021).

[7] Lea C, Flynn M D, Vidal R et al. Temporal convolutional networks for action segmentation and detection[C]. //2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA., 1003-1012(2017).

[8] Lin R C, Xiao J, Fan J P. NeXtVLAD: an efficient neural network to aggregate frame-level features for large-scale video classification[M]. // Leal-Taixé L, Roth S. Computer vision-ECCV 2018 Workshops. Lecture notes in computer science, 11132, 206-218(2019).

[9] Rochan M, Ye L W, Wang Y. Video summarization using fully convolutional sequence networks[M]. //Ferrari V, Hebert M, Sminchisescu C, et al. Computer vision-ECCV 2018. Lecture notes in computer science, 11216, 358-374(2018).

[10] Long J, Shelhamer E, Darrell T. Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 640-651(2015).

[11] Yu F, Koltun V. Multi-scale context aggregation by dilated convolutions[C]. //4th International Conference on Learning Representations, ICLR 2016, May 2-4, 2016, San Juan, Puerto Rico. [S.l.: s.n.](2016).

[12] Chen L C, Papandreou G, Kokkinos I et al. DeepLab:semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFs[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 40, 834-848(2018).

[13] He K M, Zhang X Y, Ren S Q et al. Spatial pyramid pooling in deep convolutional networks for visual recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 1904-1916(2015). http://www.sciencedirect.com/science/article/pii/S0031320315004252

[14] Krhenbühl P, Koltun V. Efficient inference in fully connected CRFs with gaussian edge potentials[EB/OL]. (2012-10-20)[2021-03-10]. https://arxiv.org/abs/1210.5644

[15] Zhang X F, Liu J, Shi Z S et al. Review of deep learning-based semantic segmentation[J]. Laser & Optoelectronics Progress, 56, 150003(2019).

[16] Dong Y F, Yang Y X, Wang L Q. Image semantic segmentation based on multi-scale feature extraction and fully connected conditional random fields[J]. Laser & Optoelectronics Progress, 56, 131007(2019).

[17] Szegedy C, Liu W, Jia Y Q et al. Going deeper with convolutions[C]. //2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 7-12, 2015, Boston, MA, USA., 15523970(2015).

[18] Eigen D, Fergus R. Predicting depth, surface normals and semantic labels with a common multi-scale convolutional architecture[C]. //2015 IEEE International Conference on Computer Vision (ICCV), December 7-13, 2015, Santiago, Chile., 2650-2658(2015).

[19] Gygli M, Grabner H, Riemenschneider H et al. Creating summaries from user videos[M]. //Fleet D, Pajdla T, Schiele B, et al. Computer vision-ECCV 2014. Lecture notes in computer science, 8695, 505-520(2014).

[20] Song Y L, Vallmitjana J, Stent A et al. TVSum: Summarizing web videos using titles[C]. //2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 7-12, 2015, Boston, MA, USA, 5179-5187(2015).

[21] Mahasseni B, Lam M, Todorovic S. Unsupervised video summarization with adversarial LSTM networks[C]. //2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA., 2982-2991(2017).

[22] Yuan L, Tay F E, Li P et al. Cycle-SUM:cycle-consistent adversarial LSTM networks for unsupervised video summarization[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 33, 9143-9150(2019).

[23] Zhu J Y, Park T, Isola P et al. Unpaired image-to-image translation using cycle-consistent adversarial networks[C]. //2017 IEEE International Conference on Computer Vision (ICCV), October 22-29, 2017, Venice, Italy., 2242-2251(2017).

Tools

Get Citation

Copy Citation Text

Hao Wang, Li Peng. Video Summarization Algorithm Based on Improved Fully Convolutional Network[J]. Laser & Optoelectronics Progress, 2021, 58(22): 2215008

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Machine Vision

Received: Mar. 15, 2021

Accepted: Jul. 15, 2021

Published Online: Nov. 10, 2021

The Author Email: Li Peng (penglimail2002@163.com)

DOI:10.3788/LOP202158.2215008

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology