Chinese Journal of Liquid Crystals and Displays, Volume. 40, Issue 8, 1233(2025)
Hierarchical video summarization algorithm based on independent recurrent neural networks
To address the limitations of existing video summarization algorithms in preserving activity integrity due to neglected shot-level information and the ineffectiveness of traditional RNNs and LSTMs in capturing long-range dependencies for lengthy videos, this paper proposes a hierarchical video summarization algorithm based on independent recurrent neural networks (HIRVS) by leveraging the inherent hierarchical structure of video sequences. Specifically, HIRVS is divided into three components: (1) Visual features for each shot are generated by the IndRNN, where the final hidden state represents a temporally weighted aggregation of all frame features within that shot; (2) Shot-level feature sequences are modeled for temporal relationships using a bidirectional IndRNN, capturing long-range dependencies between shots; (3) A self-attention video encoder is introduced to extract global dependencies across the entire video. Key shots are then selected based on predicted importance scores to generate the video summary. Experiments are conducted on two public datasets, SumMe and TvSum. On SumMe, an F-score of 51.0% is achieved, representing a 1.2% improvement over VOGNet. On TvSum, an F-score of 61.3% is obtained, surpassing the current state-of-the-art method VJMHT by 0.3%. Experimental results validate the effectiveness of HIRVS for video summarization tasks, demonstrating improved summary generation efficiency.
Get Citation
Copy Citation Text
Xiwei REN, Yan LIU, Man XIAO, Shiduan JIA, Rui WANG, Lifeng HE. Hierarchical video summarization algorithm based on independent recurrent neural networks[J]. Chinese Journal of Liquid Crystals and Displays, 2025, 40(8): 1233
Category:
Received: Apr. 12, 2025
Accepted: --
Published Online: Sep. 25, 2025
The Author Email: Xiwei REN (renxiwei@sust.edu.cn)