Optics and Precision Engineering, Volume. 31, Issue 4, 552(2023)
Combining residual shrinkage and spatio-temporal context for behavior detection network
[1] LIU C, LI X, LI Q et al. Robot recognizing humans intention and interacting with humans based on a multi-task model combining ST-GCN-LSTM model and YOLO model[J]. Neurocomputing, 430, 174-184(2021).
[2] HU X J, DAI J Z, LI M et al. Online human action detection and anticipation in videos: a survey[J]. Neurocomputing, 491, 395-413(2022).
[3] [3] 3张红颖, 安征. 基于改进双流时空网络的人体行为识别[J]. 光学 精密工程, 2021, 29(2): 420-429. doi: 10.37188/OPE.20212902.0420ZHANGH Y, ANZH. Human action recognition based on improved two-stream spatiotemporal network[J]. Opt. Precision Eng., 2021, 29(2): 420-429.(in Chinese). doi: 10.37188/OPE.20212902.0420
[4] LIU Y, YANG F, GINHAC D. ACDnet: an action detection network for real-time edge computing based on flow-guided feature approximation and memory aggregation[J]. Pattern Recognition Letters, 145, 118-126(2021).
[5] YUAN Z H, STROUD J C, LU T et al. Temporal action localization by structured maximal sums[C], 3215-3223(2017).
[6] WEI, ZHANG, WEI, ZHANG. I2Net: Mining intra-video and inter-video attention for temporal action localization[J]. Neurocomputing, 444, 16-29(2021).
[7] HUANG Y P, DAI Q, LU Y T. Decoupling localization and classification in single shot temporal action detection[C], 1288-1293(2019).
[8] ZHAO Y, XIONG Y J, WANG L M et al. Temporal action detection with structured segment networks[J]. International Journal of Computer Vision, 128, 74-95(2020).
[9] LIN T W, ZHAO X, SU H S. Joint learning of local and global context for temporal action proposal generation[J]. IEEE Transactions on Circuits and Systems for Video Technology, 30, 4899-4912(2020).
[10] XU H J, SAENKO K. R-C3D: region convolutional 3D network for temporal activity detection[C], 5794-5803(2017).
[11] CHEN G, ZHANG C, ZOU Y X. AFNet: temporal locality-aware network with dual structure for accurate and fast action detection[J]. IEEE Transactions on Multimedia, 23, 2672-2682(2021).
[12] XU H J, SAENKO K. Two-stream region convolutional 3D network for temporal activity detection[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 41, 2319-2332(2019).
[13] YANG L, PENG H W, ZHANG D W et al. Revisiting anchor mechanisms for temporal action localization[J]. IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society(2020).
[14] [14] 14孟月波, 金丹, 刘光辉, 等. 共享核空洞卷积与注意力引导FPN文本检测[J]. 光学 精密工程, 2021, 29(8): 1955-1967. doi: 10.37188/OPE.20212908.1955MENGY B, JIND, LIUG H, et al. Text detection with kernel-sharing dilated convolutions and attention-guided FPN[J]. Opt. Precision Eng., 2021, 29(8): 1955-1967.(in Chinese). doi: 10.37188/OPE.20212908.1955
[15] [15] 15毛琳, 曹哲, 杨大伟, 等. 多阶段边界参考网络的动作分割[J]. 光学 精密工程, 2022, 30(3): 340-349. doi: 10.37188/OPE.20223003.0340MAOL, CAOZH, YANGD W, et al. Multi-stage boundary reference network for action segmentation[J]. Opt. Precision Eng., 2022, 30(3): 340-349.(in Chinese). doi: 10.37188/OPE.20223003.0340
[16] BAIRONG, LI, BAIRONG, LI. Learning frame-level affinity with video-level labels for weakly supervised temporal action detection[J]. Neurocomputing, 463, 109-121(2021).
[17] YANG W F, ZHANG T Z, MAO Z D et al. Multi-scale structure-aware network for weakly supervised temporal action detection[J]. IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society, 30, 5848-5861(2021).
[18] YANG L, HAN J W, ZHAO T et al. Background-click supervision for temporal action localization[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 44, 9814-9829(2022).
[19] ZHAO M H, ZHONG S S, FU X Y et al. Deep residual shrinkage networks for fault diagnosis[J]. IEEE Transactions on Industrial Informatics, 16, 4681-4690(2020).
[20] LI L W, QIN S Y, LU Z et al. One-shot learning gesture recognition based on joint training of 3D ResNet and memory module[J]. Multimedia Tools and Applications, 79, 6727-6757(2020).
[21] YIWEI, WANG, YIWEI, WANG. Temporal convolutional network with soft thresholding and attention mechanism for machinery prognostics[J]. Journal of Manufacturing Systems, 60, 512-526(2021).
[22] CUI W X, LIU S H, JIANG F et al. Image compressed sensing using non-local neural network[J]. IEEE Transactions on Multimedia(2021).
[23] JIANG Y, LIU J, ZAMIR A et al. THUMOS challenge: Action recognition with a large number of classes[J]. http://crcv.ucf.edu/THUMOS14/(2014).
[24] HEILBRON F C, ESCORCIA V, GHANEM B et al. ActivityNet: a large-scale video benchmark for human activity understanding[C], 961-970(2015).
[25] ZHANG X Y, SHI H C, LI C S et al. TwinNet: twin structured knowledge transfer network for weakly supervised action localization[J]. Machine Intelligence Research, 19, 227-246(2022).
[26] LI G Z, LI J, WANG N N et al. Multi-hierarchical category supervision for weakly-supervised temporal action localization[J]. IEEE Transactions on Image Processing: a Publication of the IEEE Signal Processing Society, 30, 9332-9344(2021).
Get Citation
Copy Citation Text
Zhong HUANG, Mengyuan TAO, Min HU, Juan LIU, Shengbao ZHAN. Combining residual shrinkage and spatio-temporal context for behavior detection network[J]. Optics and Precision Engineering, 2023, 31(4): 552
Category: Information Sciences
Received: May. 16, 2022
Accepted: --
Published Online: Mar. 7, 2023
The Author Email: Zhong HUANG (huangzhong3315@163.com)