Opto-Electronic Engineering, Volume. 49, Issue 7, 210429(2022)

Interactive instance proposal network for HOI detection

Lixia Xue... Kaijian Yin, Ronggui Wang and Juan Yang* |Show fewer author(s)
Author Affiliations
  • School of Computer and Information, Hefei University of Technology, Hefei, Anhui 230031, China
  • show less
    References(38)

    [1] [1] Ren S Q, He K M, Girshick R, et al. Faster R-CNN: towards real-time object detection with region proposal networks[C]//Proceedings of the 28th International Conference on Neural Information Processing Systems, 2015: 91–99.

    [2] [2] Girshick R. Fast R-CNN[C]//Proceedings of the IEEE International Conference on Computer Vision, 2015: 1440–1448.

    [3] [3] Yang C Y, Xu Y H, Shi J P, et al. Temporal pyramid network for action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 588–597.

    [4] [4] Li M S, Chen S H, Chen X, et al. Actional-structural graph convolutional networks for skeleton-based action recognition[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 3590–3598.

    [5] [5] Kirillov A, He K M, Girshick R, et al. Panoptic segmentation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 9396–9405.

    [6] [6] Sofiiuk K, Sofiyuk K, Barinova O, et al. AdaptIS: adaptive instance selection network[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 7354–7362.

    [7] [7] Gao C, Xu J R, Zou Y L, et al. DRG: dual relation graph for human-object interaction detection[C]//16th European Conference on Computer Vision, 2020: 696–712.

    [8] [8] Gao C, Zou Y L, Huang J B. iCAN: instance-centric attention network for human-object interaction detection[C]//British Machine Vision Conference 2018, 2018.

    [9] [9] Chao Y W, Liu Y F, Liu X Y, et al. Learning to detect human-object interactions[C]//2018 IEEE Winter Conference on Applications of Computer Vision (WACV), 2018: 381–389.

    [10] [10] Hou Z, Peng X J, Qiao Y, et al. Visual compositional learning for human-object interaction detection[C]//16th European Conference on Computer Vision, 2020: 584–600.

    [11] [11] Zhou P H, Chi M M. Relation parsing neural network for human-object interaction detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 843–851.

    [12] [12] Kim B, Lee J, Kang J, et al. HOTR: end-to-end human-object interaction detection with transformers[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 74–83.

    [13] [13] Zhang A X, Liao Y, Liu S, et al. Mining the benefits of two-stage and one-stage HOI detection[C]//Proceedings of the Thirty-Fifth Conference on Neural Information Processing Systems, 2021.

    [14] [14] Zou C, Wang B H, Hu Y, et al. End-to-end human object interaction detection with HOI transformer[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 11820–11829.

    [15] [15] Chen M F, Liao Y, Liu S, et al. Reformulating HOI detection as adaptive set prediction[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2021: 9000–9009.

    [16] [16] Kamath A, Clark C, Gupta T, et al. Webly supervised concept expansion for general purpose vision models[Z]. arXiv: 2202.02317, 2022. https://arxiv.org/abs/2202.02317v1.

    [17] [17] Li Z M, Zou C, Zhao Y, et al. Improving human-object interaction detection via phrase learning and label composition[Z]. arXiv: 2112.07383, 2021. https://doi.org/10.48550/arXiv.2112.07383.

    [19] [19] Li Y L, Zhou S Y, Huang X J, et al. Transferable interactiveness knowledge for human-object interaction detection[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 3580–3589.

    [20] [20] Yang J W, Lu J S, Lee S, et al. Graph R-CNN for scene graph generation[C]//Proceedings of the 15th European Conference on Computer Vision (ECCV), 2018: 690–706.

    [21] [21] Chen T S, Yu W H, Chen R Q, et al. Knowledge-embedded routing network for scene graph generation[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 6156–6164.

    [23] [23] Liang W X, Jiang Y H, Liu Z X. GraghVQA: language-guided graph neural networks for graph-based visual question answering[Z]. arXiv: 2104.10283, 2021. https://arxiv.org/abs/2104.10283v2.

    [24] [24] Qi S Y, Wang W G, Jia B X, et al. Learning human-object interactions by graph parsing neural networks[C]//Proceedings of the 15th European Conference on Computer Vision (ECCV), 2018: 407–423.

    [25] [25] Xu B J, Wong Y K, Li J N, et al. Learning to detect human-object interactions with knowledge[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 2019–2028.

    [26] [26] Zheng S P, Chen S Z, Jin Q. Skeleton-based interactive graph network for human object interaction detection[C]//2020 IEEE International Conference on Multimedia and Expo (ICME), 2020: 1–6.

    [27] [27] Shen L Y, Yeung S, Hoffman J, et al. Scaling human-object interaction recognition through zero-shot learning[C]//2018 IEEE Winter Conference on Applications of Computer Vision (WACV), 2018: 1568–1576.

    [28] [28] Wang S C, Yap K H, Yuan J S, et al. Discovering human interactions with novel objects via zero-shot learning[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2020: 11649–11658.

    [29] [29] Fang H S, Xie Y C, Shao D, et al. DecAug: augmenting HOI detection via decomposition[C]//Proceedings of the 35th AAAI Conference on Artificial Intelligence, 2021: 1300–1308.

    [30] [30] Sarullo A, Mu T T. Zero-shot human-object interaction recognition via affordance graphs[Z]. arXiv: 2009.01039, 2020. https://doi.org/10.48550/arXiv.2009.01039.

    [31] [31] Wan B, Zhou D S, Liu Y F, et al. Pose-aware multi-level feature network for human object interaction detection[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 9468–9477.

    [32] [32] Peyre J, Sivic J, Laptev I, et al. Detecting unseen visual relations using analogies[C]//Proceedings of the IEEE/CVF International Conference on Computer Vision, 2019: 1981–1990.

    [33] [33] Liu Y, Chen Q C, Zisserman A. Amplifying key cues for human-object-interaction detection[C]//16th European Conference on Computer Vision, 2020: 248–265.

    [34] [34] Zhang F Z, Campbell D, Gould S. Spatio-attentive graphs for human-object interaction detection[Z]. arXiv: 2012.06060, 2020. https://arxiv.org/abs/2012.06060v1.

    [35] [35] Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object detection[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2017: 936–944.

    [36] [36] He K M, Zhang X Y, Ren S Q, et al. Deep residual learning for image recognition[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 2016: 770–778.

    [37] [37] Chen L, Zhang H W, Xiao J, et al. Zero-shot visual recognition using semantics-preserving adversarial embedding networks[C]//Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2018: 1043–1052.

    [38] [38] Pennington J, Socher R, Manning C D. GloVe: global vectors for word representation[C]//Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), 2014: 1532–1543.

    [39] [39] Gupta S, Malik J. Visual semantic role labeling[Z]. arXiv: 1505.04474, 2015. https://arxiv.org/abs/1505.04474v1.

    [40] [40] Lin T Y, Maire M, Belongie S, et al. Microsoft COCO: common objects in context[C]//13th European Conference on Computer Vision, 2014: 740–755.

    Tools

    Get Citation

    Copy Citation Text

    Lixia Xue, Kaijian Yin, Ronggui Wang, Juan Yang. Interactive instance proposal network for HOI detection[J]. Opto-Electronic Engineering, 2022, 49(7): 210429

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Article

    Received: Jan. 10, 2022

    Accepted: --

    Published Online: Aug. 1, 2022

    The Author Email: Yang Juan (yangjuan6985@163.com)

    DOI:10.12086/oee.2022.210429

    Topics