Computer Applications and Software, Volume. 42, Issue 4, 189(2025)

PMKBQA: A MULTIMODAL DOMAIN KNOWLEDGE QUESTION ANSWERING METHOD BASED ON PATH SELECTION

Wang Xiang, Li Yanchao, and Zhang Xiaoming
Author Affiliations
  • School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang 050000, Hebei, China
  • show less
    References(32)

    [1] [1] Yu J, Zhu Z H, Wang Y J, et al. Cross-modal knowledge reasoning for knowledge-based visual question answering[J]. Pattern Recognition, 2020, 108: 107563.

    [2] [2] Marino K, Rastegari M, Farhadi A, et al. OK-VQA: A visual question answering benchmark requiring external knowledge[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 3195-3204.

    [3] [3] Wang P, Wu Q, Shen C H, et al. FVQA: Fact-based visual question answering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(10): 2413-2427.

    [4] [4] Kannan A V, Fradkin D, Akrotirianakis I, et al. Multimodal knowledge graph for deep learning papers and code[C]//29th ACM International Conference on Information & Knowledge Management, 2020: 3417-3420.

    [5] [5] Li H D, Wang Y F, Melo G D, et al. Multimodal question answering over structured data with ambiguous entities[C]//26th International Conference on World Wide Web Companion, 2017: 79-88.

    [6] [6] Salatino A, Thanapalasingam T, Mannocci A, et al. The computer science ontology: A large-scale taxonomy of research areas[C]//International Semantic Web Conference, 2018: 187-205.

    [7] [7] Liu A, Lu Z M, Xu N, et al. Multi-type decision fusion network for visual Q&A[J]. Image and Vision Computing, 2021, 115: 104281.

    [8] [8] Dai J X, Ma L W, Fu D M, et al. Construction of visual question and answering system based on knowledge graph for specific objects[C]//Chinese Intelligent Systems Conference, 2022: 751-759.

    [9] [9] Zhu F B, Lei W Q, Huang Y C, et al. TAT-QA: A question answering benchmark on a hybrid of tabular and textual content in finance[EB]. arXiv: 2105.07624, 2021.

    [10] [10] Li A H, Ng P, Xu P, et al. Dual reader-parser on hybrid textual and tabular evidence for open domain question answering[EB]. arXiv: 2108.02866, 2021.

    [11] [11] Jiao J, Wang S J, Zhang X W, et al. gMatch: Knowledge base question answering via semantic matching[J]. Knowledge-Based Systems, 2021, 228: 107270.

    [12] [12] Bakhshi M, Nematbakhsh M, Mohsenzadeh M, et al. SParseQA: Sequential word reordering and parsing for answering complex natural language questions over knowledge graphs[J]. Knowledge-Based Systems, 2022, 235: 107626.

    [13] [13] Shin S, Lee K. Processing knowledge graph-based complex questions through question decomposition and recomposition[J]. Information Sciences, 2020, 523: 234-244.

    [14] [14] Aditya S, Yang Y Z, Baral C. Integrating knowledge and reasoning in image understanding[C]//28th International Joint Conference on Artificial Intelligence, 2019: 6252-6259.

    [15] [15] Zhang X M, Meng M, Sun X L, et al. FactQA: Question answering over domain knowledge graph based on two-level query expansion[J]. Data Technologies and Applications, 2020, 54: 2514-2548.

    [16] [16] Shin S J, Jin X G, Jung J, et al. Predicate constraints based question answering over knowledge graph[J]. Information Processing & Management, 2019, 56(3): 445-462.

    [17] [17] Garderes F, Ziaeefard M, Abeloos B, et al. Conceptbert: Concept-aware representation for visual question answering[C]//Conference on Empirical Methods in Natural Language Processing: Findings, 2020: 489-498.

    [18] [18] Bai L Y, Yu W T, Chen M Z, et al. Multi-hop reasoning over paths in temporal knowledge graphs using reinforcement learning[J]. Applied Soft Computing, 2021, 103: 107144.

    [19] [19] Zheng W F, Yin L R, Chen X B, et al. Knowledge base graph embedding module design for Visual question answering model[J]. Pattern Recognition, 2021, 120: 108153.

    [20] [20] Li G H, Wang X, Zhu W. Boosting visual question answering with context-aware knowledge aggregation[C]//28th ACM International Conference on Multimedia, 2020: 1227-1235.

    [21] [21] Shah S, Mishra A, Yadati N, et al. KVQA: Knowledge-aware visual question answering[C]//AAAI Conference on Artificial Intelligence, 2019: 8876-8884.

    [22] [22] Zhan X L, Huang Y, Dong X, et al. PathReasoner: Explainable reasoning paths for commonsense question answering[J]. Knowledge-Based Systems, 2022, 235: 107612.

    [23] [23] Li X P, Wu B, Song J K, et al. Text-instance graph: Exploring the relational semantics for text-based visual question answering[J]. Pattern Recognition, 2021, 124: 108455.

    [24] [24] Wu Y R, Ma Y T, Wan S H. Multi-scale relation reasoning for multi-modal visual question answering[J]. Signal Processing: Image Communication, 2021, 96: 116319.

    [25] [25] YOLOv5[EB/OL]. [2021-10-20]. https://github.com/ultralytics/yolov5/blob/master/CITATION/cff.

    [26] [26] Stanovsky G, Dagan I. Open ie as an intermediate structure for semantic tasks[C]//53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015: 303 -308.

    [27] [27] Du Y, Li C, Guo R, et al. PP-OCRv2: Bag of tricks for ultra lightweight OCR system[EB]. arXiv: 2109.03144, 2021.

    [28] [28] Ferragina P, Scaiella U. Tagme: On-the-fly annotation of short text fragments[C]//19th ACM International Conference on Information and Knowledge Management, 2010: 1625 -1628.

    [29] [29] Manning C D, Surdeanu M, Bauer J, et al. The Stanford CoreNLP natural language processing toolkit[C]//52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2014: 55-60.

    [30] [30] Marszaek-Kowalewska K, Zaretskaya A, Souek M. Stanford typed dependencies: Slavic languages application[C]//International Conference on Natural Language Processing, 2014: 151-163.

    [31] [31] Gao T Y, Yao X C, Chen D Q. SimCSE: Simple contrastive learning of sentence embeddings[EB]. arXiv: 2104.08821, 2021.

    [32] [32] Cimiano P, Lopez V, Unger C, et al. Multilingual question answering over linked data (QALD-3): Lab overview[C]//International Conference of the Cross-Language Evaluation Forum for European Languages, 2013: 321-332.

    Tools

    Get Citation

    Copy Citation Text

    Wang Xiang, Li Yanchao, Zhang Xiaoming. PMKBQA: A MULTIMODAL DOMAIN KNOWLEDGE QUESTION ANSWERING METHOD BASED ON PATH SELECTION[J]. Computer Applications and Software, 2025, 42(4): 189

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Dec. 20, 2021

    Accepted: Aug. 25, 2025

    Published Online: Aug. 25, 2025

    The Author Email:

    DOI:10.3969/j.issn.1000-386x.2025.04.028

    Topics