PMKBQA: A MULTIMODAL DOMAIN KNOWLEDGE QUESTION ANSWERING METHOD BASED ON PATH SELECTION

Wang Xiang; Li Yanchao; Zhang Xiaoming

doi:10.3969/j.issn.1000-386x.2025.04.028

Computer Applications and Software, Volume. 42, Issue 4, 189(2025)

PMKBQA: A MULTIMODAL DOMAIN KNOWLEDGE QUESTION ANSWERING METHOD BASED ON PATH SELECTION

Wang Xiang, Li Yanchao, and Zhang Xiaoming

School of Information Science and Engineering, Hebei University of Science and Technology, Shijiazhuang 050000, Hebei, China

show less

Abstract Get PDF(in Chinese)

References(32)

[1] [1] Yu J, Zhu Z H, Wang Y J, et al. Cross-modal knowledge reasoning for knowledge-based visual question answering[J]. Pattern Recognition, 2020, 108: 107563.

[2] [2] Marino K, Rastegari M, Farhadi A, et al. OK-VQA: A visual question answering benchmark requiring external knowledge[C]//IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2019: 3195-3204.

[3] [3] Wang P, Wu Q, Shen C H, et al. FVQA: Fact-based visual question answering[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40(10): 2413-2427.

[4] [4] Kannan A V, Fradkin D, Akrotirianakis I, et al. Multimodal knowledge graph for deep learning papers and code[C]//29th ACM International Conference on Information & Knowledge Management, 2020: 3417-3420.

[5] [5] Li H D, Wang Y F, Melo G D, et al. Multimodal question answering over structured data with ambiguous entities[C]//26th International Conference on World Wide Web Companion, 2017: 79-88.

[6] [6] Salatino A, Thanapalasingam T, Mannocci A, et al. The computer science ontology: A large-scale taxonomy of research areas[C]//International Semantic Web Conference, 2018: 187-205.

[7] [7] Liu A, Lu Z M, Xu N, et al. Multi-type decision fusion network for visual Q&A[J]. Image and Vision Computing, 2021, 115: 104281.

[8] [8] Dai J X, Ma L W, Fu D M, et al. Construction of visual question and answering system based on knowledge graph for specific objects[C]//Chinese Intelligent Systems Conference, 2022: 751-759.

[9] [9] Zhu F B, Lei W Q, Huang Y C, et al. TAT-QA: A question answering benchmark on a hybrid of tabular and textual content in finance[EB]. arXiv: 2105.07624, 2021.

[10] [10] Li A H, Ng P, Xu P, et al. Dual reader-parser on hybrid textual and tabular evidence for open domain question answering[EB]. arXiv: 2108.02866, 2021.

[11] [11] Jiao J, Wang S J, Zhang X W, et al. gMatch: Knowledge base question answering via semantic matching[J]. Knowledge-Based Systems, 2021, 228: 107270.

[12] [12] Bakhshi M, Nematbakhsh M, Mohsenzadeh M, et al. SParseQA: Sequential word reordering and parsing for answering complex natural language questions over knowledge graphs[J]. Knowledge-Based Systems, 2022, 235: 107626.

[13] [13] Shin S, Lee K. Processing knowledge graph-based complex questions through question decomposition and recomposition[J]. Information Sciences, 2020, 523: 234-244.

[14] [14] Aditya S, Yang Y Z, Baral C. Integrating knowledge and reasoning in image understanding[C]//28th International Joint Conference on Artificial Intelligence, 2019: 6252-6259.

[15] [15] Zhang X M, Meng M, Sun X L, et al. FactQA: Question answering over domain knowledge graph based on two-level query expansion[J]. Data Technologies and Applications, 2020, 54: 2514-2548.

[16] [16] Shin S J, Jin X G, Jung J, et al. Predicate constraints based question answering over knowledge graph[J]. Information Processing & Management, 2019, 56(3): 445-462.

[17] [17] Garderes F, Ziaeefard M, Abeloos B, et al. Conceptbert: Concept-aware representation for visual question answering[C]//Conference on Empirical Methods in Natural Language Processing: Findings, 2020: 489-498.

[18] [18] Bai L Y, Yu W T, Chen M Z, et al. Multi-hop reasoning over paths in temporal knowledge graphs using reinforcement learning[J]. Applied Soft Computing, 2021, 103: 107144.

[19] [19] Zheng W F, Yin L R, Chen X B, et al. Knowledge base graph embedding module design for Visual question answering model[J]. Pattern Recognition, 2021, 120: 108153.

[20] [20] Li G H, Wang X, Zhu W. Boosting visual question answering with context-aware knowledge aggregation[C]//28th ACM International Conference on Multimedia, 2020: 1227-1235.

[21] [21] Shah S, Mishra A, Yadati N, et al. KVQA: Knowledge-aware visual question answering[C]//AAAI Conference on Artificial Intelligence, 2019: 8876-8884.

[22] [22] Zhan X L, Huang Y, Dong X, et al. PathReasoner: Explainable reasoning paths for commonsense question answering[J]. Knowledge-Based Systems, 2022, 235: 107612.

[23] [23] Li X P, Wu B, Song J K, et al. Text-instance graph: Exploring the relational semantics for text-based visual question answering[J]. Pattern Recognition, 2021, 124: 108455.

[24] [24] Wu Y R, Ma Y T, Wan S H. Multi-scale relation reasoning for multi-modal visual question answering[J]. Signal Processing: Image Communication, 2021, 96: 116319.

[25] [25] YOLOv5[EB/OL]. [2021-10-20]. https://github.com/ultralytics/yolov5/blob/master/CITATION/cff.

[26] [26] Stanovsky G, Dagan I. Open ie as an intermediate structure for semantic tasks[C]//53rd Annual Meeting of the Association for Computational Linguistics and the 7th International Joint Conference on Natural Language Processing, 2015: 303 -308.

[27] [27] Du Y, Li C, Guo R, et al. PP-OCRv2: Bag of tricks for ultra lightweight OCR system[EB]. arXiv: 2109.03144, 2021.

[28] [28] Ferragina P, Scaiella U. Tagme: On-the-fly annotation of short text fragments[C]//19th ACM International Conference on Information and Knowledge Management, 2010: 1625 -1628.

[29] [29] Manning C D, Surdeanu M, Bauer J, et al. The Stanford CoreNLP natural language processing toolkit[C]//52nd Annual Meeting of the Association for Computational Linguistics: System Demonstrations, 2014: 55-60.

[30] [30] Marszaek-Kowalewska K, Zaretskaya A, Souek M. Stanford typed dependencies: Slavic languages application[C]//International Conference on Natural Language Processing, 2014: 151-163.

[31] [31] Gao T Y, Yao X C, Chen D Q. SimCSE: Simple contrastive learning of sentence embeddings[EB]. arXiv: 2104.08821, 2021.

[32] [32] Cimiano P, Lopez V, Unger C, et al. Multilingual question answering over linked data (QALD-3): Lab overview[C]//International Conference of the Cross-Language Evaluation Forum for European Languages, 2013: 321-332.

Tools

Get Citation

Copy Citation Text

Wang Xiang, Li Yanchao, Zhang Xiaoming. PMKBQA: A MULTIMODAL DOMAIN KNOWLEDGE QUESTION ANSWERING METHOD BASED ON PATH SELECTION[J]. Computer Applications and Software, 2025, 42(4): 189

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category:

Received: Dec. 20, 2021

Accepted: Aug. 25, 2025

Published Online: Aug. 25, 2025

The Author Email:

DOI:10.3969/j.issn.1000-386x.2025.04.028

Topics