A code-switching-based approach for low-resource language visual question answering

LIU Zheng; DONG Jun; JIALE Dongzhu; CHAOMU Rilige; LIU Xuan; WENG Yu

doi:10.12202/j.0476-0301.2025054

Journal of Beijing Normal University, Volume. 61, Issue 3, 277(2025)

A code-switching-based approach for low-resource language visual question answering

LIU Zheng^1,2, DONG Jun², JIALE Dongzhu², CHAOMU Rilige^1,2, LIU Xuan^1,2, and WENG Yu^1,2、*

¹Key Laboraory of Ethnic Language Intelligent Analysis and Security Governance, Ministry of Education, Minzu University of China, Beijing, China

²Information Engineering School, Minzu University of China, Beijing, China

show less

Abstract Get PDF(in Chinese)

To address challenges facing vision-language models in low-resource scenarios, such as lack of large-scale annotated data and effective transfer methods, a code-switching Chinese Minority pre-trained language model visual question answering (CCMPLM-VQA) method is proposed in this work. With a cross-lingual masked modeling approach using code-switching, model dependence on annotated training data is reduced. A language adapter (LA) with novel structures is introduced to effectively improve multimodal alignment of CCMPLM-VQA. The effectiveness of the proposed method is verified. Compared with the best benchmark model, CCMPLM-VQA improves zero-shot performance on real-world general visual reasoning dataset by approximately 12%. Additionally, its zero-shot performance on cross-lingual real-world general visual reasoning datasets also outperforms existing methods by about 1%.

Keywords

code-switching cross-modality alignment knowledge distillation low-resource language visual question answering

Tools

Get Citation

Copy Citation Text

LIU Zheng, DONG Jun, JIALE Dongzhu, CHAOMU Rilige, LIU Xuan, WENG Yu. A code-switching-based approach for low-resource language visual question answering[J]. Journal of Beijing Normal University, 2025, 61(3): 277

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Received: Apr. 9, 2025

Accepted: Aug. 21, 2025

Published Online: Aug. 21, 2025

The Author Email: WENG Yu (wengyu@muc.edu.cn)

DOI:10.12202/j.0476-0301.2025054

Topics