Optics and Precision Engineering, Volume. 31, Issue 12, 1859(2023)
Few-shot object detection on Thangka via multi-scale context information
Classifying and locating objects of interest in Thangka images can help people understand the rich semantic information of Thangka and promote cultural inheritance. To address the problems of insufficient Thangka image samples, the complex background, the occlusion of detection targets, and the low detection accuracy, this paper proposes a few-shot object detection algorithm for Thangka images that combines multi-scale context information and dual attention guidance. First, a new multi-scale feature pyramid is constructed to learn the multi-level features and contextual information of Thangka images and improve the ability of the model to discriminate multi-scale targets. Second, a dual attention guidance module is added at the end of the feature pyramid to improve the ability of the model to represent key features while reducing the impact of noise. Finally, Rank&Sort Loss is used to replace the cross-entropy classification loss, which simplifies the model training process and increases the detection accuracy. Experimental results indicate that the proposed method achieved a mean average precision of 19.7% and 11.2% in 10-shot experiments using a Thangka dataset and the COCO dataset, respectively.
Get Citation
Copy Citation Text
Wenjin HU, Huiyuan TANG, Chaoyang YUE, Huafei SONG. Few-shot object detection on Thangka via multi-scale context information[J]. Optics and Precision Engineering, 2023, 31(12): 1859
Category: Information Sciences
Received: Aug. 22, 2022
Accepted: --
Published Online: Jul. 25, 2023
The Author Email: HU Wenjin (wenjin_zhm@126.com)