Few-Shot Object Detection Based on Association and Discrimination

Table 1. Experimental parameters and their values
View table
Table 1. Experimental parameters and their values
Parameter Learning rate Momentum Weight decay Batch size
value 0.001 0.9 0.0001 16

Table 2. Number of training iterations under different K values
View table
Table 2. Number of training iterations under different K values
K 1 2 3 5 10
Number of iterations 4000 8000 12000 16000 20000

Table 3. nAP50 of different methods on PASCAL VOC dataset

View table

Table 3. nAP50 of different methods on PASCAL VOC dataset

Method	Backbone	Novel Split 1					Novel Split 2					Novel Split 3
Method	Backbone	K=1	K=2	K=3	K=5	K=10	K=1	K=2	K=3	K=5	K=10	K=1	K=2	K=3	K=5	K=10
LSTD^［28］	VGG-16	8.2	1.0	12.4	29.1	38.5	11.4	3.8	5.0	15.7	31.0	12.6	8.5	15.0	27.3	36.3
YOLOv2-ft^［29］	YOLO V2	6.6	10.7	12.5	24.8	38.6	12.5	4.2	11.6	16.1	33.9	13.0	15.9	15.0	32.2	38.4
FSRW^［3］		14.8	15.5	26.7	33.9	47.2	15.7	15.3	22.7	30.1	40.5	21.3	25.6	28.4	42.8	45.9
MetaDet^［29］		17.1	19.1	28.9	35.0	48.8	18.2	20.6	25.9	30.6	41.5	20.1	22.3	27.9	41.9	42.9
RepMet^［6］	InceptionV3	26.1	32.9	34.4	38.6	41.3	17.2	22.1	23.4	28.3	35.8	27.5	31.1	31.5	34.4	37.2
FRCN-ft^［29］	FRCN-R101	13.8	19.6	32.8	41.5	45.6	7.9	15.3	26.2	31.6	39.1	9.8	11.3	19.1	35.0	45.1
FRCN+FPN-ft ^［7］		8.2	20.3	29.0	40.1	45.5	13.4	20.6	28.6	32.4	38.8	19.6	20.8	28.7	42.2	42.1
MetaDet^［29］		18.9	20.6	30.2	36.8	49.6	21.8	23.1	27.8	31.7	43.0	20.6	23.9	29.4	43.9	44.1
Meta R-CNN^［4］		19.9	25.5	35.0	45.7	51.5	10.4	19.4	29.6	34.8	45.4	14.3	18.2	27.5	41.2	48.1
TFA w/fc^［7］	FRCN-R101	36.8	29.1	43.6	55.7	57.0	18.2	29.0	33.4	35.5	39.0	27.7	33.6	42.5	48.7	50.2
TFA w/cos^［7］		39.8	36.1	44.7	55.7	56.0	23.5	26.9	34.1	35.1	39.1	30.8	34.8	42.8	49.5	49.8
MPSR^［8］		41.7	—	51.4	55.2	61.8	24.4	—	39.2	39.9	47.8	35.6	—	42.3	48.0	49.7
SRR-FSD^［30］		47.8	50.5	51.3	55.2	56.8	32.5	35.3	39.1	40.8	43.8	40.1	41.5	44.3	46.9	46.4
DiGeo^［26］		37.9	39.4	48.5	58.6	61.5	26.6	28.9	41.9	42.1	49.1	30.4	40.1	46.9	52.7	54.7
FSCE^［9］		44.2	43.8	51.4	61.9	63.4	27.3	29.5	43.5	44.2	50.2	37.2	41.9	47.5	54.6	58.5
Retentive R-CNN^［25］		42.4	45.8	45.9	53.7	56.1	21.7	27.8	35.2	37.0	40.3	30.2	37.8	43.0	49.7	50.1
HTRPN^［27］		47.0	44.8	53.4	62.9	65.2	29.8	32.6	46.3	47.7	53.0	40.1	45.9	49.6	57.0	59.7
FSAD（ours）		50.5	54.7	54.6	57.6	62.2	31.4	35.5	39.2	42.5	45.2	46.1	46.3	47.3	54.8	59.0

Table 4. Parameter quantity comparison
View table
Table 4. Parameter quantity comparison
Method total_params trainable_params nontrainable_params
TFA 60.3 0.1 60.2
FSCE 60.3 60.1 0.2
DiGeo 76.4 15.0 61.4
HTRPN 76.5 76.3 0.2
FSAD 60.4 17.9 42.5

Table 5. Effectiveness of different components of FSAD
View table
Table 5. Effectiveness of different components of FSAD
Association Disentangling Margin nAP50 /%
K=1 K=3 K=5
× × × 41.3 46.3 53.7
√ × × 42.4 46.8 55.2
× √ × 42.4 47.3 54.1
√ √ × 44.9 50.3 56.8
× × √ 46.3 48.8 56.4
√ √ √ 50.5 54.6 57.6

Table 6. Effectiveness of modules in the correlation and recognition stage
View table
Table 6. Effectiveness of modules in the correlation and recognition stage
Dynamic RoI head ECA nAP50 /%
K=1 K=3 K=5
× × 43.2 49.4 54.4
√ × 44.9 51.3 56.0
× √ 46.7 53.0 56.8
√ √ 50.5 54.6 57.6

Table 7. Comparison of different allocation strategies (without using margin loss)

View table

Table 7. Comparison of different allocation strategies (without using margin loss)

Base	bird	bus	cow	motorbike	sofa	nAP50 /%
random	person	boat	horse	aeroplane	sheep	39.6
human	aeroplane	train	sheep	bicycle	chair	44.1
visual	dog	car	horse	person	chair	43.4
top2	dog	car	sheep	tv	diningtable	41.2
top1	horse	train	horse	bicycle	chair	44.3
top1 w/o dup	dog	train	horse	bicycle	chair	44.9

Table 8. Performance comparison of different margin loss
View table
Table 8. Performance comparison of different margin loss
Margin nAP50 /%
TFA 41.3
CosFace 38.9
ArcFace 37.9
CosFace（novel） 44.2
ArcFace（novel） 44.3
Ours 46.3

Table 9. Comparison of visual similarity and semantic similarity
View table
Table 9. Comparison of visual similarity and semantic similarity
Metric Novel Split 1 Novel Split 2 Novel Split 3
K=1 K=3 K=5 K=1 K=3 K=5 K=1 K=3 K=5
Visual 43.3 49.3 56.4 22.5 37.2 39.3 31.8 43.1 50.7
Semantic 44.9 50.3 56.8 26.1 38.5 40.1 37.1 45.0 51.5

Tools

Get Citation

Copy Citation Text

Jianli Jia, Huiyan Han, Liqun Kuang, Fangzheng Han, Xinyi Zheng, Xiuquan Zhang. Few-Shot Object Detection Based on Association and Discrimination[J]. Laser & Optoelectronics Progress, 2024, 61(8): 0837015

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites