Multiview Stereo Reconstruction with Feature Aggregation Transformer

Fig. 2. Comparison of convolutional kernel sampling. (a) The convolutional kernel sampling method of standard convolution; (b) (c) (d) convolutional kernel sampling method for deformable convolutions

Download full size

Fig. 3. Regular grid R

Download full size

Fig. 4. Implementation process of deformable convolution

Download full size

Fig. 5. Transformer based feature aggregation module

Download full size

Fig. 6. Transformer encoder and attention structure. (a) Transformer encoder layer; (b) attention layer

Download full size

Fig. 7. Comparison of reconstruction results of some models on DTU

Download full size

Fig. 8. Comparison of reconstruction results of some models on the Tanks & Temples

Download full size

Table 1. Quantitative testing results of different methods on DTU

View table

Table 1. Quantitative testing results of different methods on DTU

Method	D_acc	D_comp	D_overall
Furu	0.613	0.941	0.777
Gipuma	0.283	0.873	0.578
COLMAP	0.400	0.664	0.532
MVSNet	0.396	0.527	0.462
Fast-MVSNet	0.336	0.403	0.370
CasMVSNet	0.325	0.385	0.355
UCS-Net	0.338	0.349	0.344
Uni-MVSNet	0.352	0.278	0.315
CVP-MVSNet	0.296	0.406	0.351
CDS-MVSNet	0.352	0.280	0.316
MVSTR	0.356	0.295	0.326
Proposed method	0.335	0.284	0.310

Table 2. Quantitative testing results of different methods on DTU

View table

Table 2. Quantitative testing results of different methods on DTU

Method	Int.Mean	Intermediate								Adv.Mean	Advanced
Method	Int.Mean	Fam.	Fra.	Hor.	Lig.	M60	Pan.	Pla.	Tra.	Adv.Mean	Aud.	Bal.	Cou.	Mus.	Pal.	Tem.
COLMAP	42.14	50.41	22.25	26.63	56.43	44.83	46.97	48.53	42.04	27.24	16.02	25.23	34.7	41.51	18.05	27.94
MVSNet	43.48	55.99	28.55	25.07	50.79	53.96	50.86	47.9	34.69
Fast-MVSNet	47.39	65.18	39.59	34.98	47.81	49.16	46.2	53.27	42.91
PatchmatchNet	53.15	66.99	52.64	43.24	54.87	52.87	49.54	54.21	50.81	32.31	23.69	37.73	30.04	41.8	28.31	32.29
CasMVSNet	56.84	76.37	58.45	46.26	55.81	56.11	54.06	58.18	49.51	31.12	19.81	38.46	29.1	43.87	27.36	28.11
UCS-Net	54.83	76.09	53.16	43.03	54	55.6	51.49	57.38	47.89
CVP-MVSNet	54.03	76.5	47.74	36.34	55.12	57.28	54.28	57.43	47.54
AA-RMVSNet	61.51	77.77	59.53	51.53	64.02	64.05	59.47	60.85	54.9	33.53	20.96	40.15	32.05	46.01	29.28	32.71
CDS-MVSNet	61.58	78.85	63.17	53.04	61.34	62.63	59.06	62.28	52.3
MVSTER										37.53	26.68	42.14	35.65	49.37	32.16	39.19
Proposed method	62.81	78.32	65.21	53.01	62.07	64.48	63.25	61.78	54.36	38.18	27.36	43.37	39.26	52.19	33.46	33.41

Table 3. Comparison of quantitative results of ablation experiments
View table
Table 3. Comparison of quantitative results of ablation experiments
No. Method D_acc D_comp D_overall
1 Baseline 0.367 0.325 0.346
2 +DCN 0.356 0.304 0.330
3 +SA+CA 0.342 0.288 0.315
4 FAT-MVSNet 0.335 0.284 0.310

Tools

Get Citation

Copy Citation Text

Min Wang, Mingfu Zhao, Tao Song, Weiwei Li, Yuan Tian, Cheng Li, Yu Zhang. Multiview Stereo Reconstruction with Feature Aggregation Transformer[J]. Laser & Optoelectronics Progress, 2024, 61(14): 1415004

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites