Semantic Segmentation Network Based on V-Shaped Pyramid Bilateral Feature Fusion

Fig. 2. Schematic diagrams of the structure of the VASPP module and the coordinate attention module. (a) VASPP module; (b) coordinate attention module

Download full size

Fig. 3. Schematic diagrams of the structure of the BAFA module. (a) BAFA module; (b) CA module; (c) SA module

Download full size

Fig. 4. Visualization results on the PASCAL VOC 2012 dataset

Download full size

Fig. 5. Visualization results on the Cityscapes dataset

Download full size

Table 1. Comparison of MIoU results of different networks on the PASCAL VOC 2012 dataset

View table

Table 1. Comparison of MIoU results of different networks on the PASCAL VOC 2012 dataset

Method	Backbone	Params /M	MIoU /%
PSPNet^［31］	ResNet-101	51.86	80.23
DeepLabV3+^［17］	ResNet-101	68.37	78.85
WASPnet^［32］	ResNet-101	47.48	80.22
DECANet^［33］	ResNet-101		81.08
CFANet^［34］	ResNet-50		81.34
N-Deeplabv3+^［35］	Xception	37.38	81.97
Method of reference ［36］	EfficientNetV2	55.51	81.19
Method of reference ［37］	ResNet-101	60.40	81.13
DeepLabV3+^［17］	Xception	54.71	80.94
VPBF-Net	Xception	42.41	83.25

Table 2. Comparison with the quantitative information of the DeepLabV3+

View table

Table 2. Comparison with the quantitative information of the DeepLabV3+

Method	Backbone	MloU /%	MPA /%	Params /10⁶	Time /ms	Speed /（frame/s）
DeepLabV3+	Xception	80.94	87.29	54.71	45.55	21.96
VPBF-Net	Xception	83.25	89.53	42.41	43.32	23.08
DeepLabV3+	MobileNetV2	72.31	82.52	5.82	26.44	37.82
VPBF-Net	MobileNetV2	73.14	83.95	4.07	25.58	39.09

Table 3. Evaluation results of ablation experiments at different improvement points
View table
Table 3. Evaluation results of ablation experiments at different improvement points
Backbone ASPP VASPP CA BAFA MloU /% MPA /%
√ √ 80.94 87.29
√ √ 81.91 88.36
√ √ 81.26 87.78
√ √ 82.03 88.12
√ √ √ 82.59 89.08
√ √ √ √ 83.25 89.53

Table 4. Comparison of MIoU results for different networks on the Cityscapes dataset

View table

Table 4. Comparison of MIoU results for different networks on the Cityscapes dataset

Method	Backbone	Params /10⁶	MloU /%
WASPnet^［32］	ResNet-101	47.48	73.58
DECANet^［33］	ResNet-101		76.01
CFANet^［34］	ResNet-50		76.27
Method of reference ［38］	ResNet-50	43.16	76.23
Method of reference ［39］	Swin Transformer	123.77	75.18
N-Deeplabv3+^［35］	Xception	37.38	76.31
DeepLabV3+^［17］	Xception	54.71	76.23
VPBF-Net	Xception	42.41	77.21

Tools

Get Citation

Copy Citation Text

Zheng Wang, Wenyuan Li. Semantic Segmentation Network Based on V-Shaped Pyramid Bilateral Feature Fusion[J]. Laser & Optoelectronics Progress, 2024, 61(24): 2437003

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Digital Image Processing

Received: Mar. 29, 2024

Accepted: Apr. 29, 2024

Published Online: Dec. 17, 2024

The Author Email: Zheng Wang (wangxiaozheng@tju.edu.cn)

DOI:10.3788/LOP240990

CSTR:32186.14.LOP240990

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology

Table 1. Comparison of MIoU results of different networks on the PASCAL VOC 2012 dataset

Table 1. Comparison of MIoU results of different networks on the PASCAL VOC 2012 dataset

Table 2. Comparison with the quantitative information of the DeepLabV3+

Table 2. Comparison with the quantitative information of the DeepLabV3+

Table 3. Evaluation results of ablation experiments at different improvement points

Table 3. Evaluation results of ablation experiments at different improvement points

Table 4. Comparison of MIoU results for different networks on the Cityscapes dataset

Table 4. Comparison of MIoU results for different networks on the Cityscapes dataset