Multi-modal-fusion-based 3D semantic segmentation algorithm

Fig. 6. Visualization diagram of data augmentation strategy. (a) Shows the original data of the point cloud; (b) Shows the complete point cloud of the enhanced instance object tree; (c) Shows the perspective of the device during data collection after pasting the point cloud; (d) Shows the original image data, and the green dots represent the projection of the instance object tree point cloud to the image; (e) Shows the foreground image of trees; (f) Shows the pasting effect of the foreground image of trees (for the convenience of observing and taking the image of the pasting position); (g) Shows the points (green dots in the figure) that match the projection of the pasted tree point cloud and the image Mask; (h) Shows the points in the tree point cloud that do not match the image Mask after pasting (green dots in the figure); (i) Shows the points that match the tree point cloud and image after mapping correction

Download full size

View in Article

Fig. 7. GT-Paste^[11] data augmentation diagram. (a) Shows the original point cloud scene; (b) Shows the pasted point cloud scene, where purple and red represent the points that need to be filtered for occlusion; (c) Shows the filtered point cloud scene; (d) Shows the original scene of the image; (e) Shows the pasted image scene; (f) Shows the image scene after processing occlusion relationships

Download full size

View in Article

Fig. 8. The schematic diagram of the qualitative results of the model is shown in Figures (a) and (d), which represent the baseline (i.e. the first row of the ablation experiment) visualization of model false positives. Figures (b) and (e) represent the visualization of model false positives in the final model of this paper (i.e. the fourth row of the ablation experiment). Figures (c) and (f) show Ground Truth

Download full size

View in Article

Table 1. Performance comparison with other algorithms

View table

View in Article

Table 1. Performance comparison with other algorithms

Method	mIoU	Car	Truck	Pedestrian	Bicycle	Road	Motorcycle	Barries	Vegetation	Speed/ms
SquSegv3^[24]	53.8	92.8	36.8	63.4	25.7	91.1	21.1	14.2	85.1	97
KPconv^[4]	58.2	93.5	37.7	71.9	39.4	89.7	23.5	25.1	84.8	−
(AF)2S3Net^[14]	62.0	93.2	41.6	73.1	45.5	90.6	39.9	26.0	86.7	270
SPVCNN^[8]	63.3	95.8	44.8	74.4	42.1	91.3	46.4	28.6	87.5	63
Fus3DSeg^[13]	64.3	96.1	48.1	67.3	43.7	93.0	48.1	30.2	88.3	−
Ours	66.7	94.1	49.6	79.3	47.8	90.9	52.6	31.2	88.4	88

Table 2. Ablation experiment

View table

View in Article

Table 2. Ablation experiment

Depthestimate	VPSnetwork	DFM	Pointaugment	mIoU	Car		Pedestrian		Vegetation
Depthestimate	VPSnetwork	DFM	Pointaugment	mIoU	<25 m	>25 m	<25 m	>25 m	<25 m	>25 m
				62.8	95.2	86.4	79.3	62.7	90.3	79.8
$ \surd $		$ \surd $		64.4	97.1	89.8	81.2	67.4	91.1	82.4
	$ \surd $	$ \surd $		64.6	97.3	89.9	82.8	69.6	91.2	82.1
$ \surd $	$ \surd $	$ \surd $	$ \surd $	66.7	97.6	90.2	85.3	73.3	92.3	83.5

Table 3. Comparison of voxel feature extraction network effects
View table
View in Article
Table 3. Comparison of voxel feature extraction network effects
Car Pedestrian Vegetation
CN 93.2 75.7 87.1
VPS 94.1 79.3 88.4

Table 4. Comparison of object detection result
View table
View in Article
Table 4. Comparison of object detection result
method mAP
Baseline 55.4%
Baseline + GT-Paste^[11] 57.0%
Baseline + PointAugment 57.2%

Tools

Get Citation

Copy Citation Text

Qi Chao, Yandong Zhao, Shengbo Liu. Multi-modal-fusion-based 3D semantic segmentation algorithm[J]. Infrared and Laser Engineering, 2024, 53(5): 20240026

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites