Semantic Segmentation Method of Point Cloud Based on Sparse Convolution and Attention Mechanism

Meng Zuo; Yiyang Liu; Hao Cui; Hongfei Bai

doi:10.3788/LOP222819

Laser & Optoelectronics Progress, Volume. 60, Issue 20, 2015002(2023)

Semantic Segmentation Method of Point Cloud Based on Sparse Convolution and Attention Mechanism

Meng Zuo^1,2,3,4, Yiyang Liu^1,2,3、*, Hao Cui^1,2,3, and Hongfei Bai²

Author Affiliations

¹Key Laboratory Networked Control Systems, Chinese Academy of Sciences, Shenyang 110016, Liaoning , China

²Shenyang Institute of Automation, Chinese Academy of Sciences, Shenyang 110016, Liaoning , China

³Institutes for Robotics & Intelligent Manufacturing, Chinese Academy of Sciences, Shenyang 110169, Liaoning , China

⁴University of Chinese Academy of Sciences, Beijing 100049, China

show less

Abstract Get PDF(in Chinese)

Figures & Tables(15)

Fig. 1. Point cloud semantic segmentation network model

Download full size

Fig. 2. Feature extraction network based on sparse convolution and improved attention mechanism

Download full size

Fig. 3. Residual block based on sparse convolution

Download full size

Fig. 4. Comparison between ordinary convolution and sparse convolution. (a) Ordinary convolution; (b) sparse convolution

Download full size

Fig. 5. Non Local Block structure

Download full size

Fig. 6. Spatial pyramid sampling

Download full size

Fig. 7. Non Local Block combined with spatial pyramid sampling

Download full size

Fig. 8. Scannet V2 dataset segmentation visualization. (a) True value label; (b) PointNet++; (c) FPConv; (d) SSCN; (e) Minkowski; (f) proposed network

Download full size

Fig. 9. S3DIS AREA 5 segmentation visualization. (a) True value label; (b) PointNet; (c) KPConv; (d) Minkowski; (e) proposed network

Download full size

Table 1. Comparison of different voxel resolution parameters

View table

Table 1. Comparison of different voxel resolution parameters

Voxel resolution /cm	MIOU /%	Number of totle voxels	Number of active voxels
1	68.214	$1.440 \times 10^{10}$	112541
2	70.821	$1.800 \times 10^{9}$	95145
3	70.154	$5.333 \times 10^{8}$	87165
4	68.657	$2.250 \times 10^{8}$	81104
5	67.241	$1.152 \times 10^{8}$	79514
6	66.142	$6.667 \times 10^{7}$	72015

Table 2. Comparison of sampling parameters in different spatial pyramids

View table

Table 2. Comparison of sampling parameters in different spatial pyramids

Sampling method	Sample size	Size of S	MIOU /%
Pyramid random	1，4，9，36	50	70.461
Pyramid max	1，4，9，36	50	71.324
Pyramid average	1，4，9，36	50	71.640
Pyramid average	1，9，36，64	110	71.825
Pyramid average	1，16，64，144	225	71.833

Table 3. Comparison of experimental results of Scannet V2 test set

View table

Table 3. Comparison of experimental results of Scannet V2 test set

Class	PointNet++	FPConv	SSCN	Minkowski	Proposed algorithm
MIOU	33.9	63.9	70.8	70.6	71.8
Wall	52.3	79.9	83.6	84.5	83.8
Floor	67.7	94.8	95.1	95.9	94.9
Cabinet	25.6	60.3	65.3	63.9	68.4
Bed	47.8	76.0	80.7	80.8	80.4
Chair	36.0	79.8	90.4	90.1	91.2
Sofa	34.6	69.6	82.0	81.5	80.2
Table	23.2	61.4	72.2	70.9	73.5
Door	26.1	52.4	64.3	59.8	67.2
Window	25.2	56.7	60.5	60.6	64.1
Bookshelf	45.8	71.3	78.0	75.4	76.0
Picture	11.7	25.0	31.3	31.5	35.1
Counter	25.0	39.2	62.5	66.0	61.2
Desk	27.8	6.3	58.7	60.5	63.9
Curtain	24.7	53.4	75.8	71.3	76.2
Refrigerator	21.2	53.8	49.4	55.6	56.2
Shower curtain	58.4	72.3	70.8	66.5	72.2
toilet	14.5	87.2	93.0	90.3	84.2
Sink	54.8	59.8	63.9	65.2	62.5
Bathtub	36.4	78.5	87.4	93.5	88.1
other	18.3	45.7	51.4	56.6	57.2

Table 4. Comparison of experimental results of the S3DIS AREA 5

View table

Table 4. Comparison of experimental results of the S3DIS AREA 5

Class	PointNet	KPConv	Minkowski	Proposed network
MIOU	41.1	67.1	65.4	70.5
Calling	88.8	92.8	91.8	92.5
Floor	97.3	97.3	98.7	98.4
Wall	69.8	82.4	86.2	89.4
Beam	0.1	0.0	0.0	0.0
Column	3.9	23.9	34.1	54.2
Window	46.3	58.0	48.9	61.2
Door	10.8	69.0	62.4	65.1
Table	59.0	81.5	81.6	82.1
Chair	52.6	91.0	89.8	92.0
Sofa	5.9	75.4	47.2	78.2
Bookcase	40.3	75.3	74.9	74.2
Board	26.4	66.7	74.4	75.2
Clutter	33.2	58.9	58.6	54.4

Table 5. Comparison of segmentation accuracy of Non Local Block inserted into different layers after spatial pyramid sampling
View table
Table 5. Comparison of segmentation accuracy of Non Local Block inserted into different layers after spatial pyramid sampling
Layer SSCN SSCN+NonLocal Block SSCN+SPSNB
1 70.821 71.034 71.034
2 70.821 71.342 71.214
3 70.821 71.421
4 70.821 71.825
5 70.821 71.641
6 70.821 71.322

Table 6. Comparison of time for Non Local Block insertion into different layers of forward reasoning after a spatial pyramid sampling
View table
Table 6. Comparison of time for Non Local Block insertion into different layers of forward reasoning after a spatial pyramid sampling
Layer SSCN SSCN+Non Local Block SSCN+SPSNB
1 73.10 73.32 73.32
2 73.10 77.34 73.85
3 73.10 74.42
4 73.10 75.35
5 73.10 76.52
6 73.10 77.82

Tools

Get Citation

Copy Citation Text

Meng Zuo, Yiyang Liu, Hao Cui, Hongfei Bai. Semantic Segmentation Method of Point Cloud Based on Sparse Convolution and Attention Mechanism[J]. Laser & Optoelectronics Progress, 2023, 60(20): 2015002

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites