Fast Two-Stage 3D Object Detection with Semantic Guidance

Fig. 5. Comparison of foreground point preservation between farthest point sampling and semantic-guided sampling under different scenes. (a) Scene without tree cover; (b) intersection scene; (c) scene with tree cover

Download full size

Fig. 6. Comparison of first-stage and second-stage detection performance

Download full size

Table 1. Experimental environment
View table
Table 1. Experimental environment
Configuration Model /version
CPU Intel Xeon Gold 6230
GPU Geforce RTX 3090
Operating system Ubuntu 16.04
CUDA version CUDA 11.1
Deep learning framework PyTorch 1.7.0
Programming language Python 3.8
OpenPCDet 0.5.2

Table 2. Quantitative comparison of different methods on the car category on KITTI validation set

View table

Table 2. Quantitative comparison of different methods on the car category on KITTI validation set

Method	Type	Modality	AP for 3D Car（IoU is 0.7）/%			mAP /%	Speed /（frame·s^-1）
Method	Type	Modality	Easy	Moderate	Hard	mAP /%	Speed /（frame·s^-1）
MV3D^［18］	2-stage	RGB+LiDAR	71.29	62.68	56.56	63.51	2.7
3D-CVF^［19］	1-stage	RGB+LiDAR	89.67	79.88	78.47	82.67	13.3
VoxelNet^［8］	1-stage	LiDAR	81.97	65.46	62.85	70.09	4.5
SECOND^［9］	1-stage	LiDAR	87.43	76.48	69.10	77.67	20
PointPillars^［10］	1-stage	LiDAR		77.98		77.98	42.4
PV-RCNN^［22］	2-stage	LiDAR		83.90		83.90	12.5
PointRCNN^［12］	2-stage	LiDAR	88.88	78.63	77.38	81.63	10
3DSSD^［15］	1-stage	LiDAR	89.71	79.45	78.67	82.61	25
Pointformer^［16］	1-stage	LIDAR	90.05	79.65	78.89	82.86
FTS3D	2-stage	LiDAR	89.02	83.25	78.10	83.45	55.6

Table 3. Quantitative comparison of different methods under 3D view on KITTI test set

View table

Table 3. Quantitative comparison of different methods under 3D view on KITTI test set

Method	Modality	AP for 3D Ped（IoU is 0.5）/%			AP for 3D Cyc（IoU is 0.5）/%			mAP（Ped）/%	mAP（Cyc）/%	Speed /（frame·s^-1）
Method	Modality	Easy	Moderate	Hard	Easy	Moderate	Hard	mAP（Ped）/%	mAP（Cyc）/%	Speed /（frame·s^-1）
MV3D^［18］	RGB+LiDAR									2.7
3D-CVF^［19］	RGB+LiDAR									13.3
VoxelNet^［8］	LiDAR	39.48	33.69	31.50	61.22	48.36	44.37	34.89	51.31	4.5
SECOND^［9］	LiDAR	45.31	35.32	33.14	75.83	60.82	53.67	37.92	63.44	20
PointPillars^［10］	LiDAR	51.45	41.92	38.89	77.10	58.65	51.92	44.08	62.55	42.4
PV-RCNN^［22］	LiDAR	52.17	43.29	40.29	78.60	63.71	57.65	45.25	66.65	12.5
PointRCNN^［12］	LiDAR	47.98	39.37	36.01	74.96	58.82	52.53	40.93	62.10	10
3DSSD^［15］	LiDAR									25
Pointformer^［16］	LiDAR	50.67	42.43	39.60	75.01	59.80	53.99	44.23	62.93
FTS3D	LiDAR	49.42	40.03	37.27	78.36	62.73	56.34	42.24	65.81	55.6

Table 4. Quantitative comparison of different methods on the cyclist category under BEV on KITTI test set

View table

Table 4. Quantitative comparison of different methods on the cyclist category under BEV on KITTI test set

Method	Type	Modality	AP for BEV Cyc（IoU is 0.5）/%			mAP /%	Speed /（frame·s^-1）
Method	Type	Modality	Easy	Moderate	Hard	mAP /%	Speed /（frame·s^-1）
MV3D^［18］	2-stage	RGB+LiDAR					2.7
3D-CVF^［19］	1-stage	RGB+LiDAR					13.3
VoxelNet^［8］	1-stage	LiDAR					4.5
SECOND^［9］	1-stage	LiDAR	76.50	56.05	49.45	60.67	20
PointPillars^［10］	1-stage	LiDAR	79.90	62.73	55.58	66.07	42.4
PV-RCNN^［22］	2-stage	LiDAR	82.49	68.89	62.41	71.26	12.5
PointRCNN^［12］	2-stage	LiDAR	82.56	67.24	60.28	70.02	10
3DSSD^［15］	1-stage	LiDAR					25
Pointformer^［16］	2-stage	LiDAR
FTS3D	2-stage	LiDAR	86.04	71.26	63.65	73.65	55.6

Table 5. Comparison of experimental results for different sampling methods

View table

Table 5. Comparison of experimental results for different sampling methods

Sampling layer 1	Sampling layer 2	Sampling layer 3	Sampling layer 4	AP for Car Moderate（IoU is 0.7）/%	Enhancement /percentage points
Random	Random	Random	Random	66.44
Random	Random	Random	SS	67.52	1.08
Random	Random	SS	SS	69.39	2.95
FPS	FPS	FPS	FPS	79.41
FPS	FPS	FPS	SS	80.12	0.71
FPS	FPS	SS	SS	83.25	3.84

Table 6. Ablation study results
View table
Table 6. Ablation study results
Baseline SS RoI pooling CA AP for Car Moderate（IoU is 0.7）/% Enhancement /percentage points
√ 77.46
√ √ 77.74 0.28
√ √ 79.36 1.90
√ √ √ 82.95 5.49
√ √ √ √ 83.25 5.79

Tools

Get Citation

Copy Citation Text

Mang Huang, Bin Hui, Zhaoji Liu, Tianming Jin. Fast Two-Stage 3D Object Detection with Semantic Guidance[J]. Laser & Optoelectronics Progress, 2024, 61(12): 1228007

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Remote Sensing and Sensors

Received: Jul. 19, 2023

Accepted: Sep. 6, 2023

Published Online: Jun. 5, 2024

The Author Email: Bin Hui (huibin@sia.cn)

DOI:10.3788/LOP231763

CSTR:32186.14.LOP231763

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology

Table 1. Experimental environment

Table 1. Experimental environment

Table 2. Quantitative comparison of different methods on the car category on KITTI validation set

Table 2. Quantitative comparison of different methods on the car category on KITTI validation set

Table 3. Quantitative comparison of different methods under 3D view on KITTI test set

Table 3. Quantitative comparison of different methods under 3D view on KITTI test set

Table 4. Quantitative comparison of different methods on the cyclist category under BEV on KITTI test set

Table 4. Quantitative comparison of different methods on the cyclist category under BEV on KITTI test set

Table 5. Comparison of experimental results for different sampling methods

Table 5. Comparison of experimental results for different sampling methods

Table 6. Ablation study results

Table 6. Ablation study results