Laser & Optoelectronics Progress, Volume. 62, Issue 12, 1215001(2025)
Mask-Based and Multi-Modal Bird's Eye View Feature-Guided Point Cloud Panoptic Segmentation
To improve the efficiency and accuracy of attention mask prediction for panoptic segmentation on point clouds, an end-to-end point cloud panoptic segmentation network model guided by multi-modal bird's eye view (BEV) features is proposed. First, object queries are generated from BEV features decoded by the Transformer, and feature enhancement is achieved through confidence ranking and positional encoding embedding. Second, a cross-attention mechanism module is constructed to fuse object queries with learnable query features and the query features of fused object instance information are used to improve the accuracy of attention mask prediction. Finally, the dimensionality of the input features to the masked attention mechanism network is reduced to enhance detection speed. Experimental results based on the nuScenes dataset indicate that, compared with the baseline method, BEVGuide-PS improves panoptic segmentation metrics PQ, PQ?, RQ, and SQ by 17.7%, 17.0%, 18.3%, and 20.9%, respectively, reduces inference time by 58.4%, and significantly enhances training efficiency.
Get Citation
Copy Citation Text
Jiaming Zhang, Yuhui Peng, Gan Zhang, Baozhe Sun, Shenyang Lin. Mask-Based and Multi-Modal Bird's Eye View Feature-Guided Point Cloud Panoptic Segmentation[J]. Laser & Optoelectronics Progress, 2025, 62(12): 1215001
Category: Machine Vision
Received: Nov. 13, 2024
Accepted: Dec. 12, 2024
Published Online: Jun. 12, 2025
The Author Email: Yuhui Peng (pengyuhui@fzu.edu.cn)
CSTR:32186.14.LOP242258