6D Pose Detection Method Based on Cross-Attention Weighting Mechanism

To address the challenge of 6D object pose detection in unstructured scenes, where foreground-background similarity affects accuracy, we propose a 6D pose detection method based on a cross-attention weighting mechanism. Initially, an RGB-D mask isolates the region of interest (ROI) in the image. RGB semantic features are extracted using the PSPNet module, while global and local point cloud features are extracted from the corresponding region using the PointNet module, enabling dual feature representations for the same object. These RGB semantic and point cloud features are then input into a cross-attention mechanism, which facilitates their deep integration, producing foreground object fusion features with richer contextual information and enhancing the model's understanding of complex scenes. To improve robustness in scenarios with background interference and color overlap, a squeeze-and-excitation (SE) mechanism is introduced into the backbone network, allowing for the distinction between foreground and background regions with similar features. Finally, the 6D pose estimation is further optimized by utilizing both object color features and point cloud geometric transformation features, resulting in improved pose detection accuracy. Comparative experiments demonstrate that, compared to DenseFusion, the proposed method achieves a 2.5 percentage points improvement in the average average distance on the LineMOD dataset and a 1.9 percentage points improvement in the average area under curve on the YCB-Video dataset. Real-world scene tests show an overall centroid deviation of less than 2 mm and angular error below 1.5°, confirming the practical applicability of the proposed method.

AI Video Guide

AI Picture Guide

AI One Sentence

AI Short Abstract

Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.

Keywords

3D point cloud 6D pose estimation cross-attention feature fusion image processing

Tools

Get Citation

Copy Citation Text

Yu Ye, Jing Zhang, Aimin Wang, Heng Liu, Mingju Chen. 6D Pose Detection Method Based on Cross-Attention Weighting Mechanism[J]. Laser & Optoelectronics Progress, 2025, 62(16): 1615004

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites