Pose estimation network based on attention feature fusion of multimodal data

Yuntao ZHAO; Xinhui DENG

doi:10.37188/CJLCD.2024-0218

Chinese Journal of Liquid Crystals and Displays, Volume. 40, Issue 4, 598(2025)

Pose estimation network based on attention feature fusion of multimodal data

Yuntao ZHAO and Xinhui DENG^*

Author Affiliations

College of Information Science and Engineering，Wuhan University of Science and Technology.，Wuhan 430081，China

show less

Abstract Get PDF(in Chinese)

6D pose estimation that balances accuracy and applicability has been a hot and difficult research topic. To this end, a 6D pose estimation network based on attentional feature fusion of multimodal data is proposed. Firstly, a deeper structure of squeeze and excitation module is introduced to enhance the dependency to expand the receptive field by adjusting the weights of each channel to improve the effect of processing RGB image features. Further, for multimodal data, an iterative attention feature fusion module is deployed in the feature fusion stage, which solves the scale inconsistency problem in global feature fusion through multiple iterative fusion operations, and is able to capture and integrate multimodal data more accurately, which significantly improves the effect of attitude regression. Finally, in order to quantitatively assess the robustness and applicability of the model in complex environments, an invisibility percentage metric is introduced, which is capable of assessing the performance of the model when dealing with partially occluded or complex backgrounds. Through the pose prediction experiments on the public dataset, it is verified that the improved model is not only able to achieve accurate predicted poses on the validation dataset, but also the algorithmic model proposed in this paper is more applicable in complex environments compared to the densefusion model.

Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.

Keywords

6D pose estimation attention feature fusion multimodal data occlusion percentage

Tools

Get Citation

Copy Citation Text

Yuntao ZHAO, Xinhui DENG. Pose estimation network based on attention feature fusion of multimodal data[J]. Chinese Journal of Liquid Crystals and Displays, 2025, 40(4): 598

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category:

Received: Jul. 28, 2024

Accepted: --

Published Online: May. 21, 2025

The Author Email: Xinhui DENG (2211241803@qq.com)

DOI:10.37188/CJLCD.2024-0218

Topics