Opto-Electronic Engineering, Volume. 50, Issue 4, 220246(2023)
STransMNet: a stereo matching method with swin transformer fusion
Feature extraction in the CNN-based stereo matching models has the problem that it is difficult to learn global and long-range context information. To solve this problem, an improved model STransMNet stereo matching network based on the Swin Transformer is proposed in this paper. We analyze the necessity of the aggregated local and global context information. Then the difference in matching features during the stereo matching process is discussed. The feature extraction module is improved by replacing the CNN-based algorithm with the Transformer-based Swin Transformer algorithm to enhance the model's ability to capture remote context information. The multi-scale fusion module is added in Swin Transformer to make the output features contain shallow and deep semantic information. The loss function is improved by introducing the feature differentiation loss to enhance the model's attention to details. Finally, the comparative experiments with the STTR-light model are conducted on multiple public datasets, showing that the End-Point-Error (EPE) and the matching error rate of 3 px error are significantly reduced.
Get Citation
Copy Citation Text
Gaoping Wang, Xun Li, Xuefang Jia, Zhewen Li, Wenjie Wang. STransMNet: a stereo matching method with swin transformer fusion[J]. Opto-Electronic Engineering, 2023, 50(4): 220246
Category: Article
Received: Oct. 8, 2022
Accepted: Jan. 19, 2023
Published Online: Jun. 15, 2023
The Author Email: Li Xun (lixun@xpu.edu.cn)