Acta Optica Sinica, Volume. 45, Issue 6, 0606003(2025)
Method for Orbital Angular Momentum Mode Recognition Employing an Enhanced CNN-Transformer Model Integrated with Double-Slit Interference
Efficient identification of orbital angular momentum (OAM) modes in vortex beams is critical for enhancing capacity and spectral efficiency in wireless optical communication systems. However, turbulent atmospheric channels pose significant challenges due to phase distortion in vortex beams and the complexity of traditional optical approaches. In this paper, we propose a novel methodology that integrates an enhanced convolutional neural network (CNN-transformer) hybrid model with double-slit interference. The proposed approach enables simultaneous and precise identification of both the magnitude and sign of high-order OAM modes under turbulent atmospheric conditions, offering significant improvements in recognition accuracy and system performance.
To address the challenges of identifying OAM mode magnitude and sign in turbulent atmospheric environments, we propose a novel method combining an improved CNN-transformer hybrid model with double-slit interference. When Laguerre-Gaussian (LG) beams propagate through turbulent atmospheres, phase distortions result in skewed and twisted interference fringes when passing through a double slit. These patterns are captured and processed using the proposed CNN-transformer hybrid model, named CACSIV3-Net. The model employs Inception_V3 as its backbone and incorporates a coordinate attention module (CAM) to dynamically weight channel relationships and spatial features. In addition, the cross-shaped window transformer (CSWT) is introduced to extract multi-scale features and long-range dependencies, achieving high-precision OAM mode recognition.
In this paper, we propose an improved CNN-transformer hybrid model, CACSIV3-Net, designed to enhance the recognition accuracy of OAM modes in turbulent atmospheric environments. To evaluate its performance, we compare CACSIV3-Net with mainstream classification networks (AlexNet, VGGNet, ResNet, and Inception_V3) using identical system configurations and hyperparameter settings. Training is conducted on an LG beam double-slit interference dataset across varying atmospheric turbulence conditions. The performance results, illustrated in Fig. 4 show that CACSIV3-Net achieves the highest Top-1 accuracy for OAM modes, reaching 96.45%. This represents improvements of 24.00, 14.25, 11.12, and 5.34 percentage points over AlexNet, ResNet, VGGNet, and Inception_V3, respectively. In addition, CACSIV3-Net demonstrates the fastest reduction in average loss within the first 50 epochs and maintains convergence after approaching 0.1. Comprehensive analysis indicates that CACSIV3-Net offers superior adaptability and higher recognition accuracy for LG beam datasets under unknown intensity turbulence compared to other networks. To further analyze its components, ablation experiments are conducted by progressively integrating CAM and CSWT to evaluate their influence on OAM mode recognition, with results provided in Fig. 5. As shown in Fig. 5(a), the ROC curve of CACSIV3-Net is closest to the upper-left corner, achieving a micro-averaged area under the curve of 0.79, outperforming models such as Inception_V3, Inception_V3+CAM, and Inception_V3+CSWT. This indicates superior decision-making ability and stability. CACSIV3-Net processes 24, 15, and 7 more images per second compared to the baseline models, including Inception_V3, Inception_V3+CAM, and Inception_V3+CSWT, respectively, reducing the total recognition time by 2.8, 0.9 and 0.4 s, as shown in Fig. 5(b). This demonstrates the higher recognition efficiency of the CACSIV3-Net model. The classification performance metrics indicate that incorporating both CAM and CSWT into the Inception_V3 model results in optimal performance, with an accuracy of 91.55%, precision of 91.29%, recall of 91.33%, and F1-score of 91.29%, as shown in Fig. 5(c). The confusion matrix shown in (Fig.6) illustrates the prediction performance of CACSIV3-Net across 20 OAM modes, with sparse and low-proportion off-diagonal elements, signifying excellent classification capabilities. Moreover, robustness tests conducted on three newly added test sets under conditions of noise intensity σ=0.1, transmission distance z=2000 m, and beam wavelength λ=850 nm achieve OAM mode recognition accuracies of 80.1%, 85.4%, and 87.95%, respectively, as shown in Table 1 and Fig. 7.
In this paper, we propose an improved CNN-transformer hybrid model integrated with double-slit interference for high-precision OAM mode recognition, By embedding CAM into the Inception_V3 backbone and utilizing CSWT, the model captures long-range dependencies and enhances recognition accuracy in turbulent atmospheric environments. The trained model achieves 96.45% accuracy for OAM modes ranging from -10 to +10 at a transmission distance of z=1000 m, an improvement of 4.9 percentage points over the baseline network. In addition, robustness tests are conducted on three newly added test sets under conditions of noise intensity σ=0.1, transmission distance z=2000 m, and beam wavelength λ=850 nm, yielding OAM mode recognition accuracies of 80.1%, 85.4%, and 87.95%, respectively. This method provides a novel and effective solution for high-order OAM mode recognition in turbulent environments, with significant potential for OAM multiplexing communications.
Get Citation
Copy Citation Text
Pengfei Wu, Zhiyuan Jia, Sichen Lei, Jiao Wang, Zhenkun Tan, Di Wu. Method for Orbital Angular Momentum Mode Recognition Employing an Enhanced CNN-Transformer Model Integrated with Double-Slit Interference[J]. Acta Optica Sinica, 2025, 45(6): 0606003
Category: Fiber Optics and Optical Communications
Received: Jul. 15, 2024
Accepted: Aug. 16, 2024
Published Online: Mar. 24, 2025
The Author Email: Lei Sichen (lsc@xaut.edu.cn)