Chinese Journal of Lasers, Volume. 51, Issue 24, 2402110(2024)
Laser Stripe Segmentation of Weld Seam Based on CNN‑Transformer Hybrid Networks
The challenging conditions at welding construction sites—such as uneven weldment surfaces, complex bevel shapes due to the front weld channel, loss of centerline information, smoke, spatter, intense arc light, and overlapping reflections—hinder real-time and accurate tracking and control during the welding process. Projecting a laser onto the weldment surface, using a vision sensor to capture the laser streak image at the bevel, and then using the identified key point of the laser streak as the basis for weld positioning has become the most widely applied method for tracking complex weld seams. Therefore, accurately segmenting multi-layer multi-pass weld laser stripes against a complex background is a key problem in intelligent welding processes. This study proposes a lightweight weld laser stripe segmentation method based on a convolutional neural network (CNN)-Transformer hybrid network to improve the segmentation accuracy and real-time performance by acquiring fine-grained features and recognizing subtle differences, thereby enabling the tracking of complex multi-layer multi-pass welds in high-noise environments.
This study develops a hybrid CNN-Transformer model for weld laser streak segmentation. The encoder part of the model uses the MobileViT module, which has a smaller number of parameters and demands less computation, for feature extraction. It also embeds a dual non-local block (DNB) module to capture the long-distance correlation relationship on the spatial and channel domains of the weld image, which ensures feature extraction capability and improves the segmentation efficiency simultaneously. The decoder part of the model uses efficient sub-pixel convolutional neural network (ESPCN) to obtain semantic segmentation results, which reduces the feature loss in the information reconstruction process and improves the model performance in extracting laser lines from weld seams. To address the imbalance between laser-streak and background pixels in the weld image, a loss function that dynamically adjusts the weighting coefficients of laser streaks is proposed.
Ablation test results show that the introduction of the DNB module for feature extraction enriches the semantic information in weld laser streak images, and the adoption of the ESPCN implementation reduces the loss of weld laser streak information (Table 2). The results of loss function tests and related comparisons show that the dynamically generated weighted coefficient loss function proposed in this study can well solve the problem of pixel imbalance in weld laser streak images (Table 3). Tests and comparisons of the loss function demonstrate that the dynamically generated weighted coefficient loss function effectively addresses the pixel imbalance in weld laser streak images (Table 3). Testing and comparing different segmentation models reveal that the proposed CNN-Transformer hybrid network model is advantageous in accuracy and speed, achieving the highest pixel accuracy (PA), mean pixel accuracy (mPA), and mean intersection over union (mIoU) while retaining its lightweight computation (Table 4). Training results of the 20th round for different segmentation models indicate that the laser stripe line contour obtained by this model is clearer and closer to the labeled image (Fig. 11).
Addressing issues of incomplete, low-precision, and slow weld laser stripe segmentation caused by various factors at welding construction sites—such as harsh conditions, uneven weldment surfaces, complex bevel shapes due to the front weld channel, loss of centerline information, and numerous noises—a weld laser stripe segmentation model based on a CNN-Transformer hybrid network is established. Using the same dataset, experimental setup, and loss function, the proposed model outperforms commonly used lightweight semantic segmentation networks such as Unet, Deeplabv3+, SegNet, PSPNet, RefineNet, and FCN-32s in both accuracy and processing speed. As a segmentation network, the model employs different loss functions for experiments, with the improved loss function effectively addressing the imbalance between laser-stripe and background pixels, achieving the highest recognition accuracy and fastest convergence speed. The small size and low computational complexity of the proposed model, with a single image inference time of 40 ms and a pixel accuracy of 98%, meet the requirements for lightweight, high-precision, and low-latency vision tasks on resource-constrained mobile devices.
Get Citation
Copy Citation Text
Ying Wang, Sheng Gao, Zhe Dai. Laser Stripe Segmentation of Weld Seam Based on CNN‑Transformer Hybrid Networks[J]. Chinese Journal of Lasers, 2024, 51(24): 2402110
Category: Laser Forming Manufacturing
Received: Mar. 26, 2024
Accepted: Jun. 21, 2024
Published Online: Dec. 11, 2024
The Author Email: Wang Ying (wangying@nepu.edu.cn)
CSTR:32183.14.CJL240710