Journal of Applied Optics, Volume. 45, Issue 6, 1204(2024)
Video smoke recognition based on random patch shift and deformable attention
Recognition of smoke emission behavior in industrial environments is of vital importance for regulating and monitoring companies in real time, as well as for environmental protection. However, it is highly challenging. On the one hand, industrial emission smoke is characterized by high transparency and high dynamics, and on the other hand, the shape and size of smoke may change due to the environment, lighting, and other factors. Currently, the mainstream smoke recognition methods are deep learning models based on images and videos, but the image-based models cannot effectively model the dynamic characteristics of the smoke in the video in a time-series manner, while the video-based models do not take into account the characteristics of the variable shape of the smoke. The random patch shift (RPS) and deformable attention (DA) was introduced into the Swin Transformer. The traditional 2D spatial attention was transformed into spatio-temporal attention by RPS, thereby modeling the dynamic smoke using 2D self-attention computations. By means of adaptive deformation, DA enabled the network to adapt to different smoke shapes and appearance changes, thereby improving the robustness and generalization ability of the network. Experimental results on the RISE dataset show that the proposed method can achieve F1 scores of 0.85, 0.86, and 0.84 in the three subsets, respectively, with an improvement of 0.01~0.06 compared to other methods.
Get Citation
Copy Citation Text
Yehui XIE, Haitao ZHAO. Video smoke recognition based on random patch shift and deformable attention[J]. Journal of Applied Optics, 2024, 45(6): 1204
Category:
Received: Sep. 21, 2023
Accepted: --
Published Online: Jan. 14, 2025
The Author Email: ZHAO Haitao (赵海涛)