ObjectiveShip detection technology plays a crucial role in fields such as security, military operations, and maritime traffic management. Infrared imaging has become an essential tool for ship detection due to its unique advantages, including the ability to observe effectively in low light, at night, and under harsh weather conditions. However, the special nature of infrared images presents challenges, such as unclear target features, complex backgrounds, and overlapping targets, which make detection more difficult. The existing single-stage algorithm YOLOv8s achieves a balance between detection accuracy and efficiency, but its performance still faces limitations when dealing with background interference, target occlusion, and other challenges. Therefore, there is a need to introduce innovative improvements to the YOLOv8s algorithm, such as enhancing the feature extraction network and refining the fusion strategy, in order to improve its detection accuracy and anti-interference capability for infrared ship targets in complex marine environments.
MethodsFirst, a Local-Global Self-Attention (LGSA) module (Fig.3) is introduced. This module uses both local and global pooling to capture richer local and global features, allowing for a more detailed focus on target features. It also employs a self-attention mechanism to further enhance fine-grained details, strengthen the dependencies between features, improve feature extraction, and reduce the impact of irrelevant noise. Next, a Spatial-Channel Sparse Attention (SCSA) module (Fig.6) is proposed. It extracts multi-scale spatial information by segmenting features into blocks, then applies a channel-sparse module to capture and reconstruct channel information. Afterward, it recalibrates the features to highlight important details, addressing the issue of insufficient multi-scale feature fusion. Additionally, a new small-target detection layer, P2, is added to help capture fine details of small targets, improving detection capabilities for small objects. Finally, Soft-NMS, with a confidence penalty factor, is introduced to improve the NMS process and optimize the bounding box removal rules.
Results and DiscussionsThe study uses the infrared maritime ship dataset released by IRay Technology Company and conducts ablation experiments on the baseline YOLOv8s network (Tab.3) to evaluate the effectiveness of the three improvements: LGSA, SCSA, and Soft-NMS. Additionally, the ability of Soft-NMS to reduce missed detections of small targets and false detections of stacked targets is investigated (Fig.8). The experimental results show that compared to the baseline network, the mAP0.5 improved by 2.1%, mAP0.5∶0.95 increased by 4.4%, and the parameter count reduced by 0.3 M. Overall, the improvements effectively suppress background noise, focusing attention more on the targets, and addressing the issue of the original network missing small targets (Fig.11). Furthermore, the improved YOLOv8s algorithm was compared with other popular object detection algorithms (Tab.4), and the results indicate that the improved algorithm outperforms the others in detection accuracy. To further highlight the improved YOLOv8s algorithm's ability to capture and preserve small target details, a heatmap comparison of small target detection layers between the improved model and other similarly accurate models was performed (Fig.12), demonstrating the superiority of the improved algorithm.
ConclusionsThis work proposes an infrared ship target detection algorithm based on local-global self-attention and spatial-channel sparse enhancement. The aim is to address challenges in infrared ship image detection, such as large variations in target scale, densely stacked targets, and the loss of small target details. By integrating a LGSA module into the YOLOv8s backbone network, the algorithm improves the problem of feature dilution and loss, enhancing the feature extraction capability. Additionally, a SCSA module is added to the neck network to improve multi-scale feature processing and insufficient information fusion, which previously limited small target detection. Finally, Soft-NMS is used to improve the original network's NMS, addressing issues like stacked targets, incomplete feature representation, and false detections of small targets, thereby increasing detection accuracy. Experimental results show that the improved YOLOv8s model outperforms the baseline model, with improvements of 2.1% and 4.4% in mAP0.5 and mAP0.5∶0.95, respectively, reaching 95.7% and 72.8%. These results further validate the effectiveness of the proposed algorithm in enhancing infrared ship target detection accuracy. Additionally, compared to other classical models and the latest YOLOv11 model, this approach demonstrates better performance in detection accuracy.