Research on urban street view semantic segmentation method based on Transformer architecture

XIONG Wei; ZHAO Di; SUN Peng; LIU Yue

doi:10.16136/j.joel.2024.12.0229

Journal of Optoelectronics · Laser, Volume. 35, Issue 12, 1240(2024)

Research on urban street view semantic segmentation method based on Transformer architecture

XIONG Wei^1,2, ZHAO Di¹, SUN Peng¹, and LIU Yue¹

Author Affiliations

¹School of Electrical and Electronic Engineering, Hubei University of Technology, Wuhan, Hubei 430068, China

²Department of Computer Science and Engineering, University of South Carolina, Columbia, SC 29201, USA

show less

Abstract Get PDF(in Chinese)

When segmenting urban street view images in parital Transformer network, multi-scale features and context information in the network are not fully utilized, leading to defects such as holes in large targets and imprecise edge segmentation of small targets. In this paper, a Trans-AsfNet method based on Transformer architecture is proposed to extract multi-scale features and aggregate context information to solve this problem. The segmentation method introduces Swin Transformer as a new feature extraction network to strengthen the long-distance dependence of information. An adaptive subspace feature fusion module (ASFF) is proposed to strengthen the network's ability to extract multi-scale features, and an effective global context aggregation module (EGCA) is designed to improve the context information aggregation capability of the network, and uses rich multi-scale information for feature decoding and information compensation. Then, the context information of different scales is aggregated to strengthen the semantic information of the understanding target, so as to eliminate the holes of large targets and improve the edge segmentation accuracy of small target pixels. The Trans-AsfNet method is verified and tested by the CamVid urban street view dataset, and the experimental results show that the network can basically eliminate the segmentation hole defects and improve the segmentation effect of small target edges, and the MIoU reaches 69.5% on the CamVid test set.

Keywords

context information feature fusion semantic segmentation transformer urban street view

Tools

Get Citation

Copy Citation Text

XIONG Wei, ZHAO Di, SUN Peng, LIU Yue. Research on urban street view semantic segmentation method based on Transformer architecture[J]. Journal of Optoelectronics · Laser, 2024, 35(12): 1240

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category:

Received: May. 8, 2023

Accepted: Dec. 31, 2024

Published Online: Dec. 31, 2024

The Author Email:

DOI:10.16136/j.joel.2024.12.0229

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology