Computer Engineering, Volume. 51, Issue 8, 262(2025)

A Research on Training Method for Diffusion Model Based on Neighborhood Attention

JI Lixia1,2, ZHOU Hongxin1, XIAO Shijie1, CHEN Yunfeng3, and ZHANG Han1、*
Author Affiliations
  • 1School of Cyber Science and Engineering, Zhengzhou University, Zhengzhou 450000, Henan, China
  • 2College of Software Engineering, Sichuan University, Chengdu 610065, Sichuan, China
  • 3Henan Cocyber Information and Technology Co., Ltd., Zhengzhou 450000, Henan, China
  • show less

    Generative diffusion models can learn to generate data. They progressively denoise and generate new data samples based on input Gaussian noise; therefore, they are widely applied in the field of image generation. Recently, the inductive bias provided by the U-Net backbone used in diffusion models has been revealed to be non-critical, and the Transformer can be adopted as the backbone network to inherit the latest advancements from other domains. However, introducing the Transformer increases the model size and slows the training. To address the issues of slow training and inadequate image detail associated with diffusion models utilizing the Transformer backbone, this paper introduces a diffusion model based on a neighborhood attention architecture. This model incorporates a Transformer backbone network with neighborhood attention, utilizes the sparse global attention pattern of the neighborhood attention mechanism, which exponentially expands the model′s perception range of images, and focuses on global information at a lower cost. By employing progressive expansion in the attention expansion layer, more visual information is captured during model training, resulting in images with better global aspects. Experimental results demonstrate that this design provides better global consistency, yields superior global details in the generated images, and outperforms current State-Of-The-Art (SOTA) models.

    Tools

    Get Citation

    Copy Citation Text

    JI Lixia, ZHOU Hongxin, XIAO Shijie, CHEN Yunfeng, ZHANG Han. A Research on Training Method for Diffusion Model Based on Neighborhood Attention[J]. Computer Engineering, 2025, 51(8): 262

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Nov. 8, 2023

    Accepted: Aug. 26, 2025

    Published Online: Aug. 26, 2025

    The Author Email: ZHANG Han (zhang_han@gs.zzu.edu.cn)

    DOI:10.19678/j.issn.1000-3428.0068793

    Topics