Optics and Precision Engineering, Volume. 31, Issue 18, 2700(2023)

Cross-scale and cross-dimensional adaptive transformer network for colorectal polyp segmentation

Liming LIANG, Anjun HE, Renjie LI, and Jian WU*
Author Affiliations
  • School of Electrical Engineering and Automation,Jiangxi University of Science and Technology, Ganzhou341000,China
  • show less
    Figures & Tables(14)
    Core Transformer encoding block
    Cross-scale and cross-dimensional adaptive transformer network
    Spatial attention bridge block
    Channel attention bridge block
    Multi-scale dense parallel decoding block
    Multi-scale prediction block
    Segmentation results of different networks on Kvasir and CVC-ClinicDB datasets
    Segmentation results of different networks on CVC-ColonDB and ETIS datasets
    • Table 1. [in Chinese]

      View table
      View in Article

      Table 1. [in Chinese]

      Algorithm 1: Spatial attention bridge block

      Inputs: The input maps of the four channel attention bridge block Cii=1,2,3,4

      Outputs: Sii=1,2,3,4

       1: χmeani=AvgPool(Ci) /*avg-pooling*/

       2: χmaxi=MaxPool(Ci)/*max-pooling*/

       3: χsi=Concat(hmeanihmaxi)/*Concatenate the feature map odd*/

       4: α=Conv7×7(hc)/*7×7 convolution operation*/

       5: ε=σ(β)/*After sigmoid, the feature map becomeC×H×1*/

       6: Si=ε*Ci+Ci/*The feature map of sigmoid with the original feature and then add */

      End

    • Table 1. Segmentation results of different networks on Kvasir and CVC-ClinicDB datasets

      View table
      View in Article

      Table 1. Segmentation results of different networks on Kvasir and CVC-ClinicDB datasets

      DatasetMethodDiceMIoUSEPCF2MAE
      KvasirU-Net0.8180.7460.8560.8570.8270.055
      EUNet0.9080.8540.9340.9110.9190.028
      PraNet0.8980.8400.9110.9160.9010.032
      CaraNet0.9180.8670.9120.9380.9140.023
      PolypPVT0.9170.8640.9130.9470.9140.023
      SSFormer-L0.9180.8650.8970.9570.9040.022
      MSRAFormer0.9230.8730.9150.9520.9170.024
      Ours0.9320.8830.9330.9440.9310.021
      CVC-ClinicDBU-Net0.8230.7550.8340.8390.8270.019
      EUNet0.9020.8460.9590.8800.9260.011
      PraNet0.8990.8490.9100.9070.9050.009
      CaraNet0.9360.8870.9550.9280.9480.007
      PolypPVT0.9370.8890.9490.9360.9450.006
      SSFormer-L0.9060.8550.8970.9310.8980.008
      MSRAFormer0.9240.8740.9450.9200.9320.008
      Ours0.9420.8960.9640.9270.9540.006
    • Table 2. Segmentation results of different networks on CVC-ColonDB and ETIS datasets

      View table
      View in Article

      Table 2. Segmentation results of different networks on CVC-ColonDB and ETIS datasets

      DatasetMethodDiceMIoUSEPCF2MAE
      CVC-ColonDBU-Net0.5120.4440.5230.6210.5100.061
      EUNet0.7560.6810.8490.7580.7880.044
      PraNet0.7120.6400.7390.7550.7170.043
      CaraNet0.7730.6890.8570.7530.7960.042
      PolypPVT0.8080.7270.8210.8490.8090.031
      SSFormer-L0.8020.7210.7910.8640.7870.031
      MSRAFormer0.7820.7070.8030.8740.7870.028
      Ours0.8110.7310.8230.8440.8130.027
      ETISU-Net0.3980.3350.4820.4390.4290.036
      EUNet0.6870.6090.8710.6350.7490.066
      PraNet0.6280.5670.6860.6280.6490.031
      CaraNet0.7470.6720.8110.7310.7770.017
      PolypPVT0.7870.7060.8670.7740.8200.013
      SSFormer-L0.7960.7200.8300.7940.8070.014
      MSRAFormer0.7500.6790.8110.7450.7770.013
      Ours0.8050.7290.8870.7700.8420.012
    • Table 2. [in Chinese]

      View table
      View in Article

      Table 2. [in Chinese]

      Algorithm 2: Channel attention bridge block

      Inputs: The input maps of the four stagesEii=1,2,3,4

      Outputs: Cii=1,2,3,4

       1: hmeani=AvgPool(Ci) /*avg-pooling*/

       2: hc=Concat(hmean1hmean2hmean3hmean4)/*Concatenate the feature map of avg-pooling*/

       3: β=Conv3×3(hc)/*3×3 convolution operation*/

       4: γ=σ(β)/*After sigmoid, the feature map becomeC×H×1*/

       5: Ci=γ*Ei+Ei/*The feature map of sigmoid with the original feature and then add */

      End

    • Table 3. Performance comparison of different networks(CVC-ClinicDB)

      View table
      View in Article

      Table 3. Performance comparison of different networks(CVC-ClinicDB)

      MethodParameters/MGFLOPsTrain/(round·s-1
      U-Net34.5365.52309
      EU-Net31.3612.31284
      PraNet30.506.9690
      CaraNet44.5411.45256
      Polyp-PVT25.125.30233
      SSFormer-L65.9617.29220
      MSRAformer68.0321.29199
      Ours24.9910.01127
    • Table 4. Ablation results of each module on the Kvasir and CVC-ColonDB datasets

      View table
      View in Article

      Table 4. Ablation results of each module on the Kvasir and CVC-ColonDB datasets

      DatasetMethodDiceMIoUSEPCF2
      KvasirM10.9060.8510.9000.9310.901
      M20.9210.8710.9300.9310.926
      M30.9280.8770.9340.9360.928
      M40.9320.8830.9330.9440.931
      CVC-ColonDBM10.7860.7050.79180.8350.785
      M20.7890.7060.83370.8030.802
      M30.8100.7300.8410.7970.806
      M40.8110.7310.8230.8440.813
    Tools

    Get Citation

    Copy Citation Text

    Liming LIANG, Anjun HE, Renjie LI, Jian WU. Cross-scale and cross-dimensional adaptive transformer network for colorectal polyp segmentation[J]. Optics and Precision Engineering, 2023, 31(18): 2700

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Information Sciences

    Received: Mar. 15, 2023

    Accepted: --

    Published Online: Oct. 12, 2023

    The Author Email:

    DOI:10.37188/OPE.20233118.2700

    Topics