Acta Optica Sinica, Volume. 40, Issue 19, 1910001(2020)

Indoor RGB-D Image Semantic Segmentation Based on Dual-Stream Weighted Gabor Convolutional Network Fusion

Xuchu Wang1,2、*, Huihuang Liu2, and Yanmin Niu3
Author Affiliations
  • 1Key Laboratory of Optoelectronic Technology and Systems of Ministry of Education, Chongqing University, Chongqing 400040, China
  • 2College of Optoelectronic Engineering, Chongqing University, Chongqing 400040, China
  • 3College of Computer and Information Science, Chongqing Normal University, Chongqing 401331, China
  • show less
    Figures & Tables(17)
    RGB-D image semantic segmentation by double-stream weighted Gabor convolution network fusion
    Modulation process of WGoFs
    Convolution process of WGoFs
    Wide residual blocks. (a) Original residual block; (b) wide residual block 1; (c) wide residual block 2
    Architecture of WRN-WGCN module
    Pyramid pooling module
    Proposed pyramid pooling feature fusion module
    RGB and depth images and their corresponding semantic labels in dataset. (a) RGB images; (b) depth images; (c) semantic labels
    Loss curves in training process
    Test accuracy versus number of scales and number of directions. (a) Test accuracy under different number of scales; (b) test accuracy under different number of directions
    Semantic segmentation results obtained by various methods on NYUDv2 dataset. (a) RGB; (b) depth; (c) GT; (d) baseline; (e) WRN-CNN; (f) WGCN; (g) PP-Fusion; (h) FCN; (i) SegNet; (j) ours
    Semantic segmentation results obtained by various methods on SUN-RGBD dataset. (a) RGB; (b) depth; (c) GT; (d) baseline; (e) WRN-CNN; (f) WGCN; (g) PP-Fusion; (h) FCN; (i) SegNet; (j) ours
    • Table 1. Structural parameter setting of WRN-WGCN

      View table

      Table 1. Structural parameter setting of WRN-WGCN

      Group nameOutput feature sizeBlock type
      GCConv1N×N3×38
      GCConv2N×N3×316×k3×316×k×L
      GCConv3N×N3×316×k3×316×k×L
      GCConv4(N/2)×(N/2)3×332×k3×332×k×L
    • Table 2. Model sizes with different filter sizes

      View table

      Table 2. Model sizes with different filter sizes

      Model nameFilter sizeModel size /MB
      Model 15×5163
      Model 25×5124
      Model 33×3148
      Model 43×3117
    • Table 3. Comparison of results for different segmentation algorithms on NYUDv2 dataset

      View table

      Table 3. Comparison of results for different segmentation algorithms on NYUDv2 dataset

      MethodModuleAcc /%mAcc /%mIoU /%FWIoU /%
      WRN-CNNWGCNPP-Fusion
      Ours66.350.840.053.1
      Variant 158.341.630.145.8
      Variant 258.642.431.945.3
      Variant 360.848.235.850.4
      Variant 463.245.836.446.6
      FCN[2]65.445.134.348.6
      SegNet[3]56.247.635.150.1
    • Table 4. Comparison of results for different segmentation algorithms on SUN-RGBD dataset

      View table

      Table 4. Comparison of results for different segmentation algorithms on SUN-RGBD dataset

      MethodModuleAcc /%mAcc /%mIoU /%FWIoU /%
      WRN-CNNWGCNPP-Fusion
      Ours58.238.528.242.0
      Variant 145.233.721.837.4
      Variant 244.834.523.138.6
      Variant 354.635.127.337.7
      Variant 456.134.626.036.3
      FCN[2]49.536.523.735.8
      SegNet[3]47.834.626.238.2
    • Table 5. Comparison of reasoning time and space complexity for different algorithms

      View table

      Table 5. Comparison of reasoning time and space complexity for different algorithms

      MethodModuleModel size /MBReasoning time /ms
      WRN-CNNWGCNPP-Fusion
      Ours11742
      Variant 138176
      Variant 211535
      Variant 318748
      Variant 424551
      FCN[2]54943
      SegNet[3]12658
    Tools

    Get Citation

    Copy Citation Text

    Xuchu Wang, Huihuang Liu, Yanmin Niu. Indoor RGB-D Image Semantic Segmentation Based on Dual-Stream Weighted Gabor Convolutional Network Fusion[J]. Acta Optica Sinica, 2020, 40(19): 1910001

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing

    Received: Apr. 26, 2020

    Accepted: Jun. 19, 2020

    Published Online: Sep. 23, 2020

    The Author Email: Wang Xuchu (xcwang@cqu.edu.cn)

    DOI:10.3788/AOS202040.1910001

    Topics