Infrared and Laser Engineering, Volume. 53, Issue 5, 20240026(2024)

Multi-modal-fusion-based 3D semantic segmentation algorithm

Qi Chao, Yandong Zhao, and Shengbo Liu
Author Affiliations
  • School of Engineering, Beijing Forestry University, Beijing 100080, China
  • show less
    Figures & Tables(12)
    multi-modal network
    Image feature generation network
    Information loss during voxel downsampling
    Point cloud feature generation network
    Dynamic feature fusion module
    Visualization diagram of data augmentation strategy. (a) Shows the original data of the point cloud; (b) Shows the complete point cloud of the enhanced instance object tree; (c) Shows the perspective of the device during data collection after pasting the point cloud; (d) Shows the original image data, and the green dots represent the projection of the instance object tree point cloud to the image; (e) Shows the foreground image of trees; (f) Shows the pasting effect of the foreground image of trees (for the convenience of observing and taking the image of the pasting position); (g) Shows the points (green dots in the figure) that match the projection of the pasted tree point cloud and the image Mask; (h) Shows the points in the tree point cloud that do not match the image Mask after pasting (green dots in the figure); (i) Shows the points that match the tree point cloud and image after mapping correction
    GT-Paste[11] data augmentation diagram. (a) Shows the original point cloud scene; (b) Shows the pasted point cloud scene, where purple and red represent the points that need to be filtered for occlusion; (c) Shows the filtered point cloud scene; (d) Shows the original scene of the image; (e) Shows the pasted image scene; (f) Shows the image scene after processing occlusion relationships
    The schematic diagram of the qualitative results of the model is shown in Figures (a) and (d), which represent the baseline (i.e. the first row of the ablation experiment) visualization of model false positives. Figures (b) and (e) represent the visualization of model false positives in the final model of this paper (i.e. the fourth row of the ablation experiment). Figures (c) and (f) show Ground Truth
    • Table 1. Performance comparison with other algorithms

      View table
      View in Article

      Table 1. Performance comparison with other algorithms

      MethodmIoUCarTruckPedestrianBicycleRoadMotorcycleBarriesVegetationSpeed/ms
      SquSegv3[24]53.892.836.863.425.791.121.114.285.197
      KPconv[4]58.293.537.771.939.489.723.525.184.8
      (AF)2S3Net[14]62.093.241.673.145.590.639.926.086.7270
      SPVCNN[8]63.395.844.874.442.191.346.428.687.563
      Fus3DSeg[13]64.396.148.167.343.793.048.130.288.3
      Ours66.794.149.679.347.890.952.631.288.488
    • Table 2. Ablation experiment

      View table
      View in Article

      Table 2. Ablation experiment

      DepthestimateVPSnetworkDFMPointaugmentmIoUCarPedestrianVegetation
      <25 m>25 m<25 m>25 m<25 m>25 m
      62.895.286.479.362.790.379.8
      $ \surd $$ \surd $64.497.189.881.267.491.182.4
      $ \surd $$ \surd $64.697.389.982.869.691.282.1
      $ \surd $$ \surd $$ \surd $$ \surd $66.797.690.285.373.392.383.5
    • Table 3. Comparison of voxel feature extraction network effects

      View table
      View in Article

      Table 3. Comparison of voxel feature extraction network effects

      CarPedestrianVegetation
      CN93.275.787.1
      VPS94.179.388.4
    • Table 4. Comparison of object detection result

      View table
      View in Article

      Table 4. Comparison of object detection result

      methodmAP
      Baseline55.4%
      Baseline + GT-Paste[11]57.0%
      Baseline + PointAugment57.2%
    Tools

    Get Citation

    Copy Citation Text

    Qi Chao, Yandong Zhao, Shengbo Liu. Multi-modal-fusion-based 3D semantic segmentation algorithm[J]. Infrared and Laser Engineering, 2024, 53(5): 20240026

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Jan. 16, 2024

    Accepted: --

    Published Online: Jun. 21, 2024

    The Author Email:

    DOI:10.3788/IRLA20240026

    Topics