Optics and Precision Engineering, Volume. 32, Issue 24, 3603(2024)

Multi-frame self-supervised monocular depth estimation with multi-scale feature enhancement

Qiqi KOU1... Weichen WANG2, Chenggong HAN2, Chen LÜ2, Deqiang CHENG2 and Yucheng JI3,* |Show fewer author(s)
Author Affiliations
  • 1School of Computer Science and Technology,China University of Mining and Technology, Xuzhou226,China
  • 2School of Information and Control Engineering,China University of Mining and Technology, Xuzhou1116,China
  • 3Department Big Data Center,Ministry of Emergency Management, Beijing10001, China
  • show less
    Figures & Tables(15)
    Process of self-supervised depth estimation algorithm
    Cost volume construction of multi-frame depth estimation network
    Our depth estimation network architecture
    Activation Module Based onVision Attention(Act-VAN)
    Large kernel attention
    Structure enhancement module
    Structure of dynamic upsampling
    Comparison of visualization results on the KITTI dataset
    Comparison of visualization results on the CityScapes dataset
    Comparison of results for module design on SqRel
    • Table 1. Test results on the KITTI dataset

      View table
      View in Article

      Table 1. Test results on the KITTI dataset

      Method

      M

      ResolutionIndicator of Error (Lower is better)Accuracy of Prediction (Higher is better)
      AbsRelSqRelRMSERMSElogδ<1.25δ<1.252δ<1.253
      Monodepth28640×1920.1150.9034.8630.1930.8770.9590.981
      Johnston et al.26640×1920.1060.8614.6990.1850.8890.9620.982
      Packnet-SFM27640×1920.1110.7854.6010.1890.8780.9600.982
      VA-Depth28640×1920.1120.8644.8040.1900.8770.9590.982
      Zeeshan et al.29640×1920.1130.9034.8630.1930.8770.9590.981
      STDepthFormer30640×1920.1100.8054.6780.1870.8780.9610.983
      Patil et al.31640×1920.1110.8214.6500.1870.8830.9610.982
      Suanders et al.32640×1920.1000.7474.4550.1770.8950.9660.984
      Wang et al.33640×1920.1060.7994.6620.1870.8890.9610.983
      Manydepth13640×1920.0980.7704.4590.1760.9000.9650.983
      Ours640×1920.0950.7434.3740.1750.9030.9650.983
      Monodepth281 024×3200.1150.8824.7010.1900.8790.9610.982
      Packnet-SFM271 280×3840.1070.8024.5380.1860.8890.9620.981
      Shu et al.341 024×3200.1040.7294.4810.1790 8930.9650.984
      Wang et al.331 024×3200.1060.7734.4910.1850.8900.9620.982
      Manydpeth131 024×3200.0930.7154.2450.1720.9090.9660.983
      Ours1 024×3200.0900.7034.2130.1700.9130.9660.983
    • Table 2. Test results on the CityScapes dataset

      View table
      View in Article

      Table 2. Test results on the CityScapes dataset

      MethodAbsRelSqRelRMSERMSElog
      Monodepth280.1291.5696.8760.187
      Li et al.350.1191.2906.9800.190
      Manydepth130.1141.1936.2230.170
      Ours0.1111.1656.1960.168
    • Table 3. Model parametric and computational complexity

      View table
      View in Article

      Table 3. Model parametric and computational complexity

      ModelParams/MFLOPs/G
      Monodepth2814.38.0
      Manydepth1314.410.2
      Ours22.016.3
    • Table 4. Ablation experiment of modules

      View table
      View in Article

      Table 4. Ablation experiment of modules

      ExperimentAct-VANSEMDySampleIndicator of Error (Lower is better)Accuracy of Prediction (Higher is better)
      AbsRelSqRelRMSERMSElogδ<1.25δ<1.252δ<1.253
      1×××0.0980.7704.4590.1760.9000.9650.983
      2××0.0980.7654.4410.1760.9000.9650.983
      3××0.0960.7534.3790.1750.9020.9650.983
      4×0.0960.7554.4360.1760.9010.9650.983
      50.0950.7434.3740.1750.9030.9650.983
    • Table 5. Ablation experiment of modules design

      View table
      View in Article

      Table 5. Ablation experiment of modules design

      ExperimentMethodIndicator of Error (Lower is better)Accuracy of Prediction (Higher is better)
      AbsRelSqRelRMSERMSElogδ<1.25δ<1.252δ<1.253
      1Baseline0.0980.7704.4590.1760.9000.9650.983
      2VAN0.0970.7654.4950.1760.9010.9650.983
      3Act-VAN0.0960.7564.3920.1760.9020.9650.983
      4SEM(w/o)(LKA)0.0960.7614.4400.1760.9010.9650.983
      5SEM0.0960.7594.4370.1760.9010.9650.983
    Tools

    Get Citation

    Copy Citation Text

    Qiqi KOU, Weichen WANG, Chenggong HAN, Chen LÜ, Deqiang CHENG, Yucheng JI. Multi-frame self-supervised monocular depth estimation with multi-scale feature enhancement[J]. Optics and Precision Engineering, 2024, 32(24): 3603

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category:

    Received: Jun. 8, 2024

    Accepted: --

    Published Online: Mar. 11, 2025

    The Author Email: JI Yucheng (j.yc@outlook.com)

    DOI:10.37188/OPE.20243224.3603

    Topics