Laser & Optoelectronics Progress, Volume. 61, Issue 16, 1611013(2024)

Single-View Endoscopic Surgical Light Field Reconstruction Combining Vision Transformer and Diffusion Model(Invited)

Chenming Han and Gaochang Wu*
Author Affiliations
  • State Key Laboratory of Synthetical Automation for Process Industries, Northeastern University, Shenyang 110819, Liaoning, China
  • show less
    Figures & Tables(8)
    Proposed method flow diagram
    The operational details of the decoder module. (a) Reassemble tokens into feature maps with two spatial dimensions; (b) continuously upsampling feature maps and fusing features at different scales; (c) reconstructing the feature map into final MPI representation
    Visual comparison of light field reconstruction results with Zhou et al. and Tucker et al.
    Ablation study of diffusion model-based background prediction: Da Vinci robot surgical data
    Results of diffusion model-based background prediction: surgical microscope data
    • Table 1. Comparison of different methods

      View table

      Table 1. Comparison of different methods

      MethodPSNR↑SSIM↑LPIPS↓FID↓
      Zhou et al.1426.71±0.96250.8646±0.02450.1383±0.048478.2366
      Tucker et al.2726.92±0.98380.8732±0.05960.1293±0.051576.3018
      ViT-MPI3327.27±0.47150.8599±0.01320.1181±0.029370.1089
      Proposed27.55±1.90380.8902±0.04030.1157±0.058958.1178
    • Table 2. Ablation experiment

      View table

      Table 2. Ablation experiment

      PSNR↑SSIM↑LPIPS↓FID↓
      L1 Loss25.81±1.21920.8515±0.01220.1947±0.0537124.9708
      Proposed w/o global token24.43±2.72320.8537±0.01680.1238±0.017475.8218
      Proposed w/o occlusion prediction27.33±1.18170.8559±0.04060.1213±0.061570.1089
      Proposed(D=8)25.01±0.84350.7683±0.02960.1692±0.0465169.6536
      Proposed(D=16)25.41±0.27610.8074±0.01130.1432±0.0346130.1493
      Proposed(D=32)27.55±1.90380.8902±0.04030.1157±0.058958.1178
    • Table 3. Computational complexity of models in proposed method

      View table

      Table 3. Computational complexity of models in proposed method

      ModelGFLOPSParameter /M
      ViT13.5828.54
      Diffusion116.14387.25
    Tools

    Get Citation

    Copy Citation Text

    Chenming Han, Gaochang Wu. Single-View Endoscopic Surgical Light Field Reconstruction Combining Vision Transformer and Diffusion Model(Invited)[J]. Laser & Optoelectronics Progress, 2024, 61(16): 1611013

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Imaging Systems

    Received: May. 13, 2024

    Accepted: Jul. 18, 2024

    Published Online: Aug. 12, 2024

    The Author Email: Gaochang Wu (wugc@mail.neu.edu.cn)

    DOI:10.3788/LOP241272

    CSTR:32186.14.LOP241272

    Topics