Photonics Research, Volume. 9, Issue 11, 2277(2021)

Ten-mega-pixel snapshot compressive imaging with a hybrid coded aperture

Zhihong Zhang1,2、†, Chao Deng1,2、†, Yang Liu3, Xin Yuan4,6, Jinli Suo1,2、*, and Qionghai Dai1,2,5
Author Affiliations
  • 1Department of Automation, Tsinghua University, Beijing 100084, China
  • 2Institute for Brain and Cognitive Science, Tsinghua University, Beijing 100084, China
  • 3Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA
  • 4Westlake University, Hangzhou 310024, China
  • 5Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China
  • 6e-mail: xyuan@westlake.edu.cn
  • show less
    Figures & Tables(11)
    Our 10-mega-pixel video SCI system (a) and the schematic (b). Ten high-speed (200 fps) high-resolution (3200×3200 pixels) video frames (c) reconstructed from a snapshot measurement (d), with motion detail in (e) for the small region in the blue box of (d). Different from existing solutions that only use an LCoS or a mask (thus with limited spatial resolution), our 10-mega-pixel spatio-temporal coding is generated jointly by an LCoS at the aperture plane and a static mask close to the image plane.
    Pipeline of the proposed large-scale HCA-SCI system (left) and the PnP reconstruction algorithms (right). Left: During the encoded photography stage, a dynamic low-resolution mask at the aperture plane and a static high-resolution mask close to the sensor plane work together to generate a sequence of high-resolution codes to encode the large-scale video into a snapshot. Right: In the decoding, the video is reconstructed under a PnP framework incorporating deep denoising prior and TV prior into a convex optimization (GAP), which leverages the good convergence of GAP and the high efficiency of the deep network.
    Illustration of the multiplexed mask generation. For the same scene point, its images generated by different sub-apertures (marked as blue, yellow, and red, respectively) intersect the mask plane with different regions and are thus encoded with corresponding (shifted) random masks before summation at the sensor. The multiplexing would raise the light flux for high SNR recording, while doing so only with slight performance degeneration.
    Multiplexing pattern schemes used in our experiments (taking Cr=6 for an example). Top row: multiplexing patterns for simulation experiments. Each pattern contains 50% open sub-apertures, and each sub-aperture is a 512×512 binning macro pixel on the LCoS. Bottom row: multiplexing patterns for real experiments. Each pattern contains an open circle with a radius of about 400 pixels, and the circles in adjacent patterns have a rotation of 360/Cr degrees.
    Reconstruction results and comparison with state-of-the-art algorithms on simulated data at different resolutions (left: 256×256, middle: 512×512, right: 1024×1024) and with different compression ratios (top: Cr=10, bottom: Cr=20). The BIRNAT results are not available for 512×512 and 1024×1024 since the model training will be out of memory. See Visualization 1, Visualization 2, Visualization 3, Visualization 4, Visualization 5, and Visualization 6 for the reconstructed videos.
    Noise robustness comparison between multiplexed and non-multiplexed masks.
    Reconstruction results of the PnP–TV–FastDVDNet on real data captured by our HCA-SCI system (Cr=6, 10, 20, and 30). Note the full frames are of 3200×3200, and we plot small regions about 400×400 in size to demonstrate the high-speed motion.
    Reconstruction comparison between the GAP–TV, PnP–FFDNet, and PnP–TV–FastDVDNet on real data captured by our HCA-SCI system (Cr=6, 10, 20, and 30). Note the full frames are of 3200×3200, and we plot small regions 512×512 in size to demonstrate the high-speed motion. See Visualization 7 for the reconstructed videos.
    • Table 1. Average Results of PSNR in dB (left entry in each cell) and SSIM (right entry in each cell) by Different Algorithms (Cr=10)a

      View table
      View in Article

      Table 1. Average Results of PSNR in dB (left entry in each cell) and SSIM (right entry in each cell) by Different Algorithms (Cr=10)a

      ScalesAlgorithmsFootballHummingbirdReadySteadyGoJockeyYachtRideAverage
      256×256GAP–TV27.82, 0.828029.24, 0.791823.73, 0.749931.63, 0.871226.65, 0.805627.81, 0.8093
      PnP–FFDNet27.06, 0.826425.52, 0.691221.68, 0.685931.14, 0.849323.69, 0.703525.82, 0.7513
      PnP–TV–FastDVDNet31.31, 0.912331.19, 0.826426.18, 0.827631.36, 0.881728.90, 0.884129.79, 0.8664
      BIRNAT34.67, 0.971934.33, 0.954629.50, 0.938936.24, 0.971131.02, 0.943133.15, 0.9559
      512×512GAP–TV29.19, 0.885428.32, 0.788725.94, 0.791831.30, 0.871826.59, 0.793928.27, 0.8263
      PnP–FFDNet28.57, 0.895228.02, 0.836324.32, 0.745729.81, 0.824823.45, 0.679326.83, 0.7963
      PnP–TV–FastDVDNet30.92, 0.933332.24, 0.883427.04, 0.824632.11, 0.883927.87, 0.848730.04, 0.8748
      1024×1024GAP–TV30.63, 0.902229.16, 0.845928.92, 0.869831.59, 0.895329.03, 0.847029.87, 0.8720
      PnP–FFDNet29.87, 0.902327.70, 0.786927.70, 0.848329.88, 0.841225.55, 0.721128.14, 0.8200
      PnP–TV–FastDVDNet30.35, 0.926531.71, 0.890929.42, 0.891331.59, 0.901430.44, 0.871330.70, 0.8963
    • Table 2. Average Results of PSNR in dB (left entry in each cell) and SSIM (right entry in each cell) by Different Algorithms (Cr=20)a

      View table
      View in Article

      Table 2. Average Results of PSNR in dB (left entry in each cell) and SSIM (right entry in each cell) by Different Algorithms (Cr=20)a

      ScalesAlgorithmsFootballHummingbirdReadySteadyGoJockeyYachtRideAverage
      256×256GAP–TV25.01, 0.754426.33, 0.689320.48, 0.632628.13, 0.831823.56, 0.712924.70, 0.7242
      PnP–FFDNet21.67, 0.665722.13, 0.583517.27, 0.534027.78, 0.799420.39, 0.602421.85, 0.6370
      PnP–TV–FastDVDNet27.83, 0.845928.65, 0.752023.28, 0.738129.51, 0.859726.34, 0.823527.12, 0.8038
      512×512BIRNAT27.91, 0.902128.58, 0.880023.79, 0.827931.35, 0.946726.14, 0.858527.55, 0.8830
      GAP–TV23.97, 0.817924.50, 0.671922.12, 0.697526.99, 0.829723.13, 0.693024.14, 0.7420
      PnP–FFDNet22.00, 0.766123.62, 0.724519.35, 0.613325.32, 0.792419.48, 0.541821.95, 0.6876
      PnP–TV–FastDVDNet25.63, 0.885228.36, 0.777823.80, 0.749928.79, 0.855325.36, 0.778426.39, 0.8093
      1024×1024GAP–TV24.82, 0.835325.53, 0.729624.98, 0.812826.63, 0.838825.80, 0.775925.55, 0.7985
      PnP–FFDNet23.55, 0.809823.02 0.603922.48, 0.770224.48, 0.796821.67, 0.641423.04, 0.7244
      PnP–TV–FastDVDNet26.26, 0.872928.68, 0.807626.31, 0.839929.18, 0.877328.07, 0.819427.70, 0.8434
    • Table 3. PnP–TV–FastDVDNet for HCA-SCI

      View table
      View in Article

      Table 3. PnP–TV–FastDVDNet for HCA-SCI

      RequireH, y.
      1:  Initialize: v(0),λ0,ξ<1,k=1,K1,KMax.
      2:  while Not Converge andkKMaxdo
      3:   Update x by Eq. (7).
      4:   Update v:
      5:   ifkK1then
      6:    v(k)=DTV(x(k))
      7:   else
      8:    v=DTV(x(k))
      9:    v(k)=DFastDVDNet(v)
    Tools

    Get Citation

    Copy Citation Text

    Zhihong Zhang, Chao Deng, Yang Liu, Xin Yuan, Jinli Suo, Qionghai Dai. Ten-mega-pixel snapshot compressive imaging with a hybrid coded aperture[J]. Photonics Research, 2021, 9(11): 2277

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Imaging Systems, Microscopy, and Displays

    Received: Jul. 8, 2021

    Accepted: Aug. 12, 2021

    Published Online: Oct. 25, 2021

    The Author Email: Jinli Suo (jlsuo@tsinghua.edu.cn)

    DOI:10.1364/PRJ.435256

    Topics