Advanced Imaging, Volume. 1, Issue 2, 021002(2024)

Block-modulating video compression: an ultralow complexity image compression encoder for resource-limited platforms

Siming Zheng1、†, Yujia Xue2, Waleed Tahir2, Zhengjue Wang3, Hao Zhang4, Ziyi Meng5, Gang Qu1, Siwei Ma6, and Xin Yuan1、*
Author Affiliations
  • 1Research Center for Industries of the Future (RCIF) and School of Engineering, Westlake University, Hangzhou, China
  • 2Department of Electrical and Computer Engineering, Boston University, Boston, USA
  • 3National Key Laboratory of Radar Signal Processing, Xidian University, Xi’an, China
  • 4State Key Laboratory of Integrated Service Networks, Xidian University, Xi’an, China
  • 5Westlake Intelligent Vision Technology, Hangzhou, China
  • 6School of Computer Science, Peking University, Beijing, China
  • show less
    Figures & Tables(12)
    Pipeline of the proposed BMVC encoder. (a) For each input image with a size of Nh×Nw, the input frame is first divided into nonoverlapping blocks of size Bh×Bw. Then the image blocks are modulated (element-wise multiplication) by binary masks of the same size. Next, these modulated image blocks are summed together to yield a single block of size Bh×Bw. At last, the summed modulated image block is quantized to a user-defined bit depth (8–16 bit) and then transmitted to the receiver. An equivalent encoding pipeline is shown in (b), where the modulation happens before dividing into blocks.
    PnP optimization-based decoding algorithm for BMVC-PnP. The encoded image block along with the modulation binary masks are fed into the BMVC-PnP decoder as inputs. The BMVC-PnP iteratively performs a linear projection step to account for the BMVC encoding process and a DL-based denoising step as an implicit prior. We use a pretrained FFDNet43 as the denoising CNN for its flexibility and robustness against various noise levels.
    E2E neural-network-based decoding algorithm for BMVC-E2E. The encoded image block along with the modulation binary masks are fed into the BMVC-E2E decoder as inputs. The feed-forward BMVC-E2E decoder consists of several stages, where each stage contains a linear projection step and a convolutional neural network. All BMVC-E2E decoders are trained in an E2E fashion. We use 2D-U-Net and 3D-CNN with reversible blocks (RevSCI) to facilitate memory-efficient training.
    Test data set (set 13) we used to evaluate the BMVC pipeline and other compression methods.
    Decoded image results at various Crs with the proposed BMVC-PnP and BMVC-E2E approaches. The BMVC-E2E results consistently have good decoding quality at both low and high Crs. The BMVC-PnP decoder provides higher image quality for low Crs while producing some denoising artifacts at high Crs.
    Comparison of the BMVC pipeline with other image compression methods: random DS, block CS, and JPEG2000 compression. For the random DS and block CS experiments, we implemented their decoders based on the PnP algorithm with FFDNet as the flexible denoiser. Results are shown with a low Cr=32 and a high Cr=80. The two BMVC decoders provide consistently high-quality images. The random DS method fails because, in principle, the random DS decoder is solving an image inpainting task with only <3% pixels available. Block CS also shows equally good decoding results, but we will show later that block CS will deteriorate after aggressive data quantization. As expected, JPEG2000 compression gives the best image quality at Cr=32 and 72 with minor blurriness, since JPEG2000 utilizes an optimal set of wavelet basis but with the cost of increased encoding computation cost.
    PSNR performance of different compression methods at a wide range of Crs. PSNR value is computed for Y channels only. The BMVC-E2E has a PSNR increase at Cr=60. This is because the BMVC-E2E decoder has an additional stage of 3D-CNN for all Cr≥60.
    Evaluation of robustness to quantization bits. BMVC and block CS both show high PSNR performance when the dynamic range of the data is intact. In practice, quantization will affect the codec performance in real-world video signal transmission. The bar plots indicate how the three decoders (BMVC-PnP, BMVC-E2E, and block CS) perform under different quantization bits. BMVC decoders have consistent performance regardless of data quantization. However, block CS has noticeable decreases in PSNR at 10-bit and 8-bit quantization.
    • Table 1. PSNR (top line in each cell in dB) and SSIM (bottom line in each cell) Performance for Different Compression Methods at a Wide Range of Crs on the HD Image of Size 1080×1920. The highest performance of CS-based methods is bold-faced.

      View table
      View in Article

      Table 1. PSNR (top line in each cell in dB) and SSIM (bottom line in each cell) Performance for Different Compression Methods at a Wide Range of Crs on the HD Image of Size 1080×1920. The highest performance of CS-based methods is bold-faced.

      Cr (Nb) BMVC: block size150 108×128120 108×160100 108×19280 108×24072 120×24060 216×16050 216×19240 216×24032 270×24024 270×320
      BMVC-PnP22.89, 0.68224.32, 0.70725.97, 0.74527.30, 0.78127.96, 0.79728.83, 0.82229.87, 0.84831.10, 0.87732.23, 0.89533.58, 0.913
      BMVC-E2E26.59, 0.80227.38, 0.81527.80, 0.82128.81, 0.83729.31, 0.84930.74, 0.87129.08, 0.83929.23, 0.84329.98, 0.85431.05, 0.871
      Random DS8.95, 0.3549.56, 0.40110.10, 0.43110.88, 0.46011.33, 0.47212.17, 0.49113.30, 0.51614.87, 0.55516.61, 0.60918.58, 0.690
      Block CS26.59, 0.81926.93, 0.82627.65, 0.83727.88, 0.84328.20, 0.84928.70, 0.85729.38, 0.86529.83, 0.87030.70, 0.87631.79, 0.884
      JPEG200030.30, 0.85230.94, 0.86231.42, 0.86931.99, 0.87732.80, 0.88833.41, 0.89733.84, 0.90234.60, 0.91136.45, 0.93137.51, 0.941
    • Table 2. Computation Cost of Different Compression Methods.

      View table
      View in Article

      Table 2. Computation Cost of Different Compression Methods.

      CodecEncoder CostDynamic Range of Measured DataMask/Basis SizeDecoderComment
      BMVC# addition: (Nb2)N2Nb # multiplication: 0Nb2NPnP: iterative networkE2E: feedforward networkCr=Nb, Nb: # blocks N: # pixels
      Random DS# addition: 0# multiplication: 01NPnP: iterative networkCr=NNs, Ns: # sampled pixels
      Block CS# addition: N2Nb # multiplication: 0N2NbMNNbPnP: iterative networkCr = NMNb, M: # measurements per block
      JPEG2000# addition: (NNb)log2 NNb# multiplication: Nlog2NNb1flexibleDiscrete wavelet transform (DWT)Nb=NB2, B: block size
    • Table 3. PSNRs of BMVC-PnP, BMVC-E2E, and Block CS under Different Quantization Bits.

      View table
      View in Article

      Table 3. PSNRs of BMVC-PnP, BMVC-E2E, and Block CS under Different Quantization Bits.

      BMVC-PnP (dB)BMVC-E2E (dB)Block CS (dB)
      Bit/Cr100805024100805024100805024
      8-bit25.92127.27428.82933.45127.78528.79629.06031.02327.19027.04927.36129.563
      10-bit25.98127.30529.88233.59227.80528.81329.08231.05627.63427.95729.36131.931
      12-bit25.98127.30129.88933.61127.80328.81829.08731.05927.80028.12330.09033.132
      14-bit25.98227.30329.89233.61327.80228.81929.08831.06027.81328.13230.09933.149
      16-bit25.98227.30429.87933.57627.80328.81829.08831.06027.81428.13330.10233.151
      ΔPSNR↓0.06060.03150.06290.16270.02020.02300.02820.03700.62381.08442.74153.5872
    • Table 4. Ablation Study of BMVC-E2E (Cr50).

      View table
      View in Article

      Table 4. Ablation Study of BMVC-E2E (Cr50).

      Decoder StructurePSNR (dB)Runtime (ms)
      2 × U-Net + 2 × 3D-CNN (RevSCI) + 1 × deeper 3D CNN (RevSCI)31.720255
      2 × U-Net + 2 × 3D-CNN (RevSCI)31.449130
      2 × U-Net + 1 × 3D-CNN (RevSCI)29.22870
      2 × U-Net17.93010
      1 × U-Net10.3085
    Tools

    Get Citation

    Copy Citation Text

    Siming Zheng, Yujia Xue, Waleed Tahir, Zhengjue Wang, Hao Zhang, Ziyi Meng, Gang Qu, Siwei Ma, Xin Yuan, "Block-modulating video compression: an ultralow complexity image compression encoder for resource-limited platforms," Adv. Imaging 1, 021002 (2024)

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Research Article

    Received: Jan. 22, 2024

    Accepted: Jul. 9, 2024

    Published Online: Aug. 8, 2024

    The Author Email: Xin Yuan (xyuan@westlake.edu.cn)

    DOI:10.3788/AI.2024.10006

    Topics