Photonics Research, Volume. 10, Issue 8, 1848(2022)

Snapshot spectral compressive imaging reconstruction using convolution and contextual Transformer

Lishun Wang1,2, Zongliang Wu3, Yong Zhong1,2,4、*, and Xin Yuan3,5、*
Author Affiliations
  • 1Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu 610041, China
  • 2University of Chinese Academy of Sciences, Beijing 100049, China
  • 3Research Center for Industries of the Future and School of Engineering, Westlake University, Hangzhou 310030, China
  • 4e-mail: zhongyong@casit.com.cn
  • 5e-mail: xyuan@westlake.edu.cn
  • show less
    References(96)

    [1] Z. Meng, M. Qiao, J. Ma, Z. Yu, K. Xu, X. Yuan. Snapshot multispectral endomicroscopy. Opt. Lett., 45, 3897-3900(2020).

    [2] Y.-Z. Feng, D.-W. Sun. Application of hyperspectral imaging in food safety inspection and control: a review. Crit. Rev. Food Sci. Nutr., 52, 1039-1058(2012).

    [3] J. M. Bioucas-Dias, A. Plaza, G. Camps-Valls, P. Scheunders, N. Nasrabadi, J. Chanussot. Hyperspectral remote sensing data analysis and future challenges. IEEE Geosci. Remote Sens. Mag., 1, 6-36(2013).

    [4] A. Wagadarikar, R. John, R. Willett, D. Brady. Single disperser design for coded aperture snapshot spectral imaging. Appl. Opt., 47, B44-B51(2008).

    [5] M. E. Gehm, R. John, D. J. Brady, R. M. Willett, T. J. Schulz. Single-shot compressive spectral imaging with a dual-disperser architecture. Opt Express, 15, 14013-14027(2007).

    [6] J. M. Bioucas-Dias, M. A. T. Figueiredo. A new TwIST: two-step iterative shrinkage/thresholding algorithms for image restoration. IEEE Trans. Image Process., 16, 2992-3004(2007).

    [7] X. Yuan. Generalized alternating projection based total variation minimization for compressive sensing. IEEE International Conference on Image Processing (ICIP), 2539-2543(2016).

    [8] Y. Liu, X. Yuan, J. Suo, D. J. Brady, Q. Dai. Rank minimization for snapshot compressive imaging. IEEE Trans. Pattern Anal. Mach. Intell., 41, 2990-3006(2018).

    [9] X. Miao, X. Yuan, Y. Pu, V. Athitsos. λ-net: reconstruct hyperspectral images from a snapshot measurement. IEEE/CVF International Conference on Computer Vision, 4059-4069(2019).

    [10] J. Wang, Y. Zhang, X. Yuan, Y. Fu, Z. Tao. A new backbone for hyperspectral image reconstruction(2021).

    [11] G. Barbastathis, A. Ozcan, G. Situ. On the use of deep learning for computational imaging. Optica, 6, 921-943(2019).

    [12] Y. Fu, T. Zhang, L. Wang, H. Huang. Coded hyperspectral image reconstruction using deep external and internal learning. IEEE Trans. Pattern Anal. Mach. Intell., 44, 3404-3420(2021).

    [13] Y. Pu, D. Lee, M. Sugiyama, Z. Gan, U. Luxburg, R. Henao, I. Guyon, X. Yuan, C. Li, R. Garnett, A. Stevens, L. Carin. Variational autoencoder for deep learning of images, labels and captions. Advances in Neural Information Processing Systems, 29(2016).

    [14] X. Yuan, D. J. Brady, A. K. Katsaggelos. Snapshot compressive imaging: theory, algorithms, and applications. IEEE Signal Process Mag., 38, 65-88(2021).

    [15] K. Gregor, Y. LeCun. Learning fast approximations of sparse coding. 27th International Conference on Machine Learning, 399-406(2010).

    [16] Y. Yang, J. Sun, H. Li, Z. Xu. Deep ADMM-Net for compressive sensing MRI. 30th International Conference on Neural Information Processing Systems, 10-18(2016).

    [17] Y. Yang, J. Sun, H. Li, Z. Xu. ADMM-CSNet: a deep learning approach for image compressive sensing. IEEE Trans. Pattern Anal. Mach. Intell., 42, 521-538(2018).

    [18] J. Zhang, B. Ghanem. ISTA-Net: interpretable optimization-inspired deep network for image compressive sensing. IEEE Conference on Computer Vision and Pattern Recognition, 1828-1837(2018).

    [19] L. Wang, C. Sun, Y. Fu, M. H. Kim, H. Huang. Hyperspectral image reconstruction using a deep spatial-spectral prior. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 8032-8041(2019).

    [20] Z. Meng, S. Jalali, X. Yuan. GAP-Net for snapshot compressive imaging(2020).

    [21] O. Ronneberger, P. Fischer, T. Brox. U-Net: convolutional networks for biomedical image segmentation. International Conference on Medical Image Computing and Computer-assisted Intervention, 234-241(2015).

    [22] K. Han, Y. Wang, H. Chen, X. Chen, J. Guo, Z. Liu, Y. Tang, A. Xiao, C. Xu, Y. Xu. A survey on visual transformer(2020).

    [23] Z. Meng, Z. Yu, K. Xu, X. Yuan. Self-supervised neural networks for spectral snapshot compressive imaging. IEEE/CVF International Conference on Computer Vision, 2622-2631(2021).

    [24] T. Huang, W. Dong, X. Yuan, J. Wu, G. Shi. Deep Gaussian scale mixture prior for spectral compressive imaging. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 16216-16225(2021).

    [25] D. Donoho. Compressed sensing. IEEE Trans. Inf. Theory, 52, 1289-1306(2006).

    [26] E. J. Candès, J. Romberg, T. Tao. Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. IEEE Trans. Inf. Theory, 52, 489-509(2006).

    [27] P. Llull, X. Liao, X. Yuan, J. Yang, D. Kittle, L. Carin, G. Sapiro, D. J. Brady. Coded aperture compressive temporal imaging. Opt Express, 21, 10526-10545(2013).

    [28] Y. Hitomi, J. Gu, M. Gupta, T. Mitsunaga, S. K. Nayar. Video from a single coded exposure photograph using a learned over-complete dictionary. International Conference on Computer Vision, 287-294(2011).

    [29] D. Reddy, A. Veeraraghavan, R. Chellappa. P2C2: programmable pixel compressive camera for high speed imaging. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 329-336(2011).

    [30] M. A. T. Figueiredo, R. D. Nowak, S. J. Wright. Gradient projection for sparse reconstruction: application to compressed sensing and other inverse problems. IEEE J. Sel. Top. Signal Process., 1, 586-597(2007).

    [31] M. Aharon, M. Elad, A. Bruckstein. K-SVD: an algorithm for designing overcomplete dictionaries for sparse representation. IEEE Trans. Signal Process., 54, 4311-4322(2006).

    [32] X. Yuan, T.-H. Tsai, R. Zhu, P. Llull, D. Brady, L. Carin. Compressive hyperspectral imaging with side information. IEEE J. Sel. Top. Signal Process., 9, 964-976(2015).

    [33] W. He, N. Yokoya, X. Yuan. Fast hyperspectral image recovery of dual-camera compressive hyperspectral imaging via non-iterative subspace-based fusion. IEEE Trans. Image Process., 30, 7170-7183(2021).

    [34] J. Yang, X. Liao, X. Yuan, P. Llull, D. J. Brady, G. Sapiro, L. Carin. Compressive sensing by learning a Gaussian mixture model from measurements. IEEE Trans. Image Process., 24, 106-119(2015).

    [35] Z. Cheng, B. Chen, R. Lu, Z. Wang, H. Zhang, Z. Meng, X. Yuan. Recurrent neural networks for snapshot compressive imaging. IEEE Trans. Pattern Anal. Mach. Intell.(2022).

    [36] S. Zheng, Y. Liu, Z. Meng, M. Qiao, Z. Tong, X. Yang, S. Han, X. Yuan. Deep plug-and-play priors for spectral snapshot compressive imaging. Photon. Res., 9, B18-B29(2021).

    [37] Z. Lai, K. Wei, Y. Fu. Deep plug-and-play prior for hyperspectral image restoration. Neurocomputing, 481, 281-293(2022).

    [38] S. Boyd, N. Parikh, E. Chu. Distributed Optimization and Statistical Learning via the Alternating Direction Method of Multipliers(2011).

    [39] Y. LeCun, Y. Bengio. Convolutional networks for images, speech, and time series. The Handbook of Brain Theory and Neural Networks, 255-258(1998).

    [40] A. Krizhevsky, I. Sutskever, G. E. Hinton. ImageNet classification with deep convolutional neural networks. Advances Information Processing Systems 25, 1097-1105(2012).

    [41] K. He, X. Zhang, S. Ren, J. Sun. Deep residual learning for image recognition. IEEE Conference on Computer Vision and Pattern Recognition, 770-778(2016).

    [42] G. Huang, Z. Liu, L. Van Der Maaten, K. Q. Weinberger. Densely connected convolutional networks. IEEE Conference on Computer Vision and Pattern Recognition, 4700-4708(2017).

    [43] J. Redmon, S. Divvala, R. Girshick, A. Farhadi. You only look once: unified, real-time object detection. IEEE Conference on Computer Vision and Pattern Recognition, 779-788(2016).

    [44] J. Long, E. Shelhamer, T. Darrell. Fully convolutional networks for semantic segmentation. IEEE Conference on Computer Vision and Pattern Recognition, 3431-3440(2015).

    [45] C. Tian, L. Fei, W. Zheng, Y. Xu, W. Zuo, C.-W. Lin. Deep learning on image denoising: an overview. Neural Netw., 131, 251-275(2020).

    [46] R. Stone. CenterTrack: an IP overlay network for tracking DoS floods. USENIX Security Symposium, 21, 114(2000).

    [47] L. He, X. Liao, W. Liu, X. Liu, P. Cheng, T. Mei. FastReID: a PyTorch toolbox for general instance re-identification(2020).

    [48] J. Hu, L. Shen, G. Sun. Squeeze-and-excitation networks. IEEE Conference on Computer Vision and Pattern Recognition, 7132-7141(2018).

    [49] A. Vaswani, N. Shazeer, N. Parmar, J. Uszkoreit, L. Jones, A. N. Gomez, Ł. Kaiser, I. Polosukhin. Attention is all you need. Advances in Neural Information Processing Systems, 5998-6008(2017).

    [50] A. Dosovitskiy, L. Beyer, A. Kolesnikov, D. Weissenborn, X. Zhai, T. Unterthiner, M. Dehghani, M. Minderer, G. Heigold, S. Gelly. An image is worth 16 × 16 words: transformers for image recognition at scale(2020).

    [51] X. Zhu, W. Su, L. Lu, B. Li, X. Wang, J. Dai. Deformable DETR: deformable transformers for end-to-end object detection. International Conference on Learning Representations, 1-16(2020).

    [52] X. Dong, J. Bao, D. Chen, W. Zhang, N. Yu, L. Yuan, D. Chen, B. Guo. CSWin transformer: a general vision transformer backbone with cross-shaped windows(2021).

    [53] H. Touvron, M. Cord, M. Douze, F. Massa, A. Sablayrolles, H. Jégou. Training data-efficient image Transformers & distillation through attention. International Conference on Machine Learning (PMLR), 10347-10357(2021).

    [54] L. Yuan, Y. Chen, T. Wang, W. Yu, Y. Shi, F. E. H. Tay, J. Feng, S. Yan. Tokens-to-token ViT: training vision Transformers from scratch on imageNet. IEEE International Conference on Computer Vision, 558-567(2021).

    [55] C. Sun, A. Shrivastava, S. Singh, A. Gupta. Revisiting unreasonable effectiveness of data in deep learning era. IEEE International Conference on Computer Vision, 843-852(2017).

    [56] Z. Liu, Y. Lin, Y. Cao, H. Hu, Y. Wei, Z. Zhang, S. Lin, B. Guo. Swin Transformer: hierarchical vision Transformer using shifted windows(2021).

    [57] J. Deng, W. Dong, R. Socher, L.-J. Li, K. Li, L. Fei-Fei. ImageNet: a large-scale hierarchical image database. IEEE Conference on Computer Vision and Pattern Recognition, 248-255(2009).

    [58] T.-Y. Lin, M. Maire, S. Belongie, J. Hays, P. Perona, D. Ramanan, P. Dollár, C. L. Zitnick. Microsoft COCO: common objects in context. European Conference on Computer Vision, 740-755(2014).

    [59] B. Zhou, H. Zhao, X. Puig, S. Fidler, A. Barriuso, A. Torralba. Scene parsing through ADE20K dataset. IEEE Conference on Computer Vision and Pattern Recognition, 633-641(2017).

    [60] B. Zhou, H. Zhao, X. Puig, T. Xiao, S. Fidler, A. Barriuso, A. Torralba. Semantic understanding of scenes through the ADE20K dataset. Int. J. Comput. Vis., 127, 302-321(2019).

    [61] J. Liang, J. Cao, G. Sun, K. Zhang, L. Van Gool, R. Timofte. SwinIR: image restoration using Swin Transformer. IEEE/CVF International Conference on Computer Vision, 1833-1844(2021).

    [62] Y. Li, T. Yao, Y. Pan, T. Mei. Contextual Transformer networks for visual recognition(2021).

    [63] Z. Peng, W. Huang, S. Gu, L. Xie, Y. Wang, J. Jiao, Q. Ye. Conformer: local features coupling global representations for visual recognition(2021).

    [64] J. R. Hershey, J. L. Roux, F. Weninger. Deep unfolding: model-based inspiration of novel deep architectures(2014).

    [65] X. Liao, H. Li, L. Carin. Generalized alternating projection for weighted-l2,1 minimization with applications to model-based compressive sensing. SIAM J. Imag. Sci., 7, 797-823(2014).

    [66] B. Xu, N. Wang, T. Chen, M. Li. Empirical evaluation of rectified activations in convolutional network(2015).

    [67] W. Shi, J. Caballero, F. Huszár, J. Totz, A. P. Aitken, R. Bishop, D. Rueckert, Z. Wang. Real-time single image and video super-resolution using an efficient sub-pixel convolutional neural network. IEEE Conference on Computer Vision and Pattern Recognition, 1874-1883(2016).

    [68] Z. Wang, A. C. Bovik, H. R. Sheikh, E. P. Simoncelli. Image quality assessment: from error visibility to structural similarity. IEEE Trans. Image Process., 13, 600-612(2004).

    [69] F. Yasuma, T. Mitsunaga, D. Iso, S. K. Nayar. Generalized assorted pixel camera: postcapture control of resolution, dynamic range, and spectrum. IEEE Trans. Image Process., 19, 2241-2253(2010).

    [70] I. Choi, D. S. Jeon, G. Nam, D. Gutierrez, M. H. Kim. High-quality hyperspectral reconstruction using a spectral prior. ACM Trans. Graph., 36, 218(2017).

    [71] Z. Meng, J. Ma, X. Yuan. End-to-end low cost compressive spectral imaging with spatial-spectral self-attention. European Conference on Computer Vision, 187-204(2020).

    [72] D. P. Kingma, J. Ba. ADAM: a method for stochastic optimization(2014).

    [73] X. Yuan, P. Llull, X. Liao, J. Yang, D. J. Brady, G. Sapiro, L. Carin. Low-cost compressive sensing for color video and depth. IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 3318-3325(2014).

    [74] X. Yuan, Y. Liu, J. Suo, Q. Dai. Plug-and-play algorithms for large-scale snapshot compressive imaging. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 1447-1457(2020).

    [75] Z. Cheng, R. Lu, Z. Wang, H. Zhang, B. Chen, Z. Meng, X. Yuan. BIRNAT: bidirectional recurrent neural networks with adversarial training for video snapshot compressive imaging. European Conference on Computer Vision, 258-275(2020).

    [76] M. Qiao, Z. Meng, J. Ma, X. Yuan. Deep learning for video compressive sensing. APL Photon., 5, 30801(2020).

    [77] Z. Wang, H. Zhang, Z. Cheng, B. Chen, X. Yuan. MetaSCI: scalable and adaptive reconstruction for video compressive sensing. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 2083-2092(2021).

    [78] Z. Cheng, B. Chen, G. Liu, H. Zhang, R. Lu, Z. Wang, X. Yuan. Memory-efficient network for large-scale video compressive sensing. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 16246-16255(2021).

    [79] Y. Sun, X. Yuan, S. Pang. High-speed compressive range imaging based on active illumination. Opt. Express, 24, 22836-22846(2016).

    [80] Y. Sun, X. Yuan, S. Pang. Compressive high-speed stereo imaging. Opt. Express, 25, 18182-18190(2017).

    [81] X. Yuan, Y. Pu. Parallel lensless compressive imaging via deep convolutional neural networks. Opt. Express, 26, 1962-1977(2018).

    [82] T.-H. Tsai, X. Yuan, D. J. Brady. Spatial light modulator based color polarization imaging. Opt. Express, 23, 11912-11926(2015).

    [83] M. Qiao, X. Liu, X. Yuan. Snapshot spatial–temporal compressive imaging. Opt. Lett., 45, 1659-1662(2020).

    [84] R. Lu, B. Chen, G. Liu, Z. Cheng, M. Qiao, X. Yuan. Dual-view snapshot compressive imaging via optical flow aided recurrent neural network. Int. J. Comput. Vis., 129, 3279-3298(2021).

    [85] Y. Xue, S. Zheng, W. Tahir, Z. Wang, H. Zhang, Z. Meng, L. Tian, X. Yuan. Block modulating video compression: an ultra low complexity image compression encoder for resource limited platforms(2022).

    [86] B. Zhang, X. Yuan, C. Deng, Z. Zhang, J. Suo, Q. Dai. End-to-end snapshot compressed super-resolution imaging with deep optics. Optica, 9, 451-454(2022).

    [87] Z. Chen, S. Zheng, Z. Tong, X. Yuan. Physics-driven deep-learning enables temporal compressive coherent diffraction imaging. Optica, 9, 677-680(2022).

    [88] T.-H. Tsai, P. Llull, X. Yuan, D. J. Brady, L. Carin. Spectral-temporal compressive imaging. Opt. Lett., 40, 4054-4057(2015).

    [89] M. Qiao, Y. Sun, J. Ma, Z. Meng, X. Liu, X. Yuan. Snapshot coherence tomographic imaging. IEEE Trans. Comput. Imaging, 7, 624-637(2021).

    [90] X. Yuan. Compressive dynamic range imaging via Bayesian shrinkage dictionary learning. Opt. Eng., 55, 123110(2016).

    [91] X. Yuan, X. Liao, P. Llull, D. Brady, L. Carin. Efficient patch-based approach for compressive depth imaging. Appl. Opt., 55, 7556-7564(2016).

    [92] X. Ma, X. Yuan, C. Fu, G. R. Arce. LED-based compressive spectral-temporal imaging. Opt. Express, 29, 10698-10715(2021).

    [93] Y. Cai, J. Lin, X. Hu, H. Wang, X. Yuan, Y. Zhang, R. Timofte, L. Van Gool. Mask-guided spectral-wise Transformer for efficient hyperspectral image reconstruction(2022).

    [94] J. Lin, Y. Cai, X. Hu, H. Wang, X. Yuan, Y. Zhang, R. Timofte, L. Van Gool. Coarse-to-fine sparse Transformer for hyperspectral image reconstruction(2022).

    [95] X. Hu, Y. Cai, J. Lin, H. Wang, X. Yuan, Y. Zhang, R. Timofte, L. Van Gool. HDNet: high-resolution dual-domain learning for spectral compressive imaging. IEEE/CVF Conference on Computer Vision and Pattern Recognition, 17542-17551(2022).

    [96] J. Wang, Y. Zhang, X. Yuan, Z. Meng, Z. Tao. Modeling mask uncertainty in hyperspectral image reconstruction(2021).

    Tools

    Get Citation

    Copy Citation Text

    Lishun Wang, Zongliang Wu, Yong Zhong, Xin Yuan. Snapshot spectral compressive imaging reconstruction using convolution and contextual Transformer[J]. Photonics Research, 2022, 10(8): 1848

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Image Processing and Image Analysis

    Received: Mar. 14, 2022

    Accepted: Jun. 8, 2022

    Published Online: Jul. 21, 2022

    The Author Email: Yong Zhong (zhongyong@casit.com.cn), Xin Yuan (xyuan@westlake.edu.cn)

    DOI:10.1364/PRJ.458231

    Topics