Acta Optica Sinica, Volume. 44, Issue 10, 1026027(2024)

Image Synthesis of Compressive Light Field Displays with U-Net

Chen Gao1,2, Xiaodi Tan3,4,5、*, Haifeng Li6, and Xu Liu6
Author Affiliations
  • 1College of Photonic and Electronic Engineering, Fujian Normal University, Fuzhou 350117, Fujian , China
  • 2Fujian Provincial Key Laboratory of Photonics Technology, Fuzhou 350117, Fujian , China
  • 3Key Laboratory of Optoelectronic Science and Technology for Medicine of Ministry of Education, Fuzhou 350117, Fujian , China
  • 4Fujian Provincial Engineering Technology Research Center of Photoelectric Sensing Application, Fuzhou 350117, Fujian , China
  • 5Information Photonics Research Center, Fujian Normal University, Fuzhou 350117, Fujian , China
  • 6College of Optical Science and Engineering, Zhejiang University, Hangzhou 310027, Zhejiang , China
  • show less

    Objective

    3D display technology is the entrance to the realistic-feeling metaverse for tabletop, portable, and near-eye electronic devices. True 3D displays are mainly divided into light field displays and holographic displays, among which light field displays can be further subdivided into integral-imaging displays, directional light field displays, and compressive light field displays. Compressive light field displays utilize the scattering characteristic of display panels and the correlation between viewpoint images of the 3D scene. The compressive light field display is a candidate for portable 3D display owing to its compact structure, moderate viewing angle, and high spatial resolution. However, computational resources of portable electronic devices are restricted to satisfy their duration demand. Meanwhile, iterative algorithms to solve the compressive light field display patterns have the problem of heavy computation, preventing compressive light field displays from being a practical solution to portable dynamic 3D displays. With the development of artificial intelligence technology, image generation algorithms based on deep learning are gradually applied to 3D displays. Deep neural networks can be trained to fit the iterative process. Additionally, fast display image synthesis could be realized with rapid forward propagation of artificial neural networks. Previously, researchers proposed a stacked CNN-based method to generate images for compressive light field displays. However, the stacked CNN-based method suffers from convergence and over-fitting problems. U-Net is initially employed for image segmentation in computed tomography to handle slicing data and output the organ’s cancer probability. The skip connection added in the U-Net architecture significantly improves its convergence compared with the stacked CNN model. Light field data are pretty similar to slicing data in computed tomography. Thus, we introduce U-Net as the network model for optimizing compressive light field display patterns for better convergence and generalization. Given a specific viewing angle, several augmented target light field datasets are generated as the training sets of U-Net. After the U-Net converges, the trained U-Net synthesizes the display patterns that reconstruct the target light field for testing. The training and testing results prove that compared to the stacked CNN-based method and iterative algorithms, the proposed U-Net-based pattern generation method for compressive light field displays features higher reconstruction quality and fewer computing resources.

    Methods

    An artificial neural network’s training procedure can be split into forward and backward propagation. The forward propagation includes the following steps. Firstly, the target light field for training is input into the network, display images are output, and then the light field is reconstructed by simulated perspective projection. The backward propagation is to update the network’s parameters with the loss function and regular terms. Meanwhile, the above procedure is repeated during every epoch and batch. When the training is finished, the target light field for testing is input into the network, and display images are synthesized. This is called the inference procedure. The datasets, network architecture, and hyper-parameters are carefully designed to fit the features of compressive light field displays. The datasets contain 1260 pairs of image blocks cropped from seven scenes. The ReLU function is set as the activation function of the U-Net model initialized uniformly with Kaiming Initialization. The loss function is the mean square error between the target and reconstructed light field and the regular term is the effective range of image pixel values.

    Results and Discussions

    Performances of the proposed U-Net-based method, the stacked CNN-based method, and iterative algorithms are compared fairly for multiplicative (Fig. 8), additive (Fig. 9), polarized (Fig. 10), and hybrid (Fig. 11) types of compressive light field displays. The training and testing results (Figs. 17-20) prove that the proposed method’s light field reconstruction quality is always 2 dB higher than that of stacked CNN-based method. The reason is that the U-Net-based method utilizes the value range of image pixels more effectively than the stacked CNN-based method. Additionally, for additive-type compressive light field displays, the proposed method takes less time to reach the same reconstruction quality than iterative algorithms (Fig. 21).

    Conclusions

    To improve the image quality, uniformity, and computation performance of compressive light field displays, we apply an elaborate U-Net model to synthesize display images. The proposed method is compared with the stacked CNN method and iterative algorithms by simulating the perspective projection of display images with the same target light field as input. For the additive-type compressive light field display, the trained U-Net’s inference speed is much faster than the speed of iterative algorithm under the same reconstruction quality. However, the trained U-Net’s generalization performance still needs promotion for multiplicative and polarized-type compressive light field displays. Possible improvements are changing activation functions and increasing the network’s depth.

    Keywords
    Tools

    Get Citation

    Copy Citation Text

    Chen Gao, Xiaodi Tan, Haifeng Li, Xu Liu. Image Synthesis of Compressive Light Field Displays with U-Net[J]. Acta Optica Sinica, 2024, 44(10): 1026027

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Physical Optics

    Received: Oct. 20, 2023

    Accepted: Dec. 1, 2023

    Published Online: May. 6, 2024

    The Author Email: Tan Xiaodi (xtan@fjnu.edu.cn)

    DOI:10.3788/AOS231683

    Topics