Image Synthesis of Compressive Light Field Displays with U-Net

Chen Gao; Xiaodi Tan; Haifeng Li; Xu Liu

doi:10.3788/AOS231683

Acta Optica Sinica, Volume. 44, Issue 10, 1026027(2024)

Image Synthesis of Compressive Light Field Displays with U-Net

Chen Gao^1,2, Xiaodi Tan^3,4,5、*, Haifeng Li⁶, and Xu Liu⁶

Author Affiliations

¹College of Photonic and Electronic Engineering, Fujian Normal University, Fuzhou 350117, Fujian , China

²Fujian Provincial Key Laboratory of Photonics Technology, Fuzhou 350117, Fujian , China

³Key Laboratory of Optoelectronic Science and Technology for Medicine of Ministry of Education, Fuzhou 350117, Fujian , China

⁴Fujian Provincial Engineering Technology Research Center of Photoelectric Sensing Application, Fuzhou 350117, Fujian , China

⁵Information Photonics Research Center, Fujian Normal University, Fuzhou 350117, Fujian , China

⁶College of Optical Science and Engineering, Zhejiang University, Hangzhou 310027, Zhejiang , China

show less

Abstract Get PDF(in Chinese)

Objective

3D display technology is the entrance to the realistic-feeling metaverse for tabletop, portable, and near-eye electronic devices. True 3D displays are mainly divided into light field displays and holographic displays, among which light field displays can be further subdivided into integral-imaging displays, directional light field displays, and compressive light field displays. Compressive light field displays utilize the scattering characteristic of display panels and the correlation between viewpoint images of the 3D scene. The compressive light field display is a candidate for portable 3D display owing to its compact structure, moderate viewing angle, and high spatial resolution. However, computational resources of portable electronic devices are restricted to satisfy their duration demand. Meanwhile, iterative algorithms to solve the compressive light field display patterns have the problem of heavy computation, preventing compressive light field displays from being a practical solution to portable dynamic 3D displays. With the development of artificial intelligence technology, image generation algorithms based on deep learning are gradually applied to 3D displays. Deep neural networks can be trained to fit the iterative process. Additionally, fast display image synthesis could be realized with rapid forward propagation of artificial neural networks. Previously, researchers proposed a stacked CNN-based method to generate images for compressive light field displays. However, the stacked CNN-based method suffers from convergence and over-fitting problems. U-Net is initially employed for image segmentation in computed tomography to handle slicing data and output the organ’s cancer probability. The skip connection added in the U-Net architecture significantly improves its convergence compared with the stacked CNN model. Light field data are pretty similar to slicing data in computed tomography. Thus, we introduce U-Net as the network model for optimizing compressive light field display patterns for better convergence and generalization. Given a specific viewing angle, several augmented target light field datasets are generated as the training sets of U-Net. After the U-Net converges, the trained U-Net synthesizes the display patterns that reconstruct the target light field for testing. The training and testing results prove that compared to the stacked CNN-based method and iterative algorithms, the proposed U-Net-based pattern generation method for compressive light field displays features higher reconstruction quality and fewer computing resources.

Methods

An artificial neural network’s training procedure can be split into forward and backward propagation. The forward propagation includes the following steps. Firstly, the target light field for training is input into the network, display images are output, and then the light field is reconstructed by simulated perspective projection. The backward propagation is to update the network’s parameters with the loss function and regular terms. Meanwhile, the above procedure is repeated during every epoch and batch. When the training is finished, the target light field for testing is input into the network, and display images are synthesized. This is called the inference procedure. The datasets, network architecture, and hyper-parameters are carefully designed to fit the features of compressive light field displays. The datasets contain 1260 pairs of image blocks cropped from seven scenes. The ReLU function is set as the activation function of the U-Net model initialized uniformly with Kaiming Initialization. The loss function is the mean square error between the target and reconstructed light field and the regular term is the effective range of image pixel values.

Results and Discussions

Performances of the proposed U-Net-based method, the stacked CNN-based method, and iterative algorithms are compared fairly for multiplicative (Fig. 8), additive (Fig. 9), polarized (Fig. 10), and hybrid (Fig. 11) types of compressive light field displays. The training and testing results (Figs. 17-20) prove that the proposed method’s light field reconstruction quality is always 2 dB higher than that of stacked CNN-based method. The reason is that the U-Net-based method utilizes the value range of image pixels more effectively than the stacked CNN-based method. Additionally, for additive-type compressive light field displays, the proposed method takes less time to reach the same reconstruction quality than iterative algorithms (Fig. 21).

Conclusions

To improve the image quality, uniformity, and computation performance of compressive light field displays, we apply an elaborate U-Net model to synthesize display images. The proposed method is compared with the stacked CNN method and iterative algorithms by simulating the perspective projection of display images with the same target light field as input. For the additive-type compressive light field display, the trained U-Net’s inference speed is much faster than the speed of iterative algorithm under the same reconstruction quality. However, the trained U-Net’s generalization performance still needs promotion for multiplicative and polarized-type compressive light field displays. Possible improvements are changing activation functions and increasing the network’s depth.

Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.

Keywords

compressive light field display deep learning imaging system light field rendering physical optics

Tools

Get Citation

Copy Citation Text

Chen Gao, Xiaodi Tan, Haifeng Li, Xu Liu. Image Synthesis of Compressive Light Field Displays with U-Net[J]. Acta Optica Sinica, 2024, 44(10): 1026027

Download Citation

EndNote(RIS)BibTex Plain Text

Set citation alerts for article

Save article for my favorites

Paper Information

Category: Physical Optics

Received: Oct. 20, 2023

Accepted: Dec. 1, 2023

Published Online: May. 6, 2024

The Author Email: Tan Xiaodi (xtan@fjnu.edu.cn)

DOI:10.3788/AOS231683

Topics

laser devices and laser physics

Lasers and Laser Optics

Laser physics

laser manufacturing

Instrumentation, Measurement and Metrology