1School of Optical-Electrical and Computer Engineering, Engineering Research Center of Optical Instrument and System, Ministry of Education, and Shanghai Key Laboratory of Modern Optics System, University of Shanghai for Science and Technology, Shanghai 200093, China
2School of Aerospace, Northwestern Polytechnic University, Xi’an 710071, China
3Shanghai Aerospace Control Technology Institute, Shanghai 201109, China
4Shanghai Key Laboratory of Aerospace Intelligent Control Technology, Shanghai 201109, China
Digital micromirror devices (DMDs) have emerged as essential spatial light modulators for holographic 3D near-eye displays due to their rapid refresh rates and precise wavefront modulation characteristics. However, since the modulation depth of DMDs is limited to binary levels, the quality of reproduced images from a binary computer-generated hologram (CGH) is often deficient. In this paper, we propose a stochastic gradient descent (SGD) based binary CGH optimization framework where a convolutional neural network (CNN) is employed to perform the differentiable hologram binarization operation. The CNN-based binary SGD optimization can significantly minimize the binary quantization noise in the generation of binary CGH, providing a superior and high-fidelity holographic display. Our proposed method is experimentally verified by displaying both high-quality 2D and true 3D images from optimized binary CGHs.
【AIGC One Sentence Reading】:DMDs are key for holographic 3D near-eye displays, but binary CGHs from them often have poor image quality. We propose an SGD-based binary CGH optimization using CNN for differentiable binarization, reducing noise and enhancing display fidelity.
【AIGC Short Abstract】:Digital micromirror devices (DMDs) are key for holographic 3D near-eye displays but suffer from binary modulation depth, leading to poor image quality. This paper introduces a binary CGH optimization framework using stochastic gradient descent and a convolutional neural network for differentiable binarization. It effectively reduces binary quantization noise, enabling high-fidelity holographic displays. Experimental results confirm its capability to showcase high-quality 2D and 3D images.
Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.
Computer-generated holography (CGH)[1–3] is a technology that utilizes computer simulations to generate holograms. It is capable of recording both real and virtual objects and finds wide applications in wavefront shaping[1,2], holographic projection and display[4,5], optical fiber communication[6], microscopy[7], and optical tweezers[8]. Holographic 3D display can reproduce depth, parallax, and other 3D information of objects, thereby providing users with a more realistic visual experience. A spatial light modulator (SLM) is the core device in holographic displays, responsible for modulating the phase or amplitude of light to reconstruct a 3D wavefront. Common types of SLMs include liquid crystal-based SLMs and digital micromirror devices (DMDs). Among these, DMDs exhibit significant advantages in terms of high resolution, high speed, and high reliability. However, due to the limited modulation depth of DMDs being confined to binary levels, the quality of the reconstructed images is often deficient. Therefore, research on generating high-quality binary holograms becomes particularly crucial.
The methods for generating binary holograms have evolved continuously with the advancement of computational techniques and optical theories. In the early stages, the direct binarization method[9–12], which converts grayscale holograms into binary holograms through threshold processing, was widely adopted due to its simplicity and low computational cost. However, this method introduces significant quantization noise, resulting in poor reconstructed image quality[13,14]. An error diffusion algorithm was proposed[15,16] to provide more accurate binarization, which can reduce noise and distortion by diffusing quantization errors to neighboring pixels. This method significantly enhances the quality of binary holograms while maintaining moderate computational complexity, making it suitable for early computer processing capabilities. However, its optimization effects on complex holograms are limited[17,18], and it still introduces a certain level of noise. With technological advancements, the iterative Fourier transform algorithm (IFTA)[19] was introduced, iteratively optimizing between the frequency and spatial domains to gradually approximate the target hologram. IFTA can generate high-quality binary holograms, particularly suitable for complex scenes and phase modulation, but it suffers from high computational complexity, numerous iterations, and lengthy processing time, imposing higher demands on hardware. Recently, stochastic gradient descent (SGD) based optimization algorithms for binary holograms have been proposed[20,21]. During the gradient descent process, these algorithms employ differentiable functions to perform hologram binarization, and then obtain the optimized binary hologram by calculating gradients and applying gradient descent algorithms. However, the results processed using the binary-similar function for hologram binarization remain approximate and cannot precisely estimate the true binarization process.
In this paper, we propose an SGD-based binary CGH optimization framework where a convolutional neural network (CNN) is employed to perform the differentiable hologram binarization operation. Compared to conventional binary CGH generation methods, our CNN-based approach provides a more accurate calculation of binary CGH and significantly minimizes the binary quantization noise. Experimental results demonstrate that the binary CGHs generated by our proposed optimization framework achieve superior holographic reconstructions for 2D as well as 3D cases.
Sign up for Chinese Optics Letters TOC Get the latest issue of Advanced Photonics delivered right to you!Sign up now
2. Methods
A holographic near-eye display configuration based on a DMD is presented in Fig. 1. When illuminated by a plane wave, a binary CGH displayed on the DMD goes through a band-limited diffraction configuration to reproduce the target 2D or 3D images. Specifically, the beam modulated by the DMD propagates to the back focal plane of the Fourier lens through the optical Fourier transform, as described by the following equation: where denotes the binary hologram displayed on the DMD device, which is obtained using a binarization operation from an initial real-valued amplitude type hologram . is the propagated wavefront on the back focal plane of the Fourier lens by performing the Fourier transform to the binary hologram. The wavefront is then filtered using a single-side band (SSB) filter mask to block the zero-order (also known as DC) and conjugate noise[22]. Subsequently, the reconstructed images can be observed by our eyes in front of the SSB window, which is also equivalent to virtually reconstructing the images through back-propagation of Fresnel diffraction through the following process: where represents the wavefront on the image plane, is the wavelength, is the distance from the filtering plane to the image plane, and denotes the SSB mask applied on the filtering plane.
Figure 1.DMD holographic near-eye display scheme based on double-step band-limit diffraction.
Figure 2(a) presents the SGD optimization flowchart based on the above band-limited diffraction optics. As an example of displaying a 2D target image, this process aims to find the optimized binary CGH with minimal loss between the holographic reconstructed image and the target image. We first initialize a real-valued amplitude hologram on the DMD plane and then convert it to a binary hologram using the forward binary operator. Then the focal image is numerically reconstructed from the binary hologram via the double-step band-limited diffraction using Eqs. (1) and (2). We further calculate the intensity loss for the reconstructed image. For the 3D case, we calculate the loss for all depth layer images and sum those losses for the total 3D loss. To achieve the optimized binary hologram, we set the problem as finding the optimum amplitude hologram as well as the optimum parameters of the CNN model resulting in minimum loss after binarization. This problem setting allows for a binary hologram update during gradient-based optimization. The optimization process can be summarized by solving the following optimization problem: where represents the intensity of the target image, and is an energy scale factor. We set the as the l2 mean square error (MSE), considering that the MSE is sensitive to the difference of mean value rather than variance, which is appropriate to reduce the background noise of binary holograms. To solve Eq. (3), we utilize the iterative optimization based on SGD designed for updating the amplitude hologram . The update rule in iteration with step is as follows:
Figure 2.(a) Flowchart of the SGD optimization algorithm for generating binary CGH. (b1), (b2) The binarization network architecture.
In this paper, and are set as the constants, where and is assigned a value of 0.1. The optimized binary hologram is acquired through the binarization of the optimized amplitude hologram as . Unlike phase-only SGD methods, the backward gradient of the loss cannot be obtained directly because the binary operation by simply setting a threshold value for binary quantization during the forward propagation is non-differentiable, which hinders direct backward gradient computation. To address this related issue, traditional methods employ binary-similar differentiable functions such as hard hyperbolic tangent functions to estimate the binary calculation[20,21], thereby making the whole forward calculation differentiable. However, such binary-similar functions fail to obtain precise results because a large truncation error occurs between these binary-similar functions and true binarization computations. To this end, we here designed a full CNN to perform a differentiable hologram binarization operation instead of using binary-similar functions. The employment of our CNN model is not only differentiable but also provides more accuracy for hologram binarization.
A diagram of the CNN architecture is shown in Fig. 2(b). The role of this CNN is to perform an ideal approximation of the binarization operation to the input real-valued amplitude hologram, and thereby output an ideal binary hologram whose pixel values are almost close to zero or one. In the network architecture, we employ a fully convolutional operation referred to as “Convlayer”, which consists of two convolutional layers, two batch normalization layers, and two ReLU activation layers. Prior to the “output” layer, the activation function is given by . During the SGD optimization process, the CNN is jointly trained to find its optimized parameters. After finishing the SGD optimization process, we obtain the final binary hologram from the output of the CNN, which is then displayed on the DMD for precise binary holographic wavefront modulation.
3. Experiment
We first evaluate the quality of binary CGH generated by our CNN-based binary SGD (CNN-B-SGD) optimization method with those obtained using the traditional binary-similar function-based binary SGD (Func-B-SGD) method. For the traditional Func-B-SGD, we employ two types of binary-similar functions. The first function is the clip mapping reported in Ref. [20], and the second is another differentiable mapping function by reported in Ref. [21]. Meanwhile, to further suppress speckle noise, we employed the temporal multiplexing (TM) method where 20 binary holograms are generated using 20 different random initial amplitudes in the SGD iteration. Then we display them sequentially on DMD using fast binary modulation of 9.52 kHz framerate. Figure 3 shows the simulated results from the generated binary CGHs using different methods. The values of the peak signal-to-noise ratio (PSNR) and root mean square error (RMSE) are given in each image. Figure 3(a) illustrates five test target images. Figure 3(b) shows the simulation results from binary holograms generated using direct binarization to SGD optimization, where the reconstruction quality is low with poor PSNR and RMSE metrics due to direct binary quantization noise. Figures 3(c) and 3(d) show the simulated results from binary holograms generated by Func-B-SGD using clip mapping and curve mapping binary-similar functions, respectively. Figure 3(e) shows the simulated results from binary holograms generated by our CNN-B-SGD method. We can observe that our CNN-B-SGD optimization method can provide better reconstructions with a higher-level PSNR and a lower-level RMSE metric, confirming the ability of the proposed method to produce higher-quality DMD holographic displays.
Figure 3.(a) Target images. (b)–(e) Simulated comparison of different binary CGH methods, including (b) direct binarization to SGD optimized amplitude hologram, (c) Func-B-SGD optimization with clip mapping binary-similar function, (d) Func-B-SGD optimization with curve mapping binary-similar function, and (e) our proposed CNN-B-SGD optimization.
We also evaluated the performance of the binary holograms generated by our CNN-B-SGD method through holographic near-eye display experiments, with the experimental optical setup shown in Fig. 4(a). In this study, a DMD with a resolution of and a pixel pitch of 7.56 µm is utilized to display holograms. A 520 nm laser beam is collimated by means of a collimator lens and subsequently directed onto the DMD through a mirror. The optimized binary CGH is loaded onto the DMD and illuminated by the laser beam at an incident angle of 24°. Following modulation by the DMD, the beam is transmitted through a filter positioned at the back focal plane of a Fourier lens with a focal length of 150 mm. This filter is designed to remove the DC and conjugate noise. Charge-coupled devices (CCDs) coupled with zoom lenses are employed to capture the final image.
Figure 4.(a) Optical setup of DMD holographic near-eye display. (b) Comparison of experimental results of different methods. (c) Experimental results of holographic display using our CNN-B-SGD-based binary CGHs for the 2D image case. (d) Experimental results of holographic display using our CNN-B-SGD-based binary CGHs for a true 3D case when the camera is focusing from 400 to 700 mm. Intensity image and corresponding depth map of the 3D scene are illustrated.
Figure 4(b) shows the optical experimental results from the generated binary CGHs using different methods. The experimental results from left to right are the experimental results of the binary hologram generated by direct binarization to the SGD optimized amplitude hologram, Func-B-SGD optimization with a clip mapping binary-similar function, Func-B-SGD optimization with a curve mapping binary-similar function, and our CNN-B-SGD method. We can observe that the experimental results reveal high consistency with the simulation results in Fig. 3.
Figure 4(c) shows the experimental results for four test target images, which exhibit consistent performance with those in the simulation results in Fig. 3(e). To further demonstrate the ability of true 3D holographic displays, we calculate the binary CGH by modifying the optimization of Eq. (4) into a 3D loss function. The target 3D volume is rendered into multiple focal stack images[20], and we calculate amplitude loss for all depth layers and summate those losses for the total 3D loss. Figure 4(d) exhibits an example of the 3D target with intensity images and depth maps as well as the experimentally captured focal images at different distances from the camera. We can observe the in-focus and out-of-focus contents of the reconstructions in each case, correctly experiencing clear and blur changes, confirming that the binary holograms generated by our CNN-B-SGD method can provide high-contrast and speckle-free 3D images with realistic focus cues.
Moreover, Table 1 presents the comparison of the computation time of generating single binary CGHs using different methods. In the computation, we applied 40 iterations in the SGD optimization since the loss had already converged to a small value after 40 iterations. Due to the extra introduction of the CNN model in the binarization operation, our CNN-B-SGD method exhibits marginally longer computation time compared to the other three approaches. The test is performed on the platform of NVIDIA GeForce RTX 4070 GPU for CNN inference and Intel Core i7-14700KF CPU for other non-CNN computations. It should be noted that the runtime for generating binary CGHs using such an iteration strategy should face certain challenges to achieve a potential real-time holographic display requirement. This might be solved by employing a more advanced computation platform[23].
Table 1. Runtime for Generating Single Binary CGH Using Different Methods
Table 1. Runtime for Generating Single Binary CGH Using Different Methods
Direct binarization
Func-B-SGD with clip mapping
Func-B-SGD with curve mapping
Proposed method
Time (s)
1.97
2.05
2.14
3.60
In addition, we also carry out the experiment demonstration of a full-color holographic near-eye display for our CNN-B-SGD binary CGH testing. In this case, the DMD is synchronized with an RGB fiber-coupled laser beam consisting of three precisely aligned beams operating at 488, 520, and 637 nm. The binary CGHs for red, green, and blue channels are calculated separately using the proposed CNN-B-SGD optimization under different wavelength values and displayed on the DMD sequentially for color mixing. The wavelength dispersion of the DMD is compensated for each color by multiplying the wavelength-dependent tilt phase in each binary CGH optimization[24]. Figure 5 shows the captured reconstructed images, demonstrating that our method can also successfully deliver full-color contents for DMD holographic near-eye displays.
Figure 5.Experimental results of full-color holographic display using our CNN-B-SGD based binary CGHs.
In summary, we propose an SGD-based binary CGH optimization framework where a CNN is employed to perform the differentiable binarization operation. Our method significantly reduces quantization noise compared to conventional binary CGH generation approaches. While binary holography holds great promise for the future VR/AR 3D display optics, our work boosts potentials toward high-fidelity and low-cost DMD holographic VR/AR displays with superior true 3D viewing experiences.
[9] P. A. Cheremkhin, E. A. Kurbatova. Binarization of digital holograms by thresholding and error diffusion techniques. Digital Holography and Three-Dimensional Imaging 2019, Th3A.22(2019).