Holographic near-eye augmented reality (AR) displays featuring tilted inbound/outbound angles on compact optical combiners hold significant potential yet often struggle to deliver satisfying image quality. This is primarily attributed to two reasons: the lack of a robust off-axis-supported phase hologram generation algorithm; and the suboptimal performance of ill-tuned hardware parts such as imperfect holographic optical elements (HOEs). To address these issues, we incorporate a gradient descent-based phase retrieval algorithm with spectrum remapping, allowing for precise hologram generation with wave propagation between nonparallel planes. Further, we apply a camera-calibrated propagation scheme to iteratively optimize holograms, mitigating imperfections arising from the defects in the HOE fabrication process and other hardware parts, thereby significantly lifting the holographic image quality. We build an off-axis holographic near-eye display prototype using off-the-shelf light engine parts and a customized full-color HOE, demonstrating state-of-the-art virtual reality and AR display results.
【AIGC One Sentence Reading】:We enhance off-axis holographic AR displays with a phase retrieval algorithm and camera calibration, improving image quality and hardware performance.
【AIGC Short Abstract】:This study introduces an off-axis holographic AR display that employs a gradient descent phase retrieval algorithm and camera-calibrated propagation to enhance image quality. By addressing phase hologram generation and hardware imperfections, particularly in HOEs, the prototype achieves state-of-the-art AR and VR display results.
Note: This section is automatically generated by AI . The website and platform operators shall not be liable for any commercial or legal consequences arising from your use of AI generated content on this website. Please be aware of this.
1. INTRODUCTION
Augmented reality (AR) creates a novel way for human beings to experience the world by integrating virtual scenes with real environments [1]. Currently, AR display methods can be categorized into two primary types [2]. One is the video see-through AR display, which overlays virtual information directly onto real-world scenes captured by cameras, as exemplified by Apple Vision Pro [3]. The other is an optical see-through AR display, which uses optical components to merge real ambient light with virtual content, such as Google Glass [4]. Despite advancements in imaging quality, many AR products still struggle with the problem of vergence-accommodation conflict (VAC), which may cause visual fatigue and discomfort [5,6].
Unlike traditional 3D displays that rely on stereoscopic methods to simulate depth cues [7], holographic displays provide a more immersive and strain-free viewing experience by accurately reproducing the light wavefront using diffraction, thereby addressing the VAC problem by offering human eyes’ natural depth cues for proper focus adjustments [8,9]. Remarkably, computer-generated holography (CGH) has recently become mainstream, simulating the propagation of light beams and generating hologram patterns digitally. The generated holograms are loaded onto a spatial light modulator (SLM) to reproduce the light wavefront and create a 3D scene consistent with the original object [10,11].
Existing CGH algorithms used in holographic near-eye displays are mostly applicable only to the propagation of light between parallel planes, requiring the source and target planes to be aligned in parallel; examples include the angular spectrum method (ASM) [12,13]. However, when the target plane is not parallel with the source plane, these traditional methods lead to a mismatch degrading the imaging quality. The off-axis aberration should be considered in the propagation between nonparallel planes posing significant challenges for the application. For instance, in the practical use of wearable holographic near-eye displays, off-axis projection of holograms using a holographic optical element (HOE) combiner often involves propagation between nonparallel planes [14]. This requirement complicates the accurate generation of holograms for displays, making it crucial to develop methods that effectively address these propagation challenges.
Sign up for Photonics Research TOC Get the latest issue of Advanced Photonics delivered right to you!Sign up now
As a lightweight optical combiner, an HOE [15] offers advantages such as high diffraction efficiency, minimal diffraction orders, spectral selectivity, and optical see-through capability, making it suitable for AR products [16]. Li et al. simplified their system considerably by replacing the beam splitter and lens with HOE [17]. Subsequently, Microsoft Research developed a compact, eyeglasses-style display with a wide field of view (FOV) [18]. To enlarge the eyebox, Jang et al. [19] implemented a compact pupil-shifting HOE (PS-HOE), enabling exit-pupil shifting without bulky mechanisms. Similarly, Xia et al. expanded the eyebox of holographic displays using a lenslet array to fabricate HOE [20]. Notably, waveguides are also used to miniaturize AR displays. For instance, metasurface gratings and a dispersion-compensating waveguide were employed to eliminate bulky collimation optics, enabling full-color, 3D AR content in a compact form factor [21].
To calculate off-axis propagation between nonparallel planes, a method tilts the angular spectrum of parallel plane waves by coordinate rotation in the Fourier domain [22,23]. Consequently, a nonuniform fast Fourier transform method was proposed to overcome the sampling limitations of traditional fast Fourier transform on a tilted plane [24]. As a powerful iterative method, the stochastic gradient descent (SGD) algorithm demonstrates superior performance in hologram retrieval when combined with the traditional angular spectrum method (SGD-ASM) [25,26]. However, the SGD-ASM method cannot handle the propagation between nonparallel planes.
This work presents an HOE-empowered, full-color, off-axis holographic AR display design paradigm. Specifically, we propose a novel off-axis-supported hologram generation algorithm incorporating SGD-ASM with spectrum remapping. This algorithm effectively mitigates issues such as distortion and depth mismatches caused by the tilted propagation with the off-axis HOE. To further enhance the display quality, we utilize the camera-in-the-loop (CITL) calibration [25,26] to optimize the holograms with ill-tuned hardware, including negative impacts caused by defects inherent in the HOE fabrication process, nonuniform illumination, and speckles caused by the coherent light source. Our work provides a novel solution to improve the image quality of HOE-based holographic AR displays significantly, and this end-to-end optimization method empowers the application of HOE in wearable near-eye displays.
2. MOTIVATION OF OFF-AXIS CONFIGURATION
Figure 1 illustrates two optical designs for wearable near-eye displays. The traditional coaxial design, shown in Fig. 1(a), primarily consists of an SLM, a beam splitter (BS), and a curved partial mirror. In this configuration, the SLM modulates light, producing an intermediate reconstructed holographic image, which is then magnified by the curved partial mirror and directed to the eye via the BS. The FOV in this coaxial design depends on the focal length of the curved partial mirror. However, the use of a traditional BS limits the FOV and contributes to a bulky form factor. In contrast, the off-axis design features modulated light beams that strike the combiner at an oblique angle before reflecting toward the eye, as depicted in Fig. 1(b). Here, the term “off-axis” indicates the tilted incidence of light on the combiner. As reported in prior work, an HOE can serve as the combiner and integrate multiple optical functions including off-axis projection [18], while offering improved transparency, thus significantly reducing the form factor of holographic AR displays. The basic design consists of an SLM and an HOE. In this off-axis configuration, the light beam modulated by the SLM is reflected by the HOE and then converges to the eye, effectively combining the function of the BS and curved mirror, leading to a more compact design reminiscent of eyeglasses.
Figure 1.Illustration of holographic AR displays with two different types of optical combiners. (a) Traditional coaxial design. Light from the SLM is reflected onto a curved partial mirror by a BS and then reflected toward the human eye. (b) Proposed off-axis design. Light from the SLM hits the HOE in an oblique incidence angle and then gets reflected and converged toward the human eye.
Further, in the off-axis design shown in Fig. 1(b), the HOE combiner interacts with off-axis incident beams, providing eye relief similar to that of traditional eyeglasses. As a result, the FOV is significantly larger than that of traditional coaxial designs, such as the birdbath design depicted in Fig. 1(a). For example, when the width of the HOE is 50 mm and the eye relief is 30 mm, the horizontal FOV can reach nearly 80°. The off-axis design is crucial for balancing performance and portability requirements. However, it inherently introduces challenges related to off-axis propagation, such as optical aberrations, including distortion, which degrade the quality of the reconstructed image. In practice, it is necessary to select an appropriate off-axis angle to avoid interference with the wearer’s head to minimize aberrations. Additionally, factors such as the quality of the HOE, specifically, its diffraction uniformity, and laser speckle should be carefully considered and optimized during the hologram generation process.
3. OFF-AXIS-SUPPORTED HOLOGRAM GENERATION
To support hologram generation tailored for our off-axis system, we extend the vanilla SGD-ASM by modifying the propagation model and introduce the CITL framework to further improve imaging quality in off-axis configurations.
A. Vanilla SGD-ASM
As shown in Fig. 2(a), suppose that the SLM field is given by , and its spectrum is given by on the SLM plane; the reference plane field is given by , and its spectrum is given by on the reference plane, where are spatial frequencies on the SLM plane.
Figure 2.Overview of tilt-SGD and tilt-CITL. (a) Hologram generation with the tilted plane setting. The initial phase hologram loaded on the SLM is a random phase to avoid getting stuck in local optimum during iteration. The reference plane is parallel to the SLM and set for propagation between two parallel planes. Two coordinate systems are used in the propagation model: one is the SLM coordinate, while the other one is the or reference coordinate. Note that the tilted plane can either rotate the reference plane around the axis or the axis. (b) Camera-calibrated hologram optimization. The phase is represented by a green frame and the amplitude is represented by a black frame. The initial hologram loaded on the SLM is also a random phase. Similar to the pipeline of tilt-SGD, the reference plane field can be obtained through the propagation of the SLM field by ASM, and the tilted plane field is obtained by applying the transformation matrix into the reference plane field. Note that tilt-CITL needs to capture the virtual imagery in a dark environment.
The complex amplitude distribution on the reference plane can be obtained by using the ASM as the wave propagation operator : where , which is the complex amplitude distribution on the SLM plane, is the amplitude on the SLM plane, is the phase on the SLM plane, denotes the Fourier transform, denotes the inverse Fourier transform, and is the transfer function of the propagation distance .
The amplitude on the reference plane and the amplitude of the target image are used to construct a loss function, such as mean squared error (MSE). The wave propagation and the loss function are implemented in PyTorch, with optimization performed using a gradient descent algorithm, such as SGD, to refine the phase that minimizes the loss function.
B. Modified Tilt-SGD
We incorporate the Fourier spectrum remapping mentioned above into the SGD-ASM framework for spectrum remapping, enabling phase retrieval for tilted planes. This phase retrieval algorithm, is referred to as tilt-SGD in this work, with its specific process illustrated in Fig. 2(a). The phase hologram is loaded onto the SLM, which is initialized with a random phase, while its amplitude is 1 consistently. The wavefront on the SLM plane then propagates distance to the reference plane using ASM. Subsequently, a rotation matrix is applied to the angular spectrum on the reference plane to perform the coordinate transformation from the reference plane to the tilted plane, thus yielding the reconstructed image on the tilted plane. The amplitude of the reconstructed image is extracted and compared with the amplitude of the target image via a loss function . The gradient descent method is employed as the optimization solver, updating the phase through backpropagation.
The following is a brief derivation of the Fourier spectrum remapping. The complex amplitude on the reference plane can be decomposed into plane waves using a Fourier transform. After a coordinate transformation, these decomposed plane waves are recombined on the tilted plane. The complex amplitude on the tilted plane is then obtained by applying an inverse Fourier transform. This procedure is called the “coordinate rotation in the Fourier domain” [22] (abbreviated as the Fourier spectrum remapping in this paper).
Following the similar routine in the prior work [22], the transformation matrix used to rotate coordinates around the axis with the angle of is given as
The complex amplitude on the reference plane can be expressed as where is the wavelength, and is the distance between the SLM plane and reference plane.
By applying matrix into Eq. (3), we obtain the angular spectrum on the tilted plane, supposing that the wave field on the tilted plane is given by , and its spectrum is given by , where are spatial frequencies on the tilted plane, as follows:
Therefore, the complex amplitude on the tilted plane is given by the inverse Fourier transform: where is added to conserve the total energy of the field after rotational transformation.
After rotating the Fourier spectrum from the reference plane to the tilted plane, we can obtain the complex amplitude located on the tilted plane used to calculate the loss function with the amplitude of the target image. Then, phase on the SLM plane can be retrieved through inversely solving the gradients of by the gradient descent method. In this work, we use MSE as the loss function.
We employ vanilla SGD, a variant method called “perspective-SGD,” and the proposed tilt-SGD to conduct numerical simulations for reconstructing a data set consisting of 20 test images from the DIV2K data set [27]. Perspective-SGD maps target images from a tilted plane to a reference plane based on the geometric perspective relationship between the two planes. The SLM field is propagated to the reference plane using ASM, and the loss function is constructed based on the amplitude of the reference plane field and the mapped target image. The phase on the SLM plane is then optimized via the SGD. However, unlike tilt-SGD, perspective-SGD only accounts for the geometric transformation of image intensities, without preserving depth cues.
To evaluate the performance of these methods, we use the peak signal-to-noise ratio (PSNR) and structural similarity (SSIM) as the metrics. The quantitative results are summarized in Table 1 with qualitative and quantitative comparisons presented in Fig. 3. Overall, our findings indicate that tilt-SGD significantly outperforms vanilla SGD and perspective-SGD in terms of reconstruction quality.
Quantitative Results Indicating Average PSNR↑ and SSIM↑ Metrics of 20 Test Images in Simulation
Method
SGD
Perspective-SGD
Tilt-SGD
PSNR (dB)
13.27
16.24
36.18
SSIM
0.41
0.40
0.96
Figure 3.Simulation results (PSNR, in dB). For each set (left to right): simulation results with SGD, perspective-SGD, tilt-SGD.
Tilt-SGD assumes an ideal propagation model, neglecting actual hardware factors, such as laser speckle noise, nonuniform illumination, assembly error, and performance degradation caused by the HOE. During the HOE fabrication process, defects such as bubbles, ripples, and uneven diffraction efficiency are often introduced, significantly compromising imaging quality.
To mitigate these hardware-induced issues, we propose a camera-calibrated phase optimization method that incorporates the optimizing strategy of CITL into tilt-SGD to enhance imaging quality. This approach is referred to as “tilt-CITL” in this paper. In this method, the propagation model from the SLM plane to the tilted plane remains consistent with the formulations in Eqs. (1) and (5). As illustrated in Fig. 2(b), tilt-CITL captures the output via a camera, establishes a loss function between the captured results and the target images, minimizes using a gradient descent solver, and updates the phase on the SLM plane through backpropagation. Tilt-CITL effectively accounts for hardware imperfections by directly incorporating them into the captured results, providing a more accurate representation of system deficiencies. This real-time feedback mechanism significantly reduces aberrations and speckle noise, while compensating for various hardware factors, including nonuniform illumination and HOE fabrication defects.
The SLM modulates and diffracts the incident collimated light, projecting the diffracted image onto the HOE plane. Since the HOE plane is not perpendicular to the optical axis, the images captured by the camera inevitably suffer from defocus and distortion. To mitigate this issue, the tilt-CITL process employs a circle matrix for calibration, effectively compensating for the defocus and distortion caused by the off-axis HOE or other hardware factors. Following calibration, the resolution of the corrected region reaches , resulting in a significantly clearer and more accurate representation.
4. DISPLAY SYSTEM IMPLEMENTATION
To validate the proposed method, we fabricated the HOE and implemented an off-axis holographic AR display prototype.
A. HOE Design and Fabrication
For the off-axis design proposed in Section 2, an off-axis reflective HOE needs to be fabricated. The fabrication process, illustrated in Fig. 4(a), employs a point-to-parallel light exposure method. In this method, a collimated reference beam and a divergent signal beam illuminate opposite sides of the HOE interfering to form fringes on the photopolymer adhered to a glass substrate. The amplitude transmission coefficient of the HOE is linearly related to the intensity distribution of both the signal reference beams after interference. Assuming the complex amplitude of the reference beam is and that of the signal beam is , the amplitude transmission coefficient can be expressed as where represents the complex amplitude of reference beam on the plane of , i.e., the HOE plane, is the amplitude of reference beam, and are direction cosines of the wave vector of the reference beam, represents the complex amplitude of signal beam on the plane of , and is the amplitude at a unit distance from the viewpoint. Note that only one diffraction order is kept for reconstruction.
Figure 4.Schematic diagram of the proposed HOE (a) fabrication and (b) reconstruction procedure. The HOE coordinate is established to analyze the fabrication and reconstruction process better. Suppose the coordinates of the viewpoint are . (c) Fabricated HOE by a 532 nm laser. The HOE consists of a layer of photopolymer and a glass substrate. The glass substrate size is about , and the thickness is 1 to 2 mm. (d) The viewpoint was formed by illuminating the HOE with a 532 nm laser.
The distance between the HOE plane and the viewpoint is set to 30 mm, providing adequate eye relief. The angle between the reference beam and the HOE is set to 45°, ensuring that the HOE functions properly when the probe beam is also at a 45° angle to the HOE during reconstruction. After fabrication, the HOE exhibits reflective and focusing properties, functioning similarly to a combination of a mirror and a lens. The HOE fabrication setup is constructed based on this design shown in Fig. 4(a), with additional details provided in the appendix.
To fabricate a full-color HOE, three lasers with wavelengths of 639, 532, and 457 nm are used for exposure, following the point-to-parallel light exposure method on photopolymers corresponding to each wavelength. Since the arrangement of the three photopolymers impacts the diffraction efficiency of the full-color HOE, we employ a laminated structure with two layers of glass substrates to maintain high diffraction efficiency. One substrate is coated with photopolymer on both sides, accommodating 639 and 532 nm wavelengths, while the other substrate is coated on a single side and operates at 457 nm.
In the reconstruction process, illustrated in Fig. 4(b), a probe beam from the SLM illuminates the HOE. The HOE reflects and focuses the incident probe beam toward the viewpoint, forming the reconstruction beam. Let the complex amplitude of the reconstruction beam be denoted by . Since the probe beam and the reference beam are a pair of conjugate waves, the complex amplitude of the probe beam can be expressed as . Therefore, the complex amplitude of the reconstruction beam is given by , as follows:
With the utilized SLM specifications, the horizontal FOV of the fabricated HOE is measured at 63°, enabling a wide and immersive viewing experience. Figure 4(c) shows a monochromatic HOE prepared using a 532 nm laser. In this setup, a collimated beam illuminates the monochromatic HOE and the viewpoint, as illustrated in Fig. 4(d), confirming the HOE’s functionality.
To achieve optimal HOE diffraction efficiency, we conducted simulation and experimental analysis. Diffraction efficiency is a key parameter for evaluating HOE performance, as it represents the intensity distribution of incident light after interacting with the HOE [28]. Higher diffraction efficiency indicates lower energy loss, which is crucial for enhancing HOE-empowered display quality. It is typically defined as the ratio of the intensity of diffracted light at a specific diffraction order to the intensity of the incident light. For the HOE fabricated using a 532 nm laser, the relationship between diffraction efficiency and exposure time is as illustrated in Fig. 5(a). The maximum diffraction efficiency reaches approximately 32% (at the or diffraction order) with an exposure time of 120 s.
Figure 5.Analysis of HOE diffraction efficiency. (a) Diffraction efficiency distribution according to exposure time. The exposure intensity is fixed at . (b) Effects of incident light angle deviation on the measured diffraction efficiency. Please note that the diffraction efficiency is normalized.
Additionally, we analyze impact on the diffraction efficiency when the incident angle of the probe beam deviates from 45° during reconstruction. In our system, the reference beam and probe beam are emitted from the same laser source, ensuring they have the same wavelength. Therefore, the primary factor influencing diffraction efficiency is the incident angle deviation of the probe beam. Interestingly, as shown in Fig. 5(b), when the wavelength of the probe beam matches that of the reference beam, HOE can maintain a high diffraction efficiency even with a deviation angle of up to .
B. Prototype Configuration and Implementation
Figure 6(a) shows the optical schematic of our off-axis projection layout design. Our design consists of a fiber-coupled laser, a collimating lens (CL), a linear polarizer (LP), a BS, a phase-only SLM, a relay system including two lenses (Lens 1 and Lens 2), and an HOE. The relay system is employed to magnify the image from SLM. After SLM modulation, the wavefront on the SLM plane propagates over a certain distance to reach the front focal plane of Lens 1. Subsequently, the system magnifies and relays the wavefront to the rear focal plane of Lens 2, after which it propagates another distance further to reach the reference plane.
Figure 6.System design and setup of the implemented prototype. (a) Optical schematic: CL, collimating lens; LP, linear polarizer; BS, beam splitter; SLM, spatial light modulator; L1, lens 1; L2, lens 2; HOE, holographic optical element. L1 and L2 form a system for magnifying the image. (b) Bench-top holographic AR display prototype. The light emitted from the fiber-coupled laser meets CL, LP, SLM, and system, and constructs an image at the HOE plane through the eyepiece. A physical cube is placed behind the HOE as the reference. (c) Zoom-in details of the eyepiece and the full-color HOE. The distance between the HOE and the eyepiece is 30 mm, which is also the eye relief.
Our design sets the angle between the HOE and the reference plane to 45°, seeking to balance multiple factors. If is less than 45°, the observer or camera lens may block the image light from the SLM, causing occlusion. If exceeds 45°, the projected image on the HOE becomes more stretched, introducing distortion that is difficult to correct. Setting at 45° minimizes occlusion and distortion, ensuring a clearer, more accurate display. Additionally, this angle is chosen with future applications in wearable AR glasses in mind. It meets the ergonomic and optical requirements for devices worn on the face while maintaining display quality and user comfort.
As shown in Fig. 6(b), a benchtop system is built based on the optical schematic. Specifications are detailed in Table 2, the laser is a fiber-coupled laser with wavelengths of 639, 532, and 457 nm, and the utilized SLM is a phase-only LCoS (UPOLABS HDSLM45R) with a resolution of and a pixel pitch of 4.5 μm. Compared with the optical schematic, a camera replaces the human eye to capture and analyze the optical output more precisely and consistently. The coherent light from fiber-coupled laser is collimated by the CL and then illuminates the SLM via the reflective path of BS. After SLM modulation, the light is magnified by the system, illuminates the HOE, and is diffracted by HOE, with the resulting image captured by a camera. The LP adjusts the polarization state of the incident beam to meet the requirements of the phase-only SLM. The system, with a magnification (L1 focal length of 50 mm and L2 focal length of 150 mm), scales the SLM display area () to make full use of the larger HOE effective area ().
Specifications of the Off-Axis HOE Display Prototype
Experimental Devices
Parameters
Fiber-coupled laser
Wavelengths 639, 532, 457 nm
Collimating lens
Focal length 150 mm
Phase-only SLM
UPOLABS HDSLM45R;
resolution ;
pixel pitch 4.5 μm
Lens 1
Focal length 50 mm
Lens 2
Focal length 150 mm
Eyepiece
Focal length 9.52 mm;
focusing range 0.3 m–infinity
Industrial camera
FLIR GS3-U3-23S6C-C;
resolution
We employ an optical layout that differs from a traditional eyepiece to better simulate the size, position, and FOV of the human eye. Unlike traditional eyepieces, which have internal apertures and often suffer from light occlusion, our eyepiece places the aperture in front of the lens. This design aligns the eyepiece aperture with the pupil size, allowing for a more accurate evaluation of the near-eye display under conditions that closely replicate human vision. The front-aperture design also enables an unobstructed FOV comparable with that of the human eye, ensuring a wide field of view without interference. The relative position of eyepiece and full-color HOE is shown in Fig. 6(c). The eyepiece in our system has a focusing range from 0.3 m to infinity, accommodating various viewing distances.
To validate the effectiveness of tilt-SGD, experiments are carried out for SGD and tilt-SGD. As shown in Fig. 7, the comparison of reconstruction results reveals clear stretching distortion in the SGD results. This distortion occurs because the hologram propagating between two parallel planes projects directly on tilted plane. The tilt-SGD algorithm, designed for phase retrieval between nonparallel planes, successfully addresses this stretching distortion.
Figure 7.Comparison of SGD and tilt-SGD results. (left to right) Captured results and optimized holograms of SGD and tilt-SGD when the target image is located at 10 cm. The main purpose is to illustrate the effect of the proposed algorithm, so these results are not calibrated by the camera, and there will be other distortion issues besides stretch distortion.
Additionally, the proposed tilt-SGD method helps alleviate the issue of depth mismatch. In the reconstruction results of the grid pattern, the SGD reconstruction appears out of focus on the left and right edges, whereas the tilt-SGD reconstruction remains consistently in focus. However, the tilt-SGD results still exhibit other distortions, which will be further corrected through calibration in tilt-CITL.
5. RESULTS
This section presents the reconstruction results of holograms captured with the off-axis HOE display prototype, which supports a flexible switch between virtual reality (VR) and AR modes.
A. VR-Mode Holographic Display
The comparison between VR results with tilt-SGD and tilt-CITL is illustrated in Fig. 8. Due to the uneven diffraction efficiency of the customized HOE or nonuniform illumination, the reconstruction results of tilt-SGD exhibit brightness nonuniformity, which is particularly evident in the OPTICA example with tilt-SGD (first row of Fig. 8). In contrast, the tilt-CITL results compensate for the brightness nonuniformity issue, leading to more consistent brightness across the entire image. Further, the tilt-CITL noticeably reduces artifacts and speckle noise while preserving image contrast and details. This improvement in image quality is quantitatively confirmed through the PSNR measurements, indicating the advancement of tilt-CITL in enhancing the visual quality of HOE.
Figure 8.Comparison of results using tilt-SGD and tilt-CITL in the VR-mode holographic display. The 2D resolution of captured images is . For each set (left to right): target images, phase patterns by tilt-SGD, captured results of tilt-SGD, phase patterns by tilt-CITL, and captured results of tilt-CITL, respectively. The target image is set at 10 cm away from the SLM plane. The PSNR metrics are reported. The phase patterns of full-color target image are created by superimposing holograms from three separate color channels. Note that these captured images are normalized for visualization purpose.
In addition to the single-color VR results, full-color VR results are obtained using a full-color HOE and three lasers with wavelengths of 639, 532, and 457 nm. To obtain comprehensive full-color VR results, the three monochromatic results are combined during postprocessing. However, the introduction of full-color HOE with three photopolymer layers leads to problems such as brightness nonuniformity and speckle noise in each single channel, which are further amplified in the full-color results. Similarly, as shown in Fig. 8, these problems are mitigated by the tilt-CITL method, indicating its importance.
B. AR-Mode Holographic Display
The AR results are captured by adjusting the focus of eyepiece to selected depths: 30 cm (represented by a physical cube) and 150 cm (represented by a head model); further, 30 cm is a common distance at which users interact with objects or manipulate virtual tools at arm’s length. The physical cube serves as an indicator for assessing the alignment and clarity of the AR system when virtual and real-world objects are close. Conversely, virtual signpost and pieces of information are usually projected at a distance of 150 cm. The head model acts as a reference object to evaluate the system’s effectiveness in rendering and maintaining the visibility of virtual content when the distance is increased. The capability of the AR system to handle different depth cues is demonstrated by varying the focus distance, providing insights into its practical applications.
The target images of AR results shown in Fig. 9 are consistent with those in VR mode, demonstrating the versatility of our system. Further, these monochromatic and full-color AR results indicate the adaptability of the system in handling different color schemes under various conditions. The AR system delivers clear and sharp imaging whether the focus is set on distant or close objects with uniform visual clarity and high-quality rendering, suggesting that the framework effectively accommodates a range of focused depths.
Figure 9.Acquired AR results at two focusing distances. (a) Near focus, with the camera focusing at the real object “Physical Cube.” (b) Far focus, with the camera focusing at the real object “Head Model.”
The proposed HOE-empowered off-axis holographic AR display demonstrates significant advancements in VR and AR applications through the integration of tilt-CITL calibration. By addressing the limitations of existing hologram retrieval algorithms such as SGD-ASM, the proposed tilt-SGD approach, which applies Fourier spectrum rotation, effectively generates holograms for nonparallel plane propagation.
The experimental results highlight the superior performance of tilt-CITL in addressing brightness nonuniformity, reducing artifacts, and mitigating speckle noise, particularly when compared with the vanilla tilt-SGD method. The quantitative analysis using PSNR metrics confirms the improvements in visual quality, with more consistent brightness and enhanced image contrast and detail retention. Further, the full-color VR and AR results underscore the adaptability of the system in handling the challenges posed by multilayer HOE structures and nonuniform illumination across different color channels. The ability of the AR system to maintain clarity and sharpness at varying focal depths, as demonstrated by near and far target imaging, suggests that the proposed framework offers a robust solution for immersive holographic displays with practical applications across various environments and depth ranges.
Due to the constraints of the Maxwellian-view display mode, the eyebox of our setup is not large enough for comfortable viewing. When the user’s head or eyes move slightly beyond this range, the image becomes blurry, distorted, or even disappears, negatively impacting the viewing experience. Therefore, expanding the eyebox is crucial for wearable applications. This issue has been discussed in previous work [29], which enlarged the eyebox by using a 2D steering mirror to generate an illumination beam with variable angles for the SLM. In future work, we plan to adjust the angle of the SLM illumination light to create multiple viewpoints, further expanding the eyebox. Similarly, we can also create multiple viewpoints to expand the eyebox by using an HOE based on a lenslet array as reported in Ref. [20].
Another challenge associated with the Maxwellian-view display mode is that the image displayed by the HOE remains in focus at all depths [30]. Although this feature helps mitigate the VAC problem, it also indicates that important depth cues for holographic AR displays are missing. Specifically, in the holographic near-eye display system, the spatial bandwidth product (SBP) is determined by the number of pixels and the laser wavelength. When both are fixed, the SBP remains constant. Since it is roughly equal to the product of the FOV and the eyebox, a trade-off exists between these two factors. In our work, the off-axis HOE configuration aims to achieve a larger FOV; however, the eyebox is limited to 0.93 mm at a 63° horizontal FOV. A small eyebox constrains the system’s numerical aperture (NA), resulting in an increased depth of field (DOF), which weakens depth perception capability. To address this issue, we intend to refine the exposure method for the HOE preparation to preserve depth cues. We aim to design and manufacture multifocal HOE to compensate for the lack of depth cues. By incorporating multiple focuses in one HOE, we can provide varying depths to support a 3D viewing experience. Further, our system is restricted to static displays at this stage due to current hardware limitations. In the future, we plan to implement a full-color dynamic display that needs to synchronize the hologram updates on SLM by switching RGB channels. Last but not least, the generation of off-axis holograms still relies on iterative algorithms, which may require considerable time for preparation. For instance, the tilt-SGD hologram optimization takes about 250 s for 500 iterations per color channel, and the tilt-CITL hologram optimization takes about 40 min for 500 iterations per color channel. Tilt-SGD and tilt-CITL hologram optimizations are run on an NVIDIA GeForce RTX 3090 graphics processing unit. Looking ahead, integrating neural networks [25,31] to facilitate real-time off-axis hologram generation can significantly speed up the process.
APPENDIX A: PSEUDOCODE OF PROPOSED ALGORITHMS
This section includes pseudocode for proposed algorithms discussed in the paper. Specifically, Algorithm 1 outlines the tilted propagation model introduced in Section 3. The inputs contain SLM field , propagation distance , and tilted angle described in the main paper. The output is tilted plane field . The tilted propagation model combines ASM and Fourier spectrum rotation. Algorithm 1 rotates the field propagated by ASM in the Fourier domain to obtain the tilted plane field.
Tilted Propagation
: SLM field
: propagation distance
: tilted angle described in the main text
: wavelength
: fast Fourier transform
: inverse fast Fourier transform
: circ function in spectral domain
: an interpolation function
1:
2:
3:
4:
5:
6:
7:
8:
9:
10:
11:
12: return
Algorithm 2 outlines the tilt-SGD optimization procedure discussed in Section 3.B. The inputs comprise phase pattern update iters , initial learnable scale factor , initial phase on SLM , target amplitude , propagation distance , and tilted angle described in the main text. The output is an optimized phase on SLM . Algorithm 2 uses the tilted plane amplitude and target amplitude as the inputs of the loss function to iteratively optimize the phase on SLM.
Tilt-SGD Hologram Optimization
: phase pattern update iters
: initial learnable scale factor
: initial phase
: target amplitude
: propagation distance
: tilted angle described in the main text
LossBackpropagation():
backpropagation through loss function
IdealBackpropagation():
backpropagation through idealized model
1:
2:
3: fordo
4:
Algorithm 1
5:
6:
7: return
Algorithm 3 outlines the tilt-CITL optimization procedure introduced in Section 3.C. The input and output are kept the same as Algorithm 2. Algorithm 3 uses the captured amplitude on the tilted plane and target amplitude as the inputs of the loss function to iteratively optimize the phase on SLM.
Tilt-CITL Hologram Optimization
: phase pattern update iters
: initial learnable scale factor
: initial phase
: target amplitude
: propagation distance
: tilted angle described in the main text
CameraPropagation(): camera capture + homography
Replace(m, n):
replace values of m with n, retain gradients from m
LossBackpropagation():
backpropagation through loss function
IdealBackpropagation():
backpropagation through idealized model
1:
2:
3: fordo
4:
5:
Algorithm 1
6:
7:
8: return
APPENDIX B: IMPLEMENTATION DETAILS OF HOE FABRICATION
This section describes the hardware implementation of our HOE fabrication process. Figure 10 illustrates the bench-top experimental setup used in the HOE fabrication process. The system utilizes three lasers with wavelengths of 639, 532, and 457 nm, each with a maximum output power of 100 mW. Our fabrication procedure is built upon a dual-path interference method with a conjugate design. The laser beam is first split into two beams by a beam splitter. One beam passes through microscope objective 1, a collimating lens, and mirror 2 to form the reference beam, which illuminates the surface of the HOE. The other beam travels through the beam expander, mirror 3, mirror 4, and microscope objective 2, forming a signal beam. The interference between the reference and signal beams induces a reaction in the HOE, creating a holographic grating that integrates the functions of the mirror and lens.
Figure 10.Diagram of the experimental setup used for HOE fabrication. M1, M2, M3, M4: mirrors; DM1 and DM2: dichroic mirrors; BS: beam splitter; CL: collimating lens; HOE: holographic optical element.
[20] X. Xia, Y. Guan, A. State. Towards eyeglass-style holographic near-eye displays with statically. IEEE International Symposium on Mixed and Augmented Reality (ISMAR), 312-319(2020).
[23] K. Matsushima. Rotational transformation for reconstruction of digital holography and CGH creation. Digital Holography and Three-Dimensional Imaging, DWB4(2007).
[27] E. Agustsson, R. Timofte. NTIRE 2017 challenge on single image super-resolution: dataset and study. IEEE Conference on Computer Vision and Pattern Recognition Workshops, 126-135(2017).