Holographic display is the most promising technology among display techniques. With the development of holography, achievements have been made in different aspects. Elimination of a zero-order beam[
Chinese Optics Letters, Volume. 14, Issue 3, 030901(2016)
Fast processing method to generate gigabyte computer generated holography for three-dimensional dynamic holographic display
Two different methods from graphic processing unit (GPU) and central processing unit (CPU) are proposed to suitably optimize look-up table algorithms of computer generated holography (CGH). The numerical simulations and experimental results show that we can reconstruct a good quality object. The computation of CGH for a three-dimensional (3D) dynamic holographic display can also be sped up by programming with our proposed method. It can optimize both file loading and the inline calculation process. The phase-only CGH with gigabyte data for reconstructing 10 MB object samplings is generated. In addition, the proposed method effectively reduced time costs of loading and writing offline tables on a CPU. It is believed the proposed method can provide high speed and huge data CGH for 3D dynamic holographic displays in the near future.
Holographic display is the most promising technology among display techniques. With the development of holography, achievements have been made in different aspects. Elimination of a zero-order beam[
To solve the problem, there have been many different methods to accelerate the generation of a real-time hologram during the past few decades. Integral photography has been used in a capture and reconstruction system that can reconstruct a 3D live scene generated by fast Fourier transform (FFT) at 12 frames/s[
Except for the algorithms mentioned above, there are some optimized methods that utilize the characteristics of programming languages or high performance hardware devices. In recent years, mixed programming has been proposed that accelerates the generation speed of CGH[
Sign up for Chinese Optics Letters TOC Get the latest issue of Advanced Photonics delivered right to you!Sign up now
In this Letter, we propose two effective ways to accelerate the computation algorithm of a CGH based on the attributes of high performance computation hardware, GPU and CPU. We use dynamic parallelism[
In order to increase the speed of the CRT method, the LUT method stores whole offline computation results. Computation devices have to load tables into computer’s memory when generating a hologram inline. The S-LUT algorithm using Fresnel approximation has decreased the inline computation load without sacrificing the quality of the reconstructed image. It splits the horizontal light modulation factor
S-LUT is a fast and useful algorithm to generate a hologram in a proper size. Its consumption of storage space increases quickly with increasing hologram size. The offline tables cannot even be stored in video memory at one time when we want to compute a huge size hologram. C-LUT can preferably solve this problem. Compared with S-LUT, C-LUT takes advantage of the Fraunhofer diffraction to achieve the further approximation. Based on this, it defines the
CUDA is a general parallel computing architecture that can implement the complex and time-consuming computation efficiently and can appropriately support the highly parallel structure of the LUT algorithm. The newest version of CUDA provides a characteristic called dynamic parallelism, which can accommodate the size of blocks and threads in terms of the computation complexity. This new property makes the nested programming much simpler than before and can accelerate the LUT method to some extent. It has been used extensively in many other fields. This technique allows GPU to make a judgment of the results of the computation directly without transferring the data back to the CPU. Obviously, dynamic parallelism can reduce the time spent on the data communication between the GPU and CPU. Based on this mechanism, the computing process is shown in Fig.
Figure 1.Diagram of the generation of holograms by dynamic parallelism. “Blk” and “Thd” are the abbreviations of block and thread, respectively.
In our method, we allocate the blocks and threads as described above and implement the S-LUT and C-LUT algorithms in this way. It divides the whole hologram into pieces and optionally adjusts the calculation work to the current computing resources. Dynamic parallelism reduces the complexity of the nested program and converts it into a more controllable form. We can modify the size of every sub-hologram by barely changing the code. In order to take advantage of the dynamic parallelism mechanism of CUDA, we consider the light intensity information of each object point as a
Figure 2.Flow chart of inline computation.
For the computation of holograms with sizes larger than
Figure 3.Results of time consumption on (a) S-LUT and (b) C-LUT.
In Ref. [
The LUT algorithm trades storage space for inline run time overhead so whole offline tables have to be loaded before starting to compute inline. It costs us huge amounts of time when we use the file stream functions to load the offline tables of a large hologram. The inputting and outputting of the file are both very time consuming. The memory mapping file technique is a memory management method under the control of an operation system. It grants an application permission to access the files in the disk through a memory pointer. In other words, this technique establishes the connection between the whole or part of the file in the hard drive and the fixed area of the virtual address space of the process. In this way, we can access a file directly and avoid both the file stream I/O operation and file buffer. It is especially efficient for loading some huge size files. We set the process of writing and reading offline table files as a graphic example in Fig.
Figure 4.Schematic diagram of file mapping.
As shown in Fig.
Our experiments are accomplished by a computer with the specifications shown in Table
|
|
Diffraction distance is 600 mm, and pixel size is 8 μm. Based on these parameters, we compare the time costs for different offline table processing cases. The comparison of file mapping and regular file stream techniques are shown in Figs.
Figure 5.Processing time comparison of (a)reading file process and (b)writing file process on S-LUT.
Figure 6.Processing time comparison of (a) reading file process and (b) writing file process on C-LUT.
To compare the relationship of hologram size and file loading time consumption, Figs.
The numerical and optical reconstructions of
Figure 7.Reconstructed 3D objects with different depth by S-LUT: (a) and (c) focus on the teacup, and (b) and (d) focus on the teapot, where (a) and (b) are simulated results, and (c) and (d) are recorded in optical experiments.
Our program is flexible for different sizes and is capable of generating huge CGH without quality loss. We can generate a 1 Gbyte hologram with more than
In conclusion, we propose one method to speed up the generation of CGH by GPUs and CPU based on S-LUT. Dynamic parallelism will lead to simpler programming and easier management of thread granularity. In addition, it can also reduce the inline time consumption of both LUT algorithms to some extent. The file mapping technique saves more than 100 times than file I/O stream method in average. The method can deal with huge data of up to
[3] H. Zhang, N. Collings, J. Chen, B. A. Crossland, D. Chu, J. Xie. Opt. Eng., 50, 074003(2011).
[4] J. Jia, Y. Wang, J. Liu, X. Li, Y. Pan. Proc. SPIE, 8557, 85570B(2012).
[10] Y. Zhang, P. Wang, H. Chen, Y. Xu, W. Chen, W. X. Chin. Chin. Opt. Lett., 12, 030902(2014).
[18] J. Wang, S. Yalamanchili. Proceedings of the IEEE International Symposium on Workload Characterization(2014).
[19] Z. Sun. Comput. Knowl. Technol., 9, 4363(2013).
Get Citation
Copy Citation Text
Yingxi Zhang, Juan Liu, Xin Li, Yongtian Wang, "Fast processing method to generate gigabyte computer generated holography for three-dimensional dynamic holographic display," Chin. Opt. Lett. 14, 030901 (2016)
Category: Holography
Received: Nov. 24, 2015
Accepted: Jan. 8, 2016
Published Online: Aug. 6, 2018
The Author Email: Yongtian Wang (wyt@bit.edu.cn)