Advanced Photonics, Volume. 7, Issue 2, 024001(2025)
Symbiotic evolution of photonics and artificial intelligence: a comprehensive review
Fig. 3. ANN modeling. (a) Artificial neuron structure, (b) ANN model structure, and (c) photonic devices are described by two types of labels: physical variables
Fig. 4. Different neural network architectures. (a) Tandem networks: these consist of several modules connected in series, and the different modules are connected to each other through an intermediate layer to form an overall network structure. (b) CNNs: these consist of multiple convolutional, pooling, and fully connected layers. The convolutional layer extracts the local features of the image, the pooling layer is used to reduce the dimensionality and enhance the generalization ability of the model, and the fully connected layer maps the extracted features to the output of the final task. (c) GANs: these consist of a generator and a discriminator. Generator to generate fake data, discriminator to determine the authenticity of the data; the two, through the confrontation training, are constantly optimized, and finally, the generator can generate samples that are very similar to the real data. (d) Variational autoencoders: these consist of an encoder, which maps the input data to a probability distribution in the latent space, and a decoder, which reconstructs the data from the samples in the latent space. (e) Physics-informed neural networks: PINNs fit input–output relationships through neural networks while embedding physical equations (e.g., partial differential equations, initial and boundary conditions) as constraint terms in the loss function. During the training process, the network uses physical constraints to guide learning, realizing the integration of data-driven and physical models.
Fig. 5. Application of deep-learning methods. (a) Metamaterials: demonstrate the process of metamaterial image evolution during a certain number of training steps.65 (b) Photonic crystal: mode switching among different bulk modes in a topologically trivial lattice designed by an ANN.66 (c) Nanoparticles: simultaneous inverse design of structural parameters and material information of core-shell nanoparticles from given electric and magnetic dipoles extinction spectra using deep learning.67 (d) Microwave cloak: at 8.2-GHz frequency, the reflection spectrum shows that the spectrum predicted based on ANNs matches well the real spectrum obtained by simulation.68 (e) Optical storage: sketches of different geometric models encoding 2, 3, 4, or 5 bit sequences using ANNs to store the encoded information.69 (f) Soliton microcomb: second-order and higher-order dispersion is obtained from the target microcomb using the Lugiato–Lefever equation and genetic algorithm, and the microcavity geometry is obtained using a pretrained forward DNN coupled with GA.70 (g) Silicon color design: schematic of silicon nanostructures and generated colors.71 (h) Grating coupler: schematic diagram of the grating coupler structure, in which the guided light incident from the left is vertically diffracted by a column with a periodic staggered height of 220 nm and a grating with an L-shaped cross section partially etched to 110 nm.72 (i) Power splitter: forward and inverse modeling of nanophotonic devices using deep-learning networks, which can take the device topology design as input and the spectral response of components as labels and vice versa.73 (j) Plasmonic nanodimers: based on the analysis of Born–Kuhn-type plasma nanodimers, neural networks capable of successfully predicting chiral properties and further inverse design of the plasma structure to achieve the desired circular dichroism were designed.74 (k) Optical switch: all-optical plasma switches use neural networks to predict spectra through hidden layers after inputting geometric details.75
Fig. 6. Typical examples of nanophotonic devices based on deep-learning methods. (a) 3D chiral metamaterial: schematic of designed 3D chiral metamaterials and their predicted reflection and circular dichroism spectra.105 (b) Topology-optimized metasurface: schematic diagram of metasurface inverse design based on training of the GAN and topology optimization. The generated devices can be fed back to the neural network for retraining and optimization.99 (c) Power splitter: inverse design of power splitter based on GAN combined with simulation neural network and self-attention mechanism.125
Fig. 7. Applications of PINN in nanophotonics. (a) Schematic of a PINN for solving inverse problems in photonics based on partial differential equations.94 (b) PINN reconstruction of the dielectric constant profile from a data set of known scattered field profiles.94 (c) Schematic of the auxiliary PINNs solution to the radiative transfer theory problem.115 (d) Contours for finite-element method forward scattering simulations, inversion results for the complex dielectric function, real and imaginary parts of the complex electric field
Fig. 8. (a) Flow chart of the gradient-based inverse design algorithm. (b) Flow chart of the adjoint method.
Fig. 9. Nanophotonic device by gradient-based inverse design. (a) Spatial mode multiplexer: optimal design patterns and simulated field (
Fig. 10. (a) Flow chart of the variable density method. (b) Flow chart of the level set method. (c) Flow chart of the bidirectional evolutionary structure optimization.
Fig. 11. Nanophotonic devices by the gradient-based inverse design. (a) Spatial mode multiplexer.144 (b) Inverse design results (silicon regions are shown in black and silica regions in white). (c) Optical microscope image of the final fabricated device. (d) Experimentally measured
Fig. 13. Nanophotonic device designed based on GA. (a) Polarization route: SEM image of a
Fig. 16. Nanophotonic device designed based on PSO. (a) Power splitter: binary particle swarm optimized
Fig. 17. (a) Flow chart of the simulated annealing algorithm. Nanophotonic device based on simulated annealing algorithm optimized design. (b) Metasurface: simulated near-electric field distribution under
Fig. 18. (a) Flow chart of hill-climbing algorithm. Optimized design of nanophotonic device based on hill-climbing algorithm. (b) Graphene metasurfaces: structure of the first optimized metasurface.238 (c) One-dimensional photonic crystal split-beam nanocavity: schematic diagram of symmetrical cavity design.239
Fig. 20. Optimized design of nanophotonic devices based on direct binary search. (a) Mode converter: optimized layout of
Fig. 21. (a) Tabu search flow chart. Nanophotonic device based on tabu search optimized design. (b) Polarization filters based on photonic lattices: optimized holes-in-slab configuration (57 scatterers).243 (c) Beam shaping of 2D photonic lattices: photonic lattice used for the beam-shaping problem. The dashed line indicates the plane used to calculate the desired beam.244
Fig. 22. (a) Network architecture for phase unwrapping.287 (b) One quantitative phase image of multiple lung cancer cells. The images are focused manually and then unwrapped by the quality-guided unwrapping algorithm. The unwrapped focused-phase images are used for labeled training in the model. The cross section and 3D representation of one cell with wrapped and unwrapped signals are shown.288 (c) The DNN blindly outputs artifact-free phase and amplitude images of the object using only one hologram intensity. This DNN is composed of convolutional layers, residual blocks, and upsampling blocks and rapidly processes a complex-valued input image in a parallel, multiscale manner.289 (d) (i) The intensity data are captured by illuminating the sample from different angles with an LED array. (ii) Training CNN to reconstruct high-resolution phase images. The input to the CNN is low-resolution intensity images; the output of the CNN is the ground-truth phase image reconstructed using the traditional FPM algorithm. The network is then trained by optimizing the network’s parameters that minimize a loss function calculated based on the network’s predicted output and the ground truth. (iii) The network is fully trained using the first data set at 0 min and then can be used to predict phase videos of dynamic cell samples frame by frame.290
Fig. 23. Examples of network structure for AI-assisted polarization imaging. (a) Architectures of polarization denoising residual dense network (PDRDN) and residual dense block (RDB).304 (b) Architecture of FIPNet, which consists of three parts: feature extraction layer, fusion layer, and reconstruction layer.305 (c) A reflection separation network takes a cascaded architecture with three modules: semireflector orientation estimation, polarization-guided separation, and separated layers refinement.306 (d) A network tailored to polarization-based dehazing pipeline, which consists of two stages: transmitted light estimation and original scene radiance reconstruction.307 (e) A network with multibranch architecture to handle different hierarchical inputs. The physics-based prior confidence map for the weighted fusion of different inputs and the self-supervised AoLP loss to force the network to learn the prior knowledge between the normal and AoLP.308
Fig. 24. AI-assisted snapshot compact SI. (a)–(d) Results of the spectral combining of the AI reconstruction and the DOE design with diffractive rotation.329 (a) The fabricated DOE that generates spectrally varying PSFs for SI. Inset: a camera installed with the DOE. (b) The PSFs at different wavelengths. (c) Overview of the network architecture. (d) The RGB image of a reconstructed SI and the comparison between the reconstructed spectrum and the ground truth of point 1 in the scene. (e)–(g) Results of the shift-variant color-coded diffractive SI system.333 (e) Optimization of the optical elements is carried out using an end-to-end AI approach. (f) RGB image of a reconstructed hyperspectral image and the comparison between the reconstructed spectrum and the ground truth of point 1 in the scene. SCCD types 1 to 3 denote three different types of CCA utilized in the system. Spiral denotes a system without CCA. (h)–(j) Different types of pixelated filter array: (h) Fabry–Perot filter;335 (i) freeform-shaped metasurface filter;336 (j) film filter.337 (k)–(m) Results of computational SI with CMOS-compatible random array of Fabry–Perot filters shown in panel (h).335 (k) Performance of hyperspectral image reconstruction simulated for three hyperspectral image data sets, including the RGB show of reconstruction and the error map between the reconstruction and the ground truth. (l) Experimental results of the SI for a standard color sample. (m) The dependence of the frame rate on the image resolution for AI-based reconstruction and the iterative reconstruction with 50 iteration steps.
Fig. 25. Heat-assisted detection and ranging (HADAR) with AI-assisted decomposition.340 (a) Pipeline of HADAR: HADAR takes thermal photon streams as input, records hyperspectral-imaging heat cubes, addresses the ghosting effect through AI-assisted TeX decomposition, and generates TeX vision for improved detection and ranging. (b) TeX vision demonstrated on the database and the outdoor experiments, showing that HADAR sees textures through the darkness with a comprehensive understanding of the scene. (c)–(h) Ranging based on the raw thermal images (c), (d), AI reconstructed images in the HADAR technique at night (e), (f) and daylight RGB vision (g), (h).
Fig. 26. AI-assisted end-to-end platform for digital pathology using hyperspectral autofluorescence microscopy and deep-learning-based virtual histology.343 (a) Automated workflow with virtual staining and AI scoring that mimics the current pathology workflow. (b)–(e) Classical H&E stained images (b) or the immunofluorescence images [(c) elastin +
Fig. 27. Schematic diagram of RNN. (a) Traditional neural network architecture with input, hidden, and output layers. (b) RNN architecture and an unfolding structure with
Fig. 28. Functions of RNN in nonlinear compensation for optical communication. (a) Schematic diagram of LSTM based on sliding window.354 The autoencoder is represented by the blocks Tx BRNN, channel, and Rx BRNN. (b) The principle of Bi-RNN models.355 The Bi-RNN model processes distorted symbols with intersymbol dependencies to estimate bitwise BER, optimizing complexity, and performance for 16-QAM and 32-QAM. (c) Architecture of LSTM combined with CNN for nonlinear compensation.356 The feature maps
Fig. 29. Various optical-sensing applications implemented using LSTM. (a) LSTM-CNN model for vibration sensing.376 The optical cable is installed directly above the PCCP pipe and fixed with fixtures. Different signals exhibit distinct characteristics across the frequency band and more pronounced local features in the time-frequency domain. Based on LSTM and CNN architectures, a neural network was designed using time-domain waveforms along with their DWT and STFT as inputs. This integrated feature set enables effective pattern recognition. (b) Optical fiber sensing based on the LSTM-CNN model in the surgery.377 The LSTM-CNN framework is utilized to process perioperative heart rate (HR) and respiratory rate (RR) frequency signals. Trends are extracted from HR and RR, whereas CNN and LSTM are employed for feature extraction and processing, respectively. (c) Crowded abnormal scene detection using Bi-LSTM and CNN.378 The proposed methodology utilizes optical flow features to capture frame-level spatial information. Temporal information across the data set is modeled using a Bi-LSTM. The key components of the proposed architecture include constructing an optical feature matrix, integrating a CNN with a Bi-LSTM, and implementing a novel inference mechanism.
Fig. 30. Matrix computation using an MZI mesh. (a) Legend for interpreting the symbols used in other subgraphs. Two predominant methods are illustrated: (b) the Reck scheme388 and (c) the Clement scheme.389 The left side of the figure displays the spatial layout of the MZIs, with the number in each yellow block indicating the order of light manipulation by each MZI. The red dashed arrows denote the sequence for decomposing the unitary matrix. The colors blue and green surrounding the red arrows indicate column and row eliminations, respectively. The right side of the figure shows the corresponding elimination order of unitary matrix elements. (d) MZI mesh for universal complex-valued matrix through SVD decomposition.
Fig. 31. Various photonic circuits designed for matrix-vector multiplication. (a) Micrograph of a photonic circuit engineered to compute unitary matrices.32 Different methods for realizing real-valued matrix computations through coherent MZI mesh structures are shown: (b) using an incoherent laser source with power detection35 and (c) constructing the real part of a unitary matrix.391
Fig. 35. Incoherent optical computing circuit architectures. (a) A
Fig. 36. Some advances for recent optical computing circuits. The first column [(a), (b)] shows fault-tolerance computing architecture: (a) stacked FFT,415 (b) redundant rectangular mesh and permuting rectangular mesh.417 The second column [(c), (d)] shows some miniaturization strategies for computing devices: (c) 3D arrangement of MZI mesh for matrix computation,418 (d) PBWs are instead of MZI as programmable units to minimize the footprint.419 The third column [(e)–(g)] demonstrates that the computing parallelism can be enlarged via WDM,420 FDM,407 and MDM421 technologies.
Fig. 38. All-optical differentiator (a)–(c) and integrator (d)–(f) based on compact resonance structures. The phase-shifted Bragg grating can be designed to realize optical (a) differentiation435 and (d) integration.442 (b), (e) Ruan et al. theoretically demonstrated differentiation and integration can be reconfigured in the same device by controlling the propagating loss of surface plasmon polariton.436 (c) Experimental realization of optical differentiation on surface plasmonic structure.437 (f) Integration is presented using a dielectric slab.441
Fig. 39. Free-space optical matrix-vector multiplier. (a) Schematic diagram for matrix-vector multiplication proposed by Goodman.426 (b) Convolution realization through two metasurfaces.445 (c) Coherent system for realizing matrix computation.446 (d) Matrix-vector multiplier applied to imaging sensing for optical encoding.382 (e) Experimental verification of dot product operation close to the shot-noise limit of detected photons.56 (f) CMOS-compatible matrix processor supporting large input vector size.447 (g) Spatial-temporal multiplexed matrix computing system, where matrix elements and input vector are encoded via VCSEL arrays, exhibiting efficient electro-optic conversion and compact footprint.448
Fig. 40. Training methods for
Fig. 41. (a)–(c) Types of light sources used in
Fig. 42. Diffracted layers are miniaturized by reducing working wavelength or designing on-chip diffracted structures. (a) Fabrication procedure of germanium-based diffraction grating.462 (b) Optical machine-learning decryptor is physically 3D printed by galvo-dithered two-photon nanolithography, and integrated with a CMOS chip.463 (c) Exploded schematic diagram of metasurface-based diffractive neural network integrated with a CMOS chip.464 (d) Scanning electron microscope image of an on-chip metalens.465 (e) Schematic of on-chip DONN. The diffractive unit composed of three identical silicon slots is used to modulate the amplitude and phase of the optical wave.466 (f) The electric field distribution (left) and refractive index distribution (right) of the coherent photonic device that performs unitary matrix computation.467 (g) Schematic of metastructures in a SiPh platform using an inverse-design method based on the effective index approximation with low-index contrast constraint.468
Fig. 44. AI-related applications for all-optical
Fig. 45. Hybrid opto-electrical computing system empowers the machine-vision field. (a) Handwritten digit recognition through optical-digital implementation.432 (b) Malaria parasite detection using learned sensing network.483 (c) Imaging compression using a multiply scattering medium and reconstruction by sparse optimization techniques.484 (d) End-to-end computational camera design paradigm to realize achromatic extended depth of field.485 (e) Joint optimization of microscope point spread function and differentiable reconstruction algorithm to achieve 3D information reconstruction.486 (f) The flow chart for depth map estimation using a phase-coded aperture camera.487
Fig. 46. Recent high-performance optical computing chips to support advanced AI tasks. (a) The data flow of the all-analog photoelectronic chip, which can support energy-efficient and ultrahigh-speed vision tasks.489 (b), (c) Large-scale photonic chiplets are proposed to deploy large models for AGI tasks490 such as (b) music generation and (c) image generation.
|
Get Citation
Copy Citation Text
Fu Feng, Dewang Huo, Ziyang Zhang, Yijie Lou, Shengyao Wang, Zhijuan Gu, Dong-Sheng Liu, Xinhui Duan, Daqian Wang, Xiaowei Liu, Ji Qi, Shaoliang Yu, Qingyang Du, Guangyong Chen, Cuicui Lu, Yu Yu, Xifeng Ren, Xiaocong Yuan, "Symbiotic evolution of photonics and artificial intelligence: a comprehensive review," Adv. Photon. 7, 024001 (2025)
Category: Reviews
Received: Sep. 8, 2024
Accepted: Jan. 24, 2025
Published Online: Apr. 3, 2025
The Author Email: Qi Ji (ji.qi@zhejianglab.org), Du Qingyang (qydu@zhejianglab.org), Chen Guangyong (gychen@zhejianglab.org), Lu Cuicui (cuicuilu@bit.edu.cn), Yu Yu (yuyu@mail.hust.edu.cn), Ren Xifeng (renxf@ustc.edu.cn), Yuan Xiaocong (xcyuan@zhejianglab.org)