Advanced Photonics, Volume. 7, Issue 2, 024001(2025)

Symbiotic evolution of photonics and artificial intelligence: a comprehensive review

Fu Feng1,†... Dewang Huo1,2, Ziyang Zhang1, Yijie Lou1, Shengyao Wang3, Zhijuan Gu4, Dong-Sheng Liu5,6, Xinhui Duan1, Daqian Wang1, Xiaowei Liu1, Ji Qi1,*, Shaoliang Yu1, Qingyang Du1,*, Guangyong Chen7,*, Cuicui Lu3,*, Yu Yu4,*, Xifeng Ren5,6,* and Xiaocong Yuan1,* |Show fewer author(s)
Author Affiliations
  • 1Zhejiang Lab, Research Center for Frontier Fundamental Studies, Hangzhou, China
  • 2Westlake Institute for Optoelectronics, Zhejiang Key Laboratory of 3D Micro/Nano Fabrication and Characterization, Hangzhou, China
  • 3Beijing Institute of Technology, School of Physics, Center for Interdisciplinary Science of Optical Quantum and NEMS Integration, Key Laboratory of Advanced Optoelectronic Quantum Architecture and Measurements of Ministry of Education, Beijing Key Laboratory of Nanophotonics and Ultrafine Optoelectronic Systems, Beijing, China
  • 4Huazhong University of Science and Technology, Wuhan National Laboratory for Optoelectronics, Wuhan, China
  • 5University of Science and Technology of China, CAS Key Laboratory of Quantum Information, Hefei, China
  • 6University of Science and Technology of China, CAS Center for Excellence in Quantum Information and Quantum Physics, Hefei, China
  • 7Zhejiang Lab, Research Center for Life Sciences Computing, Hangzhou, China
  • show less
    Figures & Tables(50)
    Schematic of the synergy between photonics and AI.
    Artificial intelligence for photonic devices.
    ANN modeling. (a) Artificial neuron structure, (b) ANN model structure, and (c) photonic devices are described by two types of labels: physical variables x and physical responses y.
    Different neural network architectures. (a) Tandem networks: these consist of several modules connected in series, and the different modules are connected to each other through an intermediate layer to form an overall network structure. (b) CNNs: these consist of multiple convolutional, pooling, and fully connected layers. The convolutional layer extracts the local features of the image, the pooling layer is used to reduce the dimensionality and enhance the generalization ability of the model, and the fully connected layer maps the extracted features to the output of the final task. (c) GANs: these consist of a generator and a discriminator. Generator to generate fake data, discriminator to determine the authenticity of the data; the two, through the confrontation training, are constantly optimized, and finally, the generator can generate samples that are very similar to the real data. (d) Variational autoencoders: these consist of an encoder, which maps the input data to a probability distribution in the latent space, and a decoder, which reconstructs the data from the samples in the latent space. (e) Physics-informed neural networks: PINNs fit input–output relationships through neural networks while embedding physical equations (e.g., partial differential equations, initial and boundary conditions) as constraint terms in the loss function. During the training process, the network uses physical constraints to guide learning, realizing the integration of data-driven and physical models.
    Application of deep-learning methods. (a) Metamaterials: demonstrate the process of metamaterial image evolution during a certain number of training steps.65" target="_self" style="display: inline;">65 (b) Photonic crystal: mode switching among different bulk modes in a topologically trivial lattice designed by an ANN.66" target="_self" style="display: inline;">66 (c) Nanoparticles: simultaneous inverse design of structural parameters and material information of core-shell nanoparticles from given electric and magnetic dipoles extinction spectra using deep learning.67" target="_self" style="display: inline;">67 (d) Microwave cloak: at 8.2-GHz frequency, the reflection spectrum shows that the spectrum predicted based on ANNs matches well the real spectrum obtained by simulation.68" target="_self" style="display: inline;">68 (e) Optical storage: sketches of different geometric models encoding 2, 3, 4, or 5 bit sequences using ANNs to store the encoded information.69" target="_self" style="display: inline;">69 (f) Soliton microcomb: second-order and higher-order dispersion is obtained from the target microcomb using the Lugiato–Lefever equation and genetic algorithm, and the microcavity geometry is obtained using a pretrained forward DNN coupled with GA.70" target="_self" style="display: inline;">70 (g) Silicon color design: schematic of silicon nanostructures and generated colors.71" target="_self" style="display: inline;">71 (h) Grating coupler: schematic diagram of the grating coupler structure, in which the guided light incident from the left is vertically diffracted by a column with a periodic staggered height of 220 nm and a grating with an L-shaped cross section partially etched to 110 nm.72" target="_self" style="display: inline;">72 (i) Power splitter: forward and inverse modeling of nanophotonic devices using deep-learning networks, which can take the device topology design as input and the spectral response of components as labels and vice versa.73" target="_self" style="display: inline;">73 (j) Plasmonic nanodimers: based on the analysis of Born–Kuhn-type plasma nanodimers, neural networks capable of successfully predicting chiral properties and further inverse design of the plasma structure to achieve the desired circular dichroism were designed.74" target="_self" style="display: inline;">74 (k) Optical switch: all-optical plasma switches use neural networks to predict spectra through hidden layers after inputting geometric details.75" target="_self" style="display: inline;">75
    Typical examples of nanophotonic devices based on deep-learning methods. (a) 3D chiral metamaterial: schematic of designed 3D chiral metamaterials and their predicted reflection and circular dichroism spectra.105" target="_self" style="display: inline;">105 (b) Topology-optimized metasurface: schematic diagram of metasurface inverse design based on training of the GAN and topology optimization. The generated devices can be fed back to the neural network for retraining and optimization.99" target="_self" style="display: inline;">99 (c) Power splitter: inverse design of power splitter based on GAN combined with simulation neural network and self-attention mechanism.125" target="_self" style="display: inline;">125
    Applications of PINN in nanophotonics. (a) Schematic of a PINN for solving inverse problems in photonics based on partial differential equations.94" target="_self" style="display: inline;">94 (b) PINN reconstruction of the dielectric constant profile from a data set of known scattered field profiles.94" target="_self" style="display: inline;">94 (c) Schematic of the auxiliary PINNs solution to the radiative transfer theory problem.115" target="_self" style="display: inline;">115 (d) Contours for finite-element method forward scattering simulations, inversion results for the complex dielectric function, real and imaginary parts of the complex electric field Ez, and the complex electric field Ez reconstructed from PINNs.110" target="_self" style="display: inline;">110
    (a) Flow chart of the gradient-based inverse design algorithm. (b) Flow chart of the adjoint method.
    Nanophotonic device by gradient-based inverse design. (a) Spatial mode multiplexer: optimal design patterns and simulated field (Ey) evolution for spatial pattern multiplexer.145" target="_self" style="display: inline;">145 (b) Power splitter: scanning electron microscopy (SEM) image of the fabricated broadband 1×3 power splitter and the electromagnetic energy density in the device at 1550 nm.143" target="_self" style="display: inline;">143 (c) Wavelength demultiplexer: simulated electromagnetic energy density of a three-channel wavelength multiplexer at three operating wavelengths.149" target="_self" style="display: inline;">149 (d) Grating couplers based on diamond design: inverse-designed vertical coupler with analog field superimposed in red.156" target="_self" style="display: inline;">156 (e) Fano resonators: SEM image of a cascaded Fano–Lorentzian resonator. The enlarged image shows the reflector designed in inverse direction on the silicon waveguide in the resonator-waveguide coupling region.151" target="_self" style="display: inline;">151 (f) Grating couplers: the electric field in the structure of the grating coupler with a target bandwidth of 120 nm is simulated at 1550 nm.154" target="_self" style="display: inline;">154 (g) SPIN software optimization process: (1) continuous optimization; (2) discretization; (3) discrete optimization. Fabrication constraints are enforced at this time.157" target="_self" style="display: inline;">157 (h) Metalens: the metalenses are illuminated by normally incident x-polarized plane waves. The incident field outside the aperture of the metalens is blocked by a layer of perfect electrical conductors.159" target="_self" style="display: inline;">159
    (a) Flow chart of the variable density method. (b) Flow chart of the level set method. (c) Flow chart of the bidirectional evolutionary structure optimization.
    Nanophotonic devices by the gradient-based inverse design. (a) Spatial mode multiplexer.144" target="_self" style="display: inline;">144 (b) Inverse design results (silicon regions are shown in black and silica regions in white). (c) Optical microscope image of the final fabricated device. (d) Experimentally measured S parameters of the back-to-back test structure. (Shaded areas indicate the minimum and maximum values from three different measured devices from three dies, and solid lines indicate the average values.) (e) Three-channel wavelength demultiplexer.144" target="_self" style="display: inline;">144 (f) Inverse design results. (g) Optical microscope image of the final fabricated device. (h) Experimentally measured S parameters. (i) Three-way power splitter.144" target="_self" style="display: inline;">144 (j) Inverse design results. (k) Optical microscope image of the final fabricated device. (l) Experimentally measured S parameters (dashed line indicates perfect 1/3 beam splitting ratios). (m) SEM image of the inverse designed-fixed coupler.147" target="_self" style="display: inline;">147 (n) Schematic diagram of the computing platform, consisting of input generator, photonic processor, and complex output.147" target="_self" style="display: inline;">147 (o) Optical microscope image of photonic platform.147" target="_self" style="display: inline;">147 (p) Photograph of the photonic platform and wire bonding [the red square marks one platform detail in panel (o)].147" target="_self" style="display: inline;">147
    (a) Flow chart of the GA. (b) Coding method.
    Nanophotonic device designed based on GA. (a) Polarization route: SEM image of a 970 nm×1240 nm polarization router.197" target="_self" style="display: inline;">197 (b) Metasurface absorber: schematic of the optimized binary pattern A0 in the crystal cell and SEM image of the pattern A0 array.193" target="_self" style="display: inline;">193 (c) Chiral plasmonic metasurface: top view of design pattern A and SEM image of chiral metasurface.194" target="_self" style="display: inline;">194 (d) Broadband absorption optimization: structural schematics and absorption spectra of the different generations.195" target="_self" style="display: inline;">195 (e) Metasurface design: different combinations of coefficients on the pattern of light produced.196" target="_self" style="display: inline;">196 (f) Optical frequency microcombs: SEM image of the photonic-crystal resonators. The inset on the right highlights a section of the chirped corrugation.198" target="_self" style="display: inline;">198
    (a) Crossover operator and (b) variation operator.
    (a) PSO iteration process. (b) Flow chart of PSO.
    Nanophotonic device designed based on PSO. (a) Power splitter: binary particle swarm optimized 2×2 power splitter.199" target="_self" style="display: inline;">199 (b) Nanosensor: schematic diagram of a nanosensor consisting of periodic gold nanoridges.201" target="_self" style="display: inline;">201 (c) Optical coupler: structure of the proposed multisegment directional coupler.200" target="_self" style="display: inline;">200 (d) Photonic crystal: simulation results of p-polarized incident wave.204" target="_self" style="display: inline;">204 (e) Varifocal lens: schematic of the varifocal lens. The inset shows a single-cell sample.202" target="_self" style="display: inline;">202
    (a) Flow chart of the simulated annealing algorithm. Nanophotonic device based on simulated annealing algorithm optimized design. (b) Metasurface: simulated near-electric field distribution under x-polarized normal incidence.233" target="_self" style="display: inline;">233 (c) Spin Hall device: schematic of an on-chip broadband photonic spin element, where the incident light is coupled into different waveguides according to its spin states.234" target="_self" style="display: inline;">234
    (a) Flow chart of hill-climbing algorithm. Optimized design of nanophotonic device based on hill-climbing algorithm. (b) Graphene metasurfaces: structure of the first optimized metasurface.238" target="_self" style="display: inline;">238 (c) One-dimensional photonic crystal split-beam nanocavity: schematic diagram of symmetrical cavity design.239" target="_self" style="display: inline;">239
    Direct binary search flow chart.
    Optimized design of nanophotonic devices based on direct binary search. (a) Mode converter: optimized layout of TE1−TE0 mode converters and optimized optical field distribution for mode-order converter.252" target="_self" style="display: inline;">252 (b) Power splitter: SEM image of the entire manufacturing facility consisting of a dual-mode 3 dB power divider and three mode multiplexers.255" target="_self" style="display: inline;">255 (c) Polarization splitter-rotator: TM0−TE0 mode simulated light field, TM0−TE0 mode cross-sectional light field at input and output ports.256" target="_self" style="display: inline;">256
    (a) Tabu search flow chart. Nanophotonic device based on tabu search optimized design. (b) Polarization filters based on photonic lattices: optimized holes-in-slab configuration (57 scatterers).243" target="_self" style="display: inline;">243 (c) Beam shaping of 2D photonic lattices: photonic lattice used for the beam-shaping problem. The dashed line indicates the plane used to calculate the desired beam.244" target="_self" style="display: inline;">244
    (a) Network architecture for phase unwrapping.287" target="_self" style="display: inline;">287 (b) One quantitative phase image of multiple lung cancer cells. The images are focused manually and then unwrapped by the quality-guided unwrapping algorithm. The unwrapped focused-phase images are used for labeled training in the model. The cross section and 3D representation of one cell with wrapped and unwrapped signals are shown.288" target="_self" style="display: inline;">288 (c) The DNN blindly outputs artifact-free phase and amplitude images of the object using only one hologram intensity. This DNN is composed of convolutional layers, residual blocks, and upsampling blocks and rapidly processes a complex-valued input image in a parallel, multiscale manner.289" target="_self" style="display: inline;">289 (d) (i) The intensity data are captured by illuminating the sample from different angles with an LED array. (ii) Training CNN to reconstruct high-resolution phase images. The input to the CNN is low-resolution intensity images; the output of the CNN is the ground-truth phase image reconstructed using the traditional FPM algorithm. The network is then trained by optimizing the network’s parameters that minimize a loss function calculated based on the network’s predicted output and the ground truth. (iii) The network is fully trained using the first data set at 0 min and then can be used to predict phase videos of dynamic cell samples frame by frame.290" target="_self" style="display: inline;">290
    Examples of network structure for AI-assisted polarization imaging. (a) Architectures of polarization denoising residual dense network (PDRDN) and residual dense block (RDB).304" target="_self" style="display: inline;">304 (b) Architecture of FIPNet, which consists of three parts: feature extraction layer, fusion layer, and reconstruction layer.305" target="_self" style="display: inline;">305 (c) A reflection separation network takes a cascaded architecture with three modules: semireflector orientation estimation, polarization-guided separation, and separated layers refinement.306" target="_self" style="display: inline;">306 (d) A network tailored to polarization-based dehazing pipeline, which consists of two stages: transmitted light estimation and original scene radiance reconstruction.307" target="_self" style="display: inline;">307 (e) A network with multibranch architecture to handle different hierarchical inputs. The physics-based prior confidence map for the weighted fusion of different inputs and the self-supervised AoLP loss to force the network to learn the prior knowledge between the normal and AoLP.308" target="_self" style="display: inline;">308
    AI-assisted snapshot compact SI. (a)–(d) Results of the spectral combining of the AI reconstruction and the DOE design with diffractive rotation.329" target="_self" style="display: inline;">329 (a) The fabricated DOE that generates spectrally varying PSFs for SI. Inset: a camera installed with the DOE. (b) The PSFs at different wavelengths. (c) Overview of the network architecture. (d) The RGB image of a reconstructed SI and the comparison between the reconstructed spectrum and the ground truth of point 1 in the scene. (e)–(g) Results of the shift-variant color-coded diffractive SI system.333" target="_self" style="display: inline;">333 (e) Optimization of the optical elements is carried out using an end-to-end AI approach. (f) RGB image of a reconstructed hyperspectral image and the comparison between the reconstructed spectrum and the ground truth of point 1 in the scene. SCCD types 1 to 3 denote three different types of CCA utilized in the system. Spiral denotes a system without CCA. (h)–(j) Different types of pixelated filter array: (h) Fabry–Perot filter;335" target="_self" style="display: inline;">335 (i) freeform-shaped metasurface filter;336" target="_self" style="display: inline;">336 (j) film filter.337" target="_self" style="display: inline;">337 (k)–(m) Results of computational SI with CMOS-compatible random array of Fabry–Perot filters shown in panel (h).335" target="_self" style="display: inline;">335 (k) Performance of hyperspectral image reconstruction simulated for three hyperspectral image data sets, including the RGB show of reconstruction and the error map between the reconstruction and the ground truth. (l) Experimental results of the SI for a standard color sample. (m) The dependence of the frame rate on the image resolution for AI-based reconstruction and the iterative reconstruction with 50 iteration steps.
    Heat-assisted detection and ranging (HADAR) with AI-assisted decomposition.340" target="_self" style="display: inline;">340 (a) Pipeline of HADAR: HADAR takes thermal photon streams as input, records hyperspectral-imaging heat cubes, addresses the ghosting effect through AI-assisted TeX decomposition, and generates TeX vision for improved detection and ranging. (b) TeX vision demonstrated on the database and the outdoor experiments, showing that HADAR sees textures through the darkness with a comprehensive understanding of the scene. (c)–(h) Ranging based on the raw thermal images (c), (d), AI reconstructed images in the HADAR technique at night (e), (f) and daylight RGB vision (g), (h).
    AI-assisted end-to-end platform for digital pathology using hyperspectral autofluorescence microscopy and deep-learning-based virtual histology.343" target="_self" style="display: inline;">343 (a) Automated workflow with virtual staining and AI scoring that mimics the current pathology workflow. (b)–(e) Classical H&E stained images (b) or the immunofluorescence images [(c) elastin + α-SMA, (d) nuclei, and (e) CD]68" target="_self" style="display: inline;">68 of a tissue slice. (f)–(i) Images of the adjacent slice generated by a linear projection of the autofluorescence spectral image with different channel-related weights to enhance different components [(f) a uniform projection mimicking the autofluorescence intensity imaging result, (g) extracellular matrix, (h) nuclei, (i) macrophages]. (j) Neural network architecture of the generator of virtual stainer. AF, autofluorescence; BF, bright field. (k) BF real and virtual images stained with H&E. (l)–(o) Correlation of the slide level nonalcoholic steatohepatitis feature attributes predicted by segmentation models on real stains versus virtual stains [(l) percent steatosis, (m) percent lobular inflammation, (n) log-normalized hepatocyte balloon count, (o) fibrosis density].
    Schematic diagram of RNN. (a) Traditional neural network architecture with input, hidden, and output layers. (b) RNN architecture and an unfolding structure with t time steps. X(t): input state. h(t): hidden state. o(t): output state. W1, W2, and Wr represent input, output, and recurrent weight matrices, respectively. (c) LSTM cell architecture with forget, input, output, and cell states.
    Functions of RNN in nonlinear compensation for optical communication. (a) Schematic diagram of LSTM based on sliding window.354" target="_self" style="display: inline;">354 The autoencoder is represented by the blocks Tx BRNN, channel, and Rx BRNN. (b) The principle of Bi-RNN models.355" target="_self" style="display: inline;">355 The Bi-RNN model processes distorted symbols with intersymbol dependencies to estimate bitwise BER, optimizing complexity, and performance for 16-QAM and 32-QAM. (c) Architecture of LSTM combined with CNN for nonlinear compensation.356" target="_self" style="display: inline;">356 The feature maps yf from the convolutional layer are fed into either two dense layers (forming the CNN + MLP structure, with the number of layers determined by the Bayesian optimizer) or a single Bi-LSTM layer.
    Various optical-sensing applications implemented using LSTM. (a) LSTM-CNN model for vibration sensing.376" target="_self" style="display: inline;">376 The optical cable is installed directly above the PCCP pipe and fixed with fixtures. Different signals exhibit distinct characteristics across the frequency band and more pronounced local features in the time-frequency domain. Based on LSTM and CNN architectures, a neural network was designed using time-domain waveforms along with their DWT and STFT as inputs. This integrated feature set enables effective pattern recognition. (b) Optical fiber sensing based on the LSTM-CNN model in the surgery.377" target="_self" style="display: inline;">377 The LSTM-CNN framework is utilized to process perioperative heart rate (HR) and respiratory rate (RR) frequency signals. Trends are extracted from HR and RR, whereas CNN and LSTM are employed for feature extraction and processing, respectively. (c) Crowded abnormal scene detection using Bi-LSTM and CNN.378" target="_self" style="display: inline;">378 The proposed methodology utilizes optical flow features to capture frame-level spatial information. Temporal information across the data set is modeled using a Bi-LSTM. The key components of the proposed architecture include constructing an optical feature matrix, integrating a CNN with a Bi-LSTM, and implementing a novel inference mechanism.
    Matrix computation using an MZI mesh. (a) Legend for interpreting the symbols used in other subgraphs. Two predominant methods are illustrated: (b) the Reck scheme388" target="_self" style="display: inline;">388 and (c) the Clement scheme.389" target="_self" style="display: inline;">389 The left side of the figure displays the spatial layout of the MZIs, with the number in each yellow block indicating the order of light manipulation by each MZI. The red dashed arrows denote the sequence for decomposing the unitary matrix. The colors blue and green surrounding the red arrows indicate column and row eliminations, respectively. The right side of the figure shows the corresponding elimination order of unitary matrix elements. (d) MZI mesh for universal complex-valued matrix through SVD decomposition.
    Various photonic circuits designed for matrix-vector multiplication. (a) Micrograph of a photonic circuit engineered to compute unitary matrices.32" target="_self" style="display: inline;">32 Different methods for realizing real-valued matrix computations through coherent MZI mesh structures are shown: (b) using an incoherent laser source with power detection35" target="_self" style="display: inline;">35 and (c) constructing the real part of a unitary matrix.391" target="_self" style="display: inline;">391
    Self-configuring strategies in optical systems. (a) A self-aligning universal beam coupler.393" target="_self" style="display: inline;">393,394" target="_self" style="display: inline;">394 (b) Application of the ratio method for calibrating triangular meshes.395" target="_self" style="display: inline;">395 (c), (d) Use of the reversed local light interference method to calibrate universal feedforward meshes.396" target="_self" style="display: inline;">396,397" target="_self" style="display: inline;">397
    Some gradient-free calibration methods. (a) Execution process of GA.398" target="_self" style="display: inline;">398 (b) Whole pipeline for MZI mesh calibration using GA.399" target="_self" style="display: inline;">399 (c) Bacterial foraging training algorithm is implemented on MZI mesh.400" target="_self" style="display: inline;">400
    In situ training method is proposed to realize the BP algorithm in photonic circuits. (a) Procedure of in situ training404" target="_self" style="display: inline;">404 and (b) experimental verification for in situ training.405" target="_self" style="display: inline;">405
    Incoherent optical computing circuit architectures. (a) A 4×4 nonnegative matrix is realized using a microring array.408" target="_self" style="display: inline;">408 (b) In the microring array, output power in both the through port and drop port is detected to realize real-valued matrix computation.409" target="_self" style="display: inline;">409 (c) A recursive structure named SDDLN is used to realize matrix-vector multiplication.410" target="_self" style="display: inline;">410
    Some advances for recent optical computing circuits. The first column [(a), (b)] shows fault-tolerance computing architecture: (a) stacked FFT,415" target="_self" style="display: inline;">415 (b) redundant rectangular mesh and permuting rectangular mesh.417" target="_self" style="display: inline;">417 The second column [(c), (d)] shows some miniaturization strategies for computing devices: (c) 3D arrangement of MZI mesh for matrix computation,418" target="_self" style="display: inline;">418 (d) PBWs are instead of MZI as programmable units to minimize the footprint.419" target="_self" style="display: inline;">419 The third column [(e)–(g)] demonstrates that the computing parallelism can be enlarged via WDM,420" target="_self" style="display: inline;">420 FDM,407" target="_self" style="display: inline;">407 and MDM421" target="_self" style="display: inline;">421 technologies.
    All-optical convolution using a 4f-system under various configurations: coherent light sources in panels (a)430" target="_self" style="display: inline;">430 and (b)431" target="_self" style="display: inline;">431 and incoherent light sources in panels (c)432" target="_self" style="display: inline;">432 and (d).433" target="_self" style="display: inline;">433 Panels (a) and (c) utilize amplitude-only masks, whereas panels (b) and (d) employ phase-only masks.
    All-optical differentiator (a)–(c) and integrator (d)–(f) based on compact resonance structures. The phase-shifted Bragg grating can be designed to realize optical (a) differentiation435" target="_self" style="display: inline;">435 and (d) integration.442" target="_self" style="display: inline;">442 (b), (e) Ruan et al. theoretically demonstrated differentiation and integration can be reconfigured in the same device by controlling the propagating loss of surface plasmon polariton.436" target="_self" style="display: inline;">436 (c) Experimental realization of optical differentiation on surface plasmonic structure.437" target="_self" style="display: inline;">437 (f) Integration is presented using a dielectric slab.441" target="_self" style="display: inline;">441
    Free-space optical matrix-vector multiplier. (a) Schematic diagram for matrix-vector multiplication proposed by Goodman.426" target="_self" style="display: inline;">426 (b) Convolution realization through two metasurfaces.445" target="_self" style="display: inline;">445 (c) Coherent system for realizing matrix computation.446" target="_self" style="display: inline;">446 (d) Matrix-vector multiplier applied to imaging sensing for optical encoding.382" target="_self" style="display: inline;">382 (e) Experimental verification of dot product operation close to the shot-noise limit of detected photons.56" target="_self" style="display: inline;">56 (f) CMOS-compatible matrix processor supporting large input vector size.447" target="_self" style="display: inline;">447 (g) Spatial-temporal multiplexed matrix computing system, where matrix elements and input vector are encoded via VCSEL arrays, exhibiting efficient electro-optic conversion and compact footprint.448" target="_self" style="display: inline;">448
    Training methods for D2NN. (a) In situ training procedure of D2NN includes four steps: FP, error calculation, BP, and gradient update.452" target="_self" style="display: inline;">452 (b) The flow chart for dual adaptive training method.403" target="_self" style="display: inline;">403 (c) The data flow for physics-aware training.453" target="_self" style="display: inline;">453 (d) The conceptual illustration for hybrid training of the optical neural network.454" target="_self" style="display: inline;">454
    (a)–(c) Types of light sources used in D2NN, including (a) monochromatic light source,13" target="_self" style="display: inline;">13 (b) spatially incoherent monochromatic light source,458" target="_self" style="display: inline;">458 and (c) broadband pulse source.459" target="_self" style="display: inline;">459 (d)–(f) Types of D2NN structures, including (d) Fourier-space diffractive DNN,49" target="_self" style="display: inline;">49 (e) ensemble learning of diffractive neural network,460" target="_self" style="display: inline;">460 and (f) diffractive network in network and diffractive RNN.455" target="_self" style="display: inline;">455
    Diffracted layers are miniaturized by reducing working wavelength or designing on-chip diffracted structures. (a) Fabrication procedure of germanium-based diffraction grating.462" target="_self" style="display: inline;">462 (b) Optical machine-learning decryptor is physically 3D printed by galvo-dithered two-photon nanolithography, and integrated with a CMOS chip.463" target="_self" style="display: inline;">463 (c) Exploded schematic diagram of metasurface-based diffractive neural network integrated with a CMOS chip.464" target="_self" style="display: inline;">464 (d) Scanning electron microscope image of an on-chip metalens.465" target="_self" style="display: inline;">465 (e) Schematic of on-chip DONN. The diffractive unit composed of three identical silicon slots is used to modulate the amplitude and phase of the optical wave.466" target="_self" style="display: inline;">466 (f) The electric field distribution (left) and refractive index distribution (right) of the coherent photonic device that performs unitary matrix computation.467" target="_self" style="display: inline;">467 (g) Schematic of metastructures in a SiPh platform using an inverse-design method based on the effective index approximation with low-index contrast constraint.468" target="_self" style="display: inline;">468
    High-parallelism D2NN inference using (a) polarization multiplexing,475" target="_self" style="display: inline;">475 (b) wavelength multiplexing,476" target="_self" style="display: inline;">476 and (c) OAM multiplexing477" target="_self" style="display: inline;">477 technologies.
    AI-related applications for all-optical D2NN. (a) Handwritten digit recognition.53" target="_self" style="display: inline;">53 (b) Fashion product recognition.53" target="_self" style="display: inline;">53 (c) Video-based human action recognition.455" target="_self" style="display: inline;">455 (d) Image reconstruction.478" target="_self" style="display: inline;">478 (e) Subwavelength phase imaging.479" target="_self" style="display: inline;">479 (f) All-optical image encryption using incoherent illumination.461" target="_self" style="display: inline;">461 (g) Superresolution display.480" target="_self" style="display: inline;">480 (h) All-optical decryptors using coherent illumination.463" target="_self" style="display: inline;">463
    Hybrid opto-electrical computing system empowers the machine-vision field. (a) Handwritten digit recognition through optical-digital implementation.432" target="_self" style="display: inline;">432 (b) Malaria parasite detection using learned sensing network.483" target="_self" style="display: inline;">483 (c) Imaging compression using a multiply scattering medium and reconstruction by sparse optimization techniques.484" target="_self" style="display: inline;">484 (d) End-to-end computational camera design paradigm to realize achromatic extended depth of field.485" target="_self" style="display: inline;">485 (e) Joint optimization of microscope point spread function and differentiable reconstruction algorithm to achieve 3D information reconstruction.486" target="_self" style="display: inline;">486 (f) The flow chart for depth map estimation using a phase-coded aperture camera.487" target="_self" style="display: inline;">487
    Recent high-performance optical computing chips to support advanced AI tasks. (a) The data flow of the all-analog photoelectronic chip, which can support energy-efficient and ultrahigh-speed vision tasks.489" target="_self" style="display: inline;">489 (b), (c) Large-scale photonic chiplets are proposed to deploy large models for AGI tasks490" target="_self" style="display: inline;">490 such as (b) music generation and (c) image generation.
    (a) Structure of an LNOI modulator. (b) Modulation depth with voltage.372" target="_self" style="display: inline;">372 (c) Architecture of an SOA-based neural network.41" target="_self" style="display: inline;">41
    Several representative works of PCMs as a nonvolatile memory and weight element. (a) A waveguide-integrated PCM metasurface.513" target="_self" style="display: inline;">513 (b) A PCM-integrated cross-bar array for parallel convolution.406" target="_self" style="display: inline;">406 (c) A PCM pad array as the neural synapse.510" target="_self" style="display: inline;">510 (d) A PCM integrated all-optical abacus.512" target="_self" style="display: inline;">512
    Nonlinear activation units. (a) A Ge on Si nonlinear activation unit structure and (b) its nonlinear response curve.523" target="_self" style="display: inline;">523 (c) An image recognition neural network with a quantum nonlinear dot activation layer and (d) a ReLU-like response with a quantum dot activation unit.44" target="_self" style="display: inline;">44
    • Table 1. Comparison of GA and PSO features.

      View table
      View in Article

      Table 1. Comparison of GA and PSO features.

      FeaturesGenetic algorithm (GA)Particle swarm optimization (PSO)
      Search capabilityPowerful global search capability for high-dimensional multipeak problems. Population diversity through crossover and mutation.Weaker global search capability, but prone to local optimization in complex problems. Fast convergence, possible premature convergence.
      Convergence speedConvergence is slower, especially in complex problems, requiring more iterations and computational resources.Convergence is faster, especially for continuous optimization problems.
      Computational complexityHigh, especially for large populations, with multiple manipulations and fitness assessments per generation.Low, only the particle fitness needs to be evaluated for each update.
      ApplicabilityFor discrete or combinatorial optimization problems, capable of handling nonlinear constraints and multiobjective problems.Suitable for continuous optimization problems and particularly suited for parameter optimization of optical components.
      UsabilityThe implementation is complex and requires careful tuning of parameters such as population size, crossover, and mutation probabilities.The implementation is simple with few major parameters such as particle velocity and position update factor.
    Tools

    Get Citation

    Copy Citation Text

    Fu Feng, Dewang Huo, Ziyang Zhang, Yijie Lou, Shengyao Wang, Zhijuan Gu, Dong-Sheng Liu, Xinhui Duan, Daqian Wang, Xiaowei Liu, Ji Qi, Shaoliang Yu, Qingyang Du, Guangyong Chen, Cuicui Lu, Yu Yu, Xifeng Ren, Xiaocong Yuan, "Symbiotic evolution of photonics and artificial intelligence: a comprehensive review," Adv. Photon. 7, 024001 (2025)

    Download Citation

    EndNote(RIS)BibTexPlain Text
    Save article for my favorites
    Paper Information

    Category: Reviews

    Received: Sep. 8, 2024

    Accepted: Jan. 24, 2025

    Published Online: Apr. 3, 2025

    The Author Email: Qi Ji (ji.qi@zhejianglab.org), Du Qingyang (qydu@zhejianglab.org), Chen Guangyong (gychen@zhejianglab.org), Lu Cuicui (cuicuilu@bit.edu.cn), Yu Yu (yuyu@mail.hust.edu.cn), Ren Xifeng (renxf@ustc.edu.cn), Yuan Xiaocong (xcyuan@zhejianglab.org)

    DOI:10.1117/1.AP.7.2.024001

    CSTR:32187.14.1.AP.7.2.024001

    Topics