The various technical challenges that traditional electronic devices have faced in recent years suggest that Moore’s law is becoming increasingly difficult to maintain[
Chinese Optics Letters, Volume. 19, Issue 1, 011301(2021)
Intelligent algorithms: new avenues for designing nanophotonic devices [Invited] On the Cover
The research on nanophotonic devices has made great progress during the past decades. It is the unremitting pursuit of researchers that realize various device functions to meet practical applications. However, most of the traditional methods rely on human experience and physical inspiration for structural design and parameter optimization, which usually require a lot of resources, and the performance of the designed device is limited. Intelligent algorithms, which are composed of rich optimized algorithms, show a vigorous development trend in the field of nanophotonic devices in recent years. The design of nanophotonic devices by intelligent algorithms can break the restrictions of traditional methods and predict novel configurations, which is universal and efficient for different materials, different structures, different modes, different wavelengths, etc. In this review, intelligent algorithms for designing nanophotonic devices are introduced from their concepts to their applications, including deep learning methods, the gradient-based inverse design method, swarm intelligence algorithms, individual inspired algorithms, and some other algorithms. The design principle based on intelligent algorithms and the design of typical new nanophotonic devices are reviewed. Intelligent algorithms can play an important role in designing complex functions and improving the performances of nanophotonic devices, which provide new avenues for the realization of photonic chips.
1. Introduction
The various technical challenges that traditional electronic devices have faced in recent years suggest that Moore’s law is becoming increasingly difficult to maintain[
Intelligent algorithms are, in many cases, practical alternative techniques for solving varieties of challenging engineering problems[
In this review article, the deep learning method, the gradient-based inverse design method, swarm intelligence algorithms [including genetic algorithm (GA), particle swarm optimization (PSO), and ant colony algorithm (ACA)], individual inspired algorithms [including the simulated annealing algorithm (SAA), the hill climbing algorithm, and tabu search (TS)], and some other algorithms [including the direct binary search (DBS) algorithm, topology optimization, and Monte Carlo method] are introduced from research background or concept to applications for designing nanophotonic devices. A summary of the intelligent algorithms and their applications for designing nanophotonic devices is shown in Fig. 1. Corresponding application examples of nanophotonic devices are listed under each mentioned intelligent algorithm. The advances in the design of nanophotonic devices using various intelligent algorithms may bring new inspiration for further research of nanophotonic structures and devices. Recently, our group has developed an intelligent algorithm by combining GA and the finite element method (FEM), and we have realized on-chip wavelength routers[
Sign up for Chinese Optics Letters TOC Get the latest issue of Advanced Photonics delivered right to you!Sign up now
Figure 1.Summary of intelligent algorithms and their applications for designing nanophotonic devices in this review.
This review includes seven sections. The first section is the introduction, where we illustrate the purpose of writing this review. The second section is about the deep learning method, especially the artificial neural network, where the history, principle, and applications are demonstrated. In the third section, the gradient-based inverse design is introduced, including the adjoint algorithm for optimizing the parameters of nanophotonic devices, which is a further improvement on the gradient-based inverse design method and solves the problems of the system that follows the known laws of physics. The fourth section focuses on swarm intelligence algorithms, introducing GA and PSO, which have been widely used in recent years, as well as ACA, which is often used to optimize the design of solar devices. The fifth section is the individual inspired algorithms including the SAA, the hill climbing algorithm, and the TS algorithm, which are introduced from the aspects of concept, development process, and application. The sixth section is some other intelligent algorithms, including DBS, topology optimization, and Monte Carlo method, which play an important role for designing the multiplexer, band structures, optical imaging, etc. The last section is the summary, which summarizes the advantages of intelligent algorithms in designing complex functions and improving device performance for designing nanophotonic devices, and explains the development trend of using intelligent algorithms, especially in the design of nanophotonic devices in future.
2. Nanophotonic Devices Based on Deep Learning Methods
In 2016, after the artificial intelligence (AI) “AlphaGo” defeated the go world champion Lee Sedol, a terminology called “deep learning” was then firmly printed in people’s minds[
In order to get rid of these troubles, researchers try to develop an algorithm that integrates the process of feature learning into the process of machine learning, which is so-called representation learning. Deep learning is a typical kind of representation learning (see Fig. 2 for the inclusion relation of these three with AI). Deep learning experienced a long time before AlphaGo was a blockbuster, and in the past people almost gave it up. It was not until 2006, when Hinton et al. proposed a model called the “deep confidence network” that deep learning was back in the spotlight[
Figure 2.Inclusion relation of machine learning, representation learning, deep learning, and artificial intelligence.
The core of deep learning is the design of the artificial neural network (ANN). As the term suggests, the structure of the ANN is based on the simulation of the neural network of the human brain. Some of the neurons activate the messages received from somewhere else and then pass them onto other neurons. That is to say, deep learning methods are representation learning methods with multiple levels of representation, obtained by composing simple but non-linear modules that each transforms the representation at one lower level into a representation at a higher and slightly more abstract level. With the composition of enough transformations, complex functions can be learned[
2.1 Introduction to deep-learning method
In this part, the utilization in nanophotonic devices through the deep-learning method will be introduced and illustrated. First of all, when applying the deep learning method, a certain number of training data need to be generated and then the quantitative characteristics of a group of data in the form of a one-dimensional vector are input to the neural network. The input information is processed in the first layer (i.e., the layer after the input layer) and then transferred to the next layer. Taking the neurons in the layer as an example (see Fig. 3), the neural network in the layer has neurons (the number of layers, the number of neurons in each layer, and other preset parameters before training the neural system are called hyperparameters). Z and A are used to represent the information before and after processing with the activation function respectively. Then the processing of information by neurons in the layer can be expressed as follows:
Figure 3.Neurons in each layer process and transfer data in the form of column vectors, and the weights of neural networks are expressed as matrices. The
Here, is an matrix. The function g is the activation function, and the commonly used non-linear activation functions are the sigmoid function, the ReLu function, the Tanh function, etc. When the information is transferred to the last layer and activated, the so-called prediction value is obtained. After selecting the appropriate cost function, the chain rule is used for backpropagation and the stochastic gradient descent (SGD) algorithm is used to update the value of the parameters (weight and bias) of each neuron in the neural network, which ends a process of training. After feeding a large number of training data, it is expected that Θ in ANN will be updated to a value more suitable for dealing with similar problems, i.e., the cost function converges to a local minimum. By using the test set after the training process, problems such as underfitting and overfitting can be detected. Specifically, this is done by calculating variance and bias, and drawing the learning curve during the testing process.
Usually, ANN can deal with two kinds of problems: regression problems and classification problems. The time consumed to train an ANN is a reference to evaluate an ANN. Also, when evaluating the performance of a neural network in classification problems, parameters such as precision and recall are often introduced, even though sometimes it is necessary to trade off these two parameters. Some ways can be used to improve the performance. The main ways are as follows.
Deep learning methods have been applied in many fields, including the design of nanophotonic devices. In order to design and evaluate a nanophotonic device, it is necessary to predict the optical response, and the prediction is usually implemented by solving Maxwell’s equations using dedicated numerical methods[
2.2 Typical architectures of ANNs
In this part, typical architectures of ANNs will be introduced and illustrated. There are several typical architectures of ANNs that are often adopted to design and optimize nanostructures with different functions.
Malkiel et al. trained and tested a bidirectional deep-learning architecture with the capability of predicting the geometry of nanostructures solely based on the far-field response of the nanostructures, and the prediction is accurate[
Figure 4.(a) Bidirectional network used for inverse design[13]. (b) The TN consists of an inverse design network and a forward modeling network[14]. (c) A CNN consists of two bidirectional neural networks, and it is capable of automatically designing and optimizing three-dimensional (3D) chiral metamaterials with strong chiral-optical responses at specified wavelengths[17]. (d) A DNN for forward and inverse design of a power splitter[16].
In most cases, neural networks with more layers perform better, whereas fully connected deep neural networks (FCDNNs) generally suffer from the problem of vanishing gradients. As a result, increasing the depth of an FCDNN does not necessarily improve the performance. Kojima et al. solved this problem by using a residual deep neural network [ResNet, see Fig. 4(d)] to improve the depth of training up to 8 hidden layers for both the forward and inverse problem[
As a typical neural network structure, the convolutional neural network (CNN) has been successfully applied in the field of image recognition and is now also used in the design of nanophotonic devices. The two main advantages of CNNs over FCDNNs are parameter sharing and sparsity of connections (i.e., in each layer, each output value depends only on a small number of inputs, which somewhat avoids the problem of overfitting and is more suitable to deal with the design problems with more parameters). Ma et al. reported a CNN model comprising of two bidirectional neural networks assembled by a partial stacking strategy [see Fig. 4(c)], to automatically design and optimize 3D chiral metamaterials with strong chiral-optical responses at predesignated wavelengths[
Figure 5.(a) CNN used to predict the invariance of 1D photonic crystal[18]. (b) A novel CAVE for the design of a power splitter[23].
Another type of neural network whose application range is extended rapidly is the generative adversarial network (GAN)[
Sign up for Chinese Optics Letters TOC Get the latest issue of Advanced Photonics delivered right to you!Sign up now
Recently, benefiting from the development of deep learning itself and the open source software libraries such as TensorFlow, there are more and more reports of applying neural networks to the design of nanophotonic devices, and the overall trend is that the structure of neural networks is more advanced and complex. In addition, taking advantage of the “black box” characteristics of deep learning (i.e., people do not care about its internal structure, but only its input and output), some novel algorithms have been invented by modifying the deep learning method. Zhou et al. designed two programmable optical signal processing chips with a learning ability based on the idea of the deep learning method[
2.3 Discussion and outlook
Deep learning methods have many advantages over traditional algorithms. First, one advantage of deep learning is that once trained it costs less time than traditional algorithms (i.e., less computational cost) and is more likely to find better local optimal solutions. For example, using neural networks to predict the spectrum of a nanoscale optical device tends to be more accurate than traditional algorithms. Hammond et al. trained ANNs to model both strip waveguides and chirped Bragg gratings, and they found that the trained ANNs decreased the computational cost relative to the traditional design methodologies by more than 4 orders of magnitude[
However, deep learning methods also have some limitations and drawbacks. First, since the design of nanophotonic devices is a non-convex problem, it is impossible to guarantee that the designed devices are optimal. Jiang et al. presented a global optimizer that performs a global search for the optimal device within the design space, but the final devices may not be the optimal[
At the end of this section, some prospects for neural networks are given. Based on the outstanding performance of the deep learning method in the nanophotonic field and the analysis of a number of papers, we can confidently predict that there will be less nanophotonic device design works in the future without a deep learning algorithm. Its flexibility also facilitates it to be an excellent candidate for handling other nanophotonic problems[
3. Nanophotonic Devices Based on the Gradient-Based Inverse Design
Between the 1870s and 1880s, the importance of inverse problems has grown considerably in many fields. The mathematical expression of a physical law is a rule that defines a mapping T of a set of functions ξ called the parameters into a set of functions δ called the results. According to the above expression, to find inverse mappings of δ into ξ, inverse problems can be defined in a precise mathematical form that excludes the so-called “fitting procedure” in which models depending on a few parameters and giving a good fit of the experimental results are obtained by trial and error or any other techniques[
3.1 Introduction to the gradient-based inverse design
The Vuckovic group at Stanford University reported an inverse design algorithm and there are a variety of nanophotonic devices designed by the algorithm, such as multi-channel devices, power splitter (router)[
Here, is a unit vector pointing in the propagation direction and denotes the coordinates perpendicular to the propagation direction. Faraday’s law
More generally, the output mode amplitude can be specified in terms of a linear function of the electric field :
After the problem formulation, the gradient-based inverse design algorithm solves Maxwell’s equations numerically and employs numerical optimization techniques to design devices. It uses two methods to solve this problem: the ‘objective first’ method and a ‘steepest descent’ method. In the objective first method, the algorithm constrains the electric fields to satisfy the performance constraints in Eq. (7). Then the algorithm minimizes the violation of physics using the alternating directions method of multipliers (ADMM) optimization algorithm[
3.2 Application of the gradient-based inverse design
The gradient-based inverse design algorithm is a relatively general computational method for nanophotonic design that is widely used in the design of nanophotonics devices. In this section, we present some typical nanophotonic devices designed by the inverse design algorithm. The multi-channel device, which is called a hub by its designers, is shown in Fig. 6(a)[
Figure 6.Nanophotonic devices designed by the gradient-based inverse design. (a) The structure diagram of
In addition to the multi-channel device in a single polarization mode, the algorithm is also used to design devices that can exhibit different functionality for different input excitations, such as the mode converters. Figure 6(c) is a schematic diagram of the TE mode converter, which is a mode conversion device operating in TE polarization[
The researchers then improved the algorithm further, and they introduced the adjoint method to compute the gradient efficiently by using a single time-reversed electromagnetic simulation. Usually, when we optimize the parameters of a system, we know the laws of physics (usually expressed as PDE) that the system follows. This type of problem, called PDE-constrained optimization, has many application scenarios[
The adjoint method can well optimize parameters and solve practical problems, is relatively mature, and can be widely used. The time-dependent adjoint method can also be used to solve optimal control problems. If the state transition of the control problem itself is too complex to form a closed solution, gradient descent is a good choice considering the constraints. On the other hand, in the optimal control, we can consider the randomness of system transfer, so the adjoint method can obviously take the similar randomness into account. Given for time, we consider the optimization within the time period .
System: . Determine the initial state x(0), and the subsequent evolution of follows an ODE.
Loss function: . This is an integral over time.
Similarly, we define the Lagrange function:
After taking the derivative, we can get
And then we simplify and .
The end result is
We can take the multipliers and so that both of the terms in the brackets are zero.
In a word, during the whole optimization process, only three steps are needed for a single gradient descent.
Thus, for each step of gradient descent, we only need to do a few simulations and then solve a few ODEs. The computation is greatly reduced.
The application of the adjoint method in the optimization problem with constraints has two main aspects[
In the improved gradient-based inverse design algorithm, the adjoint algorithm is used to calculate the gradient and optimize the parameters and the structure. With boundary parameterization and structure optimization, a broadband optimization to produce a robust device can be performed[
Figure 7.Nanophotonic devices designed by the gradient-based inverse design. (a) The structure diagram of TE/TM router[62]. (b) The Electromagnetic energy density of the TE/TM router at 1550 nm. (c) Measured transmission of the three-channel router[65]. (d) Simulated electromagnetic energy density of the three-channel router at the three operating wavelengths.
With the development of artificial intelligence and information technology, more and more types of nanophotonic devices have been recently designed by the inverse design algorithm. It tends to be used in the design of multifunctional devices and cascaded devices, such as laser-driven particle accelerators, resonators, interfacing grating couplers of conceptual photonic circuits, and switches[
Figure 8.Nanophotonic devices designed by the gradient-based inverse design. (a) SEM image of cascaded Fano–Lorentzian resonators implemented on a silicon-on-insulator platform[67]. (b) The R
3.3 Discussions
The gradient-based inverse design can automatically design photonic devices, which is an automated photonics design, and only requires the user to input high-level parameters. The algorithm can afford large parameter space, and design devices that exploit the full space parameters of fabricable devices. It tends to require fewer simulations than genetic or particle swarm optimization as they do not rely on parameter sweeps or random perturbations to find their minima. The gradient-based inverse design algorithm can be used to design photonic devices with any passive and linear photonic element. However, the design achieved by the inverse design algorithm typically exhibits a continuous topography, and some very small components in structures may be formed during the inverse designing process, which brings challenges for sample fabrication. Moreover, the gradient-based inverse design method usually produces a local optimal solution, and it cannot realize the true global optimization.
4. Nanophotonic Devices Based on Swarm Intelligence Algorithms
Swarm intelligence refers to ‘the non-intelligent subject shows the characteristics of intelligent behavior through cooperation’, which is a kind of computing technology based on the laws of biological group behavior. In recent years, there have been various algorithms in the research field of swarm intelligence theory, such as the genetic algorithm (GA), particle swarm optimization (PSO), and the ant colony algorithm (ACA). It is proved that swarm intelligence algorithms are effective methods through the research of the theory and application method. It can effectively solve most optimization problems.
4.1 Genetic algorithm
GA is an adaptive optimization global search algorithm that simulates the genetic and evolutionary process of organisms in natural environments[
According to individual fitness and certain rules, some individuals with excellent traits are selected from the th generation group and passed on to the next generation () population. In this selection process, the greater the fitness of an individual, the greater the chance of being selected to the next generation. For the fitness of individual of and the population size of , the probability formula of being selected is
Individuals selected from population are randomly matched and, for each individual, a certain probability (crossover probability 0.25–1.0) is used to swap parts of their chromosomes (partial position of the encoding bit string). The search ability of GA is extended better. Figure 9 is the flow chart of the GA.
Figure 9.The flow chart of GA[77].
To obtain the desired optical properties, Huntington et al. designed a lattice evolution algorithm that allows lattice optical materials to exhibit simple properties or focus light on discrete points[
Figure 10.Nanophotonic devices designed by GA. (a) Lattice optical materials capable of focusing light into several different focal points in the far field. The left is a schematic diagram of the experimental device. The right shows light focused on several different points through a lattice of lattice optical materials[6]. (b) Simulated reflection characteristics of antireflection coatings[76]. (c) The left is the initial silicon plate and the corresponding electric field distribution before optimization, and the right is the structure and electric field distribution of the reflector after optimization[77]. (d) The structure obtained after GA and simulated transmittance spectrum[78].
Yu et al. used GA to optimize the design of the prevalent thin-film-on-insulator platform for reflectors[
Our group has constructed an intelligent algorithm by combining GA and FEM to design a wavelength[
Figure 11.Nanophotonic devices designed by GA. (a) The structure diagram of wavelength router and (b) the simulated transmittance[7]. (c) The optimized structure of the polarization router. (d) and (e) are the simulated transmission spectra of the polarization router’s O1 and O2 ports[8].
Chen et al. proposed a method[
Figure 12.Nanophotonic devices designed by GA. (a) Measured data and calculated results (red solid line), the illustration is a schematic of carbon nanotube films and diode FE measurements. (b) Optimized electron beam trajectories for type of FE device[79]. (c) The total scattering efficiency of normalization (black line), and the contribution of induced electric dipole (ED) and magnetic dipole (MD) moments of core-shell nanoparticles[80].
GA is well suited for complex problems, such as having to optimize many system parameters at the same time, and some other application problems may not have well-defined and unique optimal values. GA can not only solve the single objective optimization problem, but can also play a more important role in the multi-objective optimization problem. The common selection method of multi-objective GA is to define individual fitness through different methods. Although the local search ability of GA is poor, it is often used in combination with other algorithms to improve the performance of the algorithm by taking advantage of its easy parallel implementation. In many works[
Just like GA, based on biological evolution, the cultural algorithm (CA) uses cultural or social evolution to simulate human society and solve optimization problems by using domain knowledge to reduce the search space[
4.2 Particle swarm optimization
The PSO algorithm is derived from the simulation study of migration and aggregation behavior in the foraging process of birds. The basic idea is to find the optimal solution through the cooperation and information sharing among individuals in the group. It contains the characteristics of evolutionary calculation and swarm intelligence. It is essentially a kind of random search algorithm[
In PSO, the velocity and position of each particle in the solution space are initialized, including the entire possible solution set[
The whole process can be represented by the equations
Using the PSO algorithm to optimize the parameters, Djavid et al. proposed an evolutionary design approach of the photonic crystal notch filter[
Figure 13.Nanophotonic devices designed by PSO. (a) A notch filter based on microcavity and (b) single frame extract video recording of the electric field intensity of the notch filter at the wavelength of 1500 nm[86]. (c) The structure of the tapered PSO and the distribution inside the electric field[5]. (d) The optimized geometry of the silver nanoparticles array and (e) the magnitude of its Fourier transform[87].
In order to design a binary mask, Rogers et al. used a binary PSO algorithm to optimize the mask[
Figure 14.Nanophotonic devices designed by PSO. (a) The SEM image of SOL and (b) the SEM image of the cluster of nanoholes on the metal membrane. The SOL image shows all the main features of the cluster[88]. (c) Optimized power splitter device and (d) normalized strength[90]. (e) The white rectangle represents the spatial distribution of the nanometer aperture of the two-channel multiplexing lens. (f) The simulated intensity profiles of the radiated beam of the two-channel multiplexing metalens in the xz plane[91].
Ha et al. proposed a design method for the ultra-compact small footprint lens. Combining the PSO algorithm with spatial technology[
PSO has a fairly fast speed of approaching the optimal solution, which can effectively optimize the parameters of the system. The advantage of PSO is that it can be applied to continuous function optimization problems. The main drawback of this method is that it is easy to produce premature convergence, especially in dealing with complex multiple optimal value search problems, and its local optimization ability is poor. PSO falls into local minimum, which is mainly attributed to the loss of diversity of population in search space. To further improve it, we can either combine it with other algorithms or add mutation operation. PSO has been used to optimize nanostructures and design nanophotonic devices. It can be used to optimize multidimensional problems. Although PSO has a high requirement for parameter setting, its process is easy to understand and its convergence speed is fast.
4.3 Ant colony algorithm
ACA is derived by simulating the process of ants finding their way in nature, and it is an intelligent algorithm to search the shortest path. ACA has the advantages of strong robustness[
The basic ACA is expressed as follows: at the initial moment, ants are randomly placed, and the initial amount of pheromone on each path is equal. At the moment , the probability of the th ant moving from node to node is
Figure 15.The flow chart of ACA optimization process[94].
Using ACA, Saouane et al. obtained the setting of the optimal inclination angle for the photovoltaic collector through simulation and improved the efficiency of the collector[
Figure 16.Nanophotonic devices designed by ACA. (a) The ACA-based method was used to calculate the reflection coefficient of the antireflection coating system on silicon substrate and (b) the simulation results show that the reflectivity of the antireflection coating system is changed with wavelength and incident angle by ACA[95].
However, if the parameters are not set properly, the solution speed will be very slow and the quality of the solution will be particularly poor. In the early stage, it takes a long search time and a large amount of calculation, which leads to a long time for the overall solution. In the design of nanophotonic devices, ACA is suitable for combinatorial optimization and continuous function optimization. The whole process of the algorithm is intuitive, but it takes a long time to solve.
For swarm intelligence algorithms, the overhead of each individual in the system is very small, and the functions that each individual can achieve are very simple, which leads to the short execution time of each individual. Therefore, the implementation is relatively simple and convenient for researchers to implement programming and parallel processing on the computer. However, parameter sensitivity is a problem that needs to be paid attention to, because improper selection will increase the time cost and complexity of subsequent calculations.
5. Nanophotonic Devices Based on Individual Inspired Algorithms
5.1 Simulated annealing algorithm
The SAA was first introduced by Kirkpatrick et al. in 1983 to mainly apply to discrete optimization problems. Originating from the physical process in which a crystalline solid slowly cools down from a relatively high temperature and gradually forms a regular crystal configuration during the annealing process, the algorithm provides a strategy to escape local optima, hoping to achieve the global optimum[
Figure 17 illustrates the flow chart of simulated annealing. The algorithm begins with a customized initial temperature, which lowers at a given speed at the end of each iteration. At each temperature, a new-found solution is compared with the current one based on a given objective function. A better solution will be accepted consistently, while a worse solution with a higher objective function value can also be employed according to the Metropolis criteria, that is, the algorithm accepts a worse solution with the probability
Figure 17.The flow chart of SAA[98].
Different from swarm intelligence algorithms, SAA has a simple structure that allows application of SAA under various circumstances. As another advantage, SAA requires no knowledge of the specific problem and thus guarantees the robustness of a random initial guess. The convergence of SAA was promised with strict mathematical demonstration[
The analogous physical annealing process inspires us to set a high initial temperature in avoidance of an insufficient cooling process, that is, loss of ability to escape local minima. But it introduces a waste of computing budget as the algorithm loses the ability to judge the quality of the new-found solution and accepts all of them until the excessively high initial temperature cools to a critical temperature. A critical temperature represents a balance point at which objective function values are preferred, but the temperature is warm enough to tunnel through such solutions. We have no idea about the appropriate value for the initial temperature when the algorithm needs no knowledge of the problem. In that case, experiments are expected to identify the initial temperature and such a method was proposed by Basu et al.[
Due to the mechanism of SAA, a large computing budget is always expected to search for the optimum. The situation deteriorates even more with an excessive initial temperature. Considering the efficiency and computing time required in the field of nanophotonic devices, it is not appropriate to employ such a time-consuming algorithm alone, which might be the reason for SAA’s not being widely used to design nanophotonic devices. But strategies like combining SAA with other algorithms to develop its efficiency could still be a good option when it comes to devices with discrete parameters to be optimized.
SAA was proposed in the field of optical inverse design by Hara[
Figure 18.Nanophotonic devices designed by SAA. (a) A schematic of the twisted light emitter. (b) Details of structure parameters. R (R
Figure 19.Nanophotonic devices designed by SAA. (a) Schematic of the photonic spin element. Incident light is coupled into different waveguides according to the spin states. (b) The core component of an optical element. The design area is divided into 288 pixels. The green blocks stand for optimized structures filled with silicon and the white blocks stand for air. (c) The measured output power at different ports when the polarization of incident light varies[103].
5.2 Hill-climbing algorithm
The hill-climbing algorithm is a local search algorithm. Its advantage is that it does not need a traversal process to reach the highest point of the solution space; instead, it selects nodes with a higher value through heuristics where the efficiency is highly improved[
Figure 20.Nanophotonic devices designed by the hill-climbing algorithm. (a) An example of the target function in which the difficulties of hill climbing are shown. (b) The schematic of the photonic crystal split-beam nanocavity. R1, R2, and R3 are optimized by the algorithm. Experimental transmission spectrum of the split-beam cavity under 0.6 mW input power respectively in the whole measurement range, (c) the 2nd TE mode individually and (d) the 4th TE mode individually[105].
Figure 21.(a) Flowchart of the hill climbing algorithm. (b) An example of the target function in which the difficulties of hill climbing are shown.
The optimization of nanophotonic devices is often a complex problem. It is necessary to find the optimal solution in the full parameter space, and the form of the objective function is often complicated. Therefore, hill climbing is not an excellent method for designing nanophotonic devices. However, when the initial structure has been proved to possess an effective function, using hill climbing can further improve the performance of the device. In the design of a photonic crystal-based nanocavity[
The hill-climbing algorithm is a relatively basic algorithm that is easy to start with. However, with the development of intelligent algorithms, more complex algorithms have obvious advantages in the design of nanophotonic devices and are more widely adopted.
5.3 Tabu search
The TS algorithm[
In order to avoid repeated searching, a flexible “memory” technique, the establishment of the tabu list, is used in the TS search to record and select the optimization process that has been performed to guide the next search direction. The tabu list has an associated size, which can be a fixed size or change during the iterative process and can be visualized as a window on accepted moves. The moves that tend to undo moves within this window are forbidden[
Figure 22.The flowchart of TS.
The advantage of TS is that it provides a very effective solution to jump out of the local optimal solution, and it has fast convergence speed, finding the optimal solution with less iterations. Since TS is not guaranteed to traverse the full parameter space, it is still possible to find a local optimal solution. The search path is determined by the direction of the current solution to the neighborhood, so the structure of the neighbors, that is, the mapping relationship between the initial solution and its neighbors, is particularly important.
Gagnon et al. used the TS algorithm to solve inverse design problems in integrated photonics[
Figure 23.Nanophotonic devices designed by TS. (a) Basic photonic lattice configuration for the beam shaping problem. (b) Best possible trade-off between the amplitude and the phase profile of the beam in the beam shaping problem[109]. (c) The |Ez| field profile (arbitrary units) and comparison of orthogonal polarization components along target plane of optimized TM polarized Gaussian beam. (d) The |Hz| field profile (arbitrary units) and comparison of orthogonal polarization components along target plane of optimized TE polarized Gaussian beam[110].
TS has its opportunities in optimizing nanophotonic devices, especially when the parameter space is finite with discrete numeric values. However, due to relatively few reports, the prospects of this field need to be further explored.
6. Nanophotonic Devices Based on Other Algorithms
6.1 Direct binary search
As mentioned above, intelligent algorithms are beneficial to the design of compact devices and calculate the full parameter space, compared with conventional approaches. As one of the crucial algorithms, the DBS algorithm has drawn more and more attention recently. DBS is an iterative search algorithm that was first used for the synthesis of digital holograms[
Figure 24.The flow chart of DBS algorithm[112].
With the development of intelligent optimization algorithms, the DBS algorithm has found more application domains, and there are some improved versions of the DBS algorithm. The modified version of the DBS algorithm operates in an iterative fashion. In the application of this method, the device should be discretized into “pixels” first. The possible pixel states are two different materials, and the two states are denoted by 1 and 0. During each iteration of the DBS algorithm, the pixel is toggled between these two states and the pixel to be perturbed is chosen at random. Then, a figure-of-merit (FOM) or objective function is calculated for the resulting device. If the FOM is improved, the perturbation is kept and the next parameter is perturbed, and the FOM is evaluated. If the FOM is not improved, the perturbation is discarded. At this time, an alternate perturbation (of the opposite sign) may be applied and the FOM is re-evaluated. This perturbation cycle continues until all the parameters have been addressed. This completes one iteration of the DBS algorithm. Such iterations are continued until the FOM converges to a stable value. An upper bound on the total number of iterations and a minimum change in FOM are defined to enforce numerical convergence[
The algorithm provides an effective approach to designing on-chip nanophotonic devices, such as the design of diffractive optics[
Figure 25.Nanophotonic devices designed by DBS. (a) Panel a, structure diagram of a free-space to multi-mode waveguide coupler and polarization splitter; panels b and c are simulated time-averaged intensity distribution for light polarized along X and that polarized along Y, respectively[120]. (b) The structure diagram of a polarization splitter. (c) and (d) The simulated steady-state intensity distributions for TE and TM polarized light at the design wavelength of 1550 nm, respectively[113]. (e) and (f) Reference coupled system and the cloak for micro-ring resonator[124].
In the device designs, they made use of the concept of free-form metamaterials, and found that allowing the geometry of the metamaterials to be freely optimized enables devices that can be highly functional. Moreover, nanopatterning enables one to engineer the refractive index in space at a deep sub-wavelength scale. In this way, devices that achieve high-efficiency mode conversion in an extremely small area become feasible. Then they designed a polarization beam splitter with a footprint of in the same way, which is shown in Fig. 25(b), and the simulated steady-state intensity distributions for TE and TM polarized light at the design wavelength of 1550 nm are shown in Figs. 25(c) and 25(d), respectively[
With the development of the photonic integrated circuit, a higher density integration is required. One of the options to increase integration density is to decrease the spacing between the individual devices. An optical waveguide in the plane of the photonic integrated circuit is one of the most fundamental structures. However, the integration density of the waveguide is limited by the leakage of light from one waveguide to its neighbor, if the spacing between them is too small. The DBS algorithm is employed to design the integrated cloak with a footprint of just a few micrometers to decrease this spacing without considerably increasing cross talk[
The DBS algorithm was used to optimize the structure of nanohole distribution. The microscope image of the circuit and a four-stage cascaded crossing circuit is shown in Fig. 26(a), and the zoom-in SEM image of the nanostructured crossing is shown in Fig. 26(b). The transmission spectra of the cascaded crossing are measured and normalized, as shown in Fig. 26(c). Moreover, Han et al. theoretically designed three power splitters based on photonic-crystal-like metamaterial structure using the DBS algorithm[
Figure 26.Nanophotonic devices designed by DBS. (a) The top-view microscope image of the mode-division multiplexing circuit (top), and the lower left corner is the microscope image of the four-cascaded crossing[126]. (b) The scanning electron microscope image. (c) The measured transmission spectra for the mode-division multiplexing circuit. (d) The top view of the 1 × 4 power splitter (top), and the bottom is optical field distribution[129]. (e) Excess loss of each output port.
The DBS algorithm is a simpler iterative algorithm for the design of nanophotonic devices. The discrete structure generated by DBS algorithms is more favorable to the fabrication using traditional manufacturing techniques like focused ion beam milling or electron beam lithography. However, the DBS algorithm has some limitations. First, the algorithm is guaranteed to converge, but not necessarily to a global minimum. It inherently produces a suboptimal result, as the DBS algorithm converges to the first local minimum during the search process. Second, it is computationally expensive and suitable for discrete solution space and small parameter space. The cost of the calculation and the probability of the DBS algorithm falling into the local optimal value will increase as the search space increases. Third, the algorithm is sensitive to the starting point. In view of the above analysis, there is an urgent need to develop an algorithm to design the optimal and multi-function integrated device.
6.2 Topology optimization
Topology optimization is a mathematical method for optimizing the distribution of materials in a given area according to given loads, constraints, and performance indicators. Topology optimization is one of the most promising aspects of structural optimization, with greater design freedom and space. Continuous topology optimization methods include the homogenization method, variable density method, level set method, etc. The homogenization method uses the finite element method to discretize the design area, and assumes that the entire design space is a microstructure unit (unit cell) similar to the “stomata distribution”. The unit cells are evenly distributed and of the same size before the optimization starts. In the process of topology optimization, the unit cell density distribution changes; that is, the unit cell density in the high stress area becomes larger while the unit cell density in the low stress area becomes smaller. A load-bearing structure was formed during the optimization process. This structure is “dense” in high-stress areas and “sparse” in low-stress areas. When the iterative calculations are all completed, define a reasonable minimum density, and then remove the area in the design space where the unit cell density is lower than this minimum to produce a weight-optimized load-bearing structure with the highest material effect. The variable density method expresses the corresponding relationship between the relative density of the element and the elastic modulus of the material in the form of a density function of continuous variables, seeks the best force transmission route of the structure, and optimizes the distribution of materials in the design area. It has the advantages of easy program implementation, high calculation efficiency, and calculation accuracy. However, the result of this method has a fuzzy boundary. The level set method is discussed below. The topology optimization of discrete structures is mainly based on the basic structure method, using different algorithms to solve the problem. Topology optimization is more and more widely used due to its advantages[
Figure 27.(a) The structure of the topology optimization algorithm used in the work. (b) The 3D model of gold nanoparticle dimer with predefined key parameters in geometry and material.
The level set method is a numerical technique for interface tracking and shape modeling. One of the advantages of the level set method is that the curves and surfaces can be numerically calculated on a Cartesian grid without parameterizing curves and surfaces (this is the so-called Eulerian approach). Another advantage of the level set method is that it is easy to track the topology change of the object. For example, an object may be divided into two parts or combined into one, or a new cavity or new entity may be created. All of these make the level set method a powerful tool for time-changing objects modeling, such as expansion of airbag and oil droplets falling into the water. However, the level set equation needs to be updated with the PDE equation. During the process, the level set equation needs to be reset to ensure the continuous update of the PDE, which will greatly reduce the optimal convergence speed, or even fail to converge.
The optimal design of photonic bandgaps for 2D square lattices is considered[
Let [
The main approach is as follows[
The evolution of the dielectric distribution is shown in Fig. 28(a). The change of bandgap as the number of iterations increases is shown in Fig. 28(b). The final band structure for maximizing the bandgap between and is shown in Fig. 28(c).
Figure 28.Nanophotonic devices designed by the level set method. (a) The evolution of the dielectric distribution[133]. (b) The bandgap versus the iteration. (c) The final band structure with the largest bandgap between
The level set method can calculate the curves and surfaces in the evolution process numerically on the Cartesian grid without parametric curves and surfaces. It has a larger application space, and it is believed that the level set algorithm can solve more problems.
6.3 Monte Carlo method
The Monte Carlo method, also known as a statistical simulation method, is a very important numerical calculation method guided by the theory of probability and statistics, which was proposed in the mid-1940s due to the development of science and technology and the invention of electronic computers. The Monte Carlo method is a method that uses random numbers to solve many computing problems. The Monte Carlo method is widely used in financial engineering, macroeconomics, computational physics, and other fields[
The Monte Carlo method usually solves mathematical problems by constructing random numbers that conform to certain rules. The Monte Carlo method is an effective method to find numerical solutions for those problems that are too complex to obtain analytical solutions or have no analytical solutions at all. The most common application of the Monte Carlo method in mathematics is the Monte Carlo integral[
Applying the Monte Carlo method to practical problems has two main parts.
With the help of computer technology, the Monte Carlo method has many advantages; it is simple and fast, eliminating the need for complicated mathematical derivation and calculation. Moreover, the Monte Carlo method has a strong adaptability, and the complexity of the problem geometry has little influence on it. It is believed that the Monte Carlo method will have more applications in the field of photonic nanometers.
7. Summary and Outlook
In this review article, we extensively discuss a variety of intelligent algorithms including deep learning methods, the gradient-based inverse design method, swarm intelligence algorithms, individual inspired algorithms, and other intelligent algorithms, as well as nanophotonic devices designed using these algorithms. Some representative examples are used to analyze various intelligent algorithms for different situations. In many practical applications, intelligent algorithms are practical methods to deal with various challenging problems. The advantages, disadvantages, characteristics, and suitable devices of the algorithms discussed in this paper are presented in Table 1.
|
Compared with the traditional design method, the intelligent algorithm is universal and efficient. For example, the advantages of deep learning are that once trained it takes less time (i.e., less computational cost) than traditional algorithms, and is more likely to find better optimization solutions. In addition, compared with traditional algorithms, the deep learning method can realize inverse design more easily. ANN has many typical structures and strong flexibility. According to the design requirements of the equipment and many problems in the training process, we can choose the appropriate neural network for optimal design. First, the design of nanophotonic devices is non-convex, and there is no guarantee that the designed devices are optimal. Second, preparing training sets and training neural networks require a lot of computing and time costs, especially when dealing with complex learning tasks. Third, further analysis using trained neural networks is difficult because ANN’s learning mechanisms (note that they are sometimes useful) operate as black boxes. However, the useful information about the features of photonic structures can be extracted by introducing proper techniques such as latent space[
The gradient-based inverse design method can automatically design nanophotonic devices and only requires the user to input high level parameters. This method can provide large parameter space and design devices using full space parameters of manufacturable devices, which often requires less simulation than GA or PSO because they do not rely on parametric scanning or random perturbations to find their minima. This method can be used to design any passive, linear photonic device.
However, the implemented design usually presents a continuous terrain, and some very small structural components may be formed during the inverse design process, which presents a challenge to sample making. In addition, the gradient-based inverse design method usually produces only local optimal solutions and cannot realize the true global optimal solution.
Swarm intelligence algorithms have certain robustness and strong evolutionary or search ability. GA can not only solve single-objective optimization problem, but also play a greater role in multi-objective optimization problems. It has the characteristics of group search and is suitable for solving complex optimization problems, such as the need to optimize multiple system parameters at the same time, and some other application problems may not have clear and unique optimal values. Moreover, GA is scalable and easy to be combined with other algorithms. However, the search efficiency of GA in the later stage of evolution is slightly lower, and it is prone to premature convergence. Although the local search ability of the genetic algorithm is poor, it is often used in combination with other algorithms to improve its performance due to its easy parallel implementation. PSO has a fast speed of approaching the optimal solution, which can effectively optimize the parameters of the system, and the process is simple and easy to understand. The advantage of PSO is that it can be applied to continuous function optimization problems. The main drawback of this method is that it requires high parameters. It is easy to produce premature convergence when dealing with complex multiple optimal value search problems, and its local optimization ability is poor. PSO falls into local minimum, which is mainly due to the loss of diversity in the search space. We can improve it by combining it with other algorithms or adding mutation operations. PSO has been used to optimize nanostructures and design nanophotonic devices. It can be used to optimize multidimensional problems. The ACA is suitable for combinatorial optimization and continuous function optimization. The whole algorithm process is intuitive and easy to understand, but it takes a long time to solve. ACA has strong robustness in solving performance and is easy to be implemented in parallel. Therefore, other algorithms are usually combined with ACA to improve the performance of the algorithm, to design more ideal nanophotonic devices.
Individual inspired algorithms can give a better solution in a certain acceptable time, but cannot guarantee it is optimal. The calculation process of SAA is simple, and it has strong universality and robustness. However, it is very sensitive to customized parameters, especially the initial temperature. When faced with a large number of parameters that need to be optimized, SAA randomly selects new solutions from the solution space, thus making the performance weak. In the case of many unknown parameters, the search efficiency and the possibility of finding the optimal solution will decrease. The climbing algorithm is more intuitive, because the memory requirement is small. But it cannot solve the problem of large-scale multi-constraint. The TS algorithm has fast convergence speed and few iteration times, but the results depend on the initial solution and the adjacent.
The DBS algorithm is a simple iterative algorithm for designing nanophotonic devices. The discrete structure generated by the DBS algorithm is more conducive to the traditional manufacturing techniques such as focused ion beam milling or electron beam lithography. However, the DBS algorithm has some limitations. First, the algorithm guarantees convergence, but not necessarily to the minimum. When it converges to the first local minimum during the search, it inherently produces a suboptimal result. Second, it is suitable for discrete solution space and small parameter space due to its large computation. The calculation cost and the probability of falling into the local optimal value will increase with the increase of search space. Third, the algorithm is sensitive to the starting point. Topological optimization has more design freedom and design space, among which the level set method used in designing of nanophotonic devices can be used for numerical calculation of curves and surfaces in the evolution process on the Cartesian grid of parametric curves and surfaces, but the process is more complex and requires a certain mathematical foundation. The level set equation needs to be updated with a partial differential equation and, in the middle, the level set equation needs to be reset to ensure the continuous update of the partial differential equation, which greatly reduces the optimal convergence rate or even fails to converge. The Monte Carlo method has strong adaptability and can solve probability and statistics problems easily and quickly. However, the number of samples must be large enough, and the calculation process is long.
As the need for nanophotonic devices to achieve more functions is further strengthened, the intelligent algorithms, especially the more popular method – the deep learning method – with higher efficiency and better effect[
[16] M. H. Tahersima, K. Kojima, T. Koike-Akino, D. Jha, B. Wang, C. Lin, K. Parsons. Deep neural network inverse design of integrated photonic power splitters. Sci. Rep., 9, 1368(2019).
[18] B. Wu, K. Ding, C. T. Chan, Y. Chen. Machine prediction of topological transitions in photonic crystals(2017).
[20] I. J. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, Y. Bengio. Generative adversarial nets, 1(2014).
[23] Y. Tang, K. Kojima, T. Koike-Akino, Y. Wang, P. Wu, M. TaherSima, D. Jha, K. Parsons, M. Qi. Generative deep learning model for a multi-level nano-optic broadband power splitter, Th1A.1(2020).
[24] H. Zhou, Y. Zhao, G. Xu, X. Wang, Z. Tan, J. Dong, X. Zhang. Chip-scale optical matrix computation for PageRank algorithm. IEEE J. Sel. Top. Quant, 26, 8300910(2020).
[29] W. J. Brouwer, J. D. Kubicki, J. O. Sofo, C. L. Giles. An investigation of machine learning methods applied to structure prediction in condensed matter(2014).
[39] W. Ma, Y. Liu. A data-efficient self-supervised deep learning model for design and characterization of nanophotonic structures. Sci. China Phys. Mech. Astron., 63, 284212(2020).
[40] Z. Liu, D. Zhu, K. Lee, A. S. Kim, L. Raju, W. Cai. Compounding meta-atoms into meta-molecules with hybrid artificial intelligence techniques. Adv. Mater., 32, 1904790(2019).
[47] E. Khoram, A. Chen, D. Liu, L. Ying, Q. Wang, M. Yuan, Z. Yu. Nanophotonic media for artificial neural inference. Opt. Lett., 7, 823(2019).
[48] J. Chang, V. Sitzmann, X. Dun, W. Heidrich, G. Wetzstein. Hybrid optical-electronic convolutional neural networks with optimized diffractive optics for image classification. Sci. Rep., 8, 12324(2018).
[51] Y. Qu, L. Jing, Y. Shen, M. Qiu, M. Soljačić. Basic instincts. ACS Photon., 6, 1168(2019).
[59] K. Chadan, P. C. Sabatier, R. G. Newton. Inverse Problems in Quantum Scattering Theory(1988).
[60] A. Y. Piggott, J. Petykiewicz, L. Su, J. Vučković. Fabrication-constrained nanophotonic inverse design. Sci. Rep., 7, 1786(2017).
[67] K. Y. Yang, J. Skarda, M. Cotrufo, A. Dutt, G. H. Ahn, M. Sawaby, D. Vercruysse, A. Arbabian, S. Fan, A. Alù, J. Vučković. Inverse-designed non-reciprocal pulse router for chip-based LiDAR. Nat. Photon., 14, 369(2020).
[68] P. Camayd-Muñoz, G. Roberts, C. Ballew, M. Debbas, A. Faraon. Inverse designed shape-reconfigurable multifunctional photonics, FW3B.2(2020).
[81] K. Liao, T. Gan, X. Hu, Q. Gong. AI-assisted on-chip nanophotonic convolver based on silicon metasurface. Nanophotonics, 9, 3315(2020).
[84] N. Padhye. Topology optimization of compliant mechanism using multi-objective particle swarm optimization, 1831(2008).
[97] M. Čepin. Assessment of Power System Reliability(2011).
[101] Y. T. Lu, Y. Q. Zhou. Design of multilayer microwave absorbers using hybrid binary lightning search algorithm and simulated annealing. Photon. Network Commun., 78, 75(2017).
[104] S. J. Russell, P. Norvig. Artificial Intelligence: A Modern Approach(2003).
[107] E. Tabli. Metaheuristics: From Design to Implementation(2009).
[123] A. Majumder, B. Shen, R. Polson, T. Andrew, R. Menon. An ultra-compact nanophotonic optical modulator using multi-state topological optimization(2017).
[136] A. K. S. Heinrich, H. Niederreiter. Monte Carlo and Quasi-Monte Carlo Methods(2006).
[137] K. Binder. Applications of the Monte Carlo method in Statistical Physics(1987).
[138] R. Y. Rubinstein, D. P. Kroese. Simulation and the Monte Carlo Method(2008).
Get Citation
Copy Citation Text
Lifeng Ma, Jing Li, Zhouhui Liu, Yuxuan Zhang, Nianen Zhang, Shuqiao Zheng, Cuicui Lu, "Intelligent algorithms: new avenues for designing nanophotonic devices [Invited]," Chin. Opt. Lett. 19, 011301 (2021)
Category: Integrated Optics
Received: Jun. 10, 2020
Accepted: Sep. 4, 2020
Published Online: Dec. 28, 2020
The Author Email: Cuicui Lu (cuicuilu@bit.edu.cn)