## Abstract

Optimization techniques have been indispensable for designing high-performance meta-devices targeted to a wide range of applications. In fact, today optimization is no longer an afterthought and is a fundamental tool for many optical and RF designers. Still, many devices presented in recent literature do not take advantage of optimization techniques. This paper seeks to address this by presenting both an introduction to and a review of several of the most popular techniques currently used for meta-device design. Additionally, emerging techniques like topology optimization and multi-objective optimization and their context to device design are thoroughly discussed. Moreover, attention is given to future directions in meta-device optimization such as surrogate-modeling and deep learning which have the potential to disrupt the fields of optical and radio frequency (RF) inverse-design. Finally, many design examples from the literature are presented and a flow-chart that provides guidance on how best to apply these optimization algorithms to a given problem is provided for the reader.

© 2019 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

The desire to maximize the performance achieved by electromagnetic (EM) devices [1,2] has long ago necessitated a need for optimization within the design process. Furthermore, rapid advancements in fabrication technologies, physical modeling, and available computing power over the past few decades have imbued EM designers with the power to accurately model and manufacture devices faster than ever. Moreover, these advances have also enabled modeling of more diverse and complicated problems than previously imaginable. It is noteworthy to mention that while simulation times may range from a few milliseconds for simple geometrical optics ray traces to a few seconds, minutes, hours, or even days for the most complicated and electromagnetically large full-wave simulations, all EM devices can benefit from optimization of some kind.

To this end, parameter sweeps are typically the first step employed by designers towards optimization. This is usually considered a “hand-tuning” procedure, but sometimes is unfortunately the only step taken in the optimization process. While parametric sweeps are often useful in establishing parameter bounds for optimization and revealing performance trends, they tend to be extremely inefficient if employed to optimize a design; especially, if the response surface (*i.e*., a surface that maps the relationships between input variables and output objectives) is hyper-dimensional. This is only exacerbated when multiple goals are considered, or non-linear constraints are applied on the input parameters. Essentially, humans have limited spatial reasoning capabilities that limit our ability to think hyper-dimensionally.

Fortunately, computers are very well suited to deal with hyper-dimensional mathematics and are the natural choice to solve these complex optimization problems. However, a computer’s ability to find optimum solutions efficiently is limited ultimately by dimensionality and complexity of the governing problem and the power of the chosen optimization algorithm. Fortunately, many algorithms have been developed over the years for various applications. Historically, the first optimizations were based on local techniques, often exploiting function gradients to inform the next design chosen for evaluation. Algorithms such as Newton’s method [3,4], gradient descent [5], and conjugate gradients [6] all use gradient information in different ways to aid in finding local minima. While these algorithms are very general purpose, some optimization techniques are highly-associated with a specific application. Take, for example, the damped least squares (DLS) algorithm, which has been around since the 1960’s [7] and has been very popular for use in lens design for many years [2,8]. This algorithm, while still considered a local technique, introduced a damping term that aids in convergence near the local minima and can assist in escaping from the many local minima that exist in typical lens design problems. However, today’s optical engineers are afforded with many more degrees of design freedom than in decades past (*e.g.*, high-order aspheric terms, free-form optics, gradient-index (GRIN) materials [9,10], and metasurfaces [11,12]) which has necessitated the investigation of more advanced optimization algorithms [13–15]. Historically, optimization has long been of interest in the RF and antenna communities [1]. Generally, RF and antenna problems contain fewer variables and have response surfaces with fewer local minima but tend to be much more computationally demanding than lens design problems (*i.e.*, using full-wave techniques as opposed to ray tracing). Therefore, like many optical design problems, RF and antenna problems often start with a known good solution and use that as a starting point for optimization.

However, good starting solutions are not always known, especially in the case of true inverse-design problems (*i.e.*, problems in which an optimizer is tasked with finding a design that achieves a given set of performance criteria while obeying all design constraints [16]). Moreover, due to the computational cost of full-wave evaluations, finite-difference gradient calculations are often infeasible which can preclude the use of gradient-based optimization techniques. Therefore, for many problems, the ideal optimizer can routinely find the global minimum from a multimodal cost function with neither the aid of a good starting point nor gradient information and do so in as few full function evaluations as possible. To this end, global optimization techniques such as the Genetic Algorithm (GA) [17], Particle Swarm Optimization (PSO) [18], Differential Evolution (DE) [19,20], and Covariance Matrix Adaptation Evolution Strategy (CMA-ES) [21] have all seen considerable success in the optimization of RF and optical design problems. In fact, global optimization techniques have become so popular due to their ability to discover new and often unintuitive or even unexplainable solutions that competitions are held each year to test and find the most powerful, efficient, and robust algorithms [22]. However, local optimization techniques should not be discounted. Recently, gradient descent based local techniques have been exploited by topology optimization in the inverse design of disruptive nanophotonic devices [23]. Furthermore, both local and global optimization techniques have been extended to support true multi-objective optimization [24], which is a powerful emerging technique in meta-device design. However, sometimes local or global techniques alone are not enough to efficiently optimize a given problem. To this end, surrogate modeling and deep learning show tremendous potential to revolutionize the future of meta-device design by making optimization tractable for time-intensive function evaluations by replacing them with cheaper alternatives.

The organizational structure of this paper is as follows. The second section seeks to assist readers who may be inexperienced with the concepts and algorithms discussed in this manuscript by providing them a framework for understanding how to pair problem types with the appropriate optimization technique. The third section presents an overview of several global optimization algorithms as well as a discussion of designs resulting from their application. The following sections discuss the emerging techniques of topology optimization and then multi-objective optimization and in meta-device design. The final section introduces the reader to surrogate-modeling and deep learning techniques which can be exploited to accelerate optimization of computationally-expensive cost functions. Conclusions and closing remarks finish the paper and seek to reinforce the important role that optimization can play in meta-device design.

## 2. Getting started with meta-device optimization

While the no free lunches theorem states that any two optimization algorithms are essentially equivalent when averaged across all possible problems [25], this should not be interpreted as implying that the choice of algorithm is unimportant for a given optimization problem. In fact, we typically only encounter a finite number of problem types in meta-device design and can absolutely select preferential algorithms for the given problem type. For those inexperienced with advanced meta-device optimization, the hardest question to answer may be what optimization algorithm is best suited for a particular problem. The answer to that question depends on factors such as the input space topology, number of input parameters, number of objectives, and computational cost per function evaluation (*i.e.*, full-wave simulation). Moreover, except in a few specific cases, there is no single optimization algorithm that is *perfectly* suited for a particular problem or numerical method (*e.g.*, the finite element method (FEM) or finite-difference time domain (FDTD)). Still, it is possible to greatly narrow down the optimization strategy (*i.e.*, local or global) and specific algorithm (*e.g.*, GA or CMA-ES) recommended for a particular problem by asking a series of additional questions.

Figure 1 presents a meta-device optimization flowchart, which seeks to assist the reader in determining the best optimization strategy for their problem by answering these questions. Starting at the top of the chart, one flows down to the first question: *Is the input space discrete?* A discrete input space implies that there is a finite number of potential solutions to search from to find the optimal solution. This is known as a combinatorial optimization problem and while it is theoretically possible to evaluate all possible combinations to find the optimal design, this can often be intractable due to the number of parameter combinations and function evaluation cost. Fortunately, global optimizers like the GA (see Section 3) can efficiently find high performance designs from large solution spaces. However, in metamaterial design, the GA can often produce designs that have non-contiguous inclusions. Therefore, if a contiguous structure is needed (*e.g.*, a meander-line antenna), the Ant Colony Optimization (ACO) algorithm or Multi-Objective Lazy Ant Colony Optimization (MOLACO) algorithm (see Section 3) is a better choice.

For problems with continuous input space representations (even if the parameters themselves are bound with constraints), the choice of optimizer is more complicated. If there exists a good initial solution, it is usually the best strategy to apply a local optimization technique. From there, if the problem can be cast as a finite element problem (*i.e.*, one in which gradient information is used to directly modify the finite-element representation of the problem) then it is a prime candidate for topology optimization (see Section 4.1). Otherwise, there exist a number of gradient-based algorithms that can be used for optimization such as DLS, Newton’s method [4], and multiobjective Gradient Descent algorithm (MDGA) [26]. If a good starting solution does not exist, then global optimization techniques are usually the best choice (see Section 3). Flowing down from the global optimization bubble, one must determine how much time is acceptable for optimization. If the time required for a single function evaluation is short enough that evaluating hundreds or thousands of designs is within the acceptable limit determined by the designer, then a range of global optimizers may be directly applied to solve the problem. However, if the function evaluation time is large enough to prohibit direct optimization, then other techniques are needed.

Flowing down the “Prohibitive” segment, one must evaluate if optimization with design tolerance (*i.e.*, robustness) is required. If so, the Multi-Objective with TOLerance (MOTOL) optimization algorithm was developed specifically for this application [27]. By training an *in situ* surrogate model (see Section 5.1) that is paired with a robust multi-objective optimization algorithm (see Section 4.2), MOTOL is able to accurately calculate tolerance while accelerating solution convergence. When tolerance optimization is not needed and previous solution data is available, these can be used in a deep learning (see Section 5.2) procedure to train an external model (*e.g.*, an artificial neural network) which can be used in place of the full function evaluation. If no training data is available *a priori*, then a variety of surrogate modeling (see Section 5.1) techniques can still be used to accelerate the optimization procedure. The following sections discuss many of these algorithms in more detail while providing further insight on their application to a number of interesting meta-device optimization problems.

## 3. Global optimization

Unless simple geometric features are used to form optical, RF, and meta-device designs and the relationship between the input parameters and cost or fitness functions is simple and well understood, it is generally beneficial to employ a global optimization strategy to obtain the best performance available within the design constraints. Global optimization (GO) differs from local optimization techniques in that the algorithm attempts to find a function’s global minimum (or maximum) as opposed to a local minimum. Many global optimizers are based on evolutionary algorithms [28] which are population-based metaheuristics and typically employ nature-inspired techniques to evolve the population over time (or generations) in an attempt to minimize a cost or maximize a fitness function subject to a given set of design constraints. Common design constraint examples include fabrication tolerances, material properties, and size, weight, power, and cost (SWaP-C) considerations. Interestingly, different choices of constraints can lead to devices that are vastly dissimilar geometrically yet perform similar electromagnetically. While today there exist numerous global optimization algorithms for designers to choose from, there is no single algorithm that is ideally suited for all problems. However, the efficiency of modern global optimizers has enabled electromagnetic (EM) designers in many instances to treat them as black box optimizers. Readers should be aware that new algorithms are regularly developed and evaluated against standardized test problems to evaluate their efficacy as black box optimizers [22]. Finally, all the algorithms presented in this section are discussed in the context of single objective optimization (SOO) (*i.e.*, the problem is cast into a single cost or fitness function). A new optimization paradigm based on multi-objective optimization is discussed in Section 4.2.

While many global optimization techniques have been developed over the past few decades, the genetic algorithm (GA) is perhaps the most widely used [35]. The GA is an iterative optimization algorithm based on a set (population) of binary strings called chromosomes which represent candidate designs. During the optimization process, the GA intelligently selects the best designs from the previous generation to serve as parents who will generate the next generation of designs. Chromosome pairs chosen from the parent designs are recombined in a process called *crossover* while genetic diversity is maintained through a process known as *mutation.* Once the next generation is determined the population is evaluated according to the chosen cost or fitness function. This process is repeated until some stop or convergence criteria is met. Due to its internal binary representation the GA is well suited for combinatorial optimization problems (*i.e.*, finding an optimal solution from a finite set of solutions). The GA has been popularized by and used extensively in the design of pixilated metallic structures [36], which are the basis for many metamaterial designs [37,38]. In [29], a pixelated multi-layer metamaterial absorber optimized by the GA demonstrated broadband absorption in the mid-wave infrared (MWIR) regime for both TE and TM polarizations over a wide field of (see Fig. 2(a)). The GA has been used to optimize optical nanoantenna array configurations [30] and shapes [31] in order to maximize field enhancement at a chosen location (see Figs. 2(b)-2(c), respectively). Conversely, in [39] the GA found pixelated reflectarray unit cell designs which achieved a series of requisite reflection phase options while mitigating field enhancement, a limiting factor in the inclusion of metamaterials in high-power microwave applications. Recent applications of the GA include coding metasurfaces which have demonstrated RCS reduction [40–42], metasurfaces optimized for efficient polarization conversion (see Fig. 2(d)), and phase-gradient beam steering devices (see Fig. 2(e)) [33].

In addition to the GA, many nature-inspired [50,51] optimization algorithms have been proposed that attempt to mimic some natural behavior such as the artificial bee colony (ABC) [52] and bat-inspired [53] algorithms, among others. These algorithms exist under the umbrella of swarm-intelligence algorithms [54] which seek to exploit the collective behavior of decentralized, self-organized, often biologically-inspired, systems in order to optimize complex problems. Ant colony optimization (ACO) [55] is a swarm-intelligence algorithm that exploits stigmergy in ant colonies in order to find optimal solutions to graph-based problems such as job scheduling, vehicle routing, and the traveling salesman problem [56]. While the GA has seen tremendous success in the generation of high performance pixelated electromagnetic and optical structures, these designs are often limited to planar configurations due to the presence of disconnected pixels in the solution. On the other hand, ACO maps the optimal “trail” found by the artificial “ants” in the graph topology to a contiguous structure. For this reason and more, ACO has seen tremendous success in electromagnetic device design, especially in the generation of meander-line antennas [44] (see Fig. 3(a) bottom). Recently, Zhu extended the ACO algorithm to include “lazy” ants [43,57] which greatly improved design diversity and has successfully been used to generate high performance frequency-selective-surface (FSS) structures [43,58]. Furthermore, by leveraging the third dimension (*i.e.*, axial) these structure possess wide field of view (FOV) performance [58] which makes them an attractive candidate for metasurface design in the optical regime [59] (see Fig. 3(a) top).

While ACO and the GA are well suited to combinatorial (*i.e.*, discrete) optimization problems, most optical and EM design problems are continuous functions and, thus require other optimization algorithms. The particle swarm optimization (PSO) was introduced in the mid-1990s [18] and has seen extensive application in electromagnetic device optimization [60–63]. PSO is another swarm-intelligence optimization algorithm that was originally constructed to model social behavior and inspired by the movements observed in flocks of birds and schools of fish. When PSO was introduced to the electromagnetics community, it possessed a number of advantages over the GA. Firstly, PSO is a real-valued algorithm and operates with vectors of real numbers instead of binary values as with the GA (later versions of the GA introduced real-valued parameters [64]). Secondly, population members tend to operate more independently and cooperatively. This can be thought of as individual members exploring different parts of the solution space simultaneously and communicating to other members of the “swarm” when they have found a good solution. In fact, PSO was found to outperform the GA when designing negative index metamaterials [65]. Additionally, PSO has seen application to optical meta-device optimization. In [45], PSO was employed to optimize the geometrical parameters and spacing of nanoparticle-based Yagi-Uda antennas (see Fig. 3(b)). PSO has also successfully been used to optimize nanohole-array based metasurfaces for beam steering applications [46] (see Fig. 3(c)). Interestingly, PSO has been found to be a special case of the more general wind-driven optimization (WDO) algorithm, which was later introduced in [66].

While global optimizers like the GA and PSO have successfully been applied to a wide range of design problems in the RF and optical regimes, they typically are sensitive to internal parameters that often require tuning on a per-problem basis. Moreover, it is not always clear how best to tune these parameters for a given problem and may require adaptive tuning during optimization to maximize the performance of the algorithm. In fact, many studies have investigated optimal control parameter tuning in evolutionary algorithms [67–70]. On the other hand, the Covariance Matrix Adaptation Evolution Strategy (CMA-ES) [21] requires very few user-defined control parameters; typically only the population size needs to be chosen prior to beginning an optimization. This self-adaptive nature and power of the underlying algorithm itself has made CMA-ES a very attractive choice for meta-device optimization [71]. Furthermore, it has been shown through several comparisons that CMA-ES is a more capable optimization algorithm for the problems typically encountered in electromagnetics [72], allowing for the design of more complex and high performance devices due to its ability to optimize high dimensional problems in less time than other algorithms. In fact, in [47] CMA-ES was found to significantly outperform the GA in terms of convergence speed and quality of solution found (see Fig. 3(d) top) when applied to the optimization of broadband polarization-converting metasurfaces. Moreover, the optimal structure found by CMA-ES is simpler and potentially less prone to fabrication tolerances than the pixelated design found by the GA. CMA-ES, in conjunction with an efficient port-reduction method, has been applied to electromagnetic band-gap (EBG) structure synthesis [48] (see Fig. 3(e)). CMA-ES has also been applied to the optimization of more traditional optical devices including triangular fiber Bragg gratings [73] and programmable optical filters for waveform sculpting [74]. CMA-ES has proven to be a very powerful technique for homogeneous [75] and GRIN lens optimization [76,77]. In [49], CMA-ES was used to optimize a GRIN lens which converted a Gaussian laser beam to a top-hat profile while maintaining collimation (see Fig. 3(f)). This design resulted in massive SWaP-reduction over traditional homogeneous lens-based beam shapers which require multiple elements to achieve the same behavior. Additionally, CMA-ES has also been used in non-electromagnetic meta-device optimization with examples including acoustic metamaterials [78] and thermal cloaks [79].

Global optimization techniques are the dominant approach to meta-device optimization today and we expect their use to increase, especially in optical metasurface and nanoantenna applications. Moreover, there is tremendous research activity targeted at developing new and more powerful GO algorithms which will continue to expand their applicability to meta-device design. Nevertheless, some of the most exciting nanophotonic devices today are optimized using local optimization techniques and a design methodology known as topology optimization. While global optimization techniques are very general, they suffer from the curse of dimensionality (*i.e.*, the number of required function evaluations for convergence increases with the dimensionality of the problem, usually hyper-linearly). Topology optimization overcomes this limitation due to its unique cost function construction and although it is suited to a very specific class of problems, it has seen tremendous success in meta-device optimization.

## 4. Emerging optimization techniques in meta-device design

#### 4.1 Topology optimization

Topology optimization refers to the idea of optimizing a two- or three-dimensional system comprising an array of pixels or voxels (hereafter called elements), each which contains a discrete or continuous parameter requiring adjustment. Compared to many of the other approaches discussed in this paper, the number of simulations required for topology optimization does not increase as the number of elements in the system grows. As such, designs created using this method can be high-resolution, curvilinear structures containing thousands to millions of elements.

Topology optimization can be mathematically described as the maximizing of a target merit function using gradient descent. In particular, the gradients of the merit function with respect to the design variables provide a guide for iteratively modifying these design variables, in a manner that improves the merit function. Our design variable for a binary dielectric meta-device is the spatially dependent dielectric constant within the system, defined as:

_{$\overrightarrow{r}$}represents any location within the design domain,

*a*∈ [0,1], and the dielectric constants

*ϵ*and

_{low}*ϵ*represent the two materials making up the final device. The ability for the design variable to take grayscale values between these dielectric constant values is important in most implementations of topology optimization, as the method requires the iterative modifications to the design variable to be perturbative. Perturbative modifications can similarly be achieved by restricting the dielectric constant to binary values, and optimizing along the boundary of the geometry, only changing a small volume of material in each iteration. This is often useful as a refining step after greyscale optimization is complete.

_{high}The gradient of the merit function for each element can be computed efficiently using the adjoint method [23]. There exist a variety of adjoint-based topology optimization implementations. Many of these implementations make use of accurate Maxwell equation simulations to perform a set of forward and adjoint simulations per iteration [23], [80], [81]. Consider, as an example, the problem where we want to use plane wave illumination to maximize the field intensity at a point *x _{0}* near the device (Fig. 4(a)). In this case, the forward, direct simulation is the plane wave illumination of the system, and

**E**

*(*

^{old}*x*) is calculated at each element in the design domain as well as at

*x*. The adjoint simulation uses a dipole located at

_{0}*x*with an amplitude of ${\u03f5}_{0}\Delta V\overline{{E}^{old}({x}_{0})},$ where Δ

_{0}*V*is the total volume of the region in the design domain to be adjusted.

**E**

*(*

^{adj}*x*) is calculated as the fields from the adjoint simulation at each element in the design domain. The gradient of the merit function for each element is then calculated to be Re[

**E**

*(*

^{adj}*x*) ·

**E**

*(*

^{old}*x*)] in this specific case.

While topology optimization is capable of refining the dielectric distribution of large volume systems, it does have limitations. One is that, as a gradient descent method, it is fundamentally a local optimization process. As the optimization of optical devices is not convex, it is not possible to guarantee that designs approach a global optimum. For this reason, it is often necessary to run many optimizations using different, random starting points in order to ensure a high-efficiency result. Alternatively, global optimization methods may be implemented to more reliably converge to devices with high performance. One such method is “objective-first” in which the design objectives are enforced at the expense of simulation accuracy (physics residual) [82]. The minimization of the physics residual under the constraints of the design objectives can be done using the alternating directions method of multipliers (ADMM) [83], which is a non-local optimization method for solving bi-convex problems. While nanophotonic optimization problems are not strictly bi-convex, they can be modified to adhere to those constraints on an iteration-by-iteration basis [82]. This method has been used to design various small-scale nanophotonic devices, such as fiber optic couplers to silicon photonic waveguides (Fig. 4(b)).

To examine the capabilities and efficacy of topology optimization as it applies to meta-devices, we will review some examples of diffractive optical devices based on this design methodology. We first examine the application of topology optimization to meta-gratings, which are periodic meta-devices that selectively diffract light to the + 1 diffraction order. Figure 4(c) shows the optimization of a silicon-based meta-grating that diffracts normally-incident, polarization-independent light to 75°. The dielectric constant in each element begins as a random, continuous distribution with values between silicon and air. Over the course of 350 iterations, the device converges to a binary structure. During the iterative optimization process, robustness to global fabrication errors is enforced by simultaneously optimizing multiple distorted versions of a pattern [85]. These steps are taken to ensure that the devices are robust to real world fabrication errors, at the expense of theoretical device efficiency. The final device has an absolute efficiency (defined as transmitted power to the desired diffraction channel divided by the incident power) near 80% for both polarizations, which is high and unachievable with conventional metasurface design concepts. Experimentally fabricated and characterized devices (Fig. 4(d)) have efficiency metrics that are within 10% of theory.

Topology optimization can be tailored for problems with multiple design goals. In the case of adjoint-based implementations utilizing accurate Maxwell solvers, it is done by performing multiple sets of forward and adjoint simulations for each objective, and then incorporating all of these design objectives into one merit function [84] which is then optimized by a SOO. As an example, we show in Fig. 4(e) the design of meta-gratings that deflect TM-polarized planewaves of N different wavelengths to different diffraction channels. The average deflection efficiency per wavelength scales as 1/N^{1/2}, where N is the number of total wavelengths. While average deflection efficiency goes down as N increases, this scaling law is less severe than the 1/N trend more typical of sectoring and interleaving multiplexing methods [84].

Topology optimization can also readily generalize to multi-layer and even fully 3D curvilinear shapes. By pushing metasurfaces and metamaterials to the third dimension, more degrees of freedom can be incorporated in the designs, improving the efficiency of devices and even enabling new functionality not possible with single layer patterns. In the case of multi-layer dielectric composites based on device planarization, multi-functional deflectors (Fig. 5(a)) and field-flatness corrected metalenses (Fig. 5(b)) have been theoretically proposed. Broadband light splitters operating in the scalar diffractive optics regime, based on varying and optimizing height topology with a low contrast polymer, have been designed and experimentally fabricated (Fig. 5(c)). Designs based on height topology variation have the potential for low cost, large area implementation based on imprint lithography. Diffractive optical components have also been proposed using 3D printing at length scales ranging from the nanoscale (Fig. 5(d)) to the microscale (Fig. 5(e)). We expect the range and capabilities of topology-optimized devices to continue to expand, as state-of-the-art fabrication methods continue to evolve and get better at producing 3D composite materials with high spatial resolution and greater dielectric contrast.

#### 4.2 Multi-objective optimization

Although SOO and topology optimization are sufficient tools for many optimization problems, there are some classes of problems which cannot be satisfactorily solved by a single objective optimizer. In the case of a design problem with multiple competing goals, captured as multiple objective functions, it is unclear how these goals ought to be related to one another to produce an optimal solution. Furthermore, optimality in this context is no longer a straight-forward concept, as a host of different designs may fall on the tradeoff between the competing goals. One approach that is commonly employed is to combine multiple goals into a composite objective via a weighted sum which is then optimized with a traditional SOO. Unfortunately, this approach suffers from a few problems. First, the optimal choice of coefficients that the designer should use when creating the composite function is usually not known *a priori*. Additionally, and more critically, this approach will only yield a single solution despite there being a host of potential solutions that satisfy the best-possible tradeoff between the goals.

In general, the tradeoff between the objectives measuring a set of designs is best understood by the concept of a Pareto front (Fig. 6(a)). This concept expands the notion of optimality from singling out the best of all designs to delivering a set of designs that achieve the best possible tradeoff between objectives [91]. More specifically, the Pareto front of a design problem is the set of all designs for which an improvement in one objective necessitates a deterioration in some other objective. Because the “Pareto optimality” of a point with respect to other points ought not to favor one objective over others, the concept of dominance was introduced [92]. For a pair of points *x*_{1} and *x*_{2}, *x*_{1} is said to *dominate x*_{2} if *x*_{1} is better than *x*_{2} in all considered objective measures. Using this concept, the Pareto front can be expressed formally as the set of feasible non-dominated designs in a given design problem. Interestingly, it has been shown that, depending on the structure of the Pareto front, there are some portions of it that cannot be found by using the weighted sum technique [92]. Thus, true multi-objective optimization (MOO) must use a variety of other approaches to build a set of solutions which approximate the true Pareto front for a given problem. This both frees the engineer of the responsibility of prioritizing the objective functions ahead of time and can aid in understanding the physics which underly the tradeoffs between the problem goals. With the optimization completed, a common approach is to select a knee-point on the Pareto front (*e.g.*, the closest point to the origin) which represents a compromise between the various objectives. More generally, now that a tradeoff is understood between the objectives, the engineer has an opportunity to apply any preexisting prioritization of the objectives which may be specific to their problem without fear of inadvertently missing solutions that are better for their situation but would not be found using SOO.

Because MOO inherently produces a set of solutions rather than a single solution, algorithm designers have had great success in adapting population-based evolutionary algorithms to operate using the dominance relationship. Thus, there are a wide array of multi-objective evolutionary algorithms (MOEA) for engineers to choose from which are adaptations of the previously discussed global techniques. These include such contributions as NSGA-II (Non-dominated Sorting Genetic Algorithm II), MO-CMA-ES, MOPSO, and MOLACO among many others [43,93–95]. While MOEAs have been the primary focus of MOO development during its two-decade history, the power of MOO has also been extended to local optimization techniques. The multiple gradient descent algorithm (MGDA) and multi-objective Newton’s method (MONM) are two examples of MOO’s utility in local solution refinement [26,96].

As MOO has been adopted by engineers over the past two decades, its applicability has become apparent in the area of meta-device design. In the RF regime, MOO has been applied to the design of metasurfaces for different applications. Goudos *et al.* demonstrated the use of MOPSO for designing a multilayer absorber and shows the optimal tradeoff between reflection coefficient and total system thickness [100]. Zhu *et al.* developed and applied the MOLACO algorithm to the design of an ultra-wide field of view (FOV) 3D FSS that is polarization insensitive as shown in Fig. 6(c) [58]. Critically, MOLACO allowed the final system to be fabricated with a 3D printer by enforcing contiguity of the structure within each unit cell of the metasurface, while simultaneously allowing for *a posteriori* balancing of the frequency selectivity with the FOV. In addition to the RF regime, MOO has been applied with consummate effectiveness to optical meta-device design. Wiecha *et al.* used a pixelized parameterization of dielectric nanoantenna scatterers in combination with a MOEA to characterize the tradeoff between the reflection at two different frequencies [97] (see Fig. 6(b)). Similarly, Nagar *et al.* used a MOEA called BORG [101] to simultaneously optimize the directivity, front to back ratio (FTBR), and scattering efficiency of both multilayer core-shell particles and Yagi-Uda nanoloop antennas [98] (see Fig. 6(d)). MOO has also been applied to photonic scatterer and waveguide design [99,102] (see Fig. 6(e)). Finally, Hassan *et al.* designed a nanoantenna with radiation modes dependent on the excitation port using a specialized MOPSO [103]. Their optimization minimizes losses, maximizes radiation efficiency and maximizes discrimination between radiation patterns.

The problems of today demand high performance in a multitude of competing areas, especially when balancing electromagnetic performance with SWaP-C and manufacturability considerations. To this end, multi-objective optimization is perfectly suited to capture these and other arbitrary competing design objective tradeoffs. We expect MOO to be critical and its use to increase in meta-device design as more researchers become aware of its advantages.

## 5. Future directions in meta-device optimization

#### 5.1 Surrogate modeling

The practicality of any optimization technique is primarily dependent on the computational cost of the problem’s function evaluations. Many of the optimization techniques covered thus far are based on solving Maxwell’s equations in an iterative manner. While effective, they require considerable computational resources and time. For instance, to accommodate the complex physics required to evaluate Hassan’s nanoantenna referred to above, the authors elected to perform full-wave simulations which accurately measure the objective functions of each design. However, as we know, full wave simulations can become prohibitively expensive for complex structures. To mitigate this issue, the authors of this design chose to integrate a powerful concept called surrogate modelling into the optimization procedure.

Any high-quality meta-device solution must at some point be validated using trusted and robust simulation techniques. Although many designs are intentionally parameterized to avoid costly full-wave simulations, in some cases these expensive evaluations cannot be avoided. Since high-fidelity systems grow increasingly computationally expensive to evaluate in a full-wave solver, any intentions of optimizing the system may become impractical—in some cases to the point of intractability. For these kinds of problems, optimizers which make every effort to lower the expected number of high-fidelity (a.k.a. full) evaluations required to find an optimal solution are necessary. Unfortunately, many GO and MOO optimizers have no such constraints, and so may require too many full evaluations to be tractable on their own. Surrogate modeling (SM) techniques, on the other hand, strive to alleviate this problem by replacing full evaluations with trained models that are significantly faster to evaluate. These surrogate models can take many forms and fulfill different functions to lower the number of necessary full evaluations. Analytical models are one approach to accelerating optimization through surrogate modeling. For example, in the case of lens design, the lens maker’s equation may be used as a surrogate model for constraining and seeding the optimization of optical systems [104]. Equivalent circuit models used to describe antenna and metamaterial devices are another analytical surrogate model example and can be used to accelerate the optimization process of practical structures. For example, in [105], an RF circular split ring resonator metasurface was captured with a full-wave model, but the optimization work was offloaded primarily to a circuit equivalent allowing for significant time savings. Duan *et al.* [106] and Kim *et al.* [107] both similarly applied a circuit model to the optimization of nanoantenna devices (see Fig. 7(a) and Fig. 7(b), respectively). Because predefined analytical models are typically based directly on the physics of the ground truth high-fidelity model, they have the potential to offer the greatest speedup while remaining intuitive to understand and maintaining relatively high accuracy.

However, for some problems, an analytical model may be difficult or impractical to formulate. In these cases, surrogate models based on generic trainable function approximators can be used instead. These empirically-derived (*i.e.*, learn-by-example (LBE) [108]) models mimic the full-function evaluation while being significantly faster to evaluate. Examples include multi-variate polynomial regressions, radial basis functions, support vector machines (SVM) [109], and the Kriging model [110]. For example, Campbell *et al.* trained a Kriging model against the quasiconformal-transformation optics (qTO) algorithm to replicate its behavior, and then showed significant optimization time improvements using CMA-ES and MO-CMA-ES when the Kriging model was used in place of qTO [111] (see Fig. 7(d)). Kim *et al.* also used a Kriging surrogate model to replace the transmission response of a multi-objective GA optimized plasmonic nanoslit array [112]. Easum *et al.* developed an analytical surrogate model using multivariate polynomial regressions for explicitly characterizing the aberrations in GRIN lens systems and showed substantial speedups compared to direct optimization of the surrogate compared with full ray-tracing [113] (see Fig. 7(e)). Surrogate models are well suited for integration into the System-By-Design (SBD) optimization framework [16]. SBD is a formal process for optimization-driven inverse-design given a set of design constraints and performance goals. In [114], Oliveri *et al.* employed Gaussian Process-based surrogate models in an SBD framework for the synthesis of qTO-based metamaterial lenses.

Surrogate modeling can be further integrated into the optimization procedure, allowing for some truly remarkable speedups. In [116], the training of a Gaussian process model was interleaved with full-wave simulations of nano-particles of different morphologies to directly coordinate model training and prediction (see Fig. 7(c)). Surrogate models were integrated directly into the optimization algorithm to efficiently design photonic circuitry in [117] and [118]. Co-Kriging is another surrogate modeling approach which uses multiple models of varying fidelity to reduce the number of full evaluations used [119]. This technique was applied by Koziel *et al.* to an antenna design problem by correlating full-wave simulations of different mesh fidelities together for lower optimization cost [120]. There also exist a number of emerging surrogate-assisted techniques such as inverse surrogate modeling [121], response feature based optimization [122,123], and adaptive response scaling [124] that have seen successful application at RF frequencies and have the potential to make an impact in optical meta-device optimization.

Surrogate modeling can also be used for applications beyond optimization time speedups. Easum *et al.* introduced MOTOL for multi-objective optimization with tolerance studies uniquely integrated through the use of surrogate models (see Fig. 7(f)) [27]. In addition to enabling per-design measurement of design tolerance during optimization, MOTOL also simultaneously trains several competing surrogate models and dynamically selects the best one for the problem. MOTOL then explores the response surface using a Monte Carlo approach to estimate the design’s tolerance hypervolume [27]. Analytical techniques such as Interval Analysis (IA) have also been used for tolerance estimation [125–127] in RF device optimization. Finally, since tolerance is an explicit objective, one can observe the tradeoffs between design robustness and traditional performance objectives such as gain, bandwidth, and field-of-view.

Finally, the design problems of today will almost certainly become more difficult as engineers seek to incorporate multiphysics and multiscale aspects into the inverse-design process. Undoubtedly, this will challenge the available computational resources, and so surrogate modeling is well positioned to play a critical role in realizing disruptive meta-devices in the future.

#### 5.2 Deep learning

An emerging class of surrogate modeling techniques involve the use of deep neural networks (DNN). As with other learn-by-example techniques, the general idea with DNN’s is to expend computational time and resources upfront, for the generation of training data sets consisting of device geometries and their associated optical responses. These data can be used to train a deep neural network, using classical supervised learning methods, to ‘learn’ the nonlinear relationships between geometry and optical response. The power of deep neural networks comes from their multi-layered composition which allows them to learn the relationships between data with multiple levels of abstraction. Once trained, a deep neural network can efficiently produce the geometry of a device when presented with a desired optical response. Deep learning techniques based on DNNs have led to tremendous advancements in image processing, object detection, and speech recognition [128] and have the potential for tremendous disruption in the inverse design of RF and optical meta-devices.

Deep neural networks that use device geometries and optical responses as input and output parameters are able to perform inverse design with nanophotonic systems. In a recent demonstration, deep networks were used to relate the geometry of subwavelength-scale concentric dielectric shells and their scattering cross sections (see Fig. 8(a)) [129]. The thicknesses of the differing shells served as discrete parameters describing the scatterer geometries, while sampled points in the scattering spectra served as discrete parameters describing the optical response. These input and output parameters map onto a discrete set of input and output neurons, which are fully connected together with layers of additional hidden neurons. Upon training with tens of thousands of devices, this network is able to produce shell thickness for a given desired scattering cross section, confirming that the network is able to capture the highly nonlinear relationship between scatterer geometry and optical response. A similar type of neural network was used to relate the unit cell geometry and scattering profile of a dielectric meta-grating placed over a reflective backplane (see Fig. 8(b)) [130]. In this case, the geometry of the meta-grating unit cell was parameterized as a set of points in radial coordinates.

More complex network architectures can be used for the design of degenerate optical systems, in which differing device geometries can produce the same optical response. One such network architecture combines an inverse module, in which the input is optical response and the output is device geometry, together with a forward module, which predicts the optical response for a given device geometry (see Fig. 8(c)) [131]. These networks exhibit improved loss convergence upon training and better inverse design capabilities. In initial demonstrations, these networks could predict the device geometries of optical filters for given desired optical responses. Related types of networks with even greater complexity could predict the chiroptical responses of twisted split ring resonators as a function of wavelength (see Fig. 8(d)) [132].

The application of neural networks to nanophotonic design is still very much in its incipient stages, and much work is required to extend its efficacy and utility to more complex shapes. One open research area is understanding how to select and refine the neural network architectures for a specific given design problem. A broad range of network layer types, such as convolutional and deconvolutional layers, can be used, and there are a multitude of ways layers can be arranged and connected. Another challenge is the generation of sufficient quantities of training data, which requires devising faster electromagnetic solvers that utilize efficient hardware platforms and coding strategies. We also anticipate that other types of neural networks, such as those based on unsupervised learning, can serve as effective tools for meta-device inverse design.

## 6. Conclusions

Many algorithms and techniques exist for the inverse design of meta-devices and due to the ever-increasing levels of computational power, advancements in fabrication techniques, and interesting materials available to the designer there will only be an ever-increasing need for optimization to realize the highest performance designs. Whether it be a local, global, single- or multi-objective algorithm, optimization can benefit all optical, RF, and nanophotonic design problems. However, readers should not conclude that optimization can wholly replace the need for experienced designers. Rather, optimization should be thought of as a tool that designers can use to maximize the performance of their device. Moreover, experienced designers can use their prior knowledge and intuition to significantly reduce the computational costs of optimization by applying intelligent constraints and supplying exceptional starting points. Finally, the authors strongly advocate that all readers consider applying some level of optimization to their problems and hope that the discussions provided in this manuscript are helpful to experienced and novice designers alike.

## Funding

Defense Advanced Research Projects Agency (DARPA) (HR00111720032); U.S. Air Force (FA9550-18-1-0070); Office of Naval Research (N00014-16-1-2630); National Science Foundation (NSF).

## Acknowledgments

SDC, EBW, RPJ, and DHW were supported in part by Defense Advanced Research Projects Agency (DARPA) under award number HR00111720032. JAF was supported by the U.S. Air Force under Award Number FA9550-18-1-0070 and the Office of Naval Research under Award Number N00014-16-1-2630. DS was supported by the National Science Foundation (NSF) through the NSF Graduate Research Fellowship.

## References

**1. **D. K. Cheng, “Optimization techniques for antenna arrays,” Proc. IEEE **59**(12), 1664–1674 (1971). [CrossRef]

**2. **T. H. Jamieson, *Optimization Techniques in Lens Design, Monographs on Applied Optics,* No. 5 (American Elsevier Pub. Co, 1971).

**3. **J. Raphson, “Analysis aequationum universalis seu ad aequationes algebraicas resolvendas methodus generalis, & expedita, ex nova infinitarum serierum methodo, deducta ac demonstrata: cui annexum est de spatio reali, seu ente infinito conamen mathematico-metaphysicum,” (1697).

**4. **I. Newton, *The Method of Fluxions and Infinite Series: With its Application to the Geometry of Curve-Lines* (London, 1736).

**5. **M. Augustine Cauchy, “Méthode générale pour la résolution des systemes d’équations simultanées,” Comp. Rend. Sci. Paris **25**, 536–538 (1847).

**6. **M. R. Hestenes and E. Stiefel, “Methods of conjugate gradients for solving linear systems,” J. Res. Natl. Bur. Stand. **49**(6), 409 (1952). [CrossRef]

**7. **D. W. Marquardt, “An algorithm for least-squares estimation of nonlinear parameters,” J. Soc. Ind. Appl. Math. **11**(2), 431–441 (1963). [CrossRef]

**8. **A. Yabe, *Optimization in Lens Design* (SPIE Press, 2018).

**9. **S. D. Campbell, D. E. Brocker, J. Nagar, and D. H. Werner, “SWaP reduction regimes in achromatic GRIN singlets,” Appl. Opt. **55**(13), 3594 (2016). [CrossRef] [PubMed]

**10. **R. A. Flynn, E. F. Fleet, G. Beadie, and J. S. Shirk, “Achromatic GRIN singlet lens design,” Opt. Express **21**(4), 4970–4978 (2013). [CrossRef] [PubMed]

**11. **J. Nagar, S. D. Campbell, and D. H. Werner, “Apochromatic singlets enabled by metasurface-augmented GRIN lenses,” Optica **5**(2), 99–102 (2018). [CrossRef]

**12. **L. Zhang, J. Ding, H. Zheng, S. An, H. Lin, B. Zheng, Q. Du, G. Yin, J. Michon, Y. Zhang, Z. Fang, M. Y. Shalaginov, L. Deng, T. Gu, H. Zhang, and J. Hu, “Ultra-thin high-efficiency mid-infrared transmissive Huygens meta-optics,” Nat. Commun. **9**(1), 1481 (2018). [CrossRef] [PubMed]

**13. **E. D. Huber, “Extrapolated least-squares optimization in optical design,” J. Opt. Soc. Am. A **2**(4), 544–554 (1985). [CrossRef]

**14. **X. Cheng, Y. Wang, Q. Hao, and J. Sasian, “Automatic element addition and deletion in lens optimization,” Appl. Opt. **42**(7), 1309–1317 (2003). [CrossRef] [PubMed]

**15. **L. Li, Q.-H. Wang, X.-Q. Xu, and D.-H. Li, “Two-step method for lens system design,” Opt. Express **18**(12), 13285–13300 (2010). [CrossRef] [PubMed]

**16. **A. Massa, G. Oliveri, P. Rocca, and F. Viani, “System-by-design: A new paradigm for handling design complexity,” in The 8th European Conference on Antennas and Propagation (EuCAP 2014) (IEEE, 2014), pp. 1180–1183. [CrossRef]

**17. **J. H. Holland, “Genetic algorithms,” Sci. Am. **267**(1), 66–72 (1992). [CrossRef]

**18. **J. Kennedy and R. Eberhart, “Particle swarm optimization,” in Proceedings of ICNN’95 - International Conference on Neural Networks (IEEE, 1995), **4**, pp. 1942–1948. [CrossRef]

**19. **R. Storn and K. Price, “Differential Evolution – A Simple and Efficient Heuristic for global Optimization over Continuous Spaces,” J. Glob. Optim. **11**(4), 341–359 (1997). [CrossRef]

**20. **P. Rocca, G. Oliveri, and A. Massa, “Differential Evolution as Applied to Electromagnetics,” IEEE Antennas Propag. Mag. **53**(1), 38–49 (2011). [CrossRef]

**21. **N. Hansen, S. D. Müller, and P. Koumoutsakos, “Reducing the time complexity of the derandomized evolution strategy with covariance matrix adaptation (CMA-ES),” Evol. Comput. **11**(1), 1–18 (2003). [CrossRef] [PubMed]

**22. **Genetic and Evolutionary Computation Conference, International Conference on Genetic Algorithms, and Genetic and Evolutionary Computation Conference, *GECCO 2018, the Genetic and Evolutionary Computation Conference Companium [a Recombination of the 27th International Conference on Genetic Algorithms (ICGA) and the 23rd Annual Genetic Programming Conference (GP)], July 15th - 19th 2018, Kyoto, Japan* (Association for Computing Machinery, 2018).

**23. **J. S. Jensen and O. Sigmund, “Topology optimization for nano-photonics,” Laser Photonics Rev. **5**(2), 308–321 (2011). [CrossRef]

**24. **K. Deb, “Multi-objective optimization,” in *Search Methodologies* (Springer US, 2014), pp. 403–449.

**25. **D. H. Wolpert and W. G. Macready, “No free lunch theorems for optimization,” IEEE Trans. Evol. Comput. **1**(1), 67–82 (1997). [CrossRef]

**26. **J.-A. Désidéri, “Multiple-gradient descent algorithm for multiobjective optimization,” C. R. Math. **350**(5), 313–318 (2012). [CrossRef]

**27. **J. A. Easum, J. Nagar, P. L. Werner, and D. H. Werner, “Efficient multi-objective antenna optimization with tolerance analysis through the use of surrogate models,” IEEE Trans. Antenn. Propag. **66**(12), 6706–6715 (2018). [CrossRef]

**28. **J. H. Holland, *Adaptation in Natural and Artificial Systems: An Introductory Analysis with Applications to Biology, Control, and Artificial Intelligence*, 1st MIT Press ed, Complex Adaptive Systems (MIT Press, 1992).

**29. **J. A. Bossard, L. Lin, S. Yun, L. Liu, D. H. Werner, and T. S. Mayer, “Near-ideal optical metamaterial absorbers with super-octave bandwidth,” ACS Nano **8**(2), 1517–1524 (2014). [CrossRef] [PubMed]

**30. **C. Forestiere, A. J. Pasquale, A. Capretti, G. Miano, A. Tamburrino, S. Y. Lee, B. M. Reinhard, and L. Dal Negro, “Genetically engineered plasmonic nanoarrays,” Nano Lett. **12**(4), 2037–2044 (2012). [CrossRef] [PubMed]

**31. **T. Feichtner, O. Selig, M. Kiunke, and B. Hecht, “Evolutionary optimization of optical antennas,” Phys. Rev. Lett. **109**(12), 127701 (2012). [CrossRef] [PubMed]

**32. **S. Sui, H. Ma, J. Wang, M. Feng, Y. Pang, S. Xia, Z. Xu, and S. Qu, “Symmetry-based coding method and synthesis topology optimization design of ultra-wideband polarization conversion metasurfaces,” Appl. Phys. Lett. **109**(1), 014104 (2016). [CrossRef]

**33. **S. Jafar-Zanjani, S. Inampudi, and H. Mosallaei, “Adaptive genetic algorithm for optical metasurfaces design,” Sci. Rep. **8**(1), 11040 (2018). [CrossRef] [PubMed]

**34. **“Creative Commons — Attribution 4.0 International — CC BY 4.0,” https://creativecommons.org/licenses/by/4.0/.

**35. **R. L. Haupt and D. H. Werner, *Genetic Algorithms in Electromagnetics* (IEEE Press : Wiley-Interscience, 2007).

**36. **S. Chakravarty, R. Mittra, and N. R. Williams, “Application of a microgenetic algorithm (MGA) to the design of broadband microwave absorbers using multiple frequency selective surface screens buried in dielectrics,” IEEE Trans. Antenn. Propag. **50**(3), 284–296 (2002). [CrossRef]

**37. **M. A. Gingrich and D. H. Werner, “Synthesis of low/zero index of refraction metamaterials from frequency selective surfaces using genetic algorithms,” Electron. Lett. **41**(23), 1266–1267 (2005). [CrossRef]

**38. **P. Y. Chen, C. H. Chen, H. Wang, J. H. Tsai, and W. X. Ni, “Synthesis design of artificial magnetic metamaterials using a genetic algorithm,” Opt. Express **16**(17), 12806–12818 (2008). [CrossRef] [PubMed]

**39. **J. A. Bossard, C. P. Scarborough, Q. Wu, S. D. Campbell, D. H. Werner, P. L. Werner, S. Griffiths, and M. Ketner, “Mitigating field enhancement in metasurfaces and metamaterials for high-power microwave applications,” IEEE Trans. Antenn. Propag. **64**(12), 5309–5319 (2016). [CrossRef]

**40. **K. Chen, L. Cui, Y. Feng, J. Zhao, T. Jiang, and B. Zhu, “Coding metasurface for broadband microwave scattering reduction with optical transparency,” Opt. Express **25**(5), 5571–5579 (2017). [CrossRef] [PubMed]

**41. **L. R. Ji-Di, X. Y. Cao, Y. Tang, S. M. Wang, Y. Zhao, and X. W. Zhu, “A new coding metasurface for wideband RCS reduction,” Wuxiandian Gongcheng **27**(2), 394–401 (2018). [CrossRef]

**42. **T. Han, X.-Y. Cao, J. Gao, Y.-L. Zhao, and Y. Zhao, “A coding metasurface with properties of absorption and diffusion for RCS reduction,” Prog. Electromagn. Res. C **75**, 181–191 (2017). [CrossRef]

**43. **D. Z. Zhu, P. L. Werner, and D. H. Werner, “Design and optimization of 3-D frequency-selective surfaces based on a multiobjective lazy ant colony optimization algorithm,” IEEE Trans. Antenn. Propag. **65**(12), 7137–7149 (2017). [CrossRef]

**44. **A. Lewis, G. Weis, M. Randall, A. Galehdar, and D. Thiel, “Optimising efficiency and gain of small meander line RFID antennas using ant colony system,” in 2009 IEEE Congress on Evolutionary Computation (2009), pp. 1486–1492. [CrossRef]

**45. **K. R. Mahmoud, M. Hussein, M. F. O. Hameed, and S. S. A. Obayya, “Super directive Yagi–Uda nanoantennas with an ellipsoid reflector for optimal radiation emission,” J. Opt. Soc. Am. B **34**(10), 2041–2049 (2017). [CrossRef]

**46. **J. R. Ong, H. S. Chu, V. H. Chen, A. Y. Zhu, and P. Genevet, “Freestanding dielectric nanohole array metasurface for mid-infrared wavelength applications,” Opt. Lett. **42**(13), 2639–2642 (2017). [CrossRef] [PubMed]

**47. **P. E. Sieber and D. H. Werner, “Infrared broadband quarter-wave and half-wave plates synthesized from anisotropic Bézier metasurfaces,” Opt. Express **22**(26), 32371–32383 (2014). [CrossRef] [PubMed]

**48. **S. H. Martin, I. Martinez, J. P. Turpin, D. H. Werner, E. Lier, and M. G. Bray, “The synthesis of wide- and multi-bandgap electromagnetic surfaces with finite size and nonuniform capacitive loading,” IEEE Trans. Microw. Theory Tech. **62**(9), 1962–1972 (2014). [CrossRef]

**49. **S. D. Campbell, J. Nagar, and D. H. Werner, “Multi-element, multi-frequency lens transformations enabled by optical wavefront matching,” Opt. Express **25**(15), 17258–17270 (2017). [CrossRef] [PubMed]

**50. **X.-S. Yang, *Nature-Inspired Metaheuristic Algorithms* (Luniver Press, 2008).

**51. **D. H. Werner, J. A. Bossard, Z. Bayraktar, Z. H. Jiang, M. D. Gregory, and P. L. Werner, “Nature inspired optimization techniques for metamaterial design,” in *Numerical Methods for Metamaterial Design*, K. Diest, ed., Topics in Applied Physics (Springer Netherlands, 2013), pp. 97–146.

**52. **D. Karaboga and B. Akay, “A comparative study of artificial bee colony algorithm,” Appl. Math. Comput. **214**(1), 108–132 (2009). [CrossRef]

**53. **X.-S. Yang, “A new metaheuristic bat-inspired algorithm,” in *Nature Inspired Cooperative Strategies for Optimization (NICSO 2010)*, J. R. González, D. A. Pelta, C. Cruz, G. Terrazas, and N. Krasnogor, eds., Studies in Computational Intelligence (Springer Berlin Heidelberg, 2010), pp. 65–74.

**54. **J. Kennedy, “Swarm intelligence,” in *Handbook of Nature-Inspired and Innovative Computing: Integrating Classical Models with Emerging Technologies*, A. Y. Zomaya, ed. (Springer US, 2006), pp. 187–219.

**55. **M. Dorigo, V. Maniezzo, and A. Colorni, “Ant system: optimization by a colony of cooperating agents,” IEEE Trans. Syst. Man Cybern. B Cybern. **26**(1), 29–41 (1996). [CrossRef] [PubMed]

**56. **E. L. Lawler, ed., *The Traveling Salesman Problem: A Guided Tour of Combinatorial Optimization*, Wiley-Interscience Series in Discrete Mathematics (Wiley, 1985).

**57. **D. Charbonneau, C. Poff, H. Nguyen, M. C. Shin, K. Kierstead, and A. Dornhaus, “Who are the “lazy” ants? The function of inactivity in social insects and a possible role of constraint: inactive ants are corpulent and may be young and/or selfish,” Integr. Comp. Biol. **57**(3), 649–667 (2017). [CrossRef] [PubMed]

**58. **D. Z. Zhu, M. D. Gregoy, P. L. Werner, and D. H. Werner, “Fabrication and characterization of multi-band polarization independent 3D printed frequency selective structures with ultra-wide fields of view,” IEEE Trans. Antenn. Propag. **66**(11), 6096–6105 (2018). [CrossRef]

**59. **S. D. Campbell, D. Z. Zhu, J. Nagar, R. P. Jenkins, J. A. Easum, D. H. Werner, and P. L. Werner, “Inverse design of engineered materials for extreme optical devices,” in 2018 International Applied Computational Electromagnetics Society Symposium (ACES) (IEEE, 2018), pp. 1–2. [CrossRef]

**60. **J. Robinson and Y. Rahmat-Samii, “Particle swarm optimization in electromagnetics,” IEEE Trans. Antenn. Propag. **52**(2), 397–407 (2004). [CrossRef]

**61. **D. W. Boeringer and D. H. Werner, “Particle swarm optimization versus genetic algorithms for phased array synthesis,” IEEE Trans. Antenn. Propag. **52**(3), 771–779 (2004). [CrossRef]

**62. **S. Cui and D. S. Weile, “Application of a parallel particle swarm optimization scheme to the design of electromagnetic absorbers,” IEEE Trans. Antenn. Propag. **53**(11), 3616–3624 (2005). [CrossRef]

**63. **N. Jin and Y. Rahmat-Samii, “Advances in particle swarm optimization for antenna designs: real-number, binary, single-objective and multiobjective implementations,” IEEE Trans. Antenn. Propag. **55**(3), 556–567 (2007). [CrossRef]

**64. **K. Deb and R. B. Agrawal, “Simulated Binary Crossover for Continuous Search Space,” Complex Syst. **9**(2), 115–148 (1995).

**65. **A. V. Kildishev, U. K. Chettiar, Z. Liu, V. M. Shalaev, D.-H. Kwon, Z. Bayraktar, and D. H. Werner, “Stochastic optimization of low-loss optical negative-index metamaterial,” J. Opt. Soc. Am. B **24**(10), A34–A39 (2007). [CrossRef]

**66. **Z. Bayraktar, M. Komurcu, J. A. Bossard, and D. H. Werner, “The wind driven optimization technique and its application in electromagnetics,” IEEE Trans. Antenn. Propag. **61**(5), 2745–2757 (2013). [CrossRef]

**67. **J. J. Grefenstette, “Optimization of control parameters for genetic algorithms,” IEEE Trans. Syst. Man Cybern. **16**(1), 122–128 (1986). [CrossRef]

**68. **A. El-Gallad, M. El-Hawary, A. Sallam, and A. Kalas, “Enhancing the particle swarm optimizer via proper parameters selection,” in *IEEE CCECE2002. Canadian Conference on Electrical and Computer Engineering. Conference Proceedings (Cat. No.02CH37373)***2**, 792–797 (2002). [CrossRef]

**69. **C. Li, S. Yang, and T. T. Nguyen, “A self-learning particle swarm optimizer for global optimization problems,” IEEE Trans. Syst. Man Cybern. B Cybern. **42**(3), 627–646 (2012). [CrossRef] [PubMed]

**70. **G. Xu, “An adaptive parameter tuning of particle swarm optimization algorithm,” Appl. Math. Comput. **219**(9), 4560–4569 (2013). [CrossRef]

**71. **M. D. Gregory, Z. Bayraktar, and D. H. Werner, “Fast optimization of electromagnetic design problems using the covariance matrix adaptation evolutionary strategy,” IEEE Trans. Antenn. Propag. **59**(4), 1275–1285 (2011). [CrossRef]

**72. **M. D. Gregory, S. V. Martin, and D. H. Werner, “Improved electromagnetics optimization: the covariance matrix adaptation evolutionary strategy,” IEEE Antennas Propag. Mag. **57**(3), 48–59 (2015). [CrossRef]

**73. **S. Baskar, P. N. Suganthan, N. Q. Ngo, A. Alphones, and R. T. Zheng, “Design of triangular FBG filter for sensor applications using covariance matrix adapted evolution algorithm,” Opt. Commun. **260**(2), 716–722 (2006). [CrossRef]

**74. **P. Petropoulos and X. Yang, “Nonlinear sculpturing of optical spectra,” in *2012 14th International Conference on Transparent Optical Networks (ICTON)* (2012), pp. 1–4. [CrossRef]

**75. **S. Thibault, C. Gagné, J. Beaulieu, and M. Parizeau, “Evolutionary algorithms applied to lens design: case study and analysis,” in Optical Design and Engineering II (International Society for Optics and Photonics **5962**(9), 596209 (2005).

**76. **J. Nagar, D. E. Brocker, S. D. Campbell, J. A. Easum, and D. H. Werner, “Modularization of gradient-index optical design using wavefront matching enabled optimization,” Opt. Express **24**(9), 9359–9368 (2016). [CrossRef] [PubMed]

**77. **D. E. Brocker, J. P. Turpin, P. L. Werner, and D. H. Werner, “Optimization of gradient index lenses using quasi-conformal contour transformations,” IEEE Antennas Wirel. Propag. Lett. **13**, 1787–1791 (2014). [CrossRef]

**78. **B. Huang, Q. Cheng, G. Y. Song, and T. J. Cui, “Design of acoustic metamaterials using the covariance matrix adaptation evolutionary strategy,” Appl. Phys. Express **10**(3), 037301 (2017). [CrossRef]

**79. **G. Fujii, Y. Akimoto, and M. Takahashi, “Exploring optimal topology of thermal cloaks by CMA-ES,” Appl. Phys. Lett. **112**(6), 061108 (2018). [CrossRef]

**80. **C. M. Lalau-Keraly, S. Bhargava, O. D. Miller, and E. Yablonovitch, “Adjoint shape optimization applied to electromagnetic design,” Opt. Express **21**(18), 21693–21701 (2013). [CrossRef] [PubMed]

**81. **D. Sell, J. Yang, S. Doshay, R. Yang, and J. A. Fan, “Large-angle, multifunctional metagratings based on freeform multimode geometries,” Nano Lett. **17**(6), 3752–3757 (2017). [CrossRef] [PubMed]

**82. **J. Lu and J. Vučković, “Nanophotonic computational design,” Opt. Express **21**(11), 13351–13367 (2013). [CrossRef] [PubMed]

**83. **S. Boyd, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Found. Trends Mach. Learn. **3**(1), 1–122 (2010). [CrossRef]

**84. **D. Sell, J. Yang, S. Doshay, and J. A. Fan, “Periodic dielectric metasurfaces with high-efficiency, multiwavelength functionalities,” Adv. Opt. Mater. **5**(23), 1700645 (2017). [CrossRef]

**85. **F. Wang, J. S. Jensen, and O. Sigmund, “Robust topology optimization of photonic crystal waveguides with tailored dispersion properties,” J. Opt. Soc. Am. B **28**(3), 387 (2011). [CrossRef]

**86. **J. Cheng, S. Inampudi, and H. Mosallaei, “Optimization-based dielectric metasurfaces for angle-selective multifunctional beam deflection,” Sci. Rep. **7**(1), 12228 (2017). [CrossRef] [PubMed]

**87. **Z. Lin, B. Groever, F. Capasso, A. W. Rodriguez, and M. Lončar, “Topology optimized multi-layered meta-optics,” Phys. Rev. Appl. **9**(4), 044030 (2018). [CrossRef]

**88. **T. P. Xiao, O. S. Cifci, S. Bhargava, H. Chen, T. Gissibl, W. Zhou, H. Giessen, K. C. Toussaint Jr., E. Yablonovitch, and P. V. Braun, “Diffractive spectral-splitting optical element designed by adjoint-based electromagnetic optimization and fabricated by femtosecond 3D direct laser writing,” ACS Photonics **3**(5), 886–894 (2016). [CrossRef]

**89. **P. Camayd-Muñoz and A. Faraon, “Scaling laws for inverse-designed metadevices,” in Conference on Lasers and Electro-Optics (OSA, 2018), p. FF3C.7.

**90. **F. Callewaert, V. Velev, P. Kumar, A. V. Sahakian, and K. Aydin, “Inverse-designed broadband all-dielectric electromagnetic metadevices,” Sci. Rep. **8**(1), 1358 (2018). [CrossRef] [PubMed]

**91. **Y. Censor, “Pareto optimality in multiobjective problems,” Appl. Math. Optim. **4**(1), 41–59 (1977). [CrossRef]

**92. **K. Deb, *Multi-Objective Optimization Using Evolutionary Algorithms*, Paperback edition (Wiley, 2008).

**93. **K. Deb, A. Pratap, S. Agarwal, and T. Meyarivan, “A fast and elitist multiobjective genetic algorithm: NSGA-II,” IEEE Trans. Evol. Comput. **6**(2), 182–197 (2002). [CrossRef]

**94. **C. Igel, N. Hansen, and S. Roth, “Covariance matrix adaptation for multi-objective optimization,” Evol. Comput. **15**(1), 1–28 (2007). [CrossRef] [PubMed]

**95. **J. E. Alvarez-Benitez, R. M. Everson, and J. E. Fieldsend, “A MOPSO algorithm based exclusively on Pareto dominance concepts,” in *Evolutionary Multi-Criterion Optimization*, C. A. Coello Coello, A. Hernández Aguirre, and E. Zitzler, eds. (Springer Berlin Heidelberg, 2005), **3410**, pp. 459–473.

**96. **J. Fliege, L. M. G. Drummond, and B. F. Svaiter, “Newton’s Method for Multiobjective Optimization,” SIAM J. Optim. **20**(2), 602–626 (2009). [CrossRef]

**97. **P. R. Wiecha, A. Arbouet, C. Girard, A. Lecestre, G. Larrieu, and V. Paillard, “Evolutionary multi-objective optimization of colour pixels based on dielectric nanoantennas,” Nat. Nanotechnol. **12**(2), 163–169 (2016). [CrossRef] [PubMed]

**98. **J. Nagar, S. D. Campbell, Q. Ren, J. A. Easum, R. P. Jenkins, and D. H. Werner, “Multiobjective optimization-aided metamaterials-by-design with application to highly directive nanodevices,” IEEE JMMCT **2**, 147–158 (2017). [CrossRef]

**99. **D. Gagnon, J. Dumont, and L. J. Dubé, “Multiobjective optimization in integrated photonics design,” Opt. Lett. **38**(13), 2181–2184 (2013). [CrossRef] [PubMed]

**100. **S. K. Goudos and J. N. Sahalos, “Microwave absorber optimal design using multi-objective particle swarm optimization,” Microw. Opt. Technol. Lett. **48**(8), 1553–1558 (2006). [CrossRef]

**101. **D. Hadka and P. Reed, “Borg: an auto-adaptive many-objective evolutionary computing framework,” Evol. Comput. **21**(2), 231–259 (2013). [CrossRef] [PubMed]

**102. **S. M. Mirjalili, S. Mirjalili, and A. Lewis, “A novel multi-objective optimization framework for designing photonic crystal waveguides,” IEEE Photonics Technol. Lett. **26**(2), 146–149 (2014). [CrossRef]

**103. **A.-K. S. O. Hassan, A. S. Etman, and E. A. Soliman, “Optimization of a novel nano antenna with two radiation modes using Kriging surrogate models,” IEEE Photonics J. **10**(4), 1–17 (2018). [CrossRef]

**104. **S. D. Campbell, J. Nagar, J. A. Easum, D. H. Werner, and P. L. Werner, “Surrogate-assisted transformation optics inspired GRIN lens design and optimization,” in 2017 International Applied Computational Electromagnetics Society Symposium - Italy (ACES) (IEEE, 2017), pp. 1–2. [CrossRef]

**105. **P. J. Bradley, “Quasi-Newton model-trust region approach to surrogate-based optimisation of planar metamaterial structures,” Prog. Electromagn. Res. B **47**, 1–17 (2013). [CrossRef]

**106. **H. Duan, A. I. Fernández-Domínguez, M. Bosman, S. A. Maier, and J. K. W. Yang, “Nanoplasmonics: classical down to the nanometer scale,” Nano Lett. **12**(3), 1683–1689 (2012). [CrossRef] [PubMed]

**107. **M. Kim, A. M. H. Wong, and G. V. Eleftheriades, “Optical Huygens’ metasurfaces with independent control of the magnitude and phase of the local reflection coefficients,” Phys. Rev. X **4**(4), 041042 (2014). [CrossRef]

**108. **A. Massa, G. Oliveri, M. Salucci, N. Anselmi, and P. Rocca, “Learning-by-examples techniques as applied to electromagnetics,” J Electromagnet. Wave. **32**(4), 516–541 (2018).

**109. **C. Cortes and V. Vapnik, “Support-vector networks,” Mach. Learn. **20**(3), 273–297 (1995). [CrossRef]

**110. **M. A. Oliver and R. Webster, “Kriging: a method of interpolation for geographical information systems,” Int. J. Geogr. Inf. Sci. **4**(3), 313–332 (1990). [CrossRef]

**111. **S. D. Campbell, J. Nagar, D. E. Brocker, and D. H. Werner, “On the use of surrogate models in the analytical decompositions of refractive index gradients obtained through quasiconformal transformation optics,” J. Opt. **18**(4), 044019 (2016). [CrossRef]

**112. **K.-Y. Kim and J. Jung, “Multiobjective optimization for a plasmonic nanoslit array sensor using Kriging models,” Appl. Opt. **56**(21), 5838–5843 (2017). [CrossRef] [PubMed]

**113. **J. A. Easum, S. D. Campbell, J. Nagar, and D. H. Werner, “Analytical surrogate model for the aberrations of an arbitrary GRIN lens,” Opt. Express **24**(16), 17805–17818 (2016). [CrossRef] [PubMed]

**114. **G. Oliveri, L. Tenuti, E. Bekele, M. Carlin, and A. Massa, “An SbD-QCTO approach to the synthesis of isotropic metamaterial lenses,” IEEE Antennas Wirel. Propag. Lett. **13**, 1783–1786 (2014). [CrossRef]

**115. **“Creative Commons — Attribution 3.0 Unported — CC BY 3.0,” https://creativecommons.org/licenses/by/3.0/.

**116. **C. Forestiere, Y. He, R. Wang, R. M. Kirby, and L. Dal Negro, “Inverse design of metal nanoparticles’ morphology,” ACS Photonics **3**(1), 68–78 (2016). [CrossRef]

**117. **S. Ogurtsov and S. Koziel, “Fast surrogate-assisted simulation-driven optimisation of add-drop resonators for integrated photonic circuits,” IET Microw. Antennas Propag. **9**(7), 672–675 (2015). [CrossRef]

**118. **A. Bekasiewicz and S. Koziel, “Surrogate-assisted design optimization of photonic directional couplers: optimization of photonic couplers,” Int. J. Numer. Model. **30**(3–4), e2088 (2017). [CrossRef]

**119. **A. I. J. Forrester, A. Sóbester, and A. J. Keane, “Multi-fidelity optimization via surrogate modelling,” Proc. Math. Phys. Eng. Sci. **463**(2088), 3251–3269 (2007). [CrossRef]

**120. **S. Koziel, A. Bekasiewicz, I. Couckuyt, and T. Dhaene, “Efficient multi-objective simulation-driven antenna design using co-Kriging,” IEEE Trans. Antenn. Propag. **62**(11), 5900–5905 (2014). [CrossRef]

**121. **S. Koziel and A. Bekasiewicz, “Expedited geometry scaling of compact microwave passives by means of inverse surrogate modeling,” IEEE Trans. Microw. Theory Tech. **63**(12), 4019–4026 (2015). [CrossRef]

**122. **S. Koziel and J. W. Bandler, “Reliable microwave modeling by means of variable-fidelity response features,” IEEE Trans. Microw. Theory Tech. **63**(12), 4247–4254 (2015). [CrossRef]

**123. **S. Koziel and S. Ogurtsov, “Rapid design closure of linear microstrip antenna array apertures using response features,” IEEE Antennas Wirel. Propag. Lett. **17**(4), 645–648 (2018). [CrossRef]

**124. **S. Koziel and S. D. Unnsteinsson, “Expedited design closure of antennas by means of trust-region-based adaptive response scaling,” IEEE Antennas Wirel. Propag. Lett. **17**(6), 1099–1103 (2018). [CrossRef]

**125. **L. Manica, N. Anselmi, P. Rocca, and A. Massa, “Robust mask-constrained linear array synthesis through aninterval-based particle SWARM optimisation,” IET Microw. Antennas Propag. **7**(12), 976–984 (2013). [CrossRef]

**126. **P. Rocca, N. Anselmi, and A. Massa, “Optimal synthesis of robust beamformer weights exploiting interval analysis and convex optimization,” IEEE Trans. Evol. Comput. **62**(7), 3603–3612 (2014).

**127. **M. Salucci and T. Moriyama, “Robust antenna design through a hybrid inversion strategy combining interval analysis and nature-inspired optimization,” J. Phys. Conf. Ser. **904**(1), 012007 (2017). [CrossRef]

**128. **Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature **521**(7553), 436–444 (2015). [CrossRef] [PubMed]

**129. **J. Peurifoy, Y. Shen, L. Jing, Y. Yang, F. Cano-Renteria, B. G. DeLacy, J. D. Joannopoulos, M. Tegmark, and M. Soljačić, “Nanophotonic particle simulation and inverse design using artificial neural networks,” Sci. Adv. **4**(6), r4206 (2018). [CrossRef] [PubMed]

**130. **S. Inampudi and H. Mosallaei, “Neural network based design of metagratings,” Appl. Phys. Lett. **112**(24), 241102 (2018). [CrossRef]

**131. **D. Liu, Y. Tan, E. Khoram, and Z. Yu, “Training deep neural networks for the inverse design of nanophotonic structures,” ACS Photonics **5**(4), 1365–1369 (2018). [CrossRef]

**132. **W. Ma, F. Cheng, and Y. Liu, “Deep-learning-enabled on-demand design of chiral metamaterials,” ACS Nano **12**(6), 6326–6334 (2018). [CrossRef] [PubMed]

**133. **“Creative Commons — Attribution-NonCommercial 4.0 International — CC BY-NC 4.0,” https://creativecommons.org/licenses/by-nc/4.0/.