Numerical optimization of photonic devices is often limited by a large design space the finite-differences gradient method requires as many electric field computations as there are design parameters. Adjoint-based optimization can deliver the same gradients with only two electric field computations. Here, we derive the relevant adjoint formalism and illustrate its application for a waveguide slab, and for the design of optical sub-wavelength gratings.
© 2014 Optical Society of America
When designing optical structures via numerical simulations, we are often lead to compute the gradient of some target function with respect to our design variables: we may, for example, want to assess the sensitivity of our structure to manufacturing imprecisions, or optimize the structure’s performance with respect to a given metric. Computing the gradient of a function usually requires two optical simulations per design parameter. This is fine for simple systems, but intractable for many optical systems of interest (e.g. diffractive elements or photonics circuits) which can contain millions of independent parameters.
Fortunately, if the target function is specified in advance, it is possible to exploit the mathematical structure of a large class of optical simulations to compute the multivariate gradient using only two simulations, no matter how many parameters are present. Next to the primal solve, one has to solve the adjoint problem once. The adjoint solution acts as a kind of influence function regarding the respective target function and avoids to solve the primal problem for each parameter variation. This is the principle idea behind adjoint-based optimization and sensitivity analysis. This approach is well-known to several fields of application as control theory, mesh adaptation, error estimation or propagation [1–4] and has been used very successfully for many years in performing aerodynamic shape optimizations [5,6]. Later, it was used by electrical engineers in the microwave regime [7–10]. In the optical arena, the approach has been used by only a few groups [11–15].
The potential of this approach to optimize complex optical structures with many design parameters can hardly be overstated. For example, optimizing optical lens gratings  typically requires supercomputers or very large clusters. Countless runs with slightly different parameter settings are performed to find an appropriate set of design parameters. Being able to reduce the required number of these computations significantly relaxes the computational requirements. Thus, instead of probing a huge number of parameter settings, we can probe only a few and then perform efficient local optimization in the vicinity of the best settings. Adjoint-based optimization is an elegant method to compute the gradients that are needed for local optimization. This paper shows how to reduce the computational complexity of finding the gradient tremendously: from approximately as many field computations as there are design parameters (which is often in the hundreds or thousands) to two, independently of the number of design parameters. This makes it possible to optimize rather complex photonic systems with respect to an arbitrary number of parameters on a single desktop or laptop computer.
It is worth noticing that there are several different approaches to optimizing photonic structures and devices . One particularly elegant approach is inverse design , where the structure is optimized to fit a particular field. In any case, when an optimization approach uses local gradients, adjoints may be used to efficiently compute them.
This paper derives the relevant equations for adjoint-based optimization and sensitivity analysis for optical systems. In particular, we present a formalism that is well suited for sensitivity analysis and optimization using a Maxwell solver. Therefore, this paper provides a detailed tutorial on the adjoint method used to solve electromagnetic problems. We provide a simple example of a slab waveguide where the adjoint method can be compared to the exact analytic model. We also show that the adjoint method can be effectively applied to the optimization of a grating coupler – a task where the adjoint method has not been applied before.
2. Problem description
In all generality, adjoint-based optimization is used to locally optimize a target function T (x, p) which depends on the physical state of the system x, as well as a number of design parameters p. For optical gratings, for example, the physical state of the system can be described conveniently by the electric field (or sometimes just a single component thereof) at each point of the computational grid. The vector of design parameters p contains, for example, geometric parameters describing the layout of our grating. In order to determine the state of our system x(p), we solve a given vector equation R(x, p) = 0, such as Maxwell’s equations.
For our optimization, we need to know how each of the design parameters affects the value of our target function: we need to compute . For each design parameter pk, this derivative is
Computing via the finite-differences approximation is numerically very inefficient:
As we show below, it is often possible to exploit the mathematical structure of the local optimization and sensitivity analysis problem: the adjoint method allows us to compute the relevant gradient performing only two non-trivial simulations. No matter how many design variables we are dealing with.
3. Derivation of adjoint approach
Using the adjoint method, we are able to determine using only one solution of R(x, p) = 0, and one solution of the adjoint equation. As long as the latter is is of similar numerical complexity as the direct equation R, the adjoint-based optimization is likely to reduce the computational cost of , which we are interested in.
Our starting point is the fact that, for any set of design parameters p, and any corresponding state x, we know that the governing vector equation R vanishes,
To illustrate the use of this powerful approach, we will now show how to apply the adjoint method in two concrete situations and derive the relevant equations for optical systems.
4. Optical systems
In this section, we derive the relevant equations for adjoint-based optimization and sensitivity analysis in optical systems, starting the same way as Nikolova et al.  did for the microwave regime. For mathematical simplicity, we consider the dispersion-free, and linear case. Nonetheless, adjoint-based optimization can also be used for non-uniform, anisotropic media. Maxwell’s equations are
In our optical gratings, the specific conductivity tensor σ vanishes. Furthermore, we write our electric field (and source) as E(t) = Re(Ēe−iωt), where Ē ∈ ℂ. Note that the permittivity ε describes the layout of our optical grating and therefore directly depends on our design parameters. We rewrite Eq (11) asEq. (11), we can use any convenient Maxwell solver, such as any FDTD [19, 20] solver. The simulation returns the complex electric field amplitude at each point of the computational grid.
Our target function T is the electric field strength in a specific region Ω
At this point, we are able to derive the adjoint equation for this problem. First, note that for isotropic dielectric media where ε† = ε and μ† = μ we have ℳ† = ℳ. Second, we use Ē computed as the numerical solution of Eq. (12) in Eq. (7) and find . This equation will allow us to compute the adjoint field v̄. Third, we rewrite this equation in a way that shows its correspondence to the direct problem Eq. (12) and conclude that
Both the direct problem Eq. (12) and the adjoint problem Eq. (15) can be simulated using the same Maxwell solver. In fact, we see that the adjoint field v̄ can be interpreted as an electric field that was created by a particular set of sources. The adjoint sources term, , is proportional to the field of the direct solution, wherever we are optimizing the field, and zero everywhere else.
Computing the relevant gradient Eq. (8) is computationally trivial once we have the solutions of the direct problem, Ē, and of the adjoint problem, v̄. Our target function (14) does not depend on the design parameters p. Hence, . Also, the only element of the governing vector equation (12) that depends on the design parameters is the spatially varying dielectric constant ε. Therefore, and the gradient Eq. (8) is simply
5. Example 1: waveguide slab
Consider the text-book example of a waveguide slab of thickness h, described by the cover material index nc, the guiding film index nf, and the substrate index ns. We would like to know what modes are supported by this waveguide slab for a given wavelength λ. The vacuum wave vector is . The characteristic equation for transverse electric (TE) modes isEq. (17) means finding one or all β for which this equation holds.
If we were to ask how a specific value of β changes as we modify, for example, the height of the slab, we would usually have to solve Eq. (17) for different values of h. Worse, even, if we were to modify the indices nc, nf, and ns, too, we would have to solve that same equation over and over again to derive how β depends on them. Nowadays computers can solve this transcendental equation really easily, but for the sake of explaining the adjoint method, pretend that this is an arduous task.
To use the adjoint method, we rewrite Eq. (17) as
We know from the above Eq. (6) that we can avoid these repetitive calculations if we find a v such that
The derivative of our characteristic equation with respect to the longitudinal wave vector isEq. (20), Eq. (28), Fig. 1, the adjoint method leads to the correct gradient. However, is far more efficient for computing multi-variable optimizations than finite-differences gradients.
6. Example 2: grating coupler
Silicon grating couplers are used in photonic circuits to couple light between an on-chip waveguide and an off-chip optical fiber . The exact groove pattern (width, position, depth of each groove) can be optimized to provide the best possible coupling efficiency . For the sake of illustrating the adjoint method, we will restrict the design space to two variables, the uniform pitch p and duty cycle d of the grating grooves. Our simulations are assuming a waveguide thickness of 0.22μm, a groove-depth of 0.08μm, a wavelength of 1.55μm.
The vector of our design parameters isFigure 2 shows the effect of these design parameters on the design of our grating coupler.
To illustrate the effect of adjoint-based optimization, we computed the full topology of our optimization problem for this example (colored background in Fig. 3). We see the areas of best coupling in dark red, and those of worst coupling in dark blue. Of course, computing the full topology is computationally impossible for most real-world applications. Here, it serves intuition and helps to verify that our algorithm works, indeed, as steepest ascent. Furthermore, it helped us select rather poor starting points for our optimization.
To perform the actual adjoint-based optimization, we select a number of starting points that serve as initial parameters. For each point, we
- compute the direct electromagnetic field for specific design parameters (12),
- compute the adjoint field for the same design parameters (15),
- combine the direct and adjoint solutions to (trivially) compute the gradient (16),
- update the design parameters for steepest ascent,
- repeat steps 1–4 until convergence.
Figure 3 shows the result of the full optimization (steps 1–5), and the (usually unknown) topology of our problem. In order to compute these optimization runs, finite-differences gradient optimization would have required three non-trivial field computations per gradient. Adjoint-based optimization, in contrast, only required two non-trivial field computations to find the same gradient. Thus, adjoint based optimization is already slightly faster than finite-differences based optimization for two design parameters.
Consider a grating optimization starting at a pitch of 0.62 and a duty cycle of 0.73 (black star). Adjoint-based optimization evolves these initial parameters to a pitch of 0.60 and a duty cycle of 0.275. Figure 4 shows the result of Meep  simulations of real(Ē) before and after optimization. The coupling of the incident light from the top to inside the waveguide (inside the black square) has improved from 0.39 to 0.88.
For n design parameters, the advantage of adjoint-based optimization would be even more striking: direct computation of the finite-differences gradient would require at least n + 1 field computations per gradient, whereas the adjoint method still only requires two.
We have shown how to use the adjoint method to efficiently design of optical structures. The key of this method is to compute the gradient of the target function with respect to the design parameters using only two non-trivial computations, independently of the number of design parameters. We illustrated this method in two cases: the optical waveguide slab, and a grating coupler simulated using a standard Maxwell solver. For the latter, the converged electric field Ē and the corresponding adjoint field v̄ are the only non-trivial computations required to calculate the gradient, needed to perform gradient ascent (or gradient descent) on the design parameters. We believe this method will prove itself immensely valuable in the design of various grating structures, including a host of non-periodic diffractive structures.
A.N. thanks Hideo Mabuchi, Erling Riis, and Thomas Baer for their mentorship during his time at Stanford University. Furthermore, he thanks Charles Santori, Marco Fiorentino, Sonny Vo, and Wayne Sorin for interesting discussions on how to design optical devices. This work was supported by the SU2P Program of Stanford University and Research Councils UK.
References and links
1. J. L. Lions, Optimal Control of Systems Governed by Partial Differential Equations (Springer, 1971). [CrossRef]
2. R. Becker and R. Rannacher, “An optimal control approach to error control and mesh adaption in finite element methods,” Acta Numerica 10, 1–102 (2001). [CrossRef]
3. K. Eriksson, D. Estep, P. Hansbo, and C. Johnson, “Introduction to adaptive methods for differential equations,” Acta Numerica 4, 105–158 (1995). [CrossRef]
4. D. E. Rumelhart, G. E. Hinton, and R. J. Williams, “Learning internal representations by error propagation,” Parallel Data Processing 1, 318–362 (1986).
5. J. Reuther, A. Jameson, J. J. Alonso, M. J. Remlinger, and D. Saunders, “Constrained multipoint aerodynamic shape optimization using an adjoint formulation and parallel computers, part 1,” Journal of Aircraft 36(1), 51–60 (1999). [CrossRef]
6. J. Reuther, A. Jameson, J. J. Alonso, M. J. Remlinger, and D. Saunders, “Constrained multipoint aerodynamic shape optimization using an adjoint formulation and parallel computers, part 2,” Journal of Aircraft 36(1), 61–74 (1999). [CrossRef]
7. Y. seek Chung, Changyul-Cheon, I.-H. Park, and S.-Y. Hahn, “Optimal shape design of microwave device using fdtd and design sensitivity analysis,” Microwave Theory and Techniques, IEEE Transactions on 48, 2289–2296 (2000). [CrossRef]
8. N. Georgieva, S. Glavic, M. Bakr, and J. Bandler, “Feasible adjoint sensitivity technique for em design optimization,” Microwave Theory and Techniques, IEEE Transactions on 50, 2751–2758 (2002). [CrossRef]
9. N. K. Nikolova, H. W. Tam, and M. H. Bakr, “Sensitivity analysis with the fdtd method on structured grids,” Microwave Theory and Techniques, IEEE Transactions on 52, 1207–1216 (2004). [CrossRef]
10. N. K. Nikolova, Y. Li, Y. Li, and M. H. Bakr, “Sensitivity analysis of scattering parameters with electromagnetic time-domain simulators,” Microwave Theory and Techniques, IEEE Transactions on 54, 1598–1610 (2006). [CrossRef]
12. Y. Jiao, S. Fan, and D. A. B. Miller, “Photonic crystal device sensitivity analysis with wannierbasis gradients,” Optics Letters 30, 302–304 (2005). [CrossRef]
13. P. Seliger, M. Mahvash, C. Wang, and A. F. J. Levi, “Optimization of aperiodic dielectric structures,” Journal of Applied Physics 100, 034310 (2006). [CrossRef]
15. O. D. Miller, C. W. Hsu, M. T. H. Reid, W. Qiu, B. G. DeLacy, J. D. Joannopoulos, M. Soljačić, and S. G. Johnson, “Fundamental limits to extinction by metallic nanoparticles,” Physical Review Letters 112, 123903 (2014). [CrossRef] [PubMed]
16. D. Fattal, J. Li, Z. Peng, M. Fiorentino, and R. G. Beausoleil, “Flat dielectric grating reflectors with focusing abilities,” Nature Photonics 4, 466–470 (2010). [CrossRef]
17. V. Liu, D. Miller, and S. Fan, “Highly tailored computational electromagnetics methods for nanophotonic design and discovery,” Proceedings of the IEEE 101, 484–493 (2013). [CrossRef]
19. R. Courant, K. Friedrichs, and H. Lewy, “Über die partiellen Differenzengleichungen der mathematischen Physik,” Mathematische Annalen 100, 32–74 (1928). [CrossRef]
20. K. Yee, “Numerical solution of initial boundary value problems involving maxwell’s equations in isotropic media,” Antennas and Propagation, IEEE Transactions on 14, 302–307 (1966). [CrossRef]
21. D. Taillaert, W. Bogaerts, P. Bienstman, T. Krauss, P. van Daele, I. Moerman, S. Verstuyft, K. De Mesel, and R. Baets, “An out-of-plane grating coupler for efficient butt-coupling between compact planar waveguides and single-mode fibers,” Quantum Electronics, IEEE Journal of 38, 949–955 (2002). [CrossRef]
23. A. F. Oskooi, D. Roundy, M. Ibanescu, P. Bermel, J. D. Joannopoulos, and S. G. Johnson, “MEEP: A flexible free-software package for electromagnetic simulations by the FDTD method,” Computer Physics Communications 181, 687–702 (2010). [CrossRef]