Leveraging continuous material averaging for inverse electromagnetic design

Andrew Michaels; Eli Yablonovitch

doi:10.1364/OE.26.031717

1. Introduction

As integrated photonics continues to mature, we have witnessed a growing desire for more compact and efficient passive optical components. At the same time, our ability to satisfy these demands using either analytic methods or by tuning only a small set of device parameters is becoming increasingly difficult. To combat this trend, more sophisticated optimization methods are proving very useful. In particular, topology and shape optimization enable us to efficiently design devices with hundreds, thousands, or even millions of independent design parameters. In the case of shape optimization, the boundaries between different materials composing an initial structure are modified in order to minimize a desired figure of merit (i.e., a function which quantifies the performance of the device being optimized) [1]. Similarly, in the case of topology optimization, a starting structure is modified in order to minimize a figure of merit, however fewer constraints are placed on how the structure can evolve, and in particular its topology may be modified (e.g., the creation or elimination of holes) [2].

Shape and topology optimization have found extensive application in structural mechanics, and work on both methods dates back more than 30 years [1–4]. Despite the maturity of this field, in comparison, less application of these methods has found its way into the optics and photonics community. Early work on gradient-based optimization of microwave devices [5,6] demonstrated application of shape optimization to electromagnetics. This work was followed by the application of topology optimization to photonic crystals [7] and later to a greater variety of passive photonic components [8]. Within the last five years, however, the photonics community has witnessed a steady rise in interest in these optimization techniques as demonstrated by the numerous optimizations of efficient splitters, couplers, etc [9–18].

A large portion of this prior work has focused on implementing topology optimization techniques and has emphasized problems in which the design space consists of thousands or even millions of parameters. In particular, the idea of choosing the permittivity at each point in a discretized domain as independent design variables has gained popularity. While this approach has proven useful for generating solutions without requiring significant intuition about the appearance of the final structure, it suffers two primary disadvantages. First, in order to ensure that the device can be fabricated, a final post-processing step is often required to convert a grayscale material distribution to a binary material distribution [19]. As a result, the final solution may not be truly optimal or may require additional shape optimization steps. Second, there are many problems in which either the geometry of the structure is constrained in some way (for example, a one dimensional grating coupler consisting of strictly rectangular segments) or for which topological changes are unnecessary. In these cases, shape optimization techniques may be better suited.

Shape optimization serves as a remedy to both of these problems: it can be used in order to fine-tune the result of a topology optimization and is also well suited to handling problems in which the general shape of the structure is known beforehand. Shape optimization has an additional benefit in that it gives us considerable freedom over the choice of design space–we are free to choose arbitrary parameters, such as the length of a rectangle or the position of a circle, as independent design parameters.

In previous work in electromagnetics, shape optimization has frequently been used in conjunction with simulation methods that use an unstructured mesh (notably either a finite element method [5] or a variant of the finite difference time domain technique [6]). In the nanophotonics community, however, the finite difference time domain method (FDTD) is preferred due to its relative simplicity and computational efficiency. Unlike the finite element method, FDTD represents materials on a rectangular grid which is not conducive to the representation of non-rectangular material boundaries. This poses an even greater difficulty to shape optimization as the rectangular grid makes it difficult to achieve continuous modifications to material boundaries, a process which is essential to computing accurate gradients of a figure of merit (or sensitivity analysis). These gradients are helpful to the efficient minimization of the figure of merit, and it is thus desirable that we devise a way to achieve continuous perturbations to boundaries expressed on a rectangular grid.

To this end, we have implemented a boundary smoothing method which maps continuously-defined material boundaries onto a rectangular grid in a way which is differentiable. This technique, which is similar to so called “fictitious domain methods” in the structural engineering community [20], is effectively a hybrid of grayscale and Manhattan representations of material distributions as depicted in Fig. 1. Unlike the latter two methods, our boundary smoothing allows the material value to change smoothly between two values only in grid cells which intersect the boundary of material domains as shown in Fig. 1. Infinitesimal modifications to the boundary of a material are reflected in infinitesimal changes to the effective permittivity of the intersecting grid cells. This enables us to calculate accurate gradients of a figure of merit with respect to many arbitrary variables with only two simulations using the adjoint method. These gradients can then be fed into a variety of minimization algorithms in order to optimize our structure. It is worth noting that topology optimization methods which ensure that the underlying material distribution is changed in a differentiable manner can equally take advantage of the adjoint method, however our method is particularly well suited to optimizations in which the structure’s boundaries are directly manipulated. This intuitive boundary smoothing method, which in the past has been considered primarily for the purpose of improving simulation accuracy, can enable us to easily optimize electromagnetic structures which would otherwise prove very challenging.

In this paper, we will briefly review the adjoint method which is used to compute the gradient of the figure of merit and highlight the importance of smooth boundary modifications. We will then present and validate our boundary smoothing method and finally use it to optimize an efficient short waveguide taper with a constrained feature size. In order to help others in the community tackle similar design problems, we have released our own electromagnetic optimization tool which is built upon the methods discussed in this manuscript as open source software [21].

Fig. 1 Comparison of different strategies for representing abrupt material boundaries on a rectangular grid. On the far left, the desired structure is shown (a circle). The second image depicts a grayscale permittivity distribution which yields an approximation of the circle when a threshold is applied. The third image depicts a strictly binary grid (“Manhattan”) representation of the circle. The final picture depicts the circle represented using our own boundary smoothing method which best captures the circular boundary.

Download Full Size | PDF

2. The adjoint method

Optimization (or “inverse design”) of an electromagnetic device involves modifying an initial design in order to maximize (or minimize) a figure of merit which describes the device’s performance. In order to modify the design, we specify design parameters which define the shapes of material boundaries and hence the structure of the device. There are many strategies for choosing the values of our design parameters in order to maximize our figure of merit; of particular interest are gradient-based minimization techniques which are very efficient assuming we can inexpensively compute the gradients of our figure of merit (i.e. the derivative of the figure of merit with respect to each design parameter). The simplest way to do this is by a brute force approach in which each design parameter is independently varied and a separate simulation is run in order to determine how the figure of merit changes. If M design variables specify the shape of the device, then this method requires M + 1 simulations per gradient calculation, which quickly becomes impractical for even modest values of M.

It turns out that through clever application of the chain rule, we can find the gradient of our figure of merit using only two simulations. This process is called the adjoint method or the adjoint variable method and has found widespread application in many fields, including electromagnetics (as demonstrated by [4–8, 10–12, 15–19, 22–25] which employ it). For the sake of clarity and notation, we will rederive the general expressions of the adjoint variable method and point out some key results relevant to boundary smoothing.

Our goal is to minimize a figure of merit F(E, H) which depends on the spatially-dependent electric (E) and magnetic (H) fields. The fields are found by solving Maxwell’s equations, given by

\nabla \times E - i ω μ H = M

\nabla \times H + i ω ε E = J

which we have chosen to write in their time harmonic form. As is the case with most linear partial differential equations, we can discretize these equations on a rectangular grid and rewrite them in matrix form as

A \vec{x} = \vec{b}

where A is a matrix containing the curls expressed using finite differences and the permittivity and permeability at each point in the discretized space, x⃗is a vector containing the electric and magnetic fields at all points in space, and b⃗ contains the inhomogenous electric and magnetic current density at all points in space. Depending on how unknowns are ordered when assembling the system, A, x⃗, and b⃗ might take the form

A = (\begin{matrix} i ω ε (r) & \nabla \times \\ \nabla \times & - i ω μ (r) \end{matrix}), \vec{x} = (\begin{matrix} \vec{E} \\ \vec{H} \end{matrix}), \vec{b} = (\begin{matrix} \vec{J} \\ \vec{M} \end{matrix})

where E⃗, H⃗, J⃗, and M⃗ are N × 1 vectors containing the discretized field and current density values. In our discretized world, our figure of merit becomes a function of x⃗ = [E⃗ H⃗]^T which we write as F(x⃗).

If we had direct control over the electric and magnetic fields in x⃗, finding the gradient of F(x⃗) would require only simple differentiation with respect to E⃗ and H⃗. Instead of directly controlling E⃗ and H⃗, we have control over the permittivity and permeability defined everywhere in space which may be specified using a structured set of design parameters (like the dimensions of shapes, the positions of shapes, the coordinates of polygon vertices, etc). If we write this set of design variables as p⃗ = [p₁, p₂, · · ·, p_M]^T, then the gradient we are interested in is the set of derivatives of F with respect to each p_i, i.e.

{\vec{\nabla}}_{p} F = [\frac{\partial F}{\partial p_{1}}, \frac{\partial F}{\partial p_{2}}, \dots, \frac{\partial F}{\partial p_{M}}]

To find these derivatives, we begin by applying chain rule when differentiating F(x⃗). Consider the i’th derivative for the simple case in which F is an explicit function of the electric and magnetic fields only:

\frac{\partial F}{\partial p_{i}} = 2 Re {\frac{\partial F}{\partial \vec{x}} \frac{\partial \vec{x}}{\partial p_{i}}}

Because the fields in x⃗ are complex valued, we must be careful when taking their derivatives, hence the appearance of the 2Re{…}.

Notice that the derivative ∂F/∂x⃗ is already known since the figure of merit is an explicit function of the electric and magnetic fields. The second term in Eq. (6), ∂x⃗/∂p_i, remains to be found. This is accomplished by directly differentiating our system of equations given in Eq. (3) with respect to p_i and multiplying by A⁻¹, which yields

\begin{array}{l} \frac{\partial}{\partial p_{i}} (A \vec{x}) = \frac{\partial \vec{b}}{\partial p_{i}} \\ \Rightarrow A \frac{\partial \vec{x}}{\partial p_{i}} = \frac{\partial \vec{b}}{\partial p_{i}} - \frac{\partial A}{\partial p_{i}} \vec{x} \\ \Rightarrow \frac{\partial \vec{x}}{\partial p_{i}} = A^{- 1} (\frac{\partial \vec{b}}{\partial p_{i}} - \frac{\partial A}{\partial p_{i}} \vec{x}) . \end{array}

Notice that the Maxwell operator A contains the distribution of permittivity and permeability in the system which is directly controlled by the design parameters p⃗. Therefore, ∂A/∂p_i is knownor assumed to at least be easily computable. The derivative of the current sources with respect to the design parameters, ∂b⃗/∂p_i, can be calculated, although in most cases we will assume that the inputs to the system are fixed and this term will be zero. In this case, Eq. (7) becomes

\frac{\partial \vec{x}}{\partial p_{i}} = - A^{- 1} \frac{\partial A}{\partial p_{i}} \vec{x} .

Substituting this expression for ∂x⃗/∂p_i in Eq. (6), we find an expression for the i’th derivative of F in terms of known quantities:

\frac{\partial F}{\partial p_{i}} = - 2 Re {\frac{\partial F}{\partial \vec{x}} A^{- 1} \frac{\partial A}{\partial p_{i}} \vec{x}}

This expression can be written in a more enlightening way by introducing a new vector given by

{\vec{y}}^{T} = \frac{\partial F}{\partial \vec{x}} A^{- 1}

which we can rewrite in the form Ax⃗ = b⃗ by multiplying by A and taking the transpose of both sides:

A^{T} \vec{y} = {(\frac{\partial F}{\partial \vec{x}})}^{T}

Solving this expression for y is similar to solving Maxwell’s equations where ∂F/∂x⃗ acts as the current sources. In general, the discretized form of Maxwell’s equations is not symmetric, and therefore the forward and adjoint equations are not identical. Substituting Eq. (10) into (9), we obtain a final expression for the derivative of our function F with respect to the design variables of the system:

\frac{\partial F}{\partial p_{i}} = - 2 Re {{\vec{y}}^{T} \frac{\partial A}{\partial p_{i}} \vec{x}} .

In order to solve for all M derivatives of F with respect to p_i, we need to compute the physical electric and magnetic fields represented by x⃗ as well as a second set of non-physical “adjoint” fields represented by y⃗. We can intuitively think about these adjoint fields as the fields that are produced by injecting the desired output fields (which arise from currents ∂F/∂x⃗) into the system and running the whole system backwards. Solving for x⃗ and y⃗ each correspond to a single “forward” and “adjoint” simulation, respectively, and thus solving for the gradient of F requires two simulations, independent of the number of design variables. This is the great advantage of the adjoint method.

In addition to the forward and adjoint simulations, an essential component of the adjoint method is the accurate calculation of ∂A/∂p_i which contains the information about how the distribution of materials in the system is controlled by the design parameters p⃗. In particular, the discretized equations which are assembled into A can be ordered such that the permittivity and permeability values are contained in the diagonal of A as indicated by Eq. (4). Typically, when working on a rectangular grid, the discretization of the problem will remain unchanged as the design variables are modified. If this is true and the materials present are either isotropic or diagonally anisotropic, then all of the off-diagonal elements of the system matrix will not change with respect to changes to the design variables and hence the off-diagonal elements of dA/dp_i will be zero. This leaves only the diagonal elements which contain the derivatives of the permittivity and permeability at each point in space. In this case, assuming all permittivities and permeabilities are isotropic, the derivatives of the figure of merit are greatly simplified to

\begin{array}{l} \frac{\partial F}{\partial p_{k}} & = - 2 Re {\sum_{j} i ω \frac{\partial ε_{j}}{\partial p_{k}} E_{j} \cdot E_{j}^{adj} - \sum_{j} i ω \frac{\partial μ_{j}}{\partial p_{k}} H_{j} \cdot H_{j}^{adj}} \\ = 2 ω Im {\sum_{j} \frac{\partial ε_{j}}{\partial p_{k}} E_{j} \cdot E_{j}^{adj} - \sum_{j} \frac{\partial μ_{j}}{\partial p_{k}} H_{j} \cdot H_{j}^{adj}} \end{array}

where the field quantities with the superscript “adj” are contained in y⃗. It is interesting to note that this result is consistent with derivations of the continuous adjoint method [10] with the exception that Maxwell’s equations are not assumed to be symmetric in our “discrete” formulation (which can result from anisotropic material tensors, the use of perfectly matched layers to truncate the simulation domain, and nonuniform grids). Based on Eq. (13), it is apparent that our ability to accurately compute the gradient of our figure of merit is contingent on our ability to form a continuous relationship between the design parameters of the system and the distribution of permittivity and permeability (were this not the case, ∂∊/∂p_i and ∂μ/∂p_i would be ill-defined). In the case of representing material boundaries on a rectangular grid, this is not straightforward.

To overcome this issue, we use boundary smoothing techniques which allow us to project continuously defined boundaries onto a rectangular grid. If a boundary intersects a grid cell, an intermediate material value between the two values on either side of the boundary is assigned to the intersected cell. This allows us to make very small perturbations to the continuously-defined boundaries (i.e. boundary shifts smaller than the width of a grid cell) which can be modeled as a slight change in the permittivity/permeability in the grid cells which intersect the boundary. Because these changes to permittivity/permeability can be made in a continuous way, we can calculate ∂∊/∂p_i and ∂μ/∂p_i accurately even on a rectangular grid. In the next section, we explain this boundary smoothing process in detail.

3. Boundary smoothing

When trying to represent an abrupt interface between two materials on a rectangular grid, we inevitably run into the problem of staircasing: representing curved or diagonal boundaries between two different materials on a rectangular grid results in a jagged interface which conforms to the underlying grid. In addition to compromising simulation accuracy, the rectangular nature of the grid poses significant challenges to calculating sensitivities since perturbations smaller than a grid cell are not possible.

A considerable amount of work has been done to improve the treatment of non-rectangular boundaries with finite difference methods for the purpose of improving simulation accuracy. In particular, modification of the FDTD equations have been successfully employed in order to improve the simulation accuracy [26, 27] and the introduction of effective permittivity at material interfaces [28–31] has been demonstrated to improve simulation accuracy in many situations. In many of these works, the process of computing effective intermediate material values, which we refer to as “boundary smoothing,” relies on the calculation of a volume-averaged permittivity/permeability. Despite the central role this averaging plays, comparatively little work has been done on computing these averages in a way that produces effective material values that change smoothly with respect to underlying material boundaries. If this smooth behavior is achieved, this method of smoothing material interfaces becomes a powerful tool for shape optimization.

In order to demonstrate precisely this, we have implemented a simple form of boundary smoothing using weighted averages which is depicted in Fig. 2. In a given grid cell, the effective permittivity in a 2D domain is given by

〈 ε (i, j) 〉 = \frac{1}{Δ x Δ y} \sum_{k} C_{k} ε_{k} (i, j)

where C_k is the overlap area between the k’th material domain and the grid cell at location i, j and Δx and Δy are the grid cell width and height, respectively. For the purpose of computing derivatives, it is essential that C_k be computable with high precision. We accomplish this by representing all boundaries in the system as piecewise linear functions (i.e., polygons) and then computing intersections between the material domains and the grid-cells that intersect the boundaries of those domains. Because these piecewise linear functions are stored with very high (or even arbitrary) numerical precision, infinitesimal modifications to the boundaries are reflected by infinitesimal modifications to the local effective permittivity and permeabilities on the grid. This process is contingent on our ability to efficiently find polygon intersections. Fortunately, this has long been a topic of great importance in computational geometry [32]. Due to the maturity of this field, efficient algorithms for finding the intersection between polygons are readily available, making this simple form of boundary smoothing relatively straightforward to implement.

Fig. 2 Visual depiction of boundary smoothing process. Internally, all material boundaries of the system are represented using polygons which are defined in a continuous domain. These shapes are then mapped onto a rectangular grid by computing the average value of permittivities and permeabilities which overlap with each cell in the grid. This mapping is achieved by computing the overlap area between grid cells and material domains.

Download Full Size | PDF

A demonstration of this process are depicted in Fig. 3. In Fig. 3(a), the smoothed permittivity for a 0.5 μm diameter circle is shown on an intentionally coarse grid. As a result of the smoothing process, the grid cells at the boundary of the circle are filled with an effective permittivity whose value is between the permittivity inside of the circle and the permittivity surrounding the circle. We then test the continuous nature of this smoothing by displacing the circle in the y direction by 10⁻¹² μm. The change in the permittivity as a result of this displacement is shown in Fig. 3(b). The small size of the displacement is reflected by the correspondingly small change in permittivity in the grid cells at the circle’s outer boundary.

Fig. 3 Demonstration of boundary smoothing for a 0.5 μm diameter dielectric circle. (a) shows the relative permittivity of the smoothed grid computed for the circle on an intentionally coarse grid in order to clearly show the averaging which occurs at the circle’s boundary. (b) shows the difference in permittivity between the grid shown in (a) and the grid corresponding to the same circle which has been shifted in the y direction by 10⁻¹² μm. The difference in permittivity is correspondingly small, highlighting the continuous nature of our boundary smoothing.

Download Full Size | PDF

This result demonstrates that we can make infinitesimal changes to boundaries represented on a rectangular grid. These infinitesimal changes translate into the smooth changes in quantities calculated using the simulated fields in response to changes to the simulated structure. This behavior is demonstrated in greater detail in Fig. 4 which plots the scattering cross section of an infinite dielectric cylinder as a function of the radius of the cylinder. In both plots, the cross section is calculated using FDTD with a grid spacing of 30 nm. The cylinder in these simulations is represented using our boundary smoothing which is compared to a strictly binary Manhattan representation of the cylinder as well as to the theoretical value for the scattering cross section. Notice that in the left plot, the cross section of the smoothed cylinder varies smoothly over the full range of radii which spans two grid cells, closely matching the theoretical behavior. The Manhattan representation, meanwhile, varies in a jagged way with changing radius. This behavior is made more apparent by zooming in on a small range of radii (which spans much less than one grid cell) as shown in the left plot of Fig. 4. Unlike the Manhattan representation which decreases in discontinuous steps, the cross section computed using our smoothed representation decreases smoothly, closely matching the theoretical cross section. This behavior is essential to the calculation of accurate gradients of a figure of merit.

Fig. 4 Demonstration of boundary smoothing applied to the calculation of the scattering cross section of an infinite dielectric cylinder (in 2D). The scattering cross sections computed using boundary smoothing (blue curve) and without using boundary smoothing (red curve) are compared to the theoretical value for a range of cylinder radii. On the left, the cylinder radius is varied by an amount equal to twice the grid spacing (30 nm). On the right, the radius is varied over a range much smaller than a single grid step. In both cases the cross-section of the grid-smoothed-cylinder evolves smoothly with changing radius while cross section of the cylinder represented on a strictly binary grid exhibits a step-like behavior. The smooth change in the cross section of the boundary-smoothed-cylinder is highly desirable when calculating sensitivities.

Download Full Size | PDF

With continuous boundary smoothing at our disposal, the derivatives ∂A/∂p_i are easily computed using finite differences. One at a time, each design variable of the system p_i is perturbed by a small amount (e.g., 10⁻⁸ × Δx) and the diagonals of a new matrix A(p_i + Δp) is computed. The derivative is then given approximately by

\frac{\partial A}{\partial p_{i}} \approx \frac{A (p_{i} + Δ p) - A (p_{i})}{Δ p}

which is accurate so long as Δp is sufficiently small, regardless of how coarse or fine the spatial discretization of the underlying simulation is.

Calculation of this derivative combined with solutions to the forward and adjoint problems provide us with the tools we need to optimize integrated photonic devices. To summarize, the process for optimizing a structure using boundary smoothing and the adjoint method is as follows:

Choose initial structure, figure of merit F(x⃗), and design parameters p⃗
Calculate figure of merit and its gradient for current structure
1. Run forward simulation (i.e. solve Ax⃗ = b⃗)
2. Calculate F(x⃗) and ∂F/∂x⃗
3. Run adjoint simulation according to Eq. 11 to find y⃗^T
4. Calculate dA/dp_i using Eq. 14 and Eq. 15 for each design variable
5. Calculate gradient dF/dp⃗ using Eq. 12
Repeat step 2. as required by minimization algorithm until figure of merit has converged to minimum value

It is interesting to note that our boundary smoothing method is conceptually similar to a variety of different “fictitious domain methods” which are regularly employed in the field of structural engineering [20]. In particular, the volume-based material averaging we use is analogous to the “ersatz approach” to the level set method employed by Allaire et al [4] which has since been applied to other topology optimization methods (e.g., the method of moving morphable components [33]). However, in comparison to this level set approach, our method is in many ways easier to grasp conceptually and its implementation is comparatively straightforward. Furthermore, by representing the material values explicitly as polygons (which is supported by most if not all layout tools), translating our designs from a simulated to fabricated structure requires no addition effort. An additional consequence of this explicit polygon representation is the ease with which fabrication constraints may be incorporated into the optimization as is discussed in the next section. Finally, our boundary smoothing technique provides us with great flexibility in how we manipulate structures: it works equally well for geometries which are heavily constrained (e.g., grating couplers which consist of an arrangement of different sized rectangles) as it does for geometries which evolve in a more “free form” manner (as is typically observed with level set methods). This combination of simplicity and flexibility makes our method an attractive alternative to other fictitious domain methods for electromagnetic optimization.

4. Optimization of a non-adiabatic waveguide taper

In order to demonstrate the shape optimization process using boundary smoothing, we have optimized a non-adiabatic waveguide taper which is useful as a compact spot size converter for butt coupling to a fiber or short transition from a waveguide to a grating coupler. Unlike previous work on the design of short tapers [34–36], we allow the geometry of the taper to evolve with greater freedom. This allows us to achieve much more compact transitions than an analytic taper allows [37] without sacrificing performance over a large bandwidth. The taper we optimize is 18 μm long and connects a 500 nm wide input waveguide to a 9 μm wide output waveguide. The structure is made of 220 nm thick silicon clad in silicon dioxide and is reduced to two dimensions using the effective index method. Although we focus on 2D examples in this work, our boundary smoothing methods can be easily extended to 3D slab structures, and, in principle, can be extended to non-extruded 3D structures as well.

In order to simulate the structure, it is excited with the fundamental TM mode of the 500 nm wide input waveguide using a wavelength of 1550 nm. We choose the linear taper depicted in Fig. 5(a) as the initial design. The taper itself is represented using a polygon with 200 vertices on its top edge and the structure is mirrored about the x axis using symmetry boundary conditions. We select the design variables of the problem to be the relative x and y displacement of these 200 vertices from their starting positions as depicted in Fig. 5(b). We thus use 400 design variables in total.

Fig. 5 A short 18 μm taper from a 500 nm silicon waveguide to a 9 μm wide silicon waveguide. The taper is defined as a single polygon containing 200 vertices along its top and bottom diagonal edges. We choose the displacement of the x and y coordinates of these points (labeled δx and δy) as the design parameters of the system. This taper is used to validate the accuracy of gradients computed using boundary smoothing in conjunction with the adjoint method and serves as the starting point of an optimization to demonstrate application of these methods.

Download Full Size | PDF

For all forward and adjoint simulations, we use our own finite difference frequency domain (FDFD) solver. This provides easy access to the internals of the simulator and makes implementing the adjoint simulation straightforward. Although for these examples we use FDFD, our boundary smoothing methods are equally applicable to FDTD (a topic which we hope to cover in a future publication). Finally, for all of our simulations, we use a grid resolution of 25 nm in order to achieve sufficient accuracy.

Our goal in optimizing this taper is to maximize the fraction of input power that is coupled into the fundamental TM mode of the wider output waveguide. The figure of merit which quantifies this is the mode matching efficiency integral. Assuming there is no reflected wave at the output, the continuous form of this mode matching integral, which we derive in Appendix A, is given by

η = \frac{1}{4 P_{m} P_{src}} {| \iint_{A} d A \cdot E \times H_{m}^{*} |}^{2}

where E is the incident electric field, H_m is the desired magnetic field, P_src is the source power, and P_m is the power in the desired fields. The integral is taken over a plane which encompasses the entire desired field profile. In the case of the waveguide taper, H_m corresponds to the magnetic fields of the fundamental TM mode of the larger output waveguide while E is the actual simulated electric field taken along the red line on the right hand side of Fig. 5(a) (labeled “FOM plane”).

In this optimization, we not only seek a design that maximizes efficiency, but also a design that can be fabricated. We accomplish this by applying radius of curvature constraints which prevent the formation of exceedingly small features. We do this by penalizing the mode match efficiency with a penalty function which reduces the figure of merit when the approximate radius of curvature at each point in the polygon falls below a specified minimum radius of curvature. The full figure of merit we use is given by

F (E, H, \vec{p}) = η (E, H) - f_{ROC} (\vec{p})

where, η is the mode match given in Eq. (16) and f_ROC(p⃗) is a differentiable function of the design variables that is positive when the effective radius of curvature calculated at each vertex drops below a minimum radius of curvature and drops quickly to zero as the calculated radius of curvature increases above this minimum (described in detail in Appendix B). In this example, we choose a minimum radius of curvature of 150 nm since it can be easily fabricated. Because our shape optimization gives us full freedom over the parameterization, the incorporation of such constraints is a straightforward matter of modifying the figure of merit, and, unlike some topology optimization methods [16], requires no additional modification of the underlying minimization process.

Having defined a figure of merit and initial design, we are able to evaluate the accuracy of the gradients computed using the adjoint method with boundary smoothing. We do this by computing the gradient of F using Eq. (12) and then comparing it to the brute-force calculation of the gradient. The error between the gradient calculated using the adjoint method and the gradient calculated using brute-force finite differences is determined by evaluating

Error in \nabla F = \frac{| \nabla F_{FD} - \nabla F_{AM} |}{| \nabla F_{FD} |}

where the subscripts “FD” and “AM” refer to “finite difference” and “adjoint method,” respectively. The result of this comparison is shown in Fig. 6 in which gradient accuracy is plotted against the size of the perturbation to each design variable (Δp) that is used to compute ∂A/∂p_i for a range of different grid spacings (smaller simulation equals higher resolution simulation).

Fig. 6 Accuracy of the gradient of the figure of merit computed using the adjoint method and boundary smoothing as a function of the size of the perturbation Δp used in the calculation (normalized to the wavelength λ) for a variety of different simulation resolutions. The error in this gradient is defined with respect to the “brute force” gradient which is computed by sequentially perturbing each design variable of the system by a very small amount, running a new simulation, and then computing the new figure of merit. Greater error is incurred as grid cells further from the original boundary of the shape contribute to the change in permittivity. The error saturates, meanwhile, when the perturbation is small enough such that only grid cells which straddle the boundary contribute to the derivative.

Download Full Size | PDF

When the perturbation Δp is large, grid cells which do not straddle the material boundary contribute to the gradient calculation. This leads to the linear growth in error with perturbation size that we observe in Fig. 6 and which we expect from forward finite differences. When the perturbation Δp is sufficiently small (in this case ≲ 10⁻⁷λ where λ is the wavelength), the error saturates. This indicates that the finite differences used to compute ∂A/∂p⃗ are no longer a dominant source of error in the gradient calculation. We attribute the non-zero minimum error observed to the presence of numerical error that arises during the forward and adjoint solution processes. It is interesting to note that the saturated (minimum) error increases with decreasing simulation resolution. Fitting the minimum gradient errors, we find that they increases as ds^2.19 where ds is the grid spacing. This observation is consistent with similar analyses performed for finite element implementations of the adjoint method [38].

These results demonstrate that even for simulations which are too low resolution to produce useful results, the gradient of the figure of merit will remain accurate. Furthermore, these results highlight the importance of having a boundary smoothing method which behaves smoothly in response to changes in material boundaries. Boundary smoothing based on supersampling techniques is likely infeasible since the permittivity in each grid cell would have to be sampled tens or hundreds of thousands of times in order to achieve a high level of accuracy, a process which would be exceedingly computationally expensive. Finally, it is important to note that this gradient error analysis does not need to be repeated for every different problem in order to determine an appropriate step size. In practice we have found that as long as we choose a step size that is very small relative to the grid spacing (typically 10⁻⁸ × ds has proven to be a good choice), we can be confident that the error in the gradient will be at its minimum.

Based on these results we choose the perturbation size to be Δp = 10⁻⁸λ in order to ensure that the computed gradients are as accurate as possible. These gradients are used with the Broyden-Fletcher-Goldfarb-Shanno (BFGS) algorithm, which we found worked best for this particular problem, in order to minimize the figure of merit and thus optimize the geometry of the waveguide taper.

The real part of H_z is plotted for the initial design in Fig. 7. As is clearly visible, the wavefronts become curved as they propagate along the linear taper. This curving of the wavefronts results in significant coupling to higher order modes in the output waveguide, which is reflected by an initial efficiency of only 51%. Beginning with this structure, we run the optimization until the figure of merit changes by less than 10⁻⁴ which takes 97 iterations of the minimization algorithm. The whole process takes roughly two hours and thirty three minutes when run on 14 cores of a server which has two Intel Xeon E5-2697 processors and 128 GB of 2133 MHz DDR4 memory. The structure resulting from this optimization process is shown in Fig. 8. As desired, the wavefronts leaving the device are flat and there is no visible wave interference in the output waveguide that results from the excitation of higher order modes. This final structure has an efficiency of just over 99% (−0.041 dB) at the design wavelength of 1550 nm.

Fig. 7 The initial structure overlaid with the real part of z-component of the magnetic field, H_z. Notice that the abrupt transition from the input to output waveguide results in significant curvature of the propagating wavefronts. This results in significant coupling to higher order modes in the output waveguide, and the fraction of power propagating in the fundamental mode of the output waveguide is only 51%.

Download Full Size | PDF

Fig. 8 The final optimized structure overlaid with the real part of H_z. The optimized structure is very successful in coupling the input power into the fundamental mode of the output waveguide, and no curving of the outgoing wavefronts is visible. This is reflected in the final efficiency of the structure of over 99% at the design wavelength.

Download Full Size | PDF

The full extent of the improvement in performance due to optimization is made clearly evident by comparing the slices of the magnetic field (H_z) taken perpendicular to the output waveguide for the first and last iteration as shown in Fig. 9. In the first iteration, the simulated field amplitude and phase deviate significantly from the desired profiles. In particular, the phase varies by approximately π radians across the width of the output waveguide, which reflects the significant curvature visible in the full two dimensional plot of the fields and is indicative of the excitation of higher order modes.

Fig. 9 Slices of the fields shown in Fig. 7 and Fig. 8 taken along a vertical line at the right edge of the simulation domain (where the figure of merit is computed). The top row shows the magnitude of H_z for the field at the beginning of the first iteration (left) and the field after the 97th iteration (right). The bottom row, meanwhile, shows the phase of H_z for the first (left) and 97th (right) iteration. The fields of the initial structure deviate significantly from the desired fields both in terms of amplitude and phase. The fields of the optimize solution, meanwhile, match very closely to the desired fundamental mode, with its phase deviating only at the edges where the amplitude of the field is exceptionally low (and thus incurring minimal loss).

Download Full Size | PDF

After the 97th iteration, on the other hand, the simulated magnetic field amplitude is almost indistinguishable from the desired amplitude profile. Similarly, within the output waveguide, the simulated phase matches the desired phase, deviating by less than one tenth of a radian from the desired flat phase. While the phase deviates significantly outside of the waveguide, this is not a problem since the amplitude of the field is essentially zero in this region. The close match in amplitude and phase to the desired field reflects the final calculated efficiency of just over 99% at 1550 nm.

In addition to the optimized design exhibiting extremely high efficiencies at the design wavelength, it also performs favorably over a wide range of wavelengths as demonstrated by Fig. 10a. The optimized design exhibits a 3 dB bandwidth of approximately 420 nm. This wide bandwidth makes these tapers well suited for use with grating couplers whose bandwidth is generally less than ∼100 nm. Because the efficiency of the taper does not drop by more than roughly one third of a dB over a 100 nm range about the design wavelength, the combined bandwidth of the optimized taper plus a typical silicon grating coupler would not be significantly reduced compared to the bandwidth of the grating coupler alone.

Fig. 10 (a) Plots of the final figure of merit versus wavelength and (b) the figure of merit versus iteration of the optimization. The optimization achieves a figure of merit of over 90% (−0.46 dB) in under 20 iterations, eventually reaching a final efficiency of just over 99% (−0.041 dB) at the design wavelength. Although the optimization is only performed for a single wavelength, the final structure shows excellent broadband performance with a −3 dB bandwidth of approximately 420 nm. The final structure can thus be used in conjunction with other components such as grating couplers without significantly reducing the overall bandwidth of the combined component.

Download Full Size | PDF

It is worth noting that our optimized waveguide taper exhibits a stronger sensitivity compared to the original linear taper. This is likely a result of the introduction of smaller wavelength-scale features by the optimization. This result is not surprising, however, because we considered only a single wavelength when computing our figure of merit. We partially mitigate this issue by enforcing a radius of curvature constraint; by limiting the minimum size of features, we avoid the development of very fine structures which tend to exhibit a stronger wavelength dependence.

Despite the increase in sensitivity to wavelength, our optimized design still performs better than longer linear tapers over a limited bandwidth. For example, the optimized taper has a higher efficiency than a 50 μm linear taper over a 144 nm range as shown in Fig. 10a. Similarly, compared to a 100 μm long linear taper, our optimized design has a higher efficiency over an 80 nm bandwidth. Thus in cases where a more modest bandwidth is needed, the optimized design can achieve higher efficiencies than can be achieved by linear tapers that are more than five times as long. In order to achieve equal performance at the design wavelength of 1550 nm, a linear taper with a length of over 180 μm (an order of magnitude longer than the optimized design) is required.

The strength of the optimization methods we have employed not only lie in the performance of the final result, but also in the method’s ability to rapidly converge to an optimal solution. Consider how the efficiency of the taper evolves with each iteration of the optimization plotted in Fig. 10b. Within the first two iterations of the optimization alone, the figure of merit is increased by approximately 20% (from −3 dB to −1.5 dB). After the first 20 iterations, the figure of merit has further increased to over 90% (−0.46 dB), which is then followed by slower improvement as the gradients of the figure of merit become smaller and smaller. Convergence to the final efficiency is smooth as a result of the minimization algorithm used.

It is important to recognize that the minimization algorithm we used is a local optimization method which makes no guarantees that a global optimum is found. In general, however, finding global optima would be ideal. Heuristic techniques like the genetic algorithm are often employed in electromagnetics in order to increase the likelihood of finding a global optimum. These algorithms often require thousands of simulations before an optimum is found [39,40], with no guarantees that it is a global optimum. The potential advantages are thus not always perfectly evident. In many problems, however, we can largely eliminate these concerns by taking advantage of our own physical intuition of the problem and “guessing” a good starting design. In the case of our waveguide taper, for example, we know that linear transitions (which work well at longer lengths) are well behaved, exhibit low reflections, and suffer primarily from diffraction effects that cause the propagating wavefront to become curved. It is thus not surprising that beginning with the linear taper leads to a local optimum with such high efficiency and wide bandwidth.

5. Conclusion

This example demonstrates the great utility of shape optimization in electromagnetics: given a goal (i.e., figure of merit) and an initial guess for a design, we can quickly design passive components which outperform anything we could have otherwise designed by hand. This is made possible by boundary smoothing which gives us substantial freedom in how we represent material boundaries and enables us to calculate accurate gradients of a figure of merit with respect to large numbers of design parameters when used with the adjoint method. With these accurate gradients at our disposal, we can employ a wide range of very powerful minimization algorithms. Many of these minimization methods can be used in place of BFGS for electromagnetic shape optimization without any modification to this work.

Combining our physical intuition of electromagnetics with gradient-based shape optimization provides us with a path forward to improve many of the components in integrated optics. Based on the success of topology and shape optimization in other fields and the ease with which it can be applied to electromagnetics, we expect inverse electromagnetic design to become an essential component of the future nanophotonic engineer’s toolbox. To facilitate this, we have released our own optimization tool, of which boundary smoothing is a core component, as open source software [21].

A. Mode matching integral

Mode overlap, or mode-matched efficiency, refers to the degree to which an incident field can couple into a desired mode of a system. Computing the mode overlap is essential to determining the efficiency of many optical devices and is therefore highly relevant to shape optimization. For the reader’s benefit, we present a relatively detailed derivation of the mode overlap integral here.

Our first step determining the mode-matched efficiency is to express our input field as a sum of the allowed propagating modes of the system. Specifically, the basis we will use consists of the electric and magnetic fields of both forward and backward traveling waves. We begin by writing the electric field as a sum of these basis functions,

E = E_{fwd} + E_{back}

= \sum_{m} (a_{m} e^{i k_{m} z} + b_{m} e^{- i k_{m} z}) E_{m}

where the two terms in parentheses correspond to the forward and backward components and are given separately by

E_{fwd} = \sum_{m} a_{m} e^{i k_{m} z} E_{m}

E_{back} = \sum_{m} b_{m} e^{- i k_{m} z} E_{m} .

In the expressions above, E_m is the electric field profile of the mth mode of the system. The magnetic field, meanwhile, can be written in a similar form. Applying Faraday’s Law and assuming harmonic time dependence of the electric field, the magnetic field can be written as an expansion of the forward and backward wave

H = H_{fwd} + H_{back}

= \sum_{m} (a_{m} e^{i k_{m} z} - b_{m} e^{- i k_{m} z}) H_{m}

Notice in Eq. (24), that the backward propagating term is preceded by a negative sign. This arises out of the requirement that power flow in the negative direction (and hence the Poynting vector, E_m × H_m point in the negative direction). In both Eq. (24) and (20), we have chosen to write the fields as a sum of the forward and backward traveling waves. In general, given an arbitrary field, we will not know the forward and backward traveling components but only their sum. Our goal now is to develop the machinery needed to separate the different forward and backward traveling modes which compose an arbitrary field.

This is equivalent to finding the coefficients a_m and b_m. To do so, we must take advantage of the orthogonality condition of our electromagnetic basis which arises as a result of Lorentz reciprocity and is given by [41]

\frac{\iint_{A} d A \cdot E_{m} \times H_{n}^{*}}{\iint_{A} d A \cdot E_{m} \times H_{m}^{*}} = δ_{m n}

where δ_mn is the Kronecker delta. We apply this orthogonality condition by computing the surface integral of

E \times H_{m}^{*}

and rearranging terms yields an expression relating a_m and b_m to the electric field and mth mode

a_{m} e^{i k_{m} z} + b_{m} e^{- i k_{m} z} = \frac{\iint_{A} d A \cdot E \times H_{m}^{*}}{S_{m}}

where S_m is related to the power propagating in the mth mode

S_{m} = \iint_{A} d A \cdot E_{m} \times {H_{m}}^{*} .

A second expression for a_m and b_m can be found by computing the surface integral of E_m × H^* which yields

a_{m} e^{i k_{m} z} - b_{m} e^{- i k_{m} z} = \frac{\iint_{A} d A \cdot E_{m}^{*} \times H}{S_{m}^{*}} .

With two equations and two unknowns, we are now able to solve for the coefficients. Adding the two equations produces an expression for a_m

a_{m} = \frac{1}{2} e^{- i k_{m} z} (\frac{\iint_{A} d A \cdot E \times H_{m}^{*}}{S_{m}} + \frac{\iint_{A} d A \cdot E_{m}^{*} \times H}{S_{m}^{*}})

while subtracting them yields and expression for b_m. Using these equations, decomposing a given field into the modes of the system is a straightforward calculation. Mode matching requires that we take this one step further and determine how much power is propagating in a desired mode. To find this, we compute the power propagating through a plane in the forward wave, i.e.

P_{fwd} = \frac{1}{2} Re {\iint d A \cdot E_{fwd} \times H_{fwd}^{*}}

. This calculation results in a sum whose terms correspond to the power propagating in each mode. This power is more conveniently expressed as a fraction of the total power propagating in the field:

η_{m}^{fwd} = \frac{P_{m}}{P_{in}} = {| a_{m} |}^{2} \frac{Re {S_{m}}}{Re {\iint_{A} d A \cdot E \times H^{*}}} .

Eq. (30) describes the amount of power that can couple from an incident field into a desired mode of the system. This expression can be further simplified by noticing that in most problems, a backward traveling wave is not present at the output of the system. In this case, b_m equals zero and as a result

\frac{\iint_{A} d A \cdot E \times H_{m}^{*}}{S_{m}} = \frac{\iint_{A} d A \cdot E_{m}^{*} \times H}{S_{m}^{*}}

Taking this into account allows us to simplify Eq. (30) to

η_{m} = \frac{\frac{1}{2} Re {\iint_{A} d A \cdot E_{m} \times H_{m}^{*}} {| \iint_{A} d A \cdot E \times H_{m}^{*} |}^{2}}{\frac{1}{2} Re {\iint_{A} d A \cdot E \times H^{*}} {| \iint_{A} d A \cdot E_{m} \times H_{m}^{*} |}^{2}}

where E_m and H_m are the field profiles that we desire the grating to generate. This equation is the most general form of the mode-matched efficiency. However, in many applications, E_m and H_m will correspond to a guided mode. In this case, we can further simplify the expression by noting that for a guided or free space mode, the integral

\iint_{A} d A \cdot E_{m} \times H_{m}^{*}

is real valued. In this case we can cancel the first term in the numerator and we find that the mode overlap is given by

η_{m, guided} = \frac{1}{4 P_{m} P_{in}} {| \iint_{A} d A \cdot E \times H_{m}^{*} |}^{2}

where we have chosen to write

\begin{array}{l} P_{m} = \frac{1}{2} Re {\iint_{A} d A \cdot E_{m} \times H_{m}^{*}} \\ P_{in} = \frac{1}{2} Re {\iint_{A} d A \cdot E \times H^{*}} \end{array}

which describe the power in the incident field and the power in the desired mode (neither of which are guaranteed to be normalized to unity power). In many situations, we wish to know the total efficiency with which a device will output into a desired mode. This efficiency is differs slightly from the mode overlap expression given above as it compares the fraction of power coupled into a desired mode at the output of a device to the total power input to the device. This minor difference is easily accounted for by replacing P_in with P_src, the total source power of the system.

B. Radius of curvature constraint

It is relatively straightforward to impose radius of curvature constraints in optimization problems in which boundaries are defined using polygons. We do this by calculating an approximate curvature at each point in the polygon and then applying a smooth analytic thresholding function which reduces the figure of merit when the radius of curvature is below the chosen minimum value. This soft constraint does not guarantee that the approximate radius of curvature at every point in the polygon will be larger than the minimum value, however in practice the penalty can be made significant enough that small curvature will not appear.

In general, the curvature of a polygon is ill-defined since it is made up of straight segments. However, intuitively we know that with a sufficient number of points, a polygon can resemble a smooth curve. We should thus be able to define an effective radius of curvature that reflects the apparent smoothness. We can accomplish this by fitting higher order polynomials to set of points in the polygon and then calculating the curvature of those curves at each point. In this work, we fit quadratic parametric curves to sets of three points and compute the curvature at the middle point. Provided we have found paramtric curves x(t) and y(t) which are fit to the points (x_i−1, y_i−1), (x_i, y_i), and (x_i+1, y_i+1), we approximate the curvature at (x_i, y_i) using

R_{i} = \frac{{(x^{' 2} + y^{' 2})}^{3 / 2}}{| x^{'} y^{″} - y^{'} x^{″} |}

where x′ denotes the first derivative of x with respect to t and x″ denotes the second derivative of x with respect to t. With the radius of curvature now known, we can construct a penalty function which reduces the figure of merit when the radius of curvature falls below the minimum value by applying a smooth analytic approximation of the step function. A good choice for this function is the logistic function which has the desired behavior of tending towards zero when the radius of curvature is larger than the minimum value and tending towards one when it is less than the minimum value. Furthermore, being analytic, the logistic function is differentiable which is essential for calculating gradients of the figure of merit.

The full penalty function is a sum over step functions of the radius of curvature at each point, i.e.,

f_{ROC} (\vec{p}) = α \sum_{i} \frac{1}{1 + exp [- κ (R_{0} - R_{i} (\vec{p}))]}

where R₀ is the minimum radius of curvature, κ is related to the sharpness of the step from zero to one, α is a constant used to tune the strength of the penalty, and p⃗ is the set of design parameters. If we choose both α and κ to be large (i.e. near one), then it is clear from Eq. (35) that f_ROC will quickly grow if the curvature at any of the points in the polygon drop below R₀. Because this function is subtracted from the efficiency (which ranges from zero to one) in the full figure of merit in Eq. (17), it is extremely disadvantageous for the optimization to progress in a direction which would reduce the curvature at any point below the minimum curvature.

Funding

Office of Naval Research under grant numbers N00014-14-1-0505 and N00014-16-1-2237.

References

1. R. T. Haftka and R. V. Grandhi, “Structural shape optimization–A survey,” Comput. Methods Appl. Mech. Eng. 57, 91–106 (1986). [CrossRef]

2. M. P. Bendsøe and N. Kikuchi, “Generating optimal topologies in structural design using a homogenization method,” Comput. Methods Appl. Mech. Eng. 71, 197–224 (1988). [CrossRef]

3. O. Sigmund, “On the Design of Compliant Mechanisms Using Topology Optimization,” Mech. Struct. Mach. 25, 493–524 (1997). [CrossRef]

4. G. Allaire, F. Jouve, and A.-M. Toader, “Structural optimization using sensitivity analysis and a level-set method,” J. Comput. Phys. 194, 363–393 (2004). [CrossRef]

5. H.-B. Lee and T. Itoh, “A systematic optimum design of waveguide-to-microstrip transition,” IEEE Transactions on Microw. Theory Tech. 45, 803–809 (1997). [CrossRef]

6. Y.-S. Chung, C. Cheon, I.-H. Park, and S.-Y. Hahn, “Optimal shape design of microwave device using FDTD and design sensitivity analysis,” IEEE Transactions on Microw. Theory Tech. 48, 2289–2296 (2000). [CrossRef]

7. P. I. Borel, A. Harpøth, L. H. Frandsen, M. Kristensen, P. Shi, J. S. Jensen, and O. Sigmund, “Topology optimization and fabrication of photonic crystal structures,” Opt. Express 12, 1996–2001 (2004). [CrossRef] [PubMed]

8. J. Jensen and O. Sigmund, “Topology optimization for nano-photonics,” Laser & Photon. Rev. 5, 308–321 (2011). [CrossRef]

9. J. Lu and J. Vuckovic, “Nanophotonic computational design,” Opt. Express 21, 13351–13367 (2013). [CrossRef] [PubMed]

10. C. M. Lalau-Keraly, S. Bhargava, O. D. Miller, and E. Yablonovitch, “Adjoint shape optimization applied to electromagnetic design,” Opt. Express 21, 21693–21701 (2013). [CrossRef] [PubMed]

11. L. H. Frandsen, Y. Elesin, L. F. Frellsen, M. Mitrovic, Y. Ding, O. Sigmund, and K. Yvind, “Topology optimized mode conversion in a photonic crystal waveguide fabricated in silicon-on-insulator material,” Opt. Express 22, 8525–8532 (2014). [CrossRef] [PubMed]

12. A. Y. Piggott, J. Lu, K. G. Lagoudakis, J. Petykiewicz, T. M. Babinec, and J. Vuckovic, “Inverse design and demonstration of a compact and broadband on-chip wavelength demultiplexer,” Nat. Photon 9, 374–377 (2015). [CrossRef]

13. B. Shen, P. Wang, R. Polson, and R. Menon, “An integrated-nanophotonics polarization beamsplitter with 2.4 × 2.4 μm² footprint,” Nat. Photonics 9, 378 (2015). [CrossRef]

14. O. Sigmund, J. S. Jensen, and L. H. Frandsen, “On nanostructured silicon success,” Nat. Photon 10, 142–143 (2016). [CrossRef]

15. L. F. Frellsen, Y. Ding, O. Sigmund, and L. H. Frandsen, “Topology optimized mode multiplexing in silicon-on-insulator photonic wire waveguides,” Opt. Express 24, 16866–16873 (2016). [CrossRef] [PubMed]

16. A. Y. Piggott, J. Petykiewicz, L. Su, and J. Vuckovic, “Fabrication-constrained nanophotonic inverse design,” Sci. Reports 7, 1786 (2017). [CrossRef]

17. Z. Lin, M. Lončar, and A. W. Rodriguez, “Topology optimization of multi-track ring resonators and 2d microcavities for nonlinear frequency conversion,” Opt. Lett. 42, 2818–2821 (2017). [CrossRef] [PubMed]

18. J. Wang, Y. Shi, T. Hughes, Z. Zhao, and S. Fan, “Adjoint-based optimization of active nanophotonic devices,” Opt. Express 26, 3236–3248 (2018). [CrossRef] [PubMed]

19. Y. Elesin, B. S. Lazarov, J. S. Jensen, and O. Sigmund, “Design of robust and efficient photonic switches using topology optimization,” Photonics Nanostructures - Fundamentals Appl. 10, 153–165 (2012). [CrossRef]

20. O. Sigmund and K. Maute, “Topology optimization approaches,” Struct. Multidiscip. Optim. 48, 1031–1055 (2013). [CrossRef]

21. A. Michaels, “Emopt: Electromagnetic optimization toolbox,” https://github.com/anstmichaels/emopt (2018).

22. M. H. Bakr and N. K. Nikolova, “An adjoint variable method for frequency domain TLM problems with conducting boundaries,” IEEE Microw. Wirel. Components Lett. 13, 408–410 (2003). [CrossRef]

23. G. Veronis, R. W. Dutton, and S. Fan, “Method for sensitivity analysis of photonic crystal devices,” Opt. Lett. 29, 2288–2290 (2004). [CrossRef] [PubMed]

24. E. A. Soliman, M. H. Bakr, and N. K. Nikolova, “An adjoint variable method for sensitivity calculations of multiport devices,” IEEE Transactions on Microw. Theory Tech. 52589 (2004). [CrossRef]

25. L. Pang, Y. Liu, and D. Abrams, “Inverse lithography technology (ILT): What is the impact to the photomask industry?” in International Society for Optics and Photonics, vol. 6283 (2006).

26. T. G. Jurgens, A. Taflove, K. Umashankar, and T. G. Moore, “Finite-difference time-domain modeling of curved surfaces [EM scattering],” IEEE Transactions on Antennas Propag. 40, 357–366 (1992). [CrossRef]

27. S. Dey and R. Mittra, “A modified locally conformal finite-difference time-domain algorithm for modeling three-dimensional perfectly conducting objects,” Microw. Opt. Technol. Lett. 17, 349–352 (1998). [CrossRef]

28. S. Dey and R. Mittra, “A conformal finite-difference time-domain technique for modeling cylindrical dielectric resonators,” IEEE Transactions on Microw. Theory Tech. 47, 1737–1739 (1999). [CrossRef]

29. N. Kaneda, B. Houshmand, and T. Itoh, “FDTD analysis of dielectric resonators with curved surfaces,” IEEE Transactions on Microw. Theory Tech. 45, 1645–1649 (1997). [CrossRef]

30. A. Farjadpour, D. Roundy, A. Rodriguez, M. Ibanescu, P. Bermel, J. D. Joannopoulos, S. G. Johnson, and G. W. Burr, “Improving accuracy by subpixel smoothing in the finite-difference time domain,” Opt. Lett. 31, 2972–2974 (2006). [CrossRef] [PubMed]

31. A. F. Oskooi, C. Kottke, and S. G. Johnson, “Accurate finite-difference time-domain simulation of anisotropic media by subpixel smoothing,” Opt. Lett. 34, 2778–2780 (2009). [CrossRef] [PubMed]

32. F. P. Preparata and M. Shamos, Computational Geometry: An Introduction, Monographs in Computer Science (Springer-Verlag, 1985). [CrossRef]

33. W. Zhang, J. Yuan, J. Zhang, and X. Guo, “A new topology optimization approach based on moving morphable components (mmc) and the ersatz material model,” Struct. Multidiscip. Optim. 53, 1243–1260 (2016). [CrossRef]

34. B. Luyssaert, P. Bienstman, P. Vandersteegen, P. Dumon, and R. Baets, “Efficient Nonadiabatic Planar Waveguide Tapers,” J. Light. Technol. 23, 2462 (2005). [CrossRef]

35. D. Vermeulen, K. V. Acoleyen, S. Ghosh, S. Selvaraja, W. De Cort, N. Yebo, E. Hallynck, K. De Vos, P. Debackere, P. Dumon, W. Bogaerts, G. Roelkens, D. Van Thourhout, and R. Baets, “Efficient tapering to the fundamental quasi-tm mode in asymmetrical waveguides,” in ECIO, (2010), p. paper WeP16.

36. W. Bogaerts, D. Taillaert, B. Luyssaert, P. Dumon, J. V. Campenhout, P. Bienstman, D. V. Thourhout, R. Baets, V. Wiaux, and S. Beckx, “Basic structures for photonic integrated circuits in silicon-on-insulator,” Opt. Express 12, 1583–1591 (2004). [CrossRef] [PubMed]

37. Y. Fu, T. Ye, W. Tang, and T. Chu, “Efficient adiabatic silicon-on-insulator waveguide taper,” Photonics Res. 2, A41–A44 (2014). [CrossRef]

38. P. Hansen and L. Hesselink, “Accurate adjoint design sensitivities for nano metal optics,” Opt. Express 23, 23899–23923 (2015). [CrossRef] [PubMed]

39. J. M. Johnson and V. Rahmat-Samii, “Genetic algorithms in engineering electromagnetics,” IEEE Antennas Propag. Mag. 39, 7–21 (1997). [CrossRef]

40. G. Roelkens, D. V. Thourhout, and R. Baets, “High efficiency Silicon-on-Insulator grating coupler based on a poly-Silicon overlay,” Opt. Express 14, 11622–11630 (2006). [CrossRef] [PubMed]

41. C.-L. Chen, Foundations for Guided-Wave Optics (John Wiley & Sons, Inc., 2006), pp. 417–420. [CrossRef]

Leveraging continuous material averaging for inverse electromagnetic design

Abstract

1. Introduction

2. The adjoint method

3. Boundary smoothing

4. Optimization of a non-adiabatic waveguide taper

5. Conclusion

A. Mode matching integral

B. Radius of curvature constraint

Funding

References

Cited By

Figures (10)

Equations (36)

Optics Express