Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Acceleration of FDTD mode solver by high-performance computing techniques

Open Access Open Access

Abstract

A two-dimensional (2D) compact finite-difference time-domain (FDTD) mode solver is developed based on wave equation formalism in combination with the matrix pencil method (MPM). The method is validated for calculation of both real guided and complex leaky modes of typical optical waveguides against the bench-mark finite-difference (FD) eigen mode solver. By taking advantage of the inherent parallel nature of the FDTD algorithm, the mode solver is implemented on graphics processing units (GPUs) using the compute unified device architecture (CUDA). It is demonstrated that the high-performance computing technique leads to significant acceleration of the FDTD mode solver with more than 30 times improvement in computational efficiency in comparison with the conventional FDTD mode solver running on CPU of a standard desktop computer. The computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard finite-difference eigen mode solver and yet require much less memory (e.g., less than 10%). Therefore, the new method may serve as an efficient, accurate and robust tool for mode calculation of optical waveguides even when the conventional eigen value mode solvers are no longer applicable due to memory limitation.

©2010 Optical Society of America

1. Introduction

Mode calculation is a key task for simulation and analysis of optical waveguides and photonic integrated circuits. Several powerful numerical methods such as the finite difference (FD) [15] and the finite element (FE) [6,7] methods have been developed and widely adopted for mode calculations of practical optical waveguides. In the conventional mode calculation method, the open optical waveguides are discretized over the transverse section and the boundary-value problem is reduced to a matrix eigen-value problem defined on finite computation domain facilitated by artificial numerical boundary conditions. As such, the accuracy, computational efficiency as well as memory requirements for the conventional mode solvers depend critically on the size of the computation widow and the number of meshes. Thanks to the powerful matrix algorithm, the solutions of the eigen-value problem associated with the mode computation are normally highly efficient with an approximately linear scaling factor with the total number of meshes. One of the shortcomings of the boundary-value eigen solution is that the memory requirement scales drastically with the size of the matrix and hence become a bottleneck especially for dealing with optical waveguide problem that need large number of mesh point to achieve sufficient accuracy.

An alterative approach to mode calculation is to formulate the problem as an initial-value problem. By launching an initial field at the input of the waveguide, one may track its propagation along the waveguide axis by a field propagation algorithm such as the beam propagation method (BPM) [810] in frequency domain or the finite-difference time-domain (FDTD) [11,12] method in time-domain. Once the propagation reaches its steady state, a signal processing method such as the Fourier transform (FT) method [8,13] may be used to extract the modal parameters.

In this paper, the optical mode solver is based on wave equation formulation in time-domain. The compact 2D-FDTD method [12] combined with matrix pencil method (MPM) [14] is adopted for calculation of the scalar, semi-vector and full-vector modes. Uniaxial perfectly matched layer (UPML) [1517] absorbing boundary conditions (ABCs) are employed for the FDTD scheme in order to reduce the computation region. By launching an initial optical field into the waveguide, the light is propagating along the direction (z) through the FDTD method for a certain time (t). When the steady-state is reached, a series of instantaneous response field values at the end of the propagation are adopted by MPM algorithm for mode parameter extractions. Since the MPM algorithm is much more efficient and accurate for the mode parameter extraction compared with the traditional FT method, only limited propagation distance is required in FDTD simulation. Since the FDTD method is an inherently data-parallel algorithm, we implement it in Graphics Processing Units (GPUs) by using Nvidia’s Compute Unified Device Architecture (CUDA) [18]. With the help of the high computing power brought by GPUs [19,20], the simulation time of the FDTD method is further reduced to less than 3% of that used for the implementation on a standard 3.0GHz CPU.

This paper is organized as follows. The wave Eq. (-)based compact 2D-FDTD method and the matrix pencil method for optical waveguide modal calculation are described in section 2 and 3 followed by its validation in section 4.1. The convergence test is presented in section 4.2. The performance of the enhanced method implemented on GPUs using CUDA is discussed in section 4.3. Comparisons are made among the standard FDTD, the accelerated FDTD and the conventional eigen mode solvers.

2. Wave Eq. (-)based compact 2D-FDTD method with UPML ABCs

For waveguide structures with uniform refractive index, derivatives of the mode field with respect to longitudinal direction can be replaced with jβand consequently reduce one dimension complexity in space which can save significant computing time and memory consumption. This method is called compact 2D-FDTD method which was introduced by Xiao et al. [12] and has been widely used in microwaves. To avoid the complex-variables in Maxwell’s Eq. (-)based compact 2D-FDTD method, a wave Eq. (-)based compact 2D-FDTD method is introduced in [21].

In this paper, the wave Eq. (-)based compact 2D-FDTD method with UPML absorbing boundary conditions is employed. In Appendix I, the detailed formulations of the full-vector, semi-vector, and scalar schemes are summarized for the sake of completeness. Compared with Benenger’s PML [22], UPML is more preferable for parallel computing due to the reason that it avoids the nonphysical field splitting by composing a uniaxial anisotropic medium.

3. Extraction of mode parameters by matrix pencil method

When the wave equations are solved in frequency domain, the operation frequency is treated as an input parameter whereas the propagation constant β is an output in the mode calculation. In our algorithm, the wave equations are solved in time domain and hence β is selected as an input parameter so as to the mode frequencies ωm, the mode losses ξm, and the optical fields (Φm) for different modes are extracted in the post processing after the FDTD simulation. Since the optical waveguide has a uniform index along the propagation direction, an arbitrary transverse field (Ψt(x,y,t)) in the waveguide can be represented by the summation of the eigenmodes Φm(x,y)

Ψt(x,y,t)=m=0Amexp(jωmt)exp(ξmt)Φm(x,y)

In order to extract the mode parameterωm for different modes, the Fourier transform can be employed to obtain the frequency spectrum from the steady-state time domain optical field captured in a window with proper width. The mode parameters ωm are the positions of resonant peaks in the frequency spectrum. Since the accuracy of the mode frequency ωmdepends on the resolution of the frequency spectrum, a large time window in FDTD simulation is required to achieve a high accuracy, which is extremely computation intensive. It is also very difficult to obtain the mode loss ξm accurately by fitting Lorentz shapes centered at mode frequencies in the frequency spectrum. The matrix pencil method (MPM) is proposed to overcome the drawback of the Fourier transform method so that the mode frequency ωm and mode losses ξmcan be extracted accurately with reasonable computation resources.

In the MPM, the time-dependent signal is represented by a summation of a series of complex exponentials [14] as follow.

S(t)=m=0MRmexp(smt)

Rmis a real number and smis a complex number. By mapping Eq. (1) to Eq. (2), the mode frequency ωmand mode loss factor ξmcan both be obtained by

ωm=Im(sm)ξm=Re(sm)

The loss factor can be easily converted to attenuation coefficient αmby the relation αm=ξm/(dωm/dβm). It will be illustrated in the following sections that only limited number of field samples in time domain is required by MPM to obtain mode parameters accurately, which is very robust and more efficient compared with Fourier transform method.

4. Simulation results

4.1. Assessment and validation

In order to validate the mode solver presented in this paper, we calculated the real guided mode of a rectangular channel waveguide and the complex leaky mode of a ridge waveguide, respectively.

4.1.1. Guided mode analysis

The square channel waveguide structure is shown in Fig. 1 . The width (w1) of the square core region is chosen as 2μm and the indices of the core region and the cladding region are 1.52 and 1.46, respectively. The number of PML layers is chosen to be 20 in our simulation.

 figure: Fig. 1

Fig. 1 Schematic diagram of the square channel waveguide structure, n1=1.52, n2=1.46, w1=2μm, PML Layers = 20.

Download Full Size | PDF

An initial optical field with the preset propagation constant β and the Gaussian distribution is launched at the position (x0,y0) inside the core region in FDTD simulation. The excitation source is a Gaussian modulated sinusoidal waveform in time domain. The waveform in the frequency domain also follows a Gaussian distribution with a central frequency ofω0, formulated by

Esrc(x,y,t)=Aexp((xx0)2+(yy0)2Rm)exp((tt0)2Tm)sin(ω0t)

When the steady-state is reached after the optical field propagates in the waveguide for a period of time, proper numbers of temporal samples of the optical field are taken at one position inside the core region in MPM analysis from which the resonant frequency for each mode can be extracted. The accuracy of extracted values by the full-vectorial solver is validated through comparisons made with benchmark results obtained by Goell’s model [23] in Fig. 3 . Note that for the given FDTD time steps and the MPM sampling numbers, the relatively larger error is observed near the cut-off. This is due to the reason that the effective width of each mode is increasing with the decrease of the normalized frequency V (horizontal label in Fig. 3). By using a larger computation window (in space), this error can be reduced.

 figure: Fig. 3

Fig. 3 Guided modal analysis by Goell’s model and solver in this paper.

Download Full Size | PDF

Once the mode frequency ωmis extracted, the field distribution of each mode (Φm) can be readily obtained from the instantaneous field (Ψt) by the formula

Φm(x,y)=Δtn=0Nt1Ψt(x,y,t)exp(jωmnΔt)exp(ξmnΔt)

Figure 2 shows the calculated field pattern of the fundamental mode for the square channel waveguide depicted in Fig. 1.

 figure: Fig. 2

Fig. 2 Fundamental mode profile of the square channel waveguide.

Download Full Size | PDF

4.1.2. Leaky mode analysis

In this section, the proposed method is applied to a deep-etched GaAs-AlGaAs optical waveguide structure [24] shown in Fig. 4 . The indices are 3.4519, 3.4434, 3.3955, 3.3675 for 5%, 6.5%, 15% and 20% AlGaAs material, respectively, and 3.4804 for GaAs cap and substrate. Note that the refractive index of the 5% AlGaAs core region is higher than those of the upper and lower cladding layers. The fundamental mode can be well confined in the rib. However high order modes are much lossier and can strongly leak through the lower 15% AlGaAs cladding layer and radiate into the substrate [24].

 figure: Fig. 4

Fig. 4 Schematic diagram of a deep-etched GaAs-alGaAs optical waveguide structure.

Download Full Size | PDF

The TE20 mode frequency ωm and attenuation coefficient αm obtained by the proposed method with full-vectorial scheme are compared with those obtained from FD mode solver [4] in Table 1 . It also shows that results are well agreed with each other. The accuracy of the extracted results can be controlled by several factors such as the FDTD time steps and the number of sampling points for the MPM. For critical applications, the accuracy can be further enhanced with the cost of extra computation efforts. The TE20 mode profile of the ridge waveguide under investigation is calculated and shown in Fig. 5 . It is observed that the field is radiative towards to the substrate.

Tables Icon

Table 1. Leaky modal analysis by FD mode solver and solver in this paper

 figure: Fig. 5

Fig. 5 TE20 mode profile of the ridge waveguide.

Download Full Size | PDF

4.2. Error analysis and convergence test

The accuracy of the extracted parameters for the optical waveguide modal analysis depends on several factors, namely, the accuracy of the propagating field simulated by the FDTD and that of the MPM post processing. The former is related to the transverse mesh size whereas the latter on the FDTD running time (i.e., related to the sampling window in time) and MPM sample numbers. Assume that the FDTD mesh size is chosen in accordance with the numerical dispersion condition which is δλ/12. We may focus exclusively on the impact of the accuracy associated with the post data processor. Under this assumption, the convergence of the mode solver as functions of the FDTD running time and the MPM sample size is simulated and analyzed for the square channel waveguide structure described previously in section 4.1.1.

4.2.1. Convergence with FDTD time steps

The extracted wavelength of the fundamental mode is calculated with 480x480 mesh points and 7.810043rad/um propagation constant. The percentage error is defined with respect to the benchmark results obtained from the FD mode solver and shown in Fig. 6 as a function of FDTD time steps for a fixed number (1000) of MPM samples. It is observed that the error of the extracted parameter converges rapidly toward zero with the increase of FDTD time steps.

 figure: Fig. 6

Fig. 6 Percentage error of extracted wavelength of fundamental mode versus different FDTD time steps.

Download Full Size | PDF

4.2.2. Convergence with MPM sample number

To study the convergence of this method with MPM sample numbers, the FDTD time steps is chosen to be 8000. The mode calculation is then carried out for the square channel waveguide with different numbers of MPM samples. The percentage error of extracted wavelength of the fundamental mode is shown in Fig. 7 . It is noted that the extracted parameter converges well with the increase of MPM sample numbers.

 figure: Fig. 7

Fig. 7 Percentage error of extracted wavelength of fundamental mode versus MPM sample numbers.

Download Full Size | PDF

4.3. Performance of the mode solver on GPUs

In the mode solver based on FDTD, the most computational intensive work is for the FDTD simulation. Since the index of the waveguide is uniform along the propagation direction (z), the compact FDTD method is employed so that significant computation time can be saved. However, in the modal analysis of an optical waveguide with relatively large dimensions, a large number of meshes are required to ensure adequate accuracy. This leads to drastic increase in computation time and render this approach not applicable as a practical tool for mode calculation on CPUs of today’s desktop computers. In order to improve the computation efficiency, this method is implemented on Nvidia GTX 295 GPUs with parallel calculation algorithms. Comparisons of the memory and time consumptions are made in implementing this method on GPUs, CPU as well as the FD mode solver, respectively.

4.3.1. Memory consumption

When the FDTD mode solver is implemented on CPU and GPU, the memory requirements are virtually the same. The memory consumptions for the scalar mode calculation by the standard and the accelerated FDTD and the conventional FD mode solvers are estimated and shown in Fig. 8 . It is clearly shown that the memory consumption for the FDTD methods scale much slowly than that of the FD eigen solver. A saving of 94.6% memory can be realized for a problem of total mesh points of 480x480. In addition, we also examined the memory requirements of the scalar, semi-vector and full-vector mode calculations by the accelerated FDTD solvers showing in Fig. 9 . It is noted that there are moderate increases of memory consumption for the full and semi-vector modes (e.g., 20% and 35% increase, respectively) in comparison with the scalar mode. We conclude that the FDTD method is far more scalable than the FD scheme in terms of memory requirement and hence is capable of solving waveguide problems with large number of mesh points.

 figure: Fig. 8

Fig. 8 Memory consumptions of the scalar mode calculation by the accelerated FDTD and the conventional FD mode solvers.

Download Full Size | PDF

 figure: Fig. 9

Fig. 9 Memory consumptions of the scalar, semi-vector and full-vector mode calculations by the accelerated FDTD solvers.

Download Full Size | PDF

4.3.2. Time consumption

The computational efficiency is measured by the computation time required to achieve a desirable accuracy. In Fig. 10 , the computation time for scalar mode by the standard FDTD, accelerated FDTD and conventional FD solvers are plotted. Significant improvement in computation efficiency (e.g., reduction of computation time as much as 97% is realized for the scalar mode for a problem of total mesh points of 640x640) is achieved by the accelerated FDTD mode solver in comparison with the standard FDTD mode solver running on CPU of a standard desktop computer. It also shows that the computational efficiency of the accelerated FDTD method is in the same order of magnitude of the standard FD eigen mode solver. In Fig. 11 , the computation time for the scalar, semi-vector and full-vector modes by the accelerated FDTD mode solvers are plotted. It is noted that there are moderate increases of time consumption for the full and semi-vector modes in comparison with the scalar mode.

 figure: Fig. 10

Fig. 10 Time consumption of the scalar mode calculation by the standard FDTD, accelerated FDTD and the conventional FD mode solvers.

Download Full Size | PDF

 figure: Fig. 11

Fig. 11 Time consumption of the scalar, semi-vector and full-vector mode calculations by accelerated FDTD mode solvers on GPUs.

Download Full Size | PDF

5. Summary

It is proposed and demonstrated that the numerical mode solver based on the compact FDTD method in combination of the matrix pencil method (MPM) can be accelerated significantly by commercially available, low-cost and easy-to-implement high-performance computing techniques such as the CUDA algorithm on GPUs. In the meanwhile, the relative low memory requirements and simple numerical scheme inherent in the time explicit FDTD method is preserved. This idea is implemented and validated for calculations of scalar, semi-vector and full-vector modes and hence has general applicability to a wide range of optical waveguide problems. The advantages of this new approach become even more pronounced when the total number of meshes is large, in which the standard FDTD method is prohibitively time-consuming and the conventional FD solver is severely limited by its memory consumption.

Appendix

Derivation of wave equation-based compact 2D-FDTD method with UPML ABCs.

Part A. Full-vectorial scheme

In frequency domain, the full vector wave equation for transverse electric field in linear, isotropic medium optical waveguide is given by [25]

t2Et+(n2neff2)k2Et=t(1n2tn2Et)

The electric field with two perpendicular polarizations Ex,Eyobeys the following equations

AxxEx+AxyEy=k2neff2ExAyxEx+AyyEy=k2neff2Ey

where

AxxEx=x[1n2x(n2Ex)]+2Exy2+n2k2ExAyyEy=y[1n2y(n2Ey)]+2Eyx2+n2k2EyAxyEy=x[1n2y(n2Ey)]2EyxyAyxEx=y[1n2x(n2Ex)]2Exyx

Based on above equations, the full vector wave equation for the transverse electric field in time domain can be obtained as

n2c22t2Ex=x[1n2x(n2Ex)]+2Exy2+x[1n2y(n2Ey)]2Eyxyβ2Exn2c22t2Ey=y[1n2y(n2Ey)]+2Eyx2+y[1n2x(n2Ex)]2Exyxβ2Ey

Since the waveguide is assumed to be homogeneous along the propagation direction (z), derivatives with respect to z have been replaced with jβin above equations. This replacement reduced one dimension complexity in space, which can save significant computing time and memory consumption.

By using the stretched coordinates technique [26], UPML boundary conditions can be applied to full vector compact 2D-FDTD equations as follows

n2c22t2Ex=1sxx[1n21sxx(n2Ex)]+1syy[1syyEx]β2Ex+1sxx[1n21syy(n2Ey)]1syy[1sxxEy]n2c22t2Ey=1syy[1n21syy(n2Ey)]+1sxx[1sxxEy]β2Ey+1syy[1n21sxx(n2Ex)]1sxx[1syyEx]

wheresx=1+σx/jωε0n2, sy=1+σy/jωε0n2

Auxiliary variables are further introduced and defined in Eq.(11)(13)

jωDx_x=1sxx(n2Ex);jωDx_y=1syyExjωDy_x=1sxxEy;jωDy_y=1syy(n2Ey)
jωDx_xx=1sxx(1n2jωDx_x);jωDx_yy=1syy(jωDx_y)jωDy_yx=1sxx(1n2jωDy_y);jωDy_xy=1syy(jωDy_x)
jωDy_yy=1syy(1n2jωDy_y);jωDy_xx=1sxx(jωDy_x)jωDx_xy=1syy(1n2jωDx_x);jωDx_yx=1sxx(jωDx_y)

By substituting above auxiliary variables into Eq.(10), yields

n2c22t2Ex=jωDx_xx+jωDx_yyβ2Ex+jωDy_yxjωDy_xyn2c22t2Ey=jωDy_yy+jωDy_xxβ2Ey+jωDx_xyjωDx_yx

By replacing jω with /tin the resulting equations, the time domain full vector wave equations with UPML absorbing boundary can be derived as

n2c22t2Ex=tDx_xx+tDx_yyβ2Ex+tDy_yxtDy_xyn2c22t2Ey=tDy_yy+tDy_xxβ2Ey+tDx_xytDx_yxtDx_x+σxε0n2Dx_x=x(n2Ex);tDx_y+σyε0n2Dx_y=yExtDy_x+σxε0n2Dy_x=xEy;tDy_y+σyε0n2Dy_y=y(n2Ey)tDx_xx+σxε0n2Dx_xx=x(1n2tDx_x);tDx_yy+σyε0n2Dx_yy=y(tDx_y)tDy_yx+σxε0n2Dy_yx=x(1n2tDy_y);tDy_xy+σyε0n2Dy_xy=y(tDy_x)tDy_yy+σyε0n2Dy_yy=x(1n2tDy_y);tDy_xx+σxε0n2Dy_xx=x(tDy_x)tDx_xy+σyε0n2Dx_xy=y(1n2tDx_x);tDx_yx+σxε0n2Dx_yx=x(tDx_y)

Above equations can be discretized on Yee lattice, wherein the loss terms are averaged in time according to the semi-implicit scheme shown below

σxε0n2Dx=σxε0n2Dxn+1+Dxn2σyε0n2Dy=σyε0n2Dyn+1+Dyn2

The FDTD scheme can then be derived as (space discretization isn’t shown explicitly)

Exn+1=Exn1+2Exn+c2Δtn2(Dx_xxnDx_xxn1)+c2Δtn2(Dx_yynDx_yyn1)β2c2Δt2n2Exn           +c2Δtn2(Dy_yxnDy_yxn1)c2Δtn2(Dy_xynDy_xyn1)Eyn+1=Eyn1+2Eyn+c2Δtn2(Dy_yynDy_yyn1)+c2Δtn2(Dy_xxnDy_xxn1)β2c2Δt2n2Eyn           +c2Δtn2(Dx_xynDx_xyn1)c2Δtn2(Dx_yxnDx_yxn1)

where

Dx_xn+1=2ε0n2Δtσx2ε0n2+ΔtσxDx_xn+2ε0n2Δt2ε0n2+Δtσx(xn2Ex)Dx_yn+1=2ε0n2Δtσy2ε0n2+ΔtσyDx_yn+2ε0n2Δt2ε0n2+Δtσy(yEx)Dy_xn+1=2ε0n2Δtσx2ε0n2+ΔtσxDy_xn+2ε0n2Δt2ε0n2+Δtσx(xEy)Dy_yn+1=2ε0n2Δtσy2ε0n2+ΔtσyDy_yn+2ε0n2Δt2ε0n2+Δtσy(yn2Ey)Dx_xxn+1=2ε0n2Δtσx2ε0n2+ΔtσxDx_xxn+2ε0n22ε0n2+Δtσx[x1n2Dx_xnx1n2Dx_xn1]Dx_yyn+1=2ε0n2Δtσy2ε0n2+ΔtσyDx_yyn+2ε0n22ε0n2+Δtσy[yDx_ynyDx_yn1]Dy_yxn+1=2ε0n2Δtσx2ε0n2+ΔtσxDy_yxn+2ε0n22ε0n2+Δtσx[x1n2Dy_ynx1n2Dy_yn1]Dy_xyn+1=2ε0n2Δtσy2ε0n2+ΔtσyDy_xyn+2ε0n22ε0n2+Δtσy[yDy_xnyDy_xn1]Dy_yyn+1=2ε0n2Δtσy2ε0n2+ΔtσyDy_yyn+2ε0n22ε0n2+Δtσy[y1n2Dy_yny1n2Dy_yn1]Dy_xxn+1=2ε0n2Δtσx2ε0n2+ΔtσxDy_xxn+2ε0n22ε0n2+Δtσx[xDy_xnxDy_xn1]Dx_xyn+1=2ε0n2Δtσy2ε0n2+ΔtσyDx_xyn+2ε0n22ε0n2+Δtσy[y1n2Dx_xny1n2Dx_xn1]Dx_yxn+1=2ε0n2Δtσx2ε0n2+ΔtσxDx_yxn+2ε0n22ε0n2+Δtσx[xDx_ynxDx_yn1]

where σ=0 in the non-PML region.

Part B. Semivectorial scheme

The semivectorial scheme is a special case of full vectorial scheme when Axy and Ayxin Eq.(8) are both equal to zero. Thus the semivectorial compact 2D FDTD with UPML absorbing boundary conditions for optical waveguide can be derived from the full vectorial scheme described in Eq.(17), (18) by eliminating cross terms of the two perpendicular polarization electric fields, which yields

Exn+1=Exn1+2Exn+c2Δtn2(Dx_xxnDx_xxn1)+c2Δtn2(Dx_yynDx_yyn1)β2c2Δt2n2ExnEyn+1=Eyn1+2Eyn+c2Δtn2(Dy_yynDy_yyn1)+c2Δtn2(Dy_xxnDy_xxn1)β2c2Δt2n2Eyn

where

Dx_xn+1=2ε0n2Δtσx2ε0n2+ΔtσxDx_xn+2ε0n2Δt2ε0n2+Δtσx(xn2Ex)Dx_yn+1=2ε0n2Δtσy2ε0n2+ΔtσyDx_yn+2ε0n2Δt2ε0n2+Δtσy(yEx)Dy_xn+1=2ε0n2Δtσx2ε0n2+ΔtσxDy_xn+2ε0n2Δt2ε0n2+Δtσx(xEy)Dy_yn+1=2ε0n2Δtσy2ε0n2+ΔtσyDy_yn+2ε0n2Δt2ε0n2+Δtσy(yn2Ey)Dx_xxn+1=2ε0n2Δtσx2ε0n2+ΔtσxDx_xxn+2ε0n22ε0n2+Δtσx[x1n2Dx_xnx1n2Dx_xn1]Dx_yyn+1=2ε0n2Δtσy2ε0n2+ΔtσyDx_yyn+2ε0n22ε0n2+Δtσy[yDx_ynyDx_yn1]Dy_yyn+1=2ε0n2Δtσy2ε0n2+ΔtσyDy_yyn+2ε0n22ε0n2+Δtσy[y1n2Dy_yny1n2Dy_yn1]Dy_xxn+1=2ε0n2Δtσx2ε0n2+ΔtσxDy_xxn+2ε0n22ε0n2+Δtσx[xDy_xnxDy_xn1]

and where σ=0 in the non-PML region.

Part C. Scalar scheme

Similarly, when the electrical field is assumed to be continuous in both x and y direction, the scalar compact 2D FDTD with UPML absorbing boundary conditions for optical waveguide can also be derived from the semivectorial scheme described in Eq.(19) by keeping only one of the two perpendicular polarization electric fields which yields

Ψn+1=Ψn1+2Ψn+c2Δtn2(Dx_xxnDx_xxn1)+c2Δtn2(Dx_yynDx_yyn1)β2c2Δt2n2Ψn

where

Dx_xn+1=2ε0n2Δtσx2ε0n2+ΔtσxDx_xn+2ε0n2Δt2ε0n2+Δtσx(xEx)Dx_yn+1=2ε0n2Δtσy2ε0n2+ΔtσyDx_yn+2ε0n2Δt2ε0n2+Δtσy(yEx)Dx_xxn+1=2ε0n2Δtσx2ε0n2+ΔtσxDx_xxn+2ε0n22ε0n2+Δtσx[xDx_xnxDx_xn1]Dx_yyn+1=2ε0n2Δtσy2ε0n2+ΔtσyDx_yyn+2ε0n22ε0n2+Δtσy[yDx_ynyDx_yn1]

and where σ=0 in the non-PML region.

Acknowledgements

The authors would like to express their gratitude to Dr. Li Kang at Shandong University for useful discussions regarding FDTD method and GPUs acceleration, Dr. Li Xun at McMaster University for useful discussions regarding the method used in this work, and Mr. Mu Jianwei at McMaster University for useful discussions regarding FD mode solvers and leaky mode analysis methods.

References and links

1. E. Schweig and W. Bridges, “Computer analysis of dielectric waveguides: A finite-difference method,” IEEE Trans. Microw. Theory Tech. 32(5), 531–541 (1984). [CrossRef]  

2. M. Stern, “Semivectorial polarised finite difference method for opticalwaveguides with arbitrary index profiles,” IEE Proc., Optoelectron. 135, 56–63 (1988). [CrossRef]  

3. W. P. Huang, S. T. Chu, A. Goss, and S. K. Chaudhuri, “A scalar finite-difference time-domain approach to guided-wave optics,” (1991), pp. 524–526.

4. A. Fallahkhair, K. Li, and T. Murphy, “Vector finite difference modesolver for anisotropic dielectric waveguides,” J. Lightwave Technol. 26(11), 1423–1431 (2008). [CrossRef]  

5. C. Xu, W. Huang, M. Stern, and S. Chaudhuri, “Full-vectorial mode calculations by finite difference method,” IEE Proc., Optoelectron. 141(5), 281–286 (1994). [CrossRef]  

6. B. Rahman and J. Davies, “Finite-element analysis of optical and microwave waveguide problems,” IEEE Trans. Microw. Theory Tech. 32(1), 20–28 (1984). [CrossRef]  

7. J. Lee, D. Sun, and Z. Cendes, “Full-wave analysis of dielectric waveguides using tangential vector finite elements,” IEEE Trans. Microw. Theory Tech. 39(8), 1262–1271 (1991). [CrossRef]  

8. M. D. Feit and J. A. Fleck Jr., “Computation of mode properties in optical fiber waveguides by a propagating beam method,” Appl. Opt. 19(7), 1154–1164 (1980). [CrossRef]   [PubMed]  

9. C. Xu, W. Huang, and S. Chaudhuri, ““Efficient and accurate vector mode calculations by beam propagationmethod,” Lightwave Technology,” Journalism 11, 1209–1215 (1993).

10. Y. Tsuji and M. Koshiba, “Guided-mode and leaky-mode analysis by imaginary distance beam propagation method based on finite element scheme,” J. Lightwave Technol. 18(4), 618–623 (2000). [CrossRef]  

11. A. Zhao, J. Juntunen, and A. Räisänen, “Analysis of hybrid modes in channel multilayer optical waveguides with the compact 2-D FDTD method,” Microw. Opt. Technol. Lett. 15(6), 398–403 (1997). [CrossRef]  

12. S. Xiao, R. Vahldieck, and H. Jin, “Full-wave analysis of guided wave structures using a novel 2-D FDTD,” IEEE Microw. Guid. Wave Lett. 2(5), 165–167 (1992). [CrossRef]  

13. G. Zhou and X. Li, “Wave equation-based semivectorial compact 2-D-FDTD method for optical waveguide modal analysis,” J. Lightwave Technol. 22(2), 677–683 (2004). [CrossRef]  

14. T. Sarkar and O. Pereira, “Using the matrix pencil method to estimate the parameters of a sum of complex exponentials,” IEEE Antennas Propag. Mag. 37(1), 48–55 (1995). [CrossRef]  

15. Z. Sacks, D. Kingsland, R. Lee, and J. Lee, “A perfectly matched anisotropic absorber for use as an absorbingboundary condition,” IEEE Trans. Antenn. Propag. 43(12), 1460–1463 (1995). [CrossRef]  

16. S. Gedney, “An anisotropic perfectly matched layer-absorbing medium for thetruncation of FDTD lattices,” IEEE Trans. Antenn. Propag. 44(12), 1630–1639 (1996). [CrossRef]  

17. A. Taflove, and S. Hagness, Computational electrodynamics: the finite-difference time-domain method (Artech House Norwood, MA, 1995).

18. S. Ryoo, C. Rodrigues, S. Baghsorkhi, S. Stone, D. Kirk, and W. Hwu, “Optimization principles and application performance evaluation of a multithreaded GPU using CUDA,” in (ACM, 2008), 73–82.

19. S. Krakiwsky, L. Turner, and M. Okoniewski, “Acceleration of finite-difference time-domain (FDTD) using graphics processor units (GPU),” in Microwave Symposium Digest, IEEE MTT-S International, 2004), 1033- 1036.

20. A. Balevic, L. Rockstroh, A. Tausendfreund, S. Patzelt, G. Goch, and S. Simon, “Accelerating simulations of light scattering based on finite-difference time-domain method with general purpose gpus,” in 2008), 327–334.

21. M. Okoniewski, “Vector wave equation 2-D-FDTD method for guided wave problems,” IEEE Microw. Guid. Wave Lett. 3(9), 307–309 (1993). [CrossRef]  

22. J. Berenger, “A perfectly matched layer for the absorption of electromagnetic waves,” J. Comput. Phys. 114(2), 185–200 (1994). [CrossRef]  

23. J. Goell, “A circular-harmonic computer analysis of rectangular dielectric waveguides,” Bell Syst. Tech. J. 48, 2133–2160 (1969).

24. J. Heaton, M. Bourke, S. Jones, B. Smith, K. Hilton, G. Smith, J. Birbeck, G. Berry, S. Dewar, and D. Wight, “Optimization of deep-etched, single-mode GaAs/AlGaAs optical waveguides using controlled leakage into the substrate,” J. Lightwave Technol. 17(2), 267–281 (1999). [CrossRef]  

25. W. Huang and C. Xu, “Simulation of three-dimensional optical waveguides by a full-vectorbeam propagation method,” IEEE J. Quantum Electron. 29(10), 2639–2649 (1993). [CrossRef]  

26. W. Chew and W. Weedon, “A 3-D perfectly matched medium from modified Maxwell's equations with stretched coordinates,” Microw. Opt. Technol. Lett. 7(13), 599–604 (1994). [CrossRef]  

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (11)

Fig. 1
Fig. 1 Schematic diagram of the square channel waveguide structure, n 1 = 1.52 , n 2 = 1.46 , w 1 = 2 μ m , PML Layers = 20.
Fig. 3
Fig. 3 Guided modal analysis by Goell’s model and solver in this paper.
Fig. 2
Fig. 2 Fundamental mode profile of the square channel waveguide.
Fig. 4
Fig. 4 Schematic diagram of a deep-etched GaAs-alGaAs optical waveguide structure.
Fig. 5
Fig. 5 TE20 mode profile of the ridge waveguide.
Fig. 6
Fig. 6 Percentage error of extracted wavelength of fundamental mode versus different FDTD time steps.
Fig. 7
Fig. 7 Percentage error of extracted wavelength of fundamental mode versus MPM sample numbers.
Fig. 8
Fig. 8 Memory consumptions of the scalar mode calculation by the accelerated FDTD and the conventional FD mode solvers.
Fig. 9
Fig. 9 Memory consumptions of the scalar, semi-vector and full-vector mode calculations by the accelerated FDTD solvers.
Fig. 10
Fig. 10 Time consumption of the scalar mode calculation by the standard FDTD, accelerated FDTD and the conventional FD mode solvers.
Fig. 11
Fig. 11 Time consumption of the scalar, semi-vector and full-vector mode calculations by accelerated FDTD mode solvers on GPUs.

Tables (1)

Tables Icon

Table 1 Leaky modal analysis by FD mode solver and solver in this paper

Equations (22)

Equations on this page are rendered with MathJax. Learn more.

Ψ t ( x , y , t ) = m = 0 A m exp ( j ω m t ) exp ( ξ m t ) Φ m ( x , y )
S ( t ) = m = 0 M R m exp ( s m t )
ω m = Im ( s m ) ξ m = Re ( s m )
E s r c ( x , y , t ) = A exp ( ( x x 0 ) 2 + ( y y 0 ) 2 R m ) exp ( ( t t 0 ) 2 T m ) sin ( ω 0 t )
Φ m ( x , y ) = Δ t n = 0 N t 1 Ψ t ( x , y , t ) exp ( j ω m n Δ t ) exp ( ξ m n Δ t )
t 2 E t + ( n 2 n eff 2 ) k 2 E t = t ( 1 n 2 t n 2 E t )
A x x E x + A x y E y = k 2 n eff 2 E x A y x E x + A y y E y = k 2 n eff 2 E y
A x x E x = x [ 1 n 2 x ( n 2 E x ) ] + 2 E x y 2 + n 2 k 2 E x A y y E y = y [ 1 n 2 y ( n 2 E y ) ] + 2 E y x 2 + n 2 k 2 E y A x y E y = x [ 1 n 2 y ( n 2 E y ) ] 2 E y x y A y x E x = y [ 1 n 2 x ( n 2 E x ) ] 2 E x y x
n 2 c 2 2 t 2 E x = x [ 1 n 2 x ( n 2 E x ) ] + 2 E x y 2 + x [ 1 n 2 y ( n 2 E y ) ] 2 E y x y β 2 E x n 2 c 2 2 t 2 E y = y [ 1 n 2 y ( n 2 E y ) ] + 2 E y x 2 + y [ 1 n 2 x ( n 2 E x ) ] 2 E x y x β 2 E y
n 2 c 2 2 t 2 E x = 1 s x x [ 1 n 2 1 s x x ( n 2 E x ) ] + 1 s y y [ 1 s y y E x ] β 2 E x + 1 s x x [ 1 n 2 1 s y y ( n 2 E y ) ] 1 s y y [ 1 s x x E y ] n 2 c 2 2 t 2 E y = 1 s y y [ 1 n 2 1 s y y ( n 2 E y ) ] + 1 s x x [ 1 s x x E y ] β 2 E y + 1 s y y [ 1 n 2 1 s x x ( n 2 E x ) ] 1 s x x [ 1 s y y E x ]
j ω D x _ x = 1 s x x ( n 2 E x ) ; j ω D x _ y = 1 s y y E x j ω D y _ x = 1 s x x E y ; j ω D y _ y = 1 s y y ( n 2 E y )
j ω D x _ x x = 1 s x x ( 1 n 2 j ω D x _ x ) ; j ω D x _ y y = 1 s y y ( j ω D x _ y ) j ω D y _ y x = 1 s x x ( 1 n 2 j ω D y _ y ) ; j ω D y _ x y = 1 s y y ( j ω D y _ x )
j ω D y _ y y = 1 s y y ( 1 n 2 j ω D y _ y ) ; j ω D y _ x x = 1 s x x ( j ω D y _ x ) j ω D x _ x y = 1 s y y ( 1 n 2 j ω D x _ x ) ; j ω D x _ y x = 1 s x x ( j ω D x _ y )
n 2 c 2 2 t 2 E x = j ω D x _ x x + j ω D x _ y y β 2 E x + j ω D y _ y x j ω D y _ x y n 2 c 2 2 t 2 E y = j ω D y _ y y + j ω D y _ x x β 2 E y + j ω D x _ x y j ω D x _ y x
n 2 c 2 2 t 2 E x = t D x _ x x + t D x _ y y β 2 E x + t D y _ y x t D y _ x y n 2 c 2 2 t 2 E y = t D y _ y y + t D y _ x x β 2 E y + t D x _ x y t D x _ y x t D x _ x + σ x ε 0 n 2 D x _ x = x ( n 2 E x ) ; t D x _ y + σ y ε 0 n 2 D x _ y = y E x t D y _ x + σ x ε 0 n 2 D y _ x = x E y ; t D y _ y + σ y ε 0 n 2 D y _ y = y ( n 2 E y ) t D x _ x x + σ x ε 0 n 2 D x _ x x = x ( 1 n 2 t D x _ x ) ; t D x _ y y + σ y ε 0 n 2 D x _ y y = y ( t D x _ y ) t D y _ y x + σ x ε 0 n 2 D y _ y x = x ( 1 n 2 t D y _ y ) ; t D y _ x y + σ y ε 0 n 2 D y _ x y = y ( t D y _ x ) t D y _ y y + σ y ε 0 n 2 D y _ y y = x ( 1 n 2 t D y _ y ) ; t D y _ x x + σ x ε 0 n 2 D y _ x x = x ( t D y _ x ) t D x _ x y + σ y ε 0 n 2 D x _ x y = y ( 1 n 2 t D x _ x ) ; t D x _ y x + σ x ε 0 n 2 D x _ y x = x ( t D x _ y )
σ x ε 0 n 2 D x = σ x ε 0 n 2 D x n + 1 + D x n 2 σ y ε 0 n 2 D y = σ y ε 0 n 2 D y n + 1 + D y n 2
E x n + 1 = E x n 1 + 2 E x n + c 2 Δ t n 2 ( D x _ x x n D x _ x x n 1 ) + c 2 Δ t n 2 ( D x _ y y n D x _ y y n 1 ) β 2 c 2 Δ t 2 n 2 E x n             + c 2 Δ t n 2 ( D y _ y x n D y _ y x n 1 ) c 2 Δ t n 2 ( D y _ x y n D y _ x y n 1 ) E y n + 1 = E y n 1 + 2 E y n + c 2 Δ t n 2 ( D y _ y y n D y _ y y n 1 ) + c 2 Δ t n 2 ( D y _ x x n D y _ x x n 1 ) β 2 c 2 Δ t 2 n 2 E y n             + c 2 Δ t n 2 ( D x _ x y n D x _ x y n 1 ) c 2 Δ t n 2 ( D x _ y x n D x _ y x n 1 )
D x _ x n + 1 = 2 ε 0 n 2 Δ t σ x 2 ε 0 n 2 + Δ t σ x D x _ x n + 2 ε 0 n 2 Δ t 2 ε 0 n 2 + Δ t σ x ( x n 2 E x ) D x _ y n + 1 = 2 ε 0 n 2 Δ t σ y 2 ε 0 n 2 + Δ t σ y D x _ y n + 2 ε 0 n 2 Δ t 2 ε 0 n 2 + Δ t σ y ( y E x ) D y _ x n + 1 = 2 ε 0 n 2 Δ t σ x 2 ε 0 n 2 + Δ t σ x D y _ x n + 2 ε 0 n 2 Δ t 2 ε 0 n 2 + Δ t σ x ( x E y ) D y _ y n + 1 = 2 ε 0 n 2 Δ t σ y 2 ε 0 n 2 + Δ t σ y D y _ y n + 2 ε 0 n 2 Δ t 2 ε 0 n 2 + Δ t σ y ( y n 2 E y ) D x _ x x n + 1 = 2 ε 0 n 2 Δ t σ x 2 ε 0 n 2 + Δ t σ x D x _ x x n + 2 ε 0 n 2 2 ε 0 n 2 + Δ t σ x [ x 1 n 2 D x _ x n x 1 n 2 D x _ x n 1 ] D x _ y y n + 1 = 2 ε 0 n 2 Δ t σ y 2 ε 0 n 2 + Δ t σ y D x _ y y n + 2 ε 0 n 2 2 ε 0 n 2 + Δ t σ y [ y D x _ y n y D x _ y n 1 ] D y _ y x n + 1 = 2 ε 0 n 2 Δ t σ x 2 ε 0 n 2 + Δ t σ x D y _ y x n + 2 ε 0 n 2 2 ε 0 n 2 + Δ t σ x [ x 1 n 2 D y _ y n x 1 n 2 D y _ y n 1 ] D y _ x y n + 1 = 2 ε 0 n 2 Δ t σ y 2 ε 0 n 2 + Δ t σ y D y _ x y n + 2 ε 0 n 2 2 ε 0 n 2 + Δ t σ y [ y D y _ x n y D y _ x n 1 ] D y _ y y n + 1 = 2 ε 0 n 2 Δ t σ y 2 ε 0 n 2 + Δ t σ y D y _ y y n + 2 ε 0 n 2 2 ε 0 n 2 + Δ t σ y [ y 1 n 2 D y _ y n y 1 n 2 D y _ y n 1 ] D y _ x x n + 1 = 2 ε 0 n 2 Δ t σ x 2 ε 0 n 2 + Δ t σ x D y _ x x n + 2 ε 0 n 2 2 ε 0 n 2 + Δ t σ x [ x D y _ x n x D y _ x n 1 ] D x _ x y n + 1 = 2 ε 0 n 2 Δ t σ y 2 ε 0 n 2 + Δ t σ y D x _ x y n + 2 ε 0 n 2 2 ε 0 n 2 + Δ t σ y [ y 1 n 2 D x _ x n y 1 n 2 D x _ x n 1 ] D x _ y x n + 1 = 2 ε 0 n 2 Δ t σ x 2 ε 0 n 2 + Δ t σ x D x _ y x n + 2 ε 0 n 2 2 ε 0 n 2 + Δ t σ x [ x D x _ y n x D x _ y n 1 ]
E x n + 1 = E x n 1 + 2 E x n + c 2 Δ t n 2 ( D x _ x x n D x _ x x n 1 ) + c 2 Δ t n 2 ( D x _ y y n D x _ y y n 1 ) β 2 c 2 Δ t 2 n 2 E x n E y n + 1 = E y n 1 + 2 E y n + c 2 Δ t n 2 ( D y _ y y n D y _ y y n 1 ) + c 2 Δ t n 2 ( D y _ x x n D y _ x x n 1 ) β 2 c 2 Δ t 2 n 2 E y n
D x _ x n + 1 = 2 ε 0 n 2 Δ t σ x 2 ε 0 n 2 + Δ t σ x D x _ x n + 2 ε 0 n 2 Δ t 2 ε 0 n 2 + Δ t σ x ( x n 2 E x ) D x _ y n + 1 = 2 ε 0 n 2 Δ t σ y 2 ε 0 n 2 + Δ t σ y D x _ y n + 2 ε 0 n 2 Δ t 2 ε 0 n 2 + Δ t σ y ( y E x ) D y _ x n + 1 = 2 ε 0 n 2 Δ t σ x 2 ε 0 n 2 + Δ t σ x D y _ x n + 2 ε 0 n 2 Δ t 2 ε 0 n 2 + Δ t σ x ( x E y ) D y _ y n + 1 = 2 ε 0 n 2 Δ t σ y 2 ε 0 n 2 + Δ t σ y D y _ y n + 2 ε 0 n 2 Δ t 2 ε 0 n 2 + Δ t σ y ( y n 2 E y ) D x _ x x n + 1 = 2 ε 0 n 2 Δ t σ x 2 ε 0 n 2 + Δ t σ x D x _ x x n + 2 ε 0 n 2 2 ε 0 n 2 + Δ t σ x [ x 1 n 2 D x _ x n x 1 n 2 D x _ x n 1 ] D x _ y y n + 1 = 2 ε 0 n 2 Δ t σ y 2 ε 0 n 2 + Δ t σ y D x _ y y n + 2 ε 0 n 2 2 ε 0 n 2 + Δ t σ y [ y D x _ y n y D x _ y n 1 ] D y _ y y n + 1 = 2 ε 0 n 2 Δ t σ y 2 ε 0 n 2 + Δ t σ y D y _ y y n + 2 ε 0 n 2 2 ε 0 n 2 + Δ t σ y [ y 1 n 2 D y _ y n y 1 n 2 D y _ y n 1 ] D y _ x x n + 1 = 2 ε 0 n 2 Δ t σ x 2 ε 0 n 2 + Δ t σ x D y _ x x n + 2 ε 0 n 2 2 ε 0 n 2 + Δ t σ x [ x D y _ x n x D y _ x n 1 ]
Ψ n + 1 = Ψ n 1 + 2 Ψ n + c 2 Δ t n 2 ( D x _ x x n D x _ x x n 1 ) + c 2 Δ t n 2 ( D x _ y y n D x _ y y n 1 ) β 2 c 2 Δ t 2 n 2 Ψ n
D x _ x n + 1 = 2 ε 0 n 2 Δ t σ x 2 ε 0 n 2 + Δ t σ x D x _ x n + 2 ε 0 n 2 Δ t 2 ε 0 n 2 + Δ t σ x ( x E x ) D x _ y n + 1 = 2 ε 0 n 2 Δ t σ y 2 ε 0 n 2 + Δ t σ y D x _ y n + 2 ε 0 n 2 Δ t 2 ε 0 n 2 + Δ t σ y ( y E x ) D x _ x x n + 1 = 2 ε 0 n 2 Δ t σ x 2 ε 0 n 2 + Δ t σ x D x _ x x n + 2 ε 0 n 2 2 ε 0 n 2 + Δ t σ x [ x D x _ x n x D x _ x n 1 ] D x _ y y n + 1 = 2 ε 0 n 2 Δ t σ y 2 ε 0 n 2 + Δ t σ y D x _ y y n + 2 ε 0 n 2 2 ε 0 n 2 + Δ t σ y [ y D x _ y n y D x _ y n 1 ]
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.