A recently proposed ray-based method for wave propagation is used to provide a meaningful criterion for the validity of rays in wave theory. This method assigns a Gaussian contribution to each ray in order to estimate the field. Such contributions are inherently flexible. By means of a simple example, it is shown that superior field estimates can result when the contributions are no longer forced to evolve like parabasal beamlets.
©2002 Optical Society of America
It is usually said that rays result from considering the limit of small wavelength . For nonzero wavelengths, however, the meaning of a ray is unclear, and the choice of a set of rays that is to be associated with a wave field can be ambiguous. Rays, nevertheless, are often used successfully to estimate the propagation of a variety of waves, and there are several non-equivalent procedures for doing so. For some of these methods, rays represent infinitely localized conduits of power, while for others they correspond to extended plane waves. Each of these approaches has its own domain of validity. Overviews of different ray-based wave propagation techniques are presented, for example, in two books by Kravtsov and Orlov [2,3].
In a recent series of papers [4–7], we introduced the core elements of a new ray-based method for the propagation of waves. The field estimate given by this method takes the form of a sum of Gaussian contributions where each member is associated with a ray but also holds information about the neighboring rays. It turns out that there is an inherent flexibility in the width of these contributions: the sum is structured so that the result is insensitive to this width. For this reason, we coined the term “stable aggregates of flexible elements” (SAFE) to describe the method.
The purpose of this manuscript is to discuss some of the advantages of this flexibility, and to show how it provides a meaningful criterion for the significance of rays in wave problems. In particular, we seek to answer a question that remained open in our previous work: Given that SAFE allows the widths of its Gaussian contributions to evolve almost arbitrarily, wouldn’t the most accurate estimates result from requiring these contributions to evolve like parabasal beamlets? That is, as in standard beamlet summation methods, the individual contributions could be forced to be approximate solutions to the wave equation (in this case Helmholtz). To answer this question for the case of a scalar monochromatic field in an inhomogeneous medium, we summarize the derivation of the estimates for these two approaches in Section 2. By means of a simple example, it is shown in Section 3 that the freedom of the widths within SAFE not only simplifies the calculations but also gives superior estimates. This is especially so in media where Gaussian beamlets degenerate rapidly. A second goal of this paper is to illustrate the application of the new method, especially the tools for propagation across —and reflection from— an interface as developed in Ref. 7. This is done in Section 4 for the example used in Section 3.
2. Sums of Gaussian contributions
For simplicity, consider the case of propagation in two dimensions. The field U considered here satisfies the Helmholtz equation.
where k = 2π/λ and n is the refractive index. We assume that this field propagates essentially (but not necessarily paraxially) towards larger z. (That is, it is assumed that rays do not turn around in z.) A one-parameter family of rays can be associated with this two-dimensional wave field. The rays can be written as x = X(ξ; z), where ξ smoothly identifies the rays. The initial conditions for these rays, namely X(ξ;0) and ∂X/∂z (ξ;0), must be determined from the specified initial field U(x,0). The rays are then traced through the medium by using standard ray equations:
where P and H correspond, respectively, to n(X, z) times the x and z components of a unit vector locally tangent to the ray. Hence
Also of interest is the optical length of these rays, measured from a common phase front. This function, denoted by L, satisfies
Effectively, L is the eikonal evaluated at [X(ξ; z), z].
The field estimates considered here take the form of a sum over all rays:
Here, w is a weight function and g is the Gaussian contribution given by
Notice that, at any given z, the contribution in Eq. (6) is centered at x = X(ξ; z) and has a transverse phase that is linear and matches that of a plane wave propagating in the ray’s local direction. The width of the Gaussian is evidently determined by γ. We now summarize two ways in which the form in Eq. (5) can be forced to satisfy Eq. (1) asymptotically:
a) Gaussian beamlets
The conventional option exploits the linearity of the wave equation. That is, each independent contribution wg is forced to be an asymptotic solution of Eq. (1), i.e. to evolve like a parabasal Gaussian beamlet around the ray identified by ξ. As discussed in the Appendix of Ref. 4, the weight and width of the contributions must then evolve according to
where w, γ, P, and H are functions of ξ and z, and n and its derivatives are evaluated at [X(ξ; z), z]. The beamlets that result from these expressions are asymptotic solutions of the wave equation, valid under what is known as the parabasal approximation, that is, the assumption that the beamlet remains well localized around its central ray. The field estimate for z > 0 results from the sum of the independent beamlets.
The novel approach proposed in Ref. 4 is to force only the sum of all contributions in Eq. (5) to be an accurate solution Eq. (1). Integration by parts over ξ can then be used to mix different terms that could not be combined in the beamlet approach. The simple form of the weight function that results was shown in Ref. 4 to be
where the primes denote partial derivatives with respect to ξ. These derivatives mean that the weight of each Gaussian contribution depends not only on the corresponding ray but also on its closest neighbors. This is consistent with the intuitive idea that rays carry information in a shared fashion. It turns out that |a 0|2 dξ (which is constant along a ray) is proportional to the power carried by the ray bundle.
This second approach has several advantages. The first is that, at the top asymptotic order, the weight is given by the explicit expression in Eq. (9) instead of the (coupled) differential equations that beamlets must satisfy. The lower order terms in Eq. (9) can be neglected to get a basic estimate, referred to as . A more important advantage is that, when corrections or error estimates are needed, the specific form of the next term (given explicitly in Refs. 4 and 6) can be used to account not only for the global effects of any initial decomposition errors, but also for other errors accumulated through propagation. Third — and most strikingly — γ is now a free parameter and not constrained by the likes of Eq. (8). This means that the Gaussian contributions are not forced to evolve (and disintegrate) like beamlets. Even though they don’t accurately satisfy Eq. (1) individually, they do so collectively. The form of the weight in Eq. (9) not only guarantees that the estimate in Eq. (5) satisfies the Helmholtz equation asymptotically, it also makes the estimate asymptotically insensitive to the width parameter γ. That is, as shown in Ref. 4,
Equation (10) means that the width of the elements is, within some well-defined bounds, effectively irrelevant. This fact gives a profound link between rays and waves that does not require contrived notions such as varying k or sending it to infinity: a ray family has an adequate connection to a wave field only when the width of the contributions can be varied over a significant range without disrupting the associated field estimate. That is, the estimate must remain within the desired accuracy window. In cases when there are no caustics, the width of the contributions can even be made to vanish without changing the estimate significantly. In this case SAFE reduces to the most standard ray-based construction, where a ray is interpreted as a local conduit of power. Similarly, when there are no “momentum caustics” (i.e. neighboring rays are parallel), the contribution’s width can be sent to infinity, and we get another standard interpretation of a ray, namely a plane wave. In cases when these two options are valid, SAFE gives a continuous link between them. More generally, both extremes fail catastrophically, but SAFE remains valid for a wide range of intermediate values of γ.
Notice that the calculation of SAFE’s basic field estimate requires no information other than that given by the ray family (including the conserved weight a 0). For the beamlet summation case, on the other hand, one must also trace the width and weight of each beam. The bounds for the variation of γ mentioned above [which follow from the explicit form of the right-hand side of Eq. (10) as presented in Ref. 4] have a simple geometric interpretation in terms of the ray family alone. This interpretation and the meaning of insensitivity in general are illustrated with a simple example in the following section. We also show that the collective behavior of the contributions in SAFE can give more accurate wave models than a sum of Gaussian beamlets that each obeys Eqs. (7) and (8). To many, this is counterintuitive.
3. Example: Modes of a quadratic gradient-index waveguide
To illustrate the meaning of the insensitivity condition and to demonstrate that SAFE can outdo sums of Gaussian beamlets, we consider the gradient-index waveguide used in the example from Ref. 6, where
with n 0 and ν both positive and real. The modes of this waveguide are Hermite-Gaussians of the form
where k¯ = kn 0/ν and Hm is the Hermite polynomial of order m. When m > M = (k¯-1)/2, the argument of the square root within the second exponential becomes negative. In this case, the positive imaginary root must be chosen, so the corresponding modes decay exponentially in z, i.e. they are evanescent. For 0 ≤ m ≥ M, the modes in Eq. (12) are homogeneous, and they linearly accumulate a phase as they propagate in z.
3.1 Matching the initial condition and achieving insensitivity
In the example in Ref. 6 we considered a broad initial field composed of several modes. Here, however, our goal is to determine what type of evolution for the width of the Gaussian contributions leads to better results. It turns out that an individual homogeneous mode of this medium is ideal for this purpose. A set of initial conditions for a ray family that are associated with a Hermite-Gaussian initial field (without regard of the medium in which this field was to propagate) like U m(x,0) was proposed in Ref. 5. These conditions are given by
where is the spatial half-width of U m(x,0), and is the corresponding half-width of its Fourier transform. Here, ξ, varies over an interval of length 2π. It follows from Eqs. (2) and (11) that these rays propagate according to
where is the constant value that H(ξ; z) takes for this example. The corresponding ray family is overlaid onto the intensity profile of the mode for m = 8 in Fig. 1.
Notice that the high intensity regions coincide roughly with the ray caustics, and that the central fringes can be thought of as interference patterns associated with the ray crossings. Further, to within a shift in z, all rays are identical and have a uniform weight, i.e. a 0(ξ) = c. The optical length of each ray follows from Eqs. (4):
Unlike X and P, this function is not periodic in ξ. It can be seen from Eqs. (5), (6), and (9), however, that the values of X 0 and P 0 given after Eq. (13) guarantee that the field contribution wg is periodic when the square root in Eq. (9) is chosen to vary continuously in ξ. That is, by admitting only these discrete values for X 0 and P 0, this “quantization” condition ensures that the resulting estimate is independent of the chosen interval.
It was found in Ref. 5 that, provided γ is chosen to be γ 0 = n 0 ν, the initial condition U m(x,0) is met exactly by the estimate that results upon substituting Eq. (6) and the leading order of Eq. (9) into Eq. (5) with z = 0. Here, the ray family is the one discussed above. Although the match is exact, we claim that this construction gives a meaningful connection between rays and waves only if the estimate is insensitive to changes in γ over a significant interval around γ 0. (We only consider here real values for γ, although this parameter can be complex in general as discussed in Refs. 4–6, provided its real part is positive for the contributions to be localized.) On account of Eq. (10), insensitivity evidently holds for sufficiently large values of k, but such observations are not useful in practice. However, we have derived an explicit insensitivity condition. In the present case, insensitivity demands that must be much larger than unity, as we now explain.
At a given z, the ray family can be represented by the parametric curve [X(ξ; z), P(ξ; z)], for varying ξ, in the (x,p) plane referred to as phase space. It is seen from Eqs. (14) that, for the present example, this curve is always an ellipse enclosing an area of πX 0 P 0 = π(2m + 1)/k. The geometric criterion for insensitivity found in Ref. 4 is best expressed in a (dimensionless) scaled space with coordinates . Specifically, γ must be chosen to maximize the tightest local radius of curvature of the curve . This radius of curvature must be much greater than unity for to be insensitive to γ. The value γ 0 = n 0 ν turns the ellipse into a circle with a radius of . Varying γ deforms the circle into an ellipse, although the enclosed area is fixed at π(2m + 1). Clearly, a larger circle can withstand stronger deformations before its peak curvature becomes problematic.
It is impressive to see that credible results emerge even when the peak curvature comes to well within an order of magnitude of unity. For example, Fig. 2 shows, for m=8, the deformation of the circle and the corresponding variation of the field estimates (and resulting errors) associated with variation of γ. Notice that the errors gather in x around the caustics (at |x| = X 0 in this case) for larger γ, and around the momentum caustics (at x = 0) for smaller γ. Although γ is varied by a factor of four for the animation — hence the ray localization varies by a factor of two —the heights of the intensity peaks change by no more than about five percent. Of course, this insensitivity is far more pronounced for higher order modes (where it is usual to expect ray methods to apply).
We now consider the estimation of the propagated modes by using the two approaches considered in Section 2.
a) Gaussian beamlets
To estimate the field at z > 0, we must calculate the evolution of the width and weight for each beamlet. The initial conditions to be used are γ(ξ;0) = γ 0 and [from Eqs. (9) and (13) with a 0 = c] which, as discussed in the previous subsection, make the continuous superposition in Eq. (5) match exactly the initial field. For this case, and by using Γ(ξ;τ) = γ(ξ;H 0 τ/n 0 ν)/γ 0, Eq. (8) becomes
where ρ = P 0/H 0 = [( m + 1/2)/(M - m)]1/2 is a measure of the non-paraxiality of the mode. Notice that ρ is small only when m ≪ M. That is, only the low-order modes in a waveguide that supports many propagating modes are in fact paraxial. The solution to this equation is given by
The first of these expressions corresponds to a “breathing” (or phase space rotation) of the Gaussians, while the second one describes a linear stretching (or phase space shearing), as will be shown later. Similarly, the solution to Eq. (7) for this case is given by
By using Eqs. (17) and (18) in Eq. (5), we find the corresponding estimate for the modes at different z. In this simple example, the rays are known in closed form. In general, however, rays must be traced numerically, so we can only keep track of a finite, discrete set of them. To simulate this, we replace the integral in Eq. (5) by a sum at uniform intervals in ξ of 2π/N. Fig. 3a shows, for m = 8 and several values of N, the evolution in z of the sums of beamlets compared to the exact result. Notice that, with the chosen initial conditions, the initial field is well matched by the estimate even for a relatively small number of contributions. As z increases, however, the numerical errors caused by the discretization become apparent. Further, it turns out that these are not the only sources of error. Even for very large N, where the sum approaches the integral and the result retains a Hermite-Gaussian form (due to an intrinsic symmetry of this problem), the amplitude and phase go drastically astray. These variations are shown in Fig. 4. The amplitude (which should be constant) initially increases with z, and then decays roughly as z -0.53. The source of this fundamental error is explained by the diagram in Fig. 3b, which shows schematically the region of phase space occupied by each beamlet. As z increases, these Gaussians rotate and shear linearly, reaching spatial spreads well beyond the parabasal regime. The rate of deterioration of this estimate is then dictated by the rate of shearing of each beamlet, which as seen from the second expression in Eq. (17) is given by ρ 2 τ/z = γ 0 /. This means that it takes roughly (2πρ 2)-1 ray periods for the estimate to fail. That is, modes that are closer to the paraxial limit (ρ → 0) have longer-lived estimates. This makes sense because, in the paraxial limit, the medium in Eq. (11) becomes the optical analogue of the harmonic oscillator, the beamlets are analogous to coherent states, and the parabasal approximation made in the derivation of Eqs. (7) and (8) is exactly valid. Beyond the paraxial limit, the beamlets do not individually retain their Gaussian profile.
It follows from Eqs. (14) that, for any z, the phase space curve is always the same ellipse, and rays just circulate around it. Therefore, the most convenient choice for γ is always γ 0 = n 0 ν. By changing the ray parameter to ξ′ = ξ - n 0 ν z/H 0, the estimate in Eq. (5) becomes independent of z, except for a phase factor due to the last term in Eq. (15). This phase matches exactly the linear phase in Eq. (12), and the estimate matches, at all z, the exact result. That is, for this example, SAFE can always match the exact result. Further, the result is equally insensitive at all z, so the grip between rays and waves holds indefinitely. Even when a discrete number of contributions is used, the estimate is quite accurate, as seen from the initial frame of Fig. 3 (where z = 0). This frame characterizes SAFE’s results at all z. Notice also that the non-paraxiality of the problem has no impact on the validity of SAFE.
While this example might appear to be very special in that SAFE is able to match the exact result for all z, it is nevertheless representative of a more general behavior for (at least) smooth waveguides in which n is independent of z. Any guided ray in such medium has a periodic trajectory, which in phase space describes a closed loop (symmetric around the x axis) parametrized by z. Rays belonging to different loops have different periods, meaning that the waveguide is anharmonic. [The only harmonic optical waveguide is the one in Eq. (11) within the paraxial approximation.] The ray families associated to the modes of the waveguide correspond to all the rays that compose one of these phase space loops (i.e. identical rays shifted in z) with the “quantization” condition that the area enclosed by the loop is π(2m + 1)/k. A suitable value (or range of values) for γ can be chosen from the geometric criterion for insensitivity, and the desired mode can then be constructed approximately. Its profile is independent of z. If, on the other hand, we let each Gaussian propagate as a beam, the anharmonicity will cause it to stretch in phase space in a form similar to that in Fig. 3b, causing the estimate to deteriorate under propagation. Therefore, for smooth waveguides, it is always more effective to make γ independent of z instead of forcing it to evolve as the width of a Gaussian beamlet. This constancy of γ gives an analogue to Heller’s “frozen Gaussian approximation” used in semiclassical quantum mechanics . There are other options, however, considered in Refs. 4–6 where γ is allowed to depend on ξ, and this can help to further enhance SAFE’s estimates for more complex waveguides.
Suppose now that the initial field is not a mode, and the corresponding phase space curve is not one of the unchanging phase space tracks mentioned above. As z increases, the anharmonicity causes the resulting curve to get wrapped up in a spiral-like form, where each turn becomes increasingly similar to one of the natural loops. Again, the corresponding propagated beamlets would be highly stretched in phase space, causing errors in the estimate. SAFE, on the other hand, can handle such cases well, as seen in the example in Ref. 6. We can therefore expect SAFE to give superior results to those of beamlet summation for a wide variety of cases, where the parabasal approximation used in the derivation of Eqs. (7) and (8) becomes invalid, due to the beamlets’ expansion.
4. Propagation through an interface
We now illustrate the extended capabilities of ray-based methods by applying SAFE to the problem of modeling the interaction of a modal field with an interface between the waveguide used as an example in Section 3 and a homogeneous medium. We assume a flat vertical interface at z = z 1 and a homogeneous medium of refractive index n 1. The objective is to find the transmitted and reflected fields that are generated by the incidence of a waveguide mode at the interface. This problem has no closed-form solution, and an accurate analysis may appear to be beyond the reach of rays.
Following the approach in Ref. 7, the transmitted and reflected fields U t and U r resulting from the incidence of U i = U m at the interface are estimated from the corresponding transmitted and reflected ray families by using the prescription in Eqs. (5), (6), and (9). By requiring that the boundary conditions, namely U i γ0 + U r γ0 = U t γ0 and at z = z 1, are asymptotically satisfied, it follows that the weights used in these estimates are given by the following intuitive relations:
Here, the transmission and reflection factors are similar to the TE Fresnel coefficients:
where q = n 1/n 0. Note that the first part of Eq. (21) follows from the fact that, due to Snell’s law, P is conserved across a flat interface perpendicular to z. In the last part of Eq. (21), we used Eq. (14) as well as = + . Notice also that, for (i.e. n 1 < P 0), the rays hitting the interface in the vicinity of the z axis are totally reflected. As a result, the corresponding transmitted rays have a (positive) imaginary H 1, i.e. the transmitted field has evanescent components. Estimates of the resulting wavefield when the incident mode has m = 8 and ρ = 1 are shown in Fig. 5 as an animation where q = n 1/n 0 is changed in time.
As mentioned above, one of the great advantages of SAFE is the possibility of estimating its own level of error. This follows from explicitly considering the top two orders in Eq. (9):
Notice that a 1 can depend on z and γ as well as on ξ. For the incident field we let a 1(ξ, γ 0;z) = 0 since, with γ = γ 0 and a 0 = c, SAFE gives an exact reconstruction of U m (so no correction is needed). As with a 0 earlier, the values of a 1 at z 1 for the transmitted and reflected field estimates are found from the boundary conditions (as in Ref. 7) to be given by
where h(ξ) = H 1(ξ)/H 0. [A simple prescription for finding the functions in Eq. (23) at other values of γ is given in Ref. 6. With this, an optimal value of γ can be found by minimizing the error estimates discussed below. We do not bother with this here, however.]
The results in Eq. (23) can be used as corrections to a 0 in order to improve the accuracy of the estimates, roughly doubling the number of significant digits. Alternatively, they can provide a valuable measure of error of the estimates. It was shown in Ref. 6 that an rms relative error of SAFE’s basic estimate is asymptotically given by
where T and R are given by
For the estimates shown in Fig. 5, the expected errors at the interface are shown in Fig. 6. The rms error for the transmitted field is evidently only a few percent for the majority of cases shown in Fig. 5. Notice that that, because the field for z<0 is generally dominated by the incident field, the relative error in the total field is typically much less than the value indicated by the orange curve in Fig. 6. It is also interesting that the reflected field is no longer propagated without error by SAFE. This follows from the fact that (ξ) is not constant.
5. Concluding remarks
Although SAFE and the methods based on Gaussian beamlet summation may appear to be similar, there are significant differences. In particular, SAFE does not depend on the parabasal approximation for each contribution. What is more, SAFE has greater accuracy and comes with a geometric condition of validity for the ray-wave connection. Although we are assuming that the summed beamlets are associated to a one-parameter (or two-parameter in 3D) family of rays [9,10] in the comparisons presented above, there are other options. These may be based on discrete summations of Gaussian beamlets [11–16] or even on continuous sums over all possible initial positions and directions . Error estimation remains a difficult multi-step process for all of these methods. Aside from the flexible Gaussians, a fundamental advantage of SAFE is the possibility of computing error estimates and/or higher order corrections even when such things as refracting/reflecting interfaces are encountered. As demonstrated in Sec. 4, this is realized in a single consistent framework. What is more, the rms error [see Eq. (24)] is accessible without evaluating diffraction-like integrals such as Eq. (5). That is, the error estimate can be cheaper to compute than any one field value. This error analysis gives the ultimate criterion for the validity of the field estimate and, indeed, of the connection between rays and waves. It complements (and is consistent with) the criterion of insensitivity discussed above and, together, they quantify the strength of the link between rays and waves in any given application.
MAA acknowledges the funding by the Dirección General de Asuntos del Personal Académico IN112300 Optica Matemática project of the Universidad Nacional Autónoma de México. GWF thanks both the Australian Research Council and Macquarie University (Sydney, Australia).
References and links
1. M. Born and E. Wolf, Principles of Optics. Electromagnetic Theory of Propagation, Interference and Diffraction of Light (Cambridge, Cambridge, 7th edition, 1999), pp. 116–129.
2. Yu. A. Kravtsov and Yu. I. Orlov, Geometrical Optics of Inhomogeneous Media (Springer, Berlin, 1990). [CrossRef]
3. Yu. A. Kravtsov and Yu. I. Orlov, Caustics, Catastrophes and Wave Fields (Springer, Berlin, 2nd Edition, 1999). [CrossRef]
4. G.W. Forbes and M.A. Alonso, “Using rays better. I. Theory for smoothly varying media,” J. Opt. Soc. Am. A 18, 1132–1145 (2001). [CrossRef]
5. M.A. Alonso and G.W. Forbes, “Using rays better. II. Ray families to match prescribed wave fields,” J. Opt. Soc. Am. A 18, 1146–1159 (2001). [CrossRef]
6. M.A. Alonso and G.W. Forbes, “Using rays better. III. Error estimates and illustrative applications in smooth media,” J. Opt. Soc. Am. A 18, 1357–1370 (2001). [CrossRef]
7. G.W. Forbes, “Using rays better. IV. Refraction and reflection,” J. Opt. Soc. Am. A 18, 2557–2564 (2001). [CrossRef]
8. E.J. Heller, “Frozen Gaussians: A very simple semiclassical approximation,” J. Chem. Phys. 75, 2923–2931 (1981). [CrossRef]
9. M.M. Popov, “A new method of computation of wave fields using Gaussian beams,” Wave Motion 4, 85–97 (1982). [CrossRef]
10. V.M. Babich and M.M. Popov, “Gaussian summation method (review),” Izvestiya Vysshikh Zavedenii, Radiofizika 32, 1447–1466 (1989).
11. M.J. Bastiaans, “The expansion of an optical signal into a discrete set of Gaussian beams,” Optik 57, 95–102 (1980).
12. A.N. Norris, “Complex point-source representation of real point sources and the Gaussian beam summation method,” J. Opt. Soc. Am. A 12, 2005–2010 (1986). [CrossRef]
13. P.D. Einziger, S. Raz, and M. Shapira, “Gabor representation and aperture theory,” J. Opt. Soc. Am. A 3, 508–522 (1986). [CrossRef]
14. P.D. Einziger and S. Raz, “Beam-series representation and the parabolic approximation: the frequency domain,” J. Opt. Soc. Am. A 5, 1883–1892 (1988). [CrossRef]
15. B.Z. Steinberg, E. Heyman, and L.B. Felsen, “Phase-space beam summation for time-harmonic radiation from large apertures,” J. Opt. Soc. Am. A 8, 41–59 (1991). [CrossRef]
16. J.M. Arnold, “Phase-space localization and discrete representation of wave fields,” J. Opt. Soc. Am. A 12, 111–123 (1995). [CrossRef]