Scattering of electromagnetic radiation from a sphere, so-called Mie scattering, requires calculations that can become lengthy and even impossible for those with limited resources. At the same time, such calculations are required for the widest variety of optical applications, extending from the shortest UV to the longest microwave and radar wavelengths. This paper briefly describes new and thoroughly documented Mie scattering algorithms that result in considerable improvements in speed by employing more efficient formulations and vector structure. The algorithms are particularly fast on the Cray-1 and similar vector-processing computers.
© 1980 Optical Society of America
Mie scattering calculations pervade the entire field of atmospheric optics. Applications range from one end of the electromagnetic spectrum to the other—from UV solar radiation backscattered by stratospheric aerosols to satellites, through visible and IR radiation scattered by clouds and aerosols, to microwaves and radar scattered from large hydrometeors. they involve the function
The only remaining question is how to structure the Mie computation for maximum efficiency, while at the same time maintaining accuracy and avoiding numerical instability and ill-conditioning. The first published Mie algorithms were those of Dave, although Irvine, Plass, and Cheyney, to name just a few, had been doing Mie calculations for several years prior. Most new investigators in the intervening years tended to use Dave's algorithms or variants thereof.
Mie calculations have gained a reputation for being time-consuming. This is, first, because the upper limit N in Eqs. (1) is roughly equal to x, which can become very large, e.g., about 1260 for a 100-μm water drop at a visible wavelength of 0.5 μm. Second, because in typical applications one wishes to sum these series for a large number of radii (as in integrating over a size distribution) or for a large number of wavelengths (as in integrating across the solar spectrum) or for a large number of refractive indices (as in inverting scattering measurements to deduce refractive index). The present author alone has used many hours of Univac 1108, IBM 360/91, CDC 7600, and Cray-1 time doing Mie computations.
In light of all this, it seems likely that a pair of new and very efficient Mie algorithms, which the author has developed over the past several years and which are fully described in a recent report, will be of broad interest. A synopsis of the original features of the algorithms and certain selected timing comparisons are given below. For the remaining features and listings of computer codes implementing the algorithms, the reader is referred to the report cited in [Ref. 6].
II. An Computation
The complex function An was computed by upward recurrence in the early days of Mie calculations. Then Kattawar and Plass showed that upward recurrence could be highly unstable if absorption (mIm) was large enough, but that downward recurrence was always stable. Dave allowed for this by furnishing two algorithms, one with upward recurrence and one with downward recurrence; the user could take his pick.
This seemed a highly unsatisfactory state in which to leave the matter. Therefore we sought a criterion for determining a priori, from x and m, when down recurrence had to be used (when it is safe, up recurrence is always preferable because it is faster). At first we did this by observing the breakdown of many upward recursive An computations, from which we were able to descry an empirical criterion for allowing up recurrence. This had the form
We did this by not considering An computations in isolation, but by considering the full Mie computations. Exact Mie results were generated using down recurrence for An and compared with the corresponding results produced by up recurrence. At first we looked only at the Mie quantities Qext, Qsca, and g [Eqs. (1a)–(1c)], and we counted up recurrence of An as a failure whenever it produced a relative error exceeding 10−6 in any one of these quantities. It was clear from our initial studies that for fixed x and mRe, whenever up recurrence failed for a given imaginary index, it failed for all larger indices as well. Therefore, for each pair (x,mRe) we performed an upward search on mIm to find the first value at which up recurrence failed. The search was successively refined until was determined to 3 significant digits. We considered values of x from 1 to 10,000 and values of mRe from 1.05 to 9.25, which should cover almost any conceivable situation of practical interest.
This search revealed that, for fixed mRe, the product rapidly approached an asymptotic value from above as x increased. This asymptotic value depended on mRe. Mathematically speaking, this meant there was some function f(mRe) such thatFig. 1(a). They can be fitted quite excellently by the polynomial expression Eq. (6) is considerably better.]
We then extended our study to include the angular scattering functions S1 and S2 [Eqs. (1d) and (1e)], calculated at 1° increments from 0° to 180° inclusive. Up recurrence was regarded as a failure if at any angle the real or imaginary part of S1 or S2 had a relative error exceeding 10−5. (Qext, Qsca, and g played no role because their relative errors were invariably down at the 10−10 level when S1 or S2 relative errors first reached 10−5. The reason is that the earlier terms in the S1 or S2 sums may cancel, so that the later terms, which are most sensitive to up-recurrence instability, dominate the sum; this phenomenon is greatly mitigated in the sums for Qext, Qsca, and g.) The data we obtained are plotted in Fig. 1(b). They are considerably more noisy than the data in Fig. 1(a), which is due to the following circumstance: sometimes the real part of S1 or S2 is orders of magnitude smaller than the imaginary part or vice versa. These very small components are quite sensitive to up-recurrence errors because considerable cancellation of significant digits has taken place in summing for them. These very small components, which dominate the f(mRe) determination, are necessarily noisier than Qext, Qsca, or g. Because the data in Fig. 1(b) are more erratic than they are in Fig. 1(a), we have only fitted a quadratic to them:
A more limited study than the present one was done by Wiscombe. The scattered intensity | S1 |2 + | S2 |2 and the degree of polarization (|S2|2 − |S1|2)/(|S2|2 + | S1 |2) were examined at 60 angles rather than S1 and S2 at 181 angles, and mRe was varied only from 1.05 to 2.5. The criterion so derived was
Equation (5)–(7) furnish a priori criteria when up recurrence of An is safe. If down recurrence must be used, it can be made considerably more efficient by initializing it using the Lentz method rather than using Dave's method. Dave initializes using AN* (mx) = 0, where N* = 1.1| mx | + 1. Presuming N* > N ≃ x + 4x1/3 + 1 (see Sec. V) and since only A1 through AN are required, Dave's method requires about (1.1 | m | − 1)x − 4x1/3 iterations to get AN. (We say presuming because we have found many cases where N* < N, so that Dave's method fails utterly. For example, N* = 17 while N = 19 for m = 1.05 − i and x = 10; or N* = 117 while N = 119 for m = 1.05 − 0.1i and x = 100.) Table I compares this number, for a few values of m and x not satisfying Eq. (5), with the number of Lentz method iterations necessary to calculate AN. Since an iteration of either type requires about the same amount of computation time, the Lentz method clearly enjoys a dramatic advantage. The number of Dave iterations furthermore rises sharply as the imaginary index increases, while the number of Lentz iterations falls in the same situation. The Lentz method also has an error that is easily controllable and has explicit procedures to avoid ill-conditioning.
It should be emphasized that the Lentz method is only faster in combination with our up-recurrence criterion. A more extensive version of Table I (see [Ref. 6]) shows that, as the imaginary index falls below , the number of Lentz method iterations rises sharply and becomes comparable with the number of Dave method iterations.
The above empirical criteria are, of course, somewhat dependent on computer precision. On the CDC and Cray-1 machines all computations could be done in single precision (14 significant digits). But the 8 significant digit single precision on the IBM and Univac machines is inadequate, and one must do the An recurrence (but not the AN initialization for down recurrence) in double precision on those machines.
III. S1 and S2 Computation
The computation of scattering amplitudes S1 and S2 [Eqs. (1d) and (1e)] ordinarily requires almost all the time in a Mie scattering calculation. This is because the right-hand sides in Eqs. (1d) and (1e) are typically summed for many angles. Thus efficiency in this part of the calculation is mandatory. We achieve this efficiency partly by vector structure (see Sec. VI) and partly by two devices to be discussed in this section.[Ref. 6]. These recurrences require a total of three multiplications and three additions, compared with six multiplications and four additions for Dave's recurrences. [We have assumed precalculation of purely numerical factors like (n + 1)/n in both cases.] In our scheme, τn(n) is calculated just before the nth series term is formed, while πn+1(μ) is calculated just afterward; this allows the sharing of the common quantity t.
The second device consists in calculating not S1 and S2 but rather
IV. Small Particle Approximation
In the small particle or Rayleigh limit x → 0, the Mie formulas become ill-conditioned. In particular, the recurrence for An, the recurrence for a spherical Bessel function involved in an and bn, and the numerator in the expression for bn, all involve subtraction of nearly equal numbers as x → 0. After finding the usual formulas for this limit (e.g., those cited in [Refs. 1] and ) insufficiently accurate in cases where mIm ≪ 1, as described by Wiscombe, we developed improved versions as follows:
V. Number of Terms in Mie Series
To take advantage of vector structure, one must estimate a priori the number of terms N in the Mie series [Eqs. (1a)-(1e)]. This is in direct contrast to Dave's procedure, which consists in stopping summation when | an |2 + |bn|2 falls below 10−14. Analysis of the convergence behavior of these series reveals that it is only slightly influenced by refractive index and that N ∼ x. In order to get a more precise estimate, we followed the suggestion of Khare that N ∼ x + cx1/3, where the x1/3 term accounts for edge wave contributions. Upon generating a large amount of data on N as a function of x using a convergence criterion like Dave's, we found that these data could be excellently fit by
VI. Vector Structure
The newer computers, like the Cray-1, process vectors in very much the same way that older machines processed scalars (see [Ref. 10]). This can lead to dramatic speed increases when calculations are structured to enable processing of entire vectors at once.
Mie calculations involve two types of vectors, one type having the index n in Eqs. (1a)–(1e) and one type having an angular index. Loops over index n can be completely vectorized except when summing is being done, in which case they are only partly vectorizable. But loops over angle are always fully vectorizable. Therefore, the angular loop for S1 and S2 is placed inside the summing loop, in order to achieve maximum vector structure; this explains the importance of the S1 and S2 calculation improvements in Sec. III.
The only significant parts of the Mie calculation which cannot be vector structured are the Lentz method for AN and the recurrences for An and a certain spherical Bessel function. All such iterative computations, in which a result depends on one or more prior results, are intrinsically unvectorizable. But the remaining parts of the Mie calculation have been vectorized, including summing operations (using the techniques described by Johnson).
VII. Timing Studies
Table II compares published times for Dave's algorithms and the faster of the two algorithms in the Wiscombe report. (The slower algorithm is designed to use the absolute minimum amount of memory, which exacts a penalty of lesser efficiency.) The new algorithm is 30–40 times faster when the difference in basic computer speed is factored out. Of this, a factor of ∼7 is due to the vector structure, a factor of ∼2 to more efficient formulations, and the remaining improvement to the use of single precision rather than the slower double precision of Dave's routines.
Table III shows the times required to do Mie calculations with the new algorithm, for size parameters ranging from 1 to 5000 and anywhere from 0 to 255 scattering angles. Each quoted time is an average over the times for eight values of mRe for fixed mIm = 0.1. [The times are 5–30% faster for mIm = 0 because special branches are taken in the Mie coefficient (an,bn) calculation and because only up recurrence is used for An.]
There is a big rise, a factor of 2–3, in going from zero to three angles in Table III. But after this initial jump, one must go all the way to sixty-three angles to double the three-angle time. This shows the dramatic advantage of vectorized angular loops, as well as the savings to be reaped if only Qext, Qsca, and g are required.
For size parameters x below 100, the Table III times rise considerably more slowly than linearly in x, demonstrating the advantage of vector structure in certain n loops, including the ones for summing. Only beyond x = 100 do the times go up about linearly in x, but this phenomenon is specific to the Cray-1 and comes about because the Cray-1 processes vectors in sixty-four-element segments.
Timing studies like these furnish a concrete basis for testing claims of Mie algorithm improvement and incidentally allow one to make rough a priori estimates of the times required for particular Mie computations.
Dramatically faster Mie algorithms have been made possible by vector structuring and by much more efficient handling of the An and S1 and S2 calculations. Better formulas for the small particle case are also presented. Timing studies indicate a factor of 30–40 improvement over Dave's algorithms when intermachine differences are removed. Accuracy of the new algorithms is 5–6 significant digits or better, and they have been used without difficulty up to size parameters of 20,000 for real refractive indices from 1 to 9 and imaginary indices from 0 to 10. Mie calculations which were impossibly long a few years ago can be done routinely with these algorithms.
We thank Ron Welch (University of Mainz, Germany) and Eric Smith (Colorado State University) for encouraging us to make these algorithms more widely available and Tom Ackerman (NASA Ames) for testing the slower one (MIEV0) directly against Dave's DBMIE (converted to single precision) on his CDC 7600.
The National Center for Atmospheric Research is sponsored by the National Science Foundation.
Figure and Tables
1. H. C. Van de Hulst, Light Scattering by Small Particles (Wiley, New York, 1957).
2. M. Kerker, The Scattering of Light and Other Electromagnetic Radiation (Academic, New York, 1969).
3. L. Infeld, Q. Appl. Math. 5, 113 (1947).
4. J. V. Dave, “Subroutines for Computing the Parameters of the Electromagnetic Radiation Scattered by a Sphere,” Report 320-3237 (IBM Scientific Center, Palo Alto, Calif., 1968).
5. J. V. Dave, IBM J. Res. Dev. 13, 302 (1969) [CrossRef] .
6. W. J. Wiscombe, “Mie Scattering Calculations: Advances in Technique and Fast, Vector-Speed Computer Codes,” NCAR Technical Note NCAR/TN-140+STR (National Center for Atmospheric Research, Boulder, Colo. 80307,1979).
9. V. Khare, “Short-Wavelength Scattering of Electromagnetic Waves by a Homogeneous Dielectric Sphere,” Ph.D. Thesis, U. Rochester, N.Y., 1976 (available from University Microfilms, Ann Arbor, Mich.).
10. P. M. Johnson, Comput. Des. 17, 89 (1978).