This somewhat vague expectation can be given a more definite meaning and verified as follows. The monic polynomial of degree m that has the smallest mean-square value over a given interval can be shown to be proportional to the m th-order orthogonal polynomial over that interval, say, ϕm. A Gaussian quadrature scheme with m sample points has roots at the locations of the zeros of this very polynomial. Such an integration scheme is clearly unable to determine the mean-square value of ϕm: it exactly integrates polynomials of degree less than or equal to 2m− 1, whereas the square of ϕm is of degree 2m. Nevertheless, if used as a merit function, the Gaussian integration scheme reports a mean-square value of zero for an m th-order polynomial if and only if the polynomial is a multiple of ϕm. This means that, in the sense of minimizing the mean-square value, the m th-order term is optimally balanced by the terms of lower order. In practice, this entails that when the Gaussian merit functions discussed in this paper seriously underestimate, say, a mean-square wave aberration owing to the dominance of aberrations of higher order than those that the scheme can account for, the balancing of the unseen terms will not be far from optimal. In this sense, the toothpaste tube is being squeezed at just those points that guarantee that the smallest possible volume is left inside in the event that the thickness at each point is reduced to zero.

See, for example, M. Born, E. Wolf, Principles of Optics, 6th ed. (Pergamon, New York, 1980), Sec. 9.1.3.

This result follows on differentiation of the autocorrelation expression for the MTF. (It is interesting that the same result follows trivially for the so-called geometrical MTF from the standard relations between the moments of a function and the Taylor series of its Fourier transform.) It follows then that a system designed for minimum spot size will generally have better low-spatial-frequency response than a system designed for minimum OPD, which will have superior response to higher spatial frequencies.

Some particular Cartesian configurations can have somewhat better performance than those quoted; however, the numbers of rays specified here are sufficient to guarantee that the error limits will typically not be violated. Although the fractional error of the Cartesian scheme dies, on average, as the inverse of the number of rays to the power 3/4, the performance is highly erratic (as one might expect with patching a square grid to a round hole), and it is difficult to guess the accuracy of a given configuration by relating it to another. For example, halving the side length of the grid (which approximately quadruples the number of rays) typically reduces the accuracy of the result for some configurations. This is discussed in detail in Section 3.

V. K. Viswanathan, I. O. Bohachevsky, T. P. Cotter, “An attempt to develop an ‘intelligent’ lens design program,” in 1985 International Lens Design Conference, W. H. Taylor, ed., Proc. Soc. Photo-Opt. Instrum. Eng.554, 10–17 (1985).

[CrossRef]

I. O. Bohachevsky, V. K. Viswanathan, G. Woodfin, “An ‘intelligent’ optical design program,” in Applications of Artificial Intelligence I, J. F. Gilmore, ed., Proc. Soc. Photo-Opt. Instrum. Eng.485, 104–112 (1984).

[CrossRef]

P. N. Robb, “Lens design using optical aberration coefficients,” in 1980 International Lens Design Conference, R. E. Fischer, ed., Proc. Soc. Photo-Opt Instrum. Eng.237, 109–118 (1980).

[CrossRef]

The hypergon has a half-field angle of 65° and operates at f/30. The specifications can be found in U.S. patent706,650 (August12, 1902).

The Cooke triplet used here has a half-field angle of 20° and operates at f/5.6. The specifications were taken from H. A. Buchdahl, Optical Aberration Coefficients (Dover, New York, 1968), Sec. 37, p. 60.

The double Gauss used here is among the sample lens specifications provided with accos v(the lens-design program available from Scientific Calculations, Fishers, N.Y.). It has an f number of ≃2 and a half-field angle of 15°.

The specifications of the Schmidt camera used here can be found in Table 3 of Ref. 4. It is designed to operate in the UV, has a half-field angle of 5°, and operates at f/1.09.

The microscope objective used here was designed by J. R. Rogers of the Institute of Optics, University of Rochester. The half-cone angle at the object is 50° (numerical aperture ≈0.766), and the magnification is 50×. The specifications are available from him on request.

Note that the form of the weighting functions is dependent on the choice of variables. For example, the weight for color is different if frequency is used as the coordinate in place of wavelength. If a change of variables is needed, a Jacobian must be included to find the new form of the weighting function. So, for example, in changing variables from wavelength to frequency, L(λ) is replaced by N(ν)=L[λ(ν)]dλ/dν. It is also significant that, in Eq. (2.2), a mean-square length is averaged so that if the spot size in a region near the center of the field should be weighted k times more heavily than a region near the edge, F(f) should be an extra factor of k2 higher at the center (over and above the Cartesian components of any Jacobian that is picked up by using variables other than the position vector).

It is worth remarking, at this stage, that expressions of the form s2=Avgx{[f(x)−f¯]2}, which appear in Eqs. (2.6) and (2.7), are evaluated more easily if they are reformulated as follows. It is usual to expand the argument of the average operator and to reexpress s2 as s2=Avgx{f2(x)}−f¯2 in order to allow f¯ and s2 to be calculated simultaneously. However, this typically increases the numerical noise owing to cancellation. This is especially the case for the computations indicated in Eqs. (2.6) and (2.7) in which s2 may be 8 or more orders of magnitude smaller than f¯2. A more convenient expression can be obtained by first writing {f(x)−f¯} as {[f(x)−f(c)]−[f¯−f(c)]} before expanding the argument, and in this way s2=Avgx{[f(x)−f(c)]2}−[f¯−f(c)]2 is obtained, where c is taken to be some fixed value near the center of the region of interest.

Sampling on a square grid is used, for example, by code v(the lens-design program available from Optical Research Associates, Pasadena, Calif.), and the polar grid can be found in sigma(the lens-design program from Kidger Optics, UK). These programs do not necessarily use the same weighting adopted here or try to calculate the same entities; I simply take these sampling schemes as a starting points for comparison purposes.

If the points on a square lattice (of side length δ= R/n, where R is the radius of the disk) in one quadrant of the disk are located atrij=[δ(i−1/2),δ(j−1/2)]fori=1,2,…[(n2−1/4)1/2+1/2]andj=1,2,…[[n2−(i−1/2)2]1/2+1/2], it can be seen that the percentage error in approximating the integral of a constant over the disk by using a simple sum over these points is given byE(n)=100{S(n)−πR2}/πR2, whereS(n)=4δ2∑i[[n2−(i−1/2)2]1/2+1/2]. 〈x〉 denotes the integral part of x, and the range for the sum over i is, of course, just that indicated for the placement of points. It is remarked that scaling the overall result by a constant to ensure that all schemes integrate constant functions exactly will have no effect for the work reported in this paper, since all the integrals are for the purposes of averaging and any constant multiplying factor will cancel when the integrals are normalized to obtain the desired average. This particular sampling scheme is identical to that used in code v, and some observations are in order. This program has a parameter referred to as DEL, which is just the inverse of n. The default value of this parameter is DEL = 0.385, which corresponds to n = 2.60, which, from Fig. 3, can be seen to use 12 rays and to have an error in excess of +10% when integrating a constant (if uniform weighting is used). When the routines of Section 3 are used, this scheme is found to overestimate spot sizes consistently by 20–50%. When a value of n = 2.764 (corresponding to DEL = 0.362, the location of the zero on the plot in Fig. 3 with the same number of rays), is adopted, the error in the determination of spot size is reduced to ≃10%, an improvement by a factor of 3 to 5. This improvement is appreciated when it is recalled that the error is, on average, dropping as the number of rays to the power −3/4, so the gain realized by this minor change is equivalent to that typically obtained by increasing the number of rays by a factor of 4 to 8. For the interest of code vusers, it is noted that the locations of a number of the zeros of the curve in Fig. 3 that seem to give relatively good integration schemes are found to be n = 2.2568 (8 rays, 10–30% error), n = 2.7639 (12 rays, 4–12% error), n = 3.7424 (22 rays, 3–6% error), and n = 5.9708 (56 rays, 1–3% error).

The polar scheme would probably benefit from sampling at the endpoints of the radial subdivisions rather than at the midpoints. However, there are significantly better schemes available, so this minor issue is not pursued further.

With m uniformly distributed samples in [0, 2π), say, at θk= k 2π/m− γ for k= 1, 2, …, m, it can easily be shown, by using a geometric series, that Σk cos qθk vanishes for q not equal to a multiple of m. Now, since cospθ can be written as a linear combination of {cos qθ; q= p, p− 2, p− 4, …, 0 or 1}, it follows that uniform weighting with m uniformly distributed points can be used to integrate exactly cospθ for p= 0, 1, 2, …, m− 1. The use of alternating weights can be regarded as a superposition of a scheme with 2n points, and one with only half that number (skipping every other one) and can be used to integrate exactly cospθ for p= 1, 2, …, n− 1, where uniform weights yield exact results for p= 1, 2, …, 2n− 1.

This result also holds for the sample points θj= jπ/Nθ, j= 0, 1, …, Nθ, which includes points on the line of symmetry (i.e., meridional rays), although the same accuracy is now obtained with one extra point on each ring.

For interest, it is remarked that since the Legendre polynomials are simply related to the rotationally symmetric Zernike polynomials, it can be seen that the sample points presented in Table 1 in fact correspond to the radial locations of the zeros of these Zernike polynomials. The values of the weights follow simply from the relation between Gaussian quadrature and orthogonal polynomials, which is presented in H. S. Wilf, Mathematics for the Physical Sciences (Dover, New York, 1962), Sec. 2.9, pp. 61–64. A convenient form of the recurrence relations for generating the orthogonal polynomials from which the parameters for Gaussian quadrature can be found is presented in W. H. Press, B. P. Flannery, S. A. Teukolsky, W. T. Vetterling, Numerical Recipes (Cambridge U. Press, Cambridge, 1986), Sec. 4.5.

For a simple description of the derivation of the Radau integration methods, see, for example, R. W. Hamming, Numerical Methods for Scientists and Engineers, 2nd ed. (McGraw-Hill, New York, 1973), Sec. 19.7, pp. 328–330.

H. A. Buchdahl, Optical Aberration Coefficients (Dover, New York, 1968) Sec. 87, pp. 150–154.