## Abstract

Advances in fabrication and testing are allowing aspheric optics to have greater impact through their increased prevalence and complexity. The most widely used characterization of surface shape is numerically deficient, however. Furthermore, with regard to tolerancing and to constraints for manufacturability, this representation is poorly suited for design purposes. Effective alternatives are therefore presented for working with rotationally symmetric surfaces that are either (i) strongly aspheric or (ii) constrained in terms of the slope in the departure from a best-fit sphere.

©2007 Optical Society of America

## 1. Introduction

Characterizing the desired shape of individual mirrors or lens surfaces is central to activities such as optical design and fabrication. In essence, this process is pure geometry. After agreeing on certain conventions, however, the problem becomes a numerical one that is made ever more challenging by the demanding tolerances of modern precision optics. A variety of different approaches to this task have been adopted within the context of optics, and especially within optical design [1]. Limitations of one of the most widely used options [2] are demonstrated here, and a number of simple improvements are proposed. General geometry and numerics are the key points of interest; details that are specific to only one context, such as design, are suppressed as much as possible. While some comments are offered about how these ideas are expected to be of use for design purposes, the emphasis is on general processes for specifying ideal surface shapes. The results relate to various activities including optical fabrication, testing, and patenting, as well as design.

Since conic sections have special optical properties that are often exploited in system design, it is conventional to express rotationally symmetric surface shapes in terms of their deviation from the sagittal representation —or just the “sag”— of a close-fitting conic, say [2]

where {*z*,ρ,ϕ} are standard cylindrical polar
coordinates. Here, *c* is the paraxial curvature of the surface and
ε the conic parameter. Only a handful of terms are typically retained in
the added polynomial, but it is increasingly common to see this number grow in order
to characterize a desired shape with sufficient accuracy. For the purposes of
fabrication and testing, this characterization of the surface shape is completed by
specification of the aperture size, i.e. a value for
*ρ*
_{max} where Eq. (1) is then valid over
0<*ρ*<*ρ*
_{max}.

There are enough degrees of freedom in Eq. (1) to admit independent control of all even powers up to
*ρ*
^{2M+4} (and,
in fact, via *ε* to
*ρ*
^{2M+6}).
This representation is therefore sufficiently general to approximate any symmetric
shape with arbitrary accuracy provided *M* is allowed to be large
enough. However, the failings of such a representation are well known. For example,
if a particular shape, say *z* =
*f*(*ρ*), is prescribed, a
least-squares fit can be performed to determine the optimal values of
{*a _{m}*;

*m*= 0,1,..

*M*} once a close-fitting conic section has been identified. It can readily be shown, however, that even when working in terms of the normalized variable

*u*≔/

*ρ*

_{max}and using double-precision arithmetic, the associated Gram matrix [defined below at Eq. (7)] is so ill-conditioned that this simplistic process fails when more than about ten terms are used. When a solution is found for modest values of

*M*, the values of {

*a*;

_{m}*m*= 0,1,..

*M*} typically alternate in sign and only yield the desired shape through heavy cancellation between the individual terms. This property is all too familiar to the optical design and fabrication community and makes for numerical inefficiency due to problematic round-off errors. In particular, large numbers of digits must be specified. A sample of these difficulties is displayed in Sec. 4, and one of the goals of this work is to

*push these failings far beyond the ever-growing design space for optics*. The proposed solution is to use non-standard orthogonal bases in place of the simplistic additive polynomial of Eq. (1). A useful property of orthogonal decompositions is that the mean square value of the associated superposition is just the sum of the squares of the coefficients. Section 3 contains a particular basis that is tailored to exploit this property to achieve the second goal for this work:

*facilitating the enforcement of manufacturability constraints during design*.

## 2. Strong aspheres

By choosing an appropriate basis, the ill-conditioning of the Gram matrix discussed in the Introduction can be entirely removed. To see this, rewrite Eq. (1) as

where the departure from a conic, written as *D*
_{con}
(*u*), is defined by

In place of the standard choice of simple monomials, namely
*Q*
^{mon}
_{m}(*x*) = *x _{m}* [which gives Eq. (1)], we can choose {

*Q*

^{mon}

_{m}(

*x*);

*m*= 0,1,…

*M*} to be an orthogonal set. Consider fitting the surface described by

*z*=

*f*(

*ρ*) where

*f*may be specified via a numerical look-up table, or whatever. After choosing a close-fitting conic (which can be refined by using a process that invokes the procedure discussed here) the coefficients in the additive polynomial can be chosen to minimize the rms sag error. That is, if the difference between

*f*(

*ρ*) and the conic component of Eq. (2) is written as

*g*(

*ρ*), the minimization of

becomes a standard least-squares process. The angle brackets in Eq. (4) denote a weighted average that may be defined by a sum or an
integral of whatever form we please. In this work, for any function
*h*, it is convenient at first to take

Choosing the weight function to be constant, i.e.
*W*(*u*
^{2}) = 1, means that equal areas
in the aperture carry equal weight, and this is as good as any for the moment.

Setting the gradient of *E*
^{2} to zero leads to the
expression for the optimal coefficients:

where *b _{m}*
≔〈

*g*(

*u*

*ρ*

_{max})

*u*

^{4}

*Q*

^{con}

_{m}(

*u*

^{2})〉 and the elements of the Gram matrix are given by

It follows that this matrix is diagonal if the basis is chosen to be a particular case of the Jacobi polynomials [3]. In particular,

For reference, the first six members can be written as

$${Q}_{3}^{\mathrm{con}}\left(x\right)=-\left\{35-12x\left[14\phantom{\rule{.2em}{0ex}}-x\left(21-10x\right)\right]\right\},\phantom{\rule{12.0em}{0ex}}$$

$${Q}_{4}^{\mathrm{con}}\left(x\right)=70\phantom{\rule{.2em}{0ex}}-3x\left\{168\phantom{\rule{.2em}{0ex}}-5x\left[84\phantom{\rule{.2em}{0ex}}-11x\left(8\phantom{\rule{.2em}{0ex}}-3x\right)\right]\right\},\phantom{\rule{8.5em}{0ex}}$$

$${Q}_{5}^{\mathrm{con}}\left(x\right)=-\left[126\phantom{\rule{.2em}{0ex}}-x\left(1260\phantom{\rule{.2em}{0ex}}-11x\left\{420\phantom{\rule{.2em}{0ex}}-x\left[720\phantom{\rule{.2em}{0ex}}-13x\left(45\phantom{\rule{.2em}{0ex}}-14x\right)\right]\right\}\right)\right].\phantom{\rule{1.5em}{0ex}}$$

As can be seen in Fig. 1, these basis elements are scaled to take a maximum value of unity. Even when the mean square error of Eq. (4) is computed as a finite sum, the basis of Eq. (8) is still a good choice. Although the Gram matrix may then not be precisely diagonal, it is nearly so and remains well conditioned. Some benefits of this basis are outlined in Sec. 4.

## 3. Mild aspheres

Due to considerations related to both fabrication and testing, aspheric surfaces are generally more cost effective if their deviation from a sphere is constrained. In terms of metrology, it is appropriate to limit the transverse slope of the deviation between the surface and its best-fit sphere when measured along the surface’s normal. Because this slope translates directly into fringe density, a full-aperture interferometric test can be performed provided this slope is constrained so that the interferogram remains sufficiently below the Nyquist sampling limit. As a bonus, such a constraint also reduces both the magnitude of the aspheric departure and the rate of change of the local principal radii of curvature over the aperture. These reductions directly enhance manufacturability. Because, as pointed out in the Introduction, orthogonal bases can offer immediate access to mean square values, it can be effective to work in terms of a representation that is coupled to the mean square slope of a surface’s aspheric departure. As described in the following section, the capability derived in the next few paragraphs can also be used as part of an effective scheme to limit the maximum absolute slope of the departure.

One simple definition of the best-fitting sphere for a rotationally symmetric surface
is to choose the sphere that is coincident with the surface at its axial point and
around its perimeter. Various minor modifications from this best fit are possible,
e.g. to minimize the maximum absolute deviation, but the distinctions are relatively
minor for mild aspheres. If this simple option is accepted and the sag at the edge
of the clear aperture is written as
*f*(*ρ*
_{max}), it follows that
the curvature of the best-fit sphere, say *c*
_{bfs}, is given
by *c*
_{bfs} =
2*f*(*ρ*
_{max})/[*ρ*
^{2}
_{max}
+
*f*(*ρ*
_{max})^{2}]. In
place of Eq. (1), it is now more effective to express the sag as

where the sag-based departure from the best-fit sphere, written as
*D*
_{bfs}(*u*), is defined by

So that the coefficient values have no impact on the best-fitting sphere,
*D*
_{bfs}(*u*) is explicitly forced to
vanish at the aperture’s centre and edge, i.e. at *u* = 0
and *u* = 1. Also, so that the axial curvature may be different
from*c*
_{bfs}, the pre-factor in Eq. (11) reduces to *u*
_{2} for
*u*≪1 [unlike the form used in Eq. (3) where the pre-factor is *u*
^{4}].

To first order, *D*
_{bfs}(*u*) can be converted
to a deviation measured along the surface normal by multiplying it by the cosine of
the angle between the optical axis and the local normal to the best-fit sphere. This
cosine factor is precisely the square root that appears in the denominator of Eq. (11), and this is the justification for the square
root’s presence. It is now straightforward to choose
*Q*
^{bfs}
_{m} (*x*) to be polynomials of order *m* and to
configure them so that the weighted rms slope of the departure along the normal is
just the sum of the squares of *a _{m}*. To achieve this, the
elements of the normal-departure slope are written as

and then *Q*
^{bfs}
_{m} (*x*) is chosen to make *S _{m}*
(

*u*) orthogonal. In particular,

where the angle brackets were defined at Eq. (5) and *δ _{mn}* is the
Kronecker delta. By starting with

*Q*

^{bfs}

_{0}(

*x*) as a constant that is normalized in accordance with Eq. (13), higher-order members can be determined progressively by requiring that each new polynomial be orthogonal to all the lower members and normalized in keeping with Eq. (13).

To constrain the tendency of orthogonal polynomials to shoot off at the edge of their
domain, it is sometimes convenient to choose a weight function that emphasizes the
behaviour at the ends. If we choose *W*(*x*) =
[*x*(1-*x*)]^{-1/2} in Eq. (5), the first six basis elements are found to be

$${Q}_{3}^{\mathrm{bfs}}\left(x\right)=\sqrt{\frac{2}{2545}}\left\{207\phantom{\rule{.2em}{0ex}}-4x\left[315\phantom{\rule{.2em}{0ex}}-x\left(577\phantom{\rule{.2em}{0ex}}-320x\right)\right]\right\},\phantom{\rule{12.5em}{0ex}}$$

$${Q}_{4}^{\mathrm{bfs}}\left(x\right)=\frac{1}{\sqrt[3]{131831}}\left(7737\phantom{\rule{.2em}{0ex}}-16x\left\{4653\phantom{\rule{.2em}{0ex}}-2x\left[7381\phantom{\rule{.2em}{0ex}}-8x\left(1168\phantom{\rule{.2em}{0ex}}-509x\right)\right]\right\}\right),\phantom{\rule{5.0em}{0ex}}$$

$$\phantom{\rule{1.5em}{0ex}}{Q}_{5}^{\mathrm{bfs}}\left(x\right)=\frac{1}{\sqrt[3]{6632213}}\left[66657\phantom{\rule{.2em}{0ex}}-32x\left(28338\phantom{\rule{.2em}{0ex}}-x\left\{135325-8x\left[35884\phantom{\rule{.2em}{0ex}}-x\left(34661-12432x\right)\right]\right\}\right)\right].$$

These are plotted in Fig. 2. Notice that, as shown in Fig. 3, on account of the particular nonuniform weight distribution, the associated slope functions remain aesthetically bounded even at the part's centre and edge. The key property of this representation is that the mean square slope of the normal departure from a best-fit sphere is now evidently given by

It is this result that makes this basis well suited to certain design tasks.

## 4. Simple applications

A specific example, say a Cartesian oval, can provide elementary numerical checks for
implementations of the basis proposed in Sec. 2. Although such an example may be
entirely academic, it can also help to clarify some of the benefits associated with
this basis. The single refracting surface that perfectly images a source point to
its image point can be found by requiring that the summed optical distance along
line segments that join the two points via any point on the surface be a constant.
An example is shown in Fig. 4 with *n* = 1.5 and
*d*
_{1}, = *d*
_{2} = 10mm, and
2*ρ*
_{max} = 7mm, where *n* is
the refractive index and *d*
_{1}, and
*d*
_{2} are the distances between the axial point on the
surface and the object and image points, respectively. Finding the explicit sag of
the surface involves solving a quartic equation. Thankfully, standard mathematical
software packages give direct access to the solution for the purposes of the fitting
described here. The axial curvature is readily found to be *c* =
-(*n*/*d*
_{1}+
1/*d*
_{2})/(*n* -1) = -0.5mm^{-1}.
For the current purposes, it is not important whether the conic parameter, i.e.
*ε* of Eq. (2), is chosen to match the fourth-order expansion of the sag,
or to give a fit through the surface at the edge of the aperture. For the latter, in
the notation used before Eq. (4), we have *ε*=
[2*f*(*rho;*
_{max})-*cρ*
^{2}
_{max}]/[*cf*(*ρ*
_{max})^{2}],
which gives *ε*≈0.156284. The residual
60μm of deviation from this fitted conic is plotted in Fig. 5(a).

For the case *M* = 7, the fit error associated with either of Eqs. (1) or (2) is plotted in Fig. 5(b). The coefficients for each case are presented in
the first and last columns of Table 1. In principle, these two polynomials are identical.
When the traditional monomial basis is adopted, however, the coefficient values are
highly dependent upon the number of terms used in the fit. Further, because of the
ill-conditioned Gram matrix, the accuracy of the fits based on monomials rapidly
bottoms out as *M* is increased. More problematically, the non-empty
null space for the Gram matrix means that the coefficients are truly ill-defined:
the associated values in the Table are good to only a couple of significant digits
because a coordinated change can leave the overall fit effectively untouched. This
is clarified in the second column of Table 1 where some sample changes in the coefficients for the
monomial basis are given. Although individually exceeding hundreds of thousands of
nanometers, these changes combine to give no more than just one nanometer of change
in the net sag. [These changes are precisely
–*Q*
^{con}
_{7}(*x*)
and, as suggested by Fig. 1, *x*
^{2}
*Q*
^{con}
_{7}(*x*) has a peak value of
unity.] *Setting tolerances directly in such a representation is evidently an
ugly prospect*. The first column exhibits the coefficients’
characteristic alternation in signs and growing magnitude. In fact, the largest
coefficient value grows with increasing *M* so, even if it were
possible to solve for a larger number of coefficients, evaluation of the polynomial
would involve increasing numerical round-off problems due to cancellation. Notice,
for example, in Table 1 that *a*
_{5} is on the scale
of millimeters for the traditional basis even though we are fitting just
60μm of aspheric departure. The traditional representation is evidently
increasingly inefficient in terms of the number of significant digits required in
specifying the coefficient values.

For the orthogonal basis, the Gram matrix is diagonal, and the coefficient values are
thus independent of *M*. The decomposition therefore resembles a
familiar spectrum. *In contrast to the monomial representation, all of the
digits shown in the table for this case are significant, and there are clearly
fewer of them*. Notice that these coefficient values decay exponentially
with *m*: in this case, successive terms fall by a factor of three or
four. (This trend continues out to any number of terms, and the fit is accurate to
full machine precision — 15 digits— by *M* =
25. The key point in this is the striking robustness.) Notice that, because of the
decaying coefficient values, fit error plots such as Fig. 5(b) generally resemble the first of the truncated basis
elements. If we truncate at *M* = 4, for example, it is clear that
the fit error will resemble the dark blue curve of Fig. 1 and have a peak value of roughly 200nm (see
*a*
_{5} of Table 1).

Because *G _{mn}* is diagonal, it follows from Eqs. (3) and (7) that the mean square departure between the reference conic
and the asphere is just a weighted sum of the squares of the coefficients in the
decomposition. While this is a relatively incidental result for this basis, the
corresponding property given in Eq. (15) for the basis of Sec. 3 is its defining feature. This
property can be exploited in a number of ways within design tools. When aspherizing
an all-spherical imaging system, for example, it is straightforward to linearly
approximate the impact on the wavefront error from the aspheric terms on each
surface. The result involves simple ray inclination factors. Even when averaged over
the field, the mean square wavefront error is then a quadratic in the coefficients.
Since the mean square slope of the normal deviation from the spherical surfaces
follows from the trivial quadratic of Eq. (15), the optimal wavefront error for any prescribed mean square
slope can be found by using Lagrange multipliers and solving only linear equations.
Various simple strategies can now be used to identify optimal solutions that use
only aspheres that can be measured interferometrically. An effective step in this is
to constrain the rms slope to be below some fraction, say

*γ*where 0 <

*γ*<1, of Nyquist. When testing in reflection, the Nyquist slope corresponds to a change in the normal aspheric departure between adjacent pixels of

*λ*/4. When the aperture is imaged onto an

*N*×

*N*detector, the constraint on the rms slope follows from Eq. (15):

This type of result underlies the envisioned applications of the basis introduced in Sec. 3. Notice that, if desired, the maximum absolute slope can be monitored when solving for the appropriate value of the Lagrange multiplier.

## 5. Concluding remarks

The traditional parametrization of optical surface shapes in terms of monomials is
famously ill-conditioned. One symptom of this weakness is the wildly varying and
precariously balanced values among the associated degrees of freedom. All the
problematic features disappear when the basis is orthogonalized, however. As a
result, significantly fewer digits are required to characterize a surface to any
given accuracy. What is more, the high-order coefficients then independently control
just the high-order shape. This property can facilitate the tolerancing of an
aspheric surface. In the context of optical design, the coefficients associated with
orthogonalized representations also promise to give a more natural coordinate space
for optimization of the associated merit function. For example, including extra
terms in the asphere delivers the benefits of higher orders with minimal change to
the existing coefficients. It is worth noting, however, that in general design
processes, the value of *ρ*
_{max} for each surface
must be updated whenever the system modifications that are part of optimization
cause the effective (illuminated) aperture to drift significantly from
*ρ*
_{max}.

The polynomials introduced in Sec. 2, namely *Q*
^{con}
_{m} (*x*), are intended for use in a range of contexts including
design, fabrication, testing, and even patenting of a variety of aspheres. Because
of their generality, numerical details were provided in Sec. 4 for code
verification. Efficient algorithms can be implemented to provide a range of related
services for ray-tracing and design. For example, as discussed in other contexts [4,5,6,7], it is helpful to be able to provide access to first and
second derivatives as well as interconvert bases and change the aperture size for
the polynomial decompositions. It is possible to achieve this while working to
arbitrary orders with at most comparable operation counts to the code for the
monomial basis. The explicit expressions for the polynomials given in Eq. (9) are adequate for exploratory implementations, however.

The introduction of the polynomials described in Sec. 3, namely
*Q*
^{bfs}
_{m} (*x*), was motivated by the fact that aspheres can be
significantly more cost effective if their departure from a best-fit sphere is
sufficiently constrained. In particular, it is then possible to avoid the need for
dedicated, expensive null optics as part of their testing. Further, when a
part’s local principal curvatures vary slowly enough, the fabrication
difficulties associated with non-uniform tool-fit are greatly relieved. While
similarly avoiding the weaknesses of working with monomials,
*Q*
^{bfs}
_{m} (*x*) retains most of the advantages of
*Q*
^{con}
_{m} (*x*) while also facilitating the design of milder aspheres.
Notice that the simple design procedures discussed in Sec. 4 do not require access
to derivatives, and they change neither aperture sizes nor best-fit curvatures. In
this sense, a simple application of *Q*
^{bfs}
_{m} (*x*) requires minimal dedicated effort other than
implementing the basis elements themselves, see Eq. (14).

## References and links

**1. ** See, for example, the discussion and references in H. Chase, “Optical design with rotationally
symmetric NURBS”, SPIE Proceedings **4832**, 10–24
(2002) and A. W. Greynolds, “Superconic and subconic surface
descriptions in optical design,” Proc.
SPIE **4832**, 1–9
(2002). Such matters are also treated within the manuals for
commercial optical design software. [CrossRef]

**2. **G. H. Spencer and M. V. R .K. Murty, “General ray-tracing
procedure,” J. Opt. Soc. Am. **52**, 672–678
(1962), see Eq. (16). [CrossRef]

**3. **M. Abramowitz and I. Stegun, *Handbook of Mathematical Functions*
(Dover, 1978), see
22.2.1.

**4. **E. H. Doha, “On the coefficients of
differentiated expansions and derivatives of Jacobi
polynomials,” J. Phys. A: Math. Gen. **35**, 3467–3478
(2002). [CrossRef]

**5. **E. W. Weisstein, “Jacobi
Polynomial” from *MathWorld—A
Wolfram Web Resource*.
http://mathworld.wolfram.com/JacobiPolynomial.html, see esp. Eqs. (10–14).

**6. **B. Y. Ting and Y. L. Luke, “Conversion of polynomials between
different polynomial bases,” IMA J.
Numer. Anal. , **1**,
229–234
(1981). [CrossRef]

**7. **A. J. E. M. Janssen and P. Dirksen, “Concise formula for the Zernike
coefficients of scaled pupils”, J.
Microlithogr., Microfabr., Microsyst. **5**, 1–3
(2006). (Note that the Zernikes are also Jacobi
polynomials.) [CrossRef]