## Abstract

A collimation method of misaligned optical systems is proposed. The method is based on selectively nullifying main alignment-driven aberration components. This selective compensation is achieved by the optimal adjustment of chosen alignment parameters. It is shown that this optimal adjustment can be obtained by solving a linear matrix equation of the low-order alignment-driven terms of primary field aberrations. A significant result from the adjustment is to place the centers of the primary field aberrations, initially scattered over the field due to misalignment, to a desired common field location. This *aberration concentering* naturally results in recovery of image quality across the field of view. Error analyses and robustness tests show the method’s feasibility in efficient removal of alignment-driven aberrations in the face of measurement and model uncertainties. The extension of the method to the collimation of a misaligned system with higher-order alignment-driven aberrations is also shown.

© 2010 OSA

## 1. Introduction

Misalignment changes the aberration field of an optical system in a systematic manner that can be well approximated, in many cases, by simple linear and/or quadratic functions of field and alignment parameters. This makes the distributions of aberrations over the system field extremely useful not only in misalignment diagnostics, but also in determining optimal adjustments for efficiently improving collimation quality of the system. Having the focus of this paper on the latter aspect, it is important to define what we mean by *optimal correction*, as it determines the way to compute the corrections as well as the strategy of applying them to the system.

Ideally, the optimal correction is something that most efficiently removes misalignment from the components of a system. In order to do that, one needs to be able to measure (or estimate) the alignment state of the optical components with an accuracy at least comparable to the alignment tolerance. Several alignment methods have taken this approach and *reverse optimization* is quite often at the core of them [1–4]. The principle is to search for alignment states of individual components, with which the system reproduces the observed wavefront. However, this approach often utilizes non-linear optimization procedures and, as a result, produces a stagnated estimate with significant difference from the true alignment state. One remedy to this is to use wavefronts sampled at multiple fields, but these multi-field samples can be degenerate among themselves and thus the stagnation problem can still persist, especially in multi-element systems [5]. Although there is a feasible alternative to avoid this issue [5–7], the special measurement scheme proposed in the references could be hard to implement in systems with only one or two (partially) adjustable elements as one often faces in reality.

In such constrained systems, it is better to adjust alignment-sensitive components to deliberately introduce additional variations to the aberration field and thus to compensate for the alignment-driven aberrations. This is analogous to the compensator concept in optical tolerancing. If such a correction brings a system to the state free from alignment-driven aberrations, the system can be declared to be *in alignment*, no matter what the actual alignment state is, and this correction can be called *optimal*. Methods that aim for this definition of optimal correction often use the singular value decomposition of the alignment influence matrix [8, 9]. The corrections from these tend to be rather complex combinations of adjustments of many of the alignment parameters (often including alignment-insensitive ones). This would be less suitable for alignment correction of constrained systems [10].

In this paper, we propose an alternative collimation method of misaligned optical systems. The method is based on adjusting chosen alignment parameters to selectively nullify main alignment-driven aberration components of a system. With the main focus on surface decenter and tilt misalignment, the method uses three inter-related critical insights into the influence of these misalignments on primary field aberrations: (i) the linear nature of alignment-driven aberrations typically observed in many systems; (ii) the fact that the centers of field aberrations of misaligned systems are displaced by different amounts over the system field of view; (iii) the fact that each alignment-driven aberration can be split into two components that differently behave in two orthogonal field axes.

The main outcome of the presented analyses is threefold. (i) There is a linear matrix relation between alignment parameters and alignment-driven aberration terms. The solution of this equation corresponds to the optimal adjustment of a chosen set of alignment parameters. (ii) The optimal adjustment is physically equivalent to placing the centers of primary field aberrations at a desired common field location simultaneously (called *aberration concentering* hereafter). This restores the field distribution of aberrations to the nominal and improves the image quality at the same time; (iii) In most of low-order aberration dominant systems, only three alignment-driven terms need to be removed. Thus (maximum) three alignment parameters per axis are required. This can still be true for systems with higher-order alignment-driven aberrations although the aberration concentering may not be achieved. However, adding one more alignment parameter per axis for also removing the fourth term from higher-order aberrations is shown to be effective for further improvement in collimation and aberration concentering quality. In Section 2, details of this approach is described with error analyses. We present the results of case studies and robustness tests in Section 3. The results demonstrate the method’s feasibility in efficient removal of alignment-driven aberrations in the face of measurement and model uncertainties. We finish up this paper with a discussion on how this approach can be useful in collimation of wide-field large aperture multi-surface systems with higher-order field aberrations (Sec. 4).

## 2. Theory

#### 2.1. Alignment-driven aberrations

The aberration fields of many optical systems are dominated by low-order primary aberrations. Some of these, namely coma, astigmatism, and curvature, are sensitive to misalignment of individual optical components and thus easily detectable when they exists. Removing the alignment-driven terms of these aberrations effectively improves the quality of collimation and restores the system performance close to the nominal regardless of its actual alignment state. Let *Coma _{x}*,

*Coma*,

_{y}*Astg*

_{1},

*Astg*

_{2}, and

*Curv*be the coefficients of

*Z*

_{8},

*Z*

_{7},

*Z*

_{6},

*Z*

_{5}, and

*Z*

_{4}, respectively, where

*Z*is the

_{i}*i*-th standard Zernike polynomial [11]. These coefficients can be expressed as a function of de-center (

*x, y*) and tilt (

*θ*,

*ϕ*) parameters [12–17]. For a single surface case, these are given as,

where *O*
^{(n)} includes terms of order higher than *n*-1 in field and/or alignment parameters and *F*
_{0} corresponds to the defocus term at the center of the field. In low-order aberration dominant systems, *O*
^{(n)} is negligible, but in some wide-field systems these may need to be accounted for as to be discussed later on (Sec. 4).

The terms in brackets in Eq. (1) are due to misalignment and need to be suppressed during the course of collimation. We call those in round brackets *linear term* and those in square brackets *quadratic term* hereafter. The *linear terms* in *Coma* are field constant and thus mainly controls the overall magnitude of wavefront error across the system field of view. The *linear terms* in *Curv* and *Astg* effectively determine the overall slope of aberration field and, when exist, produce so-called focus gradient across the image field. The *quadratic terms* in *Curv* and *Astg* are similar to the *linear terms* in *Coma*, but usually less significant as *A*
_{00}, *B*
_{00}, and *F*
_{00} are much smaller than *C*
_{0}. These *linear terms* commonly result in displacement of the centers of the individual aberrations away from the nominal by different amounts. This induces non-intrinsic large asymmetric image quality variations across the field of a system. Thus, the *linear terms* are those to be removed in the collimation process. It is outside the scope of this paper to give explicit expressions for these coefficients, but one can certainly do this by, for example, a ray-tracing-based numerical sensitivity analysis [16] or the rigorous analytic derivations [17].

#### 2.2. Aberration concentering and optimal collimation by alignment correction

Before beginning the collimation process, the amounts of the *linear terms* of a misaligned system need be first quantified and this can be done through three steps: (i) measuring wavefront data at a set of field positions, (ii) determining the aberration coefficients by decomposing the wavefront data into aberration functions (such as Zernike polynomials), and (iii) fitting a linear or quadratic function to the distributions of the aberration coefficients across the field.

In step (i), one may attempt to measure wavefront at a two dimensional grid of discrete field positions across the field. However, scanning the wavefront along only two field axes (i.e. *H _{x}*- and

*H*-axis) can provide as much information for quantifying the

_{y}*linear terms*as the grid sampling can. This is due to the fact that the

*linear terms*of each aberration only respond to certain parameters associated with one of the two field axes, as easily noticed in Eq. (1), and thus can be split into two groups. For example, all terms of

*Coma*and those in the first line of

_{x}*Astg*

_{1}only respond to

*H*,

_{x}*x*, or

*ϕ*, all of which are associated with the

*H*-axis. Taking this aberration scanning approach, one can obtain a linear curve for

_{x}*Coma*and

*Astg*

_{2}and a quadratic curve for

*Astg*

_{1}and

*Curv*along each field axis after performing step (i) and (ii). By fitting a linear or quadratic function to these scans, as one would do in step (iii), linear fitting coefficients can be obtained. These correspond to the amounts of the

*linear terms*of the aberrations. Note that the aberration scanning naturally requires a fewer wavefront measurements than the grid sampling does and this can simplify and speed up the measurement process.

Let *X⃗* and *Y⃗* be vectors containing the measured values of the *linear terms* of *Coma*, *Astg*
_{1}, *Curv*, and *Astg*
_{2} in the *H _{x}*- and

*H*-axis, respectively. The field coordinates of the centers of

_{y}*Coma*,

*Astg*

_{1}, and

*Curv*, for example, are given by the following.

It should be noted that *Astg*
_{2} is different from the other aberrations. All terms of *Astg*
_{2} are simultaneously related with the *H _{x}*- and

*H*-axis [Eq. (1)]. Thus, the field center of

_{y}*Astg*

_{2}given by the aberration scanning along

*H*and

_{x}*H*-axis is not unique. For example, if it is scanned along

_{y}*H*-axis while

_{x}*H*=0, one obtain different

_{y}*H*coordinate of

_{x}*Astg*

_{2}’s field center from what would be obtained by the same scan but with

*H*= 0.5. The true field center of

_{y}*Astg*

_{2}can only be found when scanning along two orthogonal axes given by rotation of the

*H*- and

_{x}*H*-axis by 45 degrees about the nominal field center. In fact, these rotated axes coincide with the axes of symmetry of

_{y}*Astg*

_{2}. Nevetheless, this does not diminish the usefulness of the

*linear terms*of

*Astg*

_{2}in finding the optimal alignment corrections because these can still be determined from scanning

*Astg*

_{2}along the

*H*- and

_{x}*H*-axis.

_{y}Equation (2) indicates that reducing *X⃗* and *Y⃗* is equivalent to *concentering* the three aberrations at the nominal field center. At the same time, it removes the *linear terms* from the system aberration field. As a result of these, the aberrations restore their intrinsic distribution patterns across the system field. In order to do this, one needs to apply appropriate alignment corrections (Δ*x*,Δ*y*,Δ*θ*,Δ*ϕ*) to the alignment parameters (*x,y,θ,ϕ*) of the system. These corrections can be obtained by solving a set of linear equations given by the measured *X⃗* and *Y⃗* and the expressions of the *linear terms* in Eq. (1). The equations are given in Eq. (3).

Here, it should be noted that one set of corrections given by solving one of the above equation is not necessarily similar (if not identical) to the corrections given by other equations. This can give rise to, for instance, small coma at the center field, but large astigmatism and/or skewed curvature across the field, leading to focus gradient. This occurs quite often especially when a misaligned system is tested on-axis without verifying off-axis wavefront (as usually done in practice), in that (*X*
_{2},*Y*
_{2}), (*X*
_{3},*Y*
_{3}), and (*X*
_{4},*Y*
_{4}) cannot be sensed by on-axis wavefront measurement only. This illustrates the importance of verifying imaging performance across the field.

To remove the *linear terms* together, one needs to find a solution to some of the above equations for a given set of correction parameters. For example, in a single surface misalignment case, one can solve the following.

In this particular case, Δ*x* = −*x* and Δ*ϕ* = −*ϕ* remove the *linear* and *quadratic terms* altogether.

#### 2.3. Description of residual alignment-driven aberrations after alignment correction

Let us assume that we have a system with N misaligned surfaces and we desire to *concenter Coma*, *Astg*
_{1}, and *Curv* at the nominal center field of the system. Adopting a generic notation of *x _{i}* and

*y*for the alignment parameters, we can write

_{i}Here **C**
_{x}, **A**
_{x}, **F**
_{x} are *N* × 1 vectors of
${C}_{{{\rm X}}_{i}}$
,
${A}_{{H}_{x}{x}_{i}}$
, and
${F}_{{H}_{x}{x}_{i}}$
along *H _{x}*-axis, respectively, and likewise along

*H*-axis.

_{y}**M**

_{x}and

**M**

_{y}are 3 ×

*N*matrices.

*x⃗*and

*y⃗*are 2

*N*× 1 vectors. Note that each surface has two alignment parameters per field axis.

**A**

^{T}means the transpose of

**A**.

As only three aberrations are to be concentered, correcting three alignment parameters in each axis (6 in total) is sufficient to remove the *linear terms*. However, this may not be sufficient to eliminate the *quadratic terms*. Let the correction parameters be Δ*x _{k}* and Δ

*y*with

_{k}*k*= 1,2, 3. As only three parameters are to be adjusted, the influence matrices must be subsets of

**M**

_{x}and

**M**

_{y}. Letting

**m**

_{x}and

**m**

_{y}be the 3 × 3 subset matrices, Δ

*x⃗*and Δ

*y⃗*can be expressed in terms of

*x⃗*and

*y⃗*as,

where **m**
^{−1}
_{x} is the inverse of **m**
_{x}. Although three alignment parameters are sufficient, one may wish to use more alignment parameters in this process for some reason. In that case, **m**
_{x} and **m**
_{y} are no longer square matrices and their inverses in Eq. (6) can be replaced by pseudo-inverses via singular-value-decomposition (SVD).

Upon applying these corrections, the corrected alignment states are given, with a *N* × *N* unit matrix (**1**) and a (*N* − 3) × *N* zero matrix (**0**) as,

While the *linear terms* vanish, the *quadratic terms* may still have residuals. The residuals can be expressed in terms of *x⃗* and *y⃗*, using Eq. (6) and (7), at the common center (i.e. *H _{x}* =

*H*= 0) as,

_{y}$${\mathrm{Astg}}_{1}={\overrightarrow{x}}^{T}\left({\hat{\mathbf{M}}}_{x}^{T}{\mathbf{A}}_{xx}{\hat{\mathbf{M}}}_{x}\right)\overrightarrow{x}+{\overrightarrow{y}}^{T}\left({\hat{\mathbf{M}}}_{y}^{T}{\mathbf{A}}_{yy}{\hat{\mathbf{M}}}_{y}\right)\overrightarrow{y}={\overrightarrow{x}}^{T}{\hat{\mathbf{A}}}_{xx}\overrightarrow{x}+{\overrightarrow{y}}^{T}{\hat{\mathbf{A}}}_{yy}\overrightarrow{y}$$

$$\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.9em}{0ex}}\phantom{\rule{.2em}{0ex}}{\mathrm{Astg}}_{2}={\overrightarrow{x}}^{T}\left({\hat{\mathbf{M}}}_{x}^{T}{\mathbf{B}}_{xy}{\hat{\mathbf{M}}}_{y}\right)\overrightarrow{y}={\overrightarrow{x}}^{T}{\hat{\mathbf{B}}}_{xy}\overrightarrow{y}$$

where **F**
_{00}, **A**
_{00}, and **B**
_{00} are the coefficient matrices of the *quadratic terms*. Assuming that *x⃗* and *y⃗* are random independent variables following Gaussian distributions with zero mean and standard deviations of *σ _{x}* and

*σ*, the probability distributions of the residual aberrations can be computed. If these distributions happened to be Gaussian, the statistics in Eq. (9) can be used in finding the optimal values of

_{y}*σ*and

_{x}*σ*[19].

_{y}where *i* ≠ 1,2,3, *E*[*A*] is the mean value of *A*, and *Var*[*A*] is the variance of *A*. The condition of, for example, minimum curvature can be

where *Curv _{req}* is the allocation to

*Curv*from the total rms wavefront error budget of a system. Similar conditions can be posed to the other aberrations and one then needs to find the optimum values of

*σ*and

_{x}*σ*that meet these conditions. It should be noted, however, that the distributions are different from one case to another and may significantly deviate from a Gaussian. In such cases, one needs to find the optimal

_{y}*σ*and

_{x}*σ*using more sophisticated optimization procedures.

_{y}#### 2.4. Error analysis

Two error sources can play a critical role in performing the aforesaid alignment correction method in reality. One is the uncertainty in the measurement and the other is the error in the influence matrix (i.e. model error). The measurement error in fact originates from the aberration coefficient measurement and propagates through the curve fit procedures. Omitting axis notations for convenience, let *w _{i}* and

*δw*be the average and error of a particular aberration coefficient, inferred from

_{i}*M*measurements at the

*i*-th field locations

*H*. The curve fit coefficients

_{i}*p⃗*of the aberration scans can be computed by a least-square analysis [18] as,

In fact, [**R**
^{T}
**R**]^{−1} = **C** is the covariance matrix of the fit coefficients so that the variance of *p _{i}* equals to the (

*i,i*) element of

**C**(${\sigma}_{{p}_{i}}^{2}={C}_{\mathrm{ii}}$). If

*p*is the

_{j}*i*-th element of

*X⃗*, then ${\sigma}_{{X}_{i}}=\sqrt{{C}_{\mathrm{jj}}}$ . Equation (6) can be rewritten as,

$$\u2206\overrightarrow{y}{=\left[\mathbf{m}{\prime}_{y}^{T}\mathbf{m}{\prime}_{y}\right]}^{-1}\mathbf{m}{\prime}_{y}^{T}\overrightarrow{Y}\prime ={\mathbf{D}}_{y}\mathbf{m}{\prime}_{y}^{T}\overrightarrow{Y}\prime \phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\text{with}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}m{\prime}_{\mathrm{ij},y}=\frac{{m}_{\mathrm{ij},y}}{{\sigma}_{{Y}_{i}}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\text{and}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}{Y}_{i}^{\prime}=\frac{{Y}_{i}}{{\sigma}_{{Y}_{i}}}$$

and the measurement-driven uncertainty in the correction estimates is approximated by

The error in the influence matrix can also be treated as part of the measurement error. For given *x⃗*, if the true influence matrix of the system differs from what we think it is (**M**
_{x}) by *δ*
**M**
_{x}, the corrections are expressed as,

Letting
${\sigma}_{{M}_{\mathrm{ij},x}}^{2}$
and
${\sigma}_{{x}_{i}}^{2}$
be the variances of *δM _{ij,x}* and

*x*, respectively, the expected variance of the alignment corrections can be approximated by the following.

_{i}where *n _{ij,x}* is the (

*i,j*) element of

**n**

_{x}. This outcome obviously depends on ${\sigma}_{{x}_{i}}$ , the variance of the

*unknown*true alignment state. However, this can be substituted for its expected variance to set the expected

*upper*limit on the alignment correction uncertainty.

## 3. Case study

#### 3.1. Two-mirror finite conjugate system

In this example system, the object and image planes are at finite distances from the system. f/3 beam is converted into f/2. The primary mirror (M1) is assumed to be in perfect alignment while the secondary (M2) is misaligned in decenter by *x*
_{2}=+0.125mm, *y*
_{2}=−0.515mm and in tilt by *ϕ*
_{2}=−0.045deg, *θ*
_{2}=0.185deg. The decenter and tilt of M2 are used as the correction terms for concentering coma and astigmatism at the nominal field center. The curve fit to the initial scans shown in Fig. 1 locates the centers of the three aberration fields at (+0.092,+0.698) for *Coma*, (−0.826,+3.390) for *Astg*
_{1}, and (−0.239,+0.980) for *Curv* in normalized field coordinates.

The required corrections from solving Eq. (5) are Δ*x*
_{2}=−0.1248mm, Δ*y*
_{2}=0.5148, Δ*ϕ*
_{2}=0.04493deg, and Δ*θ*
_{2}=+0.18493deg. These corrections are close to the actual misalignment as expected. As a result of these corrections, the field aberrations show symmetric distributions around the nominal field center and the system performance is fully restored (Fig. 2).

Suppose that M1 is also misaligned due to its own position accuracy of 0.2mm in decenter and 0.05deg in tilt and that it is desired to have less than 0.1wv rms system wavefront error at the common field center. Using Eq. (8), the expected distributions of residual Curv, Astg1 and Astg2 can be computed as in Fig. 3. Apparently, the distributions deviate from a typical Gaussian, that of Curv in particular. The value of Curv is less than 0.0135wv in 95% of the time. Astg1 and Astg2 are likely to be negligible in most of the time as well. This indicates that adjusting M2 in decenter and tilt should guarantee the desired wavefront quality at the common field center with the expected alignment error of M1.

#### 3.2. Three-mirror camera and robustness test

In this case, we use a f/3 three-mirror camera system with a 1 deg field and 2.2m aperture. More details of this system is given in Appendix. In this particular system, we assume that the secondary (M2) and tertiary (M3) can be placed to an accuracy of ±0.1mm in decenter and ±0.05deg in tip/tilt. The system is randomly perturbed within these ranges and the alignment state becomes x=−0.075mm, y=0.101mm, *ϕ*=−0.017deg, *θ*=0.035deg for M2 and x=0.095mm, y=−0.100mm, *ϕ*=−0.029deg, *θ*=0.031deg for M3. The initial field scans of the aberration is shown in Fig. 4. Notice the weak sign of higher-order field aberrations in the coma field scans [Fig. 4(A)].

We here assume that M2 decenter and tilt and M3 tilt are adjustable for correcting the *linear terms* in *Coma*, *Astg*
_{1}, and *Curv*. Three parameters per axis are sufficient for the correction. The required corrections are Δx_{2}=0.091mm, Δy_{2}=-0.117mm, Δ*θ*
_{2}=0.036deg, Δ*ϕ*
_{2}=−0.016deg, Δ*θ*
_{3}=0.029deg, Δ*ϕ*
_{3}=0.031deg. The field scans after the correction are shown in Fig. 5. The correction indeed concentered the field aberrations at the desired field location and restored the system performance close to nominal.

Having shown the details of how the proposed method works, its utility must be identified through robustness test against various error sources. Here, we continue to use the previous case study results, but look into the variations of the results as the magnitudes of measurement and influence matrix error change (Fig. 6).

The top row of Fig. 6 shows the standard deviation of the alignment corrections as a function of measurement error (solid lines with cross markers in A-1, B-1, C-1). These errors were deduced from 100 random realizations at each value of measurement error. The aberration coefficients were sampled at 9 field points along each field scan axis. At each scan point, each aberration coefficient was averaged over 10 measurements. On the same plots, the expected errors in the corrections using Eq. (13) are overplotted (dashed lines with disk markers), which closely follow the curves from the random realizations. These errors were then fed to the optical model of the system and the expected rms wavefront error (in terms of three standard deviations) was computed at the center and edge of the system field (D-1). The second row shows the same plots, but with curves plotted against the influence matrix error (A-2, B-2, C-2). The plots show the similar trend as seen in the top row. For this particular system, it is found that less than 0.2wv of RMS wavefront error can be obtained across the field with measurement and model error around 0.1wv and 2%, respectively. A similar analysis can be performed for different systems with different requirements. Finally, the first three plots in the bottom row show the correction error curves as a function of the number of samples used to average each aberration coefficients at each scan point. The measurement and model error were assumed to be 0.1wv and 1%, respectively. As expected, averaging larger samples gives smaller error.

## 4. Discussion: Wide-field large aperture multi-surface systems with higher-order alignment-driven aberrations

Although many optical systems are low-order aberration dominant, there are an increasing number of wide-field on/off-axis systems that exhibit notable higher-order field aberrations.We use the Hobby-Eberly Telescope as an example system. It consists of five reflective surfaces with 10m pupil and 22 arcmin field. The primary (M1) feeds f/1.3 beam into the four-mirror prime focus corrector (PFC) that produces f/3.65 beam at the focal plane. We assume that the four mirrors in the PFC can be tilted about an arbitrary rotation point. This rigid-body motion produces additional field constant coma to the system with relatively small amounts of astigmatism and curvature.

In the presence of large higher-order field aberrations, the field aberrations start showing substantial amounts of extra higher-order terms that are absent in Eq. (1). For example, *Coma _{x}* shows field quadratic terms and, when scanned along

*H*axis, can be approximated by the following functions.

_{x}$$\phantom{\rule{.9em}{0ex}}\phantom{\rule{.9em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}\phantom{\rule{.2em}{0ex}}+({C}_{{H}_{x}}+{C}_{{\mathrm{dh}}_{x}^{2}{H}_{x}}{\mathrm{dh}}_{x}^{2}+{C}_{{\mathrm{dh}}_{y}^{2}{H}_{x}}{\mathrm{dh}}_{y}^{2}+{C}_{{\mathrm{dh}}_{x}{\mathrm{dh}}_{y}{H}_{x}}{\mathrm{dh}}_{x}{\mathrm{dh}}_{y}){H}_{x}+{C}_{{\mathrm{dh}}_{x}{H}_{x}^{2}}{\mathrm{dh}}_{x}{H}_{x}^{2}+{O}^{\left(4\right)}$$

where *dh _{x}* and

*dh*are linear functions of (

_{y}*x, ϕ*) and (

*y,θ*), respectively. Here, the field constant term is still substantial and the linear coefficients in this term are much larger than the cubic ones. Therefore, it should be possible to reduce the field constant term in the same way as used in the previous cases. However, in the presence of higher-order aberrations, the new field quadratic term (coupled with alignment parameters) can substantially contributes to

*Coma*. The major influence of this term is to deform

*Coma*field scans into quadratic shape, effectively breaking the oddness of the original functional form of

*Coma*. Therefore, in this case, the reduction of the field constant term of

*Coma*does not necessarily place its center at a desired field location. A similar effect also occurs in

*Astg*and

*Curv*, where substantial amounts of field cubic terms, coupled with alignment parameters, can appear.

In the current example system, this is certainly true and, as a demonstration, the mirror surfaces (except M1) of the system are perturbed within ±0.1mm in decenter and ±0.05deg in tilt. After the perturbation, the PFC as a rigid-body is intentionally tilted to null the field constant coma. The field scans of the aberrations show substantial higher-order features (Fig. 7).

The curve fit coefficients for the *H _{x}* scans, in Table 1, clearly show the existence of significant quadratic term for

*Coma*and cubic terms for

*Astg*and

*Curv*, whereas the terms intrinsic to each aberration (e.g. odd functions of

*H*for Coma) have been changed by only small amounts, meaning relatively weak alignment-influence in these terms. Note, however, that the cubic terms in

_{x}*Astg*and

*Curv*are still less significant than the linear terms by many factors. A clear indication from this is that selective reduction of the

*H*

^{0}and

*H*

^{2}terms of

*Coma*and the

*H*

^{1}terms of

*Astg*and

*Curv*should restore the distributions of the aberrations to their nominal over the field of view.

To demonstrate this, the alignment correction was computed in the same way used in the case studies (Method I) and by including the equation of the *H*
^{2} term in *Coma* (Method II). In Method II, total four alignment parameters per axis are required to completely correct the four alignment-driven terms.We have chosen M4 decenter/tilt, M5 decenter, and PFC tilt, and these are also used in Method I. We use the initial scan data in Fig. 7.

The correction by Method I substantially removed alignment-driven aberrations and restored the field distributions close to nominal, *Astg* and *Curv* in particular (Fig. 8). At the edge of the field, the RMS wavefront error is reduced from 11 wv to 2.5 wv. However, as discussed, the distribution of *Coma* is still off-set from the desired nominal center field, showing large asymmetry over the field. The distributions of *Astg* and *Curv* are also off-centered, but by smaller amounts than *Coma*. Though the individual aberrations are not quite centered at the common field, the overall wavefront error, across the field, becomes close to the nominal. Note that almost identical result is obtained when only M4 decenter/tilt and PFC tilt are used.

Due to the fact that the *H*
^{2} term is also corrected, Method II produced a set of alignment corrections that exactly concentered the field distributions of *Coma*, *Astg*, and *Curv* at the nominal center field (Fig. 9). *Coma* scans follow the original odd function in *H*. The amount of asymmetry in all aberrations is negligible and the overall wavefront error is nearly identical to the nominal. In comparison to Method I’s results, the RMS wavefront error at the edge of the field was reduced from 2.5wv to 2.2wv (roughly 10% improvement) by Method II. Although Method I can be effective in removing the alignment-driven aberrations, the full field ray-spot distribution, shown in Fig. 10, clearly demonstrates that Method II can further improve the quality of aberration concentering and collimation in the presence of higher-order aberrations, towards the edge of the field in particular.

Note that, if a misaligned system develops large amounts of cubic terms in *Astg* and *Curv*, two more alignment parameters per field axis are likely to be necessary for concentering the three aberrations. Even so, however, not all required alignment parameters may be necessary depending on the amount of alignment-driven optical performance degradation. If a system is in a late stage of its commissioning and the performance is not far a way from the nominal, one may use some of the alignment parameters to efficiently reduce alignment-driven aberrations. If the degradation is still large, it would be necessary to use all of the required alignment parameters. In any case, the proposed method can be a useful way to test the alignment state of a system and to determine the optimal next adjustment.

## 5. Conclusion

In this paper, we described a new collimation method for misaligned optical systems. A series of theoretical analyses of the method indicates the followings. The optimal adjustment given by the method is physically equivalent to placing the centers of primary field aberrations at a desired common field location simultaneously. This not only restores the field distribution of aberrations to the nominal, but also improves the image quality across the field. In the case study of the three-mirror system, for example, the optimal correction found from the method demonstrated a complete restoration of the field aberration distribution and significantly improved the RMS wavefront error from 0.78 wv to 0.12 wv. Note that this improvement was obtained from a single aberration scanning and alignment correction without any further iterations. In most of low-order aberration dominant systems, maximum three alignment parameters per axis are required to be adjusted to improve the quality of aberration concentering and collimation. This would mean adjusting maximum two surfaces in practice. Analyses suggest that, for misaligned systems with higher-order alignment-driven aberrations, the aberration concentering may not be achieved although a factor of 5 improvement in terms of RMS wavefront error was observed in the presented example system. However, adding one more alignment parameter per axis for also removing the fourth term from higher-order aberrations is shown to be effective for further improving collimation and aberration concentering quality. The observed improvement in RMS wavefront error was approximately 10%. Finally, the case studies and robustness tests demonstrated that the method can be robust against measurement and model uncertainties. This proves the method’s feasibility as an independent alignment test method for both coarse and fine collimation of misaligned optical systems.

## Appendix: The design prescription of the three-mirror system used in Section 3.2

## Acknowledgements

I am grateful to two anonymous reviewers for their constructive comments on this manuscript. I thank Darragh O’Donoghue, Lisa Crause, other members of the SALT image quality fixup team at the South African Large Telescope, and John Booth at McDonald Observatory for productive discussions and suggestions on optical alignment of a four-mirror prime focus corrector. The presented research is supported by the Hobby-Eberly Telescope Dark Energy eXperiment project (HETDEX). HETDEX is a collaboration of The University of Texas at Austin, Pennsylvania State University, Texas A&M University, Universitats-Sternwärte Munich, Astrophysical Institute Potsdam, and Max-Planck-Institut fuer Extraterrestrische Physik. Financial support for the HETDEX is provided by the State of Texas, the United States Air Force, and the generous contributions of many private foundations and individuals.

## References and links

**1. **H. J. Jeong, G. N. Lawrence, and K. B. Nahm, “Auto-alignment of a three mirror off-axis telescope by reverse optimization and end-to-end aberration measurements,” Proc. SPIE **818**, 419–430 (1987).

**2. **M. A. Lundgren and W. L. Wolfe, “Alignment of a three-mirror off-axis telescope by reverse optimization,” Opt. Eng. **30**, 307–311 (1991). [CrossRef]

**3. **W. Sutherland, *Alignment and Number of Wavefront Sensors for VISTA*, VIS-TRE-ATC-00112-0012 (Technical report, Astronomy Technology Center, UK, 2001).

**4. **S. Kim, H.-S. Yang, Y.-W. Lee, and S.-W. Kim, “Merit function regression method for efficient alignment control of two-mirror optical systems,” Opt. Express **15**, 5059–5068 (2007). [CrossRef] [PubMed]

**5. **H. Lee, G. B. Dalton, I. A. J. Tosh, and S.-W. Kim, “Computer-guided alignment II : Optical system alignment using differential wavefront sampling,” Opt. Express **15**, 15424–15437 (2007). [CrossRef] [PubMed]

**6. **H. Lee, G. B. Dalton, I. A. J. Tosh, and S.-W. Kim, “Practical implementation of the complex wavefront modulation model for optical alignment,” Proc. SPIE **6617**, 66170N (2007). [CrossRef]

**7. **H. Lee, G. B. Dalton, I. A. J. Tosh, and S.-W. Kim, “Implementation of differential wavefront sampling in optical alignment of pupil-segmented telescope systems,” Proc. SPIE **7017**, 70171T (2008). [CrossRef]

**8. **H. N. Chapman and D. W. Sweeney, “Rigorous method for compensation selection and alignment of microlitho-graphic optical systems,” Proc. SPIE **3331**, 102–113 (1998). [CrossRef]

**9. **A. M. Hvisc and J. H. Burge, “Alignment analysis of four-mirror spherical aberration correctors,” Proc. SPIE **7018**, 701819 (2008). [CrossRef]

**10. **D. O’Donoghue, South African Large Telscope, Observatory, 7935, South Africa (Personal communication, 2009).

**11. **R. J. Noll, “Zernike polynomials and atmospheric turbulence,” J. Opt. Soc. Am. **66**, 207–211 (1975). [CrossRef]

**12. **R. V. Shack and K. Thompson, “Influence of alignment errors of a telescope system on its aberration field,” in *Optical alignment*, R. M. Shagam and W. C. Sweatt, eds., Proc. SPIE251, 146–153 (1980).

**13. **B. McLeod, “Collimation of Fast Wide-Field Telescopes,” Publ. Astron. Soc. Pac. **108**, 217–219 (1996). [CrossRef]

**14. **R. N. Wilson and B. Delabre, “Concerning the Alignment of Modern Telescopes: Theory, Practice, and Tolerance Illustrated by the ESO NTT,” Publ. Astron. Soc. Pac. **109**, 53–60 (1997). [CrossRef]

**15. **L. Noethe and S. Guisard, “Analytic expressions for field astigmatism in decentered two mirror telescopes and application to the collimation of the ESO VLT,” Acta Anat. Suppl. **144**, 157–167 (2000).

**16. **H. Lee, G. B. Dalton, I. A. J. Tosh, and S. Kim, “Computer-guided alignment I : Phase and amplitude modulation of the alignment-influenced wavefront,” Opt. Express **15**, 3127–3139 (2007). [CrossRef] [PubMed]

**17. **H. Lee, G. Dalton, I. Tosh, and S. Kim, “Computer-guided alignment III: Description of inter-element alignment effect in circular-pupil optical systems,” Opt. Express **16**, 10992–11006 (2008). [CrossRef] [PubMed]

**18. **W. H. Press, S. A. Teukolsky, W. T. Vetterling, and B. P. Flannery, *Numerical Recipes in C* 2nd ed. (Cambridge, 2002).

**19. **G. D’Agostini, *Bayesian Reasoning in Data Analysis* (World Scientific, 2005).