Wavefront sensorless adaptive optics: a general model-based approach

Huang Linhai; Changhui Rao

doi:10.1364/OE.19.000371

1. Introduction

Using wavefront sensorless adaptive optics (AO) is a better option than using distinct, separate wavefront sensor AOs for some applications that benefit from AO correction [1], such as inertial confinement fusion (ICF), optical tracking, free-space laser propagation, and microscopy. The most significant potential advantage of wavefront sensorless AOs is related to the fact that far-field intensity distribution is permitted to be used as the feedback signal, which in turn allows a wavefront sensorless AO to be used in poor illumination situations. Wavefront sensorless AO systems operate by sequentially modulating the AO corrector and maximizing a feedback signal according to particular optimization algorithms. These particular optimization algorithms include model-free methods and model-based methods. Model-free methods contain stochastic, local, or global search methods. Among model-free methods it is noticeable that works on stochastic gradient methods proposed by M. Vorontsov are verified to be the fastest search methods [2], whereas many measurements are still required for stochastic gradient methods. Taking Ref. [3] as an example, more than 100 measurements are still needed for a 25-element actuator deformable mirror (DM) in order to correct the aberrations [3], and the more measurements that are required, the more difficult it is to realize real-time AO systems.

The model-based approach is a much better choice to reduce the measurements. M. J. Booth has proposed that the model-based approach is capable of correcting aberrations with a minimum of N + 1 photodetector measurements for N aberration modes [4,5]. However, the model-based method requires taking different sets of functions as the predetermined bias functions for aberrations of various magnitudes. The approach requires taking Zernike functions as the predetermined bias functions for small aberrations and Lukosz–Zernike (L–Z) functions for large aberrations [4,5]. Furthermore, the accuracy of this method relies upon factors such as bias values (the coefficients of the predetermined bias functions).

In this paper we propose a general model-based approach. Unlike the model-based method proposed by M. J. Booth, the general approach is insensitive to the selection of sets of functions as well as the bias values. Besides the L-Z functions, the general model-based approach can also take other kind of modes, such as the Zernike functions, as the predetermined bias functions and correct aberrations effectively, not only for large aberrations but also for small aberrations.

2. The general model-based method

2.1 The wavefront sensorless AO correction system

It is well known that a wavefront sensorless AO can be depicted as shown in Fig. 1 , where the input wavefront is incident from the left. After aberration correction by the adaptive element (the DM), the input wavefront is focused onto the photodetector by a positive lens. A mark or a small pinhole is placed on the photodetector to generate a feedback signal, such as the encircled energy signal [4]. The encircled energy feedback signal is produced by using a pinhole, which filters energy outside the pinhole and allows energy within the pinhole to be obtained by the photodetector. The feedback signal is used to drive the AO element. The aberration of the input wavefront is described by function $Φ (x, y)$ , where x and y are the rectangular coordinates in the pupil plane of the lens. The correcting aberration of the DM is depicted by function $Ψ (x, y)$ . The residual aberration of the input wavefront is represented by function $R (x, y)$ , and $R (x, y) = Φ (x, y) - Ψ (x, y)$ . The corresponding far-field intensity distribution $I (x', y')$ is given by [6]

I (x', y') = I_{0} {| \int_{}^{} \int_{}^{} A (x, y) * \exp [j R (x, y)] e^{\frac{j k}{2 z} [{(x - x^{'})}^{2} + {(y - y^{'})}^{2}]} d x d y |}^{2},

where x’ and y’ are the rectangular coordinates in the input plane of the photodetector; I₀ is proportional to the incident light power; A is the amplitude of the input wavefront and is set to be uniform in this paper;

k = 2 π / λ

, λ is the wavelength of the input wavefront; z is the focal length of the positive lens; and j is the imaginary unit.

Fig. 1 Schematic diagram of the adaptive system.

Download Full Size | PDF

We assume that Φ and Ψ can be represented by a series of M orthonormal functions and N orthonormal functions, respectively. In most cases, the number M is greater than N. The orthonormal functions are called modes in the following text, each mode denoted by Fi(x,y):

\begin{array}{l} Φ (x, y) = \sum_{i = 1}^{M} ν_{i} \cdot F_{i} (x, y), \\ Ψ (x, y) = \sum_{i = 1}^{N} μ_{i} \cdot F_{i} (x, y) . \end{array}

In other words, aberration Φ of the input wavefront can be denoted by vector V, whose elements are the coefficients of $ν_{i}$ . Similarly, the correcting aberration and the residual aberration _R can be represented by vectors U and Z, respectively. The elements of U and Z are $μ_{i}$ and $z_{i}$ , and moreover, $z_{i} = ν_{i} - μ_{i}$ . Minimizing vector Z is the job of this wavefront sensorless AO system.

In order to minimize vector Z efficiently, the relationship between the information about the far-field intensity distribution and the aberration need to be set up first.

2.2 Relationship between the second moment (SM) of the aberration gradients and the FWHM of far-field intensity distribution

As we know, the centroid of the far-field intensity distribution in geometric optics is related to the aberration of the input wavefront. When only the Seidel aberration of tilt is present, this relationship can be described by the following expression [6,7]:

{[\frac{\partial}{\partial x} Φ (x, y)]}^{2} + {[\frac{\partial}{\partial y} Φ (x, y)]}^{2} \propto (x'^{2} + y'^{2}) .

When other aberrations are present, those aberrations can be considered as the compositions of many small pieces of Seidel aberrations of tilt. The relationship between tilt aberrations in each small piece and the centroid of the corresponding spot observes the rule in Eq. (3). Hence from this fact we may deduce that the sum of all small pieces have the following relationship:

\sum_{i = 0}^{T L} {{[\frac{\partial}{\partial x} Φ_{i} (x, y)]}^{2} + {[\frac{\partial}{\partial y} Φ_{i} (x, y)]}^{2}} \propto \sum_{i = 0}^{T L F} I_{i} (x', y') (x'^{2} + y'^{2}),

where

Φ_{i} (x, y)

stands for the i^th piece of tilt. TL is the total number of the small pieces. TLF is the total number of points in the far-field intensity distribution.

I_{i} (x', y')

is the far-field intensity at point

(x', y')

. Since the far-field intensity distributions of two or more small pieces of input wavefronts may locate at the same point, a variable related to the number of pieces is needed. On the other hand, it is well known that the magnitude of a certain far-field intensity distribution is positive proportional to the number of input small-piece wavefronts in geometric optics, on the condition that those input small-piece wavefronts have the same tilt aberrations. Therefore,

I_{i} (x', y')

is present in Eq. (4) to represent the total number of input small-piece wavefronts that focus on point (x', y').

When TL→∞, TLF→∞, Eq. (4) is

\int_{x} \int_{y} {{[\frac{\partial}{\partial x} Φ (x, y)]}^{2} + {[\frac{\partial}{\partial y} Φ (x, y)]}^{2}} d x d y \propto \int_{x'} \int_{y'} I (x', y') (x'^{2} + y'^{2}) d x' d y',

where the left-hand side of Eq. (5) is the SM of the aberration gradients, and the right-hand side of Eq. (5) is the sum of the far-field intensity distribution multiplied by a mask [corresponding position

(x'^{2} + y'^{2})

]. Since the total sum of far-field intensity distribution

I (x', y')

is a constant, the right-hand side of Eq. (5) is changed to be

\begin{array}{l} \int_{x'} \int_{y'} I (x', y') (x'^{2} + y'^{2}) d x' d y' = R^{2} \int_{x'} \int_{y'} I (x', y') \frac{(x'^{2} + y'^{2})}{R^{2}} d x' d y' \\ \begin{matrix} \begin{matrix}  \end{matrix} \end{matrix} \begin{matrix} \begin{matrix}  \end{matrix} \end{matrix} \begin{matrix}  \end{matrix} = R^{2} {\int_{x'} \int_{y'} I (x', y') d x' d y' - \int_{x'} \int_{y'} I (x', y') [1 - \frac{r^{2}}{R^{2}}] d x' d y'} . \end{array}

Hence the mask becomes $1 - r^{2} / R^{2}$ for r ≤ R and zero otherwise. $r = \sqrt{x'^{2} + y'^{2}}$ , R is a suitable chosen detector radius, and R is weighted by the system’s diffraction limitation (DL). The significant advantage of the new mask is that the new mask is insensitive to the actual detected size. In practice, the new mask could be implemented by the weighting of pixels within the software when the photodetector is replaced by a CCD camera.

Moreover, the whole expression of the right-hand side of Eq. (5) is also normalized by dividing by the sum of $I (x', y')$ . For convenience, Eq. (7) is called the masked detector signal (MDS) in subsequent text.

MDS = \frac{\int_{x'} \int_{y'} I (x', y') [1 - \frac{r^{2}}{R^{2}}] d x' d y'}{\int_{x'} \int_{y'} I (x', y') d x' d y'} .

Therefore, Eq. (5) can be expressed as

{SM}_{} \approx c_{0} (1 - M D S) ,

where

SM

is the second moment of the aberration gradients, and c₀ is the slop of the trend line, which is determined by the detector radius R.

To verify the relationship between the SM of the aberration gradients and the MDS, 500 random atmospheric aberrations with various secondary moments are produced by the method proposed by Noll in [8]. By calculating the SM of the 500 random atmospheric aberration gradients as well as the MDS from the corresponding far fields, the signal responses are depicted in Fig. 2 . The values of detector radius R are set to be 5 DL, 12 DL and 24 DL, respectively. An approximate linearity between the SM of the aberration gradient and the MDS is obtained in Fig. 2 when the detector radius R is selected to be 12 DL or 24 DL. The approximate linearity has some errors when detector radius R = 5 DL. The errors of approximate linearity result from the cases where the distributions of the far field go beyond the calculated area. Thus, when the detector radius R is suitably chosen, the MDS is considered to be related to the SM of the aberration gradients by using Eq. (8).

Fig. 2 Signal response between SM of aberration gradient and MDS. 500 random aberrations are taken for study.

Download Full Size | PDF

For comparison, the relationship used in [4] is numerically calculated by using random aberrations and are drawn in Fig. 3 . The relationship between the MDS and aberration magnitude |V| is calculated from 5000 random aberrations, and each circle in Fig. 3 stands for the mean MDS of 100 random aberrations. Note that the MDS is essentially the same as the detected signal used in [4], except that the MDS is normalized by the sum of the far-field intensity. Obviously, our method results in a linear relationship between the MDS and the SM.

Fig. 3 Signal response between aberration magnitude |V| and MDS. Each point in the figure stands for the mean MDS values of 100 random aberrations.

Download Full Size | PDF

2.3 Modeling and analysis

So far, the relationship between the MDS and the SM of the aberration gradient has been set up. We will build the general model-based method for a wavefront sensorless AO system according to the relationship.

As we know that the aberration of an input wavefront can be expressed by a series of M orthonormal modes $F_{i} (x, y)$ ,

Φ (x, y) = \sum_{i = 1}^{M} ν_{i} \cdot F_{i} (x, y) .

The input aberration Φ’s gradients in the x and y axes are [9]

\begin{array}{l} \frac{\partial}{\partial x} Φ (x, y) = \sum_{i = 1}^{M} ν_{i} \cdot \frac{\partial}{\partial x} F_{i} (x, y), \\ \frac{\partial}{\partial y} Φ (x, y) = \sum_{i = 1}^{M} ν_{i} \cdot \frac{\partial}{\partial y} F_{i} (x, y), \end{array}

where

\partial Φ (x, y) / \partial x

and

\partial Φ (x, y) / \partial y

are the input aberration gradients in the x and y axes, respectively, and

\partial F_{i} (x, y) / \partial x

and

\partial F_{i} (x, y) / \partial y

are the orthonormal mode gradients in the x and y axes, respectively.

In order to find out the value of coefficients v_i, the N orthonormal mode $F_{i} (x, y)$ is taken as the predetermined bias function and is added by the DM sequentially with coefficient α to the input aberrations. Then the detector measurements are recorded.

The difference w_i,0 between the SM of the gradients of aberration Φ and that after adding a predetermined bias function $F_{i} (x, y)$ with the coefficient α is declared as

w_{i, 0} = S M_{i} - S M_{0} = \frac{\int_{s} {{[\frac{\partial}{\partial x} Φ (x, y) + α \frac{\partial}{\partial x} F_{i} (x, y)]}^{2} + {[\frac{\partial}{\partial y} Φ (x, y) + α \frac{\partial}{\partial y} F_{i} (x, y)]}^{2}} - {{[\frac{\partial}{\partial x} Φ (x, y)]}^{2} + {[\frac{\partial}{\partial y} Φ (x, y)]}^{2}} d x d y}{s},

where

S M_{0}

and

S M_{i}

are the SMs of the gradients of

Φ (x, y)

and

Φ (x, y) + α F_{i} (x, y)

, respectively. By rearranging, we can get

\begin{array}{l} w_{i, 0} = \frac{\int_{s} [α \cdot \frac{\partial}{\partial x} F_{i} (x, y) \cdot 2 \frac{\partial}{\partial x} Φ (x, y)] d x d y}{s} + \frac{\int_{s} {[α \cdot \frac{\partial}{\partial x} F_{i} (x, y)]}^{2} d x d y}{s} + \frac{\int_{s} [α \cdot \frac{\partial}{\partial y} F_{i} (x, y) \cdot 2 \frac{\partial}{\partial y} Φ (x, y)] d x d y}{s} + \frac{\int_{s} {[α \cdot \frac{\partial}{\partial y} F_{i} (x, y)]}^{2} d x d y}{s} \\ \begin{matrix}  \end{matrix} = \frac{\int_{s} 2 α \cdot [\frac{\partial}{\partial x} F_{i} (x, y) \frac{\partial}{\partial x} Φ (x, y) + \frac{\partial}{\partial y} F_{i} (x, y) \cdot \frac{\partial}{\partial y} Φ (x, y)] d x d y}{s} + \frac{\int_{s} α^{2} \cdot {{[\frac{\partial}{\partial x} F_{i} (x, y)]}^{2} + {[\frac{\partial}{\partial y} F_{i} (x, y)]}^{2}} d x d y}{s} . \end{array}

After N predetermined bias functions F_i(x, y) are added sequentially by the DM, we will obtain N equations

w_{i, 0} = \frac{\int_{s} 2 α \cdot [\frac{\partial}{\partial x} F_{i} (x, y) \frac{\partial}{\partial x} Φ (x, y) + \frac{\partial}{\partial y} F_{i} (x, y) \cdot \frac{\partial}{\partial y} Φ (x, y)] d x d y}{s} + \frac{\int_{s} α^{2} \cdot {{[\frac{\partial}{\partial x} F_{i} (x, y)]}^{2} + {[\frac{\partial}{\partial y} F_{i} (x, y)]}^{2}} d x d y}{s}, i = 1 ~ N .

Replacing the gradients of aberration Ф by Eq. (9), it is convenient to use the vectors that represent the sampled values of the original functions. Equation (12) becomes

W = 2 α * S * V + α^{2} * S_{m},

where

S = (\begin{matrix} s_{1, 1} & s_{1, 2} & ... & s_{1, N} \\ s_{2, 1} & s_{2, 2} & ... \\ ... & ... & s_{N - 1, N} \\ s_{N, 1} & ... & s_{N, N - 1} & s_{N, N} \end{matrix} \begin{matrix} ... \\ ... \\ ... \\ ... \end{matrix} \begin{matrix} s_{1, M} \\ ... \\ s_{N - 1, M} \\ s_{N, M} \end{matrix})

,

S_{m} = (\begin{matrix} s_{1, 1} \\ s_{2, 2} \\ ... \\ s_{N, N} \end{matrix})

,

W = (\begin{matrix} w_{1, 0} \\ w_{2, 0} \\ ... \\ s_{N, 0} \end{matrix})

,

V = (\begin{matrix} ν_{1} \\ ν_{2} \\ ... \\ ν_{N} \\ ... \\ ν_{M} \end{matrix})

, and

s_{n, m} = \frac{\int_{s} {[\frac{\partial}{\partial x} F_{n} (x, y) * \frac{\partial}{\partial x} F_{m} (x, y)] + [\frac{\partial}{\partial y} F_{n} (x, y) * \frac{\partial}{\partial y} F_{m} (x, y)]} d x d y}{s} .

Generally, S is invertible, so the coefficient V of the modes can be calculated by

V = \frac{S^{- 1} (W - α^{2} * S_{m})}{2 * α} .

According to Eq. (8) and Eq. (10), we have

w_{i, 0} = S M_{i} - S M_{0} \approx c_{0} (1 - M D S_{i}^{})- c_{0} (1 - M D S_{0}^{}) \approx - c_{0} (M D S_{i}^{} - M D S_{0}^{}),

where MDS₀ and MDS_i are the corresponding MDS of input wavefronts Ф(x, y) and Ф(x, y) + αF_i(x, y), respectively. That is,

W \approx c_{0} M

, thus the vector V is estimated by

V \approx \frac{S^{- 1} (c_{0} * M - α^{2} * S_{m})}{2 * α},

where

M = - (\begin{matrix} M D S_{1} - M D S_{0} \\ M D S_{2} - M D S_{0} \\ ... \\ M D S_{N} - M D S_{0} \end{matrix})

.

Note that vector V can be resolved exactly by Eq. (15), no matter what sets of modes are adopted as the predetermined bias functions; hence, it can be concluded that the method is insensitive to the selected sets of modes as well as the choice of the bias value α, and the correction error comes only from Eq. (8). Furthermore, it is known that Eq. (8) is set up based on geometric optics and is satisfied with all kinds of aberrations as long as a suitable detector radius R is selected.

3. Numerical simulations and result analysis

To verify the performance of the general method, 50 random aberrations with a normalized root-meaning-square value of 0.5 λ are produced as the input wavefront aberrations in the paper. The aberrations consist of 18 Zernike modes represented by random coefficients u_i. The 18 Zernike modes and 18 L–Z modes are taken sequentially to be the biases of the model-based method and added by the DM to the input wavefront.

The 18 Zernike modes and 18 L–Z modes are drawn in Fig. 4(a) and Fig. 4(b), respectively. The corresponding inverses S⁻¹ are shown in Fig. 5(a) and Fig. 5(b) respectively.

Fig. 4 Set of aberration modes: (a) 18 Zernike modes (3–20); (b) 18 L–Z modes (3–20).

Download Full Size | PDF

Fig. 5 Corresponding inverse S⁻¹ of sets of aberration modes in Fig. 4: The inverse S⁻¹ of (a) 18 Zernike modes (3-20), (b) 18 L–Z modes (3-20).

Download Full Size | PDF

The biases $F_{i} (x, y)$ with coefficient $α = 0.05 λ$ are added sequentially to the aberrations, given the total aberrations $U + V$ , $U^{T} = [u_{1}, u_{2}, ..., u_{18}], V^{T} = [0, 0, . α .., 0]$ , and the corresponding MDS_i are calculated. The detector radius of far-field intensity R equals $16 D L$ , and the slope of the trend line $c_{0}$ is −194.1. After the total N biases are added, the correction vector V is calculated by Eq. (16): i = 1,2…,18.

The Strehl Ratio (SR) is adopted to evaluate the correction results, and the SR is defined as

S R = \frac{P [I (x, y)]}{P [I_{0} (x, y)]},

where

P []

is an operation, which calculates the peak intensity. I is the actual intensity distribution, and I₀ is the intensity distribution when no aberrations are present.

Since the vector S⁻¹, S_m, and MDS₀ are known in advance and do not need recalculation or measurement for each correction, the total detector measurements for one aberration correction would be 19. One aberration correction is considered as one iteration in the following results.

Figure 6(a) and Fig. 6(b) show the corresponding results after correction by the method proposed by M. J. Booth and the method in this paper, respectively. The icons “method 1” and “method 2” in the figure correspond to the method proposed by M. J. Booth and our method, respectively. The method proposed by M. J. Booth is referred to as previous works in the following text.

Fig. 6 Results of AO correction using (a) L–Z modes and (b) Zernike modes as the predetermined bias functions; the surfaces of the modes are drawn in Fig. 4. The icons “method 1” and “method 2” in the figure refer to the method proposed by M. J. Booth and the method proposed in this paper.

Download Full Size | PDF

According to the correction results in Fig. 6(a) and Table 1 , it is not difficult to find that our method is more efficient than previous works when the L–Z polynomials are taken as the predetermined bias functions. When using previous works to correct the aberrations, the values of SR rise slowly when the aberrations become smaller. However, our method works well both for small and large aberrations.

Table 1. SR Results of AO Correction Using L–Z Modes by Our Method and Previous Works

View Table

From Fig. 6(b), we see that the correction results of using the method of previous works decline when Zernike polynomials are taken as the predetermined bias functions. However, the correction results of our method keep consistent with those using L–Z polynomials as the biases. The correction results indicate that our method is insensitive to the selected sets of functions.

The effect of bias values (the coefficient α) on correction results is under consideration. We take the L–Z modes and Zernike functions as the predetermined bias functions and change the value of coefficient α. The curves of such iterations are show in Fig. 7 . As we have expected in the conclusions of Section 2, our method is insensitive to the choice of bias values.

Fig. 7 Results of AO correction using (a) L–Z and (b) Zernike modes when coefficientαvary from 0.02 to 0.4 λ. The icons “method 1” and “method 2” in the figure refer to the method proposed by M. J. Booth and the method proposed in this paper.

Download Full Size | PDF

4. Conclusions

A general model-based approach for wavefront sensorless adaptive optics is presented in the paper. The general model-based approach is set up based on the approximate linearity between the MDS and the SM of the wavefront gradient. Because the general method is insensitive to the selected modes, it permits correction of the aberrations by using all kinds of orthogonal aberrations as the predetermined bias functions. Numerical simulations of AO correction to the random aberrations have shown that the general method is a fast and stable wavefront sensorless AO correction method for all kind of modes.

Acknowledgments

The authors thank Wenham Jiang for providing good suggestions. The authors acknowledge helpful suggestions from the reviewers and help from the editors.

References and links

1. B. Wang and M. J. Booth, “Optimum deformable mirror modes for sensorless adaptive optics,” Opt. Commun. 282(23), 4467–4474 (2009). [CrossRef]

2. M. A. Vorontsov, G. W. Carhart, M. Cohen, and G. Cauwenberghs, “Adaptive optics based on analog parallel stochastic optimization: analysis and experimental demonstration,” J. Opt. Soc. Am. A 17(8), 1440–1453 (2000). [CrossRef]

3. P. Piatrou and M. Roggemann, “Beaconless stochastic parallel gradient descent laser beam control: numerical experiments,” Appl. Opt. 46(27), 6831–6842 (2007). [CrossRef] [PubMed]

4. M. J. Booth, “Wavefront sensorless adaptive optics for large aberrations,” Opt. Lett. 32(1), 5–7 (2007). [CrossRef]

5. M. J. Booth, “Wave front sensor-less adaptive optics: a model-based approach using sphere packings,” Opt. Express 14(4), 1339–1352 (2006). [CrossRef] [PubMed]

6. M. Born and E. Wolf, Principles of Optics, 6^th ed. (Pergamon, 1983).

7. J. Braat, “Polynomial expansion of severely aberrated wave fronts,” J. Opt. Soc. Am. A 4(4), 643–650 (1987). [CrossRef]

8. R. Noll, “Zernike polynomials and atmospheric turbulence,” J. Opt. Soc. Am. 66(3), 207–211 (1976). [CrossRef]

9. W. J. Hardy, Adaptive Optics for Astronomical Telescopes, (Oxford Univ. Press, 1998).

Iterations	Our Method	Previous Works
0	0.07	0.07
1	0.87	0.63
2	0.99	0.94
3	0.99	0.98
4	0.99	0.99

Wavefront sensorless adaptive optics: a general model-based approach

Abstract

1. Introduction

2. The general model-based method

2.1 The wavefront sensorless AO correction system

2.2 Relationship between the second moment (SM) of the aberration gradients and the FWHM of far-field intensity distribution

2.3 Modeling and analysis

3. Numerical simulations and result analysis

4. Conclusions

Acknowledgments

References and links

Cited By

Figures (7)

Tables (1)

Equations (19)

Optics Express