Integral-based parallel algorithm for the fast generation of the Zernike polynomials

Y. H. Hsieh; Y. T. Yu; Y. H. Lai; M. X. Hsieh; Y. F. Chen

doi:10.1364/OE.380567

1. Introduction

In 1934, F. Zernike proposed a sequence of polynomials which are orthogonal and continuous over a unit circle with a finite value at the boundary to examine the figure of a circular mirror [1]. Later, these so-called Zernike polynomials are exploited by B. R. A. Nijboer to describe the small aberrations on the diffraction patterns in the rotationally symmetric system with circular pupils [2]. R. J. Noll further extended the use of Zernike polynomials to discuss the aberrations introduced by atmospheric turbulence [3]. Currently, Zernike polynomials are widely used to analyze the propagation of a wavefront through imaging system with pupils [4–8]. Some typical Zernike polynomials and corresponding types of aberration are shown in the Fig. 1. In addition to the orthogonality, the rotational invariance of Zernike polynomials are utilized by A. Khotanzad to automatically recognize images regardless of its orientation [9], which attracts great explorations in the field of pattern recognition [10–12] and the shape representation [13–15].

Fig. 1. Some typical Zernike polynomials and corresponding types of aberration. Note that the radial functions is multiplied by cosine (sine) terms to present the even (odd) symmetry in azimuthal direction.

Download Full Size | PDF

The computation of Zernike functions Z_n_,m (r) involves the direct evaluation of Zernike radial polynomials R_n_,m (r). Applications involve high order Zernike polynomials [16–24], especially in the pattern reconstruction where radial order could be up to 40-70 [21,22], suffer greatly in both speed and accuracy when the calculation of factorial functions leads to long processing time accompanying by the significant floating error originated from heavy cancellation between components. Many recurrence relations for radial polynomials are derived to reduce the elapsed time under the requirement of numerical accuracy and stability [23–26]. Conventionally, the specific order of R_n_,m (r) can be generated through the recurrence procedure started by using R_m_,m (r)= r^m. To avoid the magnified round-off error of r^m caused by the recurrence process, G. W. Forbes proposed a novel recurrence relation started by giving two values of 1 and r(m + 2)-(m + 1) [26]. I. Kaya and J. Rolland further designed a parallel algorithm based upon this recurrence relation to work on single instruction multiple threads (SIMT) graphics processing units (GPUs) architecture that can achieve improvement more than an order of magnitude in computing time over the sequential implementation [27]. They applied the data-parallel SIMT algorithm to pre-compute the coefficient of the recurrence procedure and created thousands of threads together to simultaneously carry out arithmetic operations on each position points. Nevertheless, the sequential nature of recursive relation heavily restricts the benefit of parallelization. A new approach to calculate the Zernike radial function that can reap the full benefit of parallel computation is thus highly desired.

In the past, an alternative mathematical expression to the radial function is associated with the improper integral of Bessel function, which is hard to be numerically calculated [3]. Recently, the theoretical approach to reconstruct transverse patterns in the laser resonators are derived by using the generating functions as the interference between Gaussian wave packets that moves along the geometric trajectories [28]. The Hermite-Gaussian and Laguerre-Gaussian functions are further represented as an integral of the Gaussian wave packet, which could be in an efficient and precise manner. The fast computation of these special functions inspires us to apply the generating function to express the Bessel function as an integral to obtain a new form of radial functions. In this work, the integral representation of Zernike polynomials is derived as a definite integral of trigonometric functions by exploiting the generating function of the Bessel function as well as the orthogonality of plane waves. According to the Euler-Maclaurin formula, the integral is well approximated by Riemann sums with the rectangular rule. The errors of the Riemann sums are found to averagely be not exceed 3 ×10⁻¹⁴, 3.3×10⁻¹⁴, and 1.8×10⁻¹³ for the radial order up to 30, 50, and 100, respectively. Furthermore, a parallel algorithm based upon the Riemann sums is proposed to calculate a set of radial functions. We implement the algorithm on the GPUs to show a striking increase in processing speed 200 times over the traditional CPU computation.

2. Deriving the integral representation of Zernike radial functions for the fast computation of radial polynomials

Zernike polynomials are a product of angular functions and radial polynomials. In the polar coordinates, the Zernike radial polynomials can be defined as

(1)$$\begin{array}{lll} {Z_{n,m}^{(e)}(r,\theta ),\,\,Z_{n,m}^{(o)}(r,\theta )}& = {\sqrt {\frac{{2(n + 1)}}{{1 + {\delta _{m,0}}}}} {R_{n,m}}(r)\,\left( {\cos (m\theta ),\,\,\sin (m\theta )} \right)}&{\textrm{for}\ r \le 1}\\ {}& = 0&{\textrm{for}\ r > 1} \end{array},$$

where

(2)$${R_{n,m}}(r) = \sum\limits_{s = 0}^{(n - m)/2} {\frac{{{{( - 1)}^s}(n - s)!{r^{n - 2s}}}}{{s!\,[(n + m)/2 - s]!\,[(n - m)/2 - s]!}}} .$$

Here n and m are non-negative integers with n≧m as well as $n - m \in even$. The superscripts (e) and (o) denote the two types of functions with even and odd symmetry. The radial functions derived in Noll’s work [3] is given by:

(3)$${R_{n,m}}(r) = {( - 1)^{(n - m)/2}}\int_0^\infty {{J_{n + 1}}(k)\,{J_m}(kr)\,} dk.$$

Equation (3) is useful for the derivation of the derivatives of Zernike polynomials which play a crucial roles in the Shack-Hartmann wavefront sensor [29]. Nonetheless, Eq. (3) is rarely used to directly compute the radial function because it is an improper integral which involves the product of Bessel functions. Fortunately, the generating function of the Bessel function is intimately related to the plane waves whose improper integral is well known. In the following, the generating function is utilized to express the Bessel function as a function of plane waves to deal with the improper integral in Eq. (3).

The generating function of Bessel function is written as

(4)$${e^{x(t - {t^{ - 1}})/2}} = \sum\limits_{m ={-} \infty }^\infty {{J_m}(x)\,{t^m}} .$$

Let $t = {e^{i\phi }}$, the generating function becomes

(5)$${e^{ix\sin \varphi }} = \sum\limits_{m ={-} \infty }^\infty {{J_m}(x)\,{e^{im\varphi }}} .$$

By utilizing the orthogonal property of the plane waves, the Bessel function of arbitrary order m_o can be expressed as an integral form

(6)$${J_{{m_o}}}(x) = \frac{1}{{2\pi }}\int_0^{2\pi } {{e^{ix\sin \varphi }}{e^{ - i{m_o}\varphi }}d\varphi } .$$

Substituting Eq. (6) into Eq. (3), the Zernike radial function is then given by

(7)$${R_{n,m}}(r) = {( - 1)^{(n - m)/2}}\int_0^\infty {dk\left[ {\frac{1}{{2\pi }}\int_0^{2\pi } {{e^{ik\sin \varphi^{\prime}}}{e^{ - i(n + 1)\varphi^{\prime}}}d\varphi^{\prime}} } \right]} \left[ {\frac{1}{{2\pi }}\int_0^{2\pi } {{e^{ikr\sin \varphi }}{e^{ - im\varphi }}d\varphi } } \right].$$

After rearranging the elements in Eq. (7), the integral of the complex exponential function over k is half the delta function of $\delta (\sin \varphi ^{\prime} + r\sin \varphi )$. The composition of the delta function can be obtained by applying the rule of $\delta (f(\varphi ^{\prime})) = \sum {\delta (\varphi ^{\prime} - \tilde{\varphi })} /|{f^{\prime}(\tilde{\varphi })} |$, where $\tilde{\varphi }$ represents the root of $f(\varphi ^{\prime})$. Substituting the derived delta function into Eq. (7) and integrating over $\varphi ^{\prime}$, the radial function is found to be:

(8)$${R_{n,m}}(r) = \frac{{{{( - 1)}^{(n - m)/2}}}}{{2\pi }}\frac{1}{2}\int_0^{2\pi } {\frac{{{e^{ - i\,m\,\varphi }}[{{e^{ - i(n + 1){\kern 1pt} \tilde{\varphi }}} + {{( - 1)}^n}{e^{i(n + 1){\kern 1pt} \tilde{\varphi }}}} ]}}{{\cos \tilde{\varphi }}}} d\varphi ,$$

where $\tilde{\varphi } = {\sin ^{ - 1}}( - r\sin \varphi )$. Note that the integral of the imaginary part in Eq. (8) is zero since the radial function is real. The term ${( - 1)^{(n - m)/2}}$ can be combined into the integrand in Eq. (8) to yield a simpler integral form with a shift of 0.5π in $\varphi$ and $\tilde{\varphi }$. Then, by exploiting the identity ${\sin ^{ - 1}}[ - r\sin (\varphi + 0.5\pi )] + 0.5\pi = {\cos ^{ - 1}}(r\cos \varphi )$, Eq. (8) is simplified to a definite integral of trigonometric functions:

(9)$${R_{n,m}}(r) = \frac{1}{{2\pi }}\int_0^{2\pi } {\frac{{\cos (m\varphi )\sin [(n + 1){\kern 1pt} \tilde{\varphi }]}}{{\sin \tilde{\varphi }}}} \,d\varphi ,$$

where ${\tilde{\varphi }_j} = {\cos ^{ - 1}}(r\cos {\varphi _j})$.Usually, the definite integral can be well approximated through means such as the Romberg integration, the adaptive quadrature, or the Gauss quadrature [30]. However, when calculating the integral of periodic functions, the Euler-Maclaurin formula converges more rapidly via Riemann sums with the trapezoidal rule than that of any other numerical methods [31]. Specifically, the trapezoidal rule converges to the integral with geometric rapidity which is proportional to the range of regularity for the periodic function. The trapezoidal rule is equal to the rectangular rule when the periodic function is integrated over a period. Hence, the Zernike radial function in Eq. (9) can be approximated by using the Riemann sums with rectangular rules:

(10)$${R_{n,m}}(r) = \frac{1}{N}\sum\limits_{j = 0}^{N - 1} {\frac{{\cos (m{\kern 1pt} {\varphi _j})\sin [(n + 1){\kern 1pt} {{\tilde{\varphi }}_j}]}}{{\sin {{\tilde{\varphi }}_j}}}} ,$$

where ${\varphi _j} = 2\pi j/N$ for $j = 0,1, \cdots ,N - 1$, ${\tilde{\varphi }_j} = {\cos ^{ - 1}}(r\cos {\varphi _j})$, and N is the number of the subintervals. Equation (9) and Eq. (10) are equivalent to the novel result derived by Janssen and Dirksen [32]. They considered Eq. (9) as a discrete Fourier transform of Chebyshev polynomials of the second kind ${U_n}(x) = {{\sin [(n + 1){{\cos }^{ - 1}}(x)]} \mathord{\left/ {\vphantom {{\sin [(n + 1){{\cos }^{ - 1}}(x)]} {\sin [{{\cos }^{ - 1}}(x)]}}} \right.} {\sin [{{\cos }^{ - 1}}(x)]}}$. The computation of radial functions is therefore able to be effectively and precisely computed by using the fast Fourier transform algorithm. The recurrence relation of the Chebyshev polynomials was further applied by Shakibaei and Paramesran to derive an order-independent recurrence relation for the Zernike radial polynomials [33]. The accuracy of polar and Cartesian form of new recurrence relation on a rectangular grid of (x,y) was also well discussed [34]. Nevertheless, it is believed that the derivation in this work can provide pedagogical insight for the application of the generating functions.

Different from the Janssen and Dirksen’s approach to the computation of Eq. (10), we use the π-symmetry property of integrand to further expressed Eq. (10) as

(11)$${R_{n,m}}(r) = \frac{1}{{2N^{\prime} + 1}}\left[ {\frac{{\sin [(n + 1){\kern 1pt} {{\cos }^{ - 1}}(r)]}}{{\sin [{{\cos }^{ - 1}}(r)]}} + 2\sum\limits_{j = 1}^{{N^{\prime}}} {\frac{{\cos (m{\kern 1pt} {\varphi_j})\sin [(n + 1){\kern 1pt} {{\tilde{\varphi }}_j}]}}{{\sin {{\tilde{\varphi }}_j}}}} } \right],$$

where ${N^{\prime}} = (N - 1)/2$. As a result, the summations in Eq. (11) can be carried out with just half the number of intervals with respect to Eq. (10). The sufficient numbers of intervals to approximate the radial function can be determined if the errors between the calculated and standard results are small enough. In general, the calculation under double-precision floating-point format (64-bit precision) can offer precision up to 15-17 digits, while the computation under quadruple-precision format (64-bit precision) can reach as high as 34 decimal digits of precision, which can be considered as a reference to estimate the error in the double-precision format. In other words, the error is estimated by the difference between results calculated in the double and quadruple precision. Figures 2(a)–2(b) demonstrate the dependence of radial function with various order (n,0) and (30,m) on the max error acquired in the spatial interval [0,1] using 10000 subdivision. The green and purple dash-dot lines represent the max error of integral method with N’=(n + m)/4 and (n + m)/2, respectively. It is seen in Fig. 2(a) that higher order of n will lead to larger errors. The increasing, oscillating error is the result of the calculation of ${{\sin [(n + 1){\kern 1pt} {{\tilde{\varphi }}_j}]} \mathord{\left/ {\vphantom {{\sin [(n + 1){\kern 1pt} {{\tilde{\varphi }}_j}]} {\sin {{\tilde{\varphi }}_j}}}} \right.} {\sin {{\tilde{\varphi }}_j}}}$ as ${\tilde{\varphi }_j} \to 0$. In Fig. 2(b), there is a slight decrease of error with increasing m, which is attributed to the error compensation in the multiplication between $\cos (m{\kern 1pt} {\varphi _j})$ and ${{\sin [(n + 1){\kern 1pt} {{\tilde{\varphi }}_j}]} \mathord{\left/ {\vphantom {{\sin [(n + 1){\kern 1pt} {{\tilde{\varphi }}_j}]} {\sin {{\tilde{\varphi }}_j}}}} \right.} {\sin {{\tilde{\varphi }}_j}}}$ with ${\tilde{\varphi }_j} \to 0$. More importantly, it can be found that $N^{\prime} = \lceil{({{n + m)} \mathord{\left/ {\vphantom {{n + m)} 4}} \right.} 4}} \rceil$ is sufficient to show an accurate result, where $\lceil. \rceil$ denotes the ceiling function. Even as many intervals as $N^{\prime} = \lceil{{{(n + m)} \mathord{\left/ {\vphantom {{(n + m)} 2}} \right.} 2}} \rceil$ cannot realize a smaller error, and with $N^{\prime}\;<\;\lceil{({{n + m)} \mathord{\left/ {\vphantom {{n + m)} 4}} \right.} 4}} \rceil$ will result in a more significant amount of error.

Fig. 2. The maximum errors of calculated radial functions with (a) (n,0) and (b) (0,m) by using proposed integral, Kintner’s, and Shakibaei’s methods.

Download Full Size | PDF

To validate the integral method, we compare the result to that comes from the recurrence methods. One of the most typical recurrence relation is proposed by Kintner [25] in the following form:

(12)$$\begin{aligned} {R_{n + 2,m}}(r) &= \frac{{n + 2}}{{{{(n + 2)}^2} - {m^2}}}\\ & \times \left\{ {\left[ {4(n + 1){r^2} - \frac{{{{(n + m)}^2}}}{n} - \frac{{{{(n - m + 2)}^2}}}{{n + 2}}} \right]{R_{n,m}}(r) - \frac{{{n^2} - {m^2}}}{n}{R_{n - 2,m}}(r)} \right\}. \end{aligned}$$

The recurrence procedure is started by using R_m_,m(r)=r^m. No second starting values is needed thanks to the non-existent function R_m-2_,m(r). The other is the new recurrence relation derived by Shakibaei and Paramesran [33], which is given by

(13)$${R_{n,m}}(r) = r[{{R_{n - 1,|{m - 1} |}}(r) + {R_{n - 1,m + 1}}(r)} ]- {R_{n - 2,m}}(r).$$

The recurrence is begun by using R_0,0(r) = 1 under the requirement that R_n_,m(r) = 0 when n < m. The max errors of Kintner’s and Shakibaei’s recurrence methods are displayed in the Fig. 2(a)–2(b) as red and blue solid line. Similar to the integral method, for both methods, max error rises as n increases and drops gradually as m increases. The trend of the error is resulted from the number of recursive runs. Larger number of recursive runs give rise to more accumulative errors. Additionally, Shakibaei’s recurrence relation with order-independent coefficients has less errors than Kintner’s relation with complicated order-dependent ones. It also can be observed that the integral method presents errors on par with both Kintner’s and Shakibaei’s methods. The max errors of integral methods do not exceed 3.3×10⁻¹⁴ and 1.8×10⁻¹³ for the radial order up to 50 and 100, respectively.

In the previous discussion, the sufficient number of intervals in summation to well approximate the integral is found to be $N^{\prime} = \lceil{({{n + m)} \mathord{\left/ {\vphantom {{n + m)} 4}} \right.} 4}} \rceil$. Accordingly, the estimated addition and multiplication number for the computation of a specific order of radial polynomials in Eq. (11) are $2\lceil{{{(n + m)} \mathord{\left/ {\vphantom {{(n + m)} 4}} \right.} 4}} \rceil + 2$ and $6\lceil{{{(n + m)} \mathord{\left/ {\vphantom {{(n + m)} 4}} \right.} 4}} \rceil + 5$, respectively. The number of addition and multiplication required by Kintner’s method is ${{[11(n - m) + 9(n - m - 2)H(n - m - 1)]} \mathord{\left/ {\vphantom {{[11(n - m) + 9(n - m - 2)H(n - m - 1)]} 2}} \right.} 2}$ and $8n - 7m + 5(n - m - 2)H(n - m - 1)$, whereas the number required by Shakibaei’s method is ${{(n + 3m + 4)(n - m + 2)} \mathord{\left/ {\vphantom {{(n + 3m + 4)(n - m + 2)} 4}} \right.} 4} - {{(3n + m + 4)} \mathord{\left/ {\vphantom {{(3n + m + 4)} 2}} \right.} 2}$ and ${{(n + 3m + 4)(n - m + 2)} \mathord{\left/ {\vphantom {{(n + 3m + 4)(n - m + 2)} 8}} \right.} 8}$. Here $H({\cdot} )$ is the unit step function. The required number of additions and multiplication for integral method, Kintner’s method, and Shakibaei’s method with (n,0) and (30,m) is shown in Figs. 3(a)–3(d). It can be seen that the number of operations in both recurrence approaches agrees with the corresponding errors depicted in Fig. 2, which confirms the errors estimation. Another observation is that, when m is not close to n, the number of operations in the integral method is less than other methods. This finding implies integral approaches to be a better way to evaluate a specific order of radial polynomials. It is also worth mentioning that the sufficient number of intervals N’ has spatial dependence.

Fig. 3. The required number of additions and multiplication for integral method, Kintner’s method, and Shakibaei’s method with (a) (n,0) and (b) (30,m).

Download Full Size | PDF

Figures 4–5 depict the sufficient number of intervals N’ required by the integral method for the radial function of (n,0) and (40,m) the corresponding radial functions. It is seen that N’ and r are positively correlated; N’ approaches (n + m)/4 as r increases. Thus, the real number of additions and multiplication in the calculations will be less than the estimation in Fig. 3. In the following, we compare the elapsed time for the computation of a specific order of Zernike radial function between the recurrence methods and the proposed integral approach.

Fig. 4. (a) The spatial dependency of sufficient numbers of intervals N’ in Eq. (10) to accurately compute the radial function of (n,0) and (b) the corresponding radial functions.

Download Full Size | PDF

Fig. 5. (a) The spatial dependency of sufficient numbers of intervals N’ in Eq. (10) to accurately compute the radial function of (40,m) and (b) the corresponding radial functions.

Download Full Size | PDF

The numerical calculation is implemented in MATLAB R2019a under Microsoft Windows environment. The hardware used is a Dell laptop with Intel i7-8750H CPU processor @ 2.20 GHz (up to 4.10 GHz) and 16.0 GB RAM. The CPU elapsed time (execution time) of specific order (n,0) and (30,m) of radial functions for Kintner’s, Shakibaei’s, and integral methods is displayed in the Figs. 6(a)–6(b). The results of elapsed time for Shakibaei’s, and integral methods show a good agreement with the estimated number of operation in Fig. 3 while the elapsed time for Kintner’s method suffers from the time-consuming computation of initial values R_m_,m(r)=r^m to present a worse calculation efficiency even if n = m = 0. Overall, the integral method demonstrates less elapsed time than the recurrence method. The fast generation of a specific order of radial function is helpful for improving the efficiency of obtaining a particular type of aberration. More generally, it is important to efficiently compute all the radial functions for n≦n_max and m≦n, where n_max is the maximum order of n. In the next section, the fast algorithm for computing a set of Zernike radial function is discussed.

Fig. 6. The CPU elapsed time (execution time) of radial functions with (a) (n,0) and (b) (30,m) for proposed integral, Kintner’s, and Shakibaei’s methods.

Download Full Size | PDF

3. Developing an integral-based parallel algorithm to efficiently generate a set of Zernike radial polynomials

Figure 7(a) illustrates the scheme of a typical set of Zernike radial functions. r_i which represents the position in a unit circle is given by r_i=i/(M-1) for $i = 0,1, \cdots ,M - 1$. The set of radial functions is divided into two sub-sets: one with radial order n being even and the other with n being odd. Both sub-sets are shown in Figs. 7(b)–7(c). To efficiently generate the sets of radial functions, a parallel algorithm based on the integral in Eq. (11) is designed to work on the SIMT architecture. The main purpose is to simultaneously compute the radial function not only in the r direction but in all orders of n≦n_max and m≦n.

Fig. 7. The scheme of (a) a typical set of Zernike radial functions and two sub-set with radial order n being (b) even or (c) odd, respectively.

Download Full Size | PDF

To begin with, the matrices r, $\boldsymbol{\varphi }$, and $\boldsymbol{\tilde{\varphi }}$ are generated as pre-computed input variables for the integral. Here $\boldsymbol{\varphi }$ and ${\tilde{\varphi }}$ are given by ${\varphi _j} = 2\pi j/(2{N^{\prime}} + 1)$ and ${\tilde{\varphi }_{j,i}} = {\cos ^{ - 1}}[{r_i}\cos ({\varphi _j})]$, respectively. N’ is set to be n_max/2 for convenience. All cells of each matrix are calculated at once through creating as many threads as the points in the grids. Next, these matrices are put into the flow for calculating the sub-set of radial function as shown in Fig. 8. Thanks to the independent feature of the Riemann sums, all values of integrand in Eq. (11) in subintervals for n≦n_max and m≦n can be computed in parallel before summing over index j. For the case of sub-set with even or odd n in Fig. 7, the integrand in Eq. (11) is firstly represented by two matrices ${A_{n,j,i}} = {{\cos [(n + 1){{\tilde{\varphi }}_{j,i}}]} \mathord{\left/ {\vphantom {{\cos [(n + 1){{\tilde{\varphi }}_{j,i}}]} {\cos ({{\tilde{\varphi }}_{j,i}})}}} \right.} {\cos ({{\tilde{\varphi }}_{j,i}})}}$ and ${B_{j,m}} = \cos (m{\varphi _j})$. Each plane of A is simultaneously multiplied by B to yield a lower triangular matrix C, i.e. ${C_{n,m,i}} = \sum\nolimits_{j = 1}^{{N^{\prime}}} {\,{A_{n,j,i}}\,{B_{j,m,i}}}$ which represents the summation term in Eq. (11). Define ${D_{n,m,i}} = {{\sin [(n + 1){\kern 1pt} {{\cos }^{ - 1}}({r_i})]} \mathord{\left/ {\vphantom {{\sin [(n + 1){\kern 1pt} {{\cos }^{ - 1}}({r_i})]} {\sin ({{\cos }^{ - 1}}({r_i}))}}} \right.} {\sin ({{\cos }^{ - 1}}({r_i}))}}$, the radial functions R_n,m,i are obtained by calculating ${R_{n,m,i}} = {{({D_{n,m,i}} + 2C_{n,m,i}^{})} \mathord{\left/ {\vphantom {{({D_{n,m,i}} + 2C_{n,m,i}^{})} {(2{N^{\prime}} + 1)}}} \right.} {(2{N^{\prime}} + 1)}}$.

Fig. 8. The flow of parallel calculating the sub-set of radial function with even or odd n based on the integral method in Eq. (11).

Download Full Size | PDF

The above algorithm involves massive but simple algebra associated with trigonometric functions, which is suitable for being implemented on commodity graphics hardware. The GPU used in this work is NVIDA GeForce GTX 1060 6 GB with max-q design. The GTX 1060 GPU has 10 multiprocessors, each designed to execute thousands of thread concurrently. In addition, the GPU can support up to 10 (multiprocessors) x 64 (wraps/multiprocessor) x 32 (threads/wrap) = 20480 resident concurrent threads. The MATLAB programming language which can perform computing on NVIDA CUDA-enabled GPUs is exploited to run the established parallel algorithm. The MATLAB code for the algorithm is provided in the Appendix. The most important investigation is to explore the influence of the maximum order n_max on the computation time. The elapsed time for calculating a set of radial functions by directly applying Eq. (11) to work on CPU and parallel algorithm implemented on GPU is shown in Fig. 9(a). The spatial subdivisions M is set to be 1000. Naturally, the elapsed time increases as the maximum order n_max increases. It is clearly seen that the computationally expensive operation for higher n_max causes a steep increase in the CPU elapsed time. On the other side, the proposed parallel algorithm implemented on GPU shows a nearly unchanged ultra-short elapsed time owing to GPU’s ability to execute more instructions in a clock cycle. This result also indicates that the level of parallelization is increased when n_max becomes higher. Shakibaei’s recurrence method is selected for the comparison. The algorithm of recurrence relation cannot be parallelized in (n,m) but still can be parallelized in spatial domain. Figure 9(b) displays the elapsed time for calculating a set of radial functions by directly applying the recurrence relation in Eq. (13) to work on CPU and parallel algorithm implemented on GPU. It is not surprised that the parallelization of the recurrence method shows worse efficiency than the original sequential calculations since the benefit from the parallel computing by using GPUs in spatial domain cannot overcome the slower speed of GPU in the evaluation of recursive runs. To obtain a higher efficiency, the spatial subdivision M should be much more than n_max. The influence of number of spatial subdivisions M on the elapsed time to compute a set of radial function with n_max=30 is shown in Fig. 9(c). It can be found that the elapsed time for the GPUs is less than that for CPUs when subdivision M is larger than 25000. Finally, when n_max is larger than 30, the parallel algorithm of the proposed integral method exhibits the shortest processing time.

Fig. 9. The elapsed time for generating a set of radial functions by directly applying (a) Eq. (11) and (b) Eq. (13) to work on CPU and the corresponding parallel algorithm implemented on GPU. (c) The influence of number of spatial subdivisions M on the elapsed time to compute a set of radial function with n_max=30.

Download Full Size | PDF

4. Conclusions

In conclusion, we have exploited the Riemann sums to approximate the integral representation of the Zernike radial functions, which can demonstrate a surprisingly rapid convergence. We then have numerically investigated the accuracy of the Riemann sums that showed the errors averagely do not exceed 3 ×10⁻¹⁴, 3.3×10⁻¹⁴, and 1.8×10⁻¹³ for the radial order up to 30, 50, and 100, respectively. Furthermore, we have proposed a parallel algorithm based on the Riemann sums to simultaneously generate a set of radial functions. With the aid of the GPUs, the algorithm shows a speedup ratio up to 200 over the traditional CPU computation. The generated set of Zernike radial functions is useful for calculating the principal components of aerial images to extract the aberration coefficients [35,36] and the Zernike moments of input image for the pattern recognition [9,23,24].

Appendix

The MATLAB code for implementing the proposed parallel algorithm on GPU is shown in the following:

$ \color{green}{\%\;\textrm{Assuming nmax is even, the code will generate all orders of radial functions for n}<=\textrm{nmax}+1}$

$ \color{green}{\%\;\textrm{and m}<=n}$

nmax = 100;

M = 1000;

$ \color{green}{\%\;\textrm{Calculate the input variables}}$

r = gpuArray(0:1/(M-1):1);

N0 = ceil(nmax/2) + 1;

N = 2*N0 + 1;

delta = 2*pi()/N;

o = nmax/2 + 1;

npre = ones(o,1,$ \color{purple}{\textrm{'gpuArray'}}$);

nmat = 2*cumsum(npre)-2*npre;

phi = gpuArray(delta:delta:delta*N0).’;

phi1 = reshape(arrayfun(@acos,arrayfun(@cos,phi)*r),1,M*N0);

$ \color{green}{\%\textrm{Calculate j}=0\;\textrm{term in Eq.(10), which is denoted by D matrix as shown in Fig.8}}$.

phi0 = arrayfun(@acos,r);

T0 = arrayfun(@sin,phi0);

DE = reshape(arrayfun(@sin,((nmat + 1)*phi0))./T0,o,1,M);

DO = reshape(arrayfun(@sin,((nmat + 2)*phi0))./T0,o,1,M);

$ \color{green}{\%\textrm{Calculate the A,B matrices for even and odd n described in Figs. 8. E}}$

$ \color{green}{\%\textrm{and O in here denote even and odd n, respectively.}}$

T = arrayfun(@sin,npre*phi1);

AE = reshape(arrayfun(@sin,(nmat + 1)*phi1)./T,o,N0,M);

AO = reshape(arrayfun(@sin,(nmat + 2)*phi1)./T,o,N0,M);

nmat = nmat.’;

BE = arrayfun(@cos,phi*nmat);

BO = arrayfun(@cos,phi*(nmat + 1));

$ \color{green}{\%\textrm{Calculate the RE and RO matrices.}}$

RE=(pagefun(@mtimes,AE,2*BE)+DE)/N;

RO=(pagefun(@mtimes,AO,2*BO)+DO)/N;

$ \color{green}{\%\textrm{Avoid the singularities when r = 1.}}$

RE(:,:,M) = 1;

RO(:,:,M) = 1;

Funding

Ministry of Science and Technology, Taiwan (108-2119-M-009-005).

Disclosures

The authors declare no conflicts of interest.

References

1. F. Zernike, “Beugungstheorie des Schneidenverfahrens und seiner verbesserten Form, der Phasenkontrastmethode,” Physica 1(7-12), 689–704 (1934). [CrossRef]

2. B. R. A. Nijboer, “The diffraction theory of optical aberrations. Part II: Diffraction pattern in the presence of small aberrations,” Physica 13(10), 605–620 (1947). [CrossRef]

3. R. J. Noll, “Zernike polynomials and atmospheric turbulence,” J. Opt. Soc. Am. 66(3), 207–211 (1976). [CrossRef]

4. D. M. Winker, “Effect of a finite outer scale on the Zernike decomposition of atmospheric optical turbulence,” J. Opt. Soc. Am. A 8(10), 1568–1574 (1991). [CrossRef]

5. D. R. Iskander, M. J. Collins, and B. Davis, “Optimal modeling of corneal surfaces with Zernike polynomials,” IEEE Trans. Biomed. Eng. 48(1), 87–95 (2001). [CrossRef]

6. V. Lakshminarayanan and A. Fleck, “Zernike polynomials: a guide,” J. Mod. Opt. 58(7), 545–561 (2011). [CrossRef]

7. R. W. Gray, C. Dunn, K. P. Thompson, and J. P. Rolland, “An analytic expression for the field dependence of Zernike polynomials in rotationally symmetric optical systems,” Opt. Express 20(15), 16436–16449 (2012). [CrossRef]

8. C. J. R. Sheppard, “Balanced diffraction aberrations, independent of the observation point: application to a tilted dielectric plate,” J. Opt. Soc. Am. A 30(10), 2150–2161 (2013). [CrossRef]

9. A. Khotanzad and Y. H. Hong, “Invariant image recognition by Zernike moments,” IEEE Trans. Pattern Anal. Machine Intell. 12(5), 489–497 (1990). [CrossRef]

10. P. M. Patil and T. R. Sontakke, “Rotation, scale, and translation invariant handwritten Devanagari numeral character recognition using general fuzzy neural network,” Pattern Recognit. 40(7), 2110–2117 (2007). [CrossRef]

11. L. Wang and G. Healey, “Using Zernike moments for the illumination and geometry invariant classification of multispectral texture,” IEEE Trans. on Image Process. 7(2), 196–203 (1998). [CrossRef]

12. Y. H. Pang, A. T. B. Toeh, D. C. L. Ngo, and H. F. San, “Palmprint verification with moments,” J. Comput. Graph. Vis. Comput. Vis. 12(2), 325–332 (2004).

13. D. Zhang and G. Lu, “Review of shape representation and description techniques,” Pattern Recognit. 37(1), 1–19 (2004). [CrossRef]

14. H. S. Kim and H. K. Lee, “Invariant Image Watermark Using Zernike Moments,” IEEE Trans. Circuits Syst. Video Technol. 13(8), 766–775 (2003). [CrossRef]

15. P. T. Yap, X. Jiang, and A. C. Kot, “Two-dimensional polar harmonic transforms for invariant image representation,” IEEE Trans. Pattern Anal. Mach. Intell. 32(7), 1259–1270 (2010). [CrossRef]

16. R. K. Tyson, “Conversion of Zernike aberration coefficients to Seidel and higher-order power-series aberration coefficients,” Opt. Lett. 7(6), 262–264 (1982). [CrossRef]

17. J. J. M. Braat and A. J. E. M. Janssen, “Double Zernike expansion of the optical aberration function from its power series expansion,” J. Opt. Soc. Am. A 30(6), 1213–1222 (2013). [CrossRef]

18. W. Liu, S. Liu, T. Zhou, and L. Wang, “Aerial image based technique for measurement of lens aberrations up to 37th Zernike coefficient in lithographic tools under partial coherent illumination,” Opt. Express 17(21), 19278–19291 (2009). [CrossRef]

19. S. Liu, X. Zhou, W. Lv, S. Xu, and H. Wei, “Convolution-variation separation method for efficient modeling of optical lithography,” Opt. Lett. 38(13), 2168–2170 (2013). [CrossRef]

20. S. Liu, S. Xu, X. Wu, and W. Liu, “Iterative method for in situ measurement of lens aberrations in lithographic tools using CTC-based quadratic aberration model,” Opt. Express 20(13), 14272–14283 (2012). [CrossRef]

21. X. Y. Wang, W. Y. Li, H. Y. Yang, and P. P. Niu, “Y.W. Li Invariant quaternion radial harmonic Fourier moments for color image retrieval,” Opt. Laser Technol. 66, 78–88 (2015). [CrossRef]

22. G. Amayeh, A. Erol, G. Bebis, and M. Nicolescu, “Accurate and efficient computation of high order Zernike moments,” Proceedings Lecture Notes in Computer Science 3804, 462–469 (Springer-Verlag, 2005).

23. C. Singh and E. Walia, “Fast and numerically stable methods for the computation of Zernike moments,” Pattern Recognit. 43(7), 2497–2506 (2010). [CrossRef]

24. H. F. Qin and L. Qin, “A parallel recurrence method for the fast computation of Zernike moments,” Appl. Math. Comput. 219(4), 1549–1561 (2012). [CrossRef]

25. E. C. Kintner, “A recurrence relation for calculating the Zernike polynomials,” Opt. Acta 23(6), 499–500 (1976). [CrossRef]

26. G. W. Forbes, “Robust and fast computation for the polynomials of optics,” Opt. Express 18(13), 13851–13862 (2010). [CrossRef]

27. I. Kaya and J. P. Rolland, “Acceleration of computation of φ-polynomials,” Opt. Express 21(23), 29065–29072 (2013). [CrossRef]

28. Y. F. Chen, S. C. Li, Y. H. Hsieh, J. C. Tung, H. C. Liang, and K. F. Huang, “Laser wave-packet representation to unify eigenmodes and geometric modes in spherical cavities,” Opt. Lett. 44(11), 2649–2652 (2019). [CrossRef]

29. B. C. Platt and R. Shack, “History and priciples of Shack-Hartmann wavefront sensing,” J. Refract. Surg. 17(5), S573–S577 (2001). [CrossRef]

30. S. C. Chapra and R. P. Canale, Numerical Methods for Engineers, (McGraw-Hill, Singapore, 1998), 631–647.

31. P. J. Davis and P. Rabinowitz, Methods of Numerical Integration, Academic, FL, Orlando, (1984), 134–146.

32. A. J. E. M. Janssen and P. Dirksen, “Computing Zernike polynomials of arbitrary degree using the discrete Fourier transform,” JEOS:RP 2, 07012 (2007). [CrossRef]

33. B. Honarvar Shakibaei and R. Paramesran, “Recursive formula to compute Zernike radial polynomials,” Opt. Lett. 38(14), 2487–2489 (2013). [CrossRef]

34. T. B. Andersen, “Efficient and robust recurrence relations for the Zernike circle polynomials and their derivatives in Cartesian coordinates,” Opt. Express 26(15), 18878–18896 (2018). [CrossRef]

35. L. Duan, X. Wang, A. Y. Bourov, B. Peng, and P. Bu, “In situ aberration measurement technique based on principal component analysis of aerial image,” Opt. Express 19(19), 18080–18090 (2011). [CrossRef]

36. J. Yang, X. Wang, S. Li, L. Duan, D. Xu, A. Bourov, and A. Erdmann, “High order aberration measurement technique based on quadratic Zernike model with optimized source,” Opt. Eng. 52(5), 053603 (2013). [CrossRef]

Integral-based parallel algorithm for the fast generation of the Zernike polynomials

Abstract

1. Introduction

2. Deriving the integral representation of Zernike radial functions for the fast computation of radial polynomials

3. Developing an integral-based parallel algorithm to efficiently generate a set of Zernike radial polynomials

4. Conclusions

Appendix

Funding

Disclosures

References

Cited By

Figures (9)

Equations (13)

Optics Express