Abstract
The integral representation of the Zernike radial functions is well approximated by applying the Riemann sums with a surprisingly rapid convergence. The errors of the Riemann sums are found to averagely be not exceed 3 ×10−14, 3.3×10−14, and 1.8×10−13 for the radial order up to 30, 50, and 100, respectively. Moreover, a parallel algorithm based on the Riemann sums is proposed to directly generate a set of radial functions. With the aid of the graphics processing units (GPUs), the algorithm shows an acceleration ratio up to 200-fold over the traditional CPU computation. The fast generation for a set of Zernike radial polynomials is expected to be valuable in further applications, such as the aberration analysis and the pattern recognition.
© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement
1. Introduction
In 1934, F. Zernike proposed a sequence of polynomials which are orthogonal and continuous over a unit circle with a finite value at the boundary to examine the figure of a circular mirror [1]. Later, these so-called Zernike polynomials are exploited by B. R. A. Nijboer to describe the small aberrations on the diffraction patterns in the rotationally symmetric system with circular pupils [2]. R. J. Noll further extended the use of Zernike polynomials to discuss the aberrations introduced by atmospheric turbulence [3]. Currently, Zernike polynomials are widely used to analyze the propagation of a wavefront through imaging system with pupils [4–8]. Some typical Zernike polynomials and corresponding types of aberration are shown in the Fig. 1. In addition to the orthogonality, the rotational invariance of Zernike polynomials are utilized by A. Khotanzad to automatically recognize images regardless of its orientation [9], which attracts great explorations in the field of pattern recognition [10–12] and the shape representation [13–15].
The computation of Zernike functions Zn,m (r) involves the direct evaluation of Zernike radial polynomials Rn,m (r). Applications involve high order Zernike polynomials [16–24], especially in the pattern reconstruction where radial order could be up to 40-70 [21,22], suffer greatly in both speed and accuracy when the calculation of factorial functions leads to long processing time accompanying by the significant floating error originated from heavy cancellation between components. Many recurrence relations for radial polynomials are derived to reduce the elapsed time under the requirement of numerical accuracy and stability [23–26]. Conventionally, the specific order of Rn,m (r) can be generated through the recurrence procedure started by using Rm,m (r)= rm. To avoid the magnified round-off error of rm caused by the recurrence process, G. W. Forbes proposed a novel recurrence relation started by giving two values of 1 and r(m + 2)-(m + 1) [26]. I. Kaya and J. Rolland further designed a parallel algorithm based upon this recurrence relation to work on single instruction multiple threads (SIMT) graphics processing units (GPUs) architecture that can achieve improvement more than an order of magnitude in computing time over the sequential implementation [27]. They applied the data-parallel SIMT algorithm to pre-compute the coefficient of the recurrence procedure and created thousands of threads together to simultaneously carry out arithmetic operations on each position points. Nevertheless, the sequential nature of recursive relation heavily restricts the benefit of parallelization. A new approach to calculate the Zernike radial function that can reap the full benefit of parallel computation is thus highly desired.
In the past, an alternative mathematical expression to the radial function is associated with the improper integral of Bessel function, which is hard to be numerically calculated [3]. Recently, the theoretical approach to reconstruct transverse patterns in the laser resonators are derived by using the generating functions as the interference between Gaussian wave packets that moves along the geometric trajectories [28]. The Hermite-Gaussian and Laguerre-Gaussian functions are further represented as an integral of the Gaussian wave packet, which could be in an efficient and precise manner. The fast computation of these special functions inspires us to apply the generating function to express the Bessel function as an integral to obtain a new form of radial functions. In this work, the integral representation of Zernike polynomials is derived as a definite integral of trigonometric functions by exploiting the generating function of the Bessel function as well as the orthogonality of plane waves. According to the Euler-Maclaurin formula, the integral is well approximated by Riemann sums with the rectangular rule. The errors of the Riemann sums are found to averagely be not exceed 3 ×10−14, 3.3×10−14, and 1.8×10−13 for the radial order up to 30, 50, and 100, respectively. Furthermore, a parallel algorithm based upon the Riemann sums is proposed to calculate a set of radial functions. We implement the algorithm on the GPUs to show a striking increase in processing speed 200 times over the traditional CPU computation.
2. Deriving the integral representation of Zernike radial functions for the fast computation of radial polynomials
Zernike polynomials are a product of angular functions and radial polynomials. In the polar coordinates, the Zernike radial polynomials can be defined as
The generating function of Bessel function is written as
Let $t = {e^{i\phi }}$, the generating function becomes By utilizing the orthogonal property of the plane waves, the Bessel function of arbitrary order mo can be expressed as an integral formDifferent from the Janssen and Dirksen’s approach to the computation of Eq. (10), we use the π-symmetry property of integrand to further expressed Eq. (10) as
To validate the integral method, we compare the result to that comes from the recurrence methods. One of the most typical recurrence relation is proposed by Kintner [25] in the following form:
In the previous discussion, the sufficient number of intervals in summation to well approximate the integral is found to be $N^{\prime} = \lceil{({{n + m)} \mathord{\left/ {\vphantom {{n + m)} 4}} \right.} 4}} \rceil$. Accordingly, the estimated addition and multiplication number for the computation of a specific order of radial polynomials in Eq. (11) are $2\lceil{{{(n + m)} \mathord{\left/ {\vphantom {{(n + m)} 4}} \right.} 4}} \rceil + 2$ and $6\lceil{{{(n + m)} \mathord{\left/ {\vphantom {{(n + m)} 4}} \right.} 4}} \rceil + 5$, respectively. The number of addition and multiplication required by Kintner’s method is ${{[11(n - m) + 9(n - m - 2)H(n - m - 1)]} \mathord{\left/ {\vphantom {{[11(n - m) + 9(n - m - 2)H(n - m - 1)]} 2}} \right.} 2}$ and $8n - 7m + 5(n - m - 2)H(n - m - 1)$, whereas the number required by Shakibaei’s method is ${{(n + 3m + 4)(n - m + 2)} \mathord{\left/ {\vphantom {{(n + 3m + 4)(n - m + 2)} 4}} \right.} 4} - {{(3n + m + 4)} \mathord{\left/ {\vphantom {{(3n + m + 4)} 2}} \right.} 2}$ and ${{(n + 3m + 4)(n - m + 2)} \mathord{\left/ {\vphantom {{(n + 3m + 4)(n - m + 2)} 8}} \right.} 8}$. Here $H({\cdot} )$ is the unit step function. The required number of additions and multiplication for integral method, Kintner’s method, and Shakibaei’s method with (n,0) and (30,m) is shown in Figs. 3(a)–3(d). It can be seen that the number of operations in both recurrence approaches agrees with the corresponding errors depicted in Fig. 2, which confirms the errors estimation. Another observation is that, when m is not close to n, the number of operations in the integral method is less than other methods. This finding implies integral approaches to be a better way to evaluate a specific order of radial polynomials. It is also worth mentioning that the sufficient number of intervals N’ has spatial dependence.
Figures 4–5 depict the sufficient number of intervals N’ required by the integral method for the radial function of (n,0) and (40,m) the corresponding radial functions. It is seen that N’ and r are positively correlated; N’ approaches (n + m)/4 as r increases. Thus, the real number of additions and multiplication in the calculations will be less than the estimation in Fig. 3. In the following, we compare the elapsed time for the computation of a specific order of Zernike radial function between the recurrence methods and the proposed integral approach.
The numerical calculation is implemented in MATLAB R2019a under Microsoft Windows environment. The hardware used is a Dell laptop with Intel i7-8750H CPU processor @ 2.20 GHz (up to 4.10 GHz) and 16.0 GB RAM. The CPU elapsed time (execution time) of specific order (n,0) and (30,m) of radial functions for Kintner’s, Shakibaei’s, and integral methods is displayed in the Figs. 6(a)–6(b). The results of elapsed time for Shakibaei’s, and integral methods show a good agreement with the estimated number of operation in Fig. 3 while the elapsed time for Kintner’s method suffers from the time-consuming computation of initial values Rm,m(r)=rm to present a worse calculation efficiency even if n = m = 0. Overall, the integral method demonstrates less elapsed time than the recurrence method. The fast generation of a specific order of radial function is helpful for improving the efficiency of obtaining a particular type of aberration. More generally, it is important to efficiently compute all the radial functions for n≦nmax and m≦n, where nmax is the maximum order of n. In the next section, the fast algorithm for computing a set of Zernike radial function is discussed.
3. Developing an integral-based parallel algorithm to efficiently generate a set of Zernike radial polynomials
Figure 7(a) illustrates the scheme of a typical set of Zernike radial functions. ri which represents the position in a unit circle is given by ri=i/(M-1) for $i = 0,1, \cdots ,M - 1$. The set of radial functions is divided into two sub-sets: one with radial order n being even and the other with n being odd. Both sub-sets are shown in Figs. 7(b)–7(c). To efficiently generate the sets of radial functions, a parallel algorithm based on the integral in Eq. (11) is designed to work on the SIMT architecture. The main purpose is to simultaneously compute the radial function not only in the r direction but in all orders of n≦nmax and m≦n.
To begin with, the matrices r, $\boldsymbol{\varphi }$, and $\boldsymbol{\tilde{\varphi }}$ are generated as pre-computed input variables for the integral. Here $\boldsymbol{\varphi }$ and ${\tilde{\varphi }}$ are given by ${\varphi _j} = 2\pi j/(2{N^{\prime}} + 1)$ and ${\tilde{\varphi }_{j,i}} = {\cos ^{ - 1}}[{r_i}\cos ({\varphi _j})]$, respectively. N’ is set to be nmax/2 for convenience. All cells of each matrix are calculated at once through creating as many threads as the points in the grids. Next, these matrices are put into the flow for calculating the sub-set of radial function as shown in Fig. 8. Thanks to the independent feature of the Riemann sums, all values of integrand in Eq. (11) in subintervals for n≦nmax and m≦n can be computed in parallel before summing over index j. For the case of sub-set with even or odd n in Fig. 7, the integrand in Eq. (11) is firstly represented by two matrices ${A_{n,j,i}} = {{\cos [(n + 1){{\tilde{\varphi }}_{j,i}}]} \mathord{\left/ {\vphantom {{\cos [(n + 1){{\tilde{\varphi }}_{j,i}}]} {\cos ({{\tilde{\varphi }}_{j,i}})}}} \right.} {\cos ({{\tilde{\varphi }}_{j,i}})}}$ and ${B_{j,m}} = \cos (m{\varphi _j})$. Each plane of A is simultaneously multiplied by B to yield a lower triangular matrix C, i.e. ${C_{n,m,i}} = \sum\nolimits_{j = 1}^{{N^{\prime}}} {\,{A_{n,j,i}}\,{B_{j,m,i}}}$ which represents the summation term in Eq. (11). Define ${D_{n,m,i}} = {{\sin [(n + 1){\kern 1pt} {{\cos }^{ - 1}}({r_i})]} \mathord{\left/ {\vphantom {{\sin [(n + 1){\kern 1pt} {{\cos }^{ - 1}}({r_i})]} {\sin ({{\cos }^{ - 1}}({r_i}))}}} \right.} {\sin ({{\cos }^{ - 1}}({r_i}))}}$, the radial functions Rn,m,i are obtained by calculating ${R_{n,m,i}} = {{({D_{n,m,i}} + 2C_{n,m,i}^{})} \mathord{\left/ {\vphantom {{({D_{n,m,i}} + 2C_{n,m,i}^{})} {(2{N^{\prime}} + 1)}}} \right.} {(2{N^{\prime}} + 1)}}$.
The above algorithm involves massive but simple algebra associated with trigonometric functions, which is suitable for being implemented on commodity graphics hardware. The GPU used in this work is NVIDA GeForce GTX 1060 6 GB with max-q design. The GTX 1060 GPU has 10 multiprocessors, each designed to execute thousands of thread concurrently. In addition, the GPU can support up to 10 (multiprocessors) x 64 (wraps/multiprocessor) x 32 (threads/wrap) = 20480 resident concurrent threads. The MATLAB programming language which can perform computing on NVIDA CUDA-enabled GPUs is exploited to run the established parallel algorithm. The MATLAB code for the algorithm is provided in the Appendix. The most important investigation is to explore the influence of the maximum order nmax on the computation time. The elapsed time for calculating a set of radial functions by directly applying Eq. (11) to work on CPU and parallel algorithm implemented on GPU is shown in Fig. 9(a). The spatial subdivisions M is set to be 1000. Naturally, the elapsed time increases as the maximum order nmax increases. It is clearly seen that the computationally expensive operation for higher nmax causes a steep increase in the CPU elapsed time. On the other side, the proposed parallel algorithm implemented on GPU shows a nearly unchanged ultra-short elapsed time owing to GPU’s ability to execute more instructions in a clock cycle. This result also indicates that the level of parallelization is increased when nmax becomes higher. Shakibaei’s recurrence method is selected for the comparison. The algorithm of recurrence relation cannot be parallelized in (n,m) but still can be parallelized in spatial domain. Figure 9(b) displays the elapsed time for calculating a set of radial functions by directly applying the recurrence relation in Eq. (13) to work on CPU and parallel algorithm implemented on GPU. It is not surprised that the parallelization of the recurrence method shows worse efficiency than the original sequential calculations since the benefit from the parallel computing by using GPUs in spatial domain cannot overcome the slower speed of GPU in the evaluation of recursive runs. To obtain a higher efficiency, the spatial subdivision M should be much more than nmax. The influence of number of spatial subdivisions M on the elapsed time to compute a set of radial function with nmax=30 is shown in Fig. 9(c). It can be found that the elapsed time for the GPUs is less than that for CPUs when subdivision M is larger than 25000. Finally, when nmax is larger than 30, the parallel algorithm of the proposed integral method exhibits the shortest processing time.
4. Conclusions
In conclusion, we have exploited the Riemann sums to approximate the integral representation of the Zernike radial functions, which can demonstrate a surprisingly rapid convergence. We then have numerically investigated the accuracy of the Riemann sums that showed the errors averagely do not exceed 3 ×10−14, 3.3×10−14, and 1.8×10−13 for the radial order up to 30, 50, and 100, respectively. Furthermore, we have proposed a parallel algorithm based on the Riemann sums to simultaneously generate a set of radial functions. With the aid of the GPUs, the algorithm shows a speedup ratio up to 200 over the traditional CPU computation. The generated set of Zernike radial functions is useful for calculating the principal components of aerial images to extract the aberration coefficients [35,36] and the Zernike moments of input image for the pattern recognition [9,23,24].
Appendix
The MATLAB code for implementing the proposed parallel algorithm on GPU is shown in the following:
$ \color{green}{\%\;\textrm{Assuming nmax is even, the code will generate all orders of radial functions for n}<=\textrm{nmax}+1}$
$ \color{green}{\%\;\textrm{and m}<=n}$
nmax = 100;
M = 1000;
$ \color{green}{\%\;\textrm{Calculate the input variables}}$
r = gpuArray(0:1/(M-1):1);
N0 = ceil(nmax/2) + 1;
N = 2*N0 + 1;
delta = 2*pi()/N;
o = nmax/2 + 1;
npre = ones(o,1,$ \color{purple}{\textrm{'gpuArray'}}$);
nmat = 2*cumsum(npre)-2*npre;
phi = gpuArray(delta:delta:delta*N0).’;
phi1 = reshape(arrayfun(@acos,arrayfun(@cos,phi)*r),1,M*N0);
$ \color{green}{\%\textrm{Calculate j}=0\;\textrm{term in Eq.(10), which is denoted by D matrix as shown in Fig.8}}$.
phi0 = arrayfun(@acos,r);
T0 = arrayfun(@sin,phi0);
DE = reshape(arrayfun(@sin,((nmat + 1)*phi0))./T0,o,1,M);
DO = reshape(arrayfun(@sin,((nmat + 2)*phi0))./T0,o,1,M);
$ \color{green}{\%\textrm{Calculate the A,B matrices for even and odd n described in Figs. 8. E}}$
$ \color{green}{\%\textrm{and O in here denote even and odd n, respectively.}}$
T = arrayfun(@sin,npre*phi1);
AE = reshape(arrayfun(@sin,(nmat + 1)*phi1)./T,o,N0,M);
AO = reshape(arrayfun(@sin,(nmat + 2)*phi1)./T,o,N0,M);
nmat = nmat.’;
BE = arrayfun(@cos,phi*nmat);
BO = arrayfun(@cos,phi*(nmat + 1));
$ \color{green}{\%\textrm{Calculate the RE and RO matrices.}}$
RE=(pagefun(@mtimes,AE,2*BE)+DE)/N;
RO=(pagefun(@mtimes,AO,2*BO)+DO)/N;
$ \color{green}{\%\textrm{Avoid the singularities when r = 1.}}$
RE(:,:,M) = 1;
RO(:,:,M) = 1;
Funding
Ministry of Science and Technology, Taiwan (108-2119-M-009-005).
Disclosures
The authors declare no conflicts of interest.
References
1. F. Zernike, “Beugungstheorie des Schneidenverfahrens und seiner verbesserten Form, der Phasenkontrastmethode,” Physica 1(7-12), 689–704 (1934). [CrossRef]
2. B. R. A. Nijboer, “The diffraction theory of optical aberrations. Part II: Diffraction pattern in the presence of small aberrations,” Physica 13(10), 605–620 (1947). [CrossRef]
3. R. J. Noll, “Zernike polynomials and atmospheric turbulence,” J. Opt. Soc. Am. 66(3), 207–211 (1976). [CrossRef]
4. D. M. Winker, “Effect of a finite outer scale on the Zernike decomposition of atmospheric optical turbulence,” J. Opt. Soc. Am. A 8(10), 1568–1574 (1991). [CrossRef]
5. D. R. Iskander, M. J. Collins, and B. Davis, “Optimal modeling of corneal surfaces with Zernike polynomials,” IEEE Trans. Biomed. Eng. 48(1), 87–95 (2001). [CrossRef]
6. V. Lakshminarayanan and A. Fleck, “Zernike polynomials: a guide,” J. Mod. Opt. 58(7), 545–561 (2011). [CrossRef]
7. R. W. Gray, C. Dunn, K. P. Thompson, and J. P. Rolland, “An analytic expression for the field dependence of Zernike polynomials in rotationally symmetric optical systems,” Opt. Express 20(15), 16436–16449 (2012). [CrossRef]
8. C. J. R. Sheppard, “Balanced diffraction aberrations, independent of the observation point: application to a tilted dielectric plate,” J. Opt. Soc. Am. A 30(10), 2150–2161 (2013). [CrossRef]
9. A. Khotanzad and Y. H. Hong, “Invariant image recognition by Zernike moments,” IEEE Trans. Pattern Anal. Machine Intell. 12(5), 489–497 (1990). [CrossRef]
10. P. M. Patil and T. R. Sontakke, “Rotation, scale, and translation invariant handwritten Devanagari numeral character recognition using general fuzzy neural network,” Pattern Recognit. 40(7), 2110–2117 (2007). [CrossRef]
11. L. Wang and G. Healey, “Using Zernike moments for the illumination and geometry invariant classification of multispectral texture,” IEEE Trans. on Image Process. 7(2), 196–203 (1998). [CrossRef]
12. Y. H. Pang, A. T. B. Toeh, D. C. L. Ngo, and H. F. San, “Palmprint verification with moments,” J. Comput. Graph. Vis. Comput. Vis. 12(2), 325–332 (2004).
13. D. Zhang and G. Lu, “Review of shape representation and description techniques,” Pattern Recognit. 37(1), 1–19 (2004). [CrossRef]
14. H. S. Kim and H. K. Lee, “Invariant Image Watermark Using Zernike Moments,” IEEE Trans. Circuits Syst. Video Technol. 13(8), 766–775 (2003). [CrossRef]
15. P. T. Yap, X. Jiang, and A. C. Kot, “Two-dimensional polar harmonic transforms for invariant image representation,” IEEE Trans. Pattern Anal. Mach. Intell. 32(7), 1259–1270 (2010). [CrossRef]
16. R. K. Tyson, “Conversion of Zernike aberration coefficients to Seidel and higher-order power-series aberration coefficients,” Opt. Lett. 7(6), 262–264 (1982). [CrossRef]
17. J. J. M. Braat and A. J. E. M. Janssen, “Double Zernike expansion of the optical aberration function from its power series expansion,” J. Opt. Soc. Am. A 30(6), 1213–1222 (2013). [CrossRef]
18. W. Liu, S. Liu, T. Zhou, and L. Wang, “Aerial image based technique for measurement of lens aberrations up to 37th Zernike coefficient in lithographic tools under partial coherent illumination,” Opt. Express 17(21), 19278–19291 (2009). [CrossRef]
19. S. Liu, X. Zhou, W. Lv, S. Xu, and H. Wei, “Convolution-variation separation method for efficient modeling of optical lithography,” Opt. Lett. 38(13), 2168–2170 (2013). [CrossRef]
20. S. Liu, S. Xu, X. Wu, and W. Liu, “Iterative method for in situ measurement of lens aberrations in lithographic tools using CTC-based quadratic aberration model,” Opt. Express 20(13), 14272–14283 (2012). [CrossRef]
21. X. Y. Wang, W. Y. Li, H. Y. Yang, and P. P. Niu, “Y.W. Li Invariant quaternion radial harmonic Fourier moments for color image retrieval,” Opt. Laser Technol. 66, 78–88 (2015). [CrossRef]
22. G. Amayeh, A. Erol, G. Bebis, and M. Nicolescu, “Accurate and efficient computation of high order Zernike moments,” Proceedings Lecture Notes in Computer Science 3804, 462–469 (Springer-Verlag, 2005).
23. C. Singh and E. Walia, “Fast and numerically stable methods for the computation of Zernike moments,” Pattern Recognit. 43(7), 2497–2506 (2010). [CrossRef]
24. H. F. Qin and L. Qin, “A parallel recurrence method for the fast computation of Zernike moments,” Appl. Math. Comput. 219(4), 1549–1561 (2012). [CrossRef]
25. E. C. Kintner, “A recurrence relation for calculating the Zernike polynomials,” Opt. Acta 23(6), 499–500 (1976). [CrossRef]
26. G. W. Forbes, “Robust and fast computation for the polynomials of optics,” Opt. Express 18(13), 13851–13862 (2010). [CrossRef]
27. I. Kaya and J. P. Rolland, “Acceleration of computation of φ-polynomials,” Opt. Express 21(23), 29065–29072 (2013). [CrossRef]
28. Y. F. Chen, S. C. Li, Y. H. Hsieh, J. C. Tung, H. C. Liang, and K. F. Huang, “Laser wave-packet representation to unify eigenmodes and geometric modes in spherical cavities,” Opt. Lett. 44(11), 2649–2652 (2019). [CrossRef]
29. B. C. Platt and R. Shack, “History and priciples of Shack-Hartmann wavefront sensing,” J. Refract. Surg. 17(5), S573–S577 (2001). [CrossRef]
30. S. C. Chapra and R. P. Canale, Numerical Methods for Engineers, (McGraw-Hill, Singapore, 1998), 631–647.
31. P. J. Davis and P. Rabinowitz, Methods of Numerical Integration, Academic, FL, Orlando, (1984), 134–146.
32. A. J. E. M. Janssen and P. Dirksen, “Computing Zernike polynomials of arbitrary degree using the discrete Fourier transform,” JEOS:RP 2, 07012 (2007). [CrossRef]
33. B. Honarvar Shakibaei and R. Paramesran, “Recursive formula to compute Zernike radial polynomials,” Opt. Lett. 38(14), 2487–2489 (2013). [CrossRef]
34. T. B. Andersen, “Efficient and robust recurrence relations for the Zernike circle polynomials and their derivatives in Cartesian coordinates,” Opt. Express 26(15), 18878–18896 (2018). [CrossRef]
35. L. Duan, X. Wang, A. Y. Bourov, B. Peng, and P. Bu, “In situ aberration measurement technique based on principal component analysis of aerial image,” Opt. Express 19(19), 18080–18090 (2011). [CrossRef]
36. J. Yang, X. Wang, S. Li, L. Duan, D. Xu, A. Bourov, and A. Erdmann, “High order aberration measurement technique based on quadratic Zernike model with optimized source,” Opt. Eng. 52(5), 053603 (2013). [CrossRef]