## Abstract

In computational Fourier optics, computer generated holography, etc., coherent light propagation calculation between parallel planes is the essential task. A proper calculation discretization in the off-axis case leads to big memory demands in order to avoid aliasing errors. The proposed method typically cuts down the memory demands one hundred times. The principle of the method is based on the observation that there is a close correspondence between the reconstruction process (opposite of the sampling process) and prefiltering of the convolution kernel.

© 2013 OSA

## 1. Introduction

Calculation of coherent light propagation in a free space is a fundamental tool in Fourier optics [1, 2], digital holography [3], computer generated holography (e.g. [4–8]) and other areas of optics. A very important and common task is the calculation of light propagation between two parallel planes; however, the general case of light propagation between tilted planes is also important in applications [9–11].

The problem is often given in this way: there is an area Σ in a plane *z* = 0 containing an image called the *source* that is lit by a coherent light, mostly by a plane wave. The task is to calculate the light field in a plane *z* = *z*_{0} in an area called the *target*.

In this task, we usually assume validity of the scalar approximation of the light [1]. A good approximation of the correct solution is then given by, e.g., a Rayleigh-Sommerfeld integral of the first kind. In this article, let us assume this approximation as the reference one.

The Rayleigh-Sommerfeld solution cannot usually be used in an analytic calculation due to its complexity. Sometimes it is possible to get some results by using its mathematically equivalent form, the angular spectrum decomposition [12]. It is, however, most usual to restrict the calculation to the paraxial approximation in either the near (Fresnel) or far (Fraunhofer) region.

It is possible to evaluate the “reference” Rayleigh-Sommerfeld integral numerically using computers; thanks to its form of convolution, the calculation leads to the use of three fast Fourier transforms (FFT). The reason for the use of the aforementioned forms or approximations lies in the number of FFT’s: the angular spectrum decomposition leads to two FFT’s; the Fresnel or Fraunhofer approximation lead to just one FFT or fast fractional Fourier transform [13].

The implementations of these faster algorithms are unfortunately not straightforward, as the discretization of their equations leads to various problems. Correct implementation of the angular spectrum decomposition is especially tricky [14]; even Fresnel and Fraunhofer approximations have to be discretized carefully [2, 15, 16].

It is therefore wise to verify fast algorithms by comparing them with a reference method based on the carefully discretized Rayleigh-Sommerfeld integral [17, 18]. The discretization process must consider both correct sampling of the *source* and sampling of the illumination light field and the Rayleigh-Sommerfeld convolution kernel. It is also necessary to consider the inverse operation to the sampling, i.e. the reconstruction. All of these considerations often lead to sampling distances smaller than the wavelength of light, and therefore to huge memory demands. This article studies the discretization process and suggests a method to avoid huge memory demands and consequent time demands of large arrays FFT’s.

The structure of the article is as follows. Section 2 shows a naive method of discretization; the example given will show an illuminated amplitude diffraction grating with a sine transmittance profile. We will show that the naive discretization leads to “wrong” results; we will explain the “wrong” result in a physical way and show that a fine sampling leads to the correct result (and to huge memory demands). In Section 3, we will show how to lower memory demands by a simple 1D example. In Section 4 we will remove certain simplifications introduced in Section 3, and in Section 5 we will discuss the general 2D algorithm. Finally, in Section 6 we will present the time and memory demands of the algorithm and in Section 7 we will give conclusions.

In the rest of the article we will assume SI units. Absolute value of a complex amplitude is electric field amplitude, unit volt/m. For conversion of a complex amplitude to an intensity value, see e.g. [1, 2]. Please also note that 1D examples should not be interpreted physically as propagation integrals were derived for 3D space; they just demonstrate main ideas of the final algorithm.

## 2. Effect of naive discretization

Both theoretical analysis and experiments show that an amplitude diffraction grating with a sine transmittance profile illuminated by a plane wave (let us call it the *source*) creates just three diffraction maxima in the far field – the directly transmitted wave and the plus-minus first diffraction order [1]. Sampling of the *source* is easy in this case; it is necessary to use a sampling frequency at least 2× higher than the frequency of the pattern. Let us choose the perpendicular illumination and set its complex amplitude at *z* = 0 to be 1. Let us choose a sampling whose samples coincide with maxima and minima of the transmittance of the grating, i.e. the samples will be progressively ..., 0, 1, 0, 1, 0, 1,... (this is exactly at the Nyquist limit, see also the end of the section), and let us calculate the diffraction pattern using the Rayleigh-Sommerfeld integral. It is given as

*U*(

*x*,

*y*,

*z*

_{0}) is the calculated complex amplitude at a point [

*x*,

*y*,

*z*

_{0}] of the

*target*,

*U*(

*ξ*,

*η*, 0) is the complex amplitude at a point [

*ξ*,

*η*, 0] of the

*source*(i.e. the product of the complex amplitude of incoming light and the transmittance of the grating), Σ is the extent of the

*source*, j

^{2}= −1,

*k*= 2

*π*/

*λ*is the wave number and $r={\left({\left(x-\xi \right)}^{2}+{\left(y-\eta \right)}^{2}+{z}_{0}^{2}\right)}^{-1/2}$ is the distance between points [

*x*,

*y*,

*z*

_{0}] and [

*ξ*,

*η*, 0].

Let us discretize the calculation by replacing the double integral with a double sum. The result shown in Fig. 1(a) differs a lot from the theoretical result. Where is the problem? (Note that Fig. 1 displays diffraction at sine grating of finite rectangular area in a finite distance, therefore the diffraction maxima have rectangular shape.)

When talking about discretization, let us discuss the direction of varying transmittance only (i.e. perpendicular to the grating stripes). The other direction is not important in this case.

The replacement of the integral by the sum means, in fact, that the original continuous function *U*(*ξ*, *η*, 0) is replaced by an array of Dirac pulses; in other words, by ideal point light sources arranged in a periodic lattice with spacing Δ. Light originating from a lattice of sources of equal complex amplitude interferes and creates *m*-th diffraction maximum in angle *θ _{m}* = arcsin

*m*

*λ*/Δ (see e.g. [19]), where

*m*is an integer. In our case, every second sample is zero; that is, we are working in fact with a lattice with spacing 2Δ. This lattice creates diffraction maxima of equal intensities for all

*m*, and the first diffraction maximum of this lattice coincides with the first diffraction maximum of the original sine grating, because its period is, thanks to sampling distance, equal to 2Δ. The result presented in Fig. 1a is therefore physically correct – although not for a sine grating, but for another experiment. The problem is that the discretization process did not take into account how to reconstruct the original continuous function

*U*(

*ξ*,

*η*, 0) from the samples of the

*source*, i.e. if there is zero transmittance between samples, if the samples represent a sine profile, a rectangular profile, etc. If we took this into account, the diffraction maxima created by the lattice would be attenuated somehow and the result would correspond with the original continuous situation.

A simple solution is easy. In the continuous situation, two close enough point light sources of the same complex amplitude do not create any interference pattern, i.e. at least one destructive interference. In the discretized situation, two point light sources of the same complex amplitude in adjacent samples can interfere destructively if the sampling distance is too big. Therefore, the sampling of the *source* has to represent the *source* correctly, and, moreover, the effect of the discretization has to be hidden – it must be such that the first-order destructive interference of adjacent point light sources created by the discretization (assume they have the same complex amplitude for now) has to lie out of the *target* area; note that in numerical calculations we are dealing with finite areas only. Mathematically, for any point *T* of the *target* and any adjacent samples *S*_{1}, *S*_{2} of the *source*, the inequality |*T* − *S*_{1}| − |*T* − *S*_{2}| < *λ*/2 must hold as the density of the samples has to resemble continuous nature of the *source*. The same idea could be expressed in terms of correct sampling of the propagation integral kernel (e.g. [2, 14, 18]), but this explanation based on point light sources is perhaps more intuitive. It is interesting to note that this simple explanation based on point light source model has not been explicitly published yet (as far as I know). Also note that advanced solutions exist that do not need finer sampling, e.g. [13, 15]; the purpose of this paragraph is to provide simple insight and a starting point for the following sections.

The result of the simple solution is presented in Fig. 1b. It was necessary to work with 12× finer sampling, i.e. with an array 12^{2} = 144× bigger. This results in noticeably higher memory and time demands. In the following text we will present a way to discretize correctly without increasing memory demands.

At the end of the section it is worth noting that the sampling of the sine grating in the example above was not done “correctly”, as the Nyquist limit (in its simplest form) requires a sampling frequency higher than double the maximum frequency contained in a signal. We have used exactly double the maximum frequency, and moreover, we did not consider that the *source* is spatially limited. However, had we used the mathematically precise method, the result would be the same and the discussion would not be as clear.

## 3. Simplified solution

In search of a memory-efficient algorithm, let us start with the solution presented in the last section, i.e. the sampling finer than requested by the Nyquist limit. For the sake of clarity, let us introduce two simplifications: let the *source* and the *target* be one-dimensional objects in the *xz* plane (this will be relaxed in Section 5) and let us suppose they are unbounded (this will be relaxed in Section 4).

Let the object *source* represented by samples *source*[*i*], *i* ∈ **Z** (let the sample *source*[0] be located at the point [0, 0]) be perpendicularly illuminated by a plane wave of wavelength *λ* (see Fig. 2). We will consider transmittance of the *source* to be complex, i.e. any *source* illuminated by any light can be converted to this scenario.

Let us calculate complex amplitudes of propagated light in the “plane” *z* = *z*_{0} > 0 in the unbounded area *target* represented by samples *target*[*j*], *j* ∈ **Z** (let the sample *target*[0] be located at the point [*x*_{0}, *z*_{0}]). The samples *source*[*i*] are samples of the complex function *U*(*x*, 0); the samples *target*[*j*] represent the function *U*(*x*, *z*_{0}) (see Eq. (1)). If the sampling distance is equal in both the *source* and the *target*, let us call it Δ, the following holds:

*h*

_{x0},

*z*

_{0},1[] represents the Rayleigh-Sommerfeld convolution kernel (impulse response) defined as

Equation (2) is a discretization of the integral (1), i.e. we have changed the integral to the sum and the differential to the difference Δ. As we will be interested just in the structure of the *target*, we will omit the constant term Δ.

Let us suppose that the *source* is sampled correctly, such that the structure of *U*(*x*, 0) is fully acquired, but not fine enough for the propagation calculation. Please note that this “correct sampling” is not the same as sampling that obeys the sampling theorem; for example, piecewise constant signal cannot be sampled correctly according to the basic formulation of the sampling theorem as it has infinite frequency extent. However, if we know that the signal is piecewise constant with “steps” at known locations, then we can express the signal precisely using just one sample per constant part of the signal.

For the propagation calculation, we have to use a *ups*-times finer sampling, *ups* ∈ **Z**, *ups* ≥ 1. Let us call the preliminary upsampled array *source _{ups}*[]. To obtain this array, let us put (

*ups*− 1) zero samples between every two original samples of the

*source*(see Fig. 3 top):

The final upsampled array
$\mathit{sourc}{\mathit{e}}_{\mathit{ups}}^{\mathit{fin}}\left[\right]$ is calculated using convolution with a kernel *filter*[], i.e.
$\mathit{sourc}{\mathit{e}}_{\mathit{ups}}^{\mathit{fin}}\left[\right]={\mathit{source}}_{\mathit{ups}}\left[\right]\otimes \mathit{filter}\left[\right]$. The convolution kernel *filter*[] has to be chosen according to the nature of the function *U*(*x*, 0): a rectangular kernel provides a piecewise constant interpolation (which is suitable if *U*(*x*, 0) represents a pixelated spatial light modulator), a windowed sinc kernel provides a good interpolation in terms of frequency content (which is suitable if *U*(*x*, 0) is a general continuous function), a triangular kernel provides a piecewise linear interpolation (which is faster than a windowed sinc kernel and can provide acceptable results), etc. (see Fig. 3). Examples of kernel implementations will be shown at the end of Section 4.

The main idea of the method to be derived exploits the fact that the length of the *filter*[] array is much smaller than the length of the array *source*[] and *target*[]. For simplicity, let us assume the length of the *filter*[] array to be odd. Let us write it as 2 × *fwh* + 1, where *fwh* ∈ **Z**, *fwh* ≥ 0.

Thanks to associativity of the convolution, the following holds:

*target*[] represents the

_{ups}*target*sampled with a period Δ

*/ups*and ${h}_{{x}_{0},{z}_{0},\mathit{ups}}^{\mathit{fin}}\left[\right]$ is the propagation kernel convolved by the array

*filter*[]. Specifically,

However, we need the final result *ups*-times downsampled, i.e. *target*[*j*] = *target _{ups}*[

*ups*×

*j*]. Moreover, the sample

*source*[

_{ups}*i*] is zero for

*i/ups*∉

**Z**. We can therefore omit these samples from the sum. It follows that

The propagation calculation leads to two discrete convolutions: in the first one (5), we work with sampling period Δ, in the second one (6), with sampling period Δ/*ups*.

The aforementioned equations worked with an infinite extent of the indices in order to avoid array boundary effects. In the following section, we will adjust the indices extent but the structure of the result will remain the same. The section will also explain the advantage of the presented method, which can be briefly described as follows.

The discrete convolution can be calculated indirectly using FFT or directly using the definition Eq. (2). The first way is advantageous if the convolution kernel is large; we will use it, therefore, for Eq. (5). The second way is better in the opposite case; we will therefore use it for Eq. (6). Here we will also use a nice property of direct calculation: the samples ${h}_{{x}_{0},{z}_{0},1}^{\mathit{fin}}\left[i\right]$ can be calculated with minimum memory demands.

## 4. Practical 1D solution

Let us assume that the *source* is sampled using *M* samples and the *target* is sampled using *N* samples. If we want to use FFT for the *target*[] calculation, we have to work with cyclic convolution. This means that the arrays *source*[] and *target*[] have to be zero-padded to *C* samples, *C* ≥ *M* + *N* − 1 (see [17]). Then:

The array *source*[] contains correct values for sample indices 0, 1,...,*M* − 1, the array *target*[] for indices 0, 1,...,*C* − *M*. The array
${h}_{{x}_{0},{z}_{0},1}^{\mathit{fin}}\left[\right]$ has to be then calculated as:

It is worth noting that in the calculation of the sample
${h}_{{x}_{0},{z}_{0},1}^{\mathit{fin}}\left[i\right]$, it is possible to use the samples *h*_{x0,z0,ups}[] calculated before, specifically for
${h}_{{x}_{0},{z}_{0},1}^{\mathit{fin}}\left[i-1\right]$. It is therefore convenient to save the samples *h*_{x0,z0,ups}[] in a temporary buffer of size 2 × *fwh* + 1 samples, and to replace part of them in the calculation of
${h}_{{x}_{0},{z}_{0},1}^{\mathit{fin}}\left[i\right]$ with new values. The number of these new samples depends on the *filter* type.

To calculate the *filter*[] array, we have to choose the number of samples of the *source*[] array that contribute to the calculation of the interpolated sample in the *source ^{fin}*[] array, or in other words, in the
${h}_{{x}_{0},{z}_{0},1}^{\mathit{fin}}\left[i\right]$ array. This user-defined number specifies all the parameters needed: the size of the

*filter*[] array, the interpolation type, and the number of the samples shared in the calculation of the neighbouring samples of the ${h}_{{x}_{0},{z}_{0},1}^{\mathit{fin}}\left[\right]$ array.

It is practical to use separable kernels when working with 2D arrays. We can discuss them right now, when working with 1D arrays. For clarity, some examples are given in Fig. 3. To make things simpler, we will show pure interpolation kernels in both the Fig. 3 and the following text, i.e. they do not preserve signal energy. For propagation calculation it is, however, appropriate to normalize the filter, i.e. the sum of its coefficients equals 1.

#### Piecewise constant interpolation

It is suitable if the *source* is split to rectangular pixels of non-zero area. In this case it is appropriate for *ups* to be odd, due to symmetry. Then the interpolation process adds an even number of samples between every two samples of the *source*[] array. Therefore, *fwh* = (*ups* − 1)/2 and the kernel is given as: *filter*[*i*] = 1 for −*fwh ≤ i ≤ fwh*.

#### Piecewise linear interpolation

It is suitable if the *source* represents a continuous function and it does not contain fine details. We need two neighbouring original samples for the calculation of the interpolated one, i.e. *fwh* = *ups* − 1, *filter*[*i*] = 1 − |*i*|/(*fwh* + 1) for −*fwh ≤ i ≤ fwh*.

#### Windowed sinc interpolation

It is suitable if the *source* represents a continuous function and we care about good replication of its frequency content. Choice of the number of the original *source*[] samples has to be a compromise. The more samples are included, the better is the frequency content replication; on the other hand, too large kernels perform badly in the spatial domain. A Lanczos filter is considered a reasonable compromise. It considers 2*a* neighbouring samples, where usually *a* = 2 or *a* = 3 [20]. Then *fwh* = *a*×*ups*−1 and a preliminary kernel is defined as *filter _{prel}*[

*i*] = lanczos(

*a*×

*i*/(

*fwh*+1),

*a*), where lanczos(

*x*,

*a*) =

*a*sin(

*π*

*x*)sin(

*π*

*x*/

*a*)/(

*π*

^{2}

*x*

^{2}) for −

*fwh ≤ i ≤ fwh*. The final kernel has to be adjusted before use: the coefficients contributing to the same sample calculation, i.e. the coefficients

*ups*samples apart, have to sum to 1 [21]. Mathematically,

*filter*[

*i*] =

*filter*[

_{prel}*i*]/ ∑

_{k}*filter*[

_{prel}*i*+

*k*×

*ups*] for all allowed values of

*k*.

## 5. Final 2D solution

Generalization of the aforementioned ideas is straightforward; instead of 1D cyclic convolutions (or FFT’s), we have to use 2D versions. For simplicity, let us assume the sampling periods in both *x* and *y* directions to be the same, let us call them Δ.

To propagate the *source* sampled by *M _{x}* ×

*M*samples to the

_{y}*target*sampled by

*N*×

_{x}*N*samples, let us create the arrays

_{y}*source*[,] and

*target*[,] of size

*C*×

_{x}*C*,

_{y}*C*≥

_{x}*M*+

_{x}*N*− 1,

_{x}*C*≥

_{y}*M*+

_{y}*N*− 1. It is convenient to choose the numbers

_{y}*C*and

_{x}*C*so that the FFT of these arrays runs fast, e.g. powers of 2. The samples of the

_{y}*source*must be stored in the array

*source*[,] at indices from [0, 0] to [

*M*− 1,

_{x}*M*− 1]. After the calculation, the correct samples are located in the array

_{y}*target*[,] at indices from [0, 0] to [

*N*− 1,

_{x}*N*− 1].

_{y}As the next step, we have to choose the parameter *ups*. The way to do it is as follows. Let us assume for a while that we work with the original lattice, i.e. *ups* = 1. In the propagation calculation, we have to calculate (4) for every vector *T* − *S*, where *S* is a 3D position of a sample in the *source* and *T* is a 3D position of a sample in the *target*. In (4) we have to use *r* = |*T* − *S*| (see its application in (2)). Let *S*_{1} and *S*_{2} be positions of adjacent samples in the *source* and these samples have the same value; they represent two point light sources of the same complex amplitude. Let us calculate *r*_{1} = |*T* − *S*_{1}| and *r*_{2} = |*T* − *S*_{2}|. If |*r*_{1} − *r*_{2}| = *λ*/2, then the contributions from *S*_{1} and *S*_{2} cancel each other at the point *T*, i.e. in this direction there is the first diffraction minimum. As we have shown in Section 2, we need to exclude the first diffraction minimum from the *target*. To make this happen, we have to refine the sampling, i.e. increase the parameter *ups* until |*r*_{1} − *r*_{2}| < *λ*/2. If a more precise result is needed, we can refine further. Numerical experiments have shown that higher *ups* than those leading to |*r*_{1} − *r*_{2}| < *λ*/5 did not have any significant impact. The adjacent points *S*_{1}, *S*_{2} and the point *T* have to be chosen as “the worst case”, i.e. the angle between *T* − *S*_{1} (or *T* − *S*_{2}) and the *z* axis has to be as big as possible.

As the next step, we have to calculate the array
${h}_{{x}_{0},{y}_{0},{z}_{0},1}^{\mathit{fin}}[,]$, where [*x*_{0}, *y*_{0}, *z*_{0}] is the position of the sample *target*[0, 0]. We assume the position of the sample *source*[0, 0] to be [0, 0, 0]. For the calculation, we need the numbers *h*_{x0,y0,z0,ups}[*i*, *j*] = *h*(*i*Δ*/ups* + *x*_{0}, *j*Δ*/ups* + *y*_{0}, *z*_{0}) (see (3) for comparison). Thanks to separability of the filters, we can calculate them for one upsampled row only, convolve them with *filter*[] and downsample, i.e. to use the procedure described in Section 4. We need to calculate 2×*fwh*+1 of such rows, convolve them by columns and down-sample; this procedure leads to one row of the array
${h}_{{x}_{0},{y}_{0}{z}_{0},1}^{\mathit{fin}}[,]$. The other rows are calculated in the same way.

Finally, we can calculate the propagation itself:

**FFT**and

**IFFT**are fast Fourier transform and inverse fast Fourier transform, respectively, and ⊙ is the Hadamard product (element-wise product). The propagation calculation is complete now. Examples of the propagations are shown in Fig. 4.

It is worth adding three remarks:

- The same result can be obtained using the basic method, i.e. using upsampling, interpolation of the
*source*, and convolution with a common Rayleigh-Sommerfeld propagation kernel. The upsampled array*source*[,] would have_{ups}*ups*− 1 zero samples between original samples in every row (column), and additionally*fwh*zero samples to the sides in order to correctly calculate the convolution with an interpolation kernel*filter*[,] of size (2 ×*fwh*+ 1) × (2 ×*fwh*+ 1) samples. The size of the array*source*would then be 2 ×_{ups}*fwh*+ 1 +*ups*× (*M*− 1) × 2 ×_{x}*fwh*+ 1 +*ups*× (*M*− 1) samples, sampling period Δ_{y}*/ups*. This means that the physical size of the*source*increases a bit for*fwh*≥ 1; this is the reason why the article never mentions the exact physical sizes of the*source*and the*target*. The additional borders are an interpolation artifact. However, their effect is negligible for big arrays. - The proposed method with propagation kernel filtering is nothing else than a calculation rearrangement. The article [17] that describes the propagation calculation with large
*source*and*target*or with different sampling periods of*source*and*target*is therefore fully compatible with the proposed method; is is sufficient to replace the calculation of propagation kernels. - The calculation rearrangement leads to different rounding errors in numerical calculation and thus to differences between the basic and the proposed method. The differences are negligible, though. As it is hard to tell which method gives a more precise result, we will not discuss the numerical aspects of the proposed method.

## 6. Time and memory requirements

The motivation to create the proposed method was to calculate reference propagation using a small amount of memory. Let us compare its memory and time demands with the basic method. In this section, we will assume square arrays for simplicity, i.e. *M _{x}* =

*M*=

_{y}*M*,

*N*=

_{x}*N*=

_{y}*N*.

The basic method, as we have shown in the last section, upsamples the array *source*[,] to the array *source _{ups}* with 2 ×

*fwh*+ 1 +

*ups*× (

*M*− 1)

^{2}samples and calculates propagation to the array

*target*. There is no need to introduce additional zero borders due to interpolation, i.e. the

_{ups}*target*will have 1 +

*ups*× (

*N*− 1)

^{2}samples. The propagation will be calculated in the arrays with 2×

*fwh*+1+

*ups*×(

*M*+

*N*−2)

^{2}samples, which gives the memory requirements of the basic method.

The proposed method uses the original arrays *source*[,] and *target*[,], so the propagation will be calculated in arrays with (*M* + *N* − 1)^{2} samples. The memory demands are, however, a bit higher, as we have to take account of temporary memory used in the calculation of the filtered propagation kernel
${h}_{{x}_{0},{y}_{0},{z}_{0},1}^{\mathit{fin}}[,]$. It is 2 × *fwh* + 1 samples for “row convolution” and 2 × *fwh* + 1 rows with *M* + *N* − 1 samples for “column convolution”; together (2 × *fwh* + 1)(*M* + *N*) samples. Typically, *fwh* = *a* × *ups*, where *a* is small (in our examples at most 3), and *ups* is much smaller than *M* + *N*. Therefore, it is possible to ignore this amount in further discussion.

By comparing the memory demands, we conclude that the propagation calculation using the proposed method takes approximately 2 × *fwh* + 1 + *ups* × (*M _{x}* +

*N*− 2)

_{x}^{2}/(

*M*+

*N*−1)

^{2}≈

*ups*

^{2}-times less memory. Common experiments in computer generated holography with centimetre-sized fields using a sampling period of about 10

*μ*m off-axis propagated to a distance in the order of tens of centimetres require the

*ups*parameter up to 20. We can therefore say that the proposed method has about 100-times less memory demands than the basic method.

On the other hand, the time of the computation does not fall so quickly. This is because the calculation consists not just of a convolution calculation using the FFT, which is very fast in the proposed method, but of a convolution kernel calculation as well that is approximately as slow as in the basic method. More precisely, the FFT works with arrays approximately *ups*^{2}-times smaller than in the basic method, which means it is approximately *ups*^{2}-times faster. The convolution kernel calculation requires calculating the array *h*_{x0,y0,z0,ups} [,] (the same as in the basic method), filtering it by rows and by columns using a filter of width 2 × *fwh* + 1, where again *fwh* = *a* × *ups*, and downsampling the result. The analysis shows that the convolution kernel calculation in the proposed method is approximately 4*a*^{2}-times slower than in the basic method, where *a* is again a small number. It is not worth making a precise analysis, as the speed of the FFT, the Rayleigh-Sommerfeld convolution kernel and its filtering are hard to compare theoretically. It is more useful to measure the calculation times (see Fig. 5). The graph to the right shows that the speedup of the proposed method is not as big as its memory savings; this is mainly because the calculation of the convolution kernel dominates in the total time of the calculation.

## 7. Conclusions

In Section 2 we have shown and explained in terms of physics that in the discretization of the light propagation between parallel planes it is necessary to take into account both the correct sampling of the *source* and the opposite procedure, the reconstruction. We have shown that the correct result can be obtained using upsampling; we have shown in Section 5 how fine this upsampling should be. We have demonstrated in Section 6 that in typical tasks in computer generated holography the upsampling can be approximately 10×, which leads to 100× slower calculation and 100× bigger memory demands.

In Sections 3 to 5 we have derived a procedure based on filtration of the propagation kernel by the “interpolation” kernel that is used for interpolated upsampling of the *source*. Thanks to the properties of the interpolation kernel (small support, separability), the proposed method is faster and cuts the memory demands to approximately those values which would be needed if no upsampling was used. It can thus be said that the proposed method has approximately 100× smaller memory demands than the basic method of the same precision. As the method just rearranges the calculation, the result is mathematically the same.

## Acknowledgments

This work has been supported by the Ministry of Education, Youth and Sports of the Czech Republic projects LC06008 and LH12181. The author would like to thank Václav Skala for discussions concerning the subject.

## References and links

**1. **J. W. Goodman, *Introduction to Fourier Optics* (Roberts & Company Publishers, 2004), 3rd ed.

**2. **D. G. Voelz, *Computational Fourier Optics: A Matlab Tutorial*, Tutorial texts in optical engineering (SPIE Press, 2011).

**3. **U. Schnars and W. Jueptner, *Digital Holography: Digital Hologram Recording, Numerical Reconstruction, and Related Techniques* (Springer, 2005).

**4. **K. Matsushima, “Computer-generated holograms for three-dimensional surface objects with shade and texture,” Applied Optics **44**, 4607–4614 (2005). [CrossRef] [PubMed]

**5. **I. Hanák, M. Janda, and V. Skala, “Detail-driven digital hologram generation,” Visual Computer **26**, 83–96 (2010). [CrossRef]

**6. **Y. Sakamoto, M. Takase, and Y. Aoki, “Hidden surface removal using z-buffer for computer-generated hologram,” Practical Holography XVII and Holographic Materials IX **5005**, 276–283 (2003). [CrossRef]

**7. **M. Yamaguchi, “Ray-based and wavefront-based holographic displays for high-density light-field reproduction,” Three-Dimensional Imaging, Visualization, and Display 2011 **8043**, 804306 (2011). [CrossRef]

**8. **P. W. M. Tsang, J. P. Liu, K. W. K. Cheung, and T. C. Poon, “Modern methods for fast generation of digital holograms,” 3D Research **1**, 11–18–18 (2010). [CrossRef]

**9. **N. Delen and B. Hooker, “Free-space beam propagation between arbitrarily oriented planes based on full diffraction theory: a fast fourier transform approach,” J. Opt. Soc. Am. A **15**, 857–867 (1998). [CrossRef]

**10. **K. Matsushima, H. Schimmel, and F. Wyrowski, “Fast calculation method for optical diffraction on tilted planes by use of the angular spectrum of plane waves,” J. Opt. Soc. Am. A **20**, 1755–1762 (2003). [CrossRef]

**11. **L. Onural, “Exact solution for scalar diffraction between tilted and translated planes using impulse functions over a surface,” J. Opt. Soc. Am. A **28**, 290–295 (2011). [CrossRef]

**12. **E. Lalor, “Conditions for the validity of the angular spectrum of plane waves,” J. Opt. Soc. Am. **58**, 1235–1237 (1968). [CrossRef]

**13. **H. M. Ozaktas, S. O. Arik, and T. Coşkun, “Fundamental structure of fresnel diffraction: natural sampling grid and the fractional fourier transform,” Opt. Lett. **36**, 2524–2526 (2011). [CrossRef] [PubMed]

**14. **K. Matsushima, “Shifted angular spectrum method for off-axis numerical propagation,” Opt. Express **18**, 18453–18463 (2010). [CrossRef] [PubMed]

**15. **L. Onural, “Exact analysis of the effects of sampling of the scalar diffraction field,” J. Opt. Soc. Am. A **24**, 359–367 (2007). [CrossRef]

**16. **L. Onural, A. Gotchev, H. Ozaktas, and E. Stoykova, “A survey of signal processing problems and tools in holographic three-dimensional television,” Circuits and Systems for Video Technology, IEEE Transactions on **17**, 1631 –1646 (2007). [CrossRef]

**17. **P. Lobaz, “Reference calculation of light propagation between parallel planes of different sizes and sampling rates,” Opt. Express **19**, 32–39 (2011). [CrossRef] [PubMed]

**18. **V. Katkovnik, J. Astola, and K. Egiazarian, “Discrete diffraction transform for propagation, reconstruction, and design of wavefield distributions,” Appl. Opt. **47**, 3481–3493 (2008). [CrossRef] [PubMed]

**19. **E. Steward, *Fourier Optics: An Introduction*, Ellis Horwood Series in Physics (Dover Publications, 2004), 2nd ed.

**20. **K. Turkowski, “Filters for common resampling tasks,” in “*Graphics gems*,”, A. S. Glassner, ed. (Academic Press Professional, Inc., San Diego, CA, USA, 1990).

**21. **D. P. Mitchell and A. N. Netravali, “Reconstruction filters in computer-graphics,” SIGGRAPH Comput. Graph. **22**, 221–228 (1988). [CrossRef]