Efficient diffraction calculations using implicit convolution

Tomoyoshi Shimobaba; Takayuki Takahashi; Yota Yamamoto; Takashi Nishitsuji; Atsushi Shiraki; Naoto Hoshikawa; Takashi Kakue; Tomoyoshi Ito

doi:10.1364/OSAC.1.000642

1. Introduction

Diffraction calculations, [1] such as Fresnel diffraction and the angular spectrum method, are essential in optical simulations, including a wide range of applications in computer-generated holograms, digital holography, phase retrieval algorithms, beam generation, and beam propagation methods. Although such diffraction calculations are accelerated by fast Fourier transform (FFT)-based convolution, FFT-based convolution becomes circular convolution due to the periodicity of the FFT, thereby resulting in increasing amounts of wraparound noise in the boundary of the calculation result [2]. To avoid this problem, zero padding is often used to convert circular convolution to linear convolution. However, the zero padding helps to avoid circular convolution, but increases calculation time and memory usage.

Conventional convolution-based diffraction calculations using FFT have the following form:

u_{2} (m, n) = {FFT}^{- 1} [FFT [u_{1} (m, n)] FFT [h (m, n)]],

where FFT[·] and FFT⁻¹[·] are FFT and inverse FFT operators, and u₁(m, n) and u₂(m, n) are the source and destination plane, respectively. In addition, h(m, n) is a point spread function. Equation (1) can be accelerated by FFT; however, when without zero padding, wraparound noise increases in the boundary of the calculation results. This is particularly problematic in case where the diffraction angle is wide and the propagation distance is long.

For example, Fig. 1 shows the diffracted results from a small rectangular aperture with 5 × 5 pixels located on the right side of the calculation windows. Here, the upper figures show the diffracted results obtained with zero padding. As can be seen, the light going through the aperture spreads in the calculation window and the zero padding region. Note that wraparound noise does not appear in the calculation window. In contrast, the bottom figures show the diffracted results obtained without zero padding. As shown, the light going through the aperture spread; however, at the boundary of the calculation window, the light wraps round to the opposite side due to the periodicity property of FFT, thereby resulting in the occurrence of wraparound noise in the calculation window.

Fig. 1 Wraparound noise in diffracted result. The top and bottom figures show the diffracted results obtained with and without zero padding, respectively. The wavelength, sampling pitch and propagation distance are 633 nm, 10µm and 0.1m, respectively. The calculation window has 512×512 pixels. In the zero padding case, the calculation window size is expanded to 1,024×1,024 pixels during the calculation.

Download Full Size | PDF

In addition, scaled diffractions [3–8] that can calculate light propagation at different sampling pitches on the source and destination planes must use zero padding. Figure 2 shows the diffracted result obtained from a natural image (mandrill) using scaled diffraction [7] with and without zero padding. Here, the sampling pitches of the source and destination planes were set to 10µm and 8µm, respectively. The diffracted result obtained with zero padding is correct; however, without zero padding, destructive results are obtained.

Fig. 2 Scaled diffraction results obtained using a natural image with and without zero padding. The sampling pitches of the source and destination planes are 10µm and 8µm, respectively. The wavelength and propagation distance are 633 nm and 0.3m, respectively. The calculation window has 2,048×2,048 pixels. In the zero padding case, the calculation window size is expanded to 4,096×4,096 pixels during the calculation.

Download Full Size | PDF

This paper proposes efficient convolution-based diffraction calculations using implicit convolution. The proposed diffraction scheme accelerates calculation time and reduces memory usage. In addition, the proposed scheme can be applied to any convolution-based diffraction calculations (e.g., [3–12]).

2. Implicit convolution-based diffraction calculation

We define FFT and inverse FFT of f(m, n) with N × M pixels as follows:

F (k, ℓ) = FFT [f (m, n)] = \sum_{n = 0}^{N - 1} \sum_{m = 0}^{M - 1} f (m, n) W_{N}^{- k m} W_{M}^{- ℓ n},

f (m, n) = {FFT}^{- 1} [F (k, ℓ)] = \frac{1}{N M} \sum_{ℓ = 0}^{N - 1} \sum_{m = 0}^{M - 1} f (m, n) W_{N}^{k m} W_{M}^{ℓ n},

where W_N = exp(2πi/N) is the twiddle factor where

i = \sqrt{- 1}

. Although this definition is actually discrete Fourier transform (DFT), in order to accelerate this DFT with the FFT algorithm, we used the notation “FFT” for the operator here.

Note that we apply implicit convolution [13] to Eq. (1). Pruned FFT [14] is similar method to the implicit convolution. However, The reference [13] discussed the detail comparison between the implicit convolution and pruned FFT. This literature concluded that the implicit convolution was superior to the pruned FFT in terms of the calculation time.

Although the implicit diffraction finally does not need zero padding, the derivation of the implicit diffraction is started from the zero-padded FFT. First, we expand u₁, u₂ and h with M × N pixels to 2M × 2N pixels by zero padding. Therefore, U₁(k, ℓ) = FFT[u₁ (m, n)] is expressed as follows:

U_{1} (k, ℓ) = \sum_{n = 0}^{2 N - 1} \sum_{m = 0}^{2 M - 1} u_{1} (m, n) W_{2 N}^{- 2 k m} W_{2 M}^{- 2 ℓ n} .

In implicit convolution, U₁(k, ℓ) is divided into odd and even spectra. U₁(2k, 2ℓ) is expressed as follows:

U_{1} (2 k, 2 ℓ) = \sum_{n = 0}^{2 N - 1} \sum_{m = 0}^{2 M - 1} u_{1} (m, n) W_{2 N}^{- 2 k m} W_{2 M}^{- 2 ℓ n},

= \sum_{n = 0}^{N - 1} \sum_{m = 0}^{M - 1} u_{1} (m, n) W_{N}^{- k m} W_{M}^{- ℓ n},

= FFT [u_{1} (m, n)] .

Here, u₁(m, n) = 0 for m > M and n > N and $W_{2 M}^{- 2 k m} = \exp {(2 π i / 2 M)}^{- 2 k m} = W_{M}^{- k m}$ and $W_{2 N}^{- 2 ℓ n} = W_{N}^{- ℓ n}$ . U₁(2k + 1, 2ℓ), U₁(2k, 2ℓ + 1) and U₁(2k + 1, 2ℓ + 1) are expressed as follows:

U_{1} (2 k + 1, 2 ℓ) = FFT [u_{1} (m, n) W_{2 M}^{- m}],

U_{1} (2 k, 2 ℓ + 1) = FFT [u_{1} (m, n) W_{2 N}^{- n}],

U_{1} (2 k + 1, 2 ℓ + 1) = FFT [u_{1} (m, n) W_{2 M}^{- m} W_{2 N}^{- n}] .

In the same manner, the odd and even spectra of h (m, n) are expressed as

H (2 k, 2 ℓ) = FFT [h (m, n)],

H (2 k + 1, 2 ℓ) = FFT [h (m, n) W_{2 M}^{- m}],

H (2 k, 2 ℓ + 1) = FFT [h (m, n) W_{2 N}^{- n}],

H (2 k + 1, 2 ℓ + 1) = FFT [h (m, n) W_{2 M}^{- m} W_{2 N}^{- n}] .

Convolution in the Fourier space is calculated by element-wise multiplication of U₁ and H, i.e., G (k, ℓ) = U₁(k,ℓ)H (k, ℓ). Therefore, convolution of the odd and even spectra are calculated simply as follows:

\begin{matrix} G (2 k, 2 ℓ) = U_{1} (2 k, 2 ℓ) H (2 k, 2 ℓ), \\ G (2 k + 1, 2 ℓ) = U_{1} (2 k + 1, 2 ℓ) H (2 k + 1, 2 ℓ), \\ G (2 k, 2 ℓ + 1) = U_{1} (2 k, 2 ℓ + 1) H (2 k, 2 ℓ + 1), \\ G (2 k + 1, 2 ℓ + 1) = U_{1} (2 k + 1, 2 ℓ + 1) H (2 k + 1, 2 ℓ + 1) . \end{matrix}

G (2k, 2ℓ), G (2k + 1, 2ℓ), G (2k, 2ℓ + 1) and G (2k + 1, 2ℓ + 1 have N × M pixels, respectively.

The implicit convolution-based diffraction is calculated as:

\begin{matrix} u_{2} (m_{2}, n_{2}) = {FFT}^{- 1} [G (k, ℓ)] = \sum_{ℓ = 0}^{2 N - 1} \sum_{k = 0}^{2 M - 1} G (k, ℓ) W_{2 N}^{2 k m} W_{2 M}^{2 ℓ n} \\ = \sum_{ℓ = 0}^{N - 1} \sum_{k = 0}^{M - 1} G (2 k, 2 ℓ) W_{2 N}^{2 k m} W_{M}^{ℓ n} + \sum_{ℓ = 0}^{N - 1} \sum_{k = 0}^{M - 1} G (2 k + 1, 2 ℓ) W_{N}^{k m} W_{M}^{ℓ n} W_{2 N}^{n} + \\ \sum_{ℓ = 0}^{N - 1} \sum_{k = 0}^{M - 1} G (2 k, 2 ℓ + 1) W_{N}^{k m} W_{M}^{ℓ n} + \sum_{ℓ = 0}^{N - 1} \sum_{k = 0}^{M - 1} G (2 k + 1, 2 ℓ + 1) W_{N}^{k m} W_{M}^{ℓ n} W_{2 M}^{m} W_{2 N}^{n} \\ = {FFT}^{- 1} [G (2 k, 2 ℓ)] + W_{2 N}^{n} {FFT}^{- 1} [G (2 k, 2 ℓ + 1)] + \\ W_{2 M}^{m} {FFT}^{- 1} [G (2 k + 1, 2 ℓ)] + W_{2 M}^{m} W_{2 N}^{n} {FFT}^{- 1} [G (2 k + 1, 2 ℓ + 1)] . \end{matrix}

The conventional convolution-based diffraction of Eq. (1), hereafter referred to as “explicit diffraction,” requires three FFTs with zero padding. Since the computational complexity of FFT is proportional to N²log₂N with the calculation windows of N × N pixels, the computational complexity of Eq. (1) is proportional to 12N²log₂2N. In contrast, the implicit convolution-based diffraction requires 12 FFTs without zero padding; therefore, the computational complexity is proportional to 12N²log₂N.

Figure 3 shows the diffracted results obtained from a natural image using the angular spectrum method with the explicit convolution of Eq. (1) and implicit convolution of Eq. (16). In the implicit angular spectrum method, h (m₁, n₁) is calculated as follows:

h (m_{1}, n_{1}) = {FFT}^{- 1} [\exp (i \frac{2 π}{λ} \sqrt{1 - {(λ f_{x})}^{2} - λ f_{y})^{2}})],

where λ is the wavelength and f_xand f_yare the spatial frequencies, respectively.

Fig. 3 Diffracted results using angular spectrum method: (a) and (b) are intensity and phase using the explicit angular spectrum method, and (c) and (d) are intensity and phase using the implicit angular spectrum method.

Download Full Size | PDF

Figures 3(a) and 3(b) show the intensity and phase calculated by explicit diffraction, and Figs. 3(c) and 3(d) show the intensity and phase calculated by implicit diffraction. Here, the calculation conditions are as follows: wavelength, 633nm; propagation distance between source and destination planes, 0.1m; sampling pitch, 10µm.

We evaluated the intensities and phases using the mean square error (MSE) defined as $MSE = \frac{1}{N M} \sum_{m = 0}^{M - 1} \sum_{n = 0}^{N - 1} | f_{r e f} (m, n) - f_{t a r} (m, n) |^{2}$ , where f_ref(m,n) and f_tar(m, n) are the results of conventional and implicit diffraction, respectively. Here, the MSEs for the intensity and phase are 2.3 × 10⁻⁶ and 7.7 × 10⁻⁶, respectively, which means that the average difference per pixel is approximately 0.1%.

Figure 4 shows the diffracted results obtained using scaled diffraction with explicit and implicit convolution. Here, we used aliasing-reduced shifted and scaled (ARSS)-Fresnel diffraction [7] as the scaled diffraction. The calculation conditions are as follows: wavelength, 633nm; propagation distance between source and destination planes, 0.3m. The sampling pitches on the source and destination planes are 10µm and 15µm, and the MSEs for the intensity and phase are 7.6 × 10⁻⁹ and 8.9 × 10⁻⁹, respectively.

Fig. 4 Diffracted results obtained using scaled diffraction calculation: (a) and (b) are intensity and phase using explicit scaled diffraction, and (c) and (d) are those of implicit scaled diffraction. The sampling pitches on the source and destination planes are 10µm and 15µm, respectively.

Download Full Size | PDF

Figure 5 shows diffracted results obtained using scaled diffraction with explicit and implicit convolution. The calculation conditions are the same as those of Fig. 4, with the exception of the sampling pitches. Here, we used sampling pitches of the source and destination planes of 10µm and 5µm, respectively. The MSEs for the intensity and phase are 1.7 × 10⁻⁸and 1.2 × 10⁻⁸, respectively. The MSEs indicate that the results of implicit scaled diffraction are nearly the same as those of the explicit scaled diffraction.

Fig. 5 Diffracted results using scaled diffraction: (a) and (b) are intensity and phase using explicit scaled diffraction, and (c) and (d) are intensity and phase of the implicit scaled diffraction. The sampling pitches on the source and destination planes are 10µm and 5µm, respectively.

Download Full Size | PDF

We compared calculation times and memory usage of the explicit and implicit diffraction calculations. We measured the average calculation times of 10 diffraction calculations. Here, we used a computer with Intel Core i7 6700 CPU, 64 GB of memory, the Microsoft Windows 8.1 operating system, and the Microsoft Visual Studio C++ 2015 compiler.

Table 1 shows the calculation times of the scaled diffraction using explicit and implicit convolution. We measured the calculation times of different sizes (512 × 512 to 4, 096 × 4, 096 pixels). At 1, 024 × 1, 024 pixels, implicit scaled diffraction resulted in calculation time that was 1.63 times faster than the explicit scaled diffraction. With a small calculation window, acceleration of implicit scaled diffraction was limited because the computational complexity of the FFT operations is relatively small. The variation in calculation times in Table 1 is due to the effect of CPU cache memory.

Table 1. Calculation Times of the Explicit and Implicit Scaled Diffraction

View Table | View all tables in this article

With a calculation size of N × N pixels, the memory usage of the explicit scaled diffraction was proportional to 10N²because we require two buffers of N × N pixels for u₁ and h, and two temporary buffers of 2N × 2N pixels for explicit convolution. Note that the memory usage of the implicit scaled diffraction is 5N²because we require two buffers of N × N pixels for u₁ and h, and three temporary buffers of N × N pixels for implicit convolution. Therefore, implicit diffraction requires only one half of the memory required by the explicit diffraction.

Table 2 shows the calculation times of the angular spectrum method using explicit and implicit convolutions. Here, “Implicit1” represents the calculation times including Eq. (17), whereas “Implicit2” is the calculation times not including Eq. (17). Note that, if the calculation parameters were unchanged during the diffraction calculation, we only calculated Eq. (17) once. In the “Implicit1” case, although the memory usage of implicit diffraction can be reduced compared to that of the explicit diffraction, the calculation speed was slower than the explicit diffraction due to the cost of Eq. (17). In contrast, in “Implicit2” case, the calculation speed of the implicit diffraction was improved.

Table 2. Calculation Times of the Explicit and Implicit Angular Spectrum Method

View Table | View all tables in this article

Figure 6 shows the reconstructed images from a kinoform using the explicit angular spectrum method and implicit angular spectrum method. The kinoform is generated by the explicit angular spectrum method. The calculation condition is that the kinoform size is 1, 920 × 1, 080, the pixel pitch is 10µm, the wavelength is 633nm, and the propagation distance is 0.1m. Figure 6(a), 6(b), and 6(c) show the kinoform, the reconstructed image using the explicit angular spectrum method, and the reconstructed image using the implicit angular spectrum method, respectively. As can be seen, the reconstructed images are almost the same. The peak signal-to-noise ratio (PSNR) between Figs. 6(b) and 6(c) is 62.8dB. All of the calculations were performed using our wave optics library, CWO++ [15].

Fig. 6 Reconstructed images from a kinoform using the explicit and implicit angular spectrum methods: (a) is the kinoform, (b) and (c) are the reconstructed images using the explicit and implicit angular spectrum methods, respectively.

Download Full Size | PDF

3. Conclusion

In this paper, we have proposed efficient diffraction calculations using implicit convolution. The proposed method was effective for diffraction calculations that required three FFTs, such as scaled diffraction. In future, we plan to further improve the calculation efficiency of diffraction calculations that requires two FFTs, such as the angular spectrum method. In addition, we plan to reduce the computational costs of Eq. (15) through the interpolation of G (2k, 2ℓ).

Funding

JSPS KAKENHI (16K00151).

References

1. J. W. Goodman, Introduction to Fourier Optics (Roberts and Company Publishers, 2005).

2. T. Shimobaba and T. Ito, Computer Holography—Acceleration Algorithms and Hardware Implementations (CRC Press, in press).

3. R. P. Muffoletto, J. M. Tyler, and J. E. Tohline, “Shifted Fresnel diffraction for computational holography,” Opt. Express 15, 5631–5640 (2007). [CrossRef] [PubMed]

4. J. F. Restrepo and J. Garcia-Sucerquia, “Magnified reconstruction of digitally recorded holograms by Fresnel–Bluestein transform,” Appl. Opt. 49, 6430–6435 (2010). [CrossRef] [PubMed]

5. S. Odate, C. Koike, H. Toba, T. Koike, A. Sugaya, K. Sugisaki, K. Otaki, and K. Uchikawa, “Angular spectrum calculations for arbitrary focal length with a scaled convolution,” Opt. Express 19, 14268–14276 (2011). [CrossRef] [PubMed]

6. T. Shimobaba, K. Matsushima, T. Kakue, N. Masuda, and T. Ito, “Scaled angular spectrum method,” Opt. Lett. 37, 4128–4130 (2012). [CrossRef] [PubMed]

7. T. Shimobaba, T. Kakue, N. Okada, M. Oikawa, Y. Yamaguchi, and T. Ito, “Aliasing-reduced Fresnel diffraction with scale and shift operations,” J. Opt . 15, 075405 (2013). [CrossRef]

8. T. Shimobaba, T. Kakue, M. Oikawa, N. Okada, Y. Endo, R. Hirayama, and T. Ito, “Nonuniform sampled scalar diffraction calculation using nonuniform fast Fourier transform,” Opt. Lett. 38, 5130–5133 (2013). [CrossRef] [PubMed]

9. K. Matsushima, H. Schimmel, and F. Wyrowski, “Fast calculation method for optical diffraction on tilted planes by use of the angular spectrum of plane waves,” J. Opt. Soc. Am. A 20, 1755–1762 (2003). [CrossRef]

10. M. Hillenbrand, D. P. Kelly, and S. Sinzinger, “Numerical solution of nonparaxial scalar diffraction integrals for focused fields,” J. Opt. Soc. Am. A 31, 1832–1841 (2014). [CrossRef]

11. Y.-H. Kim, C.-W. Byun, H. Oh, J.-E. Pi, J.-H. Choi, G. H. Kim, M.-L. Lee, H. Ryu, and C.-S. Hwang, “Off-axis angular spectrum method with variable sampling interval,” Opt. Commun. 348, 31–37 (2015). [CrossRef]

12. Y. Xiao, X. Tang, Y. Qin, H. Peng, W. Wang, and L. Zhong, “Nonuniform fast Fourier transform method for numerical diffraction simulation on tilted planes,” J. Opt. Soc. Am. A 33, 2027–2033 (2016). [CrossRef]

13. J. C. Bowman and M. Roberts, “Efficient dealiased convolutions without padding,” SIAM J. Sci. Comput. 33, 386–406 (2011). [CrossRef]

14. H. V. Sorensen and C. S. Burrus, “Efficient computation of the DFT with only a subset of input or output points,” IEEE Trans. Signal Process. 41, 1184–1200 (1993). [CrossRef]

15. T. Shimobaba, J. Weng, T. Sakurai, N. Okada, T. Nishitsuji, N. Takada, A. Shiraki, N. Masuda, and T. Ito, “Computational wave optics library for C++: CWO++ library,” Comput. Phys. Commun. 183, 1124–1138 (2012). [CrossRef]

	Explicit (ms)	Implicit (ms)	Explicit/Implicit
512 × 512	34	38	0.89
1, 024 × 1, 024	213	131	1.63
2, 048 × 2, 048	1,244	846	1.47
4, 096 × 4, 096	5,391	4,902	1.10

	Explicit (ms)	Implicit1 (ms)	Implicit2 (ms)
512 × 512	30	33	29
1, 024 × 1, 024	184	136	129
2, 048 × 2, 048	923	967	797
4, 096 × 4, 096	3,943	5,357	4,729

	Explicit (ms)	Implicit (ms)	Explicit/Implicit
512 × 512	34	38	0.89
1, 024 × 1, 024	213	131	1.63
2, 048 × 2, 048	1,244	846	1.47
4, 096 × 4, 096	5,391	4,902	1.10

	Explicit (ms)	Implicit1 (ms)	Implicit2 (ms)
512 × 512	30	33	29
1, 024 × 1, 024	184	136	129
2, 048 × 2, 048	923	967	797
4, 096 × 4, 096	3,943	5,357	4,729

Efficient diffraction calculations using implicit convolution

Abstract

1. Introduction

2. Implicit convolution-based diffraction calculation

3. Conclusion

Funding

References

Cited By

Figures (6)

Tables (2)

Equations (17)

OSA Continuum