## Abstract

Ptychography is a lensless imaging method that allows for wavefront sensing and phase-sensitive microscopy from a set of diffraction patterns. Recently, it has been shown that the optimization task in ptychography can be achieved via automatic differentiation (AD). Here, we propose an open-access AD-based framework implemented with TensorFlow, a popular machine learning library. Using simulations, we show that our AD-based framework performs comparably to a state-of-the-art implementation of the momentum-accelerated ptychographic iterative engine (mPIE) in terms of reconstruction speed and quality. AD-based approaches provide great flexibility, as we demonstrate by setting the reconstruction distance as a trainable parameter. Lastly, we experimentally demonstrate that our framework faithfully reconstructs a biological specimen.

© 2021 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

Ptychography is a computational imaging method that allows to recover both the complex illumination and the transmission function of an object [1]. The method is based on the illumination of an object with a localized, coherent probe and the measurement of the resulting diffraction patterns. By laterally scanning the object with overlapping illumination areas, the lost phase information in the diffraction intensities can be recovered through phase retrieval. Further algorithmic extensions to ptychography and related methods have been introduced. For example, some algorithms allow to weaken the coherence requirement of the probe beam [2], to correct for experimental errors in the positioning and the setup geometry [3–6], to obtain three-dimensional reconstructions [7–10], or to acquire images without any moving parts using Fourier ptychography [11–13] or single-shot ptychography [13,14]. Conceptionally similar algorithms have been employed for wide-field fluorescence imaging [15]. Potential applications of quantitative phase imaging include label-free imaging in biomedicine [16], characterization of lens aberrations [17], and x-ray crystallography [18]. Many phase retrieval algorithms are based on the alternating projections scheme [19], where an object is iteratively reconstructed from the modulus of its Fourier transform and by applying known constraints. One variant of this approach for ptychography is called the ptychographical iterative engine (PIE) [20]. Significant improvements to the convergence and robustness of PIE have been made by enabling the simultaneous reconstruction of the probe beam [21] (known as ePIE), and later by revising the update functions and by borrowing the idea of momentum from the machine learning community [22]. The latter introduction to the PIE family has been coined momentum-accelerated PIE (mPIE) and can be considered as a state-of-the-art approach to perform ptychographic reconstructions.

Instead of solving the phase problem in ptychography using algorithms associated with PIE, it has recently been shown that the object can also be recovered by using optimizers based on automatic differentiation (AD) [23–26]. This approach allows to solve the inverse problem in ptychography without the need to find an analytical derivation of the update function, which can be challenging to obtain. Moreover, it directly benefits from the fast progress in the machine-learning community in terms of software tools, hardware, and algorithms. However, it remains unclear how this approach compares to state-of-the-art algorithms, notably in terms of computational time and reconstruction quality.

In this paper, we implement an AD-based reconstruction framework using TensorFlow [27] and Keras [28] and benchmark it against an implementation of mPIE running on a graphics processing unit (GPU) [6]. Keras enables us to solve the reconstruction problem in ptychography using a layer-by-layer approach, akin to the architecture of deep neural networks. As a result, our framework is highly modular, and it is straightforward to extend the physics model or implement specific cost functions, which is not always possible using PIE. Lastly, the layered architecture makes our AD-based framework interesting for the emergent idea of interfacing a deep neural network (DNN) directly to the forward model [29,30]. We are making the source for our AD-based framework available in Code 1 [31]. Thus, we provide a flexible tool to perform ptychographic reconstructions and simultaneously estimate unknown experimental parameters, opening interesting perspectives to improve the reconstruction quality for visible and X-ray lensless imaging techniques.

## 2. Problem formulation

The AD-based reconstruction framework implemented in this paper is presented in Fig. 1. Let $O(\textbf {r})$ represent the two-dimensional, complex-valued object and let $P(\textbf {r})$ represent the probe while $\textbf {r}$ represents the coordinate vector in the object plane. The object is illuminated at different positions $\textbf {R}_i$ in such a way that the probed areas are overlapping. In the projection approximation [32], the exit field distribution in the object plane $\psi _{\mathrm {exit}}(\textbf {r},\textbf {R}_i)$ is then given as:

The field in the detection plane $\psi _{\mathrm {cam}}(\mathbf {r^\prime },\textbf {R}_i)$ with coordinate vector $\mathbf {r^\prime }$ can be obtained by choosing an appropriate diffraction model. We have implemented both the angular spectrum method and the Fraunhofer diffraction model, the latter of which conveniently accommodates Fresnel diffraction by absorption of a quadratic phase function into the probe [33]. While the angular spectrum method can be used to model both near-field and far-field diffraction, it comes at the cost of computing two fast Fourier transforms (FFT). The Fraunhofer diffraction model, which is based on the far-field approximation, can be performed using a single FFT. The last step in the forward model is the calculation of the intensity in the detection plane using

The key idea of AD-based ptychography is to rely on the optimizer to reconstruct the object and probe. As computers execute mathematical functions as a sequence of elementary arithmetic operations with known derivatives, the derivative of the objective function with respect to $\boldsymbol {\theta }$ is computationally known (by applying the chain rule repeatedly). Therefore, AD enables us to converge to a solution of the ptychographic problem without the need to find an analytical expression for the update function, in contrast to algorithms based on PIE. In practice, we employ the popular Adam optimizer [34] to make changes to $\boldsymbol {\theta }$ in each backpropagation epoch.

## 3. Simulated data

With our goal to quantitatively compare the reconstruction speed and quality for our AD-based framework and mPIE, we simulate diffraction patterns according to the setup shown in Fig. 2(a). A ptychographical dataset is synthesized by generating a $512\times 512$ complex-valued object generated from two images, respectively defining the transmittance and phase contrast of the object. The pixel size is 6.45 µm. Using coherent illumination at wavelength $\lambda = 561 $ nm, we shift the object to 80 overlapping locations with a relative overlap of 60 % as defined and recommended by Bunk et al. [35]. The projection approximation is applied to calculate the exit waves which are then propagated to the detection plane using the angular spectrum method. To calculate the synthetic diffraction patterns, we then take the absolute squares of the propagated fields and add both (Poissonian) shot noise and (Gaussian) camera read-out noise to the synthetic data. The simulated intensities are in the order of a few hundred counts per pixel, and the Gaussian noise is characterized by a standard deviation of $\sigma =10$ counts. Then, we utilize our optimization framework based on AD to reconstruct the object and probe functions. As shown in Fig. 2(b), we successfully reconstruct both the complex-valued object and the probe by running the algorithm for 170 epochs (4 min).

For a quantitative comparison between our AD-based framework and mPIE, we run them on the same graphics card (NVIDIA GeForce RTX 2070) using the synthetic dataset. To this end, we use the GPU-enabled mPIE implementation from Ref. [6] and measure both reconstruction quality and speed. It must be stressed that the computational time needed for one iteration in mPIE and for one epoch of our AD-based framework can be different. Therefore, to provide a meaningful comparison, we measure the reconstruction speed in time units. In order to quantify reconstruction quality, a useful figure of merit to compare complex functions is the complex correlation coefficient $C$ defined by

The performances of mPIE and of our AD-based framework generally depend on the choice of some hyperparameters. We choose hyperparameters for mPIE according to the values suggested in Ref. [22]. For our AD-based framework, we can control the learning rate of the optimizer, which determines the step size at each epoch while moving toward a minimum of the objective function. In Fig. 3, we show the magnitude of the correlation coefficient as a function of the reconstruction time for different learning rates. We also include ePIE for reference. Using a learning rate of $0.01$, our AD-based framework converges robustly but is significantly outperformed by mPIE in terms of reconstruction speed. Using a learning rate of $0.04$, our AD-based framework performs comparably to mPIE. Higher learning rates become numerically unstable and do not converge to a reasonable estimate of the ground truth. Both algorithms converge rapidly within the first 60 s. After this time, we observe a decrease of the reconstruction performance in our AD-based framework due to stronger overfitting to noise in comparison to mPIE. For long reconstruction times of more than 200 s, noise overfitting also occurs in mPIE, but to a significantly smaller extent. The influence of noise raises a relevant question for future research on AD-based ptychography. In Ref. [36] and [37], different noise models and mitigation strategies such as regularization and maximum-likelihood refinement have already been explored for coherent diffraction imaging. It was shown that an adaptive step size strategy could improve the robustness to noise in ptychography [38]. The conceptually similar approach of gradually decreasing the learning rate in our AD-based framework does not produce an immediate solution to the overfitting phenomenon and comes at the cost of introducing another arbitrary hyperparameter, namely the learning rate decay rate. A comparison for the separate reconstruction qualities for amplitude and phase images is provided in Fig. S1 in Supplement 1.

In conventional ptychography, $\boldsymbol {\theta }$ usually comprises the pixel values for $O(\textbf {r})$ and $P(\textbf {r})$ as originally introduced in the ePIE algorithm [21]. However, one important strength of our framework is that we can choose $\boldsymbol {\theta }$ relatively freely, e. g. by incorporating experimental parameters into the forward model. We demonstrate this flexibility by estimating and correcting the axial distance $z$ between the object and detector, similar to the recently shown autofocusing in ptychography using a sharpness metric and an algorithm called zPIE in Ref. [39]. By using the angular spectrum method in our AD framework, the forward model becomes explicitly dependent on $z$. Therefore, we can easily define $z$ as a trainable parameter of the model in the same way as we defined the pixel values of $O(\textbf {r})$ or $P(\textbf {r})$. To illustrate this approach, we initialize the reconstruction algorithm with an error of 15 mm in the axial distance $z$. As shown in Fig. 4, our AD-based framework is able to find the true value of $z$ with submillimeter precision in approximately 15 s ($35$ epochs), in addition to the pixel values of $O(\textbf {r})$ and $P(\textbf {r})$. Here, note that we only obtain this result by using a precalibrated probe.

## 4. Experimental data

In order to test our AD-based framework on experimental data, we study a biological sample. We acquire a near-infrared ptychographic scan of a histological slice of a mouse cerebellum. The sample is illuminated with a focused supercontinuum laser source that is spectrally filtered to a wavelength of $\lambda = 708.9$ nm (filter bandwidth, $0.6$ nm). The sample is mounted on a XY translation stage (2x Smaract SLC-1770-D-S) with a sample-detector distance of $z = 34.95 $ mm. The XY translation stage is driven by piezoelectric actuators with an accuracy on the position of the order of tens of nanometers, making it unnecessary to employ position correction methods. A charge-coupled device (CCD) camera (AVT prosilica, $1456\times 1456$ pixels with pixel size 4.54 µm) is used to collect 1824 diffraction patterns downstream of the specimen. Reconstructions are carried out in the same way as described before using the synthetic dataset. The resulting reconstructions obtained with our AD-based framework and mPIE can be visually judged in Fig. 5 to be of similar quality and resolution. A more detailed assessment of the achieved resolution with both algorithms is discussed in Fig. S4 in the Supplement 1. However, our AD-based framework is more prone to the apparition of high-frequency components in the reconstructed image. This effect, which is possibly due to overfitting to noise, could certainly be mitigated by the use of regularization procedures.

## 5. Conclusions

To conclude, we have evaluated the performances of a ptychographic reconstruction framework based on automatic differentiation. Using numerical simulations, we have shown that our AD-based framework performs comparably to the state-of-the-art algorithm mPIE in terms of speed and quality of reconstructions. Furthermore, we show that the flexibility of the forward model in our AD-based framework can readily be utilized for estimating experimental parameters in addition to the pixel values of the probe and object. As an example, we have successfully corrected the reconstruction distance. Lastly, we have experimentally demonstrated that our framework faithfully reconstructs a biological specimen with a large space-bandwidth product. We believe that the presented results are important for establishing optimization frameworks based on AD as viable methods within the field of coherent diffraction imaging. Moreover, we can expect AD-based techniques to further improve thanks to the fast-paced progress in machine-learning toolboxes like TensorFlow and in computer hardware like application-specific integrated circuits (e. g., tensor processing units [40]).

## Funding

Netherlands Organization for Scientific Research NWO (Vici 68047618 and Perspective P16-08).

## Acknowledgments

The authors thank Cees de Kok for IT support and Karl Schilling for providing the biological specimen.

## Disclosures

The authors declare no conflicts of interest.

See Supplement 1 for supporting content.

## References

**1. **J. Rodenburg and A. Maiden, “Ptychography,” in * Springer Handbook of Microscopy*, P. W. Hawkes and J. C. H. Spence, eds. (Springer International Publishing, Cham, 2019).

**2. **P. Thibault and A. Menzel, “Reconstructing state mixtures from diffraction measurements,” Nature **494**(7435), 68–71 (2013). [CrossRef]

**3. **A. C. Hurst, T. B. Edo, T. Walther, F. Sweeney, and J. M. Rodenburg, “Probe position recovery for ptychographical imaging,” J. Phys.: Conf. Ser. **241**, 012004 (2010). [CrossRef]

**4. **A. M. Maiden, M. J. Humphry, M. C. Sarahan, B. Kraus, and J. M. Rodenburg, “An annealing algorithm to correct positioning errors in ptychography,” Ultramicroscopy **120**, 64–72 (2012). [CrossRef]

**5. **P. Dwivedi, A. P. Konijnenberg, S. F. Pereira, and H. P. Urbach, “Lateral position correction in ptychography using the gradient of intensity patterns,” Ultramicroscopy **192**, 29–36 (2018). [CrossRef]

**6. **L. Loetgering, M. Rose, K. Keskinbora, M. Baluktsian, G. Dogan, U. Sanli, I. Bykova, M. Weigand, G. Schütz, and T. Wilhein, “Correction of axial position uncertainty and systematic detector errors in ptychographic diffraction imaging,” Opt. Eng. **57**(08), 1 (2018). [CrossRef]

**7. **A. M. Maiden, M. J. Humphry, and J. M. Rodenburg, “Ptychographic transmission microscopy in three dimensions using a multi-slice approach,” J. Opt. Soc. Am. A **29**(8), 1606–1614 (2012). [CrossRef]

**8. **E. H. R. Tsai, I. Usov, A. Diaz, A. Menzel, and M. Guizar-Sicairos, “X-ray ptychography with extended depth of field,” Opt. Express **24**(25), 29089–29108 (2016). [CrossRef]

**9. **L. Tian and L. Waller, “3D intensity and phase imaging from light field measurements in an LED array microscope,” Optica **2**(2), 104–111 (2015). [CrossRef]

**10. **J. Lim, A. B. Ayoub, E. E. Antoine, and D. Psaltis, “High-fidelity optical diffraction tomography of multiple scattering samples,” Light: Sci. Appl. **8**(1), 82 (2019). [CrossRef]

**11. **R. Horstmeyer, J. Chung, X. Ou, G. Zheng, and C. Yang, “Diffraction tomography with fourier ptychography,” Optica **3**(8), 827–835 (2016). [CrossRef]

**12. **P. C. Konda, L. Loetgering, K. C. Zhou, S. Xu, A. R. Harvey, and R. Horstmeyer, “Fourier ptychography: current applications and future promises,” Opt. Express **28**(7), 9603–9630 (2020). [CrossRef]

**13. **B. K. Chen, P. Sidorenko, O. Lahav, O. Peleg, and O. Cohen, “Multiplexed single-shot ptychography,” Opt. Lett. **43**(21), 5379–5382 (2018). [CrossRef]

**14. **P. Sidorenko and O. Cohen, “Single-shot ptychography,” Optica **3**(1), 9–14 (2016). [CrossRef]

**15. **H. Yilmaz, E. G. van Putten, J. Bertolotti, A. Lagendijk, W. L. Vos, and A. P. Mosk, “Speckle correlation resolution enhancement of wide-field fluorescence imaging,” Optica **2**(5), 424–429 (2015). [CrossRef]

**16. **Y. Park, C. Depeursinge, and G. Popescu, “Quantitative phase imaging in biomedicine,” Nat. Photonics **12**(10), 578–589 (2018). [CrossRef]

**17. **M. Du, L. Loetgering, K. S. E. Eikema, and S. Witte, “Measuring laser beam quality, wavefronts, and lens aberrations using ptychography,” Opt. Express **28**(4), 5022–5034 (2020). [CrossRef]

**18. **J. Miao, T. Ishikawa, I. K. Robinson, and M. M. Murnane, “Beyond crystallography: diffractive imaging using coherent x-ray light sources,” Science **348**(6234), 530–535 (2015). [CrossRef]

**19. **J. R. Fienup, “Reconstruction of an object from the modulus of its fourier transform,” Opt. Lett. **3**(1), 27–29 (1978). [CrossRef]

**20. **J. M. Rodenburg and H. M. L. Faulkner, “A phase retrieval algorithm for shifting illumination,” Appl. Phys. Lett. **85**(20), 4795–4797 (2004). [CrossRef]

**21. **A. M. Maiden and J. M. Rodenburg, “An improved ptychographical phase retrieval algorithm for diffractive imaging,” Ultramicroscopy **109**(10), 1256–1262 (2009). [CrossRef]

**22. **A. Maiden, D. Johnson, and P. Li, “Further improvements to the ptychographical iterative engine,” Optica **4**(7), 736–745 (2017). [CrossRef]

**23. **Y. S. G. Nashed, T. Peterka, J. Deng, and C. Jacobsen, “Distributed automatic differentiation for ptychography,” Procedia Comput. Sci. **108**, 404–414 (2017). [CrossRef]

**24. **S. Ghosh, Y. S. G. Nashed, O. Cossairt, and A. Katsaggelos, “ADP: Automatic differentiation ptychography,” in 2018 IEEE International Conference on Computational Photography (ICCP), (2018), pp. 1–10.

**25. **S. Kandel, S. Maddali, M. Allain, S. O. Hruszkewycz, C. Jacobsen, and Y. S. G. Nashed, “Using automatic differentiation as a general framework for ptychographic reconstruction,” Opt. Express **27**(13), 18653–18672 (2019). [CrossRef]

**26. **M. Du, Y. S. G. Nashed, S. Kandel, D. Gürsoy, and C. Jacobsen, “Three dimensions, two microscopes, one code: Automatic differentiation for x-ray nanotomography beyond the depth of focus limit,” Sci. Adv. **6**(13), eaay3700 (2020). [CrossRef]

**27. **M. Abadi, A. Agarwal, P. Barham, E. Brevdo, Z. Chen, C. Citro, G. S. Corrado, A. Davis, J. Dean, M. Devin, S. Ghemawat, I. Goodfellow, A. Harp, G. Irving, M Isard, Y. Jia, R. Jozefowicz, L. Kaiser, M. Kudlur, J. Levenberg, D. Mane, R. Monga, S. Moore, D. Murray, C. Olah, M Schuster, J. Shlens, B. Steiner, K. Talwar, I. Sutskever, P. Tucker, V. Vanhoucke, V Vasudevan, F. Viegas, O. Vinyals, P. Warden, M. Wattenberg, M. Wicke, Y. Yu, and X. Zheng, “Tensorflow: Large-scale machine learning on heterogeneous distributed systems,” arXiv preprint arXiv:1603.04467 (2016).

**28. **F. Chollet and Others “Keras,” https://keras.io (2015).

**29. **M. R. Kellman, E. Bostan, N. Repina, and L. Waller, “Physics-based learned design: Optimized coded-illumination for quantitative phase imaging,” IEEE Trans. Comput. Imaging **5**, 344–353 (2019). [CrossRef]

**30. **G. Barbastathis, A. Ozcan, and G. Situ, “On the use of deep learning for computational imaging,” Optica **6**(8), 921–943 (2019). [CrossRef]

**31. **J. Seifert, D. Bouchet, and A. P. Mosk, “Supplemental code for ptychography using an optimization framework based on automatic differentiation,” (2020) https://doi.org/10.6084/m9.figshare.13008155.

**32. **D. Paganin, * Coherent X-ray optics*, vol. 21 (Oxford University Press on Demand, 2006).

**33. **J. W. Goodman, * Introduction to Fourier Optics* (WH Freeman, Macmillan Learning, 2017), 4th ed.

**34. **D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv:1412.6980 (2014).

**35. **O. Bunk, M. Dierolf, S. Kynde, I. Johnson, O. Marti, and F. Pfeiffer, “Influence of the overlap parameter on the convergence of the ptychographical iterative engine,” Ultramicroscopy **108**(5), 481–487 (2008). [CrossRef]

**36. **P. Thibault and M. Guizar-Sicairos, “Maximum-likelihood refinement for coherent diffractive imaging,” New J. Phys. **14**(6), 063004 (2012). [CrossRef]

**37. **P. Godard, M. Allain, V. Chamard, and J. Rodenburg, “Noise models for low counting rate coherent diffraction imaging,” Opt. Express **20**(23), 25914–25934 (2012). [CrossRef]

**38. **C. Zuo, J. Sun, and Q. Chen, “Adaptive step-size strategy for noise-robust fourier ptychographic microscopy,” Opt. Express **24**(18), 20724–20744 (2016). [CrossRef]

**39. **L. Loetgering, M. Du, K. S. E. Eikema, and S. Witte, “zPIE: an autofocusing algorithm for ptychography,” Opt. Lett. **45**(7), 2030–2033 (2020). [CrossRef]

**40. **N. P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, and A. Borchers, “In-datacenter performance analysis of a tensor processing unit,” in Proceedings of the 44th Annual International Symposium on Computer Architecture (ACM, 2017), 1–12. [CrossRef]