## Abstract

Fourier ptychography is a recently developed imaging approach for large field-of-view and high-resolution microscopy. Here we model the Fourier ptychographic forward imaging process using a convolutional neural network (CNN) and recover the complex object information in a network training process. In this approach, the input of the network is the point spread function in the spatial domain or the coherent transfer function in the Fourier domain. The object is treated as 2D learnable weights of a convolutional or a multiplication layer. The output of the network is modeled as the loss function we aim to minimize. The batch size of the network corresponds to the number of captured low-resolution images in one forward/backward pass. We use a popular open-source machine learning library, TensorFlow, for setting up the network and conducting the optimization process. We analyze the performance of different learning rates, different solvers, and different batch sizes. It is shown that a large batch size with the Adam optimizer achieves the best performance in general. To accelerate the phase retrieval process, we also discuss a strategy to implement Fourier-magnitude projection using a multiplication neural network model. Since convolution and multiplication are the two most-common operations in imaging modeling, the reported approach may provide a new perspective to examine many coherent and incoherent systems. As a demonstration, we discuss the extensions of the reported networks for modeling single-pixel imaging and structured illumination microscopy (SIM). 4-frame resolution doubling is demonstrated using a neural network for SIM. The link between imaging systems and neural network modeling may enable the use of machine-learning hardware such as neural engine and tensor processing unit for accelerating the image reconstruction process. We have made our implementation code open-source for researchers.

© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

Many biomedical applications require imaging with both large field-of-view and high resolution at the same time. One example is whole slide imaging (WSI) in digital pathology, which converts tissue sections into digital images that can be viewed, managed, and analyzed on computer screens. To this end, Fourier ptychography (FP) is a recently developed coherent imaging approach for achieving both large field-of-view and high resolution at the same time [1–4]. This approach integrates the concepts of aperture synthesizing [5–11] and phase retrieval [12–18] for recovering the complex object information. In a typical microscopy setting, FP sequentially illuminates the sample with angle-varied plane waves and uses a low numerical aperture (NA) objective lens for image acquisition. Changing the incident angle of the illumination beam results in a shift of the light field’s Fourier spectrum at the pupil plane. Therefore, part of the light field that would normally lie outside the pupil aperture can now transmit through the system and be detected by the image sensor. To recover the complex object information, FP iteratively synthesizes the captured intensity images in the Fourier space (aperture synthesizing) and recover the phase information (phase retrieval) at the same time. The final achievable resolution of FP is determined by the synthesized passband at the Fourier space. As such, it is able to use a low-NA objective with a low-magnification factor to produce a high-resolution complex object image, combining the advantages of wide field-of-view and high resolution at the same time [19–21].

The FP approach is also closely related to the real-space ptychography, which is a lensless phase retrieval technique originally proposed for transmission electron microscopy [14, 22–24]. Real-space ptychography employs a confined beam for sample illumination and records the Fourier diffraction patterns as the sample is mechanically scanned to different positions. FP has a similar operating principle as real-space ptychography but switching the real space and the Fourier space using a lens [1, 20, 25]. The mechanical scanning of the sample in real-space ptychography is replaced by the angle scanning process in FP. Despite the difference in hardware implementation, many algorithm developments of real-space ptychography can be directly applied in FP, including the sub-sampling scheme [26], the coherent-state multiplexing scheme [27, 28], the multi-slice modeling approach [29, 30], and the object-probe recovering scheme [15, 31].

In this work, we model the Fourier ptychographic forward imaging process using a feed-forward neural network model and recover the complex object information in a network training process. A typical feed-forward neural network consists of an input and output layer, as well as multiple hidden layers in between. For a typical convolutional neural network (CNN), the hidden layers consist of convolutional layers, pooling layers, fully connected layers and normalization layers. In the network training process, a forward pass refers to the calculation of the loss function, where the input data travels through all layers and generates output values for the loss function calculation. A backward pass refers to the process of updating learnable weights of different layers based on the calculated loss function, and the computation is made from the last layer backward to the first layer. Different gradient-descent-based algorithms can be used in the backward pass, including momentum, Nesterov accelerated gradient, Adagrad, Adadelta, RMSprop, and Adam [32]. The use of neural networks for tackling microscopy problems is a rapidly growing research field with various applications [33–37].

In our neural network models for FP, the input layer of the network is the point spread function (PSF) in the spatial domain or the coherent transfer function (CTF) in the Fourier domain. The object is treated as 2D learnable weights of a convolutional or a multiplication layer. The output of the network is modeled as the loss function we aim to minimize. The batch size of the network corresponds to the number of captured low-resolution images in one forward / backward pass. We use a popular open-source machine learning library, TensorFlow [38], for setting up the network and conducting the optimization process. We analyze the performance of different learning rates, different solvers, and different batch sizes. It is shown that a large batch size with the Adam optimizer achieves the best performance in general. To accelerate the phase retrieval process, we also discuss a strategy to implement Fourier-magnitude projection using a multiplication neural network model.

Since convolution and multiplication are the two most-common operations in imaging modeling, the reported approach may provide a new perspective to examine many coherent and incoherent systems. As a demonstration, we discuss the extension of the reported networks for modeling single-pixel imaging and structured illumination microscopy. 4-frame resolution doubling is demonstrated using a neural network for structured illumination microscopy. The link between imaging systems and neural network models may enable the use of machine-learning hardware such as neural engine (also known as AI chips) and tensor processing unit [39] for accelerating the image reconstruction process. We have made our implementation code open-source for the interested readers.

This paper is structured as follows: in Section 2, we discuss the forward imaging model for the Fourier ptychographic imaging process and propose a CNN for modeling this process. We then analyze the performance of different learning rates, different solvers, and different batch sizes of the proposed CNN. In Section 3, we discuss a strategy to implement the Fourier-magnitude projection using a multiplication neural network model. In Section 4, we discuss the extension of the reported approach for modeling single-pixel imaging and structured illumination microscopy via CNNs. Finally, we summarize the results and discuss the future directions in Section 5.

## 2. Modelling Fourier ptychography using a convolutional neural network

The forward imaging process of FP can be expressed as

_{$\cdot $}’ denotes element-wise multiplication, ‘*’ denotes convolution, $O\left(x,y\right)$ denotes the complex object, ${e}^{i({k}_{xn}x+{k}_{yn}y)}$ denotes the n

^{th}illumination plane wave with a wave vector (${k}_{xn},{k}_{yn}$), $PSF(x,y$) denotes the PSF of the objective lens, and ${I}_{n}\left(x,y\right)$ denotes the n

^{th}intensity measurement by the image sensor. The Fourier transform of $PSF(x,y$) is the CTF of the objective lens. For diffraction-limited imaging, we have $FT\left\{PSF\left(x,y\right)\right\}=circ({k}_{x}^{2}+{k}_{y}^{2}<{(NA\cdot {k}_{0})}_{}^{2})$, where $FT$ denotes Fourier transform, ${k}_{0}=2\pi /\lambda $ and $\lambda $ is the wavelength, $circ$ is the circle function (it is 1 if the condition is met and 0 otherwise). Equation (1) can be rewritten as

The goal of the Fourier ptychographic imaging process is to recover ${O}_{r}$ and ${O}_{i}$ based on many measurements ${I}_{n}\left(x,y\right)$ (n = 1,2,3…). Since ${O}_{r}$ and ${O}_{i}$ are real numbers, we can model them as a two-channel learnable filter in a CNN.

The proposed CNN model for the Fourier ptychographic imaging process is shown in Fig. 1. This model contains an input layer for the n^{th} PSF, a convolutional layer with two channels for the real and imaginary parts of the object, an activation layer for performing the square operation, an add layer for adding the two inputs, and an output layer for the predicted FP image. For the convolutional layer, we can choose different stride value to model the down-sampling effect of the image sensor. In our implementation, we choose a stride value of 4, i.e., the pixel size of the object is 4 times smaller than that of the FP low-resolution measurements.

The training process of this CNN model is to recover the two-channel object (${O}_{r}$,${O}_{i}$) based on all FP captured images ${I}_{n}\left(x,y\right)$ (n = 1,2,3…). For an initial guess of the two-channel object (${O}_{r}$,${O}_{i}$), the CNN in Fig. 1 outputs a prediction ${I}_{n\_predict}$ in a forward pass:

*diff*(${I}_{n},{I}_{{n}_{predict}}$) is back-propagated to the convolutional layer and the two-channel object is updated accordingly. Therefore, the training process of the CNN model can be viewed as a minimization process for the following loss function:

We first analyze the performance using simulation. Figure 2(a) shows the high-resolution object amplitude and phase. Figure 1(b) shows the CNN output for the low-resolution intensity images with different wave vector (${k}_{xn},{k}_{yn}$)s. In this simulation, we use 15 by 15 plane waves for illumination and 0.1 NA objective lens to acquire images. The step size for ${k}_{xn}$ and ${k}_{yn}$ is 0.05, and the maximal synthetic NA is ~0.64. The pixel size in this simulation is 0.43 µm for the high-resolution object and 1.725 µm for the low-resolution measurements at the object plane (assuming magnification factor is 1). The use of these parameters is to simulate a microscope platform with 2X, 0.1 NA objective and an image sensor with 3.45 µm pixel size.

In Fig. 3, we show the recovered results with different learning rates. The Adam optimizer is used in this simulation. This optimizer is a first-order gradient-based optimizer using adaptive estimates of lower-order moments [41]. It combines the advantages of two extensions of stochastic gradient descent Adaptive Gradient Algorithm (AdaGrad) and Root Mean Square Propagation (RMSProp) [41], and it is the default optimizer for many deep learning problems.

Different learning rates in Fig. 3 represent different step sizes of the gradient descent approach. We can see that a higher learning rate can decay the loss faster but gets stuck at a worse value of loss. This is because there is too much ‘energy’ in the optimization process and the learnable weights are bouncing around chaotically, unable to settle in a nice spot in the optimization landscape. On the other hand, a lower learning rate is able to reach a lower minimum point in a slower process. A straight-forward approach for a better learning-rate schedule is to use a large learning rate at the beginning and reduce it for every epoch. How to schedule the learning rate for FP is an interesting topic and requires further investigation in the future.

In Fig. 4, we compare the performance of different optimizers in TensorFlow and show their corresponding recovered results. We note that all optimizers give similar results if the step size for ${k}_{xn}$ and ${k}_{yn}$ is small (i.e., aperture overlap is large in the Fourier domain). In this simulation, we use 5 by 5 plane wave illumination with 0.15 step size for ${k}_{xn}$ and ${k}_{yn}$. Other parameters are the same as before. Figure 4 shows that Adam achieves the best performance and stochastic gradient descent (SGD) is the worst among the 4. Stochastic gradient descent with momentum (SGDM) is the second-best option. This justifies the adding of momentum in the recent ptychographical iterative engine [42].

In Fig. 5, we investigate the effect of different batch sizes for the optimization process. We can see that batch size of 1 gives the best performance in Fig. 5(a). This justifies the stochastic gradient descent scheme used in the extend ptychography iterative engine (ePIE) [15].

However, one advantage of using TensorFlow library is to perform parallel processing via graphics processing unit (GPU) or tensor processing unit (TPU). As a reference point, a modern GPU can handle hundreds of images in one patch. When we use a large batch size, the processing time is about the same as that of batch size = 1. For example, the batch size is 1 in Fig. 5(a) and the epoch number is 20; therefore, we update the object with 225*20 times in this simulation. On the other hand, the batch size is 64 in Fig. 5(d) and the epoch number is 20; therefore, we update the object with (225/64)*20 times for this figure. We define ‘number of updating times’ as the number of epochs divided by the batch size. This ‘number of updating times’ is directly related to the processing time of the recovery process.

In Fig. 6(a)-6(d), we show the recovered results with the same number of updating times. In this case, we can see a large batch size leads to a better performance. Based on Figs. 5 and 6, we can draw the following conclusion: batch size of 1 is preferred for serial operation via CPU and a large batch size is preferred for parallel operation using GPU or TPU.

## 3. Modelling Fourier-magnitude projection in neural network

All widely used iterative phase retrieval algorithms have at their core an operation termed Fourier-magnitude projection (FMP), where an exit complex wave estimate is updated by replacing its Fourier magnitude with measured data while keeping the phase untouched. In this section, we discuss the implementation of FMP in neural network modeling. The motivation is to implement many existing phase retrieval algorithms via neural network training processes. We demonstrate it with FP and it can be easily extended for other phase retrieval problems.

In the Fourier ptychographic imaging process, the exit complex wave $\widehat{{\phi}_{n}}\left({k}_{x},{k}_{y}\right)$ in the Fourier domain can be expressed as

^{th}PSF $PS{F}_{n}(x,y)$. The FMP operation for the exit complex wave $\widehat{{\phi}_{n}}\left({k}_{x},{k}_{y}\right)$ can be written as

In many ptychographic phase retrieval schemes [15, 19, 31, 42, 43], it is common to perform the recovery process by dividing the optimization problem into two sub tasks: 1) perform an FMP to update the exit wave, and 2) the difference between the updated exit wave and the original exit wave is back-propagated to update the object and/or the illumination probe. For the Fourier ptychographic imaging process, we can minimize the following loss function after the FMP process in Eq. (8):

The neural network for minimizing Eq. (9) is shown in Fig. 7, where we model the object’s Fourier spectrum as learnable weights of a multiplication layer. Since the learnable weights need to be real in TensorFlow implementation, we separate the complex object spectrum $\widehat{O}\left({k}_{x},{k}_{y}\right)$ into two channels with subscripts ‘r’ and ‘i’ representing the real and imaginary parts in Fig. 7. The input layer for this network is the n^{th} ${\text{CTF}}_{n}$ and the captured low-resolution amplitude $\sqrt{{I}_{n}}$. We use a λ-layer in TensorFlow to define the Fourier-magnitude-projection operation. The output of the network is 0, and thus, the training process of the network minimizes the loss function defined in Eq. (9). Once the complex object spectrum $\widehat{O}\left({k}_{x},{k}_{y}\right)$ is recovered in the network training process, we can perform an inverse Fourier transform to get the complex object $O\left(x,y\right)$ in the spatial domain.

In Fig. 8, we compare the recovered results in 4 cases: 1) minimizing the loss function in Eq. (6) with L2-norm (‘L2 intensity’ in Fig. 8(a), 2) minimizing the loss function in Eq. (6) with L1-norm (‘L1 intensity’ in Fig. 8(b), 3) minimizing the loss function in Eq. (9) with L2-norm (‘L2 exit wave’ in Fig. 8(c), and 4) minimizing the loss function in Eq. (9) with L1-norm (‘L1 exit wave’ in Fig. 8(d). We quantify the performance of different approaches using mean square error (MSE) in Fig. 8(e). We can see that the cases of ‘L1 intensity’ and ‘L2 exit wave’ give the best results. In particular, ‘L2 exit wave’ converges faster at the first few epochs while ‘L1 intensity’ reaches a lower MSE with more iterations. We also note that the intensity updating cases tend to recover the low-resolution features first while the exit-wave updating cases tend to recover features at all levels. This behavior can be explained by the loss functions in Eqs. (6) and (9). The loss function in Eq. (6) is to reduce the difference between two intensity images in the spatial domain. Therefore, it tends to correct the low-frequency difference first since most energy concentrates in this region. On the other hand, the loss function in Eq. (9) is to reduce the difference between two Fourier spectrums and it does not focus on the low-frequency regions. As such, the resolution improvement is more obvious for the exit-wave updating cases shown in Fig. 8(c).

In Fig. 9, we test the L2-norm exit-wave network using experimental data. Figure 9(a) shows the experimental setup where we use a 2X, 0.1 NA Nikon objective with a 200 mm tube lens (Thorlabs TTL200) to build a microscope platform. A 5-megapixel camera (BFS-U3-51S5M-C) with a 3.45 µm pixel size is used to acquire the intensity images. We use an LED array (Adafruit 32 by 32 RGB LED matrix) to illuminate the sample from different incident angles and the distance between the LED array and the sample is ~85 mm. In this experiment, we illuminate the sample from 15 by 15 different incident angles and the corresponding maximum synthetic NA is ~0.55. Figure 9(b1)-9(d1) show the low-resolution images captured by the microscope platform in Fig. 9(a). We use the L2-norm exit-wave network with the Adam optimizer to recover the complex object spectrum in the multiplication layer. The batch size in this experiment is 1 and we use 20 epochs in the network training (optimization) process. The recovered object intensity images are shown in Fig. 9(b2)-9(d2) and the recovered phase images are shown in Fig. 9(b3)-9(d3). As a comparison, we also show the standard FPM reconstructions [19] in Fig. 9(b4)-(d4) and 9(b5)-(d5). Figure 9 validates the effectiveness of reported neural network models.

## 4. Extensions for single-pixel imaging and structured illumination microscopy

In this section, we extend the network models discussed above for single-pixel imaging and structured illumination microscopy. Single pixel imaging captures images using single-pixel detectors. It enables imaging in a variety of situations that are impossible or challenging with conventional 2D image sensors [44–46]. The forward imaging process of single-pixel imaging can be expressed as

where $O(x,y$) denotes the 2D object, ${P}_{n}(x,y)$ denotes the n^{th}2D illumination pattern on the object, and ${I}_{n}$ denotes the n

^{th}single-pixel measurement. The summation sign in Eq. (10) represents the signal summation over the x-y plane. Since the dimensions of the object and pattern are the same, the forward imaging model in Eq. (10) can be modeled by a ‘valid convolutional layer’ which outputs a predicted single-pixel measurement. The CNN model for single-pixel imaging is shown in Fig. 10(a). The training of this model is to minizine the following loss function:

Structured illumination microscopy (SIM) uses non-uniform patterns for sample illumination and combine multiple acquisitions for super-resolution image recovery [47–50]. Frequency mixing between the sample and the non-uniform illumination pattern modulates the high-frequency components to the passband of the collection optics. Therefore, the recorded images contain sample information that is beyond the limit of the employed optics. Conventional SIM employs sinusoidal patterns for sample illumination. In a typical implementation, three different lateral phase shifts (0, 2π/3, 4π/3) are needed for each orientation of the sinusoidal pattern, and 3 different orientations are needed to double the bandwidth isotopically in the Fourier domain. Therefore, 9 acquisitions are needed.

The forward imaging process of SIM can be expressed as

where $O(x,y$) denotes the 2D object, ${P}_{n}(x,y)$ denotes the n^{th}2D illumination pattern on the object, PSF denotes the PSF of the objective lens, and ${I}_{n}(x,y)$ denotes the n

^{th}2D image measurement. The CNN model for SIM is shown in Fig. 10(b), where the object is modeled as learnable weights of a multiplication layer. The training of this model is to minizine the following loss function:

## 5. Summary and discussion

In summary, we model the Fourier ptychographic forward imaging process using a convolutional neural network and recover the complex object information in the network training process. In our approach, the object is treated as 2D learnable weights of a convolutional or a multiplication layer. The output of the network is modeled as the loss function we aim to minimize. The batch size of the network corresponds to the number of captured low-resolution images in one forward / backward pass. We use the popular open-source machine learning library, TensorFlow, for setting up the network and conducting the optimization process. We show that the Adam optimizer achieves the best performance in general and a large batch size is preferred for GPU / TPU-based parallel processing.

Another contribution of our work is to model the Fourier-magnitude projection via a neural network model. The Fourier-magnitude projection is the most important operation in iterative phase retrieval algorithms. Based on our model, we can easily perform exit-wave-based optimization using TensorFlow. We show that L2-norm is preferred for exit-wave-based optimization while L1-norm is preferred for intensity-based optimization.

Since convolution and multiplication are the two most-common operations in imaging modeling, the reported approach may provide a new perspective to examine many coherent and incoherent systems. As a demonstration, we discuss the extensions of the reported networks for modeling single-pixel imaging and structured illumination microscopy. We show that single-pixel imaging can be modeled by a convolutional layer implementing ‘valid convolution’. For structured illumination microscopy, we propose a network model with one multiplication layer and one convolutional layer. In particular, we demonstrate 4-frame resolution doubling via the proposed CNN. Since the proposed network model can be implemented in neural engine and TPU, we envision many opportunities for accelerating the image reconstruction process via machine-learning hardware.

There are many future directions for this work. First, we can implement the CTF updating scheme in the proposed neural network models. One solution is to make the incident wave vector as the input and we can then convolute the CTF with δ(${k}_{xn},{k}_{yn}$) to generate CTF_{n}. In this case, we can model CTF as learnable weights in a convolutional layer and it can be updated in the network training process. Second, correcting positional error is an important topic for real-space ptychographic imaging. The positional errors in real-space ptychography is equivalent to the errors of incident wave vectors in FP. We can, for example, model (${k}_{xn},{k}_{yn}$) as learnable weights in a layer and they can be updated in the network training process. Similarly, we can also generate CTF based on coefficients of different Zernike modes and model such coefficients as learnable weights. Third, the proposed network models are developed for one coherent state. It is straight forward to extend our networks to model multi-state cases. Fourth, we use a fixed learning rate in our models. How to schedule the learning rates for faster convergence is an interesting topic and requires further investigations. Fifth, we can add regularization term such as total variation loss in the model to better handle the measurement noises.

We provide our implantation code in the format of Jupyter notebook Code 1, [52].

## Funding

NSF (1510077, 1555986, 1700941), NIH (R21EB022378, R03EB022144).

## Disclosures

G. Zheng has the conflict of interest with Clearbridge Biophotonics and Instant Imaging Tech, which did not support this work.

## References and links

**1. **G. Zheng, R. Horstmeyer, and C. Yang, “Wide-field, high-resolution Fourier ptychographic microscopy,” Nat. Photonics **7**(9), 739–745 (2013). [CrossRef] [PubMed]

**2. **X. Ou, R. Horstmeyer, C. Yang, and G. Zheng, “Quantitative phase imaging via Fourier ptychographic microscopy,” Opt. Lett. **38**(22), 4845–4848 (2013). [CrossRef] [PubMed]

**3. **L. Tian, X. Li, K. Ramchandran, and L. Waller, “Multiplexed coded illumination for Fourier Ptychography with an LED array microscope,” Biomed. Opt. Express **5**(7), 2376–2389 (2014). [CrossRef] [PubMed]

**4. **K. Guo, S. Dong, and G. Zheng, “Fourier ptychography for brightfield, phase, darkfield, reflective, multi-slice, and fluorescence imaging,” IEEE J. Sel. Top. Quantum Electron. **22**(4), 1–12 (2016). [CrossRef]

**5. **V. Mico, Z. Zalevsky, P. García-Martínez, and J. García, “Synthetic aperture superresolution with multiple off-axis holograms,” J. Opt. Soc. Am. A **23**(12), 3162–3170 (2006). [CrossRef] [PubMed]

**6. **J. Di, J. Zhao, H. Jiang, P. Zhang, Q. Fan, and W. Sun, “High resolution digital holographic microscopy with a wide field of view based on a synthetic aperture technique and use of linear CCD scanning,” Appl. Opt. **47**(30), 5654–5659 (2008). [CrossRef] [PubMed]

**7. **T. R. Hillman, T. Gutzler, S. A. Alexandrov, and D. D. Sampson, “High-resolution, wide-field object reconstruction with synthetic aperture Fourier holographic optical microscopy,” Opt. Express **17**(10), 7873–7892 (2009). [CrossRef] [PubMed]

**8. **L. Granero, V. Micó, Z. Zalevsky, and J. García, “Synthetic aperture superresolved microscopy in digital lensless Fourier holography by time and angular multiplexing of the object information,” Appl. Opt. **49**(5), 845–857 (2010). [CrossRef] [PubMed]

**9. **T. Gutzler, T. R. Hillman, S. A. Alexandrov, and D. D. Sampson, “Coherent aperture-synthesis, wide-field, high-resolution holographic microscopy of biological tissue,” Opt. Lett. **35**(8), 1136–1138 (2010). [CrossRef] [PubMed]

**10. **A. B. Meinel, “Aperture synthesis using independent telescopes,” Appl. Opt. **9**(11), 2501 (1970). [CrossRef] [PubMed]

**11. **T. Turpin, L. Gesell, J. Lapides, and C. Price, “Theory of the synthetic aperture microscope,” Proc. SPIE **2566**, 230–240 (1995). [CrossRef]

**12. **J. R. Fienup, “Phase retrieval algorithms: a comparison,” Appl. Opt. **21**(15), 2758–2769 (1982). [CrossRef] [PubMed]

**13. **V. Elser, “Phase retrieval by iterated projections,” J. Opt. Soc. Am. A **20**(1), 40–55 (2003). [CrossRef] [PubMed]

**14. **H. M. L. Faulkner and J. M. Rodenburg, “Movable aperture lensless transmission microscopy: a novel phase retrieval algorithm,” Phys. Rev. Lett. **93**(2), 023903 (2004). [CrossRef] [PubMed]

**15. **A. M. Maiden and J. M. Rodenburg, “An improved ptychographical phase retrieval algorithm for diffractive imaging,” Ultramicroscopy **109**(10), 1256–1262 (2009). [CrossRef] [PubMed]

**16. **R. Gonsalves, “Phase retrieval from modulus data,” JOSA **66**(9), 961–964 (1976). [CrossRef]

**17. **J. R. Fienup, “Reconstruction of a complex-valued object from the modulus of its Fourier transform using a support constraint,” JOSA A **4**(1), 118–123 (1987). [CrossRef]

**18. **I. Waldspurger, A. d’Aspremont, and S. Mallat, “Phase recovery, maxcut and complex semidefinite programming,” Math. Program. **149**(1-2), 47–81 (2015). [CrossRef]

**19. **X. Ou, G. Zheng, and C. Yang, “Embedded pupil function recovery for Fourier ptychographic microscopy,” Opt. Express **22**(5), 4960–4972 (2014). [CrossRef] [PubMed]

**20. **G. Zheng, “Breakthroughs in photonics 2013: Fourier ptychographic imaging,” Photonics Journal, IEEE **6**, 1–7 (2014).

**21. **G. Zheng, X. Ou, R. Horstmeyer, J. Chung, and C. Yang, “Fourier ptychographic microscopy: a gigapixel superscope for biomedicine,” Optics and Photonics News **4**, 26–33 (2014)

**22. **W. Hoppe and G. Strube, “Diffraction in inhomogeneous primary wave fields. 2. Optical experiments for phase determination of lattice interferences,” Acta Crystallogr. A **25**, 502–507 (1969). [CrossRef]

**23. **J. M. Rodenburg, A. C. Hurst, A. G. Cullis, B. R. Dobson, F. Pfeiffer, O. Bunk, C. David, K. Jefimovs, and I. Johnson, “Hard-x-ray lensless imaging of extended objects,” Phys. Rev. Lett. **98**(3), 034801 (2007). [CrossRef] [PubMed]

**24. **J. Rodenburg, “Ptychography and related diffractive imaging methods,” Adv. Imaging Electron Phys. **150**, 87–184 (2008). [CrossRef]

**25. **R. Horstmeyer and C. Yang, “A phase space model of Fourier ptychographic microscopy,” Opt. Express **22**(1), 338–358 (2014). [CrossRef] [PubMed]

**26. **T. B. Edo, D. J. Batey, A. M. Maiden, C. Rau, U. Wagner, Z. D. Pešić, T. A. Waigh, and J. M. Rodenburg, “Sampling in x-ray ptychography,” Phys. Rev. A **87**(5), 053850 (2013). [CrossRef]

**27. **D. J. Batey, D. Claus, and J. M. Rodenburg, “Information multiplexing in ptychography,” Ultramicroscopy **138**, 13–21 (2014). [CrossRef] [PubMed]

**28. **P. Thibault and A. Menzel, “Reconstructing state mixtures from diffraction measurements,” Nature **494**(7435), 68–71 (2013). [CrossRef] [PubMed]

**29. **A. M. Maiden, M. J. Humphry, and J. M. Rodenburg, “Ptychographic transmission microscopy in three dimensions using a multi-slice approach,” J. Opt. Soc. Am. A **29**(8), 1606–1614 (2012). [CrossRef] [PubMed]

**30. **T. M. Godden, R. Suman, M. J. Humphry, J. M. Rodenburg, and A. M. Maiden, “Ptychographic microscope for three-dimensional imaging,” Opt. Express **22**(10), 12513–12523 (2014). [CrossRef] [PubMed]

**31. **P. Thibault, M. Dierolf, O. Bunk, A. Menzel, and F. Pfeiffer, “Probe retrieval in ptychographic coherent diffractive imaging,” Ultramicroscopy **109**(4), 338–343 (2009). [CrossRef] [PubMed]

**32. **S. Ruder, “An overview of gradient descent optimization algorithms,” arXiv preprint arXiv:1609.04747 (2016).

**33. **Y. Rivenson, Z. Göröcs, H. Günaydin, Y. Zhang, H. Wang, and A. Ozcan, “Deep learning microscopy,” Optica **4**(11), 1437–1443 (2017). [CrossRef]

**34. **Y. Rivenson, Y. Zhang, H. Günaydın, D. Teng, and A. Ozcan, “Phase recovery and holographic image reconstruction using deep learning in neural networks,” Light Sci. Appl. **7**(2), 17141 (2018). [CrossRef]

**35. **Y. Rivenson, H. Ceylan Koydemir, H. Wang, Z. Wei, Z. Ren, H. Günaydın, Y. Zhang, Z. Göröcs, K. Liang, D. Tseng, and A. Ozcan, “Deep learning enhanced mobile-phone microscopy,” ACS Photonicsacsphotonics.8b00146 (2018). [CrossRef]

**36. **U. S. Kamilov, I. N. Papadopoulos, M. H. Shoreh, A. Goy, C. Vonesch, M. Unser, and D. Psaltis, “Learning approach to optical tomography,” Optica **2**(6), 517–522 (2015). [CrossRef]

**37. **S. Jiang, J. Liao, Z. Bian, K. Guo, Y. Zhang, and G. Zheng, “Transform- and multi-domain deep learning for single-frame rapid autofocusing in whole slide imaging,” Biomed. Opt. Express **9**(4), 1601–1612 (2018). [CrossRef] [PubMed]

**38. **M. Abadi, P. Barham, J. Chen, Z. Chen, A. Davis, J. Dean, M. Devin, S. Ghemawat, G. Irving, and M. Isard, “TensorFlow: A System for Large-Scale Machine Learning,” in *OSDI*, 2016), 265–283.

**39. **N. P. Jouppi, C. Young, N. Patil, D. Patterson, G. Agrawal, R. Bajwa, S. Bates, S. Bhatia, N. Boden, and A. Borchers, “In-datacenter performance analysis of a tensor processing unit,” in Proceedings of the 44th Annual International Symposium on Computer Architecture (ACM, 2017), 1–12. [CrossRef]

**40. **L. Bian, J. Suo, G. Zheng, K. Guo, F. Chen, and Q. Dai, “Fourier ptychographic reconstruction using Wirtinger flow optimization,” Opt. Express **23**(4), 4856–4866 (2015). [CrossRef] [PubMed]

**41. **D. P. Kingma and J. Ba, “Adam: A method for stochastic optimization,” arXiv preprint arXiv:1412.6980 (2014).

**42. **A. Maiden, D. Johnson, and P. Li, “Further improvements to the ptychographical iterative engine,” Optica **4**(7), 736–745 (2017). [CrossRef]

**43. **M. Odstrčil, A. Menzel, and M. Guizar-Sicairos, “Iterative least-squares solver for generalized maximum-likelihood ptychography,” Opt. Express **26**(3), 3108–3123 (2018). [CrossRef] [PubMed]

**44. **M. F. Duarte, M. A. Davenport, D. Takhar, J. N. Laska, T. Sun, K. F. Kelly, and R. G. Baraniuk, “Single-pixel imaging via compressive sampling,” IEEE Signal Process. Mag. **25**(2), 83–91 (2008). [CrossRef]

**45. **B. Sun, M. P. Edgar, R. Bowman, L. E. Vittert, S. Welsh, A. Bowman, and M. J. Padgett, “3D computational imaging with single-pixel detectors,” Science **340**(6134), 844–847 (2013). [CrossRef] [PubMed]

**46. **Z. Zhang, X. Ma, and J. Zhong, “Single-pixel imaging by means of Fourier spectrum acquisition,” Nat. Commun. **6**(1), 6225 (2015). [CrossRef] [PubMed]

**47. **M. G. Gustafsson, “Surpassing the lateral resolution limit by a factor of two using structured illumination microscopy,” J. Microsc. **198**(Pt 2), 82–87 (2000). [CrossRef] [PubMed]

**48. **M. G. Gustafsson, “Nonlinear structured-illumination microscopy: wide-field fluorescence imaging with theoretically unlimited resolution,” Proc. Natl. Acad. Sci. U.S.A. **102**(37), 13081–13086 (2005). [CrossRef] [PubMed]

**49. **M. G. Gustafsson, L. Shao, P. M. Carlton, C. J. Wang, I. N. Golubovskaya, W. Z. Cande, D. A. Agard, and J. W. Sedat, “Three-dimensional resolution doubling in wide-field fluorescence microscopy by structured illumination,” Biophys. J. **94**(12), 4957–4970 (2008). [CrossRef] [PubMed]

**50. **S. Dong, K. Guo, S. Jiang, and G. Zheng, “Recovering higher dimensional image data using multiplexed structured illumination,” Opt. Express **23**(23), 30393–30398 (2015). [CrossRef] [PubMed]

**51. **S. Dong, J. Liao, K. Guo, L. Bian, J. Suo, and G. Zheng, “Resolution doubling with a reduced number of image acquisitions,” Biomed. Opt. Express **6**(8), 2946–2952 (2015). [CrossRef] [PubMed]

**52. **S. Jiang, K. Guo, J. Liao, and G. Zheng, “Neural network models for Fourier ptychography,” figshare (2018) [retrieved 19 June 2018], https://doi.org/10.6084/m9.figshare.6607700.