## Abstract

Fourier Ptychography is a new computational microscopy technique that achieves gigapixel images with both wide field of view and high resolution in both phase and amplitude. The hardware setup involves a simple replacement of the microscope’s illumination unit with a programmable LED array, allowing one to flexibly pattern illumination angles without any moving parts. In previous work, a series of low-resolution images was taken by sequentially turning on each single LED in the array, and the data were then combined to recover a bandwidth much higher than the one allowed by the original imaging system. Here, we demonstrate a multiplexed illumination strategy in which multiple randomly selected LEDs are turned on for each image. Since each LED corresponds to a different area of Fourier space, the total number of images can be significantly reduced, without sacrificing image quality. We demonstrate this method experimentally in a modified commercial microscope. Compared to sequential scanning, our multiplexed strategy achieves similar results with approximately an order of magnitude reduction in both acquisition time and data capture requirements.

© 2014 Optical Society of America

## 1. Introduction

The LED array microscope is a powerful new platform for computational microscopy in which a wide range of capabilities are enabled by a single hardware modification to a traditional brightfield microscope - the replacement of the source with a programmable LED array [1, 2] (Fig. 1). This simple, inexpensive hardware modification allows patterning of the illumination at the Fourier plane of the sample (assuming Köhler geometry). Thus, each LED in the array corresponds to illumination of the sample by a unique angle. Conveniently, the range of illumination angles that can be patterned is much larger than the range of angles that pass through the objective [set by its numerical aperture (NA)]. This means that illumination by the central LEDs produces brightfield images, whereas illumination by the outer LEDs (outside the NA of the objective) produces dark field images [1]. Alternatively, by sequentially taking a pair of images with either half of the source on, we obtain phase derivative measurements by differential phase contrast (DPC) [3–5]. Finally, a full sequential scan of the 2D array of LEDs (angles), while taking 2D images at each angle, captures a 4D dataset similar to a light field [6] or phase space measurement [7]. This enables all the computational processing of light field imaging. For example, angular information can be traded for depth by using digital refocusing algorithms to get 3D intensity [1] or 3D phase contrast [3]. When the sample is thin, angular information can instead be used to improve resolution by computationally recovering a larger synthetic NA, limited only by the largest illumination angle of the LED array [2]. This method, named Fourier Ptychography, enables one to use a low NA objective, having a very large field of view (FoV), but still obtain high resolution across the entire image, resulting in gigapixel images. The aberrations in the imaging system can also be estimated without separate characterization [8]. All of these imaging modalities are achieved in the same optical setup, with no moving parts, simply by choosing the appropriate LEDs to turn on.

The main limitations of Fourier Ptychography are the large amount of data captured and the long acquisition times required. An image must be collected while scanning through each of the LEDs in the array, leading to hundreds of images in each dataset. This is compounded by the fact that each LED has limited intensity, requiring long exposures. The multiplexed illumination scheme that we propose here is capable of reducing both acquisition time and the number of images required by orders of magnitude.

We demonstrate two different multiplexing schemes in which multiple LEDs from the array are turned on for each captured image. First, we describe the case where we take the same number of images as in the sequential scan, but reduce the exposure time for each, since turning on more LEDs provides more light throughput. As long as the random patterns are linearly independent, the resulting images can be interpreted as a linear combination of images from each of the LEDs, implying that the data contains the same information as in the sequential scan. The more interesting multiplexing situation involves reducing the total number of images. In this second scheme, we show that a random coding strategy is capable of significantly reducing the data requirements, since each image now contains information from multiple areas of the sample’s Fourier space. To solve the inverse problem, we develop a modified Fourier Ptychography algorithm that applies to both multiplexing situations.

## 2. Theory and method

#### 2.1. Fourier Ptychography

In microscopy, one must generally choose between large FoV and high resolution. Where both are needed (e.g. in digital pathology [9]), lateral mechanical scanning of the sample is the most common solution. Fourier Ptychography (FP) [2] is a new computational illumination technique that achieves the same space-bandwidth product, but by scanning of the source in Fourier space with a programmable LED array. The setup involves no moving parts, so can be made fast. FP involves an embedded phase retrieval algorithm and so also produces high resolution, large FoV quantitative phase images [10]. In addition, FP uses a low NA objective, which is less expensive and provides a longer working distance and larger depth of field than high NA objectives.

The process of sequential FP is summarized in Fig. 1. For each image captured, the sample is illuminated from a unique angle by turning on a single LED. In Fourier space, this corresponds to a shift proportional to the angle of illumination. Thus, for each LED, a different area of Fourier space passes through the pupil. An example is given in Fig. 1(c), showing the Fourier space areas covered by three LEDs. The goal of the FP algorithm is to stitch together the images from different areas of Fourier space in order to achieve a large effective NA. The final NA is the sum of the objective NA and illumination NA. The caveat of using Fourier space stitching is that phase information is required. Typically, this is done with coherent methods by measuring phase for each angle via synthetic aperture methods [11,12]. In FP, however, phase is inferred by a phase retrieval optimization [13] based on angular diversity, analogous to Ptychography based on spatial translation [14–17]. For such algorithms to converge, significant overlap (> 60%) is required between the Fourier space areas of neighboring LEDs [2, 18] (see Section 2.4).

Both mechanical scanning and Fourier scanning result in huge datasets, on the order of gigapixels, presenting data storage and manipulation challenges. In FP, the overlap requirement means that sequential Fourier scanning requires much more data than mechanical scanning. However, by using the multiplexing methods described here, data capture requirements become comparable to or even lower than those of mechanical scanning. Analogous multiplexing in the spatial domain for traditional Ptychography would be very difficult to achieve.

#### 2.2. Coding strategies for multiplexed Fourier Ptychography

There are many choices of coding strategies for multiplexed measurements. In computational photography, multiplexed illumination has been evaluated for reflectance images [19, 20], but differs from our system in that FP involves a phase retrieval algorithm which is nonlinear. This makes it difficult to analytically derive the optimal coding strategy.

By turning on *M* LEDs for each image, we can linearly reduce the exposure time by a factor of *M* while maintaining the same photon budget. Since each image covers *M* times more area of Fourier space, we may also be able to reduce the total number of images by a factor of *M*. Thus, the possible reduction in total acquisition time is *M*^{2}, as long as the algorithm does not break. Clearly, we cannot go to the extreme case of turning on all the LEDs at once and capturing only one image, since there will be no diversity for providing phase contrast. Intuitively, we would like each pattern to turn on LEDs which are far away from each other, such that they represent distinct (non-overlapping) areas of Fourier space. However, we still need overlap across all the images captured, so later coded images should cover the overlapping areas.

Here, we choose a general random coding scheme in which the number of LEDs on is fixed, but their location varies randomly for each captured image, subject to some simple rules. We need to use each LED at least once in each dataset, even when we reduce the number of images. Thus, the first image in the set chooses which *M* LEDs to turn on randomly (with uniform probability), but later images will exclude those already used LEDs in choosing which to turn on. This process generates a set of *N*_{LED}/*M* patterns which fully cover all LEDs. To generate the additional images needed for matching the number of images to the sequential scan, we repeat this process *M* times. This scheme achieves good mixing of information and balanced coverage of Fourier space for each image. As an example, illumination patterns designed according to our random codes and their resulting images are shown in Fig. 2.

#### 2.3. Forward problem for multiplexed Fourier Ptychography

Consider a thin sample described by the complex transmission function *o*(**r**), where **r** = (*x*, *y*) denotes the lateral coordinates at the sample plane. Each LED generates illumination at the sample that is treated as a (spatially coherent) local plane wave with a unique spatial frequency **k*** _{m}* = (

*k*,

_{xm}*k*) for LEDs

_{ym}*m*= 1,···,

*N*

_{LED}, where

*N*

_{LED}is the total number of LEDs in the array. The exit wave from the sample is the multiplication of the two:

*u*(

**r**) =

*o*(

**r**)exp(

*i*

**k**

*·*

_{m}**r**). Thus, the sample’s spectrum

*O*(

**k**) is shifted to be centered around

**k**

*= (sin*

_{m}*θ*, sin

_{xm}/λ*θ*), where (

_{ym}/λ*θ*,

_{xm}*θ*) define the illumination angle for the

_{ym}*m*

^{th}LED and

*λ*is the wavelength.

At the pupil plane, the field corresponding to the Fourier transform of the exit wave *O*(**k** − **k*** _{m}*) is low-pass filtered by the pupil function

*P*(

**k**). Therefore, the intensity at the image plane resulting from a single LED illumination (neglecting magnification and noise) is

*ℱ*

_{[(·)]}(

**r**) is the 2D Fourier transform.

For multiplexed images, the sample is illuminated by different sets of LEDs according to a coding scheme. The *p*^{th} image turns on LEDs with indices *ℒ _{p}* chosen from {1,···,

*N*

_{LED}}. When multiple LEDs are on at once, the illumination must be considered partially coherent, with each LED being mutually incoherent with all others, representing a single coherent mode. The total intensity of the

*p*

^{th}multiplexed image

*I*(

_{p}**r**) is the sum of intensities from each:

*m*is an element from the set

*ℒ*.

_{p}Assuming that the entire multiplexed FP captures a total of *N*_{img} intensity images, the multiplexing scheme can be described by a *N*_{img} × *N*_{LED} binary coding matrix **A** = [*A _{p,m}*], where the element of the matrix is defined by

**(1)**any column of

**A**should contain at least one non-zero element, meaning that each LED has to be turned on at least once;

**(2)**any row of

**A**should contain at least one non-zero element, excluding the trivial case that no LEDs are turned on in a certain measurement;

**(3)**all the row vectors should be linearly independent with each other, implying that every new multiplexed image is non-redundant.

#### 2.4. Inverse problem formulation

For the multiplexed illumination case, we develop a new algorithm to handle multi-LED illumination. Our algorithm is similar in concept to that of [8], which jointly recovers the sample’s Fourier space *O*(**k**) and the unknown pupil function *P*(**k**). We split the FoV into patches whose area is on the order of the spatial coherence area of a single LED illumination and incorporate our multi-LED forward model in the optimization procedure. To improve robustness, we also add new procedures for background estimation and regularization for tuning noise performance.

The least squares formulation for reconstruction is a non-convex optimization problem. We minimize the square of the difference between the actual and estimated measurements, based on the forward model [Eq. (2)], with an additional term describing the background offset *b _{p}*,

*b̂*and subtract it to produce the corrected intensity image

_{p}Next, we start an iterative process which estimates both the object and the pupil functions
simultaneously. We initialize *O*(**k**) to be the Fourier transform of the
square root of any of the images which contain a brightfield LED, and initialize
*P*(**k**) to be a binary circle whose radius is determined by the NA. We
introduce the auxiliary function Ψ* _{m}*(

**k**) defined as

*m*

^{th}-LED illumination. Then, using the corrected intensity image

*Î*(

_{p}**r**), we iteratively update Ψ

*(*

_{m}**k**) for all

*m*, along with the estimates of

*O*(

**k**) and

*P*(

**k**). The algorithm is derived in Appendix A and summarized below and in Fig. 3. The basic structure of the reconstruction algorithm is to update the auxiliary function incrementally from image

*p*= 1 to

*N*

_{img}in each iteration, and then to repeat the same process iteratively until the value of the merit function [the quantity from the minimization in Eq. (4)] falls below a certain tolerance. Each incremental update consists of the following two steps.

**Update the auxiliary function**${\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k})$**using the***p*^{th}intensity image:Let the estimate of the

*m*^{th}auxiliary function in the*i*^{th}iteration be ${\mathrm{\Psi}}_{m}^{(i)}(\mathbf{k})={O}^{(i)}(\mathbf{k}-{\mathbf{k}}_{m}){P}^{(i)}(\mathbf{k})$. First, compute the real space representation of the*m*^{th}auxiliary function$${\psi}_{m}^{(i)}(\mathbf{r})={\mathcal{F}}_{\left[{\mathrm{\Psi}}_{m}^{(i)}(\mathbf{k})\right]}(\mathbf{r}),\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}m\in {\mathcal{L}}_{p}.$$Then, perform a projection procedure similar to the Gerchberg-Saxton-Fienup type of update by rescaling the real space auxiliary function by an optimal intensity factor and returning back to the Fourier space representation [13, 21–23]:$${\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k})={\mathcal{F}}_{\left[{\varphi}_{m}^{(i)}(\mathbf{r})\right]}^{-1}(\mathbf{k})\hspace{0.17em}\text{with}\hspace{0.17em}\hspace{0.17em}{\varphi}_{m}^{(i)}(\mathbf{r})=\sqrt{\frac{{\widehat{I}}_{p}(\mathbf{r})}{\sum _{m\in {\mathcal{L}}_{p}}{\left|{\psi}_{m}^{(i)}(\mathbf{r})\right|}^{2}}}{\psi}_{m}^{(i)}(\mathbf{r}),\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}m\in {\mathcal{L}}_{p}.$$**Update the sample spectrum***O*^{(i+1)}(**k**)**and the pupil function***P*^{(i+1)}(**k**):$${O}^{(i+1)}(\mathbf{k})={O}^{(i)}(\mathbf{k})+\frac{\sum _{m\in {\mathcal{L}}_{p}}\left|{P}^{(i)}(\mathbf{k}+{\mathbf{k}}_{m})\right|\hspace{0.17em}{\left[{P}^{(i)}(\mathbf{k}+{\mathbf{k}}_{m})\right]}^{*}\left[{\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k}+{\mathbf{k}}_{m})-{O}^{(i)}(\mathbf{k}){P}^{(i)}(\mathbf{k}+{\mathbf{k}}_{m})\right]}{{\left|{P}^{(i)}(\mathbf{k})\right|}_{\text{max}}\cdot \left(\sum _{m\in {\mathcal{L}}_{p}}{\left|{P}^{(i)}(\mathbf{k}+{\mathbf{k}}_{m})\right|}^{2}+{\delta}_{1}\right)}$$$${P}^{(i+1)}(\mathbf{k})={P}^{(i)}(\mathbf{k})+\frac{\sum _{m\in {\mathcal{L}}_{p}}\left|{O}^{(i)}(\mathbf{k}-{\mathbf{k}}_{m})\right|\hspace{0.17em}{\left[{O}^{(i)}(\mathbf{k}-{\mathbf{k}}_{m})\right]}^{*}\left[{\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k})-{O}^{(i)}(\mathbf{k}-{\mathbf{k}}_{m}){P}^{(i)}(\mathbf{k})\right]}{{\left|{O}^{(i)}(\mathbf{k})\right|}_{\text{max}}\cdot \left(\sum _{m\in {\mathcal{L}}_{p}}{\left|{O}^{(i)}(\mathbf{k}-{\mathbf{k}}_{m})\right|}^{2}+{\delta}_{2}\right)},$$where*δ*_{1}and*δ*_{2}are some regularization constants to ensure numerical stability, which is equivalent to an*ℓ*_{2}-norm/Tikhonov regularization on*O*(**k**) and*P*(**k**). The particular choice of updating step size, determined by the ratio between |*P*^{(i)}(**k**)| and its maximum (|*O*^{(i)}(**k**)| and its maximum), is shown to be robust [14].

## 3. Experimental results

The experimental setup is shown in Fig. 1. All samples are imaged with a 4 × 0.1NA objective and a scientific CMOS camera (PCO.edge). A programmable 32×32 LED array (Adafruit, 4mm spacing, controlled by an Arduino) is placed at 67.5mm above the sample to replace the light source on a Nikon TE300 inverted microscope. The central 293 red (central wavelength 629nm and 20nm bandwidth) LEDs are used for all the experiments reported here, resulting in a final synthetic NA of 0.6. In principle, our LED array could provide larger NA improvements, but it is practically limited by noise in the dark field images from high angle LEDs. To bypass these limitations, a better geometry would be a dome of LEDs, all pointed at the sample and covering the full hemisphere. Such an illumination unit was built in [24], but was not programmable and so not directly amenable to FP.

We first image a resolution target, whose low resolution image (from the central LED) is shown in Fig. 4(a) with a zoom-in in Fig. 4(b) to show that the smallest group of features (corresponding to 0.5 NA) are not resolvable. In Fig. 4(c–e), we show our recovered high-resolution images under three different coding strategies. The first, sequential scanning of a single LED across the full array, was taken with a 2s exposure time, resulting in a total acquisition time of T=586s. The coding matrix for sequential scanning is written as the *N*_{LED} × *N*_{LED} identity matrix. Next, we use random multiplexed illumination with 4 LEDs on for each image (i.e. *M* = 4) and a shorter exposure time (1s). Sample illumination patterns and images are shown in Fig. 2. The reconstruction result with the same number of images as the sequential scan *N*_{img} = 293 is shown in Fig. 4(d). As expected, the same resolution enhancement is achieved with half the acquisition time. Finally, we test our partial measurement scheme in which the total number of images is reduced by a factor of 4. This cuts the number of images used in the reconstruction to 74 and the total time to 74s, without sacrificing the quality of the result [see Fig. 4(e)].

The same multiplexing scheme was also tested on a stained biological sample (Fig. 5). During post-processing, the final image was computed in 200×200 pixel patches. The background is estimated for each dark field image by taking the average intensity from a uniform region of the sample. FP reconstructions are compared with the different illumination schemes in Fig. 5(b) and 5(c). The sequential scan is the same as previously used, but the multiplexed measurement now uses 8 LEDs (i.e. *M* = 8). The reduced measurement is demonstrated in the second case with only 40 images used, corresponding to approximately 1/8 of the data size in a full sequential scan, and reducing the total acquisition time by a factor of 14.7.

## 4. Discussion

To summarize, by exploiting illumination multiplexing, we experimentally demonstrated that both the acquisition time and data size requirements in Fourier Ptychography are significantly reduced. We achieved ∼ 2mm field of view with 0.5*μ*m resolution, with data acquisition times reduced from ∼10 minutes (for sequential scanning) to less than 1 minute. By making the LEDs in the array brighter, we expect to be able to achieve sub-second data acquisition with our multiplexed scheme. Our data capture was reduced from 293 images to only 40 images.

The reason that we are able to reduce the number of images taken with our multiplexing scheme (vs. sequential capture) is that each image contains information from multiple non-overlapping areas of Fourier space. Due to the nonlinear nature of the reconstruction algorithm, convergence of our algorithm is difficult to prove or optimize for a given coding strategy. The illumination patterns used in our experiment follow a random coding scheme, and we offer here only empirical evidence of successfully reducing the number of images. Designing an optimal coding strategy is left as future work, as it will depend on complicated interactions between many parameters (e.g. noise, dynamic range and the sample itself). For example, we found in our experiments that dynamic range problems in the FP data also tend to be alleviated by illumination multiplexing. Since light from different LEDs are mutually incoherent, illuminating with multiple LEDs means that each image has reduced spatial coherence. Thus, the diffraction fringes which often cause high dynamic range variations are smoothed out due to the reduced coherence. Another interesting avenue to explore will be methods for exploiting the data redundancy by considering priors about the sample, such as sparsity [25,26]. A gallery of interactive full FoV high resolution images from our experimental system can be found at http://www.gigapan.com/profiles/WallerLab_Berkeley.

## Appendix A: derivations of the reconstruction algorithm

Since the original optimization in (4) involves *P* images which contain massive amounts of pixels, an efficient way is to use the *incremental method* in optimization theory that updates Ψ* _{m}*(

**k**) “incrementally” for

*m*∈

*ℒ*in a sequential manner for each

_{p}*p*= 1 to

*N*

_{img}.

For each image *Î _{p}*(

**r**), we further exploit the framework of

*alternating projection*(also known as

*projection onto convex sets (POCS)*) methods to estimate the auxiliary functions {Ψ

*(*

_{m}**k**)}

_{m∈ℒp}involved in the

*p*

^{th}image. Specifically, the estimates of {Ψ

*(*

_{m}**k**)}

_{m∈ℒp}associated with the

*p*

^{th}image are projected back and forth between the following two sets:

*m*

^{th}auxiliary function in the

*i*

^{th}iteration be ${\mathrm{\Psi}}_{m}^{(i)}(\mathbf{k})$, with the

*i*

^{th}incremental iteration proceeds sequentially with the image index as The projection procedures are derived as follows:

**Projection onto Set***𝒞*_{p,1}: By introducing an intermediate variable ${\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k})$ in the*i*^{th}iteration, the alternating projection procedure onto the set*𝒞*_{p,1}can be translated mathematically into the following optimization:$${\left\{{\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k})\right\}}_{m\in {\mathcal{L}}_{p}}=\text{arg}\underset{{\mathrm{\Phi}}_{m}(\mathbf{k})}{\text{min}}\sum _{m\in {\mathcal{L}}_{p}}\sum _{\mathbf{k}}{\left|{\mathrm{\Phi}}_{m}(\mathbf{k})-{\mathrm{\Psi}}_{m}^{(i)}(\mathbf{k})\right|}^{2},\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\text{s}.\text{t}.\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}{\left\{{\mathrm{\Phi}}_{m}(\mathbf{k})\right\}}_{m\in {\mathcal{L}}_{p}}\in {\mathcal{C}}_{p,1}.$$This difficulty of the optimization lies in the total intensity constraint on the inverse Fourier transforms, which couples different auxiliary functions*m*∈*ℒ*. Fortunately, because of the unitary nature of Fourier transform, it is obvious that the objective function can be equivalently transformed into the real space as_{p}$${\left\{{\varphi}_{m}^{(i)}(\mathbf{r})\right\}}_{m\in {\mathcal{L}}_{p}}=\text{arg}\underset{{\varphi}_{m}(\mathbf{r})}{\text{min}}\sum _{m\in {\mathcal{L}}_{p}}\sum _{\mathbf{r}}{\left|{\varphi}_{m}(\mathbf{r})-{\psi}_{m}^{(i)}(\mathbf{r})\right|}^{2},\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\text{s}.\text{t}.\hspace{0.17em}\hspace{0.17em}{\widehat{I}}_{p}(\mathbf{r})=\sum _{m\in {\mathcal{L}}_{p}}{\left|{\varphi}_{m}(\mathbf{r})\right|}^{2},$$where ${\varphi}_{m}^{(i)}(\mathbf{r})$ and ${\psi}_{m}^{(i)}(\mathbf{r})$ are the real domain variables corresponding to ${\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k})$ and ${\mathrm{\Psi}}_{m}^{(i)}(\mathbf{k})$. The merit of viewing the optimization in the real space is that the optimization now becomes completely separable in the pixel**r**, which can be simplified as$${\left\{{\varphi}_{m}^{(i)}(\mathbf{r})\right\}}_{m\in {\mathcal{L}}_{p}}=\text{arg}\underset{{\varphi}_{m}(\mathbf{r})}{\text{min}}\sum _{m\in {\mathcal{L}}_{p}}{\left|{\varphi}_{m}(\mathbf{r})-{\psi}_{m}^{(i)}(\mathbf{r})\right|}^{2},\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\text{s}.\text{t}.\hspace{0.17em}\hspace{0.17em}{\widehat{I}}_{p}(\mathbf{r})=\sum _{m\in {\mathcal{L}}_{p}}{\left|{\varphi}_{m}(\mathbf{r})\right|}^{2},$$Therefore, the optimization can be solved via the Lagrangian multiplier method by incorporating the constraint into the Lagrangian with a multiplier*λ*:$${\left\{{\varphi}_{m}^{(i)}(\mathbf{r}),{\lambda}^{\u2605}\right\}}_{m\in {\mathcal{L}}_{p}}=\text{arg}\underset{{\varphi}_{m}(\mathbf{r}),\lambda}{\text{min}}\sum _{m\in {\mathcal{L}}_{p}}{\left|{\varphi}_{m}(\mathbf{r})-{\psi}_{m}^{(i)}(\mathbf{r})\right|}^{2}+\lambda \left({\widehat{I}}_{p}(\mathbf{r})-\sum _{m\in {\mathcal{L}}_{p}}{\left|{\varphi}_{m}(\mathbf{r})\right|}^{2}\right).$$By setting the derivatives with respect to*λ*and*ϕ*(_{m}**r**) for*m*∈*ℒ*to zero, we arrive at the solutions for each auxiliary function in the set_{p}*m*∈*ℒ*in this projection step:_{p}$${\varphi}_{m}^{(i)}(\mathbf{r})=\sqrt{\frac{{\widehat{I}}_{p}(\mathbf{r})}{\sum _{m\in {\mathcal{L}}_{p}}{\left|{\psi}_{m}^{(i)}(\mathbf{r})\right|}^{2}}}{\psi}_{m}^{(i)}(\mathbf{r}),\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}\hspace{0.17em}m\in {\mathcal{L}}_{p}.$$Therefore finally, the resulting solution for the intermediate variable in the frequency domain can be directly obtained by$${\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k})={\mathcal{F}}_{\left[{\varphi}_{m}^{(i)}(\mathbf{r})\right]}^{-1}(\mathbf{k}),$$where ${\mathcal{F}}_{[(\cdot )]}^{-1}(\mathbf{k})$ is the inverse Fourier transform. This resembles the classic Gerchberg-Saxton update in phase retrieval by forcing an identical intensity ${\widehat{I}}_{p}(\mathbf{r})/{\sum}_{m\in {\mathcal{L}}_{p}}{\left|{\psi}_{m}^{(i)}(\mathbf{r})\right|}^{2}$ for each auxiliary function.**Projection onto Set***𝒞*_{p,2}: The projection onto set*𝒞*_{p,2}can be equivalently obtained by solving for*O*^{(i+1)}(**k**) and*P*^{(i+1)}(**k**) separately and updating the auxiliary function as the product of the two:$${\mathrm{\Psi}}_{m}^{(i+1)}(\mathbf{k})={O}^{(i+1)}(\mathbf{k}-{\mathbf{k}}_{m}){P}^{(i+1)}(\mathbf{k}),\forall m\in {\mathcal{L}}_{p}$$$$\left\{{O}^{(i+1)}(\mathbf{k}),{P}^{(i+1)}(\mathbf{k})\right\}=\text{arg}\underset{O(\mathbf{k}),P(\mathbf{k})}{\text{min}}\sum _{m\in {\mathcal{L}}_{p}}\sum _{\mathbf{k}}{\left|O(\mathbf{k}-{\mathbf{k}}_{m})P(\mathbf{k})-{\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k})\right|}^{2}.$$This optimization is clearly separable in**k**(but not in*m*), which can be simplified as$$\left\{{O}^{(i+1)}(\mathbf{k}),{P}^{(i+1)}(\mathbf{k})\right\}=\text{arg}\underset{O(\mathbf{k}),P(\mathbf{k})}{\text{min}}\sum _{m\in {\mathcal{L}}_{p}}{\left|O(\mathbf{k}-{\mathbf{k}}_{m})P(\mathbf{k})-{\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k})\right|}^{2}.$$The gradients of the objective function*f*(*O*(**k**−**k**), $P(\mathbf{k}))\triangleq {{\sum}_{m\in {\mathcal{L}}_{p}}\left|O(\mathbf{k}-{\mathbf{k}}_{m})P(\mathbf{k}-{\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k})\right|}^{2}$ can be obtained as:_{m}$$\frac{\partial f(O(\mathbf{k}-{\mathbf{k}}_{m}),P(\mathbf{k}))}{\partial O(\mathbf{k}-{\mathbf{k}}_{m})}=2\sum _{m\in {\mathcal{L}}_{p}}\left[O(\mathbf{k}-{\mathbf{k}}_{m})P(\mathbf{k})-{\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k})\right]\hspace{0.17em}{[P(\mathbf{k})]}^{*}$$$$\frac{\partial f(O(\mathbf{k}-{\mathbf{k}}_{m}),P(\mathbf{k}))}{\partial P(\mathbf{k})}=2\sum _{m\in {\mathcal{L}}_{p}}\left[O(\mathbf{k}-{\mathbf{k}}_{m})P(\mathbf{k})-{\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k})\right]\hspace{0.17em}{[O(\mathbf{k}-{\mathbf{k}}_{m})]}^{*}$$By setting the derivatives with respect to*O*(**k**) and*P*(**k**) to zero, we can obtain the optimality conditions that need to be satisfied by the updates {*O*^{(i+1)}(**k**),*P*^{(i+1)}(**k**)}:$${O}^{(i+1)}(\mathbf{k})=\frac{{{\sum}_{m\in {\mathcal{L}}_{p}}\left[{P}^{(i+1)}(\mathbf{k}+{\mathbf{k}}_{m})\right]}^{*}{\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k}+{\mathbf{k}}_{m})}{{\sum}_{m\in {\mathcal{L}}_{p}}{\left|{P}^{(i+1)}(\mathbf{k}+{\mathbf{k}}_{m})\right|}^{2}}$$$${P}^{(i+1)}(\mathbf{k})=\frac{{{\sum}_{m\in {\mathcal{L}}_{p}}\left[{O}^{(i+1)}(\mathbf{k}-{\mathbf{k}}_{m})\right]}^{*}{\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k})}{{\sum}_{m\in {\mathcal{L}}_{p}}{\left|{O}^{(i+1)}(\mathbf{k}-{\mathbf{k}}_{m})\right|}^{2}}.$$However, the updates depend on each other and hence cannot be obtained directly since they are both unknown. Therefore, instead of solving for the updates in one shot, we perform another numerical sub-optimization with sub-iteration indexed by*ℓ*to numerically obtain the pair of updates {*O*^{(i+1)}(**k**),*P*^{(i+1)}(**k**)}. For this sub-optimization, we choose to use Newton’s method (i.e., second order), which further requires the computation of the second order derivative of the objective function:$$\frac{{\partial}^{2}f(O(\mathbf{k}-{\mathbf{k}}_{m}),P(\mathbf{k}))}{\partial O{(\mathbf{k}-{\mathbf{k}}_{m})}^{2}}=2\sum _{m\in {\mathcal{L}}_{p}}{\left|P(\mathbf{k})\right|}^{2}$$$$\frac{{\partial}^{2}f(O(\mathbf{k}-{\mathbf{k}}_{m}),P(\mathbf{k}))}{\partial P{(\mathbf{k})}^{2}}=2\sum _{m\in {\mathcal{L}}_{p}}{\left|O(\mathbf{k}-{\mathbf{k}}_{m})\right|}^{2}.$$The updates by Newton methods are written below by evaluating the first and second order derivatives with the previous iterates*O*^{(i,ℓ)}(**k**) and*P*^{(i,ℓ)}(**k**):$${O}^{(i,\ell +1)}(\mathbf{k})={O}^{(i,\ell )}(\mathbf{k})+\text{step-size}\cdot {\left[\frac{{\partial}^{2}f(O(\mathbf{k}-{\mathbf{k}}_{m}),P(\mathbf{k}))}{\partial O{(\mathbf{k}-{\mathbf{k}}_{m})}^{2}}\right]}^{-1}\frac{\partial f(O(\mathbf{k}-{\mathbf{k}}_{m}),P(\mathbf{k}))}{\partial O(\mathbf{k})}$$$${P}^{(i,\ell +1)}(\mathbf{k})={P}^{(i,\ell )}(\mathbf{k})+\text{step-size}\cdot {\left[\frac{{\partial}^{2}f(O(\mathbf{k}-{\mathbf{k}}_{m}),P(\mathbf{k}))}{\partial P{(\mathbf{k})}^{2}}\right]}^{-1}\frac{\partial f(O(\mathbf{k}-{\mathbf{k}}_{m}),P(\mathbf{k}))}{\partial P(\mathbf{k})}.$$Generally speaking, Newton methods are preferred in non-linear least squares problems (precisely our formulation) due to their fast convergence and stability compared with first order methods such as gradient descents. Furthermore, we employ the Levenberg-Marquardt algorithm, which is a variant of the Gauss-Newton algorithm by superimposing a constant*δ*to the second order derivative, to avoid the singularity of possible zeros in the denominator. Here for the stability concerns, we also impose a constant on the second order derivative in the Newton’s method, which is equivalent to introducing an*ℓ*_{2}-norm regularization on*O*(**k**) and*P*(**k**).Let the

*ℓ*^{th}sub-iterate pair be {*O*^{(i,ℓ)}(**k**),*P*^{(i,ℓ)}(**k**)}, then the Levenberg-Marquardt version of the Newton’s algorithm updates can be computed with step-sizes*α*^{(i,ℓ)}(**k**) and*β*^{(i,ℓ)}(**k**) as$$\begin{array}{ll}{O}^{(i,\ell +1)}(\mathbf{k})={O}^{(i,\ell )}(\mathbf{k})+\hfill & \frac{{\alpha}^{(i,\ell )}(\mathbf{k}+{\mathbf{k}}_{m})}{\sum _{m\in {\mathcal{L}}_{p}}{\left|{P}^{(i,\ell )}(\mathbf{k}+{\mathbf{k}}_{m})\right|}^{2}+{\delta}_{1}}\hfill \\ \hfill & \times \sum _{m\in {\mathcal{L}}_{p}}{\left[{P}^{(i,\ell )}(\mathbf{k}+{\mathbf{k}}_{m})\right]}^{*}\left[{\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k}+{\mathbf{k}}_{m})-{O}^{(i,\ell )}(\mathbf{k}){P}^{(i,\ell )}(\mathbf{k}+{\mathbf{k}}_{m})\right]\hfill \end{array}$$$$\begin{array}{ll}{P}^{(i,\ell +1)}(\mathbf{k})={P}^{(i,\ell )}(\mathbf{k})+\hfill & \frac{{\beta}^{(i,\ell )}(\mathbf{k})}{\sum _{m\in {\mathcal{L}}_{p}}{\left|{O}^{(i,\ell )}(\mathbf{k}-{\mathbf{k}}_{m})\right|}^{2}+{\delta}_{2}}\hfill \\ \hfill & \times \sum _{m\in {\mathcal{L}}_{p}}{\left[{O}^{(i,\ell )}(\mathbf{k}-{\mathbf{k}}_{m})\right]}^{*}\left[{\mathrm{\Phi}}_{m}^{(i)}(\mathbf{k})-{O}^{(i,\ell )}(\mathbf{k}-{\mathbf{k}}_{m}){P}^{(i,\ell )}(\mathbf{k})\right].\hfill \end{array}$$For simplicity the step-sizes can be chosen as constants*α*^{(i,ℓ)}(**k**) =*α*and*β*^{(i,ℓ)}(**k**) =*β*for all**k**and all iterations. Then finally, after*L*passes of the Levenberg-Marquardt updates, the final updates of {*O*^{(i+1)}(**k**),*P*^{(i+1)}(**k**)} are obtained as

## Acknowledgments

We thank Gongguo Tang for helpful discussions with algorithm designs, Kevin Liu for help with experiments. This work was supported by the Development Impact Lab (USAID Cooperative Agreements AID-OAA-A-13-00002 and AID-OAA-A-12-00011), part of USAID’s Higher Education Solutions Network, and by the Office of Naval Research (grant N00014-14-1-0083).

## References and links

**1. **G. Zheng, C. Kolner, and C. Yang, “Microscopy refocusing and dark-field imaging by using a simple LED array,” Opt. Lett. **36**, 3987–3989 (2011). [CrossRef] [PubMed]

**2. **G. Zheng, R. Horstmeyer, and C. Yang, “Wide-field, high-resolution Fourier Ptychographic microscopy,” Nat. Photonics **7**, 739–745 (2013). [CrossRef]

**3. **L. Tian, J. Wang, and L. Waller, “3D differential phase-contrast microscopy with computational illumination using an LED array,” Opt. Lett. **39**, 1326–1329 (2014). [CrossRef] [PubMed]

**4. **D. Hamilton and C. Sheppard, “Differential phase contrast in scanning optical microscopy,” J. Microscopy **133**, 27–39 (1984). [CrossRef]

**5. **T. N. Ford, K. K. Chu, and J. Mertz, “Phase-gradient microscopy in thick tissue with oblique back-illumination,” Nature Methods **9**, 1195–1197 (2012). [CrossRef] [PubMed]

**6. **M. Levoy, Z. Zhang, and I. McDowall, “Recording and controlling the 4D light field in a microscope using microlens arrays,” J. Microscopy **235**, 144–162 (2009). [CrossRef]

**7. **L. Waller, G. Situ, and J. Fleischer, “Phase-space measurement and coherence synthesis of optical beams,” Nat. Photonics **6**, 474–479 (2012). [CrossRef]

**8. **X. Ou, G. Zheng, and C. Yang, “Embedded pupil function recovery for Fourier ptychographic microscopy,” Opt. Express **22**, 4960–4972 (2014). [CrossRef] [PubMed]

**9. **L. Pantanowitz, “Digital images and the future of digital pathology,” J. Pathology Informatics **1**, 15 (2010). [CrossRef]

**10. **X. Ou, R. Horstmeyer, C. Yang, and G. Zheng, “Quantitative phase imaging via fourier ptychographic microscopy,” Opt. Lett. **38**, 4845–4848 (2013). [CrossRef] [PubMed]

**11. **S. A. Alexandrov, T. R. Hillman, T. Gutzler, and D. D. Sampson, “Synthetic aperture fourier holographic optical microscopy,” Phys. Rev. Lett. **97**, 168102 (2006). [CrossRef] [PubMed]

**12. **T. R. Hillman, T. Gutzler, S. A. Alexandrov, and D. D. Sampson, “High-resolution, wide-field object reconstruction with synthetic aperturefourier holographic optical microscopy,” Opt. Express **17**, 7873–7892 (2009). [CrossRef] [PubMed]

**13. **J. R. Fienup, “Phase retrieval algorithms: a comparison,” Appl. Opt. **21**, 2758–2769 (1982). [CrossRef] [PubMed]

**14. **J. M. Rodenburg and H. M. Faulkner, “A phase retrieval algorithm for shifting illumination,” Appl. Phys. Lett. **85**, 4795–4797 (2004). [CrossRef]

**15. **M. Guizar-Sicairos and J. R. Fienup, “Phase retrieval with transverse translation diversity: a nonlinear optimization approach,” Opt. Express **16**, 7264–7278 (2008). [CrossRef] [PubMed]

**16. **A. M. Maiden and J. M. Rodenburg, “An improved ptychographical phase retrieval algorithm for diffractive imaging,” Ultramicroscopy **109**, 1256–1262 (2009). [CrossRef] [PubMed]

**17. **P. Thibault, M. Dierolf, O. Bunk, A. Menzel, and F. Pfeiffer, “Probe retrieval in ptychographic coherent diffractive imaging,” Ultramicroscopy **109**, 338–343 (2009). [CrossRef] [PubMed]

**18. **O. Bunk, M. Dierolf, S. Kynde, I. Johnson, O. Marti, and F. Pfeiffer, “Influence of the overlap parameter on the convergence of the ptychographical iterative engine,” Ultramicroscopy **108**, 481–487 (2008). [CrossRef]

**19. **Y. Y. Schechner, S. K. Nayar, and P. N. Belhumeur, “Multiplexing for optimal lighting,” IEEE Trans. Pattern Analysis Machine Intelligence **29**, 1339–1354 (2007). [CrossRef]

**20. **N. Ratner and Y. Y. Schechner, “Illumination multiplexing within fundamental limits,” in “Computer Vision and Pattern Recognition” (IEEE, 2007), pp. 1–8.

**21. **R. Gerchberg and W. Saxton, “Phase determination for image and diffraction plane pictures in the electron microscope,” Optik **34**, 275–284 (1971).

**22. **C. Rydberg and J. Bengtsson *et al.*, “Numerical algorithm for the retrieval of spatial coherence properties of partially coherent beams from transverse intensity measurements,” Opt. Express **15**, 13613–13623 (2007). [CrossRef] [PubMed]

**23. **P. Thibault and A. Menzel, “Reconstructing state mixtures from diffraction measurements,” Nature **494**, 68–71 (2013). [CrossRef] [PubMed]

**24. **D. Dominguez, L. Molina, D. B. Desai, T. O’Loughlin, A. A. Bernussi, and L. G. de Peralta, “Hemispherical digital optical condensers with no lenses, mirrors, or moving parts,” Opt. Express **22**, 6948–6957 (2014). [CrossRef] [PubMed]

**25. **L. Tian, J. Lee, S. B. Oh, and G. Barbastathis, “Experimental compressive phase space tomography,” Opt. Express **20**, 8296–8308 (2012). [CrossRef] [PubMed]

**26. **Y. Shechtman, A. Beck, and Y. Eldar, “GESPAR: Efficient phase retrieval of sparse signals,” IEEE Trans. Sig. Processing **62**, 928–938 (2014). [CrossRef]