## Abstract

Despite recent advances, high performance single-shot 3D microscopy remains an elusive task. By introducing designed diffractive optical elements (DOEs), one is capable of converting a microscope into a 3D “kaleidoscope,” in which case the snapshot image consists of an array of tiles and each tile focuses on different depths. However, the acquired multifocal microscopic (MFM) image suffers from multiple sources of degradation, which prevents MFM from further applications. We propose a unifying computational framework which simplifies the imaging system and achieves 3D reconstruction via computation. Our optical configuration omits optical elements for correcting chromatic aberrations and redesigns the multifocal grating to enlarge the tracking area. Our proposed setup features only one single grating in addition to a regular microscope. The aberration correction, along with Poisson and background denoising, are incorporated in our deconvolution-based fully-automated algorithm, which requires no empirical parameter-tuning. In experiments, we achieve spatial resolutions of 0.35um (lateral) and 0.5um (axial), which are comparable to the resolution that can be achieved with confocal deconvolution microscopy. We demonstrate a 3D video of moving bacteria recorded at 25 frames per second using our proposed computational multifocal microscopy technique.

© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

Typically, an optical microscope focuses on one focal plane at a time. Thus, recovering 3D information with sufficient resolution usually requires sequentially z-scanning the focal planes. This process is time consuming and requires precise mechanical control. The long acquisition time of focal scanning microscopy fundamentally prevents the applications in *in vivo* imaging, especially when the 3D movement of biomedical objects is desired. To realize *single-shot* 3D microscopy, a standard microscope has to be modified in such a way that 3D information can be encoded onto a 2D plane.

One of the few modifications to achieve single-shot 3D microscopy, for example, is by placing a diffractive optical element (DOE) in the Fourier plane of a standard microscope (Fig. 1a). The DOE is designed as a distorted phase gratings [1–3], also referred to as a multifocal grating (MFG). The MFG projects different depth layers in a 3D volume onto different sub-regions of the imaging plane simultaneously. In particular, if the imaging plane is divided into *l* × *l* tiles and each tile is focused on one focal plane by design, then a 3D volume of *l*^{2} depths can be retrieved by z-stacking all the tiles. An *l* = 3 case is shown in Fig. 1(c). This imaging technique is also referred as multifocal microscopy (MFM) [4].

MFM stands out among other single-shot 3D microscopy, e.g. light field microscopy [5,6] and lensless 3D coded microscopy [7–9], for its capability of achieving diffraction-limited 3D resolution comparable to conventional focal scanning microscopy. However, as an imaging system, MFM faces several imaging issues, some of which are even more severe than other 3D microscopic modalities that prevents MFM from broader applications:

**Out-of-focus blur.**MFM captures an array of multifocal planes. Like standard wide-field microscopy, each focal plane image is convolved by out-of-focus light. Estimating and removing out-of-focus blur requires implementation of a 3D deconvolution algorithm.*Our proposed algorithm builds on the regularized Richardson-Lucy (RL) deconvolution*,*and features automatic estimation of the background noise and the optimal regularizer parameter*.**Chromatic aberration (CA).**Conventionally, CA is optically corrected by adding a chromatic correction gratings (CCG) and a multifacet prism after the MFG [4], resulting in excessive hardware cost. In our work, we propose a simplified version by abandoning the CCG and the prism. Instead, we place a 15 nm bandpass filter in front of the detector to mitigate CA, as shown in Fig. 1(a). We demonstrate that the resolution loss due to CA can be*computationally*compensated. This comes from a fact that CA from MFG is directional due to tile geometry. In each tile, the PSF is stretched mainly in one direction, but preserves imaging quality in the orthogonal direction.*Therefore*,*by jointly incorporating all the PSFs in each tile*,*our proposed 3D deconvolution algorithm enables restoring the resolution loss due to CA*.**Poisson noise due to limited photon budget.**There are three factors that contribute to the limited photon budget of the MFM. First, as mentioned before, a narrower bandpass filter of 15nm is used to mitigate CA, and thus, the total number of emission photons arriving at the detector is smaller than when using CA corrective optics. Second, the theoretical maximum efficiency of the MFG is smaller than one. For example, the theoretical limits are 68% for a binary phase-only (0 or*π*) MFG with 3 × 3 diffraction orders, and 78% for a MFG with 5 × 5 diffraction orders, which means 32% or 22% of the total emission light will be lost and cannot be collected by the detector. The efficiency can be improved by using a multi-phase MFG [10] at the expense of the complicated fabrication process. Third, each tile only shares about 1/*l*^{2}of the total photons. The limited photon budget requires a noise modeling of Poisson process.*We incorporate the Poisson noise modeling for MFM*,*for the first time*.**Field-of-view (FOV).**As described above, MFM trades lateral FOV for depth information. Conventional MFM designs use large tile spacings for large objects. We address here that this design is not optimal for tracking dynamic objects, which are usually small in size but require large FOV. We analyze the FOV performance theoretically, and demonstrate that*by modifying the MFG design to create a smaller tile spacing*,*small objects (e*.*g*.*bacteria) can be tracked over a larger lateral area than the tile width*.

#### 1.1. Related works

The 2D sensing limitation imposed by optical sensors makes 3D imaging an interesting and active research topic both in macroscopic and microscopic regime. In microscopy, several approaches have been proposed for a variety of imaging conditions. One scheme for achieving 3D imaging is to use coherent illumination, e.g. digital holographic microscopy [11–13], variably-controlled light-emitting diodes (LED) arrays [14, 15], structured illumination [16], etc. Yet the coherent modeling of light propagation does not apply to incoherent/fluorescent objects. For the incoherent case, one seminal idea is to introduce self-interference by projecting a set of Fresnel patterns [17–19]. However, most of the existing methods, especially for incoherent cases, require multiple exposures and massive processing time. This limitation prevents broader applications such as *in vivo* imaging as the motion of the objects during capture process is hard to circumvent and thus, deteriorates the image quality [20, 21]. Recent works have proposed to reduce the measuring requirement by computationally exploiting spatial-temporal redundancy of the scenes [22, 23]. Here, we review two types of computational microscopy that enable single-shot 3D fluorescence imaging.

### Light field microscopy (LFM)

By introducing a microlens array on the primary image plane of a microscopy, LFM encodes 4D (spatial-angular) information into a 2D image and computationally recover the 3D objects [5, 6]. Because LFM trades spatial resolution for single-shot capture, its spatial resolution is lower than that of the conventional microscope. Recently, a RL deconvolution method based on the wave optics theory [24] is derived and demonstrated to improve the resolution for LFM. So far, the best resolutions experimentally achieved by LFM are ∼ 1.4um and 2.6um [5] in the lateral and axial dimensions respectively. In addition, due to the variation on sampling density, the lateral resolution decreases to ∼ 3.75um at the focal plane [5].

### Lensless 3D microscopy

Traditionally, lens-based cameras/microscopes map a point in the scene to a pixel on the detector. Lensless imaging architectures, instead, replace the lens with other encoding elements such that a point is mapped to many points on the detector and thus, require computation to recover. Several seminal literatures have proposed to place a single encoding element, such as a coded mask [7] or diffuser [9], directly in front of a detector. Therefore, the imaging system is compact and cost-effective, and has large FOV. A computational algorithm is then designed to make use of the physical effects, e.g. the Point Spread Functions (PSF), to recover the 3D information of the scene. However, the spatial resolution of lensless 3D microscopes is restricted to the pixel pitch of the detector. In particular, a lens-free fluorescent imaging platform was reported to achieve the spatial resolution of ∼ 10um based on a compressive sensing algorithm for sparse objects [8]. The recovered 3D volume consisted of two or three depth layers with intervals of 50um or 100um.

#### 1.2. Computational multifocal microscopy

Our computational imaging framework, which we call computational multifocal microscopy (CMFM), balances and optimizes the processing capabilities of optics and computation. We demonstrate a CMFM system [Fig. 1 (a)] that utilizes simplified optics, as well as algorithms that correct for CA and low-photon counts. The pipeline of our computational framework is shown in Fig. 1 (d-f).

We perform two experiments to demonstrate the effectiveness of our CMFM system. First, we capture a static image of several frozen periplasms in 3D to demonstrate the spatial resolutions of 0.35um and 0.5um in the lateral and axial dimensions [see Fig. 7(d)], which are comparable to those achievable with confocal deconvolution microscopy [see Fig. 7(b)]. Note that we compare 0.5s captures with our CMFM instrument to a 20s confocal scan taken with a dual spinning disk confocal microscope (Model: CSU-W1) made by Yokogawa Electric Corporation. Our CMFM results show similar 3D image quality, but achieve a 40× reduction in acquisition time. However, we kindly remind readers that the principle of the axial resolution improvement is different for the two techniques. Further discussion will appear in section 4.1. Second, we record a video of *in vivo* bacterium at 25 frames per second (see
Visualization 1) and perform high resolution 3D reconstruction with tracking results (see
Visualization 2) using our CMFM technique.

## 2. Methods

#### 2.1. Image formation model

The image formation of our CMFM fluorescence system can be modeled as the axial integral of the 2D convolution of the PSF with its corresponding depth layer. In the noiseless case, such model can be written as:

where*g*is the observed 2D image,

*o*is an unknown 3D volume, and

*h*is 3D PSFs obtained by capturing 2D images of a point source located at different depths.

*o*and

_{z}*h*are 2D slices of

_{z}*o*and

*h*at axial depth

*z*.

Here, we discretize *o* into *N _{x}* ×

*N*×

_{y}*N*voxels in three dimensions. Each voxel has size Δ

_{z}*× Δ*

_{x}*× Δ*

_{y}*. The 2D image*

_{z}*g*on the detector is sampled with

*M*×

_{x}*M*pixels. Each pixel has size ${\mathrm{\Delta}}_{x}^{\prime}\times {\mathrm{\Delta}}_{y}^{\prime}$. For a band-limited system, the Nyquist sampling rate has to be satisfied in order to avoid aliasing. In MFM, the lateral cut-off frequency is

_{y}*f*= 2

_{c}*NA*/

*λ*, and therefore ${\mathrm{\Delta}}_{x}^{\prime}\le 1/(2{f}_{c})$ and ${\mathrm{\Delta}}_{y}^{\prime}\le 1/(2{f}_{c})$. For simplicity, we set ${\mathrm{\Delta}}_{x}={\mathrm{\Delta}}_{x}^{\prime}$ and ${\mathrm{\Delta}}_{y}={\mathrm{\Delta}}_{y}^{\prime}$ in practice. In order to match

*o*and

*g*dimensions,

*h*is therefore of

*M*×

_{x}*M*×

_{y}*N*voxels, with each voxel size of ${\mathrm{\Delta}}_{x}^{\prime}\times {\mathrm{\Delta}}_{y}^{\prime}\times {\mathrm{\Delta}}_{z}$. Then the discrete form of Eq. (1) can be written as a matrix-vector multiplication form:

_{z}**g**is an

*M*× 1 column vector, in which

*M*=

*M*×

_{x}*M*,

_{y}**o**is an

*N*× 1 column vector, in which

*N*=

*N*×

_{x}*N*×

_{y}*N*, and

_{z}**H**is the sensing matrix of a size

*M*×

*N*. Note that each

**H**

*is a Toeplitz matrix representing 2D convolution, and can be constructed from*

_{z}*h*

_{z}.Note that, in our case, two types of noise are considered. First, our imaging process has Poisson noise due to the limited photon budget resulted from (1) the narrower bandpass filter, (2) the lower diffraction efficiency of the MFG and (3) splitting of light. Second, the background noise (room stray light, or the sample itself) is also considered. We assume a uniform background photon noise across all the pixels in the acquired MFM image. Therefore, the image formation model under the Poisson noise and additive background noise model can be expressed as

where $\mathrm{P}$ represents Poisson statistics originated from signal photons, and**b**models the uniform background noise. Note that

**b**is an

*M*× 1 column vector and each entry of

**b**is a same constant, denoted by

*b*.

#### 2.2. Joint regularized RL deconvolution

Equation (3) leads to the following likelihood function according to Poisson distribution

*i*stands for the pixel coordinate in

**g**. In RL deconvolution, the optimal solution

**o**

^{∗}is found from the observation

**g**by maximizing Eq. (4), or equivalently minimizing its negative logarithm, subject to all the voxels of the restored image have non-negative values, that is,

*p*(

**g**|

**o**,

**b**) in Eq. (4), in which a constant term log (

**g**

*!) is omitted.*

_{i}In regularized RL algorithm, a regularizer term is added to the objective function in Eq. (5). Here, we consider total variation (TV) regularization because it can avoid the noise amplification problem in RL deconvolution by allowing to recover a smooth and stable solution with sharp edges. The objective function in the TV regularized RL algorithm is therefore written as

where*λ*is the regularization parameter and $\text{TV}(\mathbf{o})={\displaystyle {\sum}_{j=1}^{N}\sqrt{{({\mathrm{\Delta}}_{x}{\mathbf{o}}_{j})}^{2}+{({\mathrm{\Delta}}_{y}{\mathbf{o}}_{j})}^{2}+{({\mathrm{\Delta}}_{z}{\mathbf{o}}_{j})}^{2}}}$, in which

*j*denotes the voxel coordinate in the

**o**. Our joint algorithm estimates the 3D image, the background noise and the regularization parameter simultaneously by minimizing Φ in Eq. (7) subject to all the voxels of the restored 3D image have non-negative values, that is,

The minimization of Φ(**o**, **b**, *λ*) in Eq. (8) with respect to all three unknown variables can be performed by an iterative alternating gradient descent method. More specifically, at each iteration, the three variables are updated sequentially, and when one variable is updated, the other two variables are fixed as constants.

### 2.2.1. Estimation of 3D image

To update **o** for (*k* + 1)-th iteration, we substitute **o**^{k,}**b*** ^{k}* and

*λ*estimated from

^{k}*k*-th iteration in Eq. (7), and take partial derivatives with respect to each voxel ${\mathbf{o}}_{j}^{k}$ in

**o**

^{k}*h*

_{i}_{,}

*is the entry in the*

_{j}*i*-th row and the

*j*-th column of

**H**, and ${\sum}_{i=1}^{M}{h}_{i,j}=1$ if

**H**is column normalized (which is equivalent to normalizing PSFs

*h*), and

_{z}*div*stands for the divergence operator. Given the partial derivative in Eq. (9), We update

**o**using the gradient-based algorithm (or equivalently by using expectation maximization algorithm), defined by [25, 26]

*j*-th reconstructed voxel. Note that for 1 ≤

*j*≤

*N*, all the voxels in

**o**can be updated simultaneously by storing them in a vector.

### 2.2.2. Estimation of the uniform background

When we have **o**^{k}^{+1}, we substitute **o**^{k}^{+1,} **b*** ^{k}* and

*λ*in Eq. (7), and take partial derivatives with respect to

^{k}*b*

Note that each pixel in the captured MFM image is assumed to have the same background value *b*. Then we update *b* using the same gradient-based algorithm as is in 3D image estimation

### 2.2.3. Estimation of optimal regularization parameter

The approach of estimating optimal *λ* is based on the fact that when the optimal solution **o**^{∗} is found, the partial derivatives with respect to each restored voxel ${\mathbf{o}}_{j}^{k}$ in Eq. (9) should be equal or close to zero. So the *λ* is obtained by minimizing the sum of the square of Eq. (9) over all the reconstructed voxels [27]

Because the objective function in Eq. (13) is a quadratic form of *λ*, the close form solution of *λ* can be found by equating the derivatives of Eq. (13) with respect to λ to zero

The joint regularized RL deconvolution algorithm is summarized in Algorithm 1. We performed a series of simulations to evaluate the effectiveness and convergence of the joint regularized RL algorithm for our CMFM system. The simulation results and convergence plots are shown in Appendix A.

## 3. System analysis

A schematic of our CMFM imaging setup is shown in Fig. 1(a). A regular microscope (Nikon Eclipse Ti) is augmented by a 4 *f* relay system (L1 = 200mm, Thorlabs AC508-200-A; L2 = 400mm, Thorlabs AC508-400-A). The total magnification of the whole system is 120× with the use of 60× objective lens (Nikon 60× 1.27 NA CFI Plan Apo water immersion, MRD07650). Fluorescence emission collected through the objective was separated from excitation light by a dichroic mirror (Semrock, FF596-Di01-25x36) and bandpass filter (Semrock, FF02-641/75-25). Excitation light was bandpass filtered (Semrock FF01-578/21-25) from A solid-state LED light source (Lumencor, Spectral × light engine). An electron multiplying charge coupled device (EMCCD, Andor iXon Ultral 888) has 1024 × 1024 pixels with pixel pitch 13um × 13um. Thus, each pixel corresponds to a size of 108nm × 108nm in the object domain, which satisfies the Nyquist sampling criterion. A multifocal grating (MFG) is placed at the Fourier plane of the 4 *f* relay system. As an example, schematics of the MFG used in our first experiment is shown in Fig. 1(b). This MFG was optimized and designed to produce 3 × 3 differently focused tile images on a single 2D detector as shown in Fig. 1(c), with an equal focal step Δ*z* of 0.25um between adjacent tiles and almost even light energy distribution of [7.56, 7.48, 7.21, 7.47, 7.62, 7.47, 7.21, 7.48, 7.56]percent for each tile. A total diffraction efficiency of the MFG is therefore 67.06%, which is close to the theoretical maximum efficiency. The details about the design and manufacture of the MFG are provided in Appendix C. A second, narrower bandpass filter (Semrock, FF01-620/14-25) is placed immediately before the camera to narrow the bandwidth of the emission light to mitigate chromatic aberration. To avoid the overlapping between tile images, a field stop slightly smaller than the tile width is placed at the intermediate image plane to reduce the lateral FOV [see Fig. 1(a)].

Next, we provide analysis on the Point Spread Function (PSF), the chromatic aberration (CA), and the FOV of our CMFM system.

#### 3.1. Point Spread Function (PSF) and resolution

We image a z-stack of a 200nm fluorescent bead (ThermoFisher, F8810), which approximates an ideal light-emitting point, to measure PSF. Figure 2 (first row) shows CMFM lateral PSF imaged under five different axial positions (columns). Each lateral PSF consists of 3 × 3 tiles, with one tile in focus (outlined by a green box; also zoomed in second row) and eight tiles out of focus. From left to right, as the focus depth increases, we can observe a focal shift on the axial direction of the x-z plots of PSFs (third row). The Optical Transfer Functions (OTFs) are plotted as reference (bottom two rows).

We measure the full-width half-magnitude (FWHM) values to quantify the resolution. Note that the PSFs are affected by chromatic aberration and the lateral resolution is anisotropic. Therefore, we perform a principle component analysis (PCA) of xy PSFs. The FWHM values of those five tiles’ PSFs (second row) are {0.32, 0.87, 1.17, 0.87, 1.17}um in the CA direction, and [0.32, 0.37, 0.38, 0.37, 0.36] um in the perpendicular direction. The axial FWHM values are {1.03, 1.39, 1.50, 1.39, 1.55} um. The variation of the in-focus PSF sizes results from directional chromatic aberration. Based on the PCA results, each horizontal and vertical diffraction order tile has a resolution loss of 0.55um due to CA. Each diagonal diffraction order tile has a resolution loss of 0.85um due to geometry.

#### 3.2. Chromatic aberration (CA)

Due to the diffractive nature of the MFG, the CMFM system suffers from obvious CA effect, which causes the loss of the intensity and resolution. Mathematically, the lateral CA per tile can be expressed as [28]:

*m*and

*n*are horizontal and vertical diffraction orders of the tile,

*f*is the focal length of the second lens of the 4

*f*system,

*d*and

_{x}*d*are periods of the MFG in the

_{y}*x*and

*y*directions, and Δ

*λ*is the bandpass spectrum of the emission collected by the detector. The axial CA for each tile, i.e., different wavelengths are focused at different distances from the lens, can also be expressed as [28]: where

*λ*the wavelength used to design the MFG, and

_{c}*f*

_{m}_{,}

*= (*

_{n}*m*+

*ln*)Δ

*z*is the focusing distance of each individual tile.

From Eq. (15) and Eq. (16), we can see that for the central tile where *m* = *n* = 0, *δ _{x}* =

*δ*=

_{y}*δ*= 0, which means the central tile’s PSF and image are free of CA. However, for off-axis tile, both lateral and axial CA exist. Each horizontal and vertical diffraction oder tile has a dispersion of

_{z}*mf*Δ

*λ*/

*d*, which is equal to 1.07um when

_{x}*m*= 1 for our CMFM system of

*f*= 400mm, Δ

*λ*= 15nm,

*d*=

_{x}*d*= 56um and $\widehat{M}=120$. Each diagonal diffraction order tile has a dispersion of $\sqrt{{m}^{2}+{n}^{2}}f\mathrm{\Delta}\lambda /{d}_{x}$. In addition, the dispersion direction varies for different diffraction order tiles due to geometry. If left uncorrected, the CA would cause a loss of the the intensity and resolution in the chromatic dispersion direction. In principle, CA can be optically corrected by using CA corrective optics. However, the corrective optics increases system cost and complexity [4]. Here, we show that CA can be effectively compensated by

_{y}*computation*. This is possible because the CA is directional due to geometry of the tile distribution in our CMFM system. For example, in Fig. 2, the eight non-centric tiles exhibit directional stretching towards the image center. At depth

*z*= 0 (first column), the PSF is CA-free. On the other focal planes, from Δ

*z*to 4Δ

*z*(second column to last column), even though the green boxes indicate in-focus tiles, the PSFs stretch in different directions at different diffraction orders. However, the stretching does not occur at the perpendicular direction of CA. This implies that although a certain tile is affected by CA on one direction, the perpendicular direction is less affected and can be used to compensate another tile which suffers CA at this direction. Therefore, by jointly utilizing all the tiles in the model, 3D deconvolution can preserve 2D spatial frequencies that are lost by CA.

### CA blur vs defocus blur

Note here that the blur resulted from CA is different than the defocus blur. Fig. 3 presents a comparison of the two types of blurs. In Fig. 3(a), the red box highlights the in-focus tile. This in-focus tile is not located at the image center and has CA blur in horizontal dimension. Meanwhile, the blue box highlights the tile that suffers both defocus blur and CA blur but in vertical dimension. However, in horizontal dimension, this out-of-focus tile only has defocus blur. An Optical Transfer Function (OTF) comparison is shown in Fig. 3(b). The defocus blur size is smaller than the CA blur size in horizontal direction, resulting in a wider OTF lobe, which implies that higher resolution reconstruction can be achieved by using all the information provided from the PSFs. On the right panel, we present numerical simulation results. The simulated resolution target [Fig. 3(d)] contains bars pointing four directions. By using only the in-focus PSF, as shown in Fig. 3(f), only horizontal bars can be resolved, as the high frequency has only been preserved in vertical direction [Fig. 3(b)]. On the other hand, by making full use of the PSFs, signals on different directions can be well-recovered [Fig. 3(g)].

#### 3.3. Field-of-View (FOV)

In conventional MFM designs [Fig. 4(a)], the lateral FOV is the same as the lateral size of each tile image, which can be expressed in the sample domain as:

where*L*and

_{x}*L*are detector size in the

_{y}*x*and

*y*directions,

*l*is the number of tiles in each dimension, and $\widehat{M}$ is the magnification of the MFM. The axial FOV is limited by the focused imaging range of the designed MFG as

From Eq. (17) and Eq. (18), we can see that by increasing the number of tiles *l*^{2}, the axial FOV will increase but at the expense of the reduced lateral FOV for a fixed detector size. Thus, MFM trades its lateral FOV for the depth resolution. The maximum recovered 3D volume or tracking space in MFM is *FOV _{x}* ×

*FOV*×

_{y}*FOV*.

_{z}### 3.3.1. Enlarging lateral tracking space

In conventional MFM designs, large tile spacings are used to ensure that large objects can be imaged [see Fig. 4(a)]. However, this comes at the cost of a large reduction in lateral tracking area. This, however, is not optimal for the MFM 3D tracking applications where the objects of interest (bacteria or molecule, etc.) are quite small but move freely in a large 3D space. In single molecule tracking, larger tracking space is more desirable than the lateral FOV.

By modifying our MFG design to produce a smaller tile spacing, we can achieve a small lateral FOV that can be tracked over a large area [see Fig. 4(b)]. The comparison of MFM image diagrams obtained by two different design methods are shown in Fig. 4. When optimizing the MFG for tracking applications, the dimensions of the lateral trackable area are

*x*and

*y*dimensions, where

*s*and

_{x}*s*are tile spacing in x and y dimensions. Note that Eq. (19) is a general form of Eq. (17). From Eq. (19), we can clearly see that the lateral tracking area

_{y}*T*and

_{x}*T*can be enlarged by designing a smaller tile spacing

_{y}*s*and

_{x}*s*. Therefore, our method can achieve a larger lateral tracking area for MFM tracking applications.

_{y}### 3.3.2. Numerical validation

We simulated a 3D tracking space of 1024 × 1024 × 71 voxels with each voxel size of 108nm × 108nm × 200nm. The scene consists of an ellipsoid [Fig. 5(a)] with a diameter of 4.32um in *x* and *y* directions and 8um in *z* direction, which is similar to the size of a bacterium in our experiment [see Fig. 8]. The *xz* slice and 1D axial profile of the ellipsoid are shown in Fig. 5(a) (right) and Fig. 5(d) (red line), respectively. The center of the ellipsoid is placed at 35.8um horizontally from the center of the detector.

For conventional MFM, we used an experimentally captured z-stack PSF for the simulation. The PSF consists of 5 × 5 tiles with a tile spacing of 205 pixels in both *x* and *y* dimensions. The 2D MFM measurement [Fig. 5(e)] was generated based on the forward model of Eq. (3). The maximum signal and background photon counts are set to be 100 and 5, respectively. The Poisson noise was then added to the measurement by using Matlab’s Poisson random number generator. For the proposed MFM, the PSF was generated by cropping the central 101 × 101 pixels region of each tile of the experimentally acquired MFM PSF and then recombining the cropped small tiles. As a result, the tile spacing is 101 pixels instead of 205 pixels. The measurement [Fig. 5(f)] of this new MFM design was also generated based on Eq. (3) and then corrupted by background and Poisson noise.

In conventional MFM, since the horizontal position of the synthetic ellipsoid exceeds the tile spacing, the left two-column tiles produced by MFM which contains axial depth information from −12Δ*z* to −3Δ*z* are shifted out of the detector’s FOV and therefore cannot be recorded by the detector [Fig. 5(e)]. However, in our new MFM design, the 5 × 5 tiles are recorded in their entirety by the detector [Fig. 5(f)]. According to Eq. (19), the lateral tracking area is about 22um × 22um for conventional method, and 67um × 67um for the proposed method. For reconstruction, we recovered a 3D space on a grid of 1024 × 1024 × 71 with each grid size being 108nm × 108nm × 200nm for both methods. The reconstructed ellipsoid (cropped from the larger 3D recovery space) and its *xz* slice by different design methods are shown in Figs. 5(b) and (c), respectively. The 1D axial profiles of the reconstructed ellipsoids are also compared in Fig. 5(d) (black and blue lines). We can clearly see that since the left two-column tiles of the ellipsoid are missing in the conventional MFM measurement, the ellipsoid is reconstructed poorly while our design provides a good reconstruction. We also compare the signal loss as a function of lateral position of the tracked object in Figs. 5(g) and (h). Similar to vignetting effect, the signal falls off when approaching the edges. Our proposed design alleviates peripheral signal loss and preserves full-sized PSF array over an enlarged lateral space.

## 4. Experimental results

We present two experimental results with different biological samples to test the spatial and temporal performance of our system. As shown in Fig. 6, we divide the whole image into different sets of tiles by modifying the MFG pattern. However, the focal step Δ*z* is designed to be 0.25um for both cases. In Fig. 6(a), the image is divided into 3 × 3 tiles so that 9 focal planes are in focus, while in Fig. 6(b), 5 × 5 focal planes are obtained for the purpose of tracking.

#### 4.1. Static scene: CMFM vs confocal microscopy

We prepare several periplasms (static) as our imaging sample for 3D spatial resolution characterization. The captured raw image is shown in Fig. 6(a). The exposure time is 0.5s. The 3D PSFs are measured by recording a z-stack of a small fluorescent bead with z focal step of 50nm. The 3D image are reconstructed on a 300 × 300 × 51 grid with each grid size being 108nm × 108nm × 50nm. Figure 7(d) shows our computational 3D reconstruction in comparison with the captured focal stack [Fig. 7(c)] assembled by z-stacking 9 tiles from the 2D raw MFM image. For ground truth, we captured the 3D image of periplasms using a scanning confocal microscope [Fig. 7(a)] and performed deconvolution of the confocal image [Fig. 7(b)] by using a commercial Huygens software. For each 3D image, a XY slice at *z* = 0 (first row) and a YZ slice at *x* = −6.5um (third row) are displayed. The comparisons of vertical line profiles indicated by the red solid line in XY slice, and comparison of axial profiles indicated by the red dotted line in YZ slice are shown in the second and fourth rows, respectively. From Fig. 7(c), we can see that without computational reconstruction, the 3D image assembled by z-stacking nine tile images from 2D MFM measurement is degraded by noise, out-of-focus blur and low spatial resolution in all three dimensions. The outer membranes are blurred and the empty space between them cannot be recognized. In addition, due to the out-of-focus blur, the FWHM of the axial profile is about 2um. However, after deconvolution, the reconstructed 3D image [Fig. 7(d)] is much cleaner, less out-of-focus blur, and more importantly, has high resolution in three dimensions. The empty space between outer membranes can be clearly recognized. The outer membranes which are vertically blurred to a single peak can be well separated by a dip of 84% between two peaks with the lateral FWHM of 0.35um [second row of Fig. 7(d)], which is consistent with FWHM of the measured PSF. Due to removal of out-of-focus blur, the axial FWHM is about 0.5um [fourth row of Fig. 7(d)], with about two times improvement over that of the raw MFM image. Note that we compare 0.5s captures with our CMFM instrument to a 20s confocal scan taken with a dual spinning disk confocal microscope (Model: CSU-W1) made by Yokogawa Electric Corporation. Our CMFM results show similar 3D image quality, but achieve a 40x reduction in acquisition time. The automatically recovered background noise and optimal regularizer parameter at each iteration of the joint RL-TV deconvolution process is shown in Appendix B Fig. 11.

### Cautionary remarks

Note that although both our CMFM and confocal deconvolution microscopy achieve 0.5um axial resolution in the experiment, the principle of the axial resolution improvement is different for two techniques. Confocal microscopy increases axial resolution by means of using a (or multiple) spatial pinhole(s) to block out-of-focus light in image detection process. Therefore, the axial extent of confocal PSF is narrower than that in the widefield microscope, and thus a high contrast and resolution image can be obtained. However, for CMFM, the axial resolution improvement is based on sparsity-based super-resolution microscopy techniques [29–31] by utilizing the signal sparsity in the arbitrary known domain (i.e., gradient domain in our case). Recently, a new method called SPARCOM [32]: sparsity-based super-resolution correlation microscopy by utilizing sparsity in the correlation domain, is also reported to achieve spatial resolution comparable to PALM and STORM.

#### 4.2. 3D tracking of moving bacterium

In the second experiment, we demonstrate the 3D video reconstruction of a moving bacterium (see Visualization 2). The MFG is modified to focus on 5 × 5 planes, as shown in Fig. 6(b). A video is captured at 25 frame per second (fps) (see Visualization 1). According to our first experimental reconstruction where the axial FWHM can achieve 0.5um, we reconstruct the 3D image of the bacterium on a grid of 150 × 150 × 41 with each grid size being 108nm × 108nm × 200nm, which corresponds to a FOV of 16.2um × 16.2um × 8um in 3D space for each video frame. Figures 8(a-e) show five out of fifty frames 3D reconstructions. Note that for the visualization purpose, we cropped the reconstructed 3D volume and just showed 4um × 6um × 4um region around the bacterium. By computing the center of mass for each frame reconstruction, a 3D trajectory of the moving bacterium can be tracked [Fig. 8(f)]. The colorbar in Fig. 8(f) indicates the frame index over time. The automatically recovered background value and the optimal regularizer parameter for each video frame is shown in Appendix B Fig. 12.

## 5. Conclusion

We have presented computational multi-focus microscopy (CMFM), a framework that balances the imaging system design and computational processing to achieve single-shot 3D microscopy. Our optical design significantly reduces the system complexity and experimental alignment by discarding CA correction optics. We correct for CA computationally rather than optically. This comes from the fact that CA occurs in different directions at different diffraction order tiles. Therefore, by jointly utilizing all the tiles in the model, 3D deconvolution can preserve 2D spatial frequencies that are lost by CA. By incorporating TV regularization, our algorithm can not only compensate for CA, but also perform high quality 3D reconstructions from noisy data. We build on a joint regularized RL deconvolution algorithm and incorporate two types of noise. Notice that our proposed algorithm is free of any parameter tuning by automatically estimating the background noise and the optimal regularization parameter using alternating gradient descent.

We experimentally demonstrate that the out-of-focus blur along *z* can be significantly suppressed and the axial resolution as high as 0.5um is achievable. The lateral resolution of 0.35um, which is consistent with the diffraction-limited resolution, is also experimentally demonstrated. A high resolution 3D video of a movable bacterium at 25fps is also computationally reconstructed to verify the proposed CMFM framework. Even though we demonstrate a single color imaging in the paper, we anticipate that multi-color imaging is also possible using our CMFM system. Similar to [4], we can also insert a dichroic mirror to split the color channels onto separate cameras, and place a narrow bandpass filter with different central wavelengths in front of each camera. This is possible because the blur size of chromatic aberration depends on the bandpass spectrum of the emission, but not the central wavelength [see Eq. (15)].

Finally, we propose a new design method of MFG to enlarge the lateral tracking area of MFM tracking applications without sacrificing its axial FOV and single-shot capture speed. The benefits of the simple system design and high resolution image recovery offered by the proposed CMFM will broaden its applications in 3D single-shot imaging.

## Appendix A algorithm evaluation

To verify the effectiveness of the joint regularized RL algorithm for our CMFM and also quantify the reconstruction quality, we performed a series of simulations by using an experimentally captured CMFM PSFs. The synthetic 3D image [Fig. 9(top left)] was obtained from the confocal microscope image of the 3D bacterial that was imaged under the CMFM. The CMFM measurement image was generated based on image formation model of Eq. (3) and then degraded with background and Poisson noise. The maximum signal and background photon counts are set to be 50 and 5, respectively. The Poisson noise was then added to the measurement by using Matlab’s Poisson random number generator.

To quantify the reconstruction quality, we computed peak signal-to-noise ratio (PSNR) and I-divergence [26] between ground truth image *u* and the reconstructed image *v*:

*is the maximum pixel value of the image*

_{u}*u*, MSE is the mean square error (MSE) between two images

*u*and

*v*, and

*i*stands for the voxel index.

Figure 9(top right) shows the reconstruction of the standard RL deconvolution without TV constraint by setting *λ* = 0. Because TV prior is not used, the reconstruction can not preserve the smooth edge of the original image and contains the artifacts. To investigate the effect of the background noise on the CMFM reconstruction quality, we performed RL-TV deconvolution but with an incorrect background value of 10 (the truth value is 5). The reconstructed image is shown in Fig. 9(bottom left). Since the used background value is two times bigger than the real one, the recovered 3D image suffers from substantial signal loss. The reconstructed 3D image by joint RL-TV algorithm is shown in Fig. 9(bottom right). The PSNRs between the original image and reconstructed images by three methods are 34.2dB, 27.5dB, 40.2dB, and I-divergences are 204.3, 517, and 96.7, respectively. Clearly, the joint RL-TV algorithm, which simultaneously recovers 3D image, the background noise value and optimal regularizer parameter, gave the best reconstruction quality of CMFM among three methods. Figures 10(a-d) show the reconstructed background value *b*, the optimal regularizer parameter *λ*, PSNR and I-divergence at each iteration of the deconvolution process. Note that both the background value and regularizer parameter were initialized to be 100, but they gradually converged to 4.99 and 2.2 × 10^{−3}, respectively. From Fig. 10(c-d), we can also see that the image quality is dramatically improved during the iterative reconstruction process, proving the effectiveness of joint RL-TV algorithm for the 3D reconstruction of our snapshot CMFM system.

## Appendix B experimental results

We also tested the algorithm on two types of real data. The 3D reconstruction of static periplasms is shown in Fig. 7. The automatically recovered background noise and optimal regularizer parameter at each iteration of the joint RL-TV deconvolution process is shown in Fig. 11.

For the second experiment, we captured an MFM video of a moving bacterium at 25 frames per second (fps) using our CMFM techniques. The complete 3D video reconstruction from the first frame to the last frame is shown in Visualization 2. The automatically recovered background value and the optimal regularizer parameter for each video frame is shown in Fig. 12.

Note that the algorithm was written in MATLAB2014b and run on the computer with CPU E5-1650 and 64GB of RAM. The algorithm terminated when the maximum iteration number was reached or it converged, defined when the difference between to consecutive values of the cost function is smaller than a predefined threshold (10^{−6} in our case). For the first experiment, it took 134 iterations to converge and each iteration took about 6 seconds. The total computation time was therefore about 13.4 minutes. For the second experiment of 3D video reconstruction, it took 83 iterations to converge for a single video frame and each iteration took about 3.8 seconds. The total computation time was therefore about 5.2 minutes for each video frame. The code is publicly available on our website: http://compphotolab.northwestern.edu.

## Appendix C design and manufacture of the MFG

The MFG was designed with a focal shift Δ*z* = 0.25um between tiles based on a procedure outlined in [4]. The unit cell of the MFG was designed by using the iterative Gerchberg-Saxton (GS) algorithm so that fluorescence emission light was distributed evenly and efficiently among the tiles. To create a proper focal shift, the geometric distortion [4] was then imposed on the MFG.

In order to construct the MFG, a 5mm thick UV fused silica substrate (Thorlabs WG41050) was cleaned by using acetone, isopropyl alcohol and distilled water and then spin-coated with 1.5um thick layer of AZ1512 photoresist (Shipley). A laser writer (Heidelberg) exposed the MFG pattern into the photoresist (405nm laser at a dose of 150mJ). The photoresist was developed in AZ-300 MIF developer (Integrated Micro Materials) for 20 seconds. Then the pattern was etched in a reactive ion etcher (RIE, Oxford Instruments). The photoresist remaining after etching was stripped with acetone in an ultrasonic bath. The etching depth and surface roughness of the MFG were measured with contact profilometry. All fabrication steps were completed at the Prizker Nanofabrication Facility at the University of Chicago.

## Funding

Biological Systems Science Division, Office of Biological and Environmental Research, Office of Science, U.S. Dept. of Energy, under Contract DE-AC02-06CH11357; NSF CAREER grant IIS-1453192; ONR award N00014-15-1-2735.

## Disclosures

The authors declare that there are no conflicts of interest related to this article.

## References

**1. **P. M. Blanchard and A. H. Greenaway, “Simultaneous multiplane imaging with a distorted diffraction grating,” Appl. Opt. **38**, 6692–6699 (1999). [CrossRef]

**2. **S. Abrahamsson, R. Ilic, J. Wisniewski, B. Mehl, L. Yu, L. Chen, M. Davanco, L. Oudjedi, J.-B. Fiche, B. Hajj, X. Jin, J. Pulupa, C. Cho, M. Mir, M. E. Beheiry, X. Darzacq, M. Nollmann, M. Dahan, C. Wu, T. Lionnet, J. A. Liddle, and C. I. Bargmann, “Multifocus microscopy with precise color multi-phase diffractive optics applied in functional neuronal imaging,” Biomed. Opt. Express **7**, 855–869 (2016). [CrossRef] [PubMed]

**3. **L. Oudjedi, J.-B. Fiche, S. Abrahamsson, L. Mazenq, A. Lecestre, P.-F. Calmon, A. Cerf, and M. Nöllmann, “Astigmatic multifocus microscopy enables deep 3d super-resolved imaging,” Biomed. Opt. Express **7**, 2163–2173 (2016). [CrossRef] [PubMed]

**4. **S. Abrahamsson, B. H. J. Chen, S. Stallinga, A. Y. Katsov, J. Wisniewski, G. Mizuguchi, P. Soule, F. Mueller, C. D. Darzacq, X. Darzacq, C. Wu, C. I. Bargmann, D. A. Agard, M. Dahan, and M. G. L. Gustafsson, “Fast multicolor 3d imaging using aberration-corrected multifocus microscopy,” Nat. methods **10**, 60–63 (2013). [CrossRef]

**5. **R. Prevedel, Y.-G. Yoon, M. Hoffmann, N. Pak, G. Wetzstein, S. Kato, T. Schrödel, R. Raskar, M. Zimmer, E. S. Boyden, and A. Vaziri, “Simultaneous whole-animal 3d imaging of neuronal activity using light-field microscopy,” Nat. methods **11**, 727–730 (2014). [CrossRef] [PubMed]

**6. **N. C. Pégard, H.-Y. Liu, N. Antipa, M. Gerlock, H. Adesnik, and L. Waller, “Compressive light-field microscopy for 3d neural activity recording,” Optica **3**, 517–524 (2016). [CrossRef]

**7. **J.K. Adams, V. Boominathan, B.W. Avants, D.G. Vercosa, F. Ye, R.G. Baraniuk, J.T. Robinson, and A. Veeraraghavan, “Single-frame 3d fluorescence microscopy with ultraminiature lensless flatscope,” Sci. Adv. **3**, 1701548 (2017). [CrossRef] [PubMed]

**8. **A. F. Coskun, I. Sencan, T.-W. Su, and A. Ozcan, “Lensless wide-field fluorescent imaging on a chip using compressive decoding of sparse objects,” Opt. Express **18**, 10510–10523 (2010). [CrossRef] [PubMed]

**9. **N. Antipa, G. Kuo, R. Heckel, B. Mildenhall, E. Bostan, R. Ng, and L. Waller, “Diffusercam: lensless single-exposure 3d imaging,” Optica **5**, 1–9 (2018). [CrossRef]

**10. **J. N. Mait, “Understanding diffractive optic design in the scalar domain,” J. Opt. Soc. Am. A **12**, 2145–2158 (1995). [CrossRef]

**11. **G. Pedrini and S. Schedin, “Short coherence digital holography for 3d microscopy,” Optik-International J. for Light. Electron Opt. **112**, 427–432 (2001). [CrossRef]

**12. **B. W. Schilling, T.-C. Poon, G. Indebetouw, B. Storrie, K. Shinoda, Y. Suzuki, and M. H. Wu, “Three-dimensional holographic fluorescence microscopy,” Opt. Lett. **22**, 1506–1508 (1997). [CrossRef]

**13. **Z. Wang, D. Ryu, K. He, O. Cossairt, and A. K. Katsaggelos, “4d tracking of biological samples using lens-free on-chip in-line holography,” in *Digital Holography and Three-Dimensional Imaging*, (Optical Society of America, 2017), pp. Tu2A–4.

**14. **L. Tian, J. Wang, and L. Waller, “3d differential phase-contrast microscopy with computational illumination using an led array,” Opt. letters **39**, 1326–1329 (2014). [CrossRef]

**15. **L. Tian and L. Waller, “3d intensity and phase imaging from light field measurements in an led array microscope,” optica **2**, 104–111 (2015). [CrossRef]

**16. **S. Chowdhury, W. J. Eldridge, A. Wax, and J. A. Izatt, “Structured illumination microscopy for dual-modality 3d sub-diffraction resolution fluorescence and refractive-index reconstruction,” Biomed. Opt. Express **8**, 5776–5793 (2017). [CrossRef]

**17. **J. Rosen and G. Brooker, “Digital spatially incoherent fresnel holography,” Opt. letters **32**, 912–914 (2007). [CrossRef]

**18. **J. Rosen and G. Brooker, “Non-scanning motionless fluorescence three-dimensional holographic microscopy,” Nat. Photonics **2**, 190 (2008). [CrossRef]

**19. **O. Cossairt, K. He, R. Shang, N. Matsuda, M. Sharma, X. Huang, A. Katsaggelos, L. Spinoulas, and S. Yoo, “Compressive reconstruction for 3d incoherent holographic microscopy,” in Image Processing (ICIP), 2016 IEEE International Conference on, (IEEE, 2016), pp. 958–962.

**20. **D. Ryu, Z. Wang, K. He, G. Zheng, R. Horstmeyer, and O. Cossairt, “Subsampled phase retrieval for temporal resolution enhancement in lensless on-chip holographic video,” Biomed. optics express **8**, 1981–1995 (2017). [CrossRef]

**21. **Z. Wang, Q. Dai, D. Ryu, K. He, R. Horstmeyer, A. Katsaggelos, and O. S. Cossairt, “Dictionary-based phase retrieval for space-time super resolution using lens-free on-chip holographic video,” in *Computational Optical Sensing and Imaging*, (Optical Society of America, 2017), pp. CTu2B–3.

**22. **H. Y. Liu, J. Zhong, and L. Waller, “Multiplexed phase-space imaging for 3d fluorescence microscopy,” Opt. Express **25**, 14986–14995 (2017). [CrossRef] [PubMed]

**23. **Z. Wang, L. Spinoulas, K. He, L. Tian, O. Cossairt, A. K. Katsaggelos, and H. Chen, “Compressive holographic video,” Opt. Express **25**, 250–262 (2017). [CrossRef] [PubMed]

**24. **M. Broxton, L. Grosenick, S. Yang, N. Cohen, A. Andalman, K. Deisseroth, and M. Levoy, “Wave optics theory and 3-d deconvolution for the light field microscope,” Opt. Express **21**, 25418–25439 (2013). [CrossRef] [PubMed]

**25. **P. J. Greeb, “On the use of em algorithm for penalized likelihood estimation,” J. Royal Stat. Soc. B **52**, 443–452 (1990).

**26. **N. Dey, L. Blanc-Feraud, C. Zimmer, P. Roux, Z. Kam, J. C. Olivo-Marin, and J. Zerubia, “Richardson-lucy algorithm with total variation regularization for 3d confocal microscope deconvolution,” Microsc. Res. Tech. **4**, 260–266 (2006). [CrossRef]

**27. **M. Laasmaa, M. Vendelin, and P. Peterson, “Application of regularized richardson-lucy algorithm for deconvolution of confocal microscopy images,” J. Microsc **2**, 124–140 (2011). [CrossRef]

**28. **K. He, X. Huang, X. Wang, S. Yoo, P. Ruiz, I. Gdor, N. J. Ferrier, N. Scherer, M. Hereld, A. K. Katsaggelos, and O. Cossairt, “Design and simulation of a snapshot multi-focal interferometric microscope,” Opt. Express **26**, 27381–27402 (2018). [CrossRef]

**29. **S. Hugelier, J. J. De Rooi, R. Bernex, S. Duwé, O. Devos, M. Sliwa, P. Dedecker, P. H. Eilers, and C. Ruckebusch, “Sparse deconvolution of high-density super-resolution images,” Sci. reports **6**, 21413 (2016).

**30. **S. Hugelier, P. Eilers, O. Devos, and C. Ruckebusch, “Improved superresolution microscopy imaging by sparse deconvolution with an interframe penalty,” J. Chemom. **31**, e2847 (2017). [CrossRef]

**31. **A. Szameit, Y. Shechtman, E. Osherovich, E. Bullkich, P. Sidorenko, H. Dana, S. Steiner, E. B. Kley, S. Gazit, and T. Cohen-Hyams *et al.*, “Sparsity-based single-shot subwavelength coherent diffractive imaging,” Nat. materials **11**, 455 (2012). [CrossRef] [PubMed]

**32. **O. Solomon, M. Mutzafi, M. Segev, and Y. C. Eldar, “Sparsity-based super-resolution microscopy from correlation information,” Opt. Express **26**, 18238–18269 (2018). [CrossRef] [PubMed]