Expand this Topic clickable element to expand a topic
Skip to content
Optica Publishing Group

Simultaneous measurement and reconstruction tailoring for quantitative phase imaging

Open Access Open Access

Abstract

We propose simultaneous measurement and reconstruction tailoring (SMaRT) for quantitative phase imaging; it is a joint optimization approach to inverse problems wherein minimizing the expected end-to-end error yields optimal design parameters for both the measurement and reconstruction processes. Using simulated and experimentally-collected data for a specific scenario, we demonstrate that optimizing the design of the two processes together reduces phase reconstruction error over past techniques that consider these two design problems separately. Our results suggest at times surprising design principles, and our approach can potentially inspire improved solution methods for other inverse problems in optics as well as the natural sciences.

© 2018 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

1. Introduction

Imaging the spatially-varying optical path length differences encountered by light as it passes through a specimen of interest, i.e., quantitative phase imaging, has wide applicability in many scientific domains. For example, the spatial distribution of absorption in many biological specimens does not exhibit high contrast, unlike quantitative phase images of the same specimen. While interferometry and holography are classical solutions to phase imaging, there also exists a large family of self-interferometric techniques wherein the axial intensity evolution of a collimated beam after passing through a specimen of interest is used to numerically estimate the specimen’s optical path length profile. Members of this family include nonlinear phase retrieval techniques like Gerchberg–Saxton–Fienup approaches [1,2] or PhaseLift [3], techniques based on the transport of intensity equation (TIE) [4–8], and techniques based on contrast transfer functions (CTFs) [9, 10]. While TIE and CTF techniques usually assume coherent or nearly coherent illumination, they have been extended to the partially coherent regime as well. Partial coherence results in loss of high frequency information away from the focal plane [8,11]; loss of spatial coherence caused by area sources can be inverted via illumination diversity [12].

Like any other inverse problem, reconstruction quality depends on the design of both the measurement and reconstruction steps of the imaging method—what measurements do we take, and how do we then numerically estimate the phase? In standard TIE imaging, transverse intensity profiles at two closely spaced planes are used to estimate the axial derivative, from which a phase image is computed via regularized inversion of the Laplace operator. Recent papers [7, 8, 13–15] suggest measurements at multiple planes for better derivative estimates, whereas other threads [16,17] discuss alternatives to Tikhonov regularization for reconstruction.

The optimal reconstruction design naturally depends on the choice of measurements taken, but conversely the optimal measurement design also depends on the choice of reconstruction method—we want to avoid wasting resources, such as imaging time or light budget, to collect information that the reconstruction method will simply ignore. This principle suggests that it is suboptimal to design measurements to capture every last bit of information about the specimen and then design a reconstruction approach based on these measurements. Rather, it would be better to jointly optimize the design of the two. The seminal wavefront coding papers [18,19] as well as the ensuing computational photography field followed this general approach, wherein the captured images themselves may not be the “best” images of the object being photographed, but they yield higher quality images than standard photography after appropriate numerical postprocessing. Joint optimization was also leveraged explicitly via what is known as multi-domain optimization (MDO) in the design of many computational optical systems such as the origami lens [20,21]. Inspired by these approaches, we propose phase imaging via a joint optimization framework we call simultaneous measurement and reconstruction tailoring (SMaRT), which we now outline.

Our framework builds upon the same motivation as Wiener filtering and related minimum mean square error (MMSE) estimators, but we extend the classical approach by allowing the noise statistics (e.g., per-image noise level) to vary as a consequence of measurement design. This leads to seeking an optimal measurement–reconstruction design pair that minimizes the expected reconstruction error. More specifically, we parameterize specimens by vectors s in some Hilbert space 𝒮, and we are interested in imaging a class of specimens 𝒮0𝒮 with prior probability density function f𝒮0(s). Given design spaces 0 and 0 for the measurement M and reconstruction R, respectively, we wish to solve a problem of the form

minimizeM0,R0f𝒮0(s)srR(pM(s))22ds,
where pM(·) is a stochastic function that describes the noisy forward model for a specific measurement method M, and rR(·) is a deterministic function that reconstructs an estimate of s from data, based on reconstruction design R. While the general case is difficult to solve, the specific case of weak-phase objects illuminated by partially coherent light and imaged through a standard optical microscope under a fixed time budget enables rewriting Eq. (1) as a regularized least squares problem of the form
minimizex1,,xLbl=1LAlxl22+γ(l=1Lxl)2.
The least squares term describes the goodness of fit in the noiseless case, and the squared sum of norms regularizer models the impact of noise on the reconstruction under a fixed time budget. In this imaging scenario, measurement design M is a set of possibly-zero exposure times for L candidate imaging planes, and the reconstruction design R is a set of spectral coefficients so that rR(·) spectrally mixes the captured images accordingly. Regularization parameter γ is not hand-tuned, but rather computed from calibration data; the Al matrices and vector b are also a function of calibration data as well as expected feature size and feature phase height, the only two user-adjustable parameters in this approach. Optimal M and R are both computed from the solution x1, . . ., xL to the problem above. Furthermore, it is possible to predict the performance of the entire imaging process by examining the composite function rR (pM(·)).

The key to our SMaRT approach is the formulation of an optimization problem to jointly design the measurement and reconstruction processes for solving an inverse problem, rather than manually designing these two processes separately. This ameliorates the effect of human bias in the design process and allows a numerical algorithm to explore novel recipes that directly reduce the average reconstruction error. Furthermore, automated design facilitates leveraging additional information such as specimen statistics and the illumination spectrum, leading to more performant extraction of phase information compared to previous methods, which usually opt for simple measurement designs based on few assumptions. For example, knowledge of the spectrum of a polychromatic source allows us to ameliorate the effects of blurring caused by the source not being quasimonochromatic. Simple and intuitive parameters as well as predictable performance in our approach makes it easy to use in practice, and the concepts introduced in this paper can potentially be applied to other imaging scenarios as well.

The remainder of the paper is as follows. In Section 2, we derive an expression for the forward model pM(·) of an optical microscope and list the assumptions needed for the model to be valid; in Section 3, we use the forward model to derive a convex problem of the form in Eq. (2) and discuss how to apply it to phase imaging as well as making the solution computationally tractable; in Section 4, we apply our method to both simulated and experimental data in the context of imaging a standard phase target; finally, in Section 5, we discuss the implications of our technique.

2. Forward propagation model

We start with a standard optical microscope layout under Köhler illumination, as shown in Fig. 1. Light from a single-color light-emitting diode (LED) is collimated by a condenser lens and illuminates a thin (phase) object of interest placed at the specimen plane. An objective lens and tube lens form a 4f system images the specimen onto the intermediate image plane, where a camera in a standard microcope would capture images. To accomodate phase imaging, we allow the camera to be translated axially to capture images at varying amounts of defocus. Our imaging system is typical of TIE imaging and similar to systems for low-coherence inline holography.

 figure: Fig. 1

Fig. 1 A standard optical microscope, wherein an area light source at the front focal plane of a condenser lens illuminates a thin specimen (blue) coincident with the back focal plane. An objective–tube lens 4f system forms a magnified image (blue dotted) of the specimen at the intermediate image plane. Rays passing through the specimen’s center and edge are shown in gray and dark red, respectively. For clarity, relative sizes are not to scale.

Download Full Size | PDF

To simplify our model, we assume that the field at the intermediate image plane is a magnified and diffraction-limited representation of the field immediately after the specimen, since objective lenses are often highly optimized multi-element lens systems designed to minimize aberration. This allows us to mathematically model the system as a virtual partially coherent source illuminating a magnified specimen, whose image is captured at varying amounts of defocus, both negative and positive. The magnification reduces the effective range of illumination angles emanating from the virtual source, and it, together with the the numerical aperture (NA) of the objective lens, also imposes a low spatial bandlimit on the field observed at the intermediate image plane. These factors enable simplifying assumptions in the mathematical model that we derive in the rest of the section. Our derivation starts with a parameterization s of the specimen, continues with expressions for the propagated intensity profiles under both coherent and partially coherent illumination, and concludes with a model for noisy pixel measurements, yielding the stochastic forward operator pM(·).

2.1. Thin specimen parametrization

We consider thin specimens located on some reference plane (which we will denote as z = 0) with coordinates (x, y) and amplitude transmission profiles of the form

Θ(x,y)=Θ¯+s(x,y)=Θ¯+α(x,y)+iβ(x,y),
where Θ̄ is the average transmissivity, s(x, y) is a zero-mean complex-valued function with α(x, y) and β(x, y) as its real and imaginary components, respectively, and i is the imaginary unit. In practice, we assume Θ̄ = 1 since we are not interested in illumination intensity for phase imaging. We assume that the specimen is not dispersive—i.e., an incident coherent field Uin(x, y) at z = 0 for any wavelength of interest λ induces an output field at z = 0 given by Uout(x, y) = Θ(x, y)Uin(x, y). We also assume that s(x, y) is spatially bandlimited, and that we are only interested in the value of s(x, y) within a rectangular area of extent Wx × Wy. Let positive integers M and N be defined so that Wx/M and Wy/N are sufficient Nyquist sampling intervals for the bandlimit of s(x, y). Given these assumptions, we choose to model s(x, y) via a finite weighted sum of complex phasors, resulting in a periodic function with period Wx and Wy in the x and y directions, respectively:
s(x,y)m=1Mn=1Ns˜mnexp(i2πξmx)exp(i2πηny)=m,ns˜mnψmn(x,y),
where ξm = (m)/Wx and ηn = (n)/Wy give the discretized frequencies, and = ⌊M/2⌋+1 and = ⌊N/2⌋ + 1 give the indexes of the zero-frequency component. We define α̃mn and β̃mn similarly for α(x, y) and β(x, y), respectively. Since α(x, y) and β(x, y) are real-valued functions, their corresponding Fourier coefficients α̃mn and β̃mn possess symmetry, i.e., α˜mn=α˜mn* and β˜mn=β˜mn* where m′ = 2m, and n′ = 2n, and * denotes complex conjugation. Note that by definition, m̄n̄ = α̃m̄n̄ = β̃m̄n̄ = 0. Thus, we can represent each specimen by a vector s of length 2MN − 2, which consists of MN − 1 entries corresponding to the α̃mns followed by MN − 1 entries corresponding to the β̃mns. For the remainder of the paper, we will use m,n to denote summing over all pairs of (m, n) except for (, ).

We note that we must restrict the region of interest to a finite Wx × Wy region to make any numerical approach tractable, and we assume a periodic extrapolation of the specimen to simplify notation and achieve a model compatible with the discrete Fourier transform. In practice, no matter what assumptions are made about areas outside the region of interest, there will always be boundary artifacts that signify a mismatch between these assumptions and reality.

2.2. Noiseless propagated intensity for coherent illumination

We first derive the transverse intensity profile at various propagation distances z when the specimen is illuminated by a near-normal-incidence unit-amplitude coherent plane wave with wavelength λ much smaller than the features in the specimen; the magnification and numerical aperture present in the microscope’s 4f system satisfies this assumption. The incident field is

Uin(x,y)=exp[i2π(ξx+ηy)],
where frequencies ξ, η describe the angle of the incoming plane wave. The small incidence angle and large feature size assumptions imply M/(2Wx) + |ξ| ≪ 1/λ, and N/(2Wy) + |η| ≪ 1/λ. These assumptions allow us to model propagation of the outgoing field Uout(x, y) using the Fresnel diffraction integral, resulting in a propagated field of the form
Uprop(x,y,z)=ϕzUin(x,y)[1+sprop(xλzξ,yλzη,z;λ)],
where ϕz = exp(i2πz/λ) exp[−iπλz(ξ2 + η2)] and
sprop(x,y,z;λ)=m,ns˜mnψmn(x,y)exp[iπλz(ξm2+ηn2)].
The intensity can thus be written as
Icoherent(x,y,z;ξ,η,λ)=1+2Re{sprop(xλzξ,yλzη,z;λ)}+|sprop(xλzξ,yλzη,z;λ)|2.
For weakly scattering specimens, the quadratic term in Eq. (8) is small, so we can approximate
Icoherent(x,y,z;ξ,η,λ)1+2Re[sprop(xλzξ,yλzη,z;λ)].
This is the weak phase object approximation, a standard approximation used in the CTF literature [22, 23]. For future reference, we note that the Fourier transform along x and y of Re[sprop(x, y, z; λ)] can be written as the following series:
m,n{α˜mncos[πλz(ξm2+ηn2)]+β˜mnsin[πλz(ξm2+ηn2)]}δ(ξξm,ηηn).

2.3. Noiseless propagated intensity for partially coherent illumination

Using results from the fully coherent case, we now derive an expression for the propagated intensity I(x, y, z) when we illuminate the specimen with a partially coherent field whose cross spectral density [24] at the z = 0 plane has the form

W(x1,y1,x2,y2,ω)=I0μ(x1x2,y1y2,ω),
where x1, y1 and x2, y2 are coordinates on this plane, ω = 2πc/λ is the optical frequency, c is the speed of light, k = 2π/λ is the wave number, and μ is proportional to the complex degree of spectral coherence [25] and related to the two-dimensional Fourier transform of a nonnegative function ρ(a, b, ω) by
μ(Δx,Δy,ω)=ρ(a,b,ω)exp[ik(aΔx+bΔy)]dadb.
The functions ρ and μ are normalized such that ∫ μ(0, 0, ω) dω = 1. Any field of the form given by Eq. (11) and Eq. (12) can be interpreted as a stochastic incoherent mixture of plane waves parameterized by ξ = −ω/c and η = −ω/c, with probability density ρ(a, b, ω). Thus, the resulting intensity can be written as
I(x,y,z)=I0ρ(a,b,ω)Icoherent(x,y,z;ka,kb,2πc/ω)dadbdω.
Substituting in Eq. (9), we obtain
I(x,y,z)I0+2I0Re{ρ(a,b,ω)sprop(x+2πza,y+2πzb,z;2πc/ω)dadbdω}
=I0+[I04π2z2ρ(x2πz,y2πz,ω)]2Re{sprop(x,y,z;2πc/ω)}dω,
whose Fourier transform along x and y is
I˜(ξ,η,z)I0δ(ξ,η)+I˜real(ξ,η,z)+I˜imag(ξ,η,z),
where the contributions to the intensity by the real and imaginary parts of s(x, y) are given by
I˜real(ξ,η,z)=I0m,nα˜mnδ(ξξm,ηηm)P˜real(ξm,ηn,z),
I˜imag(ξ,η,z)=I0m,nβ˜mnδ(ξξm,ηηm)P˜imag(ξm,ηn,z),
and the propagation transfer functions are given by
P˜real(ξm,ηn,z)=2cos[λπz(ξm2+ηn2)]μ(λzξm,λzηn,ω)dω,
P˜imag(ξm,ηn,z)=2sin[λπz(ξm2+ηn2)]μ(λzξm,λzηn,ω)dω.
The partially coherent intensity patterns can be interpreted as the convolution of the coherent intensity patterns with the shape of the area source scaled proportional to propagation distance [11]. For a circular area source with uniform spectrum, i.e., ρ(a, b, ω) = ρ̂(ω) when a2+b2<rsource2 and 0 otherwise, we can write μ(−λzξm, −λzηn, ω) = J1(2)/, where r^=πzrsource(ξm2+ηn2)1/2 and J1 is the Bessel function of the first kind and order one.

2.4. Noisy imaging model

We now model the relationship between the propagated intensity and measured pixel values. We assume that images are captured via an M × N array of sensor pixels covering the same Wx × Wy extent used in the definition of the specimen, with one image for each of the propagation distances z1, . . ., zL. For the (m, n) pixel in the lth image, we model the noiseless measured value as

pixellmn(noiseless)=𝒞1τlχ(xxm,yyn)I(x,y,zl)dxdy,
where 𝒞1 is a constant scaling factor, τl is the exposure time for the lth image, χ(x, y) is the pixel’s spatial sensitivity profile normalized to ∬χ(x, y) dx dy = 1, I(x, y, z) is the propagated intensity derived earlier, and (xm, yn) is the pixel center for the (m, n) pixel. This model relies on the assumption that the illumination spectrum is narrowband enough so that wavelength-dependent pixel sensitivity effects do not apply.

For noisy measurements, we assume that: (1) we record ≫ 10 photons per pixel, (2) Poisson shot noise dominates all other noise sources, and (3) the weak-perturbation assumption of II0 applies. While more powerful techniques such as deep neural networks can be applied to low photon count scenarios (see for example [26]), in this paper we focus on typical imaging scenarios, where all three assumptions are usually valid. These assumptions allow us to approximate the noise at all the pixels as independent and identically distributed (i.i.d.) zero-mean Gaussian noise with variance proportional to exposure time τl and average intensity I0:

pixellmn(noisy)~𝒩(pixellmn(noiseless),𝒞2τlI0),
where 𝒞2 is another constant scaling factor and I0 is the average intensity. For the special case of a uniform intensity field, this reduces to 𝒩(𝒞1τlI0, 𝒞2τlI0). A centered unitary discrete Fourier transform (DFT) of the pixel values yields
pixel˜lmn(noisy)=1MNm^=1Mn^=1Npixellm^n^(noisy)exp{i2π[(xm^xm¯)ξm+(yn^yn¯)ηn]}
~𝒩(𝒞1τlI^(ξm,ηn,zl),𝒞2τlI0),
where Î(ξ, η, z) = χ̃(ξ, η) [Ĩ(ξ, η, z) ⊗ WxWy sinc(Wxξ) sinc(Wyη)] describes the Fourier transform of the intensity after cropping by the sensor area and applying the pixel sensitivity function χ(x, y), whose Fourier transform is χ̃(ξ, η); we define sinc(a) = sin(πa)/(πa). Given the derivations in the previous section, Î(ξm, ηn, zl) = I0 when ξm = ηn = 0; for other frequency coordinates,
I^(ξm,ηn,zl)=I0χ˜(ξm,ηn)[P˜real(ξm,ηn,zl)α˜mn+P˜imag(ξm,ηn,zl)β˜mn].
Let us define the normalized measurements yl as a vector of length MN − 1 whose entries correspond to pixel˜lmn(noisy)/(𝒞1τlI0) for m and n. Let us also define nl as a random vector of length MN − 1 whose entries are i.i.d. Gaussian random variables with zero mean and variance equal to σl2=𝒞2/(𝒞12τlI0). For notational convenience, we define y and n to be the vertical concatenation of y1, . . ., yL and n1, . . ., nL, respectively, and we define measurement specification M as the collection of exposure times τ1, . . ., τL. We can then express the forward operator pM as the sum of a deterministic linear function acting on specimen vector s and a stochastic noise component n, i.e.,
y=pM(s)=Ps+n,
with the linear relationship defined by the L(MN − 1) × 2(MN − 1) sparse matrix P. Let us index the entries of y and thus the rows of P by the triplet lmn, where 1 ≤ lL, and (m, n) takes on values between (1, 1) and (M, N) with the exclusion of (, ). Similarly, let us index the entries of s and thus the columns of P by the triplet pqr, where p and q are spectral coordinates corresponding to m and n, and r ∈ {1, 2} determines whether we are considering the real or imaginary parts of pq, i.e., spqr = α̃pq if r = 1 and spqr = β̃pq if r = 2. Given these index definitions, then the (lmn, pqr) entry of matrix P is given by
P(lmn,pqr)={plmn1=χ˜(ξm,ηn)P˜real(ξm,ηn,zl)ifm=p,n=q,r=1plmn2=χ˜(ξm,ηn)P˜imag(ξm,ηn,zl)ifm=p,n=q,r=20otherwise.

3. Formulation as a convex optimization problem

The sparsity structure of P suggests a linear reconstruction operator of the form rR(y) = Ry, where R is a 2(MN − 1) × L (MN − 1) matrix with similar sparsity structure parameterized by 2L(MN − 1) coefficients:

R(pqr,lmn)={xmnrlifm=p,n=q0otherwise.
Let xl be a complex-valued vector of length 2MN − 2 whose entries consist of the xmn1ls, which parameterize the reconstruction coefficients for recovering the α̃mn components of s from the lth image, followed by the xmn2ls for likewise recovering the β̃mn components; let x be the vertical concatenation of the x1, . . ., xL. Therefore, we can say that x is a parameterization for the reconstruction design R.

As stated previously, the measurement design M is a list of exposure times τ1, . . ., τL ≥ 0 for each candidate imaging plane, with imaging skipped when the exposure time is zero. To make Eq. (1) meaningful, we must impose some cost for the exposure time, or else we can unrealistically aim for infinite exposure time in order to reduce noise. We propose a simple light budget scheme constraining total exposure time to be less than some user-defined constraint τ0.

With these definitions of M and R as well as the stochastic forward model function pM defined in the previous subsection, we can rewrite Eq. (1) as a constrained convex problem that attempts to minimize the expectation of reconstruction error across all possible realizations of specimen s in specimen class 𝒮0 and noise n (which we assume to be uncorrelated):

minimizex,τ1,,τLEs𝒮0,n[sR(Ps+n)22]subjecttoτ0lτl.
The objective function can be expanded into
Es𝒮0,n[sR(Ps+n)22]=Es𝒮0,n{m,nRmn[Pmn(α˜mnβ˜mn)+nmn](α˜mnβ˜mn)22},
where nmn is an L-vector containing the relevant components of n, and Rmn and Pmn are nonoverlapping blocks of R and P:
Rmn=(xmn11xmn12xmn1Lxmn21xmn22xmn2L),Pmn=(p1mn1p1mn2pLmn1pLmn2).
Defining matrix Bmn via factoring the specimen’s spectral (raw) covariance matrix Cmn,
BmnBmnH=Cmn=[Es𝒮0(|α˜mn|2)Es𝒮0(α˜mnβ˜mn*)Es𝒮0(β˜mnα˜mn*)Es𝒮0(|β˜mn|2)],Bmn=(bmn1bmn3bmn2bmn4),
we can remove the explicit expectation from the objective function to obtain
m,n(RmnPmnI)Bmn22+m,ntr[RmnHRmnEn(nmnHnmn)].
We can further reshuffle Eq. (33) into a more familiar form, i.e.,
minimizex,τ1,,τLbl=1LAlxl22+l=1Lσl2xl22subjecttoτ0lτl,
where b is a vector of length 4(MN − 1) consisting of the bmn1s followed by the bmn2s, bmn3s and bmn4s, and Al is a block diagonal matrix whose mnth block Almn is given by
Almn=[(plmn1bmn1+plmn2bmn2)I(plmn1bmn3+plmn2bmn4)I].
For completeness, we define σl2xl22=0 if both xl = 0 and τl = 0 to avoid ambiguity. If the exposure times in Eq. (34) were fixed, then the σls would become constants, making it a standard regularized least squares problem obtained in the derivation of the Wiener filter; such a optimization problem could easily be solved using standard methods such as conjugate gradients. However, since we are jointly optimizing both reconstruction and measurement design, our solution is no longer standard Wiener filtering, and we need to rewrite Eq. (34) to solve it.

Note that only the second term in the objective function of Eq. (34) depends on the τls; therefore, we can reduce Eq. (34) to an unconstrained optimization over only the xls if we can derive a simplified expression for the minimal value of l=1Lσl2xl22 given the xls, i.e., solving

minimizeτ1,,τLl=1Lσl2xl22subjecttoτ0lτl.
The above subproblem has a closed form solution (see Appendix A) where τl is proportional to ‖xl2 for 1 ≤ lL. We note that this exposure allocation recipe applies to any spectral mixing reconstruction method where noise energy is equally distributed in the frequency domain and images are dominated by Poisson noise with more than ∼10 photons per pixel. The closed form solution for Eq. (36) allows us to rewrite Eq. (34) as an unconstrained convex problem:
minimizex1,,xLbl=1LAlxl22+γ(l=1Lxl2)2,
where γ=𝒞2/(𝒞12I0τ0) is a constant that represents the relative noise level in the system; it is a value obtained through the calibration and is not a user-selectable regularization parameter. This objective function has a smooth least-squares term as well as a nonsmooth regularization term that weakly promotes sparsity in the xls; it can be solved using proximal gradient methods [27–30], which were designed for minimizing sums of smooth and nonsmooth convex functions. The appropriate proximal operator for the regularization term is derived in Appendix C.

Once the optimal xls have been found, creating the reconstruction function rR(·) is straight-forward. The optimal τls can be obtained by setting τl=τ0xl2/(kxk2). The end-to-end spectral imaging performance of the measurement–reconstruction pair can be analyzed by observing the entries of the 2 × 2 matrix Hmn = RmnPmn. The diagonal terms give essentially the discretized transfer function for α(x, y) and β(x, y), whereas the cross-terms denote whether information about α bleeds into β and vice versa.

Solving Eq. (37) technically yields a measurement design M and reconstruction design R for estimating the functions α(x, y) and β(x, y) in the general case. For the remainder of this section, we will discuss how SMaRT can be applied specifically to quantitative phase imaging as well as how Eq. (37) can be solved efficiently in practice.

3.1. Estimating the covariance for phase imaging

Solutions to Eq. (37) depend on prior knowledge regarding the set 𝒮0 of specimens of interest; the Al matrices and the vector b in Eq. (37) are functions of the Cmns, the (prior) spectral covariance matrices defined in Eq. (32). We propose a generic two-parameter prior for phase imaging that works well in our experiments, although more apt Cmns could be derived for specific applications.

We start with the assumption that our specimen is a piecewise-constant phase specimen ϕ(x, y) where each piece has phase uniformly distributed between −ϕmax/2 and ϕmax/2. The corresponding α(x, y) and β(x, y) functions are given by

α(x,y)=ζcos[ϕ(x,y)]1,andβ(x,y)=ζsin[ϕ(x,y)],
where 1/ζ = 2 sin(ϕmax/2)/ϕmax is a normalization term to ensure that α(x, y) is zero-mean.

Furthermore, we assume that the probability of any two points belonging to the same “piece” is equal to exp(−Δxy/Δ), where Δxy is the distance between the two points and Δ is a parameter, which we will define as the average feature size. A convex tessellation of the plane where the number of discontinuities in any line segment is a Poisson distribution would induce this type of exponential falloff, but here we make no additional assumptions about the exact form of ϕ(x, y). Using results from Appendix B, we find that the spectral covariance can be approximated by

Cmn=2πΔ2[1+4π2Δ2(ξm2+ηn2)]3/2(ζ2/2+ζ/2100ζ2/2ζ/2).

Although most specimens are not piecewise-constant, naturally occurring specimens often have large, nearly constant regions. The Lorentzian-like falloff of the cross spectral density is also reminiscent of natural image priors with power law falloffs [31–33], which can be stochastically approximated using a piecewise-constant “dead leaves” model [34].

We note that in practice, Cmn mismatch results in a noisier phase image, but the reconstructed images are still predictable, as we can examine rR (pM(·)) for the transfer function of the imaging system to interpret the results in context. This is in contrast to a regularized reconstruction approach using total variation (TV) [17] or nonlinear diffusion [16], which forces reconstructed phase images to approximate piecewise-constant functions in order to minimize the regularization term. With our approach, we simply use prior statistics as a way to estimate the approximate signal-to-noise ratio at each spectral band in order to allocate measurement resources; our approach does not force compatibility with the prior, and thus our results are also less sensitive to the design of the prior.

3.2. Accomodating tilt in the illumination

For two optical imaging systems whose illumination are identical except for a constant tilt, the optimal measurement–reconstruction pair for one can be adapted for the other. Suppose the two imaging systems’ illumination density functions are related by

ρ2(a,b,ω)=ρ1(aa0,bb0,ω).
Then, the reconstruction operator for the second system is the same as multiplying the input images’ Fourier transform by exp[i2πzl(ξma0 + ηnb0)] and then applying the reconstruction operator for the first system. This is because the tilt induces a linear phase term in P, which can be neutralized by an equal and opposite linear phase term in R. Therefore, in practice, SMaRT allows for a slight tilt in the system, which can be removed using calibration.

3.3. Exploiting symmetry

Directly solving Eq. (37) can present numerical challenges, since the number of unknowns is on the order of 𝒪(LMN), leading to slow convergence and high storage costs. One saving grace is that the solution x1, . . ., xL is usually highly redundant. The optical layout of a microscope is often circularly symmetric about the imaging axis, and the spectral covariance of the specimen is circularly symmetric as well, leading to nearly circularly symmetric solutions. Therefore, we propose a modification to Eq. (37) to exploit these symmetries.

Let T ∈ ℝ(MN−1)×K be an interpolation operator that takes K coefficients and interpolates them linearly into an M × N image, while disregarding the center pixel. In other words, we have a real-valued interpolation map vector t of length MN − 1, and its mn th element is given by 1≤ tmnK. We then define the (mn, k)th entry of T to be

T(mn,k)={1+tmntmnifk=tmntmntmnifk=tmn+10otherwise.
Let matrices I2 and I4 be 2 × 2 and 4 × 4 identity matrices, respectively, and let Q be an equalization operator for T so that TQ has unit singular values and that QTTQ = I; we can easily construct Q from the singular value decomposition of the tridiagonal matrix TT. With these definitions, we propose an efficient approximation to Eq. (37) by replacing b and the Als with and the A^ls given by
b^=[I4(QT)]b,A^l=(I4Q)[(I4T)Al(I2T)](I2Q),
where ⊗ denotes the Kronecker (tensor) product. We can then solve the resulting problem for an optimal and compute xapprox = I2 ⊗ (TQ); the regularization term in the convex problem remains unchanged because QTTQ = I.

We note that the bracketed expression in A^l’s definition is sparse and thus can be compactly stored in memory, and only one copy of Q needs to be stored, regardless of the size of L. If T is an operator that interpolates K coefficients into a radially symmetric image, then as long as the illumination pattern ρ(a, b, ω) is radially symmetric about some point (a0, b0), the only approximation error incurred is with the pixel function ξ(x, y), which is usually not circularly symmetric. However, in practice this error can be ignored, as the pixel function is usually a rough approximation as well.

3.4. Pruning

While the regularizer in Eq. (37) is a mild sparsity promoter, solutions for large L usually still yield many nonzero τls, some of them very small. This can create two problems in practical applications. First, having too many positions introduces a lot of mechanical motion into the experimental apparatus, which may be undesirable as it can disturb the specimen, gradually cause the apparatus to go out of alignment, and increase the amount of time needed to capture the images, since the image sensor has to wait for any residual motion to subside before exposing. Second, very short exposure times easily violate the assumption of Gaussian noise with variance proportional to average intensity, because quantization noise and readout noise start dominating the desired signal, neither of which scale with average intensity.

We propose a pruning method that, while not yielding a globally optimal solution, still results in a practical measurement–reconstruction design pair. We start by solving Eq. (37) for the full set of possible positions (or terminating after a fixed number of iterations) and then iteratively prune the positions. In each iteration, we remove a small set of positions corresponding to those with the lowest exposure times (or zero exposure), and then recompute Eq. (37) with the reduced set of positions. This is repeated until either: (1) the number of positions is reduced below a certain threshold, or (2) the shortest exposure time is above a certain threshold, or (3) the computed expected error is more than some relative threshold compared to the full position set solution. Since the initial number of unknowns can be large, pruning improves results in practice by speeding up convergence speed, allowing better results in a fixed amount of iterations. A similar approach has been applied successfully in the deep learning literature [35,36]; we can also imagine pruning as a variant of orthogonal matching pursuit [37], wherein we remove basis vectors instead of adding them before recomputing coefficients.

4. Imaging a phase target

We now apply our SMaRT approach to image a standard phase resolution target. For illumination, we use a Newport M-20X objective to focus light from a Thorlabs LED4C1 installed with the amber LED module (nominal wavelength 590 nm) onto a 25 μm pinhole placed at the front focal plane of a 50 mm focal-length plano-convex lens. The collimated light illuminates the specimen, which is then imaged using a Mitsutoyo 10X Plan Apo objective (NA = 0.28) and a 100 mm plano-convex tube lens for a total magnification of 5 × at the intermediate image plane. Both plano-convex lenses have diameters of 25 mm. Images of the propagated intensity were captured using a UI-2240-M camera with pixel pitch 4.65 μm placed on a Newport XMS50 linear stage, with tilt correction followed by cropping to produce 1001 × 1001 pixel images.

For calibrating the camera’s noise level, we took uniform field images at various exposure levels; we assumed 𝒞1 = 1 and estimated 𝒞2 ≈ 1.33 × 10−4. Using a one-second exposure of the target, we obtained an estimate of I0 ≈ 0.113. We set τ0 = 3 s and L = 385 for uniformly spaced z positions 250 μm apart between ±24 mm. For our phase imaging prior, we set the average feature size Δ = 59.7 μm and ϕmax = 0.3 rad. We also pruned down to 7 images on either size of z = 0. Solving Eq. (37) using the proximal gradients algorithm given in [30] with the symmetry modifications as well as pruning yielded a measurement design shown in Fig. 2; computation was via MATLAB, taking ∼23 min on a quad-core Intel Xeon W3520 CPU. The corresponding transfer functions for the real and imaginary parts of the transmission profile, shown in Fig. 3, were computed from the forward model and the optimal reconstruction design.

 figure: Fig. 2

Fig. 2 Positions and exposure times computed using SMaRT are shown in dark green, with those for GP-TIE shown in tan. Total exposure time for the two methods are the same.

Download Full Size | PDF

 figure: Fig. 3

Fig. 3 Estimated transfer function for the real (α, left) and imaginary (β, right) components of the specimen s(x, y) are shown in green. The dashed tan line indicates the effective transfer function for comparison images generated using the GP-TIE and TIE methods. A normalized frequency of 1.0 corresponds to the maximum frequency allowable by the numerical aperture and magnification of the system.

Download Full Size | PDF

Although the LED emanates light in a narrow range of wavelengths, its polychromaticity is still an important factor; to see its importance, we also compute a measurement–reconstruction pair (which we will call SMaRT-qm), assuming that the illumination was quasimonochromatic and well approximated by a single wavelength. Furthermore, to facilitate comparison with Gaussian process regression transport of intensity equation phase imaging (GP-TIE) [8], a similar state-of-the-art method, we incorporate an additional set of measurements to capture one image at z = 0 in addition to 14 images at positions exponentially spaced between z = ±250 μm and z = ±24 mm, with exposures at 0.2 s each; this recipe is shown in Fig. 2 as well. For comparison with standard TIE, we took 1 s exposures at z = 0, ±300 μm (TIE short) as well as z = 0, ±10 mm (TIE long). We used code provided by [8] and adjusted the Tikhonov parameter to match the low frequency cutoff of our method, as shown in Fig. 3.

With CTF theory, the closest propagation distance offering the highest contrast for a particular spatial frequency is inversely proportional to the square of the frequency, and both SMaRT and GP-TIE approaches use different propagation distances to capture different sets of spatial frequencies. However, from Fig. 2, SMaRT’s distribution of positions is clearly trimodal compared to the first-principles exponential placement of positions suggested by the GP-TIE approach. It appears that SMaRT chose not to incorporate intermediate positions since they are not as cost efficient as splitting exposure time between a set of positions close to z = 0 and a set of positions far from z = 0 on either side. This sounds surprising at first but can be explained as follows. Since contrast transfer functions are nearly periodic in z, falling off only as a function of partial coherence, the intensity pattern at a position far away from z = 0 contains not only low frequency components, but also partial information about intermediate frequency components. Therefore, we can obviate positions intentionally designed for intermediate frequency components and allocate more time to other positions, reducing noise levels for the remaining images.

To verify our approach, we applied the computed SMaRT measurement–reconstruction pair as well as the GP-TIE and TIE methods to both simulated and experimental data of a phase-only Siemens star spoke pattern.

4.1. Simulation

For ground truth, we computed a 40-spoke Siemens star phase object with a diameter of 2092.5 μm and height of 0.3 rad at the intermediate image plane with an effective sampling interval of 4.65/9 μm and filtered the result to remove all frequency components that cannot be captured due to the NA and magnification of the system before downsampling to a sampling rate equal to the pixel pitch (4.65 μm). Simulated noisy images of size 1001 × 1001 were used to reconstruct phase images using the SMaRT, SMaRT-qm, GP-TIE and TIE approaches. The center 501 × 501 pixel region containing the star is shown in Fig. 4, and two smaller regions of the object are shown magnified in Fig. 5 and Fig. 6 for the SMaRT, SMaRT-qm and GP-TIE approaches. Cross sections of the reconstruction are shown in Fig. 7 (we omit SMaRT-qm for clarity), with the resulting error for the SMaRT and GP-TIE approaches shown in Fig. 8.

 figure: Fig. 4

Fig. 4 Ground truth and reconstructed phase images of a simulated Siemens star target; root-mean-square error in radians is given in brackets for each reconstruction. Crops “A” and “B” are shown magnified in Fig. 5 and Fig. 6, respectively. Cross sections of the reconstructed phase and reconstruction error along the dashed line “C” are visualized in Fig. 7 and Fig. 8, respectively. The scale bar indicates 500 μm on the intermediate image plane and 100 μm in the specimen.

Download Full Size | PDF

 figure: Fig. 5

Fig. 5 Magnified crops of the region indicated by “A” in Fig. 4.

Download Full Size | PDF

 figure: Fig. 6

Fig. 6 Magnified crops of the region indicated by “B” in Fig. 4. The center in the ground truth is a flat circle because of the diffraction limit.

Download Full Size | PDF

 figure: Fig. 7

Fig. 7 Reconstructed phase of simulated Siemens star along the dashed line “C” in Fig. 4. The ground truth is shown by the gray gradient.

Download Full Size | PDF

 figure: Fig. 8

Fig. 8 Reconstruction error for simulated Siemens star along the dashed line “C” in Fig. 4.

Download Full Size | PDF

Reconstructed images in Fig. 4 as well as cross sections in Fig. 7 show that standard TIE results in either sharp yet cloudy images (TIE short) or blurry images with good low frequency performance (TIE long). Both SMaRT and GP-TIE leverage diversity to keep both high frequency and low frequency components, yielding significantly better reconstructions. However, the magnified crops in Fig. 5 and Fig. 6 show that SMaRT is more efficient at combating noise, resulting in cleaner spokes in the center of the target as well as flatter phase in uniform regions. Similarly, the rms error values in Fig. 4 and the cross section of the error in Fig. 8 support this observation. Lastly, we see in Fig. 5 and Fig. 6 the importance of considering the polychromaticity of the source, as the SMaRT-qm results—which assume monochromatic illumination—are markedly blurrier than the SMaRT results.

4.2. Experiment

Using the computed measurement design, we captured images of a Benchmark Technologies quantitative phase resolution target, which has a refractive index 1.52 with various binary phase patterns. We imaged the 50 nm height Siemens star, corresponding to a phase height of ∼0.277 rad. Tilt in the illumination was estimated by using the computed reconstruction function rR(·) on the ±24 μm images individually, finding centroids of the same small point feature in both images and then estimating the slope of the tilt. Images of size 1280 × 1024 from the camera were then adjusted for tilt and cropped to 1001 × 1001 before reconstruction using either the SMaRT, GP-TIE or TIE methods. Reconstructed images, close-ups and cross sections of the phase are shown in Fig. 9, Fig. 10 and Fig. 11, respectively.

 figure: Fig. 9

Fig. 9 Reconstructed phase images of actual Siemens star. Crops “A” and “B” are shown magnified in Fig. 10. Cross sections of the reconstructed phase along the dashed line “C” are visualized in Fig. 11. The scale bar indicates 500 μm on the intermediate image plane and 100 μm in the specimen.

Download Full Size | PDF

 figure: Fig. 10

Fig. 10 Crops of reconstructed Siemens star for regions indicated by “A” and “B” in Fig. 9.

Download Full Size | PDF

 figure: Fig. 11

Fig. 11 Reconstructed phase for the actual Siemens star along the dashed line “C” in Fig. 9.

Download Full Size | PDF

The computed phase from experimental data exhibits behavior similar to the results from the simulated data—both SMaRT and GP-TIE easily outperform single distance TIE reconstructions, but SMaRT has a slight edge in noise performance compared to GP-TIE. The GP-TIE reconstruction appears to be more sensitive to low frequency noise, as seen in Fig. 9. Furthermore, Fig. 10 shows SMaRT reconstructing the flat areas of the phase target with less noise than GP-TIE; the high frequency center area of the target is also cleaner in the SMaRT result.

5. Discussion

We have demonstrated that a SMaRT joint optimization approach can be used to reduce reconstruction error in self-interferometric phase imaging by designing the measurement process jointly with the reconstruction process. Our approach leverages additional information such as the estimated spatial spectral distribution of the specimen class being imaged as well as the spectrum of the polychromatic illumination source to compute an informed decision on how to allocate imaging exposure times, resulting in better noise performance under a fixed photon budget compared to previous techniques.

While we employed a piecewise-constant phase model as a prior, this prior serves only as a guide for estimating the spectral signal-to-noise ratio and does not force the reconstructed phase images to appear piecewise-constant, unlike many other nonlinear methods. Since we do retrieve the real and imaginary parts of the specimen’s transmission profile, our method can be readily adapted to image other weakly-scattering specimens with a different prior, for example replacing standard CTF methods in electron microscopy or X-ray phase imaging, where there may be spatially-varying absorption in addition to optical thickness variations. One can also potentially build a collection of spectral covariance matrices suitable for various imaging scenarios, with more accurate priors when more information is known about the specimen to be imaged.

Although we used a translation stage to image different axial positions, movement can be minimized via judicious use of liquid lenses. Furthermore, while we assumed for simplicity that the objective–tube lens system is diffraction limited, it is trivial to incorporate information about shift-invariant aberrations such as spherical aberration into the forward operator. In general, the resolution of phase images generated by our method is primarily limited by the numerical aperture of the objective lens and any unaccounted optical aberrations in the system.

One path for improvement is that our method cannot be directly applied to arbitrary phase specimens, as large phase differences violate the small-perturbation assumption that we use in estimating noise and modeling the propagated intensity. Automatic differentiation machinery commonly used for deep neural networks, as well as suitable application of transmission cross coefficients from the nonlinear electron microscopy literature, can potentially extend our approach to strong phase objects in the future. Furthermore, we can also directly leverage deep neural networks, converting our model-driven framework to a data-driven framework and solving Eq. (1) using stochastic gradient descent methods.

Appendix A. Optimal exposure allocation subproblem

We give a solution for problems of the form

minimizev1,,vKk=1Kγk/vksubjecttok=1Kvkv0,
where the γks are all positive. First we note that the objective function and constraints are all convex. Furthermore, any solution to the problem must satisfy the Karush-Kuhn-Tucker conditions [38, 39]. The stationarity condition requires that the optimal vk=1uγk for some constant u ≥ 0. u cannot be zero because that would make vk infinite and violate primal feasibility. Thus, complementary slackness requires kvk=v0. Therefore, the optimal vk must be equal to
vk=v0γk/γtotal,whereγtotal=kγk.
This implies that Eq. (43) attains an optimal value of γtotal/v0.

For our specific problem in Eq. (36), we can write the subproblem in the form of Eq. (43) by setting γk=𝒞2xk22/(𝒞12I0) and vk = τk. This yields an optimal value of γ(l=1Lxl2)2, where γ=𝒞2/(𝒞12I0τ0). The lth optimal exposure time can be obtained via τl=τ0xl2/(l=1Lxl2).

Appendix B. Spectral covariance of a piecewise-constant vector function

We consider an ensemble of piecewise-constant two-dimensional real vector-valued functions p(x, y) where the values v for individual pieces of this function are independent and identically distributed (i.i.d.) according to some continuous statistical distribution with density function f𝒱(v). We define lines through this function using parameters s and θ to specify the line and parameter r to specify position on this line, i.e.,

l(r;s,θ)=p(rcosθssinθ,rsinθ+scosθ).
The cross-correlation of the ensemble p(x, y) in polar coordinates is given by
Cij(ρ,θ)=Exy[pi(x,y)pj(x+ρcosθ,y+ρsinθ)]=Ers[li(r;s,θ)lj(r+ρ;s,θ)].
Given some r for l(r; s, θ), we assume the probability that l(r + ρ; s, θ) is still in the same “piece” and hence has the same value is equal to exp(−ρ/Δ). Since the pieces of this function are i.i.d.,
Ers[li(r;s,θ)lj(r+ρ;s,θ)]=E(vivj)exp(ρ/Δ)+E(vi)E(vj)[1exp(ρ/Δ)]
=E(vi)E(vj)+[E(vivj)E(vi)E(vj)]exp(ρ/Δ).
The (raw) spectral covariance, i.e., the Fourier transform of the cross-correlation, can be computed via the Hankel transform of the exponential, yielding
E(vi)E(vj)δ(fx,fy)+[E(vivj)E(vi)E(vj)]2πΔ2[1+4π2Δ2(fx2+fy2)]3/2.

Appendix C. On the squared sum of norms penalty term

We consider a problem of the form

minimizex12Axb22+12(i=1nwixi2)2,
where wi > 0 are weights, and x1, . . ., xn are m-length non-overlapping partitions of x ∈ ℂnm. Without loss of generality, we can assume that x consists of the x1, . . ., xn vertically concatenated. Let us denote the least squares term f(x) and the penalty term g(x).

We note that g(x) is a convex function of x, since it is the composition of a nondecreasing convex function s(ν) = max{0, ν}2 and a convex function, the weighted sum of 2 norms. Since f(x) is also a convex function, the above problem can be solved by convex optimization. While f(x) is also smooth, g(x) is not smooth, e.g., when one or more of the xis are zero. This suggests the use of a proximal gradient method [27–29] to solve Eq. (50), and thus we require the proximal operator for g(x), defined as:

Proxgα(x)=argminy{12yx22+α2(i=1nwiyi2)2},
where α > 0, and y1, . . ., yn form the same partition of y as the x1, . . ., xn do of x. We note the following property of the optimal y and hence yis:

Theorem 1. The optimal yi for Eq. (51) are of the form yi = βi xi, where βi ∈ [0, 1].

Proof. Since the expression in Eq. (51) is also the dual of a convex function, strong duality implies that there exist 1, . . ., n ≥ 0 such that the optimal y is also a solution to the following constrained optimization problem:

minimizey(i=1nwiyi2)2subjecttoyixi22i2,i{1,,n}.
Since s(ν) = max{0, ν}2 is nondecreasing, the above problem is equivalent to:
minimizeyi=1nwiyi2subjecttoyixi22i2,i{1,,n}.
As now the function to be minimized is separable and the constraints are separable as well, the optimal value of each yi can be solved separately via:
minimizeyiyi2subjecttoyixi2i,
which has closed form solution yi=max{0,xi2ixi2}xi.

Therefore, in order to solve Eq. (51), we can instead solve the following:

minimizeβ1,,βni=1n12(βi1)2xi22+α2(i=1nwiβixi2)2subjecttoβi0,i{1,,n}.
Define γ1, . . ., γn to be elements of vector γ where γi = βixi2, and likewise ξ1, . . ., ξn to be elements of vector ξ where ξi = ‖xi2. Hence, the above problem is equivalent to the following nonnegative least squares formulation:
minimizeγγγ2ξγ+αγwwγsubjecttoγ0.
Once the optimal γ are obtained, we can then compute the optimal yis. While any nonnegative least squares algorithm can be used to solve Eq. (56), we note the following property about our specific problem:

Lemma 1. Let γ̂ denote the nonzero components of the optimal γ for Eq. (56), and let ξ̂ and ŵ denote the corresponding elements of ξ and w, respectively. Then, the γ̂ must satisfy:

γ^=ξ^sw^,wheres=(w^ξ^)/(α1+w^w^).
Proof. Since the zero-valued entries of γ annihilate corresponding components of the expression to be minimized in Eq. (56), we can reduce that expression to
l(γ^)=γ^γ^2ξ^γ^+αγ^w^w^γ^=γ^Wγ^2ξ^γ^
=W1/2γ^W1/2ξ^22+𝒪(1),
where W = I + αŵŵ. Since α > 0, W is invertible, and thus l(γ̂) has minimizer
γ^=(Iαw^w^1+αw^w^)ξ^,
via the Sherman-Morrison formula, and minimal value equal to −ξ̂γ̂.

We now consider the following theorem regarding the solution to Eq. (56):

Theorem 2. Consider a vector γ0; let 𝒫 ⊆ {1, . . ., n} denote the indexes of entries that are positive. Then, γ is a solution to Eq. (56) if and only if the following statements are true:

γi=ξiswi,i𝒫.
ξiswi,i𝒫.
where s is as it is defined in Lemma 1.

Proof. By inspection, we note that αŵγ̂ = αwγ = s, based on the definition of s and the structure of γ. Thus, the first and second statements are equivalent to

γ^=αw^wγ=ξ^,αw¯wγξ¯,
where and ξ̄ are entries of w and ξ, respectively, that correspond to the zero-valued entries of γ. Thus, both statements being true is equivalent to:
0=(I+αww)γξu,
where the entries ui of u are zero for i𝒫 and nonnegative for i𝒫. This equation itself is equivalent to the stationarity condition of the Karush-Kuhn-Tucker (KKT) conditions, and the structure of u is equivalent to the combination of complementary slackness and dual feasibility. Primal feasibility is satisfied by the nonnegativity of γ assumed by the theorem. Thus, satisfying the two statements of the theorem under the assumed conditions is equivalent to the KKT conditions. Since the primal problem exhibits strong duality, γ is a solution to Eq. (56) if and only if it satisfies the two statements.

Without loss of generality, let’s assume that

ξ1/w1ξ2/w2ξn/wn,
since we can sort and unsort the entries as needed if this assumption doesn’t hold. Under this assumption, we have the following corollary:

Corollary 1. If Eq. (65) holds and γ is a solution to Eq. (56), then 𝒫 = {1, . . ., n} for some 0 ≤ nn, with n = 0 if and only if ξ = 0.

Proof. 𝒫 is a contiguous range of indexes via a combination of Eq. (62) and Eq. (65). If n is zero, then 𝒫 = ∅ and hence 0ξ. Since the elements of ξ are norms, the only possible conclusion is that ξ = 0. Now, suppose ξ = 0, then since swi ≥ 0 by construction, no γi > 0 can exist via Eq. (61) and thus 𝒫 = ∅.

With these observations, we propose Algorithm 1 for computing Eq. (51). This algorithm has asymptotic complexity equal to 𝒪(mn + n2), where the 𝒪(mn) part is due to computing the 2 norms in step 3 and rescaling in step 18, whereas the 𝒪(n2) part is due to the while loop of at most n iterations wherein each iteration requires inner products of vectors of at most length n. Sorting the elements only takes 𝒪(n log n) time and is dominated by the other operations.

Tables Icon

Algorithm 1. Proximal operator for squared (weighted) sum of norms.

Funding

National Research Foundation Prime Minister’s Office Singapore (CREATE Programme Singapore-MIT Alliance for Research and Technology BioSystems and Micromechanics IRG).

Acknowledgments

The authors would like to thank Colin Sheppard, Jon Petruccelli and Chenglong Bao for their helpful discussions.

References

1. R. W. Gerchberg and W. O. Saxton, “A practical algorithm for the determination of the phase from image and diffraction plane pictures,” Optik 35, 237–246 (1972).

2. J. R. Fienup, “Phase retrieval algorithms: A comparison,” Appl. Opt. 21, 2758–2769 (1982). [CrossRef]   [PubMed]  

3. E. J. Candès, Y. C. Eldar, T. Strohmer, and V. Voroninski, “Phase retrieval via matrix completion,” SIAM J. Imaging Sci. 6, 199–225 (2013). [CrossRef]  

4. M. R. Teague, “Deterministic phase retrieval: a Green’s function solution,” J. Opt. Soc. Am. 73, 1434–1441 (1983). [CrossRef]  

5. N. Streibl, “Phase imaging by the transport equation of intensity,” Opt. Commun. 49, 6–10 (1984). [CrossRef]  

6. A. Barty, K. A. Nugent, D. Paganin, and A. Roberts, “Quantitative optical phase microscopy,” Opt. Lett. 23, 817–819 (1998). [CrossRef]  

7. L. Waller, L. Tian, and G. Barbastathis, “Transport of intensity phase-amplitude imaging with higher order intensity derivatives,” Opt. Express 18, 12552–12561 (2010). [CrossRef]   [PubMed]  

8. J. Zhong, R. A. Claus, J. Dauwels, L. Tian, and L. Waller, “Transport of intensity phase imaging by intensity spectrum fitting of exponentially spaced defocus planes,” Opt. Express 22, 10661–10674 (2014). [CrossRef]  

9. J.-P. Guigay, “Fourier transform analysis of Fresnel diffraction patterns and in-line holograms,” Optik 49, 121–125 (1977).

10. Y. I. Nesterets and T. E. Gureyev, “Partially coherent contrast-transfer-function approximation,” J. Opt. Soc. Am. A 33, 464–474 (2016). [CrossRef]  

11. J. C. Petruccelli, L. Tian, and G. Barbastathis, “The transport of intensity equation for optical path length recovery using partially coherent illumination,” Opt. Express 21, 14430–14441 (2013). [CrossRef]   [PubMed]  

12. T. Chakraborty and J. C. Petruccelli, “Source diversity for transport of intensity phase imaging,” Opt. Express 25, 9122–9137 (2017). [CrossRef]   [PubMed]  

13. B. Xue, S. Zheng, L. Cui, X. Bai, and F. Zhou, “Transport of intensity phase imaging from multiple intensities measured in unequally-spaced planes,” Opt. Express 19, 20244–20250 (2011). [CrossRef]   [PubMed]  

14. S. Zheng, B. Xue, W. Xue, X. Bai, and F. Zhou, “Transport of intensity phase imaging from multiple noisy intensities measured in unequally-spaced planes,” Opt. Express 20, 972–985 (2012). [CrossRef]   [PubMed]  

15. J. Sun, C. Zuo, and Q. Chen, “Iterative optimum frequency combination method for high efficiency phase imaging of absorptive objects based on phase transfer function,” Opt. Express 23, 28031–28049 (2015). [CrossRef]   [PubMed]  

16. L. Tian, J. C. Petruccelli, and G. Barbastathis, “Nonlinear diffusion regularization for transport of intensity phase imaging,” Opt. Lett. 37, 4131–4133 (2012). [CrossRef]   [PubMed]  

17. L. Tian, J. C. Petruccelli, Q. Miao, H. Kudrolli, V. Nagarkar, and G. Barbastathis, “Compressive x-ray phase tomography based on the transport of intensity equation,” Opt. Lett. 38, 3418–3421 (2013). [CrossRef]   [PubMed]  

18. E. R. Dowski Jr. and W. T. Cathey, “Extended depth of field through wave-front coding,” Appl. Opt. 34, 1859–1866 (1995). [CrossRef]   [PubMed]  

19. W. T. Cathey and E. R. Dowski, “New paradigm for imaging systems,” Appl. Opt. 41, 6080–6092 (2002). [CrossRef]   [PubMed]  

20. M. D. Stenner, A. Ashok, and M. A. Neifeld, “Multi-domain optimization for ultra-thin cameras,” in Frontiers in Optics, (Optical Society of America, 2006), p. FWH5. [CrossRef]  

21. E. J. Tremblay, J. Rutkowski, I. Tamayo, P. E. X. Silveira, R. A. Stack, R. L. Morrison, M. A. Neifeld, Y. Fainman, and J. E. Ford, “Relaxing the alignment and fabrication tolerances of thin annular folded imaging systems using wavefront coding,” Appl. Opt. 46, 6751–6758 (2007). [CrossRef]   [PubMed]  

22. J. Frank, Three-Dimensional Electron Microscopy of Macromolecular Assemblies: Visualization of Biological Molecules in Their Native State (Oxford University, 2006). [CrossRef]  

23. M. Vulović, L. M. Voortman, L. J. van Vliet, and B. Rieger, “When to use the projection assumption and the weak-phase object approximation in phase contrast cryo-EM,” Ultramicroscopy 136, 61–66 (2014). [CrossRef]  

24. E. Wolf, “New theory of partial coherence in the space-frequency domain. Part I: spectra and cross spectra of steady-state sources,” J. Opt. Soc. Am. 72, 343–351 (1982). [CrossRef]  

25. L. Mandel and E. Wolf, “Spectral coherence and the concept of cross-spectral purity,” J. Opt. Soc. Am. 66, 529–535 (1976). [CrossRef]  

26. A. Goy, K. Arthur, S. Li, and G. Barbastathis, “Low photon count phase retrieval using deep learning,” arXiv preprint arXiv:1806.10029 (2018).

27. Y. Nesterov, “Gradient methods for minimizing composite objective function,” CORE Discussion Papers 2007076, Université catholique de Louvain, Center for Operations Research and Econometrics (CORE) (2007).

28. A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM J. Imaging Sci. 2, 183–202 (2009). [CrossRef]  

29. B. O’Donoghue and E. Candès, “Adaptive restart for accelerated gradient schemes,” Found. Comp. Math. 15, 715–732 (2015). [CrossRef]  

30. C. Bao, G. Barbastathis, H. Ji, Z. Shen, and Z. Zhang, “Coherence retrieval using trace regularization,” SIAM J. Imaging Sci. 11, 679–706 (2018). [CrossRef]  

31. G. J. Burton and I. R. Moorhead, “Color and spatial structure in natural scenes,” Appl. Opt. 26, 157–170 (1987). [CrossRef]   [PubMed]  

32. D. J. Field, “Relations between the statistics of natural images and the response properties of cortical cells,” J. Opt. Soc. Am. A 4, 2379–2394 (1987). [CrossRef]   [PubMed]  

33. D. J. Tolhurst, Y. Tadmor, and T. Chao, “Amplitude spectra of natural images,” Ophthalmic Physiol. Opt. 12, 229–232 (1992). [CrossRef]   [PubMed]  

34. A. B. Lee, D. Mumford, and J. Huang, “Occlusion models for natural images: A statistical study of a scale-invariant dead leaves model,” Int. J. Comput. Vis. 41, 35–59 (2001). [CrossRef]  

35. Y. LeCun, J. S. Denker, and S. A. Solla, “Optimal brain damage,” in Advances in Neural Information Processing Systems 2, D. S. Touretzky, ed. (Morgan-Kaufmann, 1990), pp. 598–605.

36. P. Molchanov, S. Tyree, T. Karras, T. Aila, and J. Kautz, “Pruning convolutional neural networks for resource efficient inference,” in Proceedings of the International Conference on Learning Representations (ICLR), (2015).

37. Y. C. Pati, R. Rezaiifar, and P. S. Krishnaprasad, “Orthogonal matching pursuit: recursive function approximation with applications to wavelet decomposition,” in Proceedings of 27th Asilomar Conference on Signals, Systems and Computers, (1993), pp. 40–44 vol.1. [CrossRef]  

38. W. Karush, “Minima of functions of several variables with inequalities as side constraints,” Ph.D. thesis, University of Chicago (1939).

39. H. W. Kuhn and A. W. Tucker, “Nonlinear programming,” in Proceedings of the Second Berkeley Symposium on Mathematical Statistics and Probability, (University of California, 1951), pp. 481–492.

Cited By

Optica participates in Crossref's Cited-By Linking service. Citing articles from Optica Publishing Group journals and other participating publishers are listed here.

Alert me when this article is cited.


Figures (11)

Fig. 1
Fig. 1 A standard optical microscope, wherein an area light source at the front focal plane of a condenser lens illuminates a thin specimen (blue) coincident with the back focal plane. An objective–tube lens 4f system forms a magnified image (blue dotted) of the specimen at the intermediate image plane. Rays passing through the specimen’s center and edge are shown in gray and dark red, respectively. For clarity, relative sizes are not to scale.
Fig. 2
Fig. 2 Positions and exposure times computed using SMaRT are shown in dark green, with those for GP-TIE shown in tan. Total exposure time for the two methods are the same.
Fig. 3
Fig. 3 Estimated transfer function for the real (α, left) and imaginary (β, right) components of the specimen s(x, y) are shown in green. The dashed tan line indicates the effective transfer function for comparison images generated using the GP-TIE and TIE methods. A normalized frequency of 1.0 corresponds to the maximum frequency allowable by the numerical aperture and magnification of the system.
Fig. 4
Fig. 4 Ground truth and reconstructed phase images of a simulated Siemens star target; root-mean-square error in radians is given in brackets for each reconstruction. Crops “A” and “B” are shown magnified in Fig. 5 and Fig. 6, respectively. Cross sections of the reconstructed phase and reconstruction error along the dashed line “C” are visualized in Fig. 7 and Fig. 8, respectively. The scale bar indicates 500 μm on the intermediate image plane and 100 μm in the specimen.
Fig. 5
Fig. 5 Magnified crops of the region indicated by “A” in Fig. 4.
Fig. 6
Fig. 6 Magnified crops of the region indicated by “B” in Fig. 4. The center in the ground truth is a flat circle because of the diffraction limit.
Fig. 7
Fig. 7 Reconstructed phase of simulated Siemens star along the dashed line “C” in Fig. 4. The ground truth is shown by the gray gradient.
Fig. 8
Fig. 8 Reconstruction error for simulated Siemens star along the dashed line “C” in Fig. 4.
Fig. 9
Fig. 9 Reconstructed phase images of actual Siemens star. Crops “A” and “B” are shown magnified in Fig. 10. Cross sections of the reconstructed phase along the dashed line “C” are visualized in Fig. 11. The scale bar indicates 500 μm on the intermediate image plane and 100 μm in the specimen.
Fig. 10
Fig. 10 Crops of reconstructed Siemens star for regions indicated by “A” and “B” in Fig. 9.
Fig. 11
Fig. 11 Reconstructed phase for the actual Siemens star along the dashed line “C” in Fig. 9.

Tables (1)

Tables Icon

Algorithm 1 Proximal operator for squared (weighted) sum of norms.

Equations (65)

Equations on this page are rendered with MathJax. Learn more.

minimize M 0 , R 0 f 𝒮 0 ( s ) s r R ( p M ( s ) ) 2 2 d s ,
minimize x 1 , , x L b l = 1 L A l x l 2 2 + γ ( l = 1 L x l ) 2 .
Θ ( x , y ) = Θ ¯ + s ( x , y ) = Θ ¯ + α ( x , y ) + i β ( x , y ) ,
s ( x , y ) m = 1 M n = 1 N s ˜ m n exp ( i 2 π ξ m x ) exp ( i 2 π η n y ) = m , n s ˜ m n ψ m n ( x , y ) ,
U in ( x , y ) = exp [ i 2 π ( ξ x + η y ) ] ,
U prop ( x , y , z ) = ϕ z U in ( x , y ) [ 1 + s prop ( x λ z ξ , y λ z η , z ; λ ) ] ,
s prop ( x , y , z ; λ ) = m , n s ˜ m n ψ m n ( x , y ) exp [ i π λ z ( ξ m 2 + η n 2 ) ] .
I coherent ( x , y , z ; ξ , η , λ ) = 1 + 2 Re { s prop ( x λ z ξ , y λ z η , z ; λ ) } + | s prop ( x λ z ξ , y λ z η , z ; λ ) | 2 .
I coherent ( x , y , z ; ξ , η , λ ) 1 + 2 Re [ s prop ( x λ z ξ , y λ z η , z ; λ ) ] .
m , n { α ˜ m n cos [ π λ z ( ξ m 2 + η n 2 ) ] + β ˜ m n sin [ π λ z ( ξ m 2 + η n 2 ) ] } δ ( ξ ξ m , η η n ) .
W ( x 1 , y 1 , x 2 , y 2 , ω ) = I 0 μ ( x 1 x 2 , y 1 y 2 , ω ) ,
μ ( Δ x , Δ y , ω ) = ρ ( a , b , ω ) exp [ i k ( a Δ x + b Δ y ) ] d a d b .
I ( x , y , z ) = I 0 ρ ( a , b , ω ) I coherent ( x , y , z ; k a , k b , 2 π c / ω ) d a d b d ω .
I ( x , y , z ) I 0 + 2 I 0 Re { ρ ( a , b , ω ) s prop ( x + 2 π z a , y + 2 π z b , z ; 2 π c / ω ) d a d b d ω }
= I 0 + [ I 0 4 π 2 z 2 ρ ( x 2 π z , y 2 π z , ω ) ] 2 Re { s prop ( x , y , z ; 2 π c / ω ) } d ω ,
I ˜ ( ξ , η , z ) I 0 δ ( ξ , η ) + I ˜ real ( ξ , η , z ) + I ˜ imag ( ξ , η , z ) ,
I ˜ real ( ξ , η , z ) = I 0 m , n α ˜ m n δ ( ξ ξ m , η η m ) P ˜ real ( ξ m , η n , z ) ,
I ˜ imag ( ξ , η , z ) = I 0 m , n β ˜ m n δ ( ξ ξ m , η η m ) P ˜ imag ( ξ m , η n , z ) ,
P ˜ real ( ξ m , η n , z ) = 2 cos [ λ π z ( ξ m 2 + η n 2 ) ] μ ( λ z ξ m , λ z η n , ω ) d ω ,
P ˜ imag ( ξ m , η n , z ) = 2 sin [ λ π z ( ξ m 2 + η n 2 ) ] μ ( λ z ξ m , λ z η n , ω ) d ω .
pixel l m n ( noiseless ) = 𝒞 1 τ l χ ( x x m , y y n ) I ( x , y , z l ) d x d y ,
pixel l m n ( noisy ) ~ 𝒩 ( pixel l m n ( noiseless ) , 𝒞 2 τ l I 0 ) ,
pixel ˜ l m n ( noisy ) = 1 MN m ^ = 1 M n ^ = 1 N pixel l m ^ n ^ ( noisy ) exp { i 2 π [ ( x m ^ x m ¯ ) ξ m + ( y n ^ y n ¯ ) η n ] }
~ 𝒩 ( 𝒞 1 τ l I ^ ( ξ m , η n , z l ) , 𝒞 2 τ l I 0 ) ,
I ^ ( ξ m , η n , z l ) = I 0 χ ˜ ( ξ m , η n ) [ P ˜ real ( ξ m , η n , z l ) α ˜ m n + P ˜ imag ( ξ m , η n , z l ) β ˜ m n ] .
y = p M ( s ) = Ps + n ,
P ( l m n , p q r ) = { p l m n 1 = χ ˜ ( ξ m , η n ) P ˜ real ( ξ m , η n , z l ) if m = p , n = q , r = 1 p l m n 2 = χ ˜ ( ξ m , η n ) P ˜ imag ( ξ m , η n , z l ) if m = p , n = q , r = 2 0 otherwise .
R ( p q r , l m n ) = { x m n r l if m = p , n = q 0 otherwise .
minimize x , τ 1 , , τ L E s 𝒮 0 , n [ s R ( Ps + n ) 2 2 ] subject to τ 0 l τ l .
E s 𝒮 0 , n [ s R ( Ps + n ) 2 2 ] = E s 𝒮 0 , n { m , n R m n [ P m n ( α ˜ m n β ˜ m n ) + n m n ] ( α ˜ m n β ˜ m n ) 2 2 } ,
R m n = ( x m n 11 x m n 12 x m n 1 L x m n 21 x m n 22 x m n 2 L ) , P m n = ( p 1 m n 1 p 1 m n 2 p L m n 1 p L m n 2 ) .
B m n B m n H = C m n = [ E s 𝒮 0 ( | α ˜ m n | 2 ) E s 𝒮 0 ( α ˜ m n β ˜ m n * ) E s 𝒮 0 ( β ˜ m n α ˜ m n * ) E s 𝒮 0 ( | β ˜ m n | 2 ) ] , B m n = ( b m n 1 b m n 3 b m n 2 b m n 4 ) ,
m , n ( R m n P m n I ) B m n 2 2 + m , n tr [ R m n H R m n E n ( n m n H n m n ) ] .
minimize x , τ 1 , , τ L b l = 1 L A l x l 2 2 + l = 1 L σ l 2 x l 2 2 subject to τ 0 l τ l ,
A l m n = [ ( p l m n 1 b m n 1 + p l m n 2 b m n 2 ) I ( p l m n 1 b m n 3 + p l m n 2 b m n 4 ) I ] .
minimize τ 1 , , τ L l = 1 L σ l 2 x l 2 2 subject to τ 0 l τ l .
minimize x 1 , , x L b l = 1 L A l x l 2 2 + γ ( l = 1 L x l 2 ) 2 ,
α ( x , y ) = ζ cos [ ϕ ( x , y ) ] 1 , and β ( x , y ) = ζ sin [ ϕ ( x , y ) ] ,
C m n = 2 π Δ 2 [ 1 + 4 π 2 Δ 2 ( ξ m 2 + η n 2 ) ] 3 / 2 ( ζ 2 / 2 + ζ / 2 1 0 0 ζ 2 / 2 ζ / 2 ) .
ρ 2 ( a , b , ω ) = ρ 1 ( a a 0 , b b 0 , ω ) .
T ( m n , k ) = { 1 + t m n t m n if k = t m n t m n t m n if k = t m n + 1 0 otherwise .
b ^ = [ I 4 ( Q T ) ] b , A ^ l = ( I 4 Q ) [ ( I 4 T ) A l ( I 2 T ) ] ( I 2 Q ) ,
minimize v 1 , , v K k = 1 K γ k / v k subject to k = 1 K v k v 0 ,
v k = v 0 γ k / γ total , where γ total = k γ k .
l ( r ; s , θ ) = p ( r cos θ s sin θ , r sin θ + s cos θ ) .
C i j ( ρ , θ ) = E x y [ p i ( x , y ) p j ( x + ρ cos θ , y + ρ sin θ ) ] = E r s [ l i ( r ; s , θ ) l j ( r + ρ ; s , θ ) ] .
E r s [ l i ( r ; s , θ ) l j ( r + ρ ; s , θ ) ] = E ( v i v j ) exp ( ρ / Δ ) + E ( v i ) E ( v j ) [ 1 exp ( ρ / Δ ) ]
= E ( v i ) E ( v j ) + [ E ( v i v j ) E ( v i ) E ( v j ) ] exp ( ρ / Δ ) .
E ( v i ) E ( v j ) δ ( f x , f y ) + [ E ( v i v j ) E ( v i ) E ( v j ) ] 2 π Δ 2 [ 1 + 4 π 2 Δ 2 ( f x 2 + f y 2 ) ] 3 / 2 .
minimize x 1 2 Ax b 2 2 + 1 2 ( i = 1 n w i x i 2 ) 2 ,
Prox g α ( x ) = arg min y { 1 2 y x 2 2 + α 2 ( i = 1 n w i y i 2 ) 2 } ,
minimize y ( i = 1 n w i y i 2 ) 2 subject to y i x i 2 2 i 2 , i { 1 , , n } .
minimize y i = 1 n w i y i 2 subject to y i x i 2 2 i 2 , i { 1 , , n } .
minimize y i y i 2 subject to y i x i 2 i ,
minimize β 1 , , β n i = 1 n 1 2 ( β i 1 ) 2 x i 2 2 + α 2 ( i = 1 n w i β i x i 2 ) 2 subject to β i 0 , i { 1 , , n } .
minimize γ γ γ 2 ξ γ + α γ w w γ subject to γ 0 .
γ ^ = ξ ^ s w ^ , where s = ( w ^ ξ ^ ) / ( α 1 + w ^ w ^ ) .
l ( γ ^ ) = γ ^ γ ^ 2 ξ ^ γ ^ + α γ ^ w ^ w ^ γ ^ = γ ^ W γ ^ 2 ξ ^ γ ^
= W 1 / 2 γ ^ W 1 / 2 ξ ^ 2 2 + 𝒪 ( 1 ) ,
γ ^ = ( I α w ^ w ^ 1 + α w ^ w ^ ) ξ ^ ,
γ i = ξ i s w i , i 𝒫 .
ξ i s w i , i 𝒫 .
γ ^ = α w ^ w γ = ξ ^ , α w ¯ w γ ξ ¯ ,
0 = ( I + α w w ) γ ξ u ,
ξ 1 / w 1 ξ 2 / w 2 ξ n / w n ,
Select as filters


Select Topics Cancel
© Copyright 2024 | Optica Publishing Group. All rights reserved, including rights for text and data mining and training of artificial technologies or similar technologies.