Single-view phase retrieval of an extended sample by exploiting edge detection and sparsity

Ashish Tripathi; Ian McNulty; Todd Munson; Stefan M. Wild

doi:10.1364/OE.24.024719

1. Introduction

Coherent diffractive imaging (CDI) is a technique in widespread use at x-ray synchrotrons to image nanoscale materials. Traditional x-ray microscopy techniques, such as transmission x-ray microscopy (TXM), rely on condensing and objective optics. In contrast, CDI uses nonlinear optimization methods as a type of “numerical optic” to solve the phase problem, which arises because of an x-ray detector’s inability to measure a full complex-valued wavefield.

Under the projection approximation [1], a (complex-valued) exit wave ρ is formed when a plane wave x-ray beam interacts with a sample. We note that in our figures, the phase and magnitude of complex-valued quantities are represented in an HSV colorspace, with the hue and value/brightness denoting the phase and magnitude respectively, and with the saturation component set to maximum for all image pixels. As shown in Fig. 1, when an area detector is placed in the far field, the (real-valued) Fraunhofer diffraction intensity D is measured, where D is proportional to the squared modulus of the two-dimensional Fourier transform of ρ. In this paper, we study nonlinear optimization methods to recover the phase lost upon measurement.

Fig. 1 CDI experimental setup: a complex-valued exit wave ρ results in a real-valued data D collected by the detector.

Download Full Size | PDF

An advantage of CDI over more traditional microscopy techniques is that the recovered image’s spatial resolution is determined by the highest spatial frequencies measured. These spatial frequencies are in turn determined primarily by incident x-ray fluence and the contrast in both the phase and magnitude of the sample’s complex valued transmission function. Another advantage of CDI is that it can recover the full complex-valued transmission function (related to the complex-valued index of refraction of the sample). Methods such as TXM can recover only the modulus of the sample’s transmission function. Furthermore, development of phase retrieval algorithms for CDI has progressed to a point where high-quality images are produced in real time and at the same speed with which one can obtain images in a TXM (see, e.g., [2]).

Recovery of the phase typically starts with a guess ρ⁽⁰⁾ for the exit wave; a nonlinear optimization algorithm then iteratively refines this guess using knowledge from the experiment performed and the physics of the sample. This knowledge is typically used to define constraints for a nonlinear, nonconvex optimization formulation of the phase retrieval problem. Examples of such knowledge include the measured diffraction, as well as constraints on the sample such as known compact support [3], sparsity [4], and ptychography [5–7].

Common CDI experimental configurations for collecting data include the small angle geometry shown in Fig. 1, as well as reflection and Bragg geometries [8,9]. With these configurations, one can use an x-ray probe with a spatial area (or volume in the reflection and Bragg cases) larger or smaller than the sample under investigation. When the x-ray probe is larger than the sample, a single diffraction pattern from a single view can be used with known support or sparsity [3, 4, 10–12]. This single-view approach is particularly advantageous when the x-ray probe will damage the sample, such as with an x-ray free-electron laser source [13]. Improvements in single-view algorithms can also readily be incorporated into ptychographic methods for more robust phase retrieval.

In this paper, we explore the use of single-view phase retrieval methods for an extended sample, with a primary motivation being the ability to overcome difficulties encountered in ptychography when attempting to image samples that significantly change during the collection of a ptychographic dataset, or when the long-term temporal and spatial experimental stability requirements of ptychography are not met. Single-view CDI phase retrieval methods for an extended sample have had very limited success in the literature [14] when compared with methods for an isolated sample. In Sec. 2, we identify the primary limitations of using a support defined from the approximate spatial extent of the illumination function when imaging an extended sample and in Sec. 3 we discuss sparsity in an edge-detected representation of the exit wave. We then develop in Sec. 4 a phase retrieval algorithm that overcomes these limitations by performing phase retrieval on the edge-detected representation of an exit wave and by using ideas from compressive sensing [15]. The primary idea is to recover the sparsest possible edge-detected representation of the exit wave consistent with the diffraction measurement. This process is colloquially known as “shrinkwrap” in the experimental phase retrieval community [4] and more broadly known as “compressive phase retrieval” [11, 12]. The numerical tests in Sec. 5 on a range of simulated samples illustrate the benefits and limitations of the proposed methods. We test these methods on various simulated samples by varying the sparsity of the edge-detected representation of the exit wave. Our results demonstrate that our method can reliably recover the phase of extended samples in many cases, cases where standard, single-view phase retrieval methods fail. Nontrivial noise in the measured diffraction, however, requires use of an appropriate spatial frequency filtering procedure, and this is also discussed. In Sec. 6, we discuss further prospects and extensions of the proposed methodology.

2. Support defined from the illuminated area on the sample

In single-view phase retrieval of an extended sample, one obvious choice of sample constraint is to use the approximate illuminated area on the sample as the support on the exit wave we wish to recover. As an example, in Fig. 2(a) we simulate an x-ray probe by solving the Fresnel integral with a circular pinhole aperture as input. This type of probe is typical in small angle transmission geometry x-ray synchrotron experiments, where sample-to-circular-pinhole distances of between a millimeter and a centimeter are used. With this probe and a simulated sample transmission function defined by generating a Voronoi pattern (not shown), the exit wave ρ_true under the projection approximation [1] is shown in Fig. 2(b) (the red dashed line will be discussed shortly). The primary difficulty arising when attempting to use the approximate illuminated area on the sample as a support constraint is that the x-ray probe photon area density smoothly decays as one moves away from the center of the probe function. This smooth decay makes defining an effective support boundary ambiguous.

Fig. 2 Simulation results when using the approximate diameter of the probe as the support for the exit wave ρ. (a) The probe function generated by solving the Fresnel integral using a simulated circular pinhole as input. (b) The ground truth exit wave ρ_true generated by using the projection approximation with the probe shown in (a) and a simulated sample transmission function (not shown) defined as a Voronoi pattern. A typical boundary, where inside (outside) the red dotted line the support $𝕀_{𝒮}^{(k)}$ takes on values of 1 (0), of the binary support mask 𝕀_𝒮 generated by using shrinkwrap is shown as the red dotted line. (c) The unsuccessfully recovered exit wave ρ^(k) after k = 10⁵ iterations of combined HIO, ER, and shrinkwrap. (d) A linecut of the exit wave ρ_true through the main diagonal. The boundary of the support (red dotted line) in (b) is shown here also as the red dotted lines. The support boundary shown potentially can zero out features of |ρ_true| below approximately 0.05|ρ_true |_max. (e) The measurement metric Eq. (3) versus iteration number during the HIO+ER+shrinkwrap sequence.

Download Full Size | PDF

As an example, we use a support dynamically generated using the shrinkwrap method [4] as a constraint on the exit wave and the Fourier modulus measurement $\sqrt{D}$ as the measurement constraint. We denote the current iterate of the exit wave as ρ^(k), where k denotes the iteration index and the problem size is M × N = 512 × 512. We run 4,900 iterations of Fienup’s hybrid input-output method (HIO), then 100 iterations of error reduction (ER), and then a shrinkwrap update of the binary support mask, denoted by the symbol $𝕀_{𝒮}^{(k)}$ . This sequence of HIO+ER+shrinkwrap is repeated 20 times. When updating the binary support mask in shrinkwrap, we Gaussian blur (using $σ_{sw} = 0.01 \sqrt{M N}$ as the standard deviation of the Gaussian low-pass filter in Fourier space) the current iterate of |ρ^(k)| to get the quantity $ρ_{blur}^{(k)}$ . We then enforce a sparsity ratio of $κ = \frac{1}{M N} {‖ ρ_{blur}^{(k)} ‖}_{0} = 0.15$ on $ρ_{blur}^{(k)}$ to define the binary support mask. The sparsity ratio is the fraction of nonzeros since

{‖ a ‖}_{0} = \sum_{r} 𝕀_{{a \neq 0}} (r),

where the indicator function used is defined as

𝕀_{{a \neq 0}} (r) = {\begin{array}{l} 1 & if a (r) \neq 0 \\ 0 & if a (r) = 0, \end{array}

for some a ∈ ℂ^{M × N}, and we index the spatial coordinate system of the sample by r ∈ {(r_n, r_m) : n = 0, . . . , N − 1, m = 0, . . . , M − 1}. To define the binary support mask with a particular enforced sparsity ratio κ, we determine the ⌊MNκ⌋th largest value of

ρ_{blur}^{(k)}

. The binary support mask is then created when we set to zero the elements of

ρ_{blur}^{(k)}

below this value and set to one the elements above or equal to this value. A typical result for the support boundary using this recipe is shown by the dashed red line around the ground truth exit wave ρ_true in Fig. 2(b). A linecut of the ground truth exit wave ρ_true through the main diagonal is shown in Fig. 2(d), with the support boundary again shown as the dotted red lines. The support boundary shown in Fig. 2(d) potentially can zero out features of |ρ_true| below approximately 0.05|ρ_true|_max.

The results for the recovered exit wave ρ^(k) for k = 10⁵ are shown in Fig. 2(c). The support generated using shrinkwrap is not constraining enough to accurately recover the ground truth exit wave shown in Fig. 2(b). The value of the standard error metric

ε_{ℳ}^{2} (ρ^{(k)}) = \sum_{q} {| | {\tilde{ρ}}^{(k)} (q) | - \sqrt{D (q)} |}^{2},

used to measure solution quality, as a function of iteration number k is shown in Fig. 2(e). There, we use the notation ρ̃ = ℱ [ρ] = |ρ̃| ⊙ e^iϕ̃, where ℱ [·] is the two-dimensional discrete Fourier transform, ⊙ is the (componentwise) Hadamard product, and we index the Fourier space coordinate system of the measurement by q, so that ρ̃(q) = |ρ̃(q)| ⊙ e^iϕ̃(q) for q ∈ {(q_n, q_m) : n = 0, . . . , N − 1, m = 0, . . . , M − 1}. If we enforce smaller sparsity ratios (i.e., κ < 0.15), the support generated is “too tight” [16], and the recovered ρ^(k) is similarly unsatisfactory when compared with the ground truth exit wave in Fig. 2(b). Enforcing larger sparsity ratios (i.e., κ > 0.15) causes prompt algorithmic stagnation when monitoring Eq. (3), indicating convergence to a poor local solution [17] and again showing unsatisfactory results when compared with the ground truth exit wave in Fig. 2(b).

3. Sparse exit wave representations for phase retrieval

Mathematical transformations can be applied to represent a group of variables in a more meaningful and useful way. An example is the wavelet transformation of an image, where many images have only a few dominant wavelet components. Ignoring the nondominant components yields lossy compression of the image into a potentially small fraction of the original number of information bytes the image comprised. The information content of the image is sparse in the wavelet domain. The technique of “compressive sensing” attempts to incorporate and exploit properties of this inherent sparsity. The basic idea is that by formulating optimization problems in terms of a simplified representation instead of the original representation and solving for an optimal sparse, simplified representation, one might obtain more robust results.

Here, we explore the use of edge detection as a sparsifying transformation of an exit wave ρ. This idea was originally explored in [11] for use in phase retrieval. For edge detection, we take the forward-difference arrays (each of size M × N = 512 × 512)

E_{x} = [\begin{matrix} 1 & - 1 & 0 & \dots & 0 \\ 0 & 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & 0 & \dots & 0 \end{matrix}], E_{y} = E_{x}^{T} = [\begin{matrix} 1 & 0 & \dots & 0 \\ - 1 & 0 & \dots & 0 \\ 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & \dots & 0 \end{matrix}],

and convolve ρ with E_x and E_y. Since this edge detection uses forward differences, we refer to the resulting quantities as spatial derivatives and use the notation ∂_xρ = E_x * ρ, ∂_yρ = E_y * ρ. The convolution theorem shows

\partial_{j} ρ = E_{j} * ρ = ℱ^{- 1} [ℱ [E_{j}] ⊙ ℱ [ρ]] = ℱ^{- 1} [{\tilde{E}}_{j} ⊙ \tilde{ρ}],

for j = x, y and with Ẽ_j = ℱ[E_j].

Figs. 3(a), 3(c), and 3(e) show the ground truth exit wave ρ_true used in Fig. 2(b) and its respective edge-detected representations. In Fig. 3(b) we again show the binary support mask 𝕀_𝒮 generated by using shrinkwrap in Sec. 2 to define a support corresponding to the approximate illuminated area on the sample. In Figs. 3(d) and 3(f), we show the supports 𝕀_{𝒮_x}, 𝕀_{𝒮_y} that result from thresholding the modulus of the respective quantities in Figs. 3(c) and 3(e) to zero whenever they fall below 2% of the maximum value of the modulus:

𝕀_{𝒮_{j}} (r) = {\begin{array}{l} \begin{array}{l} 1 & if | \partial_{j} ρ | > 0.02 {| \partial_{j} ρ |}_{max} \\ 0 & if | \partial_{j} ρ | \leq 0.02 {| \partial_{j} ρ |}_{max}, \end{array} & for j = x, y . \end{array}

From Fig. 3, we can clearly see that the ∂_xρ and ∂_yρ quantities have far sparser supports when compared with ρ_true. We explore this property in the remainder of this paper and show that by performing phase retrieval to solve for the quantities ∂_xρ and ∂_yρ with relatively sparse supports in place of solving for the exit wave ρ with a nonsparse support, we can robustly recover the exit wave for extended samples in many cases.

Fig. 3 (a) The ground truth exit wave ρ_true. (b) A support 𝕀_𝒮 generated by hard thresholding |ρ_true| by 2% of the maximum value of |ρ_true|, with $\sum_{r} 𝕀_{𝒮} (r) / M N = 0.15$ being the resulting sparsity ratio and 𝕀_𝒮(r) defined in Eq. (6). In (c,e), ∂_xρ_true and ∂_yρ_true, respectively, the x and y components of the forward-difference gradient of ρ_true, are shown. Shown in (d,f) are the supports 𝕀_{𝒮_x} and 𝕀_{𝒮_y} generated by hard thresholding 2% of the maximum value of |∂_xρ_true| and |∂_yρ_true|, respectively; $\frac{1}{M N} {‖ 𝕀_{𝒮_{y}} ‖}_{0} = 0.0141$ and $\frac{1}{M N} {‖ 𝕀_{𝒮_{y}} ‖}_{0} = 0.0133$ are the respective sparsity ratios.

Download Full Size | PDF

4. Solving for the sparse representation of the exit wave

The alternating direction method of multipliers (ADMM) attempts to solve optimization problems by breaking them into smaller, more manageable subproblems [18]. It has the advantages that the objective function used in the optimization problem need not be differentiable and the method is simple to implement and parallelize. The convergence properties of this method are difficult to analyze for nonconvex problems, however, and more specialized algorithms can outperform it. We begin a derivation of ADMM update rules for obtaining the sparse representations ∂_jρ of the exit wave ρ by considering the constrained, nonconvex optimization problem

\begin{array}{l} min_{u_{x}, u_{y}, ρ} & {‖ u_{x} ‖}_{ℓ} + {‖ u_{y} ‖}_{ℓ} \\ subject to & u_{x} = \partial_{x} ρ \\ u_{y} = \partial_{y} ρ \\ ρ \in ℳ, \end{array}

where ℓ ∈ {0, 1} and we define the Fourier modulus measurement constraint set

ℳ = {ρ : ρ = ℱ^{- 1} [\sqrt{D} ⊙ e^{i \tilde{ϕ}}]}

and its associated projection operator Π_ℳ

Π_{ℳ} [ρ] = ℱ^{- 1} [\sqrt{D} ⊙ e^{i \tilde{ϕ}}],

with ρ̃ = ℱ [ρ] = |ρ̃| ⊙ e^iϕ̃. The ‖ · ‖₀ norm is defined in Eq. (1), and for any a ∈ ℂ^{M × N} we define the ‖ · ‖₁ norm by

{‖ a ‖}_{1} = \sum_{r} | a (r) | .

The ℓ ∈ {0, 1} norms are well-known sparsity-inducing norms, with the ‖ · ‖₁ norm frequently used as a convex and continuous relaxation of the ‖ · ‖₀ norm [19]. The optimization problem in (7) seeks the ρ satisfying the Fourier modulus measurement with sparsest possible ∂_jρ.

The constrained augmented Lagrangian function for Eq. (7) is

\begin{array}{l} ℒ (u_{x}, u_{y}, ρ, λ_{x}, λ_{y}, τ_{x}, τ_{y}, ℓ) & = \sum_{j \in {x, y}} ℒ_{j} (u_{j}, ρ, λ_{j}, τ_{j}, ℓ) \\ = \sum_{j \in {x, y}} [τ_{j} {‖ u_{j} ‖}_{ℓ} + {‖ u_{j} - \partial_{j} ρ ‖}_{F}^{2} + 2 Re [\sum_{r} λ_{j}^{*} ⊙ (u_{j} - \partial_{j} ρ)]] \\ = \sum_{j \in {x, y}} [τ_{j} {‖ u_{j} ‖}_{ℓ} + {‖ u_{j} - \partial_{j} ρ + λ_{j} ‖}_{F}^{2} - {‖ λ_{j} ‖}_{F}^{2}] \end{array}

= \sum_{j \in {x, y}} [τ_{j} {‖ u_{j} ‖}_{ℓ} + {‖ {\tilde{u}}_{j} - {\tilde{E}}_{j} ⊙ \tilde{ρ} + {\tilde{λ}}_{j} ‖}_{F}^{2} - {‖ λ_{j} ‖}_{F}^{2}],

where τ_j ≥ 0 are parameters that control the competition between the constraints u_j = ∂_jρ and ρ ∈ ℳ and finding the sparsest possible u_j. The λ_j ∈ ℂ^{M × N} are Lagrange multipliers (with

λ_{j}^{*}

denoting the complex conjugate), ũ_j = ℱ [u_j], λ̃_j = ℱ [λ_j], and ρ̃ = ℱ[ρ]. We use Plancherel’s theorem for the discrete Fourier transform when going from Eq. (11a) to Eq. (11b) on the Frobenius norm

{‖ a ‖}_{F}^{2} = \sum_{r} {| a (r) |}^{2} = \sum_{q} {| \tilde{a} (q) |}^{2},

for any a ∈ ℂ^{M × N} and with ã(q) = ℱ [a(r)].

Provided that ℓ ∈ {0, 1} is fixed and given initializations for ρ⁽⁰⁾, $u_{j}^{(0)} = \partial_{j} ρ^{(0)}$ , and $λ_{j}^{(0)}$ for j ∈ {x, y}, the kth iteration of the ADMM-based method can be written as

τ_{j}^{(k + 1)} = \underset{τ_{j}}{arg min} ψ_{j} (u_{j}^{(k)}, ρ^{(k)}, λ_{j}^{(k)}, τ_{j}, ℓ) for j \in {x, y}

u_{j}^{(k + 1)} = \underset{u_{j}}{arg min} ℒ_{j} (u_{j}, ρ^{(k)}, λ_{j}^{(k)}, τ_{j}^{(k + 1)}, ℓ) for j \in {x, y}

ρ^{(k + 1)} = \underset{ρ \in ℳ}{arg min} ℒ (u_{x}^{(k + 1)}, u_{y}^{(k + 1)}, ρ, λ_{x}^{(k)}, λ_{y}^{(k)}, τ_{x}^{(k + 1)}, τ_{y}^{(k + 1)}, ℓ)

λ_{j}^{(k + 1)} = λ_{j}^{(k)} + β_{j} (u_{j}^{(k + 1)} - \partial_{j} ρ^{(k + 1)}) for j \in {x, y}

with β_j being a damping term for the multiplier updates (typically we set β_j = 1 for both j ∈ {x, y}). The update rules for

u_{j}^{(k + 1)}

and ρ^(k+1) are derived in Secs. 4.1 and 4.2, respectively. The function ψ_j (·) is defined in Sec. 4.3 and allows for effective iterative determination of the parameters τ_j. Note that we cannot minimize Eq. (11) to update the τ_j because (τ_x, τ_y) = (0, 0) is the global minimizer of Eq. (11) when holding the other quantities constant, which is not useful for our purposes. Thus, we introduce the additional auxiliary function ψ_j (·) for determining the optimal values of the parameters τ_j.

4.1. $u_{j}^{(k + 1)}$ update

The least-squares optimization subproblem with the sparsity-promoting ℓ ∈ {0, 1}-norm regularization is

u_{j}^{(k + 1)} = \underset{u_{j}}{arg min} τ_{j} {‖ u_{j} ‖}_{ℓ} + {‖ u_{j} - b ‖}_{F}^{2},

which is equivalent to minimizing ℒ_j in Eq. (11a) over u_j when we set b = ∂_jρ − λ_j and we drop the ‖λ_j‖_F constant. The solution to Eq. (14) can be expressed in terms of proximal mappings [20–24]. If using ℓ = 1, this mapping is the soft-thresholding operator

T_{1} (τ, b) = sgn (b) ⊙ (| b | - τ) ⊙ 𝕀_{{| b | > τ}},

where we use the notation of the complex signum function sgn(b) = e^iγ for b = |b| ⊙ e^iγ when b ≠ 0 and sgn(0) = 0 with γ ∈ ℝ^{M × N}. If using ℓ = 0, this mapping is the hard-thresholding operator

T_{0} (τ, b) = b ⊙ 𝕀_{{| b | > \sqrt{τ}}} .

In both cases, we have an indicator function of the form

𝕀_{{| b | > μ}} (r) = {\begin{array}{l} 1 & if | b (r) | > μ \\ 0 & if | b (r) | \leq μ \end{array}

for a specified μ ≥ 0. We thus use the update rule

\begin{array}{l} u_{j}^{(k + 1)} & = \underset{u_{j}}{arg min} τ_{j}^{(k)} {‖ u_{j} ‖}_{ℓ} + {‖ u_{j} - \partial_{j} ρ^{(k)} + λ_{j}^{(k)} ‖}_{F}^{2} \\ = T_{ℓ} (τ_{j}^{(k)}, \partial_{j} ρ^{(k)} - λ_{j}^{(k)}), for j \in {x, y} \end{array}

in Eq. (13b). In addition to balancing the competition between fitting the measurement and promoting sparsity, we can interpret τ_j as thresholding parameters.

4.2. ρ^(k+1) update

An update for ρ^(k+1) can be found by taking the Wirtinger (complex-valued; see, e.g., [25]) derivative of the ρ̃-dependent term in Eq. (11b) with respect to ρ̃ and setting it to zero:

\frac{\partial}{\partial \tilde{ρ}} \sum_{j \in {x, y}} {‖ {\tilde{u}}_{j}^{(k + 1)} - {\tilde{E}}_{j} ⊙ \tilde{ρ} + {\tilde{λ}}_{j}^{(k)} ‖}_{F}^{2} = 0 .

Solving for ρ̃, we obtain

\tilde{ρ} = \frac{{\tilde{E}}_{x}^{*} ⊙ {\tilde{u}}_{x}^{(k + 1)} + {\tilde{E}}_{x}^{*} ⊙ {\tilde{λ}}_{x}^{(k)} + {\tilde{E}}_{y}^{*} ⊙ {\tilde{u}}_{y}^{(k + 1)} + {\tilde{E}}_{y}^{*} ⊙ {\tilde{λ}}_{y}^{(k)}}{{| {\tilde{E}}_{y} |}^{2} + {| {\tilde{E}}_{x} |}^{2}},

where the division by |Ẽ_y|² + |Ẽ_x|² is componentwise. To satisfy the constraint in Eq. (13c) that ρ ∈ ℳ, we then project Eq. (20) onto the constraint set ℳ:

ρ^{(k + 1)} = Π_{ℳ} [ℱ^{- 1} [\frac{{\tilde{E}}_{x}^{*} ⊙ {\tilde{u}}_{x}^{(k + 1)} + {\tilde{E}}_{x}^{*} ⊙ {\tilde{λ}}_{x}^{(k)} + {\tilde{E}}_{y}^{*} ⊙ {\tilde{u}}_{y}^{(k + 1)} + {\tilde{E}}_{y}^{*} ⊙ {\tilde{λ}}_{y}^{(k)}}{{| {\tilde{E}}_{y} |}^{2} + {| {\tilde{E}}_{x} |}^{2}}]],

which we use for the update in Eq. (13c).

4.3. $τ_{j}^{(k + 1)}$ update

The choice of thresholding parameters (τ_x, τ_y) is critical to the success of our method. We determine τ_j parameters by solving Eq. (13a), where we introduce the cost function

ψ_{j} (u_{j}^{(k)}, ρ^{(k)}, λ_{j}^{(k)}, τ_{j}, ℓ) = {‖ | {\tilde{E}}_{j} | ⊙ \sqrt{D} - | ℱ [z^{(k)} (τ_{j})] | ‖}_{F}^{2} + w_{j}^{(k)} {‖ z^{(k)} (τ_{j}) ‖}_{ℓ},

with

z^{(k)} (τ_{j}) = T_{ℓ} (τ_{j}, \partial_{j} ρ^{(k)} - λ_{j}^{(k)})

. The two spatial terms are then combined to define

ψ^{(k)} (τ_{x}, τ_{y}) = \sum_{j \in {x, y}} ψ_{j} (u_{j}^{(k)}, ρ^{(k)}, λ_{j}^{(k)}, τ_{j}, ℓ) .

The weighting parameter

w_{j}^{(k)} \geq 0

controls the relative scaling of the first term in Eq. (22) with the second and is defined as

w_{j}^{(k)} = ς \frac{{‖ | {\tilde{E}}_{j} | ⊙ \sqrt{D} - | ℱ [u_{j}^{(k)}] | ‖}_{F}^{2}}{{‖ u_{j}^{(k)} ‖}_{ℓ}},

where ς ≥ 0 is a parameter balancing the sparsity of the solution with its Fourier modulus measurement agreement. If ς is too large, then the sparsity-inducing term can dominate in Eq. (22), resulting in a solution that has a large mismatch with the

| {\tilde{E}}_{j} | ⊙ \sqrt{D}

term. If ς is too small, we are allowing nonsparse solutions, which can quickly cause convergence to a poor local solution similar to that seen in Sec. 2 when using the probe diameter as a support region. The choice of ς values for our numerical experiments is discussed in Sec. 5.

We can numerically evaluate Eq. (23) as a function of (τ_x, τ_y) to gain intuition on what is necessary for finding a minimizer of Eq. (23). An example is shown in Fig. 4, where ℓ = 0 is used; similar results are obtained if ℓ = 1 is used. At first glance, the metric ψ^(k)(τ_x, τ_y) appears to have a well-behaved landscape with a single global minimum, as seen in Fig. 4(a). By zooming in on a smaller field of view in Fig. 4(b) (which is enclosed by the boundaries of the red box in Fig. 4(a)), however, we begin to see multiple local minima. This discontinuous and nonsmooth behavior of the derivatives of ψ^(k)(τ_x, τ_y) is due to the nonsmooth nature of the thresholding operators in Eq. (15) and Eq. (16).

Fig. 4 At an illustrative iteration k: (a) Numerical evaluation of ψ^(k)(τ_x, τ_y) over the box defined by (τ_x, τ_y) = (0.882, 0.884) to (7.232, 7.168), which corresponds to the sparsity ratio box (κ_x, κ_y) = (0.01, 0.01) to (0.1, 0.1). At this scale in (τ_x, τ_y), ψ^(k) looks well behaved. A single global minimum is indicated by the magenta × marker; the black arrows denote the gradient (rescaled to be a unit vector) of ψ^(k)(τ_x, τ_y) with respect to (τ_x, τ_y). (b) Evaluation of ψ^(k)(τ_x, τ_y) in the red box subregion shown in (a). At this scale in (τ_x, τ_y), multiple local minima and discontinuity of the derivatives become apparent.

Download Full Size | PDF

As a result of this nonsmooth topology with multiple local minima, if we attempt to use a gradient descent method, we can expect to quickly become stuck in one of these local solutions. One way around this problem is to use finite differences to approximate the derivatives with respect to (τ_x, τ_y) and ensure that the finite-difference steps are large enough to smooth over these spurious local solutions [26]. The larger the finite-difference steps, however, the less accurate is the approximate global minimizer that can be obtained. Consequently, instead of using a gradient descent method, we use a “polling step”-like method whereby we simply test some thresholding parameter trial values around the vicinity of the current $τ_{j}^{(k)}$ in the objective function ψ^(k)(τ_x, τ_y):

h_{x}^{(k)} = \underset{h_{x} \in {0, h_{\pm 1 x}, h_{\pm 2 x, \dots}}}{arg min} ψ^{(k)} (τ_{x}^{(k)} + h_{x}, τ_{y}^{(k)})

h_{y}^{(k)} = \underset{h_{y} \in {0, h_{\pm 1 y}, h_{\pm 2 y, \dots}}}{arg min} ψ^{(k)} (τ_{x}^{(k + 1)}, τ_{y}^{(k)} + h_{y}),

where any ties are broken lexicographically (i.e., the first tied index in the order in which the trial values of h_j are tested). The h_±nj, for j ∈ {x, y} and n ∈ ℤ₊, can be chosen by picking a range of sparsity ratios (e.g.,

κ_{j}^{(k + 1)} \in {κ_{j}^{(k)}, κ_{j}^{(k)} \pm 0.005, κ_{j}^{(k)} \pm 0.01, κ_{j}^{(k)} \pm 0.015, κ_{j}^{(k)} \pm 0.02}

, with

0 < κ_{j}^{(k + 1)} \leq 1

). We then can compute—by sorting the MN values in

| \partial_{j} ρ^{(k)} - λ_{j}^{(k)} |

and setting τ_j to be the ⌊MNκ_j⌋th largest value—the thresholding parameters corresponding to these allowable changes in sparsity ratios to get the h_±nj values, and we solve Eq. (25) by direct evaluation of Eq. (23) using these h_±nj.

We found by experience that we need not update the thresholding parameters τ_j with the same frequency at which u_j, ρ, and λ_j are updated. If we attempt to update τ_j as frequently as the main iterates, we usually find that a polling step of h_j = 0 is the minimizer of ψ^(k)(τ_x, τ_y) in both x and y. Therefore we introduce a parameter L that controls the frequency at which τ_j is updated to arrive at the update rule:

τ_{j}^{(k + 1)} = {\begin{array}{l} \begin{array}{l} τ_{j}^{(k)} + h_{j}^{(k)} & if k mod L = 0 \\ τ_{j}^{(k)} & if k mod L \neq 0 \end{array} & for j \in {x, y}, \end{array}

with the

h_{j}^{(k)}

value computed in Eq. (25).

5. Benchmarking

Given initializations for ρ⁽⁰⁾, $u_{j}^{(0)} = \partial_{j} ρ^{(0)}$ , λ⁽⁰⁾, and $τ_{j}^{(0)}$ , our complete algorithm to find the sparsest possible u_j = ∂_jρ subject to the constraint ρ ∈ ℳ is

τ_{j}^{(k + 1)} = {\begin{array}{l} \begin{array}{l} τ_{j}^{(k)} + h_{j}^{(k)} & if k mod L = 0 \\ τ_{j}^{(k)} & if k mod L \neq 0 \end{array} & for j \in {x, y} \end{array}

u_{j}^{(k + 1)} = T_{ℓ} (τ_{j}^{(k + 1)}, \partial_{j} ρ^{(k)} - λ_{j}^{(k)}) for j \in {x, y}

ρ^{(k + 1)} = Π_{ℳ} [ℱ^{- 1} [\frac{{\tilde{E}}_{x}^{*} ⊙ {\tilde{u}}_{x}^{(k + 1)} + {\tilde{E}}_{x}^{*} ⊙ {\tilde{λ}}_{x}^{(k)} + {\tilde{E}}_{y}^{*} ⊙ {\tilde{u}}_{y}^{(k + 1)} + {\tilde{E}}_{y}^{*} ⊙ {\tilde{λ}}_{y}^{(k)}}{{| {\tilde{E}}_{y} |}^{2} + {| {\tilde{E}}_{x} |}^{2}}]]

λ_{j}^{(k + 1)} = λ_{j}^{(k)} + β_{j} (u_{j}^{(k + 1)} - \partial_{j} ρ^{(k + 1)}) for j \in {x, y} .

We test this algorithm on 14 problems with varying sparsity ratios for the supports 𝕀_{𝒮_j} (as defined in Eq. (6)) of the edge-detected representations ∂_jρ, with j ∈ {x, y}. The test problems have sparsity ratios for the (assumed unknown) 𝕀_{𝒮_j} ranging from approximately 0.005 to 0.037, as shown in Fig. 5(a); some representative 𝕀_{𝒮_j} are shown in Figs. 5(e), 5(f), and 5(g). The simulated ground truth exit waves corresponding to these sparsity ratios for the various 𝕀_{𝒮_j} are defined by using the probe function seen in Fig. 2(a) and from a set of sample transmission functions generated by using Voronoi patterns with varying polygon centroid density in a M × N = 512 × 512 array size. The contrast of the modulus of the sample transmission functions is allowed to range between 0.1 and 0.9, while the phase can range between ±π. Examples of these exit waves corresponding to the supports in Figs. 5(e), 5(f), and 5(g), respectively, are shown in Figs. 5(b), 5(c), and 5(d). By defining synthetic test problems in this way, we can systematically control the sparsity ratios for the 𝕀_{𝒮_j} in order to test performance.

Fig. 5 (a) Sparsity ratios for the sparse representation ∂_jρ for j ∈ {x, y} of the test problems we benchmark to evaluate performance of Eq. (27). To illustrate what is changing in (a), the exit wave ρ_true for the polygon densities $\frac{16}{M N}$ , $\frac{3072}{M N}$ , and $\frac{12288}{M N}$ are shown in (b), (c), and (d), respectively. The supports on the sparse representation ∂_xρ_true (i.e., 𝕀_{𝒮_x}) for each of these respective cases are shown in (e), (f), and (g). The probe function used to generate the exit waves is the same used in Fig. 2(a), and for all the transmission functions corresponding to these exit waves the modulus contrast varies between 0.1 and 0.9, while the phase contrast varies between ±π.

Download Full Size | PDF

The choice of ℓ ∈ {0, 1} is application, problem, and algorithm dependent. Therefore, in Sec. 5.1 we discuss our choice of preferred ℓ. Then in Sec. 5.2 we provide numerical results from the benchmark problems with ℓ = 0. Section 5.3 discusses the effects of noise in the diffraction on our method.

5.1. Use of ℓ = 0 vs. ℓ = 1

We examine the differences in performance of our ADMM method (Eq. (27)) when faced with choosing between ℓ = 0 or ℓ = 1 as sparsity-inducing metrics. The ℓ = 1 regularization choice is widely used as a stand-in for ℓ = 0 regularization because using ℓ = 0 introduces difficulties in analyzing convergence of algorithms derived from its use. Some studies [27–29], however, have noted difficulties when using ℓ = 1 with sparse image restoration and recovery. To look at the performance of the update rules given in Eqs. (27b)–(27d), we attempt to recover the simulated exit wave shown in Fig. 3(a) by sampling over a mesh of fixed sparsity ratios (κ_x, κ_y) for some number of total iterations K and to determine the final error by using the metric in Eq. (3). We do this procedure for both ℓ = 0 and ℓ = 1, meaning we use either the soft-thresholding operator given in Eq. (15) or the hard-thresholding operator given in Eq. (16) to update the u_j in Eq. (27b).

In Fig. 6, we show the resulting final error using the updates in Eqs. (27b)–(27d) for both ℓ = 0 and ℓ = 1 over a fixed mesh for (κ_x, κ_y). For both, a 21 × 21 (equally spaced in both parameters) box is used, ranging from (κ_x, κ_y) = (0.005, 0.005) to (0.255, 0.255) for ℓ = 0 (hard thresholding) and from (κ_x, κ_y) = (0.005, 0.005) to (0.505, 0.505) for ℓ = 1 (soft thresholding). For each of the 441 (κ_x, κ_y) pairs, K = 4 × 10⁴ such iterations are performed. After these K iterations, we start from the prescribed (κ_x, κ_y) values and linearly ramp up (κ_x, κ_y) over the next 10⁴ iterations to a final sparsity ratio of (κ_x, κ_y)_final = (0.255, 0.255) (hard thresholding) or (κ_x, κ_y)_final = (0.505, 0.505) (soft thresholding). For the extreme values of this box in both cases, no ramping is performed. The purpose of this ramping up of sparsity ratios is to allow the binary-valued support regions, generated by the thresholding operations on the u_j, to grow larger and larger (less sparse), which enables the ADMM method to converge to the nearest stationary point defined by Eq. (11). In our experience, we observed that the solution quality, when compared with the measurement $\sqrt{D}$ , as well as the ground truth exit wave ρ_true, is worse if the ramping is not performed; that is, a larger value is obtained when using the error metric given by Eq. (3) as well as when using the metric ${‖ ρ_{true} - ρ ‖}_{F}^{2}$ , after subpixel image registration and global phase offset correction on ρ are performed.

Fig. 6 (a–b) Fourier modulus measurement metric Eq. (3) vs a mesh of fixed sparsity ratios (κ_x, κ_y). (a) Use of hard thresholding and the ADMM updates over the box of fixed sparsity ratios from (0.005, 0.005) to (0.255, 0.255). In this case, a minimum (located by the magenta circle) is clearly defined at a small highly constraining sparsity ratio pair allowing prompt convergence to the ground truth exit wave ρ_true. (b) Use of soft thresholding and the ADMM updates over the box of fixed sparsity ratios from (0.005, 0.005) to (0.505, 0.505). (c) Recovered exit wave using hard thresholding corresponding to the minimum (red circle/cross) in (a) at sparsity ratio pair (κ_x, κ_y) = (0.0175, 0.03). (d) Recovered exit wave using soft thresholding corresponding to the minimum (blue circle/cross) in (a) at sparsity ratio pair (κ_x, κ_y) = (0.105, 0.105) (e) Recovered exit wave using soft thresholding corresponding to the minimum (magenta circle/cross) in (a) at sparsity ratio pair (κ_x, κ_y) = (0.505, 0.505).

Download Full Size | PDF

Figs. 6(a) and 6(b) show that the use of hard thresholding is much more effective (has much smaller residual error) compared with the use of soft thresholding in the ADMM method in Eqs. (27b)–(27d). In Fig. 6(a) the minimum indicated by a red circle and cross corresponds to a sparsity ratio pair of (κ_x, κ_y) = (0.0175, 0.03). The recovered exit wave after ramping up to (κ_x, κ_y)_final = (0.255, 0.255) is shown in Fig. 6(c) and is — apart from 180° rotation symmetry, complex conjugation, and global phase offset on ρ — closely matched to the ground truth exit wave ρ_true shown in Fig. 3(a). A typical exit wave reconstruction using soft thresholding is shown in Fig. 6(d), which corresponds to the blue circle/cross at (κ_x, κ_y) = (0.105, 0.105); here the use of soft thresholding appears to cause the exit wave to be dominated by relatively few image pixels, which have much larger moduli values than all other image pixels. Similar effects have been reported in wavelet-based image restoration for other highly ill-posed problems when using ℓ = 1 sparsity-inducing regularization [27]. Other studies have shown that using ℓ = 0 regularization leads to better edge preservation in image restoration (exactly the features we want to take advantage of) when compared with ℓ = 1 regularization [28, 29]. The minimum of Fig. 6(b) is located at the magenta circle/cross at (κ_x, κ_y) = (0.505, 0.505), and the corresponding exit wave reconstruction is shown in Fig. 6(e); here the sparsity ratio appears to be too large (too weakly constraining), causing quick stagnation to a local solution in a fashion similar to that seen when using the approximate beam diameter as a support in Sec. 2.

As a result of this clear superiority of hard thresholding when used with Eq. (27b) compared with the use of soft thresholding, for the remainder of this paper we will consider the use only of hard thresholding. We note that other algorithms exist that successfully use soft thresholding to promote sparsity [24, 30, 31]. We further note as a reminder that the use of soft thresholding comes from ℓ = 1 regularization as a sparsity-inducing term and that its use is as a convex and continuous relaxation of ℓ = 0 regularization in order to make the problem simpler to analyze; both regularization terms are effective at promoting sparsity to varying degrees depending on the particulars of the phase retrieval algorithm used and type of sparsifying transformation. We believe that the use of weighted ℓ = 1 regularization methods [19] should be explored; these methods use prior iterations of the exit wave ρ and a modified ℓ = 1 regularization term that can closely approximate the ℓ = 0 regularization term. These weighted ℓ = 1 regularization methods also retain the ease of analytic computations, continuity, and convexity. To address the use of ℓ = 0 versus ℓ = 1 regularization to promote sparsity in the phase retrieval problem, researchers need to carry out further systematic benchmarking and comparison of these methods in order to compare relative effectiveness at successfully recovering the exit wave and their sparse edge-detected representations, but these activities are beyond the scope of this paper.

5.2. Benchmark results for ℓ = 0

We now provide full benchmark results on our test problem with ℓ = 0. We use ς ≃ 0.1 when using Eq. (24). Its value was determined by numerical experimentation and works well with the test problems we consider in this section. For other synthetic test problems or if attempting phase retrieval on experimental data, other values for ς may perform better, and further exploratory tuning of this parameter should by performed.

We run 50 independent trials, each with a different M × N complex-valued uniform random number array for ρ⁽⁰⁾ for each of the 14 synthetic test problems with varying polygon density. We initialize λ⁽⁰⁾ to be the M × N array of zeros, and we use as initial sparsity ratios $(κ_{x}^{(0)}, κ_{y}^{(0)}) = (0.05, 0.05)$ , with initial thresholding parameters ( $τ_{x}^{(0)}$ , $τ_{y}^{(0)}$ ) computed from the initial ( $κ_{x}^{(0)}$ , $κ_{y}^{(0)}$ ) and ∂_jρ⁽⁰⁾. We update the thresholding parameters τ_j every L = 50 iterations and use 5 polling steps in both x and y with allowable sparsity ratio changes of $κ_{j}^{(k + 1)} \in {κ_{j}^{(k)}, κ_{j}^{(k)} \pm 0.005, κ_{j}^{(k)} \pm 0.01}$ . We allow K₀ = 2 × 10⁴ iterations of Eq. (27) to be performed, after which we cease using Eq. (27a) to update the thresholding parameters τ_j. Beginning from the current ( $κ_{x}^{(K_{0})}$ , $κ_{y}^{(K_{0})}$ ), we ramp up the sparsity ratios to final values of $(κ_{x}^{(K_{f})}, κ_{y}^{(K_{f})}) = (0.25, 0.25)$ in the same manner as that discussed in Sec. 5.1, with K_f = 2.5 × 10⁴. Results are shown in Fig. 7(a), where for each of the 50 trials for each test problem polygon density, we sort the final values of the metric $ε_{true}^{2} = {‖ ρ_{true} - ρ^{(K_{f})} ‖}_{F}^{2}$ in descending order. We have also corrected the recovered ρ^(K_f) for subpixel shift and global phase offsets when compared with ρ_true when computing $ε_{true}^{2}$ .

Fig. 7 (a) Sorted $ε_{true}^{2} = {‖ ρ_{true} - ρ^{(k)} ‖}_{F}^{2}$ values vs different sparsity ratios in ∂_jρ for j ∈ {x, y} after k = 2.5 × 10⁴ iterations and using 50 independent trials with different starting random exit waves. For $\frac{16}{M N}$ , (b) the ρ_true, (c) ρ^(k) with lowest $ε_{true}^{2}$ value, and (d) ρ^(k) with largest $ε_{true}^{2}$ value. For $\frac{128}{M N}$ , (e) the ρ_true and (f) ρ^(k) with lowest $ε_{true}^{2}$ value, and (g) ρ^(k) with largest $ε_{true}^{2}$ value.

Download Full Size | PDF

The results of these benchmarks give us some insight into the performance of Eq. (27) over a range of synthetic test problems of varying sparsity upon the forward-difference edge detection in Eq. (4). Fig. 7(a) shows three regions of notable and differing behavior in $ε_{true}^{2}$ . The first is an “ambiguous” region, where results are inconclusive and are between success and failure to varying degrees. The reason for the difficulty in this region lies in our attempt to use edge detection to reconstruct very smoothly varying features (discussed further below). The second and third regions correspond to where, for a clear majority of starting points, we unambiguously recover or fail to adequately recover an exit wave quantitatively and qualitatively close to the corresponding ρ_true. (The criteria used to determine what “adequate” means is discussed below.)

The “ambiguous” region, which we denote by R_A, corresponds to polygon densities of $\frac{p_{1}}{MN}$ for p₁ ∈ {16, 128, 256, 512} and where $10^{5} ≲ ε_{true} ≲ 10^{5.5}$ . For p₁ = 16, the ground truth exit wave is shown in Fig. 7(b); a reconstruction with $ε_{true}^{2} ≃ 10^{5.5}$ is shown in Fig. 7(c). The reconstruction clearly has recovered the locations of the edges correctly, albeit with some numerical artifact degradation, but has not successfully recovered the smoothly varying regions on the exit wave, which look like scaled and phase-shifted regions of the probe function (again shown in Fig. 2(a)). For p₁ = 128, the ground truth exit wave is shown in Fig. 7(e), and a reconstruction with $ε_{true}^{2} ≃ 10^{5}$ is shown in Fig. 7(f). We see that edges are again faithfully reconstructed and that smoothly varying features of the exit wave, which look like scaled-down/phase-shifted regions of the probe function, are again degraded. However, the degradation in these smoothly varying regions in Fig. 7(f) is not nearly as severe when compared with the p₁ = 16 case shown in Fig. 7(c), and only finer details of the probe are degraded.

We conclude that using forward differences is not a proper sparsifying transformation for the smooth features encountered in the test problems in the “ambiguous” region R_A, and the inability of Eq. (27) to recover these smoothly varying features using edge detection is expected. The reason is that the forward-difference edge-detected representation of the smoothly varying regions gets set to zero in the thresholding step of Eq. (27b) as these regions will likely fall below the current thresholding parameter values τ_j, thereby destroying any information in these regions. Likely what is needed to remove this inability to recover smoothly varying regions of the exit wave far from the requisite edges, which the forward-difference edge-detection scheme was designed to effectively recover, is to perform iterative separation of the sample transmission function and the probe, as is done in ptychography [6, 32], and to introduce additional information (constraints) on the optimization problem. For example, we can constrain further the sample transmission function using additional thresholding to promote sparsity in some edge-detected representations of the sample transmission function (this can be in addition to sparsity promotion by using forward differences of the exit wave used here), as well as using index of refraction limits [33]. We can also further constrain the reconstruction for the probe function using a white-field (sample-out) measurement, which is the modulus squared of the probe function numerically propagated to the area detector. Other regularization schemes and algorithmic modifications can be made as well, such as total variational denoising methods on the ∂_jρ (using $\partial_{j}^{2} ρ$ , that is, curvature sparsity information), which promote smoothness in an image while simultaneously maintaining edges [34].

The second region of interest, R_S, corresponds to Voronoi polygon densities of $\frac{p_{2}}{MN}$ for 256 ≤ p₂ ≤ 12288 and where $ε_{true}^{2} ≲ 10^{5}$ . The “dark blueish to black” region R_S corresponding to $ε_{true}^{2} ≲ 10^{5}$ in Fig. 7(a) is a region where we consider the recovery of the exit wave successful (example reconstructions not shown) and robust (in that virtually all starting points yield an accurate solution). One qualitative criterion we can use to define success is the following: after subpixel image registration and global phase offset correction, for particular p₂ values with $ε_{true}^{2} ≲ 10^{5}$ , the recovered exit waves and the known ground truth exit waves are virtually indistinguishable upon visual inspection. A more quantitative criterion we use to determine success here is computing the phase retrieval transfer function (PRTF) [35,36]. The PRTF is the ratio of the azimuthally averaged modulus of the Fourier transform of the reconstructed exit wave to the azimuthally averaged square root of the diffraction measurement, and can be used to quantitatively determine where spatial frequency content of an exit wave reconstructed using phase retrieval is no longer reliably recovered. For a particular p₂, we compute a single averaged reconstruction from the multiple trial runs with $ε_{true}^{2} ≲ 10^{5}$ (to average out remnant numerical artifacts from a particular trial run). All PRTFs computed in this way (plots not shown) yield values between ≃ 0.7 and 1 for all spatial frequencies. In the literature, spatial frequency regions of the PRTF falling below ≃ 0.5 are typically considered as unreliably recovered spatial frequency content [35,36]. Since for the cases mentioned we have PRTFs well above this cutoff at all spatial frequencies, we can be confident these reconstructions are successful.

The third region of interest, R_F, in Fig. 7(a) corresponds to the exit waves that we consider unsuccessfully recovered, using both criteria of simply comparing the reconstructions to the ground truth exit waves using visual inspection and using PRTFs. These are Voronoi polygon densities of $\frac{p_{1}}{MN}$ for p₁ ∈ {16, 128, 256, 512} with $10^{5.5} ≲ ε_{true}^{2}$ and $\frac{p_{3}}{MN}$ for 4096 ≤ p₃ with $10^{6.4} ≲ ε_{true}^{2}$ . A result for p₁ = 16 and with $ε_{true}^{2} ≃ 10^{6.4}$ , shown in Fig. 7(d), indicates that the construction has very roughly recovered the correct location of the edges but also appears to suffer greatly from other numerical artifacts common to phase retrieval due to translational Fourier transform symmetries [17], as well as failing to recover any smoothly varying features of the probe function. A result for p₁ = 128 and with $ε_{true}^{2} ≃ 10^{5.8}$ is shown in Fig. 7(g); hence, edge locations are adequately if imperfectly recovered, but smoothly varying features are unacceptably degraded when compared with the ground truth exit wave in Fig. 7(e) as well as the “borderline” successfully recovered exit wave in Fig. 7(f).

The almost exclusively “dark reddish” region in Fig. 7(a) for the p₃ values listed and with $10^{6.4} ≲ ε_{true}^{2}$ forms a sharp boundary with the R_S region discussed above, which is different from the more gradual boundary that the R_S region shares with the R_A region. The R_F region is different from the unsuccessfully recovered p₁ values in R_A in that no recognizable edges are recovered and the reconstructions look poor (similar to that seen in Fig. 2(c)). The reconstructions corresponding to the R_F region indicate that the support implicitly generated while evaluating Eq. (27b) behaves as if it is no longer sparse enough, which was the cause of failure when using the approximate probe function diameter as support (discussed in Sec. 2). Note, however, that for the p₃ values listed in the R_F region, there remains some probability of a successful reconstruction, ranging from 4% for p₃ = 12288 all the way to 96% for p₃ = 4096. The nature of this boundary between “sparse enough” and “not sparse enough” determining probabilistic reconstruction success using Eq. (27) is unclear, however, and requires further study.

5.3. Noise, edge detection, and sparsity

We next discuss the effects of noise on sparsity and the edge detection methods used in this paper. We define as before the noise-free diffraction intensity as D and the diffraction intensity with Poisson noise as D_n. The noisy measurement set ℳ_n and projection operator Π_{ℳ_n} are defined as in Eq. (8) and Eq. (9), but using D_n in place of D. In Fig. 8(a), we compute the azimuthal average of the diffraction intensities using

{〈 F (q_{r}, q_{θ}) 〉}_{q_{θ}} = \frac{1}{N (q_{r})} \sum_{q} F (q_{r}, q_{θ}) ⊙ A (q_{r}, q_{θ})

for some

F (q) = F (q_{n}, q_{m}) = F (q_{r}, q_{θ}) \in ℝ_{+}^{M \times N}

, where (q_r, q_θ) are spatial frequency polar coordinates,

q_{r} = {[q_{m}^{2} + q_{n}^{2}]}^{1 / 2}

and q_θ = tan⁻¹ (q_m/q_n), A(q_r, q_θ) is an M × N binary mask with ones within the annulus of width one pixel at radius q_r and zeros elsewhere, and

N (q_{r}) = \sum_{q} A (q_{r}, q_{θ})

is the number of pixels within the annulus defined by the constant radial spatial frequency q_r, which in the continuous limit is the circumference of the circle 2πq_r. The azimuthal average of the noise-free diffraction intensity D is shown in Fig. 8(a) as the black curve, while the azimuthal average of the diffraction intensity with Poisson noise D_n is shown by the green curve; the signal at high spatial frequencies becomes lost in the noise at approximately q_r ≃ 10²Δq_r, where Δq_r is the spatial frequency radial pixel size in units of inverse length.

Fig. 8 Effects of Poisson degraded diffraction intensity on the sparsity of the supports for the sparse representations ∂_jρ. (a) Azimuthal average of diffraction patterns without noise D (black curve) and with noise D_n (green curve). (b) Azimuthal average of the noisy diffraction pattern filtered in Fourier space using the forward difference Ẽ_x (black curve) and Sobel Ẽ′x (green curve) edge detection convolution matrices. (c) |Ẽ_x|, (d) |Ẽ′x|. (e) An example noise-free support for |∂_xρ_true|, (f) the support of |∂_xΠ_{ℳ_n} [ρ_true]| using forward difference edge detection, and (g) the |∂′_xΠ_{ℳ_n} [ρ_true]| using Sobel edge detection. For (e–g), hard thresholding is used to generate the supports with an enforced sparsity ratio of κ_x = 0.05.

Download Full Size | PDF

To best understand the effect noise has on edge detection using the forward difference convolution matrices in Eq. (4), we view edge detection as a spatial frequency filter in Fourier space. In Fig. 8(c) we show |ℱ[E_x]| = |Ẽ_x|; viewed in this way, forward-difference edge detection in the x direction can be seen as spatial high-pass filtering preferentially emphasizing high spatial frequencies in the x direction while attenuating spatial frequency content in the y direction. The azimuthal average of $| ℱ [\partial_{x} Π_{ℳ_{n}} [ρ]] | = | {\tilde{E}}_{x} | ⊙ \sqrt{D_{n}}$ is shown as the black curve in Fig. 8(b). An example of a support generated by hard thresholding using an enforced sparsity ratio of 5% for the noise-free case is shown in Fig. 8(e), while the support generated for the noisy case, also using an enforced sparsity ratio of 5%, is shown in Fig. 8(f). The noise-degraded regions at high spatial frequencies (where the true diffraction intensity signal is essentially lost in the noise) are being amplified, giving a highly noncompact support region that in turn looks noisy, with image pixels in the support randomly distributed and inconsistent with the known ground truth support shown in Fig. 8(e).

To alleviate this amplification of noise-degraded regions at high spatial frequency, we can use other types of edge detection besides forward differences. One well-known method of edge detection that has some noise suppressing properties is a Sobel filter

E_{x}^{'} = \frac{1}{4} [\begin{matrix} - 1 & 0 & 1 & 0 & \dots & 0 \\ - 2 & 0 & 2 & 0 & \dots & 0 \\ - 1 & 0 & 1 & 0 & \dots & 0 \\ 0 & 0 & 0 & 0 & \dots & 0 \\ ⋮ & ⋮ & ⋮ & ⋮ & ⋱ & ⋮ \\ 0 & 0 & 0 & 0 & \dots & 0 \end{matrix}] \in ℝ^{M \times N},

where

E_{y}^{'} = E_{x}^{' T}

and ·^T is the transpose. The behavior of E′_x in Fourier space is shown in Fig. 8(d), where |ℱ[E′_x]| = |Ẽ′_x|. The Sobel filter viewed in Fourier space acts as a band-pass filter, attenuating both low and high spatial frequencies (which are degraded from the noise) in the y direction. We also show the azimuthal average of

| ℱ [\partial_{x}^{'} Π_{ℳ_{n}} [ρ]] | = | {\tilde{E}}_{x}^{'} | ⊙ \sqrt{D_{n}}

as the green curve in Fig. 8(b); the attenuation of the noise-degraded regions at high spatial frequencies q_r ≥ 10²Δq_r is apparent. An example of a support generated by hard thresholding using an enforced sparsity ratio of 5% for the noisy case is shown in Fig. 8(g). The support generated by using a Sobel filter for edge detection appears much more compact and retains features of the known ground truth in Fig. 8(e) better than when using forward differences.

We conclude that when noise is present in a diffraction measurement and significantly degrades particular spatial frequency content of the measurement, care must be taken to choose an appropriate edge detection method to promote sparsity. The algorithm presented in Eq. (27) can readily be modified to use thresholding on a sparse representation of the exit wave ρ using any type of edge detection, for example, explicitly defined Fourier space filters like those discussed here, as well as more sophisticated methods that combine image denoising techniques and wavelet [37], shearlet [38], or curvelets [39] transforms on the exit wave ρ. Further benchmarking and study using these other edge detection methods, using sparsity promotion through thresholding both on the modulus of the sparse representation of ρ and on the phase [30], should be explored, but this work is beyond the scope of this paper.

6. Conclusions

Developing innovative imaging methods utilizing phase retrieval is especially crucial and timely given the current planning and commissioning of high brightness coherent x-ray sources across the world. Especially for x-ray free electron lasers, the experimental arrangement explored in this work will be highly convenient and natural to use. New methods such as those presented here will help to unlock the full potential of these sources by allowing for robust phase retrieval under experimental conditions currently thought to be infeasible and/or too difficult to work with.

We show a way forward for robust single-view CDI of extended samples probed with plane wave illumination using phase retrieval. This case, which is particularly relevant to flash imaging by x-ray free-electron laser sources and for study of materials and life sciences specimens when scanning approaches (ptychography) are impractical, has previously been considered intractable. Our method exploits edge detection to arrive at a sparse representation of the sample exit wave (one with far fewer dominant pixels) and uses thresholding of the modulus of this sparse representation to implicitly generate a sparse support. This sparse support is much more highly constraining when compared to using a support on the original non-sparse exit wave, which is the essence of the method: we use phase retrieval solve for the simpler, sparser representation, not the original non-sparse exit wave.

We derive an algorithm based on ADMM to solve for the sparse edge-detected representations of the exit wave, and benchmark it on test problems with varying support sparsity (dominant nonzero pixels) in the edge-detection representation. We discuss and identify characteristics in the test problems where the method succeeds and fails, as well as a borderline region where success is ambiguous. This ambiguous region corresponds to spatial exit wave features where edges have been successfully recovered but other smoothly varying regions away from edges are not. We then examine the effects of Poisson noise when performing edge detection using the forward-differences approach and discuss how Poisson noise that is dominant at high spatial frequencies adversely affects use of thresholding to define an effective support on the edge-detected representations. We then discuss a simple example of noise suppressing edge detection using a Sobel filter, as well as some other prospects to perform edge detection in the presence of noise to generate sparse supports on the edge-detected representations.

The key to this work is to apply simple mathematical transformations in order to drastically simplify how the relevant information content in an image is represented. This can enable phase retrieval in cases previously thought to be intractable. We anticipate that this general approach to constraining and regularizing the optimization problem can be extended by using other types of sparsity-inducing transformations, such as wavelets, curvelets, and shearlets. In addition to enabling imaging extended samples by single-view CDI, we expect that incorporation of this concept into multiple-view (e.g., ptychographic) algorithms will be beneficial by supplementing ptychographic overlap constraints with a sparse support constraint on the edge-detected representation of the current view.

Funding

This material is based upon work supported by the U.S. Department of Energy, Office of Science, Offices of Advanced Scientific Computing Research and Basic Energy Sciences under Contract No. DE-AC02-06CH11357.

Acknowledgments

The authors are grateful to Sven Leyffer for insightful discussions.

References and links

1. D. Paganin, Coherent X-ray Optics (Oxford University, 2006). [CrossRef]

2. Y. S. G. Nashed, D. J. Vine, T. Peterka, J. Deng, R. Ross, and C. Jacobsen, “Parallel ptychographic reconstruction,” Opt. Express 22, 32082–32097 (2014). [CrossRef]

3. J. Miao, P. Charalambous, J. Kirz, and D. Sayre, “Extending the methodology of X-ray crystallography to allow imaging of micrometre-sized non-crystalline specimens,” Nature 400, 342–344 (1999). [CrossRef]

4. S. Marchesini, H. He, H. N. Chapman, S. P. Hau-Riege, A. Noy, M. R. Howells, U. Weierstall, and J. C. H. Spence, “X-ray image reconstruction from a diffraction pattern alone,” Phys. Rev. B 68, 140101 (2003). [CrossRef]

5. H. M. L. Faulkner and J. M. Rodenburg, “Movable aperture lensless transmission microscopy: A novel phase retrieval algorithm,” Phys. Rev. Lett. 93, 023903 (2004). [CrossRef]

6. P. Thibault, M. Dierolf, A. Menzel, O. Bunk, C. David, and F. Pfeiffer, “High-resolution scanning X-ray diffraction microscopy,” Science 321, 379–382 (2008). [CrossRef]

7. P. Thibault and A. Menzel, “Reconstructing state mixtures from diffraction measurements,” Nature 494, 68–71 (2013). [CrossRef]

8. S. Marathe, S. S. Kim, S. N. Kim, C. Kim, H. C. Kang, P. V. Nickles, and D. Y. Noh, “Coherent diffraction surface imaging in reflection geometry,” Opt. Express 18, 7253–7262 (2010). [CrossRef]

9. M. A. Pfeifer, G. J. Williams, I. A. Vartanyants, R. Harder, and I. K. Robinson, “Three-dimensional mapping of a deformation field inside a nanocrystal,” Nature 442, 63–66 (2006). [CrossRef]

10. J. R. Fienup, “Phase retrieval algorithms: a comparison,” Appl. Opt. 21, 2758–2769 (1982). [CrossRef]

11. S. Marchesini, “Ab initio compressive phase retrieval,” Tech. Rep. 0809.2006, arXiv (2008).

12. M. L. Moravec, J. K. Romberg, and R. G. Baraniuk, “Compressive phase retrieval,” Proc. SPIE 6701, 670120 (2007). [CrossRef]

13. R. Neutze, R. Wouts, D. van der Spoel, E. Weckert, and J. Hajdu, “Potential for biomolecular imaging with femtosecond x-ray pulses,” Nature 406, 752–757 (2000). [CrossRef]

14. B. Abbey, K. Nugent, G. Williams, J. Clark, A. Peele, M. Pfeifer, M. de Jonge, and I. McNulty, “Keyhole coherent diffractive imaging,” Nat. Phys. 4, 394–398 (2008). [CrossRef]

15. E. Candés and M. Wakin, “An introduction to compressive sampling,” IEEE Signal Proc. Mag. 25, 21–30 (2008). [CrossRef]

16. X. Huang, J. Nelson, J. Steinbrener, J. Kirz, J. J. Turner, and C. Jacobsen, “Incorrect support and missing center tolerances of phasing algorithms,” Opt. Express 18, 26441–26449 (2010). [CrossRef]

17. A. Tripathi, S. Leyffer, T. Munson, and S. M. Wild, “Visualizing and improving the robustness of phase retrieval algorithms,” Procedia Comput. Sci. 51, 815–824 (2015). [CrossRef]

18. S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Found. Trends Mach. Learn. 3, 1–122 (2011). [CrossRef]

19. E. J. Candés, M. B. Wakin, and S. P. Boyd, “Enhancing sparsity by reweighted ℓ₁ minimization,” J. Fourier Anal. Appl. 14, 877–905 (2008). [CrossRef]

20. T. Blumensath and M. E. Davies, “Iterative thresholding for sparse approximations,” J. Fourier Anal. Appl. 14, 629–654 (2008). [CrossRef]

21. F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, “Optimization with sparsity-inducing penalties,” Found. Trends Mach. Learn. 4, 1–106 (2011). [CrossRef]

22. R. H. Chan, T. F. Chan, L. Shen, and Z. Shen, “Wavelet algorithms for high-resolution image reconstruction,” SIAM J. Sci. Comput. 24, 1408–1432 (2003). [CrossRef]

23. K. Herrity, A. Gilbert, and J. Tropp, “Sparse approximation via iterative thresholding,” in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE, 2006), pp. 624–627.

24. Z. Yang, C. Zhang, and L. Xie, “Robust compressive phase retrieval via L1 minimization with application to image reconstruction,” Tech. Rep. 1302.0081, arXiv (2013).

25. L. Sorber, M. Barel, and L. Lathauwer, “Unconstrained optimization of real functions in complex variables,” SIAM J. Optim. 22, 879–898 (2012). [CrossRef]

26. J. J. Moré and S. M. Wild, “Estimating derivatives of noisy simulations,” ACM Trans. Math. Software 38, 1–21 (2012). [CrossRef]

27. X. Zhang, Y. Lu, and T. Chan, “A novel sparsity reconstruction method from Poisson data for 3D bioluminescence tomography,” J. Sci. Comput. 50, 519–535 (2011). [CrossRef]

28. B. Dong and Y. Zhang, “An efficient algorithm for ℓ₀ minimization in wavelet frame based image restoration,” J. Sci. Comput. 54, 350–368 (2012). [CrossRef]

29. Y. Zhang, B. Dong, and Z. Lu, “ℓ₀ minimization for wavelet frame based image restoration,” Math. Comput. 82, 995–1015 (2013). [CrossRef]

30. A. Pein, S. Loock, G. Plonka, and T. Salditt, “Using sparsity information for iterative phase retrieval in x-ray propagation imaging,” Opt. Express 24, 8332–8343 (2016). [CrossRef]

31. R. Fan, Q. Wan, F. Wen, H. Chen, and Y. Liu, “Iterative projection approach for phase retrieval of semi-sparse wave field,” EURASIP J. Adv. Signal Process. 2014, 1–13 (2014). [CrossRef]

32. A. M. Maiden and J. M. Rodenburg, “An improved ptychographical phase retrieval algorithm for diffractive imaging,” Ultramicroscopy 109, 1256–1262 (2009). [CrossRef]

33. J. N. Clark, C. T. Putkunz, M. A. Pfeifer, A. G. Peele, G. J. Williams, B. Chen, K. A. Nugent, C. Hall, W. Fullagar, S. Kim, and I. McNulty, “Use of a complex constraint in coherent diffractive imaging,” Opt. Express 18, 1981–1993 (2010). [CrossRef]

34. L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Physica D 60, 259–268 (1992). [CrossRef]

35. H. N. Chapman, A. Barty, S. Marchesini, A. Noy, S. P. Hau-Riege, C. Cui, M. R. Howells, R. Rosen, H. He, J. C. H. Spence, U. Weierstall, T. Beetz, C. Jacobsen, and D. Shapiro, “High-resolution ab initio three-dimensional x-ray diffraction microscopy,” J. Opt. Soc. Am. A 23, 1179–1200 (2006). [CrossRef]

36. J. Steinbrener, J. Nelson, X. Huang, S. Marchesini, D. Shapiro, J. J. Turner, and C. Jacobsen, “Data preparation and evaluation techniques for x-ray diffraction microscopy,” Opt. Express 18, 18598–18614 (2010). [CrossRef]

37. D. Lazzaro and L. Montefusco, “Edge-preserving wavelet thresholding for image denoising,” J. Comput. Appl. Math. 210, 222–231 (2007). [CrossRef]

38. S. Yi, D. Labate, G. R. Easley, and H. Krim, “A shearlet approach to edge analysis and detection,” IEEE Trans. Image Process. 18, 929–941 (2009). [CrossRef]

39. J.-L. Starck, E. J. Candés, and D. L. Donoho, “The curvelet transform for image denoising,” IEEE Trans. Image Process. 11, 670–684 (2002). [CrossRef]

Single-view phase retrieval of an extended sample by exploiting edge detection and sparsity

Abstract

1. Introduction

2. Support defined from the illuminated area on the sample

3. Sparse exit wave representations for phase retrieval