Abstract
We propose a new approach to robustly retrieve the exit wave of an extended sample from its coherent diffraction pattern by exploiting sparsity of the sample’s edges. This approach enables imaging of an extended sample with a single view, without ptychography. We introduce nonlinear optimization methods that promote sparsity, and we derive update rules to robustly recover the sample’s exit wave. We test these methods on simulated samples by varying the sparsity of the edge-detected representation of the exit wave. Our tests illustrate the strengths and limitations of the proposed method in imaging extended samples.
© 2016 Optical Society of America
1. Introduction
Coherent diffractive imaging (CDI) is a technique in widespread use at x-ray synchrotrons to image nanoscale materials. Traditional x-ray microscopy techniques, such as transmission x-ray microscopy (TXM), rely on condensing and objective optics. In contrast, CDI uses nonlinear optimization methods as a type of “numerical optic” to solve the phase problem, which arises because of an x-ray detector’s inability to measure a full complex-valued wavefield.
Under the projection approximation [1], a (complex-valued) exit wave ρ is formed when a plane wave x-ray beam interacts with a sample. We note that in our figures, the phase and magnitude of complex-valued quantities are represented in an HSV colorspace, with the hue and value/brightness denoting the phase and magnitude respectively, and with the saturation component set to maximum for all image pixels. As shown in Fig. 1, when an area detector is placed in the far field, the (real-valued) Fraunhofer diffraction intensity D is measured, where D is proportional to the squared modulus of the two-dimensional Fourier transform of ρ. In this paper, we study nonlinear optimization methods to recover the phase lost upon measurement.
An advantage of CDI over more traditional microscopy techniques is that the recovered image’s spatial resolution is determined by the highest spatial frequencies measured. These spatial frequencies are in turn determined primarily by incident x-ray fluence and the contrast in both the phase and magnitude of the sample’s complex valued transmission function. Another advantage of CDI is that it can recover the full complex-valued transmission function (related to the complex-valued index of refraction of the sample). Methods such as TXM can recover only the modulus of the sample’s transmission function. Furthermore, development of phase retrieval algorithms for CDI has progressed to a point where high-quality images are produced in real time and at the same speed with which one can obtain images in a TXM (see, e.g., [2]).
Recovery of the phase typically starts with a guess ρ(0) for the exit wave; a nonlinear optimization algorithm then iteratively refines this guess using knowledge from the experiment performed and the physics of the sample. This knowledge is typically used to define constraints for a nonlinear, nonconvex optimization formulation of the phase retrieval problem. Examples of such knowledge include the measured diffraction, as well as constraints on the sample such as known compact support [3], sparsity [4], and ptychography [5–7].
Common CDI experimental configurations for collecting data include the small angle geometry shown in Fig. 1, as well as reflection and Bragg geometries [8,9]. With these configurations, one can use an x-ray probe with a spatial area (or volume in the reflection and Bragg cases) larger or smaller than the sample under investigation. When the x-ray probe is larger than the sample, a single diffraction pattern from a single view can be used with known support or sparsity [3, 4, 10–12]. This single-view approach is particularly advantageous when the x-ray probe will damage the sample, such as with an x-ray free-electron laser source [13]. Improvements in single-view algorithms can also readily be incorporated into ptychographic methods for more robust phase retrieval.
In this paper, we explore the use of single-view phase retrieval methods for an extended sample, with a primary motivation being the ability to overcome difficulties encountered in ptychography when attempting to image samples that significantly change during the collection of a ptychographic dataset, or when the long-term temporal and spatial experimental stability requirements of ptychography are not met. Single-view CDI phase retrieval methods for an extended sample have had very limited success in the literature [14] when compared with methods for an isolated sample. In Sec. 2, we identify the primary limitations of using a support defined from the approximate spatial extent of the illumination function when imaging an extended sample and in Sec. 3 we discuss sparsity in an edge-detected representation of the exit wave. We then develop in Sec. 4 a phase retrieval algorithm that overcomes these limitations by performing phase retrieval on the edge-detected representation of an exit wave and by using ideas from compressive sensing [15]. The primary idea is to recover the sparsest possible edge-detected representation of the exit wave consistent with the diffraction measurement. This process is colloquially known as “shrinkwrap” in the experimental phase retrieval community [4] and more broadly known as “compressive phase retrieval” [11, 12]. The numerical tests in Sec. 5 on a range of simulated samples illustrate the benefits and limitations of the proposed methods. We test these methods on various simulated samples by varying the sparsity of the edge-detected representation of the exit wave. Our results demonstrate that our method can reliably recover the phase of extended samples in many cases, cases where standard, single-view phase retrieval methods fail. Nontrivial noise in the measured diffraction, however, requires use of an appropriate spatial frequency filtering procedure, and this is also discussed. In Sec. 6, we discuss further prospects and extensions of the proposed methodology.
2. Support defined from the illuminated area on the sample
In single-view phase retrieval of an extended sample, one obvious choice of sample constraint is to use the approximate illuminated area on the sample as the support on the exit wave we wish to recover. As an example, in Fig. 2(a) we simulate an x-ray probe by solving the Fresnel integral with a circular pinhole aperture as input. This type of probe is typical in small angle transmission geometry x-ray synchrotron experiments, where sample-to-circular-pinhole distances of between a millimeter and a centimeter are used. With this probe and a simulated sample transmission function defined by generating a Voronoi pattern (not shown), the exit wave ρtrue under the projection approximation [1] is shown in Fig. 2(b) (the red dashed line will be discussed shortly). The primary difficulty arising when attempting to use the approximate illuminated area on the sample as a support constraint is that the x-ray probe photon area density smoothly decays as one moves away from the center of the probe function. This smooth decay makes defining an effective support boundary ambiguous.
As an example, we use a support dynamically generated using the shrinkwrap method [4] as a constraint on the exit wave and the Fourier modulus measurement as the measurement constraint. We denote the current iterate of the exit wave as ρ(k), where k denotes the iteration index and the problem size is M × N = 512 × 512. We run 4,900 iterations of Fienup’s hybrid input-output method (HIO), then 100 iterations of error reduction (ER), and then a shrinkwrap update of the binary support mask, denoted by the symbol . This sequence of HIO+ER+shrinkwrap is repeated 20 times. When updating the binary support mask in shrinkwrap, we Gaussian blur (using as the standard deviation of the Gaussian low-pass filter in Fourier space) the current iterate of |ρ(k)| to get the quantity . We then enforce a sparsity ratio of on to define the binary support mask. The sparsity ratio is the fraction of nonzeros since
where the indicator function used is defined as for some a ∈ ℂM × N, and we index the spatial coordinate system of the sample by r ∈ {(rn, rm) : n = 0, . . . , N − 1, m = 0, . . . , M − 1}. To define the binary support mask with a particular enforced sparsity ratio κ, we determine the ⌊MNκ⌋th largest value of . The binary support mask is then created when we set to zero the elements of below this value and set to one the elements above or equal to this value. A typical result for the support boundary using this recipe is shown by the dashed red line around the ground truth exit wave ρtrue in Fig. 2(b). A linecut of the ground truth exit wave ρtrue through the main diagonal is shown in Fig. 2(d), with the support boundary again shown as the dotted red lines. The support boundary shown in Fig. 2(d) potentially can zero out features of |ρtrue| below approximately 0.05|ρtrue|max.The results for the recovered exit wave ρ(k) for k = 105 are shown in Fig. 2(c). The support generated using shrinkwrap is not constraining enough to accurately recover the ground truth exit wave shown in Fig. 2(b). The value of the standard error metric
used to measure solution quality, as a function of iteration number k is shown in Fig. 2(e). There, we use the notation ρ̃ = ℱ [ρ] = |ρ̃| ⊙ eiϕ̃, where ℱ [·] is the two-dimensional discrete Fourier transform, ⊙ is the (componentwise) Hadamard product, and we index the Fourier space coordinate system of the measurement by q, so that ρ̃(q) = |ρ̃(q)| ⊙ eiϕ̃(q) for q ∈ {(qn, qm) : n = 0, . . . , N − 1, m = 0, . . . , M − 1}. If we enforce smaller sparsity ratios (i.e., κ < 0.15), the support generated is “too tight” [16], and the recovered ρ(k) is similarly unsatisfactory when compared with the ground truth exit wave in Fig. 2(b). Enforcing larger sparsity ratios (i.e., κ > 0.15) causes prompt algorithmic stagnation when monitoring Eq. (3), indicating convergence to a poor local solution [17] and again showing unsatisfactory results when compared with the ground truth exit wave in Fig. 2(b).3. Sparse exit wave representations for phase retrieval
Mathematical transformations can be applied to represent a group of variables in a more meaningful and useful way. An example is the wavelet transformation of an image, where many images have only a few dominant wavelet components. Ignoring the nondominant components yields lossy compression of the image into a potentially small fraction of the original number of information bytes the image comprised. The information content of the image is sparse in the wavelet domain. The technique of “compressive sensing” attempts to incorporate and exploit properties of this inherent sparsity. The basic idea is that by formulating optimization problems in terms of a simplified representation instead of the original representation and solving for an optimal sparse, simplified representation, one might obtain more robust results.
Here, we explore the use of edge detection as a sparsifying transformation of an exit wave ρ. This idea was originally explored in [11] for use in phase retrieval. For edge detection, we take the forward-difference arrays (each of size M × N = 512 × 512)
and convolve ρ with Ex and Ey. Since this edge detection uses forward differences, we refer to the resulting quantities as spatial derivatives and use the notation ∂xρ = Ex * ρ, ∂yρ = Ey * ρ. The convolution theorem shows for j = x, y and with Ẽj = ℱ[Ej].Figs. 3(a), 3(c), and 3(e) show the ground truth exit wave ρtrue used in Fig. 2(b) and its respective edge-detected representations. In Fig. 3(b) we again show the binary support mask 𝕀𝒮 generated by using shrinkwrap in Sec. 2 to define a support corresponding to the approximate illuminated area on the sample. In Figs. 3(d) and 3(f), we show the supports 𝕀𝒮x, 𝕀𝒮y that result from thresholding the modulus of the respective quantities in Figs. 3(c) and 3(e) to zero whenever they fall below 2% of the maximum value of the modulus:
From Fig. 3, we can clearly see that the ∂xρ and ∂yρ quantities have far sparser supports when compared with ρtrue. We explore this property in the remainder of this paper and show that by performing phase retrieval to solve for the quantities ∂xρ and ∂yρ with relatively sparse supports in place of solving for the exit wave ρ with a nonsparse support, we can robustly recover the exit wave for extended samples in many cases.4. Solving for the sparse representation of the exit wave
The alternating direction method of multipliers (ADMM) attempts to solve optimization problems by breaking them into smaller, more manageable subproblems [18]. It has the advantages that the objective function used in the optimization problem need not be differentiable and the method is simple to implement and parallelize. The convergence properties of this method are difficult to analyze for nonconvex problems, however, and more specialized algorithms can outperform it. We begin a derivation of ADMM update rules for obtaining the sparse representations ∂jρ of the exit wave ρ by considering the constrained, nonconvex optimization problem
where ℓ ∈ {0, 1} and we define the Fourier modulus measurement constraint set and its associated projection operator Πℳ with ρ̃ = ℱ [ρ] = |ρ̃| ⊙ eiϕ̃. The ‖ · ‖0 norm is defined in Eq. (1), and for any a ∈ ℂM × N we define the ‖ · ‖1 norm by The ℓ ∈ {0, 1} norms are well-known sparsity-inducing norms, with the ‖ · ‖1 norm frequently used as a convex and continuous relaxation of the ‖ · ‖0 norm [19]. The optimization problem in (7) seeks the ρ satisfying the Fourier modulus measurement with sparsest possible ∂jρ.The constrained augmented Lagrangian function for Eq. (7) is
where τj ≥ 0 are parameters that control the competition between the constraints uj = ∂jρ and ρ ∈ ℳ and finding the sparsest possible uj. The λj ∈ ℂM × N are Lagrange multipliers (with denoting the complex conjugate), ũj = ℱ [uj], λ̃j = ℱ [λj], and ρ̃ = ℱ[ρ]. We use Plancherel’s theorem for the discrete Fourier transform when going from Eq. (11a) to Eq. (11b) on the Frobenius norm for any a ∈ ℂM × N and with ã(q) = ℱ [a(r)].Provided that ℓ ∈ {0, 1} is fixed and given initializations for ρ(0), , and for j ∈ {x, y}, the kth iteration of the ADMM-based method can be written as
with βj being a damping term for the multiplier updates (typically we set βj = 1 for both j ∈ {x, y}). The update rules for and ρ(k+1) are derived in Secs. 4.1 and 4.2, respectively. The function ψj (·) is defined in Sec. 4.3 and allows for effective iterative determination of the parameters τj. Note that we cannot minimize Eq. (11) to update the τj because (τx, τy) = (0, 0) is the global minimizer of Eq. (11) when holding the other quantities constant, which is not useful for our purposes. Thus, we introduce the additional auxiliary function ψj (·) for determining the optimal values of the parameters τj.4.1. update
The least-squares optimization subproblem with the sparsity-promoting ℓ ∈ {0, 1}-norm regularization is
which is equivalent to minimizing ℒj in Eq. (11a) over uj when we set b = ∂jρ − λj and we drop the ‖λj‖F constant. The solution to Eq. (14) can be expressed in terms of proximal mappings [20–24]. If using ℓ = 1, this mapping is the soft-thresholding operator where we use the notation of the complex signum function sgn(b) = eiγ for b = |b| ⊙ eiγ when b ≠ 0 and sgn(0) = 0 with γ ∈ ℝM × N. If using ℓ = 0, this mapping is the hard-thresholding operator In both cases, we have an indicator function of the form for a specified μ ≥ 0. We thus use the update rule in Eq. (13b). In addition to balancing the competition between fitting the measurement and promoting sparsity, we can interpret τj as thresholding parameters.4.2. ρ(k+1) update
An update for ρ(k+1) can be found by taking the Wirtinger (complex-valued; see, e.g., [25]) derivative of the ρ̃-dependent term in Eq. (11b) with respect to ρ̃ and setting it to zero:
Solving for ρ̃, we obtain where the division by |Ẽy|2 + |Ẽx|2 is componentwise. To satisfy the constraint in Eq. (13c) that ρ ∈ ℳ, we then project Eq. (20) onto the constraint set ℳ: which we use for the update in Eq. (13c).4.3. update
The choice of thresholding parameters (τx, τy) is critical to the success of our method. We determine τj parameters by solving Eq. (13a), where we introduce the cost function
with . The two spatial terms are then combined to define The weighting parameter controls the relative scaling of the first term in Eq. (22) with the second and is defined as where ς ≥ 0 is a parameter balancing the sparsity of the solution with its Fourier modulus measurement agreement. If ς is too large, then the sparsity-inducing term can dominate in Eq. (22), resulting in a solution that has a large mismatch with the term. If ς is too small, we are allowing nonsparse solutions, which can quickly cause convergence to a poor local solution similar to that seen in Sec. 2 when using the probe diameter as a support region. The choice of ς values for our numerical experiments is discussed in Sec. 5.We can numerically evaluate Eq. (23) as a function of (τx, τy) to gain intuition on what is necessary for finding a minimizer of Eq. (23). An example is shown in Fig. 4, where ℓ = 0 is used; similar results are obtained if ℓ = 1 is used. At first glance, the metric ψ(k)(τx, τy) appears to have a well-behaved landscape with a single global minimum, as seen in Fig. 4(a). By zooming in on a smaller field of view in Fig. 4(b) (which is enclosed by the boundaries of the red box in Fig. 4(a)), however, we begin to see multiple local minima. This discontinuous and nonsmooth behavior of the derivatives of ψ(k)(τx, τy) is due to the nonsmooth nature of the thresholding operators in Eq. (15) and Eq. (16).
As a result of this nonsmooth topology with multiple local minima, if we attempt to use a gradient descent method, we can expect to quickly become stuck in one of these local solutions. One way around this problem is to use finite differences to approximate the derivatives with respect to (τx, τy) and ensure that the finite-difference steps are large enough to smooth over these spurious local solutions [26]. The larger the finite-difference steps, however, the less accurate is the approximate global minimizer that can be obtained. Consequently, instead of using a gradient descent method, we use a “polling step”-like method whereby we simply test some thresholding parameter trial values around the vicinity of the current in the objective function ψ(k)(τx, τy):
where any ties are broken lexicographically (i.e., the first tied index in the order in which the trial values of hj are tested). The h±nj, for j ∈ {x, y} and n ∈ ℤ+, can be chosen by picking a range of sparsity ratios (e.g., , with ). We then can compute—by sorting the MN values in and setting τj to be the ⌊MNκj⌋th largest value—the thresholding parameters corresponding to these allowable changes in sparsity ratios to get the h±nj values, and we solve Eq. (25) by direct evaluation of Eq. (23) using these h±nj.We found by experience that we need not update the thresholding parameters τj with the same frequency at which uj, ρ, and λj are updated. If we attempt to update τj as frequently as the main iterates, we usually find that a polling step of hj = 0 is the minimizer of ψ(k)(τx, τy) in both x and y. Therefore we introduce a parameter L that controls the frequency at which τj is updated to arrive at the update rule:
with the value computed in Eq. (25).5. Benchmarking
Given initializations for ρ(0), , λ(0), and , our complete algorithm to find the sparsest possible uj = ∂jρ subject to the constraint ρ ∈ ℳ is
We test this algorithm on 14 problems with varying sparsity ratios for the supports 𝕀𝒮j (as defined in Eq. (6)) of the edge-detected representations ∂jρ, with j ∈ {x, y}. The test problems have sparsity ratios for the (assumed unknown) 𝕀𝒮j ranging from approximately 0.005 to 0.037, as shown in Fig. 5(a); some representative 𝕀𝒮j are shown in Figs. 5(e), 5(f), and 5(g). The simulated ground truth exit waves corresponding to these sparsity ratios for the various 𝕀𝒮j are defined by using the probe function seen in Fig. 2(a) and from a set of sample transmission functions generated by using Voronoi patterns with varying polygon centroid density in a M × N = 512 × 512 array size. The contrast of the modulus of the sample transmission functions is allowed to range between 0.1 and 0.9, while the phase can range between ±π. Examples of these exit waves corresponding to the supports in Figs. 5(e), 5(f), and 5(g), respectively, are shown in Figs. 5(b), 5(c), and 5(d). By defining synthetic test problems in this way, we can systematically control the sparsity ratios for the 𝕀𝒮j in order to test performance.The choice of ℓ ∈ {0, 1} is application, problem, and algorithm dependent. Therefore, in Sec. 5.1 we discuss our choice of preferred ℓ. Then in Sec. 5.2 we provide numerical results from the benchmark problems with ℓ = 0. Section 5.3 discusses the effects of noise in the diffraction on our method.
5.1. Use of ℓ = 0 vs. ℓ = 1
We examine the differences in performance of our ADMM method (Eq. (27)) when faced with choosing between ℓ = 0 or ℓ = 1 as sparsity-inducing metrics. The ℓ = 1 regularization choice is widely used as a stand-in for ℓ = 0 regularization because using ℓ = 0 introduces difficulties in analyzing convergence of algorithms derived from its use. Some studies [27–29], however, have noted difficulties when using ℓ = 1 with sparse image restoration and recovery. To look at the performance of the update rules given in Eqs. (27b)–(27d), we attempt to recover the simulated exit wave shown in Fig. 3(a) by sampling over a mesh of fixed sparsity ratios (κx, κy) for some number of total iterations K and to determine the final error by using the metric in Eq. (3). We do this procedure for both ℓ = 0 and ℓ = 1, meaning we use either the soft-thresholding operator given in Eq. (15) or the hard-thresholding operator given in Eq. (16) to update the uj in Eq. (27b).
In Fig. 6, we show the resulting final error using the updates in Eqs. (27b)–(27d) for both ℓ = 0 and ℓ = 1 over a fixed mesh for (κx, κy). For both, a 21 × 21 (equally spaced in both parameters) box is used, ranging from (κx, κy) = (0.005, 0.005) to (0.255, 0.255) for ℓ = 0 (hard thresholding) and from (κx, κy) = (0.005, 0.005) to (0.505, 0.505) for ℓ = 1 (soft thresholding). For each of the 441 (κx, κy) pairs, K = 4 × 104 such iterations are performed. After these K iterations, we start from the prescribed (κx, κy) values and linearly ramp up (κx, κy) over the next 104 iterations to a final sparsity ratio of (κx, κy)final = (0.255, 0.255) (hard thresholding) or (κx, κy)final = (0.505, 0.505) (soft thresholding). For the extreme values of this box in both cases, no ramping is performed. The purpose of this ramping up of sparsity ratios is to allow the binary-valued support regions, generated by the thresholding operations on the uj, to grow larger and larger (less sparse), which enables the ADMM method to converge to the nearest stationary point defined by Eq. (11). In our experience, we observed that the solution quality, when compared with the measurement , as well as the ground truth exit wave ρtrue, is worse if the ramping is not performed; that is, a larger value is obtained when using the error metric given by Eq. (3) as well as when using the metric , after subpixel image registration and global phase offset correction on ρ are performed.
Figs. 6(a) and 6(b) show that the use of hard thresholding is much more effective (has much smaller residual error) compared with the use of soft thresholding in the ADMM method in Eqs. (27b)–(27d). In Fig. 6(a) the minimum indicated by a red circle and cross corresponds to a sparsity ratio pair of (κx, κy) = (0.0175, 0.03). The recovered exit wave after ramping up to (κx, κy)final = (0.255, 0.255) is shown in Fig. 6(c) and is — apart from 180° rotation symmetry, complex conjugation, and global phase offset on ρ — closely matched to the ground truth exit wave ρtrue shown in Fig. 3(a). A typical exit wave reconstruction using soft thresholding is shown in Fig. 6(d), which corresponds to the blue circle/cross at (κx, κy) = (0.105, 0.105); here the use of soft thresholding appears to cause the exit wave to be dominated by relatively few image pixels, which have much larger moduli values than all other image pixels. Similar effects have been reported in wavelet-based image restoration for other highly ill-posed problems when using ℓ = 1 sparsity-inducing regularization [27]. Other studies have shown that using ℓ = 0 regularization leads to better edge preservation in image restoration (exactly the features we want to take advantage of) when compared with ℓ = 1 regularization [28, 29]. The minimum of Fig. 6(b) is located at the magenta circle/cross at (κx, κy) = (0.505, 0.505), and the corresponding exit wave reconstruction is shown in Fig. 6(e); here the sparsity ratio appears to be too large (too weakly constraining), causing quick stagnation to a local solution in a fashion similar to that seen when using the approximate beam diameter as a support in Sec. 2.
As a result of this clear superiority of hard thresholding when used with Eq. (27b) compared with the use of soft thresholding, for the remainder of this paper we will consider the use only of hard thresholding. We note that other algorithms exist that successfully use soft thresholding to promote sparsity [24, 30, 31]. We further note as a reminder that the use of soft thresholding comes from ℓ = 1 regularization as a sparsity-inducing term and that its use is as a convex and continuous relaxation of ℓ = 0 regularization in order to make the problem simpler to analyze; both regularization terms are effective at promoting sparsity to varying degrees depending on the particulars of the phase retrieval algorithm used and type of sparsifying transformation. We believe that the use of weighted ℓ = 1 regularization methods [19] should be explored; these methods use prior iterations of the exit wave ρ and a modified ℓ = 1 regularization term that can closely approximate the ℓ = 0 regularization term. These weighted ℓ = 1 regularization methods also retain the ease of analytic computations, continuity, and convexity. To address the use of ℓ = 0 versus ℓ = 1 regularization to promote sparsity in the phase retrieval problem, researchers need to carry out further systematic benchmarking and comparison of these methods in order to compare relative effectiveness at successfully recovering the exit wave and their sparse edge-detected representations, but these activities are beyond the scope of this paper.
5.2. Benchmark results for ℓ = 0
We now provide full benchmark results on our test problem with ℓ = 0. We use ς ≃ 0.1 when using Eq. (24). Its value was determined by numerical experimentation and works well with the test problems we consider in this section. For other synthetic test problems or if attempting phase retrieval on experimental data, other values for ς may perform better, and further exploratory tuning of this parameter should by performed.
We run 50 independent trials, each with a different M × N complex-valued uniform random number array for ρ(0) for each of the 14 synthetic test problems with varying polygon density. We initialize λ(0) to be the M × N array of zeros, and we use as initial sparsity ratios , with initial thresholding parameters ( , ) computed from the initial ( , ) and ∂jρ(0). We update the thresholding parameters τj every L = 50 iterations and use 5 polling steps in both x and y with allowable sparsity ratio changes of . We allow K0 = 2 × 104 iterations of Eq. (27) to be performed, after which we cease using Eq. (27a) to update the thresholding parameters τj. Beginning from the current ( , ), we ramp up the sparsity ratios to final values of in the same manner as that discussed in Sec. 5.1, with Kf = 2.5 × 104. Results are shown in Fig. 7(a), where for each of the 50 trials for each test problem polygon density, we sort the final values of the metric in descending order. We have also corrected the recovered ρ(Kf) for subpixel shift and global phase offsets when compared with ρtrue when computing .
The results of these benchmarks give us some insight into the performance of Eq. (27) over a range of synthetic test problems of varying sparsity upon the forward-difference edge detection in Eq. (4). Fig. 7(a) shows three regions of notable and differing behavior in . The first is an “ambiguous” region, where results are inconclusive and are between success and failure to varying degrees. The reason for the difficulty in this region lies in our attempt to use edge detection to reconstruct very smoothly varying features (discussed further below). The second and third regions correspond to where, for a clear majority of starting points, we unambiguously recover or fail to adequately recover an exit wave quantitatively and qualitatively close to the corresponding ρtrue. (The criteria used to determine what “adequate” means is discussed below.)
The “ambiguous” region, which we denote by RA, corresponds to polygon densities of for p1 ∈ {16, 128, 256, 512} and where . For p1 = 16, the ground truth exit wave is shown in Fig. 7(b); a reconstruction with is shown in Fig. 7(c). The reconstruction clearly has recovered the locations of the edges correctly, albeit with some numerical artifact degradation, but has not successfully recovered the smoothly varying regions on the exit wave, which look like scaled and phase-shifted regions of the probe function (again shown in Fig. 2(a)). For p1 = 128, the ground truth exit wave is shown in Fig. 7(e), and a reconstruction with is shown in Fig. 7(f). We see that edges are again faithfully reconstructed and that smoothly varying features of the exit wave, which look like scaled-down/phase-shifted regions of the probe function, are again degraded. However, the degradation in these smoothly varying regions in Fig. 7(f) is not nearly as severe when compared with the p1 = 16 case shown in Fig. 7(c), and only finer details of the probe are degraded.
We conclude that using forward differences is not a proper sparsifying transformation for the smooth features encountered in the test problems in the “ambiguous” region RA, and the inability of Eq. (27) to recover these smoothly varying features using edge detection is expected. The reason is that the forward-difference edge-detected representation of the smoothly varying regions gets set to zero in the thresholding step of Eq. (27b) as these regions will likely fall below the current thresholding parameter values τj, thereby destroying any information in these regions. Likely what is needed to remove this inability to recover smoothly varying regions of the exit wave far from the requisite edges, which the forward-difference edge-detection scheme was designed to effectively recover, is to perform iterative separation of the sample transmission function and the probe, as is done in ptychography [6, 32], and to introduce additional information (constraints) on the optimization problem. For example, we can constrain further the sample transmission function using additional thresholding to promote sparsity in some edge-detected representations of the sample transmission function (this can be in addition to sparsity promotion by using forward differences of the exit wave used here), as well as using index of refraction limits [33]. We can also further constrain the reconstruction for the probe function using a white-field (sample-out) measurement, which is the modulus squared of the probe function numerically propagated to the area detector. Other regularization schemes and algorithmic modifications can be made as well, such as total variational denoising methods on the ∂jρ (using , that is, curvature sparsity information), which promote smoothness in an image while simultaneously maintaining edges [34].
The second region of interest, RS, corresponds to Voronoi polygon densities of for 256 ≤ p2 ≤ 12288 and where . The “dark blueish to black” region RS corresponding to in Fig. 7(a) is a region where we consider the recovery of the exit wave successful (example reconstructions not shown) and robust (in that virtually all starting points yield an accurate solution). One qualitative criterion we can use to define success is the following: after subpixel image registration and global phase offset correction, for particular p2 values with , the recovered exit waves and the known ground truth exit waves are virtually indistinguishable upon visual inspection. A more quantitative criterion we use to determine success here is computing the phase retrieval transfer function (PRTF) [35,36]. The PRTF is the ratio of the azimuthally averaged modulus of the Fourier transform of the reconstructed exit wave to the azimuthally averaged square root of the diffraction measurement, and can be used to quantitatively determine where spatial frequency content of an exit wave reconstructed using phase retrieval is no longer reliably recovered. For a particular p2, we compute a single averaged reconstruction from the multiple trial runs with (to average out remnant numerical artifacts from a particular trial run). All PRTFs computed in this way (plots not shown) yield values between ≃ 0.7 and 1 for all spatial frequencies. In the literature, spatial frequency regions of the PRTF falling below ≃ 0.5 are typically considered as unreliably recovered spatial frequency content [35,36]. Since for the cases mentioned we have PRTFs well above this cutoff at all spatial frequencies, we can be confident these reconstructions are successful.
The third region of interest, RF, in Fig. 7(a) corresponds to the exit waves that we consider unsuccessfully recovered, using both criteria of simply comparing the reconstructions to the ground truth exit waves using visual inspection and using PRTFs. These are Voronoi polygon densities of for p1 ∈ {16, 128, 256, 512} with and for 4096 ≤ p3 with . A result for p1 = 16 and with , shown in Fig. 7(d), indicates that the construction has very roughly recovered the correct location of the edges but also appears to suffer greatly from other numerical artifacts common to phase retrieval due to translational Fourier transform symmetries [17], as well as failing to recover any smoothly varying features of the probe function. A result for p1 = 128 and with is shown in Fig. 7(g); hence, edge locations are adequately if imperfectly recovered, but smoothly varying features are unacceptably degraded when compared with the ground truth exit wave in Fig. 7(e) as well as the “borderline” successfully recovered exit wave in Fig. 7(f).
The almost exclusively “dark reddish” region in Fig. 7(a) for the p3 values listed and with forms a sharp boundary with the RS region discussed above, which is different from the more gradual boundary that the RS region shares with the RA region. The RF region is different from the unsuccessfully recovered p1 values in RA in that no recognizable edges are recovered and the reconstructions look poor (similar to that seen in Fig. 2(c)). The reconstructions corresponding to the RF region indicate that the support implicitly generated while evaluating Eq. (27b) behaves as if it is no longer sparse enough, which was the cause of failure when using the approximate probe function diameter as support (discussed in Sec. 2). Note, however, that for the p3 values listed in the RF region, there remains some probability of a successful reconstruction, ranging from 4% for p3 = 12288 all the way to 96% for p3 = 4096. The nature of this boundary between “sparse enough” and “not sparse enough” determining probabilistic reconstruction success using Eq. (27) is unclear, however, and requires further study.
5.3. Noise, edge detection, and sparsity
We next discuss the effects of noise on sparsity and the edge detection methods used in this paper. We define as before the noise-free diffraction intensity as D and the diffraction intensity with Poisson noise as Dn. The noisy measurement set ℳn and projection operator Πℳn are defined as in Eq. (8) and Eq. (9), but using Dn in place of D. In Fig. 8(a), we compute the azimuthal average of the diffraction intensities using
for some , where (qr, qθ) are spatial frequency polar coordinates, and qθ = tan−1 (qm/qn), A(qr, qθ) is an M × N binary mask with ones within the annulus of width one pixel at radius qr and zeros elsewhere, and is the number of pixels within the annulus defined by the constant radial spatial frequency qr, which in the continuous limit is the circumference of the circle 2πqr. The azimuthal average of the noise-free diffraction intensity D is shown in Fig. 8(a) as the black curve, while the azimuthal average of the diffraction intensity with Poisson noise Dn is shown by the green curve; the signal at high spatial frequencies becomes lost in the noise at approximately qr ≃ 102Δqr, where Δqr is the spatial frequency radial pixel size in units of inverse length.To best understand the effect noise has on edge detection using the forward difference convolution matrices in Eq. (4), we view edge detection as a spatial frequency filter in Fourier space. In Fig. 8(c) we show |ℱ[Ex]| = |Ẽx|; viewed in this way, forward-difference edge detection in the x direction can be seen as spatial high-pass filtering preferentially emphasizing high spatial frequencies in the x direction while attenuating spatial frequency content in the y direction. The azimuthal average of is shown as the black curve in Fig. 8(b). An example of a support generated by hard thresholding using an enforced sparsity ratio of 5% for the noise-free case is shown in Fig. 8(e), while the support generated for the noisy case, also using an enforced sparsity ratio of 5%, is shown in Fig. 8(f). The noise-degraded regions at high spatial frequencies (where the true diffraction intensity signal is essentially lost in the noise) are being amplified, giving a highly noncompact support region that in turn looks noisy, with image pixels in the support randomly distributed and inconsistent with the known ground truth support shown in Fig. 8(e).
To alleviate this amplification of noise-degraded regions at high spatial frequency, we can use other types of edge detection besides forward differences. One well-known method of edge detection that has some noise suppressing properties is a Sobel filter
where and ·T is the transpose. The behavior of E′x in Fourier space is shown in Fig. 8(d), where |ℱ[E′x]| = |Ẽ′x|. The Sobel filter viewed in Fourier space acts as a band-pass filter, attenuating both low and high spatial frequencies (which are degraded from the noise) in the y direction. We also show the azimuthal average of as the green curve in Fig. 8(b); the attenuation of the noise-degraded regions at high spatial frequencies qr ≥ 102Δqr is apparent. An example of a support generated by hard thresholding using an enforced sparsity ratio of 5% for the noisy case is shown in Fig. 8(g). The support generated by using a Sobel filter for edge detection appears much more compact and retains features of the known ground truth in Fig. 8(e) better than when using forward differences.We conclude that when noise is present in a diffraction measurement and significantly degrades particular spatial frequency content of the measurement, care must be taken to choose an appropriate edge detection method to promote sparsity. The algorithm presented in Eq. (27) can readily be modified to use thresholding on a sparse representation of the exit wave ρ using any type of edge detection, for example, explicitly defined Fourier space filters like those discussed here, as well as more sophisticated methods that combine image denoising techniques and wavelet [37], shearlet [38], or curvelets [39] transforms on the exit wave ρ. Further benchmarking and study using these other edge detection methods, using sparsity promotion through thresholding both on the modulus of the sparse representation of ρ and on the phase [30], should be explored, but this work is beyond the scope of this paper.
6. Conclusions
Developing innovative imaging methods utilizing phase retrieval is especially crucial and timely given the current planning and commissioning of high brightness coherent x-ray sources across the world. Especially for x-ray free electron lasers, the experimental arrangement explored in this work will be highly convenient and natural to use. New methods such as those presented here will help to unlock the full potential of these sources by allowing for robust phase retrieval under experimental conditions currently thought to be infeasible and/or too difficult to work with.
We show a way forward for robust single-view CDI of extended samples probed with plane wave illumination using phase retrieval. This case, which is particularly relevant to flash imaging by x-ray free-electron laser sources and for study of materials and life sciences specimens when scanning approaches (ptychography) are impractical, has previously been considered intractable. Our method exploits edge detection to arrive at a sparse representation of the sample exit wave (one with far fewer dominant pixels) and uses thresholding of the modulus of this sparse representation to implicitly generate a sparse support. This sparse support is much more highly constraining when compared to using a support on the original non-sparse exit wave, which is the essence of the method: we use phase retrieval solve for the simpler, sparser representation, not the original non-sparse exit wave.
We derive an algorithm based on ADMM to solve for the sparse edge-detected representations of the exit wave, and benchmark it on test problems with varying support sparsity (dominant nonzero pixels) in the edge-detection representation. We discuss and identify characteristics in the test problems where the method succeeds and fails, as well as a borderline region where success is ambiguous. This ambiguous region corresponds to spatial exit wave features where edges have been successfully recovered but other smoothly varying regions away from edges are not. We then examine the effects of Poisson noise when performing edge detection using the forward-differences approach and discuss how Poisson noise that is dominant at high spatial frequencies adversely affects use of thresholding to define an effective support on the edge-detected representations. We then discuss a simple example of noise suppressing edge detection using a Sobel filter, as well as some other prospects to perform edge detection in the presence of noise to generate sparse supports on the edge-detected representations.
The key to this work is to apply simple mathematical transformations in order to drastically simplify how the relevant information content in an image is represented. This can enable phase retrieval in cases previously thought to be intractable. We anticipate that this general approach to constraining and regularizing the optimization problem can be extended by using other types of sparsity-inducing transformations, such as wavelets, curvelets, and shearlets. In addition to enabling imaging extended samples by single-view CDI, we expect that incorporation of this concept into multiple-view (e.g., ptychographic) algorithms will be beneficial by supplementing ptychographic overlap constraints with a sparse support constraint on the edge-detected representation of the current view.
Funding
This material is based upon work supported by the U.S. Department of Energy, Office of Science, Offices of Advanced Scientific Computing Research and Basic Energy Sciences under Contract No. DE-AC02-06CH11357.
Acknowledgments
The authors are grateful to Sven Leyffer for insightful discussions.
References and links
1. D. Paganin, Coherent X-ray Optics (Oxford University, 2006). [CrossRef]
2. Y. S. G. Nashed, D. J. Vine, T. Peterka, J. Deng, R. Ross, and C. Jacobsen, “Parallel ptychographic reconstruction,” Opt. Express 22, 32082–32097 (2014). [CrossRef]
3. J. Miao, P. Charalambous, J. Kirz, and D. Sayre, “Extending the methodology of X-ray crystallography to allow imaging of micrometre-sized non-crystalline specimens,” Nature 400, 342–344 (1999). [CrossRef]
4. S. Marchesini, H. He, H. N. Chapman, S. P. Hau-Riege, A. Noy, M. R. Howells, U. Weierstall, and J. C. H. Spence, “X-ray image reconstruction from a diffraction pattern alone,” Phys. Rev. B 68, 140101 (2003). [CrossRef]
5. H. M. L. Faulkner and J. M. Rodenburg, “Movable aperture lensless transmission microscopy: A novel phase retrieval algorithm,” Phys. Rev. Lett. 93, 023903 (2004). [CrossRef]
6. P. Thibault, M. Dierolf, A. Menzel, O. Bunk, C. David, and F. Pfeiffer, “High-resolution scanning X-ray diffraction microscopy,” Science 321, 379–382 (2008). [CrossRef]
7. P. Thibault and A. Menzel, “Reconstructing state mixtures from diffraction measurements,” Nature 494, 68–71 (2013). [CrossRef]
8. S. Marathe, S. S. Kim, S. N. Kim, C. Kim, H. C. Kang, P. V. Nickles, and D. Y. Noh, “Coherent diffraction surface imaging in reflection geometry,” Opt. Express 18, 7253–7262 (2010). [CrossRef]
9. M. A. Pfeifer, G. J. Williams, I. A. Vartanyants, R. Harder, and I. K. Robinson, “Three-dimensional mapping of a deformation field inside a nanocrystal,” Nature 442, 63–66 (2006). [CrossRef]
10. J. R. Fienup, “Phase retrieval algorithms: a comparison,” Appl. Opt. 21, 2758–2769 (1982). [CrossRef]
11. S. Marchesini, “Ab initio compressive phase retrieval,” Tech. Rep. 0809.2006, arXiv (2008).
12. M. L. Moravec, J. K. Romberg, and R. G. Baraniuk, “Compressive phase retrieval,” Proc. SPIE 6701, 670120 (2007). [CrossRef]
13. R. Neutze, R. Wouts, D. van der Spoel, E. Weckert, and J. Hajdu, “Potential for biomolecular imaging with femtosecond x-ray pulses,” Nature 406, 752–757 (2000). [CrossRef]
14. B. Abbey, K. Nugent, G. Williams, J. Clark, A. Peele, M. Pfeifer, M. de Jonge, and I. McNulty, “Keyhole coherent diffractive imaging,” Nat. Phys. 4, 394–398 (2008). [CrossRef]
15. E. Candés and M. Wakin, “An introduction to compressive sampling,” IEEE Signal Proc. Mag. 25, 21–30 (2008). [CrossRef]
16. X. Huang, J. Nelson, J. Steinbrener, J. Kirz, J. J. Turner, and C. Jacobsen, “Incorrect support and missing center tolerances of phasing algorithms,” Opt. Express 18, 26441–26449 (2010). [CrossRef]
17. A. Tripathi, S. Leyffer, T. Munson, and S. M. Wild, “Visualizing and improving the robustness of phase retrieval algorithms,” Procedia Comput. Sci. 51, 815–824 (2015). [CrossRef]
18. S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Found. Trends Mach. Learn. 3, 1–122 (2011). [CrossRef]
19. E. J. Candés, M. B. Wakin, and S. P. Boyd, “Enhancing sparsity by reweighted ℓ1 minimization,” J. Fourier Anal. Appl. 14, 877–905 (2008). [CrossRef]
20. T. Blumensath and M. E. Davies, “Iterative thresholding for sparse approximations,” J. Fourier Anal. Appl. 14, 629–654 (2008). [CrossRef]
21. F. Bach, R. Jenatton, J. Mairal, and G. Obozinski, “Optimization with sparsity-inducing penalties,” Found. Trends Mach. Learn. 4, 1–106 (2011). [CrossRef]
22. R. H. Chan, T. F. Chan, L. Shen, and Z. Shen, “Wavelet algorithms for high-resolution image reconstruction,” SIAM J. Sci. Comput. 24, 1408–1432 (2003). [CrossRef]
23. K. Herrity, A. Gilbert, and J. Tropp, “Sparse approximation via iterative thresholding,” in Proceedings of IEEE International Conference on Acoustics, Speech and Signal Processing (IEEE, 2006), pp. 624–627.
24. Z. Yang, C. Zhang, and L. Xie, “Robust compressive phase retrieval via L1 minimization with application to image reconstruction,” Tech. Rep. 1302.0081, arXiv (2013).
25. L. Sorber, M. Barel, and L. Lathauwer, “Unconstrained optimization of real functions in complex variables,” SIAM J. Optim. 22, 879–898 (2012). [CrossRef]
26. J. J. Moré and S. M. Wild, “Estimating derivatives of noisy simulations,” ACM Trans. Math. Software 38, 1–21 (2012). [CrossRef]
27. X. Zhang, Y. Lu, and T. Chan, “A novel sparsity reconstruction method from Poisson data for 3D bioluminescence tomography,” J. Sci. Comput. 50, 519–535 (2011). [CrossRef]
28. B. Dong and Y. Zhang, “An efficient algorithm for ℓ0 minimization in wavelet frame based image restoration,” J. Sci. Comput. 54, 350–368 (2012). [CrossRef]
29. Y. Zhang, B. Dong, and Z. Lu, “ℓ0 minimization for wavelet frame based image restoration,” Math. Comput. 82, 995–1015 (2013). [CrossRef]
30. A. Pein, S. Loock, G. Plonka, and T. Salditt, “Using sparsity information for iterative phase retrieval in x-ray propagation imaging,” Opt. Express 24, 8332–8343 (2016). [CrossRef]
31. R. Fan, Q. Wan, F. Wen, H. Chen, and Y. Liu, “Iterative projection approach for phase retrieval of semi-sparse wave field,” EURASIP J. Adv. Signal Process. 2014, 1–13 (2014). [CrossRef]
32. A. M. Maiden and J. M. Rodenburg, “An improved ptychographical phase retrieval algorithm for diffractive imaging,” Ultramicroscopy 109, 1256–1262 (2009). [CrossRef]
33. J. N. Clark, C. T. Putkunz, M. A. Pfeifer, A. G. Peele, G. J. Williams, B. Chen, K. A. Nugent, C. Hall, W. Fullagar, S. Kim, and I. McNulty, “Use of a complex constraint in coherent diffractive imaging,” Opt. Express 18, 1981–1993 (2010). [CrossRef]
34. L. I. Rudin, S. Osher, and E. Fatemi, “Nonlinear total variation based noise removal algorithms,” Physica D 60, 259–268 (1992). [CrossRef]
35. H. N. Chapman, A. Barty, S. Marchesini, A. Noy, S. P. Hau-Riege, C. Cui, M. R. Howells, R. Rosen, H. He, J. C. H. Spence, U. Weierstall, T. Beetz, C. Jacobsen, and D. Shapiro, “High-resolution ab initio three-dimensional x-ray diffraction microscopy,” J. Opt. Soc. Am. A 23, 1179–1200 (2006). [CrossRef]
36. J. Steinbrener, J. Nelson, X. Huang, S. Marchesini, D. Shapiro, J. J. Turner, and C. Jacobsen, “Data preparation and evaluation techniques for x-ray diffraction microscopy,” Opt. Express 18, 18598–18614 (2010). [CrossRef]
37. D. Lazzaro and L. Montefusco, “Edge-preserving wavelet thresholding for image denoising,” J. Comput. Appl. Math. 210, 222–231 (2007). [CrossRef]
38. S. Yi, D. Labate, G. R. Easley, and H. Krim, “A shearlet approach to edge analysis and detection,” IEEE Trans. Image Process. 18, 929–941 (2009). [CrossRef]
39. J.-L. Starck, E. J. Candés, and D. L. Donoho, “The curvelet transform for image denoising,” IEEE Trans. Image Process. 11, 670–684 (2002). [CrossRef]