Advanced denoising for X-ray ptychography

Huibin Chang; Pablo Enfedaque; Jie Zhang; Juliane Reinhardt; Bjoern Enders; Young-Sang Yu; David Shapiro; Christian G. Schroer; Tieyong Zeng; Stefano Marchesini

doi:10.1364/OE.27.010395

1. Introduction

Ptychography [1–4] has become an increasingly popular imaging technique, and it is used nowadays in scientific fields as diverse as condensed matter physics [5], cell biology [6], materials science [7, 8], and electronics [4, 9], among other areas. Compared to standard lens-based microscopy, the resolution in ptychography is not limited by the size of the illumination probe, but by the wavelength and number of photons used in an experiment. Unfortunately, multiple experimental challenges have to be tackled to achieve a high-quality reconstruction from a ptychographic experiment in practice. Complex experimental systems require nanometer stability while collecting large amounts of diffraction data, which can require several hours depending on the experimental setup. Data can be then contaminated by structured parasitic scattering [10–12], detector read-out noise, Poisson noise derived from the photon counting, and different types of outliers. Additionally, the characterization of the illumination source is also commonly considered as a joint problem to the object reconstruction, known as Blind Ptychography (BP) [3,13–15].

During the last decades, researchers have developed several schemes to solve the BP problem. Arguably, the most popular ones are extended Ptychographic Iterative Engine (ePIE) [3], Difference Map [13, 16], Maximum Likelihood (ML) method [17], Proximal Splitting algorithm [14], Relaxed Averaged Alternating Reflections (RAAR [18]) based algorithms [19], and generalized Alternating Direction Method of Multipliers (ADMM) [20–22]) based BP [15]. Although some implementations of those methods present ad hoc solutions for structural noise removal, there is no formal analysis or in-depth characterization of such experimental problems, even though they can be critical to achieve robust high-resolution images, specially from weakly scattering or low contrast specimens.

Experimental noise overview

Direct reconstructions based on raw experimental data often contain visible artifacts. They commonly derive from outliers and structured and randomly distributed uncorrelated noise sources. Below there is a characterization of the main causes of such anomalies.

Structured noise

(1) Parasitic scattering: Derives from the scattering from any element along the beam path other than the sample and the optical elements desired harmonic order e.g.: slits, pinholes, lens imperfections, harmonic contamination, air, gas, etc. We refer to this as the background.
(2) Saturation: Occurs when the flux exceeds the detector capacity, the range of the Analog to Digital Converter (ADC) or the maximum photon counting rate.
(3) Dark noise: Produced by the detector dark current, light from position encoders and interferometers, or light from the environment. It differs from the parasitic scattering as it occurs also when the x-ray beam is off and can be measured in advance by turning off the beam and (commonly referred to as dark frame). The constant component can be subtracted from the measured signal before applying a reconstruction solver or alternatively it can be incorporated in the generalized background.

Random noise

(4) Photon-counting noise: Caused by the quantization of the wavefront at the detector.
(5) Read-out and detector noise: Derives from electronic interference when the ADC converts the charge distribution to a digital signal and the random component of the dark current. The interference and parallel ADC read-out electronics can produce ringing and correlation in the digital signal.

Outliers

(6) Bad frames: Glitches on the functioning of mechanical components inside an instrument such as shutter timing can cause some measured frames to be corrupted.
(7) Cosmic rays: High energy cosmic particles that penetrate the atmosphere, the building and the instrument enclosure and hit the detector. Although they are rare, those types of signals normally present very high charge.
(8) Bad pixels: Result from fabrication errors may cause dead pixels on a detector, these can be typically measured in advanced and masked out during the reconstruction process.

In this paper we focus on a solution to address (1), (6) and (7), regarding structured noise and outliers, and also propose a technique to deal with a combination of different random noise distributions coming from (4) and (5). The noise from (3) and (8) can be measured in advance, and it is commonly subtracted. Saturation noise (2) can be equally identified and addressed by masking out the saturated signal areas and it is outside the focus of this paper. Random noise sources (4) and (5) have been considered in the literature [15,17,23,24], but there is no solution that can address the combination of two or more random noise distributions. In this paper, we propose a mixed-noise term for ptychography to effectively deal with any combination of Poisson and Gaussian noise sources.

Some of the previous pathologies can be identified inspecting the measured data. An illustrative example of parasitic noise (1) is reported in Fig. 1. A single measured diffraction pattern (Fig. 1(a)) presents a distinct reverse “L-Shape” artifact, which consistently appears on all other diffraction measurements from the same dataset. If parasitic scattering is not considered, a standard ptychography reconstruction produces very noisy and low contrast images (Figs. 1(d) and 1(e)). The phase part is lost in some areas, concealed by the biased background of the recovered phase, and the resulted intensities (Fig. 1(b)) contain a slight alteration of the parasitic noise from the measured data. (A more detailed comparison between different methods, with and without structured noise correction, is provided in section 3)

Fig. 1 Reconstruction results using RAAR (implemented in SHARP [19]) and the proposed algorithm. The dataset corresponds to one of the 3D ptychography experiments from the results presented in [8]. (a) Measured intensities from a single diffraction pattern. Recovered intensities (b), object amplitude (amp.) (d) and phase (e), respectively, using RAAR without structured noise correction. Recovered intensities (c), object amplitude (f) and phase (g), respectively, using the proposed algorithm (alg.). The figures in the first row are shown in scale (·)^0.05 following the colorbar shown on the top right corner; the figures in the second row are shown in linear scale following the colorbars on the left corner (amplitude) and right corner (phase).

Download Full Size | PDF

Related work

A beam stop scheme was employed in [11] to reduce the parasitic noise using intensities from two raster scans (i.e. scanning on a Cartesian 2D grid) with and/or without beamstop. However, the artifacts caused by the structured noise are still noticeable in Fig. 2 of [11], and some fine features still appear undefined. A multiple-mode approach in [25] was proposed to reduce the background. In [12], the data was pre-processed by subtracting an adjustable scalar value multiplied by the dark frame. A method to remove the parasitic background automatically employed a gradient descent method with adaptive stepsizes was proposed in [26] and implemented using a distributed multi-GPU implementation [19] in a real-time streaming production code at a user facility [27].

Fig. 2 Simulation experiment. First row: (a)–(b) Amplitude and phase for true illumination (illu., 64 × 64 pixels), with Full Width at Half Max (FWHM) = 7 pixels. Second row: Amplitude of true image (c) with 496 × 496 pixels, and amplitude of recovered images with data contaminated by only parasitic noise, using SHARP (d), SHARP-B (e) and ADP (f). Third row: Amplitude of recovered images with data contaminated as in Eq. (32) and also by additional outliers, using SHARP (g), SHARP-B (h), ADP (i), and ADPr (j). The fourth, and fifth rows show the corresponding phase parts of the images in the second and third rows.

Download Full Size | PDF

Several researchers have also studied techniques to remove the random noise from photon-counting or read-out noises by employing a sparse regularization technique to further improve the image quality for conventional ptychography,. e.g. Tikhonov regularization with nonlinear conjugate gradient method [17], total variation regularization with ADMM [28] and dictionary learning method with proximal algorithm [29]. Specially, for the more practical noises, like a mix of Poisson and Gaussian noises, several variational methods have been proposed for linear inverse problems, e.g. generalized Anscombe transform [30], total variation based Shift-Poisson method [31] and weighted least squares method [32]. However, such methods have not been applied to the conventional ptychography yet, which is essentially a nonlinear inverse problem. Beyond algorithmic methods, there are also several contributions from beamline scientists attenuating the background problem via hardware solutions [10,33].

Motivations and contributions

In a ptychography experimental setup it is impossible to isolate parasitic noise from the measured data. Furthermore, the data is often contaminated by outliers and an additional mix of random noise types. In this context, an automatic denoising algorithm with convergence guarantee to jointly remove both structured and random noise can be very valuable. In this paper, we propose ADMM Denoising for Phase retrieval (ADP), a new algorithm that addresses all the main sources of experimental noise in a blind X-ray ptychography experimental setting. The proposed algorithm achieves similar convergence speed as state-of-the-art blind ptychography methods and it is constructed on the framework presented in [15]. Even if additional hardware or pre-processing techniques are in place, ADP can be used to further enhance the reconstruction quality of the retrieved object. The main contributions of this work are listed below:

A new algorithm for automatic structured and random mixed noise removal with outliers correction. The algorithm is based on the forward physical model of the parasitic scattering noise (additive frame-invariant background) and on the maximum a posteriori (MAP) estimation combined with the shift-Poisson method [31] for mixed types of noise. By assuming the piecewise smoothness of sample patches, We propose an additional framewise regularization term to further enhance the denoising process and improve the image quality, instead of using a simple sparsity term of the sample [28]. To the best of our knowledge, it is the first time an iterative algorithm with rigorous analysis for background removal in ptychographic imaging is proposed.
By formulating the parasitic noise variable with positive constraint as a quadratic term, fast ADMM algorithms are designed for the proposed model, with and without regularization. The convergence guarantee of the method is proved under very mild conditions. Computationally, each subproblem can be solved very efficiently, using pointwise operations or simple linear algebra solvers, requiring few arithmetic operations per iteration and exposing high parallelism.
Multiple numerical experiments on both simulated and experimental data from different light sources demonstrate the enhanced quality of the proposed algorithm. Reconstruction results using the proposed algorithm can be found in Fig. 1 and in the experimental results section.

2. Proposed iterative algorithm

Mathematical formula for standard ptychography In a standard ptychography experiment, a localized coherent X-ray probe (or illumination) ω scans through an image (or sample) u, while the detector collects a sequence of phaseless intensities in the far field. Throughout this paper, we consider the following discrete setting:

The variable u ∈ ℂⁿ (ℂⁿ is the complex Euclidean space) corresponds to a 2D sample with $\sqrt{n} \times \sqrt{n}$ pixels, and ω ∈ ℂ^m̄ is a localized 2D probe with $\sqrt{\bar{m}} \times \sqrt{\bar{m}}$ pixels, both u and ω written as a vector by a lexicographical order. A stack of phaseless measurements ${\tilde{I}}_{j} \in ℝ_{+}^{\bar{m}} \forall 0 \leq j \leq J - 1$ ( $ℝ_{+}^{\bar{m}}$ is the Euclidean space with non-negative elements) is collected with Ĩ_j = |ℱ(ω ○ 𝒮_ju)|², where | · | represents the element-wise absolute value of a vector, ○ denotes the element-wise multiplication, and ℱ denotes the normalized discrete Fourier transformation. 𝒮_j ∈ ℝ^m̄×n is a binary matrix that specifies a small window with the index j and size m̄ over the entire sample u. The BP problem can then be expressed as follows:

To find ω \in ℂ^{\bar{m}} and u \in ℂ^{n}, such that {| 𝒜 (ω, u) |}^{2} = \tilde{I},

where bilinear operators 𝒜 : ℂ^m̄ × ℂⁿ → ℂ^m and 𝒜_j : ℂ^m̄ × ℂⁿ → ℂ^m̄ ∀0 ≤ j ≤ J − 1, are denoted as:

𝒜 (ω, u) : = {(𝒜_{0}^{T} (ω, u), 𝒜_{1}^{T} (ω, u), \dots, 𝒜_{J - 1}^{T} (ω, u))}^{T}

, 𝒜_j(ω, u) := ℱ(ω ○ 𝒮_ju), and

\tilde{I} : = {({\tilde{I}}_{0}^{T}, {\tilde{I}}_{1}^{T}, \dots, {\tilde{I}}_{J - 1}^{T})}^{T} \in ℝ_{+}^{m}

(m = J × m̄).

2.1. Proposed model

Experimentally, not all of the parasitic scattering (background) is the same on each frame (here, “frame” means one image from the CCD detector). However, the frame-invariant component is one of the key sources of the artifacts in standard reconstruction results due to the redundancy of ptychography [11,12]. Essentially, it is an additive components of signals, apart from those originating from the sample and the beam hitting it. Hence, we start by assuming that the background of each frame is approximately unchanged [26], i.e., it is frame-invariant. It is important to note that parasitic noise can also be interpreted as an independent structured noise source per frame. The frame-invariant approximation is critical for proper modeling and the problem is actually not well defined if this approximation is not considered. This is due to the fact that the independent noise per frame alternative has a trivial solution where the background is equal to the residual between the measured and reconstructed intensities from the iterative solution of the sample and probe. The frame-invariant approximation is, besides simple, very efficient at removing structural noise to produce high-contrast recovery images.

Formally, the true intensity data Ĩ is contaminated by non-negative parasitic structured noise $\hat{ϕ} \in ℝ_{+}^{\bar{m}}$ , such that the measured data $\hat{I} = {({\hat{I}}_{0}^{T}, {\hat{I}}_{1}^{T}, \dots, {\hat{I}}_{J - 1}^{T})}^{T} \in ℝ_{+}^{m}$ with ${\hat{I}}_{j} \in ℝ_{+}^{\bar{m}}$ , ∀0 ≤ j ≤ J − 1 is recorded for each frame as follows:

0 \leq {\hat{I}}_{j} = {\tilde{I}}_{j} + \hat{ϕ} \forall 0 \leq j \leq J - 1 .

In practice, the data is contaminated by noise as

{\hat{I}}_{j} = Noi ({\tilde{I}}_{j} + \hat{ϕ}) \forall 0 \leq j \leq J - 1,

where Noi denotes the degrading operator caused by different noise sources. Without loss of generality, we consider the Poisson and Gaussian mixed noise model with parasitic noise as:

{\hat{I}}_{j} (t) \overset{i . i . d}{~} Poisson ({| 𝒜_{j} (ω, u) (t) |}^{2} + \hat{ϕ} (t)) + n_{j} (t),

where i.i.d. is short for “Independent and Identically Distributed”, n_j denotes the white Gaussian noise with zero mean and variance σ > 0. It is not difficult to verify that var(Î_j(t) + σ²) = var(Î_j(t)) = |𝒜_j(ω, u)(t)|² + ϕ̂ + σ², and mean(Î_j(t) + σ²) = |𝒜_j(ω, u)(t)|² + ϕ̂ + σ². Hence, I_j(t) := Î_j(t) + σ² has the same variance and mean. By further estimating I_j by a Poisson distribution with the shift-Poisson technique [31] and using KL divergence derived by maximum likelihood estimation of Poisson noise, the following optimization problem is derived:

min_{ω, u, ϕ} \frac{1}{2} \sum_{j} 〈 1_{\bar{m}}, {| 𝒜_{j} (ω, u) |}^{2} + ϕ - I_{j} \circ log ({| 𝒜_{j} (ω, u) |}^{2} + ϕ) 〉 + 𝕀_{ℰ} (ϕ),

where 1_m̄ is a vector of all ones, ϕ := ϕ̂ + σ², 〈·, ·〉 denotes the inner product as

〈 v_{1}, v_{2} 〉 : = \sum_{j = 0}^{\bar{m}} v_{1} (t) v_{2} (t) \forall v_{1}, v_{2} \in ℝ^{\bar{m}}

, and positivity constraint set is defined as

ℰ : = {ϕ \in ℝ_{+}^{\bar{m}} : ϕ \geq 0}

. Here, the indicator function 𝕀_ℰ is defined as: 𝕀_ℰ(ϕ) = 0, if ϕ ∈ ℰ; 𝕀_ℰ(ϕ) = +∞, otherwise.

Remark 2.1. In order to jointly estimate the background and illumination (double-blind), based on Eq. (2), a variant optimization model with least squares fitting can be established regardless of the specific noise mechanism as

min_{ω, u, ϕ} \frac{1}{2} \sum_{j = 0}^{J - 1} {‖ {| 𝒜_{j} (ω, u) |}^{2} + ϕ - I_{j} ‖}^{2} + 𝕀_{ℰ} (ϕ) .

However, when this metric is used to measure the distance between the collected and recovered intensities, the corresponding first-order iterative algorithm is significantly slow [15]. This is because the residual slowly decreases as the iterations go, and the subproblem with respect to the auxiliary variable is related with quartic equations, which are solved with high computational cost [34].

In order to deal with the misfit caused by experimental outliers, a weight vector ${(C_{0}^{T}, C_{1}^{T}, \dots, C_{J - 1}^{T})}^{T} \in ℝ^{m}$ is introduced to improve the fitting residual. To further handle the noise or other artifacts, we consider the total variation regularization on the split patches of the sample, instead of the regularization of the entire sample in [28]. As a result, we propose the following Kullback-Leibler (KL) divergence-based problem with framewise sparsity and parasitic noise retrieval:

\begin{array}{l} min_{ω, u, ϕ} \frac{1}{2} \sum_{j} 〈 C_{j}, {| 𝒜_{j} (ω, u) |}^{2} + ϕ + ε^{2} 1_{\bar{m}} - (I_{j} + ε^{2} 1_{\bar{m}}) \circ log ({| 𝒜_{j} (ω, u) |}^{2} + ϕ + ε^{2} 1_{\bar{m}}) 〉 \\ + λ \sum_{j} TV (𝒮_{j} u) + 𝕀_{ℰ} (ϕ), \end{array}

with a penalization parameter ε > 0, where TV denotes the total variation and λ is a positive parameter to balance the regularization and data fitting terms.

2.2. ADMM denoising for phase retrieval (ADP)

Generally speaking, the ADMM is a variant of the augmented Lagrangian method [20,21] that uses partial updates for the dual variable, and each subproblem can be easily solved compared with the original optimization problem. Due to its scalability and flexibility, it has been a popular algorithm for large-scale optimization problems arising in computer vision, statistics, machine learning, and other related areas, and see more details in the review paper [22] and references therein. Since the proposed model is non-convex and non-smooth, ADMM will be adopted to solve it, which allows for bigger stepsizes by avoiding directly calculating the gradient of the objective functional and therefore has fast convergence with low computation cost per iteration.

We formulate below the proposed ADP algorithm to solve Eq. (7). First, we consider the problem without regularization by setting λ = 0. If directly following [35,36], the subproblem with respect to the variable ϕ does not have a closed form solution:

min_{ω, u, ϕ} \frac{1}{2} \sum_{j} 〈 C_{j}, {| 𝒜_{j} (ω, u) |}^{2} + ϕ + ε^{2} 1_{\bar{m}} - (I_{j} + ε^{2} 1_{\bar{m}}) \circ log ({| 𝒜_{j} (ω, u) |}^{2} + ϕ + ε^{2} 1_{\bar{m}}) 〉 + 𝕀_{ℰ} (ϕ) .

A gradient descent or proximal type algorithm reported in [15] to directly solve this problem presents very slow convergence. In order to develop a more efficient algorithm for each subproblem, we first rewrite Eq. (7) with λ = 0 as

min_{ω, u, \tilde{ϕ}} \frac{1}{2} \sum_{j} 〈 C_{j}, {| 𝒜_{j} (ω, u) |}^{2} + {\tilde{ϕ}}^{2} + ε^{2} 1_{\bar{m}} - (I_{j} + ε^{2} 1_{\bar{m}}) \circ log ({| 𝒜_{j} (ω, u) |}^{2} + {\tilde{ϕ}}^{2} + ε^{2} 1_{\bar{m}}) 〉,

where

\tilde{ϕ} : = \sqrt{ϕ}

. Note that the constraint for ϕ is automatically removed in Eq. (9). Also, the variables 𝒜_j(ω, u) and ϕ̃ play now the same role in the objective function and adding auxiliary variables can derive a more efficient algorithm than directly solving the previous problem.

Introducing the auxiliary variables z_j = 𝒜_j(ω, u) and μ_j = ϕ̃ ∀0 ≤ j ≤ J − 1, produces the following equivalent constraint optimization problem:

min_{ω, u, z, μ, \tilde{ϕ}} 𝒢_{ε} (z, μ), such that z_{j} - 𝒜_{j} (ω, u) = 0, μ_{j} - \tilde{ϕ} = 0 \forall 0 \leq j \leq J - 1,

with

𝒢_{ε} (z, μ) : = \frac{1}{2} \sum_{j} 〈 C_{j}, {| z_{j} |}^{2} + μ_{j}^{2} + ε^{2} 1_{\bar{m}} - (I_{j} + ε^{2} 1_{\bar{m}}) \circ log ({| z_{j} |}^{2} + μ_{j}^{2} + ε^{2} 1_{\bar{m}}) 〉 .

The corresponding augmented Lagrangian for Eq. (10) reads:

\begin{array}{l} ℒ (ω, u, z, μ, \tilde{ϕ}, Λ_{1}, Λ_{2}) : = 𝒢_{ε} (z, μ) + r \sum_{j} ℜ 〈 Λ_{1, j}, z_{j} - 𝒜_{j} (ω, u) 〉 \\ + \frac{r}{2} \sum_{j} {‖ z_{j} - 𝒜_{j} (ω, u) ‖}^{2} + r \sum_{j} 〈 Λ_{2, j}, μ_{j} - \tilde{ϕ} 〉 + \frac{r}{2} \sum_{j} {‖ μ_{j} - \tilde{ϕ} ‖}^{2}, \end{array}

where ℜ denotes the real part of the complex-valued number. The proposed ADP algorithm is formulated as follows:

{\begin{array}{l} ω^{k + 1} & = \underset{ω}{arg min} ℒ (ω, u^{k}, z^{k}, μ^{k}, {\tilde{ϕ}}^{k}, Λ_{1}^{k}, Λ_{2}^{k}) + \frac{α_{1}}{2} {‖ ω - ω^{k} ‖}^{2}; \\ u^{k + 1} & = \underset{u}{arg min} ℒ (ω^{k + 1}, u, z^{k}, μ^{k}, {\tilde{ϕ}}^{k}, Λ_{1}^{k}, Λ_{2}^{k}) + \frac{α_{2}}{2} {‖ u - u^{k} ‖}^{2}; \\ (z^{k + 1}, μ^{k + 1}) & = \underset{z, μ}{arg min} ℒ (ω^{k + 1}, u^{k + 1}, z, μ, {\tilde{ϕ}}^{k}, Λ_{1}^{k}, Λ_{2}^{k}); \\ {\tilde{ϕ}}^{k + 1} & = \underset{\tilde{ϕ}}{arg min} ℒ (ω^{k + 1}, u^{k + 1}, z^{k + 1}, μ^{k + 1}, \tilde{ϕ}, Λ_{1}^{k}, Λ_{2}^{k}); \\ Λ_{1}^{k + 1} & = Λ_{1}^{k} + z^{k + 1} - 𝒜 (ω^{k + 1}, u^{k + 1}); \\ Λ_{2, j}^{k + 1} & = Λ_{2, j}^{k} + μ_{j}^{k + 1} - {\tilde{ϕ}}^{k + 1} \forall 0 \leq j \leq J - 1, \end{array}

with

Λ_{1} : = {(Λ_{1, 0}^{T}, Λ_{1, 1}^{T}, \dots, Λ_{1, J - 1}^{T})}^{T} \in ℂ^{m}

, and

Λ_{2} : = {(Λ_{2, 0}^{T}, Λ_{2, 1}^{T}, \dots, Λ_{2, J - 1}^{T})}^{T} \in ℝ^{m}

.

The solution to each subproblems above are reported in Algorithm 1, and further details can be found in the Appendix B. When setting $μ_{j}^{0} = {\tilde{ϕ}}^{0} = 0$ , and $Λ_{2}^{0} = 0$ , following the above algorithm we have that $μ_{j}^{k} = 0 \forall k = 1, 2, \dots$ , since μ^★ = 0 is a stationary point of the proposed model Eq. (7). Therefore, we use the following warm-start scheme to initialize μ: first solve a problem without structured noise correction by setting $μ_{j}^{0} = 0$ , and after J₀ iterations reset $μ_{j}^{k}$ using $μ_{j}^{k + 1} : = \sqrt{max {0, \frac{1}{J} \sum_{j} (I_{j} - {| 𝒜 (ω^{k + 1}, u^{k + 1}) |}^{2})}}$ . The weight function {C_j} can be simply fixed to be one. It can also be dynamically changed if the measured data contains outliers, such that the ADP algorithm is slightly modified with additional update of the weight function between Step 3 and Step 4, and please see more details for update scheme of this function in Remark 2.2.

Algorithm 1:. ADP

View Table | View all tables in this article

Convergence analysis

We assess the convergence of ADP for the proposed model with λ = 0 in the following theorem:

Theorem 1. By denoting Y^k := (ω^k, u^k, z^k, μ^k, ϕ̃^k, $Λ_{1}^{k}$ , $Λ_{2}^{k}$ ), any limit point of {Y^k} produced by ADP is the stationary point of Eq. (7) with λ = 0 if the stepsize r is sufficiently large and ε > 0.

The proof of the above theorem can be found in Appendix C. The referred proof demonstrates that any limit point of the iterative sequence is the stationary point of the proposed model. In order to prove the convergence of the whole sequence we need more constraints for the object and illumination, and also a careful selection scheme of the proximal parameters α₁, α₂ [15]. The introduction of ε is also needed for the convergence analysis. Numerically, such penalization does not produce any obvious improvement on either convergence speed or reconstruction quality. Hence, for simplicity, we only show the performance of proposed algorithm with ε = 0.

2.3. ADP with regularization (ADPr)

In this subsection we consider the proposed model Eq. (7) with λ > 0. Introducing the auxiliary variables p_j = ∇𝒮_ju, z_j = 𝒜_j(ω, u) and μ_j = ϕ̃ ∀0 ≤ j ≤ J − 1 produces the following equivalent constraint optimization problem:

min_{ω, u, z, μ, \tilde{ϕ}} λ \sum_{j} ‖ | p_{j} | ‖ + 𝒢_{ε} (z, μ), such that z_{j} - 𝒜_{j} (ω, u) = 0, μ_{j} - \tilde{ϕ} = 0, p_{j} - \nabla 𝒮_{j} u = 0 .

The corresponding augmented Lagrangian reads

\begin{array}{l} ℒ_{reg} (ω, u, p, z, μ, \tilde{ϕ}, Λ_{1}, Λ_{2}, Λ_{3}) \\ : = 𝒢_{ε} (z, μ) + \sum_{j} (r ℜ 〈 Λ_{1, j}, z_{j} - 𝒜_{j} (ω, u) 〉 + \frac{r}{2} {‖ z_{j} - 𝒜_{j} (ω, u) ‖}^{2} + r 〈 Λ_{2, j}, μ_{j} - \tilde{ϕ} 〉 \\ + \frac{r}{2} {‖ μ_{j} - \tilde{ϕ} ‖}^{2} + λ ‖ | p_{j} | ‖ + β ℜ 〈 p_{j} - \nabla 𝒮_{j} u, Λ_{3, j} 〉 + \frac{β}{2} {‖ p_{j} - \nabla 𝒮_{j} u ‖}^{2}) . \end{array}

We propose the following generalization of ADP to solve the problem above, referred to as ADP with regularization (ADPr):

{\begin{array}{l} ω^{k + 1} & = \underset{ω}{arg min} ℒ_{reg} (ω, u^{k}, p^{k}, z^{k}, μ^{k}, {\tilde{ϕ}}^{k}, Λ_{1}^{k}, Λ_{2}^{k}, Λ_{3}^{k}) + \frac{α_{1}}{2} {‖ ω - ω^{k} ‖}^{2}; \\ u^{k + 1} & = \underset{u}{arg min} ℒ_{reg} (ω^{k + 1}, u, p^{k}, z^{k}, μ^{k}, {\tilde{ϕ}}^{k}, Λ_{1}^{k}, Λ_{2}^{k}, Λ_{3}^{k}) + \frac{α_{2}}{2} {‖ u - u^{k} ‖}^{2}; \\ p^{k + 1} & = \underset{p}{arg min} ℒ_{reg} (ω^{k + 1}, u^{k + 1}, p, z^{k}, μ^{k}, {\tilde{ϕ}}^{k}, Λ_{1}^{k}, Λ_{2}^{k}, Λ_{3}^{k}); \\ (z^{k + 1}, μ^{k + 1}) & = \underset{z, μ}{arg min} ℒ_{reg} (ω^{k + 1}, u^{k + 1}, p^{k + 1}, z, μ, {\tilde{ϕ}}^{k}, Λ_{1}^{k}, Λ_{2}^{k}, Λ_{3}^{k}); \\ {\tilde{ϕ}}^{k + 1} & = \underset{\tilde{ϕ}}{arg min} ℒ_{reg} (ω^{k + 1}, u^{k + 1}, p^{k + 1}, z^{k + 1}, μ^{k + 1}, \tilde{ϕ}, Λ_{1}^{k}, Λ_{2}^{k}, Λ_{3}^{k}); \\ Λ_{1}^{k + 1} & = Λ_{1}^{k} + z^{k + 1} - 𝒜 (ω^{k + 1}, u^{k + 1}); \\ Λ_{2, j}^{k + 1} & = Λ_{2, j}^{k} + μ_{j}^{k + 1} - {\tilde{ϕ}}^{k + 1} \forall 0 \leq j \leq J - 1; \\ Λ_{3, j}^{k + 1} & = Λ_{3, j}^{k + 1} + p_{j}^{k + 1} - \nabla 𝒮_{j} u^{k + 1} \forall 0 \leq j \leq J - 1 . \end{array}

Details of solvers for these subproblems are reported in Appendix B. We only summarize the overall algorithm in Algorithm 2 below.

Algorithm 2:. ADP with regularization (ADPr)

View Table | View all tables in this article

In a similar manner as with ADP, one can derive the convergence of ADPr. We provide the following theorem omitting the proof details in this case:

Theorem 2. Any limit point of the iterative sequence generated by ADPr is the stationary point of Eq. (7) if the stepsizes r, β are sufficiently large and ε > 0.

Remark 2.2. In order to handle outliers, we propose the following adaptive weights {C_j}. They are designed to produce bigger values when the residual between the measured data and recovered intensities are smaller. The weight function C_j is given based on the inverse of the residual as follows:

C_{j} \leftarrow \frac{2}{γ} {| \frac{1}{\sqrt{{| z_{j}^{k + 1} |}^{2} + {| μ_{j}^{k + 1} |}^{2} + ε^{2} 1_{\bar{m}}} - \sqrt{I_{j} + ε^{2} 1_{\bar{m}}}} |}^{2 - γ},

with γ ≤ 2. The weight function can be updated between Step 3 and Step 4 in ADP algorithm, or Step 4 and Step 5 in ADPr algorithm.

Remark 2.3. Raster grid scanning can cause visible periodical artifacts, and therefore a constraint of the lens can be enforced as [19]. Furthermore, a detector mask can be used as well.

3. Experimental results

The experimental results of this section are generated using both simulated and experimental ptychography data. In most experiments, the performance of the proposed algorithm is compared with that of SHARP (using RAAR) [19], a ptychography software solution that implements state-of-the-art blind ptychography algorithms and background retrieval techniques [26] (further details can be found in Appendix A). In some test we will evaluate SHARP without and with background retrieval, the later will be referred to as SHARP-B. All experimental results in this section employ raster scans. Regarding the parameter initialization of the proposed method, we set J₀ = 5 (after J₀ iterations reinitialize the background) for both ADP and ADPr. We use 5 iterations for Step 2 of ADPr.

We introduce two different criteria to evaluate performance. For simulation data, the ground truth can be used to compute the signal-to-noise ratio (SNR) of u^k as:

SNR (u^{k}, u_{g}) = - 10 {log}_{10} \sum_{t = 0}^{n - 1} {| ζ^{*} u^{k} (t + T^{*}) - u_{g} (t) |}^{2} / {‖ ζ^{*} u^{k} ‖}^{2},

where u_g corresponds to the ground truth image. The error is computed up to the translation T^*, and the phase shift and scaling factor ζ^* are determined by:

(ζ^{*}, T^{*}) : = \underset{ζ \in ℂ, T \in ℤ}{arg min} \sum_{t} {| ζ u^{k} (t + T) - u_{g} (t) |}^{2} .

For experiment data, the R-factor for ADP and ADPr is used, defined as:

{R - factor}^{k} : = \frac{\sum_{j} {‖ \sqrt{{| 𝒜_{j} (ω^{k}, u^{k}) |}^{2} + {({\tilde{ϕ}}^{k})}^{2}} - \sqrt{I_{j}} ‖}_{1}}{{‖ \sqrt{I} ‖}_{1}} .

The R-factor of SHARP is defined by setting ϕ̃ = 0, whereas the R-factor of SHARP-B is defined in the same way as with ADP and ADPr.

3.1. Synthetic data

Given a true illumination and sample ω^★ and u^★, respectively, and ϕ^★ as the parasitic noise component, the simulated intensities are generated as follows:

{\hat{I}}_{j} (t) \overset{i . i . d}{~} Poisson ({| 𝒜_{j} (ω^{★}, u^{★}) |}^{2} (t) + ϕ^{★} (t)) + n_{j} (t), \forall 0 \leq t \leq \bar{m} - 1,

where n_j denotes the white Gaussian noise (the variance is set to

1.0 \times 10^{- 6} \times \frac{1}{m} \sum_{t} I_{poi} (t)

, with I_poi(t) := Poisson (|𝒜_j(ω^★, u^★)|²(t) + ϕ^★(t))). We also consider a further corruption of the intensities by outliers. They are simulated as patches of bad measurements appearing in 10% of the frames, with size ∼ 50% of the frame size. The corrupted intensity values are set to 0. We use raster grid scan with stepsizes of 16 pixels, so an additional constraint of the illumination (a.k.a. support of the lens or illumination Fourier mask) is used to prevent potential periodical artifacts. In the experiments below, the frame size employed is 64 × 64 pixels and all algorithms stop after 1000 iterations.

Figure 2 presents reconstruction results employing two different simulation datasets. The first experiment (second and fourth rows) uses data contaminated only by parasitic noise (ϕ^★). The second experiment (third and fifth rows) reconstructs data with a mix of white Gaussian and Poisson noises as in Eq. (32) and the outliers described above as well. The ground truth images for the sample, illumination and background are depicted in Figs. 2(c), 2(k), Figs. 2(a)–2(b), and Fig. 3(a), respectively. For each of the previous reconstruction experiments, we report the retrieved background in Fig. 3 for SHARP-B, ADP and ADPr. When no mix of noises and outliers are considered, ADP produces much higher quality reconstructions (Figs. 2(f) and 2(n)) than SHARP and SHARP-B (Figs. 2(d)–2(e) and Figs. 2(l)–2(m)). The retrieved background, for both SHARP-B and ADP (Figs. 3(b)–3(c)), is consistent with those results. When additional outliers are introduced, ADP still generates better result with a dramatic reduction in outliers-related artifacts (Figs. 2(i) and 2(q)). In this case, ADPr (Figs. 2(j) and 2(r)) helps further removing the noise from the baseline ADP reconstruction. For the second experiment, the proposed algorithms also retrieve the background (Figs. 3(e)–3(f)) more precisely than SHARP-B (Fig. 3(d)). The SNRs of recovery results are reported: SNR = 13.2, 15.7, 76.7 by SHARP, SHARP-B, and ADP for the case of single parasitic noise; SNR = 9.8, 10.2, 18.5, 27.9 by SHARP, SHARP-B, ADP and ADPr for the case with additional mixture noises and outliers. Again, one can readily see the advantages of proposed ADP and ADPr.

Fig. 3 Retrieved backgrounds (backg.) for the simulation experiment of Fig. 2. First row: True background (parasitic noise) (a) and results retrieved in a simulation only with parasitic noise for SHARP-B (b) and ADP (c). Second row: Background retrieved in a simulation as in Eq. (32) with additional outliers for SHARP-B (d), ADP (e) and ADPr (f).The figures are shown in scale (·)^0.05 with the colorbar shown on the top left corner

Download Full Size | PDF

3.2. Experimental data

Soft X-ray dataset from the ALS, Lawrence Berkeley Lab

The data used in the following experiment was published in [8]. It corresponds to a 3D ptychography imaging experiment of battery cells at 708 eV, obtained from different projection angles.

The recovered amplitude and phase are reported in Fig. 4 and zoom-in view in Fig. 5 when using SHARP, SHARP-B, ADP and ADPr. The recovered intensities of a single frame and the recovered background for the same experiment are shown in Fig. 6. When no background retrieval is used, the recovery results from SHARP (Figs. 4(a) and 4(e), and zoom-in view in Figs. 5(a) and 5(e)) are significantly noisy, with some areas presenting vague or inappreciable features. SHARP-B was specially designed for structured noise from ALS soft X-ray sources and the results in Figs. 4(b) and 4(f), and zoom-in view in Figs. 5(b) and 5(f), show a considerable improvement with respect to SHARP baseline. Visually, ADP produces even sharper results and cleaner background than SHARP-B, specially when employing regularization (Figs. 4–5 last two columns). This can also be appreciated by inspecting the recovered intensities and backgrounds from Fig 6. Artifacts due to some bad frames and other source of noise can be clearly appreciated in Figs. 4(a)–4(b), 4(e)–4(f), and Figs. 5(a)–5(b), 5(e)–5(f) in the experiments with SHARP and SHARP-B. When using the proposed algorithms, such artifacts are greatly attenuated (in Figs. 4(c)–4(d), and Figs. 5(c)–5(d)).

Fig. 4 Soft X-ray experimental results from the data presented in [8]. First row: reconstructed amplitude using SHARP (a), SHARP-B (b), ADP (c) and ADPr (d). Second row: reconstructed phase using SHARP (e), SHARP-B (f), ADP (g) and ADPr (h).

Download Full Size | PDF

Fig. 5 Zoom-in view of parts of Fig. 4. First row: reconstructed amplitude using SHARP (a), SHARP-B (b), ADP (c) and ADPr (d). Second row: reconstructed phase using SHARP (e), SHARP-B (f), ADP (g) and ADPr (h).

Download Full Size | PDF

Fig. 6 Recovered intensities and background from the experiment presented in Fig. 4. First row: Recovered intensities by SHARP (a), SHARP-B (b), ADP (c), and ADPr (d). Second row: recovered backgrounds by SHARP-B (e), ADP (f), and ADPr (g). The figures are shown in scale (·)^0.05 with the colorbar shown on the down left corner.

Download Full Size | PDF

The R-factors of the previous experiment are 0.2399, 0.1112, 0.0957, and 0.0969, when using SHARP, SHARP-B, ADP and ADPr, respectively. This demonstrates how the proposed algorithm achieves smaller residuals thus producing higher accuracy results. We also provide cutline values results from the previous experiment in Fig. 7, selecting a line from the center of the reconstructed image. It can be seen from the figure how the proposed algorithm generates a higher contrast reconstruction than SHARP-B, specially in the retrieved phase. Convergence results are presented in Fig. 8, showing the R-factor as the iterations go. Those results illustrate how the R-factors of the proposed algorithm steadily decrease, providing additional evidence on the stability of the method.

Fig. 7 Cutlines of the retrieved amplitude (amp.) (a) and phase part (b) from the center of the reconstruction reported in Fig. 4, using the dataset from [8].

Download Full Size | PDF

Fig. 8 Convergence histories of the R-factor for the proposed algorithm (both ADP and ADPr) when performing the experiment reported in Fig. 4, using the dataset from [8].

Download Full Size | PDF

Hard X-ray dataset from PETRA III at DESY

We also report experimental results using a dataset measured at beamline P06 at PETRA III, DESY [11] at a photon energy of 11919 eV. This dataset consists of 2 ptychographic measurements, one without and one with beamstop. The following results are generated using (1) the data without beamstop (noBS), (2) the data with beamstop (BS), and (3) the merged data from BS and noBS (the low frequencies from noBS plus the high frequency from BS). The datasets are reconstructed using ePIE (extended Ptychographic Iterative Engine [3]), which corresponds to the results presented in [11], and with the proposed ADP algorithm. For a fair comparison, all results are produced without position retrieval. The amount of noise of this dataset is significantly higher than the one reported in the previous experiments: in this case, the phase contrast produced by the hard X-ray beam is much weaker than in the soft X-ray dataset. Because of this, more iterations are required to achieve reasonable reconstructions. The following results use at most 2000 iterations (we use early-stop for ePIE since it will blow up finally) for both algorithms.

The reconstruction results are shown in Fig. 9. The reconstructed phase parts using ePIE is significantly noisy, with lots of blurred out features (Fig. 9(a) and (b)). Visually, ADP generates a much cleaner reconstructed phase parts with much better defined features (especially noticeable in the features around the color circles). ADPr in this case produces similar quality results as ADP with no regularization.

Fig. 9 Hard X-ray experimental results from the data presented in [11]. First row: recovered phase using ePIE, with the noBS dataset (a), and with the merged dataset (b). Second row: recovered phase using ADP, with noBS data (c), merged data (d), and BS data (e).Results from (a) to (d) are displayed in the range [−0.1, 0.02] (colorbar shown in the top right corner) while (e) is depicted in the range [−0.8, 0.3] (colorbar shown in the down right corner).

Download Full Size | PDF

We also report an additional reconstruction result using only the BS dataset with ADP (Fig. 9(e)). It is important to note that for this dataset low-frequency information is almost completely lost. We can see how very sharp features are well recovered in the reconstructed phase, while producing a very clean background. This illustrates the robustness of the proposed algorithm even when the measured data is incomplete or contaminated by heavy noise. Some features from the BS experiment with ADP seem to be enhanced, compared with the merged data (specially noticeable again in the color circle areas). We remark that to make our algorithm work well, a good initialization by the illumination from the non-beamstop dataset is adopted. As a future work, we will explore more efficient strategy for the initialization.

Since our proposed algorithms are implemented by python, which is not specially optimized compared with SHARP (implemented by highly optimized CUDA code). Therefore, we only give the estimate of computational complexity of proposed ADP as about 63m + 2J × FFT per iteration, while about 78m + 2J × FFT for SHARP-B, that demonstrates the proposed ADP needs a bit less computational cost than SHARP-B, where FFT means the computational complexity of 2D fast Fourier Transform for the matrix with size of $\sqrt{\bar{m}} \times \sqrt{\bar{m}}$ (same size of the illumination).

4. Conclusion

This paper proposes a novel algorithm for (blind) ptychography reconstruction with integral experimental denoising. All main experimental sources of noise are addressed and characterized in the proposed solution as a mixture of (1) structured parasitic noise, (2) a mix of random noise sources (both Gaussian or Poison), and (3) data outliers. The algorithm is formulated to be robust and efficient, requiring few arithmetic operations and being inherently parallel in most of the steps. The convergence of the method is also proved under mild conditions. Experimental results analyze a variety of datasets with different experimental noise sources and demonstrate how the proposed algorithm achieves superior reconstruction results and denoising than state-of-the-art solutions. As a future work, we will consider partial coherence [37] and position retrieval [26] in the model, and also explore additional sparsity techniques [38], deep-learning [39], and dictionary-learning methods [29] to further enhance the reconstruction results.

Appendix

A. Revisit of SHARP background retrieval algorithm

The following description was proposed in [26]. Given an intensity I_j and for the k^th iteration, the background ϕ is retrieved as follows:

min_{ϕ, η \geq η_{-}} ℋ (η, ϕ) : = \frac{1}{2} \sum_{j} {‖ η \circ (I_{j} - ϕ) - {| z_{j}^{k} |}^{2} ‖}^{2},

given z^k, where a weight function

1 \geq η \in ℝ_{+}^{\bar{m}}

crossing all frames is introduced to be optimized. By alternating minimization:

η^{k} = \underset{η \geq η_{-}}{arg min} \sum_{j} {‖ η \circ (I_{j} - ϕ^{k - 1}) - {| z_{j}^{k} |}^{2} ‖}^{2} = max {η_{-}, \frac{\sum_{j} 〈 {| z_{j}^{k} |}^{2}, I_{j} - ϕ^{k - 1} 〉}{\sum_{j} {| I_{j} - ϕ^{k - 1} |}^{2}}} .

Then, by the preconditioned gradient descent for the background ϕ:

\begin{array}{l} ϕ^{k} & = ϕ^{k - 1} - \frac{1}{J {| η^{k} |}^{2}} \nabla_{ϕ} ℋ (η^{k}, ϕ) \\ = \frac{1}{J} \sum_{j} (I_{j} - {| z_{j}^{k} |}^{2}) + (1 - \frac{1}{η^{k}}) \circ (\frac{1}{J} \sum_{j} {| z_{j}^{k} |}^{2}), \end{array}

where the first term is essentially the closed form solution for the least squares problem Eq. (33) with respect to ϕ. When the algorithm converges, η^k → 1 if k → +∞, and Eq. (35) is a gradient descent scheme with additional stabilization term

(1 - \frac{1}{η^{k}}) \circ (\frac{1}{J} \sum_{j} {| z_{j}^{k} |}^{2})

, and tends to zeros as k → 0, which helps to speed up the evolution of background retrieval, heuristically.

B. Derivations for ADP (Algorithm 1) and ADPr (Algorithm 2)

ADP

The computation of each subproblem is presented here. For simplicity, we omit the superscripts. For the ω and u–subproblems, with additional proximal terms, we have:

\begin{array}{l} ω^{opt} & : = \underset{ω}{arg min} ℒ (ω, u, z, μ, \tilde{ϕ}, Λ_{1}, Λ_{2}) + \frac{α_{1}}{2} {‖ ω - \hat{ω} ‖}^{2} \\ = \underset{ω}{arg min} \frac{1}{2} \sum_{j} {‖ ω \circ 𝒮_{j} u - ℱ^{*} (z_{j} + Λ_{1, j}) ‖}^{2} + \frac{α_{1}}{2} {‖ ω - \hat{ω} ‖}^{2}, \end{array}

and

\begin{array}{l} u^{opt} & : = \underset{u}{arg min} ℒ (ω, u, z, μ, \tilde{ϕ}; Λ_{1}, Λ_{2}) + \frac{α_{2}}{2} {‖ u - \hat{u} ‖}^{2} \\ = \underset{u}{arg min} \frac{1}{2} \sum_{j} {‖ ω \circ 𝒮_{j} u - ℱ^{*} (z_{j} + Λ_{1, j}) ‖}^{2} + \frac{α_{2}}{2} {‖ u - \hat{u} ‖}^{2}, \end{array}

where ω̂ and û are the approximate solutions in the previous iterations. By solving a least squares problem, we have:

{\begin{cases} ω^{opt} = \frac{\sum_{j} 𝒮_{j} u^{*} \circ ℱ^{*} (z_{j} + Λ_{1, j}) + α_{1} \hat{ω}}{\sum_{j} {| 𝒮_{j} u |}^{2} + α_{1} 1_{\bar{m}}} \\ u^{opt} = \frac{\sum_{j} 𝒮_{j}^{T} (ω^{*} \circ ℱ^{*} (z_{j} + Λ_{1, j})) + α_{2} \hat{u}}{\sum_{j} 𝒮_{j}^{T} {| ω |}^{2} + α_{2} 1_{n}} . \end{cases}

The variables z_j and μ_j can be determined jointly by solving:

min_{z_{j}, μ_{j}} 𝒢_{ε, j} (z_{j}, μ_{j}) + \frac{r}{2} {‖ z_{j} - (𝒜_{j} (ω, u) - Λ_{1, j}) ‖}^{2} + \frac{r}{2} {‖ μ_{j} - (\tilde{ϕ} - Λ_{2, j}) ‖}^{2},

with

𝒢_{ε, j} (z_{j}, μ_{j}) : = \frac{1}{2} {‖ \sqrt{C_{j}} \circ (\sqrt{{| z_{j} |}^{2} + μ_{j}^{2} + ε^{2} 1_{\bar{m}}} - \sqrt{I_{j} + ε^{2} 1_{\bar{m}}}) ‖}^{2} .

By denoting

X_{j} = {(z_{j}^{T}, μ_{j}^{T})}^{T} \in ℂ^{2 \bar{m}}

, we have:

X_{j}^{opt} : = arg min 𝒢_{ε, j} (X_{j}) + \frac{r}{2} {‖ X_{j} - X_{j}^{0} ‖}^{2},

with

X_{j}^{0} : = {(𝒜_{j}^{T} (ω, u) - Λ_{1, j}^{T}, {\tilde{ϕ}}^{T} - Λ_{2, j}^{T})}^{T}

. The solution to the above problem has the following forms:

X_{j}^{opt} = {({(ρ_{j}^{opt})}^{T}, {(ρ_{j}^{opt})}^{T})}^{T} \circ sign (X_{j}^{0}),

with

sign (X_{j}^{0}) : = {(\frac{𝒜_{j}^{T} (ω, u) - Λ_{1, j}^{T}}{{| X_{j}^{0} |}_{*}^{T}}, \frac{{\tilde{ϕ}}^{T} - Λ_{2, j}^{T}}{{| X_{j}^{0} |}_{*}^{T}})}^{T},

{| X_{j}^{0} |}_{*} : = \sqrt{{| 𝒜_{j} (ω, u) - Λ_{1, j} |}^{2} + {| \tilde{ϕ} - Λ_{2, j} |}^{2}},

where

ρ_{j}^{opt}

is determined by:

ρ_{j}^{opt} = \underset{ρ_{j} \in ℝ_{+}^{\bar{m}}}{arg min} ℋ_{ε} (ρ_{j}) : = \frac{1}{2} 〈 C_{j}, ρ_{j}^{2} - (I_{j} + ε^{2} 1_{\bar{m}}) \circ log (ρ_{j}^{2} + ε^{2} 1_{\bar{m}}) 〉 + \frac{r}{2} {‖ ρ_{j} - {| X_{j}^{0} |}_{*} ‖}^{2} .

If ε = 0, the closed form solution [36] to Eq. (45) is given by:

ρ_{j}^{opt} = \frac{r {| X_{j}^{0} |}_{*} + \sqrt{r^{2} {| X_{j}^{0} |}_{*}^{2} + 4 (C_{j} + r 1_{\bar{m}}) \circ C_{j} \circ I_{j}}}{2 (C_{j} + r 1_{\bar{m}})},

such that

X_{j}^{opt} = \frac{r {| X_{j}^{0} |}_{*} + \sqrt{r^{2} {| X_{j}^{0} |}_{*}^{2} + 4 (C_{j} + r 1_{\bar{m}}) \circ C_{j} \circ I_{j}}}{2 (C_{j} + r 1_{\bar{m}})} \circ sign (X_{j}^{0}) .

Otherwise, we use the projection gradient algorithm to solve it, which gives:

\begin{array}{l} ρ_{j, l + 1} & = max {0, ρ_{j, l} - τ \nabla ℋ_{∊} (ρ_{j, l})} \\ = max {0, ((1 - τ r) 1_{\bar{m}} - τ C_{j} + τ C_{j} \circ \frac{I_{j} + ε^{2} 1_{\bar{m}}}{ρ_{j, l}^{2} + ε^{2} 1_{\bar{m}}}) \circ ρ_{j, l} + τ r {| X_{j}^{0} |}_{*}}, \end{array}

l = 0, 1, · · · , with stepsize τ, since

\nabla ℋ_{∊} (ρ_{j}) = (C_{j} + r 1_{\bar{m}} - C_{j} \circ \frac{I_{j} + ε^{2} 1_{\bar{m}}}{ρ_{j}^{2} + ε^{2} 1_{\bar{m}}}) \circ ρ_{j} - r {| X_{j}^{0} |}_{*} .

For the ϕ̃–subproblem, the solution is given directly as

{\tilde{ϕ}}^{opt} = \underset{\tilde{ϕ}}{arg min} ℒ (ω, u, z, μ, \tilde{ϕ}, Λ_{1}, Λ_{2}) = \frac{1}{J} \sum (μ_{j} + Λ_{2, j}) .

We can now further simplify the proposed algorithm. Since $0 = {\tilde{ϕ}}^{k + 1} - \frac{1}{J} \sum_{j} (μ_{j}^{k + 1} + Λ_{2, j}^{k})$ , and $Λ_{2, j}^{k + 1} = Λ_{2, j}^{k} + μ_{j}^{k + 1} - {\tilde{ϕ}}^{k + 1}$ , we have $\sum_{j} Λ_{2, j}^{k + 1} = 0$ , such that ${\tilde{ϕ}}^{k + 1} = \frac{1}{J} \sum_{j} μ_{j}^{k + 1}$ . That is to say, one can replace the variable ϕ̃^k by the pointwise average $\frac{1}{J} \sum_{j} μ_{j}^{k}$ and remove the subproblem with respect to ϕ̃.

ADPr

Similarly, we omit the superscripts for simplicity. We consider the u–subproblem, which is expressed below:

min \frac{r}{2} \sum_{j} {‖ ω \circ 𝒮_{j} u - ℱ^{*} (z_{j} + Λ_{1, j}) ‖}^{2} + \frac{β}{2} \sum_{j} {‖ Λ_{3, j} + p_{j} - \nabla 𝒮_{j} u ‖}^{2} + \frac{α_{2}}{2} {‖ u - \hat{u} ‖}^{2},

where û is the previous iterative solution. The minimizer to the above optimization problem satisfies the following equation:

\begin{matrix} \sum_{j} (diag (r 𝒮_{j}^{T} {| ω |}^{2} + α_{2} 1_{n}) - β 𝒮_{j}^{T} Δ 𝒮_{j}) u \\ = \sum_{j} (r 𝒮_{j}^{T} (ω^{*} \circ ℱ^{*} (z_{j} + Λ_{1, j})) - β 𝒮_{j}^{T} div (p_{j} + Λ_{3, j})) + α_{2} \hat{u}, \end{matrix}

where div denotes the divergence operator satisfying div = −∇^T, and Δ denotes the Laplacian operators satifying Δ = div(∇). One can readily verify that the coefficient matrix is Hermitian, and positive definite, i.e.:

\begin{array}{l} 〈 (diag (r 𝒮_{j}^{T} {| ω |}^{2} + α_{2} 1_{n}) - β 𝒮_{j}^{T} Δ 𝒮_{j}) v, v 〉 \\ = r 〈 𝒮_{j}^{T} ω \circ v, 𝒮_{j}^{T} ω \circ v 〉 + α_{2} {‖ v ‖}^{2} + β 〈 \nabla 𝒮_{j} v, \nabla 𝒮_{j} v 〉 \\ = r {‖ 𝒮_{j}^{T} ω \circ v ‖}^{2} + α_{2} {‖ v ‖}^{2} + β {‖ \nabla 𝒮_{j} v ‖}^{2} > 0, \forall 0 \neq v \in ℂ^{n}, \end{array}

such that we adopt conjugate gradient (CG) algorithm to solve Eq. (52).

The p_j–subproblem has the following closed form solution:

p_{j}^{opt} = max {0, | \nabla 𝒮_{j} u - Λ_{3, j} | - \frac{λ}{β} 1_{\bar{m}}} \circ \frac{\nabla 𝒮_{j} u - Λ_{3, j}}{| \nabla 𝒮_{j} u - Λ_{3, j} |} .

The subproblems with respect to other variables are mainly the same as in the previous subsection, and we omit the details.

C. Proof of theorem 1

The relation of the stationary points is given in the following lemma:

Lemma C.1. Iterative solutions {(ω^k, u^k, z^k, μ^k, $Λ_{1}^{k}$ , $Λ_{2}^{k}$ )}_k satisfy the following relations:

{\begin{cases} 0 = ω^{k + 1} \circ \sum_{j} {| 𝒮_{j} u^{k} |}^{2} - \sum_{j} {(𝒮_{j} u^{k})}^{*} \circ ℱ^{*} (z_{j}^{k} + Λ_{1, j}^{k}) + α_{1} (ω^{k + 1} - ω^{k}); \\ 0 = \sum_{j} 𝒮_{j}^{T} {| ω^{k + 1} |}^{2} \circ u^{k + 1} - \sum_{j} 𝒮_{j}^{T} ({(ω^{k + 1})}^{*} \circ ℱ^{*} (z_{j}^{k} + Λ_{1, j}^{k})) + α_{2} (u^{k + 1} - u^{k}); \\ 0 = \nabla 𝒢_{ε} (z^{k + 1}, μ^{k + 1}) + r {({(Λ_{1}^{k + 1})}^{T}, {(Λ_{2}^{k + 1})}^{T})}^{T}; \\ 0 = - Λ_{1, j}^{k + 1} + Λ_{1, j}^{k} + z_{j}^{k + 1} - 𝒜_{j} (ω^{k + 1}, u^{k + 1}); \\ 0 = - Λ_{2, j}^{k + 1} + Λ_{2, j}^{k} + μ_{j}^{k + 1} - \frac{1}{J} \sum_{j} μ_{j}^{k + 1} . \end{cases}

Proof. By Eq. (47), one has:

\sqrt{{| z_{j}^{k + 1} |}^{2} + {| μ_{j}^{k + 1} |}^{2}} = \frac{\sqrt{I_{j}} + r | Y_{j}^{k} |}{1 + r} \geq \frac{\sqrt{I_{j}}}{1 + r} > 0,

such that the objective function in Eq. (41) is smooth at

X_{j}^{k + 1} : = (z_{j}^{k + 1}, μ_{j}^{k + 1})

. Then one can readily derive Eq. (55) by calculating the first order derivative for each subproblem.

Lemma C.2.

‖ \nabla 𝒢_{ε} (z^{1}, μ^{1}) - \nabla 𝒢_{ε} (z^{2}, μ^{2}) ‖ \leq L_{ε} (‖ z^{1} - z^{2} ‖ + ‖ μ^{1} - μ^{2} ‖),

where L_∊ is a positive constant independent with z¹, μ¹, z², μ².

Proof. It can be readily proved following [15], and we omit the details here.

Lemma C.3. Denoting Y^k := (ω^k, u^k, z^k, μ^k, ϕ̃^k, $Λ_{1}^{k}$ , $Λ_{2}^{k}$ ), if r > 4L_ε,

ℒ (Y^{k}) - ℒ (Y^{k + 1}) \geq c_{ε, r} {‖ Y^{k + 1} - Y^{k} ‖}^{2},

where c_ε,r is a positive parameter independent from {Y^k}.

Proof. Regarding the ω and u–subproblems:

\begin{array}{r} ℒ (Y^{k}) - ℒ (ω^{k + 1}, u^{k + 1}, z^{k}, μ^{k}, {\tilde{ϕ}}^{k}, Λ_{1}^{k}, Λ_{2}^{k}) \\ \geq \frac{1}{2} {‖ 𝒜 (ω^{k + 1} - ω^{k}, u^{k}) ‖}^{2} + \frac{α_{1}}{2} {‖ ω^{k + 1} - ω^{k} ‖}^{2} \\ + \frac{1}{2} {‖ 𝒜 (ω^{k + 1}, u^{k + 1} - u^{k}) ‖}^{2} + \frac{α_{2}}{2} {‖ u^{k + 1} - u^{k} ‖}^{2} . \end{array}

For the z and μ subproblems, by Eq. (39) and the Lemma C.2, we have:

\begin{array}{l} ℒ (ω^{k + 1}, u^{k + 1}, z^{k}, μ^{k}, {\tilde{ϕ}}^{k}, Λ_{1}^{k}, Λ_{2}^{k}) - ℒ (ω^{k + 1}, u^{k + 1}, z^{k + 1}, μ^{k + 1}, {\tilde{ϕ}}^{k}, Λ_{1}^{k}, Λ_{2}^{k}) \\ \geq \frac{r - 3 L_{ε}}{2} ({‖ z_{j}^{k + 1} - z_{j}^{k} ‖}^{2} + {‖ μ_{j}^{k + 1} - μ_{j}^{k} ‖}^{2}) . \end{array}

For the subproblems of ϕ̃, we have:

\begin{array}{l} ℒ (ω^{k + 1}, u^{k + 1}, z^{k + 1}, μ^{k + 1}, {\tilde{ϕ}}^{k}, Λ_{1}^{k}, Λ_{2}^{k}) - ℒ (ω^{k + 1}, u^{k + 1}, z^{k + 1}, μ^{k + 1}, {\tilde{ϕ}}^{k + 1}, Λ_{1}^{k}, Λ_{2}^{k}) \\ \geq \frac{r J}{2} {‖ {\tilde{ϕ}}^{k + 1} - {\tilde{ϕ}}^{k} ‖}^{2} . \end{array}

Further we have:

\begin{array}{l} ℒ (ω^{k + 1}, u^{k + 1}, z^{k + 1}, μ^{k + 1}, {\tilde{ϕ}}^{k + 1}, Λ_{1}^{k}, Λ_{2}^{k}) - ℒ (Y^{k + 1}) \\ \overset{Eq . (20)}{=} - r \sum_{j} ({‖ Λ_{1, j}^{k + 1} - Λ_{1, j}^{k} ‖}^{2} + {‖ Λ_{2, j}^{k + 1} - Λ_{2, j}^{k} ‖}^{2}) \\ \overset{Eq . (55)}{\geq} - \frac{2 L_{ε}^{2}}{r} ({‖ z^{k + 1} - z^{k} ‖}^{2} - {‖ ω^{k + 1} - ω^{k} ‖}^{2}), \end{array}

where the last relation is derived by Lemma C.2.

Finally, by summing up Eqs. (59)–(62) together, we get:

ℒ (Y^{k}) - ℒ (Y^{k + 1}) \geq min {\frac{α_{1}}{2}, \frac{α_{2}}{2}, \frac{r - 3 L_{ε}}{2} - \frac{2 L_{ε}^{2}}{r}, \frac{r J}{2}} {‖ Y^{k + 1} - Y^{k} ‖}^{2} .

Setting r > 4L_ɛ concludes this lemma.

Proof of Theorem 1. We first show that {ℒ(Y^k)} is lower bounded:

\begin{array}{l} ℒ (Y^{k + 1}) = 𝒢_{ε} (z^{k + 1}, μ^{k + 1}) - ℜ 〈 \nabla_{z} 𝒢_{ε, j} (z_{j}^{k + 1}, μ_{j}^{k + 1}), z_{j}^{k + 1} - 𝒜_{j} (ω^{k + 1}, u^{k + 1}) 〉 \\ - 〈 \nabla_{μ} 𝒢_{ε, j} (z_{j}^{k + 1}, μ_{j}^{k + 1}), μ_{j}^{k + 1} - {\tilde{ϕ}}^{k + 1} 〉 + \frac{r}{2} {‖ z_{j}^{k + 1} - 𝒜_{j} (ω^{k + 1}, u^{k + 1}) ‖}^{2} \\ + \frac{r}{2} {‖ μ_{j}^{k + 1} - {\tilde{ϕ}}^{k + 1} ‖}^{2} \\ = 𝒢_{ε} (z^{k + 1}, μ^{k + 1}) - 𝒢_{ε} (𝒜_{j} (ω^{k + 1}, u^{k + 1}), {\tilde{ϕ}}^{k + 1}) \\ - ℜ 〈 \nabla 𝒢_{ε, j} (z_{j}^{k + 1}, μ_{j}^{k + 1}), {({(z_{j}^{k + 1} - 𝒜_{j} (ω^{k + 1}, u^{k + 1}))}^{T}, {(μ_{j}^{k + 1} - {\tilde{ϕ}}^{k + 1})}^{T})}^{T} 〉 \\ + \frac{r}{2} {‖ z_{j}^{k + 1} - 𝒜_{j} (ω^{k + 1}, u^{k + 1}) ‖}^{2} + \frac{r}{2} {‖ μ_{j}^{k + 1} - {\tilde{ϕ}}^{k + 1} ‖}^{2} + 𝒢_{ε} (𝒜_{j} (ω^{k + 1}, u^{k + 1}), {\tilde{ϕ}}^{k + 1}) \\ \geq \frac{r - L_{ε}}{2} ({‖ z_{j}^{k + 1} - 𝒜_{j} (ω^{k + 1}, u^{k + 1}) ‖}^{2} + {‖ μ_{j}^{k + 1} - {\tilde{ϕ}}^{k + 1} ‖}^{2}) + 𝒢_{ε} (𝒜_{j} (ω^{k + 1}, u^{k + 1}), {\tilde{ϕ}}^{k + 1}), \end{array}

where the last relation is derived from the Lemma C.2. Therefore, ℒ(Y^k) > −∞ with r ≥ 4L_ɛ. Furthermore, by the Lemma C.3:

+ \infty > ℒ (Y^{0}) - ℒ (Y^{k}) = \sum_{l = 0}^{k} (ℒ (Y^{l}) - ℒ (Y^{l + 1})) \geq c_{ε, r} \sum_{l = 0}^{k} {‖ Y^{l + 1} - Y^{l} ‖}^{2} .

Accordingly we get:

lim_{l \to + \infty} ‖ Y^{l + 1} - Y^{l} ‖ = 0 .

For any limit point Y^★ := (ω^★, u^★, z^★, μ^★, ϕ̃^★, $Λ_{1}^{★}$ , $Λ_{2}^{★}$ ) of the iterative sequence {Y^k}, there exists a subsequence Y^n_k := (ω^n_k, u^n_k, z^n_k, μ^n_k, ϕ̃^n_k, $Λ_{1}^{n_{k}}$ , $Λ_{2}^{n_{k}}$ ) ⊂ Y^k, such that

lim_{k \to + \infty} Y^{n_{k}} = Y^{★} .

We can see that {Y^n_k} is bounded such that one can derive: lim_k→+∞ 𝒜(ω^n_k, u^n_k) = 𝒜(ω^★, u^★), and

{lim}_{k \to + \infty} \sum_{j} 𝒮_{j}^{T} {| ω^{n_{k}} |}^{2} \circ u^{n_{k}} = \sum_{j} 𝒮_{j}^{T} {| ω^{★} |}^{2} \circ u^{★}

. By Eq. (66), {Y^n_k−1} is bounded as well. Hence

lim_{k \to + \infty} ω^{n_{k}} \circ \sum_{j} {| 𝒮_{j} u^{n_{k} - 1} |}^{2} = ω^{★} \circ \sum_{j} {| 𝒮_{j} u^{★} |}^{2},

lim_{k \to + \infty} \sum_{j} {(𝒮_{j} u^{n_{k} - 1})}^{*} \circ ℱ^{*} (z_{j}^{n_{k} - 1} + Λ_{1, j}^{n_{k} - 1}) = \sum_{j} {(𝒮_{j} u^{★})}^{*} \circ ℱ^{*} (z_{j}^{★} + Λ_{1, j}^{★}),

and

lim_{k \to + \infty} \sum_{j} 𝒮_{j}^{T} ({(ω^{n_{k}})}^{*} \circ ℱ^{*} (z_{j}^{n_{k} - 1} + Λ_{1, j}^{n_{k} - 1})) = \sum_{j} 𝒮_{j}^{T} ({(ω^{★})}^{*} \circ ℱ^{*} (z_{j}^{★} + Λ_{1, j}^{★})) .

By Lemma C.2:

lim_{k \to + \infty} \nabla 𝒢_{ε} (z^{n_{k}}, μ^{n_{k}}) = \nabla 𝒢_{ε} (z^{★}, μ^{★}) .

Finally, since

\begin{array}{l} 0 = ω^{n_{k}} \circ \sum_{j} {| 𝒮_{j} u^{n_{k} - 1} |}^{2} - \sum_{j} {(𝒮_{j} u^{n_{k} - 1})}^{*} \circ ℱ^{*} (z_{j}^{n_{k} - 1} + Λ_{1, j}^{n_{k} - 1}) + α_{1} (ω^{n_{k}} - ω^{n_{k} - 1}); \\ 0 = \sum_{j} 𝒮_{j}^{T} {| ω^{n_{k}} |}^{2} \circ u^{n_{k}} - \sum_{j} 𝒮_{j}^{T} {((ω^{n_{k}})}^{*} \circ ℱ^{*} (z_{j}^{n_{k} - 1} + Λ_{1, j}^{n_{k} - 1})) + α_{2} (u^{n_{k}} - u^{n_{k} - 1}); \\ 0 = \nabla 𝒢_{ε} (z^{n_{k}}, μ^{n_{k}}) + r {({(Λ_{1}^{n_{k}})}^{T}, {(Λ_{2}^{n_{k}})}^{T})}^{T}; \\ 0 = - Λ_{1, j}^{n_{k}} + Λ_{1, j}^{n_{k} - 1} + z_{j}^{n_{k}} - 𝒜_{j} (ω^{n_{k}}, u^{n_{k}}); \\ 0 = - Λ_{2, j}^{n_{k}} + Λ_{2, j}^{n_{k} - 1} + μ_{j}^{n_{k}} - \frac{1}{J} \sum_{j} μ_{j}^{n_{k}}, \end{array}

following the above calculations of these limits, we can get:

{\begin{cases} 0 = ω^{★} \circ \sum_{j} {| 𝒮_{j} u^{★} |}^{2} - \sum_{j} {(𝒮_{j} u^{★})}^{*} \circ ℱ^{*} (z_{j}^{★} + Λ_{1, j}^{*}); \\ 0 = \sum_{j} 𝒮_{j}^{T} {| ω^{★} |}^{2} \circ u^{★} - \sum_{j} 𝒮_{j}^{T} ({(ω^{★})}^{*} \circ ℱ^{*} (z_{j}^{★} + Λ_{1, j}^{★})); \\ 0 = \nabla 𝒢_{ε} (z^{★}, μ^{★}) + r {({(Λ_{1}^{★})}^{T}, {(Λ_{2}^{★})}^{T})}^{T}; \\ 0 = z_{j}^{★} - 𝒜_{j} (ω^{★}, u^{★}); \\ 0 = μ_{j}^{★} - \frac{1}{J} \sum_{j} μ_{j}^{★}, \end{cases}

which immediately implies that Y^★ is the stationary point of Eq. (7) and concludes this theorem.

Funding

National Natural Science Foundation of China (11871372, 11501413, 11271049, 11671002); Natural Science Foundation of Tianjin City (18JCYBJC16600); Tianjin Normal University 2017 Outstanding Young Innovation Team Cultivation Program (043-135202TD1703); Innovation Project of Tianjin Normal University (043-135202XC1605); Tianjin Young Backbone of Innovative Personnel Training Program and Program for Innovative Research Team in Universities of Tianjin (TD13-5078); The Center for Advanced Mathematics for Energy Research Applications, US DOE Office of Science (DE-AC02-05CH11231); STROBE: A U.S. National Science Foundation Science & Technology Center (DMR 1548924); The Helmholtz Association (HGF); German Ministry of Education and Research (BMBF) (05K13VK2/05K13OD4, 05K10VK1/05K10OD1); the Impuls-und Vernetzungsfonds (IVF) of the Helmholtz Association of German Research Centres (VH-VI-403); The Direct Grant of The Chinese University of Hong Kong.

Acknowledgments

Experimental data was acquired at the Advanced Light Source, which is a DOE Office of Science User Facility under contract no. DE-AC02-05CH11231, and Beamline P06 of the light source PETRA III at DESY, Helmholtz Association (HGF).

References

1. P. Nellist, B. McCallum, and J. Rodenburg, “Resolution beyond the ‘information limit’ in transmission electron microscopy,” Nature 374, 630 (1995). [CrossRef]

2. J. M. Rodenburg and H. M. Faulkner, “A phase retrieval algorithm for shifting illumination,” Appl. Phys. Lett. 85, 4795–4797 (2004). [CrossRef]

3. A. M. Maiden and J. M. Rodenburg, “An improved ptychographical phase retrieval algorithm for diffractive imaging,” Ultramicroscopy 109, 1256–1262 (2009). [CrossRef] [PubMed]

4. Y. Jiang, Z. Chen, Y. Han, P. Deb, H. Gao, S. Xie, P. Purohit, M. W. Tate, J. Park, S. M. Gruner, V. Elser, and D. A. Muller, “Electron ptychography of 2d materials to deep sub-Ångström resolution,” Nature 559, 343 (2018). [CrossRef] [PubMed]

5. X. Shi, P. Fischer, V. Neu, D. Elefant, J. Lee, D. Shapiro, M. Farmand, T. Tyliszczak, H.-W. Shiu, S. Marchesini, and S. Roy, “Soft x-ray ptychography studies of nanoscale magnetic and structural correlations in thin SmCo₅ films,” Appl. Phys. Lett. 108, 094103 (2016). [CrossRef]

6. K. Giewekemeyer, P. Thibault, S. Kalbfleisch, A. Beerlink, C. M. Kewish, M. Dierolf, F. Pfeiffer, and T. Salditt, “Quantitative biological imaging by ptychographic x-ray diffraction microscopy,” Proc. Natl. Acad. Sci. U.S.A. 107, 529–534 (2010). [CrossRef]

7. D. A. Shapiro, Y.-S. Yu, T. Tyliszczak, J. Cabana, R. Celestre, W. Chao, K. Kaznatcheev, A. D. Kilcoyne, F. Maia, S. Marchesini, Y. S. Meng, T. Warwick, L. L. Yang, and H. Padmore, “Chemical composition mapping with nanometre resolution by soft x-ray microscopy,” Nat. Photonics 8, 765–769 (2014). [CrossRef]

8. Y.-S. Yu, M. Farmand, C. Kim, Y. Liu, C. P. Grey, F. C. Strobridge, T. Tyliszczak, R. Celestre, P. Denes, J. Joseph, H. Krishnan, F. R. N. C. Maia, A. L. D. Kilcoyne, S. Marchesini, T. P. C. Leite, T. Warwick, H. Padmore, J. Cabana, and D. A. Shapiro, “Three-dimensional localization of nanoscale battery reactions using soft x-ray tomography,” Nat. Commun. 9, 921 (2018). [CrossRef] [PubMed]

9. M. Holler, M. Guizar-Sicairos, E. H. Tsai, R. Dinapoli, E. Müller, O. Bunk, J. Raabe, and G. Aeppli, “High-resolution non-destructive three-dimensional imaging of integrated circuits,” Nature 543, 402–406 (2017). [CrossRef] [PubMed]

10. M. O. Wiedorn, S. Awel, A. J. Morgan, M. Barthelmess, R. Bean, K. R. Beyerlein, L. M. G. Chavas, N. Eckerskorn, H. Fleckenstein, M. Heymann, D. A. Horke, J. Knoška, V. Mariani, D. Oberthür, N. Roth, O. Yefanov, A. Barty, S. Bajt, J. Küpper, A. V. Rode, R. A. Kirian, and H. N. Chapman, “Post-sample aperture for low background diffraction experiments at x-ray free-electron lasers,” J. Synchrotron Radiat. 24, 1296–1298 (2017). [CrossRef] [PubMed]

11. J. Reinhardt, R. Hoppe, G. Hofmann, C. D. Damsgaard, J. Patommel, C. Baumbach, S. Baier, A. Rochet, J.-D. Grunwaldt, G. Falkenberg, and C. G. Schroer, “Beamstop-based low-background ptychography to image weakly scattering objects,” Ultramicroscopy 173, 52–57 (2017). [CrossRef]

12. C. Wang, Z. Xu, H. Liu, Y. Wang, J. Wang, and R. Tai, “Background noise removal in x-ray ptychography,” Appl. Opt. 56, 2099–2111 (2017). [CrossRef] [PubMed]

13. P. Thibault, M. Dierolf, O. Bunk, A. Menzel, and F. Pfeiffer, “Probe retrieval in ptychographic coherent diffractive imaging,” Ultramicroscopy 109, 338–343 (2009). [CrossRef] [PubMed]

14. R. Hesse, D. R. Luke, S. Sabach, and M. K. Tam, “Proximal heterogeneous block implicit-explicit method and application to blind ptychographic diffraction imaging,” SIAM J. Imaging Sci. 8, 426–457 (2015). [CrossRef]

15. H. Chang, P. Enfedaque, and S. Marchesini, “Blind ptychographic phase retrieval via convergent alternating direction method of multipliers,” SIAM J. Imaging Sci. 12, 153–185 (2019). [CrossRef]

16. V. Elser, “Phase retrieval by iterated projections,” J. Opt. Soc. Am. A 20, 40–55 (2003). [CrossRef]

17. P. Thibault and M. Guizar-Sicairos, “Maximum-likelihood refinement for coherent diffractive imaging,” New J. Phys. 14, 063004 (2012). [CrossRef]

18. D. R. Luke, “Relaxed averaged alternating reflections for diffraction imaging,” Inverse Probl. 21, 37–50 (2005). [CrossRef]

19. S. Marchesini, H. Krishnan, D. A. Shapiro, T. Perciano, J. A. Sethian, B. J. Daurer, and F. R. Maia, “SHARP: a distributed, GPU-based ptychographic solver,” J. Appl. Crystallogr. 49, 1245–1252 (2016). [CrossRef]

20. R. Glowinski and P. Le Tallec, Augmented Lagrangian and operator-splitting methods in nonlinear mechanics (SIAM, 1989). [CrossRef]

21. C. Wu and X.-C. Tai, “Augmented Lagrangian method, dual methods and split-Bregman iterations for ROF, vectorial TV and higher order models,” SIAM J. Imaging Sci. 3, 300–339 (2010). [CrossRef]

22. S. Boyd, N. Parikh, E. Chu, B. Peleato, and J. Eckstein, “Distributed optimization and statistical learning via the alternating direction method of multipliers,” Found. Trends Mach. Learn. 3, 1–122 (2011). [CrossRef]

23. P. Godard, M. Allain, V. Chamard, and J. Rodenburg, “Noise models for low counting rate coherent diffraction imaging,” Opt. Express 20, 25914–25934 (2012). [CrossRef] [PubMed]

24. M. Odstrčil, A. Menzel, and M. Guizar-Sicairos, “Iterative least-squares solver for generalized maximum-likelihood ptychography,” Opt. Express 26, 3108–3123 (2018). [CrossRef]

25. B. Enders, M. Dierolf, P. Cloetens, M. Stockmar, F. Pfeiffer, and P. Thibault, “Ptychography with broad-bandwidth radiation,” Appl. Phys. Lett. 104, 171104 (2014). [CrossRef]

26. S. Marchesini, A. Schirotzek, C. Yang, H.-T. Wu, and F. Maia, “Augmented projections for ptychographic imaging,” Inverse Probl. 29, 115009 (2013). [CrossRef]

27. B. J. Daurer, H. Krishnan, T. Perciano, F. R. Maia, D. A. Shapiro, J. A. Sethian, and S. Marchesini, “Nanosurveyor: a framework for real-time data processing,” Adv. Struct. Chem. Imaging 3, 7 (2017). [CrossRef] [PubMed]

28. H. Chang, Y. Lou, Y. Duan, and S. Marchesini, “Total variation–based phase retrieval for Poisson noise removal,” SIAM J. Imaging Sci. 11, 24–55 (2018). [CrossRef]

29. H. Chang and S. Marchesini, “Denoising Poisson phaseless measurements via orthogonal dictionary learning,” Opt. Express 26, 19773–19796 (2018). [CrossRef] [PubMed]

30. F. Murtagh, J.-L. Starck, and A. Bijaoui, “Image restoration with noise suppression using a multiresolution support,” Astron. Astrophys., Suppl. Ser. 112, 179 (1995).

31. A. Chakrabarti and T. Zickler, “Image restoration with signal-dependent camera noise,” arXiv 1204.2994 (2012).

32. J. Li, Z. Shen, R. Yin, and X. Zhang, “A reweighted l² method for image restoration with poisson and mixed poisson-gaussian noise,” Inverse Probl. Imaging (Springfield) 9, 875–894 (2015). [CrossRef]

33. N. M. Kirby, S. T. Mudie, A. M. Hawley, D. J. Cookson, H. D. Mertens, N. Cowieson, and V. Samardzic-Boban, “A low-background-intensity focusing small-angle x-ray scattering undulator beamline,” J. Appl. Crystallogr. 46, 1670–1680 (2013). [CrossRef]

34. H. Chang, S. Marchesini, Y. Lou, and T. Zeng, “Variational phase retrieval with globally convergent preconditioned proximal algorithm,” SIAM J. Imaging Sci. 11, 56–93 (2018). [CrossRef]

35. Z. Wen, C. Yang, X. Liu, and S. Marchesini, “Alternating direction methods for classical and ptychographic phase retrieval,” Inverse Probl. 28, 115010 (2012). [CrossRef]

36. H. Chang, Y. Lou, M. K. Ng, and T. Zeng, “Phase retrieval from incomplete magnitude information via total variation regularization,” SIAM J. Sci. Comput. 38, A3672–A3695 (2016). [CrossRef]

37. H. Chang, P. Enfedaque, Y. Lou, and S. Marchesini, “Partially coherent ptychography by gradient decomposition of the probe,” Acta Crystallogr., Sect. A: Found. Adv. 74, 157–169 (2018). [CrossRef]

38. H. Chang and S. Marchesini, “A general framework for denoising phaseless diffraction measurements,” arXiv 1611.01417 (2016).

39. M. J. Cherukara, Y. S. Nashed, and R. J. Harder, “Real-time coherent diffraction inversion using deep generative networks,” arXiv 1806.03992 (2018).

Advanced denoising for X-ray ptychography

Abstract

1. Introduction

Experimental noise overview

Structured noise

Random noise

Outliers

Related work

Motivations and contributions

2. Proposed iterative algorithm

2.1. Proposed model

2.2. ADMM denoising for phase retrieval (ADP)

Convergence analysis

2.3. ADP with regularization (ADPr)

3. Experimental results

3.1. Synthetic data

3.2. Experimental data

Soft X-ray dataset from the ALS, Lawrence Berkeley Lab

Hard X-ray dataset from PETRA III at DESY

4. Conclusion

Appendix

A. Revisit of SHARP background retrieval algorithm

B. Derivations for ADP (Algorithm 1) and ADPr (Algorithm 2)

ADP

ADPr

C. Proof of theorem 1

Funding

Acknowledgments

References

Cited By

Figures (9)

Tables (2)

Equations (73)

Optics Express