## Abstract

Random illumination microscopy (RIM) using uncontrolled speckle patterns has shown the capacity to surpass the Abbe’s diffraction barrier, providing the possibility to design inexpensive and versatile structured illumination microscopy (SIM) devices. In this paper, I first present a review of the state-of-the-art joint reconstruction methods in RIM, and then propose a unified joint reconstruction approach in which the performance of various regularization terms can be evaluated under the same model. The model hyperparameter is easily tuned and robust in comparison to the previous methods and ℓ_{2,1} regularizer is proven to be a reasonable prior in most practical situations. Moreover, the degradation entailed by out-of-focus light in conventional SIM can be easily solved in RIM setup.

© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

The resolution of conventional optical microscopy is limited by the diffraction effect of waves. In the past twenty years, numerous super-resolution techniques have been proposed in fluorescence microscopy to surpass this limitation, such as stochastic optical reconstruction microscopy (STORM) [1] and stimulated emission depletion fluorescence scanning microscopy (STED) [2], enabling a resolution of approximately 10 to 100 nm.

In this paper, I focus on the technique of random illumination microscopy (following [3], we abbreviate it as RIM). RIM can be seen as a blind version of structured illumination microscopy (SIM), which achieves super-resolution by illuminating an object $\rho$ with a few structured patterns $I_m$. In fact, when RIM was firstly proposed, it is called blind-SIM [4]. In the linear regime, the measured dataset $\{ y_m\}_{m=1}^{M}$ has a relation with the sample $\rho$ of [5]:

where $h$ is the point spread function (PSF) of the system and $\ast$ denotes the convolution operator. The product of the sample $\rho$ with structured patterns $I_m$ transfers otherwise unobservable high-frequency information of the sample into a region of lower frequency; therefore the high-frequency component can pass through the optical system [6]. As a wide-field imaging technique, standard SIM and RIM acquisitions are much faster than STORM and STED in a relatively large field of view scenario, and super-resolution imaging of living samples has been demonstrated for SIM and RIM [3,7].The discretized form of (1), where each 2D quantity is displayed by a column vector, is:

in which $\mathbf {y}_m \in \mathbb {R}^{L}$ is the recorded raw image, $\mathbf {H} \in \mathbb {R}^{L\times N}$ is the discrete convolution matrix built from the discretized PSF and $\circ$ denotes the element-wise product. $\boldsymbol {\rho } \in \mathbb {R}^{N}$ denotes the discretized fluorescence density, $\mathbf {I}_m \in \mathbb {R}^{N}$ is the $m$-th illumination with homogeneous intensity mean $I_0$, and $\boldsymbol {\epsilon }_m \in \mathbb {R}^{L}$ is the noise in the imaging process.In standard SIM, the object is illuminated by a set of harmonic patterns with designed spatial frequencies and phases. However, due to the dependence on perfect knowledge of the illumination, the blurring in illumination induced within the investigated volume by the sample will reduce the SR capacity of SIM and introduce strong artefacts [8,9]. One way to address this problem is to use uncontrolled speckle patterns as a substitute for the harmonic illumination in SIM. Compared with harmonic illumination, the speckle patterns are easier to generate, while the super-resolution is still attainable [4]. The spatial super-resolution or image quality of RIM may be not as good as that obtained by standard SIM with the same photon budget, whereas the study of this strategy is well worth the effort considering that the control of the illumination may be difficult or even impossible in certain situations, like in photoacoustic imaging or in fluorescence microscopy for some specific fluorochromes. Though more raw images are required in RIM in comparison to standard SIM, its temporal resolution is as good as SIM by taking advantage of an interleaved reconstruction strategy (i.e. the stacks of raw images for forming the neighboring super-resolved frames are overlapped). For two-color imaging and 3D imaging, the temporal resolution of RIM is better than SIM due to the simplicity of the experimental protocol [3].

We estimate the fluorescence image and the speckle patterns simultaneously in the joint reconstruction approach of RIM – so named because the quantity of interest (the fluorescence distribution) is obtained jointly with nuisance parameters (the unknown speckle patterns). By introducing an auxiliary variable $\mathbf {q}_m = \boldsymbol {\rho } \circ \mathbf {I}_m$, the image formation model (2) can be written in matrix form as:

with $\mathbf {Y}=[\mathbf {y}_1,\ldots ,\mathbf {y}_M] \in \mathbb {R}^{L\times M}$ and $\mathbf {Q}=[\mathbf {q}_1,\ldots ,\mathbf {q}_M] \in \mathbb {R}^{N\times M}$. Now our task is to estimate matrix $\mathbf {Q}$ from the measurement matrix $\mathbf {Y}$. Once $\mathbf {Q}$ is obtained, $\boldsymbol {\rho }$ can be retrieved either by the mean of $\mathbf {q}_m$: or by the standard deviation:Several reconstruction methods have been proposed in RIM, as shown in [4,10–13]. In reference [13], a marginal approach is proposed where the estimator depends on the statistics of the speckle patterns, but not on their specific values. The super-resolution capacity of RIM has been demonstrated to be as good as that of classic SIM by taking advantage of the second-order statistics of the data in the asymptotic condition in a standard epi-illumination geometry. However, the computational complexity of the methods presented in [13] is $\mathcal {O}(N^{3})$, which is exceedingly high for realistic size images. On the other hand, the methods shown in [4,10,11] share a similar framework, i.e. they try to reconstruct the object by minimizing a data fidelity term plus a regularizer:

In joint reconstruction approach of RIM, the super-resolution is induced by the regularizer term, while the data fidelity term yields no super-resolution information if only the first-order statistics of speckle are used [10]. Although various regularization terms have been proposed in previous articles, there are still unclear points in the existing methods:

- • The hyperparameter that corresponds to the regularizer in model (6) is not easy to tune.

This paper shows that the joint sparsity of matrix $\mathbf {Q}$ is related to the sparsity of object $\boldsymbol {\rho }$ itself given that the speckle patterns are second-order stationary and thus is a reasonable prior. To solve the hyperparameter tuning problem of form (6), we firstly transform it to a constrained form and then solve the constraint minimization problem with the help of an indicator function and a primal-dual splitting optimization algorithm. Simulation results show that the hyperparameter in the new model is more robust than $\mu$ in form (6). Finally, we demonstrate that the estimator (5) can help remove the background signal which is inevitable in real experiments.

## 2. Problem formulation

To tackle the hyperparameter tuning problem in [4,10,11], the unconstrained minimization model (6) is transformed to a constrained one:

To introduce the prior information, the $q$-th power of the $\ell _{p,q}$ norm of $\mathbf {Q}$ is chosen, which is defined as:

#### 2.1 Relationship between joint sparsity and prior information of object $\rho$

In this section, the relation between the joint sparsity of $\mathbf {Q}$ and the prior of object $\rho$ is analyzed. For the $n$-th row of matrix $\mathbf {Q}$, its $\ell _p$ norm is given by:

#### 2.2 Equivalently unconstrained form

To simplify the notation, let us define:

Now, the problem (8) can be expressed in vector form:

Inspired by the so-called C-SALSA algorithm [21], I first transform problem (15) into an unconstrained optimization problem by introducing an indicator function. The feasible set $\mathbf {E}(\xi ,\mathcal {H},\boldsymbol{\mathfrak{y}})$ is defined as:

#### 2.3 Primal-dual splitting method

The optimization problem of (17) can be tackled using proximal-splitting algorithms, such as the alternating direction method of multipliers (ADMM) [21] or the primal-dual method proposed in [19]. In this paper I choose the latter. To solve (17) with the primal-dual algorithm, two auxiliary variables $\boldsymbol{\mathfrak{d}}$ and $\boldsymbol{\mathfrak{r}}$ are introduced, with $\boldsymbol{\mathfrak{d}} = \boldsymbol{\mathfrak{q}}$ and $\boldsymbol{\mathfrak{r}}=\mathcal {H} \boldsymbol{\mathfrak{q}}$. Then, (17) can be rewritten as:

Equation (19) is a particular case considered in [19], and the associated primal-dual algorithm is presented as follows:

$\textrm {Prox}_f(\mathbf {x})$ in the iterations denotes the proximal operator of the function $f$, whose definition is given by [14]:

Following [19, Theorem 5.2], the convergence of the primal-dual iteration is granted if $q=1$ and the parameters $(\tau ,\sigma ,\theta )$ in algorithm 1 satisfy:

The computational burden of the primal-dual algorithm mainly lie in $\mathcal {H}\boldsymbol{\mathfrak{r}}$ and $\mathcal {H}^{*}\boldsymbol{\mathfrak{r}}$ for $\boldsymbol{\mathfrak{r}} \in \mathbb {R}^{MN}$. Taking advantage of the fast Fourier transform (FFT) algorithm, the computational complexity of the primal-dual algorithm in each iteration is $\mathcal {O}(MN\log N)$.

## 3. Simulation results and experiments

To study the numerical performance of the proposed $\ell _{p,q}$ norm model, a 2D ‘star-like’ simulated target whose fluorescence density in the polar coordinates given by $\rho (r,\theta ) \propto [1+\cos (40\theta )]$ is used as the true object. The top-left quarter of the object is shown in Fig. 1. One advantage of this object is that its spatial frequencies increase when approaching the star center, making it easy to visualize the resolution improvement. The point spread function is chosen as:

where $J_1$ is the first-order Bessel function of the first kind, NA is the objective numerical aperture set to 1.49 and $k_0=\frac {2\pi }{\lambda }$ is the free-space wavenumber with $\lambda$ the emission and excitation wavelengths. The radius $r$ from the center of the object that conventional wide-field microscopy can reach could be easily deduced from the relation:The sampling step in the object should be finer than $\lambda / 8NA$ to observe an SR factor of two. In the simulations, a sampling step of $\lambda / 20$ is adopted so that aliasing does not destroy the attainable SR. For the sampling rate in the raw images, no information is lost as long as it is higher than the Nyquist rate $4NA/\lambda$. In the simulations performed in this section, I set the sampling rate for the raw images to be the same as that of the object.

The speckle patterns are generated through the same optical device as used for the collection of raw images, unless otherwise stated. Under this condition, the frequency support of the speckle has the same shape as the OTF of the system for the unapodized pupil [20, Section 7.7]. The boundary conditions of the object $\boldsymbol {\rho }$ are assumed to be periodic; thus, the convolution matrix $\mathbf {H}$ will have a block-circulant with circulant-block (BCCB) structure [22], and the matrix vector product $\mathbf {H}\mathbf {v}$ can be obtained with the fast Fourier transform (FFT) algorithm .

First, numerical simulations are performed with 300 speckle patterns. The low-resolution raw images are corrupted with Gaussian white noise, with a corresponding SNR of 40 dB. In the primal-dual algorithm, I set $\theta = \sigma =1$ and $\tau = 0.35$, with $\boldsymbol{\mathfrak{q}}_0, \boldsymbol{\mathfrak{d}}_0,\boldsymbol{\mathfrak{r}}_0$ initialized with zeros. $\xi$ is set to its true value $\xi _{\textrm {real}} = \sqrt {MN}\nu$ , where $\nu$ is the standard variance of noise, unless otherwise stated.

The Wiener deconvolution of the mean of raw images $\bar {\mathbf {y}} = \frac {1}{M} \sum _m \mathbf {y}_m$ is shown in Fig. 1(b). As expected, we see no super-resolution (patterns inside the green solid line) in the Wiener deconvolution of the wide-field image. The reconstructed image obtained by the methods presented in [4] that use only the positivity constraint is shown in Fig. 1(c). It retrieves partial super-resolution information; however, the modulation contrast in the super-resolution part is relatively low, coinciding with the results reported in [4]. Figures 1(d) and 1(e) are obtained using the $\ell _{2,0}$ norm regularizer with the M-SBL algorithm as in [11] and the $\ell _1 + \ell _2$ norm plus positivity regularizer with the PPDS algorithm presented in [10], respectively. The image reconstructed by the M-SBL algorithm does not scale well, and we see strong bias in the low-resolution part. I stop the M-SBL iterations after a fixed number of iterations (i.e. , 20) as indicated in reference [11]. The computational complexity of the M-SBL algorithm is in fact $\mathcal {O}(N^{3})$, as high as that of the marginal approach [13] plotted in Fig. 1(f). The reconstruction by marginal approach agree with its theoretical predications, however, the computational burden makes it unrealistic for real-sized images. Possible solutions to reduce the computational burden in marginal approach are beyond the scope of this paper.

#### 3.1 Reconstruction with different $\{p,q\}$ pairs

Figures 2 show the reconstruction results obtained by minimizing the constrained $\ell _{p,q}$ model with different $(p,q)$ pairs. To measure the quality of reconstructed images, the normalized radially averaged power spectrum (RAPS) of the error images is plotted, which is defined as:

As for the two estimators (4) and (5), on one hand, we see no difference in terms of super-resolution when $\ell _{2,1}$ prior is selected (see Fig. 2(b) and 2(e)). On the other hand, the standard deviation of the Wiener deconvolutions yields partial super-resolution, as shown in Fig. 2(f), though their mean yields no super-resolution [10]. We will talk more about the differences between the two estimators in section 3.2.

#### 3.2 Two estimators and background removal

The real images recorded by microscopy are always blurred by an out-of-focus background. A more accurate model than (2) to describe the imaging process is:

with $\mathbf {b} \in \mathbb {R}^{L}$ denoting the background noise. Consequently, the reconstructed $m$-th column of $\mathbf {Q}$ is in fact $\hat {\mathbf {q}}_m =\mathbf {q}_m^{\perp } + \mathbf {H}^{+}(\boldsymbol {\epsilon }_m + \mathbf {b})$, with $\mathbf {q}_m^{\perp }$ indicating the estimate of $\mathbf {q}_m$ without the present of noise and $\mathbf {H}^{+}$ the pseudo-inverse of $\mathbf {H}$. If we continue estimating $\boldsymbol {\rho }$ by averaging $\hat {\mathbf {q}}_m$, then the estimated object will be blurred by $\mathbf {H}^{+}\mathbf {b}$:The nonmodulated background signal will dramatically degrade the image quality in conventional SIM [23]. Instead of modelling the background with a smooth function [24] or carrying out background subtraction heuristically [25], a natural strategy that is much easier in the RIM setup is to estimate the object by Eq. (5), leveraging the inherent second-order stationary random process of speckle patterns. Unlike the ensemble mean, the empirical variance of $\hat {\mathbf {q}}_m$ is not blurred with the background, so (5) holds even when a strong background signal is presented.

To verify this point, simulation results using 300 speckle patterns and 40 dB Gaussian noise with a fixed background (Cameraman) are shown in Fig. 4. The images in Fig. 4(b) and 4(c) are obtained by minimizing the constrained $\ell _{21}$ regularizer. As expected, both the Wiener deconvolution of the wide-field image and the mean of $\hat {\mathbf {q}}_m$ are blurred by the background, while the standard deviations of $\hat {\mathbf {q}}_m$ (Fig. 4(c)) are rather clear.

#### 3.3 Influence of the hyperparameter

The hyperparameter $\xi$ denoting the variance of the additive noise was assumed to be known in previous simulations. In this section, its influence on the estimator is explored by considering the cases when it is not correctly set. The reconstruction results with 300 speckle patterns and 40 dB white noise using different $\xi$ values are shown in the first line of Fig. 5. When $\xi$ is equal to 5 times its true value, we partially lose the super-resolution. However, if $\xi$ is chosen to be much lower than its true value, as shown in Fig. 5(b) and 5(c), then no evident visual differences are observed in comparison with the situation when it is correctly set (Fig. 2(b)). In contrast, the reconstructed image quality obtained by solving the optimization problem of the form (6) using the FISTA algorithm [26,27] is more sensitive to the hyperparameter $\mu$. When $\mu$ is set relatively high, we observe strong artifacts in the low-frequency image component.

#### 3.4 Necessary number of speckle patterns

The $\ell _{p,q}$ norm prior is based on the statistical properties of speckle patterns. It is reasonable only when $M$ is relatively large (Eqs. (11) and (12)). In real experiments, more illumination patterns mean a longer data acquisition time and thus a lower temporal resolution. How many speckle patterns are required to achieve a reasonable performance? To answer this question, we perform simulations under 40 dB Gaussian noise and various illumination numbers with $\ell _{2,1}$ prior as shown in Fig. 6. It is clear that as the illumination number increases, the quality of the reconstruction improves. When the number of illuminations is low, strong artifacts are observed.

Figure 6(d) shows the mean square error (MSE) of $5$ simulations for certain number of speckle patterns as the cloud of points, with the MSE of the reconstructed image given by: $\textrm {MSE} =\frac {1}{N} \lVert \boldsymbol {\rho }-\hat {\boldsymbol {\rho }}\rVert _2^{2}$. The solid line indicates the average of MSE for each illumination number $M$. We see that both the MSE values and their variability decrease as the number of frames increases. It provides practional guidance for choosing the number of frames to satisfy a target MSE value. Based on the simulation results, I suggest that the speckle pattern number should be no less than $80$ in real experiments with the $\ell _{2,1}$ prior. Furthermore, Fig. 6(d) indicates that the MSE of estimator (4) is generally smaller than estimator (5). This is not surprising considering that the sample mean converges faster than the sample standard deviation (see Appendix B).

#### 3.5 Applying Poisson noise

In the previous simulations, we do not consider the shot noise caused by the random arrival of photons. For a given photon, the probability of its arrival within a given time period is governed by Poisson distribution. Our analysis suggests that the $\ell _{p,q}$ regularizer and the estimators Eqs. (4) and (5) are still valid after introducing Poisson noise as long as the photon counting rates are not too low, which is true for most practical situations. Even under low photon counting rates, the influence of Poisson noise on $\ell _{1,1}$ prior together with esitmator (4) can be neglected (see details in Appendix C).

In Fig. 7, we show the reconstructions using 300 speckle patterns under the mixture of Poisson and Gaussian noise. The SNR of Gaussian noise added is 20 dB. Fig. 7(a) shows that $\ell _{2,1}$ regularizer performs well as expected when the averaging photon number for each pixel in one raw image is set to 100 photons per pixel on average. Under low photon counting cases (20 photons per pixel on average), the image quality and resolution obtained by $\ell _{2,1}$ prior are dramatically degraded (Figs. 7(b) and 7(c)), while the reconstruction by $\ell _{1,1}$ prior together with estimator (4) is robust in low photon counting case (Fig. 7(e)) in comparison with the reconstruction without considering Poisson statistics (Fig. 7(f)).

#### 3.6 Reconstructions from experimental data

To provide a more convincing illustration of the super-resolution capacity of RIM, the processing of experimental datasets is presented in this section. The raw images are obtained with an objective of $NA=1.49$ and $100\times$ magnification. The wavelength of excitation is $405$ nm for Argolight sample and $488$ nm for beads and podosome samples, while the collection wavelength is $520$ nm. The PSF used is simulated using an ICY plug-in called PSF Generator with the Gibson & Lanni 3D Optical Model. The spatial sampling rate is set to be slightly above the Nyquist rate $\frac {\lambda }{4NA}$.

Reconstructed images by minimizing the constrained $\ell _{2,1}$ regularizer are displayed in Figs. 8 and 9. The Argolight sample shown in Fig. 8 is a designed slide, in which, from left to right, the spacing between two middle lines becomes narrower. The data shown in Fig. 9 is obtained from the fluorescent beads sample with diameters of $100$ nm and Podosome sample.

The Line section plot extracted from Argolight reconstructions (indicated by the white dashed line in Fig. 8(a)) with normalized ranges from $[0,1]$ in Fig. 8(d) reveals that RIM is superior in resolution. Zoomed-in views of a small part of the images of the beads and Podosome samples (marked by a green square in Fig. 9) are shown in Fig. 10. The reconstructions of the corresponding subimages obtained by the marginal approach are also plotted in Fig. 10. To remove the out-of-focus information in raw images, I slightly adapt the objective function introduced in Ref. [13] and reconstruct $\boldsymbol {\rho }$ with only the second-order statistics (that is, the mean values are abandoned) by minimizing:

*i.e.,*element-wise) product, $\boldsymbol {\Omega } = \boldsymbol {\Gamma }_y^{-1}\mathbf {H}$ and $\boldsymbol {\Gamma }_{\textrm {s}}$ the covariance of speckle patterns. The L-BFGS algorithm [28, 29] is chosen to optimize the marginal criterion (32). Clearly, we see better details after introducing the $\ell _{2,1}$ regularizer in comparison to the Wiener deconvolution of wide-field images (Figs. 10(b) and 10(h)). The image obtained by estimator (5) is less blurred by the background signal than that obtained by estimator (4) (Figs. 10(d) and 10(j)), which is consistent with the previous simulations. The high-frequency structures are verified by the marginal approach built on the elegant theoretical cornerstone (Figs. 10(f) and 10(l)).

## 4. Conclusion

RIM could achieve super-resolution imaging with low toxicity and high temporal resolution comparable to standard SIM. In this paper, a unified joint reconstruction approach in RIM based on constrained $\ell _{p,q}$ norm minimization of the data is proposed. Mathematical analysis shows that the joint sparsity of matrix $\mathbf {Q}$ implies sparsity of the object itself. In the analysis, the statistical priors of speckle patterns are taken into consideration, i.e. ,their ensemble average is homogeneous or they are a second-order stationary random process, as shown in Eqs. (11) and (12). Therefore, it is the sparsity of the object together with the statistical prior of speckle patterns that induce the super-resolution imaging in the $\ell _{p,q}$ norm reconstruction strategy.

Among the $\ell _{p,q}$ family of priors, numerical simulations show that the $\ell _{2,1}$ regularizer is superior in terms of both error and super-resolution in comparison to the $\ell _{1,1}$ term. Please note here that this conclusion is drawn under the fully developed speckle illumination situation. In cases with variational structured illumination, or in the very low photon count regime, the $\ell _{2,1}$ regularizer is no longer a good choice since the illumination statistics change. When $q<1$, the super-resolution information still appears even though the associated primal-dual splitting method cannot be assured to yield the global minimum because the corresponding regularizer term is no longer a convex function. The binary effect in reconstructions is quite evident in cases of $q<1$, as shown in Fig. 2.

The hyperparameter involved in this model is proportional to the standard variance of noise; thus, it is easy to tune. We believe that the robust performance of the constrained form (7) will be inspiring for the other biomedical imaging inverse problems shared the same form as (6), such as optical diffraction tomography, magnetic resonance imaging, and so on [30,31]. In applications like diffraction tomography microscopy, positivity constraint and total variation (TV norm) regularization are preferable. The TV norm of object in our imaging setup is equivalent with the composition of the $\ell _{2,1}$ regularizer with a linear operator of $\mathbf {Q}$ (see Appendix D). The associated primal-dual splitting method can find the minimizer of the this kind of objective function without big changes or applying the inverse of the linear operator.

Normally, the inevitable background signal in experiments will cause artifacts in reconstruction and reduce the super-resolution in standard SIM. We show that the estimator based on the standard deviation of $\mathbf {q}_m$ can cancel out the background in both numerical studies and for experimental real data, with an acceptable sacrifice of MSE in comparison to the estimator (4) under the same number of speckle patterns. Only 2D super-resolution problems are considered in this paper, its application in 3D imaging of thick samples in RIM setup deserves to be further explored in future [3,8,32,33].

## A. Proximal operator for the joint sparsity prior

This section focuses on the proximal operator of $g_1 = \lVert \boldsymbol{\mathfrak{d}} \rVert _{\mathcal {G} pq}^{q}$, whose definition is given by:

Since the partitions of $\boldsymbol{\mathfrak{d}}_{\mathcal {G}_n}$ do not overlap, problem (34) can be decoupled into $N$ independent subproblems. Each subproblem is given below:

For specific $(p,q)$ pairs, the minimization problem (35) has the following analytical formula [34]:

- • for $p=2$ and $q = 1/2$:$$\boldsymbol{\mathfrak{d}}_{\mathcal{G}_n} = \begin{cases} \frac{16 \lVert \boldsymbol{\mathfrak{x}}_{\mathcal{G}_n} \rVert_2^{3/2}\omega_n}{3\sqrt{3}\lambda+ 16 \lVert\boldsymbol{\mathfrak{x}}_{\mathcal{G}_n} \rVert_2^{3/2}\omega_n} \boldsymbol{\mathfrak{x}}_{\mathcal{G}_n}, & \lVert \boldsymbol{\mathfrak{x}}_{\mathcal{G}_n} \rVert_2 > \frac{3}{2}(\lambda)^{2/3} \\ \boldsymbol{0} \;\textrm{or} \; \frac{16 \lVert\boldsymbol{\mathfrak{x}}_{\mathcal{G}_n}\rVert_2^{3/2}\omega_n}{3\sqrt{3}\lambda+ 16 \lVert\boldsymbol{\mathfrak{x}}_{\mathcal{G}_n} \rVert_2^{3/2}\omega_n} \boldsymbol{\mathfrak{x}}_{\mathcal{G}_n}, & \lVert \boldsymbol{\mathfrak{x}}_{\mathcal{G}_n} \rVert_2 = \frac{3}{2}(\lambda)^{2/3} \\ \boldsymbol{0}, & \lVert \boldsymbol{\mathfrak{x}}_{\mathcal{G}_n} \rVert_2 < \frac{3}{2}(\lambda)^{2/3} \end{cases}$$with:
- • for $p=2$ and $q = 2/3$:$$\boldsymbol{\mathfrak{d}}_{\mathcal{G}_n} = \begin{cases} \frac{3 \eta^{4}}{2\lambda+3 \eta^{4}}\boldsymbol{\mathfrak{x}}_{\mathcal{G}_n}, & \lVert \boldsymbol{\mathfrak{x}}_{\mathcal{G}_n}\rVert_2 > 2(\frac{2}{3}\lambda)^{3/4}\\ 0 \; \textrm{or}\; \frac{3 \eta^{4}}{2\lambda+3 \eta^{4}}\boldsymbol{\mathfrak{x}}_{\mathcal{G}_n}, & \lVert \boldsymbol{\mathfrak{x}}_{\mathcal{G}_n}\rVert_2 = 2(\frac{2}{3}\lambda)^{3/4} \\ 0, & \lVert \boldsymbol{\mathfrak{x}}_{\mathcal{G}_n}\rVert_2 < 2(\frac{2}{3}\lambda)^{3/4} \end{cases}$$with$$ \begin{aligned} & \eta = \frac{1}{2} \Bigg(\lvert a \rvert + \sqrt{\frac{2\lVert \boldsymbol{\mathfrak{x}}_{\mathcal{G}_n}\rVert_2}{\lvert a \rvert}-a^{2}} \Bigg),\qquad a = \frac{2}{\sqrt{3}}\big(2\lambda \big)^{1/4} \Big(\cosh\big( \frac{\phi(\boldsymbol{\mathfrak{x}}_{\mathcal{G}_n})}{3}\big) \Big)^{1/2} \\ & \phi(\boldsymbol{\mathfrak{x}}_{\mathcal{G}_n}) = \textrm{arccosh}\Big( \frac{27\lVert\boldsymbol{\mathfrak{x}}_{\mathcal{G}_n} \rVert_2^{2}}{16 \big(2\lambda \big)^{3/2}}\Big) \end{aligned} $$

## B. Statistical analysis about estimators (4) and (5)

The intensity of speckle patterns follows an exponential probability distribution [35, Chapter 3]:

for $I\geq 0$. The moments of this distribution is: Then we have: Consider the sample mean divided by $I_0$ as a new random variable: $S_m = \frac {1}{MI_0} \sum _{m=1}^{M} I_m$, such that $I_1, I_2, \ldots , I_M$ are independent and identically distributed, according to the central limit theorem: where $\mathcal {N}(1, 1/M)$ denotes the normal distribution with mean $1$ and variance $1/M$. Similarly, we define the sample variance as a new random variable:## C. Validity of $\ell _{p,q}$ regularizer under Poisson noise

The recorded data of detectors in imaging obeys Poisson statistics. If we ignore the influence of point spread function and define:

where $\mathcal {P}(\mathbf {q}_m)$ denotes a realization of the Poisson process with mean $\mathbf {q}_m$, then the validity of $\ell _{p,q}$ regularizer depends on the statistics of $\mathbf {k}_m$. For brevity of expression, we remove the subscript $m$ and write $k_{r} = k(\mathbf {r}), \; q_{r} = q(\mathbf {r}_m)$, in which $\mathbf {r} \in \{1,\ldots ,N\}$ denotes the spatial index, and $k_r =\mathcal {P}(q_r)$. Since $q_r$ are still random variables, the Poisson variables $k_{r}$ are called*doubly stochastic Poisson random variables*[36].

Since $E[k_r|q_r] = q_r$, it comes directly that

*i.e.,*$E[\mathbf {k}] = E[\mathbf {q}]$, so (11) and estimator (4) still hold for random vector $\mathbf {k}$.

Similarly, one can obtain the second-order statistics of $\mathbf {k}$ :

Therefore, Substitute (50) to (12), we finally get :As the second-order moments have a quadratic dependence in the random vector $\mathbf {q}$, the changes after introducing Poisson statistics can be neglected except when $\mathbf {q}$ is small in low photon counting cases. Even in low photon situation, the above analysis shows that $\ell _{1,1}$ regularizer together with the estimator (4) are still valid.

## D. TV norm as the composition of the $\ell _{2,1}$ function with a linear operator of $\mathbf {Q}$

In some biomedical imaging applications, such as diffraction tomography microscopy, TV norm of the object is preferable. Given a $N_1\times N_2$ object $\rho$, the *isotropic* TV is defined as:

## Funding

China Scholarship Council (201404490041); Le GdR 720 ISIS (Information, Signal, Image et ViSion).

## Acknowledgments

This work was mainly done during my PhD studies at École Centrale de Nantes, France. I would like to thank Jérôme Idier, Sébastien Bourguignon at École Centrale de Nantes, Laurent Mugnier from ONERA (Châtillon) and the anonymous reviewers for their valuable comments. I also thank Simon Labouesse, Thomas Mangeat for sharing the raw data corresponding to Figs. 8 and 9.

## Disclosures

The authors declare no conflicts of interest.

## References

**1. **M. J. Rust, M. Bates, and X. Zhuang, “Sub-diffraction-limit imaging by stochastic optical reconstruction microscopy (STORM),” Nat. Methods **3**(10), 793–796 (2006). [CrossRef]

**2. **S. W. Hell and J. Wichmann, “Breaking the diffraction resolution limit by stimulated emission: stimulated-emission-depletion fluorescence microscopy,” Opt. Lett. **19**(11), 780–782 (1994). [CrossRef]

**3. **T. Mangeat, S. Labouesse, M. Allain, R. Poincloux, A. Bouissou, S. Cantaloube, E. Courtais, E. Vega, T. Li, and A. Guenole *et al.*, “Super-resolved live-cell imaging using random illumination microscopy,” bioRxiv (2020).

**4. **E. Mudry, K. Belkebir, J. Girard, J. Savatier, E. Le Moal, C. Nicoletti, M. Allain, and A. Sentenac, “Structured illumination microscopy using unknown speckle patterns,” Nat. Photonics **6**(5), 312–315 (2012). [CrossRef]

**5. **J. Goodman, * Introduction to Fourier Optics* (Roberts & Company Publishers, 2005).

**6. **M. G. L. Gustafsson, “Surpassing the lateral resolution limit by a factor of two using structured illumination microscopy,” J. Microsc. **198**(2), 82–87 (2000). [CrossRef]

**7. **P. Kner, B. B. Chhun, E. R. Griffis, L. Winoto, and M. G. Gustafsson, “Super-resolution video microscopy of live cells by structured illumination,” Nat. Methods **6**(5), 339–342 (2009). [CrossRef]

**8. **A. Jost, E. Tolstik, P. Feldmann, K. Wicker, A. Sentenac, and R. Heintzmann, “Optical sectioning and high resolution in single-slice structured illumination microscopy by thick slice blind-sim reconstruction,” PLoS One **10**(7), e0132174 (2015). [CrossRef]

**9. **R. Ayuk, H. Giovannini, A. Jost, E. Mudry, J. Girard, T. Mangeat, N. Sandeau, R. Heintzmann, K. Wicker, K. Belkebir, and A. Sentenac, “Structured illumination fluorescence microscopy with distorted excitations using a filtered blind-SIM algorithm,” Opt. Lett. **38**(22), 4723–4726 (2013). [CrossRef]

**10. **S. Labouesse, A. Negash, J. Idier, S. Bourguignon, T. Mangeat, P. Liu, A. Sentenac, and M. Allain, “Joint reconstruction strategy for structured illumination microscopy with unknown illuminations,” IEEE Trans. on Image Process. **26**(5), 2480–2493 (2017). [CrossRef]

**11. **J. Min, J. Jang, D. Keum, S.-W. Ryu, C. Choi, K.-H. Jeong, and J. C. Ye, “Fluorescent microscopy beyond diffraction limits using speckle illumination and joint support recovery,” Sci. Rep. **3**(1), 2075 (2013). [CrossRef]

**12. **L.-H. Yeh, L. Tian, and L. Waller, “Structured illumination microscopy with unknown patterns and a statistical prior,” Biomed. Opt. Express **8**(2), 695–711 (2017). [CrossRef]

**13. **J. Idier, S. Labouesse, M. Allain, P. Liu, S. Bourguignon, and A. Sentenac, “On the super-resolution capacity of imagers using unknown speckle illuminations,” IEEE Trans. Comput. Imaging **4**(1), 87–98 (2018). [CrossRef]

**14. **P. L. Combettes and J.-C. Pesquet, “Proximal splitting methods in signal processing,” in * Fixed-point Algorithms for Inverse Problems in Science and Engineering*, (Springer, 2011), pp. 185–212.

**15. **A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM J. Imaging Sci. **2**(1), 183–202 (2009). [CrossRef]

**16. **J. M. Bioucasdias and M. A. T. Figueiredo, “A new twist: Two-step iterative shrinkage/thresholding algorithms for image restoration,” IEEE Trans. on Image Process. **16**(12), 2992–3004 (2007). [CrossRef]

**17. **T. W. Murray, M. Haltmeier, T. Berer, E. Leiss-Holzinger, and P. Burgholzer, “Super-resolution photoacoustic microscopy using blind structured illumination,” Optica **4**(1), 17–22 (2017). [CrossRef]

**18. **R. T. Rockafellar, * Convex Analysis* (Princeton University Press, 2015).

**19. **L. Condat, “A primal–dual splitting method for convex optimization involving lipschitzian, proximable and linear composite terms,” J. Optim. Theory Appl. **158**(2), 460–479 (2013). [CrossRef]

**20. **J. W. Goodman, * Statistical Optics* (John Wiley & Sons, 2015).

**21. **M. V. Afonso, J. M. Bioucas-Dias, and M. A. Figueiredo, “An augmented lagrangian approach to the constrained optimization formulation of imaging inverse problems,” IEEE Trans. on Image Process. **20**(3), 681–695 (2011). [CrossRef]

**22. **P. C. Hansen, J. G. Nagy, and D. P. O’leary, * Deblurring Images: Matrices, Spectra, and Filtering* (SIAM, 2006).

**23. **J. Demmerle, C. Innocent, A. J. North, G. Ball, M. Müller, E. Miron, A. Matsuda, I. M. Dobbie, Y. Markaki, and L. Schermelleh, “Strategic and practical guidelines for successful structured illumination microscopy,” Nat. Protoc. **12**(5), 988–1010 (2017). [CrossRef]

**24. **F. Orieux, E. Sepulveda, V. Loriette, B. Dubertret, and J. C. Olivomarin, “Bayesian estimation for optimized structured illumination microscopy,” IEEE Trans. on Image Process. **21**(2), 601–614 (2012). [CrossRef]

**25. **A. Lal, C. Shan, and P. Xi, “Structured illumination microscopy image reconstruction algorithm,” IEEE J. Select. Topics Quantum Electron. **22**(4), 50–63 (2016). [CrossRef]

**26. **A. Beck and M. Teboulle, “A fast iterative shrinkage-thresholding algorithm for linear inverse problems,” SIAM J. Imaging Sci. **2**(1), 183–202 (2009). [CrossRef]

**27. **T. W. Murray, M. Haltmeier, T. Berer, E. Leiss-Holzinger, and P. Burgholzer, “Super-resolution photoacoustic microscopy using blind structured illumination,” Optica **4**(1), 17–22 (2017). [CrossRef]

**28. **D. C. Liu and J. Nocedal, “On the limited memory BFGS method for large scale optimization,” Math. programming **45**(1-3), 503–528 (1989). [CrossRef]

**29. **M. Schmidt, “minfunc: unconstrained differentiable multivariate optimization in matlab,” (2005).

**30. **E. Soubies, F. Soulez, M. T. Mccann, T. Pham, L. Donati, T. Debarre, D. Sage, and M. Unser, “Pocket guide to solve inverse problems with globalbioim,” Inverse Prob. **35**(10), 104006 (2019). [CrossRef]

**31. **T. Pham, E. Soubies, A. B. Ayoub, J. Lim, D. Psaltis, and M. Unser, “Three-dimensional optical diffraction tomography with lippmann-schwinger model,” IEEE Trans. Comput. Imaging **6**, 727–738 (2020). [CrossRef]

**32. **L. Yeh, S. Chowdhury, N. A. Repina, and L. Waller, “Speckle-structured illumination for 3d phase and fluorescence computational microscopy,” Biomed. Opt. Express **10**(7), 3635–3653 (2019). [CrossRef]

**33. **A. Negash, T. Mangeat, P. C. Chaumet, K. Belkebir, H. Giovannini, and A. Sentenac, “Numerical approach for reducing out-of-focus light in bright-field fluorescence microscopy and superresolution speckle microscopy,” J. Opt. Soc. Am. A **36**(12), 2025–2029 (2019). [CrossRef]

**34. **Y. Hu, C. Li, K. Meng, J. Qin, and X. Yang, “Group sparse optimization via lp,q regularization,” Journal of Machine Learning Research **18**, 1–52 (2017).

**35. **J. W. Goodman, * Speckle Phenomena in Optics: Theory and Applications* (Roberts and Company Publishers, 2007).

**36. **H. H. Barrett and K. J. Myers, * Foundations of Image Science* (John Wiley & Sons, 2013).