## Abstract

Super-resolution fluorescence microscopy has proven to be a useful tool in biological studies. To achieve more than two-fold resolution improvement over the diffraction limit, existing methods require exploitation of the physical properties of the fluorophores. Recently, it has been demonstrated that achieving more than two-fold resolution improvement without such exploitation is possible using only a focused illumination spot and numerical post-processing. However, how the achievable resolution is affected by the processing step has not been thoroughly investigated. In this paper, we focus on the processing aspect of this emerging super-resolution microscopy technique. Based on a careful examination of the dominant noise source and the available prior information in the image, we find that if a processing scheme is appropriate for the dominant noise model in the image and can utilize the prior information in the form of sparsity, improved accuracy can be expected. Based on simulation results, we identify an improved processing scheme and apply it in a real-world experiment to super-resolve a known calibration sample. We show an improved super-resolution of 60nm, approximately four times beyond the conventional diffraction-limited resolution.

© 2020 Optical Society of America under the terms of the OSA Open Access Publishing Agreement

## 1. Introduction

A widely used resolution metric for optical instruments like telescopes and microscopes is the Rayleigh resolution distance, $d_{\textrm {R}}$ [1,2]. It defines the minimum separation needed between two incoherent point sources for them to be “barely resolved,” and is usually estimated as $d_{\textrm {R}} = 0.61 \lambda /{\textrm {NA}}$ [2], where $\lambda$ is the light wavelength and NA is the numerical aperture of the imaging instrument. Because of this resolution limit, fluorescence microscopy images obtained using a widefield microscope will only reveal object features that are around 200nm in size or larger. Bypassing this limit using recent novel imaging techniques has allowed users of fluorescence microscopes to observe finer object details, which in turn has generated significant research interests.

To achieve more than two-fold resolution improvement over the Rayleigh resolution limit, a super-resolution microscopy technique generally needs to exploit certain properties of the fluorophores in addition to the absorption-emission mechanism. Among many novel techniques, these properties are usually stimulated emission [3,4] and on-off state transitions [5–9]. For techniques utilizing the latter, transitions can be achieved either by the photo-switchable properties of specifically designed fluorescent markers [5,6], or by the inherent stochastic blinking behavior of conventional fluorescent probes [7,8]. An additional commonly used super-resolution microscopy technique is structured illumination microscopy (SIM) [10]. In SIM, a sinusoidal illumination pattern is produced on the object that allows the overall system to recover spatial frequencies higher than the cutoff of the unmodified optical system, which is about $1/d_{\textrm {R}}$. The overall achievable cutoff is then the sum of the unmodified optical system and the highest sinusoidal illumination pattern frequency, which limits the achievable resolution improvement for SIM to a maximum of two-fold. More than two-fold resolution improvement has been reported, for example in [11], but we note that it still requires on-off state transitions of the fluorescent markers.

Achieving super-resolution is also possible using numerical post-processing, and it has shown great promise without needing to exploit the physical properties of the fluorophores. In [12], it is shown that object non-negativity and sparsity (termed "near-blackness" in [12]) are critical in achieving super-resolution computationally from noisy images. Similar to [12], in [13], it is shown that the achievable super-resolution is inversely related to the spatial extent of the object and the noise level. In another series of works, this super-resolution phenomenon is analyzed in coherent imaging [14], incoherent imaging [15], and incoherent imaging with non-uniform illumination [16], using singular value decomposition (SVD). The authors of [14–16] again show that the achievable super-resolution is inversely related to the spatial extent of the object and the noise level. A common theme among these results is that, for a computational super-resolution approach, the achievable super-resolution is dependent on the signal-to-noise ratio (SNR), and the discussion of a “resolution limit” requires specifying a SNR to be meaningful.

Regrettably, these early efforts in computational super-resolution did not include extensive examples in the biological applications of this approach. Recently, the present authors performed a real-world, biological proof-of-concept experiment to demonstrate the potential of this super-resolution approach [17]. Later, a comparison between this method and image scanning microscopy (ISM) was also performed by the authors [18]. In these works, a biological sample labeled with conventional fluorophores is imaged with a focused illumination spot that is scanned across the sample to enhance object sparsity. At each of the scanning steps, a small raw image is collected (Fig. 1(e)). These raw images are then numerically post-processed and reassembled according to the scanning pattern to produce a super-resolved image. We showed that if the object being imaged possesses some degree of sparsity, which is artificially enhanced by the focused illumination spot, computationally achieving super-resolution is possible without needing to exploit the on-off state transitions of the fluorophores.

Throughout these proof-of-concept works in computational super-resolution, the effect of the processing scheme is usually given less importance than the simple fact that super-resolution is achieved. In [12], maximum entropy inversion is used to demonstrate successful computational super-resolution, and no alternative options are explored. In [14], a truncated SVD based inversion scheme is used to recover the super-resolved image. This processing scheme requires accurate knowledge of the spatial extent of the object, which means that it cannot plausibly be used in a real-world experiment. In [17,18], non-negative least squares (NNLS) solvers were used.

While these previous works demonstrate successful computational super-resolution, we find it unlikely that the processing schemes they employ are optimal in terms of maximizing the achievable resolution. In this paper, we attempt to improve the super-resolution accuracy of NNLS, used in [17,18]. The reason for the skepticism is that NNLS does not take advantage of two factors in biological fluorescence microscopy. The first factor is that, for high-sensitivity fluorescence microscope systems, the photonic shot noise is usually the dominant noise source. However, NNLS is derived from maximum likelihood estimation (MLE) for additive, uniform, and zero-mean Gaussian noise and therefore is not in agreement with the shot noise model. The other factor is that most fluorescently labeled biological samples can be regarded as highly sparse objects, because they typically consist of sub-micron structures labeled with fluorescent markers against a non-labeled, dark background. It is widely known that a suitable regularization term can take advantage of this sparsity [19,20].

To account for these two factors, we develop three processing schemes based on the MLE for shot noise, and include a sparsity-inducing $\ell _1$ regularization term in these processing schemes. In addition to the standard MLE for shot noise, we formulate the two other processing schemes utilizing different approximations that allow us to recast the optimization problem in the form of least squares. We apply these processing schemes in simulation to super-resolve an incoherent two-dot test object and compare the accuracy of the results.

As expected, we find that the three processing schemes achieve more accurate estimations of the object than the previous NNLS approach at the same photon levels, and especially so at low photon levels. Furthermore, we show that the three processing schemes, although formulated with different approximations, perform similarly in terms of super-resolution accuracy. Based on the simulation results and computation requirements, we identify an improved processing scheme and apply it in a real-world experiment to super-resolve a commercially available calibration sample. A resolution of 60nm is achieved, which is approximately four times finer than conventional diffraction-limited resolution.

## 2. Simulation methodology

The development and evaluation of the processing schemes is done using numerical simulation, since this allows us access to a ground truth object so that fair and accurate comparisons among the processing schemes can be made. To do this, we design a numerical experiment where we: 1) define an incoherent two-dot test object with a distance between the two dots smaller than the Rayleigh resolution distance, 2) simulate the noisy images it produces when corrupted by photon shot noise and Gaussian readout noise that results from a scientific complementary metal-oxide-semiconductor (sCMOS) camera, 3) apply the processing schemes to recover the super-resolved image of the test object, and 4) evaluate their performance by calculating the probability for a processing scheme to successfully resolve the two dots in a single trial. These steps are graphically illustrated in Fig. 1. In this section, we 1) describe the image formation and noise model used in our simulation, 2) provide a general example of how the super-resolved object is recovered, and 3) describe the metric used for evaluating the performance of the various processing schemes considered.

#### 2.1 Image formation and noise model

Our imaging system is a standard widefield fluorescence microscope, except for a modified illumination setup that enables point illumination scanning for the purpose of enhancing object sparsity [17] (Fig. 1(e)). For each scanning step, a small raw image is recorded on a sCMOS camera. Then each raw image is independently processed before being recombined with all of the others to form a super-resolved image across the entire field of view. Because our two-dot test object is smaller than the size of the illumination spot, this scanning process is not modeled in the simulations shown here.

Assuming the microscope system has a 1:1 magnification (without the loss of generality), the noiseless image can be formulated as

*dictionary*throughout this paper.

We note that the dictionary $H_{\textrm {PSF}}$ needs not be a square matrix, and is, in general, not a square matrix. This is because, for the same microscope system, different discretization schemes are possible. If $s_j$ has a finer discretization (smaller than the pixel size of the camera), it allows us to model and later process for a finer resolution [17]. For example, if $d_{\textrm {camera}}$ represents the camera pixel size, and $d_{\textrm {dict}}$ represents the distance between two adjacent PSFs contained in the columns of $H_{\textrm {PSF}}$, then by allowing $d_{\textrm {dict}}$ to be smaller than $d_{\textrm {camera}}$, we can model the images of objects containing details that are smaller than the size of the camera pixel, and later attempt to recover these details. To achieve this, we first calculate a series of sub-camera-pixel PSFs, one for each particular sub-camera-pixel position of the point source. Then we shift these PSFs to form the entire $H_{\textrm {PSF}}$ matrix. For example, if $d_{\textrm {dict}}=d_{\textrm {camera}}/4$, we first calculate a series of PSFs, each representing one of a total of $4^2=16$ sub-camera-pixel positions. Using these sub-camera-pixel PSFs, we can shift them by some integer number of pixels to form the complete dictionary $H_{\textrm {PSF}}$. This method is used throughout to generate $H_{\textrm {PSF}}$.

We also model an sCMOS camera in our simulation. The camera has a quantum efficiency (QE) between 0 and 1, and the collected images are corrupted by photon shot noise and additive Gaussian readout noise. Accounting for these sources of noise, the detected noisy images are modeled as

#### 2.2 Recovery of a super-resolved object

The recovery of the super-resolved object is achieved by solving a series of inverse problems, one for each raw image. As an example, an NNLS optimization problem can be used to recover the super-resolved object. The optimization problem is

As we examine different properties (e.g., the dominant noise source) of the overall imaging system, we develop alternative processing schemes that take on different forms than Eq. (4). We present these alternative processing schemes in the main text, and leave implementation details for Supplement 1.

#### 2.3 Test object and evaluation of processing scheme performance

Our simulated test object is comprised of two fluorescent dots that are separated by a distance roughly 1/3 of the Rayleigh resolution distance $d_{\textrm {R}}$. $d_{\textrm {camera}}$ is selected such that $d_{\textrm {R}}$ spans approximately 3 pixels (as shown in Fig. 1). The images are generated with a $H_{\textrm {PSF}}$ having $d_{\textrm {dict}}=d_{\textrm {camera}}/16$ and processed with $H_{\textrm {PSF}}$ having $d_{\textrm {dict}}=d_{\textrm {camera}}/4$, respectively. These $d_{\textrm {dict}}$ values are chosen such that: 1) the small separation distance of $d_{\textrm {R}}/3$ is accurately represented, and 2) the dots are not positioned exactly on the center of a pixel in the processed image, which is the most common case in a real-world experiment.

We first specify a signal level in the form of number of photons per dot, and simulate the acquired noisy image. Next, the processing scheme under consideration is applied to recover the super-resolved object. Finally, a binary resolved/not resolved metric (Fig. 1(c)) is applied to each processing run. This metric is based on the relative intensities of the pixels in the processed image, and is similar to the metric adopted in our prior work [17]. We then repeat this process for 500 times, and generate a “resolving ratio” performance measure, which is defined as

After we generate the resolving ratio for a given signal level, we then repeat this process for a range of signal levels and plot the resolving ratio versus signal level. In our super-resolution technique, the sample is scanned with the focused illumination spot once. Therefore, evaluating the probability for a processing scheme to recover the correct object in one trial is the suitable metric in assessing its performance. An optimal processing scheme should give the highest resolving ratio using the lowest signal level.## 3. Simulation results

Next, we examine the imaging system focusing on two aspects: the noise model, and prior information.

#### 3.1 Noise model and dominant noise source

As a baseline for our discussion, in Fig. 2(a), we plot the performance of NNLS, the processing scheme used in our prior works [17,18], to recover a two-dot object under various signal levels. It is clear from Fig. 2(a) that the resolving ratio is improved with increasing signal strength, and eventually approaches 1. This is expected because, as the signal level increases, the detected image will have an SNR that eventually approaches noiseless, which has been shown to enable exact recovery [13].

Intuitively, NNLS (Eq. (4)) recovers the super-resolved objects by minimizing the total discrepancy energy between the acquired image ($I_{\textrm {noisy}}$) and the reconstructed image ($H_{\textrm {PSF}}\times x$). We show here that this straightforward method may not be appropriate for the dominant noise source present in the acquired images. From the perspective of maximum likelihood estimations, NNLS can be derived from the log-likelihood function for noise modeled by additive, zero-mean, and independent and identically distributed (i.i.d.) Gaussian random variables. While this is a noise process present in modern sCMOS cameras in the form of readout noise [21], the photon shot noise inherent to fluorescence detection is decidedly Poissonian. A Poisson random variable has a variance equal to its mean, such that, as the optical signal level (mean) increases, the noise power (variance) increases as well. Thus, for a fixed level of Gaussian readout noise, as the signal level increases, at some point the shot noise starts to dominate. For realistic values of readout noise and photon counts, the shot noise almost always dominates. In this case, NNLS is no longer appropriate for maximizing the likelihood function because of the mismatch between the the Gaussian noise assumed by the processing scheme and the Poisson noise that dominates the images.

To demonstrate this, we present an alternative that is appropriate for the shot noise model: weighted non-negative least squares (WNNLS). For a Poisson random variable with sufficiently large mean/variance, a Gaussian random variable with the same mean and variance as that of the Poisson random variable can be used to approximate the original Poisson random variable. In other words, we approximate an acquired noisy image as a series of Gaussian random variables, one for each pixel, with individual means and variances. For such a random process, weighted least squares (with weights set to the true variance) can be used to obtain a maximum likelihood estimator [22]. In this case, the true variances are represented by the noiseless image plus the variance of the Gaussian readout noise, both available in simulation. We present a more typical implementation of maximum likelihood estimation in section 4.1. The optimization problem we solve for WNNLS is

We plot the performance of NNLS and WNNLS, as measured by the resolving ratio metric, in Fig. 2(a). From the figure, we observe the following: 1) both processing schemes show improved performance with increasing signal level; 2) more interestingly, WNNLS and NNLS perform similarly at extremely low signal levels, but WWNLS shows a clear improvement in resolving ratios over NNLS as the signal level increases. This is expected since, when the signal level is sufficiently low, shot noise has a low variance, and therefore does not contribute much to the noise power. In this case, Gaussian readout noise is the main source of noise in the image, for which NNLS is appropriate. We also observe that in this regime, WNNLS does not yield worse performance than NNLS. This is also expected since, in this signal range, $W_{\textrm {diag}}\approx \sigma ^{-1} I$ (where $I$ is an identity matrix), and so Eq. (6) is nearly the same as Eq. (4). As the signal level increases, shot noise variance increases and becomes the dominant noise source in the image. In this case, WNNLS is the more appropriate processing scheme, which leads to its superior performance in this signal regime.To further illustrate this effect, in Fig. 2(b), we repeat the same simulation as in Fig. 2(a), but with an increased (100-fold) Gaussian readout noise variance. From the plot, we observe that: 1) for the same optical signal level, the increased noise level causes both NNLS and WNNLS to yield worse performance; 2) the performance of WNNLS stays roughly the same with NNLS for a longer interval as the optical signal level increases, before yielding better performance over NNLS at a higher optical signal level than that in Fig. 2(a). In this case, the increase in Gaussian readout noise means that shot noise needs to have an even higher variance (i.e., higher signal level) to become the dominant noise source in the image, making WNNLS outperform NNLS at a higher signal level.

From these results, we conclude that performance improvement can be achieved if a processing scheme more appropriate for the dominant noise source in the acquired image is chosen.

#### 3.2 Prior information and sparsity

It is well-known that inverse problems such as Eqs. (4) and (6), which are widely used in image deconvolution and super-resolution, are very ill-conditioned [12,14,23]. An effect of this is that, even when the underlying object stays the same (as is the case in our two-dot object example), the processed image will be different for successive trials. This gives rise to the non-unity resolving ratio for the majority of the signal levels considered (Fig. 2).

Utilizing prior information has proven to be a powerful solution to this problem [9,12,13,19,20]. It is typically achieved by first acquiring prior knowledge about the solution vector of the inverse problem, and then imposing appropriate constraints (or adding regularization) when solving it. Prior information can often be derived from physical properties of the imaging system, and takes on many forms, such as the non-negativity (derived from the contrast mechanism of incoherent imaging) we have been including in our processing schemes.

Another example of prior information that is critical for the success of our computational super-resolution microscopy approach is the presence of some degree of photo-emitter sparsity within the sample. This sparsity condition frequently occurs in biological fluorescence imaging because the sample consists of structures that are much smaller than the size of the illumination spot, and they are labeled with fluorescent markers against a non-labeled, dark background. When these small labeled structures are excited with a focused illumination spot, the illuminated region will be made up of a mixture of photo-emitters and non-emitting background. Because of this, one expects the solution vector will contain many zero elements. It has been shown that $\ell _1$ regularization can induce sparsity in the computed solution vector and improve the accuracy [24]. Therefore applying a sparsity-inducing $\ell _1$ regularization term should improve the super-resolution performance further.

To demonstrate this, we modify NNLS and WNNLS by simply adding an $\ell _1$ regularization term. We formulate the modified processing schemes, r-NNLS and r-WNNLS as:

In Fig. 3, we plot the performance of all four processing schemes (Eqs. (4), (6), (8), and (9)) when utilized to super-resolve the two-dot test object. We observe that because sparsity is the prior information present in the inverse problem, the addition of a sparsity-inducing $\ell _1$ term improves the performance of both NNLS and WNNLS. We also observe that r-WNNLS shows the best performance across all signal levels considered in the simulation. From these results, we can conclude that a processing scheme appropriate for the noise model and utilizing the prior sparsity information achieves the highest amount of performance improvement.

## 4. Applying processing schemes to real-world data

Although r-WNNLS produced the best performance in simulation, its implementation requires the noiseless image (i.e., $I_{\textrm {noiseless}}$), which is not accessible in a real-world experiment. Therefore, it is desirable to develop a processing scheme that can account for shot noise and $\ell _1$ regularization without requiring the noiseless image.

#### 4.1 Poisson maximum likelihood estimation

A processing scheme that is appropriate for shot noise without needing the noiseless image is readily available based on the maximum likelihood estimation applied to Poisson random variables. We refer to this processing scheme as Poisson-MLE that is described by:

In Fig. 4, we plot the performance of Poisson-MLE and WNNLS when used to super-resolve the same two-dot object from the previous section. We see that WNNLS and Poisson-MLE achieve nearly identical performance, suggesting that Poisson-MLE is a suitable processing scheme (in terms of the noise model) that can be used without requiring the noiseless image.

#### 4.2 Variance stabilizing transforms

A different method of accommodating the shot noise model is via variance stabilizing transforms. These transforms operate on random variables whose variance is dependent on their mean (e.g., Poisson random variables), and generate “stabilized” random variables, with their new variance now independent or nearly independent of their mean. As a result, an NNLS-like processing scheme that assumes a uniform variance may be appropriate to operate on this newly transformed random variable.

A common variance stabilizing transform for Poisson random variables is the Anscombe transform [27], which is defined by:

Here, $p$ is the original Poisson random variable, which has a variance dependent on its mean, $z$ is the transformed random variable, and has a uniform variance of 1 for $p$ with a sufficiently large mean.Utilizing the power of the Anscombe transform, we can formulate a new processing scheme that is appropriate for Poisson noise. We name this processing scheme VST for variance stabilizing transform, defined by:

*non-*linear least-squares problem. In NNLS, the original images are used. In VST, the Anscombe transformed images are used. In Fig. 4, in addition to WNNLS and Poisson-MLE, we plot the performance of VST as well. From the plot, we see that all three of these processing schemes achieve effectively the same performance, showing that both Poisson-MLE and VST are appropriate for the noise model without needing the noiseless image as is the case for WNNLS. This now means that

*both*Poisson-MLE and VST are suitable for the dominant noise in the image (shot noise), and can be used in a real-world experiment.

#### 4.3 Incorporating sparsity in Poisson-MLE and VST

Having identified two potential processing schemes that are appropriate for the noise model, we can further improve the performance of Poisson-MLE and VST by adding an $\ell _1$ term to form their regularized counterparts, r-Poisson-MLE and r-VST as:

A summary of these processing schemes is presented in Table 1. For the only two processing schemes that fulfill all three of these criteria (noise model, sparsity, and not requiring the noiseless image), r-Poisson-MLE and r-VST achieve similar performance. We also see that, while r-Poisson-MLE and r-VST both achieve improved performance over their non-regularized counterparts, r-VST performs slightly better than r-Poisson-MLE for some signal levels. Although the performance difference is small and the reason for the disparity is not the topic of this paper, we point out that it has been reported that Poisson-MLE, and by extension, r-Poisson-MLE, can be biased towards a certain class of sparse objects [28], and therefore may not be suitable for all types of objects.

## 5. Experimental results

To verify these findings in a real-world imaging system, we performed experiments imaging a fluorescent, resolution-test sample made by Argolight [29] (Fig. 6). This sample contains successive fluorescently-labeled line pairs separated by different distances ranging for 0nm (no separation) to 270nm. The microscope system was an inverted fluorescence microscope with its illuminator removed and replaced with a custom-implemented laser scanning setup (Fig. 1(e)). We scanned the illumination spot in a predetermined pattern, acquiring a raw image at each scanning step. The excitation wavelength was 488nm and the images were collected through a bandpass optical filter centered at 520nm, using a 1.4NA objective.

The results of this experiment are shown in Fig. 6. Here, we generated an equivalent widefield diffraction-limited image by summing the acquired raw images. As expected, the 1.4NA objective’s diffraction-limited resolution is shown to be approximately 240nm. We next used an alternative formulation of r-VST (refer to Supplement 1 for details) to process each of the acquired raw images and combined them to produce the final, super-resolved full field image. Here, it can be seen that the line pair separated by 60nm is successfully resolved by r-VST. When compared to the conventional diffraction-limited resolution, this corresponds to a resolution improvement of approximately 4 times. The processed images obtained using NNLS and VST are also shown. It can be seen that while NNLS and VST are able to resolve beyond the conventional diffraction-limited resolution, they are unable to resolve the 60nm line pair. To further illustrate this improvement, in the bottom row, we plot the average of three cross-section profiles for the 60nm and 90nm line pairs in the processed images obtained using the three processing schemes. We see that besides the improved super-resolution performance, r-VST also produces an image with improved visual quality showing enhanced contrast. These findings are consistent with our simulation results, where the highest achievable resolution can be expected when utilizing a processing scheme that is 1) appropriate for the noise model, and is 2) able to take advantage of the object sparsity.

## 6. Discussion

Here, we address some interesting observations in this paper that warrant further discussion.

#### 6.1 Equivalency between WNNLS, Poisson-MLE, and VST

From Fig. 4, we see that despite differences in their formulations, WNNLS, Poisson-MLE, and VST all achieve approximately the same performance. A possible explanation for this similarity is that they are all derived from the same Poisson-MLE model, and the simplifying approximations for VST and WNNLS and differences in optimization solvers have little effect.

Poisson-MLE is derived directly from the Poisson random variable log-likelihood function, and therefore accommodates the shot noise model natively. This is the gold-standard model, but solving the optimization problem is more difficult. This is because while the Poisson-MLE objective is convex and differentiable, its derivative lacks a key smoothness property that precludes the use of standard methods like gradient descent [30]. While there exist specialized algorithms to tackle this problem [30,31], it is slower to solve, both due to the lack of smoothness and because it is less well studied than least-squares. Because of these reasons, we consider instead WNNLS and VST in this paper. For WNNLS, the consideration of shot noise is done by approximating Poisson random variables with non-zero mean, non-uniform Gaussian random variables. After this approximation, Eq. (6), which maximizes the log-likelihood function for Gaussian random variables with *non-uniform* variances, is appropriate for maximum likelihood estimation [22]. For VST, the consideration of shot noise is done by “stabilizing” Poisson random variables to have uniform variances. With this approximation, an NNLS-like approach, which maximizes the log-likelihood function for Gaussian random variables with *uniform* variances, can be used for maximum likelihood estimation [30], and gives rise to Eq. (12).

In other words, while utilizing different approximations, the goals of WNNLS and VST are the same: to ensure the variance of the noise model assumed by the processing scheme is in agreement with the physical reality, where the dominant noise is the photon shot noise. We note that, the random variable manipulations made in WNNLS and VST are only valid if Poisson random variables act like Gaussian random variables with modified mean and variance, and this occurs in the limit as the number of photons goes to infinity. The fact that all three formulations give similar results in our experiment indicates the validity of these approximations for realistic photon levels.

Since Gaussian noise is still present in these images, we expect the performance to improve further if we adopt a processing scheme that correctly models the resultant compound Poisson-Gaussian random variable. Specifically, if the processing scheme is based on maximum likelihood estimation of the compound Poisson-Gaussian random variable’s probability density function, such as derived in [32], improved performance should be expected. However, we expect the performance improvement obtained in this way to be relatively minor, and would only be apparent at very low photon counts and for low-sensitivity sCMOS cameras (i.e., high Gaussian noise power compared with that of the shot noise). This is again because the Gaussian noise power in our technique is significantly lower than that of the shot noise. In [32], where single molecule localization microscopy is studied, because the optical signal is produced by a single emitter, shot noise power is significantly lower.

#### 6.2 Connections to single-molecule localization microscopy

Applying numerical processing to enhance the images collected from a fluorescence microscope is a widely-adopted practice. One such enhancement, and the focus of this paper, is that of achieving lateral resolution beyond the conventional diffraction limit. In section 1, we introduce early examples in applying numerical processing to achieve super-resolution (without exploiting stimulated emission and on-off state transitions) , as well as our previous work in the same area. In addition to these attempts, another prominent group of examples of super-resolution achieved via numerical processing can be found in single-molecule localization microscopy (SMLM) [5–9]. Despite the fact that SMLM requires the fluorophore to undergo on-off state transitions, some SMLM realizations and our technique share many computational similarities, especially in terms of the optimization problem solved in the processing step.

One such similarity, and interesting development in the processing step of SMLM is the use of compressed sensing principles to increase the acquisition speed [9]. In this work, an $\ell _1$ regularized least squares problem is solved to recover a super-resolved image comprised of sparse emitters with highly overlapping PSFs. This processing scheme is very similar to some of the processing schemes considered in this paper, for example, Eqs. (8) and (9). However, we note one key difference in the optimization problem solved. In [9], the imaged fluorescent molecules are still sparsely activated at a density of around 10 $\mu m^{-2}$. This low fluorophore density is only achievable through manipulating on-off state transitions, as a typical labeling density of common fluorescent samples is estimated to be much higher than 1000 $\mu m^{-2}$ [17].

#### 6.3 Connection to other contemporary work in super-resolution microscopy

Here we discuss how our imaging method fits within the larger super-resolution microscopy community. As stated previously, our technique does not require stimulated emission or on-off state transitions of the fluorophores. This allows our super-resolution approach to be implemented with a relatively simple optical setup – by adding a focused-spot scanning mechanism to a conventional widefield fluorescence microscope. Combined with a resolution improvement by a factor of four times, our technique can be easily implemented for investigating small (approximately 60nm) biological structures labeled with commonly used fluorescent proteins or organic dyes.

The resolution enhancement provided by our technique is dependent on object sparsity, and this is our motivation for using a focused illumination spot. In [17], we studied the effect of illumination on the recovered image. We found that, while our method is compatible with a widefield illumination, meaningful super-resolution only occurs with a tightly focused illumination spot. This is in agreement with prior theoretical results, such as [14], where meaningful super-resolution is shown to be achievable only if the spatial extent of the object is much smaller (e.g., by fluorescently labeling the object) than $d_{\textrm {R}}$. This sparsity condition may not always be true in a real-world biological sample. For example, if there is significant background fluorescence in the collected image, the sparsity condition is not valid. This is because the object is no longer comprised of fluorescently labeled small structures against a non-labeled dark background. However, this dependence on object sparsity could also reasonably allow us to employ existing sparsity enhancing mechanisms (such as STED illumination [3,4], or fluorophore blinking ) in conjunction with our computational approach to further improve the achievable resolution.

## 7. Conclusion

In this paper, we investigate the processing step for computationally achieving super-resolution imaging in fluorescence microscopy without needing fluorophore switching. Through numerical simulation, we determined that: 1) our previously used processing scheme (NNLS) does not always deliver optimal performance; 2) however, improved performance can be expected if the selected processing scheme is appropriate for the dominant source of noise, and takes advantage of the prior information of sparsity; and 3) the performance is less dependent on the exact formulation of the processing scheme, but more dependent on whether the variance of the noise model assumed by the processing scheme is in agreement with the physical reality.

Based on simulation results, we identified a powerful processing scheme, r-VST, and used it to process data from a real-world experiment. r-VST achieves 60nm resolution on experimental data, which is approximately four times beyond the conventional diffraction-limited resolution.

Since sparsity plays an important role in the formulation of these processing schemes, how the different levels of sparsity (caused by either the object structure or background fluorescence) affect the performance of these processing schemes remains an open question. Another open question is the *exact* mechanism for the resolution improvement observed in this paper and in our previous work. While some early theoretical results seem to be applicable to qualitatively explain the observed super-resolution [14–16], a thorough treatment of this problem in a more quantitative context in the future will be beneficial for the wider adoption of this technique.

## Funding

Colorado Advanced Industry Accelerator (CTGG1 2017-1672); National Science Foundation (1353444, 1810314).

## Acknowledgments

This material is based upon work supported by the National Science Foundation under Grants No. 1353444 and No. 1810314. Additional funding was provided by a Colorado Advanced Industries Accelerator Grant Award Number: CTGG1 2017-1672, and a University of Colorado (Boulder) Imaging Science IRT Seed Grant.

This work also utilized resources from the University of Colorado Boulder Research Computing Group, which is supported by the National Science Foundation (awards ACI-1532235 and ACI-1532236), the University of Colorado Boulder, and Colorado State University.

Portions of this work were presented at the SPIE Photonics West, Three-Dimensional and Multidimensional Microscopy: Image Acquisition and Processing XXVII Conference in 2020, paper number: 112450O.

## Disclosures

The authors declare no conflicts of interest.

See Supplement 1 for supporting content.

## References

**1. **L. Rayleigh, “Xv. on the theory of optical images, with special reference to the microscope,” The London, Edinburgh, Dublin Philos. Mag. J. Sci. **42**(255), 167–195 (1896). [CrossRef]

**2. **J. W. Goodman, * Introduction to Fourier optics* (Roberts and Company Publishers, 2005), chap. 6.5.2.

**3. **S. W. Hell and J. Wichmann, “Breaking the diffraction resolution limit by stimulated emission: stimulated-emission-depletion fluorescence microscopy,” Opt. Lett. **19**(11), 780–782 (1994). [CrossRef]

**4. **T. A. Klar and S. W. Hell, “Subdiffraction resolution in far-field fluorescence microscopy,” Opt. Lett. **24**(14), 954–956 (1999). [CrossRef]

**5. **E. Betzig, G. H. Patterson, R. Sougrat, O. W. Lindwasser, S. Olenych, J. S. Bonifacino, M. W. Davidson, J. Lippincott-Schwartz, and H. F. Hess, “Imaging intracellular fluorescent proteins at nanometer resolution,” Science **313**(5793), 1642–1645 (2006). [CrossRef]

**6. **M. J. Rust, M. Bates, and X. Zhuang, “Stochastic optical reconstruction microscopy (storm) provides sub-diffraction-limit image resolution,” Nat. Methods **3**(10), 793–796 (2006). [CrossRef]

**7. **D. T. Burnette, P. Sengupta, Y. Dai, J. Lippincott-Schwartz, and B. Kachar, “Bleaching/blinking assisted localization microscopy for superresolution imaging using standard fluorescent molecules,” Proc. Natl. Acad. Sci. **108**(52), 21081–21086 (2011). [CrossRef]

**8. **T. Dertinger, R. Colyer, G. Iyer, S. Weiss, and J. Enderlein, “Fast, background-free, 3d super-resolution optical fluctuation imaging (sofi),” Proc. Natl. Acad. Sci. **106**(52), 22287–22292 (2009). [CrossRef]

**9. **L. Zhu, W. Zhang, D. Elnatan, and B. Huang, “Faster storm using compressed sensing,” Nat. Methods **9**(7), 721–723 (2012). [CrossRef]

**10. **M. G. Gustafsson, “Surpassing the lateral resolution limit by a factor of two using structured illumination microscopy,” J. Microsc. **198**(2), 82–87 (2000). [CrossRef]

**11. **E. H. Rego, L. Shao, J. J. Macklin, L. Winoto, G. A. Johansson, N. Kamps-Hughes, M. W. Davidson, and M. G. Gustafsson, “Nonlinear structured-illumination microscopy with a photoswitchable protein reveals cellular structures at 50-nm resolution,” Proc. Natl. Acad. Sci. **109**(3), E135–E143 (2012). [CrossRef]

**12. **D. L. Donoho, I. M. Johnstone, J. C. Hoch, and A. S. Stern, “Maximum entropy and the nearly black object,” J. Royal Stat. Soc. Ser. B (Methodological) **54**(1), 41–67 (1992). [CrossRef]

**13. **P. J. Sementilli, B. R. Hunt, and M. S. Nadar, “Analysis of the limit to superresolution in incoherent imaging,” J. Opt. Soc. Am. A **10**(11), 2265–2276 (1993). [CrossRef]

**14. **M. Bertero and E. Pike, “Resolution in diffraction-limited imaging, a singular value analysis,” Opt. Acta: Int. J. Opt. **29**(6), 727–746 (1982). [CrossRef]

**15. **M. Bertero, P. Boccacci, M. Defrise, C. D. Mol, and E. R. Pike, “Super-resolution in confocal scanning microscopy: II. the incoherent case,” Inverse Probl. **5**(4), 441–461 (1989). [CrossRef]

**16. **M. Bertero, C. D. Mol, E. Pike, and J. Walker, “Resolution in diffraction-limited imaging, a singular value analysis,” Opt. Acta: Int. J. Opt. **31**(8), 923–946 (1984). [CrossRef]

**17. **J.-Y. Yu, S. R. Becker, J. Folberth, B. F. Wallin, S. Chen, and C. J. Cogswell, “Achieving superresolution with illumination-enhanced sparsity,” Opt. Express **26**(8), 9850–9865 (2018). [CrossRef]

**18. **J.-Y. Yu, V. Narumanchi, S. Chen, J. Xing, S. R. Becker, and C. J. Cogswell, “Analyzing the super-resolution characteristics of focused-spot illumination approaches,” J. Biomed. Opt. **25**(5), 056501 (2020). [CrossRef]

**19. **R. Tibshirani, “Regression shrinkage and selection via the lasso,” J. Royal Stat. Soc. Ser. B **58**(1), 267–288 (1996).

**20. **E. J. Candès, J. Romberg, and T. Tao, “Robust uncertainty principles: Exact signal reconstruction from highly incomplete frequency information,” IEEE Trans. Inf. Theory **52**(2), 489–509 (2006). [CrossRef]

**21. **A. Foi, M. Trimeche, V. Katkovnik, and K. Egiazarian, “Practical Poissonian-Gaussian noise modeling and fitting for single-image raw-data,” IEEE Trans. on Image Process. **17**(10), 1737–1754 (2008). [CrossRef]

**22. **A. Charnes, E. L. Frome, and P. L. Yu, “The equivalence of generalized least squares and maximum likelihood estimates in the exponential family,” J. Am. Stat. Assoc. **71**(353), 169–171 (1976). [CrossRef]

**23. **S. M. Riad, “The deconvolution problem: An overview,” Proc. IEEE **74**(1), 82–85 (1986). [CrossRef]

**24. **A. Chambolle and T. Pock, “An introduction to continuous optimization for imaging,” Acta Numer. **25**, 161–319 (2016). [CrossRef]

**25. **P. C. Hansen, * The L-curve and its use in the numerical treatment of inverse problems* (IMM, Department of Mathematical Modelling, Technical Universityof Denmark, 1999).

**26. **G. H. Golub, M. Heath, and G. Wahba, “Generalized cross-validation as a method for choosing a good ridge parameter,” Technometrics **21**(2), 215–223 (1979). [CrossRef]

**27. **F. J. Anscombe, “The transformation of Poisson, binomial and negative-binomial data,” Biometrika **35**(3-4), 246–254 (1948). [CrossRef]

**28. **E. Shaked and O. Michailovich, “Regularized richardson-lucy algorithm for sparse reconstruction of Poissonian images,” (2010).

**29. **“Argo-check resolution monitoring slide,” http://argolight.com/products/argo-check-resolution/. Accessed: 2020-07-07.

**30. **F. Dupe, J. M. Fadili, and J. Starck, “A proximal iteration for deconvolving Poisson noisy images using sparse representations,” IEEE Trans. on Image Process. **18**(2), 310–321 (2009). [CrossRef]

**31. **J. M. Bardsley and C. R. Vogel, “A nonnnegatively constrained convex programming method for image reconstruction,” SIAM J. Sci. Comput. **25**(4), 1326–1343 (2004). [CrossRef]

**32. **F. Huang, T. M. Hartwich, F. E. Rivera-Molina, Y. Lin, W. C. Duim, J. J. Long, P. D. Uchil, J. R. Myers, M. A. Baird, W. Mothes, M. W. Davidson, D. Toomre, and J. Bewersdorf, “Video-rate nanoscopy using scmos camera–specific single-molecule localization algorithms,” Nat. Methods **10**(7), 653–658 (2013). [CrossRef]