## Abstract

Structured illumination microscopy can achieve super-resolution in fluorescence imaging. The sample is illuminated with periodic light patterns, and a series of images are acquired for different pattern positions, also called phases. From these a super-resolution image can be computed. However, for an artefact-free reconstruction it is important that the pattern phases be known with very high precision. If the necessary precision cannot be guaranteed experimentally, the phase information has to be retrieved *a posteriori* from the acquired data. We present a fast and robust algorithm that iteratively determines these phases with a precision of typically below *λ*/100. Our method, which is based on cross-correlations, allows optimisation of pattern phase even when the pattern itself is too fine for detection, in which case most other methods inevitably fail. We analyse the performance of this method using simulated data from a synthetic 2D sample as well as experimental single-slice data from a 3D sample and compare it with another previously published approach.

© 2013 Optical Society of America

## 1. Introduction

For more than a hundred years the Abbe diffraction limit [1] was believed to fundamentally limit the spatial resolution of optical microscopy. However, the last decade has witnessed the emergence of numerous novel microscopy techniques that manage to circumvent the Abbe limit and achieve super-resolution [2–6]. One of these techniques is structured illumination microscopy (SIM) [7–9]. Through illumination with periodic light patterns, high spatial frequency sample information is down-modulated to lower frequencies, which can then be captured by the microscope. The acquisition of several images under different pattern positions yields enough information for the computational extraction of this high spatial frequency information. Shifting this information back to its true position in Fourier space extends the support region of the optical transfer function (OTF) of the microscope, thus enhancing resolution. In this way SIM combines the Abbe limit of excitation and illumination. As the illumination is diffraction limited as well, linear SIM can only enhance the resolution by a factor of about two. Exploiting nonlinearities in the response of the fluorophores to the illumination light, e.g. through saturation [10, 11] or by using photo-switchable dyes [12, 13], one can generate higher harmonics in the *effective* illumination, which can lie outside the illumination support. This enables, in principle, unlimited resolution.

For the correct extraction of the high frequency information, it is crucial that the pattern positions of the individual images be known with very high precision. While pattern positioning using spatial light modulators or mechanical means with active feedback control can ensure high precision and reproducibility of the true pattern position, less sophisticated systems may not be able to do so. Furthermore, sample movement and photo bleaching of the illumination pattern into the sample may alter the effective pattern position in the images.

It is therefore desirable to be able to determine the pattern position *a posteriori* from the acquired data. Shroff et al. have presented a method for obtaining the pattern position in each individual raw image by analysing the phase of the peaks corresponding to the pattern frequency in the Fourier representation of the image [14]. While this method is very effective for medium pattern frequencies, it is less reliable for very fine patterns or noisy data. Furthermore, for pattern frequencies so fine that the corresponding Fourier peaks lie outside the support of the detection OTF, this method inevitably fails. This is the case when combining SIM with total internal reflection fluorescence (TIRF) microscopy [15, 16], or even when using patterns near the limit of the illumination support, whose peaks may lie outside the detection support as a result of the Stokes shift. Another approach is the determination of all illumination parameters through blind deconvolution [17].

In this paper we present a fast and efficient way of iteratively determining the pattern positions with very high precision. Simulations show that our algorithm gives accurate results even for noisy data and for pattern frequencies outside the detection support. Finally, we demonstrate the successful reconstruction of experimental SIM data with unknown illumination patterns and validate the performance of our algorithm using a statistical analysis of the retrieved phase corrections.

## 2. Image formation and reconstruction in SIM

In fluorescence microscopy the recorded data *D*(*r⃗*) can be described as a convolution (⊗) of the emitted fluorescence *E*(*r⃗*) with the microscope point spread function (PSF), *h*(*r⃗*): *D*(*r⃗*) = [*E* ⊗ *h*](*r⃗*). Omitting constant scaling factors such as quantum efficiency, the emitted light distribution can be written as a product of the sample fluorophore density *S*(*r⃗*) and an *effective* illumination *I*(*r⃗*): *E*(*r⃗*)=*S*(*r⃗*)*I*(*r⃗*). In the case of a linear sample response to the illumination light, this effective illumination is identical to the real illumination. If the sample response is nonlinear, the process of illumination and emission can be written as the sample responding linearly to a modified effective illumination, which accounts for the nonlinearity [10]. As Gustafsson et al. [18], we, too, consider illumination distributions comprising a finite number *M* of components, which can be separated into axial (*z*) and lateral (*x*, *y*) parts: *I*(*r⃗ _{xy}*,

*z*) = ∑

*(*

_{m}I_{m}*z*)

*J*(

_{m}*r⃗*). Each lateral component should be a harmonic wave, containing only a single spatial frequency

_{xy}*p⃗*, i.e.

_{m}*J*(

_{m}*r⃗*)= exp{

_{xy}*ι*(2

*πp⃗*·

_{m}*r⃗*+

_{xy}*ϕ*)}. Here

_{m}*ϕ*denotes the phase of the

_{m}*m*

^{th}component. As the illumination intensity has to be real-valued, each illumination component

*J*(

_{m}*r⃗*) must have a complex conjugate partner ${J}_{m}^{*}\left(\overrightarrow{r}\right)$. This is accounted for by numbering the components from

*m*=−(

*M*−1)/2 to (

*M*−1)/2 and defining

*p⃗*

_{−}

*=−*

_{m}*p⃗*and

_{m}*ϕ*

_{−m}=−

*ϕ*. If the spatial frequencies

_{m}*p⃗*describe a harmonic pattern, they will be multiples of a single fundamental frequency

_{m}*p⃗*:

*p⃗*=

_{m}*mp⃗*. Also, if this pattern is symmetrical and remains rigid under translation between the images, the phases of the individual frequencies will be multiples of the phase of the fundamental frequency,

*ϕ*:

*ϕ*=

_{m}*mϕ*[18].

If the axial part of the illumination light distribution remains fixed with respect to the microscope objective during refocussing, its effect can be incorporated into modified PSFs *h _{m}*(

*r⃗*)=

*h*(

*r⃗*)

*I*(

_{m}*z*). The image formation can then be described by a sample that is illuminated by lateral structured illumination components

*J*(

_{m}*r⃗*) only, while each of these is imaged with their own respective PSF

*h*(

_{m}*r⃗*). We can therefore write the acquired SIM data as

*C*̃

*(*

_{m}*k⃗*). For structured illumination a series of typically

*N*≥

*M*images is acquired for different pattern phases

*ϕ*. We can write the

_{n}*n*

^{th}Fourier image as

**M**contains the various pattern phases

*mϕ*for different illumination orders (

_{n}*m*) and images (

*n*).

For the reconstruction of high resolution images, the individual Fourier components *C*̃* _{m}*(

*r⃗*) need to be extracted from the acquired SIM data. This can be done by inverting Eq. (3):

**M**is not square, i.e. if there are more images than components to be separated, the Moore-Penrose pseudo inverse [19]

**M**

^{−1}:=(

**M**

^{†}

**M**)

^{−1}

**M**

^{†}can be used. Here the dagger symbol

^{†}denotes the conjugate transpose or Hermitian conjugate of a matrix. For square mixing matrices (

*M*=

*N*) equidistant phase steps (i.e.

*ϕ*= 2

_{n}*πn*/

*N*) will yield the best conditioning number and will therefore yield components with the highest possible signal-to-noise levels. Nevertheless, separation will work for other phase steps, as long as

**M**is non-singular.

In order to achieve isotropic lateral resolution enhancement, this process of data acquisition and component separation is carried out under a number *Q* of pattern orientations, yielding separated components *C̃ _{q,m}*(

*k⃗*), where the additional index

*q*denotes the orientation. These components can then be recombined using weighted averaging in Fourier space combined with a generalised Wiener filter [18, 20] and an apodisation function

*Ã*(

*k⃗*),

*S*̃′(

*k⃗*) that contains higher frequency information than the wide-field image

*S*̃(

*k⃗*)

*h*̃(

*k⃗*). In the above equation

*w*is the constant Wiener parameter, which is adjusted empirically. The purpose of the apodisation function is to avoid hard edges in the effective reconstructed OTF, which may otherwise result from Wiener-filtering. This way ringing artefacts in the reconstructed PSF are avoided. The apodisation function used to reconstruct the images in this article is a distance transform (

*s*) applied to the footprint (

*f*) of the OTF (i.e. the distance from the edge of the support), normalised to one in the centre, taken to the power of 0.4:

*Ã*(

*k⃗*)=[

*s*(

*f*(

*h*̃(

*k⃗*)))]

^{0.4}. This function was empirically found to yield good results.

As component separation is done one pattern orientation at a time, we will limit ourselves to the notation used for single orientations for the remainder of this article. Note that when reconstructing two-dimensional slices of SIM data acquired of three-dimensional specimen, Eq. (5) cannot be stringently applied. Instead, component recombination is modified according to Appendix A.

## 3. Current phase determination techniques

As we can see from Eqs. (3) and (4), separation of the individual components requires knowledge of the pattern phases *ϕ _{n}*. If these phases are not known with sufficient precision, the unmixing process will not be able to perfectly separate the components; instead, they will contain residual information from other components. After shifting, these residual components will be at incorrect locations in Fourier space, leading to artefacts in the reconstructed image. Precise knowledge of the pattern phases is therefore paramount. As it is not always possible to exert enough control over these phases in the experimental setup, it is desirable to be a able to determine the pattern phases

*a posteriori*from the acquired data.

#### 3.1. Determining the pattern phase from the phase of the Fourier peaks

Shroff et al. have presented a method for determining pattern phases for SIM data acquired under sinusoidal (i.e. two beam) illumination, which works well for a variety of cases [14]. As it is to our knowledge the only technique published to date, we use it for comparison.

Unlike Shroff et al. we assume coherent illumination, a prerequisite for good pattern contrast of very fine gratings, which are necessary when trying to significantly enhance the resolution. This leads to slightly modified equations from the ones published in [14]. However, as Shroff et al. point out, their method will also work for coherent illumination. In fact, it will yield better results in this case.

Under sinusoidal illumination the acquired images will comprise three components, *C*̃_{−1}(*k⃗*), *C̃*_{0}(*k⃗*) and *C*̃_{+1}(*k⃗*), superimposed with different phase:

*c*is the contrast of the illumination pattern. Shroff et al. retrieve the pattern phase of the

*n*

^{th}image from the phase of the Fourier image at the frequency of the pattern peak

*p⃗*:

Firstly, the contrast of the illumination pattern, *c*, has to be sufficiently large. Secondly, the sample power spectrum must decrease sufficiently fast with growing frequency, so that |*S*̃(0)|^{2} ≫ *S*̃(*p*⃗)||^{2} +|*S*̃(2*p⃗*)|^{2}. This is a good assumption for most natural samples as long as the pattern frequency *p*⃗ is sufficiently large. When these two conditions are fulfilled, Eq. (7) will be dominated by the last term, yielding

*S*̃(0) will be real valued, as will be the OTF,

*h*̃(

*k⃗*), if the PSF

*h*(

*r⃗*) is real and symmetrical. This allows the phase retrieval in the above manner. For asymmetrical PSFs the phases of the OTFs have to be accounted for.

Lastly, the magnitude of the OTF at the pattern frequency, *h*̃(*p⃗*), has to be sufficiently large. If this is not the case, noise in the acquired image may alter the phase measured at *p⃗* significantly. Naturally, for pattern frequencies *p⃗* outside the support of the detection OTF, i.e. *h*̃ (*p⃗*)=0, the method cannot work.

## 4. Phase determination using cross-correlations of separated components

The above condition that *h*̃(*p*⃗) be sufficiently large will not always be fulfilled. For very high pattern frequencies, which are desirable for significant resolution enhancement, the OTF *h*̃(*p⃗*) quickly drops to low values, so that shot noise in the Fourier images may dominate the phase value at *p⃗*. But worse, it is possible that the pattern peaks lie outside the OTF support altogether. This is the case when using SIM in TIRF configuration, where the numerical aperture (NA) of the illumination is larger than the NA of the detection. But even for SIM systems in a conventional (epi-)fluorescence configuration, the Stokes shift may lead to patterns that are no longer visible in the detection.

For these cases we have developed a method for determining the pattern phases through an iterative optimisation using cross-correlations (CC) between separated components.

#### 4.1. Correlating separated components

The idea behind our method is the following: if the Fourier components are perfectly separated, but have not been shifted to their true origin in Fourier space, different components should not contain overlapping information. The overlap in information of two components *C*̃* _{i}*,

*C*̃

*can be measured using a weighted cross-correlation (WCC, denoted by ⍟*

_{j}*, defined in Appendix B) at frequency zero, ${\mathcal{C}}_{i,j}^{\left(0\right)}={\left[{\tilde{C}}_{i}\left(\overrightarrow{k}\right){\u235f}_{w}{\tilde{C}}_{j}\left(\overrightarrow{k}\right)\right]|}_{\overrightarrow{k}=0}$. This correlation*

_{w}*should*therefore yield a low value for

*i*≠

*j*. Auto-correlations of components (i.e.

*i*=

*j*) on the other hand will yield high values, as all separated components will obviously have strong information overlap with themselves.

When comparing an unshifted component to a component shifted by an integer number of pattern vectors *lp⃗*, however, there will be information overlap between some of the components. The correlation values

*should*be high for

*i*=

*j*+

*l*(as we can see from Eq. (2)), and low for

*i*≠

*j*+

*l*.

This is true when components have been separated perfectly. If, however, the components are not well-separated, correlations between residual (wrong) components may yield high values where low values are expected.

Note that this cross-correlation of components is similar to the comparison of components described by M. G. L. Gustafsson [8] and J. T. Frohn [9], who use it for the determination of the precise orientation, frequency, modulation depth and initial phase of the pattern. However, no other optimization of the separation matrix using the cross-correlation of the components has been reported so far.

#### 4.2. Iterative optimisation of pattern phases

It is a sign of a good separation of components, therefore, if the correlation values
${\mathcal{C}}_{i,j}^{\left(l\right)}$ are low for *i* ≠ *j* +*l*. This fact can be used for the optimisation of the unmixing matrix. We achieve this by iteratively varying the pattern phases *ϕ _{n}* in the mixing matrix (

**M**

*=*

_{n,m}*e*

^{ιmϕn}), inverting this matrix and applying it to separate the components ( $\overrightarrow{\tilde{C}}\left(\overrightarrow{k}\right)={\mathbf{M}}^{-1}\overrightarrow{\tilde{D}}\left(\overrightarrow{k}\right)$) and using these separated components to calculate the component correlation tensor ${\mathcal{C}}_{i,j}^{\left(l\right)}$. The correlation tensor should be calculated for

*l*=1 to

*L*,

*Lp⃗*being the highest shift for which there is still any overlap between the OTFs of the shifted components. The mixing matrix is then optimised iteratively by using a gradient search algorithm (“minfunc” by Mark Schmidt [21]), minimising the cost function

*i*=

*j*+

*l*), or a combination of this approach with the above function for the optimum usage of all the information available, we found that this did not improve the perceived reliability of our optimisation.

#### 4.3. Speeding up the algorithm significantly

An optimisation following this scheme requires the variation of the mixing matrix **M**, the separation of the components and the re-calculation of the correlation tensor
${\mathcal{C}}_{i,j}^{\left(l\right)}$ as well as the cost function *g*(**M**) for each iteration. This means that (*M*^{2} − *M* +*L*/2) (*L*+1)correlations have to be calculated in each iteration. This corresponds to the total number of elements in the correlation tensor less the number of element expected to have a high correlation (i.e. *j* + *l* − *i* =0, which are not used in the cost function). This number can be decreased if symmetries in the correlation tensor are exploited. Nevertheless, the iterative optimisation will be computationally very expensive and time-consuming, especially for large images and even more so in the case of nonlinear SIM, which requires a larger number of components.

However, the optimisation can be significantly speeded up. Instead of re-calculating the correlation tensor *𝒞* after each iteration, we calculate a different tensor, *𝒟*, containing the correlation values of the unshifted and *lp⃗*-shifted Fourier *images* rather than of the Fourier *components*:

*𝒟*, the component correlation tensor

*𝒞*can be calculated via

*𝒟*is independent of the mixing matrix

**M**it has to be calculated only once before the start of the iterative optimisation. This means that

*N*

^{2}

*L*correlations (or fewer, when exploiting symmetries in the tensor) have to be calculated

*once*. Furthermore, for each iteration the unmixing matrix

**M**

^{−1}will now operate on

*L*correlation matrices of size

*N*×

*N*, rather than on

*M*Fourier images containing thousands or millions of pixels. Both aspects lead to a dramatic improvement in optimisation speed.

#### 4.4. Accounting for auto-correlations of the noise

When calculating the component correlation tensor in the above manner, apart from correlating information content present in the different components, the correlation values
${\mathcal{C}}_{i,j}^{\left(l\right)}$ will also contain noise correlations. These consist of a random part with an expectation value of zero, which stems from correlating noise from *different* (or shifted) Fourier images, and a non-random part stemming from auto-correlations of noise from *one* image with itself. This non-random part has an expectation value that is real and positive, and should be removed from the correlation values
${\mathcal{C}}_{i,j}^{\left(0\right)}$ before optimisation.

This can easily be accounted for when calculating the Fourier image correlation tensor *𝒟*, where both the correlations of image information as well as noise correlations will directly contribute to the correlation value. Random noise between different Fourier images will not be correlated, nor will the noise of one Fourier image be correlated to the noise of the same Fourier image when shifted by *lp⃗*, *l* ≠ 0. However, for an (unshifted) auto-correlation [*D̃ _{i}*(

*k⃗*)⍟

*(*

_{w}D̃_{i}*k⃗*)]|

_{k⃗}_{=0}, noise correlations will be non-negligible and have to be corrected for. If we assume white noise of standard deviation

*σ*, its contribution to the auto-correlation will be [

*σ⍟*]|

_{w}σ

_{k⃗}_{=0}= ∑

*{*

_{k}*w*(

*k⃗*)

*σ*

^{2}}/∑

*{*

_{k⃗}*w*(

*k⃗*)} =

*σ*

^{2}, where

*w*(

*k⃗*) are weights used in the weighted CC, as defined in Appendix B. The Fourier image correlation tensor for unshifted images thus has to be modified to

*δ*is the Kronecker delta.

_{ij}#### 4.5. Reconstructing two-dimensional data

The derivation of our correlation-based optimisation assumes the availability of three-dimensional datasets (i.e. image stacks) and OTFs. Only then can the correct weights be applied in the weighted cross-correlations. Nevertheless, when reconstructing two-dimensional data stemming from either two- or three-dimensional samples, this method can still be applied. Instead of using the true 3D imaging OTFs, we use its projection along *k _{z}* onto the

*k*,

_{x}*k*-plane (which corresponds to the pure 2D in-focus OTF). When applied to 2D samples (e.g. very thin samples or TIRF) as the synthetic sample used in our simulated images, this again yields the correct weights. When reconstructing 2D data from 3D samples (as for our experimental data), these incorrect 2D weights will treat all information as overlapping, as long as it has overlap in the

_{y}*k*,

_{x}*k*-coordinates, ignoring

_{y}*k*. This is however still better than not using any weights at all, which would then put an undue emphasis on noise-only correlations.

_{z}## 5. Simulations

In order to evaluate the performance of our algorithm we used it to determine the phases of simulated data with known, non-uniform phase steps. In addition, we compared our algorithm to the phase-of-peak method by Shroff et al.

#### Methods

We performed simulations using the following experimental parameters: excitation wavelength *λ _{ex}* = 488 nm; emission wavelength

*λ*= 515 nm; numerical aperture

_{em}*NA*=1.4; refractive index of the embedding medium

*n*=1.52. These parameters correspond to a maximum detectable spatial frequency of (184 nm)

_{r}^{−1}, defined by the detection OTF. The illumination patterns were simulated for two-beam (i.e. sinusoidal) illumination. The pixel size in the simulated raw image corresponds to 65 nm in sample space. The synthetic sample used for the simulations is shown in Fig. 1(a), its Fourier transform in Fig. 1(b). No camera offset or other background was assumed in the simulation.

We analysed two different scenarios: illumination with a fine pattern with a spatial frequency of (210 nm)^{−1}, which corresponds to 87.6% of the maximum frequency supported by the OTF; and illumination with a very fine pattern with a spatial frequency of (185 nm)^{−1}, corresponding to 99.4% of the OTF support.

In each case we simulated illumination patterns for three different pattern orientations (0°, 60°, 120°). The ideal pattern phases (0°, 120°, 240°) were varied randomly using a Gaussian distribution with a standard deviation of 10°. We simulated raw images using twenty such sets of randomised pattern phases, {0° + *ϕ*_{1,p}, 120° + *ϕ*_{2,p}, 240° + *ϕ*_{3}* _{,p}*},

*p*=1..20. For each of these, we simulated noisy images for 51 different signal-to-noise levels, using Poisson noise and an expectation values of 10

^{l/10},

*l*=0..50, i.e. between 1 and 10

^{5}photons, in the brightest pixel.

For each simulated dataset **D̃*** _{q,p,l}* ={

*D*̃

_{1,q,p,l},

*D*̃

_{2,q,p,l},

*D*̃

_{3,q,p,l}} (i.e. three raw SIM images for one pattern orientation (index

*q*), with one set of randomised phases (index

*p*), for one particular photon level (index

*l*)) we optimised the pattern phases using our iterative, correlation-based algorithm and – for comparison – also using the method proposed by Shroff et al. [14]. For each of these datasets we subtracted the dataset’s known real phases from the phases determined by the algorithm, yielding the individual remaining phase errors Δ

*ϕ*

_{1,q,p,l}, Δ

*ϕ*

_{2,q,p,l}and Δ

*ϕ*

_{3,q,p,l}. We then calculated the standard deviation of these three individual remaining phase errors, ${\epsilon}_{q,p,l}={\u3008{\left(\mathrm{\Delta}{\varphi}_{n,q,p,l}-{\u3008\mathrm{\Delta}{\varphi}_{m,q,p,l}\u3009}_{m}\right)}^{2}\u3009}_{n}^{1/2}$, where 〈..〉

*denotes the mean of an expression over the index*

_{n}*n*. We call this

*ε*the phase error of the dataset

_{q,p,l}**D̃**

*. This approach disregards any global pattern phase offsets, as component separation only requires the knowledge of phase steps between images (i.e. a constant offset present in all the remaining phase errors is not an error at all, as it does not affect component separation; this global phase can be fitted in the overlap region of separated components [8, 9]). Although calculated by means of a standard deviation, this phase error*

_{q,p,l}*ε*is really our measure of how well the algorithm managed to find the correct phases for the dataset

_{q,p,l}**D̃**

*. For each of the 51 photon levels (*

_{q,p,l}*l*) we then calculated the mean of the datasets’ phase errors over all 20 different phase variations (

*p*) and three orientations (

*q*),

*E*= 〈

_{l}*ε*〉

_{q,p,l}*. We call this the*

_{q,p}*average phase error*for a particular photon level (

*l*). We furthermore calculated the standard deviation from this average phase error, $\mathrm{\Delta}{E}_{l}={\u3008{\left({\epsilon}_{q,p,l}-{E}_{l}\right)}^{2}\u3009}_{q,p}^{1/2}$.

#### Results

The results of these simulations are shown in Fig. 2. The dark blue lines show the average phase error *E _{l}* of our iterative,

*correlation-based*algorithm, the shaded blue areas indicate its standard deviation Δ

*E*. Likewise the red lines and shaded areas show the corresponding values obtained for Shroff et al.’s

_{l}*phase-of-peak*method.

Figure 2(a) shows the result for the coarser of the two gratings. For raw images with high SNR, both methods yield similar average phase errors of below 1°, which corresponds to an average error in pattern position of less than 0.6 nm in sample space. Both methods also have similar confidence, although for one dataset (10^{4.6} ≈ 40000 photons) our algorithm failed and returned a phase error of 31°, leading to a degradation of average phase error and standard deviation. For images of lesser SNR (less than 100 photons expected in the brightest pixel), our iterative method yields more accurate results with higher confidence.

Figure 2(b) shows the result for the finer of the two gratings. Although the pattern frequency still lies within the support of the OTF, the transfer strength of the OTF for this frequency is extremely weak, rendering the peak nearly invisible in the Fourier transformed raw images. Even for high SNR the noise dominates over the pattern information at this frequency leading to a failure of the phase-of-peak method. Only for the strongest SNRs (>10^{4} photons) a slight decrease in average phase error occurs.

Our correlation-based approach leads to an improvement in phase error already for very low SNR. For expected maximum photon levels of over 100 the average phase error drops to under 2°, corresponding to an average error in position of less than 1 nm in sample space.

#### Influence of phase optimisation

To show the importance of using the correct phases in the reconstruction of high resolution SIM images, Fig. 3 illustrates the effects of phase error. For these simulations we used the finer of the two patterns, and high intensities with an expectation value of 10^{5} photons in the brightest pixel. For a more pronounced effect we randomised the pattern phase with a standard deviation of 20° (10 nm), yielding phase errors {−7.4°, 7.1°, 0.3°} for orientation 1, {23.6°, −4.3°, −19.3°} for orientation 2 and {38.0°, −31.6°, −6.4°} for orientation 3. The phase errors found by our algorithm were {−10.0°, 9.6°, 0.4°} for orientation 1, {23.4°, −4.3°, −19.1°} for orientation 2 and {37.4°, −30.9°, −6.6°} for orientation 3.

Because of the very high photon numbers the Wiener-filter parameter *w* in Eq. (5) was chosen to be 2.4·10^{−4} of the maximum value of ∑_{q′,m′}{|*h̃*_{m′} (*k⃗* +*m*′*p⃗ _{q}*

_{′})|

^{2}}, i.e. of the denominator without

*w*in Eq. (5). Figure 1(a) shows the synthetic sample used for the simulation (with a close-up of the blue box to the left), Fig. 1(b) its Fourier transform (for the Fourier transformed images we linearly display the square root of their magnitude). Figures 3(a) and 3(b) show the Wiener-filter deconvolved wide-field image and its Fourier transform. Figures 3(c) and 3(d) show the result of SIM reconstruction without correcting the phases, using instead the nominal phases. Even when using the wrong phases the reconstructed image clearly exhibits strong resolution enhancement, and the extent of spatial frequencies in the Fourier image is much improved over the wide-field case. However, there is unwanted residual zero-order information in the shifted components, which is clearly visible as rays emanating from the edge of the support rather then from the centre. They have been highlighted in the bottom half of the Fourier image, but can also be seen (as symmetrical copies) in the upper half. They stem from a bad separation of components due to the use of incorrect phases and cause artefacts in the reconstructed image. This is most obvious in the close-up, where the reconstructed line patterns exhibit a jagged appearance. This is remedied in the reconstructed SIM image using iterative, correlation-based optimisation, which is shown in Fig. 4(c). Also, the residual rays in the Fourier transformed image (Fig. 4(d)) are no longer visible, indicating a successful separation of the components. Due to the high spatial frequency of the pattern Shroff et al.’s phase-of-peak method failed to find the pattern phases, leading to a failure of component separation (Fig. 4(b)) and image reconstruction (Fig. 4(a)).

## 6. Performance on experimental data

To demonstrate the applicability of our algorithm to real experimental data, we successfully reconstructed SIM data with unknown pattern phase errors.

#### Method

Here we used SIM images of autofluorescent lipofuscin accumulating in the retinal pigment epithelial (RPE). Experimental parameters were: excitation wavelength *λ _{ex}* = 488 nm; emission wavelength

*λ*> 500 nm; numerical aperture

_{em}*NA*=1.4, oil immersion. The SIM pattern was generated using two-beam interference from a purposely misaligned Twyman-Green-interferometer. The pattern frequency and orientation could be varied by tilting and rotating the beam splitter, resulting in frequencies |

*p⃗*|={(310.6 nm)

^{−1}, (311.4 nm)

^{−1}, (301.5 nm)

^{−1}} for the three orientations

*α*= {0.9°, 120.8°, 59.7°}. Phase stepping was achieved by moving the mirrors in the interferometer arms using piezo actuators. A more detailed description of the setup can be found in [22].

For each orientation of the pattern a series of 20 raw images was acquired. Between images the illumination pattern was moved by approximately equidistant steps, which were visually estimated to correspond to roughly 75°. However, due to the geometry of the interferometer setup, this step size varied slightly for each pattern orientation. Out of these images we selected 17 subsets per orientation, each comprising three images *D _{q}*,

*D*

_{q}_{+1}and

*D*

_{q}_{+3}, 1 ≤

*q*≤ 17, for individual reconstruction.

Although the relative phase of the second image should thus be approximately *ϕ*_{2} ≈ 75° (when defining the relative phase of the first image as *ϕ*_{1}:= 0°) and that of the third image about *ϕ*_{3} ≈ 3*ϕ*_{2} ≈ 225°, we started our iterative optimisation with initial phases of *ϕ*_{2} = 40° and *ϕ*_{3} = 240° in order to show the capability of the algorithm to retrieve completely unknown phases.

Before the phase optimisation could be carried out, several other optimisation steps had to be applied. These include: background subtraction; multiplicative normalisation of image intensity to compensate for laser fluctuations and bleaching; determination of the exact pattern vector for each direction. For the reconstruction of the final high resolution image, further steps are required after phase optimisation, including: determination of the (global) phase relation between separated components; drift correction between images acquired under different pattern orientations; determination of the pattern modulation contrast; calculation of weight maps in Fourier space for a weighted recombination of the components; Wiener-filtering of the recombined Fourier image. While these steps are important for image reconstruction, a detailed description would go beyond the scope of this paper, which focusses solely on pattern phase. More details about the reconstruction process can be found in [23].

#### Results

For each orientation of the illumination pattern the algorithm thus yielded 17 values each for phases *ϕ*_{2} and *ϕ*_{3}. Their respective means *μ*_{2} and *μ*_{3} and ratio of the means *μ*_{3}/*μ*_{2} were: *μ*_{2} = 76.2°, *μ*_{3} = 229.2°, *μ*_{3}/*μ*_{2} = 3.01 for orientation 1; *μ*_{2} = 74.9°, *μ*_{3} = 225.2°, *μ*_{3}/*μ*_{2} = 3.01 for orientation 2; and *μ*_{2} = 73.2°, *μ*_{3} = 218.8°, *μ*_{3}/*μ*_{2} = 2.99 for orientation 3.

As for the simulations above, we calculated the remaining phase errors for each image in the dataset by subtracting the presumably ideal mean phases *μ*_{2} and *μ*_{3} from the phases determined by the algorithm. As we wanted to disregard any constant phase offset in these remaining errors, we calculated each dataset’s phase error as the standard deviation of its remaining phase errors. Their mean, i.e. the average phase error, and its standard deviation were: 1.2°±0.9° for orientation 1; 2.9°±1.6° for orientation 2; and 1.7°±1.1° for orientation 3. Taking into account the varying pattern frequencies of the different orientations, this corresponds to an average phase error of 1.0 nm ±0.8 nm for orientation 1; 2.5 nm ±1.4 nm for orientation 2; and 1.4 nm±0.9 nm for orientation 3.

For the phase-of-peak method the corresponding results were: *μ*_{2} = 76.9°, *μ*_{3} = 228.3°, *μ*_{3}/*μ*_{2} = 2.97 for orientation 1; *μ*_{2} = 75.2°, *μ*_{3} = 225.8°, *μ*_{3}/*μ*_{2} = 3.00 for orientation 2; and *μ*_{2} =72.8°, *μ*_{3} =218.5°, *μ*_{3}/*μ*_{2} =3.00 for orientation 3. The average phase error, and its standard deviation were: 7.4°±1.5° for orientation 1; 1.5°±0.9° for orientation 2; and 1.2°±0.6° for orientation 3. Taking into account the varying pattern frequencies of the different orientations, this corresponds to an average phase error of 6.4 nm±1.3 nm for orientation 1; 1.3 nm±0.8 nm for orientation 2; and 1.0 nm±0.5 nm for orientation 3.

#### Influence of phase optimisation

The results of the SIM reconstruction for one set of three raw images are shown in Fig. 5 and 6. As the data acquired was two-dimensional, a recombination of components according to Eq. (5) was not possible. Instead, we used a modified recombination, as described in Appendix A.

Figure 5(a) shows a deconvolved wide-field image generated from the separated 0^{th} component along with a close-up of the sample region inside the blue box. Figure 5(b) shows the corresponding Fourier image. SIM reconstruction without phase optimisation (i.e. assuming the phases {0°, 40°, 240°}, which were used as starting phases for the iterative reconstruction) yields an image with improved resolution, sectioning and contrast (Fig. 5(c)). However, the image appears to have been superimposed with a grainy, honeycomb-like structure. This is due to bad separation of components, leading to residual information being shifted to wrong locations in Fourier space (Fig. 5(d)). These are visible as bright peaks at the position of one and two times the pattern frequency, marked by blue circles in Fig. 5(d). When compared to the values *μ*_{2} and *μ*_{3}, correlation-based phase optimisation led to phase errors of 0.7° for orientation 1, 3.2° for orientation 2 and 1.5° for orientation 3. This improves the component separation, leading to a Fourier image devoid of these unwanted peaks (Fig. 6(d)). The resulting high resolution SIM reconstruction (Fig. 6(c)) does not exhibit the grainy structure present in the uncorrected image.

For the phase-of-peak method the phase errors were 8.2° for orientation 1, 2.4° for orientation 2 and 4.3° for orientation 3. The relatively high phase error for orientation 1 led to an imperfect separation of components, visible in a residual peak in the reconstructed Fourier image (Fig. 6(b)) and in a slightly more grainy structure in the reconstructed image Fig. 6(a) as compared to the one reconstructed with the correlation-based approach (Fig. 6(c)).

The analysis of the phase errors shows that the phase-of-peak method slightly outperforms the correlation-based approach for orientations 2 and 3. However, it performed significantly worse for orientation 1. This is precisely the orientation (0°) for which a residual peak can be seen in the example reconstructed Fourier image (Fig. 6(b)).

## 7. Discussion

Fluctuations in pattern phase can prevent the correct separation of components and lead to artefacts in the reconstructed SIM images. Usually, it is insufficient suppression of the strong zero-order component that will cause the most image degradation. However, because the pattern phases do not influence the contribution of the zero-order component to the individual raw images (see Eq. (3)), it can always be removed completely from any higher order components by assuming equidistant phase steps in the reconstruction. This is the case even if the real pattern phases deviate significantly from the assumed steps. SIM using equidistant pattern stepping is therefore somewhat less prone to artefacts, as component separation can tolerate a certain degree of phase variations (e.g. the systems designed by Gustafsson et al. [8, 11, 18] produce beautiful and artefact-free images without any correction of pattern phase). However, for large deviations of the real pattern positions from the assumed ones – or in the case of non-equidistant pattern stepping even for smaller phase errors – component separation will not be good enough, leading to artefacts in the reconstructed images. In these cases it is necessary to be able to determine the pattern phases from the acquired data.

Correlation-based optimisation can be used to robustly determine relative pattern phase in SIM data with very high precision. As we have shown for simulated data, this is the case even for noisy data and when the pattern frequency is too high for detection by other means. When applied to real SIM data of RPE cells, our algorithm yielded correct pattern phases with high accuracy, allowing an artefact-free reconstruction of high resolution images. We hope our technique will thus facilitate the use of home-built SIM systems which may not exhibit the necessary stability for uncorrected image reconstruction.

## Appendix

## A. Reconstructing two-dimensional data of three-dimensional samples

When reconstructing two-dimensional (2D) data acquired of three-dimensional (3D) samples, Eq. (5) cannot be applied, as it requires 3D images and OTFs. It is nevertheless possible to reconstruct 2D SIM data. As the Fourier transform of a 2D image (i.e. of a single slice out of a 3D image stack) corresponds to a sum projection of the Fourier transformed 3D stack along *k _{z}* onto the

*k*,

_{x}*k*-plane, the straightforward approach to SIM reconstruction would be to apply Eq. (5) using a 2D OTF, which is the sum projection of the true 3D OTF:

_{y}*h*̃′(

*k⃗*)= ∫

_{xy}*h*̃(

*k⃗*)

*dk*. However, this approach is not ideal: as can be seen in Fig. 7(b), the resulting 2D OTF is strongest around the zero frequency, i.e. where the corresponding 3D OTF suffers from the missing cone, as indicated by arrows in Fig. 7(a). In these regions Eq. (5) would therefore lead to an undue emphasis of components that do not carry any missing cone information, while suppressing components that could fill this gap. As a result, the reconstructed images will exhibit almost no optical sectioning, although the acquired data does contain the necessary sectioning information.

_{z}We therefore attenuate the 2D projected OTFs around their central frequency, in order to emphasise components that fill the missing cone. This attenuation is achieved through a multiplication with the Gaussian function *g*̃(*k⃗ _{xy}*) = 1 −

*a*exp{−|

*k⃗*|

_{xy}^{2}/(2

*d*

^{2})}, which is shown as a black dotted line in Fig. 7(c). Recombining the components using these modified weights and then applying the generalised Wiener filter, Eq. (5) becomes

*g*̃ does not act quadratically in the denominator of the final Wiener-filtered image. The strength

*a*of the attenuation and its width

*d*are chosen empirically for each experiment. For the reconstruction of SIM images of RPE cells in this article, they were

*a*= 0.999 and $d=\sqrt{2}{k}_{\text{max}}=\sqrt{2}\frac{4\pi NA}{{\lambda}_{\text{em}}}$. The Wiener-filter parameter

*w*was empirically optimised and set to 0.3 of the maximum value of ∑

_{q′,m′}{|

*h*̃′

_{m′}(

*k⃗*+

_{xy}*m*′

*p⃗*

_{q′})|

^{2}

*g*̃(

*k⃗*+

_{xy}*mp⃗*)}, i.e. of the denominator without

_{q}*w*in Eq. (14).

## B. Weighted cross-correlations

For pixellated Fourier images the conventional discrete CC is defined as
${\left[{\tilde{C}}_{i}\left(\overrightarrow{k}\right)\u235f{\tilde{C}}_{j}\left(\overrightarrow{k}\right)\right]|}_{\overrightarrow{k}=l\overrightarrow{p}}={\sum}_{\overrightarrow{k}}\left\{{\tilde{C}}_{i}\left(\overrightarrow{k}\right){\tilde{C}}_{j}^{*}\left(\overrightarrow{k}-l\overrightarrow{p}\right)\right\}$. The individual terms inside the sum,
${\tilde{C}}_{i}\left(\overrightarrow{k}\right){\tilde{C}}_{j}^{*}\left(\overrightarrow{k}-l\overrightarrow{p}\right)$, have two different contributions to the final correlation value: the first contribution comes from a correlation of the sample information present in the separated components, the second from a correlation of random noise in the components. While the white noise contributes with the same strength independent of *k⃗*, the correlation of information strongly depends on the magnitude of the product of the components’ respective OTFs, |*h̃ _{i}*(

*k⃗*)

*h̃*(

_{j}*k⃗*−

*lp⃗*)|. E.g. the contribution will be zero, where one of the OTFs is zero and no information is transmitted. Any contribution from this frequency

*k⃗*would therefore only be from noise. As this would degrade the overall correlation value, it is beneficial to emphasise contributions from frequencies with high information content and disregard that from frequencies contributing only with noise.

This can be achieved by defining the weighted cross correlation
${{\mathcal{C}}_{i,j}^{\left(l\right)}=\left[{\tilde{C}}_{i}\left(\overrightarrow{k}\right){\u235f}_{w}{\tilde{C}}_{j}\left(\overrightarrow{k}\right)\right]|}_{\overrightarrow{k}=l\overrightarrow{p}}:=\left[{\sum}_{\overrightarrow{k}}\left\{w\left(\overrightarrow{k}\right){\tilde{C}}_{i}\left(\overrightarrow{k}\right){\tilde{C}}_{j}^{*}\left(\overrightarrow{k}-l\overrightarrow{p}\right)\right\}\right]/\left[{\sum}_{\overrightarrow{k}}\left\{w\left(\overrightarrow{k}\right)\right\}\right]$, using weights *w*(*r⃗*). The weights are chosen such that the ratio of information correlation over noise correlation is maximised. If we assume white noise of standard deviation *σ* in the different components, Gaussian error propagation lets us calculate this quality factor as:
${Q}_{\text{WCC}}=\left[{\sum}_{\overrightarrow{k}}\left\{w\left(\overrightarrow{k}\right){\tilde{h}}_{i}\left(\overrightarrow{k}\right){\tilde{h}}_{j}^{*}\left(\overrightarrow{k}-l\overrightarrow{p}\right)\right\}\right]{\left[{\sum}_{\overrightarrow{k}}\left\{{w}^{2}\left(\overrightarrow{k}\right){\sigma}^{2}\left[{\left|{\tilde{h}}_{i}\right|}^{2}\left(\overrightarrow{k}\right)+{\left|{\tilde{h}}_{j}\right|}^{2}\left(\overrightarrow{k}-l\overrightarrow{p}\right)\right]\right\}\right]}^{-1/2}$. It is maximised for weights
$w\left(\overrightarrow{k}\right)=\left[{\tilde{h}}_{i}\left(\overrightarrow{k}\right){\tilde{h}}_{j}^{*}\left(\overrightarrow{k}-l\overrightarrow{p}\right)\right]/\left[{\sigma}^{2}\left[{\left|{\tilde{h}}_{i}\right|}^{2}\left(\overrightarrow{k}\right)+{\left|{\tilde{h}}_{j}\right|}^{2}\left(\overrightarrow{k}-l\overrightarrow{p}\right)\right]\right]$.

Whereas the conventional CC measures overall correlation, the WCC measures average correlation per (weighted) area, the area being defined by the overlap of the OTFs of the components. This leads to distortions, as small overlaps are favoured (e.g., the average correlation value over an area of just one pixel is bound to be high). However, as we evaluate the WCC at fixed positions *lp⃗* only (i.e. for fixed overlaps), this distortion does not affect us, and for optimisation purposes the WCC will yield more reliable results than the conventional unweighted one.

## Acknowledgments

We thank S. Dithmar for providing the RPE sample and C. Cremer for interesting discussions. We also thank the late Mats Gustafsson for many inspiring and fruitful discussions. We acknowledge financial support of our research by
*Carl Zeiss Microscopy GmbH*.

## References and links

**1. **E. Abbe, “Beiträge zur Theorie des Mikroskops und der mikroskopischen Wahrnehmung,” Arch. Mikrosk. Anat. **9**, 413–468 (1873). [CrossRef]

**2. **T. A. Klar and S. W. Hell, “Subdiffraction resolution in far-field fluorescence microscopy,” Opt. Lett. **24**, 954–956 (1999). [CrossRef]

**3. **M. Hofmann, C. Eggeling, S. Jakobs, and S. W. Hell, “Breaking the diffraction barrier in fluorescence microscopy at low light intensities by using reversibly photoswitchable proteins,” PNAS **102**, 17565–17569 (2005). [CrossRef] [PubMed]

**4. **S. T. Hess, T. P. K. Girirajan, and M. D. Mason, “Ultra-high resolution imaging by fluorescence photoactivation localization microscopy,” Biophys. J. **91**, 4258–4272 (2006). [CrossRef] [PubMed]

**5. **E. Betzig, G. H. Patterson, R. Sougrat, W. Lindwasser, S. Olenych, J. S. Bonifacino, M. W. Davidson, J. Lippincott-Schwartz, and H. F. Hess, “Imaging intracellular fluorescent proteins at nanometer resolution,” Science **313**, 1642–1645 (2006). [CrossRef] [PubMed]

**6. **M. J. Rust, M. Bates, and X. Zhuang, “Sub-diffraction-limit imaging by stochastic optical reconstruction microscopy (STORM),” Nat. Methods **3**, 793–796 (2006). [CrossRef] [PubMed]

**7. **R. Heintzmann and C. Cremer, “Laterally modulated excitation microscopy: improvement of resolution by using a diffraction grating,” Proc. SPIE **3568**, 185–196 (1999). [CrossRef]

**8. **M. G. L. Gustafsson, “Surpassing the lateral resolution limit by a factor of two using structured illumination microscopy,” J. Microsc. **198**, 82–87 (2000). [CrossRef] [PubMed]

**9. **J. T. Frohn, “Super-resolution fluorescence microscopy by structured light illumination,” Ph.D. thesis, Eidgenössische Technische Hochschule Zürich, Switzerland (2000).

**10. **R. Heintzmann, T. M. Jovin, and C. Cremer, “Saturated patterned excitation microscopy - a concept for optical resolution improvement,” JOSA A **19**, 1599–1609” (2002). [CrossRef] [PubMed]

**11. **M. G. L. Gustafsson, “Nonlinear structured-illumination microscopy: wide-field fluorescence imaging with theoretically unlimited resolution,” PNAS **102**, 13081–13086 (2005). [CrossRef] [PubMed]

**12. **L. Hirvonen, O. Mandula, K. Wicker, and R. Heintzmann, “Structured illumination microscopy using photo-switchable fluorescent proteins,” Proc. SPIE **6861**, 68610L (2008). [CrossRef]

**13. **E. H. Rego, L. Shao, J. Macklin, L. Winoto, G. A. Johansson, N. Kamps-Hughes, M. W. Davidson, and M. G. L. Gustafsson, “Nonlinear structured-illumination microscopy with a photoswitchable protein reveals cellular structures at 50-nm resolution” PNAS **109**, E135–E143 (2012). [CrossRef]

**14. **S. A. Shroff, J. R. Fienup, and D. R. Williams, “Phase-shift estimation in sinusoidally illuminated images for lateral superresolution,” JOSA A **26**, 413–424 (2009). [CrossRef] [PubMed]

**15. **R. Fiolka, M. Beck, and A. Stemmer, “Structured illumination in total internal reflection fluorescence microscopy using a spatial light modulator,” Opt. Lett. **33**, 1629–1631 (2008). [CrossRef] [PubMed]

**16. **P. Kner, B. B. Chhun, E. R. Griffis, L. Winoto, and M. G. L. Gustafsson, “Super-resolution video microscopy of live cells by structured illumination,” Nat. Methods **6**, 339–342 (2009). [CrossRef] [PubMed]

**17. **E. Mudry, K. Belkebir, J. Girard, J. Savatier, E. Le Moal, C. Nicoletti, M. Allain, and A. Sentenac, “Structured illumination microscopy using unknown speckle patterns,” Nature Photon. **6**, 312–315 (2012). [CrossRef]

**18. **M. G. L. Gustafsson, L. Shao, P. M. Carlton, C. J. R. Wang, I. N. Golubovskaya, W. Z. Cande, D. A. Agard, and J. W. Sedat, “Three-dimensional resolution doubling in wide-field fluorescence microscopy by structured illumination,” Biophys. J. **94**, 4957–4970 (2008). [CrossRef] [PubMed]

**19. **R. Penrose, “A generalized inverse for matrices,” Proc. Cambridge Philos. Soc. **51**, 406–413 (1955). [CrossRef]

**20. **L. P. Yaroslavsky and H. J. Caulfield, “Deconvolution of multiple images of the same object,” Appl. Opt. **33**, 2157–2162 (1994). [CrossRef] [PubMed]

**21. **M. Schmidt, “minfunc.m,” http://www.di.ens.fr/~mschmidt/Software/minFunc.html (2012).

**22. **G. Best, R. Amberger, D. Baddeley, T. Ach, S. Dithmar, R. Heintzmann, and C. Cremer, “Structured illumination microscopy of autofluorescent aggregations in human tissue,” Micron **42**, 330–335 (2011). [CrossRef]

**23. **K. Wicker, “Increasing resolution and light efficiency in fluorescence microscopy,” Ph.D. thesis, King’s College London, U.K. (2010).