By exploiting the saturation of a reversible single photon transition, RESOLFT microscopy is capable of resolving three dimensional structures inside specimen with a resolution that is no longer limited by the wavelength of the light in use. The transition is driven by a spatially varying intensity distribution that features at least one isolated point, line or plane with zero intensity and the resolution achieved depends critically on the field distribution around these zeros. Based on a vectorial analysis of the image formation in a RESOLFT microscope, we develop a method to effectively search for optimal zero intensity point patterns under typical experimental conditions. Using this approach, we derived a spatial intensity distribution that optimizes the focal plane resolution. Moreover, we outline a general strategy that allows optimization of the resolution for a given experimental situation and present solutions for the most common cases in biological imaging.
©2007 Optical Society of America
Far-field light microscopy is the only method that non-invasively delivers three-dimensional images of (live) samples. However, the diffraction-limited resolution of light microscopes  has often eluded their application on a scale smaller than approximately half the wavelength of the light in use. Recently a family of approaches has been described and in part realized that have fundamentally overcome the diffraction resolution barrier in far-field fluorescence microscopy while still retaining the advantages of a far-field technique.
It is important to realize that nonlinear multiphoton optical transitions [2, 3] do not effectively increase the resolution in far-field microscopy. This primarily stems from the fact that the optical transitions in question require the energy to be subdivided into multiple long-wavelength photons. The power of the new concept based on reversible saturable (or switchable) optical linear (fluorescence) transitions (RESOLFT) stems from the fact that it generates a nonlinear optical response but is based on a single photon process, [4, 5, 6, 7, 8, 9] and therefore it does not require longer wavelengths. Moreover, since molecular cross-sections for single photon transitions are usually much larger than for multiphoton processes, RESOLFT microscopy operates at comparatively low intensities. Several implementations of the RESOLFT concept have been published [10, 11, 12], including those based on photoswitching of proteins and optically bistable organic molecules. More prominent examples are ground state depletion (GSD) and stimulated emission depletion (STED) microscopy. STED microscopy already increased the resolution to λ/45 in the lateral plane [13, 14]. Every RESOLFT microscope uses a spatially varying intensity distribution with one or several isolated regions of zero intensity. This pattern can be used in two ways: to inhibit fluorescence so that it is allowed only at the zeros and their immediate proximity, or to switch it on, so that it occurs everywhere, except at the local intensity zero . In the latter case, extensive mathematical modelling is required to derive the actual subdiffraction images. In this paper we shall therefore concentrate on the first approach, which is conceptually more appealing since it produces ‘positive images’ that can be readily interpreted. In a point-scanning microscope the intensity distribution is chosen such, that it features a classical intensity zero at a given point in the focal plane. The signal from there is never suppressed but as the intensity is increased, the signal will be efficiently inhibited in regions close to the central zero. Ultimately, the area from which photons can be emitted will be squeezed down to the molecular scale. The quality of the ‘inhibition patterns’ plays a vital role. It determines how effectively this process operates.
Although different inhibition patterns have already been used in practical applications or have been proposed [16, 17, 18], a systematic survey has still to be made. The goal of this work is to find the optimal inhibition point patterns for RESOLFT microscopy. For this we shall outline a framework that enables the efficient search for pupil functions that create intensity patterns with focal zeros by confining a global optimization algorithm to the corresponding subspace. The application of this method confirmed the suitability of the phase masks used. Additionally it has resulted in a novel phase mask that has recently been applied to STED microscopy of biological cells with unpredecented resolution.
2. Image formation in RESOLFT microscopy
In order to identify those intensity distributions that are especially favorable for RESOLFT type microscopy, a thorough understanding of the image formation is helpful. While treatments neglecting the vectorial nature of light are sufficient for low saturation intensities, our goal demands a vectorial theory. The single point-scanning RESOLFT microscopy analyzed here relies on the inhibition of fluorescence from areas outside a small focal spot and involves light at two wavelengths, λ ex and λ inh. The first forms an excitation or activation pattern, which drives the molecules to the fluorescing state inside a spot while the second de-excites or de-activates them outside the very center of the spot thereby inhibiting fluorescence from there. Often both, the excitation and the inhibition are effected by illumination with a short laser pulse and for our purpose it is a good approximation to assume that both pulses do not overlap in time. While the following calculations are based on this assumption, it is important to realize that RESOLFT does not rely on pulsed excitation or inhibition . In fact the concept has already been successfully implemented with continous wave illumination . A possible implementation of a point-scanning RESOLFT microscope is schematically outlined in Fig. 1.
Let the electric fields during the pulses be given by E ex(r′) and E inh(r′), respectively. Let us further assume that both, the excitation and the inhibition are single photon transitions well approximated by a dipole interaction with parallel transition dipoles p. Usually detection optics is also applied and the probability of detecting a fluorophore at position r′ in the sample is given by the collection efficiency function CEF(p,r′). Here the unit vector p denotes the dye’s orientation upon excitation and de-activation and the CEF includes any effects due to dye rotation between excitation and detection. If we assume that rotational diffusion is much faster than fluorescence emission, its dependence on p disappears. Finally, the effective PSF of the system is proportional to the joint probability of (1) exciting the dye, (2) not inhibiting fluorescence and (3) detecting the emitted photon. If we assume that the molecule cannot change its orientation between excitation and inhibition, we have
where f describes the probability of not inhibiting fluorescence at a given de-excitation or de-activation rate. Assuming pulsed light and a simple two-level system with spontaneous rates much slower than the de-excitation or de-activation rate during the pulse, f is well approximated by f(s) = exp(-ln2ε 0 cs/I sat) where I sat is called the saturation intensity, c is the speed of light and ε 0 is the vaccum permittivity . Due to the dependence of the PSF on the dye’s orientation, the imaging process can no longer be described as a simple convolution integral with the effective PSF but takes the more general form
where ρ(p,R) describes the angular density of dye orientations in sample space. In fact, equation (2) also applies to ordinary confocal microscopy where e.g. z-oriented molecules have a different PSF than those parallel to the focal plane. However, unless the dye orientation is very non-isotropic and inhomogeneous, the effect is almost negligible. This is because both excitation and detection of axially oriented molecules is supressed by a factor sin2 θ where θ is the angle of the transition dipole with the optic axis. Practice shows that deconvolution and quantitative analysis assuming a space-invariant effective PSF, are possible in common situations. For RESOLFT type microscopes, the situation is more complex. The argument of the saturation function f also depends on p and therefore the shape of the PSF can change between the desired nanoscopic size and the confocal form depending on the dye’s orientation. For space variant anisotropies, this results in data that are difficult to interpret. But even for a space-invariant, random orientation of dyes, it can result in significant broadening of the effective PSF and loss of resolution. In both cases it is therefore mandatory that the projection of the inhibition field along each transition dipole that is significantly excited is strong everywhere around the focus. In the present manuscript we shall assume that the dyes are excited using linearly polarized or circularly polarized (or unpolarized) light. In our search for optimal inhibition fields, their quality is therefore determined by the strength of either one designated lateral component (linear polarization) or the weaker of both lateral components (circular polarization). While a field which is simultaneously quenching both lateral components as in the latter case is arguably preferrable, its quality has to be compared to an incoherent combination of perpendicular fields found for the linear polarization case.
3. Efficient calculation of focal intensity distributions
To rigorously optimize feasible intensity patterns near the focal spot, an efficient way is needed to calculate the inhibition light distribution resulting from a given vectorial pupil function A(θ,ϕ). We extend the integrals given by Richards and Wolf  for aplanatic lenses to incorporate an arbitrary vectorial pupil function with complex components Ax and Ay along the two transverse directions. The focused electric field of the inhibition light is then given by an integral over the exit pupil of the objective lens:
The matrix M rotates the coordinate system about π/2 around the z-axis, K is the vectorial part of the diffraction integrals:
and cosε= cosθcosθ′+sinθsinθ′cos(ϕ-ϕ′). In this manuscript we will restrict our analysis to pupil functions with uniform circular or linear polarization because other configurations are not easily experimentally feasible. The pupil functions are then
for linearlily polarized light and
for circular polarization. The normalized radius 0 ≤ r ≤ 1 is given by r = sinθ/sinα, α is the semi-aperture angle of the objective lens and P(r, ϕ) is the scalar part of the pupil function, which we can experimentally assess by phase and amplitude filters. For efficient optimization we will decompose P(r, ϕ) into Zernike polynomials which form a complete set of orthogonal functions on the unit disc . These polynomials are usually divided in even and odd parts, with 0 ≤ |m| ≤ n and m - n even. However, it is more convenient for our purposes to renormalize and re-number the Zernike polynomials obtaining orthonormal polynomials Z̃i(r, ϕ) with a single index only:
The index i is a one-to-one mapping on the allowed indices (n,m) given by i = n(n+1)/2 + (m+n)/2. The approximate decomposition of an arbitrary function P(r,ϕ) in a finite number N of Zernike polynomials is then
where the order N has to be chosen according to the degree of complexity in the pupil function. For a given polarization, aperture angle and position r′ in sample space, equation (3) is a linear functional E inh(r′)[P] on the space of pupil functions. Using equation (8) we can therefore write
In our optimization we will have to calculate the inhibition field at points of interest for a large number of pupil functions. This is efficiently done by precalculating the integrals on the right-hand-side:
and writing the solutions as
with l = 3j + k and k = 0,1,2 denoting the x,y,z component of the electric field. For a known decomposition of an arbitrary pupil function, the field at the points of interest is then given by a simple matrix multiplication
and no integrals have to be solved during optimization.
4.1. Figure of merit
The goal of the optimization is to identify the pupil function P that produces a strict intensity zero at the focal point while also featuring the steepest doughnut intensity distribution around it. For this purpose a figure of merit (FoM) is defined, which measures this steepness on a suitable length scale and thus reflects the potential of the pupil function for resolution increase. The minimum intensity around the focal point turned out to be a practical choice for the FoM. It is calculated by placing points of interest at a distance d ER from the central intensity zero and determining the minimal intensity at these points. Their exact position depends on the case investigated. We therefore chose several common situations for our investigation:
In the last two cases it is sufficient to use only a few points if their spacing is much smaller than the wavelength. The distance from the focal spot d ER should be chosen in the range of the expected resolution. It turned out that its influence on the optimization result is negligible in the range of λ/50 - λ/5 and a value of 100 nm was used because it resulted in reliable and fast convergence.
For the reasons outlined above, only the intensity of the field’s x-component was regarded in the FoM and optimized. For circular polarization this also implies optimization of the y-component, while for linear polarization coverage of other polarization directions necessitates the incoherent combination of at least two beams.
4.2. Algorithm and constraints
Due to non-quadratic dependence of the FoM on the result vector S, linear optimization algorithms cannot be applied to the problem. An iterative approach was therefore implemented, restricting the space of pupil functions to linear combinations of the first N = 120 polynomials Z̃i. The algorithm searches a 3N-dimensional space for the vector c in equation (12) with the highest FoM. But, in order to restrict the search to physically reasonable results, several constraints have to be considered. Most obviously, a strict intensity zero at the origin is required. In addition, some form of limitation on the available power may always be present. Otherwise, spot sizes could be made arbitrarily small with any inhibition pattern that features an isolated intensity zero. In practice, the following restrictions are common:
- (A) The maximum amplitude in the aperture is limited. This is equivalent to a limitation of the available laser power under the common condition that the aperture’s amplitude distribution is created by phase and transmission filters only.
- (B) The total power the sample can sustain without damage is limited due to photobleaching, trapping effects, thermal instability or other detrimental effects.
There is a very elegant way to restrict the search to the subspace of pupil functions that create a strict intensity zero at the origin. Evaluating E inh(r′ = 0)[Z̃i] shows that many polynomials feature an intensity zero at the focus due to inherent symmetries. All others have only one or two non-zero components. It is straightforward to manually remove three polynomials from the basis and replace all others by linear combinations with one of these so that the intensity at the origin is always zero for combinations of the new basis functions. While the new basis is not necessarily orthonormal, the focal field is still the result of a matrix multiplication of the form given in equation (12) but with the number of dimensions reduced to 3(N - 3).
Due to the nonlinear nature of the power constraints, they are not as readily implemented. Here, the conditions (A) and (B) were both fulfilled by scaling the pupil function accordingly after each iteration. We used the Metropolis algorithm  to randomly browse the whole optimization space and ensured convergence to the global optimum by simulated annealing. For short-range optimization a robust simplex search  was additionally performed at the end of the annealing process. The maximum step size for each iteration in the Metropolis algorithm was adjusted to ∥Δc∥∞ ≃ 0.02..0.05 for optimum convergence rates and we restarted the annealing process several times to improve coverage of the solution space. The total number of Metropolis iterations was 5 ∙ 105 with a local search and a subsequent restart every 104 iterations.
5.1. Limited wavefront amplitude (A)
The results for this most common situation are shown in Fig. 2. Clear shapes can be identified in the phase and amplitude distributions of the pupil function which were found by the global optimization. The phase distributions mainly consist of two domains which have an average phase difference of π. The boundaries of these domains are of simple shape, mainly circular or straight lines. The result for the XY inhibition pattern and circularly polarized light shows exceptional behavior. Here, the corresponding phase distribution resembles an angular phase ramp that runs linearly from 0 to 2π. The pupil functions for inhibition patterns in the X or Y direction and circularly polarized light are not shown because in these cases the algorithm converges to a result suitable for the whole XY plane. For 3D patterns, the results for linear and circular polarization are very similar.
The lack of perfect symmetry in these results indicate that the algorithm has not yet converged to the global optimum. We therefore attempted to improve the results by finding idealized versions of the results, also allowing for easier experimental realization. Figure 2 shows amplitude and phase distributions that are almost flat except directly at the phase domain boundaries. The width of the valley in the amplitude is approximately equivalent to the smallest features representable by the polynomials included in the basis of our optimization. It can therefore be assumed that the algorithm was converging towards the best approximation of a constant amplitude distribution with a discontinuous stepwise phase distribution. In fact this is not surprising, because this choice maximizes the transmitted power participating in the resolution increase.
For our idealization, we therefore chose a constant amplitude and domains of constant phase 0 or π. The domain boundaries were simplified to circles, semi-circles or straight lines. For the XY inhibition pattern and circular polarized light, we let the phase increase linearly with ϕ. The resulting phase distributions are shown in Fig. 2(c). Where applicable, the relative extent of each phase domain was parametrized and the parameter values where determined to ensure the focal intensity zero and to maximize the FoM. The resulting pupil functions are
For the lateral FoMs we have in the case of circular polarization:
and for linear polarization along the x-axis:
The pupil functions for optimal resolution increase along the optic axis P Z are identical to those for the 3D case and the values of d and h found for a numerical aperture (NA) of 1.2 (water) are displayed in Fig. 2 along with the intensity of the resulting inhibition fields’ x-component. The idealized patterns for the 3D case and for linear polarization and optimized resolution along a single dimension (Y) had been in use in STED microscopy before our systematic survey. Our findings confirm that for these requirements, the corresponding pupil functions are the optimal choice. The phase pattern of the pupil function found for optimal lateral resolution and circularly polarized light is similar to a Gauss-Laguerre beam . However the resulting donut-shaped intensity distribution features a tighter zero due to the efficient use of the whole aperture of the lens and all available light. This inhibition pattern has been adopted by most STED-microscopes that are designed for optimal isotropic resolution in the focal plane and led to new resolution benchmarks [19, 14].
5.2. Limited power (B)
The resulting phase distributions for variable amplitudes showed a similar behavior as for (A) but the amplitude showed pronounced peaks at the centers of the phase domains. For the same reasons as above, idealized versions of the optimization results were created. We used the same phase distributions as in equations (13)–(17) but allowed for a symmetrical, smooth amplitude variation resembling the outcome of the optimization runs. Different parametrizations were tried and in each case the parameters were chosen to optimize the FoM. For circular polarization the following pupil functions delivered the best results:
The optimal parameter choices for NA=1.2 are: cα = 2.14, cβ = 14.3, d = 0.71, α= 0.57, and β = 1.58 for P 3D and α= 1.02 for P XY. To some extent, the optimal parameter values for both functions depend on the NA of the lens. The optimization results, idealized pupil functions and resulting intensity patterns are shown in Fig. 3 and are compared to the findings for regime (A) at equal power. The enhancement achieved when allowing variable amplitudes is due to the possibility to strengthen high spatial frequencies that are responsible for fast oscillations in the image plane. Other parametrizations were tried that also resembled the optimization results but allowed for more degrees of freedom in the description of the amplitude distribution. In the case of the lateral FoM, a superset of the above functions was investigated that allowed zero amplitude at the edge of the aperture and a radially adjustable maximum: rα(1 -rβ)α/β. In the axial case a dark ring between the two regimes was allowed. But when optimizing the FoM, the results given in equations (18) and (19) were reproduced.
We performed a comprehensive search, optimization and characterization of fluorescence inhibition patterns for RESOLFT microscopy relying on single point scanning. By constructing a subspace that enforces an intensity zero at a given point, a global optimization was applied and optimal pupil functions were found. Our results show conclusively, that if the maximal amplitude of the pupil function is limited, phase-only pupil functions deliver the best results. The ideal phase masks found for some conditions correspond to pupil functions used in earlier experiments [24, 16] encouraging their ongoing application. However, the optimization identified a novel, superior lateral donut-shaped distribution for circularly polarized light. Its application has led to a resolution of down to 20 nm  in biological applications. Our analysis revealed that if the total power focused into the sample is the limiting factor, the hightest resolution can be achived when allowing for phase- and amplitude-modulation of the inhibition beam’s wave-front. Our findings encourage the use of circularly polarized light in all but a few specialized situations where either experimental conditions or an anisotropic orientation of dye molecules encourage special combinations of linearly polarized beams. A single inhibition pattern cannot efficiently cover all polarization components and all directions around an intensity zero. Here, we have assumed that the contribution from z-polarized molecules is too weak to compromise resolution. As saturation factors increase and resolution approaches a few nanometers, the background of z-polarized molecules will become larger and inhibition fields have to be designed that effectively quench them. Using similar techniques as outlined in this manuscript, suitable patterns can be found by using radially polarized light . Finally parallelization of RESOLFT microscopy demands patterns with multiple intensity zeros. While incoherent, time muliplexed combinations of the patterns presented here could be used, coherent creation of such field distribution should lead to more economic use of laser power due to synergy effects. The methods introduced here will be an important tool for the efficient design and optimization of pupil functions for the creation of inhibition fields with multiple intensity zeros.
References and links
1. E. Abbe, “Beiträge zur Theorie des Mikroskops und der mikroskopischen Wahrnehmung,” Arch. f. Mikr. Anat. 9,413–420 (1873). [CrossRef]
3. L. Moreaux, O. Sandre, and J. Mertz, “Membrane imaging by second-harmonic generation microscopy,” J. Opt. Soc. Am. B 17,1685–1694 (2000). [CrossRef]
4. A. Schönle, J. Keller, B. Harke, and S. Hell, “Diffraction Unlimited Far-Field Fluorescence Microscopy,” in Handbook of Biological Nonlinear Optical Microscopy, M. Masters and P. So, eds. (Oxford University Press, Oxford, 2007).
5. S. Hell, “Toward fluorescence nanoscopy,” Nature Biotechnol. 21,1347–1355 (2003). [CrossRef]
6. S. Hell, “Strategy for far-field optical imaging and writing without diffraction limit,” Phys. Lett. A 326,140–145 (2004). [CrossRef]
7. S. W. Hell, M. Dyba, and S. Jakobs, “Concepts for nanoscale resolution in fluorescence microscopy,” Curr. Opin. Neurobio. 14(5),599–609 (2004). [CrossRef]
8. S. Hell and A. S., “Nanoscale resolution in Far-Field Fluorescence Microscopy,” in Science of Microscopy, P. Hawkes and J. Spence, eds. (Springer, 2006).
9. S. Hell and A. S., “Nanoscopy: The Future of Optical Microscopy,” in Biomedical Optical Imaging, J. Fujimoto and D. Farkas, eds. (Oxford University Press, Oxford, 2006).
10. M. Hofmann, C. Eggeling, S. Jakobs, and S. Hell, “Breaking the diffraction barrier in fluorescence microscopy at low light intensities by using reversibly photoswitchable proteins,” Proc. Natl. Acad. Sci. USA 102,17,565–17,569 (2005). [CrossRef]
12. S. W. Hell and M. Kroug, “Ground-state depletion fluorescence microscopy, a concept for breaking the diffraction resolution limit,” Appl. Phys. B 60,495–497 (1995). [CrossRef]
13. V. Westphal and S. Hell, “Nanoscale Resolution in the Focal Plane of an Optical Microscope,” Phys. Rev. Lett. 94,143,903 (2005). [CrossRef]
14. G. Donnert, J. Keller, R. Medda, M. A. Andrei, S. O. Rizzoli, R. Lührmann, R. Jahn, C. Eggeling, and S. W. Hell, “Macromolecular-scale resolution in biological fluorescence microscopy,” Proc. Natl. Acad. Sci. USA 103,11,440–11,445 (2006). [CrossRef]
15. R. Heintzmann, T. M. Jovin, and C. Cremer, “Saturated patterned excitation microscopy - A concept for optical resolution improvement,” J. Opt. Soc. Am. A: Optics and Image Science, and Vision 19,1599–1609 (2002). [CrossRef]
16. T. Klar, E. Engel, and S. Hell, “Breaking Abbe’s diffraction resolution limit in fluorescence microscopy with stimulated emission depletion beams of various shapes,” Phys. Rev. E 64, 066,613,1–9 (2001). [CrossRef]
19. K. I. Willig, S. O. Rizzoli, V. Westphal, R. Jahn, and S. W. Hell, “STED-microscopy reveals that synaptotagmin remains clustered after synaptic vesicle exocytosis.” Nature 440,935 –939 (2006). [CrossRef] [PubMed]
20. B. Richards and E. Wolf, “Electromagnetic diffraction in optical systems II. Structure of the image field in an aplanatic system,” Proc. R. Soc. Lond. A 253,358–379 (1959). [CrossRef]
21. W. J. Tango, “Circle Polynomials of Zernike and Their Application in Optics,” Appl. Phys. 13,327–332 (1977). [CrossRef]
22. N. Metropolis, A. W. Rosenbluth, M. N. Rosenbluth, A. H. Teller, and E. Teller, “Equation of State Calculations by Fast Computing Machines,” J. Chem. Phys. 21,1087–1092 (1953). [CrossRef]
23. J. A. Nelder and R. Mead, “A Simplex-Method for Function Minimization,” Comput. J. 7,308–313 (1965).
24. T. Klar and S. Hell, “Subdiffraction resolution in far-field fluorescence microscopy,” Opt. Lett. 24,954–956 (1999). [CrossRef]