A simple image-based autofocusing scheme for digital microscopy is demonstrated that uses as few as two intermediate images to bring the sample into focus. The algorithm is adapted to a commercial inverted microscope and used to automate brightfield and fluorescence imaging of histopathology tissue sections.
© 2008 Optical Society of America
High-density solid-state detector technology, coupled with affordable, terabyte and higher scale data storage, has greatly facilitated digital recording of industrial and medical images. In the biological and medical realm, digital microscopy has been used for such applications as high throughput screening, archiving, telemedicine, and rapid information retrieval [1–4]. Digital microscopy is often advantageous over conventional light microscopy since hands free operation of the microscope can increase throughput, reduce time and cost of operation, and seamlessly enable data storage and retrieval. However, fully automated microscopy eliminates the optimization brought by a skilled human operator, such as the task of keeping the sample in focus during observation. Although microscopy samples are often relatively thin, on the order of a few micrometers, the high power objective lenses typically used for imaging have a depth of field that is even smaller. Therefore, in order to maintain the sample at the optimal focal position, the distance between the sample and objective must be dynamically changed according to variations in the sample. Automated microscopy necessitates an autofocusing method such that 1) at least a feature of the sample is measured and used to determine the sample position relative to the focal plane; and 2) the recorded position is used in a servo fashion to control the position of the objective relative to the sample. Furthermore, the autofocusing method must operate sufficiently fast to minimize acquisition time.
Rapid autofocusing methods can generally be divided into two categories, often referred to as image-based or reflection-based. In the former category, a camera or image sensor is used to obtain an image of the sample at one or more focal positions. From the resulting image data, quantitative parameters, or figures of merit (FOM), such as image contrast, resolution, entropy or spatial frequency content [5, 6], are extracted that measure the quality of focus. This approach typically requires acquisition of several images along the optical axis. It is quite common to obtain these images through focus while calculating the FOM, and choosing the image corresponding to the peak (or valley) of the FOM, or by performing a search to optimize the collection of images . Since the acquisition of multiple images increases the total scan time, image-based autofocusing methods may be prohibitively slow for high throughput applications.
Alternatively, reflection-based autofocusing methods  introduce an external light source (typically a laser or laser diode) to measure the position of a reference point on the sample, such as the reflection from the air-cover glass interface or the glass slide itself. In this manner, the autofocus feedback loop is independent of and can operate at much higher frequencies than the imaging camera. However, a single reference point may not provide sufficient information about the sample to properly bring the sample into focus, and the presence of multiple reflections can confound the true fiducial reflection.
This manuscript describes an autofocusing method that uses a nominally small number of images to determine the focal position, and does not require acquisition of an image precisely at the focus for the purpose of autofocusing. Although imaging through focus is preferred, it is not required and images can be acquired with coarse sampling along the optical axis. The algorithm is implemented in a digital microscope and used to acquire brightfield and fluorescence images of histopathological samples. It is demonstrated that the algorithm works on a variety of tissue stains and with varying degrees of dye uptake.
2.1 Autofocusing procedure
The autofocusing algorithm is as follows: At a given lateral position on the sample, a quantitative, image-based figure of merit (FOM) is calculated for images acquired at several depths along the optical axis. Rather than choosing the image with the optimal FOM, an empirical function (e.g., Gaussian, polynomial) approximating FOM versus depth is fit to the data. The determined peak of the fitted function is then taken as an estimate of the focal position, and used in a servo-controlled fashion to control the position of the objective lens. The in-focus image is then recorded prior to scanning the stage laterally to the next position. Although this approach is relatively well known , the simplicity of our approach stems from the combination of FOM (Brenner gradient, described below) and curve fit (reciprocal of a second order polynomial), which together require no more than three images to calculate the focal position.
2.2 Figures of merit
A variety of algorithms are available in order to quantify a FOM for focus [5, 6]. Qualitatively, when an image is in focus, it demonstrates large contrast, a high range of intensity values, and sharp edges. Quantitatively, a FOM should have a distinct peak at the focus and monotonically and symmetrically decrease on either side of the focal plane. Sun, et al.,  provided a list of FOMs as well as a set of evaluation criteria for ranking various FOMs. Accuracy is clearly of utmost importance; in the case of automated, high throughput microscopy, where reducing total data acquisition time is important, minimizing the computation time is also critical. We chose to implement and compare algorithms that are both intuitive and computationally simple. The first of these is image contrast, defined as:
where s max and s min are the maximum and minimum grayscale pixel values, respectively. Another statistical FOM is the image variance, normalized by the mean μ to account for intensity fluctuations:
where s(i, j) is the grayscale pixel value at coordinates (i, j), and N and M represent the number of pixels in the i and j directions, respectively. The histogram of the pixel values can be used to measure the image entropy:
using the probability pk of a pixel having intensity in histogram bin k. The last FOM is known as the Brenner gradient :
The Brenner gradient is a fast, rudimentary edge detector, measuring the difference between a pixel and a neighbor that is typically two (m=2) pixels away. A comparison of the FOMs [(Eqs. (1)–(4)] is shown in Fig. 1 for a data set obtained through the focal plane. Of the four FOMs described, we found the Brenner gradient to be the most sensitive function, demonstrating a sharp, distinct peak at the focus, and dropping rapidly away from focus. This is in agreement with the results by Sun, et al. . We therefore selected the Brenner gradient as the FOM to optimize for image based autofocusing.
2.3 Model-based curve fitting
In order to minimize the number of images required to estimate the focal position, autofocusing is performed by interpolation using an analytical model that accurately follows the FOM versus depth, f(z). The focal position is taken as the peak (or valley) of the fitted function, obviating the need to acquire an image at or near the focus. The model need not provide a good fit with negligible regression coefficients; rather the peak position of the fit should accurately predict the position where the FOM would peak. In contrast to Ref. , where a Gaussian model was used as an approximation of the axial point spread function, we choose a model based on empirical observation of f(z). Over a limited range close to the focal plane, a polynomial fit may also closely approximate f(z). For instance, we have found that a fourth-order polynomial provides a reasonable fit to the variance FOM. The polynomial fit has the benefit of having established, rapid implementations in commercial software. The disadvantage is that an n th-order function requires n+1 images to be acquired, drastically increasing image acquisition time for higher order polynomials. Further, while a polynomial may provide a good fit, autofocusing requires careful handling of confounders, such as complex roots or roots which have no physical interpretation. Examples include the absence of a maximum, or one that is substantially outside of the depth of focus.
We have empirically observed that in the case of the Brenner gradient, f(z) can be approximated by a Lorentzian function:
Noting that the reciprocal of the Lorentzian is a quadratic function, one can fit a parabola to the reciprocal of the Brenner gradient, requiring only three images to determine the focal plane. A fourth image can then be taken once the sample is brought to the focal plane of the objective lens. We will describe below how autofocusing can be achieved using only two images to bring the sample into focus.
2.4 Software and Instrumentation
A commercial inverted brightfield microscope (Axiovert 100, Carl Zeiss) was adapted with an XYZ translation unit (MS2000, Applied Scientific Instrumentation). Köhler illumination was provided by a 100W halogen lamp. A high numerical aperture (20X, 0.80 NA) air coupled objective lens (PlanApochromat, Zeiss) was used. Images were acquired at approximately 1 frame per second (10 msec exposure) using a 14-bit, 4 megapixel digital camera (pco.2000, Cooke Corp.) Custom image acquisition and instrument control software was written in Labview 8.1 (National Instruments) to synchronize scanning, autofocusing and data collection. Lateral scanning of the microscope stage was synchronized with image acquisition, allowing for unsupervised collection of digital histopathology image mosaics. Servo-controlled autofocusing using curve fitting of the Brenner gradient was performed at each lateral location prior to saving the in-focus image.
Fluorescence microscopy was performed on a second system (Nikon TE2000-U) and objective lens (20X, 0.45NA, Plan Fluor). While the same algorithm was used, the camera exposure time varied depending on the sample and the label, which included DAPI-labeled nuclei (205 ms exposure time), Cy3-labeled smooth muscle actin (205 ms), and Cy3-labeled β-catenin (3.28 s). Autofocusing was performed at the emission spectrum of the fluorophore. For dual labeled samples, the entire mosaic was collected in one channel before switching filters and collecting the second channel.
A comparison of the aforementioned FOMs [Eqs. (1)–(4)] is shown in Fig. 1 for a through-focus image stack (51 images, 1 µm steps) of a typical histopathology slide. Each FOM was normalized by its maximum to allow for quantitative comparison. The Brenner gradient required no additional post-processing of the raw image data, whereas the others required subtle calculations to allow for calculation of the FOM. All of the variance images were thresholded to exclude pixels below 1.2 times the value of the minimum, and above 0.8 times the maximum. Calculation of the image contrast [Eq. (1)] on the entire image (not shown) yielded minimal differences between in-focus and out-of-focus images due to white space (roughly equal values of s max) and dead pixels (equal values of s min) in every image. Instead, a subset of pixels was used composed of 400 by 400 pixels in the center of the image. The histogram used to calculate entropy was composed of 50 bins. It is clearly observed that the Brenner gradient is substantially narrower (i.e., more sensitive to focus) and more symmetric than the remaining FOMs, and drops monotonically from the peak at the focus. The absence of prominent local maxima improves robustness of the autofocusing algorithm. We did not observe differences when the Brenner gradient was applied along the orthogonal direction [along the j-index of Eq. (4)], or by taking both directions into account. These results, albeit less comprehensive, are similar to those by Sun, et al. , who also confirmed the favorable performance of the Brenner gradient among the selected FOMs.
Having demonstrated the superior fidelity of the Brenner gradient among these simple FOMs, we aimed to minimize, via curve fitting , the number of images required to bring the sample into focus. Figure 2(A) shows the Brenner gradient as a function of the relative position between the sample and objective lens, empirically observed to be of Lorentzian form [Eq. (5)]. In this particular example, the peak of a Lorentzian fit determined that the optimal image was at a relative position of 3.53 µm. Since the Lorentzian function is the reciprocal of a quadratic equation, this suggests that the reciprocal of the Brenner gradient versus depth may be fit by a parabola with the minimum occurring at the focal plane. Therefore, only three auxiliary images along the optical axis are required to estimate the position of the focus. In comparison to Fig. 2(A), only three images from the same data set were used to fit the parabola in Fig. 2(B), resulting in a minimum at 3.99 µm. While there is a difference of 0.46 µm between the focus positions between the two methods, it is within the depth-of-focus of this objective lens, with the small margin indicating that both methods are comparably accurate.
Implemented for automated focusing, the position of the valley of the parabola is provided as the target for a servo-controlled objective piezo to bring the sample into focus. Figure 3 demonstrates single-shot convergence to the focal plane, starting from greater than 10 µm outside of focus. Following the initial application of autofocus, the sample position changes minimally, less than 0.2 µm, in subsequent attempts. There was no noticeable change in image quality, and the value of the Brenner gradient changed negligibly following the first attempt.
Following such validation of the algorithm, an automated mosaic of a hematoxylin and eosin (H&E) stained colon section was acquired, as shown in Fig. 4. This composite image is comprised of 441 individual tiles with 650 µm lateral step size between images. The net acquisition time was 36 minutes, which scales approximately linearly with the number of tiles. The mosaic appears in focus at all levels of magnification, from a panoramic view to high digital zoom. Hematoxylin stained nuclei are in sharp focus even at the highest magnification, and subcellular features critical to histopathological diagnosis are discernable. Figure 5 illustrates a sample of images across different tissue types and stain levels. Although as an edge detector (see Fig. 2 insets) the algorithm prefers sharply contrasting features such as nuclei, it remains robust for weakly stained tissue. We have successfully obtained images in tissue sections stained with only eosin [Fig. 5(C)], thereby lacking the high contrast of nuclear staining due to hematoxylin. Autofocusing may also be performed for digital recording of tissue microarrays [Fig. 5(E)].
Autofocused images of fluorescently labeled colon samples are shown in Fig. 6. The principal differences of autofocusing between the two modalities are that fluorescence is limited to a narrower spectral band than brightfield imaging, and care must be taken to minimize exposure time to avoid photobleaching or photodamage of sample. The limited bandwidth did not seem to impact the algorithm, as it was successful with a variety of fluorophores, including Cy3 and DAPI, with various targets. In order to avoid photobleaching, the intermediate images used to bring the sample into focus were obtained with shorter exposure time than the in-focus image. Minimizing exposure did not seem to impact the performance or image quality. It was also possible to acquire images on label-less samples, based on the intrinsic fluorescence of the sample (not shown.) Due to the substantially weaker autofluorescence relative to labeled samples, exposure times were increased to 6.55 seconds. In principle, autofocusing based on autofluorescence can also be used to minimize fluorophore photobleaching, assuming that the excitation wavelengths do not overlap.
We have demonstrated a simple, robust autofocusing algorithm for digital microscopy, and applied it towards brightfield and fluorescence imaging of pathology slides. The algorithm may be readily applied to any existing digital microscope with a motorized z-axis stage. The total imaging time at each lateral position took approximately 5 seconds, the bulk of which is recording the three auxiliary images and the primary, in focus image, as well as the stage settling time. Image acquisition using lower level software can dramatically expedite the algorithm, as many existing high resolution cameras are capable of recording at several frames per second. Axial positioning of the sample can be accelerated with an objective piezo with a high resonant frequency.
Image acquisition can be further accelerated by using the in-focus image at one lateral position as the first auxiliary image at the subsequent lateral position. By eliminating one of the three auxiliary images, the total acquisition time at each position is reduced by approximately 25%. Although this approach requires a slight lateral shift between the auxiliary images used to calculate the focal position, we have observed empirically, in over a hundred samples thus far, that the fidelity of autofocusing is not compromised. In this case, care must be taken to minimize image blur, by ensuring that lateral motion is less than a pixel dimension during the camera integration time.
Our approach of curve fitting using a parabola is in contrast to other slower or computationally more intensive methods of identifying the peak of the focus function, such as recursively searching for the maximum  or directly measuring the peak by increasing the number of images acquired through focus. Further, it is quite easy to implement on any existing microscope, eradicating the need for a dedicated, and potentially costly, autofocusing module. The algorithm was demonstrated on brightfield and fluorescence microscopy, but may in principle be applicable to other microscopy modalities, although at this time its use in phase contrast or darkfield microscopy has not been explored. For fluorescence microscopy, a hybrid imaging approach is feasible, whereupon brightfield imaging may be used to bring the sample into focus, and a fluorescence image subsequently acquired, in order to minimize the likelihood of sample photobleaching.
The authors gratefully acknowledge Max Seel, Denise Hollman-Hewgley and Michael Gerdes for preparing and providing histological samples.
References and links
1. R. S. Weinstein, M. R. Descour, C. Liang, A. K. Bhattacharyya, A. R. Graham, J. R. Davis, K. M. Scott, L. Richter, E. A. Krupinski, J. Szymus, K. Kayser, and B. E. Dunn, “Telepathology overview: From concept to implementation,” Hum. Pathol. 32, 1283–1299 (2001) [CrossRef]
2. M. G. Rojo, G. B. Garcia, C. P. Mateos, J. G. Garcia, and M. C. Vicente, “Critical comparison of 31 commercially available digital slide systems in pathology,” Intl. J. Surg. Pathol. 14, 285–305 (2006) [CrossRef]
3. D. L. Taylor, E. S. Woo, and K. A. Giuliano, “Real-time molecular and cellular analysis: the new frontier of drug discovery,” Curr. Opin. Biotechnol. 8, 1085–1093 (2001)
7. V. Della Mea, F. Viel, and C. A. Beltrami, “A pixel-based autofocusing technique for digital histologic and cytologic slides,” Comput. Med. Imag. Grap. 29, 333–341 (2005) [CrossRef]
9. S. K. Nayar and Y. Nakagawa, “Shape from focus,” IEEE Trans. Pattern Anal. Machine Intell. 16, 824–831 (1994) [CrossRef]
10. J. F. Brenner, B. S. Dew, J. B. Horton, T. King, P. W. Neurath, and W. D. Selles, “An automated microscope for cytologic research a preliminary evaluation,” J. Histochem. Cytochem. 24, 100–111 (1976) [CrossRef] [PubMed]