This study investigates the use of two spectroscopic techniques, auto-fluorescence lifetime measurement (AFLM) and light reflectance spectroscopy (LRS), for detecting invasive ductal carcinoma (IDC) in human ex vivo breast specimens. AFLM used excitation at 447 nm with multiple emission wavelengths (532, 562, 632, and 644 nm), at which auto-fluorescence lifetimes and their weight factors were analyzed using a double exponent model. LRS measured reflectance spectra in the range of 500-840 nm and analyzed the spectral slopes empirically at several distinct spectral regions. Our preliminary results based on 93 measured locations (i.e., 34 IDC, 31 benign fibrous, 28 adipose) from 6 specimens show significant differences in 5 AFLM-derived parameters and 9 LRS-based spectral slopes between benign and malignant breast samples. Multinomial logistic regression with a 10-fold cross validation approach was implemented with selected features to classify IDC from benign fibrous and adipose tissues for the two techniques independently as well as for the combined dual-modality approach. The accuracy for classifying IDC was found to be 96.4 ± 0.8%, 92.3 ± 0.8% and 96 ± 1.3% for LRS, AFLM, and dual-modality, respectively.
©2012 Optical Society of America
Breast cancer is one of the most common forms of cancers among American women with an estimated 230,480 new cases and 39,520 deaths in 2011 alone . With the advancement in diagnostic techniques, it is now possible to diagnose breast cancer in early stages while it is still localized. A standard treatment procedure for women with early breast cancer is breast conserving therapy , a surgical procedure known as the lumpectomy (or partial mastectomy) followed by irradiation therapy. Surgery is imminent in treating breast cancer, with many early stage patients being cured without recurrence. The goals of the surgery include complete resection of the primary tumor, with negative margins to reduce the risk of local recurrences. However, due to lack of definitive tools for intraoperative assessment of cancer margin during lumpectomy, there is incidence of positive margins in 20-50% of patients who undergo the procedure [3,4]. Patients with positive margins must undergo a second surgery, leading to higher risk of wound infection, associated psychological distress, compromised cosmesis, and added medical expenses. Hence, an accurate diagnostic tool that helps in assessing these margins intraoperatively is essential.
Due to nature of light and the way it interacts with different tissue types, many optical techniques have been evaluated for diagnosing breast cancer for more than a decade . In particular, auto-fluorescence spectroscopy (AFS) and diffuse reflectance spectroscopy (DRS) [also termed light reflectance spectroscopy (LRS)] have been developed and extensively evaluated by multiple groups of researchers to achieve clinically relevant results for cancer demarcation [6–12]. Many of these studies were focused on breast cancer: some were focused on the detection of breast cancer for clinical diagnosis [6–9]; some others were focused on the surgical margin detection [10–12]. Brown et al.  recently developed a fiber-based DRS imaging system for breast cancer margin detection, with an overall sensitivity and specificity of 79.4% and 66.7%, respectively. Around the same time, Keller et al.  evaluated DRS and AFS for using a point based approach with sensitivity and specificity of 85% and 96%, respectively, while also demonstrating the feasibility of converting the point based approach into an imaging system for larger area assessment. Although auto-fluorescence spectroscopy provides good contrast as illustrated in above mentioned studies, measurement of auto-fluorescence lifetime is another robust method. Since fluorescence takes place in a few nanoseconds, measurement of fluorescence lifetime provides much information on the mechanisms that lead to chemical or biochemical processes. As evidence, auto-fluorescence lifetime measurements (AFLM) [13–16] have been popularly utilized to provide auto-fluorescence lifetime imaging for cancer diagnosis, specifically, for microscopic assessment of fixed breast cancer tissues . Overall, optically based measurements, such as LRS, AFS, and AFLM can provide direct optical assessment of the tissue and its surrounding environment. Measurements of selected auto-fluorophores offer abundant information on the cellular dynamics of the tissue under investigation. Non-fluorescent light absorbers (such as hemoglobin molecules) and scatterers (such as cells and sub-cellular organelles making up the tissue) modulate the incident light. Subsequently, the resulting fluorescent signals or their lifetimes and the diffuse reflectance spectra collected from tissue samples offer valuable information regarding their salubrity.
Notice that there has been a significant amount of research in fluorescence imaging, especially in small animals, where the target or goal is to obtain tissue or cellular information through a certain tissue depth (e.g., several mm to 1-2 cm). In that case, simple mapping of fluorescence is not accurate or quantitative enough without image reconstructions. In contrast, our study focuses on measurements from tissue surface, without trying to achieve much depth resolution; therefore, no image reconstruction is involved in this study.
In this feasibility study, we wish to explore the optical characteristics of AFLM and LRS of breast cancer in order to determine if LRS can serve either as a stand-alone method or as a combined approach with AFLM for breast cancer detection and classification. A protocol was designed to obtain measurements from freshly excised breast samples. A fiber optic probe of 1 mm diameter was utilized in collecting AFLM and LRS from human ex vivo breast specimens of 7 cases with invasive ductal carcinoma (IDC). Three tissue types were measured, namely, IDC, benign fibrous tissue (FT) and adipose tissue (AT). Specifically, the study used a custom made dual-modality needle-like fiber optic probe  and quantified the auto-fluorescence lifetime in a relatively less explored wavelength range, with the excitation wavelength at 447 nm and emissions at 532 nm, 562 nm, 632 nm, and 684 nm, targeting flavins, lipo-pigments, and porphyrins as fluorophores. We also evaluated LRS using an empirical analysis of tissue spectra, by calculating segmented spectral slopes at every 20-nm interval starting from 500 nm to 840 nm. All the analyzed parameters obtained with each modality were used to first identify the best feature set for classification, using a feature selection algorithm. These distinct feature sets were then fed into a classification algorithm to respectively classify IDC, FT, and AT. For each of these tissue types, the classification performance was assessed by sensitivity (Sn), specificity (Sp), accuracy (Acc), and area under the curve (AUC) of a receiver operating characteristics curve (ROC). Such evaluations were performed independently for each modality and for the combined dual-modality in order to examine the necessity of dual-modality approach.
The motivation and goal of this pilot study was to examine the feasibility of finding optical pre-biomarkers through each or both of the two optical modalities that could serve as intrinsic classifiers for IDC, which is the most common form of breast cancer in lumpectomies. If such pre-biomarkers can be found in a prompt time frame, such methodology may have a potential to become a quick assessment tool for accurate detection of positive breast cancer margins during breast conserving surgery. Although our study currently implemented needle-like probe geometry for point measurement, these techniques can be implemented for surface imaging, for example, by employing a time-gated intensified charged coupled device (ICCD) camera for AFLM, or by using a multispectral camera for LRS .
2.1. Instrumentation and experimental setup
Figure 1(a) shows the overall instrument flow-chart and experimental setup. To implement AFLM, a single-channel, time correlated single photon counting (TCSPC) system was custom-made (ISS Inc., Champaign, IL) to accommodate measurements using a fiber-optic probe. This system consisted of a supercontinuum pulsed laser source (SC-450, Fianium Inc., Eugene, Oregon), with a 5 picosecond (ps) pulse width and a broadband spectrum (~440 nm to 2 μm) at a repetition rate of 20 MHz. The laser light was coupled to an excitation chamber which contained a stepper motor driven 5-slot filter wheel for excitation wavelength selection and a continuous variable neutral density (ND) filter for excitation light intensity control. In this study, the excitation wavelength was selected using a 447(60) nm bandpass filter, where the value within the parenthesis represents the bandwidth of the bandpass filter. The system also included another 5-slot emission filter wheel inside an emission chamber, and 4 emission wavelengths were selected using following bandpass filters: 532(10) nm, 562(40) nm, 632(22) nm, and 684(24) nm. The filtered emission light was directed to a cooled photomultiplier tube (PMT) (PMC 100, Becker & Hickl GmbH) with sensitivity to a spectral range of 185–820 nm.
Instrument automation was achieved through a personal computer memory card (PCMC) (ISS Inc., USA) and a motion control box. A PC-based single photon counting card (SPC-130, Becker and Hickl GmBH) was used to achieve precise synchronization between the incoming laser pulse and the photon event. The laser was coupled through the excitation chamber to the source fiber (core diameter of 100 μm), which was part of a custom-made, quad-furcated optical probe (FiberTech Optica Inc., Montreal, QC, Canada). The resulting fluorescence emission was collected through the detector fiber (core diameter of 400 μm) also within the quad-furcated probe, as illustrated in Fig. 1(b). The core-to-core distance between source and detector fiber was 292 μm.
Also shown in Fig. 1(a), the instrument for LRS consisted of a tungsten-halogen light source (HL2000HP, Ocean Optics, Dunedin, FL, USA), and a CCD array spectrometer (USB 2000+, Ocean Optics, Dunedin, FL USA), connected to the desktop computer. The spectrometer had a spectral range from 475 to 1100 nm and a spectral resolution of ~4 nm due to a 100-µm slit width; no other filter was utilized for LRS measurement. The same quad-furcated fiber optic probe, as seen in Fig. 1(b), also contained two other fibers of 200 μm in diameter with a core-to-core distance of 370 μm. These two fibers served as source and detector fibers for LRS.
The probe was fixed on a rigid probe holder during the measurement in order to control the probe placement and to minimize the pressure on the tissue surface, as shown in Fig. 1(c). For each measured point, the probe was carefully placed just in contact with the tissue, without applying any pressure. However, there was no quantitative measure of the actual pressure applied on the tissue. Both modalities (LRS and AFLM) were combined through the quad-furcated probe with an outer diameter of 1-mm at the tip.
2.2. Measurement protocol
AFLM and LRS readings of human breast cancer specimens were acquired at The University of Texas Southwestern (UTSW) Medical Center, Dallas, TX. The optical measurement protocols were compliant with the UTSW IRB requirements. The data were collected ex vivo from the breast specimens immediately after their resections through mastectomies or lumpectomies. Selection criteria for this pilot study included tissue samples with biopsy confirmed IDC, with no prior exposure to chemotherapy, and having a tumor size of 5 mm in at-least one dimension confirmed via imaging history. Seven such breast cases were selected for collecting optical measurements.
Prior to measurement, the excised breast samples were inked at the margins by the pathologist as per standard pathology protocol. Based on the surgical markings on the sample, multiple cuts were made across the breast tissue, and the tumor was located visually; Fig. 1(c) shows an example. Further, after the pathologist visually identified regions of cancer tissue, benign fibrous tissue, and adipose tissue, multiple (approximately 3-6) AFLM and LRS readings were obtained on each of the three pathological regions. AFLM and LRS readings were taken sequentially (in no particular order) at each measurement point. The integration time for AFLM was 5 sec per emission wavelength for IDC and FT, and 10 sec per emission wavelength for AT. For LRS measurements, up to 100 ms integration times were used. With manual switching between the two modalities, the acquisition time was about 60 sec per measured point. After data acquisition, the measured regions (i.e., three pathological types of tissues) were marked with ink, sliced out from the rest of the breast tissue, and sent for histological analysis. The corresponding histology results for all of the specimens sent were later obtained, confirming 100% correctness in initial identification of breast cancer versus benign tissues. According to the histology results, the measured optical points were categorized as cancer or controls.
One of the 7 cases was excluded from the analysis due to contamination from Isosulphan Blue, a surgical dye commonly used to trace the lymphatic drainage during surgery. The presence of the dye in this case was confirmed by the occurrence of an odd spectral feature around 570 nm to 650 nm in the LRS data . Therefore, a total of 93 locations across 6 breast cancer cases were measured: 34 from IDC regions, 31 from FT regions and 28 from AT regions.
Figure 2(a) is a histopathology illustration of invasive ductal carcinoma showing malignant infiltrative ducts with blue cellular nuclei in a background of abundant purplish stroma. The tumor can be viewed differently from normal breast tissue on physical examination as well as histomorphological evaluation. The overall histological alteration is evident, from a homogeneous stroma, as seen in Fig. 2(b), and two-layer epithelium ductal structure, not shown here, to a highly heterogeneous stroma admixed with many disorganized ducts, as demonstrated in Fig. 2(a).
2.3. Data analysis
2.3.1. AFLM analysis
where I(t) represents normalized lifetime intensity, τ1 and τ2 are the lifetimes of the individual exponential components, and a1 and a2 are their respective weights. A constant term c was added to account for the baseline noise.
The data fitting was achieved through Matlab (The Mathworks, Inc., Natick, MA) by implementing a nonlinear least squares curve fitting method. Also, intensity-weighted mean lifetime (τm)  was calculated as per Eq. (2):
Thus, five parameters of τ1, τ2, a1, a2 and τm were calculated for each of the four emission wavelengths, giving us 20 parameters at each measured tissue location. Each of these parameters was further evaluated for significant differences between IDC and other two benign breast tissue types (FT and AT) using a linear mixed model regression analysis for repeated measures, implemented in SAS software (SAS Institute Inc., NC, USA).
2.3.2. LRS analysis
Each acquired LRS spectrum was divided by the calibration reflectance spectrum obtained from a diffuse reflectance standard (WS-1, Ocean Optics, FL, USA). While the acquired LRS included data from 475 to 1100 nm, we selected the spectral segments between 500 and 840 nm for further analysis. This selection was based on our previous quantification algorithms for LRS  and the fact that the signal to noise ratio of the system falls off outside this range. However, we acknowledge that it is possible to include the wavelengths beyond 500-840 nm in the analysis. Given the chosen spectral region, each spectrum was divided into multiple 20-nm segments [as marked by dashed lines in Fig. 4(a) below]. A spectral slope of each region was calculated using linear regression, resulting in 17 slopes (S1 to S17) for each measured spectrum: S1 representing the slope in 500-520 nm region, S2 representing the slope in 520-540 nm region, and so on. Each slope was then compared among all three breast tissue types, followed by statistical significance tests using the same linear mixed model regression analysis approach, as done for AFLM data.
2.4. Data classification and accuracy assessment
As explained above in Subsection 2.3, our analysis resulted in 20 lifetime-driven, 4-wavelength-dependent parameters from AFLM and 17 spectral slopes from LRS at each measured location on the breast specimens. These 20 plus 17 characteristic parameters can be utilized as classification features for breast cancer discrimination. A two-step process was implemented in order to assess the classification ability of each technique independently, as well as of the combined dual-modality approach: step one was to implement a feature selection algorithm in order to select the best feature set for tissue type classification; step two was to develop a multinomial logistic regression model along with 10-fold cross validation to classify three tissue types and obtain respective classification parameters for each tissue type. Each of the two steps is explained in detail below.
2.4.1. Multinomial logistic regression classification
A multinomial logistic regression (MLR) based generalized linear model  was used as a classification tool. Details of MLR method can be found in reference 18. Briefly, MLR is an extension of binary logistic regression, where one of the outcomes is considered as a baseline, and odds ratio of other outcomes against the baseline are computed. Our problem consists of three possible outcomes, namely, IDC (Y = 0), FT (Y = 1) and AT (Y = 2), where Y is an index that represents the nominal outcome. The MLR model can then be constructed to obtain a classifier for predicting the presence of cancer, as describe d by Eqs. (3)-(5) :
where P(Y = j) represents the probability for outcome j; gj(x) is logit function; xn is the nth feature characteristically identified in either AFLM or LRS, and βjn is the corresponding coefficient for the nth parameter of the jth model.
This model was implemented in Matlab  with the use of “mnrfit” function. A 10-fold cross validation method was used to calculate the classification parameters, as described below. First, the data was partitioned into 10 segments using ‘crossvalind’ function, while making sure that each tissue type had equal distribution. One of the segments was chosen as test set and the rest as training set. Second, for a given training set, ‘mnrfit’ and ‘mnrval’ were used to compute probabilities P(Y = j), for each tissue type, j (Eq. (3)). Third, the three probability distributions were used to create an ROC for each tissue type (IDC, FT, AT), followed by determination of Youden index  to obtain an optimum cutoff value for classification. Fourth, Sn, Sp and Acc of all three tissue classes were calculated on the test set. Fifth, steps one to four were repeated while selecting a different test set each time out of 10 partitions of data generated during cross validation process, and Sn, Sp, and Acc were averaged over the ten iterations. Similarly, an averaged value of AUC was also obtained for each set from all 10 ROC curves that were generated in cross validation. Finally, steps one to five were repeated 10 times, to obtain 10 independent performance evaluations of cross-validated classification parameters. Means and standard deviations of Sn, Sp, Acc and AUC were then calculated, across the 10 values of each obtained in the fifth step.
By selecting different sets of parameters or features, [x], in Eq. (3), the above classification algorithm was implemented to evaluate the ability of discriminating IDC by three methods: (a) AFLM-only, (b) LRS-only, and (c) dual-modality with combined features of AFLM and LRS. For each of the three methods, a feature selection algorithm was first used to select a “best feature set” for classification, as further explained in the next section.
2.4.2. Feature selection
As described in Subsection 2.3, the measured data resulted in a large feature set, consisting of 17 features from LRS and 20 features from AFLM. However, all these features may not contribute equally and constructively to the classification model, which was described in Subsection 2.4.1. Also, using a relatively large feature set could lead to the over-fitting problem for the classification model, especially for sparse data sets. A feature selection algorithm was therefore developed to select an optimum feature set from a given set of features: i.e., 17 in case of LRS, 20 in case of AFLM, and 37 (17 + 20) in case of dual-modality method. In this paper, a sequential feature selection algorithm was implemented in Matlab . Sequential feature selection is one of the commonly used methods for feature selection [25,26], and will only be briefly discussed here. In general, sequential feature selection involves adding [sequential forward selection (SFS)] or removing [sequential backward selection (SBS)] features, one at a time, to or from an empty/full feature set, and evaluates a given model using a chosen criterion. This process continues until adding or removing more features does not improve the model prediction, as defined by a specified criterion. Both SFS and SBS can be used independently as feature selection methods, and can produce varying results for different data sets.
We implemented both of these methods to select a set of features, along with two criteria, which gave us four independent “feature sets.” The two criteria used to select significant features were: (a) In criterion A, we computed the deviance of the multinomial logistic regression model fit (mnrfit, see Subsection 2.4.1), and tested if the new deviance after adding/removing the new feature was significantly (p < 0.05; chi-square test) reduced. (b) In criterion B, we used the same multinomial model as in criterion A, but instead of using deviance, we utilized 10-fold cross validation to compute classification accuracy of cancer. The misclassification rate, computed using average of 10 test sets, was then used as the criterion value. If there was no decrease in the criterion value, the new feature was not added or removed for SFS or SBS, respectively. For more details, the interested reader can refer to the Matlab documentation for “sequentialfs” function .
Thus, for each initial feature set obtained from LRS, AFLM or LRS + AFLM, a total of four algorithms were implemented, based on the direction (SFS or SBS) and inclusion/exclusion criteria (A or B), namely, SFS + A, SBS + A, SFS + B, and SBS + B. The four algorithms generated four independent sets of selected features, one out of which could be the “best feature set” determined by its best classification performance, as reflected through Sn, Sp, Acc, and AUC. Such best feature sets were selected for each of the three approaches: LRS-only, AFLM-only, and dual-modality.
3.1. AFLM results
Each lifetime curve was used to obtain 4 fitted parameters (τ1, τ2, a1 and a2) and a derived parameter (τm), as described earlier in Subsection 2.3.1. With four emission wavelengths, overall 20 AFLM parameters were obtained from each measured location.
A mixed model repeated measures linear regression analysis was performed, revealing significant differences between means of cancer and the other two types of breast tissue, in 5 out of 20 AFLM parameters, as seen in Fig. 3(a) . The p-values associated with all 5 significant AFLM features are tabulated in Table 1 . All these significant parameters were derived from emission wavelengths 532 nm and 562 nm. Among the rest of the parameters, some showed significant differences either between IDC and FT or between IDC and AT, but not differentiating IDC from both FT and AT, whereas some did not show any significant difference among any tissue types.
As shown in Fig. 3(a), IDC lifetimes were found overall to be shorter than those of FT, and longer than those of AT. Figure 3(b) shows the variation of mean lifetimes as a function of wavelength, for the three tissue types. As can be seen, the τm values of IDC and FT are similar in their spectral trends across wavelength, but exhibit stark contrast at first two emission wavelengths. AT, on the other hand, follows a different mean lifetime pattern over the emission spectral range, with an increased then followed by a decreased τm; nevertheless, τm of AT still shows even more significant contrast at 532 nm and 562 nm, with respect to the τm values of both IDC and FT.
3.2 LRS results
Figure 4(a) shows a comparison of mean LRS spectra of the three breast tissue types. As described previously in Section 2, 17 spectral windows of 20 nm width were generated (as marked by dotted vertical lines); the corresponding spectral slope of each region was computed for all the data points using a linear regression curve fit, giving us overall 17 parameters to compare IDC with FT and AT. Statistical significance was tested using a mixed model repeated measures analysis model, which revealed that 10 out of 17 mean spectral slopes was significantly different between IDC and the other two types of breast tissues (note: S4 is marginally significant for IDC vs. AT with p = 0.052). The p-values of these significant features have been tabulated in Table 1. As an example, Fig. 4(b) shows a comparison of scaled slopes (slope x 103) among the three breast tissue types at several spectral windows, whose slopes are statistically different between IDC and the other two tissue types. Note that S12 to S17 shared similar slope patterns for all three breast tissue types, so only slope values of S13 among those are plotted in Fig. 4(b). We also noted that within S12 to S17, IDC has the maximum absolute slope values as compared to FT and AT.
3.3 Classification results
While statistical analyses given in Subsections 3.1 and 3.2 showed 5 and 10 significant features in AFLM and LRS, respectively, we still fed all the features (i.e., 20 in case of AFLM, 17 in case of LRS, and 37 for both) as input parameters to the feature selection algorithm. The underlying reason is that the feature selection algorithm has an ability to identify unique features which otherwise may not be significantly different among classes, but add value to the classification of data. More discussion on this approach is given in Section 4.3.
Specifically, as described earlier in Subsection 2.4, classification ability was assessed for three cases: (a) AFLM only, (b) LRS only, and (c) dual-modality (i.e., LRS + AFLM) together. In each of these three cases, a best set of features was selected from one of the four feature selection methods. For AFLM only, “SFS + A” gave the best feature set that included [τ1 (532 nm), τm (532 nm), τ1 (562 nm), and τ2 (562 nm)]. For LRS only, the best results were obtained using “SBS + B” feature selection routine (see Subsection 2.4.3), and the selected feature set consisted of [S1, S5, S6, S7, S8, S9, S17]. Similarly, for LRS + ALFM, the routine used was “SBS + B”, and the selected feature set was [τ2 (562 nm), a1 (562 nm), a2 (562 nm), S1, S8, S9, S14, S17].
Interestingly, classification model utilizes certain features which otherwise may not be significantly different among classes, but add value to the classification of data, for example, S8, S9, a1 (562 nm), a2 (562 nm). The four parameters (Sn, Sp, Acc, AUC) that summarize the classification ability are tabulated in Table 2 for each case. As can be clearly observed, the accuracy of classifying IDC is greater than 90% using any of the three cases, i.e., either technique alone, or in combination. In fact, the best accuracy for cancer detection in this study is for LRS alone (96.4 ± 0.8) which is quite close to both methods used together (96.0 ± 1.3). AFLM provides slightly better results for the other two tissue classes, when compared with LRS and the combined technique.
4. Discussion and conclusions
4.1 Feasibility of dual-modality measurements for detecting breast cancer margins
Changes in cellular metabolism caused by cancer development in vivo can result from a number of factors including genetic changes, changes in tissue vascularization, and changes in metabolic demand [14,27]. Flavins (co-enzymes) that are involved in cellular oxidative phosphorylation , porphyrins, and lipo-pigments were the targeted auto-fluorophores in this study for AFLM measurements. We found that, with 447 nm excitation, the mean auto-fluorescence lifetimes quantified by the two exponential components were significantly different between IDC and two other types of benign breast tissue, predominantly at 532 nm and 562 nm. This implies that our detected fluorescence signals stem mainly from flavins and lipo-pigments [28,29]. Also, we found significant differences within the LRS between the cancerous and benign breast tissue, in the predominantly hemoglobin absorption range (500-640 nm), as well as in predominantly scattering domain (700-840 nm) (see Table 1). These findings are consistent with previously published work [6,8–11], as well as with the expected morphology of cancer.
We have previously demonstrated the applicability and accuracy of LRS technique for quantification of optical properties in tissue phantoms and animal models [17,21]. In this study, we chose a relatively straightforward and empirical approach (i.e., quantification of segmented spectral slopes) for analyzing LRS data, as opposed to model-based or feature-extraction algorithms that quantify physiological and other feature-based parameters [7–12]. The spectral slopes are light-intensity independent, do not require frequent calibration of the instrument, and thus make the measurement and tissue classification faster, simpler, with a lower computational cost. In particular, empirical approaches may be practically useful to extract distinct characteristics due to cancer when non-contact imaging-based approaches (e.g., multi-spectral imaging) are utilized. In these cases, there often exist spectral broadening and other factors that lead to low-resolution spectra  and thus make model-based fitting difficult or inaccurate. The classification results that we obtained (see Table 2) are comparable to the ones published using absolute-quantification methods [8–10,12] and other empirical methods [6,11]. Therefore, while our sample size of the breast specimens is relatively limited (n = 6) in this pilot study, the results from LRS are consistent and convincing to show that a simple, empirical approach using selected spectral slopes can provide high sensitivity and specificity to identify IDC.
Both AFLM and LRS methods independently provided contrast parameters to differentiate IDC from benign breast tissue types, with excellent accuracy (see Table 2). AFLM measurements carry information on the mechanisms that are associated with chemical or biochemical processes of the measured tissue, while LRS is sensitive to vasculature and tissue morphology. Given the complementary information processed by both methods, we also evaluated the dual-modality method by combining the features from both AFLM and LRS. LRS was found to be the most robust and accurate (Acc = 96.4 ± 0.8%) in IDC identification in this implementation. However, AFLM was not too far (Acc = 92.3 ± 0.8%) in accuracy, and provided better accuracies when being used to identify other two classes of breast tissues. Nevertheless, it can be argued that “high accuracy in identifying IDC” would be a better qualifier when we select a cancer-detection method. Thus, LRS would be a preferred approach over the others (i.e., AFLM and dual-modality) since it provides highest accuracy for IDC discrimination in the given sample set. It is also observed from Table 2 that adding both the modalities together does not significantly improve the results. It should be noted however, that there are multitudes of methods available for data classification and feature selection, and application of other methods may provide slightly different results from ours.
4.2 Measurement parameters, accuracy, and performance of the dual-modality system
As given in Subsection 3.1, integral-intensity-weighted mean lifetimes at 532 nm for cancer, fibrous and adipose tissue were 3.27 ± 0.43 ns, 3.77 ± 0.25 ns, and 1.62 ± 0.57 ns, respectively. There has been limited report in literature on lifetimes of fresh breast tissues, especially with excitation near 450 nm. Our results are comparable to a previous report on lifetimes of IDC and fibroadenoma . There are various endogenous fluorophores that could be excited within 400-500 nm range; their lifetimes range from <0.01 ns for protein bound FAD at the shorter end, and up to 15 ns for protoporphyrin IX at the longer end . It is thus difficult to determine the individual fluorophores that are exactly responsible for the two lifetime components.
However, it is still necessary to examine and assure the accuracy of our AFLM technique. To do so, we measured two standard fluorescence dyes, one with longer lifetime (Fluorescein, 4.0 ns), and one with shorter lifetime (Rhodamine B, 1.74 ns) . These dyes were dissolved in water, and a solution of each was prepared. First, a part of this solution was measured by our AFLM system in reflectance geometry by placing the solution in a black well-plate. The measured integral-intensity-weighted mean lifetimes calculated using a two component exponent model were 4.04 ± 0.01 ns and 1.85 ± 0.02 ns for Fluorescein and Rhodamine B, respectively. These results were then confirmed by measuring a part of the same solution with another commercial TCSPC system (FluoTime 200, PicoQuant GmbH, Germany), using 1 mm thick cuvette and front-face geometry as described in . The system is equipped with a micro-channel plate (MCP) detector  and excitation wavelength is selected from tunable laser system (Fianium SC400-4) with opto-acoustic filter system. This data set was analyzed using a commercial FluoFit software package (PicoQuant GmbH). The calculated integral-intensity-weighted lifetimes were 4.02 ± 0.03 ns and 1.67 ± 0.03 ns, for Fluorescein and Rhodamine B, respectively. Overall, the deviations between measured lifetimes by our AFLM system and the PicoQuant unit were within 10% of each other. Moreover, we have performed error analyses and goodness of fit for measured mean lifetimes; the corresponding results can be found in Appendix A2.
One reason to use 447 nm for excitation was that we initially wished to use a longer wavelength for a deeper penetration depth in tissues. We performed a preliminary excitation-emission study using ex vivo rat prostate tumors and found that ~450 nm excitation provided us with good fluorescence emission spectra as well as lifetime measurements. Also, before this study, we utilized this dual-modality system to perform measurements on animal prostate cancers and achieved meaningful results . All of the previous studies strengthened our determination of using 447 nm as the excitation wavelength. However, whether other wavelengths can serve as useful or sufficient excitation wavelengths for breast cancer detection remains to be further examined.
The measurement parameters, accuracy, and performance of our LRS system has been previously studied extensively on phantoms and animal models. Relative errors in repeated measures were within 10%, and more details on specific results can be found in reference .
4.3 Selection of features for classification models to avoid overfitting conditions
In Subsections 3.1 and 3.2, we identified 5 AFLM and 10 LRS features that showed statistical significance (p< 0.05) between breast cancer and other breast tissue types (i.e., fibrous and adipose tissues). It is important, however, to point out that some of the significant (p<0.05) features may not contribute uniquely to the classification model in a separate or combined LRS, AFLM or LRS + AFLM case, possibly due to multi-collinearity. Namely, a set of individually significant contrast features may not form the best feature set for tissue classification. Therefore, in Subsection 3.3, we still utilized all the parameters (20 for AFLM, 17 for LRS and 37 for LRS + AFLM) for feature selection. Feature selection algorithms then revealed a reduced set of features (4 for AFLM, 7 for LRS, and 8 for LRS + AFLM), including some features that were found to be statistically different between cancer and other two types, and some additional features which were not. Indeed, the reduction in number of features (by feature selection) helped prevent our classification from overfitting, which often occurs when the feature set has a long list as compared to the number of observations. In addition, to account for overfitting, we also used a 10-fold cross validation routine, and the results were evaluated on the test data, which was separate from training data.
4.4 Limitations and future work
While the results shown in this paper are promising with reasonable inter-subject variability (see Appendix Subsection A1), we acknowledge that this is a pilot study with a limited specimen size (n = 6), and thus with a limited statistical power to hold the same conclusion for a larger trial. The values of sensitivity, specificity, and accuracy given in Table 2 are accurate with respect to this study, but may vary when more breast samples are measured and tested. In particular, inter-subject and intra-subject variations often exist in clinical practices and are a major roadblock in the validation of optical diagnostic signals. Therefore, further modification and improvement on feature selection and classification algorithms are needed with a larger sample population and more lesion readings from each specimen in future studies.
Another limitation of this study is that we were not able to determine the individual fluorophores that are exactly responsible for the two lifetime components. Knowing or identifying the origins of two lifetime components may provide us with a better understanding on chemical or biochemical processes of cancer cells. Since our expected or assumed fluorophores at our excitation wavelength are flavins and lipo-pigments (see Subsection 4.1), further studies on isolated components may be a good approach to take. Also, we may utilize the conventional excitation wavelength at ~360 nm and then observe fluorescence lifetimes from NADH, collagen, and lipid of breast tissues. In this way, we may be able to comprehend how intrinsic NADH and collagen fluorescence are associated with breast cancer tissue.
The short-term goal of this study was to examine and demonstrate the feasibility of identifying and predicting IDC based on ex vivo breast tissue samples using AFLM, LRS, or combined approaches. The long term goal of this investigation is to assess LRS and AFLM for clinical translation towards breast cancer margin detection. Among three possible means, this paper suggests that LRS is a highly sensitive and accurate technique to differentiate a solid IDC mass from surrounding fibrous tissue or adipose tissue with a localized point measurement. However, there are other tissue types in the breast that need to be evaluated or identified besides IDC, such as preneoplastic proliferative changes (pre-cancer) and ductal carcinoma in situ (DCIS). Again, a continuous study is warranted in order to develop a more comprehensive and robust classification algorithm for identification of normal, pre-cancer, DCIS, and IDC. Such a study may also allow us to take into account the demographics of patients in analysis , including the population with prior chemotherapy . Surgical margin detection would rather require an imaging platform for fast surface assessment of excised samples. Both LRS and AFLM have the ability to be implemented in a non-contact imaging geometry, and our data suggest either of these methods could be a useful tool. A larger pool of breast tissue specimens involving the rest of the tissue types may provide an insight into the clear dominance of either technique, but for now, LRS displays an edge over AFLM given cost as a consideration factor.
A1. Inter-subject variability for the features selected from AFLM and LRS
In Subsection 3.1, we identified 5 features that were significantly different between cancer and non-cancer breast tissues. The left panel of Fig. 5 shows box plots of these 5 parameters across 6 specimens from three tissue types, presenting a statistical distribution of inter-subject variability, respectively. The data distribution is relatively uniform with few outliers in certain parameters. Similarly, the right panel of Fig. 5 exhibits box plots of 10 selected features obtained from LRS data (see Subsection 3.2). The same conclusion holds: inter-subject variability was not a major concern in this specimen population. It should be noted that entire data set (6 specimens) was considered for classification and cross validation, and no data points were excluded as outliers.
A2. Error analyses and goodness of fit for measured mean lifetimes
The metrics of AFLM data fitting were calculated for each of the breast tissue type. Two goodness of fit (gof) parameters, namely, root mean square error (RMSE, ns) and adjusted R2, were calculated for each curve. The mean values of each parameter with standard deviations for 532 nm and 562 nm are listed in Table 3 for each tissue type. As illustrated by the gof parameters, the data fitting was highly efficient and two- exponent model was adequate to estimate the data.
Also, peak counts for acquired AFLM curves were calculated to present an estimate of the auto-fluorescence signal strength. The values (rounded to nearest 100) are also listed in Table 3. It can be seen that adipose tissue had the lowest counts, whereas fibrous tissue had the highest counts. It should be noted that the integration time for adipose tissue was 10 sec, as compared to 5 sec for the other two tissue types. Note that for the given analyses, we did not deconvolve the instrument response function (IRF) from the measured lifetime curves. This is because the IRF of our TCSPC system was around ~0.3 ns (full width half maximum), which is sufficiently shorter than our measured lifetimes. In theory, more accurate true lifetime values could be obtained after implementing the deconvolution process, especially for shorter lifetimes.
The authors thank Dr. Georgios Alexandrakis and Dr. Digant Dave from UT Arlington, for helpful discussions on AFLM measurements. The authors are grateful to Dr. Ignacy Gryczynski from the University of North Texas Health Science Center for his technical suggestions. Also, we thank Dr. Nancy Rowe from UT Arlington for statistics support in mixed model analysis.
References and links
1. American Cancer Society, Cancer Facts & Figures 2011 (ACS, 2011).
3. R. G. Pleijhuis, M. Graafland, J. de Vries, J. Bart, J. S. de Jong, and G. M. van Dam, “Obtaining adequate surgical margins in breast-conserving therapy for patients with early-stage breast cancer: current modalities and future directions,” Ann. Surg. Oncol. 16(10), 2717–2730 (2009). [CrossRef] [PubMed]
6. I. J. Bigio, S. G. Bown, G. Briggs, C. Kelley, S. Lakhani, D. Pickard, P. M. Ripley, I. G. Rose, and C. Saunders, “Diagnosis of breast cancer using elastic-scattering spectroscopy: preliminary clinical results,” J. Biomed. Opt. 5(2), 221–228 (2000). [CrossRef] [PubMed]
7. G. M. Palmer, C. F. Zhu, T. M. Breslin, F. S. Xu, K. W. Gilchrist, and N. Ramanujam, “Comparison of multiexcitation fluorescence and diffuse reflectance spectroscopy for the diagnosis of breast cancer (March 2003),” IEEE Trans. Biomed. Eng. 50(11), 1233–1242 (2003). [CrossRef] [PubMed]
8. Z. Volynskaya, A. S. Haka, K. L. Bechtel, M. Fitzmaurice, R. Shenk, N. Wang, J. Nazemi, R. R. Dasari, and M. S. Feld, “Diagnosing breast cancer using diffuse reflectance spectroscopy and intrinsic fluorescence spectroscopy,” J. Biomed. Opt. 13(2), 024012 (2008). [CrossRef] [PubMed]
9. R. Nachabé, D. J. Evers, B. H. Hendriks, G. W. Lucassen, M. van der Voort, E. J. Rutgers, M. J. Peeters, J. A. Van der Hage, H. S. Oldenburg, J. Wesseling, and T. J. Ruers, “Diagnosis of breast cancer using diffuse optical spectroscopy from 500 to 1600 nm: comparison of classification methods,” J. Biomed. Opt. 16(8), 087010 (2011). [CrossRef] [PubMed]
10. J. Q. Brown, T. M. Bydlon, L. M. Richards, B. Yu, S. A. Kennedy, J. Geradts, L. G. Wilke, M. Junker, J. Gallagher, W. T. Barry, and N. Ramanujam, “Optical assessment of tumor resection margins in the breast,” IEEE J. Sel. Top. Quantum Electron. 16(3), 530–544 (2010). [CrossRef] [PubMed]
11. M. D. Keller, S. K. Majumder, M. C. Kelley, I. M. Meszoely, F. I. Boulos, G. M. Olivares, and A. Mahadevan-Jansen, “Autofluorescence and diffuse reflectance spectroscopy and spectral imaging for breast surgical margin analysis,” Lasers Surg. Med. 42(1), 15–23 (2010). [CrossRef] [PubMed]
12. S. Kennedy, J. Geradts, T. Bydlon, J. Q. Brown, J. Gallagher, M. Junker, W. Barry, N. Ramanujam, and L. Wilke, “Optical breast cancer margin assessment: an observational study of the effects of tissue heterogeneity on optical contrast,” Breast Cancer Res. 12(6), R91 (2010). [CrossRef] [PubMed]
13. M. C. Skala, K. M. Riching, D. K. Bird, A. Gendron-Fitzpatrick, J. Eickhoff, K. W. Eliceiri, P. J. Keely, and N. Ramanujam, “In vivo multiphoton fluorescence lifetime imaging of protein-bound and free nicotinamide adenine dinucleotide in normal and precancerous epithelia,” J. Biomed. Opt. 12(2), 024014 (2007). [CrossRef] [PubMed]
14. H. M. Chen, C. P. Chiang, C. You, T. C. Hsiao, and C. Y. Wang, “Time-resolved autofluorescence spectroscopy for classifying normal and premalignant oral tissues,” Lasers Surg. Med. 37(1), 37–45 (2005). [CrossRef] [PubMed]
15. J. McGinty, N. P. Galletly, C. Dunsby, I. Munro, D. S. Elson, J. Requejo-Isidro, P. Cohen, R. Ahmad, A. Forsyth, A. V. Thillainayagam, M. A. Neil, P. M. French, and G. W. Stamp, “Wide-field fluorescence lifetime imaging of cancer,” Biomed. Opt. Express 1(2), 627–640 (2010). [CrossRef] [PubMed]
16. P. J. Tadrous, J. Siegel, P. M. French, S. Shousha, N. Lalani, and G. W. Stamp, “Fluorescence lifetime imaging of unstained tissues: early results in human breast cancer,” J. Pathol. 199(3), 309–317 (2003). [CrossRef] [PubMed]
17. V. Sharma, N. Patel, J. Chen, L. Tang, G. Alexandrakis, and H. A. N. L. I. Liu, “A dual-modality optical biopsy approach for in vivo detection of prostate cancer in rat model,” J. Innovat. Opt. Health Sci. (JIOHS) 04(03), 269–277 (2011). [CrossRef]
19. J. Siegel, D. S. Elson, S. E. D. Webb, K. C. B. Lee, A. Vlandas, G. L. Gambaruto, S. Lévêque-Fort, M. J. Lever, P. J. Tadrous, G. W. H. Stamp, A. L. Wallace, A. Sandison, T. F. Watson, F. Alvarez, and P. M. W. French, “Studying biological tissue with fluorescence lifetime imaging: microscopy, endoscopy, and complex decay profiles,” Appl. Opt. 42(16), 2995–3004 (2003). [CrossRef] [PubMed]
21. V. Sharma, J. W. He, S. Narvenkar, Y. B. Peng, and H. Liu, “Quantification of light reflectance spectroscopy and its application: determination of hemodynamics on the rat spinal cord and brain induced by electrical stimulation,” Neuroimage 56(3), 1316–1328 (2011). [CrossRef] [PubMed]
22. D. W. Hosmer, and S. Lemeshow, Applied Logistic Regression (Wiley, 2000).
23. MATLAB, version 7.13.0 (R2011b) (The Mathworks, Inc., Natick, Massachusetts, 2011).
24. N. J. Perkins and E. F. Schisterman, “The inconsistency of “optimal” cutpoints obtained using two criteria based on the receiver operating characteristic curve,” Am. J. Epidemiol. 163(7), 670–675 (2006). [CrossRef] [PubMed]
25. T. Hastie, R. Tibshirani, and J. Friedman, The elements of statistical learning, data mining, inference, and prediction (Springer, 2008).
26. A. Ng, “Machine learning course at Stanford (CS229) lecture notes,” (2011).
29. G. A. Wagnières, W. M. Star, and B. C. Wilson, “In vivo fluorescence spectroscopy and imaging for oncological applications,” Photochem. Photobiol. 68(5), 603–632 (1998). [PubMed]
30. D. Elson, J. Requejo-Isidro, I. Munro, F. Reavell, J. Siegel, K. Suhling, P. Tadrous, R. Benninger, P. Lanigan, J. McGinty, C. Talbot, B. Treanor, S. Webb, A. Sandison, A. Wallace, D. Davis, J. Lever, M. Neil, D. Phillips, G. Stamp, and P. French, “Time-domain fluorescence lifetime imaging applied to biological tissue,” Photochem. Photobiol. Sci. 3(8), 795–801 (2004). [CrossRef] [PubMed]
32. A. Mukerjee, T. J. Sørensen, A. P. Ranjan, S. Raut, I. Gryczynski, J. K. Vishwanatha, and Z. Gryczynski, “Spectroscopic properties of curcumin: orientation of transition moments,” J. Phys. Chem. B 114(39), 12679–12684 (2010). [CrossRef] [PubMed]
33. R. Luchowski, Z. Gryczynski, P. Sarkar, J. Borejdo, M. Szabelski, P. Kapusta, and I. Gryczynski, “Instrument response standard in time-resolved fluorescence,” Rev. Sci. Instrum. 80(3), 033109 (2009). [CrossRef] [PubMed]