DESCRIPTION
OPTICAL METHOD AND APPARATUS FOR THE DIAGNOSIS OF CERVICAL PRECANCERS USING RAMAN AND FLUORESCENCE SPECTROSCOPIES
BACKGROUND OF THE INVENTION
The invention relates to optical methods and apparatus used for the diagnosis of cervical precancers .
Cervical cancer is the second most common malignancy in women worldwide, exceeded only by breast cancer and in the United States, it is the third most common neoplasm of the female genital tract. 15,000 new cases of invasive cervical cancer and 55,000 cases of carcinoma in situ (CIS) were reported in the U.S. in 1994. In 1994, an estimated 4,600 deaths occurred in the United States alone from cervical cancer. However, in recent years, the incidence of pre-invasive squamous carcinoma of the cervix has risen dramatically, especially among young women. Women under the age of 35 years account for up to 24.5% of patients with invasive cervical cancer, and the incidence is continuing to increase for women in this age group. It has been estimated that the mortality of cervical cancer may rise by 20% in the next decade unless further improvements are made in detection techniques.
The mortality associated with cervical cancer can be reduced if this disease is detected at the early stages of development or at the pre-cancerous state (cervical intraepithelial neoplasia (CIN) ) . A Pap smear is used to screen for CIN and cervical cancer in the general female population. This technique has a false-negative error rate of 15-40%. An abnormal pap smear is followed by colposcopic examination, biopsy and histologic confirmation of the clinical diagnosis. Colposcopy
requires extensive training and its accuracy for diagnosis is variable and limited even in expert hands. A diagnostic method that could improve the performance of colposcopy in the hands of less experienced practitioners, eliminate the need for multiple biopsies and allow more effective wide scale diagnosis could potentially reduce the mortality associated with cervical cancer.
Recently, fluorescence, infrared absorption and
Raman spectroscopies have been proposed for cancer and precancer diagnosis. Many groups have successfully demonstrated their use in various organ systems. Auto and dye induced fluorescence have shown promise in recognizing atherosclerosis and various types of cancers and precancers. Many groups have demonstrated that autofluorescence may be used for differentiation of normal and abnormal tissues in the human breast and lung, bronchus and gastrointestinal tract. Fluorescence spectroscopic techniques have also been investigated for improved detection of cervical dysplasia.
An automated diagnostic method with improved diagnostic capability could allow faster, more effective patient management and potentially further reduce mortality.
SUMMARY OF THE INVENTION
The present invention demonstrates that fluorescence and Raman spectroscopy are promising techniques for the clinical diagnosis of cervical precancer.
Studies were conducted in vi tro to establish a strategy for clinical in vivo diagnosis, and indicated that 337, 380 and 460 nm (± 10 nm) are optimal excitation wavelengths for the identification of cervical precancer.
In vivo fluorescence spectra collected at 337 nm from 92 patients were used to develop spectroscopic methods to differentiate normal from abnormal tissues. Using empirical parameters such as peak intensity and slope of the spectra, abnormal and normal tissues were differentiated with a sensitivity and specificity of 85% and 75%. Using ultivariate statistical methods at 337 nm excitation, normal and squamous intraepithelial lesions (SILs - lesions with dysplasia and human papilloma virus (HPV) ) were differentiated with a sensitivity of 91% and specificity of 82%. At 380 nm excitation, can be differentiated from columnar normal tissues and from tissues with inflammation with a sensitivity of 77% and a specificity of 72%. At 460 nm excitation, high grade SILs (moderate to severe dysplasia carcinoma) and low grade SILs (mild dysplasia, HPV) were differentiated with a sensitivity and specificity of 73% and 85%. As used herein the calculations of sensitivity and specificity are presented in detail in Appendix I.
The present invention also contemplates the use of Raman spectroscopy for the diagnosis of disease in tissue. Raman scattering signals are weak compared to fluorescence. However, Raman spectroscopy provides molecular specific information and can be applied towards tissue diagnosis. The present invention exploits the capabilities of near infrared (NIR) Raman spectroscopy and fluorescence spectroscopy to differentiate normal, metaplastic and inflammatory tissues from SILs. Further, the ability of these techniques to separate high grade dysplastic lesions from low grade lesions is also exploited.
The invention also contemplates the use of fluorescence spectroscopy in combination with Raman spectroscopy for the diagnosis of disease in tissue.
More particularly, the present invention contemplates methods and apparatus for the optical diagnosis of cervical precancers. Specifically, one embodiment of the method of the present invention detects tissue abnormality in a tissue sample by illuminating a tissue sample with a first electromagnetic radiation wavelength selected to cause the tissue sample to produce a fluorescence intensity spectrum indicative of tissue abnormality. Then, a first fluorescence intensity spectrum emitted from the tissue sample as a result of illumination with the first wavelength is detected. The tissue sample is then illuminated with a second electromagnetic radiation wavelength selected to cause the tissue sample to produce a fluorescence intensity spectrum indicative of tissue abnormality, and a second fluorescence intensity spectrum emitted from the tissue sample as a result of illumination with the second wavelength is detected. Finally, a probability that the tissue sample is normal or abnormal is calculated from the first fluorescence intensity spectrum, and a degree of abnormality of the cervical tissue sample is calculated from the second fluorescence intensity spectrum.
The invention further contemplates that each of the calculations include principal component analysis of the first and second spectra, relative to a plurality of preprocessed spectra obtained from tissue samples of known diagnosis. The invention also contemplates normalizing the first and second spectra, relative to a maximum intensity within the spectra, and mean-scaling the first and second spectra as a function of a mean intensity of the first and second spectra.
Another embodiment of the invention includes detecting tissue abnormality in a diagnostic tissue sample by illuminating the tissue sample with an
illumination wavelength of electromagnetic radiation selected to cause the tissue sample to emit a Raman spectrum comprising a plurality of wavelengths shifted from the illumination wavelength. A plurality of peak intensities of the Raman spectrum at wavelength shifts selected for their ability to distinguish normal tissue from abnormal tissue are detected, and each of the plurality of detected peak intensities at the wavelength shifts are compared with intensities of a Raman spectrum from known normal tissue at corresponding wavelength shifts. Abnormality of the tissue sample is assessed as a function of the comparison. This embodiment also contemplates calculating a ratio between selected intensities of the Raman spectrum, and detecting abnormality of the tissue sample as a function of the ratio.
The invention also contemplates calculating a second ratio between another two of the plurality of peak intensities, and detecting a degree of tissue abnormality as a function of the second ratio.
In yet another embodiment of the method of the present invention, calculation of one or more ratios between Raman spectrum intensities is used alone to detect tissue abnormality without comparing individual intensities with those of known normal tissue.
The present invention also contemplates the combination of Raman and fluorescence spectroscopy to detect tissue abnormality.
The apparatus of the present invention includes a controllable illumination device for emitting a plurality of electromagnetic radiation wavelengths selected to cause a tissue- sample to produce a fluorescence intensity spectrum indicative of tissue abnormality, an optical
system for applying the plurality of radiation wavelengths to a tissue sample, a fluorescence intensity spectrum detecting device for detecting an intensity of fluorescence spectra emitted by the sample as a result of illumination by the plurality of electromagnetic radiation wavelengths, a data processor, connected to the detecting device, for analyzing detected fluorescence spectra to calculate a probability that the sample is abnormal .
A Raman spectroscopy apparatus in accordance with the present invention includes an illumination device for generating at least one illumination wavelength of electromagnetic radiation selected to cause a tissue sample to emit a Raman spectrum comprising a plurality of wavelengths shifted from the illumination wavelength, a Raman spectrum detector for detecting a plurality of peak intensities of the Raman spectrum at selected wavelength shifts, and a programmed computer connected to the Raman spectrum detector, programmed to compare each of the plurality of detected peak intensities with corresponding peak intensities of a Raman spectrum from known normal tissue, to detect tissue abnormality.
These and other features and advantages of the present invention will become apparent to those of ordinary skill in this art with reference to the appended drawings and following detailed description.
BRIEF DESCRIPTION OF THE DRAWINGS
FIG. 1 is a block diagram of an exemplary fluorescence spectroscopy diagnostic apparatus, in accordance with the present invention.
FIG. 2 is a block diagram an exemplary Raman spectroscopy diagnostic apparatus, in accordance with the present invention.
FIG. 3A, FIG. 3B and FIG. 3C are flowcharts of exemplary fluorescence spectroscopy diagnostic methods, in accordance with the present invention.
FIG. 4A, FIG. 4B, FIG. 4C and FIG. 4D are flowcharts of an exemplary Raman spectroscopy diagnostic method, in accordance with the present invention.
FIG. 5 and FIG. 6 are graphs depicting the performance of the fluorescence diagnostic method of the present invention, with 337 nm excitation.
FIG. 7A, FIG. 7B and FIG. 8 are graphs illustrating the performance of the fluorescence spectrum diagnostic method of the present invention at 380 nm excitation.
FIG. 9A, FIG. 9B and FIG. 10 are graphs illustrating the performance of the fluorescence spectrum diagnostic method of the present invention to distinguish squamous normal tissue from SIL at 460 nm excitation.
FIG. 11A, FIG. 11B and FIG. 12 are graphs illustrating the performance of the fluorescence spectrum diagnostic method of the present invention, to distinguish low grade SIL from high grade SIL at 460 nm excitation.
FIG. 13 is a graph depicting measured rhodamine Raman spectra with high and low signal to noise ratios.
FIG. 14 is a graph depicting the elimination of fluorescence from a Raman spectrum, using a polynomial fit.
FIG. 15 is a graph comparing low and high signal to noise ratio rhodamine Raman spectra.
FIG. 16 is a graph depicting a typical pair of Raman spectra obtained from a patient with dysplasia.
FIG. 17 is a histogram of the Raman band at 1325 cm" 1 illustrating patient to patient variation.
FIG. 18A and FIG. 18B are graphs illustrating the diagnostic capability of the Raman spectroscopy diagnostic method of the present invention.
FIG. 19 is another graph depicting the diagnostic capability of the Raman spectroscopy diagnostic method of the present invention.
FIG. 20 and FIG. 21 are graphs illustrating the diagnostic capability of the Raman spectroscopy diagnostic method of the present invention.
FIG. 22 is a graph of a hypothetical distribution of test values.
DETAILED DESCRIPTION
I. Introduction
To illustrate the advantages of the present invention fluorescence spectra were collected in vi vo at colposcopy from 20 patients. Fluorescence spectra were measured from three to four colposcopically normal and three to four colposcopically abnormal sites as identified by the practitioner using techniques known in the art. Specifically, in cervical tissue, nonacetowhite epithelium is considered normal, whereas acetowhite epithelium and the presence of vascular atypias (such as
punctuation, mosaicism, and atypical vessels) are considered abnormal . One normal and abnormal site from those investigated were biopsied from each patient. These biopsies were snap frozen in liquid nitrogen and stored in an ultra low temperature freezer at -85°C until Raman measurements were made.
II. Diagnostic Apparatus
1___ Fluorescence Spectroscopy Diaσnostic Apparatus
Fluorescence spectra were recorded with a spectroscopic system incorporating a pulsed nitrogen pumped dye laser, an optical fiber probe and an optical multi-channel analyzer at colposcopy. The laser characteristics for the study were: 337, 380 and 460 nm wavelengths, transmitted pulse energy of 50 uJ, a pulse duration of 5 ns and a repetition rate of 30 Hz. The probe includes 2 excitation fibers, one for each wavelength and 5 collection fibers. Rhoda ine 6G (8 mg/ml) was used as a standard to calibrate for day to day variations in the detector throughput . The spectra were background subtracted and normalized to the peak intensity of rhodamine. The spectra were also calibrated for the wavelength dependence of the system.
FIG. 1 is an exemplary spectroscopic system for collecting and analyzing fluorescence spectra from cervical tissue, in accordance with the present invention, incorporating a pulsed nitrogen pumped dye laser 100, an optical fiber probe 101 and an optical multi-channel analyzer 103 utilized to record fluorescence spectra from the intact cervix at colposcopy. The probe 101 comprises a central fiber 104 surrounded by a circular array of six fibers. All seven fibers have the same characteristics (0.22 NA, 200 micron core diameter) . Two of the peripheral fibers, 106 and
107, deliver excitation light to the tissue surface; fiber 106 delivers excitation light from the nitrogen laser and fiber 107 delivers light from the dye module (overlap of the illumination area viewed by both optical fibers 106, 107 is greater than 85%) . The purpose of the remaining five fibers (104 and 108-111) is to collect the emitted fluorescence from the tissue surface directly illuminated by each excitation fibers 106, 107. A quartz shield 112 is placed at the tip of the probe 101 to provide a substantially fixed distance between the fibers and the tissue surface, so fluorescence intensity can be reported in calibrated units.
Excitation light at 337 nm excitation was focused into the proximal end of excitation fiber 106 to produce a 1 mm diameter spot at the outer face of the shield 112. Excitation light from the dye module 113, coupled into excitation fiber 107 was produced by using appropriate fluorescence dyes; in this example, BBQ (1E-03M in 7 parts toluene and 3 parts ethanol) was used to generate light at 380 nm excitation, and Coumarin 460 (1E-02 M in ethanol) was used to generate light at 460 nm excitation. The average transmitted pulse energy at 337, 380 and 460 nm excitation were 20, 12 and 25 mJ, respectively. The laser characteristics for this embodiment are: a 5 ns pulse duration and a repetition rate of 30 Hz, however other characteristics would also be acceptable . Excitation fluences should remain low enough so that cervical tissue is not vaporized and so that significant photo-bleaching does not occur. In arterial tissue, for example, significant photo-bleaching occurs above excitation fluences of 80 mJ/m .
The proximal ends of the collection fibers 104, 108- 111 are arranged in a circular array and imaged at the entrance slit of a polychromator 114 (Jarrell Ash, Monospec 18) coupled to an intensified 1024-diode array
116 controlled by a multi-channel analyzer 117 (Princeton Instruments, OMA) . 370, 400 and 470 nm long pass filters were used to block scattered excitation light at 337, 380 and 460 nm excitation respectively. A 205 ns collection gate, synchronized to the leading edge of the laser pulse using a pulser 118 (Princeton Instruments, PG200) , effectively eliminated the effects of the colposcope's white light illumination during fluorescence measurements. Data acquisition and analysis were controlled by computer 119 in accordance with the fluorescence diagnostic method described below in more detail with reference to the flowcharts of FIG. 3A, FIG. 3B and FIG. 3C.
2___ Raman Spectroscopy Diagnostic Apparatus
FIG. 2 is an exemplary apparatus for the collection of near-IR Raman spectra in accordance with the present invention. Near-IR Raman measurements were made in vi tro using the system shown in FIG. 2, however the system of FIG. 2 could be readily adapted for use in vivo, for example by increasing laser power and by using a probe structure similar to that of probe 101 shown in FIG. 1A and FIG. IB. A 40 m W GaAlAs diode laser 200 (Diolite 800, LiCONix, CA) excites the samples near 789 nm through a 200 micron glass fiber 201. The biopsies measuring about 2 x 1 x 1 mm are placed moist in a quartz cuvette 202 with the epithelium towards the face of the cuvette 202 and the beam. The excitation beam is incident at an angle of approximately 75 degrees to avoid specular reflection and is focused to a spot size of 200 μm at the tissue surface. A bandpass (BP) filter 203 with a transmission of 85% at 789 nm is used to clean up the output of the laser 200. The laser power at the sample is maintained at approximately 25 mW. The scattered Raman signal is collected at an angle of 90° from the excitation beam and imaged on the entrance slit of the
detection system 204, however other angles would also be acceptable. A holographic notch (HN) filter 206 (HSNF 789, Kaiger Optical Systems, MI) with an optical density >6 at 789 nm is used to attenuate the elastic scattering. The detection system 204 includes an imaging spectrograph 207 (500IS, Chromex Inc. NM) and a liquid nitrogen cooled CCD camera 208 with associated camera controller 209 (LN- 1152E, Princeton Instruments, NJ) . The spectrograph 207 was used with a 300 gr/mm grating blazed at 500 nm which yielded a spectral resolution of 10 cm"1 with an entrance slit of 100 μm.
Detection system 204 is controlled by computer 211 which is programmed in accordance with the Raman spectroscopy diagnostic method described below in detail with reference to the flowcharts of FIG. 4A, FIG. 4B, FIG. 4C and FIG. 4D.
Raman spectra were measured over a range of 500 - 2000 cm"1 with respect to the excitation frequency and each sample spectrum was integrated for 15 minutes, however, shorter integration times would also be acceptable combined with higher laser intensity. Each background subtracted spectrum was corrected for wavelength dependent response of the spectrograph 207, camera 208, grating and filters 203 and 206. The system was calibrated for day to day throughput variations using naphthalene, rhodamine 6G and carbon tetrachloride. The Raman shift was found to be accurate to ± 7 cm"1 and the intensity was found to be constant within 12% of the mean.
The systems of FIG. 1 and FIG. 2 are exemplary embodiments and should not be considered to limit the invention as claimed. It will be understood that apparatus other than that depicted in FIG. 1 and FIG. 2 may be used without departing from the scope of the invention.
III. Diagnostic Methods
1. Development of Diagnostic Methods
A. Two-Stage Method Development at 337 nm Exci tation
The parameters for stage 1 of the two-stage method: relative peak intensity (peak intensity of each sample divided by the average peak intensity of corresponding normal (squa ous) samples from the same patient) and a linear approximation of slope of the spectrum from 420- 440 nm were calculated from the fluorescence spectrum of each sample in the calibration set. The relative peak intensity accounts for the inter-patient variation of normal tissue fluorescence intensity. A two-dimensional scattergram of the two diagnostic parameters was plotted for all the samples in the calibration set. A linear decision line was developed to minimize misclassification (non diseased vs. diseased) . Similarly, the parameters for stage 2 of the method: slope of the spectrum from
440-460 nm of each diseased sample and average slope from 420-440 nm of spectra of corresponding normal (squamous) samples were calculated from the calibration set. A scattergram of the these two diagnostic parameters was plotted for all diseased samples. Again, a linear decision line was developed to minimize misclassification (low grade SIL vs. high grade SIL) . The optimized method was implemented on spectra of each sample in the prediction set. The optimal decision lines developed from the data in the calibration set were compared to that developed in the initial clinical study for both stages of the method. The two-stage fluorescence diagnostic method is disclosed in more detail in application Serial No. 08/060,432, filed May 12, 1993, assigned to the same assignee as the present invention. The disclosure of this prior application is expressly incorporated herein by reference.
B . Mul ti -Variate Statistical Method Development
The five primary steps involved in the multivariate statistical method are 1) preprocessing of spectral data from each patient to account for inter-patient variation, 2) partitioning of the preprocessed spectral data from all patients into calibration and prediction sets, 3) dimension reduction of the preprocessed spectra in the calibration set using principal component analysis, 4) selection of the diagnostically most useful principal components using a two-sided unpaired t-test and 5) development of an optimal classification scheme based on Bayes theorem using the diagnostically useful principal component scores of the calibration set as inputs. These five individual steps of the multivariate statistical method are presented below in more detail .
1) Preprocessing: The objective of preprocessing is to calibrate tissue spectra for inter-patient variation which might obscure differences in the spectra of different tissue types. Four methods of preprocessing were invoked on the spectral data: 1) normalization 2) mean scaling 3) a combination of normalization and mean scaling and 4) median scaling.
Spectra were normalized by dividing the fluorescence intensity at each emission wavelength by the maximum fluorescence intensity of that sample. Normalizing a fluorescence spectrum removes absolute intensity information; methods developed from normalized fluorescence spectra rely on differences in spectral line shape information for diagnosis. If the contribution of the absolute intensity information is not significant, two advantages are realized by utilizing normalized spectra: 1) it is no longer necessary to calibrate for inter-patient variation of normal tissue fluorescence intensity as in the two-stage method, and 2)
identification of a colposcopically normal reference site in each patient prior to spectroscopic analysis is no longer needed.
Mean scaling was performed by calculating the mean spectrum for a patient (using all spectra obtained from cervical sites in that patient) and subtracting it from each spectrum in that patient. Mean-scaling can be performed on both unnormalized (original) and normalized spectra. Mean-scaling does not require colposcopy to identify a reference normal site in each patient prior to spectroscopic analysis. However, unlike normalization, mean-scaling displays the differences in the fluorescence spectrum from a particular site with respect to the average spectrum from that patient. Therefore this method can enhance differences in fluorescence spectra between tissue categories most effectively when spectra are acquired from approximately equal numbers of non diseased and diseased sites from each patient.
Median scaling is performed by calculating the median spectrum for a patient (using all spectra obtained from cervical sites in that patient) and subtracting it from each spectrum in that patient. Like mean scaling, median scaling can be performed on both unnormalized
(original) and normalized spectra, and median scaling does not require colposcopy to identify a reference normal site in each patient prior to spectroscopic analysis. However, unlike mean scaling, median scaling does not require the acquisition of spectra from equal numbers of non diseased and diseased sites from each patient .
2) Calibration and Prediction Data Sets: The preprocessed spectral data were randomly assigned into either a calibration or prediction set. The multivariate statistical method was developed and optimized using the
calibration set . It was then tested prospectively on the prediction data set.
3) Principal Component Analysis: Principal component analysis (PCA) is a linear model which transforms the original variables of a fluorescence emission spectrum into a smaller set of linear combinations of the original variables called principal components that account for most of the variance of the original data set. Principal component analysis is described in Dillon W.R., Goldstein M., Multivariate Analysis : Methods and Applica tions , John Wiley and Sons, 1984, pp. 23-52, the disclosure of which is expressly incorporated herein by reference. While PCA may not provide direct insight to the morphologic and biochemical basis of tissue spectra, it provides a novel approach of condensing all the spectral information into a few manageable components, with minimal information loss. Furthermore, each principal component can be easily related to the original emission spectrum, thus providing insight into diagnostically useful emission variables .
Prior to PCA, a data matrix is created where each row of the matrix contains the preprocessed fluorescence spectrum of a sample and each column contains the pre- processed fluorescence intensity at each emission wavelength. The data matrix D (r x c) , consisting of r rows (corresponding to r total samples from all patients in the training set) and c columns (corresponding to intensity at c emission wavelengths) can be written as:
The first step in PCA is to calculate the covariance matrix, Z. First, each column of the preprocessed data matrix D is mean-scaled. The mean-scaled preprocessed data matrix, Dm is then multiplied by its transpose and each element of the resulting square matrix is divided by (r-1) , where r is the total number of samples. The equation for calculating Z is defined as:
(2) z = 7^1 (D»' Dm)
The square covariance matrix, Z (c x c) is decomposed into its respective eigenvalues and eigenvectors. Because of experimental error, the total number of eigenvalues will always equal the total number of columns (c) in the data matrix D assuming that c < r. The goal is to select n < c eigenvalues that can describe most of the variance of the original data matrix to within experimental error. The variance, V accounted for by the first n eigenvalues can be calculated as follows:
The criterion used in this analysis was to retain the first n eigenvalues and corresponding eigenvectors that account for 99 % of the variance in the original data set .
Next, the principal component score matrix can be calculated according to the following equation:
R = D C (4)
where, D (r x c) is the preprocessed data matrix and C (c x n) is a matrix whose columns contain the n eigenvectors which correspond to the first n eigenvalues. Each row of the score matrix R (r x c) corresponds to the principal component scores of a sample and each column corresponds to a principal component. The principal components are mutually orthogonal to each other.
Finally, the component loading is calculated for each principal component . The component loading represents the correlation between the principal component and the variables of the original fluorescence emission spectrum. The component loading can be calculated as shown below:
where, CL^ represents the correlation between the ith variable (preprocessed intensity at ith emission wavelength) and the jth principal component. C± is the ith component of the jth eigenvector, λ- is the jth eigenvalue and Si; is the variance of the ith variable.
Principal component analysis wan performed on each type of preprocessed data matrix, described above. Eigenvalues accounting for 99% of the variance in the original preprocessed data set were retained The corresponding eigenvectors were then multiplied by the original data matrix to obtain the principal component score matrix R..
4) Student's T-Test: Average values of principal component scores were calculated for each histo- pathologic tissue category for each principal component obtained from the preprocessed data matrix. A two-sided unpaired student's t-test was employed to determine the diagnostic contribution of each principal component. Such a test is disclosed in Devore J.L., Probabili ty and Statistics for Engineering and the Sciences , Brooks/Cole, 1992, and in Walpole R.E., Myers R.H., Probabili ty and Sta tis ti cs for Engineers and Scien tis ts , Macmillan Publishing Co., 1978, Chapter 7, the disclosures of which are expressly incorporated herein by reference. The hypothesis that the means of the principal component scores of two tissue categories are different were tested for 1) normal squamous epithelia and SILs, 2) columnar normal epithelia and SILs and 3) inflammation and SILs. The t-test was extended a step further to determine if there are any statistically significant differences between the means of the principal component scores of high grade SILs and low grade SILs. Principal components for which the hypothesis stated above were true below the 0.05 level of significance were retained for further analysis .
5) Logistic Discrimination: Logistic discriminant analysis is a statistical technique that can be used to develop diagnostic methods based on posterior probabilities, overcoming the drawback of the binary decision scheme employed in the two-stage method. This
- 20 - statistical classification method is based on Bayes theorem and can be used to calculate the posterior probability that an unknown sample belongs to each of the possible tissue categories identified. Logistic discrimination is discussed in Albert A., Harris E.K., Multivariate Interpreta ion of Clinical Labora tory Da ta , Marcel Dekker, 1987, the disclosure of which is expressly incorporated herein by reference. Classifying the unknown sample into the tissue category for which its posterior probability is highest results in a classification scheme that minimizes the rate of misclassification.
For two diagnostic categories, G1 and G2, the posterior probability of being a member of G1# given measurement x, according to Bayes theorem is:
P(x|G.)P(G.)C(2|i:
P(G X) (6)
P(χ|G.)P(G.)C(2|l) +P(x|G2)P(G2)C(l|2)
where
is the conditional joint probability that a tissue sample of type i will have principal component score x, and P(G^) is the prior probability of finding tissue type i in the sample population. C(j |i) is the cost of misclassifying a sample into group j when the actual membership is group i .
The prior probability P(G^) is an estimate of the likelihood that a sample of type i belongs to a particular group when no information about it is available. If the sample is considered representative of the population, the observed proportions of cases in each group can serve as estimates of the prior probabilities. In a clinical setting, either historical incidence figures appropriate for the patient population can be used to generate prior probabilities, or the
practitioner's colposcopic assessment of the likelihood of precancer can be used to estimate prior probabilities.
The conditional probabilities can be developed from the probability distributions of the n principal component scores for each tissue type, i. The probability distributions can be modeled using the gamma function, which is characterized by two parameters, alpha and beta, which are related to the mean and standard deviation of the data set. The Gamma function is typically used to model skewed distributions and is defined below:
The gamma function can be used to calculate the conditional probability that a sample from tissue type i, will exhibit the principal component score, x. If more than one principal component is needed to describe a sample population, then the conditional joint probability is simply the product of the conditional probabilities of each principal component (assuming that each principal component is an independent variable) for that sample population.
C. Multivariate Analysis of Tissue Fluorescence Spectra
1) SILs vs. Normal Squamous Tissue at 337 nm excitation
A summary of the fluorescence diagnostic method developed and tested in a previous group of 92 patients (476 sites) is presented here. The spectral data were preprocessed by normalizing each spectrum to a peak intensity of one, followed by mean-scaling. Mean scaling
is performed by calculating the mean spectrum for a patient (using all spectra obtained from cervical sites in that patient) and subtracting it from each spectrum in that patient. Next, principal component analysis (PCA) is used to transform the original variables of each preprocessed fluorescence emission spectrum into a smaller set of linear combinations called principal components that account for 99% of the variance of the original data set. Only the diagnostically useful principal components are retained for further analysis. Posterior probabilities for each tissue type are determined for all samples in the data set using calculated prior and conditional joint probabilities. The prior probability is calculated as the percentage of each tissue type in the data. The conditional probability was calculated from the gamma function which modeled the probability distributions of the retained principal components scores for each tissue category. The entire data set was split in two groups: calibration and prediction data set such that their prior probabilities were approximately equal. The method is optimized using the calibration set and then implemented on the prediction set to estimate its performance in an unbiased manner. The methods using PCA and Bayes theorem were developed using the calibration set consisting of previously collected spectra from 46 patients (239 sites) . These methods were then applied to the prediction set (previously collected spectra from another 46 patients; 237 sites) and the current data set of 36 samples.
More specifically, at 337 nm excitation, fluorescence spectra were acquired from a total of 476 sites in 92 patients. The data were randomly assigned to either a calibration set or prediction set with the condition that both sets contain roughly equal number of
samples from each histo-pathologic category, as shown in Table 1.
Table 1. (a) Histo-pathologic classification of samples in the training and the validation set examined at 337 nm excitation and (b) histological classification of cervical samples spectroscopically interrogated in vivo from 40 patients at 380 nm excitation and 24 patients in 460 nm excitation.
(a)
Histology Training Set Validation Set
Squamous Normal 127 126
Columnar Normal 25 25
Inflammation 16 16
Low Grade SIL 40 40
High Grade SIL 31 30
(b)
Histology 380 nm excitation 460 nm excitation (40 patients) (24 patients)
Squamous Normal 82 76
Columnar Normal 20 24
Inflammation 10 11
Low Grade SIL 28 14
High Grade SIL 15 22
The random assignment ensured that not all spectra from a single patient were contained in the same data set. The purpose of the calibration set is to develop and optimize the method and the purpose of the prediction set is to prospectively test its accuracy in an unbiased manner. The two-stage method and the multivariate statistical method were optimized using the calibration set . The performance of these methods were then tested prospectively on the prediction set.
Principal component analysis of mean-scaled normalized spectra at 337 nm excitation from the calibration data set resulted in 3 principal components accounting for 99% of the total variance. Only, the first two principal components obtained from the preprocessed data matrix containing mean-scaled normalized spectra demonstrate the statistically most significant differences (P < 0.05) between normal squamous tissues and SILs (PCI: P < 1E-25, PC2 : P < 0.006) . The two-tail P values of the scores of the third principal component were not statistically significant (P < 0.2) . Therefore, the rest of the analysis was performed using these two principal components. All of the principal components are included in Appendix II.
For excitation at 337 nm, the prior probability was determined by calculating the percentage of each tissue type in the calibration set: 65% normal squamous tissues and 35% SILs. More generally, prior probabilities should be selected to describe the patient population under study; the values used here are appropriate as they describe the prediction set as well.
Posterior probabilities of belonging to each tissue type (normal squamous or SIL) were calculated for all samples in the calibration set, using the known prior probabilities and the conditional probabilities
calculated from the gamma function. A cost of misclassification of SILs equal to 0.5 was assumed. FIG. 5 illustrates the posterior probability of belonging to the SIL category. The posterior probability is plotted for all samples in the calibration set. This plot indicates that 75% of the high grade SILs have a posterior probability greater than 0.75 and almost 90% of high grade SILs have a posterior probability greater than 0.6. While 85% of low grade SILs have a posterior probability greater than 0.5, only 60% of low grade SILs have a posterior probability greater than 0.75. More than 80% of normal squamous epithelia have a posterior probability less than 0.25. Note that evaluation of normal columnar epithelia and samples with inflammation using this method results in classifying them as SILs.
FIG. 6 shows the percentage of normal squamous tissues and SILs correctly classified versus cost of misclassification of SILs for the data from the calibration set. An increase in the SIL misclassification cost results in an increase in the proportion of correctly classified SILs and a decrease in the proportion of correctly classified normal squamous tissues. Note, that varying the cost from .4 to .6 alters the classification accuracy of both SILs and normal tissues by less than 15% indicating that a small change in the cost does not significantly alter the performance of the method. An optimal cost of misclassification would be 0.6-0.7 as this correctly classifies almost 95% of SILs and 80% of normal squamous epithelia, for the prior probabilities used and is not sensitivity to small changes in prior probability.
The method was implemented on mean-scaled spectra of the prediction set, to obtain an unbiased estimate of its accuracy. The two eigenvectors obtained from the calibration set were multiplied by the prediction matrix
to obtain the new principal component score matrix. Using the same prior probabilities, a cost of misclassification of SILs equal to 0.5, and conditional joint probabilities calculated from the gamma function, all developed from the calibration set, Bayes rule was used to calculate the posterior probabilities for all samples in the prediction set.
Confusion matrices in Tables 2 (a) and 2 (b) show the spectroscopic classification using this method for the calibration set and the prediction set, respectively. A comparison of the sample classification between the prediction and calibration sets indicates that the method performs within 7% on an unknown data set of approximately equal prior probability.
Table 2. Results of multivariate statistical method applied to the entire fluorescence emission spectra of squamous normal tissues and SILs at 337 nm excitation in (a) calibration set and (b) prediction set.
(a)
Classification Squamous Low Grade High Grade Normal SIL SIL
Squamous Normal 83% 15% 10%
SIL 17% 85% 90%
(b)
Classification Squamous Low Grade High Grade Normal SIL SIL
Squamous Normal 81% 22% 6%
SIL 19% 78% 94%
The utility of another parameter called the component loadings was explored for reducing the number ' of emission variables required to achieve classification with minimal decrease in predictive ability. Portions of the emission spectrum most highly correlated (correlation > 0.9 or < 0.9) with the component loadings were selected and the reduced data matrix was used to regenerate and evaluate the method. Using intensity at 2 emission wavelengths, the method was developed in an identical manner as was done with the entire emission spectrum. It was optimized using the calibration set and implemented on the prediction set. A comparison of the sample classification based on the method using the entire emission spectrum to that using intensity at 2 emission wavelengths indicates that the latter method performs equally well in classifying normal squamous epithelia and low grade SILs. The performance of the latter method is 6% lower for classifying high grade SILs.
2) SILs vs. Normal Columnar Epithelia and Inflammation at 380 nm Excitation
Principal components obtained from the preprocessed data matrix containing mean-scaled normalized spectra at 380 nm excitation could be used to differentiate SILs from non diseased tissues (normal columnar epithelia and inflammation) . The principal components are included in Appendix II. Furthermore, a two-sided unpaired t-test
indicated that only principal component 2 (PC2) and principal component 5 (PC5) demonstrated the statistically most significant differences (p ≤ 0.05) between SILs and non diseased tissues (normal columnar epithelia and inflammation) . The p values of the remaining principal component scores were not statistically significant (p > 0.13) Therefore, the rest of the analysis was performed using these three principal components which account collectively for 32% of the variation in the original data set.
FIG. 7A and FIG. 7B illustrate the measured probability distribution and the best fit of the normal probability density function to PC2 and PC5 of non diseased tissues and SILs, respectively. There is reasonable agreement between the measured and calculated probability distribution, for each case. The prior probability was determined by calculating the percentage of each tissue type in the data set: 41% non diseased tissues and 59% SILs. Posterior probabilities of belonging to each tissue type were calculated for all samples in the data set, using the known prior probabilities and the conditional joint probabilities calculated from the normal probability density function. FIG. 8 illustrates the retrospective performance of the diagnostic method on the same data set used to optimize it. The posterior probability of being classified into the SIL category is plotted for all samples evaluated. The results shown are for a cost of misclassification of SILs equal to 50%. FIG. 8 indicates that 78% of SILs have a posterior probability greater than 0.5, 78% of normal columnar tissues have a posterior probability less than 0.5 and 60% of samples with inflammation have a posterior probability less than 0.5. Note that, there are only 10 samples with inflammation in this study.
Tables 3 (a) and (b) compare (a) the retrospective performance of the diagnostic method on the data set used to optimize it to (b) a prospective estimate of the method's performance using cross-validation. Table 3(a) indicates that for a cost of misclassification of 50%, 74% of high grade SILs, 78% of low grade SILs, 78% of normal columnar samples and 60% of samples with inflammation are correctly classified. The unbiased estimate of the method's performance (Table 3(b)) indicates that there is no change in the percentage of correctly classified SILs and approximately only a 10% decrease in the proportion of correctly classified normal columnar samples.
Table 3. (a) A retrospective and (b) prospective estimate of the multivariate statistical method's performance using mean- scaled normalized spectra at 380 nm excitation to differentiate SILs from non diseased tissues (normal columnar epithelia and inflammation) .
(a)
Classification Normal Inflammation Low Grade High Grade Columnar SIL SIL
Non diseased 78% 60% 21% 26%
SIL 22% 40% 79% 74%
(b)
Classification Normal Inflammation Low Grade High Grade Columnar SIL SIL
Non diseased 65% 30% 22% 26%
SIL 35% 70% 78% 74%
3) Squamous Normal Tissue vs. SILs at 460 nm Excitation
Principal components obtained from the preprocessed data matrix containing mean-scaled normalized spectra at 460 nm excitation could be used to differentiate SIL from normal squamous tissue. These principal components are included in Appendix II. Only principal components 1 and 2 demonstrated the statistically most significant differences (p < 0.05) between SILs and normal squamous tissues. The p values of the remaining principal
component scores, were not statistically significant (p > 0.06) . Therefore, the rest of the analysis was performed using these two principal components which account collectively for 75% of the variation in the original data set.
FIG. 9A and FIG. 9B illustrate the measured probability distribution and the best fit of the normal probability density function to PCI and PC2 of normal squamous tissues and SILs, respectively. There is reasonable agreement between the measured and calculated probability distribution, for each case. The prior probabilities were determined to be: 67% normal squamous tissues and 33% SILs. Next, posterior probabilities of belonging to each tissue type were calculated for all samples in the data set. FIG. 10 illustrates the retrospective performance of the diagnostic method on the same data set used to optimize it. The posterior probability of being classified into the SIL category is plotted for all samples evaluated. The results shown are for a cost of misclassification of SILs equal to 55%. FIG. 10 indicates that 92% of SILs have a posterior probability greater than 0.5, and 76% of normal squamous tissues have a posterior probability less than 0.5.
A prospective estimate of the method's performance was obtained using cross-validation. Table 4 (a) and (b) compares (a) the retrospective performance of the method on the data set used to optimize it to (b) the prospective estimate of the method's performance using cross-validation. Table 4(a) indicates that for a cost of misclassification of SILs equal to 55%, 92% of high grade SILs, 90% of low grade SILs, and '76% of normal squamous samples are correctly classified. The unbiased estimate of the method's performance (Table 4(b)) indicates that there is no change in the percentage of correctly classified high grade SILs or normal squamous
tissue; there is a 5% decrease in the proportion of correctly classified low grade SILs.
Table 4. (a) A retrospective and (b) prospective estimate of the multivariate statistical method' s performance using mean- scaled normalized spectra at 460 nm excitation to differentiate SILs from normal squamous tissues
(a)
Classification Normal Low Grade High Grade Squamous SIL SIL
Normal Squamous 76% 7% 9%
SIL 24% 93% 91%
(b)
Classification Normal Low Grade High Grade Squamous SIL SIL
Normal Squamous 75% 14% 9%
SIL 25% 86% 91%
4) Low Grade SILs vs. High Grade SILs at 460 nm Excitation
Principal components obtained from the preprocessed data matrix containing normalized spectra at 460 nm excitation could be used to differentiate high grade SILs from low grade SILs. These principal components are included in Appendix II. Principal component 4 (PC4) and principal component 7 (PC7) demonstrated the
statistically most significant differences (p < 0.05) between high grade SILs and low grade SILs. The p values of the remaining principal component scores were not statistically significant (p > 0.09) . Therefore, the rest of the analysis was performed using these two principal components which account collectively for 8% of the variation in the original data set.
FIG. 11A and FIG. 11B illustrate the measured probability distribution and the best fit of the normal probability density function of PC4 and PC7 for normal squamous tissues and SILs, respectively. There is reasonable agreement between the measured and calculated probability distribution, for each case. The prior probability was determined to be: 39% low grade SILs and 61% high grade SILs. Posterior probabilities of belonging to each tissue type were calculated. FIG. 12 illustrates the retrospective performance of the diagnostic method on the same data set used to optimize it. The posterior probability of being classified into the SIL category is plotted for all samples evaluated. The results shown are for a cost of misclassification of SILs equal to 65%. FIG. 12 indicates that 82% of high grade SILs have a posterior probability greater than 0.5, and 78% of low grade SILs have a posterior probability less than 0.5.
A prospective estimate of the method's performance was obtained using cross-validation. Table 5 (a) and (b) compares (a) the retrospective performance of the method on the data set used to optimize it to (b) the unbiased estimate of the method's performance using cross- validation. Table 5(a) indicates that for a cost of misclassification of 65% 82% of high grade SILs and 78% of low grade SILs are correctly classified. The unbiased estimate of the method's performance (Table 5(b) ) indicates that there is a 5% decrease in the percentage
of correctly classified high grade SILs and low grade SILs.
Table 5. (a) A retrospective and (b) prospective estimate of the multivariate statistical method's performance using mean- scaled normalized spectra at 460 nm excitation to differentiate high grade from low grade SILs.
(a)
Classification Low Grade SIL High Grade SIL
Low Grade SIL 79% 18%
High Grade SIL 21% 82%
(b)
Classification Low Grade SIL High Grade SIL
Low Grade SIL 72% 27%
High Grade SIL 21% 77%
FIG. 3A, FIG. 3B and FIG. 3C are flowcharts of the above-described fluorescence spectroscopy diagnostic methods. In practice, the flowcharts of FIG. 3A, FIG. 3B and FIG. 3C are coded into appropriate form and are loaded into the program memory of computer 119 (FIG. 1) which then controls the apparatus of FIG. 1 to cause the performance of the diagnostic method of the present invention.
Referring first to FIG. 3A, control begin in block 300 where fluorescence spectra are obtained from the patient at 337, 380 and 460 nm excitation. Control then passes to block 301 where the probability of the tissue sample under consideration being SIL is calculated from the spectra obtained from the patient at 337 or 460 nm. This method is shown in more detail with reference to FIG. 3B.
Control then passes to decision block 302 where the probability of SIL calculated in block 301 is compared against a threshold of 0.5. If the probability is not greater than 0.5, control passes to block 303 where the tissue sample is diagnosed normal, and the routine is ended. On the other hand, if the probability calculated in block 301 is greater than 0.5, control passes to block 304 where the probability of the tissue containing SIL is calculated based upon the emission spectra obtained from excitation at 380 nm. This method is identical to the method used to calculate probability of SIL from fluorescence spectra due to 337 or 460 nm, and is also presented below in more detail with reference to FIG. 3B.
Control then passes to decision block 306 where the probability of SIL calculated in block 304 is compared against a threshold of 0.5. If the probability calculated in block 304 is not greater than 0.5, control passes to block 307 where normal tissue is diagnosed and the routine is ended. Otherwise, if decision block 306 determines that the probability calculated in block 304 is greater than 0.5, control passes to block 308 where the probability of high grade SIL is calculated from the fluorescence emission spectra obtained from a 460 nm excitation. This method is discussed below in greater detail with reference to FIG. 3C.
Control then passes to decision block 309 where the probability of high grade SIL calculated in block 308 is compared with a threshold of 0.5. If the probability calculated in block 308 is not greater than 0.5, low grade SIL is diagnosed (block 311) , otherwise high grade SIL is diagnosed (block 312) .
Referring now to FIG. 3B, the conditioning of the fluorescence spectra by blocks 301 and 304 is presented in more detail. It should be noted that while the processing of block 301 and 304 is identical, block 301 operates on spectra obtained from a 337 or 460 nm excitation, whereas block 304 operates on spectra obtain from a 380 nm excitation. In either case, control begins in block 315 where the fluorescence spectra data matrix, D, is constructed, each row of which corresponds to a sample fluorescence spectrum taken from the patient. Control then passes to block 316 where the mean intensity at each emission wavelength of the detected fluorescence spectra is calculated. Then, in block 317, each spectrum of the data matrix is normalized relative to a maximum of each spectrum. Then, in block 318, each spectrum of the data matrix is mean scaled relative the mean calculated in block 316. The output of block 318 is a preprocessed data matrix, comprising preprocessed spectra for the patient under examination.
Control then passes to block 319 where principal component analysis is conducted, as discussed above, with reference to equations 2, 3, 4 and 5. During principal component analysis, the covariance matrix Z (equation (2)) , is calculated using a preprocessed data matrix, the rows of which comprise normalized, mean scaled spectra obtained from all patients, including the patient presently under consideration. The result of block 319 is applied to block 321 where a two-sided Student's T- test is conducted, which results in selection of only
diagnostic principal components. Control then passes to block 322 where logistic discrimination is conducted, which was discussed above with reference to equations 6 and 7.
The quantity calculated by block 322 is the posterior probability of the sample belonging to the SIL category (block 323) .
Referring now to FIG. 3C, presented are the details of the determination of the probability of high grade SIL from excitation at 460 nm (block 308, FIG. 3A) . Control begins in block 324 where the fluorescence spectra data matrix, D, is constructed, each row of which corresponds to a sample fluorescence spectrum taken from the patient. Control then passes to block 326 where each spectrum of the data matrix is normalized relative to a maximum of each spectrum. The output of block 326 is a preprocessed data matrix, comprising preprocessed spectra for the patient under examination. It should be noted that, in contrast to the preprocessing performed in the SIL probability calculating routine of FIG. 3B, there is no mean scaling performed when calculating the probability of high grade SIL.
Control then passes to block 327 where principal component analysis is conducted, as discussed above, with reference to equations 2, 3, 4 and 5. During principal component analysis, the covariance matrix Z (equation (2)) , is calculated using a preprocessed data matrix, the rows of which comprise normalized, mean scaled spectra obtained from all patients, including the patient presently under consideration. The result of block 327 is applied to block 328 where a two-sided Student's T- test is conducted, which results in selection of only diagnostic principal components. Control then passes to block 329 where logistic discrimination is conducted,
which was discussed above with reference to equations 6 and 7.
The quantity calculated by block 329 is the posterior probability of the sample belonging to the high grade SIL category (block 331) .
2. Raman Spectroscopy Diagnostic Method
To illustrate the efficacy of the present invention, twenty colposcopically normal and twenty colposcopically abnormal samples were studied. Two sample pairs were discarded due to experimental errors. Histologically, there were 19 normal, 2 metaplasia, 4 inflammation, 2 HPV and 9 dysplasia samples (2 mild and 7 moderate to severe dysplasias) . For the purposes of this study the samples are classified as follows: normal, metaplasia, inflammation, low grade SIL and high grade SIL. Two types of differentiation are of interest clinically: (1) SILs from all other tissues and (2) high grade SILs from low grade SILs. The diagnostic methods developed using fluorescence and Raman spectroscopy are targeted towards achieving optimal sensitivity in this differentiation.
Near infrared spectra of cervical tissues were obtained using the system shown in FIG. 2. These spectra are distorted by noise and autofluorescence and are preferably processed to yield the tissue vibrational spectrum. Rhodamine 6G powder packed in a quartz cuvette was used for calibration purposes since it has well documented Raman and fluorescent properties. A rhodamine spectrum with high signal to noise ratio was obtained using a 20 second integration time and a rhodamine spectrum with low signal to noise ratio was obtained using 1 second integration FIG. 13 is a graph showing measured rhodamine spectra with high and low S/N ratios respectively.
The observed noise in the spectra was established to be approximately gaussian. This implies that the use of simple filtering techniques would be effective in smoothing the curves. Using a moving average window on median filter yields acceptable results. However, optimal results are obtained when the spectrum is convolved with a gaussian whose full width half maximum equals the resolution of the system. This technique discards any signal with bandwidth less than the resolution of the system. The filtered scattering signal is still distorted by the residual fluorescence. A simple but accurate method to eliminate this fluorescence is to fit the spectrum containing both Raman and fluorescence information to a polynomial of high enough order to capture the fluorescence line shape but not the higher frequency Raman signal . A 5th degree polynomial was used. The polynomial was then subtracted from the spectrum to yield the Raman signal alone. In FIG. 14, the efficiency of eliminating fluorescence from a Raman signal using a polynomial fit is illustrated.
FIG. 15 is a graph showing that the processed low S/N rhodamine spectrum is similar to the high S/N rhodamine spectrum and is not distorted by the filtering process. Referring to FIG. 15, in comparison, it can be seen at that the initially noisy spectrum of the low S/N rhodamine once processed show the same principle and secondary peaks at the spectrum of high S/N rhodamine . This validates the signal processing techniques used and indicates that the technique does not distort the resultant spectrum. Each tissue spectrum was thus processed. Peak intensities of relevant bands from these spectra were measured and used for diagnosis.
Typical processed spectra for a pair of normal and abnormal samples from the same patient are shown in FIG. 16 which is a graph of a typical pair of processed
spectra from a patient with dysplasia showing the different peaks observed. Several peaks are observed at 626, 818, 978, 1070, 1175, 1246, 1325, 1454 and 1656 cm"1 (± 11 cm"1) . Several of the peaks observed have been cited in studies on gynecologic tissues by other groups such as Lui et al . , "Fluorescence and Time-Resolved Light Scattering as Optical Diagnostic Techniques to Separate Diseased and Normal Biomedical Media", J" Photochem Photobiol B : Biol , 16, 187-209, 1992, on colon tissues, and in IR absorption studies on cervical cells by Wong et al . , "Infrared Spectroscopy of Human Cervical Cells: Evidence of Extensive Structural Changes during Carcinogenesis" , Proc Natl Acad Sci USA, 88, 10988-10992, 1991. FIG. 17 is a graph of the intensity of the band at 1325 cm"1 for all biopsies to illustrate the patient to patient variation in the intensities of the Raman bands. The intensity of the various Raman bands show a significant patient to patient variability. In FIG. 17, the samples are plotted as pairs from each patient. To account for this patient to patient variability, each peak in a spectrum was normalized to the corresponding peak of the colposcopic and histologic normal sample from the same patient. Thus all colposcopic normal samples that are histologically normal have a peak intensity of one. Normalized and unnormalized spectra were analyzed for diagnostic information.
Each of the bands observed contains some diagnostic information and can differentiate between tissue types with varying accuracy. Clinically, the separation of
SILs from all other tissues and high grade SILs from low grade SILs is of interest. Because of the patient to patient variability more significant differentiation was obtained using paired analysis. The bands at 626, 1070 and 1656 cm"1 can each differentiate SILs from all other tissues. At all three bands, the intensity of the normal
is greater than the intensity of the SIL. This is illustrated in FIG. 18 and FIG. 19.
FIG. 18A and FIG. 18B are graphs showing diagnostic capability of normalized peak intensity of Raman bands at 626 cm"1, and 1070 cm"1, respectively. The band at 626 cm"1 which is due to ring deformations differentiates SILs from all other tissues with a sensitivity and specificity of 91% and 92% (FIG. 18A) . One SIL sample (focal HPV) is misclassified. The metaplasia samples are incorrectly classified as SILs at this band. However, using the intensity at the C-0 stretching and bending vibrational band of about 1070 cm-1 for a similar classification, all metaplasia and inflammation samples are correctly classified as non-SILs (FIG. 18B) . Of the two samples incorrectly classified, it was determined that one has focal dysplasia and the other is the same sample with focal HPV that was misclassified at 626 cm"1. Only one normal sample is misclassified as SIL. A sensitivity and specificity of 82% and 96% is achieved. This band has been attributed to glycogen and cellular lipids/phosphate by Wong et al . , "Infrared Spectroscopy of Human Cervical Cells: Evidence of Extensive Structural Changes during Carcinogenesiε, " Proc Na tl Acad Sci USA, 88, 10988-10992, 1991.
FIG. 19 is a graph showing the diagnostic capability of the band at 1656 cm-1. Decision line (1) separates SILs from all other tissues. Decision line (2) separates high grade from low grade SILs. The normalized peak intensity at 1656 cm"1 can differentiate SILs from other tissues using line (1) as the decision line with a sensitivity and specificity of 91% and 88%. The focal dysplasia sample incorrectly classified at 1070 cm"1 is again misclassified. The metaplastic samples are again classified as SILs. The advantage of using this peak is that it can also dif erentiate between high grade and low
grade SILs. Using line (2) as a decision line, this peak can separate high and low grade SILs with a sensitivity of 86%. The metaplasia samples misclassified as SILs are separated from the high grade samples . Only one normal sample is misclassified. In the cervix, this peak has been associated with cellular proteins from the nuclei of the epithelial cells. These proteins have been suggested to be primarily collagen and elastin by Lui et al . , "Fluorescence and Time-Resolved Light Scattering as Optical Diagnostic Techniques to Separate Diseased and
Normal Biomedical Media," J Photochem Photobiol B : Biol , 16, 187-209, 1992.
Of the other features observed in the Raman spectra of cervical tissues, the band at 818 cm"1 is associated with ring 'breathing' and is attributed to blood. The intensity of this band is greater in dysplasia samples relative to respect to normal samples. The peak at 978 cm"1 is associated with phosphorylated proteins and nucleic acids. This band differentiates SILs from other tissues with a sensitivity and specificity of 82% and 80%. The band at 1175 cm"1 can separate normal from dysplasia samples with a sensitivity of 88%. The decrease in intensity of this band with dysplasia has been reported by Wong et al . , "Infrared Spectroscopy of Human Cervical Cells: Evidence of Extensive Structural Changes during Carcinogenesiε", Proc Na tl Acad Sci USA, 88, 10988-10992, 1991, as well. This band around 1175 cm"1 has been associated with C-0 stretching and in cervical cells, this band consists of three overlapping lines at 1153, 1161 and 1172 cm"1. A similar trend is also observed in cervical tissue samples. These bands have been attributed to C-0 stretching of cell proteins such as tyrosine and carbohydrates. The Raman line at 1246 cm"1 is assigned to the stretching vibrations of C-Ν (amide III) . The line at 1325 cm"1 is due to ring vibrations and is associated with tryptophan by Lui et
al . , "Fluorescence and Time-Resolved Light Scattering as Optical Diagnostic Techniques to Separate Diseased and Normal Biomedical Media", J Photochem Photobiol B : Biol , 16, 187-209, 1992, and nucleic acids. An increase in the intensity of this peak in the SILs with respect to the other tissues is observed. This has been associated with increased cellular nuclear content in the colon. The lines at 1401 and 1454 cm"1 are due to symmetric and asymmetric CH3 bending modes of proteins (methyl group) . The line at 1454 cm"1 differentiates high grade from low grade SILs with a 91% accuracy. These lines have been associated with elastin and collagen by Lui et al . , "Fluorescence and Time-Resolved Light Scattering as Optical Diagnostic Techniques to Separate Diseased and Normal Biomedical Media", J Photochem Photobiol B : Biol , 16, 187-209, 1992.
Analyzing the diagnostic information from the tissue Raman spectra in a paired manner, SILs may be differentiated from all other tissues at several peaks with an average sensitivity of 88% (±6%) and a specificity 92% (±7%) . The best sensitivity is achieved at 91% with the bands at 626 and 1656 cm"1. The best specificity is achieved at 100% using a combination of the bands at 1070 and 626 cm"1. In differentiating SILs from normals, the sensitivity and specificity of the Raman methods are greater than those of the fluorescence based methods for the 36 samples but are similar when compared to the fluorescence results from the larger sample study. Inflammation and metaplasia samples can be separated from the SILs using the Raman band at 1070 cm"1 and at 1656 cm"1. Raman spectra are successful in differentiating high grade SILs from low grade SILs with an average sensitivity of 86% (±4%) . The sensitivity is improved when compared to fluorescence based diagnosis of the same 36 samples as well as the larger sample population. The invention also accommodates patient to
patient variability in the intensities of the Raman lines by use of paired analysis as presented above. In addition, unpaired differentiation may be done by using the peaks at 1325, 1454 and 1656 cm"1 with a comparable sensitivity.
For unpaired differentiation, the ratio of intensities at 1656 and 1325 cm"1 differentiate SILs from all other tissues with a sensitivity and a specificity of 82% and 80%, respectively (FIG. 20) . In addition, the ratio of the intensities at 1656 and 1454 cm"1 may be used in an unpaired manner to differentiate high grade SILs from low grade SILs with a sensitivity and specificity of 100% and 100% (FIG. 21) .
Further, as mentioned above, each of these specified peaks of the Raman spectrum contain some diagnostic information for tissue differentiation. Multivariate techniques using principal component analysis and Baye' s theorem, similar to the conditioning of the fluorescence spectra described above, would use information from all of the peaks of the Raman spectrum, and would thus improve the diagnostic performance of the Raman signals. The methods using Raman signals presented here have been optimized for the 36 sample data set and are thus a bias estimate of their performance. A true estimate of the diagnostic capability of Raman spectroscopy would require an unbiased assessment of the performance of the method which for the small number of samples could be obtained using cross validation techniques, or other types of validation techniques.
The present invention exploits several potential advantages of Raman spectroscopy over fluorescence. The Raman diagnostic methods used in the invention reiterate the simplicity of Raman spectroscopy for diagnosis and
indicate the potential of improved diagnostic capability using this technique.
FIG. 4A, FIG. 4B, FIG. 4C and FIG. 4D are flowcharts of the above-described Raman spectroscopy diagnostic method. In practice, the flowcharts of FIG. 4A, FIG. 4B, FIG. 4C and FIG. 4D are coded into appropriate form and are loaded into the program memory of computer 211 (FIG. 2) which then controls the apparatus of FIG. 2 to cause the performance of the Raman spectroscopy diagnostic method of the present invention.
Referring first to FIG. 4A, after the method is started, the NIR Raman spectrum is acquired from the cervical tissue sample of unknown diagnosis in step 400. Then, in step 401, the acquired spectrum is corrected as a function of the rhodamine calibration process. Then, in block 402, the spectrum is convolved with a gaussian G having a full width half maximum of 11 wavenumbers, thus providing a corrected noise spectrum R. In step 403, the broad band baseline of the noise corrected spectrum is fit to a polynomial L, and the polynomial is subtracted from the spectrum to give the Raman signal for the sample under consideration.
Control then passes to step 404 where the maximum intensities at 626, 818, 978, 1070, 1175, 1246, 1325, 1454 and 1656 wavenumbers (in units of cm"1) are noted. Also in block 404, maximum intensities at five selected wavenumbers are stored. These include:
P-L = intensity at 626 cm"1
P2 = intensity at 1070 cm"1
P3 = intensity at 1325 cm"1 P4 = intensity at 1454 cm"1
P5 = intensity at 1656 cm"1
Control then passes to block 405 where the stored intensities are analyzed in order to diagnose the tissue sample. This analysis is presented below in more detail with reference to FIG. 4B, FIG. 4C and FIG. 4D.
Referring to FIG. 4B, decision block 406 determines whether paired analysis is desired, and if so control passes to block 407 where the paired diagnostic method is conducted. This is presented below in more detail with reference to FIG. 4C.
Control then passes to decision block 408 where it determined whether unpaired analysis is desired. If so, control passes to block 409 where the unpaired diagnostic method is conducted.
Referring to the paired diagnostic method, presented with reference to FIG. 4C, three parallel analyses may be conducted, one with respect to intensity P-_, one with respect to intensity P2, and one with respect to intensity P5. For intensity P-_, control begins in block 411 where quantity Nχ is set equal to the intensity at the selected wavenumber for a normal tissue sample of the patient under consideration. Control then passes to block 412 where the ratio between measured intensity Pχ and normal intensity N1 is calculated. In block 413, the ratio is compared with a threshold of 1. If the ratio is greater than or equal to 1, the diagnosis is non-SIL (step 414) , whereas if the ratio is less than 1, the diagnosis is SIL (step 416) .
A similar analysis is conducted with respect to intensities P2 and P5. Specifically, for intensity P2, control begins in block 417 where quantity N2 is set equal to the intensity at the selected wavenumber for a normal tissue sample of the patient under consideration. Control then passes to block 418 where the ratio between
measured intensity P2 and normal intensity N2 is calculated. In block 419, the ratio is compared with a threshold of 1. If the ratio is greater than or equal to 1, the diagnosis is non-SIL (step 421) , whereas if the ratio is less than 1, the diagnosis is SIL (step 422) .
For intensity P5, control begins in block 423 where quantity N5 is set equal to the intensity at the selected wavenumber for a normal tissue sample of the patient under consideration. Control then passes to block 424 where the ratio between measured intensity P5 and normal intensity N5 is calculated. In block 426, the ratio is compared with a threshold of 1. If the ratio is greater than or equal to 1, the diagnosis is non-SIL (step 427) , whereas if the ratio is less than 1, the diagnosis is SIL (step 428) .
If SIL is concluded in step 428, control passes to decision block 429 where the ratio calculated in block 424 is compared against a threshold of 0.75. If the ratio is greater than or equal to 0.75, then low grade SIL is diagnosed (step 431) , whereas if the ratio is less than 0.75, high grade SIL is diagnosed (step 432) .
Unpaired analysis of the NIR Raman spectrum is presented in FIG. 4D. Beginning in step 432, ratio r-_ is calculated between intensity P5 and intensity P3 , and ratio r2 is calculated between intensity P5 and intensity P4. Control then passes to decision block 434 where ratio r-_ is compared against a threshold of 1.8. If ratio r1 is greater than or equal to 1.8, the tissue sample is diagnosed as non-SIL (step 436) , whereas if ratio r1 is less than 1.8, the tissue is diagnosed as SIL (step 437) . Control then passes to decision block 438 where ratio r2 is compared against the threshold of 2.6. If ratio r2 is greater than or equal to 2.6, low grade
SIL is diagnosed (step 439) , whereas ratio r2 is less than 2.6, high grade SIL is diagnosed (step 441) .
It should be noted that the various thresholds used for the decision blocks in FIG. 4C and FIG. 4D may be adjusted without departing from the scope of the invention. The thresholds presented were chosen as a function of the training data, and other or more complete training data may result in different thresholds.
Combined Fluorescence and Raman Spectroscopy Method
The present invention also contemplates a system that sequentially acquires fluorescence and NIR Raman spectra in vivo through an optical probe, such as a fiber optic probe or other optical coupling system. The optical probe is selectively coupled to ultraviolet or visible sources of electromagnetic radiation to excite fluorescence, and then selectively coupled to NIR sources to excite fluorescence free Raman spectra. The fluorescence spectra may be used to improve the analytical rejection of fluorescence from the Raman spectrum.
The apparatus used for this purpose is a combination of the apparatus disclosed in FIG. 1 and FIG. 2. A dichroic mirror or swing-away mirror is used so that each electromagnetic radiation source is selectively coupled sequentially into the optical probe. Similarly, light collected by the probe is selectively coupled to the appropriate detectors to sense the fluorescence spectra and Raman spectra.
In analyzing the spectra for diagnostic purposes, it is presently contemplated that the above-described ability of fluorescence to identify normal tissue, and low and high grade lesions, be followed by the above-
described use of NIR Raman spectra to identify inflammation and metaplasia. Alternatively, information gathered about the tissue type, in accordance with the above-described fluorescence diagnosis, is used to improve the Raman diagnostic capability. This is accomplished by using fluorescence spectra to calculate the posterior probability that a tissue is normal, low or high grade SIL. Then, this classification is used as the prior probability in a Bayesian method, based on the detected Raman spectra. In yet another embodiment, information gathered with NIR Raman spectroscopy is used to calculate the posterior probability that the tissue is inflamed or metaplastic. Then, this information is used as the prior probability in a Bayesian method, based on the detected fluorescence spectrum.
While the present invention has been described with reference to several exemplary embodiments, it will be understood that modifications, additions and deletions may be made to these embodiments without departing from the spirit and scope of the present invention.
APPENDIX I: SPECIFICALLY AND SENSITIVITY
Summarized from: Albert A., Harris E.K. : Multivariate In erpreta ion of Clinical Laboratory Da ta, Marcel Dekker Inc., New York, pp. 75-82, (1987) , the disclosure of which is expressly incorporated herein by reference.
Assuming a group of T samples which can be categorized as normal (N samples) or diseased (D samples) . A diagnostic test, designed to determine whether the sample is normal or diseased, is applied to each sample. The results of the tests is the continuous variable x, which is then used to determine the sample type. FIG. 22 illustrates a hypothetical distribution of test values for each sample type. A diagnostic method based on this test can easily be defined by choosing a cutoff point, d, such that a sample with an observed value x<d is diagnosed as normal and a sample with an observed value x≥d is diagnosed as abnormal .
Several quantitative measures have been defined to 'evaluate' the performance of this type of method. The first type evaluates the test itself (i.e. measures the ability of the test to separate the two populations, N and D) . Sensitivity and specificity are two such measures. The second type is designed to aid in the interpretation of a particular test result (i.e. deciding whether the individual test measurement has come from a normal or diseased sample) . Positive and negative predictive value are two measures of this type.
To define these measures, some terminology and notation must be introduced. Referring to Table 6, a sample to be tested can be either normal or diseased; the result of the test for each type of sample can be either negative or positive. True negatives represent those normal with a positive test result. In these cases, the
diagnosis based on the rest result is correct. False positives are those normal samples which have a positive test result and false negatives are those diseased samples which have a negative test result . In these cases, the diagnosis based on the test result is incorrect.
TABLE 6
Normal Diseased Total Samples
Test Negative True Negatives False Negatives Negatives (x < d) (TN) (FN) (Neg)
Test Positive False Positives True Positives Positives
(x ≥ d) (FP) ITP) (Pos)
Total Samples N D T
With this terminology, Table 7 contains a definition of sensitivity and specificity, the two measures which assess the performance of the diagnostic method.
Specificity is the proportion of normal samples with a negative test result (proportion of normal samples diagnosed correctly) . Sensitivity is the proportion of diseased samples with a positive test result (Proportion of diseased samples correctly diagnosed) . FIG. 22 also contains a graphical representation of specificity and sensitivity. Specificity represents the area under the normal sample distribution curve to the left of the cut off point while sensitivity represent the area under the diseased sample distribution curve to the right of the cut off point.
TABLE 7
Test Measure Meaning Calculation
Specificity Proportion of normal Sp=TN/N samples with negative test result
Sensitivity Proportion of diseased Se=TP/D samples with positive test result
While sensitivity and specificity characterize the performance of a particular method, another set of statistics is required to interpret the laboratory test result for a given specimen. The positive and negative predictive value quantify the meaning of an individual test result (Table 8) . The positive predictive value is the probability that if the test result is positive, the sample is diseased. The negative predictive value is the probability that if the test result is negative, the sample is normal. Positive and negative predictive value are calculated from Baye' s rule as outlined in Albert and Harris. Table 8 contains two equivalent formulas for calculation positive and negative predictive value.
TABLE 8
Measure Meaning Calculation 1 Calculation 2
Positive The probabilitγ that, PV+ -TP/Pos PV+ - DSe/(DSe + N|1-Sp})
Predictive if the test is
Value positive, the sample is diseased
Negative The probabilitγ that, PV.=TN/Neg PV -NSp/(NSp+D(1-Se))
Predictive if the test is
Value negative, the sample is normal
APPENDIX II: PRINCIPAL COMPONENTS
337 nm excitation 460 nm excitation 380 nm excitation 460 nm excitation
E1 E2 E1 E2 E2 E5 E4 E7
0.12 0.1 1 -0.147 ■0.275 ■0.615 0.532 0.69 0.10
0.17 0.12 -0.093 •0.319 -0.464 0.151 0.09 -0.07
0.22 0.12 -0.074 0.360 -0.378 -0.1 •0.14 -0.17
0.25 0.11 •0.056 -0.345 •0.317 -0.308 •0.23 ■0.07
0.27 0.1 -0.027 -0.314 -0.236 ■0.373 -0.24 0.06
0.28 0.11 -0.004 ■0.253 -0.157 -0.348 -0.23 0.04
0.28 0.12 0.010 -0.193 •0.086 ■0.236 -0.19 0.01
0.28 0.12 0.024 •0.121 -0.04 ■0.161 -0.15 0.00
0.28 0.11 0.029 0.048 -0.004 ■0.071 -0.09 •0.05
0.26 0.1 1 0.016 0.030 0.025 -0.055 -0.01 •0.07
0.24 0.1 1 •0.001 0.097 0.044 0.013 0.06 •0.07
0.22 0.11 -0.026 0.153 0.06 0.068 0.12 0.24
0.2 0.09 -0.052 0.201 0.06 0.108 0.14 0.40
0.17 0.08 -0.025 0.203 0.055 0.123 0.16 0.30
0.13 0.05 0.019 0.192 0.046 0.159 0.16 0.04
0.09 0.04 0.062 0.160 0.023 0.133 0.16 -0.12
0.06 0.04 0.090 0.153 0.006 0.15 0.14 ■0.18
0.02 0.05 0.091 0.153 -0.014 0.089 0.14 -0.14
-0.01 0.05 0.088 0.164 •0.026 0.075 0.16 •0.24
-0.04 0.05 0.087 0.158 -0.044 0.047 0.17 ■0.23
-0.06 0.05 0.106 0.146 -0.055 0.025 0.17 ■0.16
-0.08 0.07 0.145 0.092 -0.063 -0.018 0.1 1 ■0.12
-0.09 0.09 0.189 0.020 -0.071 •0.089 0.05 •0.18
•0.1 0.11 0.218 -0.023 -0.072 -0.102 0.01 ■0.09
APPENDIX II: PRINCIPAL COMPONENTS (continued)
337 nm excitation 460 nm excitation 380 nm excitation 460 nm excitation
E1 E2 E1 E2 E2 E5 E4 E7
■0.11 0.13 0.240 -0.054 •0.078 ■0.104 -0.02 ■0.11
•0.1 1 0.15 0.249 -0.060 -0.071 ■0.078 -0.04 0.04
-0.12 0.17 0.242 -0.073 -0.071 -0.091 -0.03 •0.06
■0.12 0.18 0.238 -0.075 -0.066 -0.087 -0.02 0.08
-0.12 0.2 0.240 -0.064 -0.062 -0.095 -0.03 0.15
-0.11 0.2 0.230 -0.063 -0.06 -0.08 -0.03 0.18
-0.1 0.21 0.221 -0.061 •0.057 •0.067 -0.03 0.19
-0.09 0.22 0.211 -0.060 -0.048 •0.086 •0.02 0.25
-0.08 0.22 0.204 -0.052 ■0.039 -0.068 ■0.01 0.26
-0.07 0.21 0.199 ■0.045 ■0.031 -0.039 0.00 0.17
-0.07 0.21 0.185 ■0.044 -0.027 -0.034 0.01 0.10
■0.07 0.2 0.181 -0.045 -0.019 -0.028 0.01 0.03
-0.06 0.2 0.176 -0.042 •0.019 -0.032 0.00 ■0.02
-0.06 0.19 0.170 -0.037 ■0.015 ■0.01 0.00 -0.01
-0.06 0.18 0.167 -0.035 ■0.008 •0.039 0.01 -0.12
-0.05 0.17 0.159 -0.030 -0.008 •0.037 0.03 ■0.13
•0.05 0.16 0.158 ■0.032 -0.01 -0.068 0.01 •0.21
•0.05 0.15 0.151 -0.027 -0.009 ■0.085 0.01 0.00
•0.05 0.14 0.146 •0.027 •0.005 -0.095 0.00 -0.03
•0.05 0.13 0.137 ■0.019 -0.01 -0.069 0.01 0.03
■0.05 0.12 0.128 •0.015 -0.007 -0.084 0.01 0.03
•0.05 0.1 1 ■0.012 ■0.034
■0.05 0.1 •0.012 -0.036
-0.04 0.11
APPENDIX II: PRINCIPAL COMPONENTS (continued)
337 nm excitation 460 nm excitation 380 nm excitation 460 nm excitation
E1 E2 E1 E2 E2 E5 E4 E7
-0.04 0.09
-0.04 0.09
-0.03 0.09
-0.03 0.09
-0.03 0.08
-0.03 0.08
-0.03 0.08
-0.02 0.09
-0.02 0.12