AU2015224495A1 - Method for analyzing biological specimens by spectral imaging - Google Patents

Method for analyzing biological specimens by spectral imaging Download PDF

Info

Publication number
AU2015224495A1
AU2015224495A1 AU2015224495A AU2015224495A AU2015224495A1 AU 2015224495 A1 AU2015224495 A1 AU 2015224495A1 AU 2015224495 A AU2015224495 A AU 2015224495A AU 2015224495 A AU2015224495 A AU 2015224495A AU 2015224495 A1 AU2015224495 A1 AU 2015224495A1
Authority
AU
Australia
Prior art keywords
spectral
data
disease
image
diagnosis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Abandoned
Application number
AU2015224495A
Inventor
Benjamin Bird
Max Diem
Milos Miljkovic
Stanley H. Remiszewski
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeastern University Boston
Cireca Theranostics LLC
Original Assignee
NORTHEASTERN, University of
Cireca Theranostics LLC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2011270731A external-priority patent/AU2011270731A1/en
Application filed by NORTHEASTERN, University of, Cireca Theranostics LLC filed Critical NORTHEASTERN, University of
Priority to AU2015224495A priority Critical patent/AU2015224495A1/en
Publication of AU2015224495A1 publication Critical patent/AU2015224495A1/en
Priority to AU2017204736A priority patent/AU2017204736A1/en
Abandoned legal-status Critical Current

Links

Landscapes

  • Investigating Or Analysing Materials By Optical Means (AREA)

Abstract

Abstract A method of providing a medical diagnosis, comprises obtaining spectroscopic data for a biological specimen, wherein the biological specimen is extracted from an individual, comparing the spectroscopic data for the biological specimen to spectral data in a repository that is associated with a disease or condition, determining whether any correlation between the spectral data and the spectroscopic data for the biological specimen exists, adding a label that corresponds to the disease or condition to the spectroscopic data when a correlation exists between the spectral data and the spectroscopic data and outputting a diagnosis for the disease or condition associated with the spectral data when a correlation exists between the spectral data and the spectroscopic data.

Description

METHOD FOR ANALYZING BIOLOGICAL SPECIMENS BY SPECTRAL IMAGING Related Applications [0001] The present application is a divisional application from Australian Patent Application No. 2011270731, the entire disclosure of which is incorporated herein by 5 reference. This application also claims the benefit of U.S, Provisional Patent Application No. 61/358,606 titled "DIGITAL STAINING OF HISTOPATHOLOGICAL SPECIMENS VIA SPECTRAL HISTOPATHOLOGY" filed on June 25, 2010, which is incorporated herein by reference in its entirety. Field of the Invention 10 [0002] Aspects of the invention relate to a method for analyzing biological specimens by spectral imaging to provide a medical diagnosis. The biological specimens may include medical specimens obtained by surgical methods, biopsies, and cultured samples. Background 15 [0003] Various pathological methods are used to analyze biological specimens for the detection of abnormal or cancerous cells. For example, standard histopathology involves visual analysis of stained tissue sections by a pathologist using a microscope. Typically, tissue sections are removed from a patient by biopsy, and the samples are either snap frozen and sectioned using a cryo-microtome, or they are 20 formalin-fixed, paraffin embedded, and sectioned via a microtome. The tissue sections are then mounted onto a suitable substrate. Paraffin-embedded tissue sections are subsequently deparaffinized. The tissue sections are stained using, for example, an hemotoxylin-eosin (H&E) stain and are coverslipped. [0004] The tissue samples are then visually inspected at 10x to 40x magnification, 25 The magnified cells are compared with visual databases in the pathologist's memory. Visual analysis of a stained tissue section by a pathologist involves scrutinizing features such 1 as nuclear and cellular morphology, tissue architecture, staining patterns, and the infiltration of immune response cells to detect the presence of abnormal or cancerous cells. [0005] If early metastases or small clusters of cancerous cells measuring from less than 5 0.2 to 2 mm in size, known as micrometastases, are suspected, adjacent tissue sections may be stained with an immuno-histochemical (IHC) agent/counter stain such as cytokeratin-specific stains. Such methods increase the sensitivity of histopathology since normal tissue, such as lymph node tissue, does not respond to these stains. Thus, the contrast between unaffected and diseased tissue can be enhanced. 0 [0006] The primary method for detecting micrometastases has been standard standard histopathology is a formidable task owing to the small size and lack of distinguishing features of the abnormality within the tissue of a lymph node. Yet, the detection of these micromptastass is of prime importance to stage the spread of 5 disease because if a lymph node is found to be free of metastatic cells, the spread of cancer may be contained. On the other hand, a false negative diagnosis resulting from a missed micrometastasis in a lymph node presents too optimistic a diagnosis, and a more aggressive treatment should have been recommended. [0007j Although standard histopathology is well-established for diagnosing advanced ?o diseases, it has numerous disadvantages. In particular, variations in the independent diagnoses of the same tissue section by different pathologists are common because the diagnosis and grading of disease by thlis melod is based on a comparison of the specimen of interest with a database in the pathologist's memory, which is inherently 2 subjective. Differences in diagnoses particularly arise when diagnosing rare cancers or in the very early stages of disease. In addition, standard histopathology is time consuming, costly and relies on the human eye for detection, which makes the results hard to reproduce. Further, operator fatigue and varied levels of expertise of the 5 pathologist may impact a diagnosis. (0008] In addition, if a tumor is poorly differentiated, many immunohistochemical stains may be required to help differentiate the cancer type. Such staining may be performed on multiple parallel cell blocks. This staining process may be prohibitively expensive and cellular samples may only provide a few diagnostic cells in a single cell block. o [0009] To overcome the variability in diagnoses by standard histopathology, which methods have been used to capture a snapshot of the biochemical composition of cells and tissue. This makes it possible to detect variations in the biochemical composition of a biological specimen caused by a variety of conditions and diseases Ry suhjanting 5 a tissue or cellular sample to spectroscopy, variations in the chemical composition in portions of the sample may be detected, which may indicate the presence of abnormal or cancerous cells. The application of spectroscopy to infrared cytopathology (the study of diseases of cells) is referred to as "spectral cytopathology" (SCP), and the application of infrared spectroscopy to histopathology (the study of diseases of tissue) as "spectral 0 histopathology" (SHP). [0010] SCP on individual urinary tract and cultured cells is discussed in B. Bird et al., (2006). SCP based on imaging data sets and applied to oral mucosa and cervical cells 3 is discussed in WO 2009/146425. Demonstration of disease progression via SCP in oral mucosal cells is discussed in K. Papamarkakis et al., Laboratory Investigations , 90, 589 (2010) Demonstration of s sensitivity to viral infection in cervical cells is discussed in K. Papamarkakis et al., 5 Laboratory Investigations, 90, 589, (2010). [0011] Demonstration of first unsupervised imaging of tissue using SHP of liver tissue via hierarchical cluster analysis (HCA) is discussed in M. Diem et al., Biopolymers, 57, 282 (2000). Detection of metastatic cancer in lymph nodes is discussed in M. J. Romeo et al., Vibrational Spectrosc., 38, 115 (2005) and M. Romeo et al., Vibrational D Microspectroscopy of Cells and Tissues, Wiley-Interscience, Hoboken, NJ (2008). Use discussed in P. Lasch et al., J.Chemometrics, 20, 209 (2007). Detection of micro metastases and individual metastatic cancer cells in lymph nodes is discussed in B. Bird et al. The Analyst, 134, 10A7 (7004), R Rirrl ef al , RMC .1 Clin. Pathology, 8, 1 (2008), 5 and B. Bird et al., Tech. Cancer Res. Treatment, 10, 135 (2011). [0012] Spectroscopic methods are advantageous in that they alert a pathologist to slight changes in chemical composition in a biological sample, which may indicate an early stage of disease. In contrast, morphological changes in tissue evident from standard histopathology take longer to manifest, making early detection of disease more difficult. o Additionally, spectroscopy allows a pathologist to review a larger sample of tissue or cellular material in a shorter amount of time than it would take the pathologist to visually inspect thie same sample. Further, spectroscopy relies on instrument-based measurements that are objective, digitally recorded and stored, reproducible, and 4 amenable to mathematical/statistical analysis. Thus, results derived from spectroscopic methods are more accurate and precise then those derived from standard histopathological methods. [0013] Various techniques may be used to obtain spectral data. For example, Raman 5 spectroscopy, which assesses the molecular vibrations of a system using a scattering effect, may be used to analyze a cellular or tissue sample. This method is described in N. Stone et al., Vibrational Spectroscopy for Medical Diagnosis, J.Wiley & Sons (2008), and C.Krafft, et al., Vibrational Spectrosc. (2011). [0014] Raman's scattering effect is considered to be weak in that only about 1 in 101 D incident photons undergoes Raman scattering. Accordingly, Raman spectroscopy works best using a tightly focused visible or near IR laser beam for excitation. This, i turn, dictates the spot from which spectral information is being collected. This spot size may range from about 0.3 pm to 2 pm in size, depending on the numerical aperture of the microscope objective, and the wavelength of the Inar utilized This small spot size 3 precludes data collection of large tissue sections, since a data set could contain millions of spectra and would require long data acquisition times. Thus, SHP using Raman spectroscopy requires the operator to select small areas of interest. This approach negates the advantages of spectral imaging, such as the unbiased analysis of large areas of tissue. 0 [0015] SHP using infrared spectroscopy has also been used to detect abnormalities in tissue, including, but not limited to brain, lung, oral mucosa, cervical mucosa, thyroid, colon, skin, breast, esophageal, prostale, and lymnph nodes. infaed spectroscopy, like Raman spectroscopy, is based on molecular vibrations, but is an absorption effect, and between 1% and 50% of incident infrared photons are likely to be absorbed if certain criteria are fulfilled. As a result, data can be acquired by infrared spectroscopy more rapidly with excellent spectral quality compared to Raman spectroscopy In addition infrared spectroscopy is extremely sensitive in detecting small compositional changes in 5 tissue. Thus, SHP using infrared spectroscopy is particularly advantageous in the diagnosis, treatment and prognosis of cancers such as breast cancer, which frequently remains undetected until metastases have formed, because it can easily detect micro metastases. It can also detect small clusters of metastatic cancer cells as small as a few individual cells. Further, the spatial resolution achievable using infrared D spectroscopy is comparable to the size of a human cell, and commercial instruments spectra in a few minutes. [0016] A method of SHP using infrared spectroscopy is described in Bird et al., "Spectral detection of micrn-metastates in lymph node histo-pathology", J Biophoton. 2, 5 No. 1-2, 37-46 (2009), (hereinafter "Bird"). This method utilizes infrared micro spectroscopy (IRMSP) and multivariate analysis to pinpoint micro-metastases and individual metastatic cells in lymph nodes. [0017] Bird studied raw hyperspectral imaging data sets including 25,600 spectra, each containing 1650 spectral intensity points between 700 and 4000 cm-'. These data sets, o occupying about 400 MByte each, were imported and pre-processed. Data preprocessing included restriction of the wavenumber range to 900-1800 cm- 1 and other pioesses. Thie "fingeiptint" infrared spectral region was further divided into a "protein region" between 1700 and 1450 cm- 1 , which is dominated by the amide I and amide Il 6 vibrational bands of the peptide linkages of proteins. This region is highly sensitive to different protein secondary and tertiary structure and can be used to stage certain events in cell biology that depend on the abundance of different roteins The lower wavenumber range, from 900 to 1350 cm 1 , the "phosphate region", contains several 5 vibrations of the phosphodiester linkage found in phospholipids, as well as DNA and RNA. [0018] In Bird, a minimum intensity criterion for the integrated amide I band was imposed to eliminate pixels with no tissue coverage. Then, vector normalization and conversion of the spectral vectors to second derivatives was performed. Subsequently, o data sets were subjected individually to hierarchical cluster analysis (HCA) using the Pixel cluster membership was converted to pseudo-color spectral images. [0019] According to Bird's method, marks are placed on slides with a stained tissue 5 section that are to be subjected to spectral analysis. The resulting spectral and visual images are matched by a user who aligns specific features on the spectral image and the visual image to physically overlay the spectral and visual images. [0020] By Bird's method, corresponding sections of the spectral image and the visual image are examined to determine any correlation between the visual observations and o the spectral data. In particular, abnormal or cancerous cells observed by a pathologist in the stained visual image may also be observed when examining a corresponding portion of the spectral image that overlays the stained visual image. I hus, the outlines of the patterns in the pseudo-color spectral image may correspond to known abnormal 7 or cancerous cells in the stained visual image. Potentially abnormal or cancerous cells that were observed by a pathologist in a stained visual image may be used to verify the accuracy of the pseudo-color sectral [0021] Bird's method, however, is inexact because it relies on the skill of the user to 5 visually match specific marks on the spectral and visual images. This method is often imprecise. In addition, Bird's method allows the visual and spectral images to be matched by physically overlaying them, but does not join the data from the two images to each other. Since the images are merely physically overlaid, the superimposed images are not stored together for future analysis. 0 [0022] Further, since different adjacent sections of tissue are subjected to spectral and makes it difficult to match the spectral and visual images, since there may be differences in the morphology of the visual image and the color patterns in the spectral image. 5 [0023] Another problem with Bird's overlaying method is that the visual image is not in the same spatial domain as the infrared spectral image. Thus, the spatial resolution of Bird's visual image and spectral image are different. Typically, spatial resolution in the infrared image is less than the resolution of the visual image. To account for this ditterence in resolution, the data used in the infrared domain may be expanded by ?0 selecting a region around the visual point of interest and diagnosing the region, and not a single point. For every point in the visual image, there is a region in the infrared image that is greater than the point that must be input to achieve diagnostic output. This process of accounting for the resolution differences is not performed by Bird. 8 Instead, Bird assumes that when selecting a point in the visual image, it is the same point of information in the spectral image through the overlay, and accordingly a diagnostic match is reported. While the images may visually be the same, they are not the same diagnostically. 5 [0024] To claim a diagnostic match, the spectral image used must be output from a supervised diagnostic algorithm that is trained to recognize the diagnostic signature of interest. Thus, the spectral image cluster will be limited by the algorithm classification scheme to driven by a biochemical classification to create a diagnostic match, and not a user-selectable match. By contrast, Bird merely used an "unsupervised" HCA image to D compare to a "supervised" stained visual image to make a diagnosis. The HCA image diagnostic, based on rules and limits assigned for clustering, including manually cutting the dendrogram until a boundary (geometric) match is visually accepted by the pathologist to outline a cancer region. This method marPly providan a visual 5 comparison. [0025] Other methods based on the analysis of fluorescence data exist that are generally based on the distribution of an external tag, such as a stain or label, or utilize changes in the inherent fluorescence, also known as auto-fluorescence. These methods are generally less diagnostic, in terms of recognizing biochemical composition 0 and changes in composition. In addition, these methods lack the fingerprint sensitivity of techniques of vibrational spectroscopy, such as Raman and infrared. [0026] A gene al piublein willi special acquisition techniques is thal an enormous amount of spectral data is collected when testing a biological sample. As a result, the 9 process of analyzing the data becomes computationally complicated and time consuming. Spectral data often contains confounding spectral features that are frequently observed in microscopically acquired infrared spectra of cells and tissue, such as scattering and baseline artifacts. Thus, it is helpful to subject the spectral data 5 to pre-processing to isolate the cellular material of interest, and to remove confounding spectral features. [0027] One type of confounding spectral feature is Mie scattering, which is a sample morphology-dependent effect. This effect interferes with infrared absorption or reflection measurements if the sample is non-uniform and includes particles the size of D approximately the wavelength of the light interrogating the sample. Mie scattering is features are superimposed. [0028] Mie scattering may also mediate the mixing of absorptive and reflective line -shapes. In principle, pure absorptive line shapes are thnse cnrresponding to the 5 frequency-dependence of the absorptivity, and are usually Gaussian, Lorentzian or mixtures of both. The absorption curves correspond to the imaginary part of the complex refractive index. Reflective contributions correspond to the real part of the complex refractive index, and are dispersive in line shapes. The dispersive contributions may be obtained from absorptive line shapes by numeric KK-transform, or o as the real part of the complex Fourier transform (FT). [0029] Resonance Mie (RMie) features result from the mixing of absorptive and refleclive band shapes, which occuts because the refractive index undergoes anomalous dispersion when the absorptivity goes through a maximum (i.e., over the 10 profile of an absorption band). Mie scattering, or any other optical effect that depends on the refractive index, will mix the reflective and absorptive line shapes, causing a distortion of the band profile, and an apparent frequency shift [0030] Figure 1 illustrates the contamination of absorption patterns by dispersive band 5 shapes observed in both SCP and SHP. The bottom trace in Figure 1 depicts a regular absorption spectrum of biological tissue, whereas the top trace shows a spectrum strongly contaminated by a dispersive component via the RMie effect. The spectral distortions appear independent of the chemical composition, but rather depend on the morphology of the sample. The resulting band intensity and frequency shifts aggravate o spectral analysis to the point that uncontaminated and contaminated spectra are background features are shown in Figure 2. When superimposed on the infrared micro spectroscopy (IR-MSP) patterns of cells, these features are attributed to Mie scattering by spherical particles, such as cellular nuelpli or qrphorical cells. 5 [0031] The appearance of dispersive line shapes in Figure 1 superimposed on IR-MSP spectra was reported along with a theoretical analysis in M. Romeo, et al., Vibrational Spectroscopy, 38, 129 (2005) (hereinafter "Romeo 2005"). Romeo 2005 indentifies the distorted band shapes as arising from the superposition of dispersive (reflective) components onto the absorption features of an infrared spectrum. These effects were o attributed to incorrect phase correction of the instrument control software. In particular, the acquired raw interferogram in FTIR spectroscopy frequently is "chirped" or asynnnebic, and needs to be symmetrized before FT. This is accomplished by 11 collecting a double sided interferogram over a shorter interferometer stroke, and calculating a phase correction to yield a symmetric interferogram. -[0032] In Romen 2005, it was assmedr that this procerdure Wars not functioning properly, which causes it to yield distorted spectral features. An attempt was made to 5 correct the distorted spectral features by calculating the phase between the real and imaginary parts of the distorted spectra, and reconstructing a power spectrum from the phase corrected real and imaginary parts. Romeo 2005 also reported the fact that in each absorption band of an observed infrared spectrum, the refractive index undergoes anomalous dispersion. Under certain circumstances, various amounts of the dispersive o line shapes can be superimposed, or mixed in, with the absorptive spectra. [0033] The malhemnatical relationship between absolplive and lefleclive band shapes is given by the Kramers-Kronig (KK) transformation, which relates the two physical phenomena. The mixing of dispersive (reflective) and absorptive effects in the 5 called "Phase Correction" (PC) is discussed in Romeo 2005. Although the cause of the mixing of dispersive and absorptive contributions was erroneously attributed to instrument software malfunction, the principle of the confounding effect was properly identified. Due to the incomplete understanding of the underlying physics, however, the proposed correction method did not work properly. 0 [0034] P. Bassan et al., Analyst, 134, 1586 (2009) and P. Bassan et al., Analyst, 134, 1171 (2009) demonstrated that dispersive and absorptive effects may mix via the Resonance Mie cattering (HMieb) ettect. An algorithm and method to correct spectral distortion is described in P. Bassan et al., "Resonant Mie Scattering (RMieS) 12 correction of infrared spectra from highly scattering biological samples", Analyst, 135, 268-277 (2010). This method is an extension of the "Extended Multiplicative Signal Correction" (FMSC) method reported in A Kohler et al , Appl Spectrose-, 59, 707 (2005) and A. Kohler et al., Appl. Spectrosc., 62, 259 (2008). 5 [0035] This method removes the non-resonant Mie scattering from infrared spectral datasets by including reflective components obtained via KK-transform of pure absorption spectra into a multiple linear regression model. The method utilizes the raw dataset and a "reference" spectrum as inputs, where the reference spectrum is used both to calculate the reflective contribution, and as a normalization feature in the EMSC 0 scaling. Since the reference spectrum is not known a priori, Bassan et al. use the mean spectrum of the entire dalasel, or can "artifikial" speub um, sucll as lhe specun umf a pure protein matrix, as a "seed" reference spectrum. After the first pass through the algorithm, each corrected spectrum may be used in an iterative approach to correct all qportra in the suhsequent pass. Thus, a dataset of 1000 secgtra will produce 1000 5 RMieS-EMSC corrected spectra, each of which will be used as an independent new reference spectrum for the next pass, requiring 1,000,000 correction runs. To carry out this algorithm, referred to as the "RMieS-EMSC" algorithm, to a stable level of corrected output spectra required a number of passes (-10), and computation times that are measured in days. 0 [0036] Since the RMieS-EMSC algorithm requires hours or days of computation time, a fast, two-step method to perform the elimination of scattering and dispersive line shapes from spectra was developed, as discussed in B. bird, M. Miljkovic and M. Diem, "Two step resonant Mie scattering correction of infrared micro-spectral data: human lymph 13 node tissue", J. Biophotonics, 3 (8-9) 597-608 (2010). This approach includes fitting multiple dispersive components, obtained from KK-transform of pure absorption spectra, as well as Mie scattrn cuvscmpuiteri via the van HulSt equations (see H. C . Wan Do Hulst, Light Scattering by Small Particles, Dover, Mineola, NY, (1981)), to all the spectra 5 in a dataset via a procedure known as Extended Multiplicative Signal Correction (EMSC) (see A. Kohler et al., Appl.Spectrosc., 62, 259 (2008)) and reconstructing all spectra without these confounding components. [0037] This algorithm avoids the iterative approach used in the RMieS-EMSC algorithm by using uncontaminated reference spectra from the dataset. These uncontaminated 0 reference spectra were found by carrying out a preliminary cluster analysis of the dataet and selecting the specba with the highest amide I frequencies in each cluster as the "uncontaminated" spectra. The spectra were converted to pure reflective spectra via numeric KK transform and used as interference spectra, along with compressed Mie curv'es for RMieS correction as. described above. Thirs approach is fast, but only works 5 well for datasets containing a few spectral classes. [0038] In the case of spectral datasets containing many tissue types, however, the extraction of uncontaminated spectra can become tedious. Furthermore, under these conditions, it is unclear whether fitting all spectra in the dataset to the most appropriate interference spectrum is guaranteed. In addition, this algorithm requires reference 20 spectra for correction, and works best with large datasets. [0039] In light of the above, there remains a need for improved methods of analyzing biological specimens by spectral imaging to provide a medical diagnosis. Further, there is a need for an improved pre-processing method that is based on a revised phase 14 correction approach, does not require input data, is computationally fast, and takes into account many types of confounding spectral contributions that are frequently observed in microscopically acquired infrared spectra of cells and tissue. [0040] The discussion of the background to the invention included herein including 5 reference to documents, acts, materials, devices, articles and the like is included to explain the context of the present invention. This is not to be taken as an admission or a suggestion that any of the material referred to was published, known or part of the common general knowledge in Australia or in any other country as at the priority date of any of the claims. 10 Summary [0041] According to a first aspect of the invention there is provided a method of providing a medical diagnosis, comprising: obtaining spectroscopic data for a biological specimen, wherein the biological specimen is extracted from an individual; comparing the spectroscopic data for the biological specimen to spectral data in a 15 repository that is associated with a disease or condition; determining whether any correlation between the spectral data and the spectroscopic data for the biological specimen exists; adding a label that corresponds to the disease or condition to the spectroscopic data when a correlation exists between the spectral data and the spectroscopic data; and outputting a diagnosis for the disease or condition associated 20 with the spectral data when a correlation exists between the spectral data and the spectroscopic data. [0042] According to a second aspect of the invention there is provided a system for providing a medical diagnosis, the system comprising: a processor; a user interface functioning via the processor; and a repository accessible by the processor; wherein 25 spectroscopic data of a biological specimen is obtained, wherein the biological specimen is extracted from an individual; wherein the spectroscopic data for the biological specimen is compared to spectral data in a repository that is associated 15 with a disease or condition; wherein whether any correlation between the spectral data and the spectroscopic data for the biological specimen exists is determined; wherein a label that corresponds to the disease or condition is added to the spectroscopic data when a correlation exists between the spectral data and the 5 spectroscopic data; and wherein a diagnosis for the disease or condition associated with the spectral data when a correlation exists between the spectral data and the spectroscopic data. [0043] According to a third aspect of the invention there is provided a computer program product comprising a computer usable medium having control logic stored 10 therein for causing a computer to provide a medical diagnosis, the control logic comprising: first computer readable program code means for obtaining spectroscopic data for a biological specimen, wherein the biological specimen is extracted from an individual; second computer readable program code means for comparing the spectroscopic data for the biological specimen to spectral data in a repository that is 15 associated with a disease or condition; third computer readable program code means for determining whether any correlation between the spectral data and the spectroscopic data for the biological specimen exists; fourth computer readable program code means for adding a label that corresponds to the disease or condition to the spectroscopic data when a correlation exists between the spectral data and the 20 spectroscopic data; and fifth computer readable program code means for outputting a diagnosis with the disease or condition associated with the spectral data when a correlation exists between the spectral data and the spectroscopic data. [0044] Also described herein is a method for analyzing biological specimens by spectral imaging to provide a medical diagnosis. The method includes obtaining 25 spectral and visual images of biological specimens and registering the images to detect cell abnormalities, pre-cancerous cells, and cancerous cells. This method overcomes the obstacles discussed above, among others, in that it eliminates the bias 16 and unreliability of diagnoses that are inherent in standard histopathological and other spectral methods. [0045] Also described is a method for correcting confounding spectral contributions that are frequently observed in microscopically acquired infrared spectra of cells and 5 tissue by performing a phase correction on the spectral data. This phase correction method may be used to correct various kinds of absorption spectra that are contaminated by reflective components. [0046] Also described is a method for analyzing biological specimens by spectral imaging includes acquiring a spectral image of the biological specimen, acquiring a 10 visual image of the biological specimen, and registering the visual image and spectral image. [0047] Also described is a method of developing a data repository which includes identifying a region of a visual image displaying a disease or condition, associating the region of the visual image to spectral data corresponding to the region, and storing the 15 association between the spectral data and the corresponding disease or condition. [0047a] Where the terms "comprise", "comprises", "comprised" or "comprising" are used in this specification (including the claims) they are to be interpreted as specifying the presence of the stated features, integers, steps or components, but not precluding the presence of one or more other features, integers, steps or components, or group 20 thereof. 17 Description of the Drawings [0048] Figure 1 illustrates the contamination of absorption patterns by dispersive band shapes typically observed in both SCP and SHP. [0049] Figure 2 shows broad, undulating background features typically observed on 5 IR-MSP spectral of cells attributed to Mie scattering by spherical particles. [0050] Figure 3 is a flowchart illustrating a method of analyzing a biological sample by spectral imaging according to aspects of the invention. [0051] Figure 4 is a flowchart illustrating steps in a method of acquiring a spectral image according to aspects of the invention. 10 [0052] Figure 5 is a flowchart illustrating steps in a method of pre-processing spectral data according to aspects of the invention. [0053] Figure 6A shows a typical spectrum, superimposed on a linear background according to aspects of the invention. [0054] Figure 6B shows an example of a second derivative spectrum according to 15 aspects of the invention. [0055] Figure 7 shows a portion of the real part of an interferogram according to aspects of the invention. [0056] Figure 8 shows that the phase angle that produces the largest intensity after phase correction is assumed to be the uncorrupted spectrum according to aspects of 20 the invention. 17a [0057] Figure 9A shows that absorption spectra that are contaminated by scattering effects that mimic a baseline slope according to aspects of the invention. curved effects at the spectral boundaries, which will contaminate the resulting corrected 5 spectra according to aspects of the invention. [0059] Figure 10A is H&E-based histopathology showing a lymph node that has confirmed breast cancer micro-metastases under the capsule according to aspects of the invention. [0060] Figure 1OB shows data segmentation by Hierarchical Cluster Analysis (HCA) o carried out on the lymph node section of Figure 1 OA according to aspects of the invention. [0061] Figure 10C is a plot showing the peak frequencies of the amide I vibrational band in each spectrum according to aspects of the invention. 5 phase-correction using RMieS correction according to aspects of the invention. [0063] Figure 11A shows the results of HCA after phase-correction using RMieS correction of Figure 1OD according to aspects of the invention. [0064] Figure 11B is H&E-based histopathology of the lymph node section of Figure 1 1A according to aspects of the invention. 20 [0065] Figure 12A is a visual microscopic image of a section of stained cervical image. [0066] Figure 12B is an infrared spectral image created from hierarchical cluster analysis of an infrared dataset collected prior to staining the tissue according to aspects of the invention. 18 [0067] Figure 13A is a visual microscopic image of a section of an H&E-stained axillary lymph node section according to aspects of the invention. [00681 Figure 13R is an infrared spectral image created from artificial neural network (ANN) analysis of an infrared dataset collected prior to staining the tissue according to 5 aspects of the invention. [0069] Figure 14A is a visual image of a small cell lung cancer tissue according to aspects of the invention. [0070] Figure 14B is an HCA-based spectral image of the tissue shown in Figure 14A according to aspects of the invention. 0 [0071] Figure 14C is a registered image of the visual image of Figure 14A and the spectral image of F~igur e 14B, auuuidius to aspects of the invention. [0072] Figure 14D is an example of a graphical user interface (GUI) for the registered image of Figure 14C according to aspects of the invention. 5 section according to aspects of the invention. [0074] Figure 15B is a global digital staining image of section shown in Figure 15A, distinguishing capsule and interior of lymph node according to aspects of the invention. [0075] Figure 15C is a diagnostic digital staining image of the section shown in Figure 15A, distinguishing capsule, metastatic breast cancer, histiocytes, activated B 0 lymphocytes, and T -lymphocytes according to aspects of the invention. [0076] Figure 16 is a schematic of relationship between global and diagnostic digital staining according to aspects of the invention. 19 [0077] Figure 17A is a visual image of H&E-stained tissue section from an axillary lymph node according to aspects of the invention. [0078] Figure 17B is a SHP-hased digitally stained reginn of breast cancer micrometastasis according to aspects of the invention. 5 [0079] Figure 17C is a SHP-based digitally stained region occupied by B-lymphocyes according to aspects of the invention. [0080] Figure 17D is a SHP-based digitally stained region occupied by histocytes according to aspects of the invention. [0081] Figure 18 illustrates the detection of individual cancer cells, and small clusters of cancer cells via SHP according to aspects of the invention. from lung adenocarcinoma, small cell carcinoma, and squamous cell carcinoma cells according to aspects of the invention. 5 recorded from lung adenocarcinoma, small cell carcinoma, and squamous cell carcinoma cells according to aspects of the invention. [0084] Figure 19C shows standard spectra for lung adenocarcinoma, small cell carcinoma, and squamous cell carcinoma according to aspects of the invention. [0085] Figure 19D shows KK transformed spectra calculated from spectra in Figure 0 19C. (0086] Figure 19E shows PCA scores plots of the multi class data set before EMSC correction according to aspects of the invention. 20 [0087] Figure 19F shows PCA scores plots of the multi class data set after EMSC correction according to aspects of the invention. [0088] Figiurp 2OA shnwc mean absorbance spectra of lung adenocarcinoma, small cell carcinoma, and squamous carcinoma, according to aspects of the invention. 5 [0089] Figure 20B shows second derivative spectra of absorbance spectra displayed in Figure 20A according to aspects of the invention. [0090] Figure 21A shows 4 stitched microscopic R&E-stained images of 1 mm x 1 mm tissue areas comprising adenocarcinoma, small cell carcinoma, and squamous cell carcinoma cells, respectively, according to aspects of the invention. o [0091] Figure 21B is a binary mask image constructed by performance of a rapid reduced RCA analysis upon te 1350 un- - 900 un' spectral region of the 4 stitched raw infrared images recorded from the tissue areas shown in Figure 21A according to aspects of the invention. 5 recorded from regions of diagnostic cellular material according to aspects of the invention. [0093] Figure 22 shows various features of a computer system for use in conjunction with aspects of the invention. [0094] Figure 23 shows a computer system for use in conjunction with aspects of the o invention. Detailed Description [0095] Unless otherwise defined, al1 technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which 21 aspects of this invention belong. Although methods and materials similar or equivalent to those described herein can be used in the practice or testing, suitable methods and materials are deqcribed holow All publications, patent applications, patents, arnd other references mentioned herein are incorporated by reference in their entirety. In case of 5 conflict, this specification, including definitions, will control, In addition, the materials, methods, and examples are illustrative only and not intended to be limiting. [0096] One aspect of the invention relates to a method for analyzing biological specimens by spectral imaging to provide a medical diagnosis. The biological specimens may be medical specimens obtained by surgical methods, biopsies, and 0 cultured samples. The method includes obtaining spectral and visual images of - biological peiesand registering the iniages to detect cell abnormalities, pre cancerous cells, and cancerous cells. The biological specimens may include tissue or cellular samples, but tissue samples are preferred for some applications. This method 5 breast, uterine, renal, testicular, ovarian, or prostate cancer, small cell lung carcinoma, non-small cell lung carcinoma, and melanoma, as well as non-cancerous effects including, but not limited to, inflammation, necrosis, and apoptosis. [0097] One method in accordance with aspects of the invention overcomes the obstacles discussed above in that it eliminates or generally reduces the bias and 20 unreliability of diagnoses that are inherent in standard histopathological and other spectral methods. In addition, it allows access to a spectral database of tissue types that is produced by quantitative and reproducible measurements and is analyzed by an algorithm that is calibrated against classical histopathology. Via this method, for 22 example, abnormal and cancerous cells may be detected earlier than they can be identified by the related art, including standard histopathological or other spectral techniues. [0098] A method in accordance with aspects of the invention is illustrated in the 5 flowchart of Figure 3. As shown in Figure 3, the method generally includes the steps of acquiring a biological section 301, acquiring a spectral image of the biological section 302, acquiring a visual image of the same biological section 303, and performing image registration 304. The registered image may optionally be subjected to training 305, and a medical diagnosis may be obtained 306. 0 [0099] BiologicalSection [00100] Aculding to the example method of the invention shown in figure 3, the step of acquiring a biological section 301 refers to the extraction of tissue or cellular material from an individual, such as a human or animal. A tissue section may be obtained by 5 material may be obtained by methods including, but not limited to swabbing (exfoliation), washing (lavages), and by fine needle aspiration (FNA). [00101] A tissue section that is to be subjected to spectral and visual image acquisition may be prepared from frozen or from paraffin embedded tissue blocks according to methods used in standard histopathology. The section may be mounted on a slide that 20 may be used both for spectral data acquisition and visual pathology. For example, the tissue may be mounted either on infrared transparent microscope slides comprising a material including, but not limited to, calcium fluoride (CaF 2 ) or on infrared reflective 23 slides, such as commercially available "low-e" slides. After mounting, paraffin embedded samples may be subjected to deparaffinization. [00102] Spectral Image [00103] According to aspects of the invention, the step of acquiring a spectral image of 5 the biological section 302 shown in Figure 3 may include the steps of acquiring spectral data from the biological section 401, performing data pre-processing 402, performing multivariate analysis 403, and creating a grayscale or pseudo-color image of the biological section 404, as outlined in the flowchart of Figure 4. [00104] Spectral Data 0 [00105] As set forth in Figure 4, spectral data from the biological section may be -acquired in step 401. Spectral data fronm an unstained biological bamTple, such as a tissue sample, may be obtained to capture a snapshot of the chemical composition of the sample. The spectral data may be collected from a tissue section in pixel detail, whorin each pixel is about the size of a cellular nucleus s. Each pixel has its own spectral 5 pattern, and when the spectral patterns from a sample are compared, they may show small but recurring differences in the tissue's biochemical composition. [00106] The spectral data may be collected by methods including, but not limited to infrared, Raman, visible, terahertz, and fluorescence spectroscopy. Infrared spectroscopy may include, but is not limited to, attenuated total reflectance (ATR) and 0 attenuated total reflectance Fourier transform infrared spectroscopy (ATR-FTIR). In general, infrared spectroscopy may be used because of its fingerprint sensitivity, which is also exhibited by Raman spectroscopy. Infrared spectroscopy may be used with larger tissue sections and to provide a dataset with a more manageable size than 24 Raman spectroscopy. Furthermore, infrared spectroscopy data may be more amenable to fully automatic data acquisition and interpretation. Additionally, infrared spectroscopy may have the necessary sensitivity and specificity for the retetion of various tissue structures and diagnosis of disease. 5 [00107] The intensity axis of the spectral data, in general, express absorbance, reflectance, emittance, scattering intensity or any other suitable measure of light power. The wavelength may relate to the actual wavelength, wavenumber, frequency or energy of electromagnetic radiation. [00108] Infrared data acquisition may be carried out using presently available Fourier > transform (FT) infrared imaging microspectrometers, tunable laser-based imaging instruments, such as quantum cascade or non-linear oplical devices, or othei functionally equivalent instruments based on different technologies. The acquisition of spectral data using a tunable laser is described further in U.S. Patent Application Serial Nn 1vnM 8477 titled "Tuinable I aser..lased Infrared Imaging System and Moethed of 5 Use Thereof', which is incorporated herein in its entirety by reference. [00109] According to one method in accordance with aspects of the invention, a pathologist or technician may select any region of a stained tissue section and receive a spectroscopy-based assessment of the tissue region in real-time, based on the hyperspectral dataset collected for the tissue before staining. Spectral data may be o collected for each of the pixels in a selected unstained tissue sample. Each of the collected spectra contains a fingerprint of the chemical composition of each of the tissue pixels. Acquisition of spectral data is described in vvU 2009/4642b, which is incorporated herein in its entirety by reference. 25 [00110] In general, the spectral data includes hyperspectral datasets, which are constructs including N = n m individual spectra or spectral vectors (absorption, emission. rflectance et- ) w dimensions of the image, respectively. Each spectrum is associated with a distinct pixel 5 of the sample, and can be located by its coordinates x and y, where 1sxsn, and 1sysm. Each vector has k intensity data points, which are usually equally spaced in the frequency or wavenumber domain. [00111] The pixel size of the spectral image may generally be selected to be smaller than the size of a typical cell so that subcellular resolution may be obtained. The size D may also be determined by the diffraction limit of the light, which is typically about 5 pm to about 7 pm for infrared light. Thus, for a 1 nin sectiiun uiltissue, about 1409 to about 2002 individual pixel infrared spectra may be collected. For each of the N pixels of a spectral "hypercube", its x and y coordinates and its intensity vector (intensity vs. wavplanqth), are stored. 5 [00112] Pre-Processing [00113] Subjecting the spectral data to a form of pre-processing may be helpful to isolate the data pertaining to the cellular material of interest and to remove confounding spectral features. Referring to Figure 4, once the spectral data is collected, it may be subjected to such pre-processing, as set forth in step 402. o [00114] Pre-processing may involve creating a binary mask to separate diagnostic from non-diagnostic regions of the sampled area to isolate the cellular data of interest. Mellods for creating a binary mask are disclosed in WU 2UU9/146425, which is incorporated by reference herein in its entirety. 26 [00115] A method of pre-processing, according to another aspect of the invention, permits the correction of dispersive line shapes in observed absorption spectra by a "phase correction" algorithm that optimizes the separation of real anr imaginary parts of the spectrum by adjusting the phase angle between them. This method, which is 5 computationally fast, is based on a revised phase correction approach, in which no input data are required. Although phase correction is used in the pre-processing of raw interferograms in FTIR and NMR spectroscopy (in the latter case, the interferogram is usually referred to as the "free induction decay, FID") where the proper phase angle can be determined experimentally, the method of this aspect of the invention differs from o earlier phase correction approaches in that it takes into account mitigating factors, such -as Mic, RMic and other effects based on the anoraaluus dispersion of tle leilactive index, and it may be applied to spectral datasets retroactively. [00116] The pre-processing method of this aspect of the invention transforms corrupted spectra into Fnurier space by reverse FT transform. The reverse FT results in a real 5 and an imaginary interferogram. The second half of each interferogram is zero-filled and forward FT transformed individually. This process yields a real spectral part that exhibits the same dispersive band shapes obtained via numeric KK transform, and an imaginary part that includes the absorptive line shapes. By recombining the real and imaginary parts with a correct phase angle between them, phase-corrected, artifact-free 0 spectra are obtained. [00117] Since the phase required to correct the contaminated spectra cannot be delennined experimentally and varies from spectrum to spectrum, phase angles are determined using a stepwise approach between -900 and 900 in user selectable steps. 27 The "best" spectrum is determined by analysis of peak position and intensity criteria, both of which vary during phase correction. The broad undulating Mie scattering contributions are= not explicitly correcteel for explicitly in this approach, but they disappear by performing the phase correction computation on second derivative 5 spectra, which exhibit a scatter-free background. [00118] According to aspects of the invention, the pre-processing step 402 of Figure 4 may include the steps of selecting the spectral range 501, computing the second derivative of the spectra 502, reverse Fourier transforming the data 503, zero-filling and forward Fourier transforming the interferograms 504, and phase correcting the resulting o real and imaginary parts of the spectrum 505, as outlined in the flowchart of Figure 5. [00119] pectral llange [00120] In step 501, each spectrum in the hyperspectral dataset is pre-processed to select the most appropriate spectral range (fingerprint region). This range may be 5 well as X-H (X: heavy atom with atomic number 2 12) deformation modes. A typical example spectrum, superimposed on a linear background, is shown in Figure 6A. [00121] Second Derivative of Spectra [00122] The second derivative of each spectrum is then computed in step of 502 of the flowchart of Figure 5. Second derivative spectra are derived from original spectral 20 vectors by second differentiation of intensity vs. wavenumber. Second derivative spectra may be computed using a Savitzky-Golay sliding window algorithm, and can also be computed in Fourier space by multiplying the interferogram by an appropriately truncated quadratic function. 28 [00123] Second derivative spectra may have the advantage of being free of baseline slopes, including the slowly changing Mie scattering background. The second derivative spectra may be nearly nompletely devoid of baseline effects due to scattering and non-resonant Mie scattering, but still contain the effects of RMieS. The second 5 derivative spectra may be vector normalized, if desired, to compensate for varying sample thickness. An example of a second derivative spectrum is shown in Figure 6B. [00124] Reverse Fourier Transform [00125] In step 503 of the flowchart of Figure 5, each spectrum of the data set is reverse Fourier transformed (FT). Reverse FT refers to the conversion of a spectrum o from intensity vs. wavenumber domain to intensity vs. phase difference domain. Since F~T routines only ork vvith spectral vectors the lengtlh of whiuh wie an inleyer power of 2, spectra are interpolated or truncated to 512, 1024 or 2048 (NFT) data point length before FT. Reverse FT yields a real (RE) and imaginary (IM) interferogram of NFT/2 p rointq A portion of the real part of such an inefrga s hown in Figure 7. 5 [00126] Zero-Fill and Forward Fourier Transform [00127] The second half of both the real and imaginary interferogram for each spectrum is subsequently zero-filled in step 504. These zero-filled interferograms are subsequently forward Fourier transformed to yield a real and an imaginary spectral component with dispersive and absorptive band shapes, respectively. ?0 [00128] Phase Correction [00129] The real (RE) and imaginary (IM) parts resulting from the Fourier analysis are subsequently phase corrected, as shown in step b05 of the tlowchart of Figure 5. This 29 yields phase shifted real (RE') and imaginary (IM') parts as set forth in the formula below: I['n = s (I)) cos( ) M where (p is the phase angle. 5 [00130] Since the phase angle p for the phase correction is not known, the phase angle may be varied between -Tr/2 (p !5 Tr/2 in user defined increments, and a spectrum with the least residual dispersive line shape may be selected. The phase angle that produces the largest intensity after phase correction may be assumed to be the uncorrupted spectrum, as shown in Figure 8. The heavy trace marked with the arrows ) and referred to as the "original spectrum" is a spectrum that is contaminated by RMieS contributions. Ihe thin traces show how the spectrum changes upon phase correction with various phase angles. The second heavy trace is the recovered spectrum, which matches the uncontaminated spectrum well. As indicated in Figure 8, the best corrected spectrum exhibits the highest amide I intensity at about 1655 cm 1 . This peak 5 position matches the position before the spectrum was contaminated. [00131] The phase correction method, in accordance with aspects of the invention This approach even solves a complication that may occur if absorption spectra are used, in that if absorption spectra are contaminated by scattering effects that mimic a o baseline slope, as shown schematically in Figure 9A, the imaginary part of the forward FT exhibits strongly curved effects at the spectral boundaries, as shown in Figure 9B, which will contaminate the resulting corrected spectra. Use of second derivative 30 spectra may eliminate this effect, since the derivation eliminates the sloping background; thus, artifact-free spectra may be obtained. Since the ensuing analysis of the snctral data-st hy hierarchical cluster analysis, or other appropriate segmenting or diagnostic algorithms, is carried out on second derivative spectra anyway, it is 5 advantageous to carry out the dispersive correction on second derivative spectra, as well. Second derivative spectra exhibit reversal of the sign of spectral Peaks. Thus, the phase angle is sought that causes the largest negative intensity. The value of this approach may be demonstrated from artificially contaminated spectra: since a contamination with a reflective component will always decrease its intensity, the o uncontaminated or "corrected" spectrum will be the one with the largest (negative) band i ntenulty in the am-ide I band between 1650 and 1660 crni-1 [00132] Example 1 - Operation of Phase Correction Algorithm [00133] An example of the operation of the phase correction algorithm is provided in 5 node tissue section. The lymph node has confirmed breast cancer micro-metastases under the capsule, shown by the black arrows in Figure 10A. This photo-micrograph shows distinct cellular nuclei in the cancerous region, as well as high cellularity in areas of activated lymphocytes, shown by the gray arrow. Both these sample heterogeneities contribute to large RMieS effects. ?0 [00134] When data segmentation by hierarchical cluster analysis (HCA) was first carried out on this example lymph node section, the image shown in Figure 10B was obtained. To distinguish the cancerous tissue (dark green and yellow) from the capsule (red), and the lymphocytes (remainder of colors), 10 clusters were necessary, and the 31 distinction of these tissue types was poor. In Figure 10B, the capsule shown in red includes more than one spectral class, which were combined into 1 cluster. [001351 The difficulties in segmenting this dataset can be gauged by of Figure 10C. This plot depicts the peak frequencies of the aide I vibrational band in 5 each spectrum. The color scale at right of the figure indicates that the peak occurs between about 1630 and 1665 cm-1 of the lymph node body, and between 1635 and 1665 cm 1 for the capsule. The spread of amide I frequency is typical for a dataset heavily contaminated by RMieS effects, since it is well-known that the amide I frequency for peptides and proteins should occur in the range from 1650 to 1660 cm', depending 3 on the secondary protein structure. Figure 10D shows an image of the same tissue node, the frequency variation of the aide I peak was reduced to the range of 1650 to 1654 cm, and for the capsule to a range of 1657 to 1665 cm 1 (fibro-connective proteins of the napsule are known to connsist mostly of cliagen, a protein knowln to 5 exhibit a high amide I band position). [00136] The results from a subsequent HCA are shown in Figure 11. In Figure 11A, cancerous tissue is shown in red; the outline of the cancerous regions coincides well with the H&E-based histopathology shown in Figure 11 B (this figure is the same as 10A). IThe capsule is represented by two different tissue classes (light blue and purple), 0 with activated B-lymphocytes shown in light green. Histiocytes and T-lymphocytes are shown in dark green, gray and blue regions. The regions depicted in Figure 1 1A match the visual histopathology well, and indicate that the phase correction method discussed herein improved the quality of the spectral histopathology methods enormously. 32 [00137] The advantages of the pre-processing method in accordance with aspects of the invention over previous methods of spectral correction include that the method provides a fast execution time of about 5000 spectra/second and no a priori information on the dataset is required. In addition, the phase correction algorithm can be 5 incorporated into spectral imaging and "digital staining" diagnostic routines for automatic cancer detection and diagnosis in SCP and SHP. Further, phase correction greatly improves the quality of the image, which is helpful for image registration accuracy and in diagnostic alignment and boundary representations, [00138] Further, the pre-processing method in accordance with aspects of the invention 0 may be used to correct a wide range of absorption spectra contaminated by reflective such as those in which band shapes are distorted by dispersive line shapes, such as Diffuse Reflectance Fourier Transform Spectroscopy (DRIFTS), Attenuated Total Reflection (ATIR), and other forrns of srpectroscopyV in which mixing of the real and 5 imaginary part of the complex refractive index, or dielectric susceptibility, occurs to a significant extent, such as may be present with Coherent Anti-Stokes Raman Spectroscopy (CARS). [00139] Multivariate Analysis [00140] Multivariate analysis may be performed on the pre-processed spectral data to ?0 detect spectral differences, as outlined in step 403 of the flowchart of Figure 4. In certain multivariate analyses, spectra are grouped together based on similarity. The numnbe i youps mnay be seleued based on the level of differentiation required for the given biological sample. In general, the larger the number of groups, the more detail 33 that will be evident in the spectral image. A smaller number of groups may be used if less detail is desired. According to aspects of the invention, a user may adjust the number of groups to attain the desired level of spectral differentiation [00141] For example, unsupervised methods, such as HCA and principal component 5 analysis (PCA), supervised methods, such as machine learning algorithms including, but not limited to, artificial neural networks (ANNs), hierarchical artificial neural networks (hANN), support vector machines (SVM), and/or "random forest" algorithms may be used. Unsupervised methods are based on the similarity or variance in the dataset, respectively, and segment or cluster a dataset by these criteria, requiring no information o except the dataset for the segmentation or clustering. Thus, these unsupervised in the dataset. Supervised algorithms, on the other hand, require reference spectra, such as representative spectra of cancer, muscle, or bone, for example, and classify a dataset based on certain similarity critari~a in these reference spectra. 5 [00142] HCA techniques are disclosed in Bird (Bird et al., "Spectral detection of micro metastates in lymph node histo-pathology", J. Biophoton. 2, No. 1-2, 37-46 (2009)), which is incorporated herein in its entirety. PCA is disclosed in WO 2009/146425, which is incorporated by reference herein in its entirety. [00143] -xamples of supervised methods for use in accordance with aspects of the o invention may be found in P. Lasch et al. "Artificial neural networks as supervised techniques for FT-IR microspectroscopic imaging" J. Chemometrics 2006 (hereinafter "Lash"), 20. 209-220, M. Miljkovi et al., "Label-free imaging of human cells: algorithms for image reconstruction of Raman hyperspectral datasets" (hereinafter 34 "Miljkovic"), Analyst, 2010, xx, 1-13, and A. Dupuy et al., "Critical Review of Published Microarray Studies for Cancer Outcome and Guidelines on Statistical Analysis and Reporting". JNCI Vol. 99. Issue 2 | January 17, 2007 (hIr.inqfter "Dupuy") each of which is incorporated by reference herein in its entirety. 5 [00144] Grayscale or Pseudo-Color Spectral Image [00145] Similarly grouped data from the multivariate analysis may be assigned the same color code. The grouped data may be used to construct "digitally stained" grayscale or pseudo-color maps, as set forth in step 404 of the flowchart of Figure 4. Accordingly, this method may provide an image of a biological sample that is based 0 solely or primarily on the chemical information contained in the spectral data. [00146] An example of a spectral image prepared after multivaiate analyi by HCA iS provided in Figures 12A and 12B. Figure 12A is a visual microscopic image of a section of stained cervical image, measuring about 0.5 mm x 1 mm. Typical layers of sciuarnous epithelium are indicated Figure 12R is a pseudo-color infrared spectral 5 image constructed after multivariate analysis by HCA prior to staining the tissue. This image was created by mathematically correlating spectra in the dataset with each other, and is based solely on spectral similarities; no reference spectra were provided to the computer algorithm. As shown in Figure 12B, an HCA spectral image may reproduce the tissue architecture visible after suitable staining (for example, with a H&E stain) 0 using standard microscopy, as shown in Figure 12A. In addition, Figure 12B shows features that are not readily detected in Figure 12A, including deposits of keratin at (a) and infilliation by iniune cells at (b).
[00147] The construction of pseudo-color spectral images by HCA analysis is discussed in Bird. [00148] An Pyvample of a sportral image prepared after analysis by ANN is provided in Figures 13A and 13B. Figure 13A is a visual microscopic image of a section of an H&E 5 stained axillary lymph node section. Figure 13B is an infrared spectral image created from ANN analysis of an infrared dataset collected prior to staining the tissue of Figure 13A. [00149] Visual Image [00150] A visual image of the same biological section obtained in step 302 may be acquired, as indicated by step 303 in Figure 3. The biological sample applied to a slide in step 301 described above indy be unstained i nay be stained by any suitable well known method used in standard histopathology, such as by one or more H&E and/or IHC stains, and may be coverslipped. Examples of visual images are shown in Figures 12A and 13A. 5 [00151] A visual image of a histopathological sample may be obtained using a standard visual microscope, such as one commonly used in pathology laboratories. The microscope may be coupled to a high resolution digital camera that captures the field of view of the microscope digitally. This digital real-time image is based on the standard microscopic view of a stained piece of tissue, and is indicative of tissue architecture, cell o morphology and staining patterns. The digital image may include many pixel tiles that are combined via image stitching, for example, to create a photograph. According to aspects of the invention, the digital image that is used for analysis may include an 36 individual tile or many tiles that are stitched combined into a photograph. This digital image may be saved and displayed on a computer screen. [0=152] R egistratinn of spertraql and Visual Images [00153] According to one method in accordance with aspects of the invention, once the 5 spectral and visual images have been acquired, the visual image of the stained tissue may be registered with a digitally stained grayscale or pseudo-color spectral image as indicated in step 304 in the flowchart of Figure 3. In general, image registration is the process of transforming or matching different sets of data into one coordinate system. Image registration involves spatially matching or transforming a first image to align with o a second image. The images may contain different types of data, and image ::yistr ation allows the matching or information of the different types of data. [00154] In accordance with aspects of the invention, image registration may be performed in a number of ways. For example, a common coordinate system may be established for the visual and spectral images. if establishing a common coordinate 5 system is not possible or is not desired, the images may be registered by point mapping to bring an image into alignment with another image. In point mapping, control points on both of the images that identify the same feature or landmark in the images are selected. Based on the positions of the control points, spatial mapping of both images may be performed. For example, at least two control points may be used. To register 0 the images, the control points in the visible image may be correlated to the corresponding control points in the spectral image and aligned together. L0016] in one variation according to aspects of the invention, control points may be selected by placing reference marks on the slide containing the biological specimen. 37 Reference marks may include, but are not limited to, ink, paint, and a piece of a material, including, but not limited to polyethylene. The reference marks may have any suitable shape or sira a;nd mnay hiz plaed in thp central portion, edges, or corners of the side, as long as they are within the field of view. The reference mark may be added 5 to the slide while the biological specimen is being prepared. If a material having known spectral patterns, including, but not limited to a chemical substance, such as polyethylene, and a biological substance, is used in a reference mark, it may be also used as a calibration mark to verify the accuracy of the spectral data of the biological specimen. 0 [00156] In another variation according to aspects of the invention, a user, such as a pathologist, may select the control points in thie spectbal and visual images. The user may select the control points based on their knowledge of distinguishing features of the visual or spectral images including, but not limited to, edges and boundaries. For 5 the biological features in the image. For example, such biological features may include, but are not limited to, clumps of cells, mitotic features, cords or nests of cells, sample voids, such as alveolar and bronchi, and irregular sample edges. The user's selection of control points in the spectral and visual images may be saved to a repository that is used to provide a training correlation for personal and/or customized use. This a approach may allow subjective best practices to be incorporated into the control point selection process. [00157] in another variation according to aspects ot the invention, software-based recognition of distinguishing features in the spectral and visual images may be used to 38 select control points. The software may detect at least one control point that corresponds to a distinguishing feature in the visual or spectral images. For example, cluster pattern may be used to identify similar features in the visual image. The features 5 in both images may be aligned by translation, rotation, and scaling. Translation, rotation and scaling may also be automated or semi-automated, for example. by developing mapping relationships or models after selecting the features selection. Such an automated process may provide an approximation of mapping relationships that may then be resampled and transformed to optimize registration, for example. Resampling 0 techniques include, but are not limited to nearest neighbor, linear, and cubic i nleipulalion. [00158] Once the control points are aligned, the pixels in the spectral image having coordinates P 1 (x 1 , y 1 ) may be aligned with the corresponding pixels in the visual image 5 selected portion of the pixels in the spectral and visual images. Once aligned, the pixels in each of the spectral and visual images may be registered together. By this registration process, the pixels in each of the spectral image and visual images may be digitally joined with the pixels in the corresponding image. Since the method in accordance with aspects of the invention allows the same biological sample to be tested 20 spectroscopically and visually, the visual and spectral images may be registered accurately. [00159j An identification mark such as a numerical code, bar code, may be added to the slide to verify that the correct specimen is being accessed. The reference and 39 identification marks may be recognized by a computer that displays or otherwise stores the visual image of the biological specimen. This computer may also contain software for use in image registration [00160] An example of image registration according to an aspect of the invention is 5 illustrated in Figures 14A-14C. Figure 14A is a visual image of a small cell lung cancer tissue sample, and Figure 14B is spectral image of the same tissue sample subjected to HCA. Figure 14B contains spectral data from most of the upper right-hand section of the visual image of Figure 14A. When the visual image of Figure 14A is registered with the spectral image of Figure 14B, the result is shown in Figure 14C. As shown in Figure D 14C, the circled sections containing spots and contours 1-4 that are easily viewable in the spectral image of Figure 14B cunresond closely to the spot5 and contouus visible in the microscopic image of Figure 14A. [00161] Once the coordinates of the pixels in the spectral and visual images are 5 images may be stored. For example, the diagnostic regions may be digitally stored instead of the images of the entire sample. This may significantly reduce data storage requirements. [00162] A user who views a certain pixel region in either the spectral or visual image may immediately access the corresponding pixel region in the other image. For o example, a pathologist may select any area of the spectral image, such as by clicking a mouse or with joystick control, and view the corresponding area of the visual image that is registered with the spectral image. Hgure 14U is an example of a graphical user interface (GUI) for the registered image of Figure 14C according to aspects of the 40 invention. The GUI shown in Figure 14D allows a pathologist to toggle between the visual, spectral, and registered images and examine specific portions of interest. [00163] In addition as a Pathologist moves or manipulates an image, he/she can also access the corresponding portion of the other image to which it is registered. For 5 example, if a pathologist magnifies a specific portion of the spectral image, he/she may access the same portion in the visual image at the same level of magnification. [00164] Operational parameters of the visual microscope system, as well as microscope magnification, changes in magnification etc., may be also stored in an instrument specific log file. The log file may be accessed at a later time to select D annotation records and corresponding spectral pixels for training the algorithm. Thus, a pathologist may manipulate the spectral image, and at a lWei time, the spectral image and the digital image that is registered to it are both displayed at the appropriate magnification. This feature may be useful, for example, since it allows a user to save a maninulatpr] regiustere image digitally for Iler viewing or for electronic transmittal for 3 remote viewing. [00165] Image registration may be used with a tissue section having a known diagnosis to extract training spectra during a training step of a method in accordance with aspects of the invention. During the training step, a visual image of stained tissue may be registered with an unsupervised spectral image, such as from HCA. Image registration 0 may also be used when making a diagnosis on a tissue section. For example, a supervised spectral image of the tissue section may be registered with its corresponding visual image. Thus, a user may obtain a diagnosis based on any point in the registered images that has been selected. 41 [00166] Image registration according to aspects of the invention provides numerous advantages over prior methods of analyzing biological samples. For example, it allows a pathologist to rely on a spectral irnage which reflects the highly sensitive biochemical content of a biological sample, when making analyzing biological material. As such, it 5 provides significantly greater accuracy in detecting small abnormalities, pre-cancerous, or cancerous cells, including micrometastates, than the related art. Thus, the pathologist does not have to base his/her analysis of a sample on his/her subjective observation of a visual image of the biological sample. Thus, for example, the pathologist may simply study the spectral image and may easily refer to the relevant D portion in the registered visual image to verify his/her findings, as necessary. [00167] in addition, the i,.na.,, registration method in aucoidanue vvith as~pecls oi tie invention provides greater accuracy than the prior method of Bird (Bird et al., "Spectral detection of micro-metastates in lymph node histo-pathology", J. Biophoton. 2, No. 1-2, 5 spectral and visual images. Bird does not correlate any digital data from the images, and instead relies merely on the skill of the user to visually match spectral and visual images of adjacent tissue sections by physically overlaying the images. Thus, the image registration method in accordance with aspects of the invention provides more accurate and reproducible diagnoses with regard to abnormal or cancerous cells. This o may be helpful, for example, in providing accurate diagnosis in the early stages of disease, when indicia of abnormalities and cancer are hard to detect. 42 [00168] Training [00169] A training set may optionally be developed, as set forth in step 305 in the method provided in the flowchart of Figuirp 3 According to aspects of the invention, a training set includes spectral data that is associated with specific diseases or conditions, 5 among other things. The association of diseases or conditions to spectral data in the training set may be based on a correlation of classical pathology to spectral patterns based on morphological features normally found in pathological specimens. The diseases and conditions may include, but are not limited to, cellular abnormalities, inflammation, infections, pre-cancer, and cancer. 0 [00170] According to one aspect in accordance with the invention, in the training step, a training sct rmay be developed by inuyg leinof a visual image containing a disease or condition, correlating the region of the visual image to spectral data corresponding to the region, and storing the association between spectral data and the corresponding diease or condition, The training set may then be archived in a 5 repository, such as a database, and made available for use in machine learning algorithms to provide a diagnostic algorithm with output derived from the training set. The diagnostic algorithm may also be archived in a repository, such as a database, for future use. [00171] For example, a visual image of a tissue section may be registered with a 20 corresponding unsupervised spectral image, such as one prepared by HCA. Then, a user may select a characteristic region of the visual image. This region may be classified and/or annotated by a user to specify a disease or condition. Ihe spectral 43 data underlying the characteristic region in the corresponding registered unsupervised spectral image may be classified and/or annotated with the disease or condition. [00172] The spectral data that has been classified and/or annotated with a disease or condition provides a training set that may be used to train a supervised analysis 5 method, such as an ANN. Such methods are also described, for example, in Lasch, Miljkovic Dupuy. The trained supervised analysis method may provide a diagnostic algorithm. [00173] A disease or condition information may be based on algorithms that are supplied with the instrument, algorithms trained by a user, or a combination of both. For D example, an algorithm that is supplied with the instrument may be enhanced by the usor. [00174] An advantage of the training step according to aspects of the invention is that the registered images may be trained against the best available, consensus-based "gold standards", which evaluate spectral da;ta by reprodurihle and repeatahle criteria Thus, 5 after appropriate instrument validation and algorithm training, methods in accordance with aspects of the invention may produce similar results worldwide, rather than relying on visually-assigned criteria such as normal, atypical, low grade neoplasia, high grade neoplasia, and cancer. The results for each cell may be represented by an appropriately scaled numeric index or the results overall as a probability of a o classification match. Thus, methods in accordance with aspects of the invention may have the necessary sensitivity and specificity for the detection of various biological bliu~ies, and diagnosis oi disease, [00175] The diagnostic limitation of a training set may be limited by the extent to which the spectral data are classified and/or annotated with diseases or conditions. As indicated above, this training set may be augmented by the user's own interest and expertise. For example, a user may prefer one stain over another, such as one or many 5 IHC stains over an H&E stain. In addition, an algorithm may be trained to recognize a specific condition, such as breast cancer metastases in axillary lymph nodes, for example. The algorithm may be trained to indicate normal vs. abnormal tissue types or binary outputs, such as adenocarcenoma vs. not-adenocarcenoma only, and not to classify the different normal tissue types encountered, such as capsule, B- and T D lymphocytes. The regions of a particular tissue type, or states of disease, obtained by SHP may bernee s"iia.tis ueipsdo eltm i'U-)'~t displays of the tissue sections. [00176] Diagnosis [00177] Once the spectral and visual images have heen registered, they may hi Used 5 make a medical diagnosis, as outlined in step 306 in the flowchart of Figure 3. The diagnosis may include a disease or condition including, but not limited to, cellular abnormalities, inflammation, infections, pre-cancer, cancer, and gross anatomical features. In a method according to aspects of the invention, spectral data from a spectral image of a biological specimen of unknown disease or condition that has been 0 registered with its visual image may be input to a trained diagnostic algorithm, as described above. Based on similarities to the training set that was used to prepare the diagnosis alguiillu, the spe al data uf the biological spe n m y be correlated to a disease or condition. The disease or condition may be output as a diagnosis. 45 [00178] For example, spectral data and a visual image may be acquired from a biological specimen of unknown disease or condition. The spectral data may be analyzed by an unsupervised method, such as HCA. which may then be used along with spatial reference data to prepare an unsupervised spectral image. This 5 unsupervised spectral image may be registered with the visual image, as discussed above. The spectral data that has been analyzed by an unsupervised method may then be input to a trained supervised algorithm. For example, the trained supervised algorithm may be an ANN, as described in the training step above. The output from the trained supervised algorithm may be spectral data that contains one or more labels that > correspond to classifications and/or annotations of a disease or condition based on the training set. [00179] To extract a diagnosis based on the labels, the labeled spectral data may used to prepare a supervised spectral image that may be registered with the visual image and/or the unsupervised spectral image of the biological specimen For ampl, when 5 the supervised spectral image is registered with the visual image and/or the unsupervised spectral image, through a GUI, a user may select a point of interest in the visual image or the unsupervised spectral image and be provided with a disease or condition corresponding to the label at that point in the supervised spectral image. As an alternative, a user may request a software program to search the registered image o for a particular disease or condition, and the software may highlight the sections in any of the visual, unsupervised spectral, and supervised spectral images that are labeled with tlhe particular disease i condition. This advantageously allows a user to obtain a 46 diagnosis in real-time, and also allows the user view a visual image, which he/she is familiar with, while accessing highly sensitive spectroscopically obtained data. [001801 The diagnosis may include a binary output such aq an "i/is not" type Output, that indicates the presence or lack of a disease or condition. In addition, the diagnosis 5 may include, but is not limited to an adjunctive report, such as a probability of a match to a disease or condition, an index, or a relative composition ratio. [00181] In accordance with aspects of the method of the invention, gross architectural features of a tissue section may be analyzed via spectral patterns to distinguish gross anatomical features that are not necessarily related to disease. Such procedures, > known as global digital staining (GDS), may use a combination of supervised and including, but not limited to, glandular and squamous epithelium, endothelium, connective tissue, bone, and fatty tissue. [00182] In GDS a supervised diagnostic algorithm may be constructed from a training 5 dataset that includes multiple samples of a given disease from different patients. Each individual tissue section from a patient may be analyzed as described above, using spectral image data acquisition, pre-processing of the resulting dataset, and analysis by an unsupervised algorithm, such as HCA. The HCA images may be registered with corresponding stained tissue, and may be annotated by a pathologist. This annotation o step, indicated in Figures 15A-C, allows the extraction of spectra corresponding to typical manifestation of tissue types or disease stages and states, or other desired features. The resulting typical spectra, along with their annotated medical diagnosis, 47 may subsequently be used to train a supervised algorithm, such as an ANN, that is specifically suited to detect the features it was trained to recognize. [00183] According to the GDS method the sample may he stained using classical stains or immuno-histochemical agents. When the pathologist receives the stained 5 sample and inspects it using a computerized imaging microscope, the spectral results may be available to the computer controlling the visual microscope, The pathologist may select any tissue spot on the sample and receive a spectroscopy-based diagnosis. This diagnosis may overlay a grayscale or pseudo-color image onto the visual image that outlines all regions that have the same spectral diagnostic classification. 0 [00184] Figure 15A is a visual microscopic image of H&E-stained lymph node tissue anatomical features, such as capsule and interior of lymph node. Figure 15B is a global digital staining image of section shown in Figure 15A, distinguishing capsule and interior of lymph node. 5 [00185] Areas of these gross anatomical features, which are registered with the corresponding visual image, may be selected for analysis based on more sophisticated criteria in the spectral pattern dataset. This next level of diagnosis may be based on a diagnostic marker digital staining (DMDS) database, which may be solely based on SHP results, for example, or may contain spectral information collected using immuno 0 histochemical (IHC) results. For example, a section of epithelial tissue may be selected to analyze for the presence of spectral patterns indicative of abnormality and/or cancer, using a inuie diagnostic database to scan the selected area. An example of this approach is shown schematically in Figure 15C, which utilizes the full discriminatory 48 power of SHP and yields details of tissue features in the lymph node interior (such as cancer, lymphocytes, etc.), as may be available only after immune-histochemical staining in classical histopathology. Figure 15C is a DMDS image of section shown in Figure 15A, distinguishing capsule, metastatic breast cancer, histiocytes, activated B 5 lymphocytes and T -lymphocytes. [00186] The relationship between GDS and DMDS is shown by the horizontal progression marked in dark blue and purple, respectively, in the schematic of Figure 16. Both GDS and DMDS are based on spectral data, but may include other information, such as IHC data. The actual diagnosis may also be carried out by the same or a o similarly trained diagnostic algorithm, such as a hANN. Such a hANN may first analyze patterns collected for the tissue (the dark blue track). Subsequent "diagnostic element" analysis may be carried out by the hANN using a subset of spectral information, shown in the purple track. A multi-layer algorithm in binary form may be implemented for 5 example. Both GDS and DMDS may use different database subsections, shown as Gross Tissue Database and Diagnostic Tissue Database in Figure 16, to arrive at the respective diagnoses, and their results may be superimposed on the stained image after suitable image registration. [00187] According to an example method in accordance with aspects of the invention, a 0 pathologist may provide certain inputs to ensure that an accurate diagnosis is achieved. For example, the pathologist may visually check the quality of the stained image. In or field of view of the sample. 49 [00188] The method according to aspects of the invention may be performed by a pathologist viewing the biological specimen and performing the image registration. Alternatively, since the registered image contains digital data that may be transmitted electronically, the method may be performed remotely. 5 [00189] Methods may be demonstrated by the following non-limiting examples. [00190] Example 2 - Lymph Node Section [00191] Figure 17 shows a visual image of an H&E-stained axillary lymph node section measuring 1 mm x 1 mm, containing a breast cancer micrometastasis in the upper left quadrant. Figure 17B is a SHP-based digitally stained region of breast cancer o micrometastasis. By selecting, for example, by clicking using a cursor controlled to be cancerous is highlighted in red as shown in Figure 17B. Figure 17C is a SHP based digitally stained region occupied by B-lymphocyes. By pointing toward the lower right comer regions occupied -in 5 Figure 17C. Figure 17D is a SHP-based digitally stained region that shows regions occupied by histocytes, which are identified by the arrow. [00192] Since the SHP-based digital stain is based on a trained and validated repository or database containing spectra and diagnoses, the digital stain rendered is directly relatable to a diagnostic category, such as "metastatic breast cancer," in the 0 case of Figure 17B. The system may be first used as a complementary or auxiliary tool by a pathologist, although the diagnostic analysis may be carried out by SHP. As an adjunctive tool, the output may be a maatch piubability and not a binary report, for 50 example. Figure 18 shows the detection of individual and small clusters of cancer cells with SHP. [001931 Example 3 - Fine Needle Aspirate Sample of Lung Section [00194] Sample sections were cut from formalin fixed paraffin embedded cell blocks 5 that were prepared from fine needles aspirates of suspicious legions located in the lung. Cell blocks were selected based on the criteria that previous histological analysis had identified an adenocarcinoma, small cell carcinoma (SCC) or squamous cell carcinoma of the lung. Specimens were cut by use of a microtome to provide a thickness of about 5 pm and subsequently mounted onto low-e microscope slides (Kevley Technologies, D Ohio, USA). Sections were then deparaffinized using standard protocols. Subsequent to spectroscopic data collection, the tissue sections were hernatoxylin and eosin (I I&E) stained to enable morphological interpretations by a histopathologist. [00195] A Perkin Elmer Spectrum 1 / Spotlight 400 Imaging Spectrometer (Perkin Elmer Corp, Shelton, CT, U SA) was Pmployepd in this study Infr::red mirro-spectral iae 5 were recorded from 1 mm x 1 mm tissue areas in transflection (transmission/reflection) mode, with a pixel resolution of 6.25 pm x 6.25 pm, a spectral resolution of 4 cm 1 , and the co-addition of 8 interferograms, before Norton-Beer apodization (see, e.g., Naylor, et al. J Opt. Soc. Am., A24:3644-3648 (2007)) and Fourier transformation. An appropriate background spectrum was collected outside the sample area to ratio o against the single beam spectra. The resulting ratioed spectra were then converted to absorbance. Each 1 mm x 1 mm infrared image contains 160 x 160, or 25,600 spectra. [00196] Initially, adw infaied luiuo-spelal data sets were imported into and processed using software written in Matlab (version R2009a, Mathworks, Natick, MA, USA). A 51 spectral quality test was performed to remove all spectra that were recorded from areas where no tissue existed, or displayed poor signal to noise. All spectra that pass the test were then baseline off-set normalized (subtraction of the minimal absorbance intensity across the entire spectral vector), converted to second derivative (Savitzy-Golay 5 algorithm (see, e.g., Savitzky, et al. Anal. Chem., 36:1627 (1964)), 13 smoothing points), cut to only include intensity values recorded in the 1350 cm 1 - 900 cm 1 spectral region, and finally vector normalized. [00197] Processed data sets were imported into a software system and HCA performed using the Euclidean distance to define spectral similarity, and Ward's algorithm (see, D e.g., Ward, J Am. Stat. Assoc., 58:236 (1963)) for clustering. Pseudo-color cluster directly with H&E images captured from the same sample. HCA images of between 2 and 15 clusters, which describe different clustering structures, were assembled by cutting the calculated HCA dendrngram at different levels These cluster images were 5 then provided to collaborating pathologists who confirmed the clustering structure that best replicated the morphological interpretations they made upon the H&E-stained tissue. [00198] Infrared spectra contaminated by underlying base line shifts, unaccounted signal intensity variations, peak position shifts, or general features not arising from or o obeying LambertBeer law were corrected by a sub-space model version of EMSC for Mie scattering and reflection contributions to the recorded spectra (see B. Bird, M. Miljkovid and M. Diemn, "Two step lesonaul Mie scallering correction of infrared micro spectral data: human lymph node tissue", J. Biophotonics, 3 (8-9) 597-608 (2010)). 52 Initially, 1000 recorded spectra for each cancer type were pooled into separate data sets from the infrared images presented in Figure 19A -19F. [00199] These data sets were then searched for spectra with minimal scattering contributions, a mean for each cancer type was calculated to increase signal to noise, 5 and KK transforms were calculated for each cell type, as shown in Figure 19A and Figure 19B. Figure 19A shows raw spectral data sets comprising cellular spectra recorded from lung adenocarcinoma, small cell carcinoma, and squamous cell carcinoma cells. Figure 19B shows corrected spectral data sets comprising cellular spectra recorded from lung adenocarcinoma, small cell carcinoma, and squamous cell D carcinoma cells, respectively. Figure 19C shows standard spectra for lung [00200] A sub space model for Mie scattering contributions was constructed by calculating 340 Mie scattering curves that describe a nuclei sphere radius range of 6 pm -40 pm. and a refractive index range of 1.1 - 1.5 using the wn re Hulst approximation 3 formulae (see, e.g., Brussard, et al., Rev. Mod. Phys., 34:507 (1962)). The first 10 principal components that describe over 95% of the variance composed in these scattering curves, were then used in a addition to the KK transforms for each cancer type, as interferences in a 1 step EMSC correction of data sets. The EMSC calculation took approximately 1 sec per 1000 spectra. Figure 19D shows KK transformed spectra o calculated from spectra in Figure 19C. Figure 19E shows PCA scores plots of the multi class data set before EMSC correction. Figure 19F shows PCA scores plots of the multi Class data set after EMGC correction. The analysis was per funnyed on the vector normalized 1800 cm 1 - 900 cm" spectral region. 53 [00201] Figure 20A shows mean absorbance spectra of lung adenocarcinoma, small cell carcinoma, and squamous carcinoma, respectively. These were calculated from 1000 scatter corrected cellular spectra of each cell type. Figure ?0R shows scne derivative spectra of absorbance spectra displayed in Figure 20A. In general, 5 adenocarcinoma and squamous cell carcinoma have similar spectral profiles in the low wavenumber region of the spectrum. However, the squamous cell carcinoma displays a substantially low wavenumber shoulder for the amide I band, which has been observed for spectral data recorded from squamous cell carcinoma in the oral cavity (Papamarkakis, et al. (2010), Lab. Invest., 90:589-598). The small cell carcinoma o displays very strong symmetric and anti-symmetric phosphate bands that are shifted observed spectra. [00202] Since the majority of sample area is composed of blood and non-diagnostic material. the data was pre-procssed to only include dinantic material and cnrrect for 5 scattering contributions. In addition, HCA was used to create a binary mask and finally classify the data. This result is shown in Figures 21A-21C. Figure 21A shows 4 stitched microscopic R&E-stained images of 1 mm x 1 mm tissue areas comprising adenocarcinoma, small cell carcinoma, and squamous cell carcinoma cells, respectively. figure 21B is a binary mask image constructed by performance of a rapid 0 reduced RCA analysis upon the 1350 cm 1 - 900 cm 1 spectral region of the 4 stitched raw infrared images recorded from the tissue areas shown in Figure 21A. The regions of diagnostic ceilulai nateial and blood cells are shown. Figure 2 C is a 6-cluster ICA image of the scatter corrected spectral data recorded from regions of diagnostic cellular 54 material. The analysis was performed on the 1800 cm 1 - 900 cm 1 spectral region. The regions of squamous cell carcinoma, adenicarcinoma, small cell carcinoma, and diverse desmoplastic tissue response are shown. Alternatively thesp pronesses can he replaced with a supervised algorithm, such as an ANN. 5 [00203] The results presented in the Examples above show that the analysis of raw measured spectral data enables the differentiation of SCC and non-small cell carcinoma (NSCC). After the raw measured spectra are corrected for scattering contributions, adenocarinoma and squamous cell carcinoma according to methods in accordance with aspects of the invention, however, the two subtypes of NSCC, are clearly differentiated. o Thus, these Examples provide strong evidence that this spectral imaging method may [00204] Figure 22 shows various features of an example computer system 100 for use in conjunction with methods in accordance with aspects of invention, including, but not limited to irnag r-gitration and training As shown in Figure 22, the computer system 5 100 may be used by a requestor 101 via a terminal 102, such as a personal computer (PC), minicomputer, mainframe computer, microcomputer, telephone device, personal digital assistant (PDA), or other device having a processor and input capability. The server module may comprise, for example, a PC, minicomputer, mainframe computer, microcomputer, or other device having a processor and a repository for data or that is o capable of accessing a repository of data. The server module 106 may be associated, for example, with an accessible repository of disease based data for use in diagnosis. [00205] ufunnation elalinrig to a diagnosis, for example, via a network, 11U, such as the Internet, for example, may be transmitted between the analyst 101 and the server module 106. Communications may be made, for example, via couplings 111, 113, such as wired, wireless, or fiberoptic links. [00206] Aspects of the invention may be implemented using hardware, software or a combination thereof and may be implemented in one or more computer systems or 5 other processing systems. In one variation, aspects of the invention are directed toward one or more computer systems capable of carrying out the functionality described herein. An example of such a computer system 200 is shown in Figure 23. [00207] Computer system 200 includes one or more processors, such as processor 204. The processor 204 is connected to a communication infrastructure 206 (e.g., a 3 communications bus, cross-over bar, or network). Various software aspects are will become apparent to a person skilled in the relevant art(s) how to implement the aspects of invention using other computer systems and/or architectures. [002081 Computer system 200 can include a display interfac 202 that fonrwards graphics, text, and other data from the communication infrastructure 206 (or from a frame buffer not shown) for display on the display unit 230. Computer system 200 also includes a main memory 208, preferably random access memory (RAM), and may also include a secondary memory 210. The secondary memory 210 may include, for example, a hard disk drive 212 and/or a removable storage drive 214, representing a 0 floppy disk drive, a magnetic tape drive, an optical disk drive, etc. The removable storage drive 214 reads from and/or writes to a removable storage unit 218 in a well known manner. Renmovable slolage unit 218, lepiebetols a fluppy disk, magnetic tape, optical disk, etc., which is read by and written to removable storage drive 214. As will 56 be appreciated, the removable storage unit 218 includes a computer usable storage medium having stored therein computer software and/or data. [00209] In alternative variations, secondary memory 210 may include other similar devices for allowing computer programs or other instructions to be loaded into computer 5 system 200. Such devices may include, for example, a removable storage unit 222 and an interface 220. Examples of such may include a program cartridge and cartridge interface (such as that found in video game devices), a removable memory chip (such as an erasable programmable read only memory (EPROM), or programmable read only memory (PROM)) and associated socket, and other removable storage units 222 and > interfaces 220, which allow software and data to be transferred from the removable storage unit 222 to computer system 200. [00210] Computer system 200 may also include a communications interface 224. Communications interface 224 allows software and data to be transferred between computer system 200 and external devices. Examples of communications interface ??4 5 may include a modem, a network interface (such as an Ethernet card), a communications port, a Personal Computer Memory Card International Association (PCMCIA) slot and card, etc. Software and data transferred via communications interface 224 are in the form of signals 228, which may be electronic, electromagnetic, optical or other signals capable of being received by communications interface 224. o These signals 228 are provided to communications interface 224 via a communications path (e.g., channel) 226. This path 226 carries signals 228 and may be implemented uIn - Ir or cable, fiber opvis, a teleplione line, a cellular link, a ladio frequency (RF) link and/or other communications channels. In this document, the terms "computer 57 program medium" and "computer usable medium" are used to refer generally to media such as a removable storage drive 214, a hard disk installed in hard disk drive 212, and signals 228. These computer program products provide software to the computer system 200. Aspects of the invention are directed to such computer program products. 5 [00211] Computer programs (also referred to as computer control logic) are stored in main memory 208 and/or secondary memory 210. Computer programs may also be received via communications interface 224. Such computer programs, when executed, enable the computer system 200 to perform the features in accordance with aspects of the invention, as discussed herein. In particular, the computer programs, when 3 executed, enable the processor 204 to perform such features. Accordingly, such computer programs represent enntrollers of the Computer system 200. [00212] In a variation where aspects of the invention are implemented using software, the software may be stored in a computer program product and loaded into computer system 200 using removable storage drive 214, hard drive 212, or communications 5 interface 224. The control logic (software), when executed by the processor 204, causes the processor 204 to perform the functions as described herein. In another variation, aspects of the invention are implemented primarily in hardware using, for example, hardware components, such as application specific integrated circuits (ASICs). Implementation of the hardware state machine so as to perform the functions o described herein will be apparent to persons skilled in the relevant art(s). [00213] In yet another variation, aspects of the invention are implemented using a combination of both hardware and software. 58

Claims (21)

1. A method of providing a medical diagnosis, comprising: obtaining spectroscopic data for a biological specimen, wherein the biological 5 specimen is extracted from an individual; comparing the spectroscopic data for the biological specimen to spectral data in a repository that is associated with a disease or condition; determining whether any correlation between the spectral data and the spectroscopic data for the biological specimen exists; 10 adding a label that corresponds to the disease or condition to the spectroscopic data when a correlation exists between the spectral data and the spectroscopic data; and outputting a diagnosis for the disease or condition associated with the spectral data when a correlation exists between the spectral data and the spectroscopic data. 15
2. The method of claim 1, wherein the repository data is obtained from a plurality of images, and wherein each of the plurality of images in the repository is associated with a disease or condition. 20
3. The method of claim 1, wherein outputting the diagnosis comprises displaying the diagnosis on a computer screen.
4. The method of claim 1, wherein outputting the diagnosis comprises storing the 25 diagnosis electronically.
5. A system for providing a medical diagnosis, the system comprising: a processor; a user interface functioning via the processor; and 30 a repository accessible by the processor; wherein spectroscopic data of a biological specimen is obtained, wherein the biological specimen is extracted from an individual; wherein the spectroscopic data for the biological specimen is compared to spectral data in a repository that is associated with a disease or condition; 59 wherein whether any correlation between the spectral data and the spectroscopic data for the biological specimen exists is determined; wherein a label that corresponds to the disease or condition is added to the spectroscopic data when a correlation exists between the spectral data and the 5 spectroscopic data; and wherein a diagnosis for the disease or condition associated with the spectral data when a correlation exists between the spectral data and the spectroscopic data.
6. The system of claim 5, wherein the processor is housed on a terminal. 10
7. The system of claim 6, wherein the terminal is selected from a group consisting of a personal computer, a minicomputer, a main frame computer, a microcomputer, a hand held device, and a telephonic device. 15
8. The system of claim 5, wherein the processor is housed on a server.
9. The system of claim 8, wherein the server is selected from a group consisting of a personal computer, a minicomputer, a microcomputer, and a main frame computer. 20
10. The system of claim 8, wherein the server is coupled to a network.
11. The system of claim 10, wherein the network is the Internet.
12. The system of claim 10, wherein the server is coupled to the network via a 25 coupling.
13. The system of claim 12, wherein the coupling is selected from a group consisting of a wired connection, a wireless connection, and a fiberoptic connection. 30
14. The system of claim 5, wherein the repository is housed on a server.
15. The system of claim 14, wherein the server is coupled to a network. 60
16. A computer program product comprising a computer usable medium having control logic stored therein for causing a computer to provide a medical diagnosis, the control logic comprising: first computer readable program code means for obtaining spectroscopic data for 5 a biological specimen, wherein the biological specimen is extracted from an individual; second computer readable program code means for comparing the spectroscopic data for the biological specimen to spectral data in a repository that is associated with a disease or condition; third computer readable program code means for determining whether any 10 correlation between the spectral data and the spectroscopic data for the biological specimen exists; fourth computer readable program code means for adding a label that corresponds to the disease or condition to the spectroscopic data when a correlation exists between the spectral data and the spectroscopic data; and 15 fifth computer readable program code means for outputting a diagnosis with the disease or condition associated with the spectral data when a correlation exists between the spectral data and the spectroscopic data.
17. The method of claim 1, wherein the biological specimen comprises cells or 20 tissue.
18. The system of claim 5, wherein the repository data is obtained from a plurality of images, and wherein each of the plurality of images in the repository is associated with a 25 disease or condition.
19. The system of claim 5, wherein the diagnosis is displayed on a computer screen.
20. The system of claim 5, wherein the diagnosis is stored electronically. 30
21. The system of claim 5, wherein the biological specimen comprises cells or tissue. 61
AU2015224495A 2010-06-25 2015-09-10 Method for analyzing biological specimens by spectral imaging Abandoned AU2015224495A1 (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
AU2015224495A AU2015224495A1 (en) 2010-06-25 2015-09-10 Method for analyzing biological specimens by spectral imaging
AU2017204736A AU2017204736A1 (en) 2010-06-25 2017-07-10 Method for analyzing biological specimens by spectral imaging

Applications Claiming Priority (3)

Application Number Priority Date Filing Date Title
US61/358,606 2010-06-25
AU2011270731A AU2011270731A1 (en) 2010-06-25 2011-06-24 Method for analyzing biological specimens by spectral imaging
AU2015224495A AU2015224495A1 (en) 2010-06-25 2015-09-10 Method for analyzing biological specimens by spectral imaging

Related Parent Applications (1)

Application Number Title Priority Date Filing Date
AU2011270731A Division AU2011270731A1 (en) 2010-06-25 2011-06-24 Method for analyzing biological specimens by spectral imaging

Related Child Applications (1)

Application Number Title Priority Date Filing Date
AU2017204736A Division AU2017204736A1 (en) 2010-06-25 2017-07-10 Method for analyzing biological specimens by spectral imaging

Publications (1)

Publication Number Publication Date
AU2015224495A1 true AU2015224495A1 (en) 2015-10-01

Family

ID=54251831

Family Applications (2)

Application Number Title Priority Date Filing Date
AU2015224495A Abandoned AU2015224495A1 (en) 2010-06-25 2015-09-10 Method for analyzing biological specimens by spectral imaging
AU2017204736A Abandoned AU2017204736A1 (en) 2010-06-25 2017-07-10 Method for analyzing biological specimens by spectral imaging

Family Applications After (1)

Application Number Title Priority Date Filing Date
AU2017204736A Abandoned AU2017204736A1 (en) 2010-06-25 2017-07-10 Method for analyzing biological specimens by spectral imaging

Country Status (1)

Country Link
AU (2) AU2015224495A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109671031A (en) * 2018-12-14 2019-04-23 中北大学 A kind of multispectral image inversion method based on residual error study convolutional neural networks

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109671031A (en) * 2018-12-14 2019-04-23 中北大学 A kind of multispectral image inversion method based on residual error study convolutional neural networks
CN109671031B (en) * 2018-12-14 2022-03-18 中北大学 Multispectral image inversion method based on residual learning convolutional neural network

Also Published As

Publication number Publication date
AU2017204736A1 (en) 2017-07-27

Similar Documents

Publication Publication Date Title
US10067051B2 (en) Method for analyzing biological specimens by spectral imaging
US9495745B2 (en) Method for analyzing biological specimens by spectral imaging
CA2803933C (en) Method for analyzing biological specimens by spectral imaging
WO2013052824A1 (en) Method and system for analyzing biological specimens by spectral imaging
US10043054B2 (en) Methods and systems for classifying biological samples, including optimization of analyses and use of correlation
AU2014235921A1 (en) Method and system for analyzing biological specimens by spectral imaging
EP2887050A1 (en) Method for marker-free demarcation of tissues
AU2017204736A1 (en) Method for analyzing biological specimens by spectral imaging

Legal Events

Date Code Title Description
MK5 Application lapsed section 142(2)(e) - patent request and compl. specification not accepted