EP1639365A1 - Differential diagnosis of colorectal cancer and other diseases of the colon - Google Patents

Differential diagnosis of colorectal cancer and other diseases of the colon

Info

Publication number
EP1639365A1
EP1639365A1 EP04733324A EP04733324A EP1639365A1 EP 1639365 A1 EP1639365 A1 EP 1639365A1 EP 04733324 A EP04733324 A EP 04733324A EP 04733324 A EP04733324 A EP 04733324A EP 1639365 A1 EP1639365 A1 EP 1639365A1
Authority
EP
European Patent Office
Prior art keywords
biomolecules
colorectal cancer
subjects
large intestine
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
EP04733324A
Other languages
German (de)
French (fr)
Inventor
Jörn Meuer
Jan Wiemer
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Miraculins Inc
Original Assignee
Europroteome AG
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from EP03090153A external-priority patent/EP1477803A1/en
Application filed by Europroteome AG filed Critical Europroteome AG
Priority to EP04733324A priority Critical patent/EP1639365A1/en
Publication of EP1639365A1 publication Critical patent/EP1639365A1/en
Withdrawn legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57407Specifically defined cancers
    • G01N33/57419Specifically defined cancers of colon

Definitions

  • the present invention provides biomolecules and the use of these biomolecules for the differential diagnosis of colorectal cancer or a non-malignant disease of the large intestine.
  • the biomolecules are characterised by mass profiles generated by contacting a test and/or biological sample with an anion exchange surface under specific binding conditions and detecting said biomolecules using gas phase ion spectrometry.
  • the biomolecules used according to the invention are preferably proteins or polypeptides.
  • preferred test and/or biological samples are blood serum samples and are of human origin.
  • Colorectal cancer is the fourth most common cancer in the world to date, and accounts for approximately 200,000 deaths per year in Europe and the US alone. Although colorectal cancer generally affects both men and women equally (currently at 9.4% and 10.1% of incident cancer, respectively), its distribution as a leading cause of death in men and women is disproportionate. Whereas colorectal cancer is the fourth leading cancer-related cause of death in men (following lung, stomach and prostate cancer), in women it takes second place to breast cancer. Furthermore, colorectal cancer is more prevalent in developed countries exhibiting more westernised lifestyle practices.
  • FamiUal and hereditary factors have been observed to play primary roles in the cause of colorectal cancers.
  • a number of other factors have been shown to be associated with an increased, risk of developing colorectal cancer namely the presence of adenomatous polyps, history/presence of inflammatory bowel disease, diets rich in animal fats and significantly decreased consumption of raw or fresh vegetables (especially leafy green vegetables, cruciferous vegetables, as well as allium vegetables such as garlic, onions, chives).
  • FOBT fecal occult blood test
  • flexible sigmoi oscopy double contrast barium enema
  • colonoscopy are the primary tools utilised to detect colorectal cancer at its early stages.
  • FOBT fecal occult blood test
  • a positive FOBT result leads to further examination, mainly colonoscopy - an extremely discomforting, invasive diagnostic method which is expensive and carries a serious complication rate of one per 5,000 examinations.
  • Colonoscopy as a follow-up diagnostic method, might prove to be effective in confirming colorectal cancer within a patient provided that the FOBT results indeed reflect the presence of the disease.
  • Unfortunately this is more often not the case, since only 12% of the patients with a heme-positive fecal sample are diagnosed with cancer or large polyps at the time of colonoscopy.
  • physicians frequently fail to properly instruct their patients on how fecal samples should be collected. Normally, patients are told to adhere to specific dietary guidelines and to avoid taking medication known to induce gastrointestinal bleeding.
  • MALDI-TOF matrix-assisted laser desorption ionization/time of flight
  • biomarkers for the detection of breast and prostate cancers have been identified using the above mentioned SELDI technology.
  • the biomarkers identified can only be used to diagnose a patient as having a specific cancer (either breast or prostate) versus not having the disease at all.
  • the test samples analysed in WO03058198 (Ciphergen) and WO0223200 (Ciphergen) were taken from patients with late-stage breast cancer (stages HI and TV)
  • the control samples were taken from patients with undetectable breast cancer.
  • biomarkers identified are neither grade-specific nor can they detect the disease at its earliest stages (stage I and II), and thereby would not allow for effective patient-specific treatment of the disease. Moreover, biomarkers that can differentiate between the presence of a colorectal cancer, a non- malignant disease of the large intestine, or an acute and chronic inflammation of the epithelium have not yet been identified.
  • the present invention addresses this difficulty with the development of a non-invasive diagnostic tool for the differential diagnosis of colorectal cancer and non-malignant diseases of the large intestine.
  • the present invention relates to methods for the differential diagnosis of colorectal cancer or non- malignant disease of the large intestine by detecting one or more differentially expressed biomolecules within a test sample of a given subject, comparing results with samples from healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer, or subjects having a non-malignant disease of the large intestine, wherein the comparison allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the large intestine, having a colorectal cancer, having a metastasised colorectal cancer or a non-malignant disease of the large intestine.
  • the present invention provides a method for the differential diagnosis of a colorectal cancer and/or a non-malignant disease of the large intestine, in vitro, comprising obtaining a test sample from a subject, contacting test sample with a biologically active surface under specific binding conditions, allowing for biomolecules present within the test sample to bind to the biologically active surface, detecting one or more bound biomolecules using mass spectrometry thereby generating a mass profile of said test sample, tr ⁇ sfo ⁇ ning data into a computer-readable form, and comparing said mass profile against a database containing mass profiles specific for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having colorectal cancer, subjects having metastasised colorectal cancers, or subjects having a non-malignant disease of the large intestine, wherein the comparison allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the large intestine, having a colorectal cancer, having
  • the invention provides a database comprising of mass profiles of biological samples from healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer, or subjects having a non- malignant disease of the large intestine.
  • the database is generated by obtaining biological samples from healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer, and subjects having a non-malignant disease of the large intestine, contacting said biological samples with a biologically active surface under specific binding conditions, allowing the biomolecules within the biological sample to bind to said biologically active surface, detecting one or more bound biomolecules using mass spectrometry thereby generating a mass profile of said biological samples, t ⁇ nsfo ⁇ ning data into a computer-readable form, and applying a mathematical algorithm to classify the mass profiles as specific for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having colorectal cancer, subjects having metastasised colorectal cancer, and subjects having a non- malignant disease of the large intestine.
  • the present invention provides biomolecules having a molecular mass selected from the group consisting of 2020 Da ⁇ 10 Da, 2049 Da ⁇ 10 Da, 2270 Da ⁇ 11 Da, 2508 Da ⁇ 13 Da, 2732 Da ⁇ 14 Da, 3026 Da ⁇ 15 Da, 3227 Da ⁇ 17 Da, 3326 Da ⁇ 17 Da, 3456 Da ⁇ 17 Da, 3946 Da ⁇ 20 Da, 4103 Da ⁇ 21 Da, 4242 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4359 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4546 Da ⁇ 23 Da, 4607 Da ⁇ 23 Da, 4719 Da ⁇ 24 Da, 4830 Da ⁇ 24 Da, 4865 Da ⁇ 24 Da, 4963 Da ⁇ 25 Da, 5112 Da ⁇ 26 Da, 5226 Da ⁇ 26 Da, 5493 Da ⁇ 27 Da, 5648 Da ⁇ 28 Da, 5772 Da ⁇ 29 Da, 5854 Da ⁇ 29 Da, 6446 Da ⁇ 32 Da, 6644 Da
  • biomolecules having said molecular masses are detected by contacting a test and/or biological sample with a biologically active surface comprising an adsorbent under specific binding conditions and further analysed by gas phase ion spectrometry.
  • a biologically active surface comprising an adsorbent under specific binding conditions and further analysed by gas phase ion spectrometry.
  • the adsorbent used is comprised of positively charged quaternary ammonium groups (anion exchange surface).
  • the invention provides specific binding conditions for the detection of biomolecules within a sample.
  • a sample is diluted 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine, and then diluted again 1:10 in binding buffer consisting of 0.1 M Tris-HCl, 0.02% Triton X-100 at a pH 8.5 at 0 to 4°C.
  • the treated sample is then contacted with a biologically active surface comprising of positively charged (cationic) quaternary ammonium groups (anion exchanging), incubated for 120 minutes at 20 to 24°C, and the bound biomolecules are detected using gas phase ion spectrometry.
  • a biologically active surface comprising of positively charged (cationic) quaternary ammonium groups (anion exchanging)
  • the invention provides a method for the differential diagnosis of a colorectal cancer and or a non-malignant disease of the large intestine comprising detecting of one or more differentially expressed biomolecules within a sample.
  • This method comprises obtaining a test sample from a subject, contacting said sample with a binding molecule specific for a differentially expressed polypeptide, detecting an interaction between the binding molecule and its specific polypeptide, wherein the detection of an interaction indicates the presence or absence of said polypeptide, thereby allowing for the differential diagnosis of a subject as healthy, having a precancerous lesion of the large intestine, having a colorectal cancer, having a metastasised colorectal cancer and or a non-malignant disease of the large intestine.
  • binding molecules are antibodies specific for said polypeptides.
  • biomolecules related to the invention having a molecular mass selected from the group consisting of 2020 Da ⁇ 10 Da, 2049 Da ⁇ 10 Da, 2270 Da ⁇ 11 Da, 2508 Da ⁇ 13 Da, 2732 Da ⁇ 14 Da, 3026
  • the invention provides a method for the identification of biomolecules within a sample, provided that the biomolecules are proteins, polypeptides or fragments thereof, - comprising: chromatography and fractionation, analysis of fractions for the presence of said differentially expressed proteins and/or fragments thereof, using a biologically active surface, further analysis using mass spectrometry to obtain amino acid sequences encoding said proteins and/or fragments thereof, and searching amin ⁇ acid sequence databases of known proteins to identify said differentially expressed proteins by amino acid sequence comparison.
  • the method of chromatography is high performance liquid chromatography (HPLC) or fast protein liquid chromatography (FPLC).
  • the mass spectrometry used is selected from the group of matrix-assisted laser desorption ionization time of flight (MALDI-TOF), surface enhanced laser desorption ionisation time of flight (SELDI-TOF), liquid chromatography, MS-MS, or ESI-MS.
  • MALDI-TOF matrix-assisted laser desorption ionization time of flight
  • SELDI-TOF surface enhanced laser desorption ionisation time of flight
  • MS-MS MS-MS
  • ESI-MS ESI-MS
  • kits for the differential diagnosis of a colorectal cancer and/or a non-malignant disease of the colon are provided.
  • test or biological samples used according to the invention may be of blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract origin.
  • test and/or biological samples are blood serum samples, and are isolated from subjects of mammalian origin, preferably of human origin.
  • a colorectal cancer of the invention is a cancer of the large intestine, and may include cancers of the colon, rectum etc. Furthermore, a colorectal cancer, as intended by the invention, may be of various stages and/or grades.
  • m/z mass/charge ratio
  • Figure 2A - F Scatter plots of clusters (peaks, variables), belonging to differentially expressed proteins included in the four classifiers.
  • the X-axis shows the mass/charge (m/z) ratio, which is equivalent to the apparent molecular mass of the corresponding biomolecule.
  • the Y-axis shows the logarithmic normalized relative signal intensity of the peaks in the examined serum samples. First, intensities were shifted to yield entirely positive values. Then, for each mass, intensities were normalized by dividing the intensity values by the average intensity of that mass. Finally, the natural logarithm was taken.
  • o N (Normal) Endoscopy control patients' serum samples.
  • FIG 3A - F Additionally scaled scatter plots of clusters (peaks, variables), belonging to differentially expressed proteins included in the four classifiers.
  • the X-axis shows the mass/charge (m/z) ratio, which is equivalent to the apparent molecular mass of the corresponding biomolecule.
  • the Y-axis shows the logarithmic normalized relative signal intensity of the peaks in the examined serum samples. However, intensities were additionally (shifted and) scaled so that the intensities of each mass cover the entire range of the Y-axis. Thereby, the minimum and maximum intensities of all masses are ahgned on the lower and upper edge of the plot, respectively. This allows to better visualize the extend of class overlap.
  • ⁇ T (Tumour): Colon cancer patients' serum samples.
  • o N Endoscopy control patients' serum samples.
  • Figure 4 Complexity of proof-of-principle classifier. The histogram visualizes the distribution of the number of decision tree variables (peaks, clusters) for the obtained proof-of-principle classifier for gastric cancer. 6 variables per decision tree are typical.
  • Figure 5 Variable importance of the proof-of-principle classifier.
  • the histograms visualize how often a variable (mass) is employed in the proof-of-principle classifier.
  • the frequency of variable selection is presented in histogram form for each hierarchical level (a-j) and for all hierarchical levels taken together (k).
  • Figure 6 Complexity of 1 st final classifier.
  • the histogram visualizes the distribution of the number of decision tree variables (peaks, clusters) for the obtained 1 st final classifier in the range of 1 to 10 decision tree variables. 9 variables per decision tree are typical.
  • Figure 7 Variable importance of 1 st final classifier.
  • the histogram visualizes how often a variable (mass) is employed in the final classifier.
  • the frequency of variable selection is presented in histogram form for each of the first 10 hierarchical levels (a-j) and for the first ten hierarchical levels taken together (k).
  • Figure 8 Complexity of 2 nd final classifier.
  • the histogram visualizes the distribution of the number of decision tree variables (peaks, clusters) for the obtained 2 nd final classifier in the range of 1 to 10 decision tree variables. As many as 10 variables per decision tree are typical.
  • Figure 9 Variable importance of 2 nd final classifier.
  • the histogram visualizes how often a variable (mass) is employed in the 2 nd final classifier.
  • the frequency of variable selection is presented in- histogram form for each of the first 10 hierarchical levels (a-j) and for the first ten hierarchical levels taken together (k).
  • Figure 10 Complexity of 3 rd final classifier.
  • the histogram visualizes the distribution of the number of decision tree variables (peaks, clusters) for the obtained 3rd final classifier in the range of 1 to 10 decision tree variables. As many as 10 variables per decision tree are typical.
  • FIG. 11 Variable importance of 3 rd final classifier.
  • the histogram visualizes how often a variable (mass) is employed in the 3 rd final classifier.
  • the frequency of variable selection is presented in histogram form for each of the first 10 hierarchical levels (a-j) and for the first ten hierarchical levels taken together (k).
  • biomolecule refers to a molecule produced by a cell or living organism. Such molecules include, but are not limited to, molecules comprising nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides, polypeptides, proteins, carbohydrates, lipids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, hpoproteins). Furthermore, the terms “nucleotide” or polynucleotide” refer to a nucleotide, oligonucleotide, polynucleotide, or any fragment thereof.
  • DNA or RNA of genomic or synthetic origin which may be single- stranded or double-stranded and may represent the sense, or the antisense strand, to peptide polynucleotide sequences (i.e. peptide nucleic acids; PNAs), or to any DNA-like or RNA-like material.
  • PNAs peptide nucleic acids
  • fragment refers to a portion of a polypeptide (parent) sequence that comprises at least 10 consecutive amino acid residues and retains a biological activity and/or some functional characteristics of the parent polypeptide e.g. antigenicity or structural domain characteristics.
  • biological sample and "test sample” refer to all biological fluids and excretions isolated from any given subject.
  • samples include, but are not limited to, blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract samples.
  • binding refers to the binding reaction between a biomolecule and a specific "binding molecule".
  • binding molecules that include, but are not limited to, proteins, peptides, nucleotides, nucleic acids, hormones, amino acids, sugars, fatty acids, steroids, polynucleotides, carbohydrates, lipids, or a combination thereof (e.g. glycoproteins, ribonucleoproteins, Hpoproteins).
  • a binding reaction is considered to be specific when the interaction between said molecules is substantial. In the context of the invention, a binding reaction is considered substantial when the reaction that takes place between said molecules is at least two times the background.
  • specific binding conditions refers to reaction conditions that permit the binding of said molecules such as pH, salt, detergent and other conditions known to those skilled in the art.
  • reaction relates to the direct or indirect binding or alteration of biological activity of a biomolecule.
  • the te ⁇ n "differential diagnosis” refers to a diagnostic decision between a healthy and different disease states, including various stages of a specific disease.
  • a subject is diagnosed as healthy or to be suffering from a specific disease, or a specific stage of a disease based on a set of hypotheses that allow for the distinction between healthy and one or more stages of the disease.
  • the choice between healthy and one or more stages of disease depends on a significant difference between each hypothesis.
  • a “differential diagnosis” may also refer to a diagnostic decision between one disease type as compared to another (e.g. colon cancer vs. diverticulosis).
  • colonal cancer refers to a cancer state associated with the large intestine of any given subject, wherein the cancer state is defined according to its stage and/or grade.
  • the various stages of a cancer may be identified using staging systems known to those skilled in the art [e.g. Union Internationale Contre Cancer (UICC) system or American Joint Committee on Cancer (AJC)].
  • UICC Union Internationale Contre Cancer
  • AJC American Joint Committee on Cancer
  • colorectal cancers include but are not limited to colon and rectal cancers.
  • non-malignant disease of the large intestine refers to alterations in the physiological, functional and/or anatomical state of the large intestine, wherein the alterations deviate from normal.
  • this term encompasses alterations in the physiological, functional and/or anatomical state of the large intestine that cannot be staged or graded according to cancer staging systems known to those skilled in the art [e.g. Union Internationale Contre Cancer (UICC) system or American Joint Committee on Cancer (AJC)].
  • UICC Union Internationale Contre Cancer
  • AJC American Joint Committee on Cancer
  • non-malignant disease include but are not limited to the acute and chronic inflammation of the large intestinal epithelium, diverticular disease including diverticulosis and diverticuUtis, colitis, ulcerative colitis, pancolitis, Crohn's disease (ileitis), proctitis, intestinal polyps including hyperplastic polyps, hamartomatous polyps (i.e. Juvenile polyps, Peutz-Jeghers polyps), inflammatory polyps, and lymphoid polyps, adenomatous polyps.
  • diverticular disease including diverticulosis and diverticuUtis
  • colitis ulcerative colitis
  • pancolitis Crohn's disease (ileitis)
  • proctitis intestinal polyps including hyperplastic polyps, hamartomatous polyps (i.e. Juvenile polyps, Peutz-Jeghers polyps), inflammatory polyps, and lymphoid polyps, adenomatous polyps.
  • the term "healthy individual” refers to a subject possessing good health. Such a subject demonstrates an absence of any disease within the large intestine, preferably a colorectal cancer or a non-malignant disease of the large intestine.
  • precancerous lesion of the large intestine refers to a biological change within a cell and/or tissue of the large intestine such that said cell and/or tissue becomes susceptible to the development of a cancer. More specifically, a precancerous lesion of the large intestine is a preliminary stage of a colorectal cancer (i.e. dysplasia).
  • causes of a precancerous lesion of the larger intestine may include, but are not limited to, genetic predisposition and exposure to cancer-causing agents (carcinogens); such cancer causing agents include agents • that cause genetic damage and induce neoplastic transformation of a cell.
  • non-plastic transformation of a cell refers an alteration in normal cell physiology and includes, but is not limited to, self-sufficiency in growth signals, insensitivity to growth-inhibitory (anti-growth) signals, evasion of programmed cell death (apoptosis), limitless repUcative potential, sustained angiogenesis, and tissue invasion and metastasis.
  • dysplastic cells refers to morphological alterations within a tissue, which are characterised by a loss in the uniformity of individual cells, as well as a loss in their architectural orientation. Furthermore, dysplastic cells also exhibit a variation in size and shape.
  • the phrase "differentially present” refers to differences in the quantity of a biomolecule (of a particular apparent molecular mass) present in a sample from a subject as compared to a comparable sample.
  • a biomolecule is present at an elevated level, a decreased level or absent in samples of subjects having colorectal cancer compared to samples of subjects who do not have a cancer of the large intestine. Therefore in the context of the invention, the term “differentially present biomolecule” refers to the quantity biomolecule (of a particular apparent molecular mass) present within a sample taken from a subject having a disease or cancer of the large intestine as compared to a comparable sample taken from a healthy subject.
  • a biomolecule is differentially present between two samples if the quantity of said biomolecule in one sample is statisticaUy significantly different from the quantity of said biomolecule in another sample.
  • diagnostic assay can be used interchangeably with “diagnostic method” and refers to the detection of the presence or nature of a pathologic condition. Diagnostic assays differ in their sensitivity and specificity. Within the context of the invention the sensitivity of a diagnostic assay is defined as the percentage of diseased subjects who test positive for a colorectal cancer or a non- malignant disease of the large intestine and are considered “true positives”. Subjects having a colorectal cancer or a non-malignant disease of the large intestine but not detected by the diagnostic assay are considered “false negatives”. Subjects who are not diseased and who test negative in the diagnostic assay are considered “true negatives”.
  • the te ⁇ n specificity of a diagnostic assay is defined as 1 minus the false positive rate, where the "false positive rate” is defined as the proportion of those subjects devoid of a colorectal cancer or a non-malignant disease of the large intestine but who test positive in said assay.
  • adsorbent refers to any material that is capable of accumulating (binding) a biomolecule.
  • the adsorbent typically coats a biologically active surface and is composed of a single material or a plurality of different materials that are capable of binding a biomolecule.
  • materials include, but are not limited to, anion exchange materials, cation exchange materials, metal chelators, polynucleotides, oligonucleotides, peptides, antibodies, metal chelators etc.
  • biologicalcaUy active surface refers to any two- or three-dimensional extension of a material that biomolecules can bind to, or interact with, due to the specific biochemical properties of this material and those of the biomolecules.
  • biochemical properties include, but are not limited to, ionic character (charge), hydrophobicity, or hydrophilicity.
  • binding molecule refers to a molecule that displays an affinity for another molecule.
  • such molecules may include, but are not limited to nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polypeptides, carbohydrates, lipids, and combinations thereof (e.g. glycoproteins, ribonucleoproteins, hpoproteins).
  • binding molecules are antibodies.
  • solution refers to a homogeneous mixture of two or more substances. Solutions may include, but are not limited to buffers, substrate solutions, elution solutions, wash solutions, detection solutions, standardisation solutions, chemical solutions, solvents, etc. Furthermore, other solutions known to those skilled in the art are also included herein.
  • mass profile refers to a mass spectrum as a characteristic property of a given sample or a group of samples, especially when compared to the mass profile of a second sample or group of samples in any way different from the first sample or group of sample.
  • the mass profile is obtained by treating the biological sample as follows. The sample is diluted it 1 :5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% ampholine and subsequently diluted 1:10 in binding buffer consisting of ⁇ l M Tris-HCl, 0.02% Triton X-100 at pH 8.5.
  • pre-treated sample is applied to a biologically active surface comprising positively charged quaternary ammonium groups (anion exchange surface) and incubated for 120 minutes.
  • the biomolecules bound to the surface are analysed by gas phase ion spectrometry as described in another section. All but the dilution steps are performed at 20 to 24°C. Dilution steps are performed at 0 to 4°C.
  • Appendix refers to the molecular mass value in Dalton (Da) of a biomolecule as it may appear in a given method of investigation, e.g. size exclusion chromatography, gel electrophoresis, or mass spectrometry.
  • chromatography refers to any method of separating biomolecules within a given sample such that the original native state of a given biomolecule is retained. Separation of a biomolecule from other biomolecules within a given sample for the purpose of enrichment, purification and/or analysis, may be achieved by methods including, but not limited to, size exclusion chromatography, ion exchange chromatography, hydrophobic and hydrophilic interaction chromatography, metal affinity chromatography, wherein "metal” refers to metal ions (e.g. nickel, copper, gallium, or zinc) of all chemically possible valences, or ligand affinity chromatography wherein "ligand” refers to binding molecules, preferably proteins, antibodies, or DNA. Generally, chromatography uses biologically active surfaces as adsorbents to selectively accumulate certain biomolecules.
  • mass spectrometry refers to a method comprising employing an ionization source to generate gas phase ions from a biological entity of a sample presented on a biologically active surface and detecting the gas phase ions with a mass spectrometer.
  • laser desorption mass spectrometry refers to a method comprising the use of a laser as an ionization source to generate gas phase ions from a biomolecule presented on a biologically active surface and detecting the gas phase ions with a mass spectrometer.
  • mass spectrometer refers to a gas phase ion spectrometer that includes an inlet system, an ionisation source, an ion optic assembly, a mass analyser, and a detector.
  • the terms “detect”, “detection” or “detecting” refer to the identification of the presence, absence, or quantity of a biomolecule.
  • EAM energy absorbing molecule
  • Cinnamic acid derivatives, sinapinic acid and dihydroxybenzoic acid are frequently used as energy-absorbing molecules in laser desorption of biomolecules. See U.S. Pat. No. 5,719,060 (Hutchens & Yip) for a further description of energy absorbing molecules.
  • training set refers to a subset of the respective entire available data set. This subset is typically randomly selected, and is solely used for the purpose of classifier construction.
  • test set refers to a subset of the entire available data set consisting of those entries not included in the training set. Test data is applied to evaluate classifier performance.
  • decision tree refers to a flow-chart-like tree structure employed for classification. Decision trees consist of repeated splits of a data set into subsets. Each split consists of a simple rule applied to one variable, e.g., "if value of 'variable 1' larger than 'threshold 1' then go left else go right". Accordingly, the given feature space is partitioned into a set of rectangles with each rectangle assigned to one class.
  • ensemble can be used interchangeably and refer to a classifier that consists of many simpler elementary classifiers, e.g., an ensemble of decision trees is a classifier consisting of decision trees.
  • the result of the ensemble classifier is obtained by combining all the results of its constituent classifiers, e.g., by majority voting that weights all constituent classifiers equally. Majority voting is especially reasonable in the case of bagging, where constituent classifiers are then naturally weighted by the frequency with which they are generated.
  • Competitors refers to a variable (in our case: mass) that can be used as an alternative splitting rule in a decision tree. In each step of decision tree construction, only the variable yielding best data splitting is selected. Competitors are non-selected variables with similar but lower performance than the selected variable. They point into the direction of alternative decision trees.
  • surrogate refers to a splitting rule that closely mimics the action of the primary split.
  • a surrogate is a variable that can substitute a selected decision tree variable, e.g. in the case of missing values. Not only must a good surrogate split the parent node into descendant nodes similar in size and composition to the primary descendant nodes. In addition, the surrogate must also match the primary split on the specific cases that go to the left child and right child nodes.
  • peak and “signal” may be used interchangeably and refer to any signal which is generated by a biomolecule when under investigation using a specific method, for example chromatography, mass spectrometry, or any type of spectroscopy like Ultraviolet/Visible Light (UV/Vis) spectroscopy, Fourier Transformed Infrared (FTTR) spectroscopy, Electron Paramagnetic Resonance (EPR) spectroscopy, or Nuclear Mass Resonance (NMR) spectroscopy.
  • UV/Vis Ultraviolet/Visible Light
  • FTTR Fourier Transformed Infrared
  • EPR Electron Paramagnetic Resonance
  • NMR Nuclear Mass Resonance
  • peak and signal refer to the signal generated by a biomolecule of a certain molecular mass hitting the detector of a mass spectrometer, thus generating a signal intensity which correlates with the amount or concentration of said biomolecule of a given sample.
  • a “peak” and “signal” is defined by two values: an apparent molecular mass value and an intensity value generated as described.
  • the mass value is an elemental characteristic of a biological entity, whereas the intensity value accords to a certain amount or concentration of a biological entity with the co ⁇ esponding apparent molecular mass value, and thus “peak” and “signal” always refer to the properties of this biological entity.
  • the te ⁇ n "cluster” refers to a signal or peak present in a certain set of mass spectra or mass profiles obtained from different samples belonging to two or more different groups (e.g. cancer and non cancer). Within the set, signals belonging to cluster can differ in their intensities, but not in the apparent molecular masses.
  • variable refers to a cluster which is subjected to a statistical analysis aiming towards a classification of samples into two or more different sample groups (e.g. cancer and non cancer) by using decision trees, wherein the sample feature relevant for classification is the intensity value of the variables in the analysed samples.
  • the present invention relates to methods for the differential diagnosis of colorectal cancers or a non- malignant disease of the large intestine by detecting one or more differentially expressed biomolecules within a test sample of a given subject, comparing results with samples from healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer, or subjects having a non-malignant disease of the large intestine, wherein the comparison allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the large intestine, having a colorectal cancer, having a metastasised colorectal cancer or a non-malignant disease of the large intestine.
  • a method for the differential diagnosis of a colorectal cancer or a non- malignant disease of the large intestine comprises obtaining a test sample from a given subject, contacting said sample with an adsorbent present on a biologically active surface under specific binding conditions, allowing the biomolecules within the test sample to bind to said adsorbent, detecting one or more bound biomolecules using a detection method, wherein the detection method generates a mass profile of said sample, fransforming mass profile data into a computer-readable form comparing the mass profile of said sample with a database containing mass profiles from comparable samples specific for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer, ..or subjects having a non-malignant disease of the large intestine.
  • a comparison of mass profiles allows for the medical practitioner to determine if a subject is healthy, has a precancerous lesion of the large intestine, a colorectal cancer, a metastasised colorectal cancer or a non-malignant disease of the large intestine based on the presence, absence or quantity of specific biomolecules.
  • Detection of a single or a combination of more than one biomolecule of the invention is based on specific sample pre-treatment conditions, the pH of binding conditions, and the type of biologically active surface used for the detection of biomolecules. For example, prior to the detection of the biomolecules described herein, a given sample is pre-treated by diluting 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% ampholine.
  • the denatured sample is then diluted 1 : 10 in a specific binding buffer (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5), applied to a biologically active surface comprising of positively-charged quaternary ammonium groups (cationic) and incubated using specific buffer conditions (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5) to allow for binding of said biomolecules to the above-mentioned biologically active surface.
  • a specific binding buffer 0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5
  • a biomolecule of the invention may include any molecule that is produced by a cell or living organism, and may have any biochemical property (e.g. phosphorylated proteins, positively charged molecules, negatively charged molecules, hydrophobicity, hydrophiUcity), but preferably biochemical properties that allow binding of the biomolecule to a biologically active surface comprising positively charged quaternary ammonium groups after denaturation in 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine and dilution in 0.1 M Tris-HCl, 0.02% Triton X-100 at pH 8.5 at 0 to 4°C followed by incubation on said biologically active surface for 120 minutes at 20 to 24°C.
  • biochemical property e.g. phosphorylated proteins, positively charged molecules, negatively charged molecules, hydrophobicity, hydrophiUcity
  • biochemical properties that allow binding of the biomolecule to a biologically active surface comprising positively charged quaternary ammonium groups after denaturation in 7 M
  • Such molecules include, but are not limited to, molecules comprising nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides (DNA or RNA), polypeptides, proteins, antibodies, carbohydrates, lipids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, hpoproteins).
  • a biomolecule may be a nucleotide, polynucleotide, peptide, protein or fragments thereof. Even more prefe ⁇ ed are peptide or protein biomolecules or fragments thereof.
  • a single biomolecule or a combination of more than one biomolecule selected from the group having an apparent molecular mass of 2020 Da ⁇ 10 Da, 2049 Da ⁇ 10 Da, 2270 Da ⁇ 11 Da, 2508 Da ⁇ 13 Da, 2732 Da ⁇ 14 Da, 3026 Da ⁇ 15 Da, 3227 Da ⁇ 17 Da, 3326 Da ⁇ 17 Da, 3456 Da ⁇ 17 Da, 3946 Da ⁇ 20 Da, 4103 Da ⁇ 21 Da, 4242 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4359 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4546 Da ⁇ 23 Da, 4607 Da ⁇ 23 Da, 47 .
  • a biomolecule having the apparent molecular mass of about e.g. 4242 Da is present only in biological samples from patients having a metastasised colorectal cancer.
  • Mass profiling of two test samples from different subjects, X and Y reveals the presence of a biomolecule with the apparent molecular mass of about 4242 Da in a sample from test subject X, and the absence of said biomolecule in test sample from subject Y.
  • the medical practitioner is able to diagnose subject X as having a metastasised colorectal cancer and subject Y as not having a metastasised colorectal cancer.
  • three biomolecules having the apparent molecular mass of about 5772 Da, 2020 Da and 22951 Da are present in varying quantities in samples specific for precancerous lesions and "early" colorectal cancers.
  • the biomolecule having the apparent molecular mass of 5772 Da is more present in samples specific for precancerous lesions of the large intestine than for "early" colorectal cancers.
  • a biomolecule having an apparent molecular mass of 2020 Da is detected in samples from subjects having "early" colorectal cancers but not in those having a precancerous lesion, whereas the biomolecule having the molecular mass of 22951 Da is present in about the same quantity in both sample types.
  • biomolecules are not present in samples from healthy subjects, only those of apparent molecular mass of 8780 Da and 16104 Da.
  • Analysis of a test sample reveals the presence of biomolecules having the molecular mass of 22951 Da, 5772 Da and 2020 Da.
  • Comparison of the quantity of the biomolecules within said sample reveals that the biomolecule with an apparent molecular mass of 5772 Da is present at lower levels than those found in samples from subjects having a precancerous lesion.
  • the medical practitioner is able to diagnose the test subject as having an "early" colorectal cancer.
  • an immunoassay can be used to determine the presence or absence of a biomolecule within a test sample of a subject.
  • the presence or absence of a biomolecule within a sample can be detected using the various immunoassay methods known to those skilled in the art (i.e. ELISA, western blots). If a biomolecule is present in the test sample, it will form an antibody- marker complex with an antibody that specifically binds a biomolecule under suitable incubation conditions. The amount of an antibody-biomolecule complex can be determined by comparing to a standard.
  • the invention provides a method for the differential diagnosis of a colorectal cancer and/or a non-malignant disease of the large intestine comprising detecting of one or more differentially expressed biomolecules within a sample.
  • This method comprises obtaining a test sample from a subject, contacting said sample with a binding molecule specific for a differentially expressed polypeptide, detecting an interaction between the binding molecule and its specific polypeptide, wherein the detection of an interaction indicates the presence or absence of said polypeptide, thereby allowing for the differential diagnosis of a subject as healthy, having a precancerous lesion of the large intestine, having a colorectal cancer, having a metastasised colorectal cancer and/or a non-malignant disease of the large intestine.
  • Binding molecules include, but are not limited to, proteins, peptides, nucleotides, nucleic acids, hormones, amino acids, sugars, fatty acids, steroids, polynucleotides, carbohydrates, lipids, or a combination thereof (e.g. glycoproteins, ribonucleoproteins, Hpoproteins), compounds or synthetic molecules.
  • binding molecules are antibodies specific for biomolecules selected from the group of having an apparent molecular mass of 2020 Da ⁇ 10 Da, 2049 Da ⁇ 10 Da, 2270 Da ⁇ 11 Da, 2508 Da ⁇ 13 Da, 2732 Da ⁇ 14 Da, 3026 Da ⁇ 15 Da, 3227 Da ⁇ 17 Da, 3326 Da ⁇ 17 Da, 3456 Da ⁇ 17 Da, 3946 Da ⁇ 20 Da, 4103 Da ⁇ 21 Da, 4242 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4359 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4546 Da ⁇ 23' Da, 4607 Da ⁇ 23 Da, 4719 Da ⁇ 24 Da, 4830 Da ⁇ 24 Da, 4865 Da ⁇ 24 Da, 4963 Da ⁇ 25 Da, 5112 Da ⁇ 26 Da, 5226 Da ⁇ 26 Da, 5493 Da ⁇ 27 Da, 5648 Da ⁇ 28 Da, 5772 Da ⁇ 29 Da, 5854 Da ⁇ 29 Da, 6446 Da ⁇ 32 Da, 6644
  • a method for detecting the differential presence of one or more biomolecules selected from the group having an apparent molecular mass of 2020 Da ⁇ 10 Da, 2049 Da ⁇ 10 Da, 2270 Da ⁇ 11 Da, 2508 Da ⁇ 13 Da, 2732 Da ⁇ 14 Da, 3026 Da ⁇ 15 Da, 3227 Da ⁇ 17 Da, 3326 Da ⁇ 17 Da, 3456 Da ⁇ 17 Da, 3946 Da ⁇ 20 Da, 4F03 Da ⁇ 21 Da, 4242 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4359 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4546 Da ⁇ 23 Da, 4607 Da ⁇ 23 Da, 4719 Da ⁇ 24 Da, 4830 Da ⁇ 24 Da, 4865 Da ⁇ 24 Da, 4963 Da ⁇ 25 Da, 5112 Da ⁇ 26 Da, 5226 Da ⁇ 26 Da, 5493 Da ⁇ 27 Da, 5648 Da ⁇ 28 Da, 5772 Da ⁇ 29 Da, 5854 Da ⁇ 29 Da, 6446
  • antibodies or fragments thereof may be utilised for the detection of a biomolecule in a biological sample comprising: applying a labeUed antibody directed against a given biomolecule of the invention to said sample under conditions that favour an interaction between the labelled antibody and its co ⁇ esponding protein.
  • a labeUed antibody directed against a given biomolecule of the invention to said sample under conditions that favour an interaction between the labelled antibody and its co ⁇ esponding protein.
  • an antibody coupled to an enzyme is detected using a chromogenic substrate that is recognised and cleaved by the enzyme to produce a chemical moiety, which is readily detected using spectrometric, fluorimetric or visual means.
  • Enzymes used to for labelling include, but are not limited to, malate dehydrogenase, staphylococcal nuclease, delta-5-steroid isomerase, yeast alcohol dehydrogenase, alpha-glycerophosphate, dehydrogenase, triose phosphate isomerase, horseradish peroxidase, alkaline phosphatase, asparaginase, glucose ox ⁇ dase, beta-galactosidase, ribonuclease, urease, catalase, glucose-6-phosphate dehydrogenase, glucoamylase and acetylcholinesterase.
  • Detection may also be accomphshed by visual comparison of the extent of the enzymatic reaction of a substrate with that of similarly prepared standards.
  • radiolabelled antibodies can be detected using a gamma or a scintillation counter, or they can be detected using autoradiography.
  • fluorescently labelled antibodies are detected based on the level at which the attached compound fluoresces following exposure to a given wavelength. Fluorescent compounds typically used in antibody labelling include, but are not limited to, fluorescein isothiocynate, rhodamine, phycoerthyrin, phycocyanin, allophycocyani, o-phthaldehyde and fluorescamine.
  • antibodies coupled to a chemi- or bioluminescent compound can be detected by determining the presence of luminescence.
  • luminescence include, but are not limited to, luminal, isoluminal, theromatic acridinium ester, imidazole, acridinium salt, oxalate ester, luciferin, luciferase and aequorin.
  • test sample used for the differential diagnosis of a colorectal cancer and/or a non-malignant disease of the large intestine of a subject may be of blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saHva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract origin.
  • test samples are of blood, blood serum, plasma, urine, excreta, prostatic fluid, biopsy, ascites, lymph or tissue extract origin. More preferred are blood, blood serum, plasma, urine, excreta, biopsy, lymph or tissue extract samples. Even more preferred are blood serum, urine, excreta or biopsy samples. Overall preferred are blood serum samples.
  • test samples used for the methods of the invention are isolated from subjects of mammalian origin, preferably of primate origin. Even more prefe ⁇ ed are subjects of human origin.
  • the methods of the invention for the differential diagnosis of healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasized colorectal cancer or subjects having a non-mahgnant disease of the large intestine described herein may be combined with other diagnostic methods to improve the outcome of the differential diagnosis.
  • Other diagnostic methods are known to those skilled in the art.
  • a database comprising of miass profiles specific for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer, or subjects having a non-malignant disease of the large intestine is generated by contacting biological samples isolated from above-mentioned subjects with an adsorbent on a biologically active surface under specific binding conditions, allowing the biomolecules within said sample to bind said adsorbent, detecting one or more bound biomolecules using a detection method wherein the detection method generates a mass profile of said sample, transforming the mass profile data into a computer-readable form and applying a mathematical algorithm to classify the mass profile as specific for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer, or subjects having a non-malignant disease of the large intestine
  • the classification of said mass profiles is performed using the "CART" decision tree approach (classification and regression trees; Breiman et al., 1984) and is known to those skilled in the art. Furthermore, bagging of classifiers is applied to overcome typical instabilities of forward variable selection procedures, thereby increasing overall classifier performance (Breiman, 1994).
  • one or more biomolecules selected from the group having an apparent molecular mass of 2020 Da ⁇ 10 Da, 2049 Da ⁇ 10 Da, 2270 Da ⁇ 11 Da, 2508 Da ⁇ 13 Da, 2732 Da ⁇ 14 Da, 3026 Da ⁇ 15 Da, 3227 Da ⁇ 17 Da, 3326 Da ⁇ 17 Da, 3456 Da ⁇ 17 Da, 3946 Da ⁇ 20 Da, 4103 Da ⁇ 21 Da, 4242 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4359 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4546 Da ⁇ 23 Da, 4607 Da ⁇ 23 Da, 4719 Da ⁇ 24 Da, 4830 Da ⁇ 24 Da, 4865 Da ⁇ 24 Da, 4963 Da ⁇ 25 Da, 5112 Da ⁇ 26 Da, 5226 Da ⁇ 26 Da, 5493 Da ⁇ 27 Da, 5648 Da ⁇ 28 Da, 5772 Da ⁇ 29 Da, 5854 Da ⁇ 29 Da, 6446 Da ⁇ 32 Da, 6644 Da ⁇ 33
  • biomolecules within a given sample are bound to an adsorbent on a biologically active surface under specific binding conditions, for example, the biomolecules within a - given sample are applied to a biologically active surface comprising positively-charged quaternary ammonium groups (cationic) and incubated with 0.1 M Tris-HCl, 0.02% Triton X-100 at a pH of 8.5 to allow for specific binding. Biomolecules that bind to said biologically active surface under these conditions are negatively charged molecules.
  • biomolecules of the invention are bound to a cationic adsorbent comprising of positively-charged quaternary ammonium groups, the biomolecules are capable of binding other types of adsorbents, as described in another section using binding conditions known to those skilled in the art. Accordingly, some embodiments of the invention are not limited to the use of cationic adsorbents
  • biological samples used to generate a database of mass profiles for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer or subjects having a non-malignant disease of the large intestine may be of blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract origin.
  • biological samples are of blood, blood serum, plasma, urine, excreta, prostatic fluid, biopsy, ascites, lymph or tissue extract origin. More prefe ⁇ ed are blood, blood serum, plasma, urine, excreta, biopsy, lymph or tissue extract samples. Even more prefe ⁇ ed are blood serum, urine, excreta or biopsy samples. Overall prefe ⁇ ed are blood serum samples.
  • the biological samples related to the invention are isolated from subjects considered to be healthy, having a precancerous lesion of the large intestine, having a colorectal cancer, having a metastasised colorectal cancer or having a non-malignant disease of the large intestine.
  • Said subjects are of mammalian- origin, preferably of primate origin. Even more prefe ⁇ ed are subjects of human origin.
  • a subject of the invention that is said to have a precancerous lesion of the large intestine displays preliminary stages of a cancer (i.e. dysplasia), wherein a cell and/or tissue has become susceptible to the development of a cancer as a result of either a genetic predisposition, exposure to a cancer-causing agent (carcinogen) or both.
  • a cancer i.e. dysplasia
  • a genetic pre-disposition may include a predisposition for an autosomal dominant inherited cancer syndrome which is generaUy indicated by a strong family history of uncommon cancer and/or an association with a specific marker phenotype (e.g. famiHal adenomatous polyps of the colon), a famiHal cancer wherein an evident clustering of cancer is observed but the role of inherited predisposition may not be clear (e.g. breast cancer, ovarian cancer, or colon cancer), or an autosomal recessive syndrome characterised by chromosomal or DNA instability.
  • cancer-causing agents include agents that cause genetic damage and induce neoplastic transformation of a cell.
  • Such agents fall into three categories: 1) chemical carcinogens such as alkylating agents, polycychc aromatic hydrocarbons, aromatic amines, azo dyes, nitrosamines and amides, asbestos, vinyl chloride, chromium, nickel, arsenic, and naturally occurring carcinogens (e.g. aflotoxinBl); 2) radiation such as ultraviolet (UV) and ionisation radiation including electromagnetic (e.g. x-rays, ⁇ -rays) and particulate radiation (e.g.
  • chemical carcinogens such as alkylating agents, polycychc aromatic hydrocarbons, aromatic amines, azo dyes, nitrosamines and amides, asbestos, vinyl chloride, chromium, nickel, arsenic, and naturally occurring carcinogens (e.g. aflotoxinBl); 2) radiation such as ultraviolet (UV) and ionisation radiation including electromagnetic (e.g. x-rays, ⁇ -rays) and particulate radiation (
  • ⁇ and ⁇ particles, protons, neutrons 3) viral and microbial carcinogens such as human Papillomavirus (HPV), Epstein-Ban virus (EBV), hepatitis B virus (HBV), human T-ceU leukaemia virus type 1 (HTLV-1), or Heticobacter pylori.
  • HPV human Papillomavirus
  • EBV Epstein-Ban virus
  • HBV hepatitis B virus
  • HTLV-1 human T-ceU leukaemia virus type 1
  • Heticobacter pylori Heticobacter pylori.
  • a subject within the invention that is said to have a colorectal cancer possesses a cancer that arises from the large intestine (interchangebly refe ⁇ ed to as colorectal cancers within the invention).
  • Such cancers may include, but are not limited to, colon and rectal cancers.
  • cancers of large intestine may also be of various stages, wherein the staging is based on the size of the primary lesion, its extent of spread to regional lymph nodes, and the presence or absence of blood-borne metastases (metastatic colorectal cancers.
  • the various stages of a cancer may be identified using staging systems known to those skilled in the art [e.g. Union Internationale Contre Cancer (UICC) system or American Joint Committee on Cancer (AJC)].
  • UICC Union Internationale Contre Cancer
  • AJC American Joint Committee on Cancer
  • grade of a cancer is based on the degree of differentiation of the epitheHal cells within the lining of the large intestine and the number of mitoses as a co ⁇ elation to a neoplasm's aggression.
  • Healthy individuals are those that possess good health, and demonstrate an absence of a colorectal cancer or a non-malignant disease of the large intestine.
  • the differential expression of biomolecules in samples from healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having metastasised colorectal cancer, and subjects having a non-mahgnant disease of the large intestine allows for the differential diagnosis of a non-mahgnant disease or a cancer of the large intestine wihin a subject.
  • Biomolecules are said to be specific for a particular clinical state (e.g. healthy, precancerous lesion of the large intestine, colorectal cancer, metastasised colorectal cancer, a non-malignant disease of the large intestine) when they are present at different levels within samples taken from subjects in one clinical state as compared to samples taken from subjects from other clinical states (e.g. in subjects with a precancerous lesion of the large intestine vs. in subjects with a metastasised colorectal cancer). Biomolecules may be present at elevated levels, at decreased levels, or altogether absent within a sample taken from a subject in a particular clinical state (e.g.
  • biomolecules A and B are found at elevated levels in samples isolated from healthy subjects as compared to samples isolated from subjects having a precancerous lesion of the large intestine, a colorectal cancer, a metastatic colorectal cancer or a non-mahgnant disease of the large intestine.
  • biomolecules X, Y, Z are found at elevated levels and or more frequently in samples isolated from subjects having a precancerous lesion of the large intestine as opposed to subjects in good health, having a colorectal cancer, a metastasised colorectal cancer or a non- malignant disease of the large intestine.
  • Biomolecules A and B are said to be specific for healthy subjects, whereas biomolecules X, Y, Z are specific for subjects having a precancerous lesion of the large intestine.
  • the differential presence of one or more biomolecules found in a test sample compared to samples from healthy subjects, subjects with a precancerous lesion of the large intestine, a colorectal cancer, a metastasized colorectal cancer, or a non-malignant disease of the large intestine, or the mere detection of one or more biomolecules in the test sample provides useful information regarding probability of whether a subject being tested has a precancerous lesion of the large intestine, a colorectal cancer, a metastasized colorectal cancer or a non-malignant disease of the large intestine.
  • the probability that a subject being tested has a precancerous lesion of the large intestine, a colorectal cancer, a metastasized colorectal cancer or a non-malignant disease of the large intestine depends on whether the quantity of one or more biomolecules in a test sample taken from said subject is statistically significantly different from the quantity of one or more biomolecules in a biological sample taken from healthy subjects, subjects having a precancerous lesion of the large intestine, a colorectal cancer, a metastasised colorectal cancer, or a non-mahgnant disease of the large intestine.
  • a biomolecule of the invention may be any molecule that is produced by a ceU or Hving organism, and may have any biochemical property (e.g. phosphorylated proteins, positively charged molecules, negatively charged molecules, hydrophobicity, hydrophiHcity), but preferably biochemical properties that allow binding of the biomolecule to a biologically active surface comprising positively charged quaternary ammonium groups after denaturation in 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine and dilution in 0.1 M Tris-HCl, 0.02% Triton X-100 at pH 8.5 at 0 to 4°C foUowed by incubation on said biologically active surface for 120 minutes at 20 to 24°C.
  • biochemical property e.g. phosphorylated proteins, positively charged molecules, negatively charged molecules, hydrophobicity, hydrophiHcity
  • biochemical properties that allow binding of the biomolecule to a biologically active surface comprising positively charged quaternary ammonium groups after denaturation
  • Such molecules include, but are not limited to, molecules comprising nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides (DNA or RNA), polypeptides, proteins, antibodies, carbohydrates, Hpids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, Hpoproteins).
  • a biomolecule may be a nucleotide, polynucleotide, peptide, protein or fragments thereof. Even more prefe ⁇ ed are peptide or protein biomolecules.
  • the biomolecules of the invention can be detected based on specific sample pre-treatment conditions, the pH of binding conditions, the type of biologically active surface used for the detection of biomolecules within a given sample and their molecular mass. For example, prior to the detection of the biomolecules described herein, a given sample is pre-tfeated by diluting 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% ampholine.
  • the denatured sample is then diluted 1:10 in 0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5, apphed to a biologically active surface comprising positively-charged quaternary ammonium groups (cationic) and incubated using specific buffer conditions (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5) to allow for binding of said biomolecules to the above-mentioned biologically active surface.
  • specific buffer conditions 0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5
  • biomolecules of the invention are detected using a cationic adsorbent positively charged quaternary ammonium groups, as well as specific pre-treatment and binding conditions, the biomolecules are capable of binding other types of adsorbents, as described below, using alternative pre-treatment and binding conditions known to those skilled in the art. Accordingly, some embodiments of the invention are not limited to the use of cationic adsorbents.
  • the biomolecules of the invention include biomolecules having a molecular mass selected from the group consisting of 2020 Da ⁇ 10 Da, 2049 Da ⁇ 10 Da, 2270 Da ⁇ 11 Da, 2508 Da ⁇ 13 Da, 2732 Da ⁇ 14 Da, 3026 Da ⁇ 15 Da, 3227 Da ⁇ 17 Da, 3326 Da ⁇ 17 Da, 3456 Da ⁇ 17 Da, 3946 Da ⁇ 20 Da, 4103 Da ⁇ 21 Da, 4242 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4359 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4546 Da ⁇ 23 Da, 4607 Da ⁇ 23 Da, 4719 Da ⁇ 24 Da, 4830 Da ⁇ 24 Da, 4865 Da ⁇ 24 Da, 4963 Da ⁇ 25 Da, 5112 Da ⁇ 26 Da, 5226 Da ⁇ 26 Da, 5493 Da ⁇ 27 Da, 5648 Da ⁇ 28 Da, 5772 Da ⁇ 29 Da, 5854 Da ⁇ 29 Da, 6446 Da ⁇ 32 Da, 6644 Da
  • biomolecules were first identified in blood serum samples, their detection is not limited to said sample type.
  • the biomolecules may also be detected in other samples types, such as blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract.
  • samples are of blood, blood serum, plasma, urine, excreta, prostatic fluid, biopsy, ascites, lymph or tissue extract origin. More prefe ⁇ ed are blood, blood serum, plasma, urine, excreta, biopsy, lymph or tissue extract samples. Even more prefe ⁇ ed are blood serum, urine, excreta or biopsy samples. Overall prefe ⁇ ed are blood serum samples.
  • biomolecules can be sufficiently characterized by their mass and biochemical characteristics such as the type of biologically active surface they bind to or the pH of binding conditions, it is not necessary to identify the biomolecules in order to be able to identify them in a sample. It should be noted that molecular mass and binding properties are characteristic properties of these biomolecules and not limitations on the means of detection or isolation. Furthermore, using the methods described herein, or other methods known in the art, the absolute identity of the markers can be determined. This is important when one wishes to develop and/or screen for specific binding molecules, or to develop a an assay for the detection of said biomolecules using specific binding molecules.
  • biologically active surfaces include, but are not restricted to, surfaces that contain adsorbents such as quaternary ammonium groups (anion exchange surfaces), carboxylate groups (cation exchange surfaces), alkyl or aryl chains (hydrophobic interaction, reverse phase chemistry), groups such as nitriloacetic acid that immobilize metal ions such as nickel, gallium, copper, or zinc (metal affinity interaction), or biomolecules such as proteins, preferably antibodies, or nucleic acids, preferably protein binding sequences, covalently bound to the surface via carbonyl diimidazole moieties or epoxy groups (specific affinity interaction).
  • adsorbents comprising anion exchange surfaces.
  • These surfaces may be located on matrices like polysaccharides such as sepharose, e.g. anion exchange surfaces or hydrophobic interaction surfaces, or sohd metals, e.g. antibodies coupled to magnetic beads. Surfaces may also include gold-plated surfaces such as those used for Biacore Sensor Chip technology. Other surfaces known to those skilled in the art are also included within the scope of the invention.
  • Biomolecules like amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides, polypeptides, carbohydrates, Hpids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, Hpoproteins).
  • devices that use biologically active surfaces to selectively adsorb biomolecules may be chromatography columns for Fast Protein Liquid Chromatography (FPLC) and High Pressure Liquid Chromatography (HPLC), where the matrix, e.g. a porysaccharide, carrying the biologically active surface, is filled into vessels (usually refe ⁇ ed to as "columns") made of glass, steel, or synthetic materials like polyetheretherketone (PEEK).
  • FPLC Fast Protein Liquid Chromatography
  • HPLC High Pressure Liquid Chromatography
  • devices that use biologically active surfaces to selectively adsorb biomolecules may be metal strips carrying thin layers of the biologically active surface on one or more spots of the strip surface to be used as probes for gas phase ion spectrometry analysis, for example the SAX2 ProteinChip a ⁇ ay (Ciphergen Biosystems, Inc.) for SELDI analysis.
  • the mass profile of a sample may be generated using an a ⁇ ay-based assay in which the biomolecules of a given sample are bound by biochemical or affinity interactions to an adsorbent present on a biologically active surface located on a solid platform ("a ⁇ ay" or "probe"). After the biomolecules have bound to the adsorbent, they are detected using gas phase ion spectrometry. Biomolecules or other substances bound to the adsorbents on the probes can be analyzed using a gas phase ion spectrometer. This includes, e.g., mass spectrometers, ion mobility spectrometers, or total ion cu ⁇ ent measuring devices. The quantity and characteristics of the biomolecule can be determined using gas phase ion spectrometry. Other substances in addition to the biomolecule of interest can also be detected by gas phase ion spectrometry.
  • a mass spectrometer can be used to detect biomolecules on the probe.
  • a probe with a biomolecule is introduced into an inlet system of the mass spectrometer.
  • the biomolecule is then ionized by an ionization source, such as a laser, fast atom bombardment, or plasma.
  • the generated ions are collected by an ion optic assembly, and then a mass analyzer disperses and analyzes the passing ions.
  • the ionisation course that ionises the biomolecule is a laser.
  • the ions exiting the mass analyzer are detected by a ion detector.
  • the ion detector then translates information of the detected ions into mass-to-charge ratios.
  • Detection of the presence of a biomolecule or other substances wiU typically involve detection of signal intensity. This, in turn, can reflect the quantity and character of a biomolecule bound to the probe.
  • the mass profile of a sample may be generated using a liquid-chromatography (LC)-based assay in which the biomolecules of a given sample are bound by biochemical or affinity interactions to an adsorbent located in a vessel made of glass, steel, or synthetic material; known to those skilled in the art as a chromatography column.
  • LC liquid-chromatography
  • the biomolecules are eluted from the biologicaUy active surface by washing the vessel with appropriate solutions known to those skiUed in the art.
  • solutions include but are not limited to, buffers, e.g. Tris (hydroxymethyl) aminomethane hydrochloride (TRIS-HCl), buffers containing salt, e.g. sodium chloride (NaCI), or organic solvents, e.g. acetonitrile.
  • Tris (hydroxymethyl) aminomethane hydrochloride (TRIS-HCl) buffers containing salt, e.g. sodium chloride (NaCI)
  • organic solvents e.g. acetonitrile.
  • Biomolecule mass profiles are generated by appHcation of the eluting biomolecules of the sample by direct connection via an electrospray device to a mass spectrometer (LC/ESI-MS).
  • Conditions that promote binding of biomolecules to an adsorbent are known to those skiUed in the art (reference) and ordinarily include parameters such as pH, the concentration of salt, organic solvent, or other competitors for binding of the biomolecule to the adsorbent.
  • incubation temperatures are of at least 0 to 100°C, preferably of at least 4 to 60°C, and most preferably of at least 15 to 30°C.
  • additional parameters such as incubation time, the concentration of detergent, e.g., 3-[(3-Cholamidopropyl) dimethylammonio]-2-hydroxy-l-propanesulfonate (CHAPS), or reducing agents, e.g. dithiothreitol (DTT), are also known to those skilled in the art.
  • Various degrees of binding can be accomplished by combining the above stated conditions as needed, and will be readily apparent to those skiUed in the art.
  • the invention relates to methods for detecting differentially present biomolecules in a test sample and/or biological sample.
  • any suitable method can be used to detect one or more of the biomolecules described herein.
  • gas phase ion spectrometry can be used. This technique includes, e.g., laser desorption/ionization mass spectrometry.
  • the test and/or biological sample is prepared prior to gas phase ion spectrometry, e.g., pre-fractionation, two-dimensional gel chromatography, high performance liquid chromatography, etc. to assist detection of said biomolecules.
  • Detection of said biomolecules can also be achieved using methods other than gas phase ion spectrometry.
  • immunoassays can be used to detect the biomolecules within a sample.
  • the test and/or biological sample is prepared prior to contacting a biologically active surface and is in aqueous form.
  • samples include, but are not limited to, blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, tears, saliva, sweat, ascites, cerebrospinal fluid, milk, lymph, or tissue extract samples.
  • solid test and/or biological samples, such as excreta or biopsy samples can be solubilised in or admixed with an eluent using methods known to those skilled in the art such that said samples may be easily applied to a biologicaUy active surface.
  • Test and/or biological samples in the aqueous form can be further prepared using specific solutions for denaturation (pre-treatment) like sodium dodecyl sulfate, mercaptoethanol, urea, etc.
  • a test and/or biological sample of the invention can be denatured prior to contacting a biologically active surface comprising of quaternary ammonium groups by diluting said sample 1:5 with a buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT and 2% ampholine.
  • the sample is contacted with a biologically active surface using any techniques including bathing, soaking, dipping, spraying, washing over, or pipetting, etc.
  • GeneraUy a volume of sample containing from a few atomoles to 100 picomoles of a biomolecule in about 1 to 500 ⁇ l is sufficient for detecting binding of the biomolecule to the adsorbent.
  • the pH value of the solvent in which the sample contacts the biologically active surface is a function of the specific sample and the selected biologically active surface.
  • a sample is contacted with a biologically active surface under pH values between 0 and 14, preferably between about 4 and 10, more preferably between 4.5 and 9.0, and most preferably, at pH 8.5.
  • the pH value depends on the type of adsorbent present on a biologically active surface and can be adjusted accordingly.
  • the sample can contact the adsorbent present on a biologically active for a period of time sufficient to allow the marker to bind to the adsorbent.
  • the sample and the biologically active surface are contacted for a period of between about 1 second and about 12 hours, preferably, between about 30 seconds and about 3 hours, and most preferably for 120 minutes.
  • the temperature at which the sample contacts the biologically active surface is a function of the specific sample and the selected biologically active surface.
  • the washing solution can be at a temperature of between 0 and 100°C, preferably between 4 and 37°C, and most preferably between 20 and 24°C.
  • a biologically active surface comprising of quaternary ammonium groups (anion exchange surface) will bind the biomolecules described herein when the pH value is between 6.5 and 9.0.
  • Optimal binding of the biomolecules of the present invention occurs at a pH of 8.5.
  • a sample is contacted with said biologically active surface for 120 min. at a temperature of 20 - 24 °C.
  • washing unbound biomolecules are removed by methods known to those skiUed in the art such as bathing, soaking, dipping, rinsing, spraying, or washing the biologically active surface with an eluent or a washing solution.
  • a microfluidics process is preferably used when a washing solution such as an eluent is introduced to smaU spots of adsorbents on the biologically active surface.
  • the washing solution can be at a temperature of between 0 and 100°C, preferably between 4 and 37°C, and most preferably between 20 and 24°C.
  • Washing solution or eluents used to wash the unbound biomolecules from a biologically active surface include, but are not limited to, organic solutions, aqueous solutions such as buffers wherein a buffer may contain detergents, salts, or reducing agents in appropriate concentrations as those known to those skiUed in the art.
  • Aqueous solutions are prefe ⁇ ed for washing biologicaUy active surfaces.
  • exemplary aqueous solutions include, but not limited to, HEPES buffer, Tris buffer, phosphate buffered saline (PBS), and modifications thereof.
  • the selection of a particular washing solution or an eluent is dependent on other experimental conditions (e. g., types of adsorbents used or biomolecules to be detected), and can be determined by those of skiU in the art. For example, if a biologically active surface comprising a quaternary ammonium group as adsorbent (anion exchange surface) is used, then an aqueous solution, such as a Tris buffer, may be prefe ⁇ ed. In another example, if a biologically active surface comprising a carboxylate group as adsorbent (cation exchange surface) is used, then an aqueous solution, such as an acetate buffer, may be prefe ⁇ ed.
  • anion exchange surface a
  • an energy absorbing molecule e.g. in solution
  • EAM energy absorbing molecule
  • exemplary energy absorbing molecules include, but are not limited to, cinnamic acid derivatives, sinapinic acid and dihydroxybenzoic acid.
  • adsorbent-bound biomolecules are detected using gas phase ion spectrometry.
  • the quantity and characteristics of a biomolecule can be determined using said method.
  • said biomolecules can be analyzed using a gas phase ion spectrometer such as mass spectrometers, ion mobihty spectrometers, or total ion cu ⁇ ent measuring devices.
  • a gas phase ion spectrometer such as mass spectrometers, ion mobihty spectrometers, or total ion cu ⁇ ent measuring devices.
  • Other gas phase ion spectrometers known to those skilled in the art are also included.
  • mass spectrometry can be used to detect biomolecules of a given sample present on a biologically active surface.
  • Such methods include, but are not limited to, matrix-assisted laser desorption ionization/time-of-flight (MALDI-TOF), surface-enhanced laser desorption ionization time-of-flight (SELDI-TOF), Hquid chromatography coupled with MS, MS-MS, or ESI-MS.
  • MALDI-TOF matrix-assisted laser desorption ionization/time-of-flight
  • SELDI-TOF surface-enhanced laser desorption ionization time-of-flight
  • Hquid chromatography coupled with MS, MS-MS, or ESI-MS.
  • biomolecules are analysed by introducing a biologicaUy active surface containing said biomolecules, ionizing said biomolecules to generate ions that are collected and analysed.
  • the biomolecules present in a sample are detected using gas phase ion spectrometry, and more preferably, using mass spectrometry.
  • gas phase ion spectrometry and more preferably, using mass spectrometry.
  • mass spectrometry can be used.
  • MALDI matrix-assisted laser desorption ionization
  • the sample is typicaUy quasi-purified to obtain a fraction that essentially consists of a marker using separation methods such as two-dimensional gel electrophoresis or high performance liquid chromatography (HPLC).
  • SELDI surface-enhanced laser desorption/ionization mass spectrometry
  • SELDI uses a substrate comprising adsorbents to capture biomolecules, which can then be directly desorbed and ionized from the substrate surface during mass spectrometry. Since the substrate surface in SELDI captures biomolecules, a sample need not be quasi-purified as in MALDI. However, depending on the complexity of a sample and the type of adsorbents used, it may be desirable to prepare a sample to reduce its complexity prior to SELDI analysis.
  • biomolecules bound to a biologically active surface can be introduced into an inlet system of the mass spectrometer.
  • the biomolecules are then ionized by an ionization source such as a laser, fast atom bombardment, or plasma.
  • the generated ions are then coUected by an ion optic assembly, and then a mass analyzer disperses the passing ions.
  • the ions exiting the mass analyzer are detected by a detector and translated into mass-to-charge ratios. Detection of the presence of a biomolecule typically involves detection of its specific signal intensity, and reflects the quantity and character of said biomolecule.
  • a laser desorption time-of-flight mass spectrometer is used with the probe of the present invention.
  • biomolecules bound to a biologicaUy active surface are introduced into an inlet system. Biomolecules are desorbed and ionized into the gas phase by a laser. The ions generated are then coUected by an ion optic assembly. These ions are accelerated through a short high voltage field and let drift into a high vacuum chamber of a time-of- flight mass analyzer. At the far end of the high vacuum chamber, the accelerated ions strike a sensitive detector surface at a different time. Since the time-of-flight is a function of the mass of the ions, the elapsed time between ionization and impact can be used to identify the presence or absence of molecules of a specific mass.
  • biomolecules described herein can be enhanced using certain selectivity conditions (e. g., types of adsorbents used or washing solutions).
  • selectivity conditions e. g., types of adsorbents used or washing solutions.
  • the same or substantiaUy the same selectivity conditions that were used to discover the biomolecules can be used in the methods for detecting a biomolecule in a sample.
  • the computer program generally contains a readable medium that stores codes. Certain codes can be devoted to memory that include the location of each feature on a biologically active surface, the identity of the adsorbent at that feature and the elution conditions used to wash the adsorbent. Using this information, the program can then identify the set of features on the biologically active surface defining certain selectivity characteristics (e. g. types of adsorbent and eluents used). The computer also contains codes that receive as data (input) on the strength of the signal at various molecular masses received from a particular addressable location on the biologically active surface. This data can indicate the number of biomolecules detected, as well as the strength of the signal and the determined molecular mass for each biomolecule detected.
  • Data analysis can include the steps of determining signal strength (e. g., height of peaks) of a biomolecule detected and removing "outhers" (data deviating from a predetermined statistical distribution).
  • the observed peaks can be normalized, a process whereby the height of each peak relative to some reference is calculated.
  • a reference can be background noise generated by instrument and chemicals (e. g., energy absorbing molecule), which is set as zero in the scale.
  • the signal strength detected for each biomolecule can be displayed in the form of relative intensities in the scale desired (e. g., 100).
  • a standard may be admitted with the sample so that a peak from the standard can be used as a reference to calculate relative intensities of the signals observed for each biomolecule or other biomolecules detected.
  • the computer can transform the resulting data into various formats for displaying.
  • spectrum view a standard spectral view can be displayed, wherein the view depicts the quantity of a biomolecule reaching the detector at each particular molecular mass.
  • scatter plot only the peak height and mass information are retained from the spectrum view, yielding a cleaner image and enabling biomolecules with nearly identical molecular mass to be more visible.
  • biomolecules of the invention are biomolecules with an apparent molecular mass of about 2020 Da ⁇ 10 Da, 2049 Da ⁇ 10 Da, 2270 Da ⁇ 11 Da, 2508 Da ⁇ 13 Da, 2732 Da ⁇ 14 Da, 3026 Da ⁇ 15 Da, 3227 Da ⁇ 17 Da, 3326 Da ⁇ 17 Da, 3456 Da ⁇ 17 Da, 3946 Da ⁇ 20 Da, 4103 Da ⁇ 21 Da, 4242 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4359 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4546 Da ⁇ 23 Da, 4607 Da ⁇ 23 Da, 4719 Da ⁇ 24 Da, 4830 Da ⁇ 24 Da, 4865 Da ⁇ 24 Da, 4963 Da ⁇ 25 Da, 5112 Da ⁇ 26 Da, 5226 Da ⁇ 26 Da, 5493 Da ⁇ 27 Da, 56
  • the present invention comprises a method for the identification of these proteins, especiaUy by obtaining their amino acid sequence.
  • This method comprises the purification of said proteins from the complex biological sample (blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, tears, saliva, sweat, ascites, cerebrospinal fluid, milk, lymph, or tissue extract samples) by fractionating said sample using techniques known by the one of ordinary skill in the art, most preferably protein chromatography (FPLC, HPLC).
  • FPLC protein chromatography
  • the biomolecules of the invention include those proteins with a molecular mass selected from 2020 Da ⁇ 10 Da, 2049 Da ⁇ 10 Da, 2270 Da ⁇ 11 Da, 2508 Da ⁇ 13 Da, 2732 Da ⁇ 14 Da, 3026 Da ⁇ 15 Da, 3227 Da ⁇ 17 Da, 3326 Da ⁇ 17 Da, 3456 Da ⁇ 17 Da, 3946 Da ⁇ 20 Da, 4103 Da ⁇ 21 Da, 4242 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4359 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4546 Da ⁇ 23 Da, 4607 Da ⁇ 23 Da, 4719 Da ⁇ 24 Da, 4830 Da ⁇ 24 Da, 4865 Da ⁇ 24 Da, 4963 Da ⁇ 25 Da, 5112 Da ⁇ 26 Da, 5226 Da ⁇ 26 Da, 5493 Da ⁇ 27 Da, 5648 Da ⁇ 28 Da, 5772 Da ⁇ 29 Da, 5854 Da ⁇ 29 Da, 6446 Da ⁇ 32 Da, 6644 Da ⁇ 33 Da,
  • the method comprises the analysis of the fractions for the presence and purity of said proteins by the method which was used to identify them as differentially expressed biomolecules, for example two-dimensional gel electrophoresis or SELDI mass spectrometry, but most preferably SELDI mass spectrometry.
  • the method also comprises an analysis of the purified proteins aiming towards the revealing of their amino acid sequence. This analysis may be performed using techniques in mass spectroscopy known to those skilled in the art.
  • this analysis may be performed using peptide mass fingerprinting, revealing information about the specific peptide mass profile after proteolytic digestion of the investigated protein.
  • this analysis may be preferably performed using post-source-decay (PSD), or MSMS, but most preferably MSMS, revealing mass information about all possible fragments of the investigated protein or proteolytic peptides thereof leading to the amino acid sequence of the investigated protein of proteolytic peptide thereof.
  • PSD post-source-decay
  • MSMS MSMS
  • the information revealed by the aforementioned techniques can be used to feed world-wide-web search engines, such as MS Fit (Protein Prospector, http://prospector.ucsf.edu) for information obtained .from peptide mass fingerprinting, or MS Tag (Protein Prospector, http ://prospector.ucsf .edu) for information obtained from PSD, or mascot (www.matrixscience.com) for information obtained from MSMS and peptide mass fingerprinting, for the alignment of the obtained results with data available in public protein sequence databases, such as SwissProt (http://us.expasy.org/sprot/), NCBI (http://www.ncbi.nlm.nih.gov/BLAST/), EMBL (http://srs.embl-heidelberg.de: 8000/srs5/) which leads to a confident information about the identity of said proteins.
  • This information may comprise, if avahable, the complete amino acid sequence, the calculated molecular mass, the structure
  • the invention provides kits using the methods of the invention as described in the section Diagnostics for the differential diagnosis of colorectal cancer or a non-malignant disease of the large intestine, wherein the kits are used to detect the biomolecules of the present invention.
  • the methods used to detect the biomolecules of the invention can also be used to determine whether a subject is at risk of developing colorectal cancer or a non-malignant disease of the large intestine, or has developed a colorectal cancer or a non-malignant disease of the large intestine.
  • Such methods may also be employed in the form of a diagnostic kit comprising an antibody specific to a biomolecule of the invention or a biologically active surface described herein, which may be conveniently used, for example, in clinical settings to diagnose patients exhibiting symptoms or a fan ⁇ ly history of a non-steroid dependent cancer.
  • diagnostic kits also include solutions and materials necessary for the detection of a biomolecule of the invention, and instructions to use the kit based on the above-mentioned methods.
  • the biomolecules of the invention include those proteins with a molecular mass selected from 2020 Da ⁇ 10 Da, 2049 Da ⁇ 10 Da, 2270 Da ⁇ 11 Da, 2508 Da ⁇ 13 Da, 2732 Da ⁇ 14 Da, 3026 Da ⁇ 15 Da, 3227 Da ⁇ 17 Da, 3326 Da ⁇ 17 Da, 3456 Da ⁇ 17 Da, 3946 Da ⁇ 20 Da, 4103 Da ⁇ 21 Da, 4242 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4359 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4546 Da ⁇ 23 Da, 4607 Da ⁇ 23 Da, 4719 Da ⁇ 24 Da, 4830 Da ⁇ 24 Da, 4865 Da ⁇ 24 Da, 4963 Da ⁇ 25 Da, 5112 Da ⁇ 26 Da, 5226 Da ⁇ 26 Da, 5493 Da ⁇ 27 Da, 5648 Da ⁇ 28 Da, 5772 Da ⁇ 29 Da, 5854 Da ⁇ 29 Da, 6446 Da ⁇ 32 Da, 6644 Da ⁇ 33 Da, 6852 Da
  • kits can be used to detect one or more of differentiaUy present biomolecules as described above in a test sample of subject.
  • the ' kits of the invention have many applications.
  • the kits can be used to differentiate if a subject is healthy, having a precancerous lesion of the large intestine, a colorectal cancer, a metastasized colorectal cancer or a non-mahgnant disease of the large intestine. Thus aiding the diagnosis of colorectal cancer or a non-maHgnant disease of the large intestine.
  • the kits can be used to identify compounds that modulate expression of said biomolecules.
  • kits comprises an adsorbent on a biologically active surface, wherein the adsorbent is suitable for binding one or more biomolecules of the invention, a denaturation solution for the pre-treatment of a sample, a binding solution, a washing solution or instructions for making a denaturation solution, binding solution, or washing solution, wherein the combination allows for the detection of a biomolecule using gas phase ion spectrometry.
  • kits can be prepared from the materials described in other previously detaUed sections (e. g., denaturation buffer, binding buffer, adsorbents, washing solutions, etc.).
  • the kit may comprise a first substrate comprising an adsorbent thereon (e. g., a particle functionalized with an adsorbent) and a second substrate onto which the first substrate can be positioned to form a probe, which is removably insertable into a gas phase ion spectrometer.
  • the kit may comprise a single substrate, which is in the form of a removably insertable probe with adsorbents on the substrate.
  • a kit comprises a binding molecule that specificaUy binds to a biomolecule related to the invention, a detection reagent, appropriate solutions and instructions on how to use the kit.
  • kits can be prepared from the materials described above, and other materials known to those skilled in the art.
  • a binding molecule used within such a kit may include, but is not limited to, proteins, peptides, nucleotides, nucleic acids, hormones, amino acids, sugars, fatty acids, steroids, polynucleotides, carbohydrates, lipids, or a combination thereof (e.g. glycoproteins, ribonucleoproteins, Hpoproteins), compounds or synthetic molecules.
  • a binding molecule used in said kit is an antibody.
  • the kit may optionally further comprise a standard or control information so that the test sample can be compared with the control information standard to determine if the test amount of a marker detected in a sample is a diagnostic amount consistent with a diagnosis of colorectal cancer.
  • the present invention also relates to use 2020 Da ⁇ 10 Da, 2049 Da ⁇ 10 Da, 2270 Da ⁇ 11 Da, 2508 Da ⁇ 13 Da, 2732 Da ⁇ 14 Da, 3026 Da ⁇ 15 Da, 3227 Da ⁇ 17 Da, 3326 Da ⁇ 17 Da, 3456 Da ⁇ 17 Da, 3946 Da ⁇ 20 Da, 4103 Da ⁇ 21 Da, 4242 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4359 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4546 Da ⁇ 23 Da, 4607 Da ⁇ 23 Da, 4719 Da ⁇ 24 Da, 4830 Da ⁇ 24 Da, 4865 Da ⁇ 24 Da, 4963 Da ⁇ 25 Da, 5112 Da ⁇ 26 Da, 5226 Da ⁇ 26 Da, 5493 Da ⁇ 27 Da, 5648 Da ⁇ 28 Da, 5772 Da ⁇ 29 Da, 5854 Da ⁇ 29 Da, 6446 Da ⁇ 32 Da, 6644 Da ⁇ 33 Da, 6852 Da ⁇ 34 Da, 6897
  • the invention also relates to a method for aiding non-steroid dependent cancer diagnosis especially colorectal cancer, the method comprising (a) detecting at least one protein marker in a sample, wherein the protein marker is selected from 2020 Da ⁇ 10 Da, 2049 Da ⁇ 10 Da, 2270 Da ⁇ 11 Da, 2508 Da ⁇ 13 Da, 2732 Da ⁇ 14 Da, 3026 Da ⁇ 15 Da, 3227 Da ⁇ 17 Da, 3326 Da ⁇ 17 Da, 3456 Da ⁇ 17 Da, 3946 Da ⁇ 20 Da, 4103 Da ⁇ 21 Da, 4242 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4359 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4546 Da ⁇ 23 Da, 4607 Da ⁇ 23 Da, 4719 Da ⁇ 24 Da, 4830 Da ⁇ 24 Da, 4865 Da ⁇ 24 Da, 4963 Da ⁇ 25 Da, 5112 Da ⁇ 26 Da, 5226 Da ⁇ 26 Da, 5493 Da ⁇ 27 Da, 5648 Da ⁇ 28 Da, 57
  • each recorded measurement reading is accompanied by a margin of deviation.
  • the margin of deviation is exclusively device-specific. That means it is caused by the type of analytical device used which is preferably a mass spectrometer.
  • the accuracy of the recorded measurement reading is specified by a fixed percentage.
  • each disclosed molecular mass represents the averaged value of that range which deviates from the averaged value about ⁇ 0.5 %.
  • each molecular mass results from the analysis of samples belonging to another type of cancer.
  • the origin of sample, the cellular status, the environmental conditions of the gathered tissue etc. exert an influence on the measurements.
  • the given molecular mass of the biomarkers represents the averaged value which is calculated from the data of numerous samples of each cancer species.
  • measuring errors might be also imaginable, for example due to the sample preparation.
  • each recorded measurement reading is overlapping with any others within its margin of deviation.
  • Example 1 Sample collection for colon cancer evaluation.
  • Serum samples were obtained from a total of 151 individuals, which included two different groups of subjects.
  • group I sera were drawn from 57 colon cancer patients, undergoing diagnosis and treatment of colon cancer at the Departments of Gastr ⁇ enterology and Surgery of the Universities of Magdeburg, Er Weg, and Cottbus (all Germany).
  • Serum samples were collected from the patients directly before surgery.
  • a primary diagnosis was made based on endoscopy, ultrasonic testing, and/or other means for the detection of colorectal cancer. In all cases the diagnosis was confirmed by histological evaluation after surgery.
  • follow-up data for all colon cancer patients are currently coUected and will be available for later studies.
  • the non-cancer control group (group II) consisted of 94 subjects with non-mahgnant disease symptoms of the large intestine (adenoma, inflammation, diverticulosis), which were recruited from the University Hospitals in Magdeburg, Cottbus, and Erlangen. Serum from each subject was taken foUowing colorectal endoscopy, wherein the absence of colorectal cancer was confirmed. Furthermore, all subjects denied a personal history of cancer and were otherwise healthy. FoUow-up data for aU non-cancer controls are currently collected and will be available for later studies. In addition, 77 serum samples from healthy blood donors was also collected for test-set analysis. Blood donors are considered to be healthy individuals not suffering from severe diseases.
  • ProteinChip A ⁇ ays of the SAX2-type were a ⁇ anged into a bioprocessor (Ciphergen Biosystems, Inc.), a device that contains up to 12 ProteinChips and facilitates processing of the ProteinChips.
  • the ProteinChips were pre-incubated in the bioprocessor with 200 ⁇ l binding buffer (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5). 10 ⁇ l of serum sample was dUuted 1:5 in a buffer (7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, 2% ampholine) and again diluted 1:10 in the binding buffer.
  • ProteinChips were placed in the ProteinChip Reader (ProteinChip Biology System II, Ciphergen Biosystems, Inc.) and time-of-flight spectra were generated by laser shots coUected in the positive mode at laser intensity 215, with the detector sensitivity of 8. Sixty laser shots per average spectra were performed.
  • ProteinChip Reader ProteinChip Biology System II, Ciphergen Biosystems, Inc.
  • 2 x 1 ⁇ l matrix solution (a saturated solution of sinapinic acid in 50% acetonitrile 0.5% trifluoracetic acid) was applied to the spot. The drop was allowed to air-dry for 10 min after each application of matrix solution.
  • the ProteinChip was placed in the ProteinChip Reader (Biology System ⁇ , Ciphergen Biosystems, Inc.) and time-of-flight spectra were generated by laser shots collected in the positive mode at laser intensity 210, with the detector sensitivity of 8. Sixty laser shots per average spectra were performed. Subsequently, Time-Of-FHght values were co ⁇ elatedto the molecular masses of the standard proteins, and caHbration was performed according to the instrument manual.
  • ProteinChip Reader Biology System ⁇ , Ciphergen Biosystems, Inc.
  • Figure 1 shows a comparison of protein mass spectra detected using the above mentioned SAX2 ProteinChip arrays for samples isolated from patients suffering from non-malignant diseases of the large intestine (e.g., acute or chronic inflammation, adenoma) (Cl and C2) and of patients with colon cancer (TI and T2).
  • non-malignant diseases of the large intestine e.g., acute or chronic inflammation, adenoma
  • TI and T2 e.g., chronic inflammation, adenoma
  • the complete set of patients was randomly divided into a fraining set and a test set.
  • the train set comprised of 54 randomly selected patients with colon cancer and 75 randomly selected patients without colon cancer.
  • the test set comprised of 14 randomly selected patients with colon cancer and 19 randomly selected patients without colon cancer. Additionally, a test set comprising of 77 sera obtained from healthy blood donors was compiled. This was done in order to test the classification algorithm generated on the basis of the spectra of the subgroup of healthy individuals (see below).
  • the m z values of all mass spectra selected for the analysis ranged between 2000 Da and 30000 Da, wherein smaller masses were not used since artefacts with the "Energy Absorbing Molecule, EAM" ("Matrix") could not be excluded, and higher masses were not detected under the chosen experimental conditions.
  • the spectra within the train set were normalised according to the intensity of the total ion cu ⁇ ent, followed by baseline subtraction, and automatic peak detection as previously described by Adam et al. (2002) Cancer Research 62: 3609-3614, using the "Biomarker Wiizard” tool of the ProteinChip Software Version 3.1 (Ciphergen Biosystem, Inc.).
  • the normalization coefficient generated by normalizing the spectra of the train sets and the cluster information of the train sets generated by the "Biomarker Wizard" tool of the software were saved and used to externaUy normalize the spectra of the co ⁇ esponding test sets and to cluster the signals of the co ⁇ esponding test sets according to the normalization and peak identification of the train sets.
  • the cluster information for each train and test set (containing sample ID and sample group, cluster mass values and cluster signal intensities for each spectrum within the sets) was transformed into an interchangeable data format (a .csv table) using the "Sample group statistics" function of the "Biomarker Wizard” tool of the ProteinChip Software Version 3.1.
  • the data can be analysed by a specific software for the generation of regression and classification trees (see examples 5 to 7).
  • classifiers with binary target variable were constructed: First, as a proof of principle, a classifier was constructed only on the basis of the training set described above. Second, a final classifier was constructed on the basis of all avaUable mass peaks and aU colon cancer samples, fusing the co ⁇ esponding fraining and test data sets. Third, a 2 nd final colon classifier was constructed analogously to the first final colon cancer classifier but excluding the most informative and dominating mass of the first final colon classifier. Fourth, a 3 rd final colon classifier was constructed analogously to the first final colon cancer classifier but excluding the most informative and dominating masses of the first and 2 nd final colon classifier.
  • Example 5 Classifier structure.
  • the proof-of-principle classifier employed 71 masses (variables) out of 90 determined signal clusters.
  • Single decision trees consisted of 4 to 9 variables (5 to 10 end nodes), 6 variables being typical, see histogram of Figure 4. Variable importance was roughly deduced by overall improvement, i.e., for each mass we summed the improvement values achieved in the generation of all 50 decision trees of the decision tree ensemble.
  • the masses used by the proof-of-principle classifier are Hsted in Table 1 (starting with most important masses having high improvement). An overview of the distribution of masses is given in Figure 5.
  • the 1 st final classifier for colon cancer employed 75 masses out of 90 dete ⁇ nined signal clusters. Single decision trees consisted of more variables than in the proof-of-principle classifier: 9 variables were typical, see histogram of Figure 6. Variable importance was roughly deduced by overall improvement.
  • the masses used by the 1 st final classifier are listed in Table 2 (starting with most important masses, i.e. masses with highest improvement values). An overview of the distribution of masses of the 1 st final classifier is given in Figure 7.
  • the 2 nd final classifier for colon cancer employed 77 masses out of 90 determined signal clusters. Single decision trees consisted of even more variables than in 1 st final classifier: 10 variables were typical, see histogram of Figure 8. Variable importance was roughly deduced by overall improvement.
  • the masses used by the 2 nd final classifier are listed in Table 3 (starting with most important masses, i.e. masses with highest improvement values); An overview of the distribution of masses of the 2 n final classifier is given in Figure 9.
  • the 3 rd final classifier for colon cancer employed 80 masses out of 90 determined signal clusters. Single decision trees consisted of even more variables than in 1 st final classifier: 10 variables were typical, see histogram of Figure 10. Variable importance was roughly deduced by overall improvement.
  • the masses used by the 3 ri final classifier are listed in Table 4 (starting with most important masses, i.e. masses with highest improvement values). An overview of the distribution of masses of the 3 rd final classifier is given in Figure 11.
  • the classifiers include all of the differentially expressed biomolecules found in this study.
  • Classification performance is determined for the proof-of-principle classifier on the colon cancer versus endoscopy control test data set as well as on a separate test set consisting of presumably healthy blood donors.
  • the classifier achieved 93% sensitivity and 84% specificity on the cancer versus endoscopy controls test data set and 94% specificity on 77 samples of blood donors.
  • Table 2 Ranking of masses of 1 st final classifier by overall improvement.
  • Table 3 Rankmg of masses of 2 nd final classifier by overall improvement.
  • Table 4 Ranking of masses of 3 rd final classifier by overall improvement.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Immunology (AREA)
  • Engineering & Computer Science (AREA)
  • Hematology (AREA)
  • Chemical & Material Sciences (AREA)
  • Urology & Nephrology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Microbiology (AREA)
  • Physics & Mathematics (AREA)
  • Biotechnology (AREA)
  • Oncology (AREA)
  • Hospice & Palliative Care (AREA)
  • Food Science & Technology (AREA)
  • Medicinal Chemistry (AREA)
  • Cell Biology (AREA)
  • Analytical Chemistry (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Other Investigation Or Analysis Of Materials By Electrical Means (AREA)

Abstract

The present invention provides biomolecules and the use of these biomolecules for the differential diagnosis of colorectal cancer or a non-malignant disease of the large intestine. In particular the present invention provides methods for detecting biomolecules within a test sample as well as a database comprising of mass profiles of biomolecules specific for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer or a metastasised colorectal cancer or subjects having a non-malignant disease of the large intestine. Furthermore, the present invention provides methods for the characterization of said biomolecules using gas phase ion spectrometry. In addition, the present invention provides methods for the identification of said biomolecules provided that they are proteins or polypeptides. The invention further provides kits for the differential diagnosis of colorectal cancer or a non-malignant disease of the large intestine.

Description

Differential Diagnosis of Colorectal Cancer and other Diseases of the Colon
The present invention provides biomolecules and the use of these biomolecules for the differential diagnosis of colorectal cancer or a non-malignant disease of the large intestine. In specific embodiments, the biomolecules are characterised by mass profiles generated by contacting a test and/or biological sample with an anion exchange surface under specific binding conditions and detecting said biomolecules using gas phase ion spectrometry. The biomolecules used according to the invention are preferably proteins or polypeptides. Furthermore, preferred test and/or biological samples are blood serum samples and are of human origin.
BACKGROUND TO THE INVENTION
Colorectal cancer is the fourth most common cancer in the world to date, and accounts for approximately 200,000 deaths per year in Europe and the US alone. Although colorectal cancer generally affects both men and women equally (currently at 9.4% and 10.1% of incident cancer, respectively), its distribution as a leading cause of death in men and women is disproportionate. Whereas colorectal cancer is the fourth leading cancer-related cause of death in men (following lung, stomach and prostate cancer), in women it takes second place to breast cancer. Furthermore, colorectal cancer is more prevalent in developed countries exhibiting more westernised lifestyle practices.
FamiUal and hereditary factors have been observed to play primary roles in the cause of colorectal cancers. In addition, a number of other factors have been shown to be associated with an increased, risk of developing colorectal cancer namely the presence of adenomatous polyps, history/presence of inflammatory bowel disease, diets rich in animal fats and significantly decreased consumption of raw or fresh vegetables (especially leafy green vegetables, cruciferous vegetables, as well as allium vegetables such as garlic, onions, chives).
Significant differences exist regarding the survival of patients affected by colorectal cancer according to the stages at which the disease is diagnosed. Most patients exhibit symptoms such as rectal bleeding, pain, abdominal distension or weight loss only after the disease is in its advanced stages, leaving little therapeutic options available. Clearly, early detection of primary, metastatic, and recurrent disease can significantly impact the prognosis of individuals suffering from colorectal cancer. Diagnosis at an early stage, prior to lymph-node spread, can significantly improve the rate of survival as compared to a diagnosis established at a later stage of the disease, since the therapies used to treat colorectal cancer are stage-dependent.
In date, fecal occult blood test (FOBT), flexible sigmoi oscopy, double contrast barium enema, and colonoscopy are the primary tools utilised to detect colorectal cancer at its early stages. Among these only FOBT, which is based on the high probability that blood found within a patients' fecal (heme- positive) sample arises from tumours found within the large intestine, is non-invasive, simple and relatively inexpensive. Unfortunately, this method of early detection has several drawbacks.
Firstly, a positive FOBT result leads to further examination, mainly colonoscopy - an extremely discomforting, invasive diagnostic method which is expensive and carries a serious complication rate of one per 5,000 examinations. Colonoscopy, as a follow-up diagnostic method, might prove to be effective in confirming colorectal cancer within a patient provided that the FOBT results indeed reflect the presence of the disease. Unfortunately this is more often not the case, since only 12% of the patients with a heme-positive fecal sample are diagnosed with cancer or large polyps at the time of colonoscopy. Furthermore, physicians frequently fail to properly instruct their patients on how fecal samples should be collected. Normally, patients are told to adhere to specific dietary guidelines and to avoid taking medication known to induce gastrointestinal bleeding. Should the patient not be instructed properly, nor adhere to the strict protocol, the chance of obtaining a false-positive FOBT result is greatly increased. The false positive-FOBT result will subsequently send the patient for a confirmatory diagnosis, which is neither necessary, inexpensive, or pleasant. Secondly, a false-negative result holds even greater consequence since a patient possessing colorectal cancer, in this case, would not be diagnosed as having the disease and would be sent home without proper therapy.
Currently, many groups are utilising proteomic technologies to comparatively analyse the differences in protein levels in colorectal cancers vs. normal large intestinal tissue in the hopes of developing diagnostic markers that could assist the practicing clinician in the management of colorectal cancer. Currently, the standard method of proteome analysis has been two dimensional (2D) gel electrophoresis, which has been an invaluable tool for the separation and identification of proteins. This method is also effective in identifying aberrantly expressed proteins in a variety of tissue samples. Unfortunately, the analysis of data generated by 2D-gel electrophoresis is labour-intensive and requires large quantities of material for protein analysis, thereby rendering it impractical for routine clinical use.
Through the introduction of SELDI (surface enhanced laser desorption ionization), a modification of MALDI-TOF (matrix-assisted laser desorption ionization/time of flight) which is a mass spectrometry technique that allows for the simultaneous analysis of multiple proteins in one sample, this tool has been achieved. Small amounts of proteins can be directly bound to a biochip, carrying spots with different types of chromatographic material, including those with hydrophobic, hydrophilic, cation- exchanging and anion-exchanging characteristics. This approach has been proven to be very useful to identify proteins and protein patterns (profiles) in various biological fluids, including serum, urine or pancreatic juice.
To date, specific biomarkers for the detection of breast and prostate cancers (patents WO0223200, WO03058198 and WO0125791 from Ciphergen, respectively) have been identified using the above mentioned SELDI technology. Unfortunately, due to the nature of sample testing, the biomarkers identified can only be used to diagnose a patient as having a specific cancer (either breast or prostate) versus not having the disease at all.. For example, whereas the test samples analysed in WO03058198 (Ciphergen) and WO0223200 (Ciphergen) were taken from patients with late-stage breast cancer (stages HI and TV), the control samples were taken from patients with undetectable breast cancer. The biomarkers identified are neither grade-specific nor can they detect the disease at its earliest stages (stage I and II), and thereby would not allow for effective patient-specific treatment of the disease. Moreover, biomarkers that can differentiate between the presence of a colorectal cancer, a non- malignant disease of the large intestine, or an acute and chronic inflammation of the epithelium have not yet been identified.
Accordingly, there is a critical need to develop a simple, non-invasive, reliable and inexpensive method for the effective detection of colorectal cancer at its early stages. Preferably, such a diagnostic method should be able to detect early-stage colorectal cancer, as well as distinguish between the later stages or grades of the disease. With such valuable information, medical practitioners would be able to tailor patient therapies for optimum treatment of the disease.
The present invention addresses this difficulty with the development of a non-invasive diagnostic tool for the differential diagnosis of colorectal cancer and non-malignant diseases of the large intestine.
SUMMARY OF THE INVENTION
The present invention relates to methods for the differential diagnosis of colorectal cancer or non- malignant disease of the large intestine by detecting one or more differentially expressed biomolecules within a test sample of a given subject, comparing results with samples from healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer, or subjects having a non-malignant disease of the large intestine, wherein the comparison allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the large intestine, having a colorectal cancer, having a metastasised colorectal cancer or a non-malignant disease of the large intestine.
The present invention provides a method for the differential diagnosis of a colorectal cancer and/or a non-malignant disease of the large intestine, in vitro, comprising obtaining a test sample from a subject, contacting test sample with a biologically active surface under specific binding conditions, allowing for biomolecules present within the test sample to bind to the biologically active surface, detecting one or more bound biomolecules using mass spectrometry thereby generating a mass profile of said test sample, tr^sfoπning data into a computer-readable form, and comparing said mass profile against a database containing mass profiles specific for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having colorectal cancer, subjects having metastasised colorectal cancers, or subjects having a non-malignant disease of the large intestine, wherein the comparison allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the large intestine, having a colorectal cancer, having a metastasised colorectal cancer or a non-malignant disease of the large intestine.
In one embodiment the invention provides a database comprising of mass profiles of biological samples from healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer, or subjects having a non- malignant disease of the large intestine.
Within the same embodiment the database is generated by obtaining biological samples from healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer, and subjects having a non-malignant disease of the large intestine, contacting said biological samples with a biologically active surface under specific binding conditions, allowing the biomolecules within the biological sample to bind to said biologically active surface, detecting one or more bound biomolecules using mass spectrometry thereby generating a mass profile of said biological samples, tønsfoπning data into a computer-readable form, and applying a mathematical algorithm to classify the mass profiles as specific for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having colorectal cancer, subjects having metastasised colorectal cancer, and subjects having a non- malignant disease of the large intestine.
In specific embodiments, the present invention provides biomolecules having a molecular mass selected from the group consisting of 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da± 38 Da, 7657 Da± 38 Da, 8076 Da ± 40 Da, 8215 Da± 41 Da, 8474 Da± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da ± 58 Da, 11905 Da ± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da ± 92 Da, 22338 Da ± 112 Da, 22466 Da ± 112 Da, 22676 Da ± 113 Da, 22951 Da ± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da, and 28259 Da ± 141 Da. The biomolecules having said molecular masses are detected by contacting a test and/or biological sample with a biologically active surface comprising an adsorbent under specific binding conditions and further analysed by gas phase ion spectrometry. Preferably the adsorbent used is comprised of positively charged quaternary ammonium groups (anion exchange surface).
In specific embodiments, the invention provides specific binding conditions for the detection of biomolecules within a sample. In preferred embodiments, a sample is diluted 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine, and then diluted again 1:10 in binding buffer consisting of 0.1 M Tris-HCl, 0.02% Triton X-100 at a pH 8.5 at 0 to 4°C. The treated sample is then contacted with a biologically active surface comprising of positively charged (cationic) quaternary ammonium groups (anion exchanging), incubated for 120 minutes at 20 to 24°C, and the bound biomolecules are detected using gas phase ion spectrometry.
In an alternative embodiment, the invention provides a method for the differential diagnosis of a colorectal cancer and or a non-malignant disease of the large intestine comprising detecting of one or more differentially expressed biomolecules within a sample. This method comprises obtaining a test sample from a subject, contacting said sample with a binding molecule specific for a differentially expressed polypeptide, detecting an interaction between the binding molecule and its specific polypeptide, wherein the detection of an interaction indicates the presence or absence of said polypeptide, thereby allowing for the differential diagnosis of a subject as healthy, having a precancerous lesion of the large intestine, having a colorectal cancer, having a metastasised colorectal cancer and or a non-malignant disease of the large intestine. Preferably, binding molecules are antibodies specific for said polypeptides.
The biomolecules related to the invention, having a molecular mass selected from the group consisting of 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026
Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21
Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da,' 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da ± 58 Da, 11905 Da ± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da± 91 Da, 18390 Da ± 92 Da, 22338 Da± 112 Da, 22466 Da ± 112 Da, 22676 Da± 113 Da, 22951 Da ± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da, or 28259 Da ± 141 Da , and may include, but are not limited to, molecules comprising nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides (DNA or RNA), polypeptides, proteins, antibodies, carbohydrates, lipids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, lipoproteins). Preferably said biomolecules are proteins, polypeptides, or fragments thereof.
In yet another embodiment, the invention provides a method for the identification of biomolecules within a sample, provided that the biomolecules are proteins, polypeptides or fragments thereof, - comprising: chromatography and fractionation, analysis of fractions for the presence of said differentially expressed proteins and/or fragments thereof, using a biologically active surface, further analysis using mass spectrometry to obtain amino acid sequences encoding said proteins and/or fragments thereof, and searching aminό acid sequence databases of known proteins to identify said differentially expressed proteins by amino acid sequence comparison. Preferably the method of chromatography is high performance liquid chromatography (HPLC) or fast protein liquid chromatography (FPLC). Furthermore, the mass spectrometry used is selected from the group of matrix-assisted laser desorption ionization time of flight (MALDI-TOF), surface enhanced laser desorption ionisation time of flight (SELDI-TOF), liquid chromatography, MS-MS, or ESI-MS.
Furthermore, the invention provides kits for the differential diagnosis of a colorectal cancer and/or a non-malignant disease of the colon.
The test or biological samples used according to the invention may be of blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract origin. Preferably, the test and/or biological samples are blood serum samples, and are isolated from subjects of mammalian origin, preferably of human origin.
A colorectal cancer of the invention is a cancer of the large intestine, and may include cancers of the colon, rectum etc. Furthermore, a colorectal cancer, as intended by the invention, may be of various stages and/or grades.
DESCRIPTION OF FIGURES
Figure 1. Comparison of protein mass spectra processed on the anion exchange surface of a SAX2 ProteinChip array comprised of cationic quaternary ammonium groups. Protein mass spectra obtained from sera of endoscopy control patients (Cl and C2), suffering from non-malignant diseases of the large intestine (e.g., acute or chronic inflammation, adenoma) and of patients with colon cancer (TI and T2) are shown. Scattered boxes indicate differentially expressed proteins with high diagnostic significance. A representative differentially expressed protein (m z= 6645 Da) is highlighted possessing high importance within the generated classifiers (ensemble of decision trees) according to overall improvement, see Tables 1-4. The X-axis shows the mass/charge (m/z) ratio, which is equivalent to the apparent molecular mass of the corresponding biomolecule. The Y-axis shows the normalized relative signal intensity of the peak in the examined serum samples.
Figure 2A - F. Scatter plots of clusters (peaks, variables), belonging to differentially expressed proteins included in the four classifiers. The X-axis shows the mass/charge (m/z) ratio, which is equivalent to the apparent molecular mass of the corresponding biomolecule. The Y-axis shows the logarithmic normalized relative signal intensity of the peaks in the examined serum samples. First, intensities were shifted to yield entirely positive values. Then, for each mass, intensities were normalized by dividing the intensity values by the average intensity of that mass. Finally, the natural logarithm was taken. □ T (Tumour): Colon cancer patients' serum samples, o N (Normal): Endoscopy control patients' serum samples.
Figure 3A - F. Additionally scaled scatter plots of clusters (peaks, variables), belonging to differentially expressed proteins included in the four classifiers. The X-axis shows the mass/charge (m/z) ratio, which is equivalent to the apparent molecular mass of the corresponding biomolecule. As in Figure 2, the Y-axis shows the logarithmic normalized relative signal intensity of the peaks in the examined serum samples. However, intensities were additionally (shifted and) scaled so that the intensities of each mass cover the entire range of the Y-axis. Thereby, the minimum and maximum intensities of all masses are ahgned on the lower and upper edge of the plot, respectively. This allows to better visualize the extend of class overlap. □ T (Tumour): Colon cancer patients' serum samples. o N (Normal): Endoscopy control patients' serum samples. Figure 4. Complexity of proof-of-principle classifier. The histogram visualizes the distribution of the number of decision tree variables (peaks, clusters) for the obtained proof-of-principle classifier for gastric cancer. 6 variables per decision tree are typical.
Figure 5. Variable importance of the proof-of-principle classifier. The histograms visualize how often a variable (mass) is employed in the proof-of-principle classifier. The frequency of variable selection is presented in histogram form for each hierarchical level (a-j) and for all hierarchical levels taken together (k).
Figure 6. Complexity of 1st final classifier. The histogram visualizes the distribution of the number of decision tree variables (peaks, clusters) for the obtained 1st final classifier in the range of 1 to 10 decision tree variables. 9 variables per decision tree are typical.
Figure 7. Variable importance of 1st final classifier. The histogram visualizes how often a variable (mass) is employed in the final classifier. The frequency of variable selection is presented in histogram form for each of the first 10 hierarchical levels (a-j) and for the first ten hierarchical levels taken together (k).
Figure 8. Complexity of 2nd final classifier. The histogram visualizes the distribution of the number of decision tree variables (peaks, clusters) for the obtained 2nd final classifier in the range of 1 to 10 decision tree variables. As many as 10 variables per decision tree are typical.
Figure 9. Variable importance of 2nd final classifier. The histogram visualizes how often a variable (mass) is employed in the 2nd final classifier. The frequency of variable selection is presented in- histogram form for each of the first 10 hierarchical levels (a-j) and for the first ten hierarchical levels taken together (k).
Figure 10. Complexity of 3rd final classifier. The histogram visualizes the distribution of the number of decision tree variables (peaks, clusters) for the obtained 3rd final classifier in the range of 1 to 10 decision tree variables. As many as 10 variables per decision tree are typical.
Figure 11. Variable importance of 3rd final classifier. The histogram visualizes how often a variable (mass) is employed in the 3rd final classifier. The frequency of variable selection is presented in histogram form for each of the first 10 hierarchical levels (a-j) and for the first ten hierarchical levels taken together (k). DESCRIPTION OF THE INVENTION
It is to be understood that the present invention is not limited to the particular materials and methods described or equipment, as these may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present invention, which will be limited only by the appended claims.
It should be noted that as used herein and in the appended claims, the singular forms "a," "an," and "the" include plural reference unless the context clearly dictates otherwise. Thus, for example, a reference to "an antibody" is a reference to one or more antibodies and derivatives thereof known to those skilled in the art, and so forth.
Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art. Although any materials and methods, or equipment comparable to those specifically described herein can be used to practice or test the present invention, the preferred equipment, materials and methods are described below. All publications mentioned herein are cited for the purpose of describing and disclosing protocols, reagents, and current state of the art technologies that might be used in connection with the invention. Nothing herein is to be construed as an admission that the invention is not entitled to precede such disclosure by virtue of prior invention.
Definitions
The term "biomolecule" refers to a molecule produced by a cell or living organism. Such molecules include, but are not limited to, molecules comprising nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides, polypeptides, proteins, carbohydrates, lipids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, hpoproteins). Furthermore, the terms "nucleotide" or polynucleotide" refer to a nucleotide, oligonucleotide, polynucleotide, or any fragment thereof. These phrases also refer to DNA or RNA of genomic or synthetic origin which may be single- stranded or double-stranded and may represent the sense, or the antisense strand, to peptide polynucleotide sequences (i.e. peptide nucleic acids; PNAs), or to any DNA-like or RNA-like material.
The term "fragment" refers to a portion of a polypeptide (parent) sequence that comprises at least 10 consecutive amino acid residues and retains a biological activity and/or some functional characteristics of the parent polypeptide e.g. antigenicity or structural domain characteristics.
The terms "biological sample" and "test sample" refer to all biological fluids and excretions isolated from any given subject. In the context of the invention such samples include, but are not limited to, blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract samples.
The teπn "specific binding" refers to the binding reaction between a biomolecule and a specific "binding molecule". Related to the invention are binding molecules that include, but are not limited to, proteins, peptides, nucleotides, nucleic acids, hormones, amino acids, sugars, fatty acids, steroids, polynucleotides, carbohydrates, lipids, or a combination thereof (e.g. glycoproteins, ribonucleoproteins, Hpoproteins). Furthermore, a binding reaction is considered to be specific when the interaction between said molecules is substantial. In the context of the invention, a binding reaction is considered substantial when the reaction that takes place between said molecules is at least two times the background. Moreover, the term "specific binding conditions" refers to reaction conditions that permit the binding of said molecules such as pH, salt, detergent and other conditions known to those skilled in the art.
The term "interaction" relates to the direct or indirect binding or alteration of biological activity of a biomolecule.
The teπn "differential diagnosis" refers to a diagnostic decision between a healthy and different disease states, including various stages of a specific disease. A subject is diagnosed as healthy or to be suffering from a specific disease, or a specific stage of a disease based on a set of hypotheses that allow for the distinction between healthy and one or more stages of the disease. The choice between healthy and one or more stages of disease depends on a significant difference between each hypothesis. Under the same principle, a "differential diagnosis" may also refer to a diagnostic decision between one disease type as compared to another (e.g. colon cancer vs. diverticulosis).
The term "colorectal cancer" refers to a cancer state associated with the large intestine of any given subject, wherein the cancer state is defined according to its stage and/or grade. The various stages of a cancer may be identified using staging systems known to those skilled in the art [e.g. Union Internationale Contre Cancer (UICC) system or American Joint Committee on Cancer (AJC)]. In the context of the invention colorectal cancers include but are not limited to colon and rectal cancers.
The term "non-malignant disease of the large intestine" refers to alterations in the physiological, functional and/or anatomical state of the large intestine, wherein the alterations deviate from normal. In addition, this term encompasses alterations in the physiological, functional and/or anatomical state of the large intestine that cannot be staged or graded according to cancer staging systems known to those skilled in the art [e.g. Union Internationale Contre Cancer (UICC) system or American Joint Committee on Cancer (AJC)]. Such non-malignant disease include but are not limited to the acute and chronic inflammation of the large intestinal epithelium, diverticular disease including diverticulosis and diverticuUtis, colitis, ulcerative colitis, pancolitis, Crohn's disease (ileitis), proctitis, intestinal polyps including hyperplastic polyps, hamartomatous polyps (i.e. Juvenile polyps, Peutz-Jeghers polyps), inflammatory polyps, and lymphoid polyps, adenomatous polyps.
The term "healthy individual" refers to a subject possessing good health. Such a subject demonstrates an absence of any disease within the large intestine, preferably a colorectal cancer or a non-malignant disease of the large intestine.
The term "precancerous lesion of the large intestine" refers to a biological change within a cell and/or tissue of the large intestine such that said cell and/or tissue becomes susceptible to the development of a cancer. More specifically, a precancerous lesion of the large intestine is a preliminary stage of a colorectal cancer (i.e. dysplasia). Causes of a precancerous lesion of the larger intestine may include, but are not limited to, genetic predisposition and exposure to cancer-causing agents (carcinogens); such cancer causing agents include agents that cause genetic damage and induce neoplastic transformation of a cell. Furthermore, the phrase "neoplastic transformation of a cell" refers an alteration in normal cell physiology and includes, but is not limited to, self-sufficiency in growth signals, insensitivity to growth-inhibitory (anti-growth) signals, evasion of programmed cell death (apoptosis), limitless repUcative potential, sustained angiogenesis, and tissue invasion and metastasis.
The term "dysplasia" refers to morphological alterations within a tissue, which are characterised by a loss in the uniformity of individual cells, as well as a loss in their architectural orientation. Furthermore, dysplastic cells also exhibit a variation in size and shape.
The phrase "differentially present" refers to differences in the quantity of a biomolecule (of a particular apparent molecular mass) present in a sample from a subject as compared to a comparable sample. For example, a biomolecule is present at an elevated level, a decreased level or absent in samples of subjects having colorectal cancer compared to samples of subjects who do not have a cancer of the large intestine. Therefore in the context of the invention, the term "differentially present biomolecule" refers to the quantity biomolecule (of a particular apparent molecular mass) present within a sample taken from a subject having a disease or cancer of the large intestine as compared to a comparable sample taken from a healthy subject. Within the context of the invention, a biomolecule is differentially present between two samples if the quantity of said biomolecule in one sample is statisticaUy significantly different from the quantity of said biomolecule in another sample.
The term "diagnostic assay" can be used interchangeably with "diagnostic method" and refers to the detection of the presence or nature of a pathologic condition. Diagnostic assays differ in their sensitivity and specificity. Within the context of the invention the sensitivity of a diagnostic assay is defined as the percentage of diseased subjects who test positive for a colorectal cancer or a non- malignant disease of the large intestine and are considered "true positives". Subjects having a colorectal cancer or a non-malignant disease of the large intestine but not detected by the diagnostic assay are considered "false negatives". Subjects who are not diseased and who test negative in the diagnostic assay are considered "true negatives". Furthermore, the teπn specificity of a diagnostic assay, as used herein, is defined as 1 minus the false positive rate, where the "false positive rate" is defined as the proportion of those subjects devoid of a colorectal cancer or a non-malignant disease of the large intestine but who test positive in said assay.
The term "adsorbent" refers to any material that is capable of accumulating (binding) a biomolecule. The adsorbent typically coats a biologically active surface and is composed of a single material or a plurality of different materials that are capable of binding a biomolecule. Such materials include, but are not limited to, anion exchange materials, cation exchange materials, metal chelators, polynucleotides, oligonucleotides, peptides, antibodies, metal chelators etc.
The teπn "biologicaUy active surface" refers to any two- or three-dimensional extension of a material that biomolecules can bind to, or interact with, due to the specific biochemical properties of this material and those of the biomolecules. Such biochemical properties include, but are not limited to, ionic character (charge), hydrophobicity, or hydrophilicity.
The teπn "binding molecule" refers to a molecule that displays an affinity for another molecule. With in the context of the invention such molecules may include, but are not limited to nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polypeptides, carbohydrates, lipids, and combinations thereof (e.g. glycoproteins, ribonucleoproteins, hpoproteins). Preferably, such binding molecules are antibodies.
The term "solution" refers to a homogeneous mixture of two or more substances. Solutions may include, but are not limited to buffers, substrate solutions, elution solutions, wash solutions, detection solutions, standardisation solutions, chemical solutions, solvents, etc. Furthermore, other solutions known to those skilled in the art are also included herein.
The term "mass profile" refers to a mass spectrum as a characteristic property of a given sample or a group of samples, especially when compared to the mass profile of a second sample or group of samples in any way different from the first sample or group of sample. In the context of the invention, the mass profile is obtained by treating the biological sample as follows. The sample is diluted it 1 :5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% ampholine and subsequently diluted 1:10 in binding buffer consisting of θ l M Tris-HCl, 0.02% Triton X-100 at pH 8.5. Thus pre-treated sample is applied to a biologically active surface comprising positively charged quaternary ammonium groups (anion exchange surface) and incubated for 120 minutes. The biomolecules bound to the surface are analysed by gas phase ion spectrometry as described in another section. All but the dilution steps are performed at 20 to 24°C. Dilution steps are performed at 0 to 4°C.
The phrase "apparent molecular mass" refers to the molecular mass value in Dalton (Da) of a biomolecule as it may appear in a given method of investigation, e.g. size exclusion chromatography, gel electrophoresis, or mass spectrometry.
The term "chromatography" refers to any method of separating biomolecules within a given sample such that the original native state of a given biomolecule is retained. Separation of a biomolecule from other biomolecules within a given sample for the purpose of enrichment, purification and/or analysis, may be achieved by methods including, but not limited to, size exclusion chromatography, ion exchange chromatography, hydrophobic and hydrophilic interaction chromatography, metal affinity chromatography, wherein "metal" refers to metal ions (e.g. nickel, copper, gallium, or zinc) of all chemically possible valences, or ligand affinity chromatography wherein "ligand" refers to binding molecules, preferably proteins, antibodies, or DNA. Generally, chromatography uses biologically active surfaces as adsorbents to selectively accumulate certain biomolecules.
The term "mass spectrometry" refers to a method comprising employing an ionization source to generate gas phase ions from a biological entity of a sample presented on a biologically active surface and detecting the gas phase ions with a mass spectrometer.
The phrase "laser desorption mass spectrometry" refers to a method comprising the use of a laser as an ionization source to generate gas phase ions from a biomolecule presented on a biologically active surface and detecting the gas phase ions with a mass spectrometer.
The term "mass spectrometer" refers to a gas phase ion spectrometer that includes an inlet system, an ionisation source, an ion optic assembly, a mass analyser, and a detector.
Within the context of the invention, the terms "detect", "detection" or "detecting" refer to the identification of the presence, absence, or quantity of a biomolecule.
The term "energy absorbing molecule" or "EAM" refers to a molecule that- absorbs energy from an energy source in a mass spectrometer thereby enabling desorption of a biomolecule from a biologically active surface. Cinnamic acid derivatives, sinapinic acid and dihydroxybenzoic acid are frequently used as energy-absorbing molecules in laser desorption of biomolecules. See U.S. Pat. No. 5,719,060 (Hutchens & Yip) for a further description of energy absorbing molecules.
The term "training set" refers to a subset of the respective entire available data set. This subset is typically randomly selected, and is solely used for the purpose of classifier construction.
The term "test set" refers to a subset of the entire available data set consisting of those entries not included in the training set. Test data is applied to evaluate classifier performance.
The term "decision tree" refers to a flow-chart-like tree structure employed for classification. Decision trees consist of repeated splits of a data set into subsets. Each split consists of a simple rule applied to one variable, e.g., "if value of 'variable 1' larger than 'threshold 1' then go left else go right". Accordingly, the given feature space is partitioned into a set of rectangles with each rectangle assigned to one class.
The terms "ensemble", "tree ensemble" or "ensemble classifier" can be used interchangeably and refer to a classifier that consists of many simpler elementary classifiers, e.g., an ensemble of decision trees is a classifier consisting of decision trees. The result of the ensemble classifier is obtained by combining all the results of its constituent classifiers, e.g., by majority voting that weights all constituent classifiers equally. Majority voting is especially reasonable in the case of bagging, where constituent classifiers are then naturally weighted by the frequency with which they are generated.
The term "competitor" refers to a variable (in our case: mass) that can be used as an alternative splitting rule in a decision tree. In each step of decision tree construction, only the variable yielding best data splitting is selected. Competitors are non-selected variables with similar but lower performance than the selected variable. They point into the direction of alternative decision trees.
The teπn "surrogate" refers to a splitting rule that closely mimics the action of the primary split. A surrogate is a variable that can substitute a selected decision tree variable, e.g. in the case of missing values. Not only must a good surrogate split the parent node into descendant nodes similar in size and composition to the primary descendant nodes. In addition, the surrogate must also match the primary split on the specific cases that go to the left child and right child nodes.
The terms "peak" and "signal" may be used interchangeably and refer to any signal which is generated by a biomolecule when under investigation using a specific method, for example chromatography, mass spectrometry, or any type of spectroscopy like Ultraviolet/Visible Light (UV/Vis) spectroscopy, Fourier Transformed Infrared (FTTR) spectroscopy, Electron Paramagnetic Resonance (EPR) spectroscopy, or Nuclear Mass Resonance (NMR) spectroscopy.
Within the context of the invention, the terms "peak" and "signal" refer to the signal generated by a biomolecule of a certain molecular mass hitting the detector of a mass spectrometer, thus generating a signal intensity which correlates with the amount or concentration of said biomolecule of a given sample. A "peak" and "signal" is defined by two values: an apparent molecular mass value and an intensity value generated as described. The mass value is an elemental characteristic of a biological entity, whereas the intensity value accords to a certain amount or concentration of a biological entity with the coπesponding apparent molecular mass value, and thus "peak" and "signal" always refer to the properties of this biological entity.
The teπn "cluster" refers to a signal or peak present in a certain set of mass spectra or mass profiles obtained from different samples belonging to two or more different groups (e.g. cancer and non cancer). Within the set, signals belonging to cluster can differ in their intensities, but not in the apparent molecular masses.
The term "variable" refers to a cluster which is subjected to a statistical analysis aiming towards a classification of samples into two or more different sample groups (e.g. cancer and non cancer) by using decision trees, wherein the sample feature relevant for classification is the intensity value of the variables in the analysed samples.
Detailed Description of the invention a) Diagnostics
The present invention relates to methods for the differential diagnosis of colorectal cancers or a non- malignant disease of the large intestine by detecting one or more differentially expressed biomolecules within a test sample of a given subject, comparing results with samples from healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer, or subjects having a non-malignant disease of the large intestine, wherein the comparison allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the large intestine, having a colorectal cancer, having a metastasised colorectal cancer or a non-malignant disease of the large intestine.
In one aspect of the invention, a method for the differential diagnosis of a colorectal cancer or a non- malignant disease of the large intestine comprises obtaining a test sample from a given subject, contacting said sample with an adsorbent present on a biologically active surface under specific binding conditions, allowing the biomolecules within the test sample to bind to said adsorbent, detecting one or more bound biomolecules using a detection method, wherein the detection method generates a mass profile of said sample, fransforming mass profile data into a computer-readable form comparing the mass profile of said sample with a database containing mass profiles from comparable samples specific for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer, ..or subjects having a non-malignant disease of the large intestine. A comparison of mass profiles allows for the medical practitioner to determine if a subject is healthy, has a precancerous lesion of the large intestine, a colorectal cancer, a metastasised colorectal cancer or a non-malignant disease of the large intestine based on the presence, absence or quantity of specific biomolecules.
In more than one embodiment, a single biomolecule or a combination of more than one biomolecule selected from the group having an apparent molecular mass of 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ±23 Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da ± 58 Da, 11905 Da ± 60 Da, 12470 Da± 62 Da, 12619 Da± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da ± 92 Da, 22338 Da ± 112 Da, 22466 Da ± 112 Da, 22676 Da ± 113 Da, 22951 Da ± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da, or 28259 Da ± 141 Da may be detected within a given sample. Detection of a single or a combination of more than one biomolecule of the invention is based on specific sample pre-treatment conditions, the pH of binding conditions, and the type of biologically active surface used for the detection of biomolecules. For example, prior to the detection of the biomolecules described herein, a given sample is pre-treated by diluting 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% ampholine. The denatured sample is then diluted 1 : 10 in a specific binding buffer (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5), applied to a biologically active surface comprising of positively-charged quaternary ammonium groups (cationic) and incubated using specific buffer conditions (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5) to allow for binding of said biomolecules to the above-mentioned biologically active surface.
According to the invention, a biomolecule with the molecular mass of 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da ± 58 Da, 11905 Da ± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da± 92 Da, 22338 Da± 112 Da, 22466 Da ± 112 Da, 22676 Da ± 113 Da, 22951 Da ± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da, or 28259 Da ± 141 Da is detected by diluting the biological sample 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine, and then 1:10 in binding buffer consisting of 0.1 M Tris-HCl, 0.02% Triton X-100 at pH 8.5 at 0 to 4°C, applying thus treated sample to a biologically active surface comprising positively charged (cationic) quaternary ammonium groups (anion exchanging), incubating for 120 minutes at 20 to 24°C, and subjecting the bound biomolecules to gas phase ion spectrometry as described in another section.
A biomolecule of the invention may include any molecule that is produced by a cell or living organism, and may have any biochemical property (e.g. phosphorylated proteins, positively charged molecules, negatively charged molecules, hydrophobicity, hydrophiUcity), but preferably biochemical properties that allow binding of the biomolecule to a biologically active surface comprising positively charged quaternary ammonium groups after denaturation in 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine and dilution in 0.1 M Tris-HCl, 0.02% Triton X-100 at pH 8.5 at 0 to 4°C followed by incubation on said biologically active surface for 120 minutes at 20 to 24°C. Such molecules include, but are not limited to, molecules comprising nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides (DNA or RNA), polypeptides, proteins, antibodies, carbohydrates, lipids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, hpoproteins). Preferably a biomolecule may be a nucleotide, polynucleotide, peptide, protein or fragments thereof. Even more prefeπed are peptide or protein biomolecules or fragments thereof.
The methods for detecting these biomolecules have many applications. For example, a single biomolecule or a combination of more than one biomolecule selected from the group having an apparent molecular mass of 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 47.19 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da± 58 Da, 11905 Da ± 60 Da, 12470 Da± 62 Da, 12619 Da± 63 Da, 12828 Da± 64 Da, 13290 Da± 66 Da, 13632 Da± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da ± 92 Da, 22338 Da ± 112 Da, 22466 Da ± 112 Da, 22676 Da ± 113 Da, 22951 Da ± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da, or 28259 Da ± 141 Da can be measured to differentiate between healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having colorectal cancer, subjects having a metastasized colorectal cancer or subjects with a non-malignant disease of the large intestine, and thus are useful as an aid in the diagnosis of a colorectal cancer and/or a non-malignant disease of the large intestine within a subject. Alternatively, said biomolecules may be used to diagnose a subject as healthy.
For example, a biomolecule having the apparent molecular mass of about e.g. 4242 Da is present only in biological samples from patients having a metastasised colorectal cancer. Mass profiling of two test samples from different subjects, X and Y, reveals the presence of a biomolecule with the apparent molecular mass of about 4242 Da in a sample from test subject X, and the absence of said biomolecule in test sample from subject Y. The medical practitioner is able to diagnose subject X as having a metastasised colorectal cancer and subject Y as not having a metastasised colorectal cancer. In yet another example, three biomolecules having the apparent molecular mass of about 5772 Da, 2020 Da and 22951 Da are present in varying quantities in samples specific for precancerous lesions and "early" colorectal cancers. The biomolecule having the apparent molecular mass of 5772 Da is more present in samples specific for precancerous lesions of the large intestine than for "early" colorectal cancers. A biomolecule having an apparent molecular mass of 2020 Da is detected in samples from subjects having "early" colorectal cancers but not in those having a precancerous lesion, whereas the biomolecule having the molecular mass of 22951 Da is present in about the same quantity in both sample types. Such biomolecules are not present in samples from healthy subjects, only those of apparent molecular mass of 8780 Da and 16104 Da. Analysis of a test sample reveals the presence of biomolecules having the molecular mass of 22951 Da, 5772 Da and 2020 Da. Comparison of the quantity of the biomolecules within said sample reveals that the biomolecule with an apparent molecular mass of 5772 Da is present at lower levels than those found in samples from subjects having a precancerous lesion. The medical practitioner is able to diagnose the test subject as having an "early" colorectal cancer. These examples are solely used for the purpose of clarification and are not intended to limit the scope of this invention.
In another aspect of the invention, an immunoassay can be used to determine the presence or absence of a biomolecule within a test sample of a subject. First, the presence or absence of a biomolecule within a sample can be detected using the various immunoassay methods known to those skilled in the art (i.e. ELISA, western blots). If a biomolecule is present in the test sample, it will form an antibody- marker complex with an antibody that specifically binds a biomolecule under suitable incubation conditions. The amount of an antibody-biomolecule complex can be determined by comparing to a standard.
Thus, the invention provides a method for the differential diagnosis of a colorectal cancer and/or a non-malignant disease of the large intestine comprising detecting of one or more differentially expressed biomolecules within a sample. This method comprises obtaining a test sample from a subject, contacting said sample with a binding molecule specific for a differentially expressed polypeptide, detecting an interaction between the binding molecule and its specific polypeptide, wherein the detection of an interaction indicates the presence or absence of said polypeptide, thereby allowing for the differential diagnosis of a subject as healthy, having a precancerous lesion of the large intestine, having a colorectal cancer, having a metastasised colorectal cancer and/or a non-malignant disease of the large intestine. Binding molecules include, but are not limited to, proteins, peptides, nucleotides, nucleic acids, hormones, amino acids, sugars, fatty acids, steroids, polynucleotides, carbohydrates, lipids, or a combination thereof (e.g. glycoproteins, ribonucleoproteins, Hpoproteins), compounds or synthetic molecules. Preferably, binding molecules are antibodies specific for biomolecules selected from the group of having an apparent molecular mass of 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23' Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da ± 58 Da, 11905 Da ± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da± 92 Da, 22338 Da± 112 Da, 22466 Da± 112 Da, 22676 Da± 113 Da, 22951 Da± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da, or 28259 Da ± 141 Da
In another aspect of the invention, a method for detecting the differential presence of one or more biomolecules selected from the group having an apparent molecular mass of 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4F03 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da ± 58 Da, 11905 Da ± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da ± 92 Da, 22338 Da ± 112 Da, 22466 Da ± 112 Da, 22676 Da ± 113 Da, 22951 Da ± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da, or 28259 Da ± 141 Da in a test sample of a subject involves contacting the test sample with a compound or agent capable of detecting said biomolecule such that the presence of said biomolecule is directly and/or indirectly labelled. For example a fluorescently labelled secondary antibody can be used to detect a primary antibody bound to its specific biomolecule. Furthermore, such detection methods can be used to detect a variety of biomolecules within a test sample both in vitro as well as in vivo.
For example, in vivo, antibodies or fragments thereof may be utilised for the detection of a biomolecule in a biological sample comprising: applying a labeUed antibody directed against a given biomolecule of the invention to said sample under conditions that favour an interaction between the labelled antibody and its coπesponding protein. Depending on the nature of the biological sample, it is possible to determine not only the presence of a biomolecule, but also its cellular distribution. For example, in a blood serum sample, only the serum levels of a given biomolecule can be detected, whereas its level of expression and cellular localisation can be detected in histological samples. It will be obvious to those skilled in the art, that a wide variety of methods can be modified in order to achieve such detection.
For example, an antibody coupled to an enzyme is detected using a chromogenic substrate that is recognised and cleaved by the enzyme to produce a chemical moiety, which is readily detected using spectrometric, fluorimetric or visual means. Enzymes used to for labelling include, but are not limited to, malate dehydrogenase, staphylococcal nuclease, delta-5-steroid isomerase, yeast alcohol dehydrogenase, alpha-glycerophosphate, dehydrogenase, triose phosphate isomerase, horseradish peroxidase, alkaline phosphatase, asparaginase, glucose oxϊdase, beta-galactosidase, ribonuclease, urease, catalase, glucose-6-phosphate dehydrogenase, glucoamylase and acetylcholinesterase. Detection may also be accomphshed by visual comparison of the extent of the enzymatic reaction of a substrate with that of similarly prepared standards. Alternatively, radiolabelled antibodies can be detected using a gamma or a scintillation counter, or they can be detected using autoradiography. In another example, fluorescently labelled antibodies are detected based on the level at which the attached compound fluoresces following exposure to a given wavelength. Fluorescent compounds typically used in antibody labelling include, but are not limited to, fluorescein isothiocynate, rhodamine, phycoerthyrin, phycocyanin, allophycocyani, o-phthaldehyde and fluorescamine. In yet another example, antibodies coupled to a chemi- or bioluminescent compound can be detected by determining the presence of luminescence. Such compounds include, but are not limited to, luminal, isoluminal, theromatic acridinium ester, imidazole, acridinium salt, oxalate ester, luciferin, luciferase and aequorin.
Furthermore, in vivo techniques for the detection of a biomolecule of the invention include introducing into a subject a labelled antibody directed against a given polypeptide or fragment thereof. In more than one embodiment of the invention, the test sample used for the differential diagnosis of a colorectal cancer and/or a non-malignant disease of the large intestine of a subject may be of blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saHva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract origin. Preferably, test samples are of blood, blood serum, plasma, urine, excreta, prostatic fluid, biopsy, ascites, lymph or tissue extract origin. More preferred are blood, blood serum, plasma, urine, excreta, biopsy, lymph or tissue extract samples. Even more preferred are blood serum, urine, excreta or biopsy samples. Overall preferred are blood serum samples.
Furthermore, test samples used for the methods of the invention are isolated from subjects of mammalian origin, preferably of primate origin. Even more prefeπed are subjects of human origin.
In addition, the methods of the invention for the differential diagnosis of healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasized colorectal cancer or subjects having a non-mahgnant disease of the large intestine described herein may be combined with other diagnostic methods to improve the outcome of the differential diagnosis. Other diagnostic methods are known to those skilled in the art.
b) Database In another aspect of the invention, a database comprising of miass profiles specific for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer, or subjects having a non-malignant disease of the large intestine is generated by contacting biological samples isolated from above-mentioned subjects with an adsorbent on a biologically active surface under specific binding conditions, allowing the biomolecules within said sample to bind said adsorbent, detecting one or more bound biomolecules using a detection method wherein the detection method generates a mass profile of said sample, transforming the mass profile data into a computer-readable form and applying a mathematical algorithm to classify the mass profile as specific for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer, or subjects having a non-malignant disease of the large intestine.
According to the invention, the classification of said mass profiles is performed using the "CART" decision tree approach (classification and regression trees; Breiman et al., 1984) and is known to those skilled in the art. Furthermore, bagging of classifiers is applied to overcome typical instabilities of forward variable selection procedures, thereby increasing overall classifier performance (Breiman, 1994). In more than one embodiment, one or more biomolecules selected from the group having an apparent molecular mass of 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, . 9201 Da± 46 Da, 9359 Da± 47 Da, 9425 Da± 47 Da, 9581 Da± 48 Da, 9641 Da± 48 Da, 9718 Da± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da± 56 Da, 11464 Da ± 57 Da, 11547 Da± 58 Da, 11693 Da± 58 Da, 11905 Da± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da ± 92 Da, 22338 Da ± 112 Da, 22466 Da± 112 Da, 22676 Da± 113 Da, 22951 Da± 115 Da, 24079 Da± 120 Da, 28055 Da± 140 Da, or 28259 Da± 141 Da may be detected within a given biological sample. Detection of said biomolecules of the invention is based on specific sample pre-treatment conditions, the pET of binding conditions, and the type of biologically active surface used for the detection of biomolecules.
Within the context of the invention, biomolecules within a given sample are bound to an adsorbent on a biologically active surface under specific binding conditions, for example, the biomolecules within a - given sample are applied to a biologically active surface comprising positively-charged quaternary ammonium groups (cationic) and incubated with 0.1 M Tris-HCl, 0.02% Triton X-100 at a pH of 8.5 to allow for specific binding. Biomolecules that bind to said biologically active surface under these conditions are negatively charged molecules. It should be noted that although the biomolecules of the invention are bound to a cationic adsorbent comprising of positively-charged quaternary ammonium groups, the biomolecules are capable of binding other types of adsorbents, as described in another section using binding conditions known to those skilled in the art. Accordingly, some embodiments of the invention are not limited to the use of cationic adsorbents
According to the invention, a biomolecule with the molecular mass of 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da± 11 Da, 2508 Da ± 13 Da, 2732 Da± 14 Da, 3026 Da± 15 Da, 3227 Da ± 17 Da, 3326
Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21
Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da ± 58 Da, 11905 Da ± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da ± 92 Da, 22338 Da ± 112 Da, 22466 Da ± 112 Da, 22676 Da ± 113 Da, 22951 Da ± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da, or 28259 Da ± 141 Da is detected by diluting the biological sample 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine, and then 1:10 in binding buffer consisting of 0.1 M Tris-HCl, 0.02% Triton X-100 at pH 8.5 at 0 to 4°C, applying thus treated sample to a biologically active surface comprising positively charged (cationic) quaternary ammonium groups (anion exchanging), incubating for 120 minutes at 20 to 24°C, and subjecting the bound biomolecules to gas phase ion spectrometry as described in another section.
In one embodiment of the invention, biological samples used to generate a database of mass profiles for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having a metastasised colorectal cancer or subjects having a non-malignant disease of the large intestine, may be of blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract origin. Preferably, biological samples are of blood, blood serum, plasma, urine, excreta, prostatic fluid, biopsy, ascites, lymph or tissue extract origin. More prefeπed are blood, blood serum, plasma, urine, excreta, biopsy, lymph or tissue extract samples. Even more prefeπed are blood serum, urine, excreta or biopsy samples. Overall prefeπed are blood serum samples.
Furthermore, the biological samples related to the invention are isolated from subjects considered to be healthy, having a precancerous lesion of the large intestine, having a colorectal cancer, having a metastasised colorectal cancer or having a non-malignant disease of the large intestine. Said subjects are of mammalian- origin, preferably of primate origin. Even more prefeπed are subjects of human origin. A subject of the invention that is said to have a precancerous lesion of the large intestine, displays preliminary stages of a cancer (i.e. dysplasia), wherein a cell and/or tissue has become susceptible to the development of a cancer as a result of either a genetic predisposition, exposure to a cancer-causing agent (carcinogen) or both.
A genetic pre-disposition may include a predisposition for an autosomal dominant inherited cancer syndrome which is generaUy indicated by a strong family history of uncommon cancer and/or an association with a specific marker phenotype (e.g. famiHal adenomatous polyps of the colon), a famiHal cancer wherein an evident clustering of cancer is observed but the role of inherited predisposition may not be clear (e.g. breast cancer, ovarian cancer, or colon cancer), or an autosomal recessive syndrome characterised by chromosomal or DNA instability. Whereas, cancer-causing agents include agents that cause genetic damage and induce neoplastic transformation of a cell. Such agents fall into three categories: 1) chemical carcinogens such as alkylating agents, polycychc aromatic hydrocarbons, aromatic amines, azo dyes, nitrosamines and amides, asbestos, vinyl chloride, chromium, nickel, arsenic, and naturally occurring carcinogens (e.g. aflotoxinBl); 2) radiation such as ultraviolet (UV) and ionisation radiation including electromagnetic (e.g. x-rays, γ-rays) and particulate radiation (e.g. α and β particles, protons, neutrons); 3) viral and microbial carcinogens such as human Papillomavirus (HPV), Epstein-Ban virus (EBV), hepatitis B virus (HBV), human T-ceU leukaemia virus type 1 (HTLV-1), or Heticobacter pylori.
Alternatively, a subject within the invention that is said to have a colorectal cancer possesses a cancer that arises from the large intestine (interchangebly refeπed to as colorectal cancers within the invention). Such cancers may include, but are not limited to, colon and rectal cancers.
Within the context of the invention, cancers of large intestine (interchangebly refeπed to as colorectal cancers within the invention) may also be of various stages, wherein the staging is based on the size of the primary lesion, its extent of spread to regional lymph nodes, and the presence or absence of blood-borne metastases (metastatic colorectal cancers. The various stages of a cancer may be identified using staging systems known to those skilled in the art [e.g. Union Internationale Contre Cancer (UICC) system or American Joint Committee on Cancer (AJC)]. Also included are different grades of said cancers, wherein the grade of a cancer is based on the degree of differentiation of the epitheHal cells within the lining of the large intestine and the number of mitoses as a coπelation to a neoplasm's aggression.
Healthy individuals, as related to certain embodiments of the invention, are those that possess good health, and demonstrate an absence of a colorectal cancer or a non-malignant disease of the large intestine.
c) Biomolecules
The differential expression of biomolecules in samples from healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having a colorectal cancer, subjects having metastasised colorectal cancer, and subjects having a non-mahgnant disease of the large intestine, allows for the differential diagnosis of a non-mahgnant disease or a cancer of the large intestine wihin a subject.
Biomolecules are said to be specific for a particular clinical state (e.g. healthy, precancerous lesion of the large intestine, colorectal cancer, metastasised colorectal cancer, a non-malignant disease of the large intestine) when they are present at different levels within samples taken from subjects in one clinical state as compared to samples taken from subjects from other clinical states (e.g. in subjects with a precancerous lesion of the large intestine vs. in subjects with a metastasised colorectal cancer). Biomolecules may be present at elevated levels, at decreased levels, or altogether absent within a sample taken from a subject in a particular clinical state (e.g. healthy, precancerous lesion of the large intestine, colorectal cancer, metastasised colorectal cancer, a non-malignant disease of the large intestine). For example, biomolecules A and B are found at elevated levels in samples isolated from healthy subjects as compared to samples isolated from subjects having a precancerous lesion of the large intestine, a colorectal cancer, a metastatic colorectal cancer or a non-mahgnant disease of the large intestine. Whereas, biomolecules X, Y, Z are found at elevated levels and or more frequently in samples isolated from subjects having a precancerous lesion of the large intestine as opposed to subjects in good health, having a colorectal cancer, a metastasised colorectal cancer or a non- malignant disease of the large intestine. Biomolecules A and B are said to be specific for healthy subjects, whereas biomolecules X, Y, Z are specific for subjects having a precancerous lesion of the large intestine.
Accordingly, the differential presence of one or more biomolecules found in a test sample compared to samples from healthy subjects, subjects with a precancerous lesion of the large intestine, a colorectal cancer, a metastasized colorectal cancer, or a non-malignant disease of the large intestine, or the mere detection of one or more biomolecules in the test sample provides useful information regarding probability of whether a subject being tested has a precancerous lesion of the large intestine, a colorectal cancer, a metastasized colorectal cancer or a non-malignant disease of the large intestine. The probability that a subject being tested has a precancerous lesion of the large intestine, a colorectal cancer, a metastasized colorectal cancer or a non-malignant disease of the large intestine depends on whether the quantity of one or more biomolecules in a test sample taken from said subject is statistically significantly different from the quantity of one or more biomolecules in a biological sample taken from healthy subjects, subjects having a precancerous lesion of the large intestine, a colorectal cancer, a metastasised colorectal cancer, or a non-mahgnant disease of the large intestine.
A biomolecule of the invention may be any molecule that is produced by a ceU or Hving organism, and may have any biochemical property (e.g. phosphorylated proteins, positively charged molecules, negatively charged molecules, hydrophobicity, hydrophiHcity), but preferably biochemical properties that allow binding of the biomolecule to a biologically active surface comprising positively charged quaternary ammonium groups after denaturation in 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine and dilution in 0.1 M Tris-HCl, 0.02% Triton X-100 at pH 8.5 at 0 to 4°C foUowed by incubation on said biologically active surface for 120 minutes at 20 to 24°C. Such molecules include, but are not limited to, molecules comprising nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides (DNA or RNA), polypeptides, proteins, antibodies, carbohydrates, Hpids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, Hpoproteins). Preferably a biomolecule may be a nucleotide, polynucleotide, peptide, protein or fragments thereof. Even more prefeπed are peptide or protein biomolecules.
The biomolecules of the invention can be detected based on specific sample pre-treatment conditions, the pH of binding conditions, the type of biologically active surface used for the detection of biomolecules within a given sample and their molecular mass. For example, prior to the detection of the biomolecules described herein, a given sample is pre-tfeated by diluting 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% ampholine. The denatured sample is then diluted 1:10 in 0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5, apphed to a biologically active surface comprising positively-charged quaternary ammonium groups (cationic) and incubated using specific buffer conditions (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5) to allow for binding of said biomolecules to the above-mentioned biologically active surface. It should be noted that although the biomolecules of the invention are detected using a cationic adsorbent positively charged quaternary ammonium groups, as well as specific pre-treatment and binding conditions, the biomolecules are capable of binding other types of adsorbents, as described below, using alternative pre-treatment and binding conditions known to those skilled in the art. Accordingly, some embodiments of the invention are not limited to the use of cationic adsorbents.
The biomolecules of the invention include biomolecules having a molecular mass selected from the group consisting of 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da," 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da± 58 Da, 11905 Da± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da ± 92 Da, 22338 Da ± 112 Da, 22466 Da ± 112 Da, 22676 Da ± 113 Da, 22951 Da ± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da, or 28259 Da ± 141 Da.
According to the invention, a biomolecule with the molecular mass of 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da ± 58 Da, 11905 Da ± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da ± 92 Da, 22338 Da ± 112 Da, 22466 Da ± 112 Da, 22676 Da ± 113 Da, 22951 Da ± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da, or 28259 Da ± 141 Da is detected by diluting the biological sample 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine, and then 1:10 in binding buffer consisting of 0.1 M Tris-HCl, 0.02% Triton X-100 at pH 8.5 at 0 to 4°C, applying thus treated sample to a biologically active surface comprising positively charged (cationic) quaternary ammonium groups (anion exchanging), incubating for 120 minutes at 20 to 24°C, and subjecting the bound biomolecules to gas phase ion spectrometry as described in another section.
Although said biomolecules were first identified in blood serum samples, their detection is not limited to said sample type. The biomolecules may also be detected in other samples types, such as blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract. Preferably, samples are of blood, blood serum, plasma, urine, excreta, prostatic fluid, biopsy, ascites, lymph or tissue extract origin. More prefeπed are blood, blood serum, plasma, urine, excreta, biopsy, lymph or tissue extract samples. Even more prefeπed are blood serum, urine, excreta or biopsy samples. Overall prefeπed are blood serum samples.
Since the biomolecules can be sufficiently characterized by their mass and biochemical characteristics such as the type of biologically active surface they bind to or the pH of binding conditions, it is not necessary to identify the biomolecules in order to be able to identify them in a sample. It should be noted that molecular mass and binding properties are characteristic properties of these biomolecules and not limitations on the means of detection or isolation. Furthermore, using the methods described herein, or other methods known in the art, the absolute identity of the markers can be determined. This is important when one wishes to develop and/or screen for specific binding molecules, or to develop a an assay for the detection of said biomolecules using specific binding molecules.
d) Biologically Active Surfaces
In one embodiment of the invention, biologically active surfaces include, but are not restricted to, surfaces that contain adsorbents such as quaternary ammonium groups (anion exchange surfaces), carboxylate groups (cation exchange surfaces), alkyl or aryl chains (hydrophobic interaction, reverse phase chemistry), groups such as nitriloacetic acid that immobilize metal ions such as nickel, gallium, copper, or zinc (metal affinity interaction), or biomolecules such as proteins, preferably antibodies, or nucleic acids, preferably protein binding sequences, covalently bound to the surface via carbonyl diimidazole moieties or epoxy groups (specific affinity interaction). Prefeπed are adsorbents comprising anion exchange surfaces.
These surfaces may be located on matrices like polysaccharides such as sepharose, e.g. anion exchange surfaces or hydrophobic interaction surfaces, or sohd metals, e.g. antibodies coupled to magnetic beads. Surfaces may also include gold-plated surfaces such as those used for Biacore Sensor Chip technology. Other surfaces known to those skilled in the art are also included within the scope of the invention.
Biologically active surfaces are able to adsorb biomolecules like amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides, polypeptides, carbohydrates, Hpids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, Hpoproteins).
In another embodiment, devices that use biologically active surfaces to selectively adsorb biomolecules may be chromatography columns for Fast Protein Liquid Chromatography (FPLC) and High Pressure Liquid Chromatography (HPLC), where the matrix, e.g. a porysaccharide, carrying the biologically active surface, is filled into vessels (usually refeπed to as "columns") made of glass, steel, or synthetic materials like polyetheretherketone (PEEK).
In yet another embodiment, devices that use biologically active surfaces to selectively adsorb biomolecules may be metal strips carrying thin layers of the biologically active surface on one or more spots of the strip surface to be used as probes for gas phase ion spectrometry analysis, for example the SAX2 ProteinChip aπay (Ciphergen Biosystems, Inc.) for SELDI analysis.
e) Mass Profiling
In one embodiment, the mass profile of a sample may be generated using an aπay-based assay in which the biomolecules of a given sample are bound by biochemical or affinity interactions to an adsorbent present on a biologically active surface located on a solid platform ("aπay" or "probe"). After the biomolecules have bound to the adsorbent, they are detected using gas phase ion spectrometry. Biomolecules or other substances bound to the adsorbents on the probes can be analyzed using a gas phase ion spectrometer. This includes, e.g., mass spectrometers, ion mobility spectrometers, or total ion cuπent measuring devices. The quantity and characteristics of the biomolecule can be determined using gas phase ion spectrometry. Other substances in addition to the biomolecule of interest can also be detected by gas phase ion spectrometry.
In one embodiment, a mass spectrometer can be used to detect biomolecules on the probe. In a typical mass spectrometer, a probe with a biomolecule is introduced into an inlet system of the mass spectrometer. The biomolecule is then ionized by an ionization source, such as a laser, fast atom bombardment, or plasma. The generated ions are collected by an ion optic assembly, and then a mass analyzer disperses and analyzes the passing ions. Within the scope of this invention, the ionisation course that ionises the biomolecule is a laser.
The ions exiting the mass analyzer are detected by a ion detector. The ion detector then translates information of the detected ions into mass-to-charge ratios. Detection of the presence of a biomolecule or other substances wiU typically involve detection of signal intensity. This, in turn, can reflect the quantity and character of a biomolecule bound to the probe. In another embodiment, the mass profile of a sample may be generated using a liquid-chromatography (LC)-based assay in which the biomolecules of a given sample are bound by biochemical or affinity interactions to an adsorbent located in a vessel made of glass, steel, or synthetic material; known to those skilled in the art as a chromatography column. The biomolecules are eluted from the biologicaUy active surface by washing the vessel with appropriate solutions known to those skiUed in the art. Such solutions include but are not limited to, buffers, e.g. Tris (hydroxymethyl) aminomethane hydrochloride (TRIS-HCl), buffers containing salt, e.g. sodium chloride (NaCI), or organic solvents, e.g. acetonitrile. Biomolecule mass profiles are generated by appHcation of the eluting biomolecules of the sample by direct connection via an electrospray device to a mass spectrometer (LC/ESI-MS).
Conditions that promote binding of biomolecules to an adsorbent are known to those skiUed in the art (reference) and ordinarily include parameters such as pH, the concentration of salt, organic solvent, or other competitors for binding of the biomolecule to the adsorbent. Within the scope of the invention, incubation temperatures are of at least 0 to 100°C, preferably of at least 4 to 60°C, and most preferably of at least 15 to 30°C. Varying additional parameters, such as incubation time, the concentration of detergent, e.g., 3-[(3-Cholamidopropyl) dimethylammonio]-2-hydroxy-l-propanesulfonate (CHAPS), or reducing agents, e.g. dithiothreitol (DTT), are also known to those skilled in the art. Various degrees of binding can be accomplished by combining the above stated conditions as needed, and will be readily apparent to those skiUed in the art.
f) Methods for detecting biomolecules within a sample
In yet another aspect, the invention relates to methods for detecting differentially present biomolecules in a test sample and/or biological sample. Within the context of the invention, any suitable method can be used to detect one or more of the biomolecules described herein. For example, gas phase ion spectrometry can be used. This technique includes, e.g., laser desorption/ionization mass spectrometry. Preferably, the test and/or biological sample is prepared prior to gas phase ion spectrometry, e.g., pre-fractionation, two-dimensional gel chromatography, high performance liquid chromatography, etc. to assist detection of said biomolecules. Detection of said biomolecules can also be achieved using methods other than gas phase ion spectrometry. For example, immunoassays can be used to detect the biomolecules within a sample.
In one embodiment, the test and/or biological sample is prepared prior to contacting a biologically active surface and is in aqueous form. Examples said samples include, but are not limited to, blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, tears, saliva, sweat, ascites, cerebrospinal fluid, milk, lymph, or tissue extract samples. Furthermore, solid test and/or biological samples, such as excreta or biopsy samples can be solubilised in or admixed with an eluent using methods known to those skilled in the art such that said samples may be easily applied to a biologicaUy active surface. Test and/or biological samples in the aqueous form can be further prepared using specific solutions for denaturation (pre-treatment) like sodium dodecyl sulfate, mercaptoethanol, urea, etc. For example, a test and/or biological sample of the invention can be denatured prior to contacting a biologically active surface comprising of quaternary ammonium groups by diluting said sample 1:5 with a buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT and 2% ampholine.
The sample is contacted with a biologically active surface using any techniques including bathing, soaking, dipping, spraying, washing over, or pipetting, etc. GeneraUy, a volume of sample containing from a few atomoles to 100 picomoles of a biomolecule in about 1 to 500 μl is sufficient for detecting binding of the biomolecule to the adsorbent.
The pH value of the solvent in which the sample contacts the biologically active surface is a function of the specific sample and the selected biologically active surface. TypicaUy, a sample is contacted with a biologically active surface under pH values between 0 and 14, preferably between about 4 and 10, more preferably between 4.5 and 9.0, and most preferably, at pH 8.5. The pH value depends on the type of adsorbent present on a biologically active surface and can be adjusted accordingly.
The sample can contact the adsorbent present on a biologically active for a period of time sufficient to allow the marker to bind to the adsorbent. Typically, the sample and the biologically active surface are contacted for a period of between about 1 second and about 12 hours, preferably, between about 30 seconds and about 3 hours, and most preferably for 120 minutes.
The temperature at which the sample contacts the biologically active surface (incubation temperature) is a function of the specific sample and the selected biologically active surface. Typically, the washing solution can be at a temperature of between 0 and 100°C, preferably between 4 and 37°C, and most preferably between 20 and 24°C.
For example, a biologically active surface comprising of quaternary ammonium groups (anion exchange surface) will bind the biomolecules described herein when the pH value is between 6.5 and 9.0. Optimal binding of the biomolecules of the present invention occurs at a pH of 8.5. Furthermore, a sample is contacted with said biologically active surface for 120 min. at a temperature of 20 - 24 °C.
Following contacting a sample or sample solution with a biological surface, it is prefeπed to remove any unbound biomolecules so that only the bound biomolecules remain on the biologically active surface. Washing unbound biomolecules are removed by methods known to those skiUed in the art such as bathing, soaking, dipping, rinsing, spraying, or washing the biologically active surface with an eluent or a washing solution. A microfluidics process is preferably used when a washing solution such as an eluent is introduced to smaU spots of adsorbents on the biologically active surface. Typically, the washing solution can be at a temperature of between 0 and 100°C, preferably between 4 and 37°C, and most preferably between 20 and 24°C.
Washing solution or eluents used to wash the unbound biomolecules from a biologically active surface include, but are not limited to, organic solutions, aqueous solutions such as buffers wherein a buffer may contain detergents, salts, or reducing agents in appropriate concentrations as those known to those skiUed in the art.
Aqueous solutions are prefeπed for washing biologicaUy active surfaces. Exemplary aqueous solutions include, but not limited to, HEPES buffer, Tris buffer, phosphate buffered saline (PBS), and modifications thereof. The selection of a particular washing solution or an eluent is dependent on other experimental conditions (e. g., types of adsorbents used or biomolecules to be detected), and can be determined by those of skiU in the art. For example, if a biologically active surface comprising a quaternary ammonium group as adsorbent (anion exchange surface) is used, then an aqueous solution, such as a Tris buffer, may be prefeπed. In another example, if a biologically active surface comprising a carboxylate group as adsorbent (cation exchange surface) is used, then an aqueous solution, such as an acetate buffer, may be prefeπed.
Optionally, an energy absorbing molecule (EAM), e.g. in solution, can be applied to biomolecules or other substances bound on the biologicaUy active surface by spraying, pipetting or dipping. Applying an EAM can be done after unbound materials are washed off of the biologically active surface. Exemplary energy absorbing molecules include, but are not limited to, cinnamic acid derivatives, sinapinic acid and dihydroxybenzoic acid.
Once the biologically active surface is free of any unbound biomolecules, adsorbent-bound biomolecules are detected using gas phase ion spectrometry. The quantity and characteristics of a biomolecule can be determined using said method. Furthermore, said biomolecules can be analyzed using a gas phase ion spectrometer such as mass spectrometers, ion mobihty spectrometers, or total ion cuπent measuring devices. Other gas phase ion spectrometers known to those skilled in the art are also included.
In one embodiment, mass spectrometry can be used to detect biomolecules of a given sample present on a biologically active surface. Such methods include, but are not limited to, matrix-assisted laser desorption ionization/time-of-flight (MALDI-TOF), surface-enhanced laser desorption ionization time-of-flight (SELDI-TOF), Hquid chromatography coupled with MS, MS-MS, or ESI-MS. Typically, biomolecules are analysed by introducing a biologicaUy active surface containing said biomolecules, ionizing said biomolecules to generate ions that are collected and analysed.
In a prefeπed embodiment, the biomolecules present in a sample are detected using gas phase ion spectrometry, and more preferably, using mass spectrometry. In one embodiment, matrix-assisted laser desorption ionization ("MALDI") mass spectrometry can be used. In MALDI, the sample is typicaUy quasi-purified to obtain a fraction that essentially consists of a marker using separation methods such as two-dimensional gel electrophoresis or high performance liquid chromatography (HPLC).
In another embodiment, surface-enhanced laser desorption/ionization mass spectrometry ("SELDI") can be used. SELDI uses a substrate comprising adsorbents to capture biomolecules, which can then be directly desorbed and ionized from the substrate surface during mass spectrometry. Since the substrate surface in SELDI captures biomolecules, a sample need not be quasi-purified as in MALDI. However, depending on the complexity of a sample and the type of adsorbents used, it may be desirable to prepare a sample to reduce its complexity prior to SELDI analysis.
For example, biomolecules bound to a biologically active surface can be introduced into an inlet system of the mass spectrometer. The biomolecules are then ionized by an ionization source such as a laser, fast atom bombardment, or plasma. The generated ions are then coUected by an ion optic assembly, and then a mass analyzer disperses the passing ions. The ions exiting the mass analyzer are detected by a detector and translated into mass-to-charge ratios. Detection of the presence of a biomolecule typically involves detection of its specific signal intensity, and reflects the quantity and character of said biomolecule.
In a prefeπed embodiment, a laser desorption time-of-flight mass spectrometer is used with the probe of the present invention. In laser desorption mass spectrometry, biomolecules bound to a biologicaUy active surface are introduced into an inlet system. Biomolecules are desorbed and ionized into the gas phase by a laser. The ions generated are then coUected by an ion optic assembly. These ions are accelerated through a short high voltage field and let drift into a high vacuum chamber of a time-of- flight mass analyzer. At the far end of the high vacuum chamber, the accelerated ions strike a sensitive detector surface at a different time. Since the time-of-flight is a function of the mass of the ions, the elapsed time between ionization and impact can be used to identify the presence or absence of molecules of a specific mass.
The detection of biomolecules described herein can be enhanced using certain selectivity conditions (e. g., types of adsorbents used or washing solutions). In a prefeπed embodiment, the same or substantiaUy the same selectivity conditions that were used to discover the biomolecules can be used in the methods for detecting a biomolecule in a sample.
Combinations of the laser desorption time-of-flight mass spectrometer with other components described herein, in the assembly of mass spectrometer that employs various means of desorption, acceleration, detection, measurement of time, etc., are known to those skiUed in the art.
Data generated by desorption and detection of markers can be analyzed with the use of a programmable digital computer. The computer program generally contains a readable medium that stores codes. Certain codes can be devoted to memory that include the location of each feature on a biologically active surface, the identity of the adsorbent at that feature and the elution conditions used to wash the adsorbent. Using this information, the program can then identify the set of features on the biologically active surface defining certain selectivity characteristics (e. g. types of adsorbent and eluents used). The computer also contains codes that receive as data (input) on the strength of the signal at various molecular masses received from a particular addressable location on the biologically active surface. This data can indicate the number of biomolecules detected, as well as the strength of the signal and the determined molecular mass for each biomolecule detected.
Data analysis can include the steps of determining signal strength (e. g., height of peaks) of a biomolecule detected and removing "outhers" (data deviating from a predetermined statistical distribution). For example, the observed peaks can be normalized, a process whereby the height of each peak relative to some reference is calculated. For example, a reference can be background noise generated by instrument and chemicals (e. g., energy absorbing molecule), which is set as zero in the scale. Then the signal strength detected for each biomolecule can be displayed in the form of relative intensities in the scale desired (e. g., 100). Alternatively, a standard may be admitted with the sample so that a peak from the standard can be used as a reference to calculate relative intensities of the signals observed for each biomolecule or other biomolecules detected.
The computer can transform the resulting data into various formats for displaying. In one format, refeπed to as "spectrum view", a standard spectral view can be displayed, wherein the view depicts the quantity of a biomolecule reaching the detector at each particular molecular mass. In another format, refeπed to as "scatter plot" only the peak height and mass information are retained from the spectrum view, yielding a cleaner image and enabling biomolecules with nearly identical molecular mass to be more visible.
Using any of the above display formats, it can be readily determined from the signal display whether a biomolecule having a particular molecular mass is detected from a sample. Prefeπed biomolecules of the invention are biomolecules with an apparent molecular mass of about 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da ± 58 Da, 11905 Da ± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da± 92 Da, 22338 Da± 112 Da, 22466 Da± 112 Da, 22676 Da± 113 Da, 22951 Da± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da, or 28259 Da ± 141 Da. Moreover, from the strength of signal, the amount of a biomolecule bound on the biologically active surface can be determined.
g) Identification of proteins
In case the biomolecules of the invention are proteins, the present invention comprises a method for the identification of these proteins, especiaUy by obtaining their amino acid sequence. This method comprises the purification of said proteins from the complex biological sample (blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, tears, saliva, sweat, ascites, cerebrospinal fluid, milk, lymph, or tissue extract samples) by fractionating said sample using techniques known by the one of ordinary skill in the art, most preferably protein chromatography (FPLC, HPLC).
The biomolecules of the invention include those proteins with a molecular mass selected from 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48' Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da± 52 Da, 10440 Da ± 52 Da, 10594 Da± 53 Da, 11216 Da± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da ± 58 Da, 11905 Da ± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da ± 92 Da, 22338 Da ± 112 Da, 22466 Da ± 112 Da, 22676 Da ± 113 Da, 22951 Da ± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da, and 28259 Da ± 141 Da. .
Furthermore, the method comprises the analysis of the fractions for the presence and purity of said proteins by the method which was used to identify them as differentially expressed biomolecules, for example two-dimensional gel electrophoresis or SELDI mass spectrometry, but most preferably SELDI mass spectrometry. The method also comprises an analysis of the purified proteins aiming towards the revealing of their amino acid sequence. This analysis may be performed using techniques in mass spectroscopy known to those skilled in the art.
In one embodiment, this analysis may be performed using peptide mass fingerprinting, revealing information about the specific peptide mass profile after proteolytic digestion of the investigated protein.
In another embodiment, this analysis may be preferably performed using post-source-decay (PSD), or MSMS, but most preferably MSMS, revealing mass information about all possible fragments of the investigated protein or proteolytic peptides thereof leading to the amino acid sequence of the investigated protein of proteolytic peptide thereof.
The information revealed by the aforementioned techniques can be used to feed world-wide-web search engines, such as MS Fit (Protein Prospector, http://prospector.ucsf.edu) for information obtained .from peptide mass fingerprinting, or MS Tag (Protein Prospector, http ://prospector.ucsf .edu) for information obtained from PSD, or mascot (www.matrixscience.com) for information obtained from MSMS and peptide mass fingerprinting, for the alignment of the obtained results with data available in public protein sequence databases, such as SwissProt (http://us.expasy.org/sprot/), NCBI (http://www.ncbi.nlm.nih.gov/BLAST/), EMBL (http://srs.embl-heidelberg.de: 8000/srs5/) which leads to a confident information about the identity of said proteins. This information may comprise, if avahable, the complete amino acid sequence, the calculated molecular mass, the structure, the enzymatic activity, the physiological function, and gene expression of the investigated proteins.
h) Kits
In yet another aspect, the invention provides kits using the methods of the invention as described in the section Diagnostics for the differential diagnosis of colorectal cancer or a non-malignant disease of the large intestine, wherein the kits are used to detect the biomolecules of the present invention.
The methods used to detect the biomolecules of the invention can also be used to determine whether a subject is at risk of developing colorectal cancer or a non-malignant disease of the large intestine, or has developed a colorectal cancer or a non-malignant disease of the large intestine. Such methods may also be employed in the form of a diagnostic kit comprising an antibody specific to a biomolecule of the invention or a biologically active surface described herein, which may be conveniently used, for example, in clinical settings to diagnose patients exhibiting symptoms or a fanπly history of a non-steroid dependent cancer. Such diagnostic kits also include solutions and materials necessary for the detection of a biomolecule of the invention, and instructions to use the kit based on the above-mentioned methods.
The biomolecules of the invention include those proteins with a molecular mass selected from 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 4719 Da± 24 Da, 4830 Da± 24 Da, 4865 Da ± 24 Da, 4963 Da± 25 Da, 5112 Da± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da ± 58 Da, 11905 Da ± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da ± 92 Da, 22338 Da± 112 Da, 22466 Da ± 112 Da, 22676 Da± 113 Da, 22951 Da± 115 Da, 24079 Da± 120 Da, 28055 Da± 140 Da, or 28259 Da± 141 Da. For example, the kits can be used to detect one or more of differentiaUy present biomolecules as described above in a test sample of subject. The' kits of the invention have many applications. For example, the kits can be used to differentiate if a subject is healthy, having a precancerous lesion of the large intestine, a colorectal cancer, a metastasized colorectal cancer or a non-mahgnant disease of the large intestine. Thus aiding the diagnosis of colorectal cancer or a non-maHgnant disease of the large intestine. In another example, the kits can be used to identify compounds that modulate expression of said biomolecules.
In one embodiment, a kit comprises an adsorbent on a biologically active surface, wherein the adsorbent is suitable for binding one or more biomolecules of the invention, a denaturation solution for the pre-treatment of a sample, a binding solution, a washing solution or instructions for making a denaturation solution, binding solution, or washing solution, wherein the combination allows for the detection of a biomolecule using gas phase ion spectrometry. Such kits can be prepared from the materials described in other previously detaUed sections (e. g., denaturation buffer, binding buffer, adsorbents, washing solutions, etc.).
In some embodiments, the kit may comprise a first substrate comprising an adsorbent thereon (e. g., a particle functionalized with an adsorbent) and a second substrate onto which the first substrate can be positioned to form a probe, which is removably insertable into a gas phase ion spectrometer. In other embodiments, the kit may comprise a single substrate, which is in the form of a removably insertable probe with adsorbents on the substrate.
In another embodiment, a kit comprises a binding molecule that specificaUy binds to a biomolecule related to the invention, a detection reagent, appropriate solutions and instructions on how to use the kit. Such kits can be prepared from the materials described above, and other materials known to those skilled in the art. A binding molecule used within such a kit may include, but is not limited to, proteins, peptides, nucleotides, nucleic acids, hormones, amino acids, sugars, fatty acids, steroids, polynucleotides, carbohydrates, lipids, or a combination thereof (e.g. glycoproteins, ribonucleoproteins, Hpoproteins), compounds or synthetic molecules. Preferably, a binding molecule used in said kit is an antibody.
In either embodiment, the kit may optionally further comprise a standard or control information so that the test sample can be compared with the control information standard to determine if the test amount of a marker detected in a sample is a diagnostic amount consistent with a diagnosis of colorectal cancer. The present invention also relates to use 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da ± 58 Da, 11905 Da ± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da ± 92 Da, 22338 Da ± 112 Da, 22466 Da ± 112 Da, 22676 Da ± 113 Da, 22951 Da ± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da, or 28259 Da ± 141 Da for manufacture of an agent for diagnosis, prophylactic and/or therapeutic treatment of non-steroid dependent cancer, preferably colorectal cancer.
The invention also relates to a method for aiding non-steroid dependent cancer diagnosis especially colorectal cancer, the method comprising (a) detecting at least one protein marker in a sample, wherein the protein marker is selected from 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359 Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702 Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da ± 58 Da, 11905 Da ± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da ± 89 Da, 17890 Da ± 89 Da, 18115 Da ± 91 Da, 18390 Da ± 92 Da, 22338 Da ± 112 Da, 22466 Da ± 112 Da, 22676 Da ± 113 Da, 22951 Da ± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da, or 28259 Da ± 141 Da and (b) coπelating the detection of the or protein marker with a probable diagnosis of non-steroid cancer especially colorectal cancer.
Each recorded measurement reading is accompanied by a margin of deviation. The latter statistical imprecision is well-known to those skilled in the art. In the scope of the present invention, the margin of deviation is exclusively device-specific. That means it is caused by the type of analytical device used which is preferably a mass spectrometer. The accuracy of the recorded measurement reading is specified by a fixed percentage. In the meaning of the present invention, each disclosed molecular mass represents the averaged value of that range which deviates from the averaged value about ± 0.5 %.
Furthermore, slight differences appear in the molecular mass value itself which concerns the same protein in parallel patent applications disclosing the matter of cancer biomarkers. There are three reasons to be considered. First, each molecular mass results from the analysis of samples belonging to another type of cancer. The origin of sample, the cellular status, the environmental conditions of the gathered tissue etc. exert an influence on the measurements. Secondly, the given molecular mass of the biomarkers represents the averaged value which is calculated from the data of numerous samples of each cancer species. Thirdly, measuring errors might be also imaginable, for example due to the sample preparation.
Above statements are further illustrated by examples which should not be construed as limiting with regard to the type of disease, the number of given molecular masses or in any other way. The following molecular masses of biomolecules are regarded as equivalent: '
(i) 2020 ± 10 (epithelial cancer) and 2020 ± 10 (colorectal cancer) (ii) 2050 ± 10 (epithelial cancer) and 2049 ± 10 (colorectal cancer) (iii) 3946 ± 20 (epithelial cancer) and 3946 ± 20 (colorectal cancer) (iv) 4104 + 21 (epithelial cancer) and 4103 ± 21 (colorectal cancer) (v) 4298 ± 21 (epithelial cancer) and 4295 ± 21 (colorectal cancer)
(vi) 4360 ± 22 (epithelial cancer) and 4359 ± 22 (colorectal cancer) (vii) 4477 ± 22 (epithelial cancer) and 4476 ± 22 (colorectal cancer) (viii) 4867 ± 24 (epithelial cancer) and 4865 ± 24 (colorectal cancer) (ix) 4958 ± 25 (epithelial cancer) and 4963 ± 25 (colorectal cancer) (x) 5491 ± 27 (epithelial cancer) and 5493 ± 27 (colorectal cancer)
(xi) 5650 ± 28 (epithelial cancer) and 5648 ± 28 (colorectal cancer)
(xii) 6449 ± 32 (epithelial cancer) and 6446 ± 32 (colorectal cancer)
(xiii) 6876 ± 34 (epithelial cancer) and 6852 ± 34 (colorectal cancer) (xiv) 7001 ± 35 (epithelial cancer) and 6999 ± 35 (colorectal cancer)
(xv) 8232 ± 41 (epithelial cancer) and 8215 ± 41 (colorectal cancer)
(xvi) 8711 ± 44 (epithelial cancer) and 8702 ± 44 (colorectal cancer)
(xvii) 12471 ± 62 (epithelial cancer) and 12470 ± 62 (colorectal cancer)
(xviii) 12669 ± 63 (epithelial cancer) and 12619 ± 63 (colorectal cancer) (xix) 13989 ± 70 (epithelial cancer) and 13983 ± 70 (colorectal cancer)
(xx) 15959 ± 80 (epithelial cancer) and 15957 ± 80 (colorectal cancer)
(xxi) 16164 ± 81 (epithelial cancer) and 16164 ± 81 (colorectal cancer)
(xxii) 17279 ± 86 (epithelial cancer) and 17263 ± 86 (colorectal cancer)
(xxiii) 17406 ± 87 (epithelial cancer) and 17397 ± 87 (colorectal cancer) (xxiv) 17630 ± 88 (epithelial cancer) and 17617 ± 88 (colorectal cancer)
(xxv) 18133 ± 91 (epithelial cancer) and 18115 ± 91 (colorectal cancer)
In all examples, each recorded measurement reading is overlapping with any others within its margin of deviation.
A further calculation of averaged values which incorporates the matching molecular masses of each type of cancer is known to those skilled in the art. By applying formulas which the method of error calculation by means of weights (weighted average) is based upon, the following generalized results are obtained for the aforementioned examples:
(i) 2020 ±10
(ϋ) 2050+10
(iii) 3946 ± 20
(iv) 4104 ±21
(v) 4297 ±21
(vi) 4360±22
(vii) 4477 ±22
(viii) 4866 ±24 (ix) 4961 ±25
(x) 5492 ±27
(xi) 5679 ±28
(xii) 6448 ±32 (xiii) 6864 ±34
(xiv) 7000 ± 35
(xv) 8224 ±41
(xvi) 8707 ±44
(xvii) 12471 ±62 (xviii) 12644 ±63
(xix) 13986±70
(xx) 15958 ±80
(xxi) 16164 ±81
(xxii) 17271 ±86 (xxiii) 14402 ±87
(xxiv) 17624 ±88
(xxv) 18124 ±91
The present invention is further Ulustrated by the following examples, which should not be construed as limiting in any way. The contents of all cited references (including literature references, issued patents, published patent applications), as cited throughout this application, are hereby expressly incorporated by reference. The practice of the present invention will employ, unless otherwise indicated, conventional techniques of cell biology, cell culture, molecular biology, transgenic biology, microbiology, recombinant DNA, and immunology, which are known to those skilled in the art. Such techniques are explained fuUy in the literature.
Example 1. Sample collection for colon cancer evaluation.
Serum samples were obtained from a total of 151 individuals, which included two different groups of subjects. In the first group (group I), sera were drawn from 57 colon cancer patients, undergoing diagnosis and treatment of colon cancer at the Departments of Gastrόenterology and Surgery of the Universities of Magdeburg, Erlangen, and Cottbus (all Germany). Serum samples were collected from the patients directly before surgery. At this time, a primary diagnosis was made based on endoscopy, ultrasonic testing, and/or other means for the detection of colorectal cancer. In all cases the diagnosis was confirmed by histological evaluation after surgery. Follow-up data for all colon cancer patients are currently coUected and will be available for later studies. The non-cancer control group (group II) consisted of 94 subjects with non-mahgnant disease symptoms of the large intestine (adenoma, inflammation, diverticulosis), which were recruited from the University Hospitals in Magdeburg, Cottbus, and Erlangen. Serum from each subject was taken foUowing colorectal endoscopy, wherein the absence of colorectal cancer was confirmed. Furthermore, all subjects denied a personal history of cancer and were otherwise healthy. FoUow-up data for aU non-cancer controls are currently collected and will be available for later studies. In addition, 77 serum samples from healthy blood donors was also collected for test-set analysis. Blood donors are considered to be healthy individuals not suffering from severe diseases.
Example 2. ProteinChip Array analysis.
ProteinChip Aπays of the SAX2-type (strong anion exchanger) were aπanged into a bioprocessor (Ciphergen Biosystems, Inc.), a device that contains up to 12 ProteinChips and facilitates processing of the ProteinChips. The ProteinChips were pre-incubated in the bioprocessor with 200 μl binding buffer (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5). 10 μl of serum sample was dUuted 1:5 in a buffer (7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, 2% ampholine) and again diluted 1:10 in the binding buffer. Then, 300 μl of this mixture (equivalent to 6 μl original serum sample) were directly applied onto the spots of the SAX2 ProteinChips. In between dilution steps and prior to the application to the spots, the sample was kept on ice (at 0°C). After incubation for 120 minutes at 20 to 24 °C the chips were incubated with 200 μl binding buffer, before 2 x 0.5 μl EAM solution (20 mg/ml sinapinic acid in 50% acetonitrile and 0.5% trifluoroacetic acid) was appHed to the spots. After air-drying for 10 min, the ProteinChips were placed in the ProteinChip Reader (ProteinChip Biology System II, Ciphergen Biosystems, Inc.) and time-of-flight spectra were generated by laser shots coUected in the positive mode at laser intensity 215, with the detector sensitivity of 8. Sixty laser shots per average spectra were performed.
Calibration of mass accuracy was performed by using the following mixture of mass standard calibrant proteins: Dynorphin A (porcine, 209 - 225, 2147.50 Da), Beta-endorphin (human, 61 - 91, 3465.00 Da), Insulin (bovine, 5733.58 Da), and Cytochrome c (bovine, 12230.90 Da) at a concentration of 1.21 pmol/μl, and Myoglobin (equine cardiac, 16951.50 Da) at a concentration of 5.16 pmol/μl. 0.5μl of this mixture was applied to a single spot of a H4 ProteinChip aπay. After afr-drying of the drop, 2 x 1 μl matrix solution (a saturated solution of sinapinic acid in 50% acetonitrile 0.5% trifluoracetic acid) was applied to the spot. The drop was allowed to air-dry for 10 min after each application of matrix solution.
The ProteinChip was placed in the ProteinChip Reader (Biology System π, Ciphergen Biosystems, Inc.) and time-of-flight spectra were generated by laser shots collected in the positive mode at laser intensity 210, with the detector sensitivity of 8. Sixty laser shots per average spectra were performed. Subsequently, Time-Of-FHght values were coπelatedto the molecular masses of the standard proteins, and caHbration was performed according to the instrument manual.
Example 3. Peak detection and data analysis.
The analysis of the data was performed by automatic peak detection and aUgnment using the operating software of the ProteinChip Biology System II, the ProteinChip Software Version 3.1 (Ciphergen Biosystems, Inc.). Figure 1 shows a comparison of protein mass spectra detected using the above mentioned SAX2 ProteinChip arrays for samples isolated from patients suffering from non-malignant diseases of the large intestine (e.g., acute or chronic inflammation, adenoma) (Cl and C2) and of patients with colon cancer (TI and T2).
The complete set of patients was randomly divided into a fraining set and a test set. The train set comprised of 54 randomly selected patients with colon cancer and 75 randomly selected patients without colon cancer. The test set comprised of 14 randomly selected patients with colon cancer and 19 randomly selected patients without colon cancer. Additionally, a test set comprising of 77 sera obtained from healthy blood donors was compiled. This was done in order to test the classification algorithm generated on the basis of the spectra of the subgroup of healthy individuals (see below).
The m z values of all mass spectra selected for the analysis ranged between 2000 Da and 30000 Da, wherein smaller masses were not used since artefacts with the "Energy Absorbing Molecule, EAM" ("Matrix") could not be excluded, and higher masses were not detected under the chosen experimental conditions. The spectra within the train set were normalised according to the intensity of the total ion cuπent, followed by baseline subtraction, and automatic peak detection as previously described by Adam et al. (2002) Cancer Research 62: 3609-3614, using the "Biomarker Wiizard" tool of the ProteinChip Software Version 3.1 (Ciphergen Biosystem, Inc.). The following settings were chosen for peak detection by "Biomarker Wizard": a) auto-detect peaks to cluster, b) first pass: 5 signal/noise, c) minimum peak threshold: 5% of all spectra, d) deletion of user-detected peaks below threshold, e) cluster mass window: +/- 0.3% of mass. Using these settings, 90 signal clusters were identified.
The normalization coefficient generated by normalizing the spectra of the train sets and the cluster information of the train sets generated by the "Biomarker Wizard" tool of the software were saved and used to externaUy normalize the spectra of the coπesponding test sets and to cluster the signals of the coπesponding test sets according to the normalization and peak identification of the train sets.
The cluster information for each train and test set (containing sample ID and sample group, cluster mass values and cluster signal intensities for each spectrum within the sets) was transformed into an interchangeable data format (a .csv table) using the "Sample group statistics" function of the "Biomarker Wizard" tool of the ProteinChip Software Version 3.1. In this format, the data can be analysed by a specific software for the generation of regression and classification trees (see examples 5 to 7).
Example 4. Construction of classifiers.
Four classifiers with binary target variable (cancer versus non-cancer) were constructed: First, as a proof of principle, a classifier was constructed only on the basis of the training set described above. Second, a final classifier was constructed on the basis of all avaUable mass peaks and aU colon cancer samples, fusing the coπesponding fraining and test data sets. Third, a 2nd final colon classifier was constructed analogously to the first final colon cancer classifier but excluding the most informative and dominating mass of the first final colon classifier. Fourth, a 3rd final colon classifier was constructed analogously to the first final colon cancer classifier but excluding the most informative and dominating masses of the first and 2nd final colon classifier.
Forward variable selection was applied in order to determine highly informative sets of variables ("patterns") for classification. The results of the present invention were generated using the "CART" decision tree approach (classification and regression trees; Breiman et al., 1984). Moreover, bagging of classifiers was applied to overcome typical instabilities of forward variable selection procedures, thereby increasing overall classifier performance (Breiman, 1994).
More precisely, for the fraining set 50 bootstrap samples were generated (sampling with replacement, maximal 3 sample redraws). For each bootstrap sample an exploratory decision tree was generated. Nodes were spht using the Gini rule until aU final nodes were either pure, i.e., contained only samples of one class, or until one of the following stopping rules was met: no nodes comprising less than 4 cases were split and no splits were considered resulting in a node comprising only one sample. The such obtained 50 single classifiers, one for each bootstrap sample, were combined to constitute an ensemble of classifiers predicting class membership by plurality vote.
The procedure of classifier construction was conducted four times to obtain one proof-of-principle classifier and three final classifiers for colon cancer detection.
Example 5. Classifier structure.
The proof-of-principle classifier employed 71 masses (variables) out of 90 determined signal clusters. Single decision trees consisted of 4 to 9 variables (5 to 10 end nodes), 6 variables being typical, see histogram of Figure 4. Variable importance was roughly deduced by overall improvement, i.e., for each mass we summed the improvement values achieved in the generation of all 50 decision trees of the decision tree ensemble. The masses used by the proof-of-principle classifier are Hsted in Table 1 (starting with most important masses having high improvement). An overview of the distribution of masses is given in Figure 5.
The 1st final classifier for colon cancer employed 75 masses out of 90 deteπnined signal clusters. Single decision trees consisted of more variables than in the proof-of-principle classifier: 9 variables were typical, see histogram of Figure 6. Variable importance was roughly deduced by overall improvement. The masses used by the 1st final classifier are listed in Table 2 (starting with most important masses, i.e. masses with highest improvement values). An overview of the distribution of masses of the 1st final classifier is given in Figure 7.
The 2nd final classifier for colon cancer employed 77 masses out of 90 determined signal clusters. Single decision trees consisted of even more variables than in 1st final classifier: 10 variables were typical, see histogram of Figure 8. Variable importance was roughly deduced by overall improvement. The masses used by the 2nd final classifier are listed in Table 3 (starting with most important masses, i.e. masses with highest improvement values); An overview of the distribution of masses of the 2n final classifier is given in Figure 9.
The 3rd final classifier for colon cancer employed 80 masses out of 90 determined signal clusters. Single decision trees consisted of even more variables than in 1st final classifier: 10 variables were typical, see histogram of Figure 10. Variable importance was roughly deduced by overall improvement. The masses used by the 3ri final classifier are listed in Table 4 (starting with most important masses, i.e. masses with highest improvement values). An overview of the distribution of masses of the 3rd final classifier is given in Figure 11.
With the exception of mass 10722 Da, the classifiers include all of the differentially expressed biomolecules found in this study.
Example 6. Classification performance.
Classification performance is determined for the proof-of-principle classifier on the colon cancer versus endoscopy control test data set as well as on a separate test set consisting of presumably healthy blood donors. The classifier achieved 93% sensitivity and 84% specificity on the cancer versus endoscopy controls test data set and 94% specificity on 77 samples of blood donors.
For the three final classifiers, we deteπnined their specificity on 77 samples of blood donors. We obtained 92% specificity for the 1st final classifier, 100% specificity for the 2nd final classifier, and 92% specificity for the 3rd final classifier. Table 1: Ranking of masses of proof-of-principle classifier by overall improvement.
mass improvement mass Improvement mass improvement
5493 11.397 6447 0.193 11465 0.048
4964 0.915 15879 0.193 8703 0.046
6645 0.724 4719 0.188 13290 0.045
12619 0.589 3228 0.176 4607 0.041
8781 0.511 17263 0.17 3457 0.04
3947 0.483 15005 0.159 8215 0.039
7576 0.464 17617 0.157 3027 0.038
10595 0.446 2509 0.155 9360 0.038
22952 0.442 9078 0.153 5113 0.031
6852 0.415 -4104 0.132 4295 0.03
.3327 0.409 13633 0.127 17890 0.028
22467 0.405 7000 0.122 11694 0.027
24080 0.398 2733 0.105 11905 0.026
2021 0.359. 9202 0.095 4546 0.025
12829 0.346 16105 0.086 16164 0.025
8575 0.342 18116 0.082 9642 0.014
2270 0.323 9718 0.08 22339 0.013
9143. 0.267 4242 0.069 15957 0.012
4866 0.229 6898 0.067 4830 0.011
4359 0.225 4476 0.066 5854 0.011
2049 0.223 8922 0.066 5773 0.009
8077 0.214 7658 0.062
13784 0.202 8474 0.058
22677 0.202 12470 0.058
17397 0.198 5648 0.052
Table 2: Ranking of masses of 1st final classifier by overall improvement.
mass improvement mass improvement mass improvement
5493 12.849 17890 0.157 3947 0.056
6645 1.216 10595 0.156 2733 0.051
4964 0.907 7658 0.148 9581 0.046
8781 0.559 11216 0.147 28259 0.045
12829 0.494 2509 0.141 4607 0.044
15879 0.392 3228 0.141 4546 0.042
2021 0.363 16105 0.128 9930 0.039
22952 0.353 22467 0.112 17617 0.039
2270 0.323 9360 0.111 3457 0.038
28055 0,305 4476 0.099 22677 0.036
18116 0.3 4830 0.093 13633 0.033
8077 0.298 9143 0.088 11694 0.032
6852 0.268 10369 0.088 11905 0.031
2049 0.252 17767 0.085 8703 0.028
4359 0.239 4242 0.083 11465 0.024
8575 0.233 6447 0.078 13983 0.024
24080 0.232 22339 0.078 9078 0.022
12619 0.197 15005 0.075 14798 0.022
7576 0.179 4719 0.073 16953 0.021
12470 0.168 7000 0.064 13290 0.021
4104 0.166 5113 0.062 11547 0.02
15957 0.165 9202 0.062 5648 0.011
17263 0.165 4866 0.058 5226 0.01
5854 0.161 16164 0.058 6898 0.01
3327 0.161 3027 0.057 5773 0.009
Table 3: Rankmg of masses of 2nd final classifier by overall improvement.
mass improvement mass improvement mass improvement
3947 5.672 9360 0.187 8575 0.068
12829 2.203 3027 0.179 10369 0.066
6645 1.472 4866 0.169 17767 0.063
4964 1.441 12470 0.163 15350 0.056
8077 1.158 9078 0.148 11216 0.046
28055 1.072 2509 0.147 17890 0.044
15957 0.912 6898 0.142 8703 0.039
6852 0.811 10595 0.139 4295 0.036
12619 0.539 7576 0.135 15005 0.036
24080 0.393 8781 0.116 22677 0.036
3327 0.385 22339 0.115 9581 0.031
28259 0.34 5854 . 0.114 9426 0.03
2021 0.337 2270 0.11 13290 0.027
16105 0.316 ' 6447 0.106 15879 0.026
11694 0.315 22952 0.104 17397 0.023
4104 0.299 4242 0.092 5648 0.022
2049 0.293 10215 0.092 17617 0.022
4719 0.27 5113 0.09 8474 0.019
16164 0.25 9202 0.089 10440 0.016
3457 0.241 9143 0.086 4359 0.009
4546 0.238 13983 0.082 5226 0.008
17263 0.232 4830 0.081 7000 0.006
16953 0.228 4476 0.08 7658 0.006
2733 0.225 11465 0.072
22467 0.218 18116 0.071
5773 0.193 15140 0.07
3228 0.19 4607 0.068
Table 4: Ranking of masses of 3rd final classifier by overall improvement.
mass improvement mass improvement mass improvement
4964 -' 3.431 10595 0.187 15140 0.047
12829 2.166 7658 0.183 7000 0.046
6645 1.999 9078 0.183 22467 0.044
28055 1.288 8781 0.171 10369 0.042
.28259 1.152 5773 0.144 18390 0.042
6852 1.089 2270 0.134 13290 0.041
3327 0.781 5113 0.133 6898 0.038
16105 0.737 7576 0.132 17767. 0.038
16953 0.736 9143 0.131 8703 0.036
15957 0.714 6447 0.128 13633 0.036
12619 0.705 2733 0.111 15005 0.036
8077 0.666 18116 0.109 15350 0.032
4830 0.615 4607 0.104 13784 0.031
4546 0.485 11694 0.104 17617 0.029
2021 0.403 ■ 15879 0.1 14798 0.027
4242 0.329 9202 0.099 17397 0.026
4719 0.304 10215 ; 0.092 5226 0.026
12470 0.292 4476 0.089 9426 0.026
9360 0.283 9581 0.089 5648 0.022
3457 0.279 11905 0.086 8474 0.019
22952 0.275 4359 0.079 8575 0.019
2509 0.261 4295 0.075 10440 0.016
4104 0.245 4866 0.068 17263 0.009
2049 0.23 9718 0.068 11216 0.008
24080 0.219 11465 0.062
16164 0.201 13983 0.062
3228 0.198 22339 0.056
5854 0.192 3027 0.047

Claims

We claim:
1. A method for the differential diagnosis of a colorectal cancer and/or a non-malignant disease of the large intestine, in vitro, comprising: a) obtaining a test sample from a subject, b) contacting test sample with a biologicaUy active surface under specific binding conditions c) allowing the biomolecules within the test sample to bind said biologicaUy active surface, d) detecting bound biomolecules using a detection method, wherein the detection method generates a mass profile of said test sample, e) transforming the mass profile into a computer readable form, and f) comparing the mass profile of e) with a database containing mass profiles specific for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having colorectal cancer, subjects having metastasised colorectal cancer, or subjects having a non-malignant disease of the large intestine, wherein said comparison allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the large intestine, having a colorectal cancer, having a metastasised colorectal cancer and/or a non-malignant disease of the large intestine.
2. The method of claim 1 , wherein the database is generated by a) obtaining biological samples from healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having colorectal cancer, subjects having metastasised colorectal cancer, and subjects having a non-malignant disease of the large intestine, b) contacting said biological samples with a biologicaUy active surface under specific binding conditions, c) allowing the biomolecules within the biological samples to bind to said biologicaUy active surface, d) detecting bound biomolecules using a detection method, wherein the detection method generates mass profiles of said biological samples, e) transforming the mass profiles into a computer-readable form, f) applying a mathematical algorithm to classify the mass profiles in e) as specific for healthy subjects, subjects having a precancerous lesion of the large intestine, subjects having colorectal cancer, subjects having metastasised colorectal cancer, and subjects having a non-malignant disease of the large intestine.
3. The method of claim 1, wherein the biomolecules are characterized by: a) diluting a sample 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, 2% Ampholine, at 0° to 4° b) further diluting said sample 1:10 with a binding buffer consisting of 0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5 at 0° to 4° c) contacting the sample with a biologicaUy active surface comprising positively charged quaternary ammonium groups d) incubating of the treated sample with said biologically active surface for 120 minutes under temperatures between 20 and 24°C at pH 8.5, e) and analysing the bound biomolecules by gas phase ion spectrometry.
4. The method of claim 1, wherein the detection method is mass spectrometry.
5. The method of claim 4, wherein the method of mass spectrometry is selected from the group of matrix-assisted laser desorption ionization time of flight (MALDI-TOF), surface enhanced laser desorption ionisation/time of flight (SELDI-TOF), Hquid chromatography, MS-MS and or ESI-MS.
6. The method of claims 1, wherein the biologically active surface comprises an adsorbent selected from the group of quaternary ammonium groups, carboxylate groups, groups with alkyl or aryl chains, groups such as nitriloacetic acid that immobilize metal ions, or proteins, antibodies, or nucleic acids.
7. The method of claim 1, wherein the mass profiles comprise a panel of one or more differentially expressed biomolecules.
8. The method of claim 7, wherein, wherein the biomolecules are selected from a group having the apparent molecular mass of 2020 Da ± 10 Da, 2049 Da ± 10 Da, 2270 Da ± 11 Da, 2508 Da ± 13 Da, 2732 Da ± 14 Da, 3026 Da ± 15 Da, 3227 Da ± 17 Da, 3326 Da ± 17 Da, 3456 Da ± 17 Da, 3946 Da ± 20 Da, 4103 Da ± 21 Da, 4242 Da ± 21 Da, 4295 Da ± 21 Da, 4359
Da ± 22 Da, 4476 Da ± 22 Da, 4546 Da ± 23 Da, 4607 Da ± 23 Da, 4719 Da ± 24 Da, 4830 Da ± 24 Da, 4865 Da ± 24 Da, 4963 Da ± 25 Da, 5112 Da ± 26 Da, 5226 Da ± 26 Da, 5493 Da ± 27 Da, 5648 Da ± 28 Da, 5772 Da ± 29 Da, 5854 Da ± 29 Da, 6446 Da ± 32 Da, 6644 Da ± 33 Da, 6852 Da ± 34 Da, 6897 Da ± 34 Da, 6999 Da ± 35 Da, 7575 Da ± 38 Da, 7657 Da ± 38 Da, 8076 Da ± 40 Da, 8215 Da ± 41 Da, 8474 Da ± 42 Da, 8574 Da ± 43 Da, 8702
Da ± 44 Da, 8780 Da ± 44 Da, 8922 Da ± 45 Da, 9078 Da ± 45 Da, 9143 Da ± 46 Da, 9201 Da ± 46 Da, 9359 Da ± 47 Da, 9425 Da ± 47 Da, 9581 Da ± 48 Da, 9641 Da ± 48 Da, 9718 Da ± 49 Da, 9930 Da ± 50 Da, 10215 Da ± 51 Da, 10369 Da ± 52 Da, 10440 Da ± 52 Da, 10594 Da ± 53 Da, 11216 Da ± 56 Da, 11464 Da ± 57 Da, 11547 Da ± 58 Da, 11693 Da ± 58 Da, 11905 Da ± 60 Da, 12470 Da ± 62 Da, 12619 Da ± 63 Da, 12828 Da ± 64 Da, 13290 Da ± 66 Da, 13632 Da ± 68 Da, 13784 Da ± 69 Da, 13983 Da ± 70 Da, 14798 Da ± 74 Da, 15005 Da ± 75 Da, 15140 Da ± 76 Da, 15350 Da ± 77 Da, 15879 Da ± 79 Da, 15957 Da ± 80 Da, 16104 Da ± 81 Da, 16164 Da ± 81 Da, 16953 Da ± 85 Da, 17263 Da ± 86 Da, 17397 Da ± 87 Da, 17617 Da ± 88 Da, 17766 Da± 89 Da, 17890 Da± 89 Da, 18115 Da ± 91 Da, 18390 Da± 92 Da, 22338 Da ± 112 Da, 22466 Da ± 112 Da, 22676 Da ± 113 Da, 22951 Da ± 115 Da, 24079 Da ± 120 Da, 28055 Da ± 140 Da and/or 28259 Da ± 141 Da.
9. A method for the identification of differentiaUy expressed biomolecules wherein the biomolecules of any of claims 1-8 are proteins, comprising: a) chromatography and fractionation, b) analysis of fractions for the presence of said differentiaUy expressed proteins and/or fragments thereof, using a biologicaUy active surface, c) further analysis using mass spectrometry to obtain amino acid sequences encoding said proteins and/or fragments thereof, and d) searching amino acid sequence databases of known proteins to identify said differentiaUy expressed proteins by amino acid sequence comparison.
10. The method of claim 9, wherein the method of chromatography is selected from high performance liquid chromatography (HPLC) or fast protein liquid chromatography (FPLC).
11. The method of claim 9, wherein the mass spectrometry used is selected from the group of matrix-assisted laser desorption ionization/time of flight (MALDI-TOF), surface enhanced laser desorption ionisation/time of flight (SELDI-TOF), liquid chromatography, MS-MS and/or ESI-MS.
12. A method for the differential diagnosis of a colorectal cancer and/or a non-malignant disease of the large intestine, in vitro, comprising detection of one or more differentially expressed biomolecules wherein the biomolecules are polypeptides, comprising: a) obtaining a test sample from a subject, b) contacting said sample with a binding molecule specific for a differentially expressed polypeptide identified in claims 9-11, c) detecting the presence or absence of said polypeptide(s), wherein the presence or absence of said polypeptide(s) allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the large intestine, haying a colorectal cancer, having a metastasised colorectal cancer and/or a non-malignant disease of the large intestine.
13. The method of any one of claims 1-12, wherein the colorectal cancer is a cancer of the colon or rectum.
14. The method of any one of claims 1-12, wherein the test sample is a blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract sample.
15. The method of any one of claims 1-12, wherein the biological sample is a blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract sample.
16. The method or kit of any one of claims 1-12, wherein the subject is of mammalian origin.
17. The method of claim 16, wherein the subject is of human origin.
18. A kit for the diagnosis of a colorectal cancer or a non-mahgnant disease of the large intestine using the method of any one of claims 1-11 and 13-17 comprising a denaturation solution, a binding solution, a washing solution, a biologically active surface comprising an adsorbent, and instructions to use the kit.
19. A kit for the diagnosis of a colorectal cancer or a non-malignant disease of the large intestine using the method of any one of claims 12-17 comprising a solution, binding molecule, detection substrate, and instructions to use the kit.
EP04733324A 2003-05-15 2004-05-17 Differential diagnosis of colorectal cancer and other diseases of the colon Withdrawn EP1639365A1 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
EP04733324A EP1639365A1 (en) 2003-05-15 2004-05-17 Differential diagnosis of colorectal cancer and other diseases of the colon

Applications Claiming Priority (7)

Application Number Priority Date Filing Date Title
EP03090141 2003-05-15
US47277203P 2003-05-23 2003-05-23
EP03090153A EP1477803A1 (en) 2003-05-15 2003-05-23 Serum protein profiling for the diagnosis of epithelial cancers
US52558303P 2003-11-24 2003-11-24
EP03090401 2003-11-24
PCT/EP2004/005294 WO2004102190A1 (en) 2003-05-15 2004-05-17 Differential diagnosis of colorectal cancer and other diseases of the colon
EP04733324A EP1639365A1 (en) 2003-05-15 2004-05-17 Differential diagnosis of colorectal cancer and other diseases of the colon

Publications (1)

Publication Number Publication Date
EP1639365A1 true EP1639365A1 (en) 2006-03-29

Family

ID=56290564

Family Applications (1)

Application Number Title Priority Date Filing Date
EP04733324A Withdrawn EP1639365A1 (en) 2003-05-15 2004-05-17 Differential diagnosis of colorectal cancer and other diseases of the colon

Country Status (4)

Country Link
EP (1) EP1639365A1 (en)
AU (1) AU2004239417A1 (en)
CA (1) CA2525743A1 (en)
WO (1) WO2004102190A1 (en)

Families Citing this family (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA2526878A1 (en) * 2003-04-08 2004-10-21 Colotech A/S A method for detection of colorectal cancer in human samples
US7425700B2 (en) 2003-05-22 2008-09-16 Stults John T Systems and methods for discovery and analysis of markers
CA2881326A1 (en) * 2005-09-12 2007-03-22 Phenomenome Discoveries Inc. Methods for the diagnosis of colorectal cancer and ovarian cancer health states
EP2623984B1 (en) * 2012-01-03 2017-10-04 National Cancer Center Apparatus for screening cancer
KR101461615B1 (en) 2012-01-03 2015-04-22 국립암센터 Apparatus for diagnosis cancer
CN114651058B (en) 2019-08-05 2023-07-28 禧尔公司 Systems and methods for sample preparation, data generation, and protein crown analysis

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2001288921A1 (en) * 2000-09-11 2002-03-26 Ciphergen Biosystems, Inc. Human breast cancer biomarkers

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See references of WO2004102190A1 *

Also Published As

Publication number Publication date
AU2004239417A1 (en) 2004-11-25
CA2525743A1 (en) 2004-11-25
WO2004102190A1 (en) 2004-11-25

Similar Documents

Publication Publication Date Title
Seibert et al. Advances in clinical cancer proteomics: SELDI-ToF-mass spectrometry and biomarker discovery
Veenstra Global and targeted quantitative proteomics for biomarker discovery
US20060088894A1 (en) Prostate cancer biomarkers
US20090204334A1 (en) Lung cancer biomarkers
EP1573044A2 (en) Serum biomarkers in lung cancer
Lu et al. Detection and identification of serum peptides biomarker in papillary thyroid cancer
CN111562338A (en) Application of transparent renal cell carcinoma metabolic marker in renal cell carcinoma early screening and diagnosis product
US20070087392A1 (en) Method for diagnosing head and neck squamous cell carcinoma
CN108020669B (en) Application of urinary osteopontin and polypeptide fragment thereof in lung adenocarcinoma
Liu et al. Potential biomarkers for esophageal carcinoma detected by matrix-assisted laser desorption/ionization time-of-flight mass spectrometry
WO2005008247A2 (en) Detection of endometrial pathology
EP1639365A1 (en) Differential diagnosis of colorectal cancer and other diseases of the colon
Song et al. MALDI‐TOF‐MS analysis in low molecular weight serum peptidome biomarkers for NSCLC
Wu et al. Proteomic evaluation of urine from renal cell carcinoma using SELDI-TOF-MS and tree analysis pattern
EP1477803A1 (en) Serum protein profiling for the diagnosis of epithelial cancers
EP1629278A1 (en) Biomarkers for the differential diagnosis of pancreatitis and pancreatic cancer
US20040033613A1 (en) Saliva-based protein profiling
EP4004549A1 (en) Progression markers for colorectal adenomas
CN117589991B (en) Biomarker, model, kit and application for identifying HER2 expression state of breast cancer patient
CN117147737B (en) Plasma combined marker for esophageal squamous carcinoma diagnosis, kit and detection method
CN118150830B (en) Application of protein marker combination in preparation of colorectal cancer early diagnosis product
Sadaka et al. Study of plasma proteome pattern using matrix-assisted laser-desorption ionization time of flight mass spectrometry in a cohort of Egyptian prostate cancer patients
Conrads et al. Mass Spectrometry‐Based Proteomic Approaches for Disease Diagnosis and Biomarker Discovery

Legal Events

Date Code Title Description
PUAI Public reference made under article 153(3) epc to a published international application that has entered the european phase

Free format text: ORIGINAL CODE: 0009012

17P Request for examination filed

Effective date: 20051214

AK Designated contracting states

Kind code of ref document: A1

Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PL PT RO SE SI SK TR

RAP1 Party data changed (applicant data changed or rights of an application transferred)

Owner name: MIRACULINS INC.

DAX Request for extension of the european patent (deleted)
STAA Information on the status of an ep patent application or granted ep patent

Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN

18D Application deemed to be withdrawn

Effective date: 20061015