WO2017219093A1 - Screening methods - Google Patents

Screening methods Download PDF

Info

Publication number
WO2017219093A1
WO2017219093A1 PCT/AU2017/050644 AU2017050644W WO2017219093A1 WO 2017219093 A1 WO2017219093 A1 WO 2017219093A1 AU 2017050644 W AU2017050644 W AU 2017050644W WO 2017219093 A1 WO2017219093 A1 WO 2017219093A1
Authority
WO
WIPO (PCT)
Prior art keywords
amount
protein
cancer
test sample
treatment
Prior art date
Application number
PCT/AU2017/050644
Other languages
French (fr)
Inventor
Mark S Baker
Sadia Mahboob
Seong Beom AHN
Samridhi SHARMA
Original Assignee
Macquarie University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Priority claimed from AU2016902484A external-priority patent/AU2016902484A0/en
Application filed by Macquarie University filed Critical Macquarie University
Publication of WO2017219093A1 publication Critical patent/WO2017219093A1/en

Links

Classifications

    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57407Specifically defined cancers
    • G01N33/57419Specifically defined cancers of colon
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N33/00Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
    • G01N33/48Biological material, e.g. blood, urine; Haemocytometers
    • G01N33/50Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
    • G01N33/53Immunoassay; Biospecific binding assay; Materials therefor
    • G01N33/574Immunoassay; Biospecific binding assay; Materials therefor for cancer
    • G01N33/57484Immunoassay; Biospecific binding assay; Materials therefor for cancer involving compounds serving as markers for tumor, cancer, neoplasia, e.g. cellular determinants, receptors, heat shock/stress proteins, A-protein, oligosaccharides, metabolites
    • GPHYSICS
    • G01MEASURING; TESTING
    • G01NINVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
    • G01N2800/00Detection or diagnosis of diseases
    • G01N2800/52Predicting or monitoring the response to treatment, e.g. for selection of therapy based on assay results in personalised medicine; Prognosis

Definitions

  • the invention relates to methods for screening for cancer in an individual, in particular, methods for screening for colorectal cancer.
  • CRC Colorectal cancer
  • colon or bowel cancer is the third most commonly diagnosed cancer worldwide with over 694,000 deaths (8.68 % of all cancer deaths) in 2012 [1 ].
  • CRC develops in a progressive manner, typically beginning with the formation of abnormal tissue growths projecting from the mucous membrane (polyps) in the colon/rectum, which then progress into adenomas and finally, metastatic disease. Many polyps do not produce overt clinical symptoms and consequently are often not detected. Left untreated, polyps may develop into adenomatous polyps (benign neoplasia) which have a high risk of subsequently developing into adenocarcinoma.
  • CRC screening includes the use of faecal occult blood tests (FOBT), flexible sigmoidoscopy and colonoscopy.
  • FOBT faecal occult blood tests
  • Colonoscopy is the current gold standard for detecting CRC and has a specificity of greater than 90% for detecting CRC.
  • colonoscopy is intrusive and costly with a small but finite risk of complications (2.1 per 1000 procedures) (Levin, 2004).
  • FOBT While less invasive, FOBT has relatively low specificity resulting in a high rate of false positives. All positive FOBT are therefore typically followed up with colonoscopy.
  • FOBT also lacks sensitivity for early stage cancerous lesions that do not bleed into the bowel and as stated above, these are the lesions for which treatment is most successful.
  • the present invention provides a method of determining the likelihood of an individual having a cancer including: - providing a test sample of bodily fluid from an individual for whom diagnosis of cancer is required;
  • the present invention also provides a method of determining the likelihood of an individual having a cancer including:
  • the present invention also provides a method of determining the likelihood of an individual having a cancer including:
  • the present invention relates to a method of determining the likelihood of an individual having a cancer including:
  • the one or more protein biomarkers is selected from the group consisting of: ADAM DEC1 , cystatin-C and complement factor D;
  • the individual has a high likelihood of having cancer when: a) the amount of ADAM DEC1 in the test sample is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
  • the amount of cystatin-C and/or complement factor D in the test sample is lower than the amount of the same protein biomarker in the reference data set;
  • the amount of ADAM DEC1 in the test sample is the same or lower than the amount of ADAM DEC1 in the reference data set;
  • the amount of cystatin-C and/or complement factor D in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set.
  • one, two or three of ADAM DEC1 , cystatin-C and complement factor D can be measured in the sample from the individual.
  • the method may involve measuring ADAM DEC-1 and cystatin-C, or ADAM DEC-1 and complement factor D, or cystatin-C and complement factor or all three of ADAM DEC-1 , cystatin-C and complement factor D. It will be understood that the measurement or determination of the levels or amounts of two or more protein biomarkers may be conducted simultaneously or in separate experiments.
  • the method in addition to measuring the amount of one or more of ADAM DEC1 , cystatin-C and complement factor D, the method further includes:
  • the one or more additional biomarkers is selected from the group consisting of:
  • serum amyloid protein A2 and (r) superoxide dismutase 3 - determining that the individual has a high likelihood of having cancer when: a) the amount of hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is greater than the amount of the same protein in the reference data set; and/or b) the amount of plasma C1 protease inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator,
  • apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, profilin-1 and/or superoxide dismutase 3 in the test sample is lower than the amount of the same protein biomarker in the reference data set;
  • the amount of hemopexin serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma -A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is the same or lower than the amount of the same protein in the reference data set; d) the amount of plasma protease C1 inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator,
  • apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, profilin-1 and/or superoxide dismutase 3 in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set.
  • the present invention also provides methods where diagnosis is based on assessment of any one of the markers disclosed herein, and in those circumstances any biomarker disclosed herein can be determinative of a diagnosis for colorectal cancer. Further, any biomarker disclosed herein, in combination with one or more of the other markers may be useful for determining the likelihood of an individual having cancer.
  • the present invention includes: a) - determining the amount of ADAM DEC1 in a test sample of bodily fluid from an individual,
  • ADAM DEC1 in the reference data set b) - determining the amount of serum amyloid P component in a test sample of bodily fluid from an individual
  • the cancer is colorectal cancer.
  • the present invention also relates to a method of determining the likelihood of successful treatment of a cancer in an individual. Any of the biomarkers listed herein may be used to determine whether an individual has received a successful treatment for cancer.
  • the present invention provides a method including:
  • the one or more protein biomarkers is selected from the group consisting of: ADAM DEC1 , cystatin-C and complement factor D;
  • the amount of ADAM DEC1 in the post-treatment test sample is lower than the amount of the same protein biomarker in a pre-treatment reference sample obtained from the individual before receiving the treatment for cancer;
  • the amount of cystatin-C and/or complement factor D, in the post- treatment test sample is greater than the amount of the same protein biomarker in the pre-treatment reference sample;
  • the amount of ADAM DEC1 in the post-treatment test sample is the same or greater than the amount of the same protein biomarker in the pre-treatment reference sample;
  • the amount of cystatin-C and/or complement factor D, in the post- treatment test sample is the same or lower than the amount of the same protein biomarker in the pre-treatment reference sample.
  • the invention may further include measuring the amount of one or more additional biomarkers to determine the likelihood of successful treatment of a cancer in an individual. For example, further to measuring or determining the amounts of ADAM DEC1 , cystatin C and/or complement factor D, a post-treatment test sample from an individual who has received a treatment for cancer may also be analysed for the purpose of measuring or determining the levels of any one or more additional protein biomarkers selected from the group consisting of:
  • serum amyloid P component serum amyloid P component
  • apolipoprotein B-100 glutathione peroxidase 3
  • protocadherin gamma-A8 S100 calcium binding protein 8
  • serum amyloid protein A1 serum amyloid protein A2
  • plasma C1 inhibitor paraoxonase 1
  • inter-alpha- trypsin inhibitor heavy chain H2 prothrombin
  • prothrombin hepatocyte growth factor activator
  • apolipoprotein A-IV carboxypeptidase Q
  • mannan-binding lectin serine protease 2 mannose receptor C type I
  • prof ilin-1 and/or superoxide dismutase 3 and
  • the amount of hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the post-treatment test sample is lower than the amount of the same protein biomarker in a pre-treatment reference sample obtained from the individual before receiving the treatment for cancer; b) the amount of plasma C1 protease inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, prof ilin-1 and/or superoxide dismutase 3 in the post- treatment test sample is greater than the amount of the same protein biomarker in the pre-treatment reference sample;
  • the amount of hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the post-treatment test sample is the same or greater than the amount of the same protein biomarker in the pre-treatment reference sample; d) the amount of plasma C1 protease inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, prof ilin-1 and/or superoxide dismutase 3 in the post- treatment test sample is the same or lower than the amount of the same protein biomarker in the pre-treatment reference sample.
  • the present invention provides a method of determining whether to treat an individual for cancer, preferably colorectal cancer, including:
  • the one or more protein biomarkers is selected from the group consisting of:
  • ADAM DEC1 cystatin-C and complement factor D;
  • the amount of ADAM DEC1 in the test sample is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer;
  • the amount of cystatin-C and/or complement factor D in the test sample is lower than the amount of the same protein biomarker in the reference data set;
  • the amount of ADAM DEC1 in the test sample is the same or lower than the amount of ADAM DEC1 in the reference data set;
  • the present invention also contemplates the determining the amounts of one or more additional biomarkers in a test sample from the individual.
  • the present invention includes:
  • the one or more protein biomarkers is selected from the group consisting of ADAM DEC1 , cystatin-C, complement factor D, hemopexin, serum amyloid P
  • apolipoprotein B-100 apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8
  • serum amyloid protein A1 serum amyloid protein A2, plasma C1 protease inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV,
  • carboxypeptidase Q carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, profilin-1 and/or superoxide dismutase 3
  • determining to treat the individual for cancer when: a) the amount of ADAM DEC-1 , hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is greater than the amount of the same protein in a reference data set in the form of one or more individuals who do not have cancer; and/or b) the amount of cystatin-C, complement factor D, plasma C1 protease inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, profilin-1 and/or superoxide dismutase 3 in the test sample is lower than
  • the amount of ADAM DEC-1 , hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma -A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is the same or lower than the amount of the same protein in the reference data set; d) the amount of cystatin-C, complement factor D, plasma protease C1 inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, profilin-1 and/or superoxide dismutase 3 in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set.
  • the reference to “measuring” includes active and passive methods.
  • the skilled person, in performing the invention may actively analyse a sample of bodily fluid obtained from an individual and subject the sample to proteomic, ELISA or any other assay for determining the presence or amount of any protein biomarker described herein.
  • the skilled person may measure the amount of the protein biomarker passively, by referring to a database of clinical results and determine the amount of the relevant protein biomarker in a sample from the individual.
  • the present invention also includes a biomarker panel comprising or consisting of any one or more of the 22 biomarkers listed herein in Table 1 .
  • the panel includes at least the protein biomarker ADAM DEC1 , the protein cystatin-C or the protein complement factor D.
  • the panel includes at least the proteins ADAM DEC1 and cystatin-C; at least the proteins ADAM DEC1 and complement factor D or at least the proteins cystatin-C and complement factor D.
  • the panel may comprise all 3 of ADAM DEC1 , cystatin-C and complement factor D. Further, the panel may comprise any one, two, three, four, five, six, seven, eight, nine, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18 or 19 of the remaining protein biomarkers listed in Table 1 .
  • the biomarker panel as described herein may be used for determining whether an individual has or is at risk of cancer, or whether the individual has or has not received a successful treatment for cancer.
  • a sample of bodily fluid from the individual can be any fluid that contains extracellular components.
  • the fluid is plasma or serum
  • Figure 1 1 D gel electrophoresis of (A) crude plasma from unaffected (E) and CRC patients (stage A-D) (B) Low abundance (FT) proteins enriched fractions from unaffected (E) and CRC (A-D) patient plasma after API depletion. The analysis was conducted using Novex Tris Tricine 16% and Precision Plus ProteinTM Unstained Standards were used as molecular marker (MM), with 4 ⁇ g protein loaded into each well.
  • MM molecular marker
  • Figure 2 1 D gel electrophoresis of ultradepleted CRC plasma samples.
  • the gel electrophoresis was conducted using nUViewTM Tris-HEPES Precast Gels (8-16% NuSep® protein gels), with 2.5 ⁇ g protein loaded into each well.
  • Figure 3 Number of proteins identified by MARS, API and API+MARS depletion strategy.
  • Figure 4 Differentially expressed proteins identified from MARS, API, and API+MARS depletions based on (A) Anova analysis (BH-corrected p-value for multiple testing, MaxFC > 1 .5) (B) Peptide-level analyses
  • Figure 5 The protein fold changes between any two groups (benign/healthy, malignant/healthy and malignant/benign) for MARS Depletion (done in duplicate), API depletion (done in duplicate) and API+MARS depletion (done in triplicate) of plasma samples.
  • Each plot represents the log fold change of that comparison across the x-axis, the Log of P-value of the comparison (e.g. un-corrected 2-sample t-test p-value) across the y-axis.
  • Figure 8 Group specific abundance patterns for 19 differentially expressed proteins from Table 4. Y axis values were calculated as for Figure 6.
  • Figure 9 1 D Gel analysis of the human plasma fractions acquired from API column. Approximately 4 pg of protein per lane was separated and separated on a Novex Tris Tricine 16% gel using Precision Plus ProteinTM Unstained Standards as molecular marker (MM) staining with Flamingo Pink stain. Lane 2, 4, 6 contains the high abundance (BF) and lane 1 , 3, 5 contains the low abundance (FT) enriched plasma fractions from API depletion.
  • Figure 10 Depletion chromatograms from API depletion when crude CRC stage
  • A-D and unaffected control group (E) plasma 100 ⁇ containing plasma protein approx. 3-4 mg were injected into the column.
  • the bound (BF) and flow through (FT) fractions indicate the high and low abundance protein enriched fractions.
  • Figure 11 Depletion chromatograms from MARS depletion when API depleted CRC stage A-D and unaffected control group (E) plasma (40 ⁇ containing plasma protein approx. 0.7-1 .6 mg) were injected into the column.
  • the bound (BF) and flow through (FT) fractions indicate the high and low abundance protein enriched fractions.
  • Figure 12 Depletion chromatograms from MARS depletion when crude CRC stage A-D and unaffected control group (E) plasma (40 ul containing plasma protein approx. 1 .6-2.1 mg) were injected into the column.
  • the bound (BF) and flow through (FT) fractions indicate the high and low abundance protein enriched fractions.
  • Figure 13 Number of unique plasma proteins identified from 4 different peptide fractionation methods (HpH: High pH C18 reversed phase, SEC Size exclusion chromatography, SAX: Strong anion exchange and SCX: Strong cation exchange).
  • cancers are asymptomatic in the early stages of malignancy and it is often not until a patient finally presents at a clinic reporting of symptoms that the cancer is detected. This is particularly the case with colorectal cancer (CRC), where up to 50% of CRC patients have occult (asymptomatic) metastases upon presentation to a clinic or medical professional. Accordingly, there is an ongoing clinical need for cancer screening and surveillance which facilitate an early diagnosis of cancer, before metastasis occurs, thereby increasing the prospects of a successful treatment of the cancer. Ideally, these screening methods should also facilitate low cost and high- throughput screening of at-risk individuals from the population.
  • CRC colorectal cancer
  • cancer specific biomarkers that are in use clinically (e.g., PSA, CEA and CA125) are found in low concentrations (pg-ng/mL) in human plasma and they are thought to be present as a result of structural changes in the microenvironment of cancer tissues which "leak" these markers so that they eventually reach a steady-state level in circulating human plasma.
  • Plasma proteome is a routinely collected for therapeutic and/or diagnostic purposes by minimally invasive methods and is therefore one of the most intensely studied clinical specimen [1 ]. It is believed that changes in the composition of plasma proteins and metabolites that are released from various organs, are a reflection of an individual's physical condition [3, 4]. Despite such keen interest and potential, the clinically important properties of the plasma proteome remain largely unexplored. Perhaps the major reason is, human plasma represents one of the most complex human-derived proteomes that contains a large variety of proteins immense complexity, wide dynamic range of concentration (>10 orders of magnitude) [5, 6].
  • liver-derived high abundance plasma proteins e.g., a1 protease inhibitor and other serpins, C-reactive protein, haptoglobin
  • medium abundance proteins e.g., tissue leakage proteins
  • low abundance proteins e.g., cytokines, chemokines and interleukins
  • biomarkers CA19-9, CA125, prostate specific antigen (PSA), AFP and CEA
  • the present inventors By developing a unique method for depleting abundant proteins from human blood samples, the present inventors have been able to identify a novel set of CRC- specific biomarkers.
  • the present invention relates to a method of determining the likelihood an individual has cancer including:
  • the individual has a high likelihood of having cancer when: a) the amount of ADAM DEC1 in the test sample is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
  • the amount of cystatin-C and/or complement factor D in the test sample is lower than the amount of the same protein biomarker in the reference data set; - determining that the individual has a low likelihood of having cancer when:
  • the amount of ADAM DEC1 in the test sample is the same or lower than the amount of ADAM DEC1 in the reference data set;
  • the amount of cystatin-C and/or complement factor D in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set.
  • any one or more additional biomarkers may be used in the diagnostic methods of the present invention. Examples of such additional biomarkers are discussed later in this document.
  • the individual for whom the likelihood of having cancer is required may be an individual who has presented to the clinic with symptoms which may be indicative of a cancer.
  • the individual does not have any overt symptoms of cancer but may require screening for cancer, for example, due to a concern that there is a family history of one or more cancers.
  • the individual for whom the likelihood of having cancer is to be determined is an individual requiring diagnosis of colorectal cancer.
  • the individual may or may not have a family history of colorectal cancer.
  • the individual may or may not have presented with symptoms suggestive of a colorectal neoplasm, including evidence of blood in the stools, a positive result from an FOBT test, abdominal pain, or changes in bowel habits.
  • One particular advantage of the present invention is that it enables early detection of a colorectal cancer, for example, detection when the cancer is not yet metastatic. This provides a significant advantage over prior methods which often fail to detect the cancer until it is significantly more advanced, and potentially metastatic.
  • the current methods of colorectal cancer screening such as use of carcinoembryonic antigen (CEA) methods, do not enable detection of colorectal cancer in Dukes Stages A and B (benign), and typically only yield positive results when the cancer is in stages C or D (malignant).
  • CEA carcinoembryonic antigen
  • Having a screening method which enables detection of colorectal cancer in stages A and B provides a significant advantage over the methods of the prior art, since cancers in Dukes Stages A and B can be more easily treated, and individuals diagnosed with cancer in this stage can expect to receive successful treatment and enter into full remission.
  • the methods of the present invention facilitate detection of colorectal cancer at all of the Dukes Stages.
  • the skilled person will be familiar with the usage of the Dukes staging for determining the malignancy of a colorectal cancer.
  • the tumour has penetrated into, but not through, the bowel wall.
  • the tumour has penetrated through the bowel wall but there is not yet any lymph node involvement.
  • the cancer involves regional lymph nodes.
  • there is distant metastasis for example, to the liver or lung.
  • the methods of the present invention are able to diagnose or detect colorectal cancer at any Dukes Stage with a sensitivity of at least 80%.
  • TNM Tumors of the American Joint Committee on Cancer
  • the TNM system describes 3 key pieces of information: "T”, "N” and “M”, where "T” denotes the degree of invasion of the intestinal wall, "N” denotes the degree of lymphatic node involvement and "M” the degree of metastasis (numbers or letters appear after T, N, and M to provide more details about each of these factors.
  • the numbers 0 through 4 indicate increasing severity.
  • the letter X means "cannot be assessed because the information is not available.”). More specifically: - T describes how far the main (primary) tumour has grown through the layers of the intestine and whether it has grown into nearby areas. These layers, from the inner to the outer, include:
  • lymph nodes - N describes the extent of spread to nearby (regional) lymph nodes, and, if so, how many lymph nodes are involved.
  • Nx no description of lymph node involvement is possible because of incomplete information
  • N1 a cancer cells are found in nearby lymph node
  • N1 b cancer cells are found in 2 to 3 nearby lymph nodes
  • N1 c small deposits of cancer cells are found in areas of fat near lymph nodes, but not in the lymph nodes themselves;
  • N2a cancer cells are found in 4 to 6 nearby lymph nodes
  • N2b cancer cells are found in 7 or more nearby lymph nodes.
  • M1 a the cancer has spread to 1 distant organ or set of distant lymph nodes
  • M1 b the cancer has spread to more than 1 distant organ or set of distant lymph nodes, or it has spread to distant parts of the peritoneum (the lining of the abdominal cavity).
  • the skilled person will appreciate that the Dukes Stages correspond to certain TNM Classifications.
  • Dukes Stage A corresponds to T1 , T2, NO and MO
  • Dukes Stage B corresponds to T3, T4a, T4b, NO and MO
  • Dukes Stage C corresponds to i) T1 -T2, Ni/Nic, MO; ii) Tl, N2a and MO; iii) T3-T4a, N1/N1 c and MO; iv) T2-T3, N2a and MO; v) T1 -T2, N2b and MO; vi) T4a, N2a and MO; vii) T3T4a, N2b and MO; and viii) T4b, N1 -N2 and MO.
  • TMN classification as known in the art.
  • an "early stage malignant neoplasm” in the context of colorectal cancer is a reference to a large intestine neoplasm which has become malignant but which is unlikely to extend beyond the bowel wall.
  • Reference to a "late stage malignant neoplasm” should be understood as a reference to a large intestine neoplasm which is malignant and which has spread to lymph nodes or distant organs.
  • Reference to late stage malignant neoplasms includes, for example, neoplasms which have become metastatic.
  • the term “metastasis” is meant to refer to the process in which cancer cells originating in one organ or part of the body relocate to another part of the body and continue to replicate. Metastasized cells subsequently form tumours which may further metastasize. Metastasis thus refers to the spread of cancer from the part of the body where it originally occurs to other parts of the body.
  • the term “metastasized colorectal cancer cells” is meant to refer to colorectal cancer cells which have metastasized; colorectal cancer cells localized in a part of the body other than the colorectal.
  • Protein biomarkers and samples of bodily fluid requires measurement or determining the level or amount of one or more protein biomarkers in a sample of bodily fluid taken from an individual.
  • the present inventors have shown that determining the amount or level of these biomarkers in a sample of bodily fluid an individual, allows for the detection or diagnosis of a cancer, in particular a colorectal cancer.
  • “measuring” means determining the level of one or more biomarkers in a sample of bodily fluid obtained from an individual.
  • the step of determining the levels may be performed by any method known the person skilled in the art, and as further described herein (for example, by any proteomic, or protein or peptide detection technique available for measuring the presence or absence, or level or amount of a protein, variant or fragment thereof in the sample).
  • the step of measuring or determining need not be an active step.
  • the skilled person may refer to a database of clinical results obtained from the individual requiring diagnosis, and therefrom, measure or determine the amount of the relevant protein biomarker, relative to a suitable control.
  • the control may be a database of one or more samples from individuals who do or do not have cancer and thereby provide a reference point with respect to the amount of a given biomarker in a sample of bodily fluid.
  • the individual may serve as their own control (for example, where the control may be a sample obtained from the individual before receiving treatment for cancer, or at an earlier stage in their life).
  • a sample of bodily fluid is a biological sample includes a whole blood sample, saliva, interstitial fluid, urine, faeces or derivatives thereof.
  • the blood sample is a plasma or serum sample.
  • biomarkers can also be detected in biological samples including biopsies of solid tissue or tumour, but which also include some blood or interstitial fluid. Such samples can also be exploited for detecting the amount of the protein biomarkers described herein for determining the likelihood of an individual having cancer.
  • the sample of bodily fluid Prior to testing for the presence of protein biomarkers, the sample of bodily fluid may be subjected to pre-treatment.
  • Pre-treatment may involve, for example, preparing plasma from blood, diluting viscous fluids, and the like. Such methods may involve filtration, distillation, separation, concentration, inactivation of interfering components, and the addition of reagents.
  • the selection and pre-treatment of biological samples prior to testing is well known in the art and need not be described further.
  • the skilled person will be familiar with various methods for obtaining biological samples including blood samples for individuals. Further, the skilled person will be familiar with various methods for ensuring proper storage of samples to ensure that there is no appreciable degradation of the protein contents of the sample (for example, the use of vacutainer blood collection tubes containing EDTA, heparin, citrate or other additives to prevent clotting or preserve the quality of the blood sample. The skilled person will also be familiar with methods for extracting plasma and serum from samples of whole blood.
  • the protein biomarkers that are useful in the methods of the present invention are selected from ADAM DEC1 , cystatin-C, complement factor D, plasma C1 inhibitor, paraoxonase 1 , hemopexin, inter-alpha-trypsin inhibitor heavy chain H2, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma- A8, S100 calcium binding protein 8, serum amyloid protein A1 , serum amyloid protein A2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, profilin-1 and superoxide dismutase 3 .
  • any one or more of the protein biomarkers disclosed herein may be used in the methods of the present invention.
  • ADAM DEC1 includes any fragment, peptide sequence or variant of the protein ADAM DEC1 (also referred to as ADAM-like decysin 1 or disinteg in and metalloproteinase domain-like protein decysin-1 and having Uniprot ID 015204).
  • cystatin-C includes any fragment, peptide sequence or variant of the protein cystatin C.
  • This protein may also be referred to as cystatin 3, gamma trace, post-gamma-globulin or neuroendocrine basic polypeptide and has the Uniprot ID P01034.
  • protein biomarker complement factor D includes any fragment, peptide sequence or variant of the protein complement factor D.
  • This protein may also be referred to as factor D protein, CFD, C3 proactivator convertase, properdin factor D esterase, factor D (complement), complement factor D, and ad ps n.
  • This protein is classified as belonging to E.G. 3.4.21 .46 and has Uniprot ID P00746.
  • protein biomarker plasma protease C1 inhibitor includes any fragment, peptide sequence or variant of the protein plasma protease C1 inhibitor.
  • the protein may also be referred to as C1 -inh, C1 esterase inhibitor or Serpin G1 and has the Uniprot ID P05155.
  • reference to the protein biomarker paraoxonase 1 includes any fragment, peptide sequence or variant of the protein paraoxonase 1 .
  • This protein may also be referred to as serum paraoxonase, PON-1 , arylesterase 1 , A esterase , homocysteine thiolactonase or serum aryldialkylphosphatase l and has Uniprot ID P27169.
  • protein biomarker hemopexin includes any fragment, peptide sequence or variant of the protein hemopexin.
  • This protein may also be referred to as haemopexin, HPX, or beta-1 B-glycoprotein and has Uniprot accession no P02790.
  • reference to the protein biomarker inter-alpha-trypsin inhibitor heavy chain H2 includes any fragment, peptide sequence or variant of the protein inter- alpha trypsin inhibitor heavy chain H2, also known as ITI-HC2, Serum-derived hyaluronan-associated protein or Inter-alpha-trypsin inhibitor complex component II. This protein is identified in Uniprot by the accession number P19823.
  • reference to the protein biomarker serum amyloid P component includes any fragment, peptide sequence or variant of the protein serum amyloid P component.
  • This protein may also be referred to as APCS, SAP, 9.5S alpha-1 - glycoprotein and has uniprot ID P02743.
  • reference to the protein apolipoprotein A-IV includes any fragment, peptide sequence or variant of the protein apolipoprotein A-IV.
  • This protein may also be referred to as APOA4, and has Uniprot ID P06727.
  • apolipoprotein B-100 includes any fragment, peptide sequence or variant of the protein apolipoprotein B-100.
  • This protein may also be referred to as APOB, apolipoprotein B-100, and has Uniprot ID P041 14.
  • reference to the protein biomarker carboxypeptidase Q includes and fragment, peptide sequence or variant of the protein carboxypeptidase Q.
  • this protein may also be referred to as CPQ, lysosomal dipeptidase or plasma glutamate carboxypeptidase and has Uniprot ID Q9Y646.
  • reference to the protein biomarker prothrombin includes any fragment, peptide sequence or variant of the protein prothrombin. This protein may also be referred to as F2, Coagulation factor II and has Uniprot ID P00734.
  • reference to the protein biomarker glutathione peroxidase 3 includes any fragment, peptide sequence or variant of the protein glutathione peroxidase 3, or GPX3. This protein may also be referred to as plasma glutathione peroxidase (GPx-P) or extracellular glutathione peroxidase and has Uniprot ID P22352.
  • HFAC protein biomarker hepatocyte growth factor activator
  • This protein may also be referred to as HGFA and has Uniprot ID Q04756.
  • MASP2 protein biomarker mannan-binding lectin serine protease
  • MASP2 protein biomarker mannan-binding lectin serine protease
  • This protein may also be referred to as mannose-binding protein-associated serine protease 2 and has Uniprot ID 000187.
  • reference to the protein biomarker mannose-receptor C1 includes any fragment, peptide sequence or variant of the protein mannose-receptor C1 .
  • This protein may also be referred to as MRC1 , macrophage mannose receptor 1 or C-type lectin domain family 13 member D.
  • This protein has the Uniprot ID P22987.
  • reference to the protein biomarker protocadherin gamma-A8 includes any fragment, peptide sequence or variant of the protein PCDHGA8.
  • This protein may also be referred to as PCDH-gamma-A8 and has the Uniprot ID Q9Y5G5.
  • reference to the protein biomarker profilin-1 includes any fragment, peptide sequence or variant of the protein profilin-1 .
  • This protein may also be referred to as PFN-1 , or epididymis tissue protein Li 184a and has Uniprot ID P07737.
  • protein biomarker S100 calcium binding protein 8 includes any fragment, peptide sequence or variant of protein S100A8.
  • This protein may also be referred to as calgranulin-A, calprotectin L1 L subunit, cystic fibrosis antigen (CFAG), leukocyte L1 complex light chain, migration inhibitory factor-related protein 8 (MRP-8), or urinary stone protein band A,
  • the protein has the Uniprot ID: P05109.
  • protein biomarker serum amyloid protein A1 includes any fragment, peptide sequence or variant of the protein SAA1 .
  • This protein may also be referred to as SAA and has the Uniprot ID: P0DJI8.
  • protein biomarker serum amyloid protein A2 includes any fragment, peptide sequence or variant of the protein SAA2. This protein may also be referred to as serum amyloid A2 protein and has Uniprot ID: P0DJI9. As used herein, reference to the protein biomarker superoxide dismutase 3
  • SOD3 includes any fragment, peptide sequence or variant of the protein SOD3. This protein may also be referred as extracellular superoxide dismutase [Cu-Zn] or EC-SOD and has Uniprot ID: P08294.
  • HYTSTAQPCPYPMAPPNGHVSPVQAKYILKDSFSIFC ETGYELLQGHLPLKSFTAVCQKDGSWDRPMPACSIV DCGPPDDLPSGRVEYITGPGVTTYKAVIQYSCEETFY TMKVNDGKYVCEADGFWTSSKGEKSLPVCEPVCGL
  • WAAEVISNARENIQRLTGRGAEDSLADQAANKWGR NO: 21 SGRDPNHFRPAGLPEKY Protein Accession Amino acid sequence
  • Exemplary peptides which can be detected as being representative of the protein are shown in underline and bold in Table 1 above.
  • APOA4_HUM PLAEDVRGNLR Apolipoprotein sp
  • Prothrombin SDIP00734ITHRB HUMA QEC[CAM]SIPVC[CAM]GQDQVTVAM[Oxi]TPR
  • Prothrombin SDIP00734ITHRB HUMA IVEGSDAEIGM[Oxi]SPWQVMLFR
  • Prothrombin SDIP00734ITHRB HUMA L A VTTHG L PC[C AM] L AW[Ntr] AS AQAK
  • Prothrombin SDIP00734ITHRB HUMA LAAC[CAM]LEGNC[CAM]AEGLGTNYR
  • Prothrombin SDIP00734ITHRB HUMA TFGSGEADC[CAM]GLR[Dea]PLFEK
  • Prothrombin SDIP00734ITHRB HUMA RQEC[CAM]SIP[Oxi]VC[CAM]GQDQVTVAM[Oxi
  • Prothrombin SDIP00734ITHRB HUMA GSGEADC[CAM]GLRPLFEK
  • Prothrombin SDIP00734ITHRB HUMA RQEC[CAM]SIPVC[CAM]GQ[Dea]DQVTVAM[0
  • Prothrombin SDIP00734ITHRB HUMA RQEC[CAM]SIPVC[CAM]GQDQVTVAM[Oxi]TP
  • Prothrombin SDIP00734ITHRB HUMA RQEC[CAM]SIPVC[CAM]GQDQVTVAMTPR
  • Prothrombin SDIP00734ITHRB HUMA PVC[CAM]GQDQVTVAM[DTM]TPR
  • Prothrombin SDIP00734ITHRB HUMA ETWTANVGKGQPSVLQVVNLPIVERPVC[CAM]
  • Prothrombin SDIP00734ITHRB HUMA TAT[Dhy]SEYQTFFNPR
  • Prothrombin SDIP00734ITHRB HUMA GDAC[CAM]EGDSGGPFVM[Oxi]K
  • Prothrombin SDIP00734ITHRB HUMA GDAC[CAM]EGDSGGPFVMK
  • Prothrombin SDIP00734ITHRB HUMA IVEGSDAEIGMSPWQVM[Oxi]LFR
  • Prothrombin SDIP00734ITHRB HUMA DKLAAC[CAM]LEGN[Oxi]C[CAM]AEGLGTNY[0
  • Prothrombin SDIP00734ITHRB HUMA DKLAAC[CAM]LEGN[Oxi]C[CAM]AEGLGTNYR
  • Prothrombin SDIP00734ITHRB HUMA DKLAAC[CAM]LEGN[Dea]C[CAM]AEGLGTNYR
  • Prothrombin SDIP00734ITHRB HUMA LAVTTHGLP[PGP]C[CAM]LAW
  • Prothrombin SDIP00734ITHRB HUMA TSEYQTFFNPR
  • APOB_HUMA SPAFTDLHLR Apolipoprotein sp
  • the above listed peptides are proteotypic, (unique to the protein), i.e., there are no other known human proteins having the same peptide sequences or single amino acid variations thereof. Accordingly, when detecting proteospecific peptides, the skilled person can have greater confidence that the peptide detected corresponds to the relevant protein biomarker and not to an unrelated protein.
  • the method of determining the likelihood that an individual has cancer involves measuring the amount of any one of the biomarkers ADAM DEC1 , cystatin-C and complement factor D, in the test sample.
  • the method of the present invention involves determining the amount of two or more of the biomarkers. For example, in one embodiment, the method involves measuring the amount of ADAM DEC1 and cystatin C in the test sample. Alternatively, the method involves measuring the amount of ADAM DEC1 and complement factor D in the test sample. In yet a further embodiment, the method involves measuring the amount of cystatin-C and complement factor D in the test sample.
  • the method involves measures the amount of all three of ADAM DEC1 , cystatin-C and complement factor D in the test sample.
  • the method of the present invention involves determining the amount of any one or more of the following biomarkers (which the inventors have shown to have altered levels in serum in individuals with cancer, as compared with healthy individuals):
  • the amount of any one or more of hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is greater than the amount of the same protein in a reference data set (in the form of representative data of one or more individuals who do not have cancer); and/or
  • the amount of any one or more of plasma C1 protease inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan- binding lectin serine protease 2, mannose receptor C type I, profilin-1 and/or superoxide dismutase 3 in the test sample is lower than the amount of the same protein biomarker in the reference data set. Still further, the individual for whom the biomarker levels are measured will be determined to have a low likelihood of having cancer when:
  • the amount of any one or more of hemopexin serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma -A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is the same or lower than the amount of the same protein in the reference data set; and/or
  • the amount of any one or more of plasma protease C1 inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan- binding lectin serine protease 2, mannose receptor C type I, prof ilin-1 and/or superoxide dismutase 3 in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set.
  • the methods of the present invention contemplate methods for determining the likelihood of an individual having cancer (or having had a successful treatment for cancer as the case may be), wherein the methods include determining the levels of a single biomarker as described herein, in a sample of biological fluid from the individual.
  • the present invention also includes methods which include determining the levels of two, three, four, five, six, seven, eight, nine, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 or 22 biomarkers are described herein.
  • the confidence with which the skilled person can determine the likelihood of an individual having a cancer will be increased, the greater the number of protein biomarkers measured.
  • the method involves measuring all three of ADAM DEC 1 , cystatin-C and complement factor D, and the amount of ADAM DEC 1 in the test sample is greater than in a reference data set in the form of data representative of one or more individuals who do not have cancer, and the amounts of cystatin-C and complement factor D are lower than the amount of the same protein biomarkers in the reference data set, then the greater the power of the determination of likelihood that the individual likely has cancer.
  • the more biomarkers used in the methods of the present invention the greater the sensitivity of the method. In comparison, if only one biomarker is measured, then power of the determination may be reduced.
  • the present invention provides methods for determining the amounts of one or more of ADAM DEC1 , cystatin-C and complement factor D in a test sample of bodily fluid from an individual, and one or more additional biomarkers. For example, in certain embodiments, the present invention involves measuring the amount of any one of ADAM DEC1 , cystatin-C or complement factor D, or any two of ADAM DEC1 , cystatin-C or complement factor D or all three of ADAM DEC1 , cystatin-C or complement factor D in conjunction with any one or more of the following: ⁇ hemopexin,
  • the individual, for whom the biomarkers are measured will be determined to have a high likelihood of having cancer when: - the amount of any one or more of ADAM DEC1 , hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is greater than the amount of the same protein in a reference data set (in the form of representative data of one or more individuals who do not have cancer); and/or
  • the individual for whom the biomarker levels are measured will be determined to have a low likelihood of having cancer when:
  • the amount of any one or more of ADAM DEC1 , hemopexin serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma -A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is the same or lower than the amount of the same protein in the reference data set; and/or
  • the amount of any one or more of cystatin-C, complement factor D, plasma protease C1 inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, prof ilin-1 and/or superoxide dismutase 3 in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set.
  • the present invention involves measuring in a sample of bodily fluid obtained from an individual, the amount of one or more proteins in the sample that are indicative of colorectal cancer.
  • the method may comprise contacting a biopsy, including a sample of bodily fluid derived from the subject with a compound capable of binding to a protein biomarker, and detecting the formation of complex between the compound and the biomarker polypeptide.
  • protein biomarker as used herein includes fragments of biomarker protein, including for example, immunogenic fragments and epitopes of the biomarker polypeptide.
  • the compound that binds the biomarker is an antibody.
  • antibody as used herein includes intact molecules as well as molecules comprising or consisting of fragments thereof, such as, for example Fab, F(ab')2, Fv and scFv, as well as engineered variants including diabodies, triabodies, mini-bodies and single-domain antibodies which are capable of binding an epitopic determinant.
  • antibodies may exist as intact immunoglobulins, or as modifications in a variety of forms.
  • an antibody to a protein biomarker is detected in a patient sample, wherein the amount of the antibody in the sample is informative in relation to whether the individual has cancer.
  • Preferred detection systems contemplated herein include any known assay for detecting proteins or antibodies in a biological test sample, such as, for example, SDS/PAGE, isoelectric focussing, 2-dimensional gel electrophoresis comprising SDS/PAGE and isoelectric focussing, an immunoassay, flow cytometry e.g. fluorescence-activated cell sorting (FACS), a detection based system using an antibody or non-antibody compound, such as, for example, a small molecule (e.g.
  • the antibody or small molecule may be used in any standard solid phase or solution phase assay format amenable to the detection of proteins.
  • Optical or fluorescent detection such as, for example, using mass spectrometry, MALDI-TOF, biosensor technology, evanescent fiber optics, or fluorescence resonance energy transfer, is clearly encompassed by the present invention.
  • Assay systems suitable for use in high throughput screening of mass samples e.g. a high throughput spectroscopy resonance method (e.g. MALDI-TOF, electrospray MS or nano-electrospray MS), are also contemplated.
  • Another suitable protein detection technique involves the use of Multiple Reaction Monitoring (MRM) or Parallel reaction monitoring (PRM) in LC-MS (LC/MRM- MS and LC/PRM-MS) or SWATH-MS as described in [1 1 ].
  • MRM Multiple Reaction Monitoring
  • PRM Parallel reaction monitoring
  • Immunoassay formats are particularly suitable for detecting protein biomarkers in accordance with the method of the instant invention and include for example immunoblot, Western blot, dot blot, enzyme linked immunosorbent assay (ELISA), radioimmunoassay (RIA), enzyme immunoassay.
  • Modified immunoassays utilizing fluorescence resonance energy transfer (FRET), isotope-coded affinity tags (ICAT), matrix-assisted laser desorption/ionization time of flight (MALDI-TOF), electrospray ionization (ESI), biosensor technology, evanescent fiber-optics technology or protein chip technology are also useful.
  • immunoprecipitation is the technique of precipitating an antigen out of solution using an antibody specific to that antigen.
  • the process can be used to identify protein complexes present in cell extracts by targeting a protein believed to be in the complex.
  • the complexes are brought out of solution by insoluble antibody-binding proteins isolated initially from bacteria, such as Protein A and Protein G.
  • the antibodies can also be coupled to sepharose beads that can easily be isolated out of solution. After washing, the precipitate can be analyzed using mass spectrometry, Western blotting, or any number of other methods for identifying constituents in the complex.
  • An ELISA short for Enzyme-Linked Immunosorbent Assay, is a biochemical technique to detect the presence of an antibody or an antigen in a sample. It utilizes a minimum of two antibodies, one of which is specific to the antigen and the other of which is coupled to an enzyme. The second antibody will cause a chromogenic or fluorogenic substrate to produce a signal. Variations of ELISA include sandwich ELISA, competitive ELISA, and ELISPOT. Because the ELISA can be performed to evaluate either the presence of antigen or the presence of antibody in a sample, it is a useful tool both for determining serum antibody concentrations and also for detecting the presence of antigen.
  • Quantitative immuno-polymerase chain reaction utilizes nucleic acid amplification techniques to increase signal generation in antibody-based immunoassays.
  • the target proteins are bound to antibodies which are directly or indirectly conjugated to oligonucleotides. Unbound antibodies are washed away and the remaining bound antibodies have their oligonucleotides amplified.
  • Protein detection occurs via detection of amplified oligonucleotides using standard nucleic acid detection methods, including real-time methods. Exemplary methods for performing iPCR are described in Niemeyer et al., (2007) Nature Protocols, 2:1918-30 Multiplexing systems such as Proseek Proximity Extension Assay and Bioplex
  • Multiplex Assay are examples of suitable platforms for conducting immunoassays for the purposes of determining the amounts of the protein biomarkers herein described.
  • the present invention relates to an array for use in detecting the amount of proteins, or fragments thereof in a sample, the array including a solid support and antibodies attached to the solid support, wherein the antibodies are capable of binding to one or more of ADAM DEC1 , cystatin-C and complement factor D, or fragments or variations thereof.
  • the support comprises antibodies to ADAM DEC-1 and cystatin C. In alternative embodiments, the support comprises antibodies to ADAM DEC-1 and complement factor D. In another embodiment, the support comprises antibodies to cystatin-C and complement factor D. In a particularly preferred embodiment, the support comprises antibodies which are capable of binding to each of ADAM DEC-1 , cystatin-C and complement factor D.
  • the sample of bodily fluid is subjected to preliminary processing designed to isolate or enrich the sample for low abundance proteins.
  • a blood plasma or serum sample is subjected to immunodepletion using the Abundant Protein Immunodepletion (API) method as described in US 201 1 /008900 and AU 2002951240. Briefly, API columns are prepared using HPLC purified anti-human plasma chicken polyclonal IgYs that were derived from seven protein repeptitive orthogonal offline fractionation (PROOF) fractions. Purified IgYs are immobilised on resin and used to remove abundant proteins from the test plasma or serum samples.
  • API Abundant Protein Immunodepletion
  • a combination of immunodepletion strategies can be used to deplete high and medium abundance proteins prior to performing the protein detection.
  • the API depleted plasma sample is subjected to further immunodepletion, using a commercially available immunodepletion method which removes a number of other high abundance proteins.
  • MARS kits multiple affinity removal system
  • Agilent Technologies the ProteoPrep immunodepletion kit from Sigma.
  • the reference data set may be in the form of representative data from one or more healthy individuals, more particularly, individuals who do not have cancer.
  • the reference data set will contain sufficient representative data to enable the skilled person to determine, with an appropriate degree of certainty, whether there is a high or low likelihood that an individual has cancer.
  • the reference data set contains reference data from at least 10 individuals who have undergone various screening tests to determine the absence of cancer.
  • the skilled person will also appreciate the greater prospects of correctly diagnosing an individual as having cancer, if the dataset contains data from a greater number of individuals.
  • the reference dataset contains protein reference data from 10 or more individuals, 25 or more, 50 or more, 100 or more, 200 or more, 400 or more, 600 or more, 800 or more, or 1000 or more individuals.
  • the reference data set is not limited to data pertaining to the protein biomarkers identified in accordance with the invention.
  • the data set may also contain information pertaining to other biomarkers which may be used to supplement the protein expression level information and assist the skilled practitioner in making a determination on the diagnosis of cancer.
  • the methods of the present invention include a comparison of the amount of protein biomarker measured in a sample from the individual who is a selection candidate for a given treatment. In some embodiments, that comparison may arise from an examination of the normalised amounts of protein in a sample from the individual, and direct visual comparison against the same proteins listed in a reference database, such as, for example, an Excel spreadsheet.
  • a reference database such as, for example, an Excel spreadsheet.
  • the skilled person will also appreciate that the invention is not so limited, such that the amount of protein in a sample may be measured in the same experiment as the amounts of protein in the one or more individuals making up the reference data set. Data analysis
  • a computer-based analysis program is used to translate the raw data generated by the detection assay (e.g., the expression level a given marker or markers) into data of predictive value for a clinician.
  • the clinician can access the predictive data using any suitable means.
  • the present invention provides the further benefit that the clinician, who is not likely to be trained in genetics or molecular biology, need not understand the raw data.
  • the data is presented directly to the clinician in its most useful form. The clinician is then able to immediately utilize the information in order to optimize the care of the subject.
  • the present invention contemplates any method capable of receiving, processing, and transmitting the information to and from laboratories conducting the assays, information provides, medical personal, and subjects.
  • a sample e.g., a biopsy or a serum or stool sample
  • a profiling service e.g., clinical lab at a medical facility, genomic profiling business, etc.
  • No greater than, in relation to the amount of a protein biomarker means that the amount of protein in the test sample is less than, approximately the same as and no more than 10% greater than the amount of the same protein biomarker in the representative data set.
  • the amount of protein biomarker is no more than 5% greater than the amount of miRNA in the representative dataset.
  • substantially less than means that the amount of biomarker in the test sample is more than 10% less than the amount of biomarker in the representative data set.
  • the same as, in relation to the amount of protein biomarker in the reference data set, means an amount that is no more than 5% more or less than the amount of the measured protein biomarker.
  • the reference data set contains biomarker information from a large number of individuals
  • the skilled person will be familiar with the different statistical methods that can be used to facilitate such an analysis, for example, statistical tests based on mean (student's t-test and extensions), Bayesian and empirical Bayesian methods, nonparametric tests, analysis of variance (ANOVA and extensions), empirical Bayes/moderated t-tests and Partial Least Squares (PLS), logistic regression analysis, full or partial least square methods, cluster analysis, machine learning techniques or techniques to analyse "big data”.
  • the methods of the present invention can also be utilised to monitor the success of a treatment for cancer. Because the biomarkers identified by the instant inventors are a reflection of the body's response to a tumour, successful treatment of tumour can also be monitored by measuring the amounts of the same biomarkers in the blood of the individual during and after treatment for the cancer. Accordingly, in yet a further embodiment, the present invention relates to a method of determining the likelihood of a treatment for a cancer in an individual being successful, including:
  • the one or more protein biomarkers is selected from the group consisting of: ADAM DEC1 , cystatin-C, complement factor D, plasma C1 inhibitor, paraoxonase 1 , hemopexin and inter-alpha-trypsin inhibitor heavy chain H2;
  • the amount of ADAM DEC1 and/or hemopexin in the post-treatment test sample is lower than the amount of the same protein biomarker in a pre- treatment reference sample of bodily fluid obtained from the individual before receiving the treatment for cancer;
  • the amount of cystatin-C, complement factor D, plasma C1 inhibitor, paraoxonase 1 , and/or inter-alpha-trypsin inhibitor heavy chain H2 in the post- treatment test sample is greater than the amount of the same protein biomarker in the pre-treatment reference sample;
  • the amount of ADAM DEC1 and/or hemopexin in the post-treatment test sample is the same or greater than the amount of the same protein biomarker in the pre-treatment reference sample;
  • the amount of cystatin-C, complement factor D, plasma C1 inhibitor paraoxonase 1 , and/or inter-alpha-trypsin inhibitor heavy chain H2 in the post- treatment test sample is the same or lower than the amount of the same protein biomarker in the pre-treatment reference sample.
  • the methods involve measuring the amount of any one of ADAM DEC1 , complement factor D, cystatin-C or paraoxonase 1 in order to determine the likelihood of a successful cancer treatment.
  • the method involves measuring the amount of at least two of ADAM DEC1 , complement factor D, cystatin-C or paraoxonase 1 in order to determine the likelihood of a successful cancer treatment.
  • the method involves measuring the amount of at least three of ADAM DEC1 , complement factor D, cystatin-C or paraoxonase 1 in order to determine the likelihood of a successful cancer treatment.
  • the method involves measuring the amount of ADAM DEC1 , cystatin-C and complement factor D for the purposes of determining the likelihood of the treatment for cancer being successful.
  • the method involves measuring all four of ADAM DEC1 , complement factor D, cystatin-C and paraoxonase 1 in order to determine the likelihood of a successful cancer treatment.
  • Treatments which are typically received by an individual for treating cancer include chemotherapy, surgical resection, radiotherapy, immunotherapy, or a combination thereof.
  • the above method has particular utility to a clinician who is seeking to develop the most appropriate course of treatment for the individual suffering from cancer.
  • This method also provide a non-invasive means for assessing the success of a treatment plan, which provides a further advantage in that the individual who has received the treatment for cancer can be assessed without the need for further interventions or procedures which may place the individual at risk of infection or physical stress.
  • the present methods therefore enable the physician to quickly and accurately make an assessment of the success of previous or ongoing treatment. Having this information rapidly at hand (for example, without having to wait for the patient to be in condition for more invasive procedures) allows the physician to make more timely decisions in relation to the future course of therapy.
  • the treating physician may decide to cease the ongoing treatment in favour of an alternative treatment. For example, if the patient was receiving chemotherapy for the treatment of cancer, or other systemic treatment, the physician may decide, in view of the evidence suggesting success of this treatment plan, that going forward, radiotherapy (or other localised treatment) may be more appropriate. Alternatively, if the above methods indicate to the physician that treatment had not been successful, they may decide to pursue a more aggressive form of treatment. For example, in circumstances where a patient was receiving radiotherapy for the treatment of a cancer, and the physician determined this was not successful in treating the cancer, the physician may decide to perform surgery and/or chemotherapy or immunotherapy to more aggressively treat the cancer. Selection of treatment plan
  • the methods of the present invention also facilitate the determination of an appropriate treatment plan for an individual suspect of having or at risk of cancer, including a colorectal cancer.
  • the present methods enable determination of whether or not to treat an individual for cancer, for example when the individual is suspected of having cancer as a result of another test which is less sensitive than the methods of the present invention and which suggests that the individual may have cancer (e.g.: a positive FOBT test or a scan which indicates that presence of a mass in the individual).
  • the invention relates to a method for determining whether to treat an individual for cancer, the method including:
  • the one or more protein biomarkers is selected from the group consisting of: ADAM DEC1 , cystatin-C and complement factor D;
  • the amount of ADAM DEC1 is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer;
  • the amount of cystatin-C and/or complement factor D is lower than the amount of the same protein biomarker in the reference data set;
  • the amount of ADAM DEC1 is the same or lower than the amount of ADAM DEC1 in the reference data set; d) the amount of cystatin-C and/or complement factor D is the same or greater than the amount of the same protein biomarker in the reference data set.
  • staging of large intestine neoplasms is an invasive procedure since it requires the harvesting of a tissue specimen which is histologically analysed.
  • the histological analysis of tissue specimens is both relatively slow and highly invasive.
  • the development of a means to reliably and routinely assess a patient to determine whether an identified neoplasm is premalignant (e.g., a polyp or adenoma), early stage (adenocarcinoma, pre-metastatic) or late stage (e.g., metastatic) is highly desirable if it can be performed quickly and repeatedly, since this would enable decisions in relation to treatment regimens to be made and implemented more accurately.
  • CEA carcinoembryonic antigen
  • the present inventors have found that by coupling known, less sensitive diagnostic tests with the methods of the present invention, it is possible to determine an appropriate therapeutic plan for the individual. For example, while the CEA blood test is unable to accurately detect the presence of early stage cancer, the method of the present invention is sufficiently sensitive to detect early stage cancers.
  • a negative CEA test result coupled with positive result using the preferred method of the instant invention is indicative of early stages of colorectal cancer, (for example, Dukes stages A/B colorectal cancer).
  • Positive test results using both CEA and the method of the present invention is indicative of late stages of colorectal cancer (Dukes stages C/D).
  • the present invention relates to a method of selecting a cancer treatment for an individual including: - providing an individual for whom a CEA blood test result has been obtained;
  • the one or more protein biomarkers is selected from the group consisting of: ADAM DEC1 , cystatin-C and complement factor D;
  • the amount of cystatin-C and/or complement factor D is lower than the amount of the same protein biomarker in the reference data set;
  • the amount of ADAM DEC1 is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer;
  • the amount of cystatin-C and/or complement factor D is lower than the amount of the same protein biomarker in the reference data set.
  • the term “individual” refers to any animal (e.g., a mammal), including, but not limited to, humans, non-human primates, rodents, and the like, which is to be the recipient of a particular treatment.
  • the terms “individual”, “subject” and “patient” are used interchangeably herein in reference to a human subject.
  • non-human animals refers to all non-human animals including, but are not limited to, vertebrates such as rodents, non-human primates, ovines, bovines, ruminants, lagomorphs, porcines, caprines, equines, canines, felines, aves, etc.
  • vertebrates such as rodents, non-human primates, ovines, bovines, ruminants, lagomorphs, porcines, caprines, equines, canines, felines, aves, etc.
  • kits for the diagnosis or detection of cancer in particular colorectal cancer.
  • kits may be suitable for detection of nucleic acid species, or alternatively may be for detection of a protein, as discussed above.
  • antibodies will most typically be used as components of kits.
  • any agent capable of binding specifically to a biomarker gene product will be useful in this aspect of the invention.
  • Other components of the kits will typically include labels, secondary antibodies, substrates (if the protein is an enzyme), inhibitors, co-factors and control gene product preparations to allow the user to quantitate expression levels and/or to assess whether the diagnosis experiment has worked correctly.
  • Enzyme-linked immunosorbent assay-based (ELISA) tests and competitive ELISA tests are particularly suitable assays that can be carried out easily by the skilled person using kit components.
  • the kit further comprises means for the detection of the binding of an antibody to a biomarker polypeptide.
  • a reporter molecule such as, for example, an enzyme (such as horseradish peroxidase or alkaline phosphatase), a dye, a radionucleotide, a luminescent group, a fluorescent group, biotin or a colloidal 30 particle, such as colloidal gold or selenium.
  • a reporter molecule is directly linked to the antibody.
  • a kit may additionally comprise a reference sample.
  • kits provided for in the present invention include kits which contain means for determining or measuring the amount of any one or more of the protein biomarkers listed herein (and as shown in Table 1 ).
  • a reference sample comprises a polypeptide that is detected by an antibody.
  • the polypeptide is of known concentration.
  • Such a polypeptide is of particular use as a standard. Accordingly, various known concentrations of such a polypeptide may be detected using a diagnostic assay described herein.
  • Example 1 determination of biomarkers
  • Clinically staged CRC Dukes' A, B, C and D
  • Malignant tumours were defined as tumours which had penetrated the muscularis propia including involvement of the lymph nodes as well as metastatic cancer (i.e., spread to other parts of the body). Malignant tumours correspond to Dukes Stages C and D.
  • the control or unaffected plasma samples were collected from 20 individuals who were aged-matched to the clinical CRC plasma and had no apparent evidence of diseases (i.e., with no evidence of inflammation or metastatic conditions, no previous history of tumor, cancer or major therapy).
  • MARS-14 Multiple Affinity Removal SystemTM (MARS-14) column (4.6 ⁇ 100 mm) (Agilent Technologies, Palo Alto, CA) was used to deplete 14 high abundance proteins (albumin, IgG, antitrypsin, IgA, transferrin, haptoglobin, fibrinogen, alpha2-macroglobulin, alphal - acid glycoprotein, IgM, apolipoprotein Al, apolipoprotein All, complement C3, and transthyretin) (MARS-14). The depletion was performed at room temperature with an Agilent 1260 Infinity HPLC system according to the manufacturer's instructions.
  • plasma samples were diluted four fold using the load/wash buffer supplied by the manufacture and remaining particulates in the diluted plasma were removed by centrifugation through a 0.22-pm spin filter 1 min at 16,000 ⁇ g.
  • the MARS-14 column was equilibrated with the load/wash buffer and the diluted plasma was loaded at a low flow rate (0.125 mL/min) for 18 min and then for an additional 2 min at a flow rate of 1 mL/min.
  • the other binding and elution steps were identical to those used for the MARS- 14 column. Both depleted (flow-through fraction, FT) and abundant plasma proteins (bound fraction, BF) were collected and stored at -20 °C.
  • API column was pre-equilibrated at 5 mL/min using PBS and 0.1 M Glycine (pH 2.5).
  • the plasma sample was injected into the column at 0.1 mL/min with subsequent washing with 2.5 column volume (CV) of PBS, first at 0.05 mL/min for 3 min and at 5 mL/min.
  • the bound proteins were eluted with 4 CV of glycine buffer (0.1 M, pH 2.5) at 5 mL/min and the column re-equilibrated with 5 CV of binding buffer at a flow rate of 5 mL/min.
  • Proteins collected from the column (flow-through and API-bound) were then buffer exchanged using Amicon 3 kDa molecular weight filters (Millipore, MA) with 3 CV of PBS and stored at -80° C.
  • the API depleted plasma samples were injected into MARS-14 column following the same protocol described above.
  • the depleted LAPs (flow-through fraction) and abundant HAPs (bound fraction) were collected, buffer exchanged and stored at -80° C.

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Immunology (AREA)
  • Engineering & Computer Science (AREA)
  • Urology & Nephrology (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Hematology (AREA)
  • Cell Biology (AREA)
  • Medicinal Chemistry (AREA)
  • Analytical Chemistry (AREA)
  • Microbiology (AREA)
  • Oncology (AREA)
  • Food Science & Technology (AREA)
  • Biotechnology (AREA)
  • Physics & Mathematics (AREA)
  • Hospice & Palliative Care (AREA)
  • Biochemistry (AREA)
  • General Health & Medical Sciences (AREA)
  • General Physics & Mathematics (AREA)
  • Pathology (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention relates to methods and compositions for screening for cancer in an individual using a novel panel of one or more biomarkers.

Description

Screening methods
Related priority application
This application claims priority from Australian provisional application AU 2016902484, the entire contents of which are hereby incorporated in their entirety. Field of the invention
The invention relates to methods for screening for cancer in an individual, in particular, methods for screening for colorectal cancer.
Background of the invention
Reference to any prior art in the specification is not an acknowledgment or suggestion that this prior art forms part of the common general knowledge in any jurisdiction or that this prior art could reasonably be expected to be understood, regarded as relevant, and/or combined with other pieces of prior art by a skilled person in the art.
Colorectal cancer (CRC), also referred to as colon or bowel cancer, is the third most commonly diagnosed cancer worldwide with over 694,000 deaths (8.68 % of all cancer deaths) in 2012 [1 ]. CRC develops in a progressive manner, typically beginning with the formation of abnormal tissue growths projecting from the mucous membrane (polyps) in the colon/rectum, which then progress into adenomas and finally, metastatic disease. Many polyps do not produce overt clinical symptoms and consequently are often not detected. Left untreated, polyps may develop into adenomatous polyps (benign neoplasia) which have a high risk of subsequently developing into adenocarcinoma. The absence of substantial symptoms associated with the formation of these neoplastic growths is such that up to 50% of CRC patients have occult (asymptomatic) metastases upon presentation to a clinic or medical professional. When symptoms of colorectal cancer are evident, these symptoms are nonspecific. For example, symptoms may include a change in bowel habit, a feeling of incomplete defecation, reduction in diameter of stools or visible evidence of blood in the stools. Given that these symptoms often occur in other diseases (or may not be the result of disease at all), observation of symptoms is unlikely to be conclusively diagnostic of colorectal cancer.
Current methods for CRC screening include the use of faecal occult blood tests (FOBT), flexible sigmoidoscopy and colonoscopy. Colonoscopy is the current gold standard for detecting CRC and has a specificity of greater than 90% for detecting CRC. However, colonoscopy is intrusive and costly with a small but finite risk of complications (2.1 per 1000 procedures) (Levin, 2004). While less invasive, FOBT has relatively low specificity resulting in a high rate of false positives. All positive FOBT are therefore typically followed up with colonoscopy. FOBT also lacks sensitivity for early stage cancerous lesions that do not bleed into the bowel and as stated above, these are the lesions for which treatment is most successful.
Sampling for FOBT is done by individuals at home and requires at least two consecutive faecal samples to be analysed to achieve optimal sensitivity. Thus, while FOBT screening does result in reduction of mortality from CRC, there has been a very low level of community uptake for this test (30-40%), most likely due to the unpleasant nature of the test, which limits its usefulness as a screening tool. Consequently, CRC screening rates remain low.
A number of studies have also confirmed that the CRC survival rates immensely depend upon the stage of disease at the time of diagnosis. For instance, early-stage CRC has a high 5-year survival rate (>90%) following simple surgical resection, while late-stage CRC has a dramatically lower 5-year survival rate (<10%) [2]. Thus, early detection of colorectal lesions would significantly reduce the burden of this disease and allow for early intervention. Although extensive research has revealed an abundant number of screening strategies, early detection of CRC remains difficult. There is a need for the development of rapid, non-invasive and specific screening tools for detecting CRC. There is also a need for the development of screening tools which facilitate an early diagnosis of CRC.
Summary of the invention
The present invention provides a method of determining the likelihood of an individual having a cancer including: - providing a test sample of bodily fluid from an individual for whom diagnosis of cancer is required;
- measuring or determining the amount of one or more protein biomarkers in the test sample, wherein at least one of the biomarkers is ADAM DEC1 , - determining that the individual has a high likelihood of having cancer when the amount of ADAM DEC1 in the test sample is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of ADAM DEC1 in the test sample is the same or lower than the amount of
ADAM DEC1 in the reference data set.
The present invention also provides a method of determining the likelihood of an individual having a cancer including:
- providing a test sample of bodily fluid from an individual for whom diagnosis of cancer is required;
- measuring or determining the amount of one or more protein biomarkers in the test sample, wherein at least one of the biomarkers is cystatin-C,
- determining that the individual has a high likelihood of having cancer when the amount of cystatin-C in the test sample is lower than the amount of cystatin-C in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of cystatin-C in the test sample is the same or higher than the amount of cystatin-C in the reference data set. The present invention also provides a method of determining the likelihood of an individual having a cancer including:
- providing a test sample of bodily fluid from an individual for whom diagnosis of cancer is required;
- measuring or determining the amount of one or more protein biomarkers in the test sample, wherein at least one of the biomarkers is complement factor D, - determining that the individual has a high likelihood of having cancer when the amount of complement factor D in the test sample is lower than the amount of complement factor D in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or - determining that the individual has a low likelihood of having cancer when the amount of complement factor D in the test sample is the same or higher than the amount of complement factor D in the reference data set.
In a further embodiment, the present invention relates to a method of determining the likelihood of an individual having a cancer including:
- providing a test sample of bodily fluid from an individual for whom diagnosis of cancer is required;
- measuring or determining the amount of one or more protein biomarkers in the test sample, wherein the one or more protein biomarkers is selected from the group consisting of: ADAM DEC1 , cystatin-C and complement factor D;
- determining that the individual has a high likelihood of having cancer when: a) the amount of ADAM DEC1 in the test sample is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
b) the amount of cystatin-C and/or complement factor D in the test sample is lower than the amount of the same protein biomarker in the reference data set;
- determining that the individual has a low likelihood of having cancer when:
c) the amount of ADAM DEC1 in the test sample is the same or lower than the amount of ADAM DEC1 in the reference data set;
d) the amount of cystatin-C and/or complement factor D in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set.
In any embodiment of the present invention, one, two or three of ADAM DEC1 , cystatin-C and complement factor D can be measured in the sample from the individual. For example, the method may involve measuring ADAM DEC-1 and cystatin-C, or ADAM DEC-1 and complement factor D, or cystatin-C and complement factor or all three of ADAM DEC-1 , cystatin-C and complement factor D. It will be understood that the measurement or determination of the levels or amounts of two or more protein biomarkers may be conducted simultaneously or in separate experiments.
In any embodiment of the present invention, in addition to measuring the amount of one or more of ADAM DEC1 , cystatin-C and complement factor D, the method further includes:
- measuring or determining the amount of one or more additional protein biomarkers in the test sample, wherein the one or more additional biomarkers is selected from the group consisting of:
(a) plasma protease C1 inhibitor,
(b) paraoxonase 1 ,
(c) hemopexin,
(d) inter-alpha-trypsin inhibitor heavy chain H2,
(e) prothrombin,
(d) hepatocyte growth factor activator,
(f) serum amyloid P component,
(g) apolipoprotein A-IV,
(h) apolipoprotein B-100,
(i) carboxypeptidase Q,
(j) glutathione peroxidase 3,
(k) mannan-binding lectin serine protease 2
(I) mannose receptor C type I,
(m) protocadherin gamma-A8,
(n) profilin-1 , (o) S100 calcium binding protein 8,
(p) serum amyloid protein A1 ,
(q) serum amyloid protein A2, and (r) superoxide dismutase 3 - determining that the individual has a high likelihood of having cancer when: a) the amount of hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is greater than the amount of the same protein in the reference data set; and/or b) the amount of plasma C1 protease inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator,
apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, profilin-1 and/or superoxide dismutase 3 in the test sample is lower than the amount of the same protein biomarker in the reference data set;
- determining that the individual has a low likelihood of having cancer when:
c) the amount of hemopexin serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma -A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is the same or lower than the amount of the same protein in the reference data set; d) the amount of plasma protease C1 inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator,
apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, profilin-1 and/or superoxide dismutase 3 in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set. The present invention also provides methods where diagnosis is based on assessment of any one of the markers disclosed herein, and in those circumstances any biomarker disclosed herein can be determinative of a diagnosis for colorectal cancer. Further, any biomarker disclosed herein, in combination with one or more of the other markers may be useful for determining the likelihood of an individual having cancer.
For example, the present invention includes: a) - determining the amount of ADAM DEC1 in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of ADAM DEC1 in the test sample is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of ADAM DEC1 in the test sample is the same or lower than the amount of
ADAM DEC1 in the reference data set; b) - determining the amount of serum amyloid P component in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of serum amyloid P component in the test sample is greater than the amount of serum amyloid P component in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of serum amyloid P component in the test sample is the same or lower than the amount of serum amyloid P component in the reference data set; c) - determining the amount of apolipoprotein B-100 in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of apolipoprotein B-100 in the test sample is greater than the amount of apolipoprotein B-100 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or - determining that the individual has a low likelihood of having cancer when the amount of apolipoprotein B-100 in the test sample is the same or lower than the amount of apolipoprotein B-100 in the reference data set; d) - determining the amount of glutathione peroxidase 3 in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of glutathione peroxidase 3in the test sample is greater than the amount of glutathione peroxidase 3 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of glutathione peroxidase 3 in the test sample is the same or lower than the amount of glutathione peroxidase 3 in the reference data set; e) - determining the amount of protocadherin gamma-A8 in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of protocadherin gamma-A8 in the test sample is greater than the amount of protocadherin gamma-A8 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of protocadherin gamma-A8 in the test sample is the same or lower than the amount of protocadherin gamma-A8 in the reference data set; f) - determining the amount of S100 calcium binding protein 8 in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of S100 calcium binding protein 8 in the test sample is greater than the amount of S100 calcium binding protein 8 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of S100 calcium binding protein 8 in the test sample is the same or lower than the amount of S100 calcium binding protein 8 in the reference data set; g) - determining the amount of serum amyloid protein A1 in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of serum amyloid protein A1 in the test sample is greater than the amount of serum amyloid protein A1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of serum amyloid protein A1 in the test sample is the same or lower than the amount of serum amyloid protein A1 in the reference data set; h) - determining the amount of serum amyloid protein A2 in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of serum amyloid protein A2 in the test sample is greater than the amount of serum amyloid protein A2 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of serum amyloid protein A2 in the test sample is the same or lower than the amount of serum amyloid protein A2 in the reference data set; i) - determining the amount of hemopexin in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of hemopexin in the test sample is greater than the amount of hemopexin in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of hemopexin in the test sample is the same or lower than the amount of hemopexin in the reference data set; j) - determining the amount of cystatin C in a test sample of bodily fluid from an individual, - determining that the individual has a high likelihood of having cancer when the amount of cystatin C in the test sample is lower than the amount of cystatin C in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of cystatin C in the test sample is the same or higher than the amount of cystatin C , in the reference data set; k) - determining the amount of complement factor D in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of complement factor D, in the test sample is lower than the amount of complement factor D in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of complement factor D, in the test sample is the same or higher than the amount of complement factor D, in the reference data set;
I) - determining the amount of plasma protease C1 inhibitor in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of plasma protease C1 inhibitor in the test sample is lower than the amount of plasma protease C1 inhibitor in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of plasma protease C1 inhibitor, in the test sample is the same or higher than the amount of plasma protease C1 inhibitor, in the reference data set; m) - determining the amount of paraoxonase 1 in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of paraoxonase 1 in the test sample is lower than the amount of paraoxonase 1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of paraoxonase 1 in the test sample is the same or higher than the amount of paraoxonase 1 , in the reference data set; n) - determining the amount of inter-alpha trypsin inhibitor heavy chain H2 in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of inter-alpha trypsin inhibitor heavy chain H2 in the test sample is lower than the amount of inter-alpha trypsin inhibitor heavy chain H2 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of inter-alpha trypsin inhibitor heavy chain H2 in the test sample is the same or higher than the amount of inter-alpha trypsin inhibitor heavy chain H2, in the reference data set; o) - determining the amount of prothrombin in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of prothrombin in the test sample is lower than the amount of prothrombin in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of prothrombin in the test sample is the same or higher than the amount of prothrombin, in the reference data set; p) - determining the amount of hepatocyte growth factor activator, in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of hepatocyte growth factor activator, in the test sample is lower than the amount of hepatocyte growth factor activator, in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or - determining that the individual has a low likelihood of having cancer when the amount of hepatocyte growth factor activator, in the test sample is the same or higher than the amount of hepatocyte growth factor activator, in the reference data set; q) - determining the amount of apolipoprotein A-IV in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of apolipoprotein A-IV in the test sample is lower than the amount of apolipoprotein A-IV in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of apolipoprotein A-IV in the test sample is the same or higher than the amount of apolipoprotein A-IV, in the reference data set; r) - determining the amount of carboxypeptidase Q in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of carboxypeptidase Q in the test sample is lower than the amount of carboxypeptidase Q in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of carboxypeptidase Q in the test sample is the same or higher than the amount of carboxypeptidase Q in the reference data set; s) - determining the amount of mannan-binding lectin serine protease 2 (MASP2) in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of MASP2 in the test sample is lower than the amount of MASP2 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of MASP2 in the test sample is the same or higher than the amount of MASP2 in the reference data set; t) - determining the amount of mannose receptor C type I (MRC1 ) in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of MRC1 in the test sample is lower than the amount of MRC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of MRC1 in the test sample is the same or higher than the amount of MRCl in the reference data set; u) - determining the amount of profilin-1 in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of profilin-1 the test sample is lower than the amount of profilin-1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of profilin-1 the test sample is the same or higher than the amount of profilin-1 in the reference data set; and/or v) - determining the amount of superoxide dismutase 3 (SOD3) in a test sample of bodily fluid from an individual,
- determining that the individual has a high likelihood of having cancer when the amount of SOD3 in the test sample is lower than the amount of SOD3 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
- determining that the individual has a low likelihood of having cancer when the amount of SOD3 in the test sample is the same or higher than the amount of SOD3 the reference data set.
In any embodiment of the invention, the cancer is colorectal cancer. The present invention also relates to a method of determining the likelihood of successful treatment of a cancer in an individual. Any of the biomarkers listed herein may be used to determine whether an individual has received a successful treatment for cancer.
For example, the present invention provides a method including:
- providing a post-treatment test sample of bodily fluid from an individual who has received a treatment for a cancer;
- measuring the amount of one or more protein biomarkers in the post-treatment test sample,
wherein the one or more protein biomarkers is selected from the group consisting of: ADAM DEC1 , cystatin-C and complement factor D;
- determining that there is a high likelihood that the treatment was successful when:
a) the amount of ADAM DEC1 in the post-treatment test sample is lower than the amount of the same protein biomarker in a pre-treatment reference sample obtained from the individual before receiving the treatment for cancer;
b) the amount of cystatin-C and/or complement factor D, in the post- treatment test sample is greater than the amount of the same protein biomarker in the pre-treatment reference sample;
- determining that there is a low likelihood that the treatment was successful when:
c) the amount of ADAM DEC1 in the post-treatment test sample is the same or greater than the amount of the same protein biomarker in the pre-treatment reference sample;
d) the amount of cystatin-C and/or complement factor D, in the post- treatment test sample is the same or lower than the amount of the same protein biomarker in the pre-treatment reference sample.
The invention may further include measuring the amount of one or more additional biomarkers to determine the likelihood of successful treatment of a cancer in an individual. For example, further to measuring or determining the amounts of ADAM DEC1 , cystatin C and/or complement factor D, a post-treatment test sample from an individual who has received a treatment for cancer may also be analysed for the purpose of measuring or determining the levels of any one or more additional protein biomarkers selected from the group consisting of:
hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , serum amyloid protein A2 , plasma C1 inhibitor, paraoxonase 1 , inter-alpha- trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, prof ilin-1 and/or superoxide dismutase 3; and
- determining that there is a high likelihood that the treatment for cancer was successful when:
a) the amount of hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the post-treatment test sample is lower than the amount of the same protein biomarker in a pre-treatment reference sample obtained from the individual before receiving the treatment for cancer; b) the amount of plasma C1 protease inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, prof ilin-1 and/or superoxide dismutase 3 in the post- treatment test sample is greater than the amount of the same protein biomarker in the pre-treatment reference sample;
- determining that there is a low likelihood that the treatment for cancer was successful when:
c) the amount of hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the post-treatment test sample is the same or greater than the amount of the same protein biomarker in the pre-treatment reference sample; d) the amount of plasma C1 protease inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, prof ilin-1 and/or superoxide dismutase 3 in the post- treatment test sample is the same or lower than the amount of the same protein biomarker in the pre-treatment reference sample.
Still further, the present invention provides a method of determining whether to treat an individual for cancer, preferably colorectal cancer, including:
- providing a test sample of bodily fluid from an individual;
- measuring the amount of one or more protein biomarkers in the test sample, wherein the one or more protein biomarkers is selected from the group consisting of:
ADAM DEC1 , cystatin-C and complement factor D;
- determining to treat the individual for cancer when:
a) the amount of ADAM DEC1 in the test sample is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
b) the amount of cystatin-C and/or complement factor D in the test sample is lower than the amount of the same protein biomarker in the reference data set;
- determining not to treat the individual for cancer when:
c) the amount of ADAM DEC1 in the test sample is the same or lower than the amount of ADAM DEC1 in the reference data set;
d) the amount of cystatin-C and/or complement factor D in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set. In determining whether or not to treat the individual for cancer, including colon cancer, the present invention also contemplates the determining the amounts of one or more additional biomarkers in a test sample from the individual. For example, the present invention includes:
- providing a test sample of bodily fluid from an individual;
- measuring the amount of one or more protein biomarkers in the test sample, wherein the one or more protein biomarkers is selected from the group consisting of ADAM DEC1 , cystatin-C, complement factor D, hemopexin, serum amyloid P
component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , serum amyloid protein A2, plasma C1 protease inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV,
carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, profilin-1 and/or superoxide dismutase 3
- determining to treat the individual for cancer when: a) the amount of ADAM DEC-1 , hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is greater than the amount of the same protein in a reference data set in the form of one or more individuals who do not have cancer; and/or b) the amount of cystatin-C, complement factor D, plasma C1 protease inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, profilin-1 and/or superoxide dismutase 3 in the test sample is lower than the amount of the same protein biomarker in the reference data set; - determining not to treat the individual for cancer when:
c) the amount of ADAM DEC-1 , hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma -A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is the same or lower than the amount of the same protein in the reference data set; d) the amount of cystatin-C, complement factor D, plasma protease C1 inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, profilin-1 and/or superoxide dismutase 3 in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set. In any method described herein, the protein biomarkers are useful for determining the likelihood that an individual has colorectal cancer or the likelihood that an individual has received a successful treatment for cancer.
As used herein, the reference to "measuring" includes active and passive methods. For example, the skilled person, in performing the invention, may actively analyse a sample of bodily fluid obtained from an individual and subject the sample to proteomic, ELISA or any other assay for determining the presence or amount of any protein biomarker described herein. Alternatively, the skilled person may measure the amount of the protein biomarker passively, by referring to a database of clinical results and determine the amount of the relevant protein biomarker in a sample from the individual. The present invention also includes a biomarker panel comprising or consisting of any one or more of the 22 biomarkers listed herein in Table 1 . In any embodiment of the present invention, the panel includes at least the protein biomarker ADAM DEC1 , the protein cystatin-C or the protein complement factor D. In a further embodiment, the panel includes at least the proteins ADAM DEC1 and cystatin-C; at least the proteins ADAM DEC1 and complement factor D or at least the proteins cystatin-C and complement factor D. The panel may comprise all 3 of ADAM DEC1 , cystatin-C and complement factor D. Further, the panel may comprise any one, two, three, four, five, six, seven, eight, nine, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18 or 19 of the remaining protein biomarkers listed in Table 1 .
The biomarker panel as described herein may be used for determining whether an individual has or is at risk of cancer, or whether the individual has or has not received a successful treatment for cancer. As used herein, a sample of bodily fluid from the individual can be any fluid that contains extracellular components. Preferably the fluid is plasma or serum
As used herein, except where the context requires otherwise, the term "comprise" and variations of the term, such as "comprising", "comprises" and "comprised", are not intended to exclude further additives, components, integers or steps.
Further aspects of the present invention and further embodiments of the aspects described in the preceding paragraphs will become apparent from the following description, given by way of example and with reference to the accompanying drawings.
Brief description of the drawings
Figure 1 : 1 D gel electrophoresis of (A) crude plasma from unaffected (E) and CRC patients (stage A-D) (B) Low abundance (FT) proteins enriched fractions from unaffected (E) and CRC (A-D) patient plasma after API depletion. The analysis was conducted using Novex Tris Tricine 16% and Precision Plus Protein™ Unstained Standards were used as molecular marker (MM), with 4 μg protein loaded into each well.
Figure 2: 1 D gel electrophoresis of ultradepleted CRC plasma samples. The gel electrophoresis was conducted using nUView™ Tris-HEPES Precast Gels (8-16% NuSep® protein gels), with 2.5 μg protein loaded into each well.
Figure 3: Number of proteins identified by MARS, API and API+MARS depletion strategy.
Figure 4: Differentially expressed proteins identified from MARS, API, and API+MARS depletions based on (A) Anova analysis (BH-corrected p-value for multiple testing, MaxFC > 1 .5) (B) Peptide-level analyses
Figure 5: The protein fold changes between any two groups (benign/healthy, malignant/healthy and malignant/benign) for MARS Depletion (done in duplicate), API depletion (done in duplicate) and API+MARS depletion (done in triplicate) of plasma samples. Each plot represents the log fold change of that comparison across the x-axis, the Log of P-value of the comparison (e.g. un-corrected 2-sample t-test p-value) across the y-axis.
Figure 6: Panels (P) (A, B, C, D, E and F) representing protein expression data) in Dukes' CRC stage A, B, C & D with E controls; pooled group (defined healthy, benign, malignant groups). Values were determined by SWATH™-MS analysis using ultradepleted (API+MARS) pooled CRC patient plasmas and based on Anova Analysis. Values for the Y axis data were calculated as follows: Y= loge(Normalized peak area), and where the normalized peak area is calculated by the method explained in Wu et al., 2016. Figure 7: Stage specific abundance patterns for 19 differentially expressed proteins from Table 4. Y axis values were calculated as for Figure 6.
Figure 8: Group specific abundance patterns for 19 differentially expressed proteins from Table 4. Y axis values were calculated as for Figure 6.
Figure 9: 1 D Gel analysis of the human plasma fractions acquired from API column. Approximately 4 pg of protein per lane was separated and separated on a Novex Tris Tricine 16% gel using Precision Plus Protein™ Unstained Standards as molecular marker (MM) staining with Flamingo Pink stain. Lane 2, 4, 6 contains the high abundance (BF) and lane 1 , 3, 5 contains the low abundance (FT) enriched plasma fractions from API depletion. Figure 10: Depletion chromatograms from API depletion when crude CRC stage
A-D and unaffected control group (E) plasma (100 μΙ containing plasma protein approx. 3-4 mg) were injected into the column. The bound (BF) and flow through (FT) fractions indicate the high and low abundance protein enriched fractions.
Figure 11 : Depletion chromatograms from MARS depletion when API depleted CRC stage A-D and unaffected control group (E) plasma (40 μΙ containing plasma protein approx. 0.7-1 .6 mg) were injected into the column. The bound (BF) and flow through (FT) fractions indicate the high and low abundance protein enriched fractions.
Figure 12: Depletion chromatograms from MARS depletion when crude CRC stage A-D and unaffected control group (E) plasma (40 ul containing plasma protein approx. 1 .6-2.1 mg) were injected into the column. The bound (BF) and flow through (FT) fractions indicate the high and low abundance protein enriched fractions.
Figure 13: Number of unique plasma proteins identified from 4 different peptide fractionation methods (HpH: High pH C18 reversed phase, SEC Size exclusion chromatography, SAX: Strong anion exchange and SCX: Strong cation exchange). Figure 14: Differential expression of proteins in samples from individuals with Dukes' CRC stages A, B, C and D compared with Healthy controls. Values were determined by SWATH™ MS analysis after ultradepletion (depletion on MARS14 column followed by two rounds of API depletion) of plasma samples. Values for the Y axis data were calculated as follows: Y= loge (Normalized peak area), and where the normalized peak area is calculated by the method explained in Wu et al., 2016.
Figure 15: Differential expression of proteins in samples from individuals with Dukes' CRC stages A, B, C and D compared with Healthy controls. Values were determined by SWATH™-MS analysis of plasma samples following MARS-14 depletion of pooled CRC patient samples and based on Anova Analysis. Values for the Y axis data were calculated as follows: Y= loge (Normalized peak area), and where the normalized peak area is calculated by the method explained in Wu et al., 2016.
Figure 16: Differential expression of proteins in samples from individuals with Dukes' CRC stages A, B, C and D compared with Healthy controls. Values were determined by SWATH™-MS analysis of neat pooled CRC patient plasmas and based on Anova Analysis. Values for the Y axis data were calculated as follows: Y= loge(Normalized peak area), and where the normalized peak area is calculated by the method explained in Wu et al., 2016.
Detailed description of the embodiments Many cancers are asymptomatic in the early stages of malignancy and it is often not until a patient finally presents at a clinic reporting of symptoms that the cancer is detected. This is particularly the case with colorectal cancer (CRC), where up to 50% of CRC patients have occult (asymptomatic) metastases upon presentation to a clinic or medical professional. Accordingly, there is an ongoing clinical need for cancer screening and surveillance which facilitate an early diagnosis of cancer, before metastasis occurs, thereby increasing the prospects of a successful treatment of the cancer. Ideally, these screening methods should also facilitate low cost and high- throughput screening of at-risk individuals from the population.
While others have recognised the need to develop more efficient screening methods for diagnosing CRC in the form of blood tests, current methods deployed for CRC screening are neither sensitive nor specific in diagnosing early stages. Those cancer specific biomarkers that are in use clinically (e.g., PSA, CEA and CA125) are found in low concentrations (pg-ng/mL) in human plasma and they are thought to be present as a result of structural changes in the microenvironment of cancer tissues which "leak" these markers so that they eventually reach a steady-state level in circulating human plasma.
Blood is a routinely collected for therapeutic and/or diagnostic purposes by minimally invasive methods and is therefore one of the most intensely studied clinical specimen [1 ]. It is believed that changes in the composition of plasma proteins and metabolites that are released from various organs, are a reflection of an individual's physical condition [3, 4]. Despite such keen interest and potential, the clinically important properties of the plasma proteome remain largely unexplored. Perhaps the major reason is, human plasma represents one of the most complex human-derived proteomes that contains a large variety of proteins immense complexity, wide dynamic range of concentration (>10 orders of magnitude) [5, 6]. Although many studies report mostly liver-derived high abundance plasma proteins (e.g., a1 protease inhibitor and other serpins, C-reactive protein, haptoglobin) as potential disease markers, it has been established that many of these proteins change their expression based on their response towards acute phase inflammation. Conversely, medium abundance proteins (e.g., tissue leakage proteins) and low abundance proteins (e.g., cytokines, chemokines and interleukins), have been established as promising biomarkers (CA19-9, CA125, prostate specific antigen (PSA), AFP and CEA) that can be used to assess disease risk, predict prognosis, monitor response to treatment and disease progression [5].
By developing a unique method for depleting abundant proteins from human blood samples, the present inventors have been able to identify a novel set of CRC- specific biomarkers.
Accordingly, in a first aspect, the present invention relates to a method of determining the likelihood an individual has cancer including:
- providing a test sample of bodily fluid from an individual for whom diagnosis of cancer is required; - measuring the amount of one or more protein biomarkers in the test sample, wherein the one or more protein biomarkers is selected from the group consisting of: ADAM DEC1 , cystatin-C and complement factor D;
- determining that the individual has a high likelihood of having cancer when: a) the amount of ADAM DEC1 in the test sample is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
b) the amount of cystatin-C and/or complement factor D in the test sample is lower than the amount of the same protein biomarker in the reference data set; - determining that the individual has a low likelihood of having cancer when:
c) the amount of ADAM DEC1 in the test sample is the same or lower than the amount of ADAM DEC1 in the reference data set;
d) the amount of cystatin-C and/or complement factor D in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set.
Any one or more additional biomarkers may be used in the diagnostic methods of the present invention. Examples of such additional biomarkers are discussed later in this document.
The individual for whom the likelihood of having cancer is required may be an individual who has presented to the clinic with symptoms which may be indicative of a cancer. In alternative embodiments, the individual does not have any overt symptoms of cancer but may require screening for cancer, for example, due to a concern that there is a family history of one or more cancers.
In a further embodiment, the individual for whom the likelihood of having cancer is to be determined, is an individual requiring diagnosis of colorectal cancer. The individual may or may not have a family history of colorectal cancer. Moreover, the individual may or may not have presented with symptoms suggestive of a colorectal neoplasm, including evidence of blood in the stools, a positive result from an FOBT test, abdominal pain, or changes in bowel habits. One particular advantage of the present invention, is that it enables early detection of a colorectal cancer, for example, detection when the cancer is not yet metastatic. This provides a significant advantage over prior methods which often fail to detect the cancer until it is significantly more advanced, and potentially metastatic. For example, the current methods of colorectal cancer screening, such as use of carcinoembryonic antigen (CEA) methods, do not enable detection of colorectal cancer in Dukes Stages A and B (benign), and typically only yield positive results when the cancer is in stages C or D (malignant). Having a screening method which enables detection of colorectal cancer in stages A and B provides a significant advantage over the methods of the prior art, since cancers in Dukes Stages A and B can be more easily treated, and individuals diagnosed with cancer in this stage can expect to receive successful treatment and enter into full remission.
The methods of the present invention facilitate detection of colorectal cancer at all of the Dukes Stages. The skilled person will be familiar with the usage of the Dukes staging for determining the malignancy of a colorectal cancer. Specifically, in Dukes Stage A, the tumour has penetrated into, but not through, the bowel wall. In Dukes Stage B, the tumour has penetrated through the bowel wall but there is not yet any lymph node involvement. In Dukes Stage C, the cancer involves regional lymph nodes. In Dukes Stage D, there is distant metastasis, for example, to the liver or lung. In one embodiment, the methods of the present invention are able to diagnose or detect colorectal cancer at any Dukes Stage with a sensitivity of at least 80%.
The skilled person will also be familiar with other systems for staging cancer that are known in the art. Other commonly used staging systems for colorectal cancer are the Modified Astler-Coller Classification (MAC) and TMN Classification of Malignant Tumors of the American Joint Committee on Cancer (AJCC), sometimes also known as the TNM system. The TNM system describes 3 key pieces of information: "T", "N" and "M", where "T" denotes the degree of invasion of the intestinal wall, "N" denotes the degree of lymphatic node involvement and "M" the degree of metastasis (numbers or letters appear after T, N, and M to provide more details about each of these factors. The numbers 0 through 4 indicate increasing severity. The letter X means "cannot be assessed because the information is not available."). More specifically: - T describes how far the main (primary) tumour has grown through the layers of the intestine and whether it has grown into nearby areas. These layers, from the inner to the outer, include:
• the inner lining (mucosa);
• thin muscle layer (muscularis mucosa);
• the fibrous tissue beneath this muscle layer (submucosa);
• thick muscle layer (muscularis propria) that contracts to force the contents of the intestines along;
• the thin, outermost layers of connective tissue (subserosa and serosa) that cover most of the colon but not the rectum
- N describes the extent of spread to nearby (regional) lymph nodes, and, if so, how many lymph nodes are involved.
Nx: no description of lymph node involvement is possible because of incomplete information;
NO: no cancer in nearby lymph nodes;
N1 a: cancer cells are found in nearby lymph node;
N1 b: cancer cells are found in 2 to 3 nearby lymph nodes;
N1 c: small deposits of cancer cells are found in areas of fat near lymph nodes, but not in the lymph nodes themselves;
N2a: cancer cells are found in 4 to 6 nearby lymph nodes;
N2b: cancer cells are found in 7 or more nearby lymph nodes.
- M indicates whether the cancer has metastasized.
MO: no distant spread is seen;
M1 a: the cancer has spread to 1 distant organ or set of distant lymph nodes;
M1 b: the cancer has spread to more than 1 distant organ or set of distant lymph nodes, or it has spread to distant parts of the peritoneum (the lining of the abdominal cavity). The skilled person will appreciate that the Dukes Stages correspond to certain TNM Classifications. For example, Dukes Stage A corresponds to T1 , T2, NO and MO; Dukes Stage B corresponds to T3, T4a, T4b, NO and MO; and Dukes Stage C corresponds to i) T1 -T2, Ni/Nic, MO; ii) Tl, N2a and MO; iii) T3-T4a, N1/N1 c and MO; iv) T2-T3, N2a and MO; v) T1 -T2, N2b and MO; vi) T4a, N2a and MO; vii) T3T4a, N2b and MO; and viii) T4b, N1 -N2 and MO. Thus, the skilled person will understand that reference to a Dukes Stage as used herein includes reference to the corresponding TMN classification as known in the art.
As used herein, an "early stage malignant neoplasm" in the context of colorectal cancer is a reference to a large intestine neoplasm which has become malignant but which is unlikely to extend beyond the bowel wall. Reference to a "late stage malignant neoplasm" should be understood as a reference to a large intestine neoplasm which is malignant and which has spread to lymph nodes or distant organs. Reference to late stage malignant neoplasms includes, for example, neoplasms which have become metastatic.
As used herein, the term "metastasis" is meant to refer to the process in which cancer cells originating in one organ or part of the body relocate to another part of the body and continue to replicate. Metastasized cells subsequently form tumours which may further metastasize. Metastasis thus refers to the spread of cancer from the part of the body where it originally occurs to other parts of the body. As used herein, the term "metastasized colorectal cancer cells" is meant to refer to colorectal cancer cells which have metastasized; colorectal cancer cells localized in a part of the body other than the colorectal.
Protein biomarkers and samples of bodily fluid The present invention requires measurement or determining the level or amount of one or more protein biomarkers in a sample of bodily fluid taken from an individual. The present inventors have shown that determining the amount or level of these biomarkers in a sample of bodily fluid an individual, allows for the detection or diagnosis of a cancer, in particular a colorectal cancer. As used herein, "measuring" means determining the level of one or more biomarkers in a sample of bodily fluid obtained from an individual. The step of determining the levels may be performed by any method known the person skilled in the art, and as further described herein (for example, by any proteomic, or protein or peptide detection technique available for measuring the presence or absence, or level or amount of a protein, variant or fragment thereof in the sample). The step of measuring or determining need not be an active step. For example, the skilled person may refer to a database of clinical results obtained from the individual requiring diagnosis, and therefrom, measure or determine the amount of the relevant protein biomarker, relative to a suitable control. As further described herein, the control may be a database of one or more samples from individuals who do or do not have cancer and thereby provide a reference point with respect to the amount of a given biomarker in a sample of bodily fluid. Alternatively, the individual may serve as their own control (for example, where the control may be a sample obtained from the individual before receiving treatment for cancer, or at an earlier stage in their life).
As used herein, a sample of bodily fluid is a biological sample includes a whole blood sample, saliva, interstitial fluid, urine, faeces or derivatives thereof. In a preferred embodiment, the blood sample is a plasma or serum sample. The skilled person will appreciate that the biomarkers can also be detected in biological samples including biopsies of solid tissue or tumour, but which also include some blood or interstitial fluid. Such samples can also be exploited for detecting the amount of the protein biomarkers described herein for determining the likelihood of an individual having cancer.
Prior to testing for the presence of protein biomarkers, the sample of bodily fluid may be subjected to pre-treatment. Pre-treatment may involve, for example, preparing plasma from blood, diluting viscous fluids, and the like. Such methods may involve filtration, distillation, separation, concentration, inactivation of interfering components, and the addition of reagents. The selection and pre-treatment of biological samples prior to testing is well known in the art and need not be described further.
The skilled person will be familiar with various methods for obtaining biological samples including blood samples for individuals. Further, the skilled person will be familiar with various methods for ensuring proper storage of samples to ensure that there is no appreciable degradation of the protein contents of the sample (for example, the use of vacutainer blood collection tubes containing EDTA, heparin, citrate or other additives to prevent clotting or preserve the quality of the blood sample. The skilled person will also be familiar with methods for extracting plasma and serum from samples of whole blood.
The protein biomarkers that are useful in the methods of the present invention are selected from ADAM DEC1 , cystatin-C, complement factor D, plasma C1 inhibitor, paraoxonase 1 , hemopexin, inter-alpha-trypsin inhibitor heavy chain H2, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma- A8, S100 calcium binding protein 8, serum amyloid protein A1 , serum amyloid protein A2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, profilin-1 and superoxide dismutase 3 .
Any one or more of the protein biomarkers disclosed herein may be used in the methods of the present invention.
Reference to any of these biomarkers described herein, includes reference to variants such as isoforms and transcript variants, and fragments thereof, as would be known by the person skilled in the art.
As used herein, reference to the protein biomarker ADAM DEC1 includes any fragment, peptide sequence or variant of the protein ADAM DEC1 (also referred to as ADAM-like decysin 1 or disinteg in and metalloproteinase domain-like protein decysin-1 and having Uniprot ID 015204).
As used herein, reference to the protein biomarker cystatin-C includes any fragment, peptide sequence or variant of the protein cystatin C. This protein may also be referred to as cystatin 3, gamma trace, post-gamma-globulin or neuroendocrine basic polypeptide and has the Uniprot ID P01034.
As used herein, reference to the protein biomarker complement factor D includes any fragment, peptide sequence or variant of the protein complement factor D. This protein may also be referred to as factor D protein, CFD, C3 proactivator convertase, properdin factor D esterase, factor D (complement), complement factor D, and ad ps n. This protein is classified as belonging to E.G. 3.4.21 .46 and has Uniprot ID P00746.
As used herein, reference to the protein biomarker plasma protease C1 inhibitor includes any fragment, peptide sequence or variant of the protein plasma protease C1 inhibitor. The protein may also be referred to as C1 -inh, C1 esterase inhibitor or Serpin G1 and has the Uniprot ID P05155.
As used herein, reference to the protein biomarker paraoxonase 1 includes any fragment, peptide sequence or variant of the protein paraoxonase 1 . This protein may also be referred to as serum paraoxonase, PON-1 , arylesterase 1 , A esterase , homocysteine thiolactonase or serum aryldialkylphosphatase l and has Uniprot ID P27169.
As used herein, reference to the protein biomarker hemopexin includes any fragment, peptide sequence or variant of the protein hemopexin. This protein may also be referred to as haemopexin, HPX, or beta-1 B-glycoprotein and has Uniprot accession no P02790.
As used herein, reference to the protein biomarker inter-alpha-trypsin inhibitor heavy chain H2 includes any fragment, peptide sequence or variant of the protein inter- alpha trypsin inhibitor heavy chain H2, also known as ITI-HC2, Serum-derived hyaluronan-associated protein or Inter-alpha-trypsin inhibitor complex component II. This protein is identified in Uniprot by the accession number P19823.
As used herein, reference to the protein biomarker serum amyloid P component includes any fragment, peptide sequence or variant of the protein serum amyloid P component. This protein may also be referred to as APCS, SAP, 9.5S alpha-1 - glycoprotein and has uniprot ID P02743. As used herein, reference to the protein apolipoprotein A-IV includes any fragment, peptide sequence or variant of the protein apolipoprotein A-IV. This protein may also be referred to as APOA4, and has Uniprot ID P06727.
As used herein, reference to the protein biomarker apolipoprotein B-100 includes any fragment, peptide sequence or variant of the protein apolipoprotein B-100. This protein may also be referred to as APOB, apolipoprotein B-100, and has Uniprot ID P041 14.
As used herein, reference to the protein biomarker carboxypeptidase Q includes and fragment, peptide sequence or variant of the protein carboxypeptidase Q. this protein may also be referred to as CPQ, lysosomal dipeptidase or plasma glutamate carboxypeptidase and has Uniprot ID Q9Y646.
As used herein, reference to the protein biomarker prothrombin includes any fragment, peptide sequence or variant of the protein prothrombin. This protein may also be referred to as F2, Coagulation factor II and has Uniprot ID P00734. As used herein, reference to the protein biomarker glutathione peroxidase 3 includes any fragment, peptide sequence or variant of the protein glutathione peroxidase 3, or GPX3. This protein may also be referred to as plasma glutathione peroxidase (GPx-P) or extracellular glutathione peroxidase and has Uniprot ID P22352.
As used herein, reference to the protein biomarker hepatocyte growth factor activator (HGFAC), includes any fragment, peptide sequence or variant of the protein hepatocyte growth factor activator. This protein may also be referred to as HGFA and has Uniprot ID Q04756.
As used herein, reference to the protein biomarker mannan-binding lectin serine protease (MASP2) includes any fragment, peptide sequence or variant of the protein MASP2. This protein may also be referred to as mannose-binding protein-associated serine protease 2 and has Uniprot ID 000187.
As used herein, reference to the protein biomarker mannose-receptor C1 includes any fragment, peptide sequence or variant of the protein mannose-receptor C1 . This protein may also be referred to as MRC1 , macrophage mannose receptor 1 or C-type lectin domain family 13 member D. This protein has the Uniprot ID P22987.
As used herein, reference to the protein biomarker protocadherin gamma-A8 (PCDHGA8) includes any fragment, peptide sequence or variant of the protein PCDHGA8. This protein may also be referred to as PCDH-gamma-A8 and has the Uniprot ID Q9Y5G5. As used herein, reference to the protein biomarker profilin-1 includes any fragment, peptide sequence or variant of the protein profilin-1 . This protein may also be referred to as PFN-1 , or epididymis tissue protein Li 184a and has Uniprot ID P07737.
As used herein, reference to the protein biomarker S100 calcium binding protein 8 includes any fragment, peptide sequence or variant of protein S100A8. This protein may also be referred to as calgranulin-A, calprotectin L1 L subunit, cystic fibrosis antigen (CFAG), leukocyte L1 complex light chain, migration inhibitory factor-related protein 8 (MRP-8), or urinary stone protein band A, The protein has the Uniprot ID: P05109.
As used herein, reference to the protein biomarker serum amyloid protein A1 includes any fragment, peptide sequence or variant of the protein SAA1 . This protein may also be referred to as SAA and has the Uniprot ID: P0DJI8.
As used herein, reference to the protein biomarker serum amyloid protein A2 includes any fragment, peptide sequence or variant of the protein SAA2. This protein may also be referred to as serum amyloid A2 protein and has Uniprot ID: P0DJI9. As used herein, reference to the protein biomarker superoxide dismutase 3
(SOD3) includes any fragment, peptide sequence or variant of the protein SOD3. This protein may also be referred as extracellular superoxide dismutase [Cu-Zn] or EC-SOD and has Uniprot ID: P08294.
Representative full protein sequences of the biomarkers and exemplary accession numbers are provided in Table 1 below.
Table 1 : Protein biomarker sequence information
Figure imgf000033_0001
Protein Accession Amino acid sequence
number
GAQATWTELPWPHEKVDGALCMEKSLGPNSCSAN
GPGLYLIHGPNLYCYSDVEKLNAAKALPQPQNVTSLL
GCTH
plasma protease P05155 MASRLTLLTLLLLLLAGDRASSNPNATSSSSQDPESL
QDRGEGKVATTVISKMLFVEPILEVSSLPTTNSTTNS
inhibitor C1
ATKITANTTDEPTTQPTTEPTTQPTIQPTQPTTQLPTD
SPTQPTTGSFCPGPVTLCSDLESHSTEAVLGDALVD
(SEQ ID NO: 6) FSLKLYHAFSAMKKVETNMAFSPFSIASLLTQVLLGA
GENTKTNLESILSYPKDFTCVHQALKGh I I KGVTSVS
QIFHSPDLAIRDTFVNASRTLYSSSPRVLSNNSDANL
ELINTWVAKNTNNKISRLLDSLPSDTRLVLLNAIYLSA
KWKTTFDPKKTRMEPFHFKNSVIKVPMMNSKKYPVA
HFIDQTLKAKVGQLQLSHNLSLVILVPQNLKHRLEDM
EQALSPSVFKAIMEKLEMSKFQPTLLTLPRIKVTTSQ
DMLSIMEKLEFFDFSYDLNLCGLTEDPDLQVSAMQH
QTVLELTETGVEAAAASAISVARTLLVFEVQQPFLFVL
WDQQHKFPVFMGRVYDPRA
inter-alpha-trypsin P19823 MKRLTCFFICFFLSEVSGFEIPINGLSEFVDYEDLVEL
APGKFQLVAENRRYQRSLPGESEEMMEEVDQVTLY
inhibitor heavy chain
SYKVQSTITSRMATTMIQSKVVNNSPQPQNVVFDVQI
H2 PKGAFISNFSMTVDGKTFRSSIKEKTVGRALYAQARA
KGKTAGLVRSSALDMENFRTEVNVLPGAKVQFELH
YQEVKWRKLGSYEHRIYLQPGRLAKHLEVDVWVIEP
(SEQ ID NO: 7)
QGLRFLHVPDTFEGHFDGVPVISKGQQKAHVSFKPT
VAQQRICPNCRETAVDGELVVLYDVKREEKAGELEV
FNGYFVHFFAPDNLDPIPKNILFVIDVSGSMWGVKMK
QTVEAMKTILDDLRAEDHFSVIDFNQNIRTWRNDLIS
ATKTQVADAKRYIEKIQPSGGTNINEALLRAIFILNEA
NNLGLLDPNSVSLIILVSDGDPTVGELKLSKIQKNVKE
NIQDNISLFSLGMGFDVDYDFLKRLSNENHGIAQRIY
GNQDTSSQLKKFYNQVSTPLLRNVQFNYPHTSVTD
VTQNNFHNYFGGSEIVVAGKFDPAKLDQIESVITATS
ANTQLVLETLAQMDDLQDFLSKDKHADPDFTRKLWA
YLTINQLLAERSLAPTAAAKRRITRSILQMSLDHHIVTP
LTSLVIENEAGDERMLADAPPQDPSCCSGALYYGSK
VVPDSTPSWANPSPTPVISMLAQGSQVLESTPPPHV
MRVENDPHFIIYLPKSQKNICFNIDSEPGKILNLVSDP
ESGIVVNGQLVGAKKPNNGKLSTYFGKLGFYFQSED
IKIEISTETITLSHGSSTFSLSWSDTAQVTNQRVQISVK
KEKVVTITLDKEMSFSVLLHRVWKKHPVNVDFLGIYIP
PTNKFSPKAHGLIGQFMQEPKIHIFNERPGKDPEKPE
ASMEVKGQKLIITRGLQKDYRTDLVFGTDVTCWFVH
NSGKGFIDGHYKDYFVPQLYSFLKRP
Apolipoprotein B- P041 14 MDPPRPALLALLALPALLLLLLAGARAEEEMLENVSL
VCPKDATRFKHLRKYTYNYEAESSSGVPGTADSRS
100
ATRINCKVELEVPQLCSFILKTSQCTLKEVYGFNPEG KALLKKTKNSEEFAAAMSRYELKLAIPEGKQVFLYPE
(SEQ ID NO: 8) KDEPTYILNIKRGIISALLVPPETEEAKQVLFLDTVYG
NCSTHFTVKTRKGNVATEISTERDLGQCDRFKPIRT
GISPLALIKGMTRPLSTLISSSQSCQYTLDAKRKHVAE Protein Accession Amino acid sequence
number
AICKEQHLFLPFSYKNKYGMVA QVTQTLKLED
TPKINSRFFGEGTKKMGLAFESTKSTSPPKQAEAVLK
TLQELKKLTISEQNIQRANLFNKLVTELRGLSDEAVTS
LLPQLIEVSSPITLQALVQCGQPQCSTHILQWLKRVH
ANPLLIDVVTYLVALIPEPSAQQLREIFNMARDQRSR
ATLYALSHAVNNYHKTNPTGTQELLDIANYLMEQIQD
DCTGDEDYTYLILRVIGNMGQTMEQLTPELKSSI
LKCVQSTKPS LMIQKAAIQA LRKMEPKDKD
QEVLLQTFLDDASPGDKRLAYLMLMRSPSQADINKIV
QILPWEQNEQVKNFVASHIANILNSEELDIQDLKKLVK
EALKESQLPTVMDFRKFSRNYQLYKSVSLPSLDPAS
AKIEGNLIFDPNNYLPKESMLKTTLTAFGFASADLIEI
GLEGKGFEPTLEALFGKQGFFPDSVNKALYWVNGQ
VPDGVSKVLVDHFGY TKDDKHEQDMVNGIMLSVEK
LIKDLKSKEVPEARAYLRILGEELGFASLHDLQLLGKL
LLMGARTLQGIPQMIGEVIRKGSKNDFFLHYIFMENA
FELPTGAGLQLQISSSGVIAPGAKAGVKLEVANMQA
ELVAKPSVSVEFVTNMGIIIPDFARSGVQMNTNFFHE
SGLEAHVALKAGKLKFIIPSPKRPVKLLSGGNTLHLVS
TTKTEVIPPLIENRQSWSVCKQVF PGLNYCTSGA
YSNASSTDSASYYPLTGDTRLELELRPTGEIEQYSVS
ATYELQREDRALVDTLKFVTQAEGAKQTEATMTFKY
NRQSMTLSSEVQIPDFDVDLGTILRVNDESTEGKTSY
RLTLDIQNKKITEVALMGHLSCDTKEERKIKGVISIPRL
QAEARSEILAHWSPAKLLLQMDSSATAYGSTVSKRV
AWHYDEEKIEFEWNTGTNVDTKKMTSNFPVDLSDY
PKSLHMYANRLLDHRVPQTDMTFRHVGSKLIVAMSS
WLQKASGSLPYTQTLQDHLNSLKEFNLQNMGLPDF
HIPENLFLKSDGRVKYTLNKNSLKIEIPLPFGGKSSRD
LKMLETVRTPALHFKSVGFHLPSREFQVPTFTIPKLY
QLQVPLLGVLDLSTNVYSNLYNWSASYSGGNTST
DHFSLRARYHMKADSVVDLLSYNVQGSGETTYDHK
NTFTLSYDGSLRHKFLDSNIKFSHVEKLGNNPVSKGL
LIFDASSSWGPQMSASVHLDSKKKQHLFVKEVKIDG
QFRVSSFYAKGTYGLSCQRDPNTGRLNGESNLRFN
SSYLQGT NQITGRYEDG TLSLTSTSDLQSGIIKNTAS
LKYENYELTLKSDTNGKYKNFATSNKMDMTFSKQNA
LLRSEYQADYESLRFFSLLSGSLNSHGLELNADILGT
DKINSGAHKATLRIGQDGISTSATTNLKCSLLVLENEL
NAELGLSGASMKLTTNGRFREHNAKFSLDGKAALTE
LSLGSAYQAMILGVDSKNIFNFKVSQEGLKLSNDMM
GSYAEMKFDHTNSLNIAGLSLDFSSKLDNIYSSDKFY
KQTVNLQLQPYSLVTTLNSDLKYNALDLTNNGKLRLE
PLKLHVAGNLKGAYQNNEIKHIYAISSAALSASYKADT
VAKVQGVEFSHRLNTDIAGLASAIDMSTNYNSDSLHF
SNVFRSVMAPFTMTIDAHTNGNGKLALWGEHTGQL
YSKFLLKAEPLAFTFSHDYKGSTSHHLVSRKSISAAL
EHKVSALLTPAEQTGTWKLKTQFNNNEYSQDLDAYN
TKDKIGVELTGRTLADLTLLDSPIKVPLLLSEPINIIDAL
EMRDAVEKPQEFTIVAFVKYDKNQDVHSINLPFFETL Protein Accession Amino acid sequence
number
QEYFERNRQTI IVVLENVQRNLKHINIDQFVRKYRAAL
GKLPQQANDYLNSFNWERQVSHAKEKLTALTKKYRI
TENDIQIALDDAKINFNEKLSQLQTYMIQFDQYIKDSY
DLHDLKIAIANI IDEI IEKLKLDEHYHIRVNLVKTIHDLHL
FIENIDFNKSGSSTASWIQNVDTKYQIRIQIQEKLQQL
KRHIQNIDIQHLAGKLKQHIEAIDVRVLLDLGTTISFERI
NDILEHVKHFVINLIGDFEVAEKINAFRAKVHELIERYE
VDQQIQVLMDKLVELAHQYKLKETIQKLSNVLQQVKI
KDYFEKLVGFIDDAVKKLNELSFKTFIEDVNKFLDMLI
KKLKSFDYHQFVDETNDKIREVTQRLNGEIQALELPQ
KAEALKLFLEETKATVAVYLESLQDTKITLI INWLQEAL
SSASLAHMKAKFRETLEDTRDRMYQMDIQQELQRYL
SLVGQVYSTLVTYISDWWTLAAKNLTDFAEQYSIQD
WAKRMKALVEQGFTVPEIKTILGTMPAFEVSLQALQK
ATFQTPDFIVPLTDLRIPSVQINFKDLKNIKIPSRFSTP
EFTILNTFHIPSFTIDFVEMKVKI IRTIDQMLNSELQWP
VPDIYLRDLKVEDIPLARITLPDFRLPEIAIPEFI IPTL.NL
NDFQVPDLHIPEQLPHISHTIEVPTFGKLYSLKIQSPLF
TLDANADIGNGTTSANEAGIAASITAKGESKLEVLNFD
FQANAQLSNPKINPLALKESVKFSSKYLRTEHGSEML
FFGNAIEGKSNTVASLHTEKNTLELSNGVIVKINNQLT
LDSNTKYFHKLNIPKLDFSSQADLRNEIKTLLKAGHIA
WTSSGKGSWKWACPRFSDEGTHESQISFTIEGPLTS
FGLSNKINSKHLRVNQNLVYESGSLNFSKLEIQS
QVDSQHVGHSVLTAKGMALFGEGKAEFTGRHDAHL
NGKVIGTLKNSLFFSAQPFEITASTNNEGNLKVRFPL
RLTGKIDFLNNYALFLSPSAQQASWQVSRFNQYKYN
QNFSAGNNENIMEAHVGINGEANLDFLNIPLTIPEMR
LPYTI ITTPPLKDFSLWEKTGLKEFLKTTKQSFDSVKA
QYKKNKHRHSITNPLAVLCEFISQSIKSFDRHFEKNR
NNALDFVTKSYNETKIKFKYKAEKSHDELPRTFQIPG
YTVPVVNVEVSPFTIEMSAFYVFPKAVSMPSFSILGS
DVRVPSYTLILPSLELPVLHVPRNLKLSLPDFKELCTI
SHIFIPAMGNITYDFSFKSSVITLNTNAELFNQSDIVAH
LLSSSSSVIDALQYKLEGTTRLTRKRGLKLATALSLSN
KFVEGSHNSTVSLTTKNMEVSVATTTKAQIPILRMNF
QELNGNTKSKPTVSSSMEFKYDFNSSMLYSTAKGAV
DHKSLESLTSYFIESSTKGDVKGSVLSREYSGTIASE
ANTYNSKSTRSSVKLQGTSKIDDIWNLEVKENFAGE
ATLQRIYLWEHSTKNHQLEGLFFTNGEHTSKATLELS
PWQMSALVQVHASQPSSFHDFPDLGQEVLNANTKN
QKRWKNEVRIHSGSFQSQVELSNDQEKAHLDIAGSL
EGHLRFLKNI ILPVYDKSLWDFLKLVTTSIGRRQHLRV
STAFVYTKNPNGYSFSIPVKVLADKFI IPGLKLNDLNS
VLVMPTFHVPFTDLQVPSCKLDFREIQIKKLRTSSFAL
NLPTLPEVKFPEVDVLTKYSQPEDSLIPFFEITVPESQ
LTVSQFTLPKSSDGIAALDLNAVANKIADFLPTI IVPEQ
TIEIPSIKFSVPAGIVIPSFQALTARFEVDSPVYNATWS
ASLKNKADYVETVLDSTCSSTVQFLEYELNVLGTHKI
EDGTLASKTKGTFAHRDFSAEYEEDGKYEGLQEWE Protein Accession Amino acid sequence
number
GKAHLNIKSPAFTDLHLRYQKDKKGISTSAASPAVGT VGMDMDEDDDFSKWNFYYSPQSSPDKKLTIFKTELR VRESDEETQIKVNWEEEAASGLLTSLKDNVPKATGV
LYDYVNKYHWEHTGLTLREVSSKLRRNLQNNAEWV
YQGAIRQIDDIDVRFQKAASGTTGTYQEWKDKAQNL
YQELLTQEGQASFQGLKDNVFDGLVRVTQEFHMKV
KHLIDSLIDFLNFPRFQFPGKPGIYTREELCTMFREVG
TVLSQYSKVHNGSEILFSYFQDLVITLPFELRKHKLID
VISMYRELLKDLSKEAQEVFKAIQSLKTTEVLRNLQDL
LQFIFQLIEDNIKQLKEMKFTYLINYIQDEINTIFSDYIPY
VFKLLKENLCLNLHKFNEFIQNELQEASQELQQIHQYI
MALREEYFDPSIVGWTVKYYELEEKIVSLIKNLLVALK
DFHSEYIVSANFTSQLSSQVEQFLHRNIQEYLSILTDP
DGKGKEKIAELSATAQEIIKSQAIATKKIISDYHQQFRY
KLQDFSDQLSDYYEKFIAESKRLIDLSIQNYHTFLIYIT
ELLK KLQSTTVMNPYMKLAPGELT ML
Serum amyloid P- P02743 MNKPLLWISVLTSLLEAFAHTDLSGKVFVFPRESVTD
HVNLITPLEKPLQNFTLCFRAYSDLSRAYSLFSYNTQ
component (SAP)
GRDNELLVYKERVGEYSLYIGRHKVTSKVIEKFPAPV
HICVSWESSSGIAEFWINGTPLVKKGLRQGYFVEAQ
(SEQ ID NO: 9) PKIVLGQEQDSYGGKFDRSQSFVGEIGDLYMWDSV
LPPENILSAYQGTPLPANILDWQALNYEIRGYVIIKPL
vwv
Prothrombin P00734 MAHVRGLQLPGCLALAALCSLVHSQHVFLAPQQARS
LLQRVRRANTFLEEVRKGNLERECVEETCSYEEAFE
ALESSTATDVFWAKYTACETARTPRDKLAACLEGN
(SEQ ID NO: 10)
CAEGLGTNYRGHVNITRSGIECQLWRSRYPHKPEIN
STTHPGADLQENFCRNPDSSTTGPWCYTTDPTVRR
QECSIPVCGQDQVTVAMTPRSEGSSVNLSPPLEQC
VPDRGQQYQGRLAVTTHGLPCLAWASAQAKALSKH
QDFNSAVQLVENFCRNPDGDEEGVWCYVAGKPGD
FGYCDLNYCEEAVEEETGDGLDEDSDRAIEGRTATS
EYQTFFNPRTFGSGEADCGLRPLFEKKSLEDKTERE
LLESYIDGRIVEGSDAEIGMSPWQVMLFRKSPQELL
GASLISDRWVLTAAHCLLYPPWDKNFTENDLLVRIGK
HSRTRYERNIEKISMLEKIYIHPRYNWRENLDRDIALM
KLKKVAFSDYIHPVCLPDRETAASLLQAGYKGRVTG
WGNLKETWTANVGKGQPSVLQVVNLPIVERPVCKD
STRIRITDNMFCAGYKPDEGKRGDACEGDSGGPFV
MKSPFNNRWYQMGIVSWGEGCDRDGKYGFYTHVF
RLKKWIQKVIDQFGE
Hepatocyte growth Q04756 MGRWAWVPSPWPPPGLGPFLLLLLLLLLLPRGFQP
QPGGNRTESPEPNATATPAIPTILVTSVTSETPATSA
factor activator
PEAEGPQSGGLPPPPRAVPSSSSPQAQALTEDGRP
CRFPFRYGGRMLHACTSEGSAHRKWCATTHNYDR
(SEQ ID NO: 1 1 ) DRAWGYCVEATPPPGGPAALDPCASGPCLNGGSC
SNTQDPQSYHCSCPRAFTGKDCGTEKCFDETRYEY
LEGGDRWARVRQGHVEQCECFGGRTWCEGTRHTA
CLSSPCLNGGTCHLIVATGTTVCACPPGFAGRLCNIE
PDERCFLGNGTGYRGVASTSASGLSCLAWNSDLLY Protein Accession Amino acid sequence
number
QELHVDSVGAAALLGLGPHAYCRNPDNDERPWCYV
VKDSALSWEYCRLEACESLTRVQLSPDLLATLPEPA
SPGRQACGRRHKKRTFLRPRIIGGSSSLPGSHPWLA
AIYIGDSFCAGSLVHTCWVVSAAHCFSHSPPRDSVS
VVLGQHFFNRTTDVTQTFGIEKYIPYTLYSVFNPSDH
DLVLIRLKKKGDRCATRSQFVQPICLPEPGSTFPAGH
KCQIAGWGHLDENVSGYSSSLREALVPLVADHKCSS
PEVYGADISPNMLCAGYFDCKSDACQGDSGGPLAC
EKNGVAYLYGIISWGDGCGRLHKPGVYTRVANYVD
WINDRIRPPRRLVAPS
Apolipoprotein A-IV P06727 MFLKAVVLTLALVAVAGARAEVSADQVATVMWDYF
SQLSNNAKEAVEHLQKSELTQQLNALFQDKLGEVN
TYAGDLQKKLVPFATELHERLAKDSEKLKEEIGKELE
(SEQ ID NO: 12)
ELRARLLPHANEVSQKIGDNLRELQQRLEPYADQLR
TQVNTQAEQLRRQLTPYAQRMERVLRENADSLQAS
LRPHADELKAKIDQNVEELKGRLTPYADEFKVKIDQT
VEELRRSLAPYAQDTQEKLNHQLEGLTFQMKKNAEE
LKARISASAEELRQRLAPLAEDVRGNLRGNTEGLQK
SLAELGGHLDQQVEEFRRRVEPYGENFNKALVQQM
EQLRQKLGPHAGDVEGHLSFLEKDLRDKVNSFFSTF
KEKESQDKTLSLPELEQQQEQQQEQQQEQVQMLAP
LES
Carboxy-peptidase Q9Y646 MKFLIFAFFGGVHLLSLCSGKAICKNGISKRTFEEIKE
EIASCGDVAKAIINLAVYGKAQNRSYERLALLVDTVG Q
PRLSGSKNLEKAIQIMYQNLQQDGLEKVHLEPVRIPH
WERGEESAVMLEPRIHKIAILGLGSSIGTPPEGITAEV
(SEQ ID NO: 13) LVVTSFDELQRRASEARGKIVVYNQPYINYSRTVQYR
TQGAVEAAKVGALASLIRSVASFSIYSPHTGIQEYQD
GVPKIPTACITVEDAEMMSRMASHGIKIVIQLKMGAK
TYPDTDSFNTVAEITGSKYPEQVVLVSGHLDSWDVG
QGAMDDGGGAFISWEALSLIKDLGLRPKRTLRLVLW
TAEEQGGVGAFQYYQLHKVNISNYSLVMESDAGTFL
PTGLQFTGSEKARAIMEEVMSLLQPLNITQVLSHGEG
TDINFWIQAGVPGASLLDDLYKYFFFHHSHGDTMTV
M DP KQMN VAAAVW AVVS YV VADM EEM LP RS
Glutathione P22352 MARLLQASCLLSLLLAGFVSQSRGQEKSKMDCHGGI
SGTIYEYGALTIDGEEYIPFKQYAGKYVLFVNVASYU
peroxidase 3
GLTGQYIELNALQEELAPFGLVILGFPCNQFGKQEPG ENSEILPTLKYVRPGGGFVPNFQLFEKGDVNGEKEQ
(SEQ ID NO: 14) KFYTFLKNSCPPTSELLGTSDRLFWEPMKVHDIRWN
FEKFLVGPDGIPIMRWHHRTTVSNVKMDILSYMRRQ AALGVKRK
Mannan-binding 000187 MRLLTLLGLLCGSVATPLGPKWPEPVFGRLASPGFP
GEYANDQERRWTLTAPPGYRLRLYFTHFDLELSHLC
lectin serine
EYDFVKLSSGAKVLATLCGQESTDTERAPGKDTFYS
protease 2 LGSSLDITFRSDYSNEKPFTGFEAFYAAEDIDECQVA
PGEAPTCDHHCHNHLGGFYCSCRAGYVLHRNKRTC
SALCSGQVFTQRSGELSSPEYPRPYPKLSSCTYSISL
(SEQ ID NO: 15)
EEGFSVILDFVESFDVETHPETLCPYDFLKIQTDREE
HGPFCGKTLPHRIETKSNTVTITFVTDESGDHTGWKI Protein Accession Amino acid sequence
number
HYTSTAQPCPYPMAPPNGHVSPVQAKYILKDSFSIFC ETGYELLQGHLPLKSFTAVCQKDGSWDRPMPACSIV DCGPPDDLPSGRVEYITGPGVTTYKAVIQYSCEETFY TMKVNDGKYVCEADGFWTSSKGEKSLPVCEPVCGL
SARTTGGRIYGGQKAKPGDFPWQVLILGGTTAAGAL
LYDNWVLTAAHAVYEQKHDASALDIRMGTLKRLSPH
YTQAWSEAVFIHEGYTHDAGFDNDIALIKLNNKVVIN
SNITPICLPRKEAESFMRTDDIGTASGWGLTQRGFLA
RNLMYVDIPIVDHQKCTAAYEKPPYPRGSVTANMLC
AGLESGGKDSCRGDSGGALVFLDSETERWFVGGIV
SWGSMNCGEAGQYGVYTKVINYIPWIENIISDF
Mannose receptor C P22897 MRLPLLLVFASVIPGAVLLLDTRQFLIYNEDHKRCVDA
VSPSAVQTAACNQDAESQKFRWVSESQIMSVAFKL
type 1
CLGVPSKTDWVAITLYACDSKSEFQKWECKNDTLLG
IKGEDLFFNYGNRQEKNIMLYKGSGLWSRWKIYGTT
(SEQ ID NO: 16) DNLCSRGYEAMYTLLGNANGATCAFPFKFENKWYA
DCTSAGRSDGWLWCG I I I DYDTDKLFGYCPLKFEG
SESLWNKDPLTSVSYQINSKSALTWHQARKSCQQQ
NAELLSITEIHEQTYLTGLTSSLTSGLWIGLNSLSFNS
GWQWSDRSPFRYLNWLPGSPSAEPGKSCVSLNPG
KNAKWENLECVQKLGYICKKGNTTLNSFVIPSESDVP
THCPSQWWPYAGHCYKIHRDEKKIQRDALTTCRKE
GGDLTSIHTIEELDFIISQLGYEPNDELWIGLNDIKIQM
YFEWSDGTPVTFTKWLRGEPSHENNRQEDCVVMK
GKDGYWADRGCEWPLGYICKMKSRSQGPEIVEVEK
GCRKGWKKHHFYCYMIGHTLSTFAEANQTCNNENA
YLTTIEDRYEQAFLTSFVGLRPEKYFWTGLSDIQTKG
TFQWTIEEEVRFTHWNSDMPGRKPGCVAMRTGIAG
GLWDVLKCDEKAKFVCKHWAEGVTHPPKPTTTPEP
KCPEDWGASSRTSLCFKLYAKGKHEKKTWFESRDF
CRALGGDLASINNKEEQQTIWRLITASGSYHKLFWLG
LTYGSPSEGFTWSDGSPVSYENWAYGEPNNYQNV
EYCGELKGDPTMSWNDINCEHLNNWICQIQKGQTP
KPEPTPAPQDNPPVTEDGWVIYKDYQYYFSKEKETM
DNARAFCKRNFGDLVSIQSESEKKFLWKYVNRNDA
QSAYFIGLLISLDKKFAWMDGSKVDYVSWATGEPNF
ANEDENCVTMYSNSGFWNDINCGYPNAFICQRHNS
SINATTVMPTMPSVPSGCKEGWNFYSNKCFKIFGFM
EEERKNWQEARKACIGFGGNLVSIQNEKEQAFLTYH
MKDSTFSAWTGLNDVNSEHTFLWTDGRGVHYTNW
GKGYPGGRRSSLSYEDADCVVIIGGASNEAGKWMD
DTCDSKRGYICQTRSDPSLTNPPATIQTDGFVKYGK
SSYSLMRQKFQWHEAETYCKLHNSLIASILDPYSNAF
AWLQMETSNERVWIALNSNLTDNQYTWTDKWRVRY
TNWAADEPKLKSACVYLDLDGYWKTAHCNESFYFL
CKRSDEIPATEPPQLPGRCPESDHTAWIPFHGHCYYI
ESSYTRNWGQASLECLRMGSSLVSIESAAESSFLSY
RVEPLKSKTNFWIGLFRNVEGTWLWINNSPVSFVNW
NTGDPSGERNDCVALHASSGFWSNIHCSSYKGYICK
RPKIIDAKPTHELLTTKADTRKMDPSKPSSNVAGVVII Protein Accession Amino acid sequence
number
VILLILTGAGLAAYFFYKKRRVHLPQEGAFENTLYFNS QSSPGTSDMKDLVGNIEQNEHSVI
Protocadherin Q9Y5G5 MAAPQSRPRRGELILLCALLGTLWEIGRGQIRYSVPE
ETDKGSFVGNISKDLGLDPRKLAKHGVRIVSRGRTQ
gamma-A8
LFALNPRSGSLITAGRIDREELCAQSPRCLININTLVE
DKGKLFGVEIEIIDINDNNPKFQVEDLEVKINEIAVPGA
(SEQ ID NO: 17) RYPLPEAVDPDVGVNSLQSYQLSPNHHFSLDVQTG
DNGAINPELVLERALDREEEAAHHLVLTASDGGKPP
RSSTVRIHVTVLDTNDNAPVFPHPIYRVKVLENMPPG
TRLLTVTASDPDEGINGKVAYKFRKINEKQTPLFQLN
ENTGEISIAKSLDYEECSFYEMEIQAEDVGALLGRTK
LLISVEDVNDNRPEVIITSLFSPVLENSLPGTVIAFLSV
HDQDSGKNGQVVCYTRDNLPFKLEKSIGNYYRLVTR
KYLDRENVSIYNITVMASDLGTPPLSTETQIALHVADI
NDNPPTFPHASYSAYILENNLRGASIFSLTAHDPDSQ
ENAQVTYSVTEDTLQGAPLSSYISINSDTGVLYALQS
FDYEQIRDLQLLVTASDSGDPPLSSNMSLSLFVLDQN
DNAPEILYPALPTDGSTGVELAPRSAERGYLVTKVVA
VDRDSGQNAWLSYRLLKASEPGLFSVGLHTGEVRT
ARALLDRDALKQSLVVAVQDHGQPPLSATVTLTVAV
ADSIPEVLTELGSLKPSVDPNDSSLTLYLVVAVAAISC
VFLAFVAVLLGLRLRRWHKSRLLQDSGGRLVGVPAS
HFVGVEEVQAFLQTYSQEVSLTADSRKSHLIFPQPN
YADMLISQEGCEKNDSLLTSVDFHEYKNEADHGQQA
PPNTDWRFSQAQRPGTSGSQNGDDTGTWPNNQFD
TEMLQAMILASASEAADGSSTLGGGAGTMGLSARY
GPQFTLQHVPDYRQNVYIPGSNATLTNAAGKRDGKA
PAGGNGNKKKSGKKEKK
Profilin-1 P07737 MAGWNAYIDNLMADGTCQDAAIVGYKDSPSVWAAV
PGKTFVNITPAEVGVLVGKDRSSFYVNGLTLGGQKC
SVIRDSLLQDGEFSMDLRTKSTGGAPTFNVTVTKTD
(SEQ ID NO: 18)
KTLVLLMGKEGVHGGLINKKCYEMASHLRRSQY
S100 calcium P05109 MLTELEKALNSIIDVYHKYSLIKGNFHAVYRDDLKKLL
ETECPQYIRKKGADVWFKELDINTDGAVNFQEFLILVI
binding protein 8
KMGVAAHKKSHEESHKE
(SEQ ID NO: 19)
Serum amyloid P0DJI8 MKLLTGLVFCSLVLGVSSRSFFSFLGEAFDGARDMW
RAYSDMREANYIGSDKYFHARGNYDAAKRGPGGV
protein A1 (SEQ ID
WAAEAISDARENIQRFFGHGAEDSLADQAANEWG NO: 20) RSGKDPNHFRPAGLPEKY
Serum amyloid P0DJI9 MKLLTGLVFCSLVLSVSSRSFFSFLGEAFDGARDMW
RAYSDMREANYIGSDKYFHARGNYDAAKRGPGGA
protein A2 (SEQ ID
WAAEVISNARENIQRLTGRGAEDSLADQAANKWGR NO: 21 ) SGRDPNHFRPAGLPEKY Protein Accession Amino acid sequence
number
Superoxide P08294 MLALLCSCLLLAAGASDAWTGEDSAEPNSDSAEWIR
DMYAKVTEIWQEVMQRRDDDGALHAACQVQPSATL
dismutase 3
DAAQPRVTGVVLFRQLAPRAKLDAFFALEGFPTEPN
SSSRAIHVHQFGDLSQGCESTGPHYNPLAVPHPQH
(SEQ ID NO: 22) PGDFGNFAVRDGSLWRYRAGLAASLAGPHSIVGRA
VVVHAGEDDLGRGGNQASVENGNAGRRLACCVVG
VCGPGLWERQAREHSERKKRRRESECKAA
Exemplary peptides which can be detected as being representative of the protein are shown in underline and bold in Table 1 above.
In accordance with the present invention, it will be appreciated that it is not necessary for the full protein sequence to be detected in the test sample. For example, in certain embodiments, only specific fragments or peptides derived from the protein or corresponding to the protein may be detected in the sample, and will be sufficient to enable the skilled person to perform the methods of the invention.
In certain preferred embodiments, the specific peptide fragments of the above listed protein biomarkers which can be detected are shown below in Table 2:
Table 2: Preferred peptide fragments for detecting biomarkers
Figure imgf000041_0001
Protein Accession number Peptide
Complement sp|P00746|CFAD_HUMA ATLGPAVRPLPWQR
factor D N
Paraoxonase 1 sp|P27169|PON1_HUMA IQN[Dea]ILTEEPK
N
Paraoxonase 1 sp|P27169|PON1_HUMA STVELFKFQEEEK
N
Paraoxonase 1 sp|P27169|PON1_HUMA EVQPVELPNC[CAM]NLVK
N
Paraoxonase 1 sp|P27169|PON1_HUMA IQNILTEEPK
N
Paraoxonase 1 sp|P27169|PON1_HUMA LIGTVFHK
N
Hemopexin sp I P027901 H E MO_H U M A PPTSAHGNVAEGETKPDPDVTER
N
Hemopexin sp I P027901 H E MO_H U M A PPTSAHGNVAEGETKPDPDVTE
N
plasma sp|P05155| IC1_HUMAN LLDSLPSDTRLVLLNA
protease C1
inhibitor
plasma sp|P05155| IC1_HUMAN NSVIKVPMMNSK
protease C1
inhibitor
plasma sp|P05155| IC1_HUMAN NLESILSYPK
protease C1
inhibitor
plasma sp|P05155| IC1_HUMAN TNLESILSYPK
protease C1
inhibitor
plasma sp|P05155| IC1_HUMAN FQPTLLTLPR
protease C1
inhibitor
plasma sp|P05155| IC1_HUMAN LLDSLPSDTR
protease C1
inhibitor Protein Accession number Peptide
plasma sp|P05155| IC1_HUMAN LVLLNAIYLSAK
protease C1
inhibitor
plasma sp|P05155| IC1_HUMAN QPTLLTLPR
protease C1
inhibitor
Inter-alpha- sp|P19823| ITIH2_HUMAN MLADAPPQ[Dea]DPSC[CAM]C[CAM]SGALYY trypsin inhibitor GSK
heavy chain H2
Inter-alpha- sp|P19823| ITIH2_HUMAN AEDHFSVIDFN[Dea]QNIR
trypsin inhibitor
heavy chain H2
Inter-alpha- sp|P19823| ITIH2_HUMAN SPQPQNVVFDVQIPK
trypsin inhibitor
heavy chain H2
Inter-alpha- sp|P19823| ITIH2_HUMAN NDLISATKTQVADAK
trypsin inhibitor
heavy chain H2
Inter-alpha- sp|P19823| ITIH2_HUMAN QPSGGTNINEALLR
trypsin inhibitor
heavy chain H2
Inter-alpha- sp|P19823| ITIH2_HUMAN KFYNQVSTPLLR
trypsin inhibitor
heavy chain H2
Inter-alpha- sp|P19823| ITIH2_HUMAN VQFELHYQEVK
trypsin inhibitor
heavy chain H2
Inter-alpha- sp|P19823| ITIH2_HUMAN TWRNDLISATK
trypsin inhibitor
heavy chain H2 Protein Accession number Peptide
Inter-alpha- sp P19823 ITIH2 HUMAN AEDHFSVIDFNQNIR
trypsin inhibitor
heavy chain H2
Inter-alpha- sp|P19823| ITIH2_HUMAN SSALDMENFR
trypsin inhibitor
heavy chain H2
Inter-alpha- sp|P19823| ITIH2_HUMAN NDLISATKTQVADAKR
trypsin inhibitor
heavy chain H2
Inter-alpha- sp|P19823| ITIH2_HUMAN FLHVPDTFEGHFDGVPVISKGQQK trypsin inhibitor
heavy chain H2
Inter-alpha- sp|P19823| ITIH2_HUMAN IYGNQDTSSQLKK
trypsin inhibitor
heavy chain H2
Serum amyloid sp I P02743 |S AM P_H U M A IVLGQEQDSYGGK
P-component N
(SAP)
Serum amyloid sp I P027431 SAM P_H U M A AYS LFS YNTQG R
P-component N
(SAP)
Serum amyloid sp|P02743|SAMP_HUMA IVLGQEQDSYGGKFDR
P-component N
(SAP)
Serum amyloid sp|P02743|SAMP_HUMA DNELLVYK
P-component N
(SAP)
Serum amyloid sp|P02743|SAMP_HUMA VGEYSLYIGR
P-component N
(SAP)
Serum amyloid sp|P02743|SAMP_HUMA GYVIIKPLVWV
P-component N Protein Accession number Peptide
(SAP)
Serum amyloid sp 1 P027431 SAM P_H U M A QGYFVEAQPK
P-component N
(SAP)
Serum amyloid sp|P02743|SAMP_HUMA AYSDLSR
P-component N
(SAP)
Serum amyloid sp|P02743|SAMP_HUMA IVLGQ[Dea]EQDSYGGKFDR
P-component N
(SAP)
Serum amyloid sp|P02743|SAMP_HUMA SLFSYNTQGR
P-component N
(SAP)
Serum amyloid sp|P02743|SAMP_HUMA HTDLSGKVFVFPR
P-component N
(SAP)
Serum amyloid sp|P02743|SAMP_HUMA DLSGKVFVFPR
P-component N
(SAP)
Serum amyloid sp|P02743|SAMP_HUMA Q[Dea]GYFVEAQPK
P-component N
(SAP)
Serum amyloid sp|P02743|SAMP_HUMA IVLGQEQDSYGGKFD
P-component N
(SAP)
Apolipoprotein sp|P06727|APOA4_HUM ISASAEELR
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM LGEVNTYAGDLQK
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM LAPLAEDVR
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM LEPYADQLR
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM SLAPYAQDTQEK
A-IV AN Protein Accession number Peptide
Apolipoprotein sp|P06727|APOA4_HUM LTPYADEFK
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM ALVQQMEQLR
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM QLTPYAQR
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM IDQNVEELK
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM TQVNTQAEQLR
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM IDQTVEELR
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM TQVNTQAEQLRR
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM DKVNSFFSTFK
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM SLAELGGHLDQQVEEFR
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM TLSLPELEQQQEQQQEQQQEQVQMLAPLES A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM ENADSLQASLRPHADELK
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM EVSADQVAT[CAM]VMWDYFSQLSNNAK A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM EVSADQVATVMWDYFSQLSNNAK
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM SELTQQ[Dea]LNALFQ[Dea]DK
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM SELTQQ[Dea]LNALFQDK
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM SELTQ[Dea]QLNALFQDK
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM VEPYGENFNKALVQQMEQLR Protein Accession number Peptide
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM LGPHAGDVEGHLSFLEKDLR
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM [CAM]-SLAELGGHLDQQVEEFRR A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM SLAELGGHLD[CAM]QQVEEFRR
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM RVEPYGENFNKALVQQMEQLR
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM PLAEDVRGNLR
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM ALVQQM[Oxi]EQLR
A-IV AN
Apolipoprotein sp|P06727|APOA4_HUM ALVQQ[Dea]MEQLR
A-IV AN
Hepatocyte sp|Q04756|HGFA_HUMA [PGQ]-QGHVEQC[CAM]EC[CAM]FGGR growth factor N
activator
Hepatocyte sp|Q04756|HGFA_HUMA SQFVQPIC[CAM]LPEPGSTFPAGHK growth factor N
activator
Hepatocyte sp|Q04756|HGFA_HUMA VANYVDWINDR
growth factor N
activator
Hepatocyte sp|Q04756|HGFA_HUMA VQLSPDLLATLPEPASPGR
growth factor N
activator
Hepatocyte sp|Q04756|HGFA_HUMA LC[CAM]NIEPDER
growth factor N
activator
Hepatocyte sp|Q04756|HGFA_HUMA SQFVQ[Dea]PIC[CAM]LPEPGSTFPAGHK growth factor N
activator
Hepatocyte sp|Q04756|HGFA_HUMA VQLSP[Oxi]D[KXX]LLATLPEPASPGR growth factor Protein Accession number Peptide
activator N
Prothrombin SDIP00734ITHRB HUMA QEC[CAM]SIPVC[CAM]GQDQVTVAM[Oxi]TPR
N
Prothrombin SDIP00734ITHRB HUMA IVEGSDAEIGM[Oxi]SPWQVMLFR
N
Prothrombin SDIP00734ITHRB HUMA L A VTTHG L PC[C AM] L AW[Ntr] AS AQAK
N
Prothrombin SDIP00734ITHRB HUMA LAAC[CAM]LEGNC[CAM]AEGLGTNYR
N
Prothrombin SDIP00734ITHRB HUMA TFGSGEADC[CAM]GLR[Dea]PLFEK
N
Prothrombin SDIP00734ITHRB HUMA RQEC[CAM]SIP[Oxi]VC[CAM]GQDQVTVAM[Oxi
N ]TPR
Prothrombin SDIP00734ITHRB HUMA GSGEADC[CAM]GLRPLFEK
N
Prothrombin SDIP00734ITHRB HUMA RQEC[CAM]SIPVC[CAM]GQ[Dea]DQVTVAM[0
N xi]TPR
Prothrombin SDIP00734ITHRB HUMA RQEC[CAM]SIPVC[CAM]GQDQVTVAM[Oxi]TP
N R
Prothrombin SDIP00734ITHRB HUMA RQEC[CAM]SIPVC[CAM]GQDQVTVAMTPR
N
Prothrombin SDIP00734ITHRB HUMA PVC[CAM]GQDQVTVAM[DTM]TPR
N
Prothrombin SDIP00734ITHRB HUMA ETWTANVGKGQPSVLQVVNLPIVERPVC[CAM]
N K
Prothrombin SDIP00734ITHRB HUMA TAT[Dhy]SEYQTFFNPR
N
Prothrombin SDIP00734ITHRB HUMA GDAC[CAM]EGDSGGPFVM[Oxi]K
N
Prothrombin SDIP00734ITHRB HUMA GDAC[CAM]EGDSGGPFVMK
N
Prothrombin SDIP00734ITHRB HUMA IVEGSDAEIGMSPWQVM[Oxi]LFR
N Protein Accession number Peptide
Prothrombin SDIP00734ITHRB HUMA DKLAAC[CAM]LEGN[Oxi]C[CAM]AEGLGTNY[0
N xi]R
Prothrombin SDIP00734ITHRB HUMA DKLAAC[CAM]LEGN[Oxi]C[CAM]AEGLGTNYR
N
Prothrombin SDIP00734ITHRB HUMA DKLAAC[CAM]LEGN[Dea]C[CAM]AEGLGTNYR
N
Prothrombin SDIP00734ITHRB HUMA LAVTTHGLP[PGP]C[CAM]LAW
N
Prothrombin SDIP00734ITHRB HUMA TSEYQTFFNPR
N
Prothrombin SDIP00734ITHRB HUMA WIQKVIDQFGE
N
Apolipoprotein sp|P041 14|APOB_HUMA EEEMLENVSLVC[CAM]PK
B-100 N
Apolipoprotein sp|P041 14|APOB_HUMA M[Oxi]TSNFPVDLSDYPK
B-100 N
Apolipoprotein sp|P041 14|APOB_HUMA TLQGIPQMIGEVIR
B-100 N
Apolipoprotein sp|P041 14|APOB_HUMA KYTYNYEAE[KXX]SSSGVPGTADSR
B-100 N
Apolipoprotein sp|P041 14|APOB_HUMA VNWEEEAASGLLTSLKDNVPK
B-100 N
Apolipoprotein sp|P041 14|APOB_HUMA VPSYTLILPSLELPVLHVPR
B-100 N
Apolipoprotein sp|P041 14|APOB_HUMA [PGQ]-QVFLYPEKDEPTYILNIK
B-100 N
Apolipoprotein sp|P041 14|APOB_HUMA KGNVATEISTERDLGQC[CAM]DR
B-100 N
Apolipoprotein sp|P041 14|APOB_HUMA ATFQTPDFIVPLTDLR
B-100 N
Apolipoprotein sp|P041 14|APOB_HUMA SPAFTDLHLR
B-100 N
Apolipoprotein sp|P041 14|APOB_HUMA LAPGELTIIL Protein Accession number Peptide
B-100 N
S100 calcium sp|P05109|S10A8_HUMA LLETEC[CAM]PQYIR binding protein N
8
Superoxide sp|P08294|SODE_HUMA VTEIWQEVMQR
dismutase 3 N
Serum amyloid sp|P0DJI8|SAA1_HUMAN RGPGGVWAAEAISDAR protein A1
Serum amyloid sp|P0DJI8|SAA1_HUMAN FFGHGAEDSLADQAANEWGR protein A1
Mannose sp|P22897|MRC1_HUMA TGIAGGLWDVLK
receptor C type N
I
Serum amyloid sp|P0DJI9|SAA2_HUMAN RSFFSFLGEAFDGAR protein A2
Serum amyloid sp|P0DJI9|SAA2_HUMAN SFFSFLGEAFDGAR protein A2
Serum amyloid sp|P0DJI9|SAA2_HUMAN RGPGGAWAAEVISN[Dea]AR protein A2
Glutathione sp|P22352|GPX3_HUMA QEPGENSEILPTLK peroxidase 3 N
Glutathione sp|P22352|GPX3_HUMA FLVGPDGIPIMR
peroxidase 3 N
Glutathione sp|P22352|GPX3_HUMA YVRPGGGFVPNFQLFEK peroxidase 3 N
Protocadherin sp|Q9Y5G5|PCDG8_HU FQVEDLEVK Protein Accession number Peptide
gamma-A8 MAN
Profilin-1 sp|P07737|PROF1_HUM TFVNITPAEVGVLVGK
AN
Profilin-1 sp|P07737|PROF1_HUM STGGAPTFNVTVTK
AN
Profilin-1 sp|P07737|PROF1_HUM SSFYVNGLTLGGQK
AN
Carboxypeptida sp|Q9Y646|CBPQ_HUMA AIQIMYQNLQQDGLEK se Q N
Carboxypeptida sp|Q9Y646|CBPQ_HUMA AIINLAVYGK
se Q N
Carboxypeptida sp|Q9Y646|CBPQ_HUMA VGALASLIR
se Q N
Mannan-binding sp|O00187|MASP2_HUM LASPGFPGEYANDQERR lectin serine AN
protease 2
Mannan-binding sp|O00187|MASP2_HUM LASPGFPGEYANDQER lectin serine AN
protease 2
Mannan-binding sp|O00187|MASP2_HUM TPLGPKWPEPVFGR lectin serine AN
protease 2
Mannan-binding sp|O00187|MASP2_HUM SLPVCEPVCGLSAR lectin serine AN
protease 2
Mannan-binding sp|O00187|MASP2_HUM LGPKWPEPVFGR lectin serine AN Protein Accession number Peptide
protease 2
Mannan-binding sp|O00187|MASP2_HUM WVLTAAH
lectin serine AN
protease 2
Where [CAM] = cysteine carbamidomethylation and [Dea] = deamidation
In some embodiments, the above listed peptides are proteotypic, (unique to the protein), i.e., there are no other known human proteins having the same peptide sequences or single amino acid variations thereof. Accordingly, when detecting proteospecific peptides, the skilled person can have greater confidence that the peptide detected corresponds to the relevant protein biomarker and not to an unrelated protein.
In certain embodiments the method of determining the likelihood that an individual has cancer involves measuring the amount of any one of the biomarkers ADAM DEC1 , cystatin-C and complement factor D, in the test sample. In a further embodiment, the method of the present invention involves determining the amount of two or more of the biomarkers. For example, in one embodiment, the method involves measuring the amount of ADAM DEC1 and cystatin C in the test sample. Alternatively, the method involves measuring the amount of ADAM DEC1 and complement factor D in the test sample. In yet a further embodiment, the method involves measuring the amount of cystatin-C and complement factor D in the test sample.
In yet a further embodiment, the method involves measures the amount of all three of ADAM DEC1 , cystatin-C and complement factor D in the test sample.
Still further, the method of the present invention involves determining the amount of any one or more of the following biomarkers (which the inventors have shown to have altered levels in serum in individuals with cancer, as compared with healthy individuals):
• hemopexin,
• plasma C1 inhibitor, • inter-alpha trypsin inhibitor heavy chain H2,
• paraoxonase 1 ;
• prothrombin,
• hepatocyte growth factor activator,
• serum amyloid P component,
• apolipoprotein A-IV,
• apolipoprotein B-100,
• carboxypeptidase Q,
• glutathione peroxidase 3,
• Mannan-binding lectin serine protease 2
• mannose receptor C type I,
• protocadherin gamma-A8,
• profilin-1 ,
• S100 calcium binding protein 8,
• serum amyloid protein A1 ,
• serum amyloid protein A2, and
• superoxide dismutase 3 wherein, the individual, for whom the biomarkers are measured will be determined to have a high likelihood of having cancer when:
- the amount of any one or more of hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is greater than the amount of the same protein in a reference data set (in the form of representative data of one or more individuals who do not have cancer); and/or
- the amount of any one or more of plasma C1 protease inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan- binding lectin serine protease 2, mannose receptor C type I, profilin-1 and/or superoxide dismutase 3 in the test sample is lower than the amount of the same protein biomarker in the reference data set. Still further, the individual for whom the biomarker levels are measured will be determined to have a low likelihood of having cancer when:
- the amount of any one or more of hemopexin serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma -A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is the same or lower than the amount of the same protein in the reference data set; and/or
- the amount of any one or more of plasma protease C1 inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan- binding lectin serine protease 2, mannose receptor C type I, prof ilin-1 and/or superoxide dismutase 3 in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set. It will therefore be appreciated that the methods of the present invention contemplate methods for determining the likelihood of an individual having cancer (or having had a successful treatment for cancer as the case may be), wherein the methods include determining the levels of a single biomarker as described herein, in a sample of biological fluid from the individual. The present invention also includes methods which include determining the levels of two, three, four, five, six, seven, eight, nine, 10, 1 1 , 12, 13, 14, 15, 16, 17, 18, 19, 20, 21 or 22 biomarkers are described herein.
For example, it will be appreciated that the confidence with which the skilled person can determine the likelihood of an individual having a cancer will be increased, the greater the number of protein biomarkers measured. Where the method involves measuring all three of ADAM DEC 1 , cystatin-C and complement factor D, and the amount of ADAM DEC 1 in the test sample is greater than in a reference data set in the form of data representative of one or more individuals who do not have cancer, and the amounts of cystatin-C and complement factor D are lower than the amount of the same protein biomarkers in the reference data set, then the greater the power of the determination of likelihood that the individual likely has cancer. In other words, the more biomarkers used in the methods of the present invention, the greater the sensitivity of the method. In comparison, if only one biomarker is measured, then power of the determination may be reduced.
In further embodiments, the present invention provides methods for determining the amounts of one or more of ADAM DEC1 , cystatin-C and complement factor D in a test sample of bodily fluid from an individual, and one or more additional biomarkers. For example, in certain embodiments, the present invention involves measuring the amount of any one of ADAM DEC1 , cystatin-C or complement factor D, or any two of ADAM DEC1 , cystatin-C or complement factor D or all three of ADAM DEC1 , cystatin-C or complement factor D in conjunction with any one or more of the following: · hemopexin,
• plasma C1 inhibitor,
• inter-alpha trypsin inhibitor heavy chain H2,
• paraoxonase 1 ;
• prothrombin,
· hepatocyte growth factor activator,
• serum amyloid P component,
• apolipoprotein A-IV,
• apolipoprotein B-100,
• carboxypeptidase Q,
· glutathione peroxidase 3,
• Mannan-binding lectin serine protease 2
• mannose receptor C type I,
• protocadherin gamma-A8,
• profilin-1 ,
· S100 calcium binding protein 8,
• serum amyloid protein A1 ,
• serum amyloid protein A2, and
• superoxide dismutase 3
wherein, the individual, for whom the biomarkers are measured will be determined to have a high likelihood of having cancer when: - the amount of any one or more of ADAM DEC1 , hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is greater than the amount of the same protein in a reference data set (in the form of representative data of one or more individuals who do not have cancer); and/or
- the amount of any one or more of cystatin-C, complement factor D, plasma C1 protease inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, prof ilin-1 and/or superoxide dismutase 3 in the test sample is lower than the amount of the same protein biomarker in the reference data set.
Conversely, the individual for whom the biomarker levels are measured will be determined to have a low likelihood of having cancer when:
- the amount of any one or more of ADAM DEC1 , hemopexin serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma -A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is the same or lower than the amount of the same protein in the reference data set; and/or
- the amount of any one or more of cystatin-C, complement factor D, plasma protease C1 inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, prof ilin-1 and/or superoxide dismutase 3 in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set.
Protein detection techniques
The present invention involves measuring in a sample of bodily fluid obtained from an individual, the amount of one or more proteins in the sample that are indicative of colorectal cancer. For example, the method may comprise contacting a biopsy, including a sample of bodily fluid derived from the subject with a compound capable of binding to a protein biomarker, and detecting the formation of complex between the compound and the biomarker polypeptide. The term "protein biomarker" as used herein includes fragments of biomarker protein, including for example, immunogenic fragments and epitopes of the biomarker polypeptide. In one embodiment, the compound that binds the biomarker is an antibody.
The term "antibody" as used herein includes intact molecules as well as molecules comprising or consisting of fragments thereof, such as, for example Fab, F(ab')2, Fv and scFv, as well as engineered variants including diabodies, triabodies, mini-bodies and single-domain antibodies which are capable of binding an epitopic determinant. Thus, antibodies may exist as intact immunoglobulins, or as modifications in a variety of forms.
In another embodiment, an antibody to a protein biomarker is detected in a patient sample, wherein the amount of the antibody in the sample is informative in relation to whether the individual has cancer. Preferred detection systems contemplated herein include any known assay for detecting proteins or antibodies in a biological test sample, such as, for example, SDS/PAGE, isoelectric focussing, 2-dimensional gel electrophoresis comprising SDS/PAGE and isoelectric focussing, an immunoassay, flow cytometry e.g. fluorescence-activated cell sorting (FACS), a detection based system using an antibody or non-antibody compound, such as, for example, a small molecule (e.g. a chemical compound, agonist, antagonist, allosteric modulator, competitive inhibitor, or noncompetitive inhibitor, of the protein). In accordance with these embodiments, the antibody or small molecule may be used in any standard solid phase or solution phase assay format amenable to the detection of proteins. Optical or fluorescent detection, such as, for example, using mass spectrometry, MALDI-TOF, biosensor technology, evanescent fiber optics, or fluorescence resonance energy transfer, is clearly encompassed by the present invention. Assay systems suitable for use in high throughput screening of mass samples, e.g. a high throughput spectroscopy resonance method (e.g. MALDI-TOF, electrospray MS or nano-electrospray MS), are also contemplated. Another suitable protein detection technique involves the use of Multiple Reaction Monitoring (MRM) or Parallel reaction monitoring (PRM) in LC-MS (LC/MRM- MS and LC/PRM-MS) or SWATH-MS as described in [1 1 ].
Immunoassay formats are particularly suitable for detecting protein biomarkers in accordance with the method of the instant invention and include for example immunoblot, Western blot, dot blot, enzyme linked immunosorbent assay (ELISA), radioimmunoassay (RIA), enzyme immunoassay. Modified immunoassays utilizing fluorescence resonance energy transfer (FRET), isotope-coded affinity tags (ICAT), matrix-assisted laser desorption/ionization time of flight (MALDI-TOF), electrospray ionization (ESI), biosensor technology, evanescent fiber-optics technology or protein chip technology are also useful.
In more detail, immunoprecipitation is the technique of precipitating an antigen out of solution using an antibody specific to that antigen. The process can be used to identify protein complexes present in cell extracts by targeting a protein believed to be in the complex. The complexes are brought out of solution by insoluble antibody-binding proteins isolated initially from bacteria, such as Protein A and Protein G. The antibodies can also be coupled to sepharose beads that can easily be isolated out of solution. After washing, the precipitate can be analyzed using mass spectrometry, Western blotting, or any number of other methods for identifying constituents in the complex.
An ELISA, short for Enzyme-Linked Immunosorbent Assay, is a biochemical technique to detect the presence of an antibody or an antigen in a sample. It utilizes a minimum of two antibodies, one of which is specific to the antigen and the other of which is coupled to an enzyme. The second antibody will cause a chromogenic or fluorogenic substrate to produce a signal. Variations of ELISA include sandwich ELISA, competitive ELISA, and ELISPOT. Because the ELISA can be performed to evaluate either the presence of antigen or the presence of antibody in a sample, it is a useful tool both for determining serum antibody concentrations and also for detecting the presence of antigen.
Quantitative immuno-polymerase chain reaction (qlPCR) utilizes nucleic acid amplification techniques to increase signal generation in antibody-based immunoassays. The target proteins are bound to antibodies which are directly or indirectly conjugated to oligonucleotides. Unbound antibodies are washed away and the remaining bound antibodies have their oligonucleotides amplified. Protein detection occurs via detection of amplified oligonucleotides using standard nucleic acid detection methods, including real-time methods. Exemplary methods for performing iPCR are described in Niemeyer et al., (2007) Nature Protocols, 2:1918-30 Multiplexing systems such as Proseek Proximity Extension Assay and Bioplex
Multiplex Assay are examples of suitable platforms for conducting immunoassays for the purposes of determining the amounts of the protein biomarkers herein described.
In certain embodiments, the present invention relates to an array for use in detecting the amount of proteins, or fragments thereof in a sample, the array including a solid support and antibodies attached to the solid support, wherein the antibodies are capable of binding to one or more of ADAM DEC1 , cystatin-C and complement factor D, or fragments or variations thereof.
In one embodiment, the support comprises antibodies to ADAM DEC-1 and cystatin C. In alternative embodiments, the support comprises antibodies to ADAM DEC-1 and complement factor D. In another embodiment, the support comprises antibodies to cystatin-C and complement factor D. In a particularly preferred embodiment, the support comprises antibodies which are capable of binding to each of ADAM DEC-1 , cystatin-C and complement factor D.
In some embodiments, the sample of bodily fluid is subjected to preliminary processing designed to isolate or enrich the sample for low abundance proteins. In a preferred embodiment, a blood plasma or serum sample is subjected to immunodepletion using the Abundant Protein Immunodepletion (API) method as described in US 201 1 /008900 and AU 2002951240. Briefly, API columns are prepared using HPLC purified anti-human plasma chicken polyclonal IgYs that were derived from seven protein repeptitive orthogonal offline fractionation (PROOF) fractions. Purified IgYs are immobilised on resin and used to remove abundant proteins from the test plasma or serum samples.
In yet a further preferred embodiment, a combination of immunodepletion strategies can be used to deplete high and medium abundance proteins prior to performing the protein detection. For example, in one embodiment, the API depleted plasma sample is subjected to further immunodepletion, using a commercially available immunodepletion method which removes a number of other high abundance proteins.
The skilled person will be familiar with a number of different immune based affinity depletion methods for removing high abundance proteins including the MARS kits (multiple affinity removal system) sold by Agilent Technologies, the ProteoPrep immunodepletion kit from Sigma.
Reference data set
In determining the likelihood that an individual has a cancer using the methods of the present invention, a determination is made by reference to the measured amount of one or more protein biomarkers in a test sample of bodily fluid and the amount of the same protein biomarker in a relevant reference data set. The reference data set may be in the form of representative data from one or more healthy individuals, more particularly, individuals who do not have cancer.
The reference data set will contain sufficient representative data to enable the skilled person to determine, with an appropriate degree of certainty, whether there is a high or low likelihood that an individual has cancer. For example, in certain embodiments, the reference data set contains reference data from at least 10 individuals who have undergone various screening tests to determine the absence of cancer. The skilled person will also appreciate the greater prospects of correctly diagnosing an individual as having cancer, if the dataset contains data from a greater number of individuals. Accordingly, in further embodiments, the reference dataset contains protein reference data from 10 or more individuals, 25 or more, 50 or more, 100 or more, 200 or more, 400 or more, 600 or more, 800 or more, or 1000 or more individuals. The reference data set is not limited to data pertaining to the protein biomarkers identified in accordance with the invention. For example, the data set may also contain information pertaining to other biomarkers which may be used to supplement the protein expression level information and assist the skilled practitioner in making a determination on the diagnosis of cancer. In certain embodiments, the methods of the present invention include a comparison of the amount of protein biomarker measured in a sample from the individual who is a selection candidate for a given treatment. In some embodiments, that comparison may arise from an examination of the normalised amounts of protein in a sample from the individual, and direct visual comparison against the same proteins listed in a reference database, such as, for example, an Excel spreadsheet. However, the skilled person will also appreciate that the invention is not so limited, such that the amount of protein in a sample may be measured in the same experiment as the amounts of protein in the one or more individuals making up the reference data set. Data analysis
In some embodiments, a computer-based analysis program is used to translate the raw data generated by the detection assay (e.g., the expression level a given marker or markers) into data of predictive value for a clinician. The clinician can access the predictive data using any suitable means. Thus, in some preferred embodiments, the present invention provides the further benefit that the clinician, who is not likely to be trained in genetics or molecular biology, need not understand the raw data. The data is presented directly to the clinician in its most useful form. The clinician is then able to immediately utilize the information in order to optimize the care of the subject.
The present invention contemplates any method capable of receiving, processing, and transmitting the information to and from laboratories conducting the assays, information provides, medical personal, and subjects. For example, in some embodiments of the present invention, a sample (e.g., a biopsy or a serum or stool sample) is obtained from a subject and submitted to a profiling service (e.g., clinical lab at a medical facility, genomic profiling business, etc.), to generate raw data. No greater than, in relation to the amount of a protein biomarker, means that the amount of protein in the test sample is less than, approximately the same as and no more than 10% greater than the amount of the same protein biomarker in the representative data set. Preferably, the amount of protein biomarker is no more than 5% greater than the amount of miRNA in the representative dataset. In relation to the amount of a protein biomarker, substantially less than means that the amount of biomarker in the test sample is more than 10% less than the amount of biomarker in the representative data set.
The same as, in relation to the amount of protein biomarker in the reference data set, means an amount that is no more than 5% more or less than the amount of the measured protein biomarker.
In circumstances where the reference data set contains biomarker information from a large number of individuals, it will be appreciated that there may be a need for statistical analyses to accurately determine the significance of any similarity or differences, as the case may be, between the amount of biomarker in the test sample and the amount of biomarker in the reference data set. The skilled person will be familiar with the different statistical methods that can be used to facilitate such an analysis, for example, statistical tests based on mean (student's t-test and extensions), Bayesian and empirical Bayesian methods, nonparametric tests, analysis of variance (ANOVA and extensions), empirical Bayes/moderated t-tests and Partial Least Squares (PLS), logistic regression analysis, full or partial least square methods, cluster analysis, machine learning techniques or techniques to analyse "big data".
Methods of determining the likelihood of successful treatment
The methods of the present invention can also be utilised to monitor the success of a treatment for cancer. Because the biomarkers identified by the instant inventors are a reflection of the body's response to a tumour, successful treatment of tumour can also be monitored by measuring the amounts of the same biomarkers in the blood of the individual during and after treatment for the cancer. Accordingly, in yet a further embodiment, the present invention relates to a method of determining the likelihood of a treatment for a cancer in an individual being successful, including:
- providing a post-treatment test sample of bodily fluid from an individual who has received a treatment for a cancer;
- measuring the amount of one or more protein biomarkers in the test sample, wherein the one or more protein biomarkers is selected from the group consisting of: ADAM DEC1 , cystatin-C, complement factor D, plasma C1 inhibitor, paraoxonase 1 , hemopexin and inter-alpha-trypsin inhibitor heavy chain H2;
- determining that there is a high likelihood that the treatment was successful when:
a) the amount of ADAM DEC1 and/or hemopexin in the post-treatment test sample is lower than the amount of the same protein biomarker in a pre- treatment reference sample of bodily fluid obtained from the individual before receiving the treatment for cancer;
b) the amount of cystatin-C, complement factor D, plasma C1 inhibitor, paraoxonase 1 , and/or inter-alpha-trypsin inhibitor heavy chain H2 in the post- treatment test sample is greater than the amount of the same protein biomarker in the pre-treatment reference sample;
- determining that there is a low likelihood that the treatment was successful when:
c) the amount of ADAM DEC1 and/or hemopexin in the post-treatment test sample is the same or greater than the amount of the same protein biomarker in the pre-treatment reference sample;
d) the amount of cystatin-C, complement factor D, plasma C1 inhibitor paraoxonase 1 , and/or inter-alpha-trypsin inhibitor heavy chain H2 in the post- treatment test sample is the same or lower than the amount of the same protein biomarker in the pre-treatment reference sample.
In one embodiment, the methods involve measuring the amount of any one of ADAM DEC1 , complement factor D, cystatin-C or paraoxonase 1 in order to determine the likelihood of a successful cancer treatment.
In yet a further embodiment, the method involves measuring the amount of at least two of ADAM DEC1 , complement factor D, cystatin-C or paraoxonase 1 in order to determine the likelihood of a successful cancer treatment.
In a further embodiment, the method involves measuring the amount of at least three of ADAM DEC1 , complement factor D, cystatin-C or paraoxonase 1 in order to determine the likelihood of a successful cancer treatment. In a particularly preferred embodiment, the method involves measuring the amount of ADAM DEC1 , cystatin-C and complement factor D for the purposes of determining the likelihood of the treatment for cancer being successful. In yet a further preferred embodiment, the method involves measuring all four of ADAM DEC1 , complement factor D, cystatin-C and paraoxonase 1 in order to determine the likelihood of a successful cancer treatment.
Treatments which are typically received by an individual for treating cancer include chemotherapy, surgical resection, radiotherapy, immunotherapy, or a combination thereof. The above method has particular utility to a clinician who is seeking to develop the most appropriate course of treatment for the individual suffering from cancer. This method also provide a non-invasive means for assessing the success of a treatment plan, which provides a further advantage in that the individual who has received the treatment for cancer can be assessed without the need for further interventions or procedures which may place the individual at risk of infection or physical stress. The present methods therefore enable the physician to quickly and accurately make an assessment of the success of previous or ongoing treatment. Having this information rapidly at hand (for example, without having to wait for the patient to be in condition for more invasive procedures) allows the physician to make more timely decisions in relation to the future course of therapy. For example, depending on the results of the assessment using the method of the present invention, the treating physician may decide to cease the ongoing treatment in favour of an alternative treatment. For example, if the patient was receiving chemotherapy for the treatment of cancer, or other systemic treatment, the physician may decide, in view of the evidence suggesting success of this treatment plan, that going forward, radiotherapy (or other localised treatment) may be more appropriate. Alternatively, if the above methods indicate to the physician that treatment had not been successful, they may decide to pursue a more aggressive form of treatment. For example, in circumstances where a patient was receiving radiotherapy for the treatment of a cancer, and the physician determined this was not successful in treating the cancer, the physician may decide to perform surgery and/or chemotherapy or immunotherapy to more aggressively treat the cancer. Selection of treatment plan
The methods of the present invention also facilitate the determination of an appropriate treatment plan for an individual suspect of having or at risk of cancer, including a colorectal cancer. The present methods enable determination of whether or not to treat an individual for cancer, for example when the individual is suspected of having cancer as a result of another test which is less sensitive than the methods of the present invention and which suggests that the individual may have cancer (e.g.: a positive FOBT test or a scan which indicates that presence of a mass in the individual).
Accordingly, in a further aspect, the invention relates to a method for determining whether to treat an individual for cancer, the method including:
- providing a test sample of bodily fluid from an individual;
- measuring the amount of one or more protein biomarkers in the test sample, wherein the one or more protein biomarkers is selected from the group consisting of: ADAM DEC1 , cystatin-C and complement factor D;
- determining to treat the individual for cancer when:
a) the amount of ADAM DEC1 is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
b) the amount of cystatin-C and/or complement factor D is lower than the amount of the same protein biomarker in the reference data set;
- determining not to treat the individual for cancer when:
c) the amount of ADAM DEC1 is the same or lower than the amount of ADAM DEC1 in the reference data set; d) the amount of cystatin-C and/or complement factor D is the same or greater than the amount of the same protein biomarker in the reference data set.
Determining the stage of a neoplasm (i.e., the amount of penetration of a particular cancer) is extremely valuable since this enables the physician to identify the most appropriate treatment plan for the patient, which may ultimately lead to complete remission. As previously discussed, early-stage CRC has a high 5-year survival rate (>90%) following simple surgical resection, while late-stage CRC has a dramatically lower 5-year survival rate (<10%) [2]. Thus, early detection of colorectal malignancies would significantly reduce the burden of this disease and allow for early intervention. Although extensive research has revealed an abundant number of screening strategies, early detection of CRC remains difficult. Currently, staging of large intestine neoplasms is an invasive procedure since it requires the harvesting of a tissue specimen which is histologically analysed. In the context of large intestine neoplasms, the histological analysis of tissue specimens is both relatively slow and highly invasive. The development of a means to reliably and routinely assess a patient to determine whether an identified neoplasm is premalignant (e.g., a polyp or adenoma), early stage (adenocarcinoma, pre-metastatic) or late stage (e.g., metastatic) is highly desirable if it can be performed quickly and repeatedly, since this would enable decisions in relation to treatment regimens to be made and implemented more accurately. It would also enable ongoing monitoring to be performed during a treatment regime, such as in the context of treating an adenoma or early stage cancer, to assess transition to a more advanced stage without the need to perform invasive biopsies. This would also enable more flexibility in terms of adapting treatment regimens to reflect changes to the stage of a neoplasm.
The skilled person will be familiar with currently used methods for screening for colorectal cancer. For example, measurement of carcinoembryonic antigen (CEA) in serum, is commonly used to monitor colorectal carcinoma treatment, to identify recurrences after surgical resection, for staging or to localize cancer spread through measurement of biological fluids. The CEA blood test is not reliable for diagnosing cancer or as a screening test for early detection of cancer.
The present inventors have found that by coupling known, less sensitive diagnostic tests with the methods of the present invention, it is possible to determine an appropriate therapeutic plan for the individual. For example, while the CEA blood test is unable to accurately detect the presence of early stage cancer, the method of the present invention is sufficiently sensitive to detect early stage cancers.
Accordingly, the inventors have found that a negative CEA test result coupled with positive result using the preferred method of the instant invention, is indicative of early stages of colorectal cancer, (for example, Dukes stages A/B colorectal cancer). Positive test results using both CEA and the method of the present invention is indicative of late stages of colorectal cancer (Dukes stages C/D).
Accordingly, in a further embodiment, the present invention relates to a method of selecting a cancer treatment for an individual including: - providing an individual for whom a CEA blood test result has been obtained;
- measuring the amount of one or more protein biomarkers in a test sample of bodily fluid obtained from the individual, wherein the one or more protein biomarkers is selected from the group consisting of: ADAM DEC1 , cystatin-C and complement factor D;
- determining to select the individual for chemotherapy and/or radiotherapy when the CEA blood test result does not indicate that the individual has cancer and a) the amount of ADAM DEC1 is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
b) the amount of cystatin-C and/or complement factor D is lower than the amount of the same protein biomarker in the reference data set;
- determining to select the individual for surgical treatment for cancer when the CEA blood test result indicates that the individual has cancer and
a) the amount of ADAM DEC1 is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
b) the amount of cystatin-C and/or complement factor D is lower than the amount of the same protein biomarker in the reference data set.
As used herein, the term "individual" refers to any animal (e.g., a mammal), including, but not limited to, humans, non-human primates, rodents, and the like, which is to be the recipient of a particular treatment. Typically, the terms "individual", "subject" and "patient" are used interchangeably herein in reference to a human subject.
As used herein, the term "non-human animals" refers to all non-human animals including, but are not limited to, vertebrates such as rodents, non-human primates, ovines, bovines, ruminants, lagomorphs, porcines, caprines, equines, canines, felines, aves, etc. Compositions and kits
The present invention provides kits for the diagnosis or detection of cancer, in particular colorectal cancer. Such kits may be suitable for detection of nucleic acid species, or alternatively may be for detection of a protein, as discussed above. For detection of biomarker proteins, antibodies will most typically be used as components of kits. However, any agent capable of binding specifically to a biomarker gene product will be useful in this aspect of the invention. Other components of the kits will typically include labels, secondary antibodies, substrates (if the protein is an enzyme), inhibitors, co-factors and control gene product preparations to allow the user to quantitate expression levels and/or to assess whether the diagnosis experiment has worked correctly. Enzyme-linked immunosorbent assay-based (ELISA) tests and competitive ELISA tests are particularly suitable assays that can be carried out easily by the skilled person using kit components.
Optionally, the kit further comprises means for the detection of the binding of an antibody to a biomarker polypeptide. Such means include a reporter molecule such as, for example, an enzyme (such as horseradish peroxidase or alkaline phosphatase), a dye, a radionucleotide, a luminescent group, a fluorescent group, biotin or a colloidal 30 particle, such as colloidal gold or selenium. Preferably such a reporter molecule is directly linked to the antibody. In yet another embodiment, a kit may additionally comprise a reference sample.
The kits provided for in the present invention, include kits which contain means for determining or measuring the amount of any one or more of the protein biomarkers listed herein (and as shown in Table 1 ).
In one embodiment, a reference sample comprises a polypeptide that is detected by an antibody. Preferably, the polypeptide is of known concentration. Such a polypeptide is of particular use as a standard. Accordingly, various known concentrations of such a polypeptide may be detected using a diagnostic assay described herein.
It will be understood that the invention disclosed and defined in this specification extends to all alternative combinations of two or more of the individual features mentioned or evident from the text or drawings. All of these different combinations constitute various alternative aspects of the invention.
Examples
Example 1 : determination of biomarkers
1. Methods
Patient plasma samples Clinically staged CRC (Dukes' A, B, C and D) and control EDTA-plasma samples were obtained from 100 patients. Patients were Dukes' staged CRC (20 patients each stage A-D) or apparently healthy disease unaffected controls at Victoria Cancer Biobank (n=20, called group E subsequently). Samples were stringently age and sex matched with strict inclusion/exclusion criteria applied to minimize variation within the study population. In detail, the study population was a mixture (50:50) of females/males, aged between 50 and 80 for each CRC stage and for the healthy unaffected controls.
Samples were collected from CRC patients diagnosed with benign or malignant tumours, before they underwent any treatment and surgery for CRC. In this study, the term 'benign' was used to reflect the clinical definition, i.e., a tumour which has not spread beyond the muscularis propria, and where lymph nodes are not involved
(corresponding to Dukes Stages A and B). Malignant tumours were defined as tumours which had penetrated the muscularis propia including involvement of the lymph nodes as well as metastatic cancer (i.e., spread to other parts of the body). Malignant tumours correspond to Dukes Stages C and D. The control or unaffected plasma samples were collected from 20 individuals who were aged-matched to the clinical CRC plasma and had no apparent evidence of diseases (i.e., with no evidence of inflammation or metastatic conditions, no previous history of tumor, cancer or major therapy).
Preparation of API column using purified IgY antibodies Previous literature published by our research group has reported API as an efficient immunodepletion tool that will deplete the HAPs present in plasma [5]. The HPLC purified anti-human plasma chicken polyclonal IgYs that were derived from the seven protein repetitive orthogonal offline fractionation (PROOF) fractions [5] have been used to develop an immunoaffinity depletion column. In details, the purified IgYs were immobilised on UltraLink Hydrazide Gel (Pierce) following manufacturer's instructions, seven PROOF fractions were mixed in a 0.2:0.2:0.2:0.2:0.2:1 :1 ratio and was buffer exchanged with 0.1 M sodium phosphate (pH 7.0). The combined antibodies were then oxidized using sodium meta-periodate (5 imM), incubated for 15 min at room
temperature, with subsequent desalting using 100 imM sodium phosphate (pH 7.0), using Amicon Ultra-15 centrifugal filter unit (30 kDa NMWL). Then 20 imL volume of gel slurry was placed into a disposable column (BioRad), allowed to settle for 15 min followed by draining remaining liquid. The column was then equilibrated with 5 gel bed volumes of 100 imM sodium phosphate (pH 7.0) and the oxidized IgY was added to the gel (~1 imL of oxidized protein per ml of gel) and incubated overnight at room
temperature. The column was consequently washed with one gel-bed volume of 100 imM sodium phosphate, pH 7.0. Finally the prepared Ultralink-lgY immunodepletion media was then transferred and packed into a Xk26/20 column (GE Healthcare,
Uppsala, Sweden) and washed with 5 gel-bed volumes of 1 M NaCI (wash solution) followed by 5 gel bed volumes of PBS containing 0.05% sodium azide.
Immunodepletion of CRC plasma using MARS- 14
Multiple Affinity Removal System™ (MARS-14) column (4.6 χ 100 mm) (Agilent Technologies, Palo Alto, CA) was used to deplete 14 high abundance proteins (albumin, IgG, antitrypsin, IgA, transferrin, haptoglobin, fibrinogen, alpha2-macroglobulin, alphal - acid glycoprotein, IgM, apolipoprotein Al, apolipoprotein All, complement C3, and transthyretin) (MARS-14). The depletion was performed at room temperature with an Agilent 1260 Infinity HPLC system according to the manufacturer's instructions. Briefly, plasma samples were diluted four fold using the load/wash buffer supplied by the manufacture and remaining particulates in the diluted plasma were removed by centrifugation through a 0.22-pm spin filter 1 min at 16,000 χ g. The MARS-14 column was equilibrated with the load/wash buffer and the diluted plasma was loaded at a low flow rate (0.125 mL/min) for 18 min and then for an additional 2 min at a flow rate of 1 mL/min. The other binding and elution steps were identical to those used for the MARS- 14 column. Both depleted (flow-through fraction, FT) and abundant plasma proteins (bound fraction, BF) were collected and stored at -20 °C. The samples (flow-through and bound) were then buffer exchanged using Amicon 3 kDa molecular weight filters (Millipore, MA) with 3 CV of PBS and stored at -80° C. Immunodepletion of plasma using chicken IgY antibody based API column
API column was pre-equilibrated at 5 mL/min using PBS and 0.1 M Glycine (pH 2.5). The plasma sample was injected into the column at 0.1 mL/min with subsequent washing with 2.5 column volume (CV) of PBS, first at 0.05 mL/min for 3 min and at 5 mL/min. The bound proteins were eluted with 4 CV of glycine buffer (0.1 M, pH 2.5) at 5 mL/min and the column re-equilibrated with 5 CV of binding buffer at a flow rate of 5 mL/min. Proteins collected from the column (flow-through and API-bound) were then buffer exchanged using Amicon 3 kDa molecular weight filters (Millipore, MA) with 3 CV of PBS and stored at -80° C.
Ultradepletion of plasma
The API depleted plasma samples were injected into MARS-14 column following the same protocol described above. The depleted LAPs (flow-through fraction) and abundant HAPs (bound fraction) were collected, buffer exchanged and stored at -80° C.
Sample preparation and protein digestion for LC-MS analysis
Protein concentration in the crude and depleted plasma were determined by the BCA protein assay (Pierce-Thermo Scientific, Rockford, IL) following the microplate procedure. The plasma samples were reduced with 5μΙ of reduction solution (200 imM DTT in 50 imM ammonium bicarbonate) and incubated for 60 min at room temperature. Then 10 μΙ of iodoacetamide (200 imM lodoacetamide in 50 imM ammonium
bicarbonate) was added to the mixture and incubated for 60 min at room temperature in dark. Finally Trypsin solution (0.1 μς/μΙ trypsin in 50 imM ammonium bicarbonate) was added in appropriate ratio (1 :20) to the amount of protein by weight. The sample mixture was then digested overnight. The digested samples were then taken into SCX fractionation and SWATH™ MS analysis.
SCX Fractionation of MARS-14 depleted sample
To maximize the proteome coverage of the individual specimen, the MARS-14 depleted samples were fractionated strong cation exchange chromatography. Prior to SCX fractionation, samples were cleaned-up by Sep-Pak Light C18 cartridge (part number WAT023501 ) following manufacturer's instructions. Samples were then fractionated by SCX chromatography using an Agilent 1 100 quaternary HPLC pump with a PolyLC polysulfoethyl aspartamide column (200mmx4.6 mm, 5 μιη, 200 A;
PolyLC, Columbia, MD). The column was equilibrated with Buffer A (5 imM KH2PO4, pH 2.6, 25% v/v ACN), which was also used for sample resuspension, sample injection and peptide adsorption to the column. Peptide elution was achieved with a 75 min gradient to 100% Buffer B (5 mMKH2PO4, 350 mM KCI, pH 2.6, 25% v/v ACN), at a flow rate of 0.3 imL/min. Peptides were collected in 2 to 4 min increments and dried in a vacuum centrifuge. Fractions were resuspended in 2% v/v ACN and 0.1 % v/v formic acid and, based on the intensity from the strong cation exchange chromatographic UV trace, combined into a total of six fractions.
DDA mass spectrometry for spectral library generation
For the comprehensive shotgun analysis, crude CRC plasma samples were depleted with the multiple affinity removal system (Agilent Technologies) to remove the 14 most abundant proteins according to the manufacturer's instruction. Depleted samples were buffer exchanged with Amicon spin filters with a 3,000 Da molecular weight cut-off (Millipore), and digested with trypsin as detailed above. 200 g of the resulting peptides was then separated into six fractions by strong cation exchange (SCX) fractionation as discussed above. The depleted samples were also directly digested, pooled and analyzed without SAX fractionation. The samples were then analysed using an AB Sciex 5600 and Eksigent NanoLC
Ultra 2D Plus HPLC system. In detail, 10uL of each sample was injected onto a peptide trap (Optimize Technologies OPTI-TRAP CAP 0.5mm x 1 .3mm) for pre-concentration and desalted with 0.1 % formic acid, 2% ACN, at 10 μί/ιτπη for 5 min. The peptides then separated using 150-min gradient from 2 to 40% (Buffer A (0.1 % (v/v) formic acid, Buffer B (0.1 % (v/v) formic acid), 99.9% (v/v) acetonitrile) at a flow rate of 550 nl/min. MS1 spectra were collected in the range 380-1500 m/z. The 10 most intense precursors with charge state 2-5 which exceeded 150 counts per second were selected for fragmentation, and MS2 spectra were collected in the range 100-1500 m/z for 200 ms. Precursor ions were dynamically excluded from reselection for 20 s. SWA TH™ -MS measurement SWATH™-MS measurements were performed with peptide mixtures generated by digesting depleted (MARS, API, combination of MARS and API) plasma samples. The unfractionated, total peptide samples were analyzed to minimize confounding factors introduced by sample handling. As suggested by previous literature, the same LC-MS/MS system used for DDA measurements was used for SWATH™ analysis [9, 1 1 , 12]. To minimize instrument condition caused bias, SWATH™ data were acquired in random order for the samples with one blank run between every sample.
For SWATH™ MS, m/z window sizes were determined based on precursor m/z frequencies (m/z 400 - 1250) based on Information-Dependent Acquisition (IDA) of data (SWATH™ variable window acquisition, 60 windows in total). In SWATH™ mode, first a TOFMS survey scan was acquired (m/z 380-1500, 0.05 sec) then the 60 predefined m/z ranges were sequentially subjected to MS/MS analysis. MS/MS spectra were
accumulated for 60 milliseconds (duty cycle of -3.7 s) in the mass range m/z 380-1500 with rolling collision energy optimised for lowed m/z in m/z window +10%. Following data collection, the SWATH protein area data was normalized by Total
Area Normalization (as described in Wu et al, Mol. Cell Proteomics 2016, herein incorporated by reference in its entirety). Briefly, the data were entered into a boxplot, wherein the values for the Y axis data was calculated as follows: Y= loge(Normalized peak area), and where the normalized peak area is calculated by the method explained in Wu et al., 2016.
Once the SWATH™ quantitation data is generated, the following basic statistical analysis of the SWATH™ data is carried out, for each of the libraries used, whether local or extended:
1 . A one way ANOVA analysis is ran identifying all proteins that change in abundance between the experimental groups identified. The analysis is run separately for each protein, using the log-transformed protein peak areas that have been previously normalized to account for un-equal total area, with proteins deemed to be differentially expressed based on a p-value and fold change criterion (ANOVA BH FDR- corrected p-value < 0.05 and ratio between the highest and lowest group mean is greater than 1 .5). 2. Differentially expressed proteins are identified between any sets of two conditions identified as of interest by the user, using both a protein-level criterion, and a stricter peptide-level criterion. For the peptide-level criterion, the fold changes between conditions are determined separately for each peptide, and then averaged for each protein additionally using a one-sample t-test to decide differential expression.
The statistical analysis uses in-house scripts, implemented in the R statistical analysis framework. These strategies are described in detail Wu et al., (2016).
Statistical Analysis
The statistical comparisons were carefully designed to identify differentially expressed proteins between stage A-D and unaffected control, E, in order to minimize the possibility of false positives. Each of the replicates were analysed using protein level Anova, with Benjamini and Hochberg multiple testing p-value corrections (BH-adjusted p-value < 0.05), maximum fold change > 1 .5 followed by post hoc testing (Tukey HSD) as well as peptide level analysis to detect the proteins showing significant changes. To determine whether the expression differences were reflected in a more general CRC staging, the individually staged data were pooled into three general cancer stages; healthy (Group E), benign (Groups A and B) and metastatic (Groups C and D) CRC. This was performed to ease further comparisons and to be more reflective gross alterations of potential oncology biomarkers in plasma as a result of CRC progression. Due to the now unequal sample sizes between the 15 healthy samples and the 30 pooled benign or malignant samples protein level Anova, with Benjamini and Hochberg multiple testing p-value corrections (BH-adjusted p-value < 0.05), maximum fold change > 1 .5. To discover which proteins were being differentially expressed between the blocked CRC groups: healthy, benign and metastatic stages a Tukey honest significant differences post-hoc test was then performed with respect to the interaction between Protein and Group factors.
2. Results
Immunodepletion of CRC plasma samples
Abundant Protein Immunodepletion (API) The efficiency of API column was optimised by depleting standard plasma samples. The protein concentrations in bound and flow through fractions from three separate depletions (API only, MARS only and API+ MARS) were calculated. The BF and FT were analysed using 1 D gel electrophoresis. The depleted LAP enriched FT were analysed which confirmed that API column had successfully removes
approximately 70% of total plasma protein. Supplementary Figure 1 represents the 1 D gel electrophoresis of API depleted plasma.
After successful conformation of the API column efficiency, the pooled each CRC staged plasma (A to D) and disease unaffected healthy control (E) plasma samples were depleted. For each depletion, 100 μΙ of plasma containing a range of ~ 3-4mg of protein was injected into the API column. The depletion chromatograms are included in supplementary Figure 2. One dimensional (1 D) gel electrophoresis was conducted on API depleted LAP fractions to confirm the enrichment of LAPs (Figure 1 ). The enriched LAP bands were appeared near 260, 160 and 10 kDa (Figure 1 b) compare to crude plasma samples (Figure 1 a). However API did not completely remove the serum albumins, indicating the necessity of a subsequent depletion step that will specifically remove most of the HAPs.
MARS- 14 immunodepletion
The API depleted LAP fractions were further depleted using MARS-14
immunoaffinity column in order to selectively remove the top 14 high abundance plasma proteins. Figure 2 represents the 1 D gel electrophoresis of the LAP enriched fractions after MARS followed by API depletion. This image also illustrates a sufficient depletion of serum albumin in the depleted fractions. The depletion chromatograms for ultradepleted (API+MARS) samples are represented in Supplementary Figure 3. In addition, a batch of crude CRC plasma samples were immunodepleted using
MARS-14 column and the depleted fractions were subsequently fractionated by strong cation exchange chromatography. The fractions were subjected to Mass Spectrometry analysis for preparation of IDA library. The MARS depletion chromatograms for crude CRC plasma samples and the SCX chromatograms are illustrated in Supplementary Figure 4 and 5 respectively. SWATH™ MS Analysis
CRC plasma samples were depleted using three separate depletion strategies (MARS, API and API+MARS) and hence different subsets of differentially expressed proteins were obtained. The MARS depleted plasma samples were SCX fractionated and analysed using MS for creating an IDA library, where a total of 323 proteins were identified. In contrast, only 65 proteins were identified when the API depleted plasma samples were pooled and run in duplicate to identify the protein IDs, however 8 of the proteins found in LAP enriched fractions (Ig alpha-2 chain C region, alpha-1 -acid glycoprotein 1 , beta-2-microglobulin, probable crossover junction endonuclease EME2, keratin type II cytoskeletal 2 epidermal, E3 ubiquitin-protein ligase TRAIP, ribonuclease ZC3H12A, protein dopey-1 and protein dopey-2) were not identified in MARS or API+ MARS depleted fractions. In addition, 1 18 proteins were identified from pooled
API+MARS depleted samples, where 9 unique proteins (Ig gamma-2 chain C region, Ig lambda-2 chain C regions, Ig lambda-1 chain C regions, immunoglobulin lambda-like polypeptide 5, ankyrin repeat and IBR domain-containing protein 1 , alpha-1 -syntrophin, contactin-1 , paired mesoderm homeobox protein 2B) were identified. Figure 3 represents the numbers of total proteins identified from three independent depleted strategies and a list of all protein IDs from MARS, API and API+MARS depletions are listed in supplementary table 1 . SWATH™ data from each of the depleted samples was individually analysed and the stage specific protein expressions were compared using the strict statistical approach (as detailed in Methods section). Interestingly, MARS depleted CRC samples did not exhibit any differentially expressed proteins with Anova analysis, while only 6 proteins were identified for API depletion, however API+MARS depletion showed 23 differentially expressed proteins across 5 stages. This is not surprising, because the MARS and API depleted samples were run in duplicates while API+ MARS depleted samples were run in triplicates, therefore higher p-values were observed for API and Mars depleted sample sets. In contrast, peptide level analysis was done for pairwise comparison between two stages. In this analysis, 40, 53 and 38 differentially expressed proteins were identified from MARS, API and API+ MARS depleted samples
respectively. Figure 4 represents the number of proteins identified with MARS, API and
API+MARS depletion strategy using the protein level Anova, with Benjamini and
Hochberg multiple testing p-value corrections (BH-adjusted p-value < 0.05), maximum fold change > 1 .5 followed by peptide level analysis. Supplementary Table 2 lists the proteins that are differentially expressed after
MARS, API and API+MARS depletion through pairwise peptide analysis. The 6 common proteins found in all three depletions are complement C1 r subcomponent, apolipoprotein A-l, apolipoprotein A-ll, apolipoprotein C-ll, alpha-2-HS-glycoprotein and inter-alpha-trypsin inhibitor heavy chain H1 . For the next set of analysis, the individually staged data were pooled into three general groups; healthy (Group E), benign (Groups A and B) and metastatic (Groups C and D) CRC (as discussed in the method section). Volcano plots were generated for each depletion strategy between the above three groups. In figure 5, the volcano plots represent the protein fold change between any two groups (benign/healthy,
malignant/healthy and malignant/benign) for three depletions.
Further analysis on the proteins identified from the API+MARS depleted samples was conducted. Two independent analyses (Anova and pairwise peptide analysis) were conducted on API+MARS depleted plasma protein IDs and the overlap of these two groups showed 10 common proteins. Figure 5 represents the number of differentially expressed proteins identified by strict ANOVA analysis and pairwise peptide level analysis and Table 3 summarises the 10 proteins that have been identified in both analyses.
Figure imgf000078_0001
P05155 Plasma protease CI inhibitor 8.22 0.00 18
P00739 Haptoglobin-related protein 2.78 0.00 4
P02749 Beta-2-glycoprotein 1 2.14 0.04 20
P27169 Serum paraoxonase/arylesterase 1 6.95 0.00 5
P35858 Insulin-like growth factor-binding 6.57 0.02 17
protein complex acid labile subunit
P63261 Actin, cytoplasmic 2 14.24 0.00 2
Q04756 Hepatocyte growth factor activator 7.07 0.04 7
Then the results from Anova analysis and pairwise peptide analysis were combined and the list of proteins was manually handled by removing the HAPs (e.g. serum albumin, apolipoproteins and fibrinogens) from the list. Therefore a list of 19 proteins was obtained that are differentially expressed among CRC stage A-D and healthy control group (E). Table 4 summarises the differentially expressed proteins across CRC stages A-D and disease unaffected (E) identified by Anova and/ or peptide level analysis (excluding HAPs).
Figure imgf000079_0001
P63261 Actin, cytoplasmic 2 14.24 0
P00450 Ceruloplasmin 1.73 0.01
Q14624 Inter-alpha-trypsin inhibitor heavy 2.06 0.01
chain H4
P02790 Hemopexin 1.56 0.01
015204 ADAM DEC1 4.53 0.02
P00734 Prothrombin 2.91 0.02
P00748 Coagulation factor XII 2.09 0.02
P35858 Insulin-like growth factor-binding 6.57 0.02
protein complex acid labile subunit
P00746 Complement factor D 41.7 0.03
P01034 Cvstatin-C 9.96 0.04
Q14520 Hyaluronan-binding protein 2 3.39 0.04
P08697 Alpha-2-antiplasmin 2.73 0.04
P19823 Inter-alpha-trypsin inhibitor heavy 2.56 0.04
chain H2
P02749 Beta-2-glycoprotein 1 2.14 0.04
Q04756 Hepatocyte growth factor activator 7.07 0.04
The stage specific (CRC stage A-D and disease unaffected healthy, E) abundance patterns of the 19 proteins are provided in figure 7. The above 19 proteins were further analysed to observe their differential expression across three major groups, disease unaffected (E), non-malignant (A+B) and malignant (C+D). This analysis generated a list of 9 differentially expressed low abundance proteins that are summarised in Table 5. Table 5: List of 19 proteins obtained from API+MARS depletion that are differentially expressed among three major groups (Healthy, Non malignant and Malignant). Underlined proteins appear also present in the Stage specific analysis (Table 2)
Uniprot Protein Name MaxFC P value (Anova adjusted)
015204 ADAM DEC1 3.97 0.03
075636 Ficolin-3 4.16 0.03
P00734 Prothrombin 2.81 0.03
P00739 Haptoglobin-related protein 2.12 0.03
P01034 Cvstatin-C 8.12 0.03
Q6EMK4 Vasorin 6.56 0.03
P00746 Complement factor D 20.49 0.04
P02774 Vitamin D-binding protein 1.8 0.04
The group specific (healthy, benign, malignant) abundance patterns of the 19 proteins are provided in figure 8.
The stage specific and group specific abundance patterns of the above 19 proteins were then manually inspected and ADAM DEC1 , cystatin-C and complement factor D were identified to be differentially expressed among CRC stages A-D when compared with disease unaffected healthy groups. In addition, plasma protease C1 inhibitor, hemopexin and inter-alpha-trypsin inhibitor heavy chain H2 indicated to be differentially expressed among disease unaffected healthy group and at least two or more CRC stages. Figure 6 represents the stage and group specific (healthy, benign, malignant) abundance patterns of ADAM DEC1 , cystatin-C, complement factor D, plasma protease C1 inhibitor, hemopexin and inter-alpha-trypsin inhibitor heavy chain H2. 3. Discussion
This study is the first ever attempt to detect LAP markers in stage specific CRC plasma samples using combination of two independent immunodepletion strategies (i.e., ultradepletion) followed by comprehensive SWATH proteomic analysis. The concept of minimizing the complexity of blood samples by immunodepleting has been widely practiced by proteomic researchers. However it has been suggested that the LAP fraction may still contains albumin, a predominantly sticky protein that acts as a carrier protein for bilirubin and various sex hormones (e.g., sex hormone binding globulin) [13]. Tan et al has demonstrated the efficiency of API strategy where 165 plasma proteins were identified from LAP enriched human plasma using ultradepletion technique (ProteoPrep albumin and IgG column (Sigma, St Louis, MO) followed by API) [5]. In that study, 32 human proteins were found in the depleted fraction were never observed in human plasma previously, and 2 of them were not reported in PeptideAtlas. This finding validates the utility of ultradepletion technique using API technology in combination with a commercial immunodepletion platform. In this study, the LAP fractions obtained from API depletion indicated the presence of albumin which justified the second step of depletion using MARS-14 immunoaffinity column.
SWATH™ MS is the most recent version of the data independent acquisition (DIA) method that records fragment ion spectra of all ionized species of a sample and creates a digital record of all ionized species above the detection limit where the fragment ion spectra of individual peptides supports the accurate relative quantification of large fractions of a proteome in a single injection. As SWATH™ is a recently emerging technique adopted by the proteomic researchers, there has not yet been any study conducted on CRC patient plasma samples using this strategy. Liu et al. have reported glycoproteomics analysis [1 1 ] on prostate cancer tissues by SWATH™ MS [1 1 ], where the secretome of metastatic cell lines was analysed by SWATH™™ and CD109 has been identified as a cancer associated protein in non-small-cell lung cancer [13]. Therefore our study is the first ever reported attempt to analyse immunodepleted pooled Dukes' CRC plasma samples using SWATH™ technique. The protein IDs from each depletion set indicated presence of HAPs (e.g., serum albumin, apolipoprotein A, Immunoglobulin g). This observation confirms that none of the depletion approaches have completely removed all the HAPs from the samples. The SCX fractionation of MARS depleted samples resulted in a significant 323 protein IDs, however 8 and 9 unique proteins were identified from the unfractionated pooled API and API+MARS depleted samples.
An extensive analysis of ultradepleted samples identified six proteins (ADAM DEC1 , cystatin-C, complement factor D, plasma protease C1 inhibitor, hemopexin and inter-alpha-trypsin inhibitor heavy chain H2) that are significantly differentially expressed across CRC stages. Our data clearly demonstrate the efficiency of using combination of ultradepletion and SWATH™ MS based quantitative approach for identification of stage specific low abundance candidate proteins. Example 2: Reference dataset
A reference data set is generated, containing the amounts of ADAM DEC-1 , cystatin-C and complement factor D from 20 individuals. Specifically, blood plasma samples are obtained from the individuals and the samples are subjected to API and MARS-14 immunodepletion. Depleted plasma are subsequently processed using an ELISA assay to determine the amount of ADAM DEC1 , cystatin-C and complement factor D in the samples. All the data obtained are normalised against the amount of residual human serum albumin in the sample.
The amount of each protein identified using the ELISA assay are entered into a spreadsheet for use as a reference dataset for use in determining the likelihood of suspected cancer patients having cancer.
The individuals identified for the reference data set are all healthy individuals, with no evidence of cancer or other pathology or taking any medication.
Example 3: Test samples
A patient presents at the clinic and in need of screening for colorectal cancer. The patient does not present with overt clinical symptoms of any pathology but requires screening for cancer due to a concern of family history of colorectal cancer.
A plasma sample is obtained from the patient and the amounts of ADAM DEC-1 , cystatin-C and complement factor D were measured using an array. The array comprises a solid phase and antibodies attached to the solid phase, wherein the antibodies recognise and are capable of binding to each of ADAM DEC-1 , cystatin-C and complement factor D. The amounts of each of ADAM DEC-1 , cystatin-C and complement factor D were entered into a spreadsheet for comparison with the reference data set. Results
The amount of ADAM-DEC 1 in the plasma sample obtained from the patient was greater than the amount of ADAM DEC 1 in the reference data set. The amount of cystatin-C and complement factor D in the plasma sample from the patient was lower than the amount in the reference data set. The patient was determined to have a high likelihood of having colorectal cancer and scheduled for colonoscopy to identify the stage of the colorectal cancer.
Example 4: Plasma SWATH library generation using 4 independent peptide fractionation methods
High pH C18 reversed phase fractionation of MARS-14 depleted sample: The MARS-14 depleted plasma samples were fractionated with high pH C18 reversed phase chromatography. Prior to fractionation, samples were cleaned-up by Sep-Pak Light C18 cartridge following manufacturer's instructions. Samples were then fractionated by High pH C18 reversed phase chromatography using an Agilent 1260 HPLC system with a ZORBAX 300Extend-C18 (2.1 x 150mm 3.5-Micron 300A). The column was equilibrated with Buffer A (5 imM NH4OH, pH 10), which was also used for sample resuspension, sample injection and peptide adsorption to the column. Peptide elution was achieved with a 70 min gradient to 90% Buffer B (5 imM NH4OH, pH 10, 90% v/v ACN), at a flow rate of 0.3 imL/min. Peptides were collected in 1 to 2 min increments and dried in a vacuum centrifuge. Fractions were resuspended in 2% v/v ACN and 0.1 % v/v formic acid and, based on the intensity from the High pH C18 reversed phase chromatographic UV trace, combined into a total of 20 fractions. Each fraction was subjected to SCIEX Triple TOF 5600 system for protein identification.
Size exclusion chromatography (SEC) fractionation of MARS-14 depleted sample: The MARS-14 depleted plasma samples were fractionated with SEC. Prior to fractionation, samples were cleaned-up by Sep-Pak Light C18 cartridge following manufacturer's instructions. Samples were then fractionated by SEC using a Amersham Biosciences Superdex Peptides HR 10/30 (17-1453-01 ). The fraction was performed with single buffer system (50mM NaPO4, 250mM NaCI, pH 7), at a flow rate of 0.6 imL/min. Peptides were collected in every 1 for 96 min and dried in a vacuum centrifuge. Fractions were resuspended in 2% v/v ACN and 0.1 % v/v formic acid and, based on the intensity from the SEC UV trace, combined into a total of 15 fractions. Each fraction was subjected to SCIEX Triple TOF 5600 system for protein identification. Strong anion exchange (SAX) fractionation of MARS- 14 depleted sample:
The MARS-14 depleted plasma samples were fractionated with SAX chromatography. Prior to fractionation, samples were cleaned-up by Sep-Pak Light C18 cartridge following manufacturer's instructions. Samples were then fractionated by SAX chromatography using a Bio-Rad UNO Q-1 column (720-0001 ). The column was equilibrated with Buffer A (20mM Tris-HCL, pH 7), which was also used for sample resuspension, sample injection and peptide adsorption to the column. Peptide elution was achieved with a 60 min gradient to 100% Buffer B (20mM Tris-HCL, pH 7, 1 M KCI) followed by a 10 min gradient to 100% Buffer C (20mM Tris-HCL, pH 7, 2M KCI), at a flow rate of 0.5 imL/min. Peptides were collected in every 1 and dried in a vacuum centrifuge. Fractions were resuspended in 2% v/v ACN and 0.1 % v/v formic acid and, based on the intensity from the SAX chromatography UV trace, combined into a total of 15 fractions. Each fraction was subjected to SCIEX Triple TOF 5600 system for protein identification.
Strong cation exchange (SCX) fractionation of MARS-14 depleted sample: The methodology for SCX chromatography was adapted from Example 1 . A total of 15 fractions was collected and subjected to SCIEX Triple TOF 5600 system for protein identification.
Results:
A total of 529 plasma proteins were identified from 4 independent peptide fractionation methods, Table 6 represents the number of proteins identified from each fractionation methods. Figure 13 is a Venn diagram showing the number of unique plasma proteins identified from different methodologies.
Table 6: Summary of CRC plasma protein identification from different peptide fractionation methods.
Figure imgf000086_0001
Example 5: Determination of additional biomarkers part A
Method:
The methodology of Example 1 was adapted for the purpose of identifying additional biomarkers. Briefly, plasma samples from patients with clinically staged CRC or control plasma samples were obtained as described in Example 1 . Plasma samples were subjected to MARS14 immunodepletion as described in
Example 1 , followed by two successive rounds of depletion on an API column.
Depleted plasma samples were then subjected to SWATH-MS as described in Example 1 .
Results: Table 7: List of 3 differentially expressed proteins identified from MARS-14 and API depletion based on Peptide-level analyses and Anova analysis (BH- corrected p-value<0.05 fold change > 1.5).
Figure imgf000087_0001
Example 6: Determination of additional biomarkers part B
Method:
The methodology of Example 1 was adapted for the purpose of identifying additional biomarkers. Briefly, plasma samples from patients with clinically staged CRC or control plasma samples were obtained as described in Example 1 . Plasma samples were subjected to MARS14 immunodepletion, as described in
Example 1 .
Depleted plasma samples were then subjected to SWATH-MS as described in Example 1 .
Results: Using the SWATH library generated from Example 4, quantitative SWATH-MS analyses were performed on stage-specific pooled MARS14 depleted CRC plasma samples.
A total of 6 proteins were identified to be differentially expressed among CRC stages A-D when compared with disease unaffected healthy groups with >1 .5-fold change, false discovery rate 1 % at the protein level with statistical significance p<0.05. (Table 8). Figure 15 represents the stage and group-specific (healthy or CRC stage A, B, C & D) abundance patterns of glutathione peroxidase 3 (GPX3), protocadherin gamma-A8 (PCDHGA8), serum amyloid protein A2 (SAA2), carboxypeptidase Q (CPQ), mannan- binding lectin serine protease 2 (MASP2) and prof ilin-1 (PFN1 ) Expression levels of GPX3, PCDHGA8 and SAA2 were up-regulated at the disease stages (A-D) compare to healthy control group.
Expression levels of CPQ, MASP2 and PFN1 were down-regulated at the disease stages (A-D) compare to healthy control group.
Table 8: List of 6 differentially expressed proteins identified from MARS-14 depletion based on Peptide-level analyses and Anova analysis (BH-corrected p- value<0.05 fold change > 1.5).
Figure imgf000088_0001
Example 7: Determination of additional biomarkers part C Method: The methodology of Example 1 was adapted for the purpose of identifying additional biomarkers. Briefly, plasma samples from patients with clinically staged CRC or control plasma samples were obtained as described in Example 1 .
Plasma samples were subjected to SWATH-MS without any immunodepletion or other treatment to remove proteins of high abundance. Results:
Using the SWATH library generated from Example 4, quantitative SWATH-MS analyses were performed on stage-specific pooled CRC plasma samples.
A total of 4 proteins were identified to be differentially expressed among CRC stages A-D when compared with disease unaffected healthy groups with >1 .5-fold change, false discovery rate 1 % at the protein level with statistical significance p<0.05. (Table 9).
Figure 16 represents the stage and group-specific (healthy or CRC stage A, B, C & D) abundance patterns of S100 calcium binding protein 8 (S100A8), serum amyloid protein A1 (SAA1 ), mannose receptor C type I (MRC1 ) and superoxide dismutase 3 (SOD3).
Expression levels of S100A8 and SAA1 were up-regulated at the disease stages (A-D) compare to healthy control group.
Expression levels of MRC1 and SOD3 were down-regulated at the disease stages (A-D) compare to healthy control group.
Table 9: List of 4 differentially expressed proteins identified from neat plasma based on Peptide-level analyses and Anova analysis (BH-corrected p- value<0.05 fold change > 1.5).
Figure imgf000089_0001
References
1 . Mahboob, S., et al., A novel multiplexed immunoassay identifies CEA, IL-8 and prolactin as prospective markers for Dukes' stages A-D colorectal cancers. Clin Proteomics, 2015. 12(1 ): p. 10. 2. Peeters, M., T. Price, and J.L. Van Laethem, Anti-epidermal growth factor receptor monotherapy in the treatment of metastatic colorectal cancer: where are we today? The oncologist, 2009. 14(1 ): p. 29-39.
3. Cao, Z., et al., Systematic comparison of fractionation methods for in- depth analysis of plasma proteomes. Journal of proteome research, 2012. 1 1 (6): p. 3090-3100.
4. Zhang, Q., V. Faca, and S. Hanash, Mining the plasma proteome for disease applications across seven logs of protein abundance. Journal of proteome research, 2010. 10(1 ): p. 46-50.
5. Tan, S.-H., et al., Ultradepletion of human plasma using chicken antibodies: a proof of concept study. Journal of proteome research, 2013. 12(6): p. 2399-2413.
6. Seibert, V., M.P. Ebert, and T. Buschmann, Advances in clinical cancer proteomics: SELDI-ToF-mass spectrometry and biomarker discovery. Briefings in functional genomics & proteomics, 2005. 4(1 ): p. 16-26. 7. Millioni, R., et al., High abundance proteins depletion vs low abundance proteins enrichment: comparison of methods to reduce the plasma proteome
complexity. PLoS One, 201 1 . 6(5): p. e19603.
8. Echan, L.A., et al., Depletion of multiple high-abundance proteins improves protein profiling capacities of human serum and plasma. Proteomics, 2005. 5(13): p. 3292-3303.
9. Gillet, L.C., et al., Targeted data extraction of the MS/MS spectra generated by data-independent acquisition: a new concept for consistent and accurate proteome analysis. Molecular & Cellular Proteomics, 2012. 1 1 (6): p. O1 1 1 . 016717. 10. Rost, H.L., et al., OpenSWATH enables automated, targeted analysis of data-independent acquisition MS data. Nature biotechnology, 2014. 32(3): p. 219-223.
1 1 . Liu, Y., et al., Glycoproteomic analysis of prostate cancer tissues by SWATH mass spectrometry discovers N-acylethanolamine acid amidase and protein tyrosine kinase 7 as signatures for tumor aggressiveness. Molecular & Cellular Proteomics, 2014. 13(7): p. 1753-1768.
12. Liu, Y., et al., Quantitative variability of 342 plasma proteins in a human twin population. Molecular systems biology, 2015. 1 1 (2): p. 786.
13. Wu et al., SWATH mass spectrometry performance using extended peptide MS/MS assay libraries. Mol Cell Proteomics 2016 May 9
14. Zhang, F., et al., SWATH™-and iTRAQ-based quantitative proteomic analyses reveal an overexpression and biological relevance of CD109 in advanced NSCLC. Journal of proteomics, 2014. 102: p. 125-136.

Claims

1 . A method of determining the likelihood of an individual having a cancer including:
- providing a test sample of bodily fluid from an individual for whom diagnosis of cancer is required;
- measuring the amount of one or more protein biomarkers in the test sample, wherein the one or more protein biomarkers is selected from the group consisting of: ADAM DEC1 , cystatin-C and complement factor D;
- determining that the individual has a high likelihood of having cancer when: a) the amount of ADAM DEC1 in the test sample is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
b) the amount of cystatin-C and/or complement factor D in the test sample is lower than the amount of the same protein biomarker in the reference data set;
- determining that the individual has a low likelihood of having cancer when: c) the amount of ADAM DEC1 in the test sample is the same or lower than the amount of ADAM DEC1 in the reference data set;
d) the amount of cystatin-C and/or complement factor D in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set.
2. The method of claim 1 , further including:
- measuring the amount of one or more additional protein biomarkers in the test sample, wherein the one or more additional biomarkers is selected from the group consisting of: plasma protease C1 inhibitor, paraoxonase 1 , hemopexin, inter-alpha- trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, serum amyloid P component, apolipoprotein A-IV, apolipoprotein B-100,
carboxypeptidase Q, glutathione peroxidase 3, Mannan-binding lectin serine protease 2, mannose receptor C type I, protocadherin gamma-A8, profilin-1 , S100 calcium binding protein 8, serum amyloid protein A1 , serum amyloid protein A2, and superoxide dismutase 3;
- determining that the individual has a high likelihood of having cancer when: a) the amount of hemopexin, serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma-A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is greater than the amount of the same protein in the reference data set; and/or b) the amount of plasma C1 protease inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator,
apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, profilin-1 and/or superoxide dismutase 3 in the test sample is lower than the amount of the same protein biomarker in the reference data set;
- determining that the individual has a low likelihood of having cancer when:
c) the amount of hemopexin serum amyloid P component, apolipoprotein B-100, glutathione peroxidase 3, protocadherin gamma -A8, S100 calcium binding protein 8, serum amyloid protein A1 , and/or serum amyloid protein A2 in the test sample is the same or lower than the amount of the same protein in the reference data set; d) the amount of plasma protease C1 inhibitor, paraoxonase 1 , inter-alpha trypsin inhibitor heavy chain H2, prothrombin, hepatocyte growth factor activator, apolipoprotein A-IV, carboxypeptidase Q, mannan-binding lectin serine protease 2, mannose receptor C type I, profilin-1 and/or superoxide dismutase 3 in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set.
3. A method of determining the likelihood of a successful treatment of a cancer in an individual including:
- providing a post-treatment test sample of bodily fluid from an individual who has received a treatment for a cancer;
- measuring the amount of one or more protein biomarkers in the post-treatment test sample, wherein the one or more protein biomarkers is selected from the group consisting of: ADAM DEC1 , cystatin-C, complement factor D;
- determining that there is a high likelihood that the treatment was successful when: a) the amount of ADAM DEC1 in the post-treatment test sample is lower than the amount of ADAM DEC1 in a pre-treatment reference sample obtained from the individual before receiving the treatment for cancer;
b) the amount of cystatin-C and/or complement factor D in the post- treatment test sample is greater than the amount of the same protein biomarker in the pre-treatment reference sample;
- determining that there is a low likelihood that the treatment was successful when:
c) the amount of ADAM DEC1 in the post-treatment test sample is the same or greater than the amount of ADAM DEC1 in the pre-treatment reference sample;
d) the amount of cystatin-C and/or complement factor D in the post- treatment test sample is the same or lower than the amount of the same protein biomarker in the pre-treatment reference sample.
4. The method of claim 3, further including:
- measuring the amount of one or more additional protein biomarkers in the post- treatment test sample, wherein the one or more additional biomarkers is selected from the group consisting of: plasma C1 inhibitor, paraoxonase 1 , hemopexin and inter-alpha-trypsin inhibitor heavy chain H2, and
- determining that there is a high likelihood that the treatment was successful when:
a) the amount of hemopexin in the post-treatment test sample is lower than the amount of hemopexin in the pre-treatment reference sample; and/or b) the amount of plasma C1 inhibitor, paraoxonase 1 and/or inter-alpha trypsin inhibitor heavy chain H2 in the post-treatment test sample is greater than the amount of the same protein biomarker in the pre-treatment reference sample;
- determining that there is a low likelihood that the treatment was successful when:
c) the amount of hemopexin in the post-treatment test sample is the same or greater than the amount of hemopexin in the pre-treatment reference sample; d) the amount of plasma C1 inhibitor, paraoxonase 1 and/or inter-alpha trypsin inhibitor heavy chain H2 is the same or lower than the amount of the same protein biomarker in the pre-treatment reference sample.
5. The method of any one of claims 1 to 4 wherein the cancer is a colorectal cancer.
6. The method of claim 3 or 4 wherein the treatment is chemotherapy, surgical
resection, radiotherapy, immunotherapy or a combination thereof.
7. The method according to any one of the preceding claims further including the step of comparing the amount of the measured protein biomarker with the amount of the same biomarker in the reference data set or the pre-treatment reference sample, to determine whether the measured amount of the biomarker is greater than, lower than, or the same as the amount of the same biomarker in the reference data set or the pre-treatment reference sample.
8. The method of any one of the preceding claims wherein the sample of bodily fluid is a sample of plasma or serum obtained from blood.
9. A method of determining whether to treat an individual for cancer including:
- providing a test sample of bodily fluid from an individual;
- measuring the amount of one or more protein biomarkers in the test sample, wherein the one or more protein biomarkers is selected from the group consisting of: ADAM DEC1 , cystatin-C and complement factor D;
- determining to treat the individual for cancer when:
a) the amount of ADAM DEC1 in the test sample is greater than the amount of ADAM DEC1 in a reference data set in the form of data representative of one or more individuals who do not have cancer; and/or
b) the amount of cystatin-C and/or complement factor D in the test sample is lower than the amount of the same protein biomarker in the reference data set;
- determining not to treat the individual for cancer when:
c) the amount of ADAM DEC1 in the test sample is the same or lower than the amount of ADAM DEC1 in the reference data set; d) the amount of cystatin-C and/or complement factor D in the test sample is the same or greater than the amount of the same protein biomarker in the reference data set.
10. The method according to claim 9, wherein the cancer is a colorectal cancer.
PCT/AU2017/050644 2016-06-24 2017-06-23 Screening methods WO2017219093A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
AU2016902484A AU2016902484A0 (en) 2016-06-24 Screening methods
AU2016902484 2016-06-24

Publications (1)

Publication Number Publication Date
WO2017219093A1 true WO2017219093A1 (en) 2017-12-28

Family

ID=60783124

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/AU2017/050644 WO2017219093A1 (en) 2016-06-24 2017-06-23 Screening methods

Country Status (1)

Country Link
WO (1) WO2017219093A1 (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4032543A1 (en) * 2021-01-20 2022-07-27 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Peptides from the sequence of the complement factor d for medical use, especially for the treatment of ebv-associated diseases
CN114839253A (en) * 2022-04-21 2022-08-02 深圳市第二人民医院(深圳市转化医学研究院) Quantitative analysis method for low molecular weight protein in serum or plasma and application thereof

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002068677A2 (en) * 2001-02-27 2002-09-06 Eos Biotechnology, Inc. Novel methods of diagnosis of metastatic colorectal cancer, compositions and methods of screening for modulators of metastatic colorectal cancer
WO2010145796A2 (en) * 2009-06-19 2010-12-23 Merck Patent Gmbh Biomarkers and methods for determining efficacy of anti-egfr antibodies in cancer therapy
WO2013023994A1 (en) * 2011-08-12 2013-02-21 Pronota N.V. New biomarker for the classification of ovarian tumours

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2002068677A2 (en) * 2001-02-27 2002-09-06 Eos Biotechnology, Inc. Novel methods of diagnosis of metastatic colorectal cancer, compositions and methods of screening for modulators of metastatic colorectal cancer
WO2010145796A2 (en) * 2009-06-19 2010-12-23 Merck Patent Gmbh Biomarkers and methods for determining efficacy of anti-egfr antibodies in cancer therapy
WO2013023994A1 (en) * 2011-08-12 2013-02-21 Pronota N.V. New biomarker for the classification of ovarian tumours

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
BIRKENKAMP-DEMTRODER, K. ET AL.: "Gene expression in colorectal cancer", CANCER RESEARCH, vol. 62, no. 15, 2002, pages 4352 - 4363, XP002514753 *
YAP, YEELENG ET AL.: "Classification between normal and tumor tissues based on the pair-wise gene expression ratio", BMC CANCER, vol. 4, no. 72, 2004, pages 1 - 17, XP021004675 *
ZHANG, J. ET AL.: "Increasing cystatin C and cathepsin B in serum of colorectal cancer patients", CLINICAL LABORATORY, vol. 63, no. 2, February 2017 (2017-02-01), pages 365 - 371 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP4032543A1 (en) * 2021-01-20 2022-07-27 Fraunhofer-Gesellschaft zur Förderung der angewandten Forschung e.V. Peptides from the sequence of the complement factor d for medical use, especially for the treatment of ebv-associated diseases
CN114839253A (en) * 2022-04-21 2022-08-02 深圳市第二人民医院(深圳市转化医学研究院) Quantitative analysis method for low molecular weight protein in serum or plasma and application thereof

Similar Documents

Publication Publication Date Title
Belczacka et al. Proteomics biomarkers for solid tumors: Current status and future prospects
Li et al. Discovery of Apo-A1 as a potential bladder cancer biomarker by urine proteomics and analysis
US10345309B2 (en) Biomarkers for gastric cancer and uses thereof
Shimura et al. Novel urinary protein biomarker panel for early diagnosis of gastric cancer
Chen et al. Elevated level of anterior gradient-2 in pancreatic juice from patients with pre-malignant pancreatic neoplasia
Gajbhiye et al. Urinary proteome alterations in HER2 enriched breast cancer revealed by multipronged quantitative proteomics
KR101788414B1 (en) Biomarker for diagnosis of liver cancer and use thereof
Li et al. Identification of urinary Gc-globulin as a novel biomarker for bladder cancer by two-dimensional fluorescent differential gel electrophoresis (2D-DIGE)
US20150072349A1 (en) Cancer Biomarkers and Methods of Use
Liu et al. Proteomic identification of serum biomarkers for gastric cancer using multi-dimensional liquid chromatography and 2D differential gel electrophoresis
US10509034B2 (en) Bladder carcinoma biomarkers
Ma et al. Mass spectrometry based translational proteomics for biomarker discovery and application in colorectal cancer
US8980269B2 (en) G-protein coupled receptor-associated sorting protein 1 as a cancer biomarker
WO2014097584A1 (en) Method for determining colon cancer
EP3243077B1 (en) Prostate cancer markers and uses thereof
Loch et al. Use of high density antibody arrays to validate and discover cancer serum biomarkers
WO2017219093A1 (en) Screening methods
US8420333B2 (en) G-protein coupled receptor-associated sorting protein 1 as a cancer biomarker
KR101995189B1 (en) Biomarker for non-invasive in vitro diagnosis of a Hepatocellular carcinoma and biokit for diagnosis thereof comprising the same
CN116867911A (en) Biomarker profile for gastric cancer prevention and early discovery
Kim et al. Proteomic profiling of bladder cancer for precision medicine in the clinical setting: A review for the busy urologist
KR20130040294A (en) Composition for diagnosis of small cell lung cancer and diagnosis kit of small cell lung cancer
US20160313334A1 (en) Methods for the detection of esophageal adenocarcinoma
EP4004549A1 (en) Progression markers for colorectal adenomas
EP2845008A1 (en) G-protein coupled receptor-associated sorting protein 1 as a cancer biomarker

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17814338

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 17814338

Country of ref document: EP

Kind code of ref document: A1