WO2012151277A1 - Kits and methods for selecting a treatment for ovarian cancer - Google Patents

Kits and methods for selecting a treatment for ovarian cancer Download PDF

Info

Publication number
WO2012151277A1
WO2012151277A1 PCT/US2012/036120 US2012036120W WO2012151277A1 WO 2012151277 A1 WO2012151277 A1 WO 2012151277A1 US 2012036120 W US2012036120 W US 2012036120W WO 2012151277 A1 WO2012151277 A1 WO 2012151277A1
Authority
WO
WIPO (PCT)
Prior art keywords
subject
genes
actb
output score
cutoff value
Prior art date
Application number
PCT/US2012/036120
Other languages
French (fr)
Inventor
Jason Basil NIKAS
Walter Cheney LOW
Amy Patrice SKUBITZ
Kristin Louise Murgic BOYLAN
Original Assignee
Applied Informatic Solutions, Inc.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Applied Informatic Solutions, Inc. filed Critical Applied Informatic Solutions, Inc.
Publication of WO2012151277A1 publication Critical patent/WO2012151277A1/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • Ovarian cancer is the most lethal gynecological malignancy in the U.S., due in part to the subtlety of its symptoms. It accounts for ⁇ 3% of all cancers in women in the U.S.
  • ovarian cancer Approximately 24,000 new cases of ovarian cancer are diagnosed each year in the U.S., resulting in about 16,000 deaths per year. Current diagnostic tests are neither adequately sensitive nor specific; consequently the majority of ovarian cancer patients are diagnosed with advanced disease. Standard therapy for ovarian cancer involves debulking surgery to reduce tumor burden followed by chemotherapy with a combination of platinum and paclitaxel (e.g., TAXOL®). Initially, up to 80% of ovarian cancer patients respond to chemotherapy, however most patients relapse in less than 2 years.
  • platinum and paclitaxel e.g., TAXOL®
  • Taxol® is a mitotic inhibitor, whose mechanism of action is to prevent a) the destabilization of microtubules, necessary to the formation of the mitotic spindle and subsequent chromosomal separation during mitosis and b) the formation of new
  • microtubules also necessary to the aforementioned mitotic stages.
  • Another cytostructural component to the destabilization of the microtubules and subsequent formation of the mitotic spindle is ⁇ actin, a polymeric microfilament that helps hold microtubules together.
  • One embodiment provides a method to determine if an ovarian cancer patient is a long term survivor or a short term survivor comprising measuring the level of expression of at least one gene in a sample from the patient, wherein the level of expression of the at least one gene in the sample is an indication that the subject is a long term survivor or a short term survivor.
  • a method of selecting a treatment for a subject having ovarian cancer comprises determining whether a subject having ovarian cancer is likely to have short term or long term survival by a method comprising measuring the level of gene expression of at least a set of genes comprising LYPLA2, TUB A3 C, ACTB, MED13L, OSBPL8, EED, and PKP4 in a sample comprising ovarian cancer cells from the subject; inputting the expression levels of the set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer to obtain an output score; determining whether the subject is likely to have long term survival by determining if the output score is less than a cutoff value or whether the subject is likely to have short term survival by determining if the output score is greater than or equal to the cutoff value, wherein the cutoff value is a value determined by identifying a value between the 99% confidence interval of a mean output score of a first set of samples
  • a method of selecting a treatment for a subject having ovarian cancer comprises determining whether the subject having ovarian cancer is likely to have short term or long term survival by a method comprising measuring the level of gene expression of at least a set of genes comprising SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUBA3C, MED13L, and EED in a sample comprising ovarian cancer cells from the subject; inputting the expression levels of the set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer to obtain an output score; determining whether the subject is likely to have long term survival by determining if the output score is less than a cutoff value or whether the subject is likely to have short term survival by determining if the output score is greater than or equal to the cutoff value, wherein the cutoff value is a value determined by identifying a value between the 99% confidence interval of a mean output score of a
  • a method of selecting a treatment for a subject having ovarian cancer comprises determining whether the subject is likely to have short term or long term survival by a method comprising measuring the level of gene expression of at least a set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED in a sample comprising ovarian cancer cells from the subject; inputting the expression levels of the set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer to obtain an output score; determining whether the subject is likely to have long term survival by determining if the output score is less than a cutoff value or whether the subject is likely to have short term survival by determining if the output score is greater than or equal to the cutoff value, wherein the cutoff value is a value determined by identifying a value between the 99% confidence interval of a mean output score of a first set of samples from subjects known to have short term
  • the methods further comprise treating a subject likely to have long term survival with standard chemotherapy.
  • standard chemotherapy comprises taxol and/or platinum.
  • the method further comprises treating a subject likely to have short term survival with therapy in addition to or in place of standard chemotherapy.
  • an alternative therapy comprises a therapy selected from the group consisting of antiangiogenesis compounds, taxane analogues, tubulin binding agents, and ubiquitination inhibitors.
  • a subject likely to have short term survival is treated with an inhibitor of a protein selected from the group consisting of TUBA3C, ACTB, CDC42 and combinations thereof.
  • the disclosure provides a method for selecting a treatment for a subject that has ovarian cancer comprising, the method comprising: calculating an output score, using a computing device, by inputting gene expression levels of a first set of genes comprising LYPLA2, TUB A3 C, ACTB, MED13L, OSBPL8, EED, and PKP4, a second set of genes comprising SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUBA3C, MED13L, and EED, or a third set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED, into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer; and displaying the output score, using a computing device.
  • the method further comprises determining whether the output score is greater than or equal to or less than a cutoff
  • One embodiment provides a method for diagnosing ovarian cancer in a subject comprising: measuring the level of expression of at least one gene in a test sample from a subject and comparing the level of expression with the level of expression of the at least one gene in a control sample from a healthy subject, wherein a higher or lower level of expression of the gene in the test sample compared with the level of expression in the control sample is an indication that the subject has ovarian cancer.
  • the mR A levels are measured.
  • the protein levels are measured.
  • the gene expression levels are measured by microarray analysis.
  • One embodiment provides that expression of LYPLA2, TUB A3 C, ACTB, ED13L, OSBPL8, EED, PKP4, SSR1, USP5, HLCS, NDUFB1, CDC42 or a combination thereof is measured. In another embodiment, the expression of LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4 is measured. In another embodiment, the expression of LYPLA2, TUBA3C, ACTB and PKP4 is increased and the expression of MED13L, OSBPL8, and EED is decreased.
  • the expression of SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUBA3C, MED13L, and EED is measured. In another embodiment, the expression of SSR1, NDUFB1, MED13L and EED is decreased and the expression USP5, ACTB, HLCS LYPLA2 and TUBA3C is increased. In one embodiment, the expression of CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED is measured. In another embodiment the expression of CDC42, LYPLA2, TUB A3 C, ACTB and HLCS is increased and the expression of MED13L and EED is decreased.
  • the expression of LYPLA2, TUBA3C, ACTB, USP5, HLCS, CDC42 or a combination thereof is increased.
  • the expression of MED13L, OSBPL8, EED, PKP4, SSR1, NDUFB1 or a combination thereof is decreased.
  • the expression of LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, PKP4, SSR1, USP5, HLCS, NDUFB1, CDC42 or a combination thereof is measured and applied to a mathematical function to yield a diagnosis of ovarian cancer.
  • the measurement of gene expression provides a diagnosis which indicates that the subject/patient will survive the cancer longer than about seven years. In another embodiment, the measurement of gene expression provides a diagnosis that the subject/patient will not survive the cancer for longer than about three years.
  • the subject/patient is a mammal, such as a human.
  • a health care provider or worker is informed.
  • the subject/patient is treated for ovarian cancer.
  • kits for selecting a treatment for an ovarian cancer patient comprises or consists essentially of primer or a probe or both that specifically hybridizes to each gene of a first set of genes comprising LYPLA2, TUB A3 C, ACTB, MED 13 L, OSBPL8, EED, and PKP4.
  • the kit consists essentially of reagents for detecting expression of the first set of genes and contains other reagents such as primer or probes for housekeeping genes, positive controls and/or negative controls.
  • a kit comprises or consists essentially of: a primer or a probe or both that specifically hybridizes to each gene of a first set of genes comprising SSR1, USP5, ACTB, HLCS, NDUFB 1, LYPLA2, TUBA3C, MED13L, and EED.
  • a kit comprises or consists essentially of a primer or a probe or both that specifically hybridizes to each gene of a first set of genes comprising CDC42, LYPLA2, TUB A3 C, ACTB, HLCS, MED13L, and EED.
  • the kit contains no more than 200 primers or probes or both, no more than 175 primers, probes or both, no more than 150 primers, probes or both, no more than 125 primers, probes or both, no more than 100 primers, probes or both, no more than 75 primers, probes or both, no more than 50 primers, probes or both, no more than 25 primers, probes or both, or no more than 15 primers, probes or both.
  • a kit further comprises a computer readable storage medium having computer-executable instructions that, when executed by a computing device, cause the computing device to perform a step comprising: calculating an output score by inputting gene expression levels of a set of genes comprising LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4, a second set of genes comprising SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUB A3 C, MED13L, and EED, or a third set of genes comprising CDC42, LYPLA2, TUB A3 C, ACTB, HLCS, MED13L, and EED from a sample, into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer.
  • the disclosure provides a computing device comprising a processing unit; and a system memory connected to the processing unit, the system memory including instructions that, when executed by the processing unit, cause the processing unit to: calculate an output score by inputting gene expression levels of a set of genes comprising LYPLA2, TUB A3 C, ACTB, MED13L, OSBPL8, EED, and PKP4, a second set of genes comprising SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUBA3C, MED13L, and EED, or a third set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED from a sample, into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer; and display the output score.
  • a computing device comprising a processing unit; and a system memory connected to the processing unit, the system memory including
  • system memory includes instructions, that when executed by the processing unit, cause the processing unit to determine whether the output score is greater than or equal to or less than a cutoff value; and displaying whether the subject is likely to be a short term or long term survivor.
  • FIG. Box plots of the output scores of two survival/(treatment-response) groups (LTS and STS) of the Fl biomarker.
  • Figure 4 Box plots of the output scores of two survival/(treatment-response) groups (LTS & STS) of the F2 and F3 biomarkers.
  • FIG. 5 3D plot of output scores from long term and short term survivor subjects from functions Fl vs. F2 vs. F3. It can be seen that, with the exception of one subject, the three biomarkers are able to separate long-term from short-term survivors (responders vs. non-responders) in this 3D space.
  • Figure 6. Scatter plot & bar graph of output scores of all individual subjects [both LTS (responders) and STS (non-responders)] of the Fl prognostic biomarker. All 10 unknown STS subjects have Fl scores that are higher than the cutoff value (21.4), whereas all 10 unknown LTS subjects have Fl scores that are lower than the cutoff value.
  • Figure 7. Scatter plot and bar graph of output score of all individual subjects (both LTS (responders) and STS (non-responders)) of the F2 and F3 prognostic biomarkers. All 10 unknown STS subjects have F3 scores that are higher than the cutoff value (14.3 for F2 and 14.7 for F3), whereas all 10 unknown LTS subjects have F2 and F3 scores that are lower than the cutoff value.
  • Figure 8 Three-dimensional plot of prognostic biomarkers of output scores from each function Fl vs. F2 vs. F3 for the validation (qualification) study of long-term
  • Figure 9 provides mathematical equations.
  • prognostic markers available for the diagnosis and/or prognosis of ovarian cancer, in particular, the classification of ovarian patients in relation to short-term (less than about three years from diagnosis, including several weeks, several months, 1 year, 2 year or three years) vs. long-term survivors (at least about 4 years, about 5 years, about 6 years or about 7 years or longer than about 7 years) or in relation to response to the standard aforementioned chemotherapy treatment.
  • long-term survivors at least about 4 years, about 5 years, about 6 years or about 7 years or longer than about 7 years
  • the ability to distinguish between these two patient populations would allow the modification of treatment therapies and/or the development of new pharmacological treatments for short-term survivors to potentially prolong their survival time.
  • novel prognostic biomarkers that can distinguish between ovarian cancer patients who will survive longer than seven years versus those who will succumb to the disease within three years using a novel mathematical bioinformatic approach for the analysis of gene expression in each patient's tumor tissue.
  • This novel mathematical bioinformatic approach has resulted in the discovery of novel genes and networks underlying the progression from long-term survival to short-term survival in ovarian cancer patients.
  • the gene biomarkers that constitute this novel gene network when combined together into a single complex mathematical function and, thus, treated as a single complex biomarker have a very high prognostic power (AUC of 0.978).
  • AUC prognostic power
  • This AUC value indicates that these biomarkers can both independently and collectively be used to identify short-term survivors with a very high accuracy and therefore provide alternative treatments that may extend their survival.
  • this approach demonstrates the potential of personalized medicine based on the particular gene expression of a patient as it pertains to their specific disease.
  • One of the discovered genes namely, TUBA3C
  • TUBA3C is directly linked to the mechanism of action of taxol, the standard chemotherapy treatment for ovarian cancer.
  • Two of the remaining discovered genes namely, ACTB and CDC42, are indirectly linked to the mechanism of action of taxol. More specifically, the TUB A3 C gene is responsible for the production of microtubules, something which is needed for cell proliferation, and something which taxol is trying to oppose.
  • the gene ACTB is responsible for the production of ⁇ -actin, which can be polymerized to form ⁇ -actin microfilaments, which are used for the
  • Taxol and other taxol analogs oppose either the depolymerization or polymerization of microtubules, respectively.
  • the gene CDC42 promotes the polymerization of ⁇ actin into microfilaments, and, furthermore, it can regulate the polarization of both the actin and the microtubule cytoskeleton. All three of those genes were significantly over-expressed in the short-term survivors as compared with those of the long-term survivors. This indicates that in the case of the short-term survivors, taxol cannot overcome the combined effect of the TUBA3C, ACTB, and CDC42 genes, and that those individuals will not respond to the standard treatment of care, i.e. chemotherapy with platinum and taxol. In addition, the findings indicate that chemotherapeutic agents that inhibit the overexpression of these genes are useful to extend the survival of ovarian cancer patients.
  • an element means one element or more than one element.
  • a “subject” or “patient” is a vertebrate, including a mammal, such as a human.
  • Mammals include, but are not limited to, humans, farm animals, sport animals and pets.
  • biological sample refers to samples obtained from a subject, including, but not limited to, skin, hair, tissue, blood, plasma, serum, cells, sweat, saliva, feces, tissue and/or urine.
  • biologically active fragments or “bioactive fragment” of the polypeptides encompasses natural or synthetic portions of the full length protein that are capable of specific binding to their natural ligand or of performing the function of the protein.
  • a “functional” or “active” biological molecule is a biological molecule in a form in which it exhibits a property by which it is characterized.
  • a functional enzyme for example, is one which exhibits the characteristic catalytic activity by which the enzyme is characterized.
  • fragment is a portion of an amino acid sequence, comprising at least one amino acid, or a portion of a nucleic acid sequence comprising at least one nucleotide.
  • fragment and “segment” are used interchangeably herein.
  • fragment as applied to a protein or peptide, can ordinarily be at least about 3-15 amino acids in length, at least about 15-25 amino acids, at least about 25-50 amino acids in length, at least about 50-75 amino acids in length, at least about 75-100 amino acids in length, and greater than 100 amino acids in length.
  • fragment as applied to a nucleic acid, may ordinarily be at least about 20 nucleotides in length, typically, at least about 50 nucleotides, more typically, from about 50 to about 100 nucleotides, at least about 100 to about 200 nucleotides, at least about 200 nucleotides to about 300 nucleotides, at least about 300 to about 350, at least about 350 nucleotides to about 500 nucleotides, at least about 500 to about 600, at least about 600 nucleotides to about 620 nucleotides, at least about 620 to about 650, and or the nucleic acid fragment will be greater than about 650 nucleotides in length.
  • binding refers to the adherence of molecules to one another, such as, but not limited to, enzymes to substrates, ligands to receptors, antibodies to antigens, DNA binding domains of proteins to DNA, and DNA or RNA strands to complementary strands.
  • Binding partner refers to a molecule capable of binding to another molecule.
  • health care provider or worker includes either an individual or an institution that provides preventive, curative, promotional or rehabilitative health care services to a subject, such as a patient.
  • the data is provided to a health care provider so that they may use it in their diagnosis/treatment of the patient.
  • standard refers to something used for comparison, such as control or a healthy subject.
  • primer refers to a nucleic acid capable of acting as a point of initiation of synthesis along a complementary strand when conditions are suitable for synthesis of a primer extension product.
  • the synthesizing conditions include the presence of four different bases and at least one polymerization-inducing agent such as reverse transcriptase or DNA polymerase. These are present in a suitable buffer, which may include constituents which are co-factors or which affect conditions such as pH and the like at various suitable temperatures.
  • a primer is preferably a single strand sequence, such that amplification efficiency is optimized, but double stranded sequences can be utilized.
  • Primers are typically at least about 15 nucleotides. In embodiments, primers can have a length of anywhere from 15 to 2000 nucleotides. In embodiments , primers have a melting temp of at least 50°C, 52°C, 55°C, 58°C, 60°C, or 65°C.
  • a probe refers to a nucleic acid that hybridizes to a target sequence.
  • a probe includes about eight nucleotides, about 10 nucleotides, about 15 nucleotides, about 20 nucleotides, about 25 nucleotides, about 30 nucleotides, about 40 nucleotides, about 50 nucleotides, about 60 nucleotides, about 70 nucleotides, about 75 nucleotides, about 80 nucleotides, about 90 nucleotides, about 100 nucleotides, about 110 nucleotides, about 1 15 nucleotides, about 120 nucleotides, about 130 nucleotides, about 140 nucleotides, about 150 nucleotides, about 175 nucleotides, about 187 nucleotides, about 200 nucleotides, about 225 nucleotides, and about 250 nucleotides.
  • a probe can further include a detectable label.
  • Detectable labels include, but are not limited to, a fluorophore (e.g.,Texas- Red ® , Fluorescein isothiocyanate, etc.,) and a hapten, (e.g., biotin).
  • a detectable label can be covalently attached directly to a probe oligonucleotide, e.g., located at the probe's 5' end or at the probe's 3' end.
  • a probe including a fluorophore may also further include a quencher, e.g., Black Hole QuencherTM, Iowa BlackTM, etc.
  • Ovarian cancer is a cancerous growth arising from different parts of the ovary. Most (>90%) ovarian cancers are classified as "epithelial” and were believed to arise from the surface (epithelium) of the ovary. However, recent evidence suggests that the Fallopian tube could also be the source of some ovarian cancers. Other types arise from the egg cells (germ cell tumor) or supporting cells (sex cord/stromal).
  • Ovarian cancer usually has a poor prognosis. It is disproportionately deadly because it lacks any clear early detection or screening test, meaning that most cases are not diagnosed until they have reached advanced stages. More than 60% of patients presenting with this cancer already have stage III or stage IV cancer, when it has already spread beyond the ovaries. Ovarian cancers shed cells into the naturally occurring fluid within the abdominal cavity. These cells can then implant on other abdominal (peritoneal) structures including the uterus, urinary bladder, bowel and the lining of the bowel wall (omentum) forming new tumor growths before cancer is even suspected.
  • Ovarian cancer causes non-specific symptoms. Most women with ovarian cancer report one or more symptoms such as abdominal pain or discomfort, an abdominal mass, bloating, back pain, urinary urgency, constipation, tiredness and a range of other non-specific symptoms, as well as more specific symptoms such as pelvic pain, abnormal vaginal bleeding or involuntary weight loss. There can be a build-up of fluid (ascites) in the abdominal cavity.
  • ovarian cancer starts with a physical examination (including a pelvic examination), a blood test (for CA-125 and sometimes other markers), and transvaginal ultrasound. The diagnosis must be confirmed with surgery to inspect the abdominal cavity, take biopsies (tissue samples for microscopic analysis) and look for cancer cells in the abdominal fluid. Treatment usually involves chemotherapy and surgery, and sometimes radiotherapy. In most cases, the cause of ovarian cancer remains unknown. Older women, and in those who have a first or second degree relative with the disease, have an increased risk. Hereditary forms of ovarian cancer can be caused by mutations in specific genes (most notably BRCA1 and BRCA2, but also in genes for hereditary nonpolyposis colorectal cancer).
  • Ovarian cancer is classified according to the histology of the tumor, obtained in a pathology report.
  • Surface epithelial-stromal tumor also known as ovarian epithelial carcinoma, is the most common type of ovarian cancer. It includes serous tumor, endometrioid tumor and mucinous cystadenocarcinoma. Sex cord-stromal tumor, including estrogen-producing granulosa cell tumor and virilizing Sertoli-Leydig cell tumor or arrhenoblastoma, accounts for 8% of ovarian cancers. Germ cell tumor accounts for approximately 30% of ovarian tumors, but only 5% of ovarian cancers. Germ cell tumor tends to occur in young women and girls. The prognosis depends on the specific histology of germ cell tumor. Mixed tumors, containing elements of more than one of the above classes of tumor histology are also possible.
  • Ovarian cancer staging is by the FIGO staging system and uses information obtained after surgery, which can include a total abdominal hysterectomy, removal of (usually) both ovaries and fallopian tubes, (usually) the omentum, and pelvic (peritoneal) washings for cytopathology.
  • the AJCC stage is the same as the FIGO stage.
  • the AJCC staging system describes the extent of the primary Tumor (T), the absence or presence of metastasis to nearby lymph Nodes (N), and the absence or presence of distant Metastasis (M).
  • Stage I limited to one or both ovaries
  • IA - involves one ovary; capsule intact; no tumor on ovarian surface; no malignant cells in ascites or peritoneal washings
  • IB - involves both ovaries; capsule intact; no tumor on ovarian surface; negative washings
  • IC - tumor limited to ovaries with any of the following: capsule ruptured, tumor on ovarian surface, positive washings
  • Stage III microscopic peritoneal implants outside of the pelvis; or limited to the pelvis with extension to the small bowel or omentum
  • IIIB macroscopic peritoneal metastases beyond pelvis less than 2 cm in size
  • IIIC peritoneal metastases beyond pelvis > 2 cm or lymph node metastases
  • Para-aortic lymph node metastases are considered regional lymph nodes (Stage IIIC). As there is only one para-aortic lymph node intervening before the thoracic duct on the right side of the body, the ovarian cancer can rapidly spread to distant sites such as the lung.
  • the AJCC/TNM staging system includes three categories for ovarian cancer, T, N and M.
  • the T category contains three other subcategories, Tl, T2 and T3, each of them being classified according to the place where the tumor has developed (in one or both ovaries, inside or outside the ovary).
  • the Tl category of ovarian cancer describes ovarian tumors that are confined to the ovaries, and which may affect one or both of them.
  • the sub-subcategory Tla is used to stage cancer that is found in only one ovary, which has left the capsule intact and which cannot be found in the fluid taken from the pelvis.
  • Tic category describes a type of tumor that can affect one or both ovaries, and which has grown through the capsule of an ovary or it is present in the fluid taken from the pelvis.
  • T2 is a more advanced stage of cancer. In this case, the tumor has grown in one or both ovaries and is spread to the uterus, fallopian tubes or other pelvic tissues.
  • Stage T2a is used to describe a cancerous tumor that has spread to the uterus or the fallopian tubes (or both) but which is not present in the fluid taken from the pelvis.
  • Stages T2b and T2c indicate cancer that metastasized to other pelvic tissues than the uterus and fallopian tubes and which cannot be seen in the fluid taken from the pelvis, respectively tumors that spread to any of the pelvic tissues (including uterus and fallopian tubes) but which can also be found in the fluid taken from the pelvis.
  • T3 is the stage used to describe cancer that has spread to the peritoneum. This stage provides information on the size of the metastatic tumors (tumors that are located in other areas of the body, but are caused by ovarian cancer). These tumors can be very small, visible only under the microscope (T3a), visible but not larger than 2 centimeters (T3b) and bigger than 2 centimeters (T3c).
  • This staging system also uses N categories to describe cancers that have or not spread to nearby lymph nodes. There are only two N categories, NO which indicates that the cancerous tumors have not affected the lymph nodes, and Nl which indicates the
  • the M categories in the AJCC/TNM staging system provide information on whether the ovarian cancer has metastasized to distant organs such as liver or lungs. MO indicates that the cancer did not spread to distant organs and Ml category is used for cancer that has spread to other organs of the body.
  • the AJCC/TNM staging system also contains a Tx and a Nx sub-category which indicates that the extent of the tumor cannot be described because of insufficient data, respectively the involvement of the lymph nodes cannot be described because of the same reason.
  • the ovarian cancer stages are made up by combining the TNM categories in the following manner:
  • IB Tlb+N0+M0
  • IIIB T3b+ N0+M0
  • Ovarian cancer as well as any other type of cancer, is also graded, apart from staged.
  • the histologic grade of a tumor measures how abnormal or malignant its cells look under the microscope. There are four grades indicating the likelihood of the cancer to spread and the higher the grade, the more likely for this to occur.
  • Grade 0 is used to describe non-invasive tumors.
  • Grade 0 cancers are also referred to as borderline tumors.
  • Grade 1 tumors have cells that are well differentiated (look very similar to the normal tissue) and are the ones with the best prognosis.
  • Grade 2 tumors are also called moderately well differentiated and they are made up by cells that resemble the normal tissue.
  • Grade 3 tumors have the worst prognosis and their cells are abnormal, referred to as poorly differentiated.
  • surgical treatment may be sufficient for malignant tumors that are well-differentiated and confined to the ovary.
  • Addition of chemotherapy may be required for more aggressive tumors that are confined to the ovary.
  • a combination of surgical reduction with a combination chemotherapy regimen is standard. Borderline tumors, even following spread outside of the ovary, are managed well with surgery, and chemotherapy is not seen as useful.
  • Chemotherapy has been a general standard of care for ovarian cancer for decades, although with highly variable protocols. Chemotherapy is used after surgery to treat any residual disease, if appropriate. This depends on the histology of the tumor; some kinds of tumor (particularly teratoma) are not sensitive to chemotherapy. In some cases, there may be reason to perform chemotherapy first, followed by surgery.
  • IP intraperitoneal
  • a method of selecting a treatment for a subject that has ovarian cancer comprises: a)determining whether the subject is likely to have short term or long term survival by a method comprising i)measuring the level of gene expression of at least a set of genes in a sample comprising ovarian cancer cells from the subject; ii)inputting the expression levels of the set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer to obtain an output score; iii)determining whether the subject is likely to have long term survival by determining if the output score is less than a cutoff value or whether the subject is likely to have short term survival by determining if the output score is greater than or equal to the cutoff value, wherein the cutoff value is a value determined by identifying a value between the 99% confidence interval of the mean output score of a first set of samples from subjects known to
  • the set of genes comprises at least the genes LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4.
  • the set of genes comprises at least the genes SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUBA3C, MED13L, and EED.
  • a set of genes comprises CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED.
  • Affymetrix sequences are available either at GenBank or the Affymetrix website. These genes include the following:
  • LYPLA2 such as human lysophospholipase II, is represented, for example, by accession numbers 215566_x_at, NM_007269, NM 007620, or NP_009191 (231 aa;
  • the protein sequence is:
  • TUBA3C such as human tubulin, alpha 3c
  • the protein sequence is:
  • ACTB such as human beta actin
  • accession numbers 200801_x_at NM_001101(gl 168480144), or NP_001092(375 aa; gI4501885).
  • the protein sequence is:
  • MED13L such as human mediator complex subunit 13-like, is represented, for example, by accession numbers 212209_at, NM_015335.4(gI300360584), or NP_0561150 (2210 aa;gI44771211).
  • the protein sequence is:
  • OSBPL8 such as human oxysterol binding protein-like 8
  • OSBPL8 is represented, for example, by accession numbers 212585_at, NM 001003712.0 and NM_020841.4 (2 alternative transcripts), or NP_001003712 (847 aa;gI51243032) and NP _065892 (889aa;gll 8079218).
  • the protein sequence for variant 1 is:
  • the protein sequence for variant 2 is:
  • EED such as human embryonic ectoderm development
  • EED is represented, for example, by accession numbers 209572_s_at, NM_003797.2 and NMJ52991.1 (2 alternative transcripts), or NP_003788(441 aa;gI24141020) and NP_694536 (400aa;gI24041023).
  • the protein sequence for variant 1 is:
  • the protein sequence for variant 2 is:
  • VEDPHKAK SEQ ID NO : 29
  • mRNA sequence for variant 2 is:
  • PKP4 such as human plakophilin 4
  • accession numbers 201929_s_at NM_001005476.1 and NM_003628.3 (2 alternative transcripts), or NP_001005476 (1 149aa;gI53829378) and NP_ 003619(1 192 aa;gI53829374).
  • the protein sequence for variant 1 is:
  • RVHFPASTDYSTQYGL STTNYVDFYSTKRPSYRAEQYPGSPDSWV (SEQ ID NO: 13) and the mRNA sequence of variant 1 is:
  • the protein sequence for variant 2 is:
  • SSR1 such as human signal sequence receptor, alpha
  • SSR1 is represented, for example, by accession numbers 200891_s_at, NM_003144.3, or NP_003135(286aa;gll 6904009).
  • the protein sequence is:
  • USP5 such as human ubiquitin specific peptidase 5 (isopeptidase T), is represented, for example, by accession numbers 20603 l_s_at, NM_001098536.1 and NM_003481.2 (2 alternative transcripts), or NP_001092006(858aa;gI148727331) and
  • the protein sequence for variant 1 is:
  • GKYQLFAFISHMGTSTMCGHYVCHIKKEGRWVIYNDQKVCASEKPPKDLGYIYFYQRVAS (SEQ ID NO: 17) and the mRNA sequence for variant 1 is:
  • the protein sequence for variant 2 is:
  • HiK EGR viYNDQ VCASEKPPKDLGYiYFYQRVAS (SEQ ID NO: 33) and the mRNA sequence for variant 2 is:
  • TTFNSIMKCDVDIR DLYANTVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPE
  • RKYSv iGGSiLASLSTFQQ WisKQEYDESGPSivHRKCF SEQ ID NO: 19
  • sequence mRNA sequence is:
  • HLCS such as human holocarboxylase synthetase (biotin-(proprionyl-CoA- carboxylase (ATP-hydrolysing)) ligase)
  • accession numbers 209399 _at, NM_00041 1.5 and NM_001242785 and NM_001242784 (three alt transcripts) or NP_000402(726aa;gI46255045) and NP_001229713 (726aa;gI338753397) and
  • the protein sequence is:
  • NDUFB1 such as human NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 1, 7kDa, is represented, for example, by accession numbers 206790_s_at, NM_004545.3, or NP 38569473 (105aa;gI38569473).
  • the protein sequence is:
  • AAIMWLLQIVI ⁇ HWVHVLVPMGFVIGCYLDR SDERLTAFR KSMLF RELQPSEEVT (SEQ ID NO: 23) and the mRNA se quence is:
  • CDC42 such as human cell division cycle 42 (GTP binding protein, 25kDa), is represented, for example, by accession numbers 208728_s_at, NM_001039802.1,
  • NM_001791.3 and NM_044472.2 (three alternative transcripts), or
  • the protein sequence or variant 1 is:
  • EPPEPKKSRRCVLL SEQ ID O: 25
  • mRNA for variant 1 is:
  • the protein sequence or variant 2 is:
  • TQRGLK VFDEAILAALEPPETQPKRKCCIF (SEQ ID NO: 35) and the mRNA sequence for variant 2 is:
  • the protein sequence or variant 3 is:
  • TQKGLKNVFDEAiLAALEPPEPKKSRRCVLL (SEQ ID NO: 37) and the mRNA sequence for variant 3 is:
  • Primers can be designed in accord with a number of criteria using Primer design programs such as Premier Primer (biosoft), Oligo Primer Analysis software, and Oligo Perfect (Life Technologies) and other free and commercially available software. Probes can be designed using free and commercially available software including Array Designer (biosoft), and Light Cycler Probe design software (Roche). Primers and/or probes can be detectably labeled in accord with standard methods. Probes can be attached to a solid surface such as a slide, a well in a multiwell plate, and/or a chip.
  • primers and/or probes are designed to specifically bind to each of the nucleic acids encoding CDC42, LYPLA2, TUB A3 C, ACTB, HLCS, MED13L, EED, SSR1, USP5, NDUFB1, OSBPL8, and PKP4.
  • a custom array can be prepared that contains no more than 200 probes, including at least 12 probes, one for each of the identified genes.
  • the primers or probes are not designed to bind to the polyA tail.
  • the primers and/ or probes specifically bind to the nucleic acid sequences under standard PCR or microarray conditions.
  • those conditions include 7% sodium dodecyl sulfate SDS, 0.5 M NaP04, I mM EDTA at 50°C with washing in 2X standard saline citrate (SSC), 0.1% SDS at 50°C; preferably in 7% (SDS), 0.5 M NaP04, 1 mM EDTA at 50°C.
  • Each of the genes identified herein as useful in determining a short term or long term survivor can have one or more variants that are known and primers and probes can be designed to detect all variants and/or each variant.
  • Variants include those nucleic acids or proteins that are "Substantially homologous nucleic acid sequence” or “substantially identical nucleic acid sequence” "substantially homologous amino acid sequences” or “substantially identical amino acid sequences”.
  • Homologous refers to the subunit sequence similarity between two polymeric molecules, e.g., between two nucleic acid molecules, e.g., two DNA molecules or two RNA molecules, or between two polypeptide molecules. When a subunit position in both of the two molecules is occupied by the same monomeric subunit, e.g., if a position in each of two DNA molecules is occupied by adenine, then they are homologous at that position.
  • the homology between two sequences is a direct function of the number of matching or homologous positions, e.g., if half (e.g., five positions in a polymer ten subunits in length) of the positions in two compound sequences are homologous then the two sequences are 50% homologous, if 90% of the positions, e.g., 9 of 10, are matched or homologous, the two sequences share 90% homology.
  • the DNA sequences 3'ATTGCC5' and 3'TATGGC share 50% homology.
  • the determination of percent identity between two nucleotide or amino acid sequences can be accomplished using a mathematical algorithm. For example, a
  • BLAST protein searches can be performed with the XBLAST program (designated "blastn" at the NCBI web site) or the NCBI "blastp” program, using the following parameters:
  • Gapped BLAST can be utilized as described in Altschul et al.
  • PSI-Blast or PHI-Blast can be used to perform an iterated search which detects distant relationships between molecules and relationships between molecules which share a common pattern.
  • the default parameters of the respective programs e.g., XBLAST and NBLAST.
  • the percent identity between two sequences can be determined using techniques similar to those described above, with or without allowing gaps. In calculating percent identity, typically exact matches are counted.
  • a "substantially homologous amino acid sequences" or “substantially identical amino acid sequences” includes those amino acid sequences which have at least about 92%, or at least about 95% homology or identity, including at least about 96% homology or identity, including at least about 97% homology or identity, including at least about 98% homology or identity, and at least about 99% or more homology or identity to an amino acid sequence of a reference antibody chain.
  • Amino acid sequence similarity or identity can be computed by using the BLASTP and TBLASTN programs which employ the BLAST (basic local alignment search tool) 2.0.14 algorithm. The default settings used for these programs are suitable for identifying substantially similar amino acid sequences for purposes of the present invention.
  • conservative amino acid substitution is defined herein as an amino acid exchange within one of the following five groups:
  • substantially homologous nucleic acid sequence or “substantially identical nucleic acid sequence” means a nucleic acid sequence corresponding to a reference nucleic acid sequence wherein the corresponding sequence encodes a peptide having substantially the same structure and function as the peptide encoded by the reference nucleic acid sequence; e.g., where only changes in amino acids not significantly affecting the peptide function occur.
  • the substantially identical nucleic acid sequence encodes the peptide encoded by the reference nucleic acid sequence.
  • the percentage of identity between the substantially similar nucleic acid sequence and the reference nucleic acid sequence is at least about 50%, 65%, 75%, 85%, 92%, 95%, 99% or more.
  • Substantial identity of nucleic acid sequences can be determined by comparing the sequence identity of two sequences, for example by physical/chemical methods (i.e., hybridization) or by sequence alignment via computer algorithm.
  • Suitable computer algorithms to determine substantial similarity between two nucleic acid sequences include, GCS program package The default settings provided with these programs are suitable for determining substantial similarity of nucleic acid sequences for purposes of the present invention.
  • the expression of the nucleic acid such as mRNA of the genes of interest is determined.
  • Levels of mRNA can be quantitatively measured by Northern blotting. A sample of RNA is separated on an agarose gel and hybridized to a radio-labeled RNA probe that is complementary to the target sequence. The radio-labeled RNA is then detected by an autoradiograph.
  • PCR polymerase chain reaction
  • amplification reaction such as polymerase chain reaction (PCR).
  • PCR is RT-PCR.
  • a DNA template from the mRNA is generated by reverse transcription, which is called cDNA.
  • This cDNA template is then used for qPCR where the change in fluorescence of a probe changes as the DNA amplification process progresses.
  • qPCR can produce an absolute measurement such as number of copies of mRNA, typically in units of copies per nanolitre of homogenized tissue or copies per cell. qPCR is very sensitive (detection of a single mRNA molecule is possible).
  • Another approach is to individually tag single mRNA molecules with fluorescent barcodes (nanostrings), which can be detected one-by-one and counted for direct digital quantification (Krassen Dimitrov, NanoString Technologies).
  • DNA microarrays can be used to determine the transcript levels for many genes at once (expression profiling). Recent advances in microarray technology allow for the quantification, on a single array, of transcript levels for every known gene in several organism's genomes, including humans or smaller custom arrays can be utilized.
  • tag based technologies like Serial analysis of gene expression (SAGE), which can provide a relative measure of the cellular concentration of different mRNAs, can be used.
  • SAGE Serial analysis of gene expression
  • the level of expression can be determined using RNA sequencing technology.
  • RNA sequencing technology involves high throughput sequencing of cDNA. mRNA is isolated and reverse transcribed to form a library of cDNA. The cDNA is fragmented to a specific size and optionally may be detectably labeled. The fragments are sequenced and the full sequence is assembled in accord with different platforms such as provided by Ilumina, 454 Sequencing or SOLID sequencing. In addition, mRNA can be sequenced directly(without conversion to cDNA) using protocols available from Helicos.
  • the expression of the protein from the genes of interest is determined.
  • the expression level can be directly assessed by a number of means with some clear analogies to the techniques for mRNA quantification.
  • the most commonly used method is to perform a Western blot against the protein of interest - this gives information on the size of the protein in addition to its identity.
  • a sample (often cellular lysate) is separated on a polyacrylamide gel, transferred to a membrane and then probed with an antibody to the protein of interest.
  • Other methods include, for example, Enzyme-linked immunosorbent assay (ELISA), lateral flow test, latex agglutination, other forms of immunochromatography, western blot, and/or magnetic immunoassay.
  • Reagents to the detect the molecules of interest can be produced by methods available to an art worker or purchased commercially.
  • a method for selecting a treatment of a subject with ovarian cancer comprises inputting the expression levels of the set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer to obtain an output score.
  • the gene expression analysis of the genes of interest is applied to the equations provided in Figure 9.
  • the gene expression analysis is obtained from microarray analysis using an Affymetrix U133 chip and the data from each gene is produced using the original raw intensity data (CEL files) processed using the MAS5 normalization and background-correction algorithm (51 OK FDA approved).
  • the gene expression values can be converted to values of the Affymetrix gene expression analysis algorithm using known methods. For example, the gene expression analysis can be run in parallel using PCR or RNA sequencing and using the Affymetrix U133 chip and software. The gene expression values for each gene from PCR or RNA sequencing can be compared to the values generated using Affymetrix system and a conversion factor identified. Gene expression levels for each gene generated by PCR or RNA sequencing can be generated and converted to the output of the Affymetrix algorithm using the conversion factor before inputting gene expression levels for each gene into the functions.
  • F1 f (LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, PKP4)
  • F2 f (SSR1 , USP5, ACTB, HLCS, NDUFB1 , LYPLA2, TUBA3C, MED13L, EED)
  • F3 f (CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, EED).
  • Each function operates to independently provide a risk assessment of whether the subject is likely to have long term or short term survival.
  • One or more functions can be used together to determine the likelihood that a subject has a risk of short term or long term survival.
  • a method for selecting a treatment for a patient having ovarian cancer comprises determining whether the subject is likely to have long term survival by determining if the output score is less than a cutoff value or whether the subject is likely to have short term survival by determining if the output score is greater than or equal to the cutoff value, wherein the cutoff value is a value determined by identifying a value between the 99% confidence interval of the mean output score of a first set of samples from subjects known to have short term survival and the 99% confidence interval of the mean output score of a second set of samples from subjects known to have long term survival.
  • the disclosure also provides methods for determining a cutoff value.
  • the method for determining the cutoff value comprises determining a mean output score for a first group of patients that are known to have short term survival and a mean output score of a second group of patients known to have long term survival of an original set of patients.
  • the mean output score, the standard deviation, the range of each group, and 99% confidence interval of each group is determined.
  • a cutoff value is determined that falls between the 99% confidence interval for both groups.
  • the cut-off score of the Fl model was determined to be 21.388.
  • the upper limit of the 99% confidence interval for the long term survivors was 20.663 and the lower limit for the 99% confidence interval for the short term responders was 22.924.
  • the difference between the two groups is 2.261 and in one embodiment, this value is divided in half and then added to the upper value for the long term survivors; that constitutes the middle point between the two groups.
  • the cutoff is set within that difference between the 99% confidence interval of the groups and adjusted up or down from the aforementioned middle point according to the magnitude of the standard deviation of the two groups, i.e. the cutoff is moved away from the middle point from the group that has the larger standard deviation and closer to the other group (the one with the smaller standard deviation).
  • the cutoff value is determined by a method comprising calculating an optimal point on the ROC curve based on the 34 scores of the 34 original subjects used in the discovery study [optimal point is defined as the point with the highest sensitivity and the lowest false positive rate (1 -specificity)] for first group of short term survivors and a second group of long term survivors. That optimal point (the score of one of the 34 original subjects), which represents, according to ROC curve analysis, the best cutoff point for all of the 34 original subjects' scores, itself may be used as the cutoff point.
  • the cutoff values for the Fl function is 21.388, for the F2 function is 14.3 and for the F3 function is 14.7.
  • the method for determining the cutoff values further comprises verifying the validity of the cutoff value by obtaining output scores for a second set of patients (validation set) whose status as a long term survivor or short term survivor is hidden from the tester.
  • the output scores are compared to the cutoff values for each function and if the patient's sample in the validation set is greater than or equal to the cutoff value then it is predicted that the patient is a short term survivor and if less than the cutoff value a long term survivor.
  • the status of the patient is unblinded and the validity of the cutoff value is determined by determining whether the cutoff value provides a sensitivity of at least 90% and a specificity of at least 90%.
  • a method comprises displaying whether the output score is less than a cutoff value indicating that the subject is a long term survivor or greater than or equal to the cutoff indicating that the subject is a short term survivor so that the health care worker can select a treatment for the subject.
  • the health care worker may select one or more standard therapy options.
  • standard therapy options include chemotherapy, surgery, and/or radiation.
  • Standard chemotherapeutic options include treatment with one or more of cyclophosphamide, Taxol, Platinum, CarbopJatin, Cisplatin, Gemcitabine, Topotecan, Oxaliplatin, Doxorubicin, Paclitaxel, Docetaxel, and combinations thereof.
  • the health care worker may select a more aggressive treatment in addition to or in place of the standard chemotherapy.
  • treatment includes treatment with a cancer vaccine, angiogenesis inhibitors, tubulin binding inhibitors, taxane analogs, actin polymerization inhibitors, adoptive cell therapy, and protein ubiquination inhibitors.
  • Examples of compounds that can be utilized include Avastin, Votrient, SIK2 inhibitors, Vinblastine, ixabepilone, epothelin B, imatinib, atorvastatin, siromilus, bestatin,
  • the chemotherapy treatment includes treatment with an inhibitor of ACTB, TUBA3C, CDC42, and combinations thereof.
  • the methods of the invention may be employed on a set of patients to identify a responder group or a nonresponder group in a clinical trial , for example.
  • a new therapeutic agent it is useful to know whether the therapeutic agent has different effects in the responder population versus the nonresponder population.
  • a group of patients having ovarian cancer are identified as responders or nonresponders and are then treated with a potential therapeutic agent. Safety and efficacy of the drug is assessed in responder and nonresponder propulations.
  • Another aspect of the disclosure includes methods for screening therapeutic agents. Identification of ovarian cancer tissue samples as nonresponders and responders can be used to screen therapeutic effectiveness of the potential therapeutic agent on both types of patient populations.
  • cell lines may be developed from ovarian cancer tissue using standard methods from nonresponder and responders in order to provide for high through put analysis.
  • a method for screening agents for treating ovarian cancer comprises contacting an ovariant cancer sample identified as a nonresponder or responder with a potential agent for treating ovarian cancer; and redetermining whether the agent decreases the growth, spread of the ovarian cancer sample, or changes the gene expression profile of the first set of genes , the second set of genes , the third set of genes or all sets of genes.
  • the method further comprises identifying a ovarian cancer sample as from a responder or nonresponder by determining the expression level of a a first set of genes comprising LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4, a second set of genes comprising SSRl, USP5, ACTB, HLCS, NDUFBl, LYPLA2, TUBA3C, MED13L, and EED, or a third set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED, in a sample from the patient.
  • a first set of genes comprising LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4
  • a second set of genes comprising SSRl, USP5, ACTB, HLCS, NDUFBl, LYPLA2, TUBA3C, MED
  • the potential therapeutic agents are those that interact with any one of the genes a first set of genes comprising LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4, a second set of genes comprising SSRl, USP5, ACTB, HLCS, NDUFBl, LYPLA2, TUBA3C, MED13L, and EED, or a third set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED, or all set of genes in a sample from the patient. Examples of such agents are listed above.
  • Drugs or chemicals similar to those known drugs in mechanism of action may be screened using nonresponder and responder ovarian cancer cells or cell lines as a measure of their efficacy in each of the patient groups.
  • Other drugs or agents may also be those that are selected to act on other genes that are known to interact with any of the genes in the first or second set of genes as.
  • the genes in the first, second, and/or third set of genes are targets to develop new therapeutics which can be tested on ovarian cancer cells identified as responder or nonresponders.
  • High throughput assays such as multiwell plate assays or arrays with cells attached to nanobeads can be utilized to test a number of therapeutic compounds for any effects on the responder or nonresponder cell types with regard to inhibition of cell growth, cell death, or change is gene expression of one or more of the genes of the first set of genes, the second set of genes , the third set of genes or all sets of genes. Those agents effective on both the responder and nonresponder population may be selected for further development. In other embodiments, an effective agent on either a responder or nonresponder cell types is selected and the patient group is sorted as responders and non responders for further testing of the agent effective in the respective responder or nonresponder cell type. Kits
  • kits comprises a primer or a probe or both that specifically hybridizes to each gene of a set of genes comprising LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4.
  • the kit comprises a primer or a probe or both that specifically hybridizes to each gene of a set of genes comprising SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUBA3C, MED13L, and EED.
  • a kit comprises a primer or a probe or both that specifically hybridizes to each gene of a set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED.
  • Primers can be designed in accord with a number of criteria using Primer design programs such as Premier Primer (biosoft), Oligo Primer Analysis software, and Oligo Perfect (Life Technologies) and other free and commercially available software. Probes can be designed using free and commercially available software including Array Designer (biosoft), and Light Cycler Probe design software (Roche). Primers and/or probes can be detectably labeled in accord with standard methods. Probes can be attached to a solid surface such as a slide, a well in a multiwell plate, and/or a chip.
  • a primer and/or probe is designed to specifically bind to each of the nucleic acids encoding CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, EED, SSR1, USP5, NDUFB1, OSBPL8, and PKP4.
  • a custom array can be prepared that contains no more than 200 probes, including at least 12 probes, one for each of the identified genes.
  • the primers and/ or probes specifically bind to the nucleic acid sequences under standard PCR or microarray conditions.
  • those conditions include 7% sodium dodecyl sulfate SDS, 0.5 M NaP04, 1 mM EDTA at 50°C with washing in 2X standard saline citrate (SSC), 0.1% SDS at 50°C; preferably in 7% (SDS), 0.5 M NaP04, 1 mM EDTA at 50°C.
  • a hybridization buffer includes 25% formamide, 2.5x SSC, 0.5 % SDS and lx Denhardts, and the primers and probes are incubated at 42°C for 1 hour followed by two washes of 0.5 SSC and 0.5% SDS.
  • the kit contains no more than 200 primers or probes or both, no more than 175 primers, probes or both, no more than 1 0 primers, probes or both, no more than 125 primers, probes or both, no more than 100 primers, probes or both, no more than 75 primers, probes or both, no more than 50 primers, probes or both, no more than 25 primers, probes or both, or no more than 15 primers, probes or both.
  • the kit can comprise or consist essentially of other reagents for detecting the gene expression level of the identified genes.
  • the kit may also contain primers or probes for detecting one or more housekeeping genes as a positive control.
  • the kit does not contain probes for any other genes that are predictive of short term or long term survivorship of ovarian cancer other than the genes identified herein.
  • the kit further comprises instructions for inputting the gene expression values into function 1, function2, function 3, or combinations thereof to obtain an output score.
  • the instructions further provide comparing the output score for each function to a cutoff value and determining if the subject is likely to have long term survival if the output score is less than the cutoff value or if the subject is likely to have short term survival if the subject has an output score greater than or equal to the cutoff value for each function.
  • a kit further comprises a computer readable storage medium having computer-executable instructions that, when executed by a computing device, cause the computing device to perform a step comprising: calculating an output score by inputting gene expression levels of a set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer.
  • the computer readable storage medium having computer- executable instructions that, when executed by a computing device, cause the computing device to perform a step comprising: comparing the output score to a cutoff value and displaying whether the subject is likely to have long term survival if the output score is less than the cutoff value or if the subject is likely to have short term survival if the subject has an output score greater than or equal to the cutoff value for each function.
  • the set of genes comprises at least the genes LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4.
  • the set of genes comprises at least the genes SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUB A3 C, MED13L, and EED.
  • a set of genes comprises CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED.
  • the function is selected from the group consisting of function 1, function 2, and function 3.
  • the detection, prognosis and/or diagnosis method can employ the use of a processor/computer system.
  • a general purpose computer system comprising a processor coupled to program memory storing computer program code to implement the method, to working memory, and to interfaces such as a conventional computer screen, keyboard, mouse, and printer, as well as other interfaces, such as a network interface, and software interfaces including a database interface find use one embodiment described herein.
  • the computer system accepts user input from a data input device, such as a keyboard, input data file, or network interface, or another system, such as the system interpreting, for example, the microarray or PCR data, and provides an output to an output device such as a printer, display, network interface, or data storage device.
  • a data input device such as a keyboard, input data file, or network interface
  • another system such as the system interpreting, for example, the microarray or PCR data
  • an output device such as a printer, display, network interface, or data storage device.
  • Input device for example a network interface, receives an input comprising detection of the proteins/nucleic acids described herein and/or quantification of those compounds.
  • the output device provides an output such as a display, including one or more numbers and/or a graph depicting the detection and/or quantification of the compounds.
  • Computer system is coupled to a data store which stores data generated by the methods described herein. This data is stored for each measurement and/or each subject; optionally a plurality of sets of each of these data types is stored corresponding to each subject.
  • One or more computers/processors may be used, for example, as a separate machine, for example, coupled to computer system over a network, or may comprise a separate or integrated program running on computer system. Whichever method is employed these systems receive data and provide data regarding detection/diagnosis in return.
  • a method for selecting a treatment for a subject that has ovarian cancer comprises calculating an output score, using a computing device, by inputting gene expression levels of a set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer; and displaying the output score, using a computing device.
  • the method further comprises determining whether the output score is greater than or equal to or less than a cutoff value, using a computing device; and displaying whether the subject is likely to be a short term survivor if the output score is greater than or equal to the cutoff value or long term survivor if the output score is less than the cutoff value.
  • a computing device comprises a processing unit;
  • system memory connected to the processing unit, the system memory including instructions that, when executed by the processing unit, cause the processing unit to: calculate an output score by inputting gene expression levels of a set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer; and display the output score.
  • the system memory includes instructions that when executed by the processing unit, cause the processing unit to determine whether the output score is greater than or equal to or less than a cutoff value; and displaying whether the subject is likely to be a short term survivor if the output score is greater than or equal to the cutoff value or long term survivor if the output score is less than the cutoff value.
  • the set of genes comprises at least the genes LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4.
  • the set of genes comprises at least the genes SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUB A3 C, MED13L, and EED.
  • a set of genes comprises CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED.
  • the function is selected from the group consisting of function 1, function 2, and function 3.
  • the platform technology as developed by Dr. Jason B. Nikas, and as presented in part in Nikas et al. 2010 (2), in Nikas and Low 2011(a) (3), and in Nikas and Low 201 1(b) (4), identified three biomarkers (complex mathematical functions of original mRNA variables, see Figure 9 and discussion above) that allowed one to distinguish between long-term and short-term survivors or between responders and non-responders, respectively.
  • the three biomarkers panels of markers are as follows:
  • F1 f (LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, PKP4)
  • F2 f (SSR1 , USP5, ACTB, HLCS, NDUFB1 , LYPLA2, TUBA3C, MED13L, EED)
  • F3 f (CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, EED).
  • the cut-off score of the Fl prognostic biomarker model was determined by taking into account the results of the following two analyses: 1) calculation of the optimal point on the ROC curve based on the 34 scores of the 34 original subjects used in the discovery study [optimal point is defined as the point with the highest sensitivity and the lowest false positive rate (1 -specificity)] and 2) calculation of the 99.99% confidence intervals for the mean Fl scores of the two groups (R/LTS and NR/STS) and their respective standard deviations. Based on that, the cut-off score of the Fl model was determined to be 21.388.
  • a subject has an Fl score less than 21.388, then that subject is classified as an R LTS; otherwise, that subject is classified as an NR STS.
  • the Fl model correctly identified all (14/14) R/LTS subjects and 19/20 NR/STS subjects.
  • our target group is the R/LTS (responder /long term survivor) and our reference group is the NR STS (non responder /short term survivor).
  • the mean Fl score of the 14 R LTS subjects was 17.9358 (top of clear bar) and the standard deviation (whisker above or below the top of the clear bar) was 2.9622; whereas the mean Fl score of the 20 NR STS subjects was 25.4697 (top of dark bar) and the standard deviation(whisker above or below the top of the dark bar) was 3.3651.
  • the Fl is parametrically distributed with respect to both groups.
  • the cut-off score was determined to be 14.694, signifying that a score less than 14.694 belongs to an R LTS subject, whereas a score greater than 14.694 belongs to an NR/STS subject.
  • the F3 model correctly identified all (14/14) R/LTS subjects and 19/20 NR/STS subjects.
  • the sensitivity and specificity of the F3 model were 1.000 and 0.950, respectively; with regard to survival, its sensitivity and specificity were 0.950 andl .000, respectively.
  • Figure 2 and Tables 1A and IB show all pertinent statistical results of the F3 prognostic biomarker model in connection with the discovery study in great detail.
  • the 14 R/LTS subjects was 13.4223 (top of clear bar) and the standard deviation (whisker above or below the top of the clear bar) was 0.8905; whereas the mean F2 score of the 20 NR/STS subjects was 15.1843 (top of dark bar) and the standard deviation (whisker above or below the top of the dark bar) was 0.6407.
  • the mean F3 score of the 14 R/LTS subjects was 13.8864 and the standard deviation was 0.7017; whereas the mean F3 score of the 20 NR/STS subjects was 15.3433 and the standard deviation was 0.6082.
  • Table 1A Statistical results of all three prognostic models and predicted group mean values for future LTS and STS subjects.
  • Table IB shows the response to treatment results (here, the LTS subjects are the target group and the STS subjects are the reference group).Table IB. Statistical results of all three prognostic models for future Responders (LTS) and Non-Responders (STS).
  • the aforementioned diagnostic biomarkers were validated with 20 new, unknown subjects (10 long-term survivors and 10 short-term survivors).
  • the validation (qualification) results are shown in Table 2A (here, the LTS subjects are the reference group and the STS subjects are the target group).
  • Table 2A Statistical results of all three prognostic models with respect to the 20 new, unknown subjects, along with the observed group mean values of those unknown subjects.
  • the observed group mean values of the 20 new, unknown subjects fall within the respective confidence intervals as predicted by all three models (see Table 1 A).
  • Table 2B shows the response to treatment results (here, the LTS subjects are the target group and the STS subjects are the reference group).Table 2B. Statistical results of all three prognostic models for the 20 new, unknown subjects as Responders (LTS) & Non- Responders (STS).
  • LTS Responders
  • STS Non- Responders
  • FIG 8. A 3-dimensional plot of Fl vs. F2 vs. F3 is shown in Figure 8. As can be seen, all three prognostic biomarkers correctly prognosed all 20 new, unknown subjects (10 LTS (responders) and 10 STS (non-responders)) (complete segregation of the long-term and short- term survival groups).
  • the aforementioned 12 genes can be categorized into three general groups: 1) genes that regulate the expression of cytostructural proteins, 2) genes that regulate cell proliferation, and 3) genes that regulate metabolism.
  • ACTB ACTB
  • TUBA3C cytoskeletal proteins
  • CDC42 promotes the polymerization of actin into microfilaments
  • MED13L The following genes, whose function pertains to cell proliferation in general, compose the second group: MED13L, SSR1, PKP4, EED, and USP5.
  • the MED13L protein also known as, among other names, THRAP2, TRAP240L, and KIAA1025
  • the Mediator complex a group of about 30 transcriptional co-activators that play various regulatory roles in the induction of RNA polymerase II transcription. Compositional differences may account for different functions among the Mediator proteins; for instance some promote transcription, whereas others act as transcriptional repressors.
  • Mediator proteins are novel, and, consequently, their exact function is not known, including that of MED13L.
  • MED13L the MED13L gene
  • over-expression of the TP53 gene (p53) in human colon carcinoma cell lines relative to controls suppresses the expression of MED13L (KIAA1025). That could very well explain our finding that the MED13L gene was significantly under-expressed in the NR/STS group relative to the R/LTS group by affirming the existence of a more aggressive EOC cancer in the case of the former group in comparison with the latter one.
  • SSR1 is an ER (endoplasmic reticulum) receptor part of the translocon-associated protein (TRAP)complex.
  • the SSR1 gene was significantly under-expressed in the NR/STS group relative to the R/LTS group.
  • the PKP4 protein (aka p0071) belongs to the family of arm-repeat proteins, which are involved in cell adhesion. According to the results of our analysis, the PKP4 gene was significantly under-expressed in the NR STS group relative to the R/LTS group, and that accords with the observation that metastatic cancer cells rely on greater cell mobility and, thus, lower cell adhesion.
  • the EED protein is part of the Polycomb-group (PcG) proteins involved in repressive transcriptional control mediated via histone deacetylation
  • PcG Polycomb-group
  • USP5 belongs to the largest class of deubiquitinating enzymes (USPs) that regulate protein ubiquitination, a post-translational modification of cellular proteins.
  • USPs deubiquitinating enzymes
  • the third group comprises genes whose function is involved in metabolism in general and lipid metabolism in particular. Those genes are: LYPLA2, OSBPL8, HLCS, and NDUFB 1.
  • LYPLA2 is the enzyme that catalyzes the hydrolysis of 2-lysophosphatidyIcholine (which, along with arachidonic acid, is derived from the hydrolysis of phosphatidylcholine— a phospholipid that is a major component of cell membranes) to glycerophosphocholine.
  • the protein OSBPL8 is an intracellular lipid receptor that belongs to the family of oxysterols (oxygenated cholesterol derivatives).
  • LXR liver X receptors
  • HLCS is an enzyme that catalyzes the covalent biotinylation of the five crucial mammalian carboxylaseenzymes: pyruvate carboxylase (PC), acetyl-CoA carboxylase 1 and 2 (ACC1 and ACC2), 3-methylcrotonyl- CoA carboxylase (MCC), and propionyl-CoA carboxylase (PCC).
  • PC pyruvate carboxylase
  • ACC1 and ACC2 acetyl-CoA carboxylase 1 and 2
  • MCC 3-methylcrotonyl- CoA carboxylase
  • PCC propionyl-CoA carboxylase
  • the DUFB l dehydrogenase (ubiquinone) 1 beta subcomplex constitutes the mitochondrial Complex I— a very large multiprotein enzyme which is located in the inner mitochondrial membrane, and which catalyzes the first step of the electron transport chain, the redox machinery of the oxidative phosphorylation. It has been observed by multiple studies that, owing to their surrounding hypoxic environment, tumor cells rely to a much larger extent on anaerobic glycolysis to produce energy rather than on oxidative phosphorylation.
  • Taxol is an anti-tubulin chemotherapeutic agent that acts as a mitotic inhibitor. More specifically, it increases polymerization of microtubules from - ⁇ tubulin heterodimers, and it stabilizes microtubules by preventing their depolymerization.
  • the CDC42 gene not only promotes the polymerization of actin into microfilaments, the reorganization of the actin cytoskeleton, and cell formation, growth, and spreading; but also it can regulate the polarization of both the actin and the microtubule cytoskeleton. Theoretically, therefore, over-expression of the CDC42 gene can overcome the action of taxol, as well; and that is the finding of our analysis: the CDC42 gene was significantly over-expressed in the NR STS group relative to the R LTS group.

Abstract

The present invention relates to methods for the selecting a treatment for a subject having ovarian cancer, in particular methods to distinguish between ovarian cancer patients who will respond to the taxol/platinum chemotherapy and survive longer than seven years versus those who will succumb to the disease within three years.

Description

KITS AND METHODS FOR SELECTING A TREATMENT FOR OVARIAN
CANCER
This application is being filed on 02 May 2012, as a PCT International Patent application in the name of Applied Informatic Solutions, Inc., a U.S. national corporation, applicant for the designation of all countries except the U.S., and, Jason Basil Nikas, a citizen of the U.S., Walter Cheney Low, a citizen of the U.S., Amy Patrice Skubiz, a citizen of the U.S., and Kristin Louise Murgic Boylan, a citizen of the U.S., applicants for the designation of the U.S. only, and claims priority to U.S. Patent Application Serial No. 61/481,556 filed on 02 May 201 1, the disclosure of which is incorporated herein by reference in its entirety.
Statement of Government Rights
This invention was made with government support under Grant No. T32 DA007097 awarded by the National Institutes of Health (NIH). The government has certain rights in the invention.
Background of the Invention
Ovarian cancer is the most lethal gynecological malignancy in the U.S., due in part to the subtlety of its symptoms. It accounts for ~ 3% of all cancers in women in the U.S.
Approximately 24,000 new cases of ovarian cancer are diagnosed each year in the U.S., resulting in about 16,000 deaths per year. Current diagnostic tests are neither adequately sensitive nor specific; consequently the majority of ovarian cancer patients are diagnosed with advanced disease. Standard therapy for ovarian cancer involves debulking surgery to reduce tumor burden followed by chemotherapy with a combination of platinum and paclitaxel (e.g., TAXOL®). Initially, up to 80% of ovarian cancer patients respond to chemotherapy, however most patients relapse in less than 2 years.
Taxol® is a mitotic inhibitor, whose mechanism of action is to prevent a) the destabilization of microtubules, necessary to the formation of the mitotic spindle and subsequent chromosomal separation during mitosis and b) the formation of new
microtubules, also necessary to the aforementioned mitotic stages. Another cytostructural component to the destabilization of the microtubules and subsequent formation of the mitotic spindle is β actin, a polymeric microfilament that helps hold microtubules together. Summary of the Invention
One embodiment provides a method to determine if an ovarian cancer patient is a long term survivor or a short term survivor comprising measuring the level of expression of at least one gene in a sample from the patient, wherein the level of expression of the at least one gene in the sample is an indication that the subject is a long term survivor or a short term survivor.
In embodiments, a method of selecting a treatment for a subject having ovarian cancer comprises determining whether a subject having ovarian cancer is likely to have short term or long term survival by a method comprising measuring the level of gene expression of at least a set of genes comprising LYPLA2, TUB A3 C, ACTB, MED13L, OSBPL8, EED, and PKP4 in a sample comprising ovarian cancer cells from the subject; inputting the expression levels of the set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer to obtain an output score; determining whether the subject is likely to have long term survival by determining if the output score is less than a cutoff value or whether the subject is likely to have short term survival by determining if the output score is greater than or equal to the cutoff value, wherein the cutoff value is a value determined by identifying a value between the 99% confidence interval of a mean output score of a first set of samples from subjects known to have short term survival and the 99% confidence interval of a mean output score of a second set of samples from subjects known to have long term survival; and optionally, displaying whether the output score is greater than or equal to the cutoff value or less than the cutoff value to a health care worker so that the health care worker can select a treatment for the subject.
In embodiments, a method of selecting a treatment for a subject having ovarian cancer comprises determining whether the subject having ovarian cancer is likely to have short term or long term survival by a method comprising measuring the level of gene expression of at least a set of genes comprising SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUBA3C, MED13L, and EED in a sample comprising ovarian cancer cells from the subject; inputting the expression levels of the set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer to obtain an output score; determining whether the subject is likely to have long term survival by determining if the output score is less than a cutoff value or whether the subject is likely to have short term survival by determining if the output score is greater than or equal to the cutoff value, wherein the cutoff value is a value determined by identifying a value between the 99% confidence interval of a mean output score of a first set of samples from subjects known to have short term survival and the 99% confidence interval of a mean output score of a second set of samples from subjects known to have long term survival; and optionally, displaying whether the output value of the sample is greater than or equal to the cutoff value or less than the cutoff value so that the health care worker can select a treatment for the subject.
In embodiments, a method of selecting a treatment for a subject having ovarian cancer comprises determining whether the subject is likely to have short term or long term survival by a method comprising measuring the level of gene expression of at least a set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED in a sample comprising ovarian cancer cells from the subject; inputting the expression levels of the set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer to obtain an output score; determining whether the subject is likely to have long term survival by determining if the output score is less than a cutoff value or whether the subject is likely to have short term survival by determining if the output score is greater than or equal to the cutoff value, wherein the cutoff value is a value determined by identifying a value between the 99% confidence interval of a mean output score of a first set of samples from subjects known to have short term survival and the 99% confidence interval of a mean output score of a mean output score of a second set of samples from subjects known to have long term survival; and optionally, displaying whether the output score is less than a cutoff value or greater than or equal to the cutoff so that the health care worker can select a treatment for the subject.
In embodiments, the methods further comprise treating a subject likely to have long term survival with standard chemotherapy. In embodiments, standard chemotherapy comprises taxol and/or platinum. In embodiments, the method further comprises treating a subject likely to have short term survival with therapy in addition to or in place of standard chemotherapy. In embodiments, an alternative therapy comprises a therapy selected from the group consisting of antiangiogenesis compounds, taxane analogues, tubulin binding agents, and ubiquitination inhibitors. In embodiments, a subject likely to have short term survival is treated with an inhibitor of a protein selected from the group consisting of TUBA3C, ACTB, CDC42 and combinations thereof.
In yet other embodiments, the disclosure provides a method for selecting a treatment for a subject that has ovarian cancer comprising, the method comprising: calculating an output score, using a computing device, by inputting gene expression levels of a first set of genes comprising LYPLA2, TUB A3 C, ACTB, MED13L, OSBPL8, EED, and PKP4, a second set of genes comprising SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUBA3C, MED13L, and EED, or a third set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED, into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer; and displaying the output score, using a computing device. In embodiments, the method further comprises determining whether the output score is greater than or equal to or less than a cutoff value, using a computing device; and displaying whether the subject is likely to be a short term or long term survivor.
One embodiment provides a method for diagnosing ovarian cancer in a subject comprising: measuring the level of expression of at least one gene in a test sample from a subject and comparing the level of expression with the level of expression of the at least one gene in a control sample from a healthy subject, wherein a higher or lower level of expression of the gene in the test sample compared with the level of expression in the control sample is an indication that the subject has ovarian cancer. In one embodiment, the mR A levels are measured. In another embodiment, the protein levels are measured. In one embodiment, the gene expression levels are measured by microarray analysis.
One embodiment provides that expression of LYPLA2, TUB A3 C, ACTB, ED13L, OSBPL8, EED, PKP4, SSR1, USP5, HLCS, NDUFB1, CDC42 or a combination thereof is measured. In another embodiment, the expression of LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4 is measured. In another embodiment, the expression of LYPLA2, TUBA3C, ACTB and PKP4 is increased and the expression of MED13L, OSBPL8, and EED is decreased. In one embodiment, the expression of SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUBA3C, MED13L, and EED is measured. In another embodiment, the expression of SSR1, NDUFB1, MED13L and EED is decreased and the expression USP5, ACTB, HLCS LYPLA2 and TUBA3C is increased. In one embodiment, the expression of CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED is measured. In another embodiment the expression of CDC42, LYPLA2, TUB A3 C, ACTB and HLCS is increased and the expression of MED13L and EED is decreased. In one embodiment, the expression of LYPLA2, TUBA3C, ACTB, USP5, HLCS, CDC42 or a combination thereof is increased. In another embodiment, the expression of MED13L, OSBPL8, EED, PKP4, SSR1, NDUFB1 or a combination thereof is decreased. In one embodiment, the expression of LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, PKP4, SSR1, USP5, HLCS, NDUFB1, CDC42 or a combination thereof is measured and applied to a mathematical function to yield a diagnosis of ovarian cancer.
In one embodiment, the measurement of gene expression provides a diagnosis which indicates that the subject/patient will survive the cancer longer than about seven years. In another embodiment, the measurement of gene expression provides a diagnosis that the subject/patient will not survive the cancer for longer than about three years.
In one embodiment, the subject/patient is a mammal, such as a human.
In one embodiment, a health care provider or worker is informed. In another embodiment, the subject/patient is treated for ovarian cancer.
In another aspect, the disclosure provides kits for selecting a treatment for an ovarian cancer patient. In embodiments, a kit comprises or consists essentially of primer or a probe or both that specifically hybridizes to each gene of a first set of genes comprising LYPLA2, TUB A3 C, ACTB, MED 13 L, OSBPL8, EED, and PKP4. In embodiments, the kit consists essentially of reagents for detecting expression of the first set of genes and contains other reagents such as primer or probes for housekeeping genes, positive controls and/or negative controls. In other embodiments, a kit comprises or consists essentially of: a primer or a probe or both that specifically hybridizes to each gene of a first set of genes comprising SSR1, USP5, ACTB, HLCS, NDUFB 1, LYPLA2, TUBA3C, MED13L, and EED. In yet other embodiments, a kit comprises or consists essentially of a primer or a probe or both that specifically hybridizes to each gene of a first set of genes comprising CDC42, LYPLA2, TUB A3 C, ACTB, HLCS, MED13L, and EED.
In embodiments, the kit contains no more than 200 primers or probes or both, no more than 175 primers, probes or both, no more than 150 primers, probes or both, no more than 125 primers, probes or both, no more than 100 primers, probes or both, no more than 75 primers, probes or both, no more than 50 primers, probes or both, no more than 25 primers, probes or both, or no more than 15 primers, probes or both.
In embodiments, a kit further comprises a computer readable storage medium having computer-executable instructions that, when executed by a computing device, cause the computing device to perform a step comprising: calculating an output score by inputting gene expression levels of a set of genes comprising LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4, a second set of genes comprising SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUB A3 C, MED13L, and EED, or a third set of genes comprising CDC42, LYPLA2, TUB A3 C, ACTB, HLCS, MED13L, and EED from a sample, into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer.
In embodiments, the disclosure provides a computing device comprising a processing unit; and a system memory connected to the processing unit, the system memory including instructions that, when executed by the processing unit, cause the processing unit to: calculate an output score by inputting gene expression levels of a set of genes comprising LYPLA2, TUB A3 C, ACTB, MED13L, OSBPL8, EED, and PKP4, a second set of genes comprising SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUBA3C, MED13L, and EED, or a third set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED from a sample, into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer; and display the output score. In yet another embodiment, the system memory includes instructions, that when executed by the processing unit, cause the processing unit to determine whether the output score is greater than or equal to or less than a cutoff value; and displaying whether the subject is likely to be a short term or long term survivor.
Brief Description of the Drawings
Figure 1. Output scores from Fl ovarian cancer gene expression biomarker for long- term vs. short-term survival (responders vs. non-responders).
Figure 2. Output scores from F2 and F3 ovarian cancer gene expression biomarkers for long-term vs. short-term survival (responders vs. non-responders).
Figure 3. Box plots of the output scores of two survival/(treatment-response) groups (LTS and STS) of the Fl biomarker.
Figure 4. Box plots of the output scores of two survival/(treatment-response) groups (LTS & STS) of the F2 and F3 biomarkers.
Figure 5. 3D plot of output scores from long term and short term survivor subjects from functions Fl vs. F2 vs. F3. It can be seen that, with the exception of one subject, the three biomarkers are able to separate long-term from short-term survivors (responders vs. non-responders) in this 3D space.
Figure 6. Scatter plot & bar graph of output scores of all individual subjects [both LTS (responders) and STS (non-responders)] of the Fl prognostic biomarker. All 10 unknown STS subjects have Fl scores that are higher than the cutoff value (21.4), whereas all 10 unknown LTS subjects have Fl scores that are lower than the cutoff value. Figure 7. Scatter plot and bar graph of output score of all individual subjects (both LTS (responders) and STS (non-responders)) of the F2 and F3 prognostic biomarkers. All 10 unknown STS subjects have F3 scores that are higher than the cutoff value (14.3 for F2 and 14.7 for F3), whereas all 10 unknown LTS subjects have F2 and F3 scores that are lower than the cutoff value.
Figure 8. Three-dimensional plot of prognostic biomarkers of output scores from each function Fl vs. F2 vs. F3 for the validation (qualification) study of long-term
(responders) and short-term (non-responders) ovarian cancer survivors. As can be seen there is a complete segregation of the two survival/(treatment response) groups.
Figure 9 provides mathematical equations.
Detailed Description of the Invention
There are currently few reliable prognostic markers available for the diagnosis and/or prognosis of ovarian cancer, in particular, the classification of ovarian patients in relation to short-term (less than about three years from diagnosis, including several weeks, several months, 1 year, 2 year or three years) vs. long-term survivors (at least about 4 years, about 5 years, about 6 years or about 7 years or longer than about 7 years) or in relation to response to the standard aforementioned chemotherapy treatment. The ability to distinguish between these two patient populations would allow the modification of treatment therapies and/or the development of new pharmacological treatments for short-term survivors to potentially prolong their survival time.
Described herein are novel prognostic biomarkers that can distinguish between ovarian cancer patients who will survive longer than seven years versus those who will succumb to the disease within three years using a novel mathematical bioinformatic approach for the analysis of gene expression in each patient's tumor tissue. This novel mathematical bioinformatic approach has resulted in the discovery of novel genes and networks underlying the progression from long-term survival to short-term survival in ovarian cancer patients.
In one embodiment, the gene biomarkers that constitute this novel gene network when combined together into a single complex mathematical function and, thus, treated as a single complex biomarker, have a very high prognostic power (AUC of 0.978). This AUC value indicates that these biomarkers can both independently and collectively be used to identify short-term survivors with a very high accuracy and therefore provide alternative treatments that may extend their survival. In general, this approach demonstrates the potential of personalized medicine based on the particular gene expression of a patient as it pertains to their specific disease.
One of the discovered genes, namely, TUBA3C, is directly linked to the mechanism of action of taxol, the standard chemotherapy treatment for ovarian cancer. Two of the remaining discovered genes, namely, ACTB and CDC42, are indirectly linked to the mechanism of action of taxol. More specifically, the TUB A3 C gene is responsible for the production of microtubules, something which is needed for cell proliferation, and something which taxol is trying to oppose. The gene ACTB is responsible for the production of β-actin, which can be polymerized to form β-actin microfilaments, which are used for the
polymerization or depolymerization of microtubules. Taxol and other taxol analogs oppose either the depolymerization or polymerization of microtubules, respectively. The gene CDC42 promotes the polymerization of β actin into microfilaments, and, furthermore, it can regulate the polarization of both the actin and the microtubule cytoskeleton. All three of those genes were significantly over-expressed in the short-term survivors as compared with those of the long-term survivors. This indicates that in the case of the short-term survivors, taxol cannot overcome the combined effect of the TUBA3C, ACTB, and CDC42 genes, and that those individuals will not respond to the standard treatment of care, i.e. chemotherapy with platinum and taxol. In addition, the findings indicate that chemotherapeutic agents that inhibit the overexpression of these genes are useful to extend the survival of ovarian cancer patients.
Definitions
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Although any methods and materials similar or equivalent to those described herein can be used in the practice or testing of the present invention, several embodiments with regards to methods and materials are described herein. As used herein, each of the following terms has the meaning associated with it in this section.
The articles "a" and "an" are used herein to refer to one or to more than one (i.e. to at least one) of the grammatical object of the article. By way of example, "an element" means one element or more than one element.
"Plurality" means at least two.
A "subject" or "patient" is a vertebrate, including a mammal, such as a human.
Mammals include, but are not limited to, humans, farm animals, sport animals and pets.
The term "biological sample," as used herein, refers to samples obtained from a subject, including, but not limited to, skin, hair, tissue, blood, plasma, serum, cells, sweat, saliva, feces, tissue and/or urine.
The term "about," as used herein, means approximately, in the region of, roughly, or around. When the term "about" is used in conjunction with a numerical range, it modifies that range by extending the boundaries above and below the numerical values set forth. In general, the term "about" is used herein to modify a numerical value above and below the stated value by a variance of 10%. In one aspect, the term "about" means plus or minus 20% of the numerical value of the number with which it is being used. Therefore, about 50% means in the range of 45%-55%. Numerical ranges recited herein by endpoints include all numbers and fractions subsumed within that range (e.g. 1 to 5 includes 1 , 1.5, 2, 2.75, 3, 3.90, 4, and 5). It is also to be understood that all numbers and fractions thereof are presumed to be modified by the term "about."
As used herein, the term "biologically active fragments" or "bioactive fragment" of the polypeptides encompasses natural or synthetic portions of the full length protein that are capable of specific binding to their natural ligand or of performing the function of the protein. For example, a "functional" or "active" biological molecule is a biological molecule in a form in which it exhibits a property by which it is characterized. A functional enzyme, for example, is one which exhibits the characteristic catalytic activity by which the enzyme is characterized.
A "fragment" or "segment" is a portion of an amino acid sequence, comprising at least one amino acid, or a portion of a nucleic acid sequence comprising at least one nucleotide. The terms "fragment" and "segment" are used interchangeably herein. As used herein, the term "fragment," as applied to a protein or peptide, can ordinarily be at least about 3-15 amino acids in length, at least about 15-25 amino acids, at least about 25-50 amino acids in length, at least about 50-75 amino acids in length, at least about 75-100 amino acids in length, and greater than 100 amino acids in length. As used herein, the term "fragment" as applied to a nucleic acid, may ordinarily be at least about 20 nucleotides in length, typically, at least about 50 nucleotides, more typically, from about 50 to about 100 nucleotides, at least about 100 to about 200 nucleotides, at least about 200 nucleotides to about 300 nucleotides, at least about 300 to about 350, at least about 350 nucleotides to about 500 nucleotides, at least about 500 to about 600, at least about 600 nucleotides to about 620 nucleotides, at least about 620 to about 650, and or the nucleic acid fragment will be greater than about 650 nucleotides in length.
The term "binding" refers to the adherence of molecules to one another, such as, but not limited to, enzymes to substrates, ligands to receptors, antibodies to antigens, DNA binding domains of proteins to DNA, and DNA or RNA strands to complementary strands. "Binding partner," as used herein, refers to a molecule capable of binding to another molecule.
As used herein, "health care provider or worker" includes either an individual or an institution that provides preventive, curative, promotional or rehabilitative health care services to a subject, such as a patient. In one embodiment, the data is provided to a health care provider so that they may use it in their diagnosis/treatment of the patient.
The term "standard," as used herein, refers to something used for comparison, such as control or a healthy subject.
The terms "comprises", "comprising", and the like can have the meaning ascribed to them in U.S. Patent Law and can mean "includes", "including" and the like. As used herein, "including" or "includes" or the like means including, without limitation.
The term "primer" refers to a nucleic acid capable of acting as a point of initiation of synthesis along a complementary strand when conditions are suitable for synthesis of a primer extension product. The synthesizing conditions include the presence of four different bases and at least one polymerization-inducing agent such as reverse transcriptase or DNA polymerase. These are present in a suitable buffer, which may include constituents which are co-factors or which affect conditions such as pH and the like at various suitable temperatures. A primer is preferably a single strand sequence, such that amplification efficiency is optimized, but double stranded sequences can be utilized. Primers are typically at least about 15 nucleotides. In embodiments, primers can have a length of anywhere from 15 to 2000 nucleotides. In embodiments , primers have a melting temp of at least 50°C, 52°C, 55°C, 58°C, 60°C, or 65°C.
The term "probe" refers to a nucleic acid that hybridizes to a target sequence. In some embodiments, a probe includes about eight nucleotides, about 10 nucleotides, about 15 nucleotides, about 20 nucleotides, about 25 nucleotides, about 30 nucleotides, about 40 nucleotides, about 50 nucleotides, about 60 nucleotides, about 70 nucleotides, about 75 nucleotides, about 80 nucleotides, about 90 nucleotides, about 100 nucleotides, about 110 nucleotides, about 1 15 nucleotides, about 120 nucleotides, about 130 nucleotides, about 140 nucleotides, about 150 nucleotides, about 175 nucleotides, about 187 nucleotides, about 200 nucleotides, about 225 nucleotides, and about 250 nucleotides. A probe can further include a detectable label. Detectable labels include, but are not limited to, a fluorophore (e.g.,Texas- Red®, Fluorescein isothiocyanate, etc.,) and a hapten, (e.g., biotin). A detectable label can be covalently attached directly to a probe oligonucleotide, e.g., located at the probe's 5' end or at the probe's 3' end. A probe including a fluorophore may also further include a quencher, e.g., Black Hole Quencher™, Iowa Black™, etc.
Ovarian Cancer
Ovarian cancer is a cancerous growth arising from different parts of the ovary. Most (>90%) ovarian cancers are classified as "epithelial" and were believed to arise from the surface (epithelium) of the ovary. However, recent evidence suggests that the Fallopian tube could also be the source of some ovarian cancers. Other types arise from the egg cells (germ cell tumor) or supporting cells (sex cord/stromal).
In 2004, in the United States, 25,580 new cases were diagnosed and 16,090 women died of ovarian cancer. The risk increases with age and decreases with pregnancy. Lifetime risk is about 1.6%, but women with affected first-degree relatives have a 5% risk. Women with a mutated BRCAl or BRCA2 gene carry a risk between 25% and 60% depending on the specific mutation. Ovarian cancer is the fifth leading cause of death from cancer in women and the leading cause of death from gynecological cancer.
Ovarian cancer usually has a poor prognosis. It is disproportionately deadly because it lacks any clear early detection or screening test, meaning that most cases are not diagnosed until they have reached advanced stages. More than 60% of patients presenting with this cancer already have stage III or stage IV cancer, when it has already spread beyond the ovaries. Ovarian cancers shed cells into the naturally occurring fluid within the abdominal cavity. These cells can then implant on other abdominal (peritoneal) structures including the uterus, urinary bladder, bowel and the lining of the bowel wall (omentum) forming new tumor growths before cancer is even suspected.
Ovarian cancer causes non-specific symptoms. Most women with ovarian cancer report one or more symptoms such as abdominal pain or discomfort, an abdominal mass, bloating, back pain, urinary urgency, constipation, tiredness and a range of other non-specific symptoms, as well as more specific symptoms such as pelvic pain, abnormal vaginal bleeding or involuntary weight loss. There can be a build-up of fluid (ascites) in the abdominal cavity.
Diagnosis of ovarian cancer starts with a physical examination (including a pelvic examination), a blood test (for CA-125 and sometimes other markers), and transvaginal ultrasound. The diagnosis must be confirmed with surgery to inspect the abdominal cavity, take biopsies (tissue samples for microscopic analysis) and look for cancer cells in the abdominal fluid. Treatment usually involves chemotherapy and surgery, and sometimes radiotherapy. In most cases, the cause of ovarian cancer remains unknown. Older women, and in those who have a first or second degree relative with the disease, have an increased risk. Hereditary forms of ovarian cancer can be caused by mutations in specific genes (most notably BRCA1 and BRCA2, but also in genes for hereditary nonpolyposis colorectal cancer). Infertile women and those with a condition called endometriosis, those who have never been pregnant and those who use postmenopausal estrogen replacement therapy are at increased risk. Use of combined oral contraceptive pills is a protective factor. The risk is also lower in women who have had their uterine tubes blocked surgically (tubal ligation).
Ovarian cancer is classified according to the histology of the tumor, obtained in a pathology report. Surface epithelial-stromal tumor, also known as ovarian epithelial carcinoma, is the most common type of ovarian cancer. It includes serous tumor, endometrioid tumor and mucinous cystadenocarcinoma. Sex cord-stromal tumor, including estrogen-producing granulosa cell tumor and virilizing Sertoli-Leydig cell tumor or arrhenoblastoma, accounts for 8% of ovarian cancers. Germ cell tumor accounts for approximately 30% of ovarian tumors, but only 5% of ovarian cancers. Germ cell tumor tends to occur in young women and girls. The prognosis depends on the specific histology of germ cell tumor. Mixed tumors, containing elements of more than one of the above classes of tumor histology are also possible.
Ovarian cancer staging is by the FIGO staging system and uses information obtained after surgery, which can include a total abdominal hysterectomy, removal of (usually) both ovaries and fallopian tubes, (usually) the omentum, and pelvic (peritoneal) washings for cytopathology. The AJCC stage is the same as the FIGO stage. The AJCC staging system describes the extent of the primary Tumor (T), the absence or presence of metastasis to nearby lymph Nodes (N), and the absence or presence of distant Metastasis (M).
Stage I - limited to one or both ovaries
IA - involves one ovary; capsule intact; no tumor on ovarian surface; no malignant cells in ascites or peritoneal washings
IB - involves both ovaries; capsule intact; no tumor on ovarian surface; negative washings
IC - tumor limited to ovaries with any of the following: capsule ruptured, tumor on ovarian surface, positive washings
Stage II - pelvic extension or implants
IIA - extension or implants onto uterus or fallopian tube; negative washings
IIB - extension or implants onto other pelvic structures; negative washings 6120
IIC - pelvic extension or implants with positive peritoneal washings
Stage III - microscopic peritoneal implants outside of the pelvis; or limited to the pelvis with extension to the small bowel or omentum
IIIA - microscopic peritoneal metastases beyond pelvis
IIIB - macroscopic peritoneal metastases beyond pelvis less than 2 cm in size IIIC - peritoneal metastases beyond pelvis > 2 cm or lymph node metastases
Stage IV - distant metastases to the liver or outside the peritoneal cavity
Para-aortic lymph node metastases are considered regional lymph nodes (Stage IIIC). As there is only one para-aortic lymph node intervening before the thoracic duct on the right side of the body, the ovarian cancer can rapidly spread to distant sites such as the lung.
The AJCC/TNM staging system includes three categories for ovarian cancer, T, N and M. The T category contains three other subcategories, Tl, T2 and T3, each of them being classified according to the place where the tumor has developed (in one or both ovaries, inside or outside the ovary). The Tl category of ovarian cancer describes ovarian tumors that are confined to the ovaries, and which may affect one or both of them. The sub-subcategory Tla is used to stage cancer that is found in only one ovary, which has left the capsule intact and which cannot be found in the fluid taken from the pelvis. Cancer that has not affected the capsule, is confined to the inside of the ovaries and cannot be found in the fluid taken from the pelvis but has affected both ovaries is staged as Tib. Tic category describes a type of tumor that can affect one or both ovaries, and which has grown through the capsule of an ovary or it is present in the fluid taken from the pelvis. T2 is a more advanced stage of cancer. In this case, the tumor has grown in one or both ovaries and is spread to the uterus, fallopian tubes or other pelvic tissues. Stage T2a is used to describe a cancerous tumor that has spread to the uterus or the fallopian tubes (or both) but which is not present in the fluid taken from the pelvis. Stages T2b and T2c indicate cancer that metastasized to other pelvic tissues than the uterus and fallopian tubes and which cannot be seen in the fluid taken from the pelvis, respectively tumors that spread to any of the pelvic tissues (including uterus and fallopian tubes) but which can also be found in the fluid taken from the pelvis. T3 is the stage used to describe cancer that has spread to the peritoneum. This stage provides information on the size of the metastatic tumors (tumors that are located in other areas of the body, but are caused by ovarian cancer). These tumors can be very small, visible only under the microscope (T3a), visible but not larger than 2 centimeters (T3b) and bigger than 2 centimeters (T3c). This staging system also uses N categories to describe cancers that have or not spread to nearby lymph nodes. There are only two N categories, NO which indicates that the cancerous tumors have not affected the lymph nodes, and Nl which indicates the
involvement of lymph nodes close to the tumor.
The M categories in the AJCC/TNM staging system provide information on whether the ovarian cancer has metastasized to distant organs such as liver or lungs. MO indicates that the cancer did not spread to distant organs and Ml category is used for cancer that has spread to other organs of the body.
The AJCC/TNM staging system also contains a Tx and a Nx sub-category which indicates that the extent of the tumor cannot be described because of insufficient data, respectively the involvement of the lymph nodes cannot be described because of the same reason.
The ovarian cancer stages are made up by combining the TNM categories in the following manner:
Stage I: T1+N0+M0
IA: Tla+N0+M0
IB: Tlb+N0+M0
IC: Tlc+N0+M0
Stage II: T2+N0+M0
Ila: T2a+N0+M0
IIB: T2b+N0+M0
IIC: T2c+N0+M0
Stage III: T3+ N0+M0
IIIA: T3a+ NO+MO
IIIB: T3b+ N0+M0
IIIC: T3c+ NO+MO or Any T+ 1+M0
Stage IV: Any T+ Any N+Ml
Ovarian cancer, as well as any other type of cancer, is also graded, apart from staged. The histologic grade of a tumor measures how abnormal or malignant its cells look under the microscope. There are four grades indicating the likelihood of the cancer to spread and the higher the grade, the more likely for this to occur. Grade 0 is used to describe non-invasive tumors. Grade 0 cancers are also referred to as borderline tumors. Grade 1 tumors have cells that are well differentiated (look very similar to the normal tissue) and are the ones with the best prognosis. Grade 2 tumors are also called moderately well differentiated and they are made up by cells that resemble the normal tissue. Grade 3 tumors have the worst prognosis and their cells are abnormal, referred to as poorly differentiated.
With regard to treatment, surgical treatment may be sufficient for malignant tumors that are well-differentiated and confined to the ovary. Addition of chemotherapy may be required for more aggressive tumors that are confined to the ovary. For patients with advanced disease a combination of surgical reduction with a combination chemotherapy regimen is standard. Borderline tumors, even following spread outside of the ovary, are managed well with surgery, and chemotherapy is not seen as useful.
Chemotherapy has been a general standard of care for ovarian cancer for decades, although with highly variable protocols. Chemotherapy is used after surgery to treat any residual disease, if appropriate. This depends on the histology of the tumor; some kinds of tumor (particularly teratoma) are not sensitive to chemotherapy. In some cases, there may be reason to perform chemotherapy first, followed by surgery.
For patients with stage IIIC epithelial ovarian adenocarcinomas who have undergone successful optimal debulking, a recent clinical trial demonstrated that median survival time is significantly longer for patient receiving intraperitoneal (IP) chemotherapy. Patients in this clinical trial reported less compliance with IP chemotherapy and fewer than half of the patients received all six cycles of IP chemotherapy. Despite this high "drop-out" rate, the group as a whole (including the patients that didn't complete IP chemotherapy treatment) survived longer on average than patients who received intravenous chemotherapy alone.
Methods of selecting a treatment
The disclosure provides methods for selecting a treatment for a subject having ovarian cancer. In embodiments, a method of selecting a treatment for a subject that has ovarian cancer comprises: a)determining whether the subject is likely to have short term or long term survival by a method comprising i)measuring the level of gene expression of at least a set of genes in a sample comprising ovarian cancer cells from the subject; ii)inputting the expression levels of the set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer to obtain an output score; iii)determining whether the subject is likely to have long term survival by determining if the output score is less than a cutoff value or whether the subject is likely to have short term survival by determining if the output score is greater than or equal to the cutoff value, wherein the cutoff value is a value determined by identifying a value between the 99% confidence interval of the mean output score of a first set of samples from subjects known to have short term survival and the 99% confidence interval of the mean output score of a second set of samples from subjects known to have long term survival; and b) optionally, displaying whether the output score is greater than or equal to the cutoff value or less than the cutoff value to a health care worker so that the health care worker can select a treatment for the subject.
In embodiments, the set of genes comprises at least the genes LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4. In other embodiments the set of genes comprises at least the genes SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUBA3C, MED13L, and EED. In yet other embodiments a set of genes comprises CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED.
Markers
The expression of certain genes has been demonstrated herein to be prognostic of ovarian cancer. Sequences of the protein, nucleic acid encoding the protein and the
Affymetrix sequences are available either at GenBank or the Affymetrix website. These genes include the following:
LYPLA2, such as human lysophospholipase II, is represented, for example, by accession numbers 215566_x_at, NM_007269, NM 007620, or NP_009191 (231 aa;
gI2032149).
The protein sequence is:
MCGNT SVPLLTDAATVSGAERETAAVIFLHGLGDTGHSWADAL
S IRLPHVKYICPHAPRIPVTL MKMVMPSWFDLMGLSPDAPEDEAGIKKAAE IKAL IEHEMK GIPANRIVLGGFSQGGALSLYTALTCPHPLAGIVALSCWLPLHRAFPQAA GSA DLAILQCHGELDPMVPVRFGALTAE LRSWTPARVQFKTYPGVMHSSCPQEMA AVKEFLEKLLPPV (SEQ ID NO:l) and the mR A sequence is
1 ggaagttccg gcgggggcgg ccgaggggga agagtgtgtc tgcgggagaa agaggagaat 61 cgcccaagcg gcctcggaag tcccagggag tggaggcccc cgccgtggag ccgtgtggtg 121 tatgtgtggt aacaccatgt ctgtgcccct gctcaccgat gctgccaccg tgtctggagc 181 tgagcgggaa acggccgcgg ttattttttt acatggactt ggagacacag ggcacagctg 241 ggctgacgcc ctctccacca tccggctccc tcacgtcaag tacatctgtc cccatgcgcc 301 taggatccct gtgaccctca acatgaagat ggtgatgccc tcctggtttg acctgatggg 361 gctgagtcca gatgccccag aggacgaggc tggcatcaag aaggcagcag agaacatcaa 421 ggccttgatt gagcatgaaa tgaagaacgg gatccctgcc aatcgaatcg tcctgggagg 481 cttttcacag ggcggggccc tgtccctcta cacggccctc acctgccccc accctctggc 541 tggcatcgtg gcgttgagct gctggctgcc tctgcaccgg gccttccccc aggcagctaa 601 tggcagtgcc aaggacctgg ccatactcca gtgccatggg gagctggacc ccatggtgcc 661 cgtacggttt ggggccctga cggctgagaa gctccggtct gttgtcacac ctgccagggt U 2012/036120
721 ccagttcaag acatacccgg gtgtcatgca cagctcctgt cctcaggaga tggcagctgt 781 gaaggaattt cttgagaagc tgctgcctcc tgtctaacta gtcgctggcc ccagtgcagt 841 accccagctc atgggggact cagcaagcaa gcgtggcacc atcttggatc tgagccggtc 901 gagcccctgt ccccaccctt cctgacctgt ccttttccca caggcctctg ggggcaggtg 961 gcaaggcctg gccgggcctt ccttcctggc cttagccacc tggctctgtc tgcagcaggg 1021 gcaggctgct ttcttatcca tttccctgga ggcgggcccc cctggcagca gtattggagg 1081 ggctacaggc agctggagaa aggggcccag ccgctgaccc actcactcag gacctcactc 1141 actagccccg ctttgggccc cctcctgtga cctcagggtt tggcccatgg ggccccccca 1201 ggcccctgcc ccaactgatt ctgcccagat aatcgtgtct cctgcctcca ctcagctgct 1261 tctcagtcat gaatgtggcc atggccccgg ggtccccttg ctgctgtggg ctccctgtcc 1321 ctgggcagga gtgctggtga ggaggtggag ccttttgagg ggggccttcc ctcagctgtt 1381 tccccacact ggggggctgg gccctgcctc cccgttaccc tccttccctg caggcctgga 1441 gcctgtaggg ctggactgag gttcaggtct ccccccagct gtctcacccc cactttgtcc 1501 ccactctaga gcagggaggc agtgggggag gagttgtgtc tcgtcttctg tctccatgtg 1561 gtttttgggt gtttttcttg ttgtgtcctg gattccgata aaattaaaga aattgcttcc 1621 tcaaaaaaaa aaaaaaaaaa aaaaaaaa (SEQ ID NO: 2)
TUBA3C, such as human tubulin, alpha 3c, is represented, for example, by accession numbers 210527_x_at, M_006001.01(gI 325053695), or NP_005992 (450aa; gl 17921933) . The protein sequence is:
MRECISIHVGQAGVQIGNACWELYCLEHGIQPDGQMPSDKTIGG
GDDSFNTFFSETGAGKHVPRAVFVDLEPTWDEVRTGTYRQLFHPEQLITGKEDAANN YARGHYTIGKEIVDLVLDRIRKLADLCTGLQGFLIFHSFGGGTGSGFASLLMERLSVD YGKKSKLEFAIYPAPQVSTAWEPYNSILTTHTTLEHSDCAF VDNEAIYDICRRNLD IERPTYTNLNRLIGQIVSSITASLRFDGALNVDLTEFQTNLVPYPRIHFPLATYAPVI SAEKAYHEQLSVAEITNACFEPANQMVKCDPRHG YMACCMLYRGDWPKDVNAAIATI TK RTIQFVDWCPTGFKVGINYQPPTWPGGDLAKVQRAVC LSNTTAIAEAWARLD HKFDLMYAKRAFVHWYVGEG EEGEFSEAREDLAALEKDYEEVGVDSVEAEAEEGEEY (SEQ ID NO:3) and the mRNA sequence is:
1 ggttgaggtc aagtagtagc gttgggctgc ggcagcggag gagctcaaca tgcgtgagtg 61 tatctctatc cacgtggggc aggcaggagt ccagatcggc aatgcctgct gggaactgta 121 ctgcctggaa catggaattc agcccgatgg tcagatgcca agtgataaaa ccattggtgg 181 tggggacgac tccttcaaca cgttcttcag tgagactgga gctggcaagc acgtgcccag 241 agcagtgttt gtggacctgg agcccactgt ggtcgatgaa gtgcgcacag gaacctatag 301 gcagctcttc cacccagagc agctgatcac cgggaaggaa gatgcggcca ataattacgc 361 cagaggccat tacaccatcg gcaaggagat cgtcgacctg gtcctggacc ggatccgcaa 421 actggcggat ctgtgcacgg gactgcaggg cttcctcatc ttccacagtt ttgggggtgg 481 cactggctct gggttcgcat ctctgctcat ggagcggctc tcagtggatt acggcaagaa 541 gtccaagcta gaatttgcca tttacccagc cccccaggtc tccacggccg tggtggagcc 601 ctacaactcc atcctgacca cccacacgac cctggaacat tctgactgtg ccttcatggt 661 cgacaatgaa gccatctatg acatatgtcg gcgcaacctg gacatcgagc gtcccacgta 721 caccaacctc aatcgcctga ttgggcagat cgtgtcctcc atcacggcct ccctgcgatt 781 tgacggggcc ctgaatgtgg acttgacgga attccagacc aacctagtgc cgtacccccg 841 catccacttc cccctggcca cctacgcccc ggtcatctca gccgagaagg cctaccacga 901 gcagctgtcc gtggctgaga tcaccaatgc ctgcttcgag ccagccaatc agatggtcaa 961 gtgtgaccct cgccacggca agtacatggc ctgctgcatg ttgtacaggg gggatgtggt 1021 cccgaaagat gtcaacgcgg ccatcgccac catcaagacc aagcgcacca tccagtttgt 1081 agattggtgc ccaactggat ttaaggtggg cattaactac cagcccccca cggtggtccc 1141 tgggggagac ctggccaagg tgcagcgggc tgtgtgcatg ctgagcaaca ccacggccat 1201 cgcggaggcc tgggctcgcc tggaccataa gttcgatctc atgtatgcca agcgggcctt 1261 tgtgcactgg tacgtgggag aaggcatgga ggagggggag ttctctgagg cccgcgagga 1321 cctggcagct ctggagaagg attatgaaga ggtgggcgtg gattccgtgg aagccgaggc 1381 tgaagaaggt gaagaatact gaggggaggg tgtggtgggt tctccactcc actgccaccc 1441 ccagcgtggc tgctttcaag ttctttgcaa ttaaaggttc tgtataaaaa aaaaaaaaaa 1501 aaaa (SEQ ID NO:4) .
ACTB, such as human beta actin, is represented, for example, by accession numbers 200801_x_at, NM_001101(gl 168480144), or NP_001092(375 aa; gI4501885). The protein sequence is:
DDDIAALWDNGSGMCKAGFAGDDAPRAVFPSIVGRPRHQGVM
VGMGQKDSYVGDEAQSKRGILTLKYPIEHGIVTNWDDMEKIWHHTFYNELRVAPEEHP VLLTEAPLNPKANREKMTQIMFETFNTPAMYVAIQAVLSLYASGRTTGIVMDSGDGVT HTVPIYEGYALPHAILRLDLAGRDLTDYLMKILTERGYSFTTTAEREIVRDIKEKLCY VALDFEQEMATAASSSSLEKSYELPDGQVITIGNERFRCPEALFQPSFLGMESCGIHE TTFNSIMKCDVDIRKDLYA TVLSGGTTMYPGIADR QKEITALAPSTMKIKIIAPPE R YSVWIGGSILASLSTFQQMWISKQEYDESGPSIVHRKCF (SEQ ID NO:5) and the mRNA sequence is:
1 accgccgaga ccgcgtccgc cccgcgagca cagagcctcg cctttgccga tccgccgccc 61 gtccacaccc gccgccagct caccatggat gatgatatcg ccgcgctcgt cgtcgacaac 121 ggctccggca tgtgcaaggc cggcttcgcg ggcgacgatg ccccccgggc cgtcttcccc 181 tccatcgtgg ggcgccccag gcaccagggc gtgatggtgg gcatgggtca gaaggattcc 241 tatgtgggcg acgaggccca gagcaagaga ggcatcctca ccctgaagta ccccatcgag 301 cacggcatcg tcaccaactg ggacgacatg gagaaaatct ggcaccacac cttctacaat 361 gagctgcgtg tggctcccga ggagcacccc gtgctgctga ccgaggcccc cctgaacccc 421 aaggccaacc gcgagaagat gacccagatc atgtttgaga ccttcaacac cccagccatg 481 tacgttgcta tccaggctgt gctatccctg tacgcctctg gccgtaccac tggcatcgtg 541 atggactccg gtgacggggt cacccacact gtgcccatct acgaggggta tgccctcccc 601 catgccatcc tgcgtctgga cctggctggc cgggacctga ctgactacct catgaagatc 661 ctcaccgagc gcggctacag cttcaccacc acggccgagc gggaaatcgt gcgtgacatt 721 aaggagaagc tgtgctacgt cgccctggac ttcgagcaag agatggccac ggctgcttcc 781 agctcctccc tggagaagag ctacgagctg cctgacggcc aggtcatcac cattggcaat 841 gagcggttcc gctgccctga ggcactcttc cagccttcct tcctgggcat ggagtcctgt 901 ggcatccacg aaactacctt caactccatc atgaagtgtg acgtggacat ccgcaaagac 961 ctgtacgcca acacagtgct gtctggcggc accaccatgt accctggcat tgccgacagg 1021 atgcagaagg agatcactgc cctggcaccc agcacaatga agatcaagat cattgctcct 1081 cctgagcgca agtactccgt gtggatcggc ggctccatcc tggcctcgct gtccaccttc 1141 cagcagatgt ggatcagcaa gcaggagtat gacgagtccg gcccctccat cgtccaccgc 1201 aaatgcttct aggcggacta tgacttagtt gcgttacacc ctttcttgac aaaacctaac 1261 ttgcgcagaa aacaagatga gattggcatg gctttatttg ttttttttgt tttgttttgg 1321 tttttttttt ttttttggct tgactcagga tttaaaaact ggaacggtga aggtgacagc 1381 agtcggttgg agcgagcatc ccccaaagtt cacaatgtgg ccgaggactt tgattgcaca 1441 ttgttgtttt tttaatagtc attccaaata tgagatgcgt tgttacagga agtcccttgc 1501 catcctaaaa gccaccccac ttctctctaa ggagaatggc ccagtcctct cccaagtcca 1561 cacaggggag gtgatagcat tgctttcgtg taaattatgt aatgcaaaat ttttttaatc 1621 ttcgccttaa tactttttta ttttgtttta ttttgaatga tgagccttcg tgccccccct 1681 tccccctttt ttgtccccca acttgagatg tatgaaggct tttggtctcc ctgggagtgg 1741 gtggaggcag ccagggctta cctgtacact gacttgagac cagttgaata aaagtgcaca 1801 ccttaaaaat gaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aa (SEQ ID
NO: 6)
MED13L, such as human mediator complex subunit 13-like, is represented, for example, by accession numbers 212209_at, NM_015335.4(gI300360584), or NP_0561150 (2210 aa;gI44771211). The protein sequence is:
MTAAA VANGASLEDCHSNLFSLAELTGIKWRRYNFGGHGDCG
PIISAPAQDDPILLSFIRCLQANLLCVWRRDVKPDCKELWIF WGDEPNLVGVIHHEL QWEEGLWENGLSYECRTLLFKAIHNLLERCLMDKNFVRIGKWFVRPYEKDEKPVNKS EHLSCAFTFFLHGESNVCTSVEIAQHQPIYLINEEHIHMAQSSPAPFQVLVSPYGLNG TLTGQAY MSDPATRKLIEEWQYFYPMVLKKKEESKEEDELGYDDDFPVAVEVIVGGV RMVYPSAFVLISQNDIPVPQSVASAGGHIAVGQQGLGSVKDPSNCGMPLTPPTSPEQA ILGESGGMQSAASHLVSQDGGMITMHSP RSG IPPKLHNHMVHRV KECILNRTQSK RSQMSTPTLEEEPASNPATWDFVDPTQRVSCSCSRHKLLKRCAVGPNRPPTVSQPGFS AGPSSSSSLPPPASSKHKTAERQEKGD LQKRPLIPFHHRPSVAEELCMEQDTPGQKL GLAGIDSSLEVSSSRKYDKQ AVPSRNTSKQ NLNPMDSPHSPISPLPPTLSPQPRGQ ETESLDPPSVPVNPALYGNGLELQQLSTLDDRTVLVGQRLPLMAEVSETALYCGIRPS NPESSEKWWHSYRLPPSDDAEFRPPELQGERCDAKMEVNSESTALQRLIJAQPNKRF I WQDKQPQLQPLHFLDPLPLSQQPGDSLGEVNDPYTFEDGDIKYIFTA K CKQGTEKD SLKKNKSEDGFGTKD"VTTPGHSTPVPDGKNAMSIFSSATKTDVRQDNAAGRAGSSSLT QVTDLAPSLHDLDNIFDNSDDDELGAVSPALRSSKMPAVGTEDRPLG DGRAAVPYPP TVADLQRMFPTPPSLEQHPAFSPVM Y DGISSETVTALGMMESPMVSMVSTQLTEFK EVEDGLGSPKPEEIKDFSYVHKVPSFQPFVGSSMFAPLK LPSHCLLPLKIPDACLF RPS AIPPKIEQLPMPPAATFIRDGY NVPSVGSLADPDYLNTPQMNTPVTLNSAAPA SNSGAGVLPSPATPRFSVPTPRTPRTPRTPRGGGTASGQGSVKYDSTDQGSPASTPST TRPLNSVEPAT QPIPEAHSLYVTLILSDSV NIFKDR FDSCCICACNMNIKGADVG LYIPDSSNEDQYRCTCGFSAIMNRKLGYNSGLFLEDELDIFGKNSDIGQAAERRL C QSTFLPQVEGTKKPQEPPISLLLLLQNQHTQPFASLNFLDYISS NRQTLPCVSWSYD RVQADNNDY TECFNALEQGRQYVDNPTGGKVDEALVRSATVHSWPHSNVLDISMLSS QDWRMLLSLQPFLQDAIQKKRTGRTWENIQHVQGPLT QQFHKMAGRGTYGSEESPE PLPIPTLLVGYDKDFLTISPFSLPFWERLLLDPYGGHRDVAYIWCPENEALLEGAKT FFRDLSAVYEMCRLGQHKPICKVLRDGIMRVGKTVAQKLTDELVSEWFNQPWSGEEND NHSRLKLYAQVCRHHLAPYLATLQLDSSLLIPPKYQTPPAAAQGQATPGNAGPLAPNG SAAPPAGSAFNPTSNSSSTNPAASSSASGSSVPPVSSSASAPGISQISTTSSSGFSGS VGGQNPSTGGISADRTQGNIGCGGDTDPGQSSSQPSQDGQESVTERERIGIPTEPDSA DSHAHPPAWIY VDPFTYAAEEDSTSGNFWLLSLMRCYTEMLDNLPEH R SFILQI VPCQYMLQTMKDEQVFYIQYL SMAFSVYCQCRRPLPTQIHIKSLTGFGPAASIEMTL K PERPSPIQLYSPPFIIiAPIKDKQTELGETFGEASQKYNVLFVGYCLSHDQRWLLAS CTDLHGELLETC\A/ IALPNRSRRSKVSARKIGLQKLWE CIGIVQMTSLPWRWIGR LGRLGHGELKDWSILLGECSLQTISKKLKDVCR CGISAADSPSILSACLVAMEPQGS FWMPDAVTMGSVFGRSTAL MQSSQL TPQDASCTHILVFPTSSTIQVAPA YPNED GFSP NDDMF-VDLPFPDDMDNDIGILMTGNLHSSPNSSPVPSPGSPSGIGVGSHFQHS RSQGERLLSREAPEELKQQPLALGYFVSTAKAENLPQWFWSSCPQAQNQCPLFLKASL HHHISVAQTDELLPARNSQRVPHPLDSKTTSDAn_,RF\TLEQYNALS LTCNPATQDRTS CLPVHFWLTQLYNAIMNIL (SEQ ID NO : 7 ) and the mRNA sequence is:
1 ctcggacgcc tcgctccgac atgccccgct ctggcggccg ggctcgcgga ggatcatgac
61 tgcggcagcg aactgggtgg cgaacggggc gagcctggag gattgtcact ccaacctctt
121 ttcgctggct gaactcacgg gaatcaaatg gcgtaggtac aattttggag ggcatgggga
181 ctgtggaccc ataatttcag ccccagccca agatgatcca attctgttaa gtttcatccg
241 ctgtctgcaa gctaacctgc tttgtgtatg gcgtcgtgat gtcaaaccag attgcaaaga
301 gttatggata ttctggtggg gagatgaacc caacctagtg ggtgtaatac atcatgaact
361 gcaggttgtg gaagaaggac tctgggaaaa tggcctttcc tatgaatgta ggacgctgct
421 cttcaaagcg atccacaatc tgttagaaag gtgcctaatg gataagaact tcgttaggat
481 tgggaaatgg tttgtccgac cctacgaaaa ggatgaaaag ccagtcaaca aaagtgagca
541 tttgtcctgt gctttcacat tctttctgca tggagaaagt aatgtatgca caagtgtgga
601 gattgcccag caccagccaa tttatttgat caatgaggag catatacaca tggctcagtc
661 ttcacctgca ccatttcaag tactggtaag tccttatggc ttaaatggga cgctaacagg
721 ccaagcatac aagatgtcag acccagccac tcgtaagttg attgaggaat ggcagtat'tt
781 ctacccgatg gtgctaaaaa agaaagaaga atcgaaagag gaagacgagt tgggatatga 841 tgatgatttc cctgtggcag ttgaagtaat tgttggtggt gttcggatgg tttacccttc 901 agcatttgtt ttgatctctc agaatgacat cccggttcct cagagtgttg ccagtgctgg 961 aggccacatt gcagttgggc agcaagggct tggtagtgtg aaggacccaa gtaactgtgg 1021 gatgcctctg acccctccca cctctccaga acaggctatc ctaggtgaga gtggaggtat 1081 gcagagtgct gccagtcacc tggtttccca agatggaggg atgataacga tgcacagtcc 1141 aaagagatcg gggaagattc ctccaaaact ccacaatcat atggtccatc gagtctggaa 1201 ggaatgcatc ctcaacagaa cccagtccaa gaggagccaa atgtcaactc caactcttga 1261 agaagagcct gctagcaatc ctgctacttg ggattttgtg gatccaaccc aaagagtcag 1321 ctgttcttgt tccaggcata agcttttaaa acgttgtgca gtcgggccca atcgacctcc 1381 cacagtatct caaccagggt tcagtgcagg accatcatca tcttcatctt taccacctcc 1441 tgcttcttct aagcacaaaa cagcagaaag acaggaaaaa ggagacaagc tgcaaaagag 1501 acccttaata ccatttcacc ataggccctc tgtggccgaa gaattatgca tggagcaaga 1561 tacaccagga cagaaactag ggttggcagg gatagactcc tccttagagg tgtctagcag 1621 taggaaatat gataagcaaa tggccgtgcc ttccagaaat acaagcaagc aaatgaatct 1681 gaatcctatg gattcacctc attcccctat atcccctctg ccaccaacac tcagccctca 1741 gccacgaggt caggaaacag agagtttgga cccaccatcg gtccctgtga atccagccct 1801 ttatggaaat ggactagaac tccagcagtt gtctactctg gatgacagaa ctgtcctcgt 1861 aggccaaaga ctgcctctca tggcagaggt cagcgagaca gccttatatt gtgggattag 1921 gccctcgaac ccggagtcat cagaaaagtg gtggcatagt tatcgtctcc cacccagtga 1981 tgatgctgag ttcaggcctc cagagctcca gggtgagaga tgtgatgcca aaatggaggt 2041 aaactcagag agcactgcat tgcaaagact cttagcacaa cctaacaaac ggtttaaaat 2101 ctggcaagac aaacagcccc agttgcagcc actccacttc cttgacccat tgcctctatc 2161 acaacaacct ggagacagtt tgggagaagt gaatgaccca tatacctttg aagatggtga 2221 cataaaatac atctttacag ccaacaagaa atgcaaacaa gggacggaga aagattccct 2281 gaaaaagaat aagtcagagg atggatttgg taccaaggat gtcactacac caggtcattc 2341 cacgccggtg cctgatggga aaaatgccat gtctattttc agttctgcta ctaaaacaga 2401 tgtccggcag gataatgctg ctggcagagc tggctccagt agccttacac aggtaacaga 2461 tttggcacct tccctgcatg acttagacaa catctttgat aattctgatg acgacgaact 2521 tggggctgta tcacctgctc tgcgctcatc aaaaatgcct gcagttggga cagaagaccg 2581 acctcttggg aaggatggaa gagctgctgt tccttatcca ccaacagttg cagacttgca 2641 aaggatgttt cccactccac catctttgga acagcatcct gcattttctc ctgtgatgaa 2701 ttataaagat gggatcagct cagagacagt gacagcatta ggcatgatgg agagccctat 2761 ggtcagtatg gtttcaacac aactcacaga attcaaaatg gaagtggaag atggattagg 2821 aagtcccaag cccgaggaaa ttaaggactt ttcatatgtg cacaaagttc catcctttca 2881 accttttgtg ggatcctcca tgtttgctcc actgaagatg ttgccgagcc attgtttgct 2941 acctctgaag atacctgatg cctgtctgtt tcggccttca tgggcaattc ctcctaaaat 3001 tgaacaactg cccatgcccc ctgcagccac tttcattaga gatggctaca ataacgtgcc 3061 tagtgttggg agcctagcag atccagacta tctgaacaca ccacagatga acacacccgt 3121 gacgttgaac agcgctgccc cagccagcaa tagtggggca ggagtcctac catctccagc 3181 aacccctcgc ttctctgtcc ccacaccacg aacccccagg accccaagaa ctcccagagg 3241 tgggggcact gccagtggtc aagggtctgt taagtatgat agcaccgatc aaggatcacc 3301 agcctccacc ccctctacta cacggcccct caactctgtg gagcccgcca ccatgcagcc 3361 aattcccgaa gcccacagcc tctatgttac cctgattctc tccgattccg tgatgaatat 3421 ctttaaagac agaaactttg acagctgttg catctgtgcc tgcaacatga acatcaaagg 3481 ggcggatgtc gggctttaca tccccgattc ttccaatgag gaccagtacc gctgtacctg 3541 tgggtttagt gcgattatga accgcaaact tggctacaat tcaggactct tccttgaaga 3601 tgagttggat atttttggga agaattctga tattggtcag gctgcagaga ggcgcttaat 3661 gatgtgtcag tccaccttcc ttcctcaggt ggaaggaacc aaaaaacccc aggagccacc 3721 cataagcctt ctcctcctcc tccagaatca acacacacaa ccttttgctt cactgaattt 3781 cctggactac atttcctcta acaatcgcca aactcttccc tgtgtaagct ggagttatga 3841 ccgggtgcaa gcagataata atgattactg gacggaatgc tttaatgcgt tggagcaggg 3901 gcggcagtat gtggataacc ccactggtgg aaaagtggac gaagctctgg tgagaagtgc 3961 cactgtgcac tcttggcctc acagcaatgt gctggacatc agcatgctct cctcccagga 4021 tgtggttcgt atgctgttgt ccctgcagcc ctttctccaa gatgccatcc aaaagaagcg 4081 cacgggcagg acctgggaga acatccagca tgtgcaggga ccactcactt ggcagcagtt 4141 ccataaaatg gcaggacggg gaacctacgg ttcggaagaa tctcctgagc cgttgcccat 4201 ccccactctg ctggtaggct atgacaagga tttcctcacc atctcgccat tctccttgcc 4261 gttttgggag aggctcttgt tggacccata tgggggccac cgtgatgttg cctatattgt 4321 ggtgtgtcca gaaaatgagg ccttgctcga aggagccaaa actttcttca gggacttgag 4381 tgctgtatac gagatgtgta ggcttgggca gcacaagccc atctgcaaag tgctacgtga 4441 cgggatcatg cgcgtgggaa aaactgtggc acagaagctg acagatgagc ttgtgagtga 4501 gtggtttaac cagccttgga gcggcgagga gaatgacaat cattccagac tcaaacttta 4561 tgcgcaagtt tgccgccatc acctagcacc ttatttagcc actctgcagc ttgatagcag 4621 cctattgata ccacctaaat accagacccc accagcagca gcacagggac aagctacgcc 4681 agggaatgct gggcccttag ctccaaatgg atcagcagct cccccagctg gcagtgcatt 4741 taatcccacc tcgaatagta gttctacaaa tcctgcagca agtagttctg catctggttc 4801 ctctgtgcca ccggtctcat cgtctgcctc tgctcctggt attagccaga taagcactac 4861 ctcttcttca ggattcagtg gtagtgttgg agggcagaac cccagcactg ggggcatttc 4921 tgcggataga acgcaaggga acataggctg tggtggagac actgaccctg ggcagagctc 4981 ttctcagccc tcacaggatg gacaagagag tgttacagaa agggagagaa taggaattcc 5041 cacggagcct gactctgcag acagccatgc ccaccctcca gctgttgtca tttacatggt 5101 ggacccgttc acgtatgctg cagaggagga ctccacttct gggaactttt ggctgttgag 5161 cttgatgcgc tgctacacag aaatgctgga taatttacct gagcatatga gaaattcttt 5221 cattctccag attgtgcctt gccagtacat gctgcagaca atgaaggatg agcaagtttt 5281 ctacattcaa tacttgaagt ccatggcatt ttcagtgtac tgccagtgca ggcgaccact 5341 gcctacacag atccacatta aatccctcac gggatttggg cctgcagcca gcattgagat 5401 gaccctcaag aaccctgagc ggcccagccc aatccagctt tactcccctc cctttatatt 5461 ggccccaatc aaagacaagc agacagagct gggagagacg tttggtgagg cgagccagaa 5521 atacaatgtg ctcttcgtgg gctattgtct gtctcacgac cagcgctggc ttttggcttc 5581 ctgcactgac ctccatgggg aattattaga gacctgcgtt gtaaatattg ctttaccaaa 5641 caggtcacgg aggagtaaag tatctgcacg taaaattgga ctacagaagt tatgggagtg 5701 gtgcataggg attgtccaaa tgacatctct accctggaga gttgtaatcg ggcgacttgg 5761 gcgtcttggc catggggagc ttaaagattg gagtatcctc cttggagaat gttcactaca 5821 gacaatcagc aaaaagctca aggatgtgtg ccggatgtgt ggaatctctg ccgcagactc 5881 tccttctatc cttagtgcct gcctggttgc catggagccc caggggtcct ttgtagtgat 5941 gccagatgct gtcacaatgg gctctgtttt tggccgaagt actgcactga acatgcagtc 6001 atctcagctc aacacccctc aagatgcttc ttgtacacac atcttggtgt tcccaacatc 6061 atcaaccatc caggtggctc cagccaacta ccccaatgaa gatgggttta gccccaacaa T U 2012/036120
6121 tgatgatatg tttgttgacc ttccattccc agatgatatg gacaatgata ttggcatatt 6181 aatgactggg aacctccatt cctctcccaa ctcttcccca gtaccctccc caggctctcc 6241 ttctggaatt ggtgtgggct ctcacttcca gcatagtcgg agccagggtg agcgtcttct 6301 ttctagagaa gcaccagagg agctaaagca gcagcccctg gcccttgggt attttgtatc 6361 aactgccaaa gctgagaatc ttccccagtg gttttggtca tcgtgtcccc aggctcaaaa 6421 ccagtgccct ctcttcttaa aggcttcgct gcatcaccac atttcagtag cacagacaga 6481 cgaacttctg cctgccagga attctcagcg ggttccacac cctcttgact ccaaaaccac 6541 gtcggatgtt ttaaggtttg ttttggagca gtacaacgct ctgtcctggc tcacgtgcaa 6601 tccggccacc caggaccgta cttcctgcct tcccgtccac tttgtggtgc tcactcagtt 6661 gtacaatgcc atcatgaata tactttaatt ggaaaagcac ttgttctctc tggctcagtt 6721 ccttctccct gcaacctcag tccaaggaac ctgctacact ctgcaaataa cccacatcct 6781 tttcttcaga ccactctcca cagtcctgca ctgtgattcc ttctcagcag gcacatgtca 6841 attctgcagt gttcattacc agagtgactc cttgacactt ctctcatgga cctggaaact 6901 tccataagtg gtgactttca gccagtgcgg tggtgtgtgt agccccaacc actggtcccc 6961 aggaagtggt ggtggttgat ggcttttcag cgggaaacag aagagacagt gtccttttgc 7021 acaagagtct gtgttttcag cctctgtata caattgaggg cagtctagcc ctttggatga 7081 aatcctctta gttactggtg tatggcctgt gggttacctg aactccataa tctgggactt 7141 tttaaaaata agaaccagct caagtacatg gtttcatact ggggtttctg tctccctagt 7201 gttcccatcc agattagcat gagtgctttg gttgacttca aacctgtgtg tcaatgcaga 7261 aggtctggag acagcttcat tttgtttatt tattttaatt tgttttgtca tatggttttt 7321 gtgactttat ttttttaatt cacaaggacc aggtacagta gctgaaaccc aattcagatc 7381 caccatagga ttctttgact acatacctct gtcctagaag ccggaaaagg agtaaaaaca 7441 cattggggag atcatgccta aaagtaatat attcaaaacc acccagcagt aggttttgtt 7501 aacaacaaac tggattttaa aagttctgcc atgttaagtg gccagcattt catgaaggat 7561 aacattttta tacagaaggc agtcaagctc aactcagagc catggaggca agtaccttaa 7621 ttagttttat atagtcacaa cggaaatata ttttctagtg aattcttatt ggaagccagg 7681 tctctcctct cattagatca aaagggactt atgtacatac aacaattgaa agtgtttgct 7741 catgaaatca gttataaata tggtgaattt tttctggacc ataggaatat tatttcaaag 7801 aaatattaca acttaaccat taaattagta cttgaagttg agcctttgtg gtgggacttt 7861 ttaaaaaaat gcctttttaa agcattaatg gctaattgaa gtattttatg actcctcatt 7921 cctggcccag agggttgtct ttgaaaccct gtttctaacc cttgtgttgt gtgtttctgt 7981 ctgaggacag tgggtgtgta ctggcctccc gggagccact gtgaccaggc ctttgagctc 8041 ttgtcatctg tggagagaat catgcaaatt ttaaaagttc ttccaagaga cttccatgtc 8101 ctggttatta acaaaaaagg aaaaatgtaa taattgatat gattttgtaa aagtattttt 8161 cttgaaataa tctaaagttt aaaacattat attaaaaaaa aagttgtgtg gtgggaatgt 8221 gaaagcagag aaataacttg taaatggata attttgttct ctgtaccacc agttgaaggg 8281 ggggttgact ttcgcaatgt ataggataaa aaatctgata tatcaaacca tttgtatcta 8341 atgtgtacag tgtaaaattg actttaaaaa tattgcagtg ctattttttc ttaatcagaa 8401 aggaaaattc tcaaggcctt ttgaagagca taagaagatg aagattgtaa acttgtataa 8461 aattatcttg gtgagaagac aaattgtaaa gtagatattt gtaatctttt accactttgg 8521 ggttgctttt ttcccggaat tcatcagaac tttgaatttt ttttttaaat gggctgtttt 8581 taatgcaggg gcttttcttc cctagaaacc caattctaag cagaaaaaga aaaaaaacac 8641 aaaaaataaa aaacccctac aaaaaaactt taaaaaaaat ggcagcaaag ggtagttttc 8701 atctggtgtc ttttatttaa gttttttaag ttaagaaaag ctggtgacat atttatacgt 8761 ttttgtgcaa aaataaatga atggcaatag attttaaaaa atcttattat gtacttctgt 8821 gtgaaaaagt ctgtataata tttcccttaa atatgcatta ttttacttgt gagtttttta 8881 ctgaattaat ctgaaatgta caagccctgg atttgctaca gagtgagaag ttattttatt 8941 tttttttatt tttaattttg gaaattctgc agaaatcaga actcttacca tggtttgaac 9001 aaaaaaaggg gaaatgggga ggggaaaagg gtgggattgt ccagcatgct tgtatgtata 9061 tttcagaacc ttttttaaat gtaaaagctg tacatttctg ggaagttctg aatttctttt 9121 gtttcctttt ttccttcaag cattttgcag tgagcttctt ttatatatag caaacaattt 9181 gaaagaatac aaaaatatgt gaagttcatt taaaaaaata actacagtat agcgctggta 9241 cagtacacta aaagactttg ataaaaagaa acaataataa aaggcctcca ttttaaatgt 9301 cattcatata taccttgtgg atgagagcta tatactttta cacacttttt tagaggaata 9361 aattattgaa ttactg (SEQ ID NO: 8)
OSBPL8, such as human oxysterol binding protein-like 8, is represented, for example, by accession numbers 212585_at, NM 001003712.0 and NM_020841.4 (2 alternative transcripts), or NP_001003712 (847 aa;gI51243032) and NP _065892 (889aa;gll 8079218). The protein sequence for variant 1 is:
EGGLADGEPDRTSLLGDSKDVLGPSTWANSDESQLLTPGKMS
QRQGKEAYPTPTKDLHQPSLSPASPHSQGFERGKEDISQNKDESSLSMSKSKSESKLY NGSEKDSSTSS LT ESL VQ NYREEK RAT ELLSTITDPSVIVMADWL IRGT LKSWTKL CVL PGVLLIY TQKNGQWVGTVLLNACEIIERPS KDGFCFKLFHPLEQ SIWAVKGPKGEAVGSITQPLPSSYLIIRATSESDGRCW DALELAL CSSLLKRTMIR EGKEHDLSVSSDSTHVTFYGLLRANNLHSGDNFQLNDSEIERQHF DQDMYSDKSD E NDQEHDESDNEVMGKSEESDTDTSERQDDSYIEPEPVEPLKETTYTEQSHEELGEAGE ASQTETVSEENKSLIWTLLKQVRPGMDLSKWLPTFILEPRSFLD LSDYYYHADFLS EAALEENPYFRL KWKWYLSGFY KPKGL PYNPILGETFRCLWIHPRTNSKTFYI AEQVSHHPPISAFYVSNRKDGFCLSGSILAKSKFYGNSLSAILEGEARLTFLNRGEDY VMTMPYAHCKGILYGTMTLELGGTVNITCQKTGYSAILEFKLKPFLGSSDCVNQISG L LGKEVLATLEGHWDSEVFITD KTDNSEVF NPTPDI Q RLIRHTVKFEEQGDFE SEKLWQRVTRAINA DQTEATQEKYVLEEAQRQAARDRKT NEEWSCKLFELDPLTGE WHYKFADTRPWDPLNDMIQFE DGVIQTKVKHRTPMVSVPKMKHKPTRQQK VAKGYS SPEPDIQDSSGSEAQSVKPSTRRKKGIELGDIQSSIESIKQTQEEIKRNIMALRNHLV
ssTPATDYFLQQKDYFiiFLLiLLQviiNFMFK (SEQ ID NO: 9) and the mRNA sequence for variant 1 is
1 cactagaatg tgaaggatct tcgcggttct gggtgcccag aaaggcggcg acgcggcgga 61 tgacaacatt aggccgcgac gcgctcctgg ccaggcggcg gctgtagtgt tagctttgga 121 cgccgcagta gccgctgccg gtagcaagcc gactgaggga aggtgggggt ccgcccgggc 181 tggtggacct cggggccgaa agttcccgcc ccgctcgggg gctgagccgg cagtgcctcc 241 gcggccgctg ggcagcgccc ttcgtccagg ctcgcgcccc agctgccgcc gacgacagcg 301 gccgagagaa gttggggtct gactagacgc ttacggggcc tcggaccccg gcgccgcggc 361 gacctcggag gaaccggctc cttgcgtccc gcctccctgg gagctccgca cgggatttgc 421 agatttacag aatggctgca cattaatgga aagagaagca taaacctatc ttctttcatt 481 atggagggag gtttggcaga tggagaacct gatcgaactt cgcttcttgg tgatagcaaa 541 gatgtccttg ggccatcaac tgttgtagca aacagtgacg aatctcagct tctgacacca 601 ggaaagatga gtcagcgcca aggaaaagaa gcttatccaa cgccaaccaa agatttgcat
661 cagccatctc ttagtccagc aagtcctcat agccagggtt ttgaaagagg gaaggaagat
721 atttctcaaa ataaagatga atcttcactt tctatgtcaa agagcaagtc tgaatctaaa
781 ctttataatg gctcagagaa ggacagttca acttcaagca aactcacaaa aaaagaatct
841 cttaaggtac aaaagaaaaa ttaccgagaa gaaaagaaaa gagccacaaa ggagctgctc
901 agtacaatca cagatccttc tgttattgtt atggctgatt ggttaaagat tcgtggtact
961 ctaaagagct ggaccaagtt atggtgtgtg ttgaaacctg gggtgctact gatctataaa
1021 acccaaaaaa atggtcagtg ggtaggaaca gttcttctga atgcctgtga aatcattgaa
1081 cgtccatcaa aaaaggatgg cttttgtttc aaacttttcc atcctttgga gcaatctatt
1141 tgggcagtga agggtccaaa aggtgaagcg gttggatcca ttactcaacc cttacctagc
1201 agttatttga tcatccgagc tacttcagag tcagatggaa ggtgctggat ggatgctttg
1261 gagttggctt tgaaatgttc tagtcttctt aaacgtacaa tgatcagaga aggaaaggaa
1321 catgacctga gcgtttcatc agatagcaca catgtgactt tctatggctt actacgtgct
1381 aacaatctcc acagtggtga taacttccag ttaaatgata gtgaaattga acgacaacat
1441 tttaaggacc aagatatgta ttctgataaa tctgataaag aaaatgatca agaacatgat
1501 gagtctgata atgaggtgat ggggaaaagt gaagaaagtg acacagatac atcagaaaga
1561 caagatgact catatatcga acctgagcct gttgagcctt taaaggagac tacctacact
1621 gaacagagcc atgaagaact tggagaggca ggtgaggctt ctcaaacaga aactgtatct
1681 gaagaaaaca aaagccttat ctggacacta ttgaaacaag tccgtcctgg catggaccta
1741 tccaaggtgg ttctgcctac atttattttg gaaccccgtt ctttcctgga taaactttca
1801 gattactact atcatgcaga tttcctatct gaggcagctc ttgaagaaaa tccttatttc
1861 cgtttgaaga aagtagtgaa atggtatttg tcaggattct ataaaaagcc aaagggactg
1921 aagaaacctt ataatcctat acttggcgag actttccgtt gtttatggat tcatcccaga
1981 acaaacagca aaacttttta tattgctgaa caggtgtccc atcatccacc aatatctgcc
2041 ttttatgtta gtaatcgaaa agatggattt tgccttagcg gtagtatcct ggctaagtct
2101 aagttttatg gaaactcatt atctgcaata ttagagggag aagcacggtt aactttcttg
2161 aatagaggtg aagattatgt aatgacaatg ccatacgctc attgtaaagg aattctttat
2221 ggtacaatga cactggagct tggtggaaca gtcaatatta catgtcaaaa aactggatac
2281 agtgcaatac ttgaatttaa actaaagcca ttcctaggga gtagtgactg tgttaatcaa
2341 atatcaggga aacttaaact gggaaaagaa gtcctagcta ctttggaagg tcattgggat
2401 agtgaagttt ttattactga taaaaagact gataattcag aggttttctg gaatccaaca
2461 cctgacatta agcaatggag attaataagg cacactgtaa aatttgaaga acagggagat
2521 tttgaatcag agaaactctg gcaacgggta actcgagcca taaatgccaa agaccaaact
2581 gaagctaccc aagagaagta tgttttggaa gaagctcaaa gacaagctgc cagggatcgg
2641 aaaacaaaaa atgaagagtg gtcttgcaaa ttatttgaac ttgatccact cacaggagaa
2701 tggcattaca agtttgcaga tacccgacca tgggacccac ttaatgatat gatacagttt
2761 gaaaaagatg gtgttattca gaccaaagtg aaacatcgta ctccaatggt tagcgtcccc
2821 aaaatgaaac ataagccaac caggcaacag aagaaagtag caaaaggcta ttcctcccca
2881 gaacctgaca ttcaagactc ctctggaagt gaagctcaat cagtaaaacc aagtacaaga
2941 agaaagaaag gaatagaact gggagacatt cagagttcca tcgaatctat aaaacaaaca
3001 caggaagaaa ttaaaagaaa tattatggct cttcgaaatc atttagtttc aagcacaccg
3061 gccacggatt attttctgca acaaaaagac tacttcatca ttttcctcct gattttgctt
3121 caagtcataa taaacttcat gttcaagtag aagttctcta ccattgaatc agtgaactag
3181 aaagatctga tttggcctgg gaccagtgtt caagttggtt tggtctttat taaaaatcac 0
3241 aatattccga aaacaaaaaa acctaggaga taaatgtaga ggtattgact tttcgtatct 3301 tttatcttca cactgaaaca agagctatcc tatttgatta ttaaagtgag ctatgtgtta 3361 agtgccagga catttctagc ttttgtgaga atgtgtctac atatgagtat aataaaccca 3421 catgtataca caattgtctc ttatgtactc ctacctgaca gtagtctttg tattctatag 3481 tatgttctga gatataatgt taacattgtt cataacaaaa aatgctatca atcttataaa 3541 tatatgtaat ctattttctt cataaaacag gcacaaaagt tttatcagta aggaattaca 3601 gattgagaaa tgatggaata atagacataa ttaattcaat acactactgt taaaatcatt 3661 tgcaaagcac tcagctcaat tatcttctta gaaagaaaga aaaagtatga atggtcaaaa 3721 tgaatacatc gagagagata aatggcaaat tgctttttta aaagtttaca taagtttttt 3781 ttaaccccta gaatttaata tttgtagatg caggtaaata tatatactta cgtgtatatc 3841 agtataaaaa cactggtgtg caattaattg gattgattat aataccacct taagcacttg 3901 ctgaaaaaag tgtggtcaaa attgattgct gtccttttgt cttatttttg tttttcttaa 3961 gtcagctggt tcataacata ggccaaattc tagagatgtt tatagagcat ttgaagtgct 4021 gataatttat gttttttcat tatgaaaact tattttagct ttagactcca gtgtgttcag 4081 tgaataagta gaatataaaa aaatataacc agtattttac ttcaaaagcc aaaaagaggc 4141 aataagaaaa gacactttgt ggtggccttt atgtgtgcat taaaattggt ttctgtaaaa 4201 cgtgtaataa gttgagtatc tacgaagagt atcaagttct gaagtttaat ttttttatta 4261 tcctcctctc ttcttagtaa cttctttctg tggcaaaacc acaattcttt aagattccta 4321 ttgttcaggc taaggcaaat ttttttgttt gtttcttcag tttaatattt tgattttgtg 4381 tttttacgta aatatttata ttccttgaaa gcaatttttg ccaaggtagt tcagtttagg 4441 aatatgttgt tctaaaatat gtcttagaat cctgaaagca tagattttga aatgtttttt 4501 taatgaaaat gaaggtcaga gagaataatt gccctgacca catttgcctt tcagtaggag 4561 gaggctgtga aatagtaaaa ttataatcgt ttatgccatg ataaatacaa gattggtaaa 4621 aaatacatt gattggtaaa ttatgagaat caaaatgata aaaagagcct gcttttttcc 4681 ctaaccaata tagctatctt aagtatcctt aggtttctgt gaagaaccat ttcccatgtt 4741 ttcttggcaa aataatgctg tattccatat gtacatgtga aatgatgttt taaattgata 4801 aaagcttaaa taagatctac ctatacccag tattttcatg atattagaac aaatgggttt 4861 ttggttatat tttatatttg tcaatataat ttttgtattc acattctgtt acactctgcc 4921 tattcattga tatatgatat tctgtaaata ttgtacaatt tgatcttttt tatggtttaa 4981 attagttaat tacatacaaa ttgattggct tatcacaaaa atcatttcat cagtaaacct 5041 tgttaacatt ttgtactggt gacccacctc ttaggacttt ggtcttatcc acgtgtatgt 5101 tgttttcatt tggtccaaat aatattttat ttgtatgggt atcttctaag actaaatagg 5161 tagttgtgtt ctttattttt aaaatttctt tttagagcaa atgttatggg ttcttaccca 5221 aagagtcaaa aactatttct taagaaagag cagagttatt catgactgtt ctttatacac 5281 taaaagcatg catctaatct aatagtcctc ttattatgct tttagttgta tgagtctctt 5341 tctatgaact gaacacaaaa ctcaggaatt ggtggcttaa ttttagatca gtgcttgtac 5401 taggcttagt tatatgaatc tttataacac ataattacta actttgtagc catatatgta 5461 attgactttg aatgttattt acctgaaatt aatcttcctt cacacatgga ccgtaaacgg 5521 ttcccagttg tctgagagcc tcatgagggt ttctaggatt tatgacctta tgaccagttt 5581 ttttcattta ccaagatttt attttcctac atgaaaattt aattgagtaa taattattca 5641 catgtgcatt ttctttttag ctgttaaatg tactatgcca tcatccacca tttagtaaaa 5701 tgtagctggc ccaggacatg taaaaaaaaa aaaaaaacaa caacaataaa tagggcatgt 5761 gaaatgttaa gttacagcaa tagatatttt atttgtattt catgttagta cttttttgtt 5821 ttatatcact tataaaggta cagtgtactc tttgtcacag ctcagttggt aaccgcattc 5881 cattgaaaag ttggccttgt aaaatacaac tctcatttaa tattcatgct tttgtgcctt 5941 taagaaaata ttttttgtca ttttttgtgt tacagaacta taatgtgatt caaggtgttt 6001 ataggcttgt cataaaaggg tcatttctgt gtgttacttt ctttttatat agctatagta 6061 tatttaaaca ataatactat cttttatagg ggtttgtcta tttacctatt ctttactcag 6121 acattgatgt agacttgtca gattattctg agtattgtta acagtgcctt ttcgatggaa 6181 tcacactttt tggctgtcac cttgtgccat atacacacaa aattttgtgg aaggcagttt 6241 taactttctg aagaatatct gtcaaaattt aagaaaacaa atgtataaaa ttccattttt 6301 tccagtgttt agcatttcta gtaagcagtg aggttgtttg acatacagtg atgatggcat 6361 tattgataag ccatacatga gactgcagat tatattgaat catattaaat gtacagaaat 6421 aaaatattag atttatatca aattttccaa tttgaaccag tggggaaaat cccacagaaa 6481 tcagtaagtt tacatttcaa tttctatctt atttgactaa gtggaaagag attctttaaa 6541 atgtataacc tgccattatg taatttggtt tcattttatt ctacctgttg tgtgagttta 6601 gtatatttaa tttacttttt gttactcttt acatactgtt tatttttgtt agtttttaat 6661 tgaagatgga ctgttgaaat tgtataggac cagtgtctta ttaatatgat taatatattt 6721 agaagagcca cgtgaaaccc atgacaaaat gaatgtgaat attctttcta aaaatttaga 6781 aaatgttatc tttttgcatt tattatgtaa aactgtttta cagtatcaaa atttttcact 6841 taaagaaaaa aaatgccatg aaacatttga actgatgagc cacagaactt cagttgaaat 6901 ttttttcact ttttagcatg ctaaatatac atctgagttt aaatgttctg tttaatggcc 6961 attcataaat tcaagcacta ccactggtca gttttgtgtg atagaataaa aatatgttac 7021 ctgcagtgta agtacagcac actgtcaaat tcttttcctt aaggtgcaca gtaaatgtac 7081 agatagttat aggccactgt tttgtaatgt agtacatttc taatctatta ttcctaacct 7141 attataactg tttgcagaaa gaaaagaatt tttctaataa tctgtaaaat tatgctaact 7201 tctacaagta ggcttctaaa taaaattttt aaaaagagca (SEQ ID NO:10) .
The protein sequence for variant 2 is:
MSQRQGKEAYPTPTKDLHQPSLSPASPHSQGFERGKEDISQNKD
ESSLSMSKSKSESKLYNGSEKDSSTSS LTKKESLKVQ NYREEK RAT ELLSTIT DPSVIV ADWLKIRGTL S TKLWCVLKPGVLLIYKTQ NGQ VGTVLLNACEIIERP SKKDGFCFKLFHPLEQSIWAVKGPKGEAVGSITQPLPSSYLIIRATSESDGRCWMDAL ELALKCSSLLKRTMIREG EHDLSVSSDSTHVTFYGLLRANWLHSGDNFQLNDSEIER QHF DQDMYSDKSDKENDQEHDESDNEVMGKSEESDTDTSERQDDSYIEPEPVEPLKE TTYTEQSHEELGEAGEASQTETVSEENKSLIWTLLKQVRPGMDLSKWLPTFILEPRS FLDKLSDYYYHADFLSEAALEENPYFRLK W YLSGFYKKP GLKKPYNPILGETF RCLWIHPRTNSKTFYIAEQVSHHPPISAFYVSNRKDGFCLSGSILAKSKFYGNSLSAI LEGEARLTFLNRGEDYV T PYAHCKGILYGTMTLELGGTVNITCQ TGYSAILEFKL KPFLGSSDCVNQISGKL LG EVLATLEGH DSEVFITDK TDNSEVFWNPTPDIKQ RLIRHTVKFEEQGDFESEKLWQRVTRAINAKDQTEATQEKYVLEEAQRQAARDRKTKN EE SCKLFELDPLTGE HYKFADTRP DPLNDMIQFE DGVIQT V HRTPMVSVPKM KHKPTRQQKKVAKGYSSPEPDIQDSSGSEAQSVKPSTRRKKGIELGDIQSSIESIKQT QEEI RNIMALR HLVSSTPATDYFLQQKDYFIIFLLILLQVIINFMFK (SEQ ID NO: 27) and the mR A sequence for variant 2 is:
1 cactagaatg tgaaggatct tcgcggttct gggtgcccag aaaggcggcg acgcggcgga 61 tgacaacatt aggccgcgac gcgctcctgg ccaggcggcg gctgtagtgt tagctttgga 121 cgccgcagta gccgctgccg gtagcaagcc gactgaggga aggtgggggt ccgcccgggc 181 tggtggacct cggggccgaa agttcccgcc ccgctcgggg gctgagccgg cagtgcctcc 241 gcggccgctg ggcagcgccc ttcgtccagg ctcgcgcccc agctgccgcc gacgacagcg 301 gccgagagaa gttggggtct gactagacgc ttacggggcc tcggaccccg gcgccgcggc 361 gacctcggag gaaccggctc cttgcgtccc gcctccctgg gagctccgca cgggatttgc 421 agatttacag aatggctgca cattaatgga aagagaagca taaacctatc ttctttcatt 481 atggagggag gtttggcaga tggagaacct gatcgaactt cgcaaacagt gacgaatctc 541 agcttctgac accaggaaag atgagtcagc gccaaggaaa agaagcttat ccaacgccaa 601 ccaaagattt gcatcagcca tctcttagtc cagcaagtcc tcatagccag ggttttgaaa 661 gagggaagga agatatttct caaaataaag atgaatcttc actttctatg tcaaagagca 721 agtctgaatc taaactttat aatggctcag agaaggacag ttcaacttca agcaaactca 781 caaaaaaaga atctcttaag gtacaaaaga aaaattaccg agaagaaaag aaaagagcca 841 caaaggagct gctcagtaca atcacagatc cttctgttat tgttatggct gattggttaa 901 agattcgtgg tactctaaag agctggacca agttatggtg tgtgttgaaa cctggggtgc 961 tactgatcta taaaacccaa aaaaatggtc agtgggtagg aacagttctt ctgaatgcct 1021 gtgaaatcat tgaacgtcca tcaaaaaagg atggcttttg tttcaaactt ttccatcctt 1081 tggagcaatc tatttgggca gtgaagggtc caaaaggtga agcggttgga tccattactc 1141 aacccttacc tagcagttat ttgatcatcc gagctacttc agagtcagat ggaaggtgct 1201 ggatggatgc tttggagttg gctttgaaat gttctagtct tcttaaacgt acaatgatca 1261 gagaaggaaa ggaacatgac ctgagcgttt catcagatag cacacatgtg actttctatg 1321 gcttactacg tgctaacaat ctccacagtg gtgataactt ccagttaaat gatagtgaaa 1381 ttgaacgaca acattttaag gaccaagata tgtattctga taaatctgat aaagaaaatg 1441 atcaagaaca tgatgagtct gataatgagg tgatggggaa aagtgaagaa agtgacacag 1501 atacatcaga aagacaagat gactcatata tcgaacctga gcctgttgag cctttaaagg 1561 agactaccta cactgaacag agccatgaag aacttggaga ggcaggtgag gcttctcaaa 1621 cagaaactgt atctgaagaa aacaaaagcc ttatctggac actattgaaa caagtccgtc 1681 ctggcatgga cctatccaag gtggttctgc ctacatttat tttggaaccc cgttctttcc 1741 tggataaact ttcagattac tactatcatg cagatttcct atctgaggca gctcttgaag 1801 aaaatcctta tttccgtttg aagaaagtag tgaaatggta tttgtcagga ttctataaaa 1861 agccaaaggg actgaagaaa ccttataatc ctatacttgg cgagactttc cgttgtttat 1921 ggattcatcc cagaacaaac agcaaaactt tttatattgc tgaacaggtg tcccatcatc 1981 caccaatatc tgccttttat gttagtaatc gaaaagatgg attttgcctt agcggtagta 2041 tcctggctaa gtctaagttt tatggaaact cattatctgc aatattagag ggagaagcac 2101 ggttaacttt cttgaataga ggtgaagatt atgtaatgac aatgccatac gctcattgta 2161 aaggaattct ttatggtaca atgacactgg agcttggtgg aacagtcaat attacatgtc 2221 aaaaaactgg atacagtgca atacttgaat ttaaactaaa gccattccta gggagtagtg 2281 actgtgttaa tcaaatatca gggaaactta aactgggaaa agaagtccta gctactttgg 2341 aaggtcattg ggatagtgaa gtttttatta ctgataaaaa gactgataat tcagaggttt 2401 tctggaatcc aacacctgac attaagcaat ggagattaat aaggcacact gtaaaatttg 2461 aagaacaggg agattttgaa tcagagaaac tctggcaacg ggtaactcga gccataaatg 2521 ccaaagacca aactgaagct acccaagaga agtatgtttt ggaagaagct caaagacaag 2581 ctgccaggga tcggaaaaca aaaaatgaag agtggtcttg caaattattt gaacttgatc 2641 cactcacagg agaatggcat tacaagtttg cagatacccg accatgggac ccacttaatg 2701 atatgataca gtttgaaaaa gatggtgtta ttcagaccaa agtgaaacat cgtactccaa 2761 tggttagcgt ccccaaaatg aaacataagc caaccaggca acagaagaaa gtagcaaaag 2012/036120
2821 gctattcctc cccagaacct gacattcaag actcctctgg aagtgaagct caatcagtaa 2881 aaccaagtac aagaagaaag aaaggaatag aactgggaga cattcagagt tccatcgaat 2941 ctataaaaca aacacaggaa gaaattaaaa gaaatattat ggctcttcga aatcatttag 3001 tttcaagcac accggccacg gattattttc tgcaacaaaa agactacttc atcattttcc 3061 tcctgatttt gcttcaagtc ataataaact tcatgttcaa gtagaagttc tctaccattg 3121 aatcagtga ctagaaagat ctgatttggc ctgggaccag tgttcaagtt ggtttggtct 3181 ttattaaaaa tcacaatatt ccgaaaacaa aaaaacctag gagataaatg tagaggtatt 3241 gacttttcgt atcttttatc ttcacactga aacaagagct atcctatttg attattaaag 3301 tgagctatgt gttaagtgcc aggacatttc tagcttttgt gagaatgtgt ctacatatga 3361 gtataataaa cccacatgta tacacaattg tctcttatgt actcctacct gacagtagtc 3421 tttgtattct atagtatgtt ctgagatata atgttaacat tgttcataac aaaaaatgct 3481 atcaatctta taaatatatg taatctattt tcttcataaa .acaggcacaa aagttttatc 3541 agtaaggaat tacagattga gaaatgatgg aataatagac ataattaatt caatacacta 3601 ctgttaaaat catttgcaaa gcactcagct caattatctt cttagaaaga aagaaaaagt
3661 atgaatggtc aaaatgaata catcgagaga gataaatggc aaattgcttt tttaaaagtt 3721 tacataagtt ttttttaacc cctagaattt aatatttgta gatgcaggta aatatatata 3781 cttacgtgta tatcagtata aaaacactgg tgtgcaatta attggattga ttataatacc 3841 accttaagca cttgctgaaa aaagtgtggt caaaattgat tgctgtcctt ttgtcttatt 3901 tttgtttttc ttaagtcagc tggttcataa cataggccaa attctagaga tgtttataga 3961 gcatttgaag tgctgataat ttatgttttt tcattatgaa aacttatttt agctttagac 4021 tccagtgtgt tcagtgaata agtagaatat aaaaaaatat aaccagtatt ttacttcaaa
4081 agccaaaaag aggcaataag aaaagacact ttgtggtggc ctttatgtgt gcattaaaat 4141 tggtttctgt aaaacgtgta ataagttgag tatctacgaa gagtatcaag ttctgaagtt 4201 taattttttt attatcctcc tctcttctta gtaacttctt tctgtggcaa aaccacaatt 4261 ctttaagatt cctattgttc aggctaaggc aaattttttt gtttgtttct tcagtttaat 4321 attttgattt tgtgttttta cgtaaatatt tatattcctt gaaagcaatt tttgccaagg 4381 tagttcagtt taggaatatg ttgttctaaa atatgtctta gaatcctgaa agcatagatt 4441 ttgaaatgtt tttttaatga aaatgaaggt cagagagaat aattgccctg accacatttg 4501 cctttcagta ggaggaggct gtgaaatagt aaaattataa tcgtttatgc catgataaat 4561 acaagattgg taaataaata cattgattgg taaattatga gaatcaaaat gataaaaaga 4621 gcctgctttt ttccctaacc aatatagcta tcttaagtat ccttaggttt ctgtgaagaa 4681 ccatttccca tgttttcttg gcaaaataat gctgtattcc atatgtacat gtgaaatgat 4741 gttttaaatt gataaaagct taaataagat ctacctatac ccagtatttt catgatatta 4801 gaacaaatgg gtttttggtt atattttata tttgtcaata taatttttgt attcacattc 4861 tgttacactc tgcctattca ttgatatatg atattctgta aatattgtac aatttgatct 4921 tttttatggt ttaaattagt taattacata caaattgatt ggcttatcac aaaaatcatt 4981 tcatcagtaa accttgttaa cattttgtac tggtgaccca cctcttagga ctttggtctt 5041 atccacgtgt atgttgtttt catttggtcc aaataatatt ttatttgtat gggtatcttc 5101 taagactaaa taggtagttg tgttctttat ttttaaaatt tctttttaga gcaaatgtta 5161 tgggttctta cccaaagagt caaaaactat ttcttaagaa agagcagagt tattcatgac 5221 tgttctttat acactaaaag catgcatcta atctaatagt cctcttatta tgcttttagt 5281 tgtatgagtc tctttctatg aactgaacac aaaactcagg aattggtggc ttaattttag 5341 atcagtgctt gtactaggct tagttatatg aatctttata acacataatt actaactttg 5401 tagccatata tgtaattgac tttgaatgtt atttacctga aattaatctt ccttcacaca 36120
5461 tggaccgtaa acggttccca gttgtctgag agcctcatga gggtttctag gatttatgac 5521 cttatgacca gtttttttca tttaccaaga ttttattttc ctacatgaaa atttaattga 5581 gtaataatta ttcacatgtg cattttcttt ttagctgtta aatgtactat gccatcatcc 5641 accatttagt aaaatgtagc tggcccagga catgtaaaaa aaaaaaaaaa acaacaacaa 5701 taaatagggc atgtgaaatg ttaagttaca gcaatagata ttttatttgt atttcatgtt 5761 agtacttttt tgttttatat cacttataaa ggtacagtgt actctttgtc acagctcagt 5821 tggtaaccgc attccattga aaagttggcc ttgtaaaata caactctcat ttaatattca 5881 tgcttttgtg cctttaagaa aatatttttt gtcatttttt gtgttacaga actataatgt 5941 gattcaaggt gtttataggc ttgtcataaa agggtcattt ctgtgtgtta ctttcttttt 6001 atatagctat agtatattta aacaataata ctatctttta taggggtttg tctatttacc 6061 tattctttac tcagacattg atgtagactt gtcagattat tctgagtatt gttaacagtg 6121 ccttttcgat ggaatcacac tttttggctg tcaccttgtg ccatatacac acaaaatttt 6181 gtggaaggca gttttaactt tctgaagaat atctgtcaaa atttaagaaa acaaatgtat 6241 aaaattccat tttttccagt gtttagcatt tctagtaagc agtgaggttg tttgacatac 6301 agtgatgatg gcattattga taagccatac atgagactgc agattatatt gaatcatatt 6361 aaatgtacag aaataaaata ttagatttat atcaaatttt ccaatttgaa ccagtgggga 6421 aaatcccaca gaaatcagta agtttacatt tcaatttcta tcttatttga ctaagtggaa 6481 agagattctt taaaatgtat aacctgccat tatgtaattt ggtttcattt tattctacct 6541 gttgtgtgag tttagtatat ttaatttact ttttgttact ctttacatac tgtttatttt 6601 tgttagtttt taattgaaga tggactgttg aaattgtata ggaccagtgt cttattaata 6661 tgattaatat atttagaaga gccacgtgaa acccatgaca aaatgaatgt gaatattctt 6721 tctaaaaatt tagaaaatgt tatctttttg catttattat gtaaaactgt tttacagtat 6781 caaaattttt cacttaaaga aaaaaaatgc catgaaacat ttgaactgat gagccacaga 6841 acttcagttg aaattttttt cactttttag catgctaaat atacatctga gtttaaatgt 6901 tctgtttaat ggccattcat aaattcaagc actaccactg gtcagttttg tgtgatagaa 6961 taaaaatatg ttacctgcag tgtaagtaca gcacactgtc aaattctttt ccttaaggtg 7021 cacagtaaat gtacagatag ttataggcca ctgttttgta atgtagtaca tttctaatct 7081 attattccta acctattata actgtttgca gaaagaaaag aatttttcta ataatctgta 7141 aaattatgct aacttctaca agtaggcttc taaataaaat ttttaaaaag agcaaaaaaa 7201 aaaaaaaaaa aaa (SEQ ID NO: 28)
EED, such as human embryonic ectoderm development, is represented, for example, by accession numbers 209572_s_at, NM_003797.2 and NMJ52991.1 (2 alternative transcripts), or NP_003788(441 aa;gI24141020) and NP_694536 (400aa;gI24041023). The protein sequence for variant 1 is:
MSEREVSTAPAGTDMPAAKKQKLSSDENSNPDLSGDENDDAVSI
ESGTNTERPDTPTNTPNAPGR S GKGKWKSKKCKYSFKCVNSLKEDHNQPLFGVQFN HSKEGDPLVFATVGSNRVTLYECHSQGEIRLLQSYVDADADENFYTCAWTYDSNTSH PLLAVAGSRGIIRIINPITMQCIKHYVGHGNAINEL FHPRDPNLLLSVS DHALRLW NIQTDTLVAIFGGVEGHRDEVLSADYDLLGEKIMSCGMDHSL LWRINSKRM NAIKE SYDYNPNKTNRPFISQ IHFPDFSTRDIHRNYVDCVRWLGDLILSKSCENAIVCWKPG MEDDIDKIKPSES VTILGRFDYSQCDIWYMRFSMDFWQ MLALGNQVG LYVWDLE VEDPHKAKCTTLTHHKCGAAIRQTSFSRDSSILIAVCDDASIWRWDRLR (SEQ ID NO: 11) and the mR A sequence of variant 1 is:
1 ctagcagcgg gtcggagatc gaaggaacgg gccaattgcg gctgaaacgt ctttggaagg 61 aggaaggggg tgagggagca tccctttgag tttcgcctct tctcgaggcg gtggtgggaa 121 gggagacata cttaatactg ccctcttaat ccaacggacc ttacatcgtg tagactgccg 181 ggagggcggc gggaaaaggg caagacggga gttggggaag ggaaggagcc aggaagccgc 241 gcgggagggc gcgcgcgcgc gccccttttt cagcagtgtg gcggggtcgc acgcacgccc 301 gcctcggcgg ctgggcgcga tttgcgacag tggggggggc ggtggaggtg gcggcggcag 361 cggcaacttt gcggcaagct cgggccgggc ttgcttgacg gcggtgtggc ggaggccccg 421 ccccaggcgg caggaacctg gagggaggcg gaggaatatg tccgagaggg aagtgtcgac 481 tgcgccggcg ggaacagaca tgcctgcggc caagaagcag aagctgagca gtgacgagaa 541 cagcaatcca gacctctctg gagacgagaa tgatgacgct gtcagtatag aaagtggtac 601 aaacactgaa cgccctgata cacctacaaa cacgccaaat gcacctggaa ggaaaagttg 661 gggaaaggga aaatggaagt caaagaaatg caaatattct ttcaaatgtg taaatagtct 721 caaggaagat cataaccaac cattgtttgg agttcagttt aactggcaca gtaaagaagg 781 agatccatta gtgtttgcaa ctgtaggaag caacagagtt accttgtatg aatgtcattc 841 acaaggagaa atccggttgt tgcaatctta cgtggatgct gatgctgatg aaaactttta 901 cacttgtgca tggacctatg atagcaatac gagccatcct ctgctggctg tagctggatc 961 tagaggcata attaggataa taaatcctat aacaatgcag tgtataaagc actatgttgg 1021 ccatggaaat gctatcaatg agctgaaatt ccatccaaga gatccaaatc ttctcctgtc 1081 agtaagtaaa gatcatgctt tacgattatg gaatatccag acggacactc tggtggcaat 1141 atttggaggc gtagaagggc acagagatga agttctaagt gctgattatg atcttttggg
1201 tgaaaaaata atgtcctgtg gtatggatca ttctcttaaa ctttggagga tcaattcaaa 1261 gagaatgatg aatgcaatta aggaatctta tgattataat ccaaataaaa ctaacaggcc 1321 atttatttct cagaaaatcc attttcctga tttttctacc agagacatac ataggaatta
1381 tgttgattgt gtgcgatggt taggcgattt gatactttct aagtcttgtg aaaatgccat 1441 tgtgtgctgg aaacctggca agatggaaga tgatatagat aaaattaaac ccagtgaatc 1501 taatgtgact attcttgggc gatttgatta cagccagtgt gacatttggt acatgaggtt 1561 ttctatggat ttctggcaaa agatgcttgc attgggcaat caagttggca aactttatgt 1621 ttgggattta gaagtagaag atcctcataa agccaaatgt acaacactga ctcatcataa 1681 atgtggtgct gctattcgac aaaccagttt tagcagggat agcagcattc ttatagctgt 1741 ttgtgatgat gccagtattt ggcgctggga tcgacttcga taaaatactt ttgcctaatc 1801 aaaattagag tgtgtttgtt gtctgtgtaa aatagaatta atgtatcttg ctagtaaggg 1861 cacgtagagc atttagagtt gtctttcagc attcaatcag gctgagctga atgtagtgat 1921 gtttacattg tttacattct ttgtactgtc ttcctgctca gactctactg cttttaataa 1981 aaatttattt ttgtaaaaaa (SEQ ID NO: 12) .
The protein sequence for variant 2 is:
MSEREVSTAPAGTDMPAAKKQKLSSDENSNPDLSGDENDDAVSI
ESGTNTERPDTPTNTPNAPGR SWGKG WKSKKCKYSFKCVNSL EDHNQPLFGVQFN WHSKEGDPLVFATVGSNRVTLYECHSQGEIRLLQSYVDADADENFYTCAWTYDSNTSH PLLAVAGSRGIIRIINPITMQCIKHYVGHGNAINELKFHPRDPNLLLSVSKDHALRL NIQTDTLVAIFGGVEGHRDEVLSADYDLLGEKI SCG DHSL LWRINSKRMMNAIKE SYDYNPNKTNRPFISQKIHFPDFSTRDIHRNYVDCVR LGDLILSKSCENAIVC KPG KMEDDIDKIKPSESWTILGRFDYSQCDIWYMRFSMDFWQKMLALGNQVGKLYVWDLE
VEDPHKAK ( SEQ ID NO : 29 ) and the mRNA sequence for variant 2 is:
1 ctagcagcgg gtcggagatc gaaggaacgg gccaattgcg gctgaaacgt ctttggaagg 61 aggaaggggg tgagggagca tccctttgag tttcgcctct tctcgaggcg gtggtgggaa 121 gggagacata cttaatactg ccctcttaat ccaacggacc ttacatcgtg tagactgccg 181 ggagggcggc gggaaaaggg caagacggga gttggggaag ggaaggagcc aggaagccgc 241 gcgggagggc gcgcgcgcgc gccccttttt cagcagtgtg gcggggtcgc acgcacgccc 301 gcctcggcgg ctgggcgcga tttgcgacag tggggggggc ggtggaggtg gcggcggcag
361 cggcaacttt gcggcaagct cgggccgggc ttgcttgacg gcggtgtggc ggaggccccg 421 ccccaggcgg caggaacctg gagggaggcg gaggaatatg tccgagaggg aagtgtcgac
481 tgcgccggcg ggaacagaca tgcctgcggc caagaagcag aagctgagca gtgacgagaa 541 cagcaatcca gacctctctg gagacgagaa tgatgacgct gtcagtatag aaagtggtac 601 aaacactgaa cgccctgata cacctacaaa cacgccaaat gcacctggaa ggaaaagttg 661 gggaaaggga aaatggaagt caaagaaatg caaatattct ttcaaatgtg taaatagtct 721 caaggaagat cataaccaac cattgtttgg agttcagttt aactggcaca gtaaagaagg 781 agatccatta gtgtttgcaa ctgtaggaag caacagagtt accttgtatg aatgtcattc 841 acaaggagaa atccggttgt tgcaatctta cgtggatgct gatgctgatg aaaactttta 901 cacttgtgca tggacctatg atagcaatac gagccatcct ctgctggctg tagctggatc 961 tagaggcata attaggataa taaatcctat aacaatgcag tgtataaagc actatgttgg
1021 ccatggaaat gctatcaatg agctgaaatt ccatccaaga gatccaaatc ttctcctgtc 1081 agtaagtaaa gatcatgctt tacgattatg gaatatccag acggacactc tggtggcaat 1141 atttggaggc gtagaagggc acagagatga agttctaagt gctgattatg atcttttggg 1201 tgaaaaaata atgtcctgtg gtatggatca ttctcttaaa ctttggagga tcaattcaaa 1261 gagaatgatg aatgcaatta aggaatctta tgattataat ccaaataaaa ctaacaggcc 1321 atttatttct cagaaaatcc attttcctga tttttctacc agagacatac ataggaatta 1381 tgttgattgt gtgcgatggt taggcgattt gatactttct aagtcttgtg aaaatgccat 1441 tgtgtgctgg aaacctggca agatggaaga tgatatagat aaaattaaac ccagtgaatc 1501 taatgtgact attcttgggc gatttgatta cagccagtgt gacatttggt acatgaggtt 1561 ttctatggat ttctggcaaa agatgcttgc attgggcaat caagttggca aactttatgt 1621 ttgggattta gaagtagaag atcctcataa agccaagtaa gtatttagaa atttctgttc 1681 aaaatttcag gctttttctc cacacttgta tgccaatgta gagaagatca tttatatttg 1741 cagtgccatc cttaagtcat ttttaacatt tactgttttc agagttaaag ttattctttt 1801 ttattccaat aatttttgtt tttcctaagt accttggtga caagtcattt cttgttttta 1861 tacaaattat gtagtgcttg ttgaacttaa aatatatcta aattttataa aatttgagac 1921 tgagctctta gtgaagtata ttctggtttt aagtgctttt cgtatgactt ggaacatctg 1981 cttattttct aatccgctgt tttagggtag acactgacaa cgttatgtgt ggtctttaac 2041 ctgttgtcat gttttttccc tagatgtaca acactgactc atcataaatg tggtgctgct 2101 attcgacaaa ccagttttag cagggatagc agcattctta tagctgtttg tgatgatgcc 2161 agtatttggc gctgggatcg acttcgataa aatacttttg cctaatcaaa attagagtgt 2221 gtttgttgtc tgtgtaaaat agaattaatg tatcttgcta gtaagggcac gtagagcatt 2281 tagagttgtc tttcagcatt caatcaggct gagctgaatg tagtgatgtt tacattgttt 2341 acattctttg tactgtcttc ctgctcagac tctactgctt ttaataaaaa tttatttttg 2401 taaaaaaaaa aaa (SEQ ID NO: 30) . 0
PKP4, such as human plakophilin 4, is represented, for example, by accession numbers 201929_s_at , NM_001005476.1 and NM_003628.3 (2 alternative transcripts), or NP_001005476 (1 149aa;gI53829378) and NP_ 003619(1 192 aa;gI53829374). The protein sequence for variant 1 is:
PAPEQASLVEEGQPQTRQEAASTGPGMEPETTATTILASVKEQ
ELQFQRLTRELEVERQIVASQLERCRLGAESPSIASTSSTE SFP RSTDVPNTGVSK
PRVSDAVQPNNYLIRTEPEQGTLYSPEQTSLHESEGSLGNSRSSTQM SYSDSGYQEA
GSFHNSQ VSKADNRQQHSFIGSTNNHVVRNSRAEGQTLVQPSVANRAMRRVSSVPSR
AQSPSYVISTGVSPSRGSLRTSLGSGFGSPSVTDPRPLNPSAYSSTTLPAARAASPYS
QRPASPTAIRRIGSVTSRQTSNPNGPTPQYQTTARVGSPLTLTDAQTRVASPSQGQVG
SSSP RSGMTAVPQHLGPSLQRTVHDMEQFGQQQYDIYER VPPRPDSLTGLRSSYAS
QHSQLGQDLRSAVSPDLHITPIYEGRTYYSPVYRSPNHGTVELQGSQTALYRTGSVGI
GNLQRTSSQRSTLTYQR NYALNTTATYAEPYRPIQYRVQECNYNRLQHAVPADDGTT
RSPSIDSIQKDPREFAWRDPELPEVIHMLQHQFPSVQANAAAYLQHLCFGDN VK EV
CRLGGIKHLVDLLDHRVLEVQ NACGALR LVFGKSTDENKIAMKNVGGIPALLRLLR SIDAEVRELVTGVLWNLSSCDAVKMTIIRDALSTLTNTVIVPHSGW SSFDDDHKI FQTSLVLR TTGCLRNLSSAGEEAR QMRSCEGLVDSLLYVIHTCVNTSDYDSKTVE
NCVCTLRNLSYRLELEVPQARLLGLNELDDLLG ESPSKDSEPSCWGK KKKKKRTPQ
EDQMDGVGPIPGLSKSP GVEMLWHPSWKPYLTLIoAESSNPATLEGSAGSLQNLSAG
NWKFAAYIRAAVRKEKGLPILVELLRMDNDRWSSVATALRNMALDVRN ELIGKYA
RDLVNRLPGGNGPSVLSDET AAICCALHEVTSKNMENAKALADSGGIEKLVNITKGR
GDRSSLKWKAAAQVLNTLWQYRDLRSIYKKDGWNQ HFITPVSTLERDRFKSHPSLS
TTNQQMSPIIQSVGSTSSSPALLGIRDPRSEYDRTQPPMQYYNSQGDATHKGLYPGSS
KPSPIYISSYSSPAREQNRRLQHQQLYYSQDDSNRKNFDAYRLYLQSPHSYEDPYFDD
RVHFPASTDYSTQYGL STTNYVDFYSTKRPSYRAEQYPGSPDSWV (SEQ ID NO: 13) and the mRNA sequence of variant 1 is:
1 tccggggctg agtccgcgtc gacgccggcc gcggaggcgg caccatgggc aaggggtaga 61 ggggcaagtt ggccaccgcc gccgccgggg gtggtgggag agccgctccg ggggcggggg 121 ccggtggggg agggaggggc gggcagccgc gccgccgcgg cactttttta attttttcgg 181 gtgccgcagc ggcgacccct cggcgccgat gtccctgatc cctggagcga cgacggccgc 241 tgcctaagct ggaaagagga atgccagctc ctgagcaggc ctcattggtg gaggaggggc 301 aaccacagac ccgccaggaa gctgcctcca ctggcccagg catggaaccc gagaccacag 361 ccaccactat tctagcatcc gtgaaggagc aggagcttca gtttcagcga ctcacccgag 421 aactggaagt ggaaaggcag attgttgcca gtcagctaga aagatgtagg cttggagcag 481 aatcaccaag catcgccagc accagctcaa ctgagaagtc atttccttgg agatcaacag 541 acgtgccaaa tactggtgta agcaaaccta gagtttctga cgctgtccag cccaacaact 601 atctcatcag gacagagcca gaacaaggaa ccctctattc accagaacag acatctctcc 661 atgaaagtga gggatcattg ggtaactcaa gaagttcaac acaaatgaat tcttattccg 721 acagtggata ccaggaagca gggagtttcc acaacagcca gaacgtgagc aaggcagaca 781 acagacagca gcattcattc ataggatcaa ctaacaacca tgtggtgagg aattcaagag 841 ctgaaggaca aacactggtt cagccatcag tagccaatcg ggccatgaga agagttagtt 12 036120
901 cagttccatc tagagcacag tctccttctt atgttatcag cacaggcgtg tctccttcaa 961 gggggtctct gagaacttct ctgggtagtg gatttggctc tccgtcagtg accgaccccc 1021 gacctctgaa ccccagtgca tattcctcca ccacattacc tgctgcacgg gcagcctctc 1081 cgtactcaca gagacccgcc tccccaacag ctatacggcg gattgggtca gtcacctccc 1141 ggcagacctc caatcccaac ggaccaaccc ctcaatacca aaccaccgcc agagtggggt 1201 ccccactgac cctgacggat gcacagactc gagtagcttc cccatcccaa ggccaggtgg 1261 ggtcgtcgtc ccccaaacgc tcagggatga ccgccgtacc acagcatctg ggaccttcac 1321 tgcaaaggac tgttcatgac atggagcaat tcggacagca gcagtatgac atttatgaga 1381 ggatggttcc acccaggcca gacagcctga caggcttacg gagttcctat gctagtcagc 1441 atagtcagct tgggcaagac cttcgttctg ccgtgtctcc cgacttgcac attactccta 1501 tatatgaggg gaggacctat tacagcccag tgtaccgcag cccaaaccat ggaactgtgg 1561 agctccaagg atcgcagacg gcgttgtatc gcacaggttc agtaggtatt ggaaatctac 1621 aaaggacatc cagccaacga agtaccctta cataccaaag aaataattat gctctgaaca 1681 caacagctac ctacgcggag ccctacaggc ctatacaata ccgagtgcaa gagtgcaatt 1741 ataacaggct tcagcatgca gtgccggctg atgatggcac cacaagatcc ccatcaatag 1801 acagcattca gaaggacccc agggagtttg cctggcgtga tcctgagttg cctgaggtca 1861 ttcacatgct tcagcaccag ttcccatctg ttcaggcaaa tgcagcggcc tacctgcagc 1921 acctgtgctt tggtgacaac aaagtgaaga tggaggtgtg taggttaggg ggaatcaagc 1981 atctggttga ccttctggac cacagagttt tggaagttca gaagaatgct tgtggtgccc 2041 ttcgaaacct cgtttttggc aagtctacag atgaaaataa aatagcaatg aagaatgttg 2101 gtgggatacc tgccttgttg cgactgttga gaaaatctat tgatgcagaa gtaagggagc 2161 ttgttacagg agttctttgg aatttatcct catgtgatgc tgtaaaaatg acaatcattc 2221 gagatgctct ctcaacctta acaaacactg tgattgttcc acattctgga tggaataact 2281 cttcttttga tgatgatcat aaaattaaat ttcagacttc actagttctg cgtaacacga 2341 caggttgcct aaggaacctc agctccgcgg gggaagaagc tcggaagcaa atgcggtcct 2401 gcgaggggct ggtagactca ctgttgtatg tgatccacac gtgtgtgaac acatccgatt 2461 acgacagcaa gacggtggag aactgcgtgt gcaccctgag gaacctgtcc tatcggctgg 2521 agctggaggt gccccaggcc cggttactgg gactgaacga attggatgac ttactaggaa 2581 aagagtctcc cagcaaagac tctgagccaa gttgctgggg gaagaagaag aaaaagaaaa 2641 agaggactcc gcaagaagat caatgggatg gagttggtcc tatcccagga ctgtcgaagt 2701 cccccaaagg ggttgagatg ctgtggcacc catcggtggt aaaaccatat ctgactcttc 2761 tagcagaaag ttccaaccca gccaccttgg aaggctctgc agggtctctc cagaacctct 2821 ctgctggcaa ctggaagttt gcagcatata tccgggcggc cgtccgaaaa gaaaaggggc 2881 tccccatcct tgtggagctt ctgagaatgg ataacgatag agttgtttct tccgtggcaa 2941 cagccttgag gaatatggca ctagatgttc gcaacaagga gctcataggc aaatacgcca 3001 tgcgagacct ggtcaaccgg ctccccggcg gcaatggccc cagtgtcttg tctgatgaga 3061 ccatggcagc catctgctgt gctctgcacg aggtcaccag caaaaacatg gagaacgcaa 3121 aagccctggc cgactcagga ggcatagaga agctggtgaa cataaccaaa ggcaggggcg 3181 acagatcatc tctgaaagtg gtgaaggcag cagcccaggt cttgaataca ttatggcaat 3241 atcgggacct ccggagcatt tataaaaagg atgggtggaa tcagaaccat tttattacac 3301 ctgtgtcgac attggagcga gaccgattca aatcacatcc ttccttgtct accaccaacc 3361 aacagatgtc acccatcatt cagtcagtcg gcagcacctc ttcctcacca gcactgttag 3421 gaatcagaga ccctcgctct gaatacgata ggacccagcc acctatgcag tattacaata 3481 gccaagggga tgccacacat aaaggcctgt accctggctc cagcaaacct tcaccaattt 3541 acatcagttc ctattcctca ccagcaagag aacaaaatag acggctacag catcaacagc
3601 tgtattatag tcaagatgac tccaacagaa agaactttga tgcatacaga ttgtatttgc
3661 agtctcctca tagctatgaa gatccttatt ttgatgaccg agttcacttt ccagcttcta
3721 ctgattactc aacacagtat ggactgaaat cgaccacaaa ttatgtagac ttttattcca
3781 ctaaacgacc ttcttataga gcagaacagt acccagggtc cccagactca tgggtgtagc
3841 atcaagatgc ccaacagagg aactctttct ttctaacctt gttcagattg aggtgaaaag
3901 tccatcttgc tgatttgatg attgaaatgt gaaagtgaag tggaaggaat gaatgaagtg
3961 tgtttttttt ttcttttttg aggaattatc agggaagtga ggaaatgttt gggagaggac
4021 tttctaagct ctatttaggt gttagatcta attacttata gattctgtag tctggtgaag
4081 gtgtgggtga cgtgatgaga ggtttgagaa atgggtgaaa tgaaatgggg gatatgtagg
4141 tcaaatcaaa ttaaagatga tttttttaat gtgaataaag ttatgttctg atagtttgta
4201 cagaaaaaat aaaatggatg cccatgtttt attgctatta ctaaatgtca agattgtatg
4261 ctattatgtc ttgtaaattt cttttgttgg tgtaaatatg gaaatgccac attggttaag
4321 tgccatcatt tgtaatgcaa tgtgtcactt gaaaagagat ttgaagaaac tgacaacttc
4381 aaaaacaaat gagaagccca aggaactgtg agcaattaaa agcaaaccgc gacacccttt
4441 gtctccacca cacatagtgt actttggaag cacaacgtcc aggctggtac cgcagcgcca
4501 tgcccattcc tcgcctcgtt cataggacac ttcactgcca ttttctattc acataaaaga
4561 aaaataaatg tggaaatttc atccttggaa aaaaaaaaaa aaaa (SEQ ID NO: 14)
The protein sequence for variant 2 is:
MPAPEQASLVEEGQPQTRQEAASTGPGMEPETTATTILASVKEQ
ELQFQRLTRELEVERQIVASQLERC LGAESPSIASTSSTEKSFPWRSTDVPNTGVSK
PRVSDAVQPN YLIRTEPEQGTLYSPEQTSLHESEGSLGNSRSSTQMNSYSDSGYQEA
GSFHNSQNVSKADNRQQHSFIGSTNNHWRNSRAEGQTLVQPSVANRAMRRVSSVPSR
AQSPSYVISTGVSPSRGSLRTSLGSGFGSPSVTDPRPLNPSAYSSTTLPAARAASPYS
QRPASPTAIRRIGSVTSRQTSNPNGPTPQYQTTARVGSPLTLTDAQTRVASPSQGQVG
SSSP RSGMTAVPQHLGPSLQRTVHDMEQFGQQQYDIYERMVPPRPDSLTGLRSSYAS
QHSQLGQDLRSAVSPDLHITPIYEGRTYYSPVYRSPNHGTVELQGSQTALYRTGSVGI
GNLQRTSSQRSTLTYQRNNYALNTTATYAEPYRPIQYRVQECNYNRLQHAVPADDGTT
RSPSIDSIQKDPREFAWRDPELPEVIHMLQHQFPSVQANAAAYLQHLCFGDNKV EV
CRLGGIKHLVDLLDHRVLEVQK ACGALRNLVFGKSTDENKIAM NVGGIPALLRLLR
KSIDAEVRELVTGVLWNLSSCDAVKMTIIRDALSTLTNTVIVPHSGWNNSSFDDDHKI
KFQTSLVLRNTTGCLRNLSSAGEEARKQMRSCEGLVDSLLYVIHTCVNTSDYDS TVE
NCVCTLRNLSYRLELEVPQARLLGLNELDDLLGKESPSKDSEPSCWGK KKKKKRTPQ
EDQWDGVGPIPGLSKSPKGVEMLWHPSWKPYLTLLAESSNPATLEGSAGSLQNLSAG
NWKFAAYIRAAVRKEKGLPILVELLRMDNDRWSSVATALRNMALDVR KELIGKYAM
RDLVNRLPGGNGPSVLSDETMAAICCALHEVTSKNMENAKALADSGGIE LVNITKGR
GDRSSLKW AAAQVLNTLWQYRDLRSIYKKDGWNQNHFITPVSTLERDRFKSHPSLS
TTNQQ SPIIQSGSS PSPIYISSYSSPAREQNRRLQHQQLYYSQDDSNRKNFDAYRL
YLQSPHSYEDPYFDDRVHFPASTDYSTQYGL STTNYVDFYSTKRPSYRAEQYPGSPD
swv (SEQ ID NO : 3i) and the mRNA sequence for variant 2 is:
1 tccggggctg agtccgcgtc gacgccggcc gcggaggcgg caccatgggc aaggggtaga 61 ggggcaagtt ggccaccgcc gccgccgggg gtggtgggag agccgctccg ggggcggggg 121 ccggtggggg agggaggggc gggcagccgc gccgccgcgg cactttttta attttttcgg 181 gtgccgcagc ggcgacccct cggcgccgat gtccctgatc cctggagcga cgacggccgc
241 tgcctaagct ggaaagagga atgccagctc ctgagcaggc ctcattggtg gaggaggggc
301 aaccacagac ccgccaggaa gctgcctcca ctggcccagg catggaaccc gagaccacag
361 ccaccactat tctagcatcc gtgaaggagc aggagcttca gtttcagcga ctcacccgag
421 aactggaagt ggaaaggcag attgttgcca gtcagctaga aagatgtagg cttggagcag
481 aatcaccaag catcgccagc accagctcaa ctgagaagtc atttccttgg agatcaacag
541 acgtgccaaa tactggtgta agcaaaccta gagtttctga cgctgtccag cccaacaact
601 atctcatcag gacagagcca gaacaaggaa ccctctattc accagaacag acatctctcc
661 atgaaagtga gggatcattg ggtaactcaa gaagttcaac acaaatgaat tcttattccg
721 acagtggata ccaggaagca gggagtttcc acaacagcca gaacgtgagc aaggcagaca
781 acagacagca gcattcattc ataggatcaa ctaacaacca tgtggtgagg aattcaagag
841 ctgaaggaca aacactggtt cagccatcag tagccaatcg ggccatgaga agagttagtt
901 cagttccatc tagagcacag tctccttctt atgttatcag cacaggcgtg tctccttcaa
961 gggggtctct gagaacttct ctgggtagtg gatttggctc tccgtcagtg accgaccccc
1021 gacctctgaa ccccagtgca tattcctcca ccacattacc tgctgcacgg gcagcctctc
1081 cgtactcaca gagacccgcc tccccaacag ctatacggcg gattgggtca gtcacctccc
1141 ggcagacctc caatcccaac ggaccaaccc ctcaatacca aaccaccgcc agagtggggt
1201 ccccactgac cctgacggat gcacagactc gagtagcttc cccatcccaa ggccaggtgg
1261 ggtcgtcgtc ccccaaacgc tcagggatga ccgccgtacc acagcatctg ggaccttcac
1321 tgcaaaggac tgttcatgac atggagcaat tcggacagca gcagtatgac atttatgaga
1381 ggatggttcc acccaggcca gacagcctga caggcttacg gagttcctat gctagtcagc
1441 atagtcagct tgggcaagac cttcgttctg ccgtgtctcc cgacttgcac attactccta
1501 tatatgaggg gaggacctat tacagcccag tgtaccgcag cccaaaccat ggaactgtgg
1561 agctccaagg atcgcagacg gcgttgtatc gcacaggttc agtaggtatt ggaaatctac
1621 aaaggacatc cagccaacga agtaccctta cataccaaag aaataattat gctctgaaca
1681 caacagctac ctacgcggag ccctacaggc ctatacaata ccgagtgcaa gagtgcaatt
1741 ataacaggct tcagcatgca gtgccggctg atgatggcac cacaagatcc ccatcaatag
1801 acagcattca gaaggacccc agggagtttg cctggcgtga tcctgagttg cctgaggtca
1861 ttcacatgct tcagcaccag ttcccatctg ttcaggcaaa tgcagcggcc tacctgcagc
1921 acctgtgctt tggtgacaac aaagtgaaga tggaggtgtg taggttaggg ggaatcaagc
1981 atctggttga ccttctggac cacagagttt tggaagttca gaagaatgct tgtggtgccc
2041 ttcgaaacct cgtttttggc aagtctacag atgaaaataa aatagcaatg aagaatgttg
2101 gtgggatacc tgccttgttg cgactgttga gaaaatctat tgatgcagaa gtaagggagc
2161 ttgttacagg agttctttgg aatttatcct catgtgatgc tgtaaaaatg acaatcattc
2221 gagatgctct ctcaacctta acaaacactg tgattgttcc acattctgga tggaataact
2281 cttcttttga tgatgatcat aaaattaaat ttcagacttc actagttctg cgtaacacga
2341 caggttgcct aaggaacctc agctccgcgg gggaagaagc tcggaagcaa atgcggtcct
2401 gcgaggggct ggtagactca ctgttgtatg tgatccacac gtgtgtgaac acatccgatt
2461 acgacagcaa gacggtggag aactgcgtgt gcaccctgag gaacctgtcc tatcggctgg
2521 agctggaggt gccccaggcc cggttactgg gactgaacga attggatgac ttactaggaa
2581 aagagtctcc cagcaaagac tctgagccaa gttgctgggg gaagaagaag aaaaagaaaa
2641 agaggactcc gcaagaagat caatgggatg gagttggtcc tatcccagga ctgtcgaagt
2701 cccccaaagg ggttgagatg ctgtggcacc catcggtggt aaaaccatat ctgactcttc
2761 tagcagaaag ttccaaccca gccaccttgg aaggctctgc agggtctctc cagaacctct
36 2821 ctgctggcaa ctggaagttt gcagcatata tccgggcggc cgtccgaaaa gaaaaggggc 2881 tccccatcct tgtggagctt ctgagaatgg ataacgatag agttgtttct tccgtggcaa 2941 cagccttgag gaatatggca ctagatgttc gcaacaagga gctcataggc aaatacgcca 3001 tgcgagacct ggtcaaccgg ctccccggcg gcaatggccc cagtgtcttg tctgatgaga 3061 ccatggcagc catctgctgt gctctgcacg aggtcaccag caaaaacatg gagaacgcaa 3121 aagccctggc cgactcagga ggcatagaga agctggtgaa cataaccaaa ggcaggggcg 3181 acagatcatc tctgaaagtg gtgaaggcag cagcccaggt cttgaataca ttatggcaat 3241 atcgggacct ccggagcatt tataaaaagg atgggtggaa tcagaaccat tttattacac 3301 ctgtgtcgac attggagcga gaccgattca aatcacatcc ttccttgtct accaccaacc 3361 aacagatgtc acccatcatt cagtcaggct ccagcaaacc ttcaccaatt tacatcagtt 3421 cctattcctc accagcaaga gaacaaaata gacggctaca gcatcaacag ctgtattata 3481 gtcaagatga ctccaacaga aagaactttg atgcatacag attgtatttg cagtctcctc 3541 atagctatga agatccttat tttgatgacc gagttcactt tccagcttct actgattact 3601 caacacagta tggactgaaa tcgaccacaa attatgtaga cttttattcc actaaacgac 3661 cttcttatag agcagaacag tacccagggt ccccagactc atgggtgtag catcaagatg 3721 cccaacagag gaactctttc tttctaacct tgttcagatt gaggtgaaaa gtccatcttg 3781 ctgatttgat gattgaaatg tgaaagtgaa gtggaaggaa tgaatgaagt gtgttttttt 3841 tttctttttt gaggaattat cagggaagtg aggaaatgtt tgggagagga ctttctaagc 3901 tctatttagg tgttagatct aattacttat agattctgta gtctggtgaa ggtgtgggtg 3961 acgtgatgag aggtttgaga aatgggtgaa atgaaatggg ggatatgtag gtcaaatcaa 4021 attaaagatg atttttttaa tgtgaataaa gttatgttct gatagtttgt acagaaaaaa 4081 taaaatggat gcccatgttt tattgctatt actaaatgtc aagattgtat gctattatgt 4141 cttgtaaatt tcttttgttg gtgtaaatat ggaaatgcca cattggttaa gtgccatcat 4201 ttgtaatgca atgtgtcact tgaaaagaga tttgaagaaa ctgacaactt caaaaacaaa 4261 tgagaagccc aaggaactgt gagcaattaa aagcaaaccg cgacaccctt tgtctccacc 4321 acacatagtg tactttggaa gcacaacgtc caggctggta ccgcagcgcc atgcccattc 4381 ctcgcctcgt tcataggaca cttcactgcc attttctatt cacataaaag aaaaataaat 4441 gtggaaattt catccttgga aaaaaaaaaa aaaaa (SEQ ID NO: 32) .
SSR1, such as human signal sequence receptor, alpha, is represented, for example, by accession numbers 200891_s_at, NM_003144.3, or NP_003135(286aa;gll 6904009). The protein sequence is:
MRLLPRLLLLLLLVFPATVLFRGGPRGLLAVAQDLTEDEETVED
SIIEDEDDEAEVEEDEPTDLVEDKEEEDVSGEPEASPSADTTILFVKGEDFPANNIVK FLVGFTNKGTEDFIVESLDASFRYPQDYQFYIQNFTALPLNTWPPQRQATFEYSFIP AEPMGGRPFGLVINLNY DLNG VFQDAVFNQTVTVIEREDGLDGETIFMYMFLAGLG LLVIVGLHQLLESRKRKRPIQKVEMGTSSQNDVDMSWIPQETLNQINiASPRRLPR R
AQKRSVGSDE (SEQ ID NO : 15 > and the mR A sequence is:
1 cccacctctc cgcgctgagc agccacgtcg gcggcggtcc ctggtccagg gaggggcgtg 61 gcaaaggccc gtgcgcggta cgtgtcccgc ccctcgctgc ccggagcccg gatgaagagt 121 aacgccatta ccgccggagc cgccgagagc cttagccgac ggaaactgga cactggaccg 181 gcagcgccat gagactcctc ccccgcttgc tgctgcttct cttactcgtg ttccctgcca 241 ctgtcttgtt ccgaggcggc cccagaggct tgttagcagt ggcacaagat cttacagagg 301 atgaagaaac agtagaagat tccataattg aggatgaaga tgatgaagcc gaggtagaag 361 aagatgaacc cacagatttg gtagaagata aagaggaaga agatgtgtct ggtgaacctg 421 aagcttcacc gagtgcagat acaactatac tgtttgtaaa aggagaagat tttccagcaa 481 ataacattgt gaagttcctg gtaggcttta ccaacaaggg tacagaagat tttattgttg 541 aatccttaga tgcctcattc cgttatcctc aggactacca gttttatatc cagaatttca 601 cagctcttcc tctgaacact gtagtgccac cccagagaca ggcaactttt gagtactctt 661 tcattcctgc agagcccatg ggcggacgac catttggttt ggtcatcaat ctgaactaca 721 aagatttgaa cggcaatgta ttccaagatg cagtcttcaa tcaaacagtt acagttattg 781 aaagagagga tgggttagat ggagaaacaa tctttatgta tatgttcctt gctggtcttg 841 ggcttctggt tattgttggc cttcatcaac tcctagaatc tagaaagcgt aagagaccca 901 tacagaaagt agaaatgggt acatcaagtc agaatgatgt tgacatgagt tggattcctc 961 aggaaacatt gaatcaaatc aataaagctt caccaagaag gttgcccagg aaacgggcac 1021 agaagagatc agtgggatct gatgagtaaa tgttcctttg tgcaacaatt cggtctttac 1081 ttaacctgcc ctaatatttt tcggcctgat gggaattagt gcagagaagc catgtcacca 1141 tagaaggcaa ctcctacttg tgtgtggact gagcaatcag agtctgtggc gataatattg 1201 ctgaaaatgc actgcattca tttttctaaa gtaacaaatt tggttttttt ttaaaccatt 1261 aaaatctatg tgtgtgcgtg tgtatgtatg tgagcagttg gtcttaccag aatcattgtt 1321 gaactacctg aaacaagtct ttagaatact aaatataatg ctgttgtctc ttcctttttg 1381 acattttctg attttttccc ccaaaactca gttaatattt acccactatg attattgatg 1441 tcctgccttg aacagtttta aagaaaacaa tttttggaat agctcaaatt tcaattgatg 1501 gcacaaatca gcattttgtt gttgttactg tattacaatt agtattctaa aggcagaagc 1561 agaagtagct gctttttagc aatagaattg tttcagtatt ttgctgctgt ttaatgcgca 1621 tcttcagaaa acttcccagt ggcttcaagg aatttgggga tctctctggc aacaaattgt 1681 gaaacatgaa atttctgctg actttaatat atgaaaccta atcctacccc cttttttaac 1741 aaaaagaaac tagtacattt gtgaaaattg tgttgtgttg tccattgttg ctctagttct 1801 gacccagagg tagctctgga gtgattttag acctactcac tcagttgtgt gtaggttttt 1861 ttgttttgtt ttgagagaga atttttctct ccttaataga agcatccttt ttaaagagaa 1921 gttgccttgg tccacacact aagcagaaaa ccaagttatc aggacagaga tatttcccag 1981 ttactcctaa tcaatgaaga aagtgagttg gatattttta aagcagttaa ctaatttttt 2041 cttacctaat cttttgggag ttttgcttgt tgatataacc tttttagtta acctgaaaga 2101 ttccaaaaat tgttcttaag tgcttgagac tggaaccaaa attaaattgt acttcataaa 2161 atcctcttat agagttactc ttgccctaga ttgtaaatta agtttggcat tattgtcaga 2221 ctggatggag ggtgaagtaa aatagtatga acaattaaga ggctctcccc ctcttgtctt 2281 taagccatat tctcctacat gtattttata agaaaatgtt aagtcaaatt ttagtggctc 2341 tttaattcct gacctcttca ttctcctttt cagtataacc tcccctatgc tcatgcccac 2401 acagacaaaa aaacaaaacg aaatacacac agaaaaaagt ctttccaaac tgtttaagta 2461 tttaaacatc tgagccaaag cagatagaag ttattgtata attgttaatc actttgcaaa 2521 taggggctat caaattacct atattggcat tgctggatta taaactctat atctgtaata 2581 taaagtgttt gagtttttaa ttgggctgtt atgatcagta gttgattttg agaaagctct 2641 atgagctcta agtaactgca tggttttttg tttaatgtaa tataggagac ccttcacatt 2701 cccaaggaat atattccaaa acatttttgt gaatatctaa gtttgtgaaa ctactagggc 2761 atgatacagt aaggtgtaat tacagaattt acgaaatgta aatggcctct acagagtttt 2821 atggaatacc tggtactaac gtaggcagct gcaaaaccac actgagttac agctgtcagc 2881 cctcctcatt cctaaataac ttgccttaca tatcagccct cccacttctg aagttcaaat 2941 tagtgcctcg gaaatgtaga atttattatt tgtcattttt ttttttttag catagattga 3001 gaacagttga actcttaaat cctcagatgc caggggtctg ctctagcatc agtaagtatt 3061 tagcagaaac taactccgta atgaatggaa ttcaattcca cacatggttt gttcaagcac 3121 acttaataag tagcctattt tttaaatgtc tttttaaaat gtaaatattt ggatgaagtt 3181 tttctttgtt ttgatatatt catttgctac accaactatg ttttcagaat tcatcttttg 3241 aacaacttgg tttcagaata tgtaaaatga ctttaaggat cttgtgtatc aaacctatcc 3301 ccggatgtgt gagaataatg tgttcataaa gcatggatct cgctttggtt gtatagcttc 3361 ctcatttact tcatggtctt acatagctgg tgactttcag ggctaatctg ccctctaaag 3421 cattgtccca ggagaggaaa aggaaatggg acctcagaag tagaagcctc agggaaggag 3481 taaagtagaa atcagaagaa aagaagcttc acttgatagt aataaggttt ttaacttcaa 3541 gtaccttcag aaaatgtgat tttgataaga ggaaagggca aatttagacc ttaaaaaata
3601 tggaagaact actgccttaa aagtgcattt gtggcacatc agcctagaac tgtatcatgg 3661 ctgtgctggg gagaagtaaa tggtggtaat gtaacattgc cacctttact taatgatgtg 3721 ttattttcga ggtacagtag atcaatatag taataggcga gcctcatata tagcattcat 3781 cttgtacaat gatatccata cccttgatat gaaggaaaat tgacttggtt tgtgcatttg 3841 aatactgaaa taatttttta aaactcagtg acacatacca tctcttgcca agactagacc 3901 ctgtatttta gttcctaaat tagatgttta aatttaaaaa cattttcaga tgtacttaag 3961 tacttccata gtactttttt tttttttttt ttttttgccc cctgagacgg agtctctgtg
4021 tcacccaggc tggagtgcag tggtgcgatc tcggctcact gcaacttctg cctcttgggt
4081 tcaagcagtt ctccctgtct cagcctcctg agtagcttgg attacaggcg cccgtcaccg 4141 cacctgccta atttttgcat ttttagtaga gacggggttt cgtcatttgg ccaggctggt 4201 cttgaactcc tgaccacagg tgatacgcct gccttggcct cccaaagtgt gctgggatta 4261 caggcgtgag ccactgtgcc cggcccactg ttcacttttt gaatggcatc attttatagc 4321 tgtagaacta aaatcaatgt ttgccccaat tttcttaagt aaaactctac tttgagctct 4381 tacctccaac ttagtaaaaa gcagcttcac acacaaacaa gattcttact ggtggaatgt 4441 taggtttcgt tgttaagtta atctgtctat aagctcatcc ttagaggata tttgaggagg 4501 aagaacacct tgcagctgac ttgcaaacat ctaaataatt tatttcgggt gcttatgaat 4561 gttactaatg gattttgtat gaatttttat cccttttcat ttatacaaaa acctgggctt 4621 ttatgttaat tatatcatct gaggttctaa ggtttttttt ttagattttg aaatttaggg 4681 ataatagctc ttaggtttgg gtaccacttt gctgcagttt aagaaagggg gaagggaact 4741 catttattaa acatcaatca cgtgctgtgt tctgtttgtt ttctagtcat catatcacac 4801 acctttacga cagctcactg aaggaaggtg atactgttcc cattttgtag atggaataga 4861 caaaacctga atttaagtag cttgctcaag gttccatatt gaatatggaa agttcaaatc 4921 atctcagtaa tgaatatacc atatatactt gctgtattgt atctatgata attcagttac 4981 ccacaatacc cttttaaatt tctgttaatg acataccttt aaatgtctcc ttgatgaaca
5041 gaatcatggt ctttaaaaac attttcatgg gttgattgca ttttcaagct ctaaaggatt 5101 gaaagataaa tcttcacgtt aaaggtaaga gtgaagtatc tgctcttggg ttacagaacc 5161 agatagtact agaactaaga ttacagggta aagctgcttt tatctttttt ctttttcttt 5221 ttcttttttt ttttgacatg gggtctcact gtattgccca ggctggaatg cagtggcatg 5281 atctcagctc acggcagcct ctgcctcttg ggctcaagcg attctcctgc ttcagctttc 5341 caagtatttg ggaccacagg cgcacaccac aggcctggct aatgtttttg ttttgttttt 5401 ggtagagacg gggtttcacc atgttgccag gctggtctcg aactcctgag ctcaagtgat 5461 tcacccacct tggcctcaca aagtgtcagg cttacaggcg tgagccactg cgcccggctc 5521 acagggtaag gcttctgtct ggtgtgttgt attacggatt ttgcttaata ggcacagtga P T/US2012/036120
5581 ggcattaaaa agaaaattca gtatgcctgt agaaaggata atccttgttt aaagtctcca 5641 aattgcagtc aaagatgttt tgactgtgcc tttttttgtt cccctgctgt cccttatgta 5701 gacttctgtc agtacccatg gcagcctgtc atcttgttga catctccttc tggactgtga 5761 gctctgtatc tggcttgttt ttcatcccca gcttctagtt cacaattagg tagaacccta 5821 ttactctttg aagaaggaac aagaaaatgt gggccagttt tcatttgcca ttcttccatg 5881 tgagttagta tggttcgtaa gtattcctgg tgatacgcta gtattggcaa ttctgtgagg 5941 ttgaacaaag gggtggtatg gtgtgctagc gtgggaatta ggagacctct gggtcttgac 6001 agtgccctgg ccactaagca aaggcagttc atccttggag tctcaatgtg cttttttgtt 6061 aattgagata tgcttgaagt atcagcccta aatagtctga ttctgtgacc tacaaaccct 6121 tacttaattc agtgttacta taaatgattc ttcccttaaa cctacttttt acttagcaaa 6181 agagaaaaaa aaaaaaagga aacgctcatg tcaggctgcc tgggtttgaa ccccagatcc 6241 cctcttagtt ggaggcaagt tgtatgatgc ttcagctttt ggttcctgcc tcctaaggtt 6301 gttaggagtt actgtgtgta gctgcttaga acattgcctg ggtctcagta agccgcttaa 6361 gtgactgctc tcattttcgc tgtaaagcac catactgtaa taacatccca tgaagcatgg 6421 ggcggggaag agtatatggt accttatgga ctttgatgtg gtggggtagt aggtaaattc 6481 tgaatatgta agctacatag tattcttttt gtaactaaag gataaaagtt ttaagatggg 6541 catgtaatat ggctagcact ggattttaat gatgagccag agtaataagg ctggcaaggg 6601 gagtttttgt tttgtaataa atctcttaca cctgcacatc cagttctttt taaaaacacg 6661 tttgaagagg ctccatattt ctcagtgaag ctgttggggt tcatgttatt tttgaaaatc 6721 atcttgcaat ttatttcagc atcagactga accacccaaa gaaaattagt ctatttctga 6781 acagtttttt cagaaaccat attggtctga tcacccaaca tttatagaaa taggcgctta 6841 aggcctagag gcatttcaga aaggagaatg agaaatactg tggtcaagag tagtgttcag 6901 atggagaaca tcaagcatct gctctgccct tccacacaca tcctcatttc atcttcacaa 6961 ctgccctagg taaccttgtt atctgtaatg aaacagcctc aaggagcaca gaggcactca 7021 ctcctcacgt ttggtctgtt gccatttccg gcttggttgg aatagggtag gagccttttg 7081 gcagggagca cattctcagt aatgcagagt gcactcacct gggttctagc ttcaacactt 7141 aggatttgct tgatattttg tattctctga ggcactgcct gtatttgttt ctgctatcac 7201 agtccagagt caagcttcat ttataaagac tgggcagggc atggtggctc actcctgtaa 7261 ttccagcact ttgggaggct gaggtgggtg gatcacttta ggtcaggagt tcaagaccag 7321 cctggccaac atggtgaaac cccatctcta ctaaaaaata caaaaaggcc gggcacggtg 7381 gctcacacct gtaatcccaa cactttggga ggccaaggtg ggcagaacac ctgaggtcag 7441 gagttcaaga ccatcctggc gaacatagtg aaacctcgcc tctactaaaa atacaaaaat 7501 tagccaggtg tggtggtgca tgcctgtaat cccagctact tgggaggctg aggcaggaga 7561 attgcttaaa cctgggaggt ggagattgtg gtgagctgag atcgtgccac tgcactccag 7621 cctgggagat agaacgagac tccatctcaa aaaagaacaa aaacaattag ccgagcgagg 7681 tggtgcacgc ctgtaatccc agctactcat gaggctgagg caggagaatc acttgaaccc 7741 aggaggagga ggttgtagtg agtcaaggtt gcaccactgc actccagcct gggtgacaga 7801 gtgagactgt ctcaaaaaat aagtacataa ataataagta aaagctacta acaattaaaa 7861 aataaataaa taaagacaag actgtctgga aaatggctct cctaaaagga ccagttgcca 7921 tcatccacag tggaagattc aaagcagttg gtccttggta cgtatgagaa gcggatttca 7981 ttcccttgaa ttctacagag cagtttatta gagtgaatgc attttaaggc cttgcatttg 8041 atatgtcatc cagttcataa tcaagttgcc tttttctggc taaaacataa tgattatgta 8101 tttttctcat ttggtcctac aagctgctgg ccctttgtcc ctccactgtg ggaatcagat 8161 ctagaggagg ctgagcctgc agacacagca gtggccaaaa ggtcactcta agtgttttgt 8221 cttgactcct tacttgaagt ccacccagct agcacacatc tggtttatac tgaagccccc 8281 tgcctagaaa tactcatttc aggaaccacc agtaagcatc tgtgaccaca caggcttttt 8341 gactgatggc ttcccggatc tggtttcaag ggataacccc gtctgtgtgc atctatggtc 8401 ttctctctac agcgaggact ttgcagtgct gcttgtggtc cacacaaggg gctcagagct 8461 gagtctgaac tgcttcatgg tcaccagctc ctgtcccttc cagtcttgag aggctttttt 8521 ctccagatgg aacctttcct tcccgccgtt ttctcggtct ctggctgttt ttctcttgtg 8581 cccgtctaat tggacacctc ctggcttcca tctctgtggt tctcctgcct cacttcctgt 8641 tctgttgttt ttccgttttg tcaaaatatc tcctatgttc ttggcttcct tttcgtcgcc 8701 aggttttcag ctttccttta gctcttcttc taatatggct tctgcccaca aaagcctgct 8761 ctgtcaggat ctcatggttc tccacttgcc agaaccttct tcagcctcag ttcctcggcc 8821 tcaacttgta cgtttaaccc attgaccacc accccccaaa ttcaccttca tttctttgac 8881 cctgctcctc actccttttc tgttgaggaa tctgttgact aactccaggc tcactcaggc 8941 tcaccgtcct gctctctgca ccagcctttc cagagcgtgc cagttctcat ggcttcatct 9001 gttaactgtt gatcacttca gtcctgattt ttagacctaa atggtttcct taacgccatt 9061 ctaactgcct gtgactcatt ttcacttaca gtgtttattg taacgccaaa ccaacaaatc 9121 acaggtgctt gcttctctcc ataaatctcc ccagtctaac tttttgtcat tcaacatgac 9181 tcgtttatcc aacctgaaat cgcatatagc cccaagtatg gtgttttgta cacaggtatt 9241 taataagtga cttccagttt tggctctgct atgaataaaa agagatttca gttctcttca 9301 ctttgaaatc taacaactca gagaacattg aagaaattgg aatttagttg ggatgaaata 9361 cttgtggttt aaaatatttc tgttcatatt ttctaatttg ttgccggagg tcttgggttt 9421 tctatttgag tgcttgcaaa ctcaatgtga tttctgtcag catatcttag gtttgtttgt 9481 tatgaaactt atgcagtgtg aggttctatc tgaaaatgtt atttagctat cttctgggac 9541 tatttaatga aagtggggtc atgaatcctt aaaattcttg tgcagctttg agaaacattt 9601 ctgttatttg ggtatcagtt tgtaagtgtg gtaaagccaa gatggaaacg agcactttgc 9661 tttcttggtt gttgttactg gtctaacctc ctgcttgaac tagtctgctg tcctgtcaaa 9721 tgcatctttt tatttacatg tcccttaaat taaagctgat catgaaagta tttgtgtgca 9781 acatgaaatg ttttctaaag taccagtgca tattcctgct tggtctcaag acgcacgagg 9841 aatccaaaat agaagcaaaa aaaaaaaaaa aaaa (SEQ ID NO: 16) .
USP5, such as human ubiquitin specific peptidase 5 (isopeptidase T), is represented, for example, by accession numbers 20603 l_s_at, NM_001098536.1 and NM_003481.2 (2 alternative transcripts), or NP_001092006(858aa;gI148727331) and
NP_003472(835aa;gI148727247). The protein sequence for variant 1 is:
MAELSEEALLSVLPTIRVPKAGDRVHKDECAFSFDTPESEGGLY
ICMNTFLGFGKQYVERHFN TGQRVYLHLRRTRRPKEEDPATGTGDPPRK PTRLAIG
VEGGFDLSEE FELDEDVKIVILPDYLEIAKDGLGGLPDIVRDRVTSAVEALLSADSA
SR QEVQAWDGEVRQVSKHAFSLKQLDNPARIPPCGWKCSKCDMRENLWLNLTDGSIL
CGRRYFDGSGGN HAVEHYRETGYPLAVKLGTITPDGADVYSYDEDD VLDPSLAEHL
SHFGIDMLKMQKTDKTMTELEIDMNQRIGEWELIQESGVPLKPLFGPGYTGIR LGNS
CYLNSWQVLFSIPDFQRKYVDKLE IFQNAPTDPTQDFSTQVAKLGHGLLSGEYSKP
VPESGDGERVPEQ EVQDGIAPRMFKALIGKGHPEFSTNRQQDAQEFFLHLINMVERN
CRSSENPNEVFRFLVEE IKCLATEKV YTQRVDYIMQLPVPMDAALNKEELLEYEEK
KRQAEEEKMALPELVRAQVPFSSCLEAYGAPEQVDDFWSTALQAKSVAVKTTRFASFP 36120
DYLVIQIKKFTFGLDWVPKKLDVSIEMPEELDISQLRGTGLQPGEEELPDIAPPLVTP DEPKGSLGFYGNEDEDSFCSPHFSSPTSP LDESVIIQLVEMGFP DACRKAVYYTGN SGAEAAMNWV SHMDDPDFA PLILPGSSGPGSTSAAADPPPEDCVTTIVSMGFSRDQ ALKALRATNWSLERAVDWIPSHIDDLDAEAAMDISEGRSAADSISESVPVGPKVRDGP
GKYQLFAFISHMGTSTMCGHYVCHIKKEGRWVIYNDQKVCASEKPPKDLGYIYFYQRVAS (SEQ ID NO: 17) and the mRNA sequence for variant 1 is:
1 aggggagggg actgggaacg gtgggagccg ccgtgtgtgg agaagctgct gccggtgtca 61 tggcggagct gagtgaggag gcgctgctgt cagtattacc gacgatccgg gtccctaagg 121 ctggagaccg ggtccacaaa gacgagtgcg ccttctcctt cgacacgccg gagtctgagg 181 ggggcctcta catctgtatg aacacgtttc tgggctttgg gaaacagtat gtggagagac 241 atttcaataa gaccggccag cgagtctact tgcacctccg gcggacccgg cgcccgaaag 301 aggaggaccc tgctacaggc actggagacc caccccggaa gaagcccacg cggctggcta 361 ttggtgttga aggcggattt gaccttagcg aggagaagtt tgaattagac gaggatgtga 421 agattgtcat tttgccagat tacctggaga ttgcccggga tggactgggg ggactgcctg 481 acattgtcag agatcgggtg accagtgcag tggaggccct actgtcggcc gactcagcct 541 cccgcaagca ggaggtgcag gcatgggatg gggaagtacg gcaggtgtct aagcatgcct 601 tcagcctcaa gcagttggac aaccctgctc gaatccctcc ctgtggctgg aagtgctcca 661 agtgtgacat gagagagaac ctgtggctca acctgactga tggctccatc ctctgtgggc 721 gacgctactt cgatggcagt gggggcaaca accacgctgt ggagcactac cgagagacag 781 gctacccgtt agctgtcaag ctgggcacca tcacccctga tggagctgac gtgtactcat 841 atgatgagga tgacatggtc ctggacccca gcctggctga gcacctgtcc cacttcggca 901 tcgacatgct gaagatgcag aagacagaca agacgatgac tgagttggag atagacatga 961 accagcggat tggtgaatgg gagctgatcc aggagtcagg tgtgccactc aagcccctgt 1021 ttgggcctgg ctacacaggc atccggaacc tgggtaacag ctgctacctc aactctgtgg 1081 tccaggtgct cttcagcatc cctgacttcc agaggaagta tgtggataag ctggagaaga 1141 tcttccagaa tgccccgacg gaccctaccc aggatttcag cacccaggtg gccaagctgg 1201 gccatggcct tctctccggg gagtattcca agccagtacc ggagtcgggc gatggggagc 1261 gggtgccaga acagaaggaa gttcaagatg gcattgcccc tcggatgttc aaggccctca 1321 tcggcaaggg ccaccctgaa ttctccacca accggcagca ggatgcccag gagttcttcc 1381 ttcaccttat caacatggtg gagaggaatt gccggagctc tgaaaatcct aatgaagtgt 1441 tccgcttctt ggtggaggaa aagatcaagt gcctggccac agagaaggtg aagtacaccc 1501 agcgagttga ctacatcatg cagctgcctg tgcccatgga tgcagccctt aacaaagagg 1561 agcttctgga gtacgaggag aagaagcggc aagccgaaga ggagaagatg gcactgccag 1621 aactggttcg ggcccaggtg cccttcagct cttgcctgga ggcctacggg gcccctgagc 1681 aggtcgatga cttctggagc acggccctgc aggccaagtc agtagctgtc aagaccacac 1741 gatttgcctc attccctgac tacctggtca tccagatcaa gaagttcacc ttcggcttag 1801 actgggtgcc caagaaactg gatgtgtcca tcgagatgcc agaggagctc gacatctccc 1861 agttgagggg cacagggctg cagcccggag aggaggagct gccagacatt gccccacccc 1921 tggtcactcc ggatgagccc aaaggtagcc ttggtttcta tggcaacgaa gacgaagact 1981 ccttctgctc ccctcacttc tcctctccga catcgcccat gctggatgaa tcagtcatca 2041 tccagctggt ggagatggga ttccctatgg acgcctgccg caaagctgtc tactacacgg 2101 gcaacagcgg ggctgaggcc gccatgaact gggtcatgtc acacatggat gatccagatt 2161 ttgcaaaccc cctcatcctg cctggctcta gtgggccggg ctccacaagc gcagcagccg 2221 acccccctcc tgaggactgt gtgaccacca ttgtctccat gggcttctcc cgggaccagg 2281 ccttgaaagc gctgcgggcc acgaacaata gtttagaacg ggctgtggac tggatcttca 2341 gtcacattga cgacctggat gctgaagctg ccatggacat ctcagagggc cgctcagctg 2401 ccgactccat ctctgagtct gtgccagtgg gacctaaagt ccgggatggt cctggaaagt 2461 atcagctctt tgccttcatt agtcacatgg gcacctctac catgtgtggt cactacgtct 2521 gccacatcaa gaaagaaggc agatgggtga tctacaatga ccagaaagtg tgtgcctccg 2581 agaagccgcc caaggacctg ggctacatct acttctacca gagagtggcc agctaagagc 2641 ctgcctcacc ccttaccaat gagggcaggg gaagaccacc tggcatgagg gagaggggct 2701 gagggatgga cttcagcccc tctgctctgt accctttttc cttttgtccc cggcagcagg 2761 gaagaagctg gaggccgtgg gagaatggct gggcagagca gaggggcagc gatagactct 2821 ggggatggag caggacgggg acgggagggg ccggccacct gtctgtaagg agactttgtt 2881 gcttcccctg cccccggaat ccacagtgct ctgcttctct gtgtcgcccc gcccagcccc 2941 ctggtgtgga gggaggggtc tcgtttgtgc gcgtgggtgt agctttgtgc atcctctccc 3001 agtggagcga tcacctgtgc ctcccctccc cctttgtttg cccctgtgtg gttggtcaag 3061 gagggatgtg agggaaatag ggaccccccg acttgccctc ctgcctcagt ctttccccca 3121 ccctgtctct tccttgtcct tctctggaaa atgccaaaat acacgatgtg aataaaagta 3181 caacggctaa aaaaaaaaaa aaaaaaaaaa aaaa (SEQ ID NO: 18)
The protein sequence for variant 2 is:
MAELSEEALLSVLPTIRVPKAGDRVHKDECAFSFDTPESEGGLY
ICMNTFLGFGKQYVERHFNKTGQRVYLHLRRTRRPKEEDPATGTGDPPRKKPTRLAIG
VEGGFDLSEEKFELDEDVKIVILPDYLEIARDGLGGLPDIVRDRVTSAVEALLSADSA
SRKQEVQAWDGEVRQVS HAFSL QLDNPARIPPCGWKCSKCDMRENL LNLTDGSIL
CGRRYFDGSGGNNHAVEHYRETGYPLAVKLGTITPDGADVYSYDEDDMVLDPSLAEHL
SHFGID LKMQ TD TMTELEIDMNQRIGEWELIQESGVPLKPLFGPGYTGIRNLGNS
CYLNSWQVLFSIPDFQRKYVDKLEKIFQNAPTDPTQDFSTQVAKLGHGLLSGEYSKP
VPESGDGERVPEQ EVQDGIAPRMF ALIGKGHPEFSTNRQQDAQEFFLHLINMVERN
CRSSENPNEVFRFLVEEKIKCLATE VKYTQRVDYI QLPVPMDAALNKEELLEYEEK
KRQAEEE MALPELVRAQVPFSSCLEAYGAPEQVDDFWSTALQAKSVAVKTTRFASFP
DYLVIQI KFTFGLD VP LDVSIEMPEELDISQLRGTGLQPGEEELPDIAPPLVTP
DEPKAP LDESVIIQLVEMGFPMDACRKAVYYTGNSGAEAAMNWVMSH DDPDFANPL
ILPGSSGPGSTSAAADPPPEDCVTTIVSMGFSRDQALKALRATNNSLERAVDWIFSHI
DDLDAEAAMDISEGRSAADSISESVPVGP VRDGPG YQLFAFISHMGTSTMCGHYVC
HiK EGR viYNDQ VCASEKPPKDLGYiYFYQRVAS (SEQ ID NO: 33) and the mRNA sequence for variant 2 is:
1 ggggactggg aacggtggga gccgccgtgt gtggagaagc tgctgccggt gtcatggcgg
61 agctgagtga ggaggcgctg ctgtcagtat taccgacgat ccgggtccct aaggctggag
121 accgggtcca caaagacgag tgcgccttct ccttcgacac gccggagtct gaggggggcc
181 tctacatctg tatgaacacg tttctgggct ttgggaaaca gtatgtggag agacatttca
241 ataagaccgg ccagcgagtc tacttgcacc tccggcggac ccggcgcccg aaagaggagg
301 accctgctac aggcactgga gacccacccc ggaagaagcc cacgcggctg gctattggtg
361 ttgaaggcgg atttgacctt agcgaggaga agtttgaatt agacgaggat gtgaagattg
421 tcattttgcc agattacctg gagattgccc gggatggact ggggggactg cctgacattg
481 tcagagatcg ggtgaccagt gcagtggagg ccctactgtc ggccgactca gcctcccgca 541 agcaggaggt gcaggcatgg gatggggaag tacggcaggt gtctaagcat gccttcagcc 601 tcaagcagtt ggacaaccct gctcgaatcc ctccctgtgg ctggaagtgc tccaagtgtg 661 acatgagaga gaacctgtgg ctcaacctga ctgatggctc catcctctgt gggcgacgct 721 acttcgatgg cagtgggggc aacaaccacg ctgtggagca ctaccgagag acaggctacc 781 cgttagctgt caagctgggc accatcaccc ctgatggagc tgacgtgtac tcatatgatg 841 aggatgacat ggtcctggac cccagcctgg ctgagcacct gtcccacttc ggcatcgaca 901 tgctgaagat gcagaagaca gacaagacga tgactgagtt ggagatagac atgaaccagc 961 ggattggtga atgggagctg atccaggagt caggtgtgcc actcaagccc ctgtttgggc 1021 ctggctacac aggcatccgg aacctgggta acagctgcta cctcaactct gtggtccagg 1081 tgctcttcag catccctgac ttccagagga agtatgtgga taagctggag aagatcttcc 1141 agaatgcccc gacggaccct acccaggatt tcagcaccca ggtggccaag ctgggccatg 1201 gccttctctc cggggagtat tccaagccag taccggagtc gggcgatggg gagcgggtgc 1261 cagaacagaa ggaagttcaa gatggcattg cccctcggat gttcaaggcc ctcatcggca 1321 agggccaccc tgaattctcc accaaccggc agcaggatgc ccaggagttc ttccttcacc 1381 ttatcaacat ggtggagagg aattgccgga gctctgaaaa tcctaatgaa gtgttccgct 1441 tcttggtgga ggaaaagatc aagtgcctgg ccacagagaa ggtgaagtac acccagcgag 1501 ttgactacat catgcagctg cctgtgccca tggatgcagc ccttaacaaa gaggagcttc 1561 tggagtacga ggagaagaag cggcaagccg aagaggagaa gatggcactg ccagaactgg 1621 ttcgggccca ggtgcccttc agctcttgcc tggaggccta cggggcccct gagcaggtcg 1681 atgacttctg gagcacggcc ctgcaggcca agtcagtagc tgtcaagacc acacgatttg 1741 cctcattccc tgactacctg gtcatccaga tcaagaagtt caccttcggc ttagactggg 1801 tgcccaagaa actggatgtg tccatcgaga tgccagagga gctcgacatc tcccagttga 1861 ggggcacagg gctgcagccc ggagaggagg agctgccaga cattgcccca cccctggtca 1921 ctccggatga gcccaaagcg cccatgctgg atgaatcagt catcatccag ctggtggaga 1981 tgggattccc tatggacgcc tgccgcaaag ctgtctacta cacgggcaac agcggggctg 2041 aggccgccat gaactgggtc atgtcacaca tggatgatcc agattttgca aaccccctca 2101 tcctgcctgg ctctagtggg ccgggctcca caagcgcagc agccgacccc cctcctgagg 2161 actgtgtgac caccattgtc tccatgggct tctcccggga ccaggccttg aaagcgctgc 2221 gggccacgaa caatagttta gaacgggctg tggactggat cttcagtcac attgacgacc 2281 tggatgctga agctgccatg gacatctcag agggccgctc agctgccgac tccatctctg 2341 agtctgtgcc agtgggacct aaagtccggg atggtcctgg aaagtatcag ctctttgcct 2401 tcattagtca catgggcacc tctaccatgt gtggtcacta cgtctgccac atcaagaaag 2461 aaggcagatg ggtgatctac aatgaccaga aagtgtgtgc ctccgagaag ccgcccaagg 2521 acctgggcta catctacttc taccagagag tggccagcta agagcctgcc tcacccctta 2581 ccaatgaggg caggggaaga ccacctggca tgagggagag gggctgaggg atggacttca 2641 gcccctctgc tctgtaccct ttttcctttt gtccccggca gcagggaaga agctggaggc 2701 cgtgggagaa tggctgggca gagcagaggg gcagcgatag actctgggga tggagcagga 2761 cggggacggg aggggccggc cacctgtctg taaggagact ttgttgcttc ccctgccccc 2821 ggaatccaca gtgctctgct tctctgtgtc gccccgccca gccccctggt gtggagggag 2881 gggtctcgtt tgtgcgcgtg ggtgtagctt tgtgcatcct ctcccagtgg agcgatcacc 2941 tgtgcctccc ctcccccttt gtttgcccct gtgtggttgg tcaaggaggg atgtgaggga 3001 aatagggacc ccccgacttg ccctcctgcc tcagtctttc ccccaccctg tctcttcctt 3061 gtccttctct ggaaaatgcc aaaatacacg atgtgaataa aagtacaacg gctaaaaaaa 3121 aaaaaaaaaa (SEQ ID NO: 34) . ACTB, such as human beta actin, is represented, for example, by accession numbers
213867_x_at, NM_001 10L3(gI168480149), or NP_001092 (375aa;gI4501885). The protein sequence is:
DDDIAALWDNGSGMCKAGFAGDDAPRAVFPSIVGRPRHQGVM
VGMGQKDSYVGDEAQS RGILTLKYPIEHGIVTNWDDMEKIWHHTFYNELRVAPEEHP
VLLTEAPLNPKA RE MTQIMFETFNTPAMYVAIQAVLSLYASGRTTGIVMDSGDGVT
HTVPIYEGYALPHAILRLDLAGRDLTDYLMKILTERGYSFTTTAEREIVRDIKEKLCY
VALDFEQEMATAASSSSLEKSYELPDGQVITIGNERFRCPEALFQPSFLGMESCGIHE
TTFNSIMKCDVDIR DLYANTVLSGGTTMYPGIADRMQKEITALAPSTMKIKIIAPPE
RKYSv iGGSiLASLSTFQQ WisKQEYDESGPSivHRKCF (SEQ ID NO: 19) and the sequence mRNA sequence is:
1 accgccgaga ccgcgtccgc cccgcgagca cagagcctcg cctttgccga tccgccgccc
61 gtccacaccc gccgccagct caccatggat gatgatatcg ccgcgctcgt cgtcgacaac
121 ggctccggca tgtgcaaggc cggcttcgcg ggcgacgatg ccccccgggc cgtcttcccc
181 tccatcgtgg ggcgccccag gcaccagggc gtgatggtgg gcatgggtca gaaggattcc
241 tatgtgggcg acgaggccca gagcaagaga ggcatcctca ccctgaagta ccccatcgag
301 cacggcatcg tcaccaactg ggacgacatg gagaaaatct ggcaccacac cttctacaat
361 gagctgcgtg tggctcccga ggagcacccc gtgctgctga ccgaggcccc cctgaacccc
421 aaggccaacc gcgagaagat gacccagatc atgtttgaga ccttcaacac cccagccatg
481 tacgttgcta tccaggctgt gctatccctg tacgcctctg gccgtaccac tggcatcgtg
541 atggactccg gtgacggggt cacccacact gtgcccatct acgaggggta tgccctcccc
601 catgccatcc tgcgtctgga cctggctggc cgggacctga ctgactacct catgaagatc
661 ctcaccgagc gcggctacag cttcaccacc acggccgagc gggaaatcgt gcgtgacatt
721 aaggagaagc tgtgctacgt cgccctggac ttcgagcaag agatggccac ggctgcttcc
781 agctcctccc tggagaagag ctacgagctg cctgacggcc aggtcatcac cattggcaat
841 gagcggttcc gctgccctga ggcactcttc cagccttcct tcctgggcat ggagtcctgt
901 ggcatccacg aaactacctt caactccatc atgaagtgtg acgtggacat ccgcaaagac
961 ctgtacgcca acacagtgct gtctggcggc accaccatgt accctggcat tgccgacagg
1021 atgcagaagg agatcactgc cctggcaccc agcacaatga agatcaagat cattgctcct
1081 cctgagcgca agtactccgt gtggatcggc ggctccatcc tggcctcgct gtccaccttc
1141 cagcagatgt ggatcagcaa gcaggagtat gacgagtccg gcccctccat cgtccaccgc
1201 aaatgcttct aggcggacta tgacttagtt gcgttacacc ctttcttgac aaaacctaac
1261 ttgcgcagaa aacaagatga gattggcatg gctttatttg ttttttttgt tttgttttgg
1321 tttttttttt ttttttggct tgactcagga tttaaaaact ggaacggtga aggtgacagc
1381 agtcggttgg agcgagcatc ccccaaagtt cacaatgtgg ccgaggactt tgattgcaca
1441 ttgttgtttt tttaatagtc attccaaata tgagatgcgt tgttacagga agtcccttgc
1501 catcctaaaa gccaccccac ttctctctaa ggagaatggc ccagtcctct cccaagtcca
1561 cacaggggag gtgatagcat tgctttcgtg taaattatgt aatgcaaaat ttttttaatc
1621 ttcgccttaa tactttttta ttttgtttta ttttgaatga tgagccttcg tgccccccct
1681 tccccctttt ttgtccccca acttgagatg tatgaaggct tttggtctcc ctgggagtgg
1741 gtggaggcag ccagggctta cctgtacact gacttgagac cagttgaata aaagtgcaca
1801 ccttaaaaat gaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aaaaaaaaaa aa (SEQ ID NO: 20) HLCS, such as human holocarboxylase synthetase (biotin-(proprionyl-CoA- carboxylase (ATP-hydrolysing)) ligase), is represented, for example, by accession numbers 209399 _at, NM_00041 1.5 and NM_001242785 and NM_001242784 (three alt transcripts), or NP_000402(726aa;gI46255045) and NP_001229713 (726aa;gI338753397) and
NP_001229714(726aa;gI338753400). The protein sequence is:
EDRLHMDNGLVPQKIVSVHLQDSTLKEVKDQVSNKQAQILEP
PEPSLEIKPEQDGMEHVGRDDPKALGEEPKQRRGSASGSEPAGDSDRGGGPVEHYHLH LSSCHECLELENSTIESV PASAENIPDLPYDYSSSLESVADETSPEREGRRVNLTGK APNILLYVGSDSQEALGRFHEVRSVLADCVDIDSYILYHLLEDSALRDPWTDNCLLLV IATRESIPEDLYQKFMAYLSQGGKVLGLSSSPTFGGFQVTSKGALHKTVQNLVFS AD QSEV LSVLSSGCRYQEGPVRLSPGRLQGHLENED DRMIVHVPFGTRGGEAVLCQVH LELPPSSNIVQTPEDFNLLKSSNFRRYEVLREILTTLGLSCDMKQVPALTPLYLLSAA EEIRDPLMQWLG HVDSEGEIKSGQLSLRFVSSYVSEVEITPSCIPWT MEAFSSEH FNLEIYRQNLQTKQLGKVILFAEVTPTTMRLLDGLMFQTPQEMGLIVIAARQTEGKGR GGN LSPVGCALSTLLISIPLRSQLGQRIPFVQHL SVAWEAVRSIPEYQDINLRV K P BIYYSDL KIGGVLVNSTLMGETFYILIGCGFNVTNSNPTICI DLITEYNKQH KAELKPLRADYLIARWTVLEKLIKEFQDKGPNSVLPLYYRYWVHSGQQVHLGSAEGP VSIVGLDDSGFLQVHQEGGEWTVHPDGNSFDMLRNLILP RR (SEQ ID NO: 21) and the mRNA sequence is:
1 acgagagggc gcagcgcggc gcggcaggga ttcgcgggcg accacccggc gcaggagcgg 61 ccgcgtttcg gcctcagaat ccattgaaga cttgaacaag tgggccctat ttcttgtgtc 121 tccttttata cttgaagcag aacacatagc atttgtgacg gagagcattt gggtacaaag 181 tgagaattta cagagatcat cctcttcaga aacagttcgg tttttgccca ctagggatga 241 tgtggtttct catgaggtta cttgctctaa aggactttat attttggaac cataagagca 301 cccttgtggc ccaggcactt tatggatgat cccttttagt gctcccagta accttccaag 361 attgtcaagt ggtcagactg ttgtttgcca ttagcttgca gacctgggga tccttatcgg 421 ctaattgctg aagcaagtgt ggacaacttc agcaagctgg gggtggcgtt catggaagat 481 agactccaca tggataatgg actggtaccc caaaagattg tgtcggtgca cttgcaggac 541 tccactctga aggaagttaa ggatcaggtc tcaaacaagc aagcccagat cctagagccg 601 aagcctgaac cttctcttga gattaagcct gagcaggacg gtatggagca tgttggcaga 661 gatgacccaa aggctcttgg tgaagaaccc aaacaaagga gaggcagtgc ctctgggagt 721 gagcctgctg gggacagtga caggggaggg ggccccgttg agcattatca cctccatctg 781 tctagttgcc acgagtgtct ggaacttgag aacagcacca ttgagtcagt caagtttgcg 841 tctgccgaga acattccaga ccttccctac gattatagca gcagtttgga gagtgttgct 901 gatgagacct cccccgaaag agaagggagg agagtcaacc tcacgggaaa ggcacccaac 961 atcctcctct atgtgggctc cgactcccag gaagccctcg gccggttcca cgaggtccgg 1021 tctgtgctgg ccgactgtgt ggacattgac agttatattc tctaccacct gctggaggac 1081 agtgctctca gagacccgtg gacggacaac tgtctgctgt tggtcattgc taccagggag 1141 tccattcccg aagacctgta ccagaagttc atggcctatc tttctcaggg agggaaggtg 1201 ttgggcctgt cttcatcctt cacctttggt ggctttcagg tgacaagcaa gggtgcactg 1261 cacaagacag tccagaactt ggttttctcc aaggctgacc agagcgaggt gaagctcagc 0
1321 gtcttgagca gtggctgcag gtaccaggaa ggccccgtcc ggctcagccc cggcaggctc 1381 cagggccacc tggagaatga ggacaaggac aggatgattg tgcatgtgcc ttttggaact 1441 cgcgggggag aagctgttct ttgccaggtg cacttagaac tacctcccag ctccaacata 1501 gtgcaaactc cagaagattt taacttgctc aagtcaagca attttagaag atacgaagtc 1561 cttagagaga ttctgacaac ccttggcctc agctgtgaca tgaaacaagt tcctgcctta 1621 actcctcttt acttgctgtc agctgcggag gaaatcaggg atcctcttat gcagtggctt 1681 gggaaacatg tggactccga gggagaaata aaatccggcc agctctctct tagatttgtt 1741 tcatcctacg tgtctgaagt agaaataacc ccatcttgta tacctgtggt gaccaacatg 1801 gaggccttct catcagaaca tttcaactta gagatctatc gccaaaatct gcagaccaag 1861 cagttgggga aagtaatttt gtttgccgaa gtgaccccca caacgatgcg tctcctggat 1921 gggctgatgt ttcagacacc gcaggaaatg ggcttaatag tgatcgcggc ccggcagacc 1981 gagggcaaag gacggggagg gaatgtgtgg ctgagccctg tgggatgtgc tctttctact 2041 ctgctcatct ccattccact gagatcccag ctgggacaga ggatcccgtt tgtccagcat 2101 ctgatgtccg tggctgtcgt ggaagcagtg aggtccattc ccgagtatca ggatatcaac 2161 ttacgagtga agtggcccaa cgatatttat tacagtgacc tcatgaagat cggcggagtt 2221 ctggttaact caacactcat gggagaaaca ttttatatac ttattggctg tggatttaat 2281 gtgactaaca gtaaccctac catctgcatc aacgacctca tcacagaata caataaacaa 2341 cacaaggcag aactgaagcc cttaagagcc gattatctca tcgccagagt cgtgactgtg 2401 ctggagaaac tgatcaaaga gtttcaggac aaagggccca acagcgtcct tcccctttat 2461 taccgatact gggtccacag tggtcagcaa gtccatctgg gcagcgcaga gggaccaaag 2521 gtgtccatcg ttggcctgga cgattctggc ttcctccagg ttcaccagga gggcggcgag 2581 gttgtgactg tgcacccgga cggcaactcc ttcgacatgc tgagaaacct catcctcccc 2641 aaacggcggt aatgccgggc gtccccgaga cgcggctgcc tgtccgtgcc catgcatctg 2701 gaaatctaat ttagagttgt aggtgaattt tcttttcctc caattcattt gttaagtctt 2761 tgttcttttt ctgtgtttct gtttgttttt aggtttgttt tgttgtcgtt ttctttggtg 2821 tttgaagagg ctctgggata gatggttaag aagtagaaaa tttagtttag ggaaagccct 2881 cccacaggtg ggaaattgct ctcccctctg tggcttggac ttacgtttat tgtcaagggg 2941 agtttttaca tggaaatgac aatgggaaaa ttcagatatt ttcttagtag tgcagacctt 3001 tacccctagt ctatgaaaaa acaaaccaaa atatgctctt gcgcccaggc cagtggtgag 3061 ttagaggtat gctatcactg tttgtaagca tctggggagg tactgaactg taagaacatg 3121 cttggacact tagtcattgt tctgtgtttt tattaatgaa gaaaagggaa gacagacttc 3181 caagagttac tgtccacccg gtggtgtggc cccatagcga agtctaaatg cctgtagaga 3241 tagagctagc tggtgtggtt gcagtgacct tgtagaggaa atcagttcat tactttgaca 3301 tcattcagtg agctctcctt tcctaaggaa gtttaaatgt ccttagttag ggactgactt 3361 tcttaagtaa gtttaaattt actacatatt gtgaagagac aggatcaagt tcagaatcct 3421 taaatgtctg attaggcatc acttggatga ggaggtgggc gatttggctc tgacagctgg 3481 agatgaaggc acactcatac cacatacaag ggaggatttg gagcttttaa gccagtttca 3541 gatttactct gaaatgtgga gcattcctgc aagactgtgc agctcacgga atatagaaga 3601 catggcattt tactcagaag tcataagttt ttgcccccct catttacctc gtattaccaa 3661 gaaagaaaat gttatcgata ctaaacacca tcagttcaga gggaggatgt gtgtgtgtgc 3721 ccgcatatgt gtgtgcgtgc gtgtgtgcgc acatagcttt aaaagaagac attcaaaatt 3781 tgatgtgcta caagcctcat gaaagaacaa aagaaatgaa gccttttgat atgcattcgc 3841 tattcccaga tgtacgccat gccttttcca tgtccctcct atctctgttg aacttatgaa 3901 tcatactcat tacttttcag ctttttaaaa ggccaatttt tgtccagttt. tctctcttcc 3961 agtcccagct gaaattagtg gaaagaaagt ttgatggagc tttcagcttt gaacaaaatc 4021 ccttcattgt aaactagcac catctttatc caggtcttac ccagtcaggc taattccaga 4081 aacttgtggt ttttagtata gtctgtctac ctttagccag gcacaggaca gccctatgaa 4141 aaaataccca atatatattt tttggaaatg aaacattaaa agaacttaaa aagtaatttt 4201 tggaaatgag gcttcaatta gaattatttt tctcaaaaaa caaacaaaca aaaaacacaa 4261 aaaaaaccac tcttctccaa atgcccaagc cttctttcaa aattagttag aaacttaagt 4321 aaaatacaag tccacaccat ccccaaatta caaaatggac ttacccttga gagggcatct 4381 gcagaatatc atcagggaca aagatctcga ggctaacgat gtaggtttca tttctcagac 4441 tttgtaatat aaggcaagcc ctctctcaga gctgccatca tcactttttg aatttctttg 4501 ggggttattt aatgaaaaac atgctatgtt ttgttttaag ctgaagtcct attctggaca 4561 ctctgctttg ggaaaaaatg ttatcattta atttcctttc tgcaaattaa aactaatgaa 4621 gtgtggcctt gtcaaaggct atggagatgt tccgggcata ctgctgtgct ctgtgctttc 4681 cagcaggcgc tcctccctca cgcaggagac tcagttgtcc tgagagagat gaagcagcct 4741 tgaagcagat gctgcgtttt ccataaacct gattttgcct cacatgaacc aaagactctc 4801 aaaactccgc .ttctatagaa ttagctgaat aaaggcattt tactgatagc tgttcgtgtt 4861 agcgaaacct gtctacctgc tatagcacac tctccgattt gggccattta tgcaccccgc 4921 aacctgggat ctcaaggagc tttaaagtct taatgggaac ttggcatttt cctgatgatc 4981 tttaaaatgt ggtcactaaa ctcaggattg gcgtgtgctt ttagaacact ggagtagccc 5041 ttgttttaga ggctgtgcat tgagtatcga ccgtattttg taaaaggcaa gatatcctcc 5101 cttccaggct ggtaacgggt ttcaagggga ctcttgagga agtgccccct aaaatagaac 5161 acagcaataa ctgggcttcc tgtccccacc cccaccccag cagtgctctc tggcactggg 5221 aactctgcta gggagtggtg gaagtaggaa ggatttgtgt gcaaaggaaa atcgtggttg 5281 agtttcactg cagcaggctg acgttgcctg atgtgagagc aagtggccga ctggggtgcg 5341 ggtgcacagg tcgggggagc acaggccaca gagcgcagcc tctgggggtc ccccaaggca 5401 cagcatatac agcatggtcg ccccttgccc tggagtctgg gaacaaagag aggagccagc 5461 ctccccgcac tgcttcagat ggaaaaggga ggcagggtgg gcttccgttc tccagatctg 5521 tttgctctta acaggcagaa catgggagaa tccttattcc tggttaatca ctatgcatat 5581 ttgaaataaa agaaagcgta agcctctgca attttaactt ctcaaaggat gtctctgaaa 5641 agaatcactt taaaccaatg cctataaaaa gcaagtctac caaaataaac taagactttc 5701 tatgtggttt gggctccctc ttatttttac aagtttcatt tttaaaagta ggcaactact 5761 ttgggttaca gtatttttat tcatatttaa acatttttac aaaataaata aagtgtttta 5821 catagtaagg aatatgtacg tatttccaag tattaagaag ccaagtgttt ttttttttga 5881 cgtattattg acaaatgtat tcagcgccat acacaagaga aatattatta ctccaaagaa 5941 cgaaagttaa caaaactcca aagcaaaaac ccttttaatg gaggtgagaa ataaatttaa 6001 tgtaacaaca (SEQ ID NO: 22)
NDUFB1, such as human NADH dehydrogenase (ubiquinone) 1 beta subcomplex, 1, 7kDa, is represented, for example, by accession numbers 206790_s_at, NM_004545.3, or NP 38569473 (105aa;gI38569473). The protein sequence is:
MICWRHPSAPCGRGEWQVPRSQLPLARVEFPVALGLGVAVGAEA
AAIMWLLQIVI^HWVHVLVPMGFVIGCYLDR SDERLTAFR KSMLF RELQPSEEVT (SEQ ID NO: 23) and the mRNA se quence is:
1 agaggagtca accctgagga ggaaaaagta gtatgatttg ctggcgtcac ccctctgctc 2012/036120
61 cgtgcgggcg cggcgaatgg caggtcccga ggtcgcagct tccactggcg cgggttgagt 121 tccctgttgc ccttggtctc ggggtcgctg taggcgctga ggctgcagct atcatggtga 181 acttacttca gattgtgcgg gaccactggg ttcatgttct tgtccctatg ggatttgtca 241 ttggatgtta tttagacaga aagagtgatg aacggctaac tgccttccgg aacaagagta 301 tgttatttaa aagggaattg caacccagtg aagaagttac ctggaagtaa agactggcta 361 gattatcgaa tgttcacatt ttaaagttct gagagaaata aaaacatgaa gaatctgaaa 421 aaaaaaaaaa aaa (SEQ ID NO: 24)
CDC42, such as human cell division cycle 42 (GTP binding protein, 25kDa), is represented, for example, by accession numbers 208728_s_at, NM_001039802.1,
NM_001791.3 and NM_044472.2 (three alternative transcripts), or
NP_00103489(191aa;89903012) and NP_001782(191aa;gI4757952) and
NP_426359(191aa;gI16357472). The protein sequence or variant 1 is:
QTIKCVWGDGAVG TCLLISYTTN FPSEYVPTVFDNYAVTV
MIGGEPYTLGLFDTAGQEDYDRLRPLSYPQTDVFLVCFSWSPSSFENVKEKWVPEIT
HHCPKTPFLLVGTQIDLRDDPSTIE LA N Q PITPETAEKLARDLKAVKYVECSALTQKGLKNVFDEAILAAL
EPPEPKKSRRCVLL (SEQ ID O: 25) and the mRNA for variant 1 is:
1 acttccgcgg gcacccaact gtgcgtctcc tgcgcgctga cgtcaggtgc gtgcccctgt 61 ccggcagccg aggagacccc gcgcagtgct gccaacgccc cggtggagaa gctgaggtca 121 tcatcagatt tgaaatattt aaagtggata caaaactatt tcagcaatgc agacaattaa 181 gtgtgttgtt gtgggcgatg gtgctgttgg taaaacatgt ctcctgatat cctacacaac 241 aaacaaattt ccatcggaat atgtaccgac tgtttttgac aactatgcag tcacagttat 301 gattggtgga gaaccatata ctcttggact ttttgatact gcagggcaag aggattatga 361 cagattacga ccgctgagtt atccacaaac agatgtattt ctagtctgtt tttcagtggt 421 ctctccatct tcatttgaaa acgtgaaaga aaagtgggtg cctgagataa ctcaccactg 481 tccaaagact cctttcttgc ttgttgggac tcaaattgat ctcagagatg acccctctac 541 tattgagaaa cttgccaaga acaaacagaa gcctatcact ccagagactg ctgaaaagct 601 ggcccgtgac ctgaaggctg tcaagtatgt ggagtgttct gcacttacac agaaaggcct 661 aaagaatgta tttgacgaag caatattggc tgccctggag cctccagaac cgaagaagag 721 ccgcaggtgt gtgctgctat gaacatctct ccagagccct ttctgcacag ctggtgtcgg 781 catcatacta aaagcaatgt ttaaatcaaa ctaaagatta aaaattaaaa ttcgtttttg 841 caataatgac aaatgccctg cacctaccca catgcactcg tgtgagacaa ggcccatagg 901 tatggccccc cccttccccc tcccagtact agttaatttt gagtaattgt attgtcagaa 961 aagtgattag tactattttt ttttgttgtt tcaaaaaaaa aatttttgtg tgtgtgtgtt 1021 tttttttttt ttttttttgt tgtttaaaag caaggcatgc ttgtggatga ctctgtaaca 1081 gactaattgg aattgttgaa gctgctccct ggttccactc tggagagtaa tctgggacat 1141 cttagtgttt tgttttgttt ttttccctcc tctttttttt gggggggagt gtgtgtgggg 1201 tttgtttttt agtcttgttt ttttaattca ttaaccagtg gttagccctt aaggggagga 1261 ggacggattg attccacatt ccacttccta gatctagttt agaaaacatg ttccccatct 1321 ggtgctctta ggaaggagta tagtaaatgc ctcatttaat aacatactcc tttttgaaag 1381 ttgccttttc tctccaccct tgagtagatc cagtatttga tgaaactcat gaaagtgggt 1441 ggagcccatc ttgcccctcc tcttttctag gacgcactat atgtgactgt gactttcaag 1501 gacatttgtt tgccatttgc tgattttttt gggaagttaa tttctaactt ctttcactga
1561 taaatgaaga aaagtattgc acctttgaaa tgcaccaaat gaattgagtt tgtaattaaa
1621 aaaatttttt tccctttcag tcattgtctt atatgcttag catagatttg cagctcagta
1681 gtatatggtg ttcctagaat gcagctgaag acctgttatg tagaggaaat acgaggggtg
1741 gtgctagaag acagacatct gtggaatgat tcacatcctc tcaagttagg aggatggagg
1801 cctgcttcat taagaagctg ggggtagggt gggggtgggg agaacactta acaacatggg
1861 gaccagtcag gggaatcccc ttatttctgt tttgcatatg aggaacccta gagcagccag
1921 gtgaggctct ctagtttaat aaaaatcatg gaaagactct taatgcagac tcttcttaag
1981 tgttaatagg gattttttca gcttattttg gttgcagttt ccaattttta aaaatgttga
2041 ggtaatcttt cccaccttcc caaacctaat tcttgtagat gcattagtgt tgaaccaatg
2101 ctttctcatg tctcaattct ttgtatatgc attcttttca gatgtattaa acaaacaaaa
2161 acccttcaaa aaaaaaaaaa aa (SEQ ID NO:26).
The protein sequence or variant 2 is:
MQTIKCVWGDGAVGKTCLLISYTTNKFPSEYVPTVFDNYAVTV
MIGGEPYTLGLFDTAGQEDYDRLRPLSYPQTDVFLVCFSWSPSSFE V EKWVPEIT HHCPKTPFLLVGTQIDLRDDPSTIEKLA QKPITPETAEKLARDLKAVKYVECSAL
TQRGLK VFDEAILAALEPPETQPKRKCCIF (SEQ ID NO: 35) and the mRNA sequence for variant 2 is:
1 acttccgcgg gcacccaact gtgcgtctcc tgcgcgctga cgtcaggtgc gtgcccctgt 61 ccggcagccg aggagacccc gcgcagtgct gccaacgccc cggtggagaa gctgaggtca 121 tcatcagatt tgaaatattt aaagtggata caaaactatt tcagcaatgc agacaattaa 181 gtgtgttgtt gtgggcgatg gtgctgttgg taaaacatgt ctcctgatat cctacacaac 241 aaacaaattt ccatcggaat atgtaccgac tgtttttgac aactatgcag tcacagttat 301 gattggtgga gaaccatata ctcttggact ttttgatact gcagggcaag aggattatga 361 cagattacga ccgctgagtt atccacaaac agatgtattt ctagtctgtt tttcagtggt 421 ctctccatct tcatttgaaa acgtgaaaga aaagtgggtg cctgagataa ctcaccactg 481 tccaaagact cctttcttgc ttgttgggac tcaaattgat ctcagagatg acccctctac 541 tattgagaaa cttgccaaga acaaacagaa gcctatcact ccagagactg ctgaaaagct 601 ggcccgtgac ctgaaggctg tcaagtatgt ggagtgttct gcacttacac agagaggtct 661 gaagaatgtg tttgatgagg ctatcctagc tgccctcgag cctccggaaa ctcaacccaa 721 aaggaagtgc tgtatattct aaactgtttt ctccttccct tctttgctgc tgcttcctgt 781 cccactactg tagaaagatc gtttaaaaac aaaggaataa aaccatcctg tttgaaagcc 841 tctgcgtctt tttactcacc accttagagc aacctctgta ttagtttttg atcaagaatg 901 caatatcata taaatttttt gtgatcagta gtcaagttgg acttgtttta acgttctgct 961 gcttgagttg cctgatgctc agagcttttt ggtttggatt actattgcaa aagggaactt 1021 ggtctggctt taagaatgtc ctcttggaga aaataacaag agttttaaca cttctagatc 1081 ttagttctag atggagaaag taacacaaac atcattttac tcttatgatc aattgttaat 1141 tgtaattgca tgacaaacct tatggaaaag gggtgaccta gtagagtgta atggggaagg 1201 gaggattctt ttctggtttt cctttgtgcg gtgaaacttt gtgttgctgt tgctttggct 1261 gtctgtgctg tagtggagta tttgtcagtc tggggtgggg aagatattga tgtatctgct 1321 actgctttat gagttcattt gttacattat cttttaagaa taacatccat ttaaacagtt 1381 gacttacagt ttgttaatgc tgagatgtaa agctgccacc tttatatttt cctgcttctg 1441 attttattgt gagggaaata tacaattgtg gttaccttca aattttgaaa ttaaaaatat 1501 acaaccgttt gtaaaaaaaa aaaaaaaaaa (SEQ ID NO: 36)
The protein sequence or variant 3 is:
MQTIKCVWGDGAVGKTCLLISYTTNKFPSEYVPTVFDNYAVTV
MIGGEPYTLGLFDTAGQEDYDRLRPLSYPQTDVFLVCFSWSPSSFENVKEKWVPEIT HHCPKTPFLLVGTQIDLRDDPSTIE LAKNKQKPITPETAEKLARDLKAVKYVECSAL
TQKGLKNVFDEAiLAALEPPEPKKSRRCVLL (SEQ ID NO: 37) and the mRNA sequence for variant 3 is:
1 acttccgcgg gcacccaact gtgcgtctcc tgcgcgctga cgtcaggtgc gtgcccctgt 61 ccggcagccg aggagacccc gcgcagtgct gccaacgccc cggtggagaa gctgagacgg 121 agtctcactg tgttgcccag gctggagtgc agtggcgcca tcttggctca ctgcagtgcg 181 cctctgcccc ccgagttcaa gcgattctcc tgcctcaggc tcctgagtag ctgggactac 241 aggtcatcat cagatttgaa atatttaaag tggatacaaa actatttcag caatgcagac 301 aattaagtgt gttgttgtgg gcgatggtgc tgttggtaaa acatgtctcc tgatatccta 361 cacaacaaac aaatttccat cggaatatgt accgactgtt tttgacaact atgcagtcac 421 agttatgatt ggtggagaac catatactct tggacttttt gatactgcag ggcaagagga 481 ttatgacaga ttacgaccgc tgagttatcc acaaacagat gtatttctag tctgtttttc 541 agtggtctct ccatcttcat ttgaaaacgt gaaagaaaag tgggtgcctg agataactca 601 ccactgtcca aagactcctt tcttgcttgt tgggactcaa attgatctca gagatgaccc 661 ctctactatt gagaaacttg ccaagaacaa acagaagcct atcactccag agactgctga 721 aaagctggcc cgtgacctga aggctgtcaa gtatgtggag tgttctgcac ttacacagaa 781 aggcctaaag aatgtatttg acgaagcaat attggctgcc ctggagcctc cagaaccgaa 841 gaagagccgc aggtgtgtgc tgctatgaac atctctccag agccctttct gcacagctgg 901 tgtcggcatc atactaaaag caatgtttaa atcaaactaa agattaaaaa ttaaaattcg 961 tttttgcaat aatgacaaat gccctgcacc tacccacatg cactcgtgtg agacaaggcc 1021 cataggtatg gcccccccct tccccctccc agtactagtt aattttgagt aattgtattg 1081 tcagaaaagt gattagtact attttttttt gttgtttcaa aaaaaaaatt tttgtgtgtg 1141 tgtgtttttt tttttttttt ttttgttgtt taaaagcaag gcatgcttgt ggatgactct 1201 gtaacagact aattggaatt gttgaagctg ctccctggtt ccactctgga gagtaatctg 1261 ggacatctta gtgttttgtt ttgttttttt ccctcctctt ttttttgggg gggagtgtgt 1321 gtggggtttg ttttttagtc ttgttttttt aattcattaa ccagtggtta gcccttaagg 1381 ggaggaggac ggattgattc cacattccac ttcctagatc tagtttagaa aacatgttcc 1441 ccatctggtg ctcttaggaa ggagtatagt aaatgcctca tttaataaca tactcctttt 1501 tgaaagttgc cttttctctc cacccttgag tagatccagt atttgatgaa actcatgaaa 1561 gtgggtggag cccatcttgc ccctcctctt ttctaggacg cactatatgt gactgtgact 1621 ttcaaggaca tttgtttgcc atttgctgat ttttttggga agttaatttc taacttcttt 1681 cactgataaa tgaagaaaag tattgcacct ttgaaatgca ccaaatgaat tgagtttgta 1741 attaaaaaaa tttttttccc tttcagtcat tgtcttatat gcttagcata gatttgcagc 1801 tcagtagtat atggtgttcc tagaatgcag ctgaagacct gttatgtaga ggaaatacga 1861 ggggtggtgc tagaagacag acatctgtgg aatgattcac atcctctcaa gttaggagga 1921 tggaggcctg cttcattaag aagctggggg tagggtgggg gtggggagaa cacttaacaa 1981 catggggacc agtcagggga atccccttat ttctgttttg catatgagga accctagagc 2041 agccaggtga ggctctctag tttaataaaa atcatggaaa gactcttaat gcagactctt 2101 cttaagtgtt aatagggatt ttttcagctt attttggttg cagtttccaa tttttaaaaa 2161 tgttgaggta atctttccca ccttcccaaa cctaattctt gtagatgcat tagtgttgaa 2221 ccaatgcttt ctcatgtctc aattctttgt atatgcattc ttttcagatg tattaaacaa 2281 acaaaaaccc ttcaaaaaaa aaaaaaaa (SEQ ID NO: 38)
Multiple primers and probes can be prepared based on these known sequences and are encompassed within the disclosure. Primers can be designed in accord with a number of criteria using Primer design programs such as Premier Primer (biosoft), Oligo Primer Analysis software, and Oligo Perfect (Life Technologies) and other free and commercially available software. Probes can be designed using free and commercially available software including Array Designer (biosoft), and Light Cycler Probe design software (Roche). Primers and/or probes can be detectably labeled in accord with standard methods. Probes can be attached to a solid surface such as a slide, a well in a multiwell plate, and/or a chip. In embodiments, primers and/or probes are designed to specifically bind to each of the nucleic acids encoding CDC42, LYPLA2, TUB A3 C, ACTB, HLCS, MED13L, EED, SSR1, USP5, NDUFB1, OSBPL8, and PKP4. In embodiments, a custom array can be prepared that contains no more than 200 probes, including at least 12 probes, one for each of the identified genes. In embodiments, the primers or probes are not designed to bind to the polyA tail.
In embodiments, the primers and/ or probes specifically bind to the nucleic acid sequences under standard PCR or microarray conditions. In embodiments, those conditions include 7% sodium dodecyl sulfate SDS, 0.5 M NaP04, I mM EDTA at 50°C with washing in 2X standard saline citrate (SSC), 0.1% SDS at 50°C; preferably in 7% (SDS), 0.5 M NaP04, 1 mM EDTA at 50°C. with washing in IX SSC, 0.1% SDS at 50°C; preferably 7% SDS, 0.5 M NaP04, 1 mM EDTA at 50°C with washing in 0.5X SSC, 0.1% SDS at 50°C; and more preferably in 7% SDS, 0.5 M NaP04, 1 mM EDTA at 50°C with washing in 0.1X SSC, 0.1% SDS at 65°C.
Each of the genes identified herein as useful in determining a short term or long term survivor can have one or more variants that are known and primers and probes can be designed to detect all variants and/or each variant. Variants include those nucleic acids or proteins that are "Substantially homologous nucleic acid sequence" or "substantially identical nucleic acid sequence" "substantially homologous amino acid sequences" or "substantially identical amino acid sequences".
"Homologous" as used herein, refers to the subunit sequence similarity between two polymeric molecules, e.g., between two nucleic acid molecules, e.g., two DNA molecules or two RNA molecules, or between two polypeptide molecules. When a subunit position in both of the two molecules is occupied by the same monomeric subunit, e.g., if a position in each of two DNA molecules is occupied by adenine, then they are homologous at that position. The homology between two sequences is a direct function of the number of matching or homologous positions, e.g., if half (e.g., five positions in a polymer ten subunits in length) of the positions in two compound sequences are homologous then the two sequences are 50% homologous, if 90% of the positions, e.g., 9 of 10, are matched or homologous, the two sequences share 90% homology. By way of example, the DNA sequences 3'ATTGCC5' and 3'TATGGC share 50% homology.
As used herein, "homology" is used synonymously with "identity."
The determination of percent identity between two nucleotide or amino acid sequences can be accomplished using a mathematical algorithm. For example, a
mathematical algorithm useful for comparing two sequences is the algorithm of Karlin and Altschul (1990), modified as in Karlin and Altschul (1993). This algorithm is incorporated into the NBLAST and XBLAST programs of Altschul, et al., and can be accessed, for example at the National Center for Biotechnology Information (NCBI) world wide web site. BLAST nucleotide searches can be performed with the NBLAST program (designated "blastn" at the NCBI web site), using the following parameters: gap penalty = 5; gap extension penalty = 2; mismatch penalty = 3; match reward = 1; expectation value 10.0; and word size = 11 to obtain nucleotide sequences homologous to a nucleic acid described herein. BLAST protein searches can be performed with the XBLAST program (designated "blastn" at the NCBI web site) or the NCBI "blastp" program, using the following parameters:
expectation value 10.0, BLOSUM62 scoring matrix to obtain amino acid sequences homologous to a protein molecule described herein. To obtain gapped alignments for comparison purposes, Gapped BLAST can be utilized as described in Altschul et al.
Alternatively, PSI-Blast or PHI-Blast can be used to perform an iterated search which detects distant relationships between molecules and relationships between molecules which share a common pattern. When utilizing BLAST, Gapped BLAST, PSI-Blast, and PHI-Blast programs, the default parameters of the respective programs (e.g., XBLAST and NBLAST) can be used.
The percent identity between two sequences can be determined using techniques similar to those described above, with or without allowing gaps. In calculating percent identity, typically exact matches are counted.
As used herein, a "substantially homologous amino acid sequences" or "substantially identical amino acid sequences" includes those amino acid sequences which have at least about 92%, or at least about 95% homology or identity, including at least about 96% homology or identity, including at least about 97% homology or identity, including at least about 98% homology or identity, and at least about 99% or more homology or identity to an amino acid sequence of a reference antibody chain. Amino acid sequence similarity or identity can be computed by using the BLASTP and TBLASTN programs which employ the BLAST (basic local alignment search tool) 2.0.14 algorithm. The default settings used for these programs are suitable for identifying substantially similar amino acid sequences for purposes of the present invention.
As used herein, the term "conservative amino acid substitution" is defined herein as an amino acid exchange within one of the following five groups:
I. Small aliphatic, nonpolar or slightly polar residues:
Ala, Ser, Thr, Pro, Gly;
II. Polar, negatively charged residues and their amides:
Asp, Asn, Glu, Gin;
III. Polar, positively charged residues:
His, Arg, Lys;
IV. Large, aliphatic, nonpolar residues:
Met Leu, He, Val, Cys
V. Large, aromatic residues:
Phe, Tyr, Tip
"Substantially homologous nucleic acid sequence" or "substantially identical nucleic acid sequence" means a nucleic acid sequence corresponding to a reference nucleic acid sequence wherein the corresponding sequence encodes a peptide having substantially the same structure and function as the peptide encoded by the reference nucleic acid sequence; e.g., where only changes in amino acids not significantly affecting the peptide function occur. In one embodiment, the substantially identical nucleic acid sequence encodes the peptide encoded by the reference nucleic acid sequence. The percentage of identity between the substantially similar nucleic acid sequence and the reference nucleic acid sequence is at least about 50%, 65%, 75%, 85%, 92%, 95%, 99% or more. Substantial identity of nucleic acid sequences can be determined by comparing the sequence identity of two sequences, for example by physical/chemical methods (i.e., hybridization) or by sequence alignment via computer algorithm.
Suitable computer algorithms to determine substantial similarity between two nucleic acid sequences include, GCS program package The default settings provided with these programs are suitable for determining substantial similarity of nucleic acid sequences for purposes of the present invention.
Determination of Expression Levels
In one embodiment, the expression of the nucleic acid, such as mRNA of the genes of interest is determined. Levels of mRNA can be quantitatively measured by Northern blotting. A sample of RNA is separated on an agarose gel and hybridized to a radio-labeled RNA probe that is complementary to the target sequence. The radio-labeled RNA is then detected by an autoradiograph.
Another approach for measuring mRNA abundance is an amplification reaction such as polymerase chain reaction (PCR). In embodiments, PCR is RT-PCR. In RT-PCR, a DNA template from the mRNA is generated by reverse transcription, which is called cDNA. This cDNA template is then used for qPCR where the change in fluorescence of a probe changes as the DNA amplification process progresses. With a standard curve qPCR can produce an absolute measurement such as number of copies of mRNA, typically in units of copies per nanolitre of homogenized tissue or copies per cell. qPCR is very sensitive (detection of a single mRNA molecule is possible).
Another approach is to individually tag single mRNA molecules with fluorescent barcodes (nanostrings), which can be detected one-by-one and counted for direct digital quantification (Krassen Dimitrov, NanoString Technologies).
Also, DNA microarrays can be used to determine the transcript levels for many genes at once (expression profiling). Recent advances in microarray technology allow for the quantification, on a single array, of transcript levels for every known gene in several organism's genomes, including humans or smaller custom arrays can be utilized.
Also, "tag based" technologies like Serial analysis of gene expression (SAGE), which can provide a relative measure of the cellular concentration of different mRNAs, can be used.
In other embodiments, the level of expression can be determined using RNA sequencing technology. RNA sequencing technology involves high throughput sequencing of cDNA. mRNA is isolated and reverse transcribed to form a library of cDNA. The cDNA is fragmented to a specific size and optionally may be detectably labeled. The fragments are sequenced and the full sequence is assembled in accord with different platforms such as provided by Ilumina, 454 Sequencing or SOLID sequencing. In addition, mRNA can be sequenced directly(without conversion to cDNA) using protocols available from Helicos.
In one embodiment, the expression of the protein from the genes of interest is determined. For genes encoding proteins the expression level can be directly assessed by a number of means with some clear analogies to the techniques for mRNA quantification. The most commonly used method is to perform a Western blot against the protein of interest - this gives information on the size of the protein in addition to its identity. A sample (often cellular lysate) is separated on a polyacrylamide gel, transferred to a membrane and then probed with an antibody to the protein of interest. Other methods include, for example, Enzyme-linked immunosorbent assay (ELISA), lateral flow test, latex agglutination, other forms of immunochromatography, western blot, and/or magnetic immunoassay.
The use of the word "detect" and its grammatical variants refers to measurement of the species without quantification, whereas use of the word "determine" or "measure" with their grammatical variants are meant to refer to measurement of the species with
quantification. The terms "detect" and "identify" are used interchangeably herein.
Reagents to the detect the molecules of interest (such as a mRNA, cDNA, a nucleic probe, or antibodies) can be produced by methods available to an art worker or purchased commercially.
Analytical Methods
In embodiments, a method for selecting a treatment of a subject with ovarian cancer comprises inputting the expression levels of the set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer to obtain an output score. In one embodiment, the gene expression analysis of the genes of interest is applied to the equations provided in Figure 9. In embodiments, the gene expression analysis is obtained from microarray analysis using an Affymetrix U133 chip and the data from each gene is produced using the original raw intensity data (CEL files) processed using the MAS5 normalization and background-correction algorithm (51 OK FDA approved).
If gene expression analysis is conducted using PCR or RNA sequencing the gene expression values can be converted to values of the Affymetrix gene expression analysis algorithm using known methods. For example, the gene expression analysis can be run in parallel using PCR or RNA sequencing and using the Affymetrix U133 chip and software. The gene expression values for each gene from PCR or RNA sequencing can be compared to the values generated using Affymetrix system and a conversion factor identified. Gene expression levels for each gene generated by PCR or RNA sequencing can be generated and converted to the output of the Affymetrix algorithm using the conversion factor before inputting gene expression levels for each gene into the functions.
The variables are defined as the expression of the below genes:
Variables: XI = 215566_x_at = LYPLA2
X2 = 210527_x_at = TUB A3 C
X3 = 20080 l_x_at = ACTB
X4 = 212209_at = MED13L
X5 = 212585_at = OSBPL8
X6 = 209572_s_at = EED
X7 = 201929_s_at = PKP4
X8 = 20089 l_s_at = SSRl
X9 = 20603 l_s_at = USP5
X 10 = 213867_x_at = ACTB
XI I = 209399_at = HLCS
X12 = 206790_s_at = NDUFB1
XI 3 = 208728_s_at = CDC42
The functions are those as presented in Figure 9.
F1 = f (LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, PKP4)
F2 = f (SSR1 , USP5, ACTB, HLCS, NDUFB1 , LYPLA2, TUBA3C, MED13L, EED)
F3 = f (CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, EED).
Each function operates to independently provide a risk assessment of whether the subject is likely to have long term or short term survival. One or more functions can be used together to determine the likelihood that a subject has a risk of short term or long term survival. Once the gene expression data is inputted into a function . A function provides an output score for the subject's sample.
Cutoff values
In embodiments a method for selecting a treatment for a patient having ovarian cancer comprises determining whether the subject is likely to have long term survival by determining if the output score is less than a cutoff value or whether the subject is likely to have short term survival by determining if the output score is greater than or equal to the cutoff value, wherein the cutoff value is a value determined by identifying a value between the 99% confidence interval of the mean output score of a first set of samples from subjects known to have short term survival and the 99% confidence interval of the mean output score of a second set of samples from subjects known to have long term survival. The disclosure also provides methods for determining a cutoff value.
In embodiments, the method for determining the cutoff value comprises determining a mean output score for a first group of patients that are known to have short term survival and a mean output score of a second group of patients known to have long term survival of an original set of patients. The mean output score, the standard deviation, the range of each group, and 99% confidence interval of each group is determined. A cutoff value is determined that falls between the 99% confidence interval for both groups. For example, the cut-off score of the Fl model was determined to be 21.388. The upper limit of the 99% confidence interval for the long term survivors was 20.663 and the lower limit for the 99% confidence interval for the short term responders was 22.924. The difference between the two groups is 2.261 and in one embodiment, this value is divided in half and then added to the upper value for the long term survivors; that constitutes the middle point between the two groups. The cutoff is set within that difference between the 99% confidence interval of the groups and adjusted up or down from the aforementioned middle point according to the magnitude of the standard deviation of the two groups, i.e. the cutoff is moved away from the middle point from the group that has the larger standard deviation and closer to the other group (the one with the smaller standard deviation).
In another embodiment, the cutoff value is determined by a method comprising calculating an optimal point on the ROC curve based on the 34 scores of the 34 original subjects used in the discovery study [optimal point is defined as the point with the highest sensitivity and the lowest false positive rate (1 -specificity)] for first group of short term survivors and a second group of long term survivors. That optimal point (the score of one of the 34 original subjects), which represents, according to ROC curve analysis, the best cutoff point for all of the 34 original subjects' scores, itself may be used as the cutoff point.
In embodiments, the cutoff values for the Fl function is 21.388, for the F2 function is 14.3 and for the F3 function is 14.7.
In embodiments, the method for determining the cutoff values further comprises verifying the validity of the cutoff value by obtaining output scores for a second set of patients (validation set) whose status as a long term survivor or short term survivor is hidden from the tester. The output scores are compared to the cutoff values for each function and if the patient's sample in the validation set is greater than or equal to the cutoff value then it is predicted that the patient is a short term survivor and if less than the cutoff value a long term survivor. The status of the patient is unblinded and the validity of the cutoff value is determined by determining whether the cutoff value provides a sensitivity of at least 90% and a specificity of at least 90%.
Selecting a treatment
In embodiments a method comprises displaying whether the output score is less than a cutoff value indicating that the subject is a long term survivor or greater than or equal to the cutoff indicating that the subject is a short term survivor so that the health care worker can select a treatment for the subject.
In embodiments, where the output score indicates that the subject is likely to be a long term survivor, the health care worker may select one or more standard therapy options. These standard therapy options include chemotherapy, surgery, and/or radiation. Standard chemotherapeutic options include treatment with one or more of cyclophosphamide, Taxol, Platinum, CarbopJatin, Cisplatin, Gemcitabine, Topotecan, Oxaliplatin, Doxorubicin, Paclitaxel, Docetaxel, and combinations thereof.
In embodiments, where the output score indicates that the subject is likely to be a short term survivor, the health care worker may select a more aggressive treatment in addition to or in place of the standard chemotherapy. Such treatment includes treatment with a cancer vaccine, angiogenesis inhibitors, tubulin binding inhibitors, taxane analogs, actin polymerization inhibitors, adoptive cell therapy, and protein ubiquination inhibitors.
Examples of compounds that can be utilized include Avastin, Votrient, SIK2 inhibitors, Vinblastine, ixabepilone, epothelin B, imatinib, atorvastatin, siromilus, bestatin,
indomethacin, simvastatin, infliximab, microcystin, Camptothecin, Combretastatin A4 phosphate, ZD 6126, Pomalidimide, Lenalidimide, and Bortezomib. In embodiments, the chemotherapy treatment includes treatment with an inhibitor of ACTB, TUBA3C, CDC42, and combinations thereof.
In some embodiment, the methods of the invention may be employed on a set of patients to identify a responder group or a nonresponder group in a clinical trial , for example. When testing a new therapeutic agent, it is useful to know whether the therapeutic agent has different effects in the responder population versus the nonresponder population. Using the methods of the disclosure, a group of patients having ovarian cancer are identified as responders or nonresponders and are then treated with a potential therapeutic agent. Safety and efficacy of the drug is assessed in responder and nonresponder propulations.
Methods for screening therapeutic agents
Another aspect of the disclosure includes methods for screening therapeutic agents. Identification of ovarian cancer tissue samples as nonresponders and responders can be used to screen therapeutic effectiveness of the potential therapeutic agent on both types of patient populations. In some embodiments, cell lines may be developed from ovarian cancer tissue using standard methods from nonresponder and responders in order to provide for high through put analysis. In am embodiment, a method for screening agents for treating ovarian cancer, comprises contacting an ovariant cancer sample identified as a nonresponder or responder with a potential agent for treating ovarian cancer; and redetermining whether the agent decreases the growth, spread of the ovarian cancer sample, or changes the gene expression profile of the first set of genes , the second set of genes , the third set of genes or all sets of genes.
In embodiments, the method further comprises identifying a ovarian cancer sample as from a responder or nonresponder by determining the expression level of a a first set of genes comprising LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4, a second set of genes comprising SSRl, USP5, ACTB, HLCS, NDUFBl, LYPLA2, TUBA3C, MED13L, and EED, or a third set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED, in a sample from the patient.
In embodiments, the potential therapeutic agents are those that interact with any one of the genes a first set of genes comprising LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4, a second set of genes comprising SSRl, USP5, ACTB, HLCS, NDUFBl, LYPLA2, TUBA3C, MED13L, and EED, or a third set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED, or all set of genes in a sample from the patient. Examples of such agents are listed above. Drugs or chemicals similar to those known drugs in mechanism of action may be screened using nonresponder and responder ovarian cancer cells or cell lines as a measure of their efficacy in each of the patient groups. Other drugs or agents may also be those that are selected to act on other genes that are known to interact with any of the genes in the first or second set of genes as.The genes in the first, second, and/or third set of genes are targets to develop new therapeutics which can be tested on ovarian cancer cells identified as responder or nonresponders.
High throughput assays such as multiwell plate assays or arrays with cells attached to nanobeads can be utilized to test a number of therapeutic compounds for any effects on the responder or nonresponder cell types with regard to inhibition of cell growth, cell death, or change is gene expression of one or more of the genes of the first set of genes, the second set of genes , the third set of genes or all sets of genes. Those agents effective on both the responder and nonresponder population may be selected for further development. In other embodiments, an effective agent on either a responder or nonresponder cell types is selected and the patient group is sorted as responders and non responders for further testing of the agent effective in the respective responder or nonresponder cell type. Kits
Another aspect of the disclosure involves a kit. In embodiments, a kit comprises a primer or a probe or both that specifically hybridizes to each gene of a set of genes comprising LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4. In other embodiments, the kit comprises a primer or a probe or both that specifically hybridizes to each gene of a set of genes comprising SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUBA3C, MED13L, and EED. In yet another embodiment, a kit comprises a primer or a probe or both that specifically hybridizes to each gene of a set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED.
Multiple primers and probes can be prepared based on these known sequences and are encompassed within the disclosure. Primers can be designed in accord with a number of criteria using Primer design programs such as Premier Primer (biosoft), Oligo Primer Analysis software, and Oligo Perfect (Life Technologies) and other free and commercially available software. Probes can be designed using free and commercially available software including Array Designer (biosoft), and Light Cycler Probe design software (Roche). Primers and/or probes can be detectably labeled in accord with standard methods. Probes can be attached to a solid surface such as a slide, a well in a multiwell plate, and/or a chip. In embodiments, a primer and/or probe is designed to specifically bind to each of the nucleic acids encoding CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, EED, SSR1, USP5, NDUFB1, OSBPL8, and PKP4. In embodiments, a custom array can be prepared that contains no more than 200 probes, including at least 12 probes, one for each of the identified genes.
In embodiments, the primers and/ or probes specifically bind to the nucleic acid sequences under standard PCR or microarray conditions. In embodiments, those conditions include 7% sodium dodecyl sulfate SDS, 0.5 M NaP04, 1 mM EDTA at 50°C with washing in 2X standard saline citrate (SSC), 0.1% SDS at 50°C; preferably in 7% (SDS), 0.5 M NaP04, 1 mM EDTA at 50°C. with washing in IX SSC, 0.1% SDS at 50°C; preferably 7% SDS, 0.5 M NaP04, 1 mM EDTA at 50°C with washing in 0.5X SSC, 0.1% SDS at 50°C; and more preferably in 7% SDS, 0.5 M NaP04, 1 mM EDTA at 50°C with washing in 0. IX SSC, 0.1% SDS at 65°C. In other embodiments, a hybridization buffer includes 25% formamide, 2.5x SSC, 0.5 % SDS and lx Denhardts, and the primers and probes are incubated at 42°C for 1 hour followed by two washes of 0.5 SSC and 0.5% SDS.
In embodiments, the kit contains no more than 200 primers or probes or both, no more than 175 primers, probes or both, no more than 1 0 primers, probes or both, no more than 125 primers, probes or both, no more than 100 primers, probes or both, no more than 75 primers, probes or both, no more than 50 primers, probes or both, no more than 25 primers, probes or both, or no more than 15 primers, probes or both.
In embodiments, the kit can comprise or consist essentially of other reagents for detecting the gene expression level of the identified genes. In embodiments, the kit may also contain primers or probes for detecting one or more housekeeping genes as a positive control. In embodiments, the kit does not contain probes for any other genes that are predictive of short term or long term survivorship of ovarian cancer other than the genes identified herein.
In embodiments, the kit further comprises instructions for inputting the gene expression values into function 1, function2, function 3, or combinations thereof to obtain an output score. The instructions further provide comparing the output score for each function to a cutoff value and determining if the subject is likely to have long term survival if the output score is less than the cutoff value or if the subject is likely to have short term survival if the subject has an output score greater than or equal to the cutoff value for each function.
In embodiments, a kit further comprises a computer readable storage medium having computer-executable instructions that, when executed by a computing device, cause the computing device to perform a step comprising: calculating an output score by inputting gene expression levels of a set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer.
In embodiments, the computer readable storage medium having computer- executable instructions that, when executed by a computing device, cause the computing device to perform a step comprising: comparing the output score to a cutoff value and displaying whether the subject is likely to have long term survival if the output score is less than the cutoff value or if the subject is likely to have short term survival if the subject has an output score greater than or equal to the cutoff value for each function.
In embodiments the set of genes comprises at least the genes LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4. In other embodiments the set of genes comprises at least the genes SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUB A3 C, MED13L, and EED. In yet other embodiments a set of genes comprises CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED. In embodiments, the function is selected from the group consisting of function 1, function 2, and function 3. Computer/Processor
The detection, prognosis and/or diagnosis method can employ the use of a processor/computer system. For example, a general purpose computer system comprising a processor coupled to program memory storing computer program code to implement the method, to working memory, and to interfaces such as a conventional computer screen, keyboard, mouse, and printer, as well as other interfaces, such as a network interface, and software interfaces including a database interface find use one embodiment described herein.
The computer system accepts user input from a data input device, such as a keyboard, input data file, or network interface, or another system, such as the system interpreting, for example, the microarray or PCR data, and provides an output to an output device such as a printer, display, network interface, or data storage device. Input device, for example a network interface, receives an input comprising detection of the proteins/nucleic acids described herein and/or quantification of those compounds. The output device provides an output such as a display, including one or more numbers and/or a graph depicting the detection and/or quantification of the compounds.
Computer system is coupled to a data store which stores data generated by the methods described herein. This data is stored for each measurement and/or each subject; optionally a plurality of sets of each of these data types is stored corresponding to each subject. One or more computers/processors may be used, for example, as a separate machine, for example, coupled to computer system over a network, or may comprise a separate or integrated program running on computer system. Whichever method is employed these systems receive data and provide data regarding detection/diagnosis in return.
In embodiments, a method for selecting a treatment for a subject that has ovarian cancer comprises calculating an output score, using a computing device, by inputting gene expression levels of a set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer; and displaying the output score, using a computing device. In embodiments, the method further comprises determining whether the output score is greater than or equal to or less than a cutoff value, using a computing device; and displaying whether the subject is likely to be a short term survivor if the output score is greater than or equal to the cutoff value or long term survivor if the output score is less than the cutoff value.
In embodiments, a computing device, comprises a processing unit; and
a system memory connected to the processing unit, the system memory including instructions that, when executed by the processing unit, cause the processing unit to: calculate an output score by inputting gene expression levels of a set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer; and display the output score. In embodiments, the system memory includes instructions that when executed by the processing unit, cause the processing unit to determine whether the output score is greater than or equal to or less than a cutoff value; and displaying whether the subject is likely to be a short term survivor if the output score is greater than or equal to the cutoff value or long term survivor if the output score is less than the cutoff value.
In embodiments the set of genes comprises at least the genes LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4. In other embodiments the set of genes comprises at least the genes SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2, TUB A3 C, MED13L, and EED. In yet other embodiments a set of genes comprises CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED. In embodiments, the function is selected from the group consisting of function 1, function 2, and function 3.
Examples
The following examples are provided in order to demonstrate and further illustrate certain embodiments and aspects of the present invention and are not to be construed as limiting the scope thereof.
Example 1 - Discovery of Ovarian Cancer Biomarkers
The raw intensity microarray data (CEL files) by Berchuck et al. posted at the Duke Institute for Genome Sciences & Policy (Clinical Cancer Research 11, 3686-3696 (2005); (data.genome.duke.edu/clinicalcancerresearch.php) were used for this study. Briefly, according to Berchuck et al tumor tissue was harvested from 54 EOC patients with stages III and IV during surgery (48 with stage III and 6 with stage IV) and prior to the commencement of platinum/taxol chemotherapy. Total RNA was extracted from each tumor tissue sample and was analyzed for global gene expression using the GeneChip array U133A by
Affymetrix. Following platinum/taxol chemotherapy, all 54 patients were followed for a period greater than seven years. Thirty patients survived for a period less than 3 years
[NR/STS (non-responders/short-term survivors)-(median survival = 17.5 months)], and 24 patients survived for a period greater than 7 years [R/LTS (responders/long-term survivors)- (median survival = 107.5 months)]. None of the 30 NR/STS subjects died of causes other than EOC. The aforementioned original raw intensity data (CEL files) were processed using the MAS5 normalization and background-correction algorithm (51 OK FDA approved). The Affymetrix U133A chip, which has 22,283 probe sets that can interrogate an equal number of transcripts, was utilized for this study. Microarray data from the aforementioned Affymetrix U133A chip were obtained from 14 long-term ovarian cancer survivors who lived more than seven years and 20 short-term survivors who lived less than three years after initial diagnosis.
We performed ROC curve analysis on the entire data matrix, ie, on all variables (22,283 transcripts χ 54 subjects) in order to assess the discriminating capability of all variables with respect to our two groups, namely, R/LTS and NR/STS. In the final round, we selected only those variables with an AUC > 0.80. Eighty four variables (transcripts) fulfilled this criterion, and they constituted the final pool of the most significant variables. From the aforementioned 84 most significant variables, 13 became the input variables to the three complex mathematical functions (Fl, F2, and F3), which we were able to generate, and which we term— and henceforward refer to as— super variables. Those three super variables (complex mathematical functions) are the final prognostic biomarker models. We should point out here that several other super variables were generated employing the remaining of the aforementioned 84 most significant variables, but, following final assessment, they proved to be not as robust as the Fl, F2, or F3, and they are consequently not presented here.
The platform technology, as developed by Dr. Jason B. Nikas, and as presented in part in Nikas et al. 2010 (2), in Nikas and Low 2011(a) (3), and in Nikas and Low 201 1(b) (4), identified three biomarkers (complex mathematical functions of original mRNA variables, see Figure 9 and discussion above) that allowed one to distinguish between long-term and short-term survivors or between responders and non-responders, respectively. The three biomarkers (panels of markers) are as follows:
F1 = f (LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, PKP4)
F2 = f (SSR1 , USP5, ACTB, HLCS, NDUFB1 , LYPLA2, TUBA3C, MED13L, EED)
F3 = f (CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, EED).
The cut-off score of the Fl prognostic biomarker model, as well as those of the other two models, was determined by taking into account the results of the following two analyses: 1) calculation of the optimal point on the ROC curve based on the 34 scores of the 34 original subjects used in the discovery study [optimal point is defined as the point with the highest sensitivity and the lowest false positive rate (1 -specificity)] and 2) calculation of the 99.99% confidence intervals for the mean Fl scores of the two groups (R/LTS and NR/STS) and their respective standard deviations. Based on that, the cut-off score of the Fl model was determined to be 21.388. If a subject has an Fl score less than 21.388, then that subject is classified as an R LTS; otherwise, that subject is classified as an NR STS. As can be seen from Figure 1, the Fl model correctly identified all (14/14) R/LTS subjects and 19/20 NR/STS subjects. In terms of treatment response, since we would like to identify those subjects that will respond to the platinum/taxol chemotherapy, our target group is the R/LTS (responder /long term survivor) and our reference group is the NR STS (non responder /short term survivor).
The ROC AUC of the Fl is 0.98929 with a 95% CI = [0.90449, 0.99884]. The mean Fl score of the 14 R LTS subjects was 17.9358 (top of clear bar) and the standard deviation (whisker above or below the top of the clear bar) was 2.9622; whereas the mean Fl score of the 20 NR STS subjects was 25.4697 (top of dark bar) and the standard deviation(whisker above or below the top of the dark bar) was 3.3651. The significance level was set at a = 0.001 (two-tailed), and the probability of significance was P = 1.30 x 10-7 (independent t- Test with T-value = -6.7405). The Fl is parametrically distributed with respect to both groups.
Therefore, for the discovery study, insofar as response to treatment is concerned, the Fl model exhibited a sensitivity = 1.000 and a specificity = 19/20 = 0.950. See Table IB. In the case of survival, given that we are interested in identifying the subjects that will be short- term survivors, our target group is the NR STS and our reference group is the R/LTS. With regard to survival, therefore, for the discovery study, the Fl model exhibited a sensitivity = 0.950 and a specificity = 1.000. Both Figure 1 and Tables 1A and IB show all pertinent statistical results of the Fl prognostic biomarker model in connection with the discovery study in great detail.
The cut-off score of the F2 prognostic biomarker model was determined to be 14.259. If a subject has an F2 score less than 14.259, then that subject is classified as an R/LTS; otherwise, that subject is classified as an NR/STS. As can be seen from Figure 2, the F2 model correctly identified 13/14 R LTS subjects and all (20/20) NR/STS subjects. In connection with treatment response, therefore, and with regard to the discovery study, the F2 model exhibited a sensitivity = 13/14 = 0.929 and a specificity = 1.000; whereas in connection with survival, its sensitivity and specificity were 1.000 and 0.929, respectively. Figure 2 and Tables 1 A and IB show all pertinent statistical results of the F2 prognostic biomarker model in connection with the discovery study in great detail.
Regarding the F3 prognostic biomarker model, the cut-off score was determined to be 14.694, signifying that a score less than 14.694 belongs to an R LTS subject, whereas a score greater than 14.694 belongs to an NR/STS subject. As can be seen from Figure 2, the F3 model correctly identified all (14/14) R/LTS subjects and 19/20 NR/STS subjects. For the discovery study, therefore, with regard to treatment response, the sensitivity and specificity of the F3 model were 1.000 and 0.950, respectively; with regard to survival, its sensitivity and specificity were 0.950 andl .000, respectively. Figure 2 and Tables 1A and IB show all pertinent statistical results of the F3 prognostic biomarker model in connection with the discovery study in great detail.
The ROC AUC of the F2 is 0.98929 with a 95% CI = [ 0.90321, 0.99886 ], whereas the ROC AUC of the F3 is 0.98214 with a 95% CI = [0.86165, 0.99782]. The mean F2 score of
the 14 R/LTS subjects was 13.4223 (top of clear bar) and the standard deviation (whisker above or below the top of the clear bar) was 0.8905; whereas the mean F2 score of the 20 NR/STS subjects was 15.1843 (top of dark bar) and the standard deviation (whisker above or below the top of the dark bar) was 0.6407. The mean F3 score of the 14 R/LTS subjects was 13.8864 and the standard deviation was 0.7017; whereas the mean F3 score of the 20 NR/STS subjects was 15.3433 and the standard deviation was 0.6082. The significance level was set at a = 0.001 (two-tailed), and the probability of significance for the F2 was P = 1.37 x 10-7 (independent t-Test with T-value = -6.7217), whereas the probability of significance for the F3 was P = 2.93 10"7
(independent t-Test with T-value = -6.4541). Both the F2 and the F3 are parametrically distributed with respect to both groups.
Thirty-four subjects were used (14 long-term survivors (LTS) and 20 short-term survivors (STS)). The prognostic accuracy of the three biomarkers is shown in Table 1 A. Both the Fl and F3 prognostic models classified correctly all 14 LTS subjects (Reference Group) and misclassified one STS subject (Target Group) (Sensitivity = 0.950 and Specificity = 1.000). The F2 prognostic model classified correctly all 20 STS subjects and misclassified one LTS subject (Sensitivity = 1.000 and Specificity = 0.929).
Table 1A. Statistical results of all three prognostic models and predicted group mean values for future LTS and STS subjects.
Figure imgf000070_0001
Table IB shows the response to treatment results (here, the LTS subjects are the target group and the STS subjects are the reference group).Table IB. Statistical results of all three prognostic models for future Responders (LTS) and Non-Responders (STS).
Figure imgf000070_0002
Differences in the values of the gene expression biomarkers between long-term and short-term survivors are shown in Figure 1 for biomarker Fl and in Figure 2 for the F2 and F3 biomarkers. The LTS and STS box plots of the Fl biomarker are shown in Figure 3, whereas those of F2 and F3 biomarkers are shown in Figure 4.
The 3-dimensional plot of biomarkers Fl vs. F2 vs. F3 is shown in Figure 5. This plot shows distinct and separated clustering for long-term vs. short-term survivors of ovarian cancer patients with the exception of one subject.
Example 2 - Qualification of Ovarian Cancer Biomarkers
The aforementioned diagnostic biomarkers were validated with 20 new, unknown subjects (10 long-term survivors and 10 short-term survivors). The Fl, F2, and F3 prognostic models correctly classified all 20 new, unknown subjects (Sensitivity = 1.000 and Specificity = 1.000). The validation (qualification) results are shown in Table 2A (here, the LTS subjects are the reference group and the STS subjects are the target group).
Table 2A. Statistical results of all three prognostic models with respect to the 20 new, unknown subjects, along with the observed group mean values of those unknown subjects. The observed group mean values of the 20 new, unknown subjects fall within the respective confidence intervals as predicted by all three models (see Table 1 A).
Figure imgf000071_0001
Table 2B shows the response to treatment results (here, the LTS subjects are the target group and the STS subjects are the reference group).Table 2B. Statistical results of all three prognostic models for the 20 new, unknown subjects as Responders (LTS) & Non- Responders (STS).
Figure imgf000071_0002
Combined scatter plots and bar graphs with all individual subjects (both LTS
(responders) and STS (non-responders)) of the three prognostic biomarkers (Fl, F2, and F3) are shown in Figure 6 (Fl) and Figure 7 (F2 and F3). As can be seen, all three prognostic biomarkers were able to correctly prognose all 20 new, unknown subjects (complete segregation of the two survival/(treatment-response) groups, i.e. the LTS from the STS group).
A 3-dimensional plot of Fl vs. F2 vs. F3 is shown in Figure 8. As can be seen, all three prognostic biomarkers correctly prognosed all 20 new, unknown subjects (10 LTS (responders) and 10 STS (non-responders)) (complete segregation of the long-term and short- term survival groups).
The aforementioned 12 genes can be categorized into three general groups: 1) genes that regulate the expression of cytostructural proteins, 2) genes that regulate cell proliferation, and 3) genes that regulate metabolism.
The genes ACTB, TUBA3C, and CDC42 have functions that pertain to the cytoskeleton, and as such, they compose the first group. Cancer proliferation and metastasis relies on cytostructural materials, ie, cytoskeletal proteins, such as microfilaments (actin) and microtubules. The first two genes promote the expression of actin and microtubules, respectively. CDC42 promotes the polymerization of actin into microfilaments;
reorganization of the actin cytoskeletonl5; and cell formation, growth, and spreading. There is also evidence that CDC42 can regulate the polarization of both the actin and the microtubule cytoskeleton. In our study all three of the aforementioned cytostructural genes were significantly over-expressed in the NR/STS group relative to the R/LTS group.
The following genes, whose function pertains to cell proliferation in general, compose the second group: MED13L, SSR1, PKP4, EED, and USP5.The MED13L protein (also known as, among other names, THRAP2, TRAP240L, and KIAA1025) is member of the Mediator complex, a group of about 30 transcriptional co-activators that play various regulatory roles in the induction of RNA polymerase II transcription. Compositional differences may account for different functions among the Mediator proteins; for instance some promote transcription, whereas others act as transcriptional repressors.
A number of those Mediator proteins are novel, and, consequently, their exact function is not known, including that of MED13L. Regarding specifically the MED13L gene, it has been observed that over-expression of the TP53 gene (p53) in human colon carcinoma cell lines relative to controls suppresses the expression of MED13L (KIAA1025). That could very well explain our finding that the MED13L gene was significantly under-expressed in the NR/STS group relative to the R/LTS group by affirming the existence of a more aggressive EOC cancer in the case of the former group in comparison with the latter one.
SSR1 is an ER (endoplasmic reticulum) receptor part of the translocon-associated protein (TRAP)complex. In our study, the SSR1 gene was significantly under-expressed in the NR/STS group relative to the R/LTS group.
The PKP4 protein (aka p0071) belongs to the family of arm-repeat proteins, which are involved in cell adhesion. According to the results of our analysis, the PKP4 gene was significantly under-expressed in the NR STS group relative to the R/LTS group, and that accords with the observation that metastatic cancer cells rely on greater cell mobility and, thus, lower cell adhesion.
The EED protein is part of the Polycomb-group (PcG) proteins involved in repressive transcriptional control mediated via histone deacetylation We found that the EED gene was significantly under-expressed in the NR/STS group relative to the R/LTS group, indicating that in the case of more aggressive EOC, inhibitory control of gene activity was more diminished. USP5 belongs to the largest class of deubiquitinating enzymes (USPs) that regulate protein ubiquitination, a post-translational modification of cellular proteins. Compounds that inhibit the regulation of protein ubiquitination, such as bortezomib, have been approved by the FDA and are used for the treatment of certain types of cancer. Moreover, other compounds that specifically suppress the activity of USP5, such as WP1 130, lead to apoptosis of tumor cells. Those findings are in agreement with our results: we found that the USP5 gene was significantly over-expressed in the NR/STS group relative to the R/LTS group. There is also evidence that malignant tumors, via the release of certain factors, may promote the expression of USPs and other deubiquitinating enzymes in order to induce major alterations in the metabolism,
such as increased proteolysis and lipolysis.
The third group comprises genes whose function is involved in metabolism in general and lipid metabolism in particular. Those genes are: LYPLA2, OSBPL8, HLCS, and NDUFB 1. LYPLA2 is the enzyme that catalyzes the hydrolysis of 2-lysophosphatidyIcholine (which, along with arachidonic acid, is derived from the hydrolysis of phosphatidylcholine— a phospholipid that is a major component of cell membranes) to glycerophosphocholine. The protein OSBPL8 is an intracellular lipid receptor that belongs to the family of oxysterols (oxygenated cholesterol derivatives). Oxysterols activate the liver X receptors (LXR) that regulate the expression of a number of genes whose function pertains to cholesterol metabolism. We found that the OSBPL8 gene was significantly under-expressed in the NR/STS group relative to the R/LTS group, suggesting that lipogenesis is more suppressed in the more aggressive cancer cells. That is in agreement with our findings about the overexpression of the LYPLA2 gene in the same group(NR/STS), which points to a greater lipolysis in the case of the more aggressive cancer cells. HLCS is an enzyme that catalyzes the covalent biotinylation of the five crucial mammalian carboxylaseenzymes: pyruvate carboxylase (PC), acetyl-CoA carboxylase 1 and 2 (ACC1 and ACC2), 3-methylcrotonyl- CoA carboxylase (MCC), and propionyl-CoA carboxylase (PCC). From an energy production perspective, the most likely targets of HLCS in connection with advanced-stage EOC are PCC and MCC. Along with 45 other subunits, the DUFB l dehydrogenase (ubiquinone) 1 beta subcomplex constitutes the mitochondrial Complex I— a very large multiprotein enzyme which is located in the inner mitochondrial membrane, and which catalyzes the first step of the electron transport chain, the redox machinery of the oxidative phosphorylation. It has been observed by multiple studies that, owing to their surrounding hypoxic environment, tumor cells rely to a much larger extent on anaerobic glycolysis to produce energy rather than on oxidative phosphorylation. This is in complete agreement with our finding that the NDUFB 1 gene was significantly under-expressed in the NR/STS group relative to the R/LTS group, for it points to a lower utilization of oxidative phosphorylation on the part of the more aggressive cancer cells.
Another category involves genes related to the mechanism of action of taxol. Taxol is an anti-tubulin chemotherapeutic agent that acts as a mitotic inhibitor. More specifically, it increases polymerization of microtubules from -β tubulin heterodimers, and it stabilizes microtubules by preventing their depolymerization. This action prevents the formation of the mitotic spindle, a necessary step in the process of mitosis, and that results in the arrest of the mitosis cycle either in the G2 or the M phase.As was mentioned earlier, the gene TUBA3C, which encodes the production of a-tubulin, and which is common to all three super variables and one of the top four most significant predictors, was significantly over-expressed in those patients (NR/STS) who did not respond to taxol (platinum/taxol chemotherapy). Furthermore, the fact that taxol binds to the β-tubulin subunit would render the over-expression of
TUBA3C on the part of the more aggressive cancer cells a successful strategy for evading the action of taxol and, thus, for survival. The CDC42 gene not only promotes the polymerization of actin into microfilaments, the reorganization of the actin cytoskeleton, and cell formation, growth, and spreading; but also it can regulate the polarization of both the actin and the microtubule cytoskeleton. Theoretically, therefore, over-expression of the CDC42 gene can overcome the action of taxol, as well; and that is the finding of our analysis: the CDC42 gene was significantly over-expressed in the NR STS group relative to the R LTS group.
In summary, when over-expressed, the genes TUBA3C, ACTB, and CDC42 can collectively (or even perhaps in a given combination thereof) overcome the exerted actions of taxol and diminish its efficacy. It stands to reason, therefore, to expect that a new
pharmacological approach whereby, via a combination of chemotherapeutic agents, all three of those aforementioned genes are targeted will be more successful than the current standard treatment in extending the life span of those women with EOC who, because of the specific pattern of the aforementioned genetic networks, will not respond to the platinum/taxol chemotherapy and will turn out to be short-term survivors.
All publications, patents and patent applications are incorporated herein by reference. While in the foregoing specification this invention has been described in relation to certain preferred embodiments thereof, and many details have been set forth for purposes of illustration, it will be apparent to those skilled in the art that the invention is susceptible to additional embodiments and that certain of the details described herein may be varied considerably without departing from the basic principles of the invention.
Bibliography
(1) Berchuck A., Iversen E.S., Lancaster J.M., et al. (2005). Patterns of Gene Expression That Characterize Long-term Survival in Advanced Stage Serous Ovarian Cancers. Clinical Cancer Research, 1 1(10): 3686-3696.
(2) Nikas J.B., C. Dirk Keene, and Low W.C. (2010). Comparison of Analytical
Mathematical Approaches for Identifying Key Nuclear Magnetic Resonance Spectroscopy Biomarkers in the Diagnosis and Assessment of Clinical Change of Diseases. Journal of Comparative Neurology, 518: 4091^11 12.
(3) Nikas J.B. and Low W.C. (201 1). ROC-Supervised Principal Component Analysis in Connection with the Diagnosis of Diseases. American Journal ofTranslational Research, 3(2): 180-196.
(4) Nikas J.B. and Low W.C. (201 1). Application of Clustering Analyses to the Diagnosis of Huntington Disease in Mice and Other Diseases with Well-Defined Group Boundaries. Computer Methods and Programs in Biomedicine, doi: 10.1016/j .cmpb.201 1.03.004.

Claims

WHAT IS CLAIMED IS:
1. A method of selecting a treatment for a subject that has ovarian cancer comprising: a) determining whether the subject is likely to have short term or long term survival by a method comprising
i) measuring the level of gene expression of at least a set of genes comprising LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4 in a sample comprising ovarian cancer cells from the subject;
ii) inputting the expression levels of the set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer to obtain an output score; iii)determining whether the subject is likely to have long term survival by determining if the output score is less than a cutoff value or whether the subject is likely to have short term survival by determining if the output score is greater than or equal to the cutoff value, wherein the cutoff value is a value determined by identifying a value between the 99% confidence interval of the mean output scores of a first set of samples from subjects known to have short term survival and a second set of samples from subjects known to have long term survival; and
b) displaying whether the output score is greater than or equal to the cutoff value or less than the cutoff value to a health care worker so that the health care worker can select a treatment for the subject.
2. The method of claim 1, wherein the cDNA levels are measured.
3. The method of claim 1 or 2, wherein the gene expression levels are measured by microarray analysis.
4. The method of any one of claims 1-2, wherein gene expression levels are measured by polymerase chain reaction.
5. The method of any one of claims 1-2, wherein the gene expression levels are measured by R A sequencing.
6. The method of any one of claims 1-5, wherein the function is as follows:
Figure imgf000077_0001
Wherein XI is LYPLA2, X2 is TUB A3 C, X3 is ACTB, X4 is MED13L, X5 is OSBPL8, X6 is EED, and X7 is PKP4.
7. The method of claim 6, wherein the cutoff value is about 21.4.
8. The method of any one of claims 1-7, further comprising treating a subject likely to have long term survival with standard chemotherapy.
9. The method of any one of claim 1-8, further comprising treating a subject likely to have short term survival with an inhibitor of a protein selected from the group consisting of TUB A3 C, ACTB, CDC42 and combinations thereof.
10. A method of selecting a treatment for a subject that has ovarian cancer comprising: a) determining whether the subject is likely to have short term or long term survival by a method comprising
i) measuring the level of gene expression of at least a set of genes comprising SSRl, USP5, ACTB, HLCS, NDUFBl, LYPLA2, TUBA3C, MED13L, and EED in a sample comprising ovarian cancer cells from the subject;
ii) inputting the expression levels of the set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer to obtain an output score;
iii) determining whether the subject is likely to have long term survival by determining if the output score is less than a cutoff value or whether the subject is likely to have short term survival by determining if the output score is greater than or equal to the cutoff value, wherein the cutoff value is a value determined by identifying a value between the 99% confidence interval of a mean output score of a first set of samples from subjects known to have short term survival and a mean output score of a second set of samples from subjects known to have long term survival; and b) displaying whether the output value of the sample is greater than or equal to the cutoff value or less than the cutoff value so that the health care worker can select a treatment for the subject.
1 1. The method of claim 10, wherein the cDNA levels are measured.
12. The method of claim 10 or 1 1, wherein the gene expression levels are measured by microarray analysis.
13. The method of any one of claims 10-11 , wherein gene expression levels are measured by polymerase chain reaction.
14. The method of any one of claims 10-11, wherein the gene expression levels are measured by RNA sequencing.
15. The method of any one of claims 10-14, wherein the function is as follows:
Figure imgf000078_0001
Wherein XI is LYPLA2, X2 is TUB A3 C, X3 is ACTB, X4 is MED13L, X6 is EED, X8 is SSR1, X9 is USP5, XI 0 is ACTB, XI 1 is HLCS, and X12 is NDUFB1.
16. The method of claim 15, wherein the cutoff value is about 14.3.
17. The method of any one of claims 10-16, further comprising treating a subject likely to have long term survival with standard chemotherapy.
18. The method of any one of claim 10-16, further comprising treating a subject likely to have short term survival with an inhibitor of a protein selected from the group consisting of TUB A3 C, ACTB, CDC42 and combinations thereof .
19. A method of selecting a treatment for a subject that has ovarian cancer comprising: a) determining whether the subject is likely to have short term or long term survival by a method comprising
i) measuring the level of gene expression of at least a set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, ED13L, and EED in a sample comprising ovarian cancer cells from the subject;
ii) inputting the expression levels of the set of genes into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer to obtain an output score;
iii) determining whether the subject is likely to have long term survival by determining if the output score is less than a cutoff value or whether the subject is likely to have short term survival by determining if the output score is greater than or equal to the cutoff value, wherein the cutoff value is a value determined by identifying a value between the 99% confidence interval of a mean output score of a first set of samples from subjects known to have short term survival and a mean output score of a second set of samples from subjects known to have long term survival; and
b) displaying whether the output score is less than a cutoff value indicating that the subject is a long term survivor or whether the output score is greater than or equal to the cutoff indicating that the subject is a short term survivor so that the health care worker can select a treatment for the subject.
20. The method of claim 19, wherein the cDNA levels are measured.
21. The method of claim 19 or 20, wherein the gene expression levels are measured by microarray analysis.
22. The method of any one of claims 19-20, wherein gene expression levels are measured by polymerase chain reaction.
23. The method of any one of claims 19-20, wherein the gene expression levels are measured by R A sequencing.
24. The method of any one of claims 19-23, wherein the function is as follows: F (arc sinh / (hxi )"¾)¾)0 6C"Xio)"(Xu)(Xi3)0 8s 1 1 >
Wherein XI is LYPLA2, X2 is TUBA3C, X3 is ACTB, X4 is MED13L, X6 is EED, XIO is
ACTB, XI 1 is HLCS, and X13 is CDC42.
25. The method of claim 24, wherein the cutoff value is about 14.7.
26. The method of any one of claims 19-25, further comprising treating a subject likely to
have long term survival with standard chemotherapy.
27. The method of any one of claim 19-25, further comprising treating a subject likely to
have short term survival with an inhibitor of a protein selected from the group consisting of
TUBA3C, ACTB, CDC42 and combinations thereof.
28. A kit comprising: a primer or a probe or both that specifically hybridizes to each gene
of a first set of genes comprising LYPLA2, TUBA3C, ACTB, MED 13 L, OSBPL8, EED,
and PKP4, and wherein the kit contains no more than 125 primers or probes.
29. A kit comprising: a primer or a probe or both that specifically hybridizes to each gene
of a first set of genes comprising SSR1, USP5, ACTB, HLCS, NDUFB1, LYPLA2,
TUBA3C, MED13L, and EED and wherein the kit contains no more than 125 primers or
probes.
30. A kit comprising: a primer or a probe or both that specifically hybridizes to each gene
of a first set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L,
and EED and wherein the kit contains no more than 125 primers or probes.
31. The kit of any one of claims 28-30, further comprising a computer readable storage
medium having computer-executable instructions that, when executed by a computing device, cause the computing device to perform a step comprising: calculating an output score by inputting gene expression levels of a set of genes of any one of claims 28-30 into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer; and
determining whether the subject is likely to be long term survivor by determining whether the output score is less than a cutoff value or is likely to be a short term survivor if the output score is greater than or equal to the cutoff score.
32. A method for selecting a treatment for a subject that has ovarian cancer comprising, the method comprising:
calculating an output score, using a computing device, by inputting gene expression levels of a first set of genes comprising LYPLA2, TUB A3 C, ACTB, MED13L, OSBPL8, EED, and PKP4, a second set of genes comprising SSRl, USP5, ACTB, HLCS, NDUFBl, LYPLA2, TUBA3C, MED13L, and EED, or a third set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED, into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer; and displaying the output score, using a computing device.
33. A method of claim 32, further comprising
determining whether the output score is greater than or equal to or less than a cutoff value, using a computing device; and displaying whether the subject is likely to be a short term survivor if the output score is greater than or equal to the cutoff value or long term survivor if the output score is less than the cutoff value.
34. A computing device, comprising:
a processing unit; and
a system memory connected to the processing unit, the system memory including instructions that, when executed by the processing unit, cause the processing unit to:
calculate an output score by inputting gene expression levels of a first set of genes comprising LYPLA2, TUBA3C, ACTB, MED13L, OSBPL8, EED, and PKP4, a second set of genes comprising SSRl, USP5, ACTB, HLCS, NDUFBl, LYPLA2, TUBA3C, MED13L, and EED, or a third set of genes comprising CDC42, LYPLA2, TUBA3C, ACTB, HLCS, MED13L, and EED, into a function that provides a predictive relationship between gene expression levels of the set of genes and short term or long term survival of subjects having ovarian cancer; and display the output score.
35. A computing device of claim 34, wherein the system memory includes instructions, that when executed by the processing unit, cause the processing unit to determine whether the output score is greater than or equal to or less than a cutoff value; and displaying whether the subject is likely to be a short term survivor if the output score is greater than or equal to the cutoff value or long term survivor if the output score is less than the cutoff value.
PCT/US2012/036120 2011-05-02 2012-05-02 Kits and methods for selecting a treatment for ovarian cancer WO2012151277A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201161481556P 2011-05-02 2011-05-02
US61/481,556 2011-05-02

Publications (1)

Publication Number Publication Date
WO2012151277A1 true WO2012151277A1 (en) 2012-11-08

Family

ID=47108029

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/US2012/036120 WO2012151277A1 (en) 2011-05-02 2012-05-02 Kits and methods for selecting a treatment for ovarian cancer

Country Status (1)

Country Link
WO (1) WO2012151277A1 (en)

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105219855A (en) * 2015-09-29 2016-01-06 北京泱深生物信息技术有限公司 A kind of osteoarthritis of diagnosing is caused a disease the DNA methylation assay reagent of risk
US9580437B2 (en) 2014-12-23 2017-02-28 Novartis Ag Triazolopyrimidine compounds and uses thereof
US10676479B2 (en) 2016-06-20 2020-06-09 Novartis Ag Imidazolepyridine compounds and uses thereof
US10689378B2 (en) 2016-06-20 2020-06-23 Novartis Ag Triazolopyridine compounds and uses thereof
US11091489B2 (en) 2016-06-20 2021-08-17 Novartis Ag Crystalline forms of a triazolopyrimidine compound

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030165831A1 (en) * 2000-03-21 2003-09-04 John Lee Novel genes, compositions, kits, and methods for identification, assessment, prevention, and therapy of ovarian cancer
US7115370B2 (en) * 2002-06-05 2006-10-03 Capital Genomix, Inc. Combinatorial oligonucleotide PCR
WO2008086182A2 (en) * 2007-01-04 2008-07-17 University Of Rochester Use of gene signatures to design novel cancer treatment regimens

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20030165831A1 (en) * 2000-03-21 2003-09-04 John Lee Novel genes, compositions, kits, and methods for identification, assessment, prevention, and therapy of ovarian cancer
US7115370B2 (en) * 2002-06-05 2006-10-03 Capital Genomix, Inc. Combinatorial oligonucleotide PCR
WO2008086182A2 (en) * 2007-01-04 2008-07-17 University Of Rochester Use of gene signatures to design novel cancer treatment regimens

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
BERCHUCK ET AL.: "Patterns of Gene Expression That Characterize Long-Term Survival in Advanced Stage Serous Ovarian Cancers.", CLIN. CANCER RES., vol. 11, no. 10, 15 May 2005 (2005-05-15), pages 3686 - 3696 *
NIKAS ET AL.: "Mathematical Prognostic Biomarker Models for Treatment Response and Survival in Epithelial Ovarian Cancer.", CANCER INFORMATICS., vol. 10, 2011, pages 233 - 247 *

Cited By (9)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9580437B2 (en) 2014-12-23 2017-02-28 Novartis Ag Triazolopyrimidine compounds and uses thereof
US10220036B2 (en) 2014-12-23 2019-03-05 Novartis Ag Triazolopyrimidine compounds and uses thereof
US11207325B2 (en) 2014-12-23 2021-12-28 Novartis Ag Triazolopyrimidine compounds and uses thereof
US11931363B2 (en) 2014-12-23 2024-03-19 Novartis Ag Triazolopyrimidine compounds and uses thereof
CN105219855A (en) * 2015-09-29 2016-01-06 北京泱深生物信息技术有限公司 A kind of osteoarthritis of diagnosing is caused a disease the DNA methylation assay reagent of risk
US10676479B2 (en) 2016-06-20 2020-06-09 Novartis Ag Imidazolepyridine compounds and uses thereof
US10689378B2 (en) 2016-06-20 2020-06-23 Novartis Ag Triazolopyridine compounds and uses thereof
US11091489B2 (en) 2016-06-20 2021-08-17 Novartis Ag Crystalline forms of a triazolopyrimidine compound
US11548897B2 (en) 2016-06-20 2023-01-10 Novartis Ag Crystalline forms of a triazolopyrimidine compound

Similar Documents

Publication Publication Date Title
AU2019200670B2 (en) Interrogatory cell-based assays and uses thereof
DK2121979T3 (en) Genetic markers for risk management of cardiac arrhythmia
DK2471954T3 (en) Susceptibility genetic variants associated with cardiovascular diseases
DK2644711T3 (en) A method for diagnosing neoplasms
KR20150090246A (en) Molecular diagnostic test for cancer
KR20140044341A (en) Molecular diagnostic test for cancer
CN110382521A (en) The active method of tumor-inhibitory FOXO is distinguished from oxidative stress
AU2014299322B2 (en) Sepsis biomarkers and uses thereof
KR101421326B1 (en) Composition for predicting prognosis of breast cancer and kit comprising the same
KR20100017865A (en) Genetic variants on chr 5p12 and 10q26 as markers for use in breast cancer risk assessment, diagnosis, prognosis and treatment
CN111448325A (en) Assessment of JAK-STAT3 cell signaling pathway activity using mathematical modeling of target gene expression
CN112795650A (en) Evaluation of PI3K cell signaling pathway activity using mathematical modeling of target gene expression
CN101687050A (en) Be used to differentiate the method and the material of the origin of the cancer that former initiation source is not clear
AU2018210695A1 (en) Molecular subtyping, prognosis, and treatment of bladder cancer
CA2383871A1 (en) A novel bap28 gene and protein
EP2852839A1 (en) Interrogatory cell-based assays for identifying drug-induced toxicity markers
CN101258249A (en) Methods and reagents for the detection of melanoma
KR20110057188A (en) System and methods for measuring biomarker profiles
KR20140140069A (en) Compositions and methods for diagnosis and treatment of pervasive developmental disorder
KR20090127939A (en) Genetic variants on chr2 and chr16 as markers for use in breast cancer risk assessment, diagnosis, prognosis and treatment
CN101111768A (en) Lung cancer prognostics
WO2012151277A1 (en) Kits and methods for selecting a treatment for ovarian cancer
US20230022417A1 (en) Chemical compositions and methods of use
US20230022236A1 (en) Chemical compositions and methods of use
KR102250063B1 (en) Method for identifying causative genes of tourette syndrome

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 12779364

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 12779364

Country of ref document: EP

Kind code of ref document: A1