MXPA06008788A - Predicting bone relapse of breast cancer - Google Patents

Predicting bone relapse of breast cancer

Info

Publication number
MXPA06008788A
MXPA06008788A MXPA/A/2006/008788A MXPA06008788A MXPA06008788A MX PA06008788 A MXPA06008788 A MX PA06008788A MX PA06008788 A MXPA06008788 A MX PA06008788A MX PA06008788 A MXPA06008788 A MX PA06008788A
Authority
MX
Mexico
Prior art keywords
further characterized
gene
expression
genes
relapse
Prior art date
Application number
MXPA/A/2006/008788A
Other languages
Spanish (es)
Inventor
Yixin Wang
Yi Zhang
David Atkins
Marcel Smid
Anieta M Sieuwerts
John W M Martens
Jan G M Klijin
John A Foekens
Original Assignee
Veridex Llc
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Veridex Llc filed Critical Veridex Llc
Publication of MXPA06008788A publication Critical patent/MXPA06008788A/en

Links

Abstract

A method of providing predicting relapse of breast cancer in bone is conducted by analyzing the expressionof a group of genes. Gene expression profiles in a variety of medium such as microarrays are included as are kits that contain them.

Description

PREDICTION OF THE BREAST FALL OF BREAST CANCER BACKGROUND OF THE INVENTION This invention relates to the forecast of! patient with breast cancer, with respect to relapse to bone, and is based on the gene expression profiles of the biological samples of the patient. The most abundant site of a distant relapse in breast cancer is bone. Many factors have been implicated to facilitate bone relapse, including blood flow in the red bone marrow, adhesive molecules in the tumor cells, and growth factors immobilized in the bone matrix, such as ß-growth factors, bone morphogenetic proteins , growth factor derived from platelets, insulin-like growth factors and fibroblast growth factors. However, relationships based on genes that involve the promotion of interactions with bone and cancer cells derived from breast cancers are largely unknown. A breast cancer prognosis was recently described to predict distant recurrence in patients negative to lymph nodes. Wang et. al, PCT / US2005 / 005711, filed on February 18, 2005. Patterns of gene expression have also been used to classify breast tumors into different clinically relevant subtypes. Perou et al. (2000); Sorlie et al. (2001); STrlie et al. (2003); Gruvberger et al. (2001); van't Veer et al. (2002); go de Vijver et al. (2002); Ahr et al. (2002); Huang et al. (2003); Sotiriou et al. (2003); Woelfle et al. (2003); Ma et al. (2003); Ramaswamy et al. (2003); Chang et al. (2003); Sotiriou et al. (2003); and Hedenfalk et al. (2001). At present, however, there are few diagnostic tools available to identify patients at specific risk for relapse to bone. There is a need to specifically identify the patient's risk of relapse of the disease to the bone, to ensure that he receives the appropriate therapy.
BRIEF DESCRIPTION OF THE INVENTION The invention encompasses a method for assessing the state of breast cancer, obtaining a biological sample from a patient with breast cancer and measuring the expression le of the genes via Markers, wherein the le of expression of the gene above or below of the predetermined cut-off le are indicative of the state of breast cancer with respect to bone metastasis. The invention encompasses a method for staging breast cancer, obtaining a biological sample from a patient with breast cancer and measuring the expression le in the sample of the genes via Markers, where the expression le of the gene above or below the predetermined cut-off le, are indicative of the stage of breast cancer. The invention encompasses a method for verifying the treatment of a patient with breast cancer, obtaining a biological sample from a patient with breast cancer and measuring expression le in the sample of the genes via Markers, where the le of expression of the gene above or below the predetermined cut-off le (as discussed in an algorithm) are sufficiently indicative of the risk of metastasis to the bone, to allow a doctor to determine the degree and type of therapy recommended to avoid such metastasis. The invention encompasses a method for treating a patient with breast cancer, obtaining a biological sample from a patient with breast cancer; and measuring expression le in the gene sample, via Markers, where the gene expression le above or below the predetermined cut-off le indicate a high risk of bone metastasis and; the treatment of the patient with adjuvant therapy if he is a high-risk patient. The invention encompasses a method to generate a score of probability of bone relapse, to allow the prognosis of patients with breast cancer obtaining the data of the expression of the gene from a statistically significant number of biological samples of the patient, applying the regression analysis of Cox of a variable to the data to obtain selected genes; applying weighted expression levels to the selected genes with standard Cox coefficients, to obtain a prediction model that can be applied to the score of probability of bone relapse. The invention encompasses a method for generating a patient report with a prognosis of breast cancer by obtaining a biological sample from the patient; measuring the expression of the sample gene; applying a qualification of probability of bone relapse; and using the results obtained to generate the report and the patient reports generated by it. The invention encompasses a composition that contains Markers. The invention encompasses a device for performing an assay to determine the prognosis of breast cancer, using a biological sample obtained from the patient. The equipment contains materials to detect the Markers. Preferably, the equipment includes instructions for its use. The invention encompasses articles for assessing the state of breast cancer that contains the Markers. The invention encompasses a diagnostic / prognostic portfolio, containing Markers, wherein the combination is sufficient to characterize the status of the breast cancer or the risk of relapse in the bone in a biological sample. The inventive methods can be used advantageously in conjunction with other breast prognostics. This can be done in a reflexive manner, so that the first prognosis of any relapse is determined after the application of the PAM procedure, presented in this application. Alternatively, these methods can be performed simultaneously or almost simultaneously, to provide the physician and / or the patient with information regarding the likelihood of relapse anywhere, and more specifically, relapse to the bone.
DETAILED DESCRIPTION OF THE INVENTION The invention encompassing a method for assessing the status of breast cancer, determines whether a patient is at high risk of a recurrence of the disease in the bone. References to prognosis and prediction through this application are related to predictions with reference to the relapse of breast cancer with its appearance in the bone. These methods involve obtaining a biological sample from a patient with breast cancer and measuring the expression levels in the sample of certain genes, wherein the levels of expression of the gene above or below the predetermined cut-off levels are indicative of the breast cancer status with respect to its relapse in the bone. The inventive methods, compositions, articles and equipment described and claimed in this specification include one or more Markers. "Marker" is used throughout this specification to refer to: a) gene and gene expression products, such as RNA, MRNA and the corresponding cDNA, peptides, proteins, fragments and complements of each of the foregoing, and b) compositions such as probes, antibodies, ligands, haptens and labels which, through physical or chemical interaction with a), indicate the expression of the gene or the presence of the gene expression product and wherein the gene, the gene expression product or the compositions correspond to: i) SEQ ID NO 112, ii) a combination of SEQ ID NO 112 and a member of the group consisting of SEQ ID NO 113, SEQ ID NO 114, SEQ ID NO 115, SEQ ID NO 116, iii) a combination of SEQ ID NO 112 and all of SEQ ID NO 113, SEQ ID NO 114, SEQ ID NO 115 and SEQ ID NO 116, iv) one or more of SEQ ID NO 112-SEQ ID NO 116 and one or more of SEQ ID NO 117- 198, ov) all of SEQ ID NO 112-SEQ ID NO 198. A gene corresponds to the sequence designated by SEQ ID NO when it contains that sequence. A segment or fragment of the gene corresponds to the sequence of such a gene when it contains a portion of the referred sequence or its complement sufficient to distinguish it as the sequence of the gene. A gene expression product corresponds to such a sequence wherein its RNA, mRNA or CDNA hybridizes to the composition having such a sequence (e.g., a probe) or, in the case of a peptide or protein, it is encoded by such mRNA. A segment or expression fragment of the gene corresponds to the sequence of such gene or gene expression product when it contains a portion of the expression product of the referred gene or its complement, sufficient to distinguish it as the sequence of the gene or product of expression of the gene. gen. Markers that correspond to i and iii are preferred. The Markers that correspond to iv and v, are most preferred. Although the mere presence or absence of particular nucleic acid sequences (eg, genes containing SNPs) in a sample of tissue, it has only rarely been found that it has a diagnostic or prognostic value, information about the expression of various proteins, peptides or mRNA is increasingly considered important. The mere presence of nucleic acid sequences that have the potential to express proteins, peptides or mRNA (such sequences referred to as "genes"), within the genome per se, is not determinative of whether a protein, peptide or mRNA is expressed in a given cell. Whether or not a given gene is capable of expressing proteins, peptides or mRNA, and to what degree such an expression occurs, if any, is determined by a variety of complex factors. However, indications regarding the degree to which genes are active or inactive can be found in the gene expression profiles. Here, it is reported that tests of gene expression are useful to identify and report if a patient with breast cancer is likely to experience a relapse to the bone. This is important for several reasons, including making sure the patient can receive the most beneficial treatment. The preparation of the sample is an important aspect to practice the methods and to use the equipment and articles of the invention. The preparation of the sample requires the collection of patient samples. The patient samples used in the inventive method are those that are suspected to contain diseased cells, such as epithelial cells taken from the primary tumor in a breast sample. The sample can be any sample suspected of having cancer cells present including, but not limited to, primary tumor tissue, tissue or fluid aspirates, fluid from the ducts, prepared by any method known in the art, including bulk tissue preparation and microdissection with laser capture. Bulk tissue preparations can be obtained from a biopsy or from a surgical specimen. The fluids can be easily obtained with fine needle aspirates, washes and other extraction methods known in the medical arts. Most preferably, the sample is obtained from the primary tumor. Samples taken from surgical margins are also preferred. The Laser Capture Microdissection (LCM) technology is a way to select the cells to be studied, minimizing the variability caused by cell type heterogeneity. The samples can also comprise circulating epithelial cells, extracted from the peripheral blood. These can be obtained according to various methods, but the most preferred method is the magnetic separation technique described in the U.S. Patent. 6,136,182, incorporated in its entirety in this specification. Once the sample containing the cells of interest has been obtained, the genetic material is extracted and the methods or with the equipment or articles of the inventions are used. Preferably, the RNA is extracted and amplified, and the expression profile of the gene is obtained, preferably via a microarray, for the genes in the appropriate portfolios. Using microarray data of gene expression (Affymetrix U133A microcircuits) of 107 primary breast tumors that were all negative to the lymph node at the time of diagnosis and that all had relapsed, panels of genes expressed in a significantly differential manner were found among patients that relapsed to the bone versus those who relapsed elsewhere in the body. This panel was reached using the SAM procedure described in this application. The gene expressed more differentially in this panel, TFF1, was confirmed by quantitative RT-PCR in an independent cohort (n = 122, p = 0.0015). In addition, a classifier was developed that predicts exact relapse to bone in general. This classifier / panel is referred to as the PAM panel in this application. This classifier can be used as a tool for adjuvant therapy recommended, particularly suitable for the treatment of bone metastasis, including, but not limited to, bisphosphonate treatment. These treatments can be recommended in addition to endocrine treatments, with chemotherapy, radiation or other treatments. The inventive methods for staging breast cancer involves obtaining a biological sample from a patient with breast cancer and measuring the levels of expression in the sample of genes, via markers, where the levels of expression of the gene above or below the predetermined cut-off levels, they are used as an entry to indicate the stage of breast cancer. The information is used in any classification known in the art, including the TNM system of the American Joint Committee on Cancer www.cancerstaging.org, and the comparison with the stages corresponding to patients with similar gene expression profiles. The methods to determine the treatment of the patient with breast cancer, involve obtaining a biological sample from a patient with breast cancer and measuring the levels of expression in the sample of the genes, via Markers, where the levels of expression of the gene above or below the predetermined cut-off levels, are sufficiently indicative of the risk of relapse to the bone, to allow the doctor to determine the degree and type of therapy recommended to avoid such relapse. The method of treating a patient with breast cancer involves obtaining a biological sample from a patient with breast cancer; and measuring expression levels in the gene sample, via Markers, wherein the gene expression levels above or below the predetermined cut-off levels, indicate a high risk of relapse to the bone, and treat the patient with adjuvant therapy if you are a high-risk patient. The above methods may further include measuring the level of expression of at least one gene constitutively expressed in the sample. The above methods preferably have a specificity of at least 40% and a sensitivity of at least 80%. The above methods can be used where the pattern of expression of the genes is compared to an expression pattern indicative of a patient with breast cancer who has relapse to the bone. The comparison can be any method known in the art, including the comparison of the expression patterns that are performed with the pattern recognition methods. The pattern recognition methods may be any known in the art, including PAM analysis and, alternatively, analysis of Cox proportional hazards.
Preferably, the envelope and deregulation levels of the gene markers used in the invention are distinguished based on the times that the intensity measurements of the hybridized microarray probes change. In any case, the methods, equipment, portfolios and inventive measurements and analyzes, which are carried out in them, use the predetermined cut-off levels indicative of at least 1.7 times the envelope or subexpression in the sample in relation to the samples of patients without relapse to bone. Preferably, the predetermined cut-off levels have at least one p-value statistically significant for overexpression in the sample of patients having relapse to bone, relative to patients who do not relapse to bone. Most preferably, the value of p is less than 0.05. A difference of 2.0 times is more preferred to make such distinctions. That is, before it is said that a gene is differentially expressed in samples from patients with relapse versus relapse, it is found that samples from patients with relapse provide at least 2 times or 2 times less intensity than those from patients with relapse. the patients without relapse. The greater the difference in times, the more preferred is the use of the gene as a diagnostic or prognostic tool provided that the p-value of the gene is clinically acceptable (i.e., closely associated with relapse). to the bone). The genes selected for the expression profiles of the gene of the present invention have expression levels that result in the generation of a signal that is distinguishable from those of the genes without relapse not modulated by an amount that exceeds the base, using clinical laboratory instrumentation. The above methods can be used where the expression of the gene is measured in a microarray or a microcircuit of genes. Microcircuits of genes and microarrays suitable for use herein are also included in the invention. The microarray may be a cDNA array or an oligonucleotide array, and may also contain one or more internal control reagents. The above methods can likewise be used where the expression of the gene is determined by methods of amplification and detection of nucleic acids. Preferably, such methods include the polymerase chain reaction (PCR) of the RNA extracted from the sample. PCR can be the polymerase chain reaction with reverse transcription (RT-PCR). The RT-PCR may also contain one or more internal control reagents. The above methods can be used wherein the expression of the gene is detected by measuring or detecting a protein encoded by the gene. Any method known in the art can be used, including detection by an antibody specific to the protein and measuring a characteristic of the gene. Suitable features include, but are not limited to, DNA amplification, methylation, mutation, and allelic variation. One method of the invention comprises generating a probability score of the relapse to the bone to allow the prediction of relapse to the bone. bone. This method can be performed by obtaining the gene expression data from a statistically significant number of biological samples from the patient, and by applying the PAM analysis, as described below. In another embodiment of the invention, the qualification of probability of relapse to bone can be obtained by applying the Cox regression formula, using the standardized Cox regression coefficients. The inventive method to generate the report of the patient with prognosis of breast cancer (for relapse to the bone), is done by obtaining a biological sample from the patient, measuring the expression of the gene of the sample; applying a score of relapse probability to the bone to the results, and using the results obtained to generate the report. The report may contain an assessment of the patient's outcome and / or probability of the risk in relation to the patient population. The inventive compositions include at least one set of probe markers. The composition may further contain reagents for performing a microarray, amplification or analysis based on the probe, and a medium through which the nucleic acid sequences, their complements or portions thereof are tested. The inventive team to perform an assay to determine the prognosis of breast cancer in a biological sample, includes materials to detect the Markers. The equipment may also contain reagents to perform a microarray, amplification or probe-based analysis, and a medium through which the nucleic acid sequences, their components or portions thereof are tested. Inventive articles to assess the status of breast cancer include materials to detect Markers. The articles may also contain reagents for performing a microarray, amplification or probe-based analysis, and a means through which the nucleic acid sequences, their complements or portions thereof can be tested. Microarrays useful in inventive methods, articles and equipment may contain markers, wherein the combination is sufficient to characterize the state of the breast cancer or the risk of relapse in the bone. Preferred kits, articles and microarrays include substrates to which the probes are attached or attached, and to which the target Markers are attached or associated, so that they can be detected. It is more preferred that these substrates be suitable only for performing the test or (sic) described in this specification or that they are suitable for performing a discrete number of related assays (ie, they contain small numbers of panels). The invention encompasses a diagnostic / prognostic portfolio of the Markers, wherein the combination is sufficient to characterize the state of breast cancer or the risk of relapse to bone in a biological sample. Preferred methods for establishing gene expression profiles include determining the amount of RNA that is produced by a gene that can code for a protein or peptide. This is achieved by PCR with reverse transcriptase (RT-PCR), competitive RT-PCR, RT-PCR in real time, RT- Differential representation PCR, Northern Blot analysis and other related tests. Although it is possible to perform these techniques using individual PCR reactions, it is better to amplify the complementary DNA (cDNA) or complementary RNA (cRNA) produced from an mRNA and analyze it via a microarray. Several different configurations of arrangements and methods for their production are known to those skilled in the art, and are described in US Patents, such as: 5,445,934; 5,532,128; 5,556,752; 5,242,974; 5,384,261; 5,405,783; 5,412,087; 5,424,186; 5,429,807; 5,436,327; 5,472,672; 5,527,681; 5,529,756; 5,545,531; 5,554,501; 5,561, 071; 5,571, 639; 5,593,839; 5,599,695; 5,624,711; 5,658,734; and 5,700,637. The microarray technology allows the measurement of the steady state mRNA level of thousands of genes simultaneously, thus presenting a powerful tool to identify the effects, such as the start, stop or modulation of uncontrolled cell proliferation. Two microarray technologies are currently in wide use. The first are the cDNA arrays and the second are the oligonucleotide arrays. Although there are differences in the construction of these microcircuits, essentially all downstream data analysis and the results are the same. The product of these analyzes are typically measurements of the intensity of the signal received from a labeled probe used to detect a cDNA sequence from the sample that hybridizes to a nucleic acid sequence at a known location in the microarray. Typically, the intensity of the signal is proportional to the amount of cDNA, and thus mRNA, expressed in the sample cells. A large number of such techniques are available and useful. Preferred methods for determining gene expression can be found in U.S. Pat. 6,271, 002; 6,218,122; 6,218,114 and 6,004,755. In the analysis of the expression levels, it is carried out by comparing the intensities of the signal and subjecting these measurements to statistical algorithms. The generation of a matrix of proportion of the expression intensities of the genes in a test sample versus those in a control sample is such a method. For example, the intensities of gene expression of a test tissue can be compared to the expression intensities generated from the tissue of the same type of a patient with the condition of interest (eg, tumor tissue from a patient who has relapsed). to the bone vs. one that does not). The ratio of these expression intensities indicates the change in gene expression times between the test and control samples. The expression profiles of the gene can also be represented in several ways. The most common method is to adjust the intensities of the untreated fluorescence or the matrix of the ratio in a graphical dendrogram, where the columns indicate the test samples and the rows indicate the genes. The data is arranged so that genes that have similar expression profiles are close to each other. The relation of the expression of each gene is visualized as a color. For example, a ratio less than one (indicating a deregulation) may appear in the blue portion of the spectrum, while a ratio greater than one (indicating upregulation) may appear as a color in the red portion of the spectrum. Commercially available computer programming element programs are available to display such data, including GeneSpring from Agilent Technologies and Partek Discover ™ and the Partek Infer ™ program from Partek®. The modulated genes used in the methods of the invention are described in the Examples. The differentially expressed genes are either over or deregulated in patients with a relapse to breast cancer bone in relation to those without such relapse. Overregulation and deregulation are relative terms that mean that a detectable difference (beyond the contribution of noise in the system used to measure it) is found in the amount of expression of the genes in relation to some baseline. In this case, the baseline is the measured expression of a patient's gene without relapse. The genes of interest of the diseased cells (of patients with relapse) are either over or deregulated with respect to the level of the baseline (of patients without relapse to bone) using the same measurement method. A patient with a pattern of expression of the gene consistent with that of the condition of interest (probability of relapse to bone), is assessed as having that condition and treated accordingly. In the verification of therapy, clinical judgments are made regarding the effect of a given course of therapy, comparing the expression of the genes with respect to time to determine whether the gene expression profiles have changed or are changing to more consistent with the tissue of patients without relapse.
Statistical values can be used to reliably distinguish modulated from unmodulated genes and noise and establish expression profiles. One such statistical test that finds the most significant genes different among various sample groups is based on a Student's T test. The p values are obtained in relation to the inclusion of particular genes in a class of genes. The lower the value of p, the more convincing is the evidence that the gene is showing a difference between the different groups. Since microarrays measure more than one gene at a time, tens of thousands of statistical tests can be performed all at once, so that it is unlikely to casually see small p values. Adjustments can be made to use a Sidak correction, as well as a randomization / permutation experiment. A p-value less than 0.05 for the T-test is evidence that the gene is significantly different. More convincing is a p-value less than 0.05 after the correction of Sidak is factored. For a large number of samples in each group, a p-value less than 0.05 after the randomization / permutation test is the most convincing evidence of a significant difference. Another parameter that can be used to select genes that generate a signal that is greater than that of the unmodulated gene or noise is the use of a measurement of the difference of the absolute signal. Preferably, the signal generated by the expression of the modulated gene is at least 20% different than that of the gene or the non-modulated genes of the patient without relapse (on an absolute basis). It is even more preferred that such genes produce patterns of expression that are at least 30% different from those of the non-modulated genes. The genes that are grouped so that the information obtained about the group of genes in the group provides a solid basis for making a clinically relevant judgment, such as a diagnosis, prognosis or choice of treatment, constitute the portfolios of the invention. In this case, the judgments supported by the portfolios involve breast cancer and its probability of relapse to the bone. Preferably, the portfolios are set so that the combination of genes in the portfolio exhibits improved sensitivity and specificity with respect to the individual genes or combinations of randomly selected genes. In the context of the present invention, the sensitivity of the portfolio can be reflected in the differences in times exhibited by the expression of the gene in the state of a patient that relapses to the bone in relation to those without such a relapse. The specificity can be reflected in the statistical measurements of the correlation of the signaling of the expression of the gene with the condition of interest. For example, the standard deviation can be used as such measurement. When considering a group of genes for inclusion in a portfolio, a small standard deviation in expression measurements correlates with greater specificity. Other measurements of the variation, such as the correlation coefficients can also be used. Portfolios of the genes of this invention were determined through SAM analysis. The expression patterns of the gene were analyzed using the PAM analysis. SAM (Significance Analysis of Microarrays) is a statistical procedure to identify genes whose expression patterns are significantly associated with the specific characteristics of the sample sets. This method is incorporated into a program developed at Stanford University and is available to the public. SAM identifies genes with statistically significant changes in expression, assimilating a set of T-tests specific to the gene. The method is described in the U.S. Patent Application. 20020019704 from Tusher et. al., presented on March 19, 2001 and incorporated in its entirety in this specification. It is also described in Significance Analysis of Microarrays Applied to the longest Radiation Response; Tusher, Tibshirani, and Chu, 5116-5121 _ PNAS _ April 24, 2001, vol. 98 _ no. 9. In a SAM analysis, each gene tested is assigned a rating based on its change in gene expression, relative to the standard deviation of repeated measurements for that gene. Genes with ratings greater than a threshold are considered potentially significant. The percentage of such genes identified by chance is the proportion of false discoveries (FDR). To estimate the FDR, the nonsense genes are identified by analyzing the permutations of the measurements. The threshold can be adjusted to identify sets of major or minor genes and the FDRs are calculated for each set. A value referred to as the "relative difference" or d (i) in gene expression is based on the ratio of change in gene expression to deviation standard in the data for that gene. The "gene-specific dispersion" s (i) is the standard deviation of repeated expression measurements. The coefficient of variation of d (i) is calculated as a function of s (i). To find significant changes in gene expression, genes are classified by magnitude of their values of d (i), so that d (1) is the largest relative difference, d (2) is the second largest relative difference , and d (i) is the ith greatest relative difference. For each of the permutations, the relative differences dp (i) are also calculated, and the genes are classified again, so that dp (¡) is the i th largest relative difference for the permutation p. The expected relative difference, dE (i), is defined as the average over balanced permutations. To potentially identify significant changes in expression, a scatter plot of the observed relative difference d (i) versus the expected relative difference dE (i) can be used. For the vast majority of genes, d (i) is approximately equal to dE (i), but some genes are represented by the displaced points of the line d (i) = dE (i) a distance greater than a threshold. To determine the number of falsely significant genes generated by SAM, horizontal slices are defined as the smallest d (i) among genes called significantly induced, and the d (i) least negative among genes called significantly repressed. The number of false significant genes corresponding to each permutation is calculated by counting the number of genes that exceed the horizontal cuts for the induced and repressed genes. The estimated number of false significant genes is the average of the number of genes called significant of all the permutations. This method for adjusting the thresholds provides asymmetric cuts for the induced and repressed genes. An alternative is the standard t test, which imposes a horizontal symmetric cut, with d (i) greater than c for the induced genes and d (i) less than c for the repressed genes. However, asymmetric cleavage is preferred because it allows the possibility that d (i) for induced or repressed genes may behave differently in some biological experiments. In PAM analysis (Predictive Analysis of Microarrays), it is a modified version of the nearest centroid method. The method was developed at the Stanford University Laboratories and is typically carried out using the Statistical R package. It provides a list of significant genes whose expression characterizes each kind of diagnosis and estimates the error of the prediction via cross-validation. The method is a methodology of the nearest shrunk centroid. It is described in Diagnosis of Multiple Cancer Types by Shrunken Centroids of Gene Expression; Narashiman and Chu, PNAS 2002, 99: 6567-6572 (May 14, 2002). In this method, a standardized centroid is calculated for each class. This is the expression of the average gene for each gene in each class, divided by the standard deviation within the class for that gene. The classification of the nearest centroid takes the gene expression profile of a new sample, and compares it with each of these centroids of the class. The class whose centroid is closest to, distance squared, is the class predicted for that new sample. The nearest cramped centroid classification "shrinks" each of the centroids of the class to the total centroid for all classes, by an amount called the threshold. This shrinkage consists of moving the centroid towards zero by the threshold, adjusting it equal to zero if it hits zero. For example, if the threshold was 2.0, a centroid of 3.2 would shrink 1.2, a centroid of -3.4 would shrink to -1.4, and a centroid of 1.2 would shrink to zero. After the centroids shrink, the new sample is classified by the new rule of the usual nearest centroid, but using the centroids of the shrunken class. This shrinkage can make the sorter more accurate, reducing the effect of noisy genes and providing a selection of the automatic gene. In particular, if a gene is shrunk to zero for all classes, then it is removed from the prediction rule. Alternatively, it can be set to zero for all classes except one, and it can be learned that the high or low expression for that gene characterizes that class. The user decides the value to be used for the threshold. Typically, one examines several different selections. To guide this choice, the PAM performs a cross-validation of K times even a range of threshold values. The samples are randomly divided into K parts approximately equally sized. For each part in turn, the classifier is built in the other K-1 parts that are tested in the remaining part. This is done for a range of threshold values, and the error ratio of misclassification with cross validation is reported for each threshold value. Typically, the user would choose the threshold value from an error ratio of misclassification with minimal cross-validation.
Alternatively, portfolios of gene expression can be established through the use of optimization algorithms, such as the average variance algorithm widely used to establish portfolios of actions. This method is described in detail in the patent publication of E.U.A. number 20030194734. Essentially, the method calls for the establishment of input sets (actions in financial applications, expression as measured by intensity in the present), which will optimize the return (for example, signal that is generated) once it is received, to use it while minimizing the variability of the return. Many commercial programs are available to perform such operations. The "Application of Optimization of the Average Variance of the Wagner Associates", referred to as the "Wagner Program", is preferred through this specification. This program uses functions of the "Library of Optimization of the Average Variance of the Wagner Associates", to determine an efficient frontier and an optimal portfolio in the sense of Markowitz, is preferred. The use of this type of program requires that the microarray data be transformed so that it can be treated as an input into the way in which the return of actions and risk measurements are used when the program is used for its purposes. intended financial analysis. The procedure of selecting a portfolio may also include the application of heuristic rules. Preferably, such rules are formulated based on biology and an understanding of the technology used to produce clinical results. More preferably, they are applied for the result of the optimization method. For example, the mean variance method of portfolio selection can be applied to the microarray data for a number of genes expressed differentially in subjects with breast cancer. The result of the method would be an optimized set of genes that would include some genes that are expressed in peripheral blood, as well as in diseased tissue. If the samples used in the test method are obtained from the peripheral blood and certain differentially expressed genes in cases of breast cancer are differentially expressed in the peripheral blood, then a heuristic rule can be applied in which a portfolio is selects from the efficient border, excluding those that are differentially expressed in the peripheral blood. Of course, the rule can be applied before the formation of the efficient frontier by, for example, applying the rule during the preselection of the data. Other heuristic rules may apply, which are not necessarily related to the biology in question. For example, one can apply a rule that only a prescribed percentage of the portfolio can be represented by a particular gene or group of genes. The commercially available program, such as the Wagner Program, easily accommodates these types of heuristic rules. This can be useful, for example, when factors other than accuracy and precision (for example, anticipated license rights) have an impact on the desirability of including one or more genes. One method of the invention involves comparing the gene expression profiles for several genes (or portfolios) to attribute the prognoses.
The expression profiles of the gene of each of the genes that comprise the portfolio are fixed in a medium, such as a computer-readable medium. This can take several forms. For example, a table may be established in which the range of signals (eg, intensity measurements) indicative of the condition of interest is introduced (eg, high probability of relapse to bone). The current patient data can then be compared with the values in the table to determine the probability of relapse to the bone of the patient's samples. In a more sophisticated mode, the expression patterns of the signals (for example, fluorescent intensity) are recorded digitally or graphically. The gene expression patterns of the gene portfolios used in conjunction with the patient samples are then compared to the expression patterns. The pattern comparison program can then be used to determine then whether the patient's samples have a pattern indicative of relapse to the bone. Of course, these comparisons can also be used to determine if the patient is not likely to experience a relapse to the bone. The expression profiles of the samples are then compared to the portfolio of a control cell. If the patterns of expression of the sample are consistent with the pattern of expression for relapse to the bone of a breast cancer then (in the absence of compensatory medical considerations), the patient is treated as one would treat such a patient with relapse. If the expression patterns of the sample are Consistent with the pattern of normal cell expression / control, then the patient is diagnosed negative for breast cancer. The expression pattern of a patient's gene can be used to determine the prognosis of breast cancer (with respect to its relapse in bone) through the use of a Cox risk analysis program. Such analyzes are preferably performed using the S-Plus program (commercially available from Insightful Corporation). Using such methods, a gene expression profile is compared to that of a profile that reliably represents relapse to bone (ie, expression levels for the combination of genes in the profile is indicative of relapse to bone). The Cox risk model with the established threshold is used to compare the similarity of the two profiles (relapse to known bone versus patient), and then determines if the patient's profile exceeds the threshold. If not, the patient is classified as one that will relapse to the bone and is treated in a manner consistent with adjuvant therapy, bisphosphonate therapy or other appropriate therapy. If the patient's profile does not exceed the threshold, then it is classified as a patient relapse to bone. Other analytical tools can also be used to answer the same questions, such as linear discrimination analysis, logistic regression and neural network procedures. Numerous other well-known methods of pattern recognition are available. The following references provide some examples: Weighted Voting: Golub et al. (1999).
Support Vectors Machines: Su et al. (2001); and Ramaswamy et al. (2001). Neighbors closest to K: Ramaswamy (2001). Correlation coefficients: van't Veer et al. (2002). The expression profiles of the gene of this invention can also be used in conjunction with other non-genetic diagnostic methods, useful in the diagnosis, prognosis or verification of cancer treatment. For example, in some circumstances, it is beneficial to combine the diagnostic potential of the methods based on gene expression described above with data from conventional markers, such as serum protein markers (eg, Cancer Antigen 27.29 ("CA 27.29")). There is a range of such markers, including such analytes as CA 27.29. In such a method, blood is taken periodically from a treated patient, and then subjected to an enzyme immunoassay for one of the serum markers described above. When the concentration of the marker suggests the return of the tumors or the failure of the therapy, a suitable sample source for the analysis of gene expression is taken. Where there is a suspicious mass, an aspirate is taken with a fine needle (FNA), and the gene expression profiles of the cells taken from the mass are then analyzed as described above. Alternatively, tissue samples can be taken from areas adjacent to the tissue from which the tumor was previously removed. This procedure can be particularly useful when other test procedures are ambiguous.
The articles of this invention include the representations of the gene expression profiles, useful for the treatment, diagnosis, prognosis and to assess in another way, if it is probable, that a patient with breast cancer experiences relapse in the bone. These representations of the profile are reduced to a medium that can be read automatically by a machine, such as a computer-readable medium (magnetic, optical and the like). The articles may also include instructions for assessing gene expression profiles in such media. For example, the articles may comprise a CD ROM having computer instructions for comparing the gene expression profiles of the portfolios of the genes described above. The articles may also have gene expression profiles registered digitally therein, so that they can be compared with the gene expression data of the patient samples. Alternatively, the profiles can be registered in different representation formats. A graphic record is one such format. Clustering algorithms, such as those incorporated in the Partek Discover ™ and Partek Infer ™ program of Partek®, mentioned above, can best assist in the visualization of such data. Different types of articles of manufacture according to the invention are means or formatted tests used to reveal the expression profiles of the gene. These may comprise, for example, microarrays in which the complements or probes of the sequence are fixed to a matrix to which the sequence indicative of the genes of interest is combined, creating a readable determinant of their presence. Alternately, articles agree With the invention, they can be placed in reagent kits to carry out the hybridization, amplification and generation of the signal indicative of the level of expression of the genes of interest to detect breast cancer. The equipment made according to the invention includes formatted assays to determine the gene expression profiles. These may include all or some of the materials needed to perform the analyzes, such as reagents and instructions. The invention is further illustrated by the following non-limiting examples. All references cited herein, are hereby incorporated by reference herein.
EXAMPLES The genes analyzed in accordance with this invention are typically related to the full-length nucleic acid sequences that encode the production of a protein or peptide. One skilled in the art will recognize that the identification of full length sequences is not necessary from an analytical point of view. That is, portions of the sequences or ESTs can be selected in accordance with well-known principles for which probes can be designated to assess the expression of the gene for the corresponding gene.
EXAMPLE 1 Sample handling and microarray work for a different relapse profile previously established This example describes the establishment of a gene portfolio for the identification of patients with breast cancer with a high risk of relapse generally (ie, not restricted to relapse to bone). Specimens from frozen tumors of lymph node negative patients, treated during 1980-1995, but not treated with neoadjuvant systemic therapy, were selected from the tumor bank of the Erasmus Medical Center (Rotterdam, The Netherlands). All tumor samples were submitted to a reference laboratory of 25 regional hospitals for measurements of the spheroid hormone receptor. The guidelines for primary treatment were similar for all hospitals. The tumors selected in a way to avoid deviation. Assuming 25-30% in 5 years, and a substantial loss of tumors due to quality control reasons, 436 samples of invasive tumors were processed. Patients with a poor, intermediate and good clinical outcome were included. Samples were rejected based on insufficient tumor content (53), poor RNA quality (77) or poor microcircuit quality (20), leaving 286 samples eligible for further analysis. The average age of the patients at the time of surgery (breast-conserving surgery: 219 patients, modified radical mastectomy: 67 patients), was 52 years (range, 26-83 years). Radiotherapy was given to 248 patients (87%) according to the institutional protocol. Patients were included regardless of the status of radiation therapy, since this study is not intended to investigate the potential effects of a specific type of surgery or adjuvant therapy. In addition, studies have shown that radiation therapy does not have a clear effect on the relapse of distant disease. Early Breast Cancer Trialists (1995). The negativity to the lymphatic node was based on a pathological examination by regional pathologists. Foekens et al. (1989a). Prior to inclusion, all samples from 286 tumors were confirmed by having sufficient tumor (> 70%) and uniform involvement of the tumor in frozen sections of 5 μm stained with H & E. The levels of ER (and PgR) were measured by a ligand binding assay or an enzyme immunoassay (EIA) (Foekens et al. (1989b)), or by immunohistochemistry (in 9 tumors). The cutoff values used to classify patients as positive and negative for ER and PR were 10fmol / mg protein or 10% positive tumor cells. The postoperative follow-up involved the examination every 3 months during the first 2 years, every 6 months during years 3 to 5, and every 12 months from year 5. The date of diagnosis of metastasis was defined as the confirmation date of the metastasis after the symptoms reported by the patient, detection of clinical signs, or with regular follow-up. The mean follow-up period of the surviving patients (n = 198) was 101 months (range, 20-171). Of the 286 patients included, 93 (33%) showed evidence of distant metastasis over the course of 5 years and were counted as failures in the analysis of distant metastasis free survival (DMFS). Five patients (2%) died without evidence of disease and were counted at the last follow-up. Eighty and three patients (29%) died before the previous relapse. Therefore, a total of 88 patients (31%) were failing in the analysis of total survival (OS).
EXAMPLE 2 Analysis of gene expression of the data obtained in Example 1 Total RNA was isolated from 20 to 40 cryostat sections 30 μm thick (50-100 mg) with RNAzol B (Campro Scientific, Veenendaal, The Netherlands). Biotinylated targets were prepared using the published methods (Affymetrix, CA, Lipshutz et al. (1999)), and hybridized to the U133a GeneChip of the Affymetrix oligonucleotide microarray. The arrays were explored using standard Affymetrix protocols. Each set of probes was treated as a separate gene. The expression values were calculated using the MAS 5.0 program for the GeneChip analysis of Affymetrix. The microcircuits were rejected if the average intensity was < 40 or if the background signal was > 100. To normalize the signals from the microcircuits, the sets of probes were scaled to an objective intensity of 600, and the files that mask the scale were not selected.
EXAMPLE 3 Statistical analysis of the genes identified in Example 2 The gene expression data was filtered to include genes called "present" in two or more samples. 17,819 genes passed this filter and were used for the hierarchical grouping. Before grouping, the level of expression of each gene was divided by its mean expression level in the patients. This standardization step limits the effect of the magnitude of gene expression, and grouped the genes with similar expression patterns in the cluster analysis. To identify subgroups of patients, we performed the hierarchical linkage clustering of both genes and samples using GeneSpring 60. To identify the genes that discriminate patients that develop distant metastases from those that remained metastatic for 5 years, Two prediction procedures of the supervised class were used. In the first procedure, 286 patients were randomly assigned to training and test sets of 80 and 206 patients, respectively. The survival curves of Kaplan-Meier (Kaplan et al. (1958)) for the two sets were examined to ensure that there is no significant difference and no deviation was introduced by the random selection of the training and test sets. In the second procedure, patients were assigned to one of two subgroups stratified by the ER state.
Each subgroup of patients was analyzed separately in order to select the markers. Patients in the ER-positive subgroup were randomly assigned to training and test sets of 80 and 129 patients, respectively. Patients in the ER negative subgroup were randomly divided into training and test sets of 35 and 42 patients, respectively. The markers selected from each subgroup of the training set were combined to form a single rubric to predict tumor metastasis for both ER-positive and ER-negative patients in a subsequent independent validation. The sample size of the training set was determined by a method to retake the samples to ensure their statistical confidence level. Briefly, the number of patients in the training set started in 15 patients and increased in steps of 5. For a given sample size, 10 training sets were made with randomly selected patients. A rubric of the gene was constructed from each of the training sets and then tested in a designated trial set of patients by analysis of the receiver operating characteristic curve (ROC) with distant metastasis over the course of 5 years as the point of definition. The mean and the coefficient of variation (CV) of the area under the curve (AUC) were calculated for a given sample size. A minimum number of patients required for the training set was chosen at the point where the average AUC reached a plate and the CV of the 10 AUC was less than 5%.
The genes were selected as follows. First, the regression of Cox proportional hazards was used with a variable to identify the genes for which the expression (on a log2 scale) was correlated with the length of DMFS. To reduce the effect of multiple tests and to test the robustness of the selected genes, the Cox model was constructed with the priming of the patients in the training set. Efron et al. (1981 ). Briefly, 400 samples primed from the training set were constructed, each with 80 randomly chosen patients with replacement. A Cox model was run in each of the primed samples. A barley score was created for each gene by removing the p values of 5% higher and lower and then averaging the inverse of the remaining p-values. This rating was used to classify the genes. To construct a rubric with multiple genes, the combinations of the gene markers were tested by adding one gene at a time, according to the order of the classification. The ROC analysis using the distant metastasis over the course of 5 years as the definition point, was performed to calculate the area under the AUC for each rubric with a number that increases in genes, until a maximum value of AUC was reached. The Relapse Rating (RS) was used to calculate each patient's risk of distant metastasis. The qualification was defined as the linear combination of the weighted expression signals with the standardized Cox regression coefficient as the weighting. 60 16 Qualification of Relapse = A -I + I - w¡x¡ + B - Q. -I) + T (1 - /) • WJXJ where If the level of ER > 10 finol per mg of protein 10 if the level of ER < 10 finol per mg of protein A and B are constants w¡ is the standardized Cox regression coefficient for ER + marker x¡ is the expression value of ER + marker on a log2 scale Wj is the standardized Cox regression coefficient for ER -marker Xj is the expression value of ER - marker on a log2 scale The threshold was determined from the ROC curve of the training set to ensure 100% sensitivity and the highest specificity. The values of constants A of 313.5 and B of 280 were chosen to center the RS threshold to zero for patients positive to ER and negative to ER. Patients with positive SR scores were classified into a poor prognosis group, and patients with negative SR scores were classified in the good prognostic group. The gene signature and cuts were validated in the test set. Kaplan-Meier survival graphs and logarithm-classification tests were used to assess the differences in time to distant metastases of the high and low groups predicted risk. The disparate relationships (OR) were calculated as the ratio of the disparities of distant metastases between patients predicted to relapse and those predicted to remain relapse free. Analyzes were made with one variable and with multiple variables with the regression of Cox proportional risks in the individual clinical variables with and without the gene signature. The HR and its 95% confidence interval (Cl) were derived from these results. All statistical analyzes were performed using the S-Plus 6.1 (Insightful, VA) program.
EXAMPLE 4 Analysis of the trajectory of the genes identified in Example 3 A functional class was assigned to each of the genes in the prediction signature of the gene described in Examples 1-3 (without bone-specific relapse). The trajectory analyzes were done with the Ingenuity 1.0 program (Ingenuity Systems, CA). Affymetrix probes were used as an input to search for biological networks constructed by the program. The biological networks identified by the program were evaluated in the context of the general functional classes by means of the GO ontology classification. The trajectories with two or more genes in the prognostic rubric were selected and evaluated.
EXAMPLE 5 Results for Examples 1-4 Patient and tumor characteristics The clinical and pathological features of the 286 patients in examples 1-3 are summarized in Table 1.
TABLE 1 Clinical and pathological characteristics of patients and their tumors Characteristics All Set of Patient Set Treatment Treatment Validation (%) (%) positive to ER negative to ER (%) (%) Number 286 80 35 171 Age (mean ± SD) 54 ± 12 54 ± 13 54 ± 13 54 ± 12 < 40 years 36 (13) 12 (15) 3 (9) 21 (12) 41-55 years 129 (45) 30 (38) 17 (49) 82 (48) 56-70 years 89 (31) 28 (35) 11 (31) 50 (29) > 70 years 32 (11) 10 (13) 4 (11) 18 (11) Menopausal State Premenopausal 139 (49) 39 (49) 16 (46) 84 (49) Postmenopausal 147 (51) 41 (51) 19 (54) 87 (51) Stage T T1 146 (51) 38 (48) 14 (40) 94 (55) T2 132 (46) 41 (51) 19 (54) 72 (42) T3 / 4 8 (3) 1 (1) 2 (6 ) 5 (3) Deficient grade 148 (52) 37 (46) 24 (69) 87 (51) Moderate 42 (15) 12 (15) 3 (9) 27 (16) Good 7 (2) 2 (3) 2 (6) 3 (2) Unknown 89 (31) 29 (36) 6 (17) 54 (32) ER * Positive 209 (73) 80 (100) 0 (0) 129 (75) Negative 77 (27) 0 ( 0) 35 (100) 42 (25) PgR * Positive 165 (58) 59 (74) 5 (14) 101 (59) Negative 111 (39) 19 (24) 29 (83) 63 (37) Unknown 10 (3 ) 2 (2) 1 (3) 7 (4) Metastasis < 5 years Yes 93 (33) 24 (30) 13 (37) 56 (33) No 183 (64) 51 (64) 17 (49) 115 (57) Censed if < 5 10 (3) 5 (6) 5 (14) 0 (0) years * Positive to ER and positive to PgR: > 10 fmol / mg protein or > 10% of tumor cells positive.
There were no differences in age or menopausal status. The ER-negative training group had a slightly higher proportion of larger tumors, and as expected, more tumor grade deficient than the ER-positive training group. The validation group of 171 patients (129 positive for ER, 42 negative for ER), does not differ from the total group of 286 patients with respect to any of the characteristics of the patient or tumor.
Two procedures were used to identify markers predictive of relapse of the disease. First, the data was divided from randomly, so that the 286 patients (positive to ER and negative to ER combined), were put into a training set and a test set. Thirty-five genes of 80 patients were selected in the set of training and a Cox model was built to predict the appearance of distant metastasis. A moderate prognostic value was observed. Table 2. The cluster analysis without supervision, showed two distinct subgroups highly correlated with the ER status of the tumor (chi square test p O.0001).
TABLE 2 SEQ ID NO: Cox coefficient value of p 1 4.008 0.00006 2 -3.649 0.00026 3 4.005 0.00006 4 -3.885 0.00010 5 -3.508 0.00045 6 -3.176 0.00150 7 3.781 0.00016 8 3.727 0.00019 9 -3.570 0.00036 10 -3.477 0.00051 11 3.555 0.00038 12 -3.238 0.00120 13 -3.238 0.00120 14 3.405 0.00066 15 3.590 0.00033 16 -3.157 0.00160 17 -. 17 -3,622 0.00029 18 -3,698 0.00022 19 3.323 0.00089 20 -3.556 0.00038 21 -3.317 0.00091 22 -2.903 0.00370 23 -3.338 0.00085 24 -3.339 0.00084 25 -3.355 0.00079 26 3.713 0.00021 27 -3.325 0.00088 28 -2.984 0.00284 29 3.527 0.00042 30 -3.249 0.00116 31 -2.912 0.00360 32 3.118 0.00182 33 3.435 0.00059 34 -2.971 0.00297 35 3.282 0.00103 Each subgroup was analyzed in order to select the markers. Seventy-six genes of the patients were selected in the training sets (60 for the ER positive group, 16 for the ER negative group). With the selected genes and ER status taken together, a Cox model was constructed to predict cancer relapse (non-specific to bone). The validation of the predictor of 76 genes in the 171 patients of the test set, produced an ROC with an AUC value of 0.694, sensitivity of 93% (52/56), and a specificity of 48% (55/115). Patients with a relapse rating above the threshold of the prognostic rubric had an OR of 11.9 times (95% CI: 4.04-35.1, p <0.0001), to develop distant metastasis within 5 years. As a control, sets of 76 randomly selected genes were generated. These produced ROC with an average value of AUC of 0.515, sensitivity of 91%, and specificity of 12% in the test group. Patients stratified by such a set of genes would have a disparate ratio of 1.3 (0.50-3.90, p = 0.8) for the development of metastases, indicating a random classification. In addition, Kaplan-Meier analyzes for the survival of distant metastasis (DMFS) and overall survival (OS) as a function of the rubric of 76 genes, showed highly significant differences in time for metastasis among the predicted groups that have a good and poor forecast At 60 and 80 months, the respective absolute differences in DMFS between the groups with predicted good and poor predictions were 40% (93% vs. 53%) and 39% (88% vs. 49%) and those in OS were of 27% (97% vs. 70%) and 32% (95% vs. 63%), respectively. The profile of 76 genes also represented a strong prognostic factor for the development of distant metastases in the subgroups of 84 premenopausal patients (HR: 9.60), 87 postmenopausal patients (HR: 4.04) and 79 patients with tumor sizes of 10 to 20 mm (HR: 14.1). Cox regression analyzes of one variable and multiple variables are summarized in Table 3.
TABLE 3 Analysis of a variable v multiple > variables for DMFS in the < trial set of 171 patients with relapse Analysis with one variable Analysis with multiple variables * HRt (95% CI) t p value HRt (95% Cl) p value Age ? Age2 vs. Agel 1.16 (0.51 - 2.65) 0.7180 1.14 (0.45 - 2.91) 0.7809 Age3 vs. Agel 1.32 (0.56 - 3.10) 0.5280 0.87 (0.26 - 2.93) 0.8232 Age4 vs. Age 0.95 (0.32 - 2.82) 0.9225 0.61 (0.15 - 2.60) 0.5072 State 1.24 (0.76 - 2.03) 0.3909 1.53 (0.68 - 3.44) 0.3056 menopausal § Stage || 1.08 (0.66 - 1.77) 0.7619 2.57 (0.23 - 29.4) 0.4468 Differentiation ^ 0.38 (0.16 - 0.90) 0.0281 0.60 (0.24 -1.46) 0.2590 Size of 1.06 (0.65 - 1.74) 0.8158 0.34 (0.03 - 3.90) 0.3849 tumor ** ERft 1.09 (0.61 - 1.98) 0.7649 1.05 (0.54 - 2.04) 0.8935 PRtt 0.83 (0.51 - 1.38) 0.4777 0.85 (0.47 - 1.53) 0.5882 Rubric of 76 5.67 (2.59 - 12.4) 1.5x10"5 5.55 (2.46 - 12.5) 3.6x10" 5 genes * The multiple variable model included 162 patients, due to the missing values in 9 patients. T Risk ratio and 95% confidence interval $ Agel is < 40 years, Age2 is from 41 to 55 years, Age3 is from 56 to 70 years, Age4 is > 70 years § Postmenopausal vs. Premenopausal Stage II & III vs. I í Degree: moderate / good vs. deficient, the unknown degree was included as a separate group ** Tumor size: > 20 mm vs. < 20 mm tt Positive vs. negative Apart from the rubric of 76 genes, the single degree was significant in the analysis with one variable, and moderate / good differentiation was associated with favorable DMFS. The estimation of the regression with multiple variables of HR for the appearance of the tumor metastasis in the course of 5 years was 5.55 (p <0.0001), indicating that the set of 76 genes represents an independent prognostic rubric associated in large measured with a higher risk of tumor metastasis. The one-variable and multi-variable analyzes were also done separately for ER-positive and ER-negative patients, the 76-gene rubric was also an independent prognostic variable in the subgroups stratified by ER status. The function of the 76 genes (Table 4) in the non-skeletal specific forecast rubric was analyzed to relate the genes to the biological trajectories.
TABLE 4 or from ER SEQ ID NO. Cox Cox Coefficient Standard + 36 -3.83 0.00005 + 37 -3.865 0.00001 + 38 3.63 0.00002 + 39 -3.471 0.00016 + 40 3.506 0.00008 + 41 -3.476 0.00001 + 42 3.392 0.00006 + 43 -3.353 0.00080 + 44 -3.301 0.00038 + 45 3.101 0.00033 + 46 -3.174 0.00128 + 47 3.083 0.00020 + 48 3,336 0.00005 + 49 -3,054 0.00063 + 50 -3,025 0.00332 + 51 3,095 0.00044 + 52 -3,175 0.00031 + 53 -3.082 0.00086 + 54 3,058 0.00016 + 55 3.085 0.00009 + 56 -2,992 0.00040 + 57 -2,791 0.00020 + 58 -2,948 0.00039 + 59 2,931 0.00020 + 60 -2,896 0.00052 + 61 2,924 0.00050 + 62 2.915 0.00055 + 63 -2.968 0.00099 + 64 2,824 0.00086 + 65 -2,777 0.00398 + 66 -2,635 0.00160 + 67 -2,854 0.00053 + 68 2,842 0.00051 + 69 -2,835 0.00033 + 70 2,777 0.00164 + 71 -2.759 0.00222 + 72 -2.745 0.00086 + 73 2.79 0.00049 + 74 2,883 0.00031 + 75 -2,794 0.00139 + 76 -2,743 0.00088 + 77 -2,761 0.00164 + 78 -2,831 0.00535 + 79 2,659 0.00073 + 80 -2.715 0.00376 + 81 2,836 0.00029 + 82 -2,687 0.00438 + 83 -2,631 0.00226 + 84 -2.716 0.00089 + 85 2,703 0.00232 + 86 -2,641 0.00537 + 87 -2,686 0.00479 + 88 -2,654 0.00363 + 89 2.695 0.00095 + 90 -2.758 0.00222 + 91 2,702 0.00084 + 92 -2,694 0.00518 + 93 2.711 0.00049 + 94 -2.771 0.00156 + 95 2.604 0.00285 96 -3.495 0.00011 97 3.224 0.00036 98 -3.225 0.00041 99 -3.145 0.00057 100 -3.055 0.00075 101 -3.037 0.00091 102 -3.066 0.00072 103 3.06 0.00077 104 - 2.985 0.00081 105 -2.983 0.00104 106 -3.022 0.00095 107 -3.054 0.00082 108 -3.006 0.00098 109 -2.917 0.00134 110 -2.924 0.00149 111 -2.882 0.0017 Although 18 of the 76 genes have an unknown function, they They identified several trajectories or biochemical activities that are well represented, such as cell death, cell cycle and proliferation, DNA replication and repair, and the immune response (Table 5).
TABLE 5 Analysis of the trajectory of the 76 genes of the forecast heading Functional class Heading of the 76 genes Cell death TNFSF10, TNFSF13, MAP4, CD44, IL18, GAS2, NEFL, EEF1A2, BCLG, C3 Cell cycle CCNE2, CD44, MAP4, SMC4L1, TNFSF10, AP2A2, FEN1, KPNA2, ORC3L, PLK1 Proliferation CD44, IL18, TNFSF10, TNFSF13, PPP1CC, CAPN2, PLK1, SAT Replication, TNFSF10, SMC4L1, FEN1, ORC3L, recombination / repair of KPNA2, SUPT16H, POLQ, ADPRTL1 DNA Immune response TNFSF10, CD44, IL18, TNFSF13, ARHGDIB, C3 Growth PPP1CC, CD44, IL18, TNFSF10, SAT, HDGFRP3 Mounting and cellular organization MAP4, NEFL, TNFSF10, PLK1, AP2A2, SMC4L1 Transcription KPNA2, DUSP4, SUPT16H, DKFZP434E2220, PHF11, ETV2 Signaling and cell interaction to CD44, IL18, TNFSF10 , TNFSF13, C3 Cell Survival TNFSF10, TNFSF13, CD44, NEFL Development IL18, TNFSF10, COL2A1 Cell morphology CAPN2, CD44, TACC2 Synthesis of proteins IL18, TNFSF10, EEF1A2 Binding to ATP PRO2000, URKL1, ACACB Binding to DNA H1ST1 H4H, DKFZP434E2220, PHF11 Formation of colonies CD44, TNFSF10 Adhesion CD44, TMEM8 Neurogénesis CLN8, NEURL Golgi apparatus GOLPH2, BICD1 Activity of the kinase CNK1. URKL1 Transferase activity FUT3, ADPRTL1 It was found that the genes involved in the progression of diseases include calpain2, the origin recognition protein, double specificity phosphatases, Rho-GDP dissociation inhibitor, TNF superfamily protein, complement component 3, microtubule-associated protein, protein phosphatase 1 and apoptosis regulator BCL-G. In addition, the previously characterized prognostic genes, such as cyclin E2 (Keyomarsi et al. (2002)) and CD44 (Herrera-Gayol et al. (1999)), were in the rubric of the gene.
The patients who provided the samples did not receive adjuvant systemic therapy, so that the assessment of the multiple genes of the prognosis was not submitted to potentially confusing contributions with the predictive factors related to the systemic treatment. From this analysis, a rubric of 76 genes was created that accurately predicts the relapse of the distant tumor that is not specifically a prognosis of relapse to bone. This rubric is applicable to patients with relapsed breast cancer regardless of age, tumor size and degree and ER status. In the analysis with multiple Cox variables for DMFS, the rubric of 76 genes was the only significant variable, replacing the clinical variables, including the degree. After 5 years, the absolute differences in DMFS and OS among patients with rubrics of 76 good or deficient genes were 40% and 27%, respectively. Of the patients with a good prognostic rubric, 7% developed distant metastasis and 3% died within 5 years. If validated further, this forecast heading will provide a positive predictive value of 37% and a negative predictive value of 95%, with the assumption of a 25% relapse rate of the disease in patients with breast cancer. In particular, this rubric can be valuable in defining the risk of relapse for the increasing proportion of T1 tumors (<2 cm). The comparison with the guidelines of St Gallen and NIH was instructive. Although by ensuring that the same number of high-risk patients would receive the necessary treatment, the 76-gene rubric would recommend adjuvant systemic chemotherapy for only 52% of patients with low risk, compared with 90% and 89% by the St. Gallen and NIH guidelines, respectively (Table 6).
TABLE 6 Comparison of the rubric of 76 genes and the current conventional consensus in the treatment of breast cancer Method Guided patients to receive adjuvant chemotherapy in the test set Metastatic disease at 5 years (%) Free metastatic disease at 5 years (%) St Gallen 52/55 (95) 104/115 (90) NIH 52/55 (95) 101/114 (89) Heading of 76 genes 52/56 (93) 60/115 (52) Conventional consensus criteria. St. Gallen: tumor > 2 cm, negative to ER, grade 2-3, patient < 35 years (any of these criteria); NIH: tumor > 1 cm The rubric of 76 genes can thus result in a reduction of number of patients with low-risk relapse, who would be recommended to have an unnecessary adjuvant systemic therapy.
The 76 genes in the prognostic rubric belong to many functional classes, suggesting that different trajectories could lead to the progression of the disease. The rubric included well-characterized genes and 18 unknown genes. This finding could explain the superior performance of the rubric compared to other forecasting factors. Although the genes involved in cell death, cell proliferation and transcriptional regulation were found in both groups of patients stratified by ER status, the 60 genes selected for the ER positive group and the 16 genes selected for the ER negative group were not superimposed. This result suggests the idea that the degree of heterogeneity and the underlying mechanisms for the progression of the disease may differ from the two subgroups based on ER of patients with breast cancer. The comparison of these results with those of the study by van de Vijver et al. (2002), it is difficult due to the differences in patients, techniques and materials used, van de Vijver et al., Included both node negative and node-positive patients, who had or not received adjuvant systemic therapy, and only younger women 53 years old In addition, the microarray platforms used in the studies are different, Affymetrix vs. Agilent. Of the 70 genes in the van't Veer study (2002), only 48 are present in the U133a array of Affymetrix, while of the 76 genes in this profile, only 38 are present in the Agilent array. There is an overlap of 3 genes between the two rubrics (cyclin E2, recognition complex of origin and protein of the TNF superfamily). Despite the obvious difference, both rubrics included genes that identified several common trajectories that may be involved in tumor relapse. These findings support the idea that there may be redundancy in the members of the gene, effective rubrics may be required to include the representation of specific trajectories.
The strengths of the study described above compared to the study by van de Vijver et al. (2002), are the largest number of patients with relapse untreated (286 vs. 141), and the independence of the rubric of 76 genes with respect to age, menopausal status and tumor size. The validation set of patients in this procedure is completely without overlap with the training set, in contrast to 90% of other reports. Ransohoff (2004). In conclusion, since approximately 30-40% of untreated patients develop tumor relapse, the prognostic rubric can provide a powerful tool to identify those patients with low risk, to avoid overtreatment in a substantial number of patients. The recommendation of adjuvant systemic therapy in patients with primary breast cancer would be guided in the future by this prognostic rubric. The preferred profiles described in Examples 1-5 (for the risk of relapse generally), are the portfolio of 35 genes comprised of the genes of SEQ ID NOs: 1-35, the portfolio of 60 genes, consisting of the genes of SEQ ID NOs: 36-95, which are best used to predict ER-positive patients, and the portfolio of 16 genes made up of the genes of SEQ ID NOs: 96-111, is best used to predict negative patients. ER.
EXAMPLE 6 Comparison of breast tumor gene profile generated from laser capture microdissection and bulk tissue in stage I / II breast cancer Profiling of gene expression has been shown to be a powerful diagnostic and prognostic tool for a variety of cancers. Almost exclusively in all cases, bulk tumor RNA was used for hybridization in the microcircuit. Estrogens play important roles in the development and growth of hormone-dependent tumors. Approximately 75% of breast cancers express the estrogen receptor (ER), which is an indicator for treatment with tamoxifen (adjuvant) and is associated with patient outcomes. To gain understanding of the mechanisms triggered by estrogen in breast epithelial cells and their association with tumorigenesis, laser capture microdissection (LCM) was used to provide a histologically homogeneous population of tumor cells from 29 tumors of the breast. primary stage mom, in combination with GeneChip expression analysis. Of these 29 patients, 11 were negative for ER and 17 were positive for ER, based on the quantitative binding to ligand or immunoassays with the enzyme in tumor cytosols. For comparison, profiling of gene expression was also obtained using bulk tissue RNA isolated from the same group of 29 patients.
Fresh frozen tissue samples were collected from 29 breast cancer patients negative to the lymph node who had been treated surgically for the breast tumor and did not receive adjuvant systemic therapy. For each tissue sample from the patient, a portion of H &E was first used to evaluate cell morphology. The RNA was isolated from both tumor cells obtained by LCM (PALM), performed on sections of cryostat and whole sections of cryostat, i.e., the bulk tissue of the same tumor. The quality of the RNA sample was analyzed in an Agilent BioAnalyzer. The RNA samples were hybridized to a human U133A microcircuit of Affymetrix containing approximately 22,000 sets of probes. The fluorescence was quantified and the intensities normalized. The Grouping Analysis and the Principal Component Analysis were used to group the patients with similar gene expression profiles. The genes that are differentially expressed between the samples positive to ER and negative to ER, were selected. The total RNA isolated from LCM, procured breast cancer cells that underwent a T7-based amplification of two rounds, in the target preparation, versus a one-round amplification with tissue RNA in bulk. The expression levels of 21 control genes (Table 7) were compared between the LCM data set and the bulk tissue set to demonstrate the fidelity of the linear amplification.
TABLE 7 List of the control gene The results obtained are described in Table 8.
TABLE 8 Clinical characteristics of patients A hierarchical grouping based on 5121 genes, showed that LCM and bulk tissue samples are completely separated based on the expression profiles of the global RNA. The expression levels of 21 control genes in RNA isolates from LCM samples and bulk tissues was subjected to an additional round of linear amplification used for the RNA obtained by LCM, which did not cause a differential expression of the control genes. .
The genes differentially expressed between ER-positive and ER-negative subgrouping in LCM and bulk tissue samples were defined by the Student's T test path analysis by Gene Ontology for genes associated exclusively with ER in LCM samples, exclusively in bulk samples, and for those that are common in LCM and test tissue, they were made. The results obtained showed several important conclusions. First, genes related to cell proliferation and energy metabolism were observed to be differentially expressed in ER- / ER + patients in both the bulk tissue data set and the LCM data set. Second, due to the enrichment of breast cancer cells via LCM, the genes involved in the transduction of the signal bound to the cell surface receptor, the translation of the RAS signal, the transduction of the JAK-STAT signal and apoptosis they were found associated with the ER state. These genes were not identified in the bulk data set. Third, microdissection provides a sensitive method to study epithelial tumor cells and a compression in the signaling pathway associated with estrogen receptors. Therefore, it is clear that the application of the expression profile of the gene described herein with tumor cells isolated with LCM, is commensurate with the results obtained in the heterogeneous bulk tissue.
EXAMPLE 7 Validation and trajectory of analysis of the prognosis rubric of 76 genes in breast cancer This Example reports the results of a validation study in which the 76 genes were used to predict the results of 132 patients obtained from 4 independent sources. In addition, in order to evaluate the robustness of the gene signature, this Example also provides the identification of the substitutable components of the rubric, and describes how substitutions lead to the identification of key trajectories in an effective rubric. Fresh frozen tissue samples were collected from 132 patients who had been treated surgically for a breast tumor and who did not receive adjuvant systemic therapy. The patient samples used were collected between 1980 and 1996. For each tissue sample from the patient, an H & E portion was used to evaluate cell morphology. Total RNA samples were prepared, and the quality of the sample was analyzed by Agilent BioAnalyzer. The RNA samples were analyzed by microarray analysis. The fluorescence was quantified and the intensities normalized. A score of relapse risks was calculated for each patient, based on the expression levels of the 76-gene rubric. The patients were classified into groups of good and poor results.
In order to evaluate the robustness of this gene rubric, two statistical analyzes were designed and used. First, the procedures for gene selection and construction of the rubric that were used to discover the 76-gene rubric were repeated. As shown in the Table, ten training sets of 115 patients each from a total of 286 patients were randomly selected. The remaining patients served as the test set. Second, the number of patients in the training set was increased to 80% of the 286 patients and the remaining 20% of the patients were used as the test set. This selection procedure was repeated 10 times. In both procedures, the Kaplan-Meier survival curves were used to ensure no significant difference in disease-free survival between the training and the test pair. The genes were selected and a rubric of each of the training sets was constructed, using the regression of Cox proportional hazards. Each rubric was validated in the corresponding test set. In addition, the prognostic rubric for 76 genes was assigned in functional groups using the GO autology classification. The trajectories that cover significant numbers of genes in the rubric were selected (value of p <0.05 and> 2 hits). The selected trajectories were also evaluated in all the forecast headings derived from different training sets.
TABLE 9A Results from 10 rubrics using training sets of 115 patients TABLE 9B Results of 10 rubrics using training sets of 80% of patients The results obtained in this Example show that: The rubric of 76 genes is successfully validated in 132 independent patients, giving an AUC value of 0.757 in 132 cancer patients. breast with relapse from 4 independent sources. The rubric shows a sensitivity of 88% and a specificity of 41%. The average AUC for the substitute rubrics is 0.64 (95% CI: 0.53-0.72). This result is consistent with that of the predictor of 76 genes (AUC of 0.69). Twenty-one overrepresented trajectories in the 76-gene rubric were also found in all other prognostic rubrics, suggesting that common biologic trajectories are involved in tumor relapse. These results suggest that gene expression profiles provide a powerful procedure for assessing the risks of the patient's outcome. The data highlights the feasibility of a molecular prognostic assay that provides patients with a quantitative measure of tumor relapse.
EXAMPLE 8 Bone relapse rubrics From the set of samples used to establish the profile of 76 genes to predict distant relapse, 107 samples were selected for the additional study of bone relapse. These samples were all selected because the relapse site was known and the samples can be clustered in sets of distant bone and non-bone relapse. Those classified as bone relapse samples included those that had bone relapse and also that they possibly relapsed into other parts of the body. The remaining patient samples with relapse were marked as non-osseous. The information regarding the samples used in these analyzes is shown in Table 10. Two different analyzes were carried out. First, the analysis of Significance Analysis of the Microarray (SAM) to identify the differentially expressed genes in the case of relapse in bone in relation to relapse elsewhere (ie, not bone). In the second analysis, a predictor of bone relapse was established to determine the probability of a patient relapsing into the bone. This rubric is referred to as the Prediction Analysis of Microarrays (PAM). In the case of SAM analysis, 300 permutations of the data were used to calculate a proportion of false discoveries (FDR). Genes are considered significant when the FDR was below 5% and when a minimum of 1.7 times difference in expression level was observed. To construct a diagnostic profile that would be useful to distinguish those who (among those likely to relapse) are likely to relapse into the bone, the samples were divided into a training set (n = 72, 46 with a relapse in the bone and 26). without a relapse in the bone) and a test set (n = 35, 23 relapses in the bone and 12 without relapse in the bone) stratified by the relapse site, at the level of the ER protein and the metastasis-free interval. A gene selection step was performed using an optimal cut procedure in the training set samples. All levels of expression measured from a gene were used as the cut-off point to assign that the gene is "high" or "low" in a particular sample, maintaining a minimum of 20 samples in one of the groups. Knowing the site of the drop of these samples, the frequencies for the categories High / Bone, Low / Bone, High / Not in bone and Low / No in bone were counted for each cut. The optimal cut was determined using the distribution of? 2. Genes were included if the maximum score of? 2 was 10,827 or higher (p <0.001) for the analysis in a Microarray Prediction Analysis (MAP). TFF1 was the most significant gene in the profiles of genes established through this procedure (from a statistical point of view). Additional experiments to determine mRNA levels in TFF1 by quantitative RT-PCR were performed using the following pairs of primers (TGGAGCAGAGAGGAGGCAAT and ACGAACGGTGTCGTCGAAAC). The samples selected for the RT-PCR study coincided for the characteristics of patients and tumors listed in Table 10. Expression levels were expressed in relation to a panel of maintenance genes and transformed by 2log. The difference in expression levels of the TFF1 gene was correlated with the two relapse groups and the p values were calculated using Kruskal-Wallis anova, approximated by? 2 and corrected for the tails. Statistical analyzes were performed using the Analyse-it software (Analyse-it Software Ltd, Leeds, United Kingdom).
Results SAM analysis The samples described above were classified according to the relapse site, 69 samples were marked as bone and 38 as non-bone.
Using SAM, 73 sets of probes representing 69 unique sets were observed to be significantly differently expressed between bone and non-bone samples. The 5 genes with the highest classification were TFF1, TFF3, AGR2, NAT1 and CRIP1, all of which were expressed higher in the samples with bone relapse. The highest ranked gene, TFF1, was studied in 122 independent breast tumors by quantitative RT-PCR. The expression of TFF1 was significantly associated with the relapse site (p = 0.0015) with a relative mean expression level and a 95% CI for TFF1 of 3.02 (1.41 to 4.66) and -1.63 (-5.44 to 2.49) for the bone and non-bone relapse group, respectively. Genes corresponding to SEQ ID Nos. 112-147, were expressed higher in the bone relapse samples. The rest was expressed lower.
PAM analysis Samples were divided into a training set (n = 72) and a test set (n = 35) stratified by the relapse site, ER protein level and metastasis free interval. Using the optimal cut-off procedure, 588 informative genes were selected to be included in the analysis PAM. A 31-gene predictor was selected after a 10-fold cross validation of the training set that could identify the bone relapse samples in the test set with a sensitivity of 100% and a specificity of 50%. The predictor showed a positive predictive value of 79.3% and a poor classification of 17% of the samples. 17 genes in the profile, including TFF1, were also present in the list of SAM genes (all 31 genes are referred to in the column "PAM" in Table 11). To determine the validity of the gene set, 50 sets of 100 randomly chosen genes were also analyzed. These random gene sets were used to introduce a PAM analysis using the same set of training and testing. The mean percentage of poorly classified samples was 28.5% (SD 4.3%). This indicates that 17% of the poorly classified samples found by the current PAM gene list is significantly smaller (z 2.67, two tails p = 0.008) than the random data sets.
TABLE 10 Clinical and tumor characteristics of the patients for SAM and PAM Characteristics All Relapse to No relapse to patients bone bone Number 107 69 38 Age (mean ± SD) 53 ± 12 52 ± 12 54 ± 11 5-40 years 16 (15%) 12 (17%) 4 (11%) 41 -55 years 49 (46%) 32 (46%) 17 (45%) 56-70 years 34 (32%) 20 (29%) 14 (37%) > 70 years 8 (7%) 5 (7%) 3 (8%) Menopausal status Premenopausal 51 (48%) 33 (48%) 18 (47%) Postmenopausal 56 (52%) 36 (52%) 20 (53%) Stage T T1 54 (50%) 38 (55%) 16 (42 %) T2 50 (47%) 31 (45%) 19 (50%) T3 / 4 3 (3%) 0 (0%) 3 (8%) Deficient Degree 61 (57%) 39 (57%) 22 ( 58%) Good-Moderate 10 (9%) 9 (13%) 1 (3%) Unknown 36 (34%) 21 (30%) 15 (39%) ER * 1" Positive 80 (75%) 57 (83%) 23 (61%) Negative 27 (25%) 12 (17%) 15 (39%) PgR * Positive 56 (52%) 38 (55%) 18 (47%) Negative 48 (45%) 28 (41%) 20 (52%) Unknown 3 (3%) 3 (4%) 0 (0%) * ER and PgR are defined positive when tumors contain > 10 fmol / mg protein or > 10% of positive tumor cells. t Patient characteristics are equally distributed between bone and non-bone relapses, except for the state of ER (value of p = 0.02), calculated using the distribution of? 2.
TABLE 11 Genes involved in the bone metastasis of breast cancer t Genes identified by the PAM analysis; the genes of this analysis that were not identified by SAM are annexed after the genes identified by SAM.
Path Analysis for the Bone Relapse Rubric Differentially expressed genes were compared with those found in the Gene Ontology and KEGG databases. Since there were only 8 genes from the SAM list listed in the KEGG database, that list was merged with a recently published bone metastasis profile. In that study, Kang et al., Generated expression profiles of the subclone gene of the ER-negative breast cancer cell line MDA-MB-231, which was then injected into mice, with a relapse to bone deficient or efficient. The genes expressed differentially between these two subtypes were considered as the rubric of relapse to bone. Since Kang et al. Used the same microarrays as those used in the previous examples, it was convenient to merge their list of 127 sets of probes (122 unique genes) with the list of SAM genes (n = 69). Although the two profiles they share only one gene (BENE), they are likely to try common trajectories. For this purpose, the maps of both lists were drawn in the KEGG database. In total 20 genes annotated in KEGG, revealed that 5 of the 20 genes (FGF5, SOS1 and DUSP1 (list of Kang) and FGFR3 and DUSP4 (SAM)), were located in the path of FGFR-p42 / 44 MAP kinase; this number of genes is statistically different from a random data set (p <0.0001). All 5 genes were upregulated in cells / tumors by bone metastasis. The 142 genes from the combined list that were entered into the Gene Ontology database were studied. Determinations were made as to whether the Gene Ontology descriptions were overexpressed in the merged SAM / Kang list compared to all the genes printed on the U133a microcircuit. Overexpressed annotations indicate biological processes, which are possibly linked to the relapse site. For example, the "extracellular" description was linked to 21 of the 142 (14.8%) genes of the bone marker list, while 1350 of the 16367 genes (8.2%) of the U133a microcircuit were noted for this description. This means that "extracellular" is 1.8 times overrepresented (p = 0.006, distribution of? 2) in the list of relapse to bone. Other examples are "cell adhesion" (17 genes, p = 0.0007) and "cell organization and biogenesis" (22 genes, p = 2.3 10"5) found 2.2 and 2.4 times overrepresented, respectively." In addition, "immune response" was significant (p = 8.7 10"5), but in contrast to the descriptions mentioned above, the genes linked to" immune response ", originated predominantly from Kang's list.
Table 12 identifies the sequences referred to in this specification.
TABLE 12 Identification of the sequences Ahr et al. (2002) "Identification of high ris breast cancer patients by gene-expression profiling" Lancet 359: 131-132 Chang et al. (2003) "Gene expression profiling for the prediction of therapeutic response to docetaxel in patients with breast cancer" Lancet 362: 362-9 Early Breast Cancer Trialists 'Collaborative Group (1995) "Effects of Radiation Therapy and Surgery in Early Breast Cancer An early overview of the randomized triais" N Engl J Med 333: 1444-1455 Early Breast Cancer Trialists' Collaborative Group (1998a) "Polychemotherapy for early breast cancer: an overview of the randomized triais "Lancet 352: 930-942 Early Breast Cancer Trialists' Collaborative Group (1998b)" Tamoxifen for early breast cancer: an overview of randomized triais "Lancet 351: 1451-1467 Efron (1981) "Censored data and the bootstrap" J Am Stat Assoc 76: 312-319 Eifel et al. (2001) "National Institutes of Health Consensus Development Conference Statement: adjuvant therapy for breast cancer, November 1-3, 2000" J Nati Cancer Inst 93: 979-989 Foekens et al. (1989b) "Prognostic value of estrogen and progesterone receptors measured by enzyme immunoassays in human breast tumor cytosols" Cancer Res 49: 5823-5828 Foekens et al. (1989a) "Prognostic valué of receptors for insulin-like growth factor 1, somatostatin, and epidermal growth factor in human breast cancer" Cancer Res 49: 7002-7009 Goldhirsch et al. (2003) "Meeting highlights: Updated International Expert Consensus on the Primary Therapy of Early Breast Cancer" J Clin Oncol 21: 3357-3365 Golub et al. (1999) "Molecular classification of cancer: class discovery and class prediction by gene expression monitoring" Science 286: 531-537 Gruvberger et al. (2001) "Estrogen receptor status in breast cancer is associated with remarkably distinct gene expression patterns" Cancer Res 61: 5979-5984 Hedenfalk et al. (2001) "Gene-expression profiles in hereditary breast cancer" N Engl J Med 344: 539-548 Herrera-Gayol et al. (1999) "Adhesion proteins in the biology of breast cancer: contribution of CD44" Exp Mol Pathol 66: 149-156 Huang et al. (2003) "Gene expression predictors of breast cancer outcomes" Lancet 361: 1590-1596 Kaplan et al. (1958) "Non-parametric estimation of incomplete observations" J Am Stat Assoc 53: 457-481 Keyomarsi et al. (2002) "Cyclin E and survival in patients with breast cancer" N Engl J Med 347: 1566-1575 Lipshutz et al. (1999) "High density synthetic oligonucleotide arrays" Nat Genet 21: 20-24 Ma et al. (2003) "Gene expression profiles of human breast cancer progression" Proc Nati Acad Sci USA 100: 5974-5979 Ntzani et al. (2003) "Predictive ability of DNA microarrays for cancer outcomes and correlates: an empirical assessment" Lancet 362: 1439-1444 Perou et al. (2000) "Molecular portraits of human breast tumors" Nature 406: 747-752 Ramaswamy et al. (2001) "Multiclass cancer diagnosis using tumor gene expression signatures" Proc Nati Acad Sci USA 98: 15149-15154 Ramaswamy et al. (2003) "A molecular signature of metastasis in primary solid tumors" Nat Genet 33: 1-6 Ransohoff (2004) "Rules of evidence for molecular cancer-marker discovery and validation" Nat Rev Cancer 4: 309-314 S0rlie et al . (2001) "Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications" Proc Nati Acad Sci USA 98: 10869-10874 STrlie et al. (2003) "Repeated observation of breast tumor subtypes in independent gene expression data sets" Proc Nati Acad Sci USA 100: 8418-8423 Sotiriou et al. (2003) "Gene expression profiles derived from fine needle aspiration correlate with response to systemic chemotherapy in breast cancer" Breast Cancer Res 4: R3 Sotiriou et al. (2003) "Breast cancer classification and prognosis based on gene expression profiles from a population-based study" Proc Nati Acad Sci USA 100: 10393-10398 Su et al. (2001) "Molecular classification of human carcinomas by use of gene expression signatures" Cancer Res 61. "7388-7393 van de Vijver et al. (2002)" A gene expression signature as a predictor of survival in breast cancer "N Engl J Med 347: 1999-2009 van't Veer et al. (2002) "Gene expression profiling predicts clinical outcome of breast cancer" Nature 415: 530-536 Wang et al. (2004) "Gene expression profiles and molecular markers to predict relapse of Dukes' B colon cancer "J Clin Oncol 22: 1564-1571 Woelfle et al. (2003)" Molecular signature associated with bone marrow micrometastasis in human breast cancer "Cancer Res 63: 5679-5684

Claims (56)

NOVELTY OF THE INVENTION CLAIMS
1. - A method for assessing the state of breast cancer, comprising measuring the expression levels in a biological sample of a breast cancer previously obtained from a patient of genes, via a Marker, where the expression levels of the gene above or below the predetermined cut-off levels are indicative of the probability of relapse in the bone.
2. A method for classifying patients with breast cancer, comprising measuring the expression levels in a biological sample of a breast cancer previously obtained from a patient of the genes, via a Marker, where the levels of expression of the gene above or below the predetermined cut levels are indicative of the stage of breast cancer.
3. The method according to claim 2, further characterized in that the step corresponds to the classification by the TNM system.
4. The method according to claim 2, further characterized in that the step corresponds to patients with similar profiles of gene expression.
5. A method for determining the treatment protocol for a patient with breast cancer, comprising measuring the expression levels in a biological sample of a breast cancer previously obtained from a patient of genes, via a Marker, wherein the Gene expression levels above or below the predetermined cut-off levels are sufficiently indicative of risk of relapse in the bone, to allow the physician to determine the degree and type of therapy recommended to prevent or treat relapse in the bone.
6. The method according to claim 1, further characterized in that the preparation of the tissue in bulk is obtained prior to a biopsy or a surgical specimen.
7. The method according to claim 1, further characterized in that the Markers include all those corresponding to SEQ ID NOs: 112-198.
8. The method according to claim 1, 2 or 5, further characterized in that it comprises measuring the level of expression of at least one gene expressed constitutively in the sample.
9. The method according to claim 1, 2 or 5, further characterized in that it comprises determining the state of the estrogen receptor (ER) of the sample.
10. The method according to claim 9, further characterized in that the ER state is determined by measuring the level of expression of at least one gene indicative of the ER status.
11. The method according to claim 10, further characterized in that the state of ER is determined by measuring the presence of ER in the sample.
12. - The method according to claim 11, further characterized in that the presence of ER is measured immunohistochemically.
13. The method according to claim 1, 2 or 5, further characterized in that the sample is obtained previously from a primary tumor.
14. The method according to claim 1, 2 or 5, further characterized in that the specificity is at least about 40%.
15. The method according to claim 1, 2 or 5, further characterized in that the sensitivity is at least about 90%.
16. The method according to claim 1, 2 or 5, further characterized in that the expression pattern of the genes is compared with an expression pattern indicative of a patient with breast cancer that falls to the bone.
17. The method according to claim 16, further characterized in that the comparison of expression patterns is performed with pattern recognition methods.
18. The method according to claim 17, further characterized in that the methods of recognition of the pattern includes the use of a qualification of the predictor of bone relapse.
19. The method according to claim 1, 2 or 5, further characterized in that the predetermined cut levels are at least
1. 7 times the envelope or subexpression in the sample relative to the cells or tissues of patients without bone relapse.
20. The method according to claim 1, 2 or 5, further characterized in that the predetermined cut-off values have at least one p value of the statistically significant overexpression in the sample having metastatic cells, relative to the cells or to the tissue of patients without bone relapse.
21. The method according to claim 20, further characterized in that the value of p is less than 0.05.
22. The method according to claim 1, 2 or 5, further characterized in that the expression of the gene is measured in a microarray or microcircuit of genes.
23. The method according to claim 22, further characterized in that the microarray is a cDNA array or an oligonucleotide array.
24. The method according to claim 22, further characterized in that the microarray or gene microcircuit comprises one or more internal control reagents.
25. The method according to claim 1, 2 or 5, further characterized in that the expression of the gene is determined by the amplification of the nucleic acid made by the polymerase chain reaction (PCR) of RNA extracted from the sample.
26. - The method according to claim 25, further characterized in that the PCR is a polymerase chain reaction with reverse transcriptase (RT-PCR).
27. The method according to claim 26, further characterized in that the RT-PCR further comprises one or more internal control reagents.
28. The method according to claim 1, 2 or 5, further characterized in that the expression of the gene is detected by measuring or detecting a protein encoded by the gene.
29. The method according to claim 28, further characterized in that the protein is detected by an antibody specific for the protein.
30. The method according to claim 1, 2 or 5, further characterized in that the expression of the gene is detected by measuring a characteristic of the gene.
31. The method according to claim 30, further characterized in that the measured characteristic is selected from the group consisting of DNA amplification, methylation, mutation and allelic variation.
32.- A method for assessing the state of breast cancer, which comprises measuring the expression levels in a biological sample of a breast cancer previously obtained from a patient of genes, via a marker, wherein the levels of expression of the gene by above or below the predetermined cut-off levels are indicative of the probability of relapse in the bone.
33. - A team to perform an assay to determine the prognosis of breast cancer in a biological sample previously obtained from a patient.
34. The equipment according to claim 33, further characterized in that the Marker corresponds to any of SEQ ID NO 112-116.
35.- The equipment according to claim 33, further characterized in that the Marker corresponds to all SEQ ID NO 112-116.
36.- The equipment according to claim 34, further characterized in that it includes Markers corresponding to one or more of the SEQ ID NOs. 117-198.
37.- The equipment according to claim 33, further characterized in that it includes markers corresponding to all SEQ ID NOs. 112-198.
38.- The equipment according to claim 33, further characterized in that it comprises reagents to perform microarray analysis.
39.- The equipment according to claim 33, further characterized in that it comprises a means through which the nucleic acid sequences, their complements or portions thereof can be tested.
40.- Articles to assess the state of breast cancer, which include markers.
41. - The articles according to claim 40, further characterized because the Marker corresponds to any of SEQ ID NO 112-116.
42. The articles according to claim 40, further characterized in that the Marker corresponds to all of SEQ ID NO 112-116.
43.- The articles according to claim 40, further characterized in that it includes markers corresponding to one or more of the SEQ ID NOs. 117-198.
44.- The articles according to claim 40, further characterized in that they comprise reagents for performing microarray analyzes.
45. The articles according to claim 40, further characterized in that they comprise a medium through which the nucleic acid sequences, their complements or portions thereof are tested.
46.- A microarray or microcircuit of genes for carrying out the method according to claims 1, 2, 5 or 6.
47.- The microarray according to claim 46, further characterized in that it comprises a sufficient marker to characterize the state of the breast cancer or the risk of relapse in the bone of a biological sample.
48. - The microarray according to claim 46, further characterized in that the measurement or characterization is at least 1.7 times the envelope or subexpression.
49. The microarray according to claim 46, further characterized in that the measurement provides a statistically significant p-value of over or under-expression.
50. The microarray according to claim 46, further characterized in that the value of p is less than 0.05.
51. The microarray according to claim 46, further characterized in that it comprises a cDNA array or an array of oligonucleotides.
52. The microarray according to claim 46, further characterized in that it comprises one or more internal control reagents.
53.- A diagnostic / prognostic portfolio comprising a sufficient marker to characterize the cancer state or the risk of relapse in the bone in a biological sample, previously obtained from a patient.
54.- The portfolio according to claim 53, further characterized in that the measurement or characterization is at least 1.7 times the envelope or subexpression.
55.- The portfolio according to claim 53, further characterized in that the measurement provides a statistically significant p-value of over or under-expression.
56. - The portfolio according to claim 53, further characterized in that the value of p is less than 0.05.
MXPA/A/2006/008788A 2005-08-02 2006-08-02 Predicting bone relapse of breast cancer MXPA06008788A (en)

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
US60/704740 2005-08-02

Publications (1)

Publication Number Publication Date
MXPA06008788A true MXPA06008788A (en) 2008-09-02

Family

ID=

Similar Documents

Publication Publication Date Title
US20070031873A1 (en) Predicting bone relapse of breast cancer
US11021754B2 (en) Tumor grading and cancer prognosis
WO2005083429A2 (en) Breast cancer prognostics
US20080275652A1 (en) Gene-based algorithmic cancer prognosis
EP1880335A1 (en) Gene-based algorithmic cancer prognosis
WO2006127537A2 (en) Thyroid fine needle aspiration molecular assay
JP2008521412A (en) Lung cancer prognosis judging means
AU2008203226B2 (en) Colorectal cancer prognostics
JP2010502227A (en) Methods for predicting distant metastasis of lymph node-negative primary breast cancer using biological pathway gene expression analysis
JP2011509689A (en) Molecular staging and prognosis of stage II and III colon cancer
US20050186577A1 (en) Breast cancer prognostics
US9195796B2 (en) Malignancy-risk signature from histologically normal breast tissue
JP2008538284A (en) Laser microdissection and microarray analysis of breast tumors reveals genes and pathways associated with estrogen receptors
EP2278026A1 (en) A method for predicting clinical outcome of patients with breast carcinoma
EP1512758B1 (en) Colorectal cancer prognostics
MXPA06008788A (en) Predicting bone relapse of breast cancer
EP2872651A1 (en) Gene expression profiling using 5 genes to predict prognosis in breast cancer
Agrawal Gene Expression Profiling of Carcinoma Breast and its, Prognostic Signature: A Review