WO2013147330A1

WO2013147330A1 - Prognosis prediction system of locally advanced gastric cancer

Info

Publication number: WO2013147330A1
Application number: PCT/KR2012/002193
Authority: WO
Inventors: 허용민; 서진석; 노성훈; 정재호; 박은성
Original assignee: 연세대학교 산학협력단
Priority date: 2012-03-26
Filing date: 2012-03-26
Publication date: 2013-10-03

Abstract

The present invention relates to a novel prognosis prediction system capable of predicting a prognosis of a locally advanced gastric cancer. More specifically, the present invention can predict a clinical outcome through a comparative analysis method of gene or protein set expressions after surgically removing a gastric cancer.

Description

Prognostic Prediction System for Locally Advanced Gastric Cancer

The present invention relates to a novel prognostic prediction system capable of predicting the prognosis of locally advanced gastric cancer through comparative analysis of gene or protein aggregation expression.

Gastric cancer is clearly different from stage 1 to stage 4 according to the TNM staging system, unlike breast cancer and colorectal cancer (see FIG. 1). That is, in case of stage 1, the 5-year survival rate is 90% or more, and in stage 4, the difference is 20% or less. Therefore, the prognostic predictive power of the TNM staging system is very good [Ref., 7th edition of the AJCC cancer staging Manual: stomach. Ann Surg Oncol 2010; 17: 3077-3079.

Based on the staging system, gastric cancer can often be divided into early gastric cancer, locally advanced gastric cancer, locally advanced invasive gastric cancer, and metastatic gastric cancer. .

Oncologists do not qualify for labeling for specific cancers and chemotherapeutic agents characterized as "standards of care," but have numerous treatment options available to them by combining numerous drugs that are effective against the cancer. have. The best possibility for good treatment outcomes should specify the optimal cancer treatment available to the patient and this designation needs to be made as soon as possible after diagnosis. In particular, it is important to determine the likelihood of patient response to "treatment basis" chemotherapy because chemotherapeutic agents such as anthracycline and taxanes have limited efficacy and are toxic. Thus, identification of the most responsive or least responsive patients can, via smarter patient selection, increase the net benefits that these drugs must provide and reduce net mortality and toxicity. Currently, diagnostic tests used in clinical practice are single analytes and thus do not obtain the potential value of known relationships between many different markers. In addition, diagnostic tests are not quantitative, mainly dependent on immunohistochemistry. This method yields different results in different laboratories, in part because the reagents are not standardized, and in part because the interpretation is subjective and cannot be easily quantified. RNA-based testing was not frequently used due to the problem of RNA degradation over time and the fact that fresh tissue samples for analysis were difficult to obtain from patients. Tissues immobilized and embedded in paraffin are more readily available, and methods for detecting RNA in fixed tissues have been established. However, these methods typically do not allow the study of large numbers of genes (DNA or RNA) from small amounts of material. Thus, traditionally, fixed tissues are rarely used except for immunohistochemical detection of proteins.

In recent years, several groups have published studies on the classification of various cancer types by microarray gene expression analysis (see, eg, Golub et al., Science 286: 531-537 (1999). ], Bhattacharjae et al., Proc. Natl. Acad. Sci. USA 98: 13790-13795 (2001), Chen-Hsiang et al., Bioinformatics 17 (Suppl. 1): S316-S322 (2001)]. , And Ramaswamy et al., Proc. Natl. Acad. Sci. USA 98: 15149-15154 (2001)).

Although modern molecular biology and biochemistry have uncovered hundreds of genes whose activity affects tumor cell behavior, their differentiation status, and resistance or sensitivity to certain therapeutic agents, with a few exceptions, the current state of these genes It is not routinely used to make clinical decisions about drug treatment.

Despite recent advances, the challenges of cancer treatment remain to personalize tumor treatment in order to target specific treatment regimens for pathologically distinct tumor types and ultimately maximize outcomes. Thus, there is a need for tests that simultaneously provide predictive information about patient response to various treatment options.

It is an object of the present invention to provide a novel prognostic prediction system based on microRNAs, transcripts or proteins for locally advanced gastric cancer, in particular for all gastric cancers not associated with stages or stages paired with N0 regional lymph node metastasis. It is.

In order to achieve the above object, the present invention provides a method for predicting the clinical outcome (prognosis) of the N0 gastric cancer patient group, such as T1N0, T2N0, T3N0 or T4N0 gastric cancer patient group in the TNM stage.

In one embodiment, the invention provides at least one RNA transcript selected from the group consisting of FZD1, GLI3, ANGPTL7, ABL1, SMARCD3, ILK, CAV1, VIP, HSPB7, TOP2A and FANCD2 in a biological sample comprising cancer cells from a subject; And determining the expression level of one or more miRNAs selected from the group consisting of hsa-miR-933, hsa-miR-184, hsa-miR-380 *, hsa-miR-190b, hsa-miR-27a * and hsa-miR-1201. Making; And

Calculating a recurrence score (RS) of the biological sample based on the expression level of the RNA transcript or miRNA determined in the step,

The present invention provides a method for predicting prognosis in a subject diagnosed as gastric cancer, the method including determining a prognosis according to the RS value.

The RS can be calculated according to the following equation (1):

[Equation 1]

Risk Score = HR ₁ * normLogTransValue ₁ + HR ₂ * normLogTransValue ₂ + ... + HR _n * normLogTransValue _n

Where

HR _n represents the hazard ratio of the nth RNA transcript or microRNA,

normLogTransValue _n means the value associated with the expression of the n-th RNA transcript or micro RNA.

According to one embodiment, RS can be obtained as follows:

Risk Score = FZD1 × 4.302 + GLI3 × 4.073 + ANGPTL7 × 2.949 + ABL1 × 2.784 + SMARCD3 × 2.266 + ILK × 2.251 + CAV1 × 1.788 + VIP × 1.73 + HSPB7 × 1.535-TOP2A × 1.766-FANCD2 × 2.793 + miR933 × 5.256 + miR184 × 1.674 + miR380 * × 1.903-miR190b × 3.597-miR27a * × 1.7-miR1201 × 1.35

The present invention provides a useful method for predicting clinical outcome of the entire gastric cancer patient group irrespective of the TNM stage.

In one embodiment, the invention is directed to a biological sample comprising cancer cells obtained from a subject.

a) HAT, C17orf65, TRAF6, CISH, ELAC1, ACTR8, SMARCAD1, SRRM1, C15orf44, EFTUD1, BUB3, KIAA0232, SEPSECS, DCAF16, ARHGAP19, TAF5, CNOT6L, NIF3L1, C19orf54, RNPC6, NRCPC28, NRCPC28 USP54, LIN54, FANCF, GAR1, GPBP1L1, TRAF3, KIAA0368, CRNKL1, SCLY, SMCR7L, PAIP1, RBD1, RPAIN, AP1G1, C1orf212, C18orf54, TIFA, EWSR1, FUBP1, AGGF1, CWF2F145, CRP14F152 NUP88, SNORA65, MED28, RFC1, RRM1, KARS, CCR1, CHAF1A, PLCH1, FASTKD1, KIAA0174, SAAL1, TNFSF14, ETV7, NBN, C20orf7, RHBDD1, ANKRD32, ING3, ATPAF1, CCD KP14, QD1 NFX1, SMAP2, SRGAP3, KIR2DL3, KIAA0564, GFI1, KIAA1715, COX15, PATL1, LETMD1, PRRG4, SETD4, GRAMD1C, NDRG3, PTPN22, TRIM21, PI4K2B, DCLRE1A, ALG11, PARPC, LIKASCEK2 MMP25, LARP1B, STAP2, GCH1, C20orf72, HK3, SNX5, NAAA, KLRD1, IL18RAP, PSMB8, THOP1, CASP5, ALPK1, SLC11A2, PSMB10, MND1, FANCG, IMPA1, MYL5, TTF2, RFIA3 PRF, BA BTN3A1, FANCD2, RIPK2, TSPAN6, IFNG, CDC25A, CXCR6, SLC27A2, GAD Expression level of one or more RNA transcript X selected from the group consisting of 1, DLEU2, JAK2, CD7, FKBP11, IL32, SORD, TAP1, GNLY, C2, GZMB, VSNL1 and GBP5, and / or

b. CRTAC1, DKK3, DIO2, CYBRD1, SPIRE1, SERPINE2, PPAP2A, TCEAL2, DPYSL3, ACTA2, RBPMS2, PALLD, ALDH1A3, HDGFRP3, DACT3, IGFBP7, TMEFF2, PCSK5, ICAM2, MYL2, DC1 SOD2 NNMT, HEYL, APOD, HSPB2, NGFRAP1, HSPB6, RBPMS, SGCE, DCAF6, LPP, PEA15, VIP, GJA4, CYTH3, PTN, LEPR, RAI14, TMEM47, FOXS1, ESAM, MEIS3P1, C15orf52, ITGB1, OGN IGFBP6, ABLIM3, LAYN, FERMT2, FZD4, ADAMTS8, TGFB1I1, DARC, PLN, SCHIP1, PDGFC, RAB6B, CPE, MARCKS, TIE1, AFAP1L1, ERGIC1, HSPB7, EHD2, SLC38A1, FTSCC, AD20, FNDC4H TMEM136, FSTL1, CDH6, HTR2B, LAMA2, GEM, CDH5, PDE8B, RAB32, SELM, C7, PLAC9, MFAP4, FLNC, CTSE, LOC346887, MPRIP, GNB5, ELN, ENG, CRABP2, CST6, MYOM1, PCDH18, LAMB LHFP, FILIP1L, CAV1, CPXM2, NBEA, TEK, CTSF, LTC4S, AEBP1, GNG11, SV2B, KCNMB1, BARX1, DIP2C, LAMC1, PODN, LAPTM One or more RNAs selected from the group consisting of 4A, HTRA1, FGF2, CLEC14A, PHLDB2, CD93, RGS11, TRIM47, LHX6, EDNRA, PRSS23, FAM129A, SDPR, PAMR1, APLNR, PDE7B, ANKRD10, FRZB, SMOC2, CDC42EP4 and RERG Measuring the expression level of transcript Y; And

The increase in expression of transcript X is determined to be an increase in the likelihood of a positive clinical outcome, and the increase in expression of transcript Y is determined to be a decrease in the likelihood of a positive clinical outcome. Provide a way to predict.

In one embodiment, the invention also relates to a biological sample comprising cancer cells obtained from a subject,

a) HS_59, HS_162, HS_67, hsa-miR-96 *, hsa-miR-496, hsa-miR-223, hsa-miR-302a *, hsa-miR-20a, hsa-miR-93, hsa-miR- Expression level of one or more miRNA (I) selected from the group consisting of 148a, hsa-miR-155 *, hsa-miR-15a, hsa-miR-17 and hsa-miR-18a, and / or

b) hsa-miR-1, HS_6, HS_111, HS_114, hsa-let-7c, HS_126, HS_90, hsa-miR-548d-5p, hsa-miR-189: 9.1, solexa-4793-177, HS_135, hsa- measuring the expression level of one or more miRNA (II) selected from the group consisting of miR-20b * and hsa-miR-658; And

Increased expression of miRNA (I) is judged to be an increase in the likelihood of positive clinical outcomes, and increased expression of miRNA (II) is determined to be a decrease in the likelihood of positive clinical outcomes. Provide a way to predict.

Determining the expression level of at least one protein selected from the group consisting of Akt ^pS473 , PAI, SMAD3, ^P70 S6K and EGFR2; And

Calculating a recurrence score (RS) of the biological sample based on the expression level of the protein determined in the step,

The RS may be calculated according to Equation 2:

[Equation 2]

Risk Score = HR ₁ * RPPAValue ₁ + HR ₂ * RPPAValue ₂ + ... + HR _n * RPPAValue _n

Where

HR _n represents the hazard ratio of the nth functional protein,

RPPAValue _n means the value associated with the expression of the n th functional protein.

The present invention also provides a computer readable recording medium having recorded thereon a program for executing prognostic prediction of gastric cancer.

In one embodiment, a medium useful for predicting clinical outcome of a stage N0 gastric cancer patient group during a TNM stage may be provided. for example,

At least one RNA transcript selected from the group consisting of FZD1, GLI3, ANGPTL7, ABL1, SMARCD3, ILK, CAV1, VIP, HSPB7, TOP2A and FANCD2 in a nucleic acid sample obtained from a patient; And determining the expression level of one or more miRNAs selected from the group consisting of hsa-miR-933, hsa-miR-184, hsa-miR-380 *, hsa-miR-190b, hsa-miR-27a * and hsa-miR-1201. Making; And

A computer readable recording medium having recorded thereon a program for causing a computer to classify a patient having a higher RS than a setpoint is a high probability of relapse and a patient having a lower RS is set to a lower likelihood of relapse.

RS value using the expression level of the RNA transcript or miRNA can be obtained through the above equation.

In one embodiment, a medium may be provided that is useful for predicting clinical outcomes of the entire gastric cancer patient population independent of TNM stages. for example,

Determining the expression level of at least one protein selected from the group consisting of Akt ^pS473 , PAI, SMAD3, ^P70 S6K and EGFR2 in a protein sample obtained from the patient; And

A computer-readable recording medium having recorded thereon a program for causing a computer to classify a patient whose RS is greater than a setpoint is a high probability of recurrence and a patient smaller than the setpoint is a low likelihood of relapse.

RS value using the expression level of the protein can be obtained through the above equation.

The present invention creates a predictive model of overall survival rate and relapse-free survival rate for stage N0 gastric cancer patients in the TNM stage, and then determines the expression level of micro RNA, RNA transcript or protein that affects statistically significant survival. By producing a system to calculate prognostic indicators, the clinical results after resection by gastric cancer surgery can be predicted.

In addition, the present invention enables the analysis of the gene group according to the biological function of gastric cancer itself by using a gene aggregation system according to the biological function of the gene.

1 is a result of a mortality rate according to the stage of gastric cancer in 9324 cases from 1987 to 2007 at Severance Hospital in Yonsei Medical Center.

Figure 2 shows an example using a recurrence scoring method using micro RNA expression in gastric cancer stage 3a.

Figure 3 shows the results of survival analysis of Akt ^pS473 as a functional ^protein .

Figure 4 shows the number of deaths in the group with the good prognosis and the number of deaths in the poor prognosis when the prognostic index (prognostic index) using the protein expression level is 0.

FIG. 5 shows survival analysis results according to prognostic indicators (risk scoring system) (when scores are divided into + and − groups) in a T1NO, T2N0, T3N0, or T4N0 gastric cancer patient group.

FIG. 6 shows the number of deaths in the group with the good prognosis and the number of deaths in the group with the poor prognosis when the prognostic index is 0 based on the T1NO, T2N0, T3N0, or T4N0 gastric cancer patient groups. will be.

FIG. 7 illustrates a process of extracting a correlation between the expression level of microRNA and the expression level of RNA transcript in a T1NO, T2N0, T3N0, or T4N0 gastric cancer patient group.

Hereinafter, the configuration of the present invention will be described in detail.

The present invention was devised to develop a system for predicting clinical outcome after gastric resection for the entire gastric cancer patient group or the N0 patient group in the TNM stage, and is useful for predicting clinical outcome after surgical resection of gastric cancer patients. , MicroRNA or protein sets.

In one aspect, the present invention provides a method for predicting clinical outcome after resection by surgery in a stage N0 patient group, such as T1NO, T2N0, T3N0, or T4N0 stage of advanced TCC stage. For example, in biological samples containing cancer cells obtained from a subject

One or more RNA transcripts selected from the group consisting of FZD1, GLI3, ANGPTL7, ABL1, SMARCD3, ILK, CAV1, VIP, HSPB7, TOP2A and FANCD2; And determining the expression level of one or more miRNAs selected from the group consisting of hsa-miR-933, hsa-miR-184, hsa-miR-380 *, hsa-miR-190b, hsa-miR-27a * and hsa-miR-1201. Making; And

The RS can be calculated according to the following equation (1):

[Equation 1]

Where

HR _n represents the hazard ratio of the nth RNA transcript or microRNA,

In the above formula, the term "Hazard Ratio" (HR) means a coefficient that reflects the contribution to cancer progression, relapse, or therapy response. The risk factor can be derived by various statistical techniques. The risk factor, HR value, can be determined in various statistical models, for example in the Univariate Cox's proportional harzard model. In one embodiment, in using the HR value in the RS formula, when the HR value is greater than or equal to 1, the HR value may be used as it is, and when the HR value is less than 1, the 1 / HR value may be used.

In addition, in the above formula, the term, a value value associated with the expression of an RNA transcript or microRNA, means a value associated with the expression of an individual gene, for example RNA transcript, micro RNA, protein. The value can be determined, for example, using various known statistical means. For example, the value related to expression may be a value after quantile normalization after transforming p value measured by Univariate Cox's proportional harzard model into log2 function value.

According to one embodiment, the RS can be determined as follows:

The method may be useful for predicting clinical outcome after surgery for surgical treatment of stage N0 gastric cancer patients in a TNM stage, such as stage T0N0, T2N0, T3N0 or T4N0 stage advanced gastric cancer.

The method may determine that the RS value is a positive prognosis in terms of overall survival (OS) or recurrence free survival (RFS), and the prognosis is a negative value. In other words, a positive value indicates a low overall survival rate or a high incidence of deaths due to relapse during at least 3 years, 5 years, 8 years, and 10 years. Higher overall survival or abnormal incidence of death patients without relapse for at least 8 years or more than 10 years. The term, good prognosis can be expressed as an increase in the likelihood of a positive clinical outcome of a clinical outcome, and a bad prognosis can be expressed as a decrease in the likelihood of a positive clinical outcome of a clinical outcome.

The present invention provides a useful method for predicting clinical outcome after surgical resection of total gastric cancer regardless of TNM stage.

The method may be a PCR based method or an array based method.

The expression level may be one that is normalized to the expression level of one or more RNA transcripts.

The clinical result may be expressed in terms of overall survival (OS) or recurrence free survival (RFS).

The method may comprise measuring the expression level of at least two RNA transcripts selected from RNA transcripts X and Y. More specifically, the prognosis can be predicted by measuring two or more expression levels selected from RNA transcripts X and Y and analyzing each increase in expression to determine the increase or decrease in the likelihood of a positive clinical outcome.

The method may comprise measuring the expression level of at least five RNA transcripts selected from RNA transcripts X and Y. More specifically, five or more expression levels selected from RNA transcripts X and Y can be measured and each expression increase analyzed to determine the increase or decrease in the likelihood of a positive clinical outcome to predict prognosis.

The method may comprise measuring the expression level of at least 10 RNA transcripts selected from RNA transcripts X and Y. More specifically, 10 or more expression levels selected from RNA transcripts X and Y can be measured and each expression increase analyzed to determine the increase or decrease in the likelihood of a positive clinical outcome to predict prognosis.

The method may comprise measuring the expression level of RNA transcript X and Y total RNA transcript. More specifically, the prognosis can be predicted by measuring the overall expression level of RNA transcripts X and Y and analyzing the increase in expression to determine the increase or decrease in the likelihood of a positive clinical outcome.

The method may comprise measuring the expression level of two or more micro RNAs selected from micro RNA transcripts I and II. More specifically, two or more expression levels selected from micro RNA transcripts I and II can be measured and each expression increase analyzed to determine the increase or decrease in the likelihood of a positive clinical outcome to predict prognosis.

The method may comprise measuring the expression level of at least five micro RNAs selected from micro RNA transcripts I and II. More specifically, five or more expression levels selected from microRNA transcripts I and II can be measured and each expression increase analyzed to determine the increase or decrease in the likelihood of a positive clinical outcome to predict prognosis.

The method may comprise measuring the expression level of at least 10 microRNAs selected from micro RNA transcripts I and II. More specifically, 10 or more expression levels selected from micro RNA transcripts I and II can be measured and each expression increase analyzed to determine the increase or decrease in the likelihood of a positive clinical outcome to predict prognosis.

The method may comprise measuring the expression level of micro RNA throughout the micro RNA transcripts I and II. More specifically, the prognosis can be predicted by measuring the expression levels of the entire micro RNA transcripts I and II and analyzing the respective increase in expression to determine the increase or decrease in the likelihood of a positive clinical outcome.

In one embodiment the invention also relates to a biological sample comprising cancer cells obtained from a subject,

The RS may be calculated according to Equation 2:

[Equation 2]

Where

HR _n represents the hazard ratio of the nth functional protein,

Values associated with the expression of the risk factor and the functional protein can use the values measured as described above.

The method may be a bad prognosis if the RS value is greater than the set point in terms of overall survival (OS) or recurrence free survival (RFS), and the prognosis is good if the RS value is less than the set point. . For example, when the set value is 0, the prognosis is poor when the RS value is greater than 0, and the prognosis is good when the RS value is less than zero.

The invention also provides a computer readable recording medium having recorded thereon a program for executing a prediction of prognosis after resection by surgery of gastric cancer.

In one embodiment, a medium useful for predicting clinical outcome after surgical resection of a stage N0 gastric cancer patient during a TNM staging can be provided. For example, in nucleic acid samples obtained from patients

The RS may be calculated according to Equation 1.

The recording medium is regarded as a high probability of recurrence when the RS value is higher than the set point in terms of overall survival (OS) or recurrence free survival (RFS), and a low recurrence rate when the RS value is lower than the set point. Can be. For example, when the set value is expressed as +/-, it may be determined that the recurrence is high when the RS is a positive value, and the recurrence is low when the value is −.

In one embodiment, a medium that can be useful for predicting clinical outcome after gastric resection of the entire gastric cancer patient group irrespective of the TNM stage can be provided. For example, in protein samples obtained from patients

The RS may be calculated according to Equation 2.

The recording medium has a high probability of recurrence when the RS value is larger than the set point in terms of overall survival or recurrence free survival (RFS), and a low recurrence rate when the RS value is smaller than the set point. It may be. For example, when the set value is 0, if the RS value is greater than 0, recurrence is high, and if the RS value is less than 0, recurrence is low.

Unless defined otherwise, technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. Singleton et al., Dictionary of Microbiology and Molecular Biology 2nd ed., J. Wiley & Sons (New York, NY 1994) and March, Advanced Organic Chemistry Reactions, Mechanisms and Structure 4th ed., John Wiley & Sons (New York, NY 1992)] provides general guidance for many of the terms used in this application to those skilled in the art.

Those skilled in the art will recognize many methods and materials similar or equivalent to those described herein that can be used in the practice of the present invention. In fact, the invention is not limited to the methods and materials described in any way. For the purposes of the present invention, the following terms are defined below.

As used herein, "microarray" refers to the regular placement of hybridizable array elements, preferably polynucleotide probes, on a substrate.

As used herein, "polynucleotide", when used in the singular or plural, generally refers to any polyribonucleotide or polydeoxyribonucleotide, which may be unmodified RNA or DNA or modified RNA or DNA. Thus, for example, polynucleotides as defined herein include, but are not limited to, DNA comprising one- and two-stranded DNA, one- and two-stranded regions, one- and two-stranded RNA, one-and RNAs comprising two-stranded regions, single-stranded or more typically two-stranded, or hybrid molecules comprising DNA and RNA comprising one- and two-stranded regions.

Also, herein, "polynucleotide" refers to a three-stranded region comprising RNA or DNA or both RNA and DNA. The strands in this region can be from the same molecule or from different molecules. A zone may comprise all of one or more molecules, but more specifically includes only one zone of some of the molecules. One of the molecules of the triple-helix region is an oligonucleotide.

As used herein, "polynucleotide" specifically includes cDNA. The term includes DNA (including cDNA) and RNA containing one or more modified bases. Thus, a DNA or RNA having a backbone modified for stability or for other reasons is a "polynucleotide" as intended herein. In addition, DNA or RNA comprising an unusual base such as inosine or a modified base such as tritium base is included within the term "polynucleotide" as defined herein. In general, the term "polynucleotide" refers to all chemically, enzymatically and / or metabolically modified forms of unmodified polynucleotides, as well as DNA and RNA characteristics of cells and viruses, including simple and complex cells. It includes a chemical form having a.

As used herein, "oligonucleotide" refers to a relatively short polynucleotide, including but not limited to one-strand deoxyribonucleotide, one- or two-strand ribonucleotide, RNA: DNA hybrid and two-strand DNA. Oligonucleotides, such as single-stranded DNA probe oligonucleotides, are often synthesized by chemical methods using, for example, commercially available automated oligonucleotide synthesizers. However, oligonucleotides can be prepared by a variety of other methods, including in vitro recombinant DNA-mediated techniques, and by expression of DNA in cells and organisms.

As used herein, “differentially expressed gene”, “differential gene expression” and their synonyms used interchangeably refer to their expression in a disease, in particular stomach cancer, as compared to their expression in normal or control subjects. It refers to a gene that is activated at a higher or lower level among a subject suffering from a cancer such as. The term also includes genes whose expression is activated at higher or lower levels in different stages of the same disease. It will also be appreciated that differentially expressed genes may be activated or inhibited at the nucleic acid level or the protein level, or may undergo other splicing to result in different polypeptide products. Such differences can be demonstrated, for example, by changes in mRNA levels, surface expression, secretion or other distribution of the polypeptide. Differential gene expression is a comparison of expression between two or more genes or their gene products, or a comparison of expression ratios between two or more genes or their gene products, or even two differently processed genes of the same gene. Comparison of products (these may differ between a normal subject and a disease, in particular a subject suffering from cancer, or between various stages of the same disease). Differential expression is, for example, a quantitative, as well as qualitative difference in the pattern of transient or cell expression in a gene or its expression product between normal and diseased cells, or between cells undergoing different disease events or disease stages. Include all of them. For the purposes of the present invention, "differential gene expression" is at least about 2 times, preferably at least about 4 times, between the expression of a given gene in normal and diseased subjects or at various stages of disease development in a diseased subject, More preferably at least about 6 times and most preferably at least about 10 times.

The term “standardized” with respect to a gene transcript or gene expression product refers to the level of the transcript or gene expression product relative to the average level of the transcript / product of the reference gene set, wherein the reference genes are throughout the patient, tissue or treatment. Selected based on their minimal variation (“housekeeping genes”), or reference genes are all of the genes tested. In the latter case, generally referred to as "global normalization", it is important that the total number of genes tested is relatively large, preferably greater than 50. Specifically, the term 'standardized' with respect to RNA transcripts refers to the level of transcription relative to the average of the levels of transcription of a set of reference genes. More specifically, the mean level of RNA transcript as measured by TaqMan® RT-PCR refers to the Ct value—mean Ct value of the reference gene transcript set.

As used herein, "expression threshold" and "defined expression threshold" are used interchangeably and above this level the gene or gene product of that gene or gene product is used as a predictive marker for patient response or drug resistance. Say the level. Thresholds are typically defined experimentally from clinical studies. The expression threshold may be selected for maximum sensitivity (eg to detect all responders to one drug), or maximum selectivity (eg to select only responders for one drug), or minimum error.

As used herein, "gene amplification" refers to the process by which multiple copies of a gene or gene fragment are formed in a particular cell or cell line. Replicated regions (extension of amplified DNA) are often referred to as "amplicons". Often, the amount of messenger RNA (mRNA) produced, ie gene expression, is also increased in proportion to the number of copies made of a particular gene.

As used herein, "prognosis" is used herein to refer to the prediction of the likelihood of death by cancer or progression of neoplastic disease such as gastric cancer (including relapse, metastatic spread and drug resistance). The term "prediction" is used herein to refer to the likelihood that a patient will survive for a certain period of time without cancer recurrence after surgical removal of the primary tumor. The prediction method of the present invention can be used clinically to determine treatment by selecting the most appropriate treatment technique for any particular patient. The predictive method of the present invention is an invaluable means in predicting whether a patient is likely to respond favorably to a treatment regimen, for example a surgical procedure, or whether the patient can survive long term after the end of the surgery. The term "prognostic indicator" can be used interchangeably with "recurrence score."

As used herein, “long term” survival is used herein to refer to survival of at least 3 years, more preferably at least 5 or 8 years, most preferably at least 10 years after surgery or other treatment.

As used herein, "tumor" as used herein refers to all neoplastic cell growth and proliferation (whether malignant or benign) and all cancerous and cancerous cells and tissues.

As used herein, "cancer" and "cancerous" describe or refer to physiological conditions in mammals that are typically characterized by unregulated cell growth. Examples of cancer include gastric cancer, breast cancer, colon cancer, lung cancer, prostate cancer, hepatocellular cancer, gastric cancer, pancreatic cancer, cervical cancer, ovarian cancer, liver cancer, bladder cancer, cancer of the urethra, thyroid cancer, kidney cancer, carcinoma, melanoma, or brain cancer But not limited to these.

The “stringency” of the hybridization reaction is easily determined by one of ordinary skill in the art and is an experimental calculation that generally depends on probe length, wash temperature and salt concentration. In general, longer probes require higher temperatures for proper annealing, while shorter probes require lower temperatures. Hybridization generally depends on the ability of denatured DNA to reanneal when complementary strands are present in an environment below their melting temperature. The higher the degree of desired homology between the probe and the hybridizable sequence, the higher the relative temperature that can be used. As a result, higher relative temperatures tend to make the reaction conditions more stringent, while lower temperatures are less so. For further details and explanation of the stringency of the hybridization reaction, see Ausubel et al. , Current Protocols in Molecular Biology, Wiley Interscience Publishers, (1995).

"Strict conditions" or "high stringency conditions" as defined herein typically include (1) low ionic strength, for example, at 50 ° C. for 0.015 M sodium chloride / 0.0015 M sodium citrate / 0.1% sodium dodecyl sulfate wash and Using high temperatures; (2) denaturant at 42 ° C. during hybridization, for example formamide, for example 50% (v / v) formamide and 0.1% bovine serum albumin / 0.1% Ficoll / 0.1% polyvinylpyrrolidone / Using 750 mM sodium chloride, 75 mM sodium citrate with 50 mM sodium phosphate buffer, pH 6.5; Or (3) 50% formamide at 42 ° C., 5 × SSC (0.75M NaCl, 0.075 M sodium citrate), 50 mM sodium phosphate (pH 6.8), 0.1% sodium pyrophosphate, 5 × Denhardt's solution , Sonicated salmon sperm DNA (50 μg / ml), 0.1% SDS, and 10% dextran sulfate, 0.2 × SSC (sodium chloride / sodium citrate) and 50% formamide (at 55 ° C.) at 42 ° C. ), Followed by a high-stringency wash consisting of 0.1 x SSC containing EDTA at 55 ° C.

“Moderately stringent conditions” may be the same as described in Sambrook et al ., Molecular Cloning: A Laboratory Manual, New York: Cold Spring Harbor Press, 1989, and less stringent washing solutions and hybridizations than those described above. The use of conditions (eg, temperature, ionic strength and% SDS). Examples of moderately stringent conditions include 20% formamide, 5 × SSC (150 mM NaCl, 15 mM trisodium citrate), 50 mM sodium phosphate (pH 7.6), 5 × denhardt solution, 10% dextran sulfate at 37 ° C. , And overnight incubation in a solution comprising 20 mg / ml denatured sheared salmon sperm DNA followed by washing of the filter in 1 × SSC at about 37-50 ° C. Those skilled in the art will appreciate how to adjust the temperature, ionic strength, and the like necessary to employ factors such as probe length and the like.

In the context of the present invention, reference to "one or more", "two or more", "five or more", etc., among the genes listed in any particular gene set, refers to any one or any and all combinations of the listed genes. Means water.

The practice of the present invention will employ, unless otherwise indicated, conventional techniques of molecular biology (including recombination techniques), microbiology, cell biology, and biochemistry (within ordinary skill in the art). Such techniques are described in "Molecular Cloning: A Laboratory Manual", 2nd edition (Sambrook et al. , 1989), "Oligonucleotide Synthesis" (MJ Gait, ed., 1984), "Animal Cell Culture" ( RI Freshney, ed., 1987)], "Methods in Enzymology" (Academic Press, Inc.), "" Handbook of Experimental Immunology ", 4th edition (DM Weir & CC Blackwell, eds., Blackwell Science Inc., 1987)], "Gene Transfer Vectors for Mammalian Cells" (JM Miller & MP Calos, eds., 1987)], "Current Protocols in Molecular Biology" (FM Ausubel et al ., Eds., 1987) and [ (PCR: The Polymerase Chain Reaction), (Mullis et al ., Eds., 1994)).

1. Gene Expression Profile (Profiling)

Gene expression profiling methods include methods based on hybridization analysis of polynucleotides, methods based on sequencing polynucleotides, and methods based on proteomics. The most commonly used methods known in the art for the quantification of mRNA expression in samples are Northern blotting and in situ hybridization (Parker & Barnes, Methods in Molecular Biology 106: 247 -283 (1999)]); RNAse protection assay (Hod, Biotechniques 13: 852-854 (1992)); And PCR-based methods such as reverse transcription polymerase chain reaction (RT-PCR) (Weis et al., Trends in Genetics 8: 263-264 (1992)). Alternatively, antibodies can be used that can recognize two specific strands, including two DNA strands, two RNA strands, and two DNA-RNA hybrid strands or two DNA-protein strands. Representative methods for sequencing-based gene expression analysis include gene expression analysis by serial analysis of gene expression (SAGE) and massively parallel signature sequencing (MPSS). .

2. How to Create a PCR-Based Gene Expression Profile

a. Reverse Transcriptase PCR (RT-PCR)

One of the most sensitive and most flexible quantitative PCR-based gene expression profiling methods is RT-PCR, which compares mRNA levels in different sample populations in normal and tumor tissues with or without drug treatment. It can be used to characterize gene expression patterns, determine closely related mRNAs, and analyze RNA structure.

The first step is the isolation of mRNA from the target sample. Starting materials are typically total RNA isolated from human tumors or tumor cell lines and corresponding normal tissue or cell lines, respectively. Thus, RNA, along with pooled DNA from a healthy donor, may be a tumor or tumor of various major tumors (breast, lung, colon, prostate, brain, liver, kidney, pancreas, spleen, thyroid, testes, ovaries, uterus, etc. Cell lines). If the source of mRNA is the primary tumor, the mRNA can be extracted, for example, from frozen or stored paraffin-embedded and immobilized (eg formalin-fixed) tissue samples.

General methods for mRNA extraction are known in the art and described in standard textbooks of molecular biology, including Ausubel et al., Current Protocols of Molecular Biology, John Wiley and Sons (1997). Methods of RNA extraction from tissue embedded in paraffin are described, for example, in Rupp and Locker, Lab Invest. 56: A67 (1987) and De Andres et al., BioTechniques 18: 42044 (1995). Is disclosed. In particular, RNA isolation can be performed according to the manufacturer's instructions using commercial kits, such as purification kits from Qiagen, buffer sets and proteases. For example, total RNA from cells in culture can be isolated using Qiagen RN easy mini-columns. Other commercially available RNA isolation kits include the MasterPureTM Complete DNA and RNA Purification Kit (EPICENTRE, Madison, WI) and Paraffin Block RNA Isolation Kit (Ambion, Inc.) Ambion, Inc.). Complete RNA from tissue samples can be isolated using RNA Stat-60 (Tel-Test). RNA prepared from tumors can be isolated, for example, by cesium chloride density gradient centrifugation.

Since RNA cannot be used as a template for PCR, the first step in gene expression profiling by RT-PCR is reverse transcription of the RNA template into cDNA, followed by exponential amplification into its PCR reaction. The two most commonly used reverse transcriptases are avian myeloblastosis virus reverse transcriptase (AMV-RT) and Moloney rat leukemia virus reverse transcriptase (MMLV-RT). The reverse transcription step is typically first antigen-stimulated using specific primers, random hexamers, or oligo-dT primers, depending on the environment and goal of expression profiling. For example, the extracted RNA can be reverse-transcribed using the GeneAmp RNA PCR Kit (Perkin Elmer, California, USA) following the manufacturer's instructions. The derived cDNA can then be used as a template in subsequent PCR reactions.

Although the PCR step can use a variety of thermostable DNA-dependent DNA polymerases, it typically uses Taq DNA polymerase, which has 5'-3 'nuclease activity, but has 3'-5' read protection. There is a lack of proofreading endonuclease activity. Thus, Takman PCR typically utilizes a 5'-nuclease activity that hybridizes a hybridization probe bound to its target amplicon of Taq or Tth polymerase, but with any 5 'nuclease activity equivalent. Enzymes can be used. Two oligonucleotide primers are used to generate representative amplicons of the PCR reaction. The third oligonucleotide or probe is designed to detect a nucleotide sequence located between two PCR primers. The probe is non-extensible by Taq DNA polymerase enzyme and is labeled with a reporter fluorescent dye and a quencher fluorescent dye. Any laser-induced emission from the reporter dye is quenched by the quench dye when the two dyes are placed together as close as they are on the probe. During the amplification reaction, Taq DNA polymerase enzyme cleaves the probe in a template-dependent manner. The resulting probe fragments dissociate in solution and have no quenching effect of the second fluorophore on the signal from the released reporter dye. One molecule of reporter dye is released from each of the synthesized new molecules, and detection of the unquenched reporter dye provides a basis for quantitative interpretation of the data.

TAKMAN RT-PCR is a commercially available instrument, for example ABI Prism 7700TM Sequence Detection SystemTM (Perkin-Elmer-Applied Biosystems, California, USA Foster City), or Lightcycler (Roche Molecular Biochemicals, Mannheim, Germany). In a preferred embodiment, the 5 'nuclease procedure is performed on a real time quantitative PCR device, such as the ABI Prism 7700TM Sequence Detection SystemTM. The system consists of a thermocycler, laser, charge-coupled device (CCD), camera and computer. The system amplifies the sample in a 96-well format on a thermocycler. During amplification, laser-induced fluorescence signals are collected in real time via fiber optic cables for all 96 wells. The system includes software for running the device and for analyzing the data.

5'-nuclease assay data is initially expressed as Ct, or threshold cycle. As discussed above, the fluorescence value is recorded every cycle and represents the amount of product amplified to that point in the amplification reaction. The point when the fluorescence signal is first recorded as statistically significant is the threshold cycle (Ct).

In order to minimize the variability effects and errors between samples, RT-PCR is generally performed using reference RNA, which is ideally expressed at some level between different tissues, and is not affected by experimental treatment. . The RNA most often used to normalize gene expression patterns is mRNA for the housekeeping genes glyceraldehyde-3-phosphate-dehydrogenase (GAPD) and β-actin (ACTB).

A more recent change in RT-PCR technology is real-time quantitative PCR, which measures PCR product accumulation via double-labeled fluorogenic probes (ie, tagman probes). Real-time PCR is compatible with both quantitative competitive PCR (wherein internal competitors for each target sequence are used for standardization) and quantitative comparison PCR using standardized genes contained in the sample or housekeeping genes for RT-PCR. For further details, see, eg, Held et al., Genome Research 6: 986-994 (1996).

b. MassARRAY System

In the massarray-based gene expression profiling method developed by Sequenom, Inc. (San Diego, Calif.), CDNA obtained after isolation and reverse transcription of RNA is synthesized from a synthetic DNA molecule (competitor) (this is a single Spiked to the targeting cDNA region at all positions except base) and used as internal standard. The cDNA / competitor mixture is PCR amplified and post-PCR shrimp alkaline phosphatase (SAP) enzyme treatment is added to cause dephosphorylation of the remaining nucleotides.

After inactivation of alkaline phosphatase, PCR products from competitors and cDNAs are primer stretched, which produces separate mass signals for competitor- and cDNA-derived PCR products. After purification, these products are metered onto a chip array that is already loaded with the components necessary for analysis using matrix-assisted laser desorption ionization flow time mass spectrometry (MALDI-TOF MS) analysis. The cDNA present in the reaction is then quantified by analyzing the peak area ratio in the resulting mass spectrum. For further details, see, eg, Ding and Cantor, Proc. Natl. Acad. Sci . USA 100: 3059-3064 (2003).

c. Other PCR-Based Methods

Additional PCR-based techniques are described, for example, in parallax displays (Liang and Pardee, Science 257: 967-971 (1992)); Amplified fragment length polymorphism (iAFLP) (Kawamoto et al., Genome Res . 12: 1305-1312 (1999)); BeadArray ™ technology (Illumina, San Diego, CA) (Oliphant et al ., Discovery of Markers for Disease (Supplement to Biotechniques, June 2002) and Ferguson et al., Analytical Chemistry 72: 5618 (2000)])); Beads for gene expression detection using a commercially available Luminex 100 LabMAP system and multicolor-coded microspheres (Luminex Corp., Austin, Texas) for rapid assays for gene expression Arrays for Detection of Gene Expression (BADGE) (Yang et al., Genome Res . 11: 1888-1898 (2001)); And high coat expression profiling (HiCEP) analysis (Fukumura et al., Nucl. Acids. Res . 31 (16) e94 (2003)).

3. Microarray

Differential gene expression can be determined or confirmed using microarray techniques. Thus, expression profiles of gastric cancer-related genes can be measured in tumor tissue fresh or embedded in paraffin using microarray techniques. In this method, sequences of interest (including cDNA and oligonucleotides) are plated or arranged on a microchip substrate. The arranged sequences are then hybridized with specific DNA probes from the cell or tissue of interest. As in the RT-PCR method, the source of mRNA is typically human RNA or tumor cell lines, and total RNA isolated from the corresponding normal tissues or cell lines. Thus RNA can be isolated from various major tumors or tumor cell lines. If the source of mRNA is a major tumor, the mRNA may be, for example, frozen or stored paraffin-embedded and fixed (eg formalin-fixed) tissue samples (which are routinely prepared and preserved with daily clinical practice). Can be extracted from.

In certain embodiments of microarray technology, PCR amplified inserts of cDNA clones are applied onto the substrate in a dense array. Preferably, 10,000 or more nucleotide sequences are added to the substrate. Microarrayed genes immobilized on the microchip with 10,000 elements each are suitable for hybridization under stringent conditions. Fluorescently labeled cDNA probes can be generated through incorporation of fluorescent nucleotides by reverse transcription of RNA extracted from tissue of interest. Labeled cDNA probes applied to the chip hybridize with specificity to each spot of DNA on the array. After rigorous washing to remove non-specifically bound probes, the chip is scanned by in-focus laser microscopy or by another detection method such as a CCD camera. Quantification of hybridization of each arranged element allows for evaluation of the corresponding mRNA excess. For dual color fluorescence, separately labeled cDNA probes generated from two RNA sources hybridize to each pair in the array. Thus, the relative excess of transcripts from two sources corresponding to each specified gene is determined simultaneously. Miniaturization scale hybridization provides convenient and rapid evaluation of expression patterns for large numbers of genes. This method has been shown to have the necessary sensitivity to detect rare transcripts (which are expressed in a few copies per cell) and to reproducibly detect at least approximately two-fold differences in expression (Schena et al., Proc. Natl. Acad. Sci . USA 93 (2): 106-149 (1996)]. Microarray analysis can be performed by commercially available equipment according to the manufacturer's protocol, for example using Affymetrix GenChip technology or Insight's microarray technology.

The development of microarray methods for large-scale analysis of gene expression makes it possible to systematically study molecular markers of cancer classification and performance prediction in various tumor types.

4. Continuous Analysis of Gene Expression (SAGE)

Serial analysis of gene expression (SAGE) is a method that allows for simultaneous and quantitative analysis of large numbers of gene transcripts without the need to provide separate hybridization probes for each transcript. First, a short sequence tag (about 10-14 bp) is generated that contains enough information to uniquely identify a transcript, with the tag being obtained from a unique location within each transcript. Many transcripts are then linked together to form a long series of molecules (sequenced to represent the identity of multiple tags simultaneously).

The expression pattern of any transcript population can be quantitatively assessed by measuring the excess of an individual tag and identifying the gene corresponding to each tag. For more details, see, eg, Velculescu et al., Science 270: 484-487 (1995) and Velculescu et al., Cell 88: 243-51 (1997).

5. Gene expression analysis by massively parallel signature sequencing (MPSS)

This method, described by Brenner et al., Nature Biotechnology 18: 630-634 (2000), allows for in vitro cloning of millions of templates on separate 5 μm diameter microbeads and non-gel-based signature sequencing. It is a combination sequencing approach. First, microbead libraries of DNA templates are constructed by in vitro cloning. This is followed by the assembly of planar arrays of template-containing microbeads in high density (typically 3 × 10 ⁶ microbeads / cm) flow cells. The free end of the cloned template on each microbead is analyzed simultaneously using a fluorescence-based signature sequencing method that does not require DNA fragment separation. This method has been shown to provide hundreds and thousands of gene signature sequences from yeast cDNA libraries simultaneously and accurately in one operation.

6. Immunohistochemistry

Immunohistochemical methods are also suitable for detecting the expression of prognostic markers of the present invention. Thus, expression is detected using antibodies or antisera, preferably polyclonal antisera and most preferably monoclonal antibodies specific for each marker. Antibodies can be detected by direct labeling of the antibodies themselves, for example with radiolabels, fluorescent labels, hapten labels, such as biotin, or enzymes such as horse radish peroxidase or alkaline phosphatase. Alternatively, an unlabeled primary antibody is used in combination with a labeled secondary antibody comprising an antiserum, polyclonal antiserum or monoclonal antibody specific for the primary antibody. Immunohistochemistry protocols and kits are known in the art and are commercially available.

7. Proteomics

The term "proteome" is defined as the entirety of a protein present in a sample (eg, tissue, organism or cell culture) at a particular time period. Proteomics in particular involves the study of the overall change in protein expression in a sample (also called "expression proteomics"). Proteomics typically include the following steps: (1) separation of individual proteins in a sample by 2-D gel electrophoresis (2-D PAGE); (2) identification of individual proteins recovered from the gel, such as mass spectrometry or N-terminal sequencing, and (3) analysis of data using bioinformatics.

Proteomics methods are valuable appendices to other methods of gene expression profiling and can be used alone or in combination with other methods to detect the product of prognostic markers of the present invention.

8. General description of mRNA isolation, purification and amplification

Representative protocol steps for profiling gene expression using immobilized, paraffin-embedded tissue as an RNA source, including mRNA isolation, purification, primer extension and amplification, are described in various published magazine articles (eg, (TE Godfrey et al. J. Molec. Diagnostics 2: 84-91 [2000] and K. Specht et al., Am. J. Pathol . 158: 419-29 [2001]). . In summary, a representative method begins with cutting about 10 μm thick sections of paraffin-embedded tumor tissue samples. RNA is then extracted and proteins and DNA are removed. After analysis of RNA concentration, RNA repair and / or amplification steps may be included if necessary, followed by RT-PCR after RNA is reverse transcribed using a gene specific promoter. Finally, the data is analyzed to determine the best treatment option (s) available to the patient based on the characteristic gene expression patterns identified in the observed tumor samples.

An important aspect of the present invention is the use of the measured expression of specific genes by gastric cancer tissue to provide prognostic information. For this purpose, it is essential to correct (standardize) the amount of RNA tested, variations in RNA quality used, and differences in other factors, such as machine and operator differences. Therefore, assays typically measure and incorporate the use of reference RNA, including those transcribed from known housekeeping genes such as GAPD and ACTB. Accurate methods for standardizing gene expression data are provided in "User Bulletin # 2" for the ABI PRISM 7700 Sequence Detection System (Applied Biosystems; 1997). Alternatively, normalization can be based on the mean or median signal (Ct) of the assayed genes or all of their many subsets (full normalization approach). In the studies described in the Examples below, a so-called central standardization strategy was used, which used a subset of screened genes selected based on lack of correlation with clinical outcome for standardization.

9. Create miRNA profiles

CDNA synthesized from RNA of a sample is prepared using a multiplex RT or TaqMan low-density array for a TaqMan array human microRNA panel (Applied Biosystems, Foster City, Calif.).

10. Response scores for relapse and their application

The operation that distinguishes cancer prognosis methods against the likelihood of recurring gastric cancer is characterized by 1) the unique set of test mRNAs (or corresponding gene expression products) used to measure recurrence, and 2) the specific data used to combine expression data. Weights, and 3) thresholds used to divide patients into groups with different levels of risk, such as low, medium, and high risk groups. This operation yields a numerical recurrence score (RS).

The test requires laboratory assays to determine the levels of specified mRNAs or expression products thereof, but is fixed or paraffin embedded tumor biopsies that are either fresh or frozen tissue or already collected and stored from patients. Test specimens are available in very small quantities. Thus, the test can be non-invasive. It is also compatible with several different methods of tumor tissue harvested, for example, via core biopsy or microneedle aspiration.

According to this method, the cancer recurrence score (RS) is

(a) generating a gene or protein expression profile with a biological sample comprising cancer cells obtained from the subject;

(b) quantifying the expression levels (ie mRNA or protein levels) of a plurality of individual genes to determine expression values for each gene;

(c) generate a subset of gene expression values, each comprising an expression value for genes linked by cancer-related biological function and / or by co-expression;

(d) multiply the expression level of each gene in one subset by a coefficient reflecting its relative contribution to cancer recurrence or therapy response in the subset and add the multiplied value to yield a value for the subset;

(e) multiply that value of each subset by a coefficient that reflects its contribution to cancer recurrence or therapy response;

(f) is determined by summing the values for each subset multiplied by the coefficients to obtain a recurrence score (RS), where the contribution of each subset that does not show a linear correlation with cancer recurrence is merely a predetermined threshold value. Only included in the above,

Increased expression of the specified genes gives a negative value to subsets that reduce the risk of cancer recurrence and positive expression to the subsets where expression of the specified genes increases the risk of cancer recurrence.

In a specific embodiment, RS is

(a) at least one RNA transcript selected from the group consisting of FZD1, GLI3, ANGPTL7, ABL1, SMARCD3, ILK, CAV1, VIP, HSPB7, TOP2A and FANCD2; And measuring the expression level of at least one miRNA selected from the group consisting of hsa-miR-933, hsa-miR-184, hsa-miR-380 *, hsa-miR-190b, hsa-miR-27a * and hsa-miR-1201 and;

(b) is determined by calculating the recurrence score (RS) by the following equation:

[Equation 1]

Where

HR _n represents the hazard ratio of the nth RNA transcript or microRNA,

Here, if the RS value is a positive value, it is a bad prognosis, and if the RS value is a -value, it is determined that it is a good prognosis.

In a specific embodiment, RS is

a) measuring the expression level of at least one protein selected from the group consisting of Akt ^pS473 , PAI, SMAD3, ^P70 S6K and EGFR2,

b) determined by calculating the recurrence score (RS) by Equation 2:

[Equation 2]

Where

HR _n represents the hazard ratio of the nth functional protein,

If the value is greater than 0, the prognosis is bad, and if the RS value is less than 0, the prognosis is determined to be good.

Further details of the invention will be described in the following non-limiting examples.

Example 1 Prediction of Gastric Cancer Including Total Stage Based on RNA Transcripts

RT-PCR was performed on frozen section tissue stored in the gene bank of patients with gastric cancer surgery (n = 332) at Severance Hospital in Yonsei University Medical Center from 2000 to 2004. It was an example. The levels of the 113 mRNA species were measured by RT-PCR, indicating the product of the candidate cancer-related gene selected in the biomedical research literature.

mRNA was extracted using the MasterPureTM RNA Purification Kit (Epicentre Technologies) and quantified by RiboGreen fluorescence method (Molecular probes). Molecular assays of quantitative gene expression were performed using the ABI Prism 7900 ™ Sequence Detection System ™ (Perkin-Elmer-Applied Biosystems, Foster City, CA, USA). The ABI Prism 7900TM consists of a thermocycler, laser, charge-coupled device (CCD), camera and computer. The system amplifies the sample in a 384-well format on a thermocycler. During amplification, laser-induced fluorescence signals are collected in real time for all 384 wells and detected in the CCD. The system includes software for running the device and for analyzing the data.

Tumor tissue was analyzed for 384 genes. Threshold cycle (CT) values for each patient were normalized based on the mean of one subset of screened genes for a particular patient, selected based on lack of correlation with clinical outcome (central standardization strategy).

Table 1

Parametric p-value	Hazard Ratio	Unique id	Gene symbol	UGCluster	Name
0.000492	0.304	ILMN_1768856	CHAT	Hs.302002	choline O-acetyltransferase
9.25E-05	0.367	ILMN_1676731	C17orf65	Hs.656564	chromosome 17 open reading frame 65
0.000122	0.37	ILMN_1783910	TRAF6	Hs.591983	TNF receptor-associated factor 6
2.00E-06	0.374	ILMN_1738207	CISH	Hs.655334	cytokine inducible SH2-containing protein
0.000116	0.386	ILMN_1693072	ELAC1	Hs.657360	elaC homolog 1 (E. coli)
0.000579	0.399	ILMN_1689162	ACTR8	Hs.412186	ARP8 actin-related protein 8 homolog (yeast)
1.31E-05	0.401	ILMN_1741976	SMARCAD1	Hs.410406	SWI / SNF-related, matrix-associated actin-dependent regulator of chromatin, subfamily a, containing DEAD / H box 1
0.000126	0.411	ILMN_1697670	SRRM1	Hs.18192	serine / arginine repetitive matrix 1
0.000976	0.419	ILMN_2268026	C15orf44	Hs.6686	chromosome 15 open reading frame 44
5.90E-05	0.425	ILMN_2404629	EFTUD1	Hs.459114	elongation factor Tu GTP binding domain containing 1
7.67E-05	0.434	ILMN_1693145	BUB3	Hs.418533	budding uninhibited by benzimidazoles 3 homolog (yeast)
0.00018	0.438	ILMN_1795704	KIAA0232	Hs.79276	KIAA0232
9.61E-05	0.442	ILMN_1750092	SEPSECS	Hs.253305	Sep (O-phosphoserine) tRNA: Sec (selenocysteine) tRNA synthase
9.30E-06	0.451	ILMN_1753440	DCAF16	Hs.614787	DDB1 and CUL4 associated factor 16
2.99E-05	0.454	ILMN_1688953	ARHGAP19	Hs.80305	Rho GTPase activating protein 19
0.000137	0.459	ILMN_1684802	TAF5	Hs.96103	TAF5 RNA polymerase II, TATA box binding protein (TBP) -associated factor, 100 kDa
0.000336	0.463	ILMN_2083833	CNOT6L	Hs.592519	CCR4-NOT transcription complex, subunit 6-like
0.00075	0.464	ILMN_1777066	NIF3L1	Hs.145284	NIF3 NGG1 interacting factor 3-like 1 (S. pombe)
6.30E-06	0.468	ILMN_1729546	C19orf54	Hs.585105	chromosome 19 open reading frame 54
1.02E-05	0.47	ILMN_1741780	DUSP28	Hs.369297	dual specificity phosphatase 28
1.92E-05	0.471	ILMN_2334587	HNRNPC	Hs.508848	heterogeneous nuclear ribonucleoprotein C (C1 / C2)
7.28E-05	0.472	ILMN_1665164	CTR9	Hs.725151	Ctr9, Paf1 / RNA polymerase II complex component, homolog (S. cerevisiae)
0.000517	0.479	ILMN_1657064	C6orf70	Hs.47546	chromosome 6 open reading frame 70
9.70E-06	0.48	ILMN_2409318	RCCD1	Hs.655895	RCC1 domain containing 1
0.000455	0.48	ILMN_1738971	USP54	Hs.657355	ubiquitin specific peptidase 54
0.000378	0.481	ILMN_1724062	LIN54	Hs.96952	lin-54 homolog (C. elegans)
0.000949	0.486	ILMN_1682724	FANCF	Hs.713574	Fanconi anemia, complementation group F
0.000171	0.487	ILMN_2412549	GAR1	Hs.69851	GAR1 ribonucleoprotein homolog (yeast)
0.000889	0.487	ILMN_1662719	GPBP1L1	Hs.725955	GC-rich promoter binding protein 1-like 1
5.87E-05	0.488	ILMN_2383774	TRAF3	Hs.510528	TNF receptor-associated factor 3
0.000405	0.488	ILMN_1847822	KIAA0368	Hs.368255	KIAA0368
0.000706	0.488	ILMN_1768640	CRNKL1	Hs.171342	crooked neck pre-mRNA splicing factor-like 1 (Drosophila)
0.000257	0.49	ILMN_1722742	SCLY	Hs.709612	selenocysteine lyase
0.000276	0.49	ILMN_1793203	SMCR7L	Hs.714252	Smith-Magenis syndrome chromosome region, candidate 7-like
0.000725	0.492	ILMN_2312386	PAIP1	Hs.482038	poly (A) binding protein interacting protein 1
0.000632	0.495	ILMN_1798827	SRBD1	Hs.14229	S1 RNA binding domain 1
0.000394	0.496	ILMN_2323774	RPAIN	Hs.462086	RPA interacting protein
2.20E-05	0.497	ILMN_2399622	AP1G1	Hs.461253	adapter-related protein complex 1, gamma 1 subunit
0.000541	0.499	ILMN_1693226	C1orf212	Hs.27160	chromosome 1 open reading frame 212
0.0003	0.503	ILMN_2169089	C18orf54	Hs.208701	chromosome 18 open reading frame 54
0.000407	0.504	ILMN_1686454	TIFA	Hs.310640	TRAF-interacting protein with forkhead-associated domain
0.00017	0.505	ILMN_1727041	EWSR1	Hs.374477	Ewing sarcoma breakpoint region 1
0.000294	0.506	ILMN_1776552	FUBP1	Hs.567380	far upstream element (FUSE) binding protein 1
0.000667	0.506	ILMN_1776153	AGGF1	Hs.634849	angiogenic factor with G patch and FHA domains 1
0.000488	0.507	ILMN_1651886	CWF19L1	Hs.215502	CWF19-like 1, cell cycle control (S. pombe)
0.000387	0.508	ILMN_1806825	C14orf145	Hs.162889	chromosome 14 open reading frame 145
0.000149	0.51	ILMN_1730077	RPUSD2	Hs.173311	RNA pseudouridylate synthase domain containing 2
1.46E-05	0.511	ILMN_2411190	SMC2	Hs.119023	structural maintenance of chromosomes 2
0.000576	0.512	ILMN_2199676	CEP152	Hs.443005	centrosomal protein 152kDa
0.000468	0.513	ILMN_1734826	NUP88	Hs.584784	nucleoporin 88kDa
0.000652	0.518	ILMN_1787326	SNORA65	Hs.656353	small nucleolar RNA, H / ACA box 65
0.000272	0.523	ILMN_1749821	MED28	Hs.434075	mediator complex subunit 28
0.000558	0.523	ILMN_2217935	RFC1	Hs.507475	replication factor C (activator 1) 1, 145kDa
4.96E-05	0.525	ILMN_1771593	RRM1	Hs.445705	ribonucleotide reductase M1
0.000376	0.527	ILMN_1777584	KARS	Hs.3100	lysyl-tRNA synthetase
3.75E-05	0.528	ILMN_1678833	CCR1	Hs.301921	chemokine (CC motif) receptor 1
0.000325	0.532	ILMN_1669842	CHAF1A	Hs.79018	chromatin assembly factor 1, subunit A (p150)
9.00E-07	0.534	ILMN_2043060	PLCH1	Hs.567423	phospholipase C, eta 1
0.000707	0.534	ILMN_1751362	FASTKD1	Hs.529276	FAST kinase domains 1
0.000817	0.534	ILMN_1740351	KIAA0174	Hs.232194	KIAA0174
0.000267	0.536	ILMN_1658678	SAAL1	Hs.591998	serum amyloid A-like 1
0.000433	0.538	ILMN_1655414	TNFSF14	Hs.129708	tumor necrosis factor (ligand) superfamily, member 14
5.30E-06	0.54	ILMN_1700671	ETV7	Hs.272398	ets variant 7
0.000723	0.54	ILMN_2358041	NBN	Hs.492208	nibrin
0.000653	0.541	ILMN_1813344	C20orf7	Hs.472165	chromosome 20 open reading frame 7
0.000838	0.544	ILMN_2209766	RHBDD1	Hs.471514	rhomboid domain containing 1
0.000889	0.544	ILMN_1805985	ANKRD32	Hs.657315	ankyrin repeat domain 32
0.000531	0.553	ILMN_2237746	ING3	Hs.489811	inhibitor of growth family, member 3
0.000731	0.557	ILMN_2395055	ATPAF1	Hs.100874	ATP synthase mitochondrial F1 complex assembly factor 1
1.97E-05	0.559	ILMN_1783676	CCDC15	Hs.287555	coiled-coil domain containing 15
0.000325	0.561	ILMN_2316104	IQCB1	Hs.604110	IQ motif containing B1
0.00062	0.561	ILMN_1726520	TDP1	Hs.209945	tyrosyl-DNA phosphodiesterase 1
9.00E-07	0.563	ILMN_1739756	KIR2DL4	Hs.651287	killer cell immunoglobulin-like receptor, two domains, long cytoplasmic tail, 4
0.000383	0.564	ILMN_1750052	NOP14	Hs.627133	NOP14 nucleolar protein homolog (yeast)
0.000417	0.564	ILMN_1744959	NFX1	Hs.413074	nuclear transcription factor, X-box binding 1
0.000835	0.564	ILMN_1781468	SMAP2	Hs.15200	small ArfGAP2
0.00094	0.564	ILMN_2400644	SRGAP3	Hs.654743	SLIT-ROBO Rho GTPase activating protein 3
2.00E-07	0.565	ILMN_1667232	KIR2DL3	Hs.654605	killer cell immunoglobulin-like receptor, two domains, long cytoplasmic tail, 3
0.000141	0.568	ILMN_2339006	KIAA0564	Hs.368282	KIAA0564
7.44E-05	0.571	ILMN_1690420	GFI1	Hs.73172	growth factor independent 1 transcription repressor
0.00041	0.571	ILMN_2055760	KIAA1715	Hs.209561	KIAA1715
0.000309	0.572	ILMN_1718309	COX15	Hs.28326	COX15 homolog, cytochrome c oxidase assembly protein (yeast)
0.000125	0.576	ILMN_1680782	PATL1	Hs.591960	protein associated with topoisomerase II homolog 1 (yeast)
1.99E-05	0.577	ILMN_1754149	LETMD1	Hs.655272	LETM1 domain containing 1
8.06E-05	0.578	ILMN_1661809	PRRG4	Hs.471695	proline rich Gla (G-carboxyglutamic acid) 4 (transmembrane)
0.000694	0.583	ILMN_1751075	SETD4	Hs.606200	SET domain containing 4
0.000121	0.585	ILMN_2052717	GRAMD1C	Hs.24583	GRAM domain containing 1C
0.000588	0.591	ILMN_2385097	NDRG3	Hs.437338	NDRG family member 3
0.00065	0.591	ILMN_1695640	PTPN22	Hs.535276	protein tyrosine phosphatase, non-receptor type 22 (lymphoid)
8.71E-05	0.592	ILMN_1678054	TRIM21	Hs.532357	tripartite motif-containing 21
4.60E-06	0.593	ILMN_1815134	PI4K2B	Hs.726376	phosphatidylinositol 4-kinase type 2 beta
0.000197	0.593	ILMN_1734096	DCLRE1A	Hs.1560	DNA cross-link repair 1A
0.000401	0.594	ILMN_1660856	ALG11	Hs.512963	asparagine-linked glycosylation 11, alpha-1,2-mannosyltransferase homolog (yeast)
8.53E-05	0.596	ILMN_1796682	PARP3	Hs.271742	poly (ADP-ribose) polymerase family, member 3
0.000285	0.598	ILMN_2059357	KLRC2	Hs.591157	killer cell lectin-like receptor subfamily C, member 2
0.000692	0.598	ILMN_1736077	LIAS	Hs.550502	lipoic acid synthetase
0.000171	0.602	ILMN_2395236	CHEK2	Hs.291363	CHK2 checkpoint homolog (S. pombe)
0.000313	0.604	ILMN_1758629	DONSON	Hs.436341	downstream neighbor of SON
0.000751	0.604	ILMN_2101375	CCDC77	Hs.631656	coiled-coil domain containing 77
0.000284	0.608	ILMN_1717207	MMP25	Hs.654979	matrix metallopeptidase 25
0.000417	0.609	ILMN_1733390	LARP1B	Hs.657067	La ribonucleoprotein domain family, member 1B
2.40E-06	0.61	ILMN_1657631	STAP2	Hs.194385	signal transducing adapter family member 2
0.000723	0.61	ILMN_1812759	GCH1	Hs.86724	GTP cyclohydrolase 1
9.61E-05	0.611	ILMN_2176251	C20orf72	Hs.320823	chromosome 20 open reading frame 72
0.000402	0.617	ILMN_1670302	HK3	Hs.411695	hexokinase 3 (white cell)
0.000222	0.619	ILMN_1709772	SNX5	Hs.316890	sorting nexin 5
0.00063	0.621	ILMN_2391512	NAAA	Hs.437365	N-acylethanolamine acid amidase
0.000114	0.628	ILMN_1797988	KLRD1	Hs.562457	killer cell lectin-like receptor subfamily D, member 1
6.58E-05	0.633	ILMN_1721762	IL18RAP	Hs.158315	interleukin 18 receptor accessory protein
3.74E-05	0.638	ILMN_2390299	PSMB8	Hs.180062	proteasome (prosome, macropain) subunit, beta type, 8 (large multifunctional peptidase 7)
0.000232	0.641	ILMN_1726659	THOP1	Hs.78769	thimet oligopeptidase 1
0.000284	0.647	ILMN_1722158	CASP5	Hs.213327	caspase 5, apoptosis-related cysteine peptidase
0.000718	0.649	ILMN_2078697	ALPK1	Hs.652825	alpha-kinase 1
6.22E-05	0.651	ILMN_1745034	SLC11A2	Hs.505545	solute carrier family 11 (proton-coupled divalent metal ion transporters), member 2
0.00052	0.651	ILMN_1683026	PSMB10	Hs.9661	proteasome (prosome, macropain) subunit, beta type, 10
0.000606	0.655	ILMN_2148796	MND1	Hs.294088	meiotic nuclear divisions 1 homolog (S. cerevisiae)
0.000771	0.656	ILMN_1758728	FANCG	Hs.591084	Fanconi anemia, complementation group G
0.000756	0.658	ILMN_1758811	IMPA1	Hs.656694	inositol (myo) -1 (or 4) -monophosphatase 1
0.000817	0.66	ILMN_2203588	MYL5	Hs.410970	myosin, light chain 5, regulatory
0.000445	0.662	ILMN_1810228	TTF2	Hs.486818	transcription termination factor, RNA polymerase II
0.000566	0.662	ILMN_1874530	DIAPH3	Hs.283127	diaphanous homolog 3 (Drosophila)
4.76E-05	0.663	ILMN_1690241	BATF2	Hs.124840	basic leucine zipper transcription factor, ATF-like 2
0.000225	0.671	ILMN_1740633	PRF1	Hs.2200	perforin 1 (pore forming protein)
0.000926	0.671	ILMN_1687107	RFWD3	Hs.567525	ring finger and WD repeat domain 3
0.000696	0.673	ILMN_1802708	BTN3A1	Hs.191510	butyrophilin, subfamily 3, member A1
0.000256	0.683	ILMN_2235137	FANCD2	Hs.208388	Fanconi anemia, complementation group D2
0.000872	0.685	ILMN_1758939	RIPK2	Hs.103755	receptor-interacting serine-threonine kinase 2
0.000407	0.694	ILMN_2183856	TSPAN6	Hs.43233	tetraspanin 6
3.23E-05	0.698	ILMN_2207291	IFNG	Hs.856	interferon, gamma
0.000204	0.699	ILMN_1711005	CDC25A	Hs.437705	cell division cycle 25 homolog A (S. pombe)
0.000567	0.703	ILMN_1674640	CXCR6	Hs.34526	chemokine (CXC motif) receptor 6
1.08E-05	0.718	ILMN_1700831	SLC27A2	Hs.720807	solute carrier family 27 (fatty acid transporter), member 2
7.49E-05	0.719	ILMN_1660973	GAD1	Hs.420036	glutamate decarboxylase 1 (brain, 67kDa)
0.000938	0.72	ILMN_1658607	DLEU2	Hs.547964	deleted in lymphocytic leukemia 2 (non-protein coding)
0.000598	0.721	ILMN_1683178	JAK2	Hs.656213	Janus kinase 2
0.000262	0.722	ILMN_1792538	CD7	Hs.36972	CD7 molecule
0.000633	0.722	ILMN_1787345	FKBP11	Hs.655103	FK506 binding protein 11, 19 kDa
3.05E-05	0.732	ILMN_1778010	IL32	Hs.943	interleukin 32
0.000236	0.733	ILMN_2285375	SORD	Hs.878	sorbitol dehydrogenase
0.000322	0.733	ILMN_1751079	TAP1	Hs.352018	transporter 1, ATP-binding cassette, sub-family B (MDR / TAP)
0.000123	0.748	ILMN_1790692	GNLY	Hs.105806	granulysin
0.000897	0.75	ILMN_1710740	C2	Hs.408903	complement component 2
1.00E-05	0.764	ILMN_2109489	GZMB	Hs.1051	granzyme B (granzyme 2, cytotoxic T-lymphocyte-associated serine esterase 1)
0.00068	0.809	ILMN_1676413	VSNL1	Hs.444212	visinin-like 1
0.000722	0.809	ILMN_2114568	GBP5	Hs.513726	guanylate binding protein 5

TABLE 2

Parametric p-value	Hazard Ratio	Gene symbol	UGCluster	Name
5.00E-07	1.513	NOV	Hs.235935	nephroblastoma overexpressed gene
6.00E-07	1.492	ARHGAP23	Hs.374446	Rho GTPase activating protein 23
1.90E-06	1.992	ITGB5	Hs.536663	integrin, beta 5
3.10E-06	1.826	OLFM1	Hs.522484	olfactomedin 1
5.70E-06	1.221	ACTG2	Hs.516105	actin, gamma 2, smooth muscle, enteric
6.20E-06	1.212	TAGLN	Hs.410977	transgelin
6.90E-06	1.403	GJA1	Hs.74471	gap junction protein, alpha 1, 43kDa
9.10E-06	1.194	MYH11	Hs.460109	myosin, heavy chain 11, smooth muscle
9.20E-06	1.29	PDK4	Hs.8364	pyruvate dehydrogenase kinase, isozyme 4
9.20E-06	1.243	CRYAB	Hs.53454	crystallin, alpha B
9.70E-06	1.545	CHST3	Hs.158304	carbohydrate (chondroitin 6) sulfotransferase 3
1.01E-05	1.558	VCL	Hs.643896	vinculin
1.14E-05	1.414	RHOB	Hs.502876	ras homolog gene family, member B
1.17E-05	1.348	LOXL4	Hs.306814	lysyl oxidase-like 4
1.39E-05	1.296	LEPREL1	Hs.374191	leprecan-like 1
1.68E-05	1.417	CRIP2	Hs.534309	cysteine-rich protein 2
1.86E-05	1.24	MGP	Hs.365706	matrix Gla protein
1.97E-05	1.433	GSN	Hs.522373	gelsolin
2.21E-05	1.163	SCRG1	Hs.7122	stimulator of chondrogenesis 1
2.49E-05	1.145	DES	Hs.594952	desmin
2.82E-05	1.295	TPM1	Hs.133892	tropomyosin 1 (alpha)
3.15E-05	1.164	THBS4	Hs.211426	thrombospondin 4
3.55E-05	1.247	SPON1	Hs.643864	spondin 1, extracellular matrix protein
3.63E-05	1.337	SLCO2A1	Hs.518270	solute carrier organic anion transporter family, member 2A1
3.68E-05	1.147	CNN1	Hs.465929	calponin 1, basic, smooth muscle
3.81E-05	1.336	HSPA2	Hs.432648	heat shock 70kDa protein 2
3.98E-05	1.31	CSRP1	Hs.108080	cysteine and glycine-rich protein 1
4.57E-05	1.673	HSPB1	Hs.520973	heat shock 27kDa protein 1
5.12E-05	1.232	HSPB8	Hs.400095	heat shock 22kDa protein 8
5.24E-05	1.387	EFHD1	Hs.516769	EF-hand domain family, member D1
5.27E-05	1.519	TCEAL4	Hs.194329	transcription elongation factor A (SII) -like 4
5.42E-05	1.29	TPM2	Hs.300772	tropomyosin 2 (beta)
5.70E-05	1.507	RRAS	Hs.515536	related RAS viral (r-ras) oncogene homolog
5.72E-05	1.31	C20orf103	Hs.22920	chromosome 20 open reading frame 103
5.86E-05	1.501	EMCN	Hs.152913	endomucin
5.95E-05	1.262	CHRDL2	Hs.432379	chordin-like 2
6.05E-05	1.425	C5orf13	Hs.36053	chromosome 5 open reading frame 13
6.32E-05	1.164	SYNM	Hs.207106	synemin, intermediate filament protein
6.39E-05	1.329	C10orf10	Hs.93675	chromosome 10 open reading frame 10
6.78E-05	1.178	MYLK	Hs.477375	myosin light chain kinase
7.07E-05	1.45	COL18A1	Hs.517356	collagen, type XVIII, alpha 1
7.31E-05	1.307	SHISA2	Hs.433791	shisa homolog 2 (Xenopus laevis)
7.39E-05	1.262	CALD1	Hs.490203	caldesmon 1
7.61E-05	1.146	C2orf40	Hs.43125	chromosome 2 open reading frame 40
7.97E-05	1.238	MATN2	Hs.189445	matrilin 2
8.36E-05	1.348	AQP1	Hs.76152	aquaporin 1 (Colton blood group)
8.46E-05	1.398	LPHN2	Hs.24212	latrophilin 2
9.13E-05	1.275	TYRP1	Hs.270279	tyrosinase-related protein 1
9.38E-05	1.428	TUBB6	Hs.193491	tubulin, beta 6
9.46E-05	1.867	EDNRB	Hs.82002	endothelin receptor type B
9.62E-05	1.214	PDLIM3	Hs.442702	PDZ and LIM domain 3
0.0001076	1.379	RHOJ	Hs.656339	ras homolog gene family, member J
0.0001104	1.653	ACOT1	Hs.568046	acyl-CoA thioesterase 1
0.0001121	1.316	SVIL	Hs.499209	supervillin
0.0001131	1.375	COL4A2	Hs.508716	collagen, type IV, alpha 2
0.0001173	1.184	FHL1	Hs.435369	four and a half LIM domains 1
0.0001186	1.202	PPP1R3C	Hs.303090	protein phosphatase 1, regulatory (inhibitor) subunit 3C
0.0001195	1.212	GREM1	Hs.40098	gremlin 1
0.0001199	1.467	PTPRM	Hs.49774	protein tyrosine phosphatase, receptor type, M
0.0001234	1.316	SSPN	Hs.183428	sarcospan (Kras oncogene-associated gene)
0.0001234	1.25	ANXA8	Hs.535306	annexin A8
0.0001238	1.229	MSRB3	Hs.339024	methionine sulfoxide reductase B3
0.0001335	1.238	SPARCL1	Hs.62886	SPARC-like 1 (hevin)
0.0001349	1.324	OMD	Hs.94070	osteomodulin
0.0001365	1.267	COL8A1	Hs.654548	collagen, type VIII, alpha 1
0.0001372	1.369	C1QTNF5	Hs.632102	C1q and tumor necrosis factor related protein 5
0.0001375	1.454	CRTAC1	Hs.500736	cartilage acidic protein 1
0.0001406	1.387	DKK3	Hs.292156	dickkopf homolog 3 (Xenopus laevis)
0.0001418	1.303	DIO2	Hs.202354	deiodinase, iodothyronine, type II
0.000147	1.282	CYBRD1	Hs.726027	cytochrome b reductase 1
0.0001499	1.405	SPIRE1	Hs.515283	spire homolog 1 (Drosophila)
0.0001529	1.301	SERPINE2	Hs.38449	serpin peptidase inhibitor, clade E (nexin, plasminogen activator inhibitor type 1), member 2
0.0001584	1.499	PPAP2A	Hs.696231	phosphatidic acid phosphatase type 2A
0.0001602	1.175	TCEAL2	Hs.401835	transcription elongation factor A (SII) -like 2
0.0001607	1.249	DPYSL3	Hs.519659	dihydropyrimidinase-like 3
0.0001727	1.328	ACTA2	Hs.500483	actin, alpha 2, smooth muscle, aorta
0.0001756	1.186	RBPMS2	Hs.436518	RNA binding protein with multiple splicing 2
0.0001769	1.338	PALLD	Hs.151220	palladin, cytoskeletal associated protein
0.0001775	1.296	ALDH1A3	Hs.459538	aldehyde dehydrogenase 1 family, member A3
0.0001794	1.401	HDGFRP3	Hs.513954	hepatoma-derived growth factor, related protein 3
0.0001801	1.225	DACT3	Hs.515490	dapper, antagonist of beta-catenin, homolog 3 (Xenopus laevis)
0.0001836	1.333	IGFBP7	Hs.479808	insulin-like growth factor binding protein 7
0.0001852	1.373	TMEFF2	Hs.144513	transmembrane protein with EGF-like and two follistatin-like domains 2
0.0001878	1.472	PCSK5	Hs.368542	proprotein convertase subtilisin / kexin type 5
0.0001898	1.415	ICAM2	Hs.431460	intercellular adhesion molecule 2
0.0001954	1.183	MYL9	Hs.504687	myosin, light chain 9, regulatory
0.000196	1.32	FOXF2	Hs.484423	forkhead box F2
0.0001984	1.204	LMOD1	Hs.519075	leiomodin 1 (smooth muscle)
0.0002005	1.641	SEPW1	Hs.631549	selenoprotein W, 1
0.000201	1.173	SYNPO2	Hs.655519	synaptopodin 2
0.0002042	1.292	DCBLD2	Hs.203691	discoidin, CUB and LCCL domain containing 2
0.0002081	1.363	NNMT	Hs.503911	nicotinamide N-methyltransferase
0.0002116	1.332	HEYL	Hs.472566	hairy / enhancer-of-split related with YRPW motif-like
0.0002129	1.154	APOD	Hs.522555	apolipoprotein D
0.0002221	1.462	HSPB2	Hs.709660	heat shock 27kDa protein 2
0.0002233	1.35	NGFRAP1	Hs.448588	nerve growth factor receptor (TNFRSF16) associated protein 1
0.0002269	1.151	HSPB6	Hs.534538	heat shock protein, alpha-crystallin-related, B6
0.0002281	1.404	RBPMS	Hs.334587	RNA binding protein with multiple splicing
0.0002309	1.293	SGCE	Hs.371199	sarcoglycan, epsilon
0.0002342	2.016	DCAF6	Hs.435741	DDB1 and CUL4 associated factor 6
0.0002401	1.405	LPP	Hs.5724	LIM domain containing preferred translocation partner in lipoma
0.0002404	1.67	PEA15	Hs.517216	phosphoprotein enriched in astrocytes 15
0.0002474	1.179	VIP	Hs.53973	vasoactive intestinal peptide
0.0002526	1.457	GJA4	Hs.296310	gap junction protein, alpha 4, 37kDa
0.0002531	1.975	CYTH3	Hs.487479	cytohesin 3
0.0002589	1.531	PTN	Hs.371249	pleiotrophin
0.0002607	1.231	LEPR	Hs.23581	leptin receptor
0.0002677	1.423	RAI14	Hs.431400	retinoic acid induced 14
0.0002767	1.274	TMEM47	Hs.8769	transmembrane protein 47
0.0002899	1.531	FOXS1	Hs.516971	forkhead box S1
0.0002919	1.44	ESAM	Hs.173840	endothelial cell adhesion molecule
0.0002935	1.443	MEIS3P1	Hs.356135	Meis homeobox 3 pseudogene 1
0.000294	1.257	C15orf52	Hs.32433	chromosome 15 open reading frame 52
0.0002955	1.962	ITGB1	Hs.643813	integrin, beta 1 (fibronectin receptor, beta polypeptide, antigen CD29 includes MDF2, MSK12)
0.0003037	1.219	OGN	Hs.109439	osteoglycin
0.0003059	1.184	RGMA	Hs.271277	RGM domain family, member A
0.0003069	1.302	IGFBP6	Hs.274313	insulin-like growth factor binding protein 6
0.0003084	1.571	ABLIM3	Hs.49688	actin binding LIM protein family, member 3
0.0003098	1.335	LAYN	Hs.503831	layilin
0.0003109	1.275	FERMT2	Hs.509343	fermitin family member 2
0.0003147	1.392	FZD4	Hs.591968	frizzled homolog 4 (Drosophila)
0.0003253	1.354	ADAMTS8	Hs.271605	ADAM metallopeptidase with thrombospondin type 1 motif, 8
0.0003302	1.293	TGFB1I1	Hs.513530	transforming growth factor beta 1 induced transcript 1
0.0003322	1.182	DARC	Hs.153381	Duffy blood group, chemokine receptor
0.0003501	1.254	PLN	Hs.170839	phospholamban
0.0003519	1.403	SCHIP1	Hs.134665	schwannomin interacting protein 1
0.0003527	1.402	PDGFC	Hs.570855	platelet derived growth factor C
0.000363	1.617	RAB6B	Hs.707804	RAB6B, member RAS oncogene family
0.0003672	1.22	CPE	Hs.75360	carboxypeptidase E
0.0003776	1.906	MARCKS	Hs.519909	myristoylated alanine-rich protein kinase C substrate
0.00038	1.807	TIE1	Hs.78824	tyrosine kinase with immunoglobulin-like and EGF-like domains 1
0.0003805	1.854	AFAP1L1	Hs.483793	actin filament associated protein 1-like 1
0.0003823	1.548	ERGIC1	Hs.509163	endoplasmic reticulum-golgi intermediate compartment (ERGIC) 1
0.0003825	1.195	HSPB7	Hs.502612	heat shock 27kDa protein family, member 7 (cardiovascular)
0.0003838	1.447	EHD2	Hs.726202	EH-domain containing 2
0.0003881	1.429	SLC38A1	Hs.533770	solute carrier family 38, member 1
0.0004033	1.619	FNDC4	Hs.27836	fibronectin type III domain containing 4
0.0004092	1.334	ADAMTS1	Hs.643357	ADAM metallopeptidase with thrombospondin type 1 motif, 1
0.0004268	1.474	C20orf160	Hs.382151	chromosome 20 open reading frame 160
0.0004552	1.714	CALHM2	Hs.241545	calcium homeostasis modulator 2
0.0004579	1.623	FAM124B	Hs.147585	family with sequence similarity 124B
0.0004674	1.517	TMEM136	Hs.643516	transmembrane protein 136
0.0004674	1.343	FSTL1	Hs.269512	follistatin-like 1
0.0004742	1.549	CDH6	Hs.171054	cadherin 6, type 2, K-cadherin (fetal kidney)
0.0004958	1.281	HTR2B	Hs.421649	5-hydroxytryptamine (serotonin) receptor 2B
0.0004971	1.482	LAMA2	Hs.200841	laminin, alpha 2
0.0005016	1.342	GEM	Hs.654463	GTP binding protein overexpressed in skeletal muscle
0.0005157	1.387	CDH5	Hs.76206	cadherin 5, type 2 (vascular endothelium)
0.0005179	1.329	PDE8B	Hs.584830	phosphodiesterase 8B
0.0005236	1.394	RAB32	Hs.287714	RAB32, member RAS oncogene family
0.0005255	1.29	SELM	Hs.55940	selenoprotein M
0.0005265	1.154	C7	Hs.78065	complement component 7
0.0005292	1.245	PLAC9	Hs.204947	placenta-specific 9
0.0005345	1.193	MFAP4	Hs.296049	microfibrillar-associated protein 4
0.0005371	1.178	FLNC	Hs.58414	filamin C, gamma
0.0005421	1.16	CTSE	Hs.644082	cathepsin E
0.0005663	1.479	LOC346887	Hs.127286	similar to solute carrier family 16 (monocarboxylic acid transporters), member 14
0.0005731	1.378	MPRIP	Hs.462341	myosin phosphatase Rho interacting protein
0.000584	1.484	GNB5	Hs.155090	guanine nucleotide binding protein (G protein), beta 5
0.000585	1.305	ELN	Hs.647061	elastin
0.0006189	1.332	ENG	Hs.76753	endoglin
0.0006308	1.227	CRABP2	Hs.405662	cellular retinoic acid binding protein 2
0.0006328	1.196	CST6	Hs.139389	cystatin E / M
0.0006334	1.223	MYOM1	Hs.464469	myomesin 1, 185 kDa
0.0006354	1.408	PCDH18	Hs.591691	protocadherin 18
0.0006526	1.528	LAMB1	Hs.650585	laminin, beta 1
0.0006578	1.28	LHFP	Hs.507798	lipoma HMGIC fusion partner
0.0006642	1.302	FILIP1L	Hs.104672	filamin A interacting protein 1-like
0.0006744	1.228	CAV1	Hs.74034	caveolin 1, caveolae protein, 22kDa
0.0006751	1.187	CPXM2	Hs.656887	carboxypeptidase X (M14 family), member 2
0.0006762	1.298	NBEA	Hs.491172	neurobeachin
0.000684	1.344	TEK	Hs.89640	TEK tyrosine kinase, endothelial
0.0007038	1.345	CTSF	Hs.11590	cathepsin F
0.0007096	1.613	LTC4S	Hs.706741	leukotriene C4 synthase
0.000716	1.247	AEBP1	Hs.439463	AE binding protein 1
0.0007249	1.3	GNG11	Hs.83381	guanine nucleotide binding protein (G protein), gamma 11
0.000732	1.617	SV2B	Hs.21754	synaptic vesicle glycoprotein 2B
0.0007328	1.153	KCNMB1	Hs.484099	potassium large conductance calcium-activated channel, subfamily M, beta member 1
0.0007334	1.208	BARX1	Hs.164960	BARX homeobox 1
0.0007401	1.435	DIP2C	Hs.432397	DIP2 disco-interacting protein 2 homolog C (Drosophila)
0.0007424	1.446	LAMC1	Hs.609663	laminin, gamma 1 (formerly LAMB2)
0.0007667	1.201	PODN	Hs.586141	podocan
0.0007707	1.57	LAPTM4A	Hs.467807	lysosomal protein transmembrane 4 alpha
0.000771	1.319	HTRA1	Hs.501280	HtrA serine peptidase 1
0.0007865	1.377	FGF2	Hs.284244	fibroblast growth factor 2 (basic)
0.0007874	1.344	CLEC14A	Hs.525307	C-type lectin domain family 14, member A
0.0008122	1.372	PHLDB2	Hs.603252	pleckstrin homology-like domain, family B, member 2
0.0008304	1.366	CD93	Hs.97199	CD93 molecule
0.0008458	1.31	RGS11	Hs.65756	regulator of G-protein signaling 11
0.000851	1.469	TRIM47	Hs.293660	tripartite motif-containing 47
0.0008686	1.348	LHX6	Hs.103137	LIM homeobox 6
0.0008794	1.293	EDNRA	Hs.183713	endothelin receptor type A
0.0008876	1.293	PRSS23	Hs.25338	protease, serine, 23
0.0009105	1.226	FAM129A	Hs.518662	family with sequence similarity 129, member A
0.0009239	1.243	SDPR	Hs.26530	serum deprivation response
0.0009423	1.331	PAMR1	Hs.55044	peptidase domain containing associated with muscle regeneration 1
0.0009484	1.286	APLNR	Hs.438311	apelin receptor
0.0009587	1.279	PDE7B	Hs.726482	phosphodiesterase 7B
0.0009591	1.507	ANKRD10	Hs.525163	ankyrin repeat domain 10
0.0009622	1.195	FRZB	Hs.128453	frizzled-related protein
0.0009762	1.175	SMOC2	Hs.487200	SPARC related modular calcium binding 2
0.0009826	1.55	CDC42EP4	Hs.3903	CDC42 effector protein (Rho GTPase binding) 4
0.0009992	1.231	RERG	Hs.199487	RAS-like, estrogen-regulated, growth inhibitor

Table 1 lists a set of genes showing good prognosis with genes with Hazard ratio <1.0 and p <0.05, and Table 2 lists sets of genes showing bad prognosis with genes with Hazard ratio> 1.0 and p <0.05. will be.

Example 2 Prediction of Total Stage Gastric Cancer Based on Micro RNA

Alumina microRNA microarray was performed on frozen section tissue stored in the gene bank of patients who had gastric cancer surgery (n = 332) at Severance Hospital in Yonsei University Medical Center from 2000 to 2004. Was 113 cases. Micro RNA was tested in the same sample as the patient sample from which the RNA transcript was obtained.

Based on unsupervised clustering, prognostic analysis was performed not only in the overall stage but also in the locally advanced gastric cancer group. Accordingly, a good prognosis group showing 85% or more of 5 year overall survival rate as well as 5 year no recurrence survival rate was selected by micro RNA group in locally advanced gastric cancer group.

Based on the results of the unsupervised clustering, the first and fourth stages were set as the training set, and the predictive model was made using the locally advanced gastric cancer based on the second and third stages as the test set. By training set is meant a subject sample from which statistically significant RNA transcripts and microRNAs were extracted. The test set refers to a set for testing the accuracy of the extracted variable can actually determine whether the prognosis is good or bad. The reason for using this method is not only to be able to effectively predict prognosis in a specific sample group, but also to determine that it is effective in an independent sample. Based on the table, the following accuracy was obtained when leave one out cross validation (Reference, BRB-ArrayTools Version 4.2 User's Manual p74-81).

TABLE 3

Accuracy of Prognostic Prediction Model in Locally Advanced Gastric Cancer (Stages 2 and 3)

	Compound Covariate Predictor	Linear Discrimination Analysis	Shortest Center Connection
Average percent of correct classification	87%	85%	93%

The accuracy of the prediction model is very high. It makes it possible to make prognostic predictions more clear, especially in locally advanced gastric cancer.

Tables 4 and 5 are lists of microRNAs that influence survival using the Univariate Cox's proportional harzard model.The left column shows the names of the microRNAs, and the left column to the right, Cox p value, harzard ratio. , The degree of expression of the microRNA, and finally the fold difference between the maximum and minimum values. A total of 27 microRNAs showed survival and statistical significance. Among them, the names of each of the 14 microRNAs affecting survival and survival analysis are shown.

Table 4

Parametric p-value	Hazard Ratio	Unique id	Description
0.0039084	0.368	ILMN_3167707	HS_59
0.0083208	0.421	ILMN_3167742	HS_162
0.009446	0.423	ILMN_3167440	HS_67
0.0092068	0.623	ILMN_3168569	hsa-miR-96 *
0.0062687	0.651	ILMN_3167393	hsa-miR-496
0.0062712	0.712	ILMN_3166979	hsa-miR-223
0.0011696	0.729	ILMN_3167474	hsa-miR-302a *
0.0043565	0.741	ILMN_3167510	hsa-miR-20a
0.0077049	0.75	ILMN_3167769	hsa-miR-93
0.0071496	0.757	ILMN_3168105	hsa-miR-148a
0.0016924	0.779	ILMN_3168715	hsa-miR-155 *
0.0022459	0.781	ILMN_3167434	hsa-miR-15a
0.0058083	0.832	ILMN_3168642	hsa-miR-17
0.0084347	0.878	ILMN_3168282	hsa-miR-18a

Table 5

Parametric p-value	Hazard Ratio	Unique id	Description
0.0072061	1.086	ILMN_3168320	hsa-miR-1
0.0073919	1.2	ILMN_3167368	HS_6
0.0069289	1.248	ILMN_3168527	HS_111
0.006867	1.282	ILMN_3168017	HS_114
0.0027552	1.283	ILMN_3168513	hsa-let-7c
0.0063893	1.321	ILMN_3168523	HS_126
0.0045926	1.35	ILMN_3168036	HS_90
0.0029516	1.373	ILMN_3168072	hsa-miR-548d-5p
0.0038524	1.388	ILMN_3168103	hsa-miR-189: 9.1
0.0022636	1.423	ILMN_3168899	solexa-4793-177
0.0071311	1.564	ILMN_3167512	HS_135
0.0007934	1.709	ILMN_3168593	hsa-miR-20b *
0.0002497	1.866	ILMN_3168097	hsa-miR-658

Table 4 lists micro RNA sets that show good prognosis with genes with Hazard ratio <1.0 and p <0.05, and Table 5 shows micro RNA sets showing bad prognosis with genes with Hazard ratio> 1.0 and p <0.05. It is listed.

Based on the microRNA list, a Risk Scoring System was created, and the formula for calculating prognostic indicators is as follows:

Prognostic Index = (HR ₁ * normLogTransValue ₁ + HR ₂ * normLogTransValue ₂ + ... + HR _n * normLogTransValue _n )

Where

HR _n represents the hazard ratio of the n-th micro RNA, in particular, when the hazard ratio is less than 1, it is replaced with -1 / hazard ratio.

normLogTransValue _n means the value after quantile normalization after transformation to log2 of nth micro RNA.

In other words, if the prognostic index is about -20 or less, the prognosis is good, and when -20 or more, the prognosis is poor.

Figure 2 shows an example using the recurrence scoring method in stage 3a of gastric cancer, the y-axis represents the prognostic indicator value, and based on about -20, 1 death of 13 out of 13 patients with a good prognosis less than that was recorded In the above-mentioned group, 19 deaths were recorded out of a total of 45.

From the above results, it was found that the risk score system using micro RNAs can make an excellent prognostic prediction model in locally advanced gastric cancer.

Example 3 Prediction of Prognosis of All Stages of Gastric Cancer Based on Functional Proteins

To predict the prognosis of functional proteolytic-based locally advanced gastric cancer, SDS-sample buffer (without) was extracted from total cellular proteome using Reverse phase protein array (RPPA) Lysis buffer from frozen tumor tissue. After denaturation using bromophenol blue) and serial dilution of 6-8 times, printing was carried out with a robot arrayer on a nitrocellulose coated glass slide.

In the above method, the slides in which the proteins derived from the tumor freezing tissues are printed in high density can be used for the biological characteristics of the tumor cells such as tumor growth, cell death, survival and cell cycle transition and invasion, metastasis and cell neovascularization. Using a specific antibody (including phosphorylated protein specific antibodies) that can be explored, quantitatively detect the amount of protein expression on each slide using immunological methods (specific antigen-antibody reactions) and signal amplifier method (DACO CSA system) It was.

Table 6 shows a list of functional proteins that affect survival significantly statistically using the univariate cox's proportional hazard model among a total of 250 specific antibodies for detecting functional proteins. The left column of Table 8 indicates the name of the functional protein, and shows the Cox p value (parametric), harzard ratio, and standard error of log intensities, from left to right.

Table 6

As shown in Table 6, five functional ^proteins , Akt ^pS473 , PAI, SMAD3, ^P70 S6K and VEGFR2, showed survival and statistical significance. Among them, Akt ^pS473 showed the best p = 0.00032 in survival analysis.

Figure 3 shows the results of survival analysis in the case of Akt ^pS473 , the lower the expression of the phosphorylated protein has a better prognosis. In addition, the phosphorylated protein may be referred to as a target biomarker having a feature that can simultaneously function as a target of a target therapeutic agent.

Based on the functional protein list, a risk scoring system was created, and the formula for calculating prognostic indicators is as follows:

Prognostic Index = (HR ₁ * RPPAValue ₁ + HR ₂ * RPPAValue ₂ + ... + HR _n * RPPAValue _n )

Where

HR _n represents the hazard ratio of the nth functional protein, and in particular, when the hazard ratio is less than 1, it is substituted with -1 / hazard ratio.

RPPAValue _n is the value after transformation to log2 of the n th functional protein.

In other words, if the prognostic index is greater than zero, the prognosis is bad, and if the prognostic index is less than zero, the prognosis is good.

In FIG. 4, when the prognostic index was 0, 3 of 36 patients died in the good prognosis group and 14 of 33 died in the poor prognosis group. The risk score system using the five markers shows the characteristics of functional proteins that can look at the prognosis as well as the availability of various targeted therapies. Specifically, among the five targeted markers such as Akt ^pS473 , PAI, SMAD3, ^P70 ^S6K , and VEGFR2, the therapeutic drugs by Akt and VEGFR2 targets are already used in other cancers in the clinic. The advantage is that it can be applied immediately. It can also be used as a prognostic model for locally advanced gastric cancer.

Example 4 Prediction of N0 Gastric Cancer Based on RNA Transcripts and Micro RNA

Table 7 below lists a set of RNA transcripts showing good prognosis with genes with Hazard ratio <1.0 and p <0.05 and a set of RNA transcripts showing bad prognosis with genes> 1.0 and p <0.05.

Table 8 below lists the micro RNA set showing good prognosis with genes with Hazard ratio <1.0 and p <0.05 and the micro RNA set showing bad prognosis with genes> 1.0 and p <0.05.

TABLE 7

Parametric p-value_Total	Hazard Ratio_Total	Description	UG cluster	Gene symbol
0.0000585	4.302	frizzled homolog 1 (Drosophila)	Hs.94234	FZD1
0.0000134	4.073	GLI family zinc finger 3	Hs.21509	GLI3
0.0000345	2.949	angiopoietin-like 7	Hs.146559	ANGPTL7
0.0000525	2.784	c-abl oncogene 1, non-receptor tyrosine kinase	Hs.431048	ABL1
0.0000177	2.266	SWI / SNF related, matrix associated, actin dependent regulator of chromatin, subfamily d, member 3	Hs.647067	SMARCD3
0.0000331	2.251	integrin-linked kinase	Hs.706355	ILK
0.0000189	1.788	caveolin 1, caveolae protein, 22kDa	Hs.74034	CAV1
0.0000212	1.73	vasoactive intestinal peptide	Hs.53973	VIP
0.0000251	1.535	heat shock 27kDa protein family, member 7 (cardiovascular)	Hs.502612	HSPB7
0.0000514	0.566	topoisomerase (DNA) II alpha 170kDa	Hs.156346	TOP2A
0.0000064	0.358	Fanconi anemia, complementation group D2	Hs.208388	FANCD2

Table 8

Parametric p-value	Hazard Ratio	Unique id	Description
0.0005464	5.256	ILMN_3168863	hsa-miR-933
0.002469	1.674	ILMN_3167698	hsa-miR-184
0.004149	1.903	ILMN_3168015	hsa-miR-380 *
0.0075231	0.278	ILMN_3168837	hsa-miR-190b
0.0141791	0.587	ILMN_3168612	hsa-miR-27a *
0.015639	0.74	ILMN_3168604	hsa-miR-1201

Based on the lists in Tables 7 and 8, a Risk Scoring System was created, and the formula for calculating the prognostic indicators is as follows:

5 is a survival analysis result when the risk score in the N0 gastric cancer group is divided into negative cases and positive cases when using the prognostic indicators. Similar anatomical stages clearly show differences in survival rates. This means superiority of prognostic indicators using RNA transcripts and microRNAs.

FIG. 6 shows that the risk scoring system in N0 gastric cancer is separated into two groups based on 0, showing a 7% relapse survival rate in a good prognosis group and a 41% recurrence survival rate in a poor prognosis group. Show an ability to clearly distinguish

FIG. 7 illustrates a process in which a clear prognostic difference appears between clusters when hierachial clustering is performed using statistically correlated genes of microRNAs. This means the value of the combined use of microRNA and RNA transcripts. In particular, biologically specific microRNAs can have the biological significance of such statistical methods as they have the ability to collectively inhibit specific group RNA transcripts.

All references cited throughout this specification are expressly incorporated herein by reference. While the invention has been described with particular emphasis on certain embodiments, it will be apparent to those skilled in the art that changes and variations in specific methods and techniques are possible. Accordingly, the invention includes all modifications that fall within the spirit and scope of the invention as defined by the following claims.

The present invention can be used in the field of predicting gastric cancer recurrence prognosis.

Claims

In biological samples containing cancer cells obtained from a subject

One or more RNA transcripts selected from the group consisting of FZD1, GLI3, ANGPTL7, ABL1, SMARCD3, ILK, CAV1, VIP, HSPB7, TOP2A and FANCD2; And determining the expression level of one or more miRNAs selected from the group consisting of hsa-miR-933, hsa-miR-184, hsa-miR-380 *, hsa-miR-190b, hsa-miR-27a * and hsa-miR-1201. Making; And

Calculating a recurrence score (RS) of the biological sample based on the expression level of the RNA transcript or miRNA determined in the step,

A method of predicting prognosis in a subject diagnosed as gastric cancer comprising the step of determining the prognosis according to the RS value.
The method of claim 1,

RS is calculated according to Equation 1:

[Equation 1]

Risk Score = HR 1 * normLogTransValue 1 + HR 2 * normLogTransValue 2 + ... + HR n * normLogTransValue n

Where

HR n represents the hazard ratio of the nth RNA transcript or microRNA,

normLogTransValue n means the value associated with the expression of the n-th RNA transcript or micro RNA.
The method of claim 1,

Wherein said method predicts clinical results after surgery for resection of locally advanced T1NO, T2N0, T3N0, or T4N0 stage cancer in TNM staging.
The method of claim 1,

The method determines that the poor prognosis if the RS value is a positive value and the good prognosis is a negative value in terms of overall survival (OS) or recurrence free survival (RFS).
In biological samples containing cancer cells obtained from a subject

a) HAT, C17orf65, TRAF6, CISH, ELAC1, ACTR8, SMARCAD1, SRRM1, C15orf44, EFTUD1, BUB3, KIAA0232, SEPSECS, DCAF16, ARHGAP19, TAF5, CNOT6L, NIF3L1, C19orf54, RNPC6, NRCPC28, NRCPC28 USP54, LIN54, FANCF, GAR1, GPBP1L1, TRAF3, KIAA0368, CRNKL1, SCLY, SMCR7L, PAIP1, RBD1, RPAIN, AP1G1, C1orf212, C18orf54, TIFA, EWSR1, FUBP1, AGGF1, CWF2F145, CRP14F152 NUP88, SNORA65, MED28, RFC1, RRM1, KARS, CCR1, CHAF1A, PLCH1, FASTKD1, KIAA0174, SAAL1, TNFSF14, ETV7, NBN, C20orf7, RHBDD1, ANKRD32, ING3, ATPAF1, CCD KP14, QD1 NFX1, SMAP2, SRGAP3, KIR2DL3, KIAA0564, GFI1, KIAA1715, COX15, PATL1, LETMD1, PRRG4, SETD4, GRAMD1C, NDRG3, PTPN22, TRIM21, PI4K2B, DCLRE1A, ALG11, PARPC, LIKASCEK2 MMP25, LARP1B, STAP2, GCH1, C20orf72, HK3, SNX5, NAAA, KLRD1, IL18RAP, PSMB8, THOP1, CASP5, ALPK1, SLC11A2, PSMB10, MND1, FANCG, IMPA1, MYL5, TTF2, RFIA3 PRF, BA BTN3A1, FANCD2, RIPK2, TSPAN6, IFNG, CDC25A, CXCR6, SLC27A2, GAD Expression level of one or more RNA transcript X selected from the group consisting of 1, DLEU2, JAK2, CD7, FKBP11, IL32, SORD, TAP1, GNLY, C2, GZMB, VSNL1 and GBP5, and / or

b. CRTAC1, DKK3, DIO2, CYBRD1, SPIRE1, SERPINE2, PPAP2A, TCEAL2, DPYSL3, ACTA2, RBPMS2, PALLD, ALDH1A3, HDGFRP3, DACT3, IGFBP7, TMEFF2, PCSK5, ICAM2, MYL2, DC1 SOD2 NNMT, HEYL, APOD, HSPB2, NGFRAP1, HSPB6, RBPMS, SGCE, DCAF6, LPP, PEA15, VIP, GJA4, CYTH3, PTN, LEPR, RAI14, TMEM47, FOXS1, ESAM, MEIS3P1, C15orf52, ITGB1, OGN IGFBP6, ABLIM3, LAYN, FERMT2, FZD4, ADAMTS8, TGFB1I1, DARC, PLN, SCHIP1, PDGFC, RAB6B, CPE, MARCKS, TIE1, AFAP1L1, ERGIC1, HSPB7, EHD2, SLC38A1, FTSCC, AD20, FNDC4H TMEM136, FSTL1, CDH6, HTR2B, LAMA2, GEM, CDH5, PDE8B, RAB32, SELM, C7, PLAC9, MFAP4, FLNC, CTSE, LOC346887, MPRIP, GNB5, ELN, ENG, CRABP2, CST6, MYOM1, PCDH18, LAMB LHFP, FILIP1L, CAV1, CPXM2, NBEA, TEK, CTSF, LTC4S, AEBP1, GNG11, SV2B, KCNMB1, BARX1, DIP2C, LAMC1, PODN, LAPTM One or more RNAs selected from the group consisting of 4A, HTRA1, FGF2, CLEC14A, PHLDB2, CD93, RGS11, TRIM47, LHX6, EDNRA, PRSS23, FAM129A, SDPR, PAMR1, APLNR, PDE7B, ANKRD10, FRZB, SMOC2, CDC42EP4 and RERG Measuring the expression level of transcript Y; And

The increase in expression of transcript X is determined to be an increase in the likelihood of a positive clinical outcome, and the increase in expression of transcript Y is determined to be a decrease in the likelihood of a positive clinical outcome. How to predict.
The method of claim 5,

The method is a PCR based or an array based method.
The method of claim 5,

Wherein said expression level is normalized to the expression level of one or more RNA transcripts.
The method of claim 5,

Wherein said clinical outcome is expressed in terms of overall survival (OS) or recurrence free survival (RFS).
The method of claim 5,

Wherein said method predicts clinical outcome after surgical resection of total gastric cancer irrespective of TNM stage.
The method of claim 5,

Measuring the expression level of at least two RNA transcripts selected from RNA transcripts X and Y.
In biological samples containing cancer cells obtained from a subject,

a) HS_59, HS_162, HS_67, hsa-miR-96 *, hsa-miR-496, hsa-miR-223, hsa-miR-302a *, hsa-miR-20a, hsa-miR-93, hsa-miR- Expression level of one or more miRNA (I) selected from the group consisting of 148a, hsa-miR-155 *, hsa-miR-15a, hsa-miR-17 and hsa-miR-18a, and / or

b) hsa-miR-1, HS_6, HS_111, HS_114, hsa-let-7c, HS_126, HS_90, hsa-miR-548d-5p, hsa-miR-189: 9.1, solexa-4793-177, HS_135, hsa- measuring the expression level of one or more miRNA (II) selected from the group consisting of miR-20b * and hsa-miR-658; And

Increased expression of miRNA (I) is judged to be an increase in the likelihood of positive clinical outcomes, and increased expression of miRNA (II) is determined to be a decrease in the likelihood of positive clinical outcomes. How to predict.
The method of claim 11,

Wherein the cancer predicts clinical outcome after surgical resection of total gastric cancer independent of TNM stage.
The method of claim 11,

Wherein said clinical outcome is expressed in terms of overall survival (OS) or recurrence free survival (RFS).
In biological samples containing cancer cells obtained from a subject,

Determining the expression level of at least one protein selected from the group consisting of Akt pS473 , PAI, SMAD3, P70 S6K and EGFR2; And

Calculating a recurrence score (RS) of the biological sample based on the expression level of the protein determined in the step,

A method of predicting prognosis in a subject diagnosed as gastric cancer comprising the step of determining the prognosis according to the RS value.
The method of claim 14,

RS is calculated according to Equation 2:

[Equation 2]

Risk Score = HR 1 * RPPAValue 1 + HR 2 * RPPAValue 2 + ... + HR n * RPPAValue n

Where

HR n represents the hazard ratio of the nth functional protein,

RPPAValue n means the value associated with the expression of the n th functional protein.
The method of claim 14,

Predicting clinical outcome after surgical resection of total gastric cancer regardless of TNM stage.
The method of claim 14,

The method is a poor prognosis if the RS value is greater than the set point in terms of overall survival (OS) or recurrence free survival (RFS), and the prognosis is good if the RS value is less than the set point. .
A computer-readable recording medium having recorded thereon a program for performing prognostic prediction of gastric cancer, comprising:

One or more RNA transcripts selected from the group consisting of FZD1, GLI3, ANGPTL7, ABL1, SMARCD3, ILK, CAV1, VIP, HSPB7, TOP2A and FANCD2; And determining the expression level of one or more miRNAs selected from the group consisting of hsa-miR-933, hsa-miR-184, hsa-miR-380 *, hsa-miR-190b, hsa-miR-27a * and hsa-miR-1201. Making; And

Calculating a recurrence score (RS) of the biological sample based on the expression level of the RNA transcript or miRNA determined in the step,

A computer-readable recording medium having recorded thereon a program for causing a computer to classify a patient having a higher RS than a setpoint as a high probability of recurrence and a patient having a RS lower than a setpoint as a low likelihood of relapse.
The method of claim 18,

The recording medium is a recording medium for predicting the clinical outcome after resection by surgery of T1NO stage, T2N0 stage, T3N0 stage or T4N0 stage advanced gastric cancer in the TNM stage classification.
A computer-readable recording medium having recorded thereon a program for executing prognostic prediction of gastric cancer,

Determining the expression level of at least one protein selected from the group consisting of Akt pS473 , PAI, SMAD3, P70 S6K and EGFR2 in a protein sample obtained from the patient; And

Calculating a recurrence score (RS) of the biological sample based on the expression level of the protein determined in the step,

A computer-readable recording medium having recorded thereon a program for causing a computer to classify a patient whose RS is greater than a setpoint is a high probability of relapse and a patient smaller than the setpoint is a low likelihood of relapse.
The method of claim 20,

The recording medium is a recording medium for predicting clinical results after surgery for resection of total gastric cancer irrespective of the TNM stage.