KR20160073798A

KR20160073798A - Marker composition for predicting prognosis and chemo-sensitivity of cancer patients

Info

Publication number: KR20160073798A
Application number: KR1020140182573A
Authority: KR
Inventors: 김용성; 김선영; 김선규; 김진천
Original assignee: 재단법인 아산사회복지재단
Priority date: 2014-12-17
Filing date: 2014-12-17
Publication date: 2016-06-27
Also published as: KR101751806B1

Abstract

According to the present invention, a gene classifier TCA19 may be useful in predicting recurrence and metastasis of cancer and predicting sensitivity of an anti-cancer agent. Moreover, the present invention provides a method for sorting major genes related to occurrence and metastasis of cancer by statistically analyzing a gene expression aspect between normal tissue, cancer tissue, and metastasis cancer tissue, and selecting a gene capable of being used as a gene classifier by a prognostic value of a gene adjusted by the activity of the sorted genes. By using the method, a classifier for predicting a prognosis of cancer or sensitivity of an anti-cancer agent may be conveniently screened.

Description

BACKGROUND OF THE INVENTION 1. Field of the Invention The present invention relates to a marker composition for predicting the prognosis and anticancer drug susceptibility of cancer patients,

본 발명은 유전자 분류자(classifier) TCA19를 포함하는 암의 예후 및 항암제 감수성 예측용 바이오 마커 조성물, 유전자 분류자 TCA19의 mRNA 또는 이의 단백질의 발현 수준을 측정하는 제제를 포함하는 암의 예후 및 항암제 감수성 예측용 조성물, 상기 조성물을 포함하는 암의 예후 및 항암제 감수성 예측용 키트, 상기 마커를 이용한 암의 예후 및 항암제 감수성 예측을 위한 정보 제공 방법 및 암의 예후 및 항암제 감수성 예측용 유전자 분류자 스크리닝 방법에 관한 것이다.
The present invention relates to a biomarker composition for predicting cancer susceptibility and anticancer drug susceptibility including a classifier TCA19, an agent for measuring the expression level of mRNA of the gene classifier TCA19 or a protein thereof, and prognosis and cancer susceptibility A kit for predicting cancer prognosis and anticancer drug susceptibility including the composition, a method for providing information for predicting cancer prognosis and anticancer drug sensitivity using the marker, and a method for screening a gene classifier for predicting cancer susceptibility and anticancer drug susceptibility .

암은 국내 사망 원인의 1위를 차지하는 중대 질환으로 암을 정복하기 위한 수많은 연구가 있어 왔지만 아직까지 정복되지 않고 있는 난치병이다. 진단된 암에 대한 치료법은 일반적으로 수술, 화학요법 및 방사선 치료 등이 있으나, 각각의 방법에는 한계가 많다. 또한 암은 일단 치료된 후에도 재발 가능성이 상당히 높으며, 항암제에 대한 감수성도 개체에 따라 차이가 많으므로 암의 예후 및 항암제 감수성을 예측하는 것이 암환자의 치료 방향을 결정하는데 필수적이다.Cancer is the leading cause of death in Korea, a major disease, and there have been many studies to conquer cancer, but it is still an incurable disease. Treatment of diagnosed cancers generally includes surgery, chemotherapy, and radiation therapy, but there are limitations to each method. In addition, cancer is highly likely to recur after treatment, and susceptibility to anticancer drugs varies widely among individuals. Therefore, predicting cancer prognosis and anticancer drug sensitivity is essential to determine the treatment direction of cancer patients.

한편, 대장암은 전 세계적으로 매우 빈번하게 발생하는 암으로서, 병리학적 병기가 예후 예측의 표준 진단 방법이며, 항암 치료 여부를 결정하는 기준이다(Hari et al., 2013). 병리학적 병기로 분류된 대장암 환자는 재발이 매우 잦으며, 특히 3기 대장암 환자에 있어 수술 후 5년 내에 약 50%의 환자에서 재발이 발생한다 (Carlsson et al., 1987; Midgley and Kerr, 1999). 대장암은 재발과 함께 다른 장기로 전이가 자주 발생하는데, 주로 간과 폐 조직에 전이가 발생하며(Cunningham et al., 2010), 환자의 예후가 매우 불량하고, 환자 간 임상학적 예후가 매우 이질적이다. 대부분의 3기 대장암 환자에 대하여 표준 항암 치료를 수행하지만, 75세 이상의 고령 3기 대장암 환자의 경우는 고령으로 인한 신체 쇠약으로 인해 항암 치료가 권장되지 않는다. 따라서 암의 예후 및 항암제에 대한 감수성을 정확히 예측하여 항암 치료 여부를 선택할 필요가 있다.On the other hand, colorectal cancer is a very common cancer worldwide, and pathologic stage is the standard diagnostic method of prognosis prediction, and it is a criterion for determining whether or not to receive chemotherapy (Hari et al., 2013). Patients with colorectal cancer classified as a pathologic stage are very frequent relapses, especially in patients with stage 3 colorectal cancer, recurrence occurs in about 50% of patients within 5 years after surgery (Carlsson et al., 1987; Midgley and Kerr , 1999). In colorectal cancer, metastases frequently occur in other organs with recurrence, mainly in liver and lung tissue (Cunningham et al., 2010), with poor prognosis and very heterogeneous clinical prognosis . Although most patients with stage 3 colorectal cancer undergo standard chemotherapy, patients with older age 3 colorectal cancer over 75 years old are not recommended for chemotherapy due to age-related physical weakness. Therefore, the prognosis of cancer and the susceptibility to anticancer drugs should be precisely predicted, and it is necessary to select whether or not to receive chemotherapy.

또한, 유전자 발현 정량 기술에 근거한 유전체 연구가 다양한 암종에서 수행되었으며, 이는 암의 임상학적 다양성이 유전자 발현에 반영됨을 방증하는 것이다. 현재까지 매우 다양한 연구를 통해, 대장암의 아류형이 보고되었으나, 견고한 예후 예측 능력을 보이는 대장암 분류자는 거의 없는 실정이다. In addition, genomic studies based on gene expression quantification techniques have been performed in a variety of carcinomas, which demonstrate that the clinical diversity of cancer is reflected in gene expression. To date, a large variety of studies have reported subtypes of colorectal cancer, but few colon cancer classifications have demonstrated robust prognostic significance.

한편, 많은 항암제들이 효과적인 치료제로 사용되고 있지만 새롭게 대두되고 있는 문제점은 항암제에 대한 암세포의 내성 (resistance)이다 (Tsuruo et al., 1984). 항암제에 대한 내성은 장기적인 항암제 사용으로 약물에 노출된 세포들이 약물의 세포 내 축적을 감소시키거나 (Shen et al., 1986; Shen et al., 2000; Gottesman et al.,2002), 해독 작용 또는 배출을 활성화하거나 (Schuetz et al., 1996; Goto et al., 2002), 표적이 되는 단백질을 변형 (Urasaki et al., 2001) 시키는 등의 여러 가지 기전을 통하여 일어나게 된다. 이러한 과정은 암치료에 가장 큰 장애 요소일 뿐만 아니라 치료의 실패와도 깊은 관련이 있다 (강, 1996). 실제 암 환자에 화학요법을 시도할 때 어떤 항암제가 효력을 발휘하지 못하는 경우에 이후 다른 항암제에 대해서도 내성을 보이는 사례가 빈번하며, 초기 치료 시 작용기전이 다른 여러 종류의 항암제를 동시에 투여하는 복합화학요법을 시도했음에도 불구하고 치료효과가 없는 현상을 자주 관찰할 수 있다. 이로 인하여 사용가능한 항암제의 범위가 매우 제한되는 것은 암의 화학요법에 있어서 중요한 문제점으로 지적되고 있다. 그러나 특정약제에 대한 생체반응 관련 요소의 복합적 작용, 치료제 및 투여방식의 다양성과 방대한 시료확보의 어려움으로 아직 괄목할 만한 성과가 미약하다.On the other hand, many anticancer drugs are being used as effective treatments, but the newly emerging problem is the resistance of cancer cells to anticancer drugs (Tsuruo et al., 1984). The resistance to anticancer drugs may be due to the long-term use of anticancer drugs, which may result in decreased intracellular accumulation of drug-exposed cells (Shen et al., 1986; Shen et al., 2000; Gottesman et al. (Urasaki et al., 2001) or by modifying the target protein (Schatz et al., 1996; Goto et al., 2002). This process is not only the biggest obstacle to cancer treatment, but also has a profound effect on treatment failure (Kang, 1996). When chemotherapy is tried in actual cancer patients, when some anticancer drugs are not effective, they are resistant to other anticancer drugs in the future. In the case of initial chemotherapy, several kinds of chemotherapeutic drugs Despite the attempted therapy, the phenomenon that has no therapeutic effect can often be observed. Therefore, the range of available anticancer drugs is very limited, which is pointed out as an important problem in cancer chemotherapy. However, the remarkable results are still weak due to the complex action of biologic reaction related factors, the diversity of therapeutic agents and dosage regimes, and the difficulty of securing large samples.

현재 임상에서 사용되고 있는 분자 수준에서의 예후 예측 기준으로써, microsatellite instability(MSI), CpG island methylation phenotype(CIMP), chromosomal instability(CIN), BRAF/KRAS 돌연변이 등이 사용되고 있으나, 환자 개개인의 특성에 입각한 항암 치료의 감수성을 예측하는 방법론은 전무한 실정이다. 따라서 대장암 환자의 예후를 정확하게 예측하면서도 동시에 항암 치료의 감수성을 예측할 수 있는 표지자 개발이 절실한 실정이다.
(CIM), chromosomal instability (CIN), and BRAF / KRAS mutations have been used as predictor of prognosis at the molecular level currently used in clinical practice. However, There is no methodology for predicting the sensitivity of chemotherapy. Therefore, it is urgent to develop a marker that can accurately predict the prognosis of colorectal cancer patients and predict the sensitivity of chemotherapy.

본 발명자는 이러한 필요성을 해결하고자, 정상조직, 암 조직 및 간 전이암 조직 간의 유전자 발현 양상을 통계학적으로 분석하여 암의 발생 및 전이와 연관된 유전자들을 선별하고, 이를 현재까지 보고된 유전자 간 상호관계 문헌을 분석하여 암의 예후 및 항암제 감수성 예측용 유전자군을 선별하고, 본 발명을 완성하기에 이르렀다.To solve this need, the present inventors analyzed the gene expression patterns between normal tissues, cancer tissues and liver metastatic cancer tissues statistically to select genes associated with cancer development and metastasis, By analyzing the literature, a group of genes for prediction of cancer prognosis and anticancer drug susceptibility was selected, and the present invention was completed.

따라서 본 발명의 목적은 유전자 분류자 TCA19를 포함하는 암의 예후 및 항암제 감수성 예측용 바이오 마커 조성물을 제공하는 것이다.Accordingly, an object of the present invention is to provide a biomarker composition for predicting cancer prognosis and anticancer drug susceptibility prediction including gene classifier TCA19.

본 발명의 또 다른 목적은 유전자 분류자 TCA19 의 mRNA 또는 이의 단백질의 발현 수준을 측정하는 제제를 포함하는, 암의 예후 또는 항암제 감수성 예측용 조성물을 제공하는 것이다.It is still another object of the present invention to provide a composition for predicting cancer prognosis or anticancer drug sensitivity, which comprises an agent for measuring the expression level of mRNA of the gene classifier TCA19 or a protein thereof.

본 발명의 또 다른 목적은 상기 조성물을 포함하는 암의 예후 또는 항암제 감수성 예측용 키트를 제공하는 것이다.Still another object of the present invention is to provide a kit for predicting prognosis or anticancer drug susceptibility of cancer comprising the composition.

본 발명의 또 다른 목적은 상기 유전자 분류자를 이용하여 암의 예후 또는 항암제 감수성 예측을 위한 정보 제공 방법을 제공하는 것이다.It is another object of the present invention to provide a method for providing information for predicting cancer prognosis or anticancer drug sensitivity using the gene classifier.

본 발명의 또 다른 목적은 3기 대장암 환자의 예후 예측 및 항암제 감수성 예측용 유전자의 스크리닝 방법을 제공하는 것이다.
Another object of the present invention is to provide a method for predicting the prognosis of patients with stage 3 colorectal cancer and screening genes for anticancer drug sensitivity prediction.

상기 과제를 해결하기 위하여, 본 발명은 유전자 분류자 TCA19를 포함하는 암의 예후 및 항암제 감수성 예측용 바이오 마커 조성물을 제공한다.In order to solve the above problems, the present invention provides a biomarker composition for predicting cancer prognosis and cancer susceptibility prediction including gene classifier TCA19.

또한, 본 발명은 유전자 분류자 TCA19 의 mRNA 또는 이의 단백질의 발현 수준을 측정하는 제제를 포함하는, 암의 예후 또는 항암제 감수성 예측용 조성물을 제공한다.The present invention also provides a composition for predicting cancer prognosis or anticancer drug sensitivity, which comprises an agent for measuring the expression level of mRNA of the gene classifier TCA19 or a protein thereof.

또한, 본 발명은 상기 조성물을 포함하는 암의 예후 또는 항암제 감수성 예측용 키트를 제공한다.The present invention also provides a kit for predicting prognosis or anticancer drug susceptibility of cancer comprising the composition.

또한, 본 발명은 상기 유전자 분류자를 이용하여 암의 예후 또는 항암제 감수성 예측을 위한 정보 제공 방법을 제공한다.In addition, the present invention provides a method for providing information for predicting cancer prognosis or anticancer drug susceptibility using the gene classifier.

또한, 본 발명은 3기 대장암 환자의 예후 예측 및 항암제 감수성 예측용 유전자의 스크리닝 방법을 제공한다.
The present invention also provides a method of predicting the prognosis of patients with stage 3 colorectal cancer and screening genes for anticancer drug susceptibility prediction.

유전자 분류자 TCA19는 암의 재발 및 전이를 예측하고, 항암제의 감수성을 예측하는데 유용하게 이용될 수 있다. 또한 본 발명은 정상조직, 암 조직 및 전이 암 조직 간의 유전자 발현 양상을 통계학적으로 분석하여 암의 발생 및 전이와 연관된 주요한 유전자를 선별하고, 상기 유전자의 활성에 의해 조절되는 유전자의 예후값(prognostic value)에 의해 유전자 분류자로 사용될 수 있는 유전자를 선택하는 방법을 제공함으로써 상기 방법을 이용하여 간편하게 암의 예후 또는 항암제 감수성 예측용 분류자를 스크리닝 할 수 있다.
The gene classifier TCA19 can be used to predict the recurrence and metastasis of cancer and to predict the sensitivity of anticancer drugs. The present invention also provides a method for predicting the prognostic value of a gene regulated by the activity of the gene by statistically analyzing patterns of gene expression between normal tissues, cancer tissues and metastatic cancer tissues, value of the gene to be used as a gene classifier, it is possible to easily screen the classifier for cancer prognosis or anticancer drug susceptibility prediction by using the above method.

도 1은 본 발명의 작업 흐름의 개략도를 나타낸 도이다.
도 2는 AMC 코호트(n=54)에서의 대장암의 유전자 발현 패턴을 나타낸 도이다.
도 3은 조직 그룹 중에서 상이하게 발현되는 유전자의 비교 분석을 나타낸 도이다. 도 3A는 다른 조직에서 EdgeR 소프트웨어를 이용하여 GLM 우도 비율에 의해 선택된 유전자의 벤다이어그램을 나타낸 도이고, 도 3B는 벤다이어그램의 선택된 유전자의 발현 패턴을 나타낸 도이다.
도 4는 암의 생성 또는 전이와 관련된 유전자의 유전자 세트 농축 분석 결과를 나타낸 도이다.
도 5는 암의 생성(A) 및 전이(B)와 관련된 유전자로 강화된 TERM1 네트워크를 나타낸 도이다.
도 6은 암의 생성(A) 및 전이(B)와 관련된 유전자로 강화된 CTGF 네트워크를 나타낸 도이다.
도 7은 오리지날(RNA-시퀀싱 데이터) 및 검증 코호트(GSE14297, 유전자 발현 마이크로어레이)에서 TREM1 및 CTGF의 박스플롯(boxplot)을 나타낸 도이다.
도 8은 TCA66 예측자에 의한 대장암 환자 분류를 나타낸 도이다. 도 8A는 CIT 코호트에서 TCA66에 의한 위험 점수를 나타낸 것이고, 도 8B는 TCA66 위험 점수에 의해 분류된 CIT의 두 서브그룹의 Kaplan-Meier 커브를 나타낸 도이며, 도 8C는 CIT 코호트로부터 유도된 TCA66 위험 점수에 의해 분류된 AUS 코호트에서 두 서브그룹의 Kaplan-Meier 커브를 나타낸 도이다. P 값은 로그-순위 테스트에 의해 얻었으며, mDFS는 중간 무병 생존률을 나타낸다.
도 9는 TCA19 예측자에 의한 대장암 환자 분류를 나타낸 도이다. 도 9A는 CIT 코호트에서 TCA19에 의한 위험 점수를 나타낸 것이고, 도 9B는 TCA19 위험 점수에 의해 분류된 CIT 두 개의 서브그룹의 Kaplan-Meier 커브를 나타낸 도이며, 도 9C는 CIT 코호트로부터 유도된 TCA19 위험 점수에 의해 분류된 AUS 코호트에서 두 서브그룹의 Kaplan-Meier 커브를 나타낸 도이고, 도 9D는 CIT 코호트로부터 유도된 TCA19 위험 점수에 의해 분류된 UAPC 코호트에서 두 서브그룹의 Kaplan-Meier 커브를 나타낸 도이다. P 값은 로그-순위 테스트에 의해 얻었으며, mDFS는 중간 무병 생존률을 나타낸다.
도 10은 AUS 및 UPAC 코호트(n=449)에서 AJCC 병기 분류에 의해 분류된 환자의 DFS(무병 생존률)의 Kaplan-Meier 플롯을 나타낸 도이다. 도 10A는 2기 대장암 환자의 TCA19 위험 점수-기반 서브세트 분석을 나타낸 도이고, 도 10B는 3기 대장암 환자의 TCA19 위험 점수-기반 서브세트 분석대장암 환자의 TCA19 위험 점수-기반 서브세트 분석을 나타낸 도이다. P 값은 로그-순위 테스트에 의해 얻었으며, mDFS는 중간 무병 생존률을 나타낸다.
도 11은 CIT 코호트(A)와 AUS 및 UAPC 코호트(B)에서 대장암 환자의 병기에 따른 TCA19의 상호작용을 나타낸 도이다.
도 12는 CIT 코호트에서의 DNA MMR(mismatch repair) 상태에 의해 분류된 환자의 DFS(무병 생존률)의 Kaplan-Meier 플롯을 나타낸 도이다. 도 12A는 dMMR(deficient MMR) 환자에서의 TCA19 위험 점수-기반 서브세트 분석을 나타낸 도이고, 도 12B는 pMMR(proficient MMR) 환자에서의 TCA19 위험 점수-기반 서브세트 분석을 나타낸 도이다. P 값은 로그-순위 테스트에 의해 얻었으며, mDFS는 중간 무병 생존률을 나타낸다.
도 13은 TCA 분류자를 기반으로 한 두 서브그룹에서의 화학요법에 대한 감수성 예측을 나타낸 도이다. 도 13A는 AUS 코호트에서의 두 서브그룹(고TCA19 및 저TCA19)의 DFS(무병 생존률)의 Kaplan-Meier 플롯을 나타낸 도이고, 도 13B는 AUS 코호트의 3기 대장암 환자의 DFS(무병 생존률)의 Kaplan-Meier 플롯을 나타낸 도이며, 도 13C는 AUS 코호트의 75세 이상 고령의 3기 대장암 환자의 Kaplan-Meier 플롯을 나타낸 도이고, 도 13D는 고TCA19 서브그룹에서 3기 대장암 환자의 Kaplan-Meier 플롯을 나타낸 도이며, 도 13 E는 저TCA19 서브그룹에서 3기 대장암 환자의 Kaplan-Meier 플롯을 나타낸 도이다. 데이터는 환자가 화학요법을 받았는지 여부에 따라 플로팅되었다. P 값은 로그-순위 테스트에 의해 얻었으며, mDFS는 중간 무병 생존률을 나타낸다.
도 14는 AUS 코호트에서 보조 화학요법을 받은 대장암 환자에서 TCA19 분류자의 상호작용을 나타낸 도이다.
도 15는 AUC 및 CIT 통합 코호트에서 TCA19 분류자에 의해 분류된 TCA19 고위험(A) 및 TCA19 저위험(B) 서브그룹에서의 화학요법 감수성 예측을 나타낸 도이다. 데이터는 환자가 화학요법을 받았는지 여부에 따라 플로팅되었다. P 값은 로그-순위 테스트에 의해 얻었으며, mDFS는 중간 무병 생존률을 나타낸다.
도 16은 TCA19 분류자(A), MDA114 분류자(B) 및 OncoDX 분류자(C)의 위험 수준에 의해 분류된 AUS 코호트에서의 DFS(무병 생존률)의 Kaplan-Meier 플롯을 나타낸 도이다. P 값은 로그-순위 테스트에 의해 얻었으며, mDFS는 중간 무병 생존률을 나타낸다.
도 17은 TCA19 분류자(A), MDA114 분류자(B) 및 OncoDX 분류자(C)에 의해 분류된 AUS 코호트에서의 AJCC 병기에 따른 환자들의 DFS(무병 생존률)의 Kaplan-Meier 플롯을 나타낸 도이다. P 값은 로그-순위 테스트에 의해 얻었으며, mDFS는 중간 무병 생존률을 나타낸다.
도 18은 TCA19 분류자(A), MDA114 분류자(B) 및 OncoDX 분류자(C)의 위험 수준에 의해 분류된 AJCC 병기에 따른 3기 대장암 환자의 고위험 및 저위험 서브그룹에서의 Kaplan-Meier 플롯을 나타낸 도이다. 데이터는 환자가 화학요법을 받았는지 여부에 따라 플로팅되었다. P 값은 로그-순위 테스트에 의해 얻었으며, mDFS는 중간 무병 생존률을 나타낸다.
도 19는 랜덤 분류자 및 TCA19 분류자를 비교하여 나타낸 도이다.
도 20은 분류자에서의 66 유전자의 유전자 세트 강화(enrichment) 분석을 나타낸 도이다. 분류 강화(classification enrichment)는 Ingenuity Analysis software를 이용하였고, 유의성의 임계값은 P < 0.05이다.
도 21은 TCA19 분류자에 의해 분류된 서브그룹 및 CRCassigner에 의해 유도된 5 서브타입의 비교를 나타낸 도이다.
도 22는 TCA19 분류자에 의해 분류된 서브그룹 및 146-유전자 분류자에 의해 유도된 3 서브타입의 비교를 나타낸 도이다.
도 23은 TCA19 분류자에 의해 분류된 서브그룹 및 57-유전자 분류자에 의해 유도된 6 서브타입의 비교를 나타낸 도이다.BRIEF DESCRIPTION OF THE DRAWINGS Figure 1 is a schematic diagram of a workflow of the present invention.
Fig. 2 is a diagram showing a gene expression pattern of colon cancer in the AMC cohort (n = 54). Fig.
Figure 3 is a diagram showing comparative analysis of genes that are differentially expressed in tissue groups. FIG. 3A shows a Venn diagram of a gene selected by the GLM likelihood ratio using EdgeR software in another tissue, and FIG. 3B shows an expression pattern of a selected gene of the Venn diagram.
Fig. 4 is a diagram showing the result of concentration-analysis analysis of genes related to the generation or metastasis of cancer. Fig.
Figure 5 shows a TERM1 network enriched with genes associated with the generation (A) and metastasis (B) of cancer.
Figure 6 shows a CTGF network enriched with genes associated with the generation (A) and metastasis (B) of cancer.
Figure 7 shows a boxplot of TREM1 and CTGF in the original (RNA-sequencing data) and validation cohort (GSE14297, gene expression microarray).
Fig. 8 is a diagram showing a classification of colon cancer patients by the TCA66 predictor. Fig. FIG. 8A shows the risk score by TCA66 in the CIT cohort, FIG. 8B shows the Kaplan-Meier curve of the two subgroups of CIT classified by the TCA66 risk score, FIG. 8C shows the TCA66 risk derived from the CIT cohort Kaplan-Meier curves of two subgroups in the AUS cohort categorized by score. P value was obtained by log-rank test, and mDFS represents intermediate disease free survival rate.
FIG. 9 is a diagram showing a classification of colon cancer patients by the TCA19 predictor. FIG. Figure 9A shows the risk score by TCA19 in the CIT cohort, Figure 9B shows the Kaplan-Meier curve of two subgroups of CIT categorized by TCA19 risk score, Figure 9C shows the TCA19 risk from the CIT cohort FIG. 9D is a graph showing the Kaplan-Meier curves of two subgroups in the UAPC cohort classified by the TCA19 risk score derived from the CIT cohort; FIG. to be. P value was obtained by log-rank test, and mDFS represents intermediate disease free survival rate.
Figure 10 is a Kaplan-Meier plot of DFS (disease free survival) of patients classified by AJCC stage classification in the AUS and UPAC cohorts (n = 449). Figure 10A is an analysis of TCA19 risk score-based subset analysis of second stage colorectal cancer patients and Figure 10B is a TCA19 risk score-based subset analysis of stage 3 colorectal cancer patients. FIG. P value was obtained by log-rank test, and mDFS represents intermediate disease free survival rate.
11 is a graph showing the interaction of TCA19 with the stage of colon cancer patients in the CIT cohort (A) and the AUS and UAPC cohort (B).
Figure 12 is a Kaplan-Meier plot of DFS (disease free survival) of patients classified by the DNA mismatch repair (MMR) status in the CIT cohort. FIG. 12A is an analysis of TCA19 risk score-based subset analysis in dMMR (deficient MMR) patients, and FIG. 12B is an analysis of TCA19 risk score-based subset analysis in pMMR (proficient MMR) patients. P value was obtained by log-rank test, and mDFS represents intermediate disease free survival rate.
Figure 13 shows susceptibility predictions for chemotherapy in two subgroups based on the TCA classifier. 13A is a Kaplan-Meier plot of the DFS (disease free survival) of two subgroups (high TCA19 and low TCA19) in the AUS cohort, FIG. 13B shows DFS (disease free survival) FIG. 13C is a Kaplan-Meier plot of an AUS cohort of patients older than 75 years old with stage 3 colorectal cancer, and FIG. 13D is a Kaplan-Meier plot of a patient with stage 3 colorectal cancer in a high TCA19 subgroup FIG. 13E is a Kaplan-Meier plot of a third stage colorectal cancer patient in the low TCA19 subgroup. FIG. The data was plotted according to whether the patient received chemotherapy. P value was obtained by log-rank test, and mDFS represents intermediate disease free survival rate.
Figure 14 shows the interaction of the TCA 19 classifier in patients with colorectal cancer who received adjuvant chemotherapy in the AUS cohort.
Figure 15 shows the prediction of chemosensitivity susceptibility in the TCA19 high risk (A) and TCA19 low risk (B) subgroups classified by the TCA19 classifier in the AUC and CIT integrated cohort. The data was plotted according to whether the patient received chemotherapy. P value was obtained by log-rank test, and mDFS represents intermediate disease free survival rate.
Figure 16 is a Kaplan-Meier plot of DFS (disease free survival) in the AUS cohort classified by the risk level of TCA19 classifier (A), MDA114 classifier (B), and OncoDX classifier (C). P value was obtained by log-rank test, and mDFS represents intermediate disease free survival rate.
Figure 17 shows Kaplan-Meier plots of DFS (disease free survival) according to AJCC stage in the AUS cohort classified by TCA19 classifier (A), MDA114 classifier (B) and OncoDX classifier (C) to be. P value was obtained by log-rank test, and mDFS represents intermediate disease free survival rate.
Figure 18 shows Kaplan-Meier method for high-risk and low-risk subgroups of stage 3 colorectal cancer patients according to AJCC stage classified by risk level of TCA19 classifier (A), MDA114 classifier (B) and OncoDX classifier (C) Meier plot. The data was plotted according to whether the patient received chemotherapy. P value was obtained by log-rank test, and mDFS represents intermediate disease free survival rate.
19 is a diagram showing a comparison between the random classifier and the TCA19 classifier.
Figure 20 is an illustration of gene set enrichment analysis of 66 genes in the classifier. Classification enrichment was performed using the Ingenuity Analysis software and the significance threshold was P <0.05.
Figure 21 shows a comparison of subgroups classified by TCA19 classifier and 5 subtypes derived by CRCassigner.
Figure 22 shows a comparison of three subtypes derived by the TCA19 classifier and by the 146-gene classifier.
Figure 23 shows a comparison of subgroups classified by the TCA19 classifier and six subtypes derived by the 57-gene classifier.

이하 본 발명에 대하여 보다 상세히 설명한다. Hereinafter, the present invention will be described in more detail.

본 발명은 유전자 분류자 TCA19를 포함하는 암의 예후 또는 항암제 감수성 예측용 바이오 마커 조성물을 제공한다. The present invention provides a biomarker composition for predicting cancer prognosis or anticancer drug susceptibility including a gene classifier TCA19.

상기 유전자 분류자 TCA19는 GADD45B(growth arrest and DNA-damage-inducible, beta), S1PR3(sphingosine-1-phosphate receptor 3), CDKN2B(cyclin-dependent kinase inhibitor 2B), EGR2(early growth response 2), CTGF(connective tissue growth factor), SERPINE1(serpin peptidase inhibitor, clade E), RGS16(regulator of G-protein signaling 16), RHOU(ras homolog family member U), TIMP1(metallopeptidase inhibitor 1), PHLDA1(pleckstrin homology-like domain family A, member 1), IL36RN(interleukin 36 receptor antagonist), SLAMF7(SLAM family member 7), E2F7(transcription factor 7), DTL(denticleless E3 ubiquitin protein ligase homolog), CFB(complement factor B), CDK1(cyclin-dependent kinase 1), CXCL1(chemokine (C-X-C motif) ligand 1), CXCL3(chemokine (C-X-C motif) ligand 3), 및 CKS2(CDC28 protein kinase regulatory subunit 2)로 이루어진 군으로부터 선택된 1개 이상의 유전자로 이루어진 것이며, 바람직하게는 GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHDLA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, 및 CKS2 유전자를 모두 포함하는 것이다. The gene classifier TCA19 may be selected from the group consisting of growth arrest and DNA-damage-inducible beta (GADD45B), sphingosine-1-phosphate receptor 3 (S1PR3), cyclin-dependent kinase inhibitor 2B (CDKN2B), early growth response 2 (RGS 16), RHOU (ras homolog family member), TIMP1 (metallopeptidase inhibitor 1), PHLDA1 (pleckstrin homology-like), serine protease inhibitor domain family A, member 1), IL36RN (interleukin 36 receptor antagonist), SLAMF7 (SLAM family member 7), E2F7 (transcription factor 7), DTL (denticleless E3 ubiquitin protein ligase homolog), CFB consisting of one or more genes selected from the group consisting of cyclin-dependent kinase 1), CXCL1 (chemokine (CXC motif) ligand 1), CXCL3 (chemokine (CXC motif) ligand 3), and CKS2 (CDC28 protein kinase regulatory subunit 2) Preferably GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIM P1, PHDLA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, and CKS2 genes.

상기 유전자 분류자 TCA 19에 포함되는 유전자 GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHDLA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, 및 CKS2는 TREM1(triggering receptor expressed on myeloid cells 1) 및 CTGF(connective tissue growth factor)의 활성에 의해 조절되는 유전자들이며, 상기 TREM 1 및 CTGF는 대장암의 생성 및 전이 동안 활성화된 두 가지 주요한 조절자(regulator)이다.The genes GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHDLA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, and CKS2 contained in the gene classifier TCA19 are TREM1 (TREM 1) and CTGF are two major regulators activated during the production and transduction of colorectal cancer. These TREM 1 and CTGF are regulated by the activity of CTGF (triggering receptor expressed on myeloid cells 1) and CTGF .

본 발명에서 "돌연변이(mutation)", "변이(variant)" 및 "변이체"는 해당 유전자의 뉴클레오티드 및 아미노산 서열의 염기 치환, 결실, 삽입, 증폭 및 재배열을 포함한다. 뉴클레오티드 변이는 참조 서열(예를 들어, 야생형 서열)에 대한 뉴클레오티드 서열의 변화 (예를들어, 1개 이상의 뉴클레오티드의 삽입, 결실, 역위 또는 치환, 예컨대 단일 뉴클레오티드다형성 (SNP))를 지칭한다. 이 용어는 또한 달리 나타내지 않는다면, 뉴클레오티드 서열의 보체에서의 상응하는 변화도 포함한다. 뉴클레오티드 변이는 체세포 돌연변이 또는 배선 다형성일 수 있다.In the present invention, the terms "mutation", "variant" and "variant" include base substitution, deletion, insertion, amplification and rearrangement of the nucleotide and amino acid sequences of a gene of interest. A nucleotide variation refers to a change in the nucleotide sequence (e.g., insertion, deletion, inversion, or substitution of one or more nucleotides, such as a single nucleotide polymorphism (SNP)) relative to a reference sequence (e.g., a wild-type sequence). This term also includes corresponding changes in the complement of the nucleotide sequence, unless otherwise indicated. The nucleotide mutation may be a somatic mutation or a polymorphism.

본 발명에서 "예후 예측용 마커(prognostic marker)"란 암 치료 후에 병의 경과, 생존 여부 또는 완치 여부를 확인할 수 있는 물질로, 폴리펩타이드 또는 핵산(예:mRNA 등), 지질, 당지질, 당단백질 또는 당(단당류, 이당류, 올리고당류 등) 등과 같은 유기 생체 분자들을 포함한다. 본 발명의 목적상, 본 발명의 암의 예후 예측용 마커는 GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHDLA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, 및 CKS2 유전자로 이루어진 군으로부터 선택된 1개 이상의 유전자로 이루어진 유전자 분류자 TCA19이다. In the present invention, the term " prognostic marker "refers to a substance capable of confirming the progress of disease, survival or cure after cancer therapy, and includes a polypeptide or nucleic acid (e.g., mRNA), lipid, Or sugars (monosaccharides, disaccharides, oligosaccharides, etc.), and the like. For the purpose of the present invention, the marker for prognosis of cancer of the present invention may be one or more of GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHDLA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, and CKS2 genes. The gene classifier TCA19 comprises at least one gene selected from the group consisting of CXCL3, CXCL3, and CKS2 genes.

본 발명에서 "감수성"은 개개의 환자의 암에 대한 특정약물이 효과를 나타내는지 여부를 의미한다.In the present invention, "susceptibility" means whether a particular drug for cancer of an individual patient exhibits an effect.

예컨대, 상기 특정약물은 주로 항암제이며, 이들 항암제에는 암의 종류에 따라 효과를 나타내는 경우와 효과를 나타내지 않는 경우가 있다. 또한, 유효한 것으로 인정되고 있는 종류의 암이어도, 개개의 환자에 따라 효과를 나타내는 경우와 효과를 나타내지 않는 경우가 있는 것이 알려져 있다. 이와 같은 개개의 환자의 암에 대해 항암제가 효과를 나타내는 지 여부를 항암제 감수성이라고 한다. 따라서, 본 발명에 따라 치료 개시 전에 효과를 기대할 수 있는 환자(반응자)와, 효과를 기대할 수 없는 환자(무반응자)를 예측할 수 있으면, 유효성과 안전성이 높은 화학 요법이 실현될 수 있다. 본 발명의 항암제는 옥사리플라틴, 플루오로우라실, 레보폴리네이트 및 그 염으로 이루어진 군으로부터 선택된 것이다.For example, the specific drug is mainly an anticancer agent, and the anticancer agent may not exhibit the effect or the effect depending on the kind of cancer. Further, it is known that, even in the kind of cancer recognized as being effective, there are cases in which effects are exhibited depending on individual patients and cases where effects are not shown. The sensitivity of anticancer drugs to the cancer of individual patients is called anticancer drug susceptibility. Therefore, a chemotherapy with high efficacy and safety can be realized as long as the patient (the responder) whose effect can be expected before the initiation of treatment according to the present invention and the patient (the non-responder) whose effect can not be expected can be predicted. The anticancer agent of the present invention is selected from the group consisting of oxaplylatin, fluorouracil, levopolinate, and salts thereof.

본 발명에서 "예측"은 본원에서 대상 환자가 약물 또는 약물 세트에 대해 유리하게 또는 불리하게 반응할 가능성을 지칭하는데 사용된다. 한 실시양태에서, 예측은 이러한 반응의 정도에 관한 것이다. 예컨대, 예측은 환자가 처치 후, 예를 들어 특정한 치료제의 처치 및/또는 초발성 종양의수술적 제거 및/또는 특정 기간 동안의 화학요법 후에 암 재발 없이 생존할 지의 여부 및/또는 그러할 확률에 관한 것이다. 본 발명의 예측은 암 환자에 대한 가장 적절한 치료 방식을 선택함으로써 치료를 결정하는데 임상적으로 사용될 수 있다. 본 발명의 예측은 환자가 치료 처치, 예컨대 주어진 치료적 처치, 예를 들어 주어진 치료제 또는 조합물의 투여, 수술적 개입, 화학요법 등에 유리하게 반응할 것인지 또는 치료적 처치 후에 환자의 장기 생존이 가능한 지의 여부를 예측하는데 있어서 유용한 도구이다.In the present invention, "prediction" is used herein to refer to the likelihood that the subject patient will respond favorably or adversely to the drug or drug set. In one embodiment, the prediction is about the degree of such response. For example, the prediction relates to the likelihood and / or likelihood that a patient will survive after treatment, for example, without treatment of a particular therapeutic agent and / or after surgical removal of a primary tumor and / or chemotherapy for a specified period of time without cancer recurrence . The predictions of the present invention can be used clinically to determine treatment by selecting the most appropriate treatment regime for a cancer patient. The predictions of the present invention can be used to determine whether a patient will be beneficially responsive to a treatment procedure, such as a given therapeutic treatment, such as administration of a given therapeutic or combination, surgical intervention, chemotherapy, It is a useful tool in predicting whether or not

본 발명의 암의 종류는 급성 림프구성 또는 림프아구성 백혈병, 급성 또는 만성의 림포구성 백혈병, 급성 비림프구성 백혈병, 방광암, 뇌종양, 유방암, 경관암, 만성 골수성 백혈병, 장암, T-존 림프종, 자궁내막증, 식도암, 담즙 방광암, 유잉 육종(Ewing's sarcoma), 두 및 목암, 설암, 홉킨스 림프종, 카포시스 육종, 신장암, 간암, 폐암, 중피종, 다발성 골수종, 신경아세포종, 비홉킨 림프종, 골육종, 난소암, 신경아세포종, 유선암, 경관암, 전립선암, 췌장암, 대장암, 페니스암, 레티노블라스토마, 피부암, 위암, 갑상선압, 자궁암, 고환암, 윌름스 종양, 및 트로포블라스토마로 이루어진 군에서 선택된 것이며, 바람직하게는 대장암, 더욱 바람직하게는 3기 대장암이나, 이에 제한되지 않는다.The types of cancer of the present invention include, but are not limited to, acute lymphocytic or lymphocytic leukemia, acute or chronic lymphocytic leukemia, acute non-lymphoid leukemia, bladder cancer, brain tumor, breast cancer, chronic myelogenous leukemia, Endometriosis, esophageal cancer, bile bladder cancer, Ewing's sarcoma, duodenum, stomach cancer, Hopkins lymphoma, caposic sarcoma, kidney cancer, liver cancer, lung cancer, mesothelioma, multiple myeloma, neuroblastoma, non- Wherein the cancer is selected from the group consisting of cancer, neuroblastoma, breast cancer, colorectal cancer, prostate cancer, pancreatic cancer, colon cancer, penis cancer, retinoblastoma, skin cancer, gastric cancer, thyroid cancer, uterine cancer, testicular cancer, Wilms' tumor, and tropoblastoma , Preferably, but not limited to, colorectal cancer, more preferably, third stage colorectal cancer.

상기 대장암은 직장암, 결장암 및 항문암을 포함한다.
The colon cancer includes rectal cancer, colon cancer, and anal cancer.

또한, 본 발명은 GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, 및 CKS2 유전자로 이루어진 군으로부터 선택된 하나 이상의 유전자의 mRNA 또는 이의 단백질의 발현 수준을 측정하는 제제를 포함하는, 암의 예후 또는 항암제 감수성 예측용 조성물을 제공한다.The present invention also relates to a method for the production of a compound of the present invention which is selected from the group consisting of GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, A composition for predicting cancer prognosis or anticancer drug sensitivity, comprising an agent for measuring the expression level of mRNA of the above gene or its protein.

본 발명에서 “mRNA 발현 수준 측정”이란 암의 예후 또는 항암제 감수성 예측을 위하여 생물학적 시료에서 암 관련 유전자인 GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, 및 CKS2 유전자로 이루어진 군으로부터 선택된 하나 이상의 유전자의 mRNA 존재 여부와 발현 정도를 확인하는 과정으로, mRNA의 양을 측정하여 이루어진다. 이를 위한 분석 방법으로는 예를 들어, 역전사 중합효소반응(RT-PCR), 경쟁적 역전사 중합효소반응(Competitive RT-PCR), 실시간 역전사 중합효소반응(Realtime RT-PCR), RNase 보호 분석법(RPA; RNase protection assay), 노던 블랏팅(Northern blotting), DNA 칩 등이 있으나, 이에 제한되는 것은 아니다. 본 발명에서 유전자의 mRNA 수준을 측정하는 제제는 바람직하게는 안티센스 올리고뉴클레오티드, 프라이머 쌍 또는 프로브이다. In order to predict cancer prognosis or anticancer drug susceptibility, the cancer-associated genes GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, and CKS2 genes, and measuring the amount of mRNA. For example, RT-PCR, competitive RT-PCR, real-time RT-PCR, and RNase protection assay (RPA). RNase protection assay, Northern blotting, DNA chip, and the like. The agent for measuring the mRNA level of a gene in the present invention is preferably an antisense oligonucleotide, a primer pair or a probe.

본 발명에서 "안티센스"는 안티센스 올리고머가 왓슨-크릭 염기쌍 형성에 의해 RNA 내의 표적 서열과 혼성화되어, 표적서열 내에서, 전형적으로 mRNA와 RNA:올리고머 헤테로이중체의 형성을 허용하는, 뉴클레오티드염기의 서열 및 서브 유닛간 백본을 갖는 올리고머를 지칭한다. 올리고머는 표적 서열에 대한 정확한 서열 상보성 또는 근사 상보성을 가질 수 있다. 이 안티센스 올리고머는 mRNA의 번역을 차단 또는 저해하고 mRNA의 스플라이스 변이체를 생산하는 mRNA의 프로세싱 과정을 변화시킬 수 있다.In the present invention, "antisense" means that the antisense oligomer is hybridized with the target sequence in the RNA by Watson-Crick base pairing to form the sequence of the nucleotide base, typically within the target sequence, And an oligomer having a backbone between subunits. Oligomers may have an exact sequence complement or approximate complementarity to the target sequence. This antisense oligomer can alter the processing of mRNA that blocks or inhibits translation of mRNA and produces splice variants of mRNA.

본 발명에서 "프라이머"는 짧은 자유 3말단 수산화기를 가지는 핵산 서열로 상보적인 템플레이트(template)와 염기쌍을 형성할 수 있고 템플레이트 가닥 복사를 위한 시작 지점으로 기능을 하는 짧은 핵산 서열을 의미한다. 프라이머는 적절한 완충용액 및 온도에서 중합반응(즉, DNA 중합효소 또는 역전사효소)을 위한 시약 및 상이한 4가지 뉴클레오사이드 트리포스페이트의 존재하에서 DNA 합성이 개시할 수 있다. "Primer" in the present invention means a short nucleic acid sequence capable of forming a base pair with a complementary template with a short free 3-terminal hydroxyl group and functioning as a starting point for template strand copying. Primers can initiate DNA synthesis in the presence of reagents and four different nucleoside triphosphates for polymerization reactions (i. E., DNA polymerase or reverse transcriptase) at appropriate buffer solutions and temperatures.

본 발명에서 "프로브"란 mRNA와 특이적 결합을 이룰 수 있는 짧게는 수 염기 내지 길게는 수백 염기에 해당하는 RNA 또는 DNA 등의 핵산 단편을 의미하며 표지(Labelling)되어 있어서 특정 mRNA의 존재 유무를 확인 할 수 있다. 프로브는 올리고 뉴클레오티드 프로브, 단쇄 DNA(single stranded DNA) 프로브, 이중쇄 DNA(double stranded DNA) 프로브, RNA 프로브 등의 형태로 제작될 수 있다. 적당한 프로브의 선택 및 혼성화 조건은 당업계에 공지된 것을 기초로 변형할 수 있다.In the present invention, "probe" means a nucleic acid fragment such as RNA or DNA corresponding to a few nucleotides or hundreds of nucleotides that can specifically bind to mRNA, and is labeled to detect the presence or absence of a specific mRNA Can be confirmed. The probe can be produced in the form of an oligonucleotide probe, a single stranded DNA probe, a double stranded DNA probe, or an RNA probe. Selection of suitable probes and hybridization conditions can be modified based on what is known in the art.

본 발명의 GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, 및 CKS2 유전자의 핵산 서열이 유전자 은행에 등록되어 있으므로 당업자는 상기 서열을 바탕으로 이들 유전자의 특정 영역을 특이적으로 증폭하는 안티센스 올리고뉴클레오티드, 프라이머 쌍 또는 프로브를 디자인할 수 있다.Nucleic acid sequences of GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3 and CKS2 genes of the present invention A person skilled in the art can design an antisense oligonucleotide, a primer pair or a probe that specifically amplifies a specific region of these genes based on the above sequence.

본 발명의 안티센스 올리고뉴클레오티드, 프라이머 또는 프로브는 포스포르 아미다이트 고체 지지체 방법, 또는 기타 널리 공지된 방법을 사용하여 화학적으로 합성할 수 있다. 이러한 핵산 서열은 또한 당해 분야에 공지된 많은 수단을 이용하여 변형시킬 수 있다. 이러한 변형의 비-제한적인 예로는 메틸화, 캡화, 천연 뉴클레오티드 하나 이상의 동족체로의 치환, 및 뉴클레오티드 간의 변형, 예를 들면, 하전되지 않은 연결체(예: 메틸 포스포네이트, 포스포트리에스테르, 포스포로아미 데이트, 카바메이트 등) 또는 하전된 연결체(예: 포스포로티오에이트, 포스포로디티오에이트 등)로의 변형이 있다.The antisense oligonucleotides, primers or probes of the present invention can be chemically synthesized using the phosphoramidite solid support method, or other well-known methods. Such nucleic acid sequences may also be modified using many means known in the art. Non-limiting examples of such modifications include, but are not limited to, methylation, capping, substitution with one or more of the natural nucleotide analogs, and modifications between nucleotides, such as uncharged linkers (e.g., methylphosphonate, phosphotriester, Amidates, carbamates, etc.) or charged linkages (e.g., phosphorothioates, phosphorodithioates, etc.).

본 발명에서 “단백질 발현 수준 측정”이란 암의 예후 또는 항암제 감수성을 예측하기 위하여 생물학적 시료에서 암의 관련 유전자인 GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, 및 CKS2로 이루어진 군으로부터 선택된 유전자로부터 발현된 단백질의 존재 여부와 발현 정도를 확인하는 과정으로, 단백질의 양을 측정하여 이루어진다. 이를 위한 분석 방법으로는 웨스턴 블랏, ELISA (enzyme linked immunosorbent assay), 방사선면역분석(RIA: Radioimmunoassay), 방사 면역 확산법(radioimmunodiffusion), 오우크테로니(Ouchterlony) 면역확산법, 로케트(rocket) 면역전기영동, 조직면역 염색, 면역침전 분석법(Immunoprecipitation Assay), 보체 고정 분석법(Complement Fixation Assay), 유세포분석(Fluorescence Activated Cell Sorter, FACS), 단백질 칩(protein chip) 등이 있으나 이로 제한되는 것은 아니다. 본 발명에서 단백질 발현 수준을 측정하는 제제는 바람직하게는 항체이다. In order to predict cancer prognosis or anticancer drug susceptibility, the genes related to cancer, GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, and CKS2, and measuring the amount of the protein. Examples of the assay methods include Western blotting, enzyme linked immunosorbent assay (ELISA), radioimmunoassay (RIA), radioimmunodiffusion, Ouchterlony immunodiffusion, rocket immunoelectrophoresis But are not limited to, tissue immuno staining, immunoprecipitation assays, complement fixation assays, fluorescence activated cell sorters (FACS), protein chips, and the like. The agent for measuring the protein expression level in the present invention is preferably an antibody.

본 발명에서, “항체”란 항원성 부위에 대해서 지시되는 특이적인 단백질 분자를 의미한다. 본 발명의 목적상, 항체는 마커 단백질에 대해 특이적으로 결합하는 항체를 의미하며, 다클론 항체, 단클론 항체 및 재조합 항체를 모두 포함한다.
In the present invention, " antibody " means a specific protein molecule directed against an antigenic site. For purposes of the present invention, an antibody refers to an antibody that specifically binds to a marker protein and includes both polyclonal antibodies, monoclonal antibodies, and recombinant antibodies.

상기 조성물을 포함하는 암의 예후 또는 항암제 감수성 예측용 키트를 제공한다. There is provided a kit for predicting prognosis or anticancer drug susceptibility of cancer comprising the composition.

본 발명의 키트는 GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, 및 CKS2 유전자로 이루어진 군으로부터 선택된 유전자의 mRNA 또는 이의 단백질의 발현 수준을 확인함으로써 암의 예후 또는 항암제 감수성을 예측할 수 있다. 본 발명의 키트에는 암의 예후 또는 항암제 감수성 예측을 위한 프라이머, 프로브, 항체 등 뿐만 아니라 분석 방법에 적합한 한 종류 또는 그 이상의 다른 구성 성분 조성물, 용액, 또는 장치가 더 포함될 수 있으며, 바람직하게는 RT-PCR 키트, 마이크로어레이 칩 키트, DNA 칩 키트, 단백질 칩 키트의 형태일 수 있으나 이에 제한되지 않는다. The kit of the present invention comprises a gene selected from the group consisting of GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, The prognosis or anticancer drug sensitivity of cancer can be predicted by confirming the expression level of the mRNA or its protein. The kit of the present invention may further include one or more other component compositions, solutions, or devices suitable for the analysis method as well as primers, probes, antibodies, etc. for predicting cancer prognosis or anticancer drug susceptibility, preferably RT A PCR kit, a microarray chip kit, a DNA chip kit, and a protein chip kit.

구체적인 일례로서, 본 발명에서 상기 마커 유전자들의 mRNA 발현 수준을 측정하기 위한 키트는 RT-PCR을 수행하기 위해 필요한 필수 요소를 포함하는 키트일수 있다. RT-PCR 키트는 마커 유전자에 대한 특이적인 각각의 프라이머 쌍 외에도 테스트 튜브 또는 다른 적절한 컨테이너, 반응 완충액, 데옥시뉴클레오티드(dNTPs), Taq-중합 효소 및 역전사효소와 같은 효소, DNase, RNase 억제제, DEPC-물(DEPC-water), 멸균수 등을 포함할 수 있다.As a specific example, in the present invention, the kit for measuring the mRNA expression level of the marker genes may be a kit containing essential elements necessary for performing RT-PCR. RT-PCR kits contain enzymes such as test tubes or other appropriate containers, reaction buffers, deoxynucleotides (dNTPs), Taq-polymerase and reverse transcriptase, DNase, RNase inhibitors, DEPC Water (DEPC-water), sterile water, and the like.

또한, 본 발명의 키트는 마이크로어레이를 수행하기 위해 필요한 필수 요소를 포함하는 키트일 수 있다. 마이크로어레이 키트는, 유전자 또는 그의 단편에 해당하는 cDNA가 프로브로 부착되어 있는 기판을 포함하고 기판은 정량 대조구 유전자 또는 그의 단편에 해당하는 cDNA를 포함할 수 있으며, 본 발명의 마커를 이용하여 당업계에서 통상적으로 사용되는 제조 방법에 의하여 용이하게 제조될 수 있다. 마이크로어레이를 제작하기 위해서, 상기 탐색된 마커를 탐침 DNA 분자로 이용하여 DNA 칩의 기판상에 고정화시키기 위해 파이조일렉트릭(piezoelectric) 방식을 이용한 마이크로피펫팅(micropipetting)법 또는 핀(pin) 형태의 스폿터(spotter)를 이용한 방법 등을 사용하는 것이 바람직하나 이에 제한되지 않는다. 상기 마이크로어레이 칩의 기판은 아미노-실란(amino-silane), 폴리-L-라이신(poly-Llysine) 및 알데히드(aldehyde)로 이루어진 군에서 선택되는 활성기가 코팅된 것이 바람직하나, 이에 제한되지 않는다. 또한, 상기 기판은 슬라이드 글래스, 플라스틱, 금속, 실리콘, 나일론 막 및 니트로셀룰로스 막(nitrocellulose membrane)으로 이루어진 군에서 선택되는 것이 바람직하나 이에 제한되지 않는다.In addition, the kit of the present invention may be a kit including essential elements necessary for performing the microarray. The microarray kit may include a substrate to which a cDNA corresponding to a gene or a fragment thereof is attached as a probe and the substrate may include a cDNA corresponding to a quantitative control gene or a fragment thereof, Can be easily produced by a production method commonly used in the art. In order to fabricate a microarray, a micropipetting method using a piezo electric method or a micropipetting method using a pin-shaped method to immobilize the detected marker on a substrate of a DNA chip using the probe as a probe DNA molecule A method using a spotter or the like is preferably used, but the present invention is not limited thereto. The substrate of the microarray chip is preferably coated with an activator selected from the group consisting of amino-silane, poly-L-lysine and aldehyde, but is not limited thereto. In addition, the substrate is preferably selected from the group consisting of slide glass, plastic, metal, silicon, nylon film, and nitrocellulose membrane, but is not limited thereto.

또한, 본 발명의 키트는 DNA 칩을 수행하기 위해 필요한 필수 요소를 포함하는 키트일 수 있다. DNA 칩 키트는 유전자 또는 그의 단편에 해당하는 cDNA 또는 올리고뉴클레오티드(oligonucleotide)가 부착되어 있는 기판, 및 형광표식 프로브를 제작하기 위한 시약, 제제, 효소 등을 포함할 수 있다. 또한 기판은 대조군 유전자 또는 그의 단편에 해당하는 cDNA 또는 올리고뉴클레오티드를 포함할 수 있다.
In addition, the kit of the present invention may be a kit containing essential elements necessary for performing a DNA chip. The DNA chip kit may include a substrate on which a cDNA or an oligonucleotide corresponding to a gene or a fragment thereof is attached, and reagents, preparations, enzymes, and the like for producing a fluorescent-labeled probe. The substrate may also comprise a cDNA or oligonucleotide corresponding to a control gene or fragment thereof.

또한, 본 발명은 In addition,

(a) 생물학적 시료에서 GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, 및 CKS2 유전자로 이루어진 군으로부터 선택된 유전자의 mRNA 또는 이의 단백질의 발현 수준을 측정하는 단계; 및 (a) selected from the group consisting of GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3 and CKS2 genes in the biological sample Measuring an expression level of mRNA of the gene or a protein thereof; And

(b) (a) 단계에의 생물학적 시료에서 측정한 상기 유전자들의 mRNA 또는 이의 단백질의 발현 수준을 정상 대조구 시료에서 상기 유전자들의 mRNA 또는 이의 단백질의 발현 수준과 비교하는 단계;를 포함하는 암의 예후 또는 항암제 감수성 예측을 위한 정보 제공 방법을 제공한다.(b) comparing the expression level of the mRNA of the gene or its protein measured in the biological sample in step (a) with the expression level of mRNA of the gene or a protein thereof in a normal control sample; and Or a method for providing information for predicting anticancer drug susceptibility.

본 발명에서 “생물학적 시료”는 개체로부터 분리된 전혈, 혈청, 혈장, 타액, 뇨, 객담, 림프액, 조직, 세포 등을 포함하며, 바람직하게는 암조직 또는 암세포이나, 이에 제한되지 않는다. In the present invention, the term "biological sample" includes whole blood, serum, plasma, saliva, urine, sputum, lymphatic fluid, tissue, cells and the like isolated from an individual, preferably cancer tissue or cancer cells.

본 발명에서 mRNA 발현 수준을 측정하는 방법은 역전사 중합효소반응(RT-PCR), 경쟁적 역전사 중합효소반응(Competitive RT-PCR), 실시간 역전사 중합효소반응(Realtime RT-PCR), RNase 보호 분석법(RPA; RNase protection assay), 노던 블랏팅(Northern blotting), DNA 칩 등이 있으나 이에 제한되지 않는다.Methods for measuring mRNA expression levels in the present invention include RT-PCR, competitive RT-PCR, real-time RT-PCR, RNase protection assay (RPA RNase protection assay, Northern blotting, DNA chip, and the like.

본 발명에서 단백질의 발현 수준을 측정하기 위한 방법은 웨스턴 블랏, ELISA (enzyme linked immunosorbent assay), 방사선면역분석(RIA: Radioimmunoassay), 방사 면역 확산법(radioimmunodiffusion), 오우크테로니(Ouchterlony) 면역확산법, 로케트(rocket) 면역전기영동, 조직면역 염색, 면역침전 분석법(Immunoprecipitation Assay), 보체 고정 분석법(Complement Fixation Assay), 유세포분석(Fluorescence Activated Cell Sorter, FACS), 단백질 칩(protein chip) 등이 있으나 이에 제한되지 않는다.Methods for measuring the expression level of a protein in the present invention include Western blotting, enzyme linked immunosorbent assay (ELISA), radioimmunoassay (RIA), radioimmunodiffusion, Ouchterlony immunodiffusion, Immunoprecipitation Assay, Complement Fixation Assay, Fluorescence Activated Cell Sorter (FACS), and Protein Chip are examples of the immunoassay method. It is not limited.

본 발명의 GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, 및 CKS2 유전자로 이루어진 군으로부터 선택된 유전자는 암의 예후가 불량하거나 항암제 감수성이 낮은 경우에 그 발현이 증가하므로, 암의 예후 또는 항암제 감수성을 정확하고 빠르게 예측할 수 있다.
The gene selected from the group consisting of GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, The prognosis of cancer is poor or the sensitivity of the anticancer drug is low. Therefore, the prognosis of the cancer or the susceptibility of the cancer drug can be accurately and quickly predicted.

또한, 본 발명은 In addition,

(a) 정상조직, 암 조직 및 전이 암조직 간의 유전자 발현 양상을 통계학적으로 분석하여 암의 발생 및 전이와 연관된 주요한 유전자를 선별하는 단계; 및(a) statistically analyzing gene expression patterns between normal tissues, cancer tissues and metastatic cancer tissues to select major genes associated with cancer development and metastasis; And

(b) 상기 (a) 단계의 암의 발생 및 전이와 연관된 주요한 유전자의 활성에 의해 조절되는 유전자의 예후값(prognostic value)에 의해 유전자 분류자로 사용될 수 있는 유전자를 선택하는 단계;를 포함하는, 암의 예후 또는 항암제 감수성 예측용 유전자 분류자 스크리닝 방법을 제공한다. (b) selecting a gene that can be used as a gene classifier by prognostic value of a gene regulated by the activity of a major gene associated with the development and metastasis of cancer in the step (a). A gene classifier screening method for predicting cancer prognosis or anticancer drug susceptibility.

상기 암의 발생 및 전이와 연관된 주요한 유전자는 TREM1(triggering receptor expressed on myeloid cells 1) 및 CTGF(connective tissue growth factor)이나, 이에 제한되지 않는다.The major genes involved in the development and metastasis of the cancer are TREM1 (triggering receptor expressed on myeloid cells 1) and CTGF (connective tissue growth factor), but are not limited thereto.

상기 유전자의 예후값(prognostic value)은 유전자의 발현 수준에 해당하는 계수를 곱함으로써 유도되는 위험 점수(Risk score = ∑ Cox coefficient of gene Gi × expression value of gene Gi)로 부터 추정할 수 있다.
The prognostic value of the gene can be estimated from the risk score (Σ Cox coefficient of gene Gi × expression value of gene Gi) derived by multiplying the coefficient corresponding to the expression level of the gene.

이하, 본 발명을 실시예에 의거하여 보다 구체적으로 설명한다. 실시예는 오로지 본 발명을 보다 구체적으로 설명하기 위한 것으로, 본 발명의 요지에 따라 발명의 범위가 이들 실시예에 의해 제한되지 않는다는 것은 당업계에서 통상의 지식을 가진 자에 있어서 자명할 것이다.
Hereinafter, the present invention will be described more specifically based on examples. It will be apparent to those skilled in the art that the embodiments are only for describing the present invention in more detail and that the scope of the invention is not limited by these embodiments in accordance with the gist of the present invention.

실험예 1. 종양 샘플의 준비Experimental Example 1. Preparation of tumor sample

하기의 실험에 이용한 종양 샘플은 2011년 5월 부터 2012년 2월 사이에 아산병원에서 치료받은 18명의 환자로부터 초발성 대장암 조직(PC), 동시에 발생한 조직학적으로 선암으로 확인된 간 전이 조직(MC) 및 정상 대장 상피 조직(NC)을 수득하였다(AMC 코호트). AMC 코호트에 속한 모든 환자는 한 명의 미소위성체 불안정성(high-frequency MSI, MSI-H)을 보이는 환자를 제외하고 미소위성체 안정적(microsatellite stable, MSS) 상태를 보였다. 모든 종양 샘플은 환자의 조직 샘플 기부 및 실험 동의 하에 취득하였다. 본 연구 프로토콜은 헬싱키 선언에 따라 인간의 유전자 및 게놈 연구를 위한 임상 시험 심사위원회(Institutional Review Board for Human Genetic and Genomic Research)(등록번호 2009-0091)에 의해 승인되었다.
Tumor samples used in the following experiments were obtained from 18 patients who were treated at Asan Medical Center from May 2011 to February 2012 and were treated with primary colorectal cancer tissue (PC), hepatocellular tissue (MC) ) And normal colonic epithelial tissue (NC) (AMC cohort). All patients in the AMC cohort showed microsatellite stable (MSS) status except for one patient with high-frequency MSI (MSI-H). All tumor samples were obtained with patient tissue sample donation and experimental consent. This study protocol was approved by the Institutional Review Board for Human Genetic and Genomic Research (Registration No. 2009-0091) for human genetic and genomic research under the Helsinki Declaration.

실험예 2. RNA 추출 및 RNA 시퀀싱Experimental Example 2. RNA Extraction and RNA Sequencing

상기 실험예 1에서 수득한 샘플로부터 제조사의 프로토콜에 따라 RNeasy Mini Kit (Qiagen, CA)를 이용하여 RNA를 분리하였다. RNA의 질과 무결성을 아가로스 젤 전기영동 및 에티디움 브로마이드 염색을 수행한 후, 자외선 하에서 확인하였다. 시퀀싱 라이브러리를 제조사의 지시에 따라 TruSeq RNA Sample Preparation kit v2 (Illumina, CA)를 이용하여 제조하였다. 간단히, mRNA를 총 RNA로부터 poly-T oligo-attached magnetic beads를 사용하여 정제한 후, 단편화 하고, cDNA로 변환하였다. 어댑터(adapters)를 cDNA에 연결하고 PCR을 이용하여 단편들을 증폭시켰다. 시퀀싱은 Hiseq-2000(Illumina)를 이용하여 페어드 엔드 리드(paired-end reads)(2×100bp)에서 수행하였다.
RNA was isolated from the sample obtained in Experimental Example 1 using an RNeasy Mini Kit (Qiagen, CA) according to the manufacturer's protocol. The quality and integrity of the RNA was confirmed by ultraviolet light after performing agarose gel electrophoresis and ethidium bromide staining. The sequencing library was prepared using the TruSeq RNA Sample Preparation kit v2 (Illumina, Calif.) According to the manufacturer's instructions. Briefly, mRNA was purified from total RNA using poly-T oligo-attached magnetic beads, fragmented, and converted into cDNA. Adapters were ligated into the cDNA and the fragments were amplified using PCR. Sequencing was performed in paired-end reads (2 x 100 bp) using Hiseq-2000 (Illumina).

실험예 3. RNA 시퀀스 데이터 프로세싱Experimental Example 3. RNA Sequence Data Processing

호모 사피엔스(Homo sapiens)에서 기준 게놈 시퀀스 데이터(reference genome sequence data)를 University of California Santa Cruz Genome Browser Gateway(assembly ID: hg19)로부터 얻었다. 기준 게놈 인덱스는 Bowtie2(ver. 2.0)의 Bowtie2-build component 및 SAMtools (ver. 0.1.18)을 이용하여 구축하였다. Tophat2를 기준 게놈(ver. 2.0)에 대한 매핑 판독(mapping reads)을 위해 조직 샘플에 적용하였다. Tophat2의 매핑 활성의 통계를 아래 표 1에 나타내었다. RNA 시퀀싱에 의해 생성된 데이터 세트는 데이터 시리즈 수탁 번호 GSE50760 하의 유전자 발현 옴니버스 공개 데이터베이스에서 사용할 수 있다.Reference genome sequence data from Homo sapiens was obtained from the University of California Santa Cruz Genome Browser Gateway (assembly ID: hg19). The reference genomic index was constructed using the Bowtie2-build component of Bowtie2 (ver. 2.0) and SAMtools (ver. 0.1.18). Tophat2 was applied to tissue samples for mapping reads to the reference genome (ver. 2.0). The statistics of the mapping activity of Tophat2 are shown in Table 1 below. The data set generated by RNA sequencing can be used in the gene expression omnibus public database under data series accession number GSE 50760.

[표 1] AMC 코호트에서의 Tophat2에 의한 호모 사피엔스 기준 게놈에 대한 매핑 비율[Table 1] Mapping ratio of Tophat2 in the AMC cohort to Homo sapiens-based genome

실험예 4. 공개 유전자 발현 데이터 세트 및 연구 설계Experimental example 4. Open gene expression data set and study design

Hubrecht Institute의 유전자 발현 데이터 세트를 NC, PC, 및 MC 조직군 중에서 선택된 유전자의 발현 차이를 확인하기 위한 검증 데이터 세트로 사용하였다(HI 코호트, GSE14297, n = 48). HI 코호트에서, 같은 환자로부터 얻은 초발성 대장암 조직(PC) 및 상응하는 간 전이 조직(MC)을 18쌍의 유전자 발현 프로팔리링에 의해 분석하였다. 또한, 정상 대장 상피 조직(NC) 및 정상 간 조직의 발현 데이터가 데이터세트 내에 존재한다(Stange et al., 2010). 대장암의 예후를 예측하기 위한 위험 점수 분류자(risk score classifier)를 개발하기 위해서, French Ligue Nationale Contre le Cancer로부터 Cartes d’Identitre des Tumeurs (CIT) 프로그램을 위해 수집된 데이터를 개발 데이터 세트(development data set)로 사용하였다(CIT 코호트, GSE39582, n = 566). CIT 코호트에서 수술 전 화학 요법 및/또는 방사선 치료를 받은 환자 및 초발성 직장암 환자는 제외하였다. 임상 및 병리학적 데이터를 의료 기록에서 추출하였고, American Joint Committee on Cancer (AJCC)의 병기 분류 시스템에 따라 환자들의 병기를 분류하고 재발(원위 및/또는 국소 재발) 여부를 모니터 하였다(Marisa et al., 2013). 본 발명의 분류자를 검증하기 위해 호주의 로얄 멜버른 병원, 서부 병원, 및 피터 맥칼럼 암센터 및 미국의 H. Lee Moffitt 암센터의 조직 은행으로부터 얻은 신선한 냉동 종양 표본에서 생성된 유전자 발현 데이터 세트를 사용하였다(AUS 코호트, GSE14333, n=229). AUS 코호트에서, 수술 전 화학 요법 및/또는 방사선 치료를 받은 환자 또는 마이크로어레이 분석을 위해 부적당한 샘플[RNA integrity number (RIN) < 6]인 종양 유래 총 RNA를 제외하였다. 이들 중, 2기 대장암 환자 94 명 중 22명 및 3기 대장암 환자 91명 중 64명의 환자가 병원의 프로토콜에 따라 표준 보조 항암화학요법(5-플루오로유라실 또는 카페시타빈의 단일 제제 투여 또는 5-플루오로유라실 및 옥살리플라틴의 병용 투여)을 받았거나, 수술 후 동시 화학 방사선 치료를 받았다. AUS 코호트의 대장암 2기 및 3기 환자를 위한 후속 조치 및 환자의 성별 및 암의 스테이징(TNM staging)을 포함하는 추가 임상 데이터는 호주 환자는 Bio-grid Australia에 의해서, 미국 환자는 Moffitt Cancer Center Tumor Registry에 의해서 수집되었다(Jorissen et al., 2009). Academic Medical Centre at the University of Amsterdam (GSE33113, n=90) 및 the Institut Paoli-Calmettes in France (GSE37892, n=130)로부터의 두 개의 데이터 세트를 분류자의 검증을 위해 결합하였다(UAPC 코호트, n= 220). GSE33113에서, 환자들은 1997년에서 2006년 사이에 치료를 받았고, 광범위한 의료 기록이 보관되었으며 장기 임상 추적 관찰이 대다수를 위해 가능했다. GSE33113의 모든 환자의 유전자 발현 프로파일을 얻기 위해 파라핀포매 조직 및 신선 냉동 조직이 사용 가능했다(de Sousa et al., 2011). GSE37892에서, 일련의 130 명의 2기 및 3기 대장암 환자가 유지되었고, 발현 프로파일은 올리고뉴클레오티드 마이크로어레이에서 생성되었다. 주요 엔드-포인트는 첫번째로 확인된 재발의 수술로부터의 시간으로 정의된 무병 생존률(disease-free survival, DFS)이었다. DFS는 CIT, AUS 및 UAPC 코호트에서 가능했다. 모든 유전자 발현 데이터는 Affymetrix Human Genome U133 Plus 2.0 platform을 이용하여 구축되었다. 본 발명을 위한 실험 설계 및 검증 전략은 REMARK 가이드라인(McShane et al., 2005)을 준수하였고, 이를 도 1에 나타내었다.
The Hubrecht Institute gene expression data set was used as a validation data set (HI cohort, GSE 14297, n = 48) to confirm expression differences of selected genes in NC, PC, and MC tissue groups. In the HI cohort, primary colorectal cancer tissues (PC) and corresponding liver metastases (MC) from the same patient were analyzed by 18 pairs of gene expression profiling. In addition, expression data of normal bowel epithelium (NC) and normal liver tissue are present in the data set (Stange et al., 2010). In order to develop a risk score classifier for predicting the prognosis of colorectal cancer, data collected for the Cartes d'Identité des Tumeurs (CIT) program from the French Ligue Nationale Contre le Cancer, data set) (CIT cohort, GSE39582, n = 566). Patients who underwent preoperative chemotherapy and / or radiation therapy and those with primary colorectal cancer were excluded from the CIT cohort. Clinical and pathologic data were extracted from the medical records and the patient's stage was classified according to the American Joint Committee on Cancer (AJCC) staging system and monitored for recurrence (distal and / or local recurrence) (Marisa et al. , 2013). To validate the classifier of the present invention, a set of gene expression data generated from fresh frozen tumor specimens obtained from the Royal Melbourne Hospital, Western Hospital, and Peter McCallum Cancer Center in Australia and tissue banks from the H. Lee Moffitt Cancer Center in the United States (AUS cohort, GSE 14333, n = 229). In the AUS cohort, tumor-derived total RNA with a sample of RNA integrity number (RIN) < 6 > was excluded for patients undergoing preoperative chemotherapy and / or radiation therapy or for microarray analysis. Of these, 22 out of 94 patients with colorectal cancer and 64 out of 91 patients with colorectal cancer were treated with standard adjuvant chemotherapy (5-fluorouracil or a single preparation of capecitabine Or combined administration of 5-fluorouracil and oxaliplatin) or received concurrent chemoradiotherapy after surgery. Additional clinical data, including follow-up for AUS cohorts of stage 2 and stage 3 patients and patient gender and cancer staging (TNM staging), were reported by Australian patients to Bio-grid Australia and US patients to Moffitt Cancer Center Collected by the Tumor Registry (Jorissen et al., 2009). Two data sets from the Academic Medical Center at the University of Amsterdam (GSE33113, n = 90) and the Institut Paoli-Calmettes in France (GSE37892, n = 130) were combined for classifier validation (UAPC cohort, n = 220). In GSE 33113, patients were treated between 1997 and 2006, extensive medical records were kept and long-term clinical follow-up was possible for the majority. Paraffin-embedded tissues and fresh frozen tissues were available to obtain the gene expression profiles of all patients with GSE33113 (de Sousa et al., 2011). In GSE37892, a series of 130 patients with stage 2 and stage 3 colorectal cancer were maintained and expression profiles were generated in the oligonucleotide microarray. The primary end-point was disease-free survival (DFS), defined as the time since surgery for the first identified recurrence. DFS was available in the CIT, AUS and UAPC cohorts. All gene expression data were constructed using the Affymetrix Human Genome U133 Plus 2.0 platform. The experimental design and verification strategy for the present invention followed the REMARK guideline (McShane et al., 2005) and is shown in FIG.

실험예 5. 전사체 프로파일링 및 중요성 테스트(Transcriptomic profiling and significance test)Experimental Example 5. Transcriptomic profiling and significance test

조직 샘플에서 mRAN 발현 양상을 얻기 위해, 유사성 및 완전 결합 클러스터링의 척도로서 중심 상관 계수(centred correlation coefficient)를 이용한 계층적 클러스터링 알고리즘(hierarchical clustering algorithm)을 적용하였다. 클러스터 분석을 위해, 각 샘플의 FPKM(fragments per kilobase of transcript per million fragments mapped)가 각 유전자의 발현 수준을 추정하기 위하여 사용하였다. FPKM 데이터는 유전자 및 샘플 전체에서 분위수 방법(quantile method), log₂-변환 및 중앙값으로 정규화 되었다. To obtain the mRNA expression pattern in tissue samples, we applied a hierarchical clustering algorithm using a centre correlation coefficient as a measure of similarity and complete binding clustering. For cluster analysis, the FPKM (fragments per kilobase of transcript per million fragments mapped) of each sample was used to estimate the expression level of each gene. The FPKM data were normalized to the quantile method, log ₂ - transform and median values across genes and samples.

샘플의 서브그룹의 유전자 발현에서 차이의 유의성을 추정하기 위하여, 음이항 모델(negative binomial model)을 사용하는 EdgeR 패키지를 카운트 데이터(count data)로부터 달리 발현되는 유전자를 검출하기 위하여 사용하였다(Robinson et al., 2010). 유전자 카운트(count) 분산(dispersion)은 Cox-Reid 프로파일-조정된 우도법(Cox-Reid profile-adjusted likelihood method)에 의해 추정되었다. 모델 피팅 및 분산의 추정 후, 달리 발현된 유전자를 평균 분산 관계에 따라 확률 분포를 특정하는 일반 선형 모델(generalized linear model, GLM) 우도 비율 테스트(likelihood ration test)를 이용하여 선택하였다. 일반 선형 모델 우도 비율 테스트(GLM likelihood ration test)는 Cox-Reid 분산 추정을 가지는 음이항 GLM을 피팅하는 원리에 기반한다. 유전자의 발현의 차이는 P 값이 0.001 미만이고 2 개의 샘플군 간의 발현의 배수 차이가 2 이상일 경우에 통계적으로 유의한 것으로 간주하였다. 간 전이 암을 초발성 대장암과 비교할 때, 간-특이적 유전자(309 유전자)가 TiGER 데이터베이스를 이용하여 필터링되었고(Liu et al., 2008), 달리 발현된 유전자를 나머지 유전자로부터 선택하였다.To estimate the significance of differences in the gene expression of the subgroup of samples, the EdgeR package using a negative binomial model was used to detect genes that were differentially expressed from count data (Robinson et < RTI ID = 0.0 > al., 2010). The gene count dispersion was estimated by the Cox-Reid profile-adjusted likelihood method. After model fitting and variance estimation, differentially expressed genes were selected using the generalized linear model (GLM) likelihood ration test, which specifies the probability distribution according to the mean variance relationship. The general linear model likelihood ratio test is based on the principle of fitting a negative binomial GLM with a Cox-Reid variance estimate. Differences in gene expression were considered statistically significant when the P value was less than 0.001 and the difference in the number of expressions between two sample groups was greater than 2. When liver metastasis was compared to primary colorectal cancer, liver-specific genes (309 genes) were filtered using the TiGER database (Liu et al., 2008) and other expressed genes were selected from the remaining genes.

SAMtools 및 VarScan (ver. 2.3.4)을 RNA-시퀀싱 데이터로부터 체세포 돌연변이를 확인하기 위해 사용하였다. 우선, SAMtools에서 파일업(pileup) 명령을 각 배열된 bam 파일로부터 파일업(pileup) 파일을 생성하기 위해 사용하였다. 이후, VarScan의 체세포(somatic) 명령을 암(tumour) 대 정상(normal) 또는 전이(metastatic) 대 초발성 암(primary tumour)의 두 조건으로부터 체세포 돌연변이를 확인하기 위해 적용하였다. 상이한 단일염기다형성(non-synonymous SNPs)을 위한 기능적 예측의 통합 데이터베이스인 dbNSFP(ver. 2.04)를 VarScan 아웃풋으로부터 기능적으로 유의적이고 상이한(non-synonymous) 돌연변이를 선택하기 위해 사용하였다. 다중 예측 스코어(multiple prediction scores)를 유의적인 변형을 결정하기 위해 세 가지 예측 알고리즘(Mutation assessor(MA), sorting intolerant from tolerant(SIFT), 및 Polyphen2(PP2))로부터 고려하였다.
SAMtools and VarScan (ver 2.3.4) were used to identify somatic mutations from RNA-sequencing data. First, SAMITools used the pileup command to create a pileup file from each arrayed bam file. Subsequently, VarScan's somatic commands were applied to identify somatic mutations from two conditions: tumor-versus-normal or metastatic vs. primary tumor. DbNSFP (ver. 2.04), an integrated database of functional predictions for different single-nucleotide polymorphisms (non-synonymous SNPs), was used to select functionally significant and non-synonymous mutations from the VarScan output. Multiple prediction scores were considered from three prediction algorithms (MA, sorting intolerant from tolerant (SIFT), and Polyphen2 (PP2)) to determine significant deformation.

실험예 6. 유전자 세트 및 상부 조절자(upstream regulator) 분석Experimental Example 6. Analysis of gene set and upstream regulator

질병 과정, 분자 및 세포 기능 및 생리적 및 발달 과정과 연관된 가장 중요한 유전자 세트를 확인하기 위해 유전자 세트 강화 분석(gene set enrichment analysis)을 수행하였다. 과발현된(over-represented) 유전자 세트의 유의성은 피셔의 정확성 검정(Fisher's exact test)을 이용하여 추정하였다. 관찰된 유전자 발현 변화를 설명하는 주요한 상부 조절자를 확인하기 위해 데이터 세트에 존재했던 각 조절자의 알려진 표적의 수를 측정했던 상부 조절자 분석(upstream regulator analysis)을 수행했고, 가능한 관련 조절자를 예측하기 위해 이전에 보고된 문헌으로부터 기대되는 것에 대한 그들의 변화의 방향성을 비교하였다. 각 잠재적 조절자에 대해, 오버랩 P-값 및 활성 Z-점수를 추정하였다. 피셔의 정확성 검정(Fisher's exact test)에 의해 추정된 오버랩 P-값은 데이터 세트 내의 유전자와 조절자에 의해 조절되는 유전자 사이에 통계적으로 유의한 오버랩(overlap)이 있는지를 측정한다. 활성 Z-점수는 랜덤 조절 방향을 할당하는 모델과의 비교에 기반한 상부 조절자의 활성 상태를 추측하기 위해 사용된다. 양수 또는 음수 활성 Z-점수는 잠재적 상부 조절자가 각각 활성화 또는 억제되었다는 것을 나타낸다. 유전자 세트 강화 및 상부 조절자 분석(gene set enrichment and upstream regulator analyses)은 Ingenuity Pathway Analysis (IPA) Tool을 사용하여 수행하였다.
Gene set enrichment analysis was performed to identify the most important gene sets associated with disease processes, molecular and cellular functions, and physiological and developmental processes. The significance of the over-represented gene set was estimated using Fisher's exact test. Upper regulator analysis was performed to determine the number of known targets of each Adjuster that were present in the data set to identify the major upstream regulators that accounted for observed gene expression changes, They compared the direction of their change to what was expected from previously reported literature. For each potential adjuster, an overlap P-value and an active Z-score were estimated. The overlap P-value estimated by Fisher's exact test measures whether there is a statistically significant overlap between the gene in the data set and the gene regulated by the regulator. The active Z-score is used to infer the active state of the upper regulator based on a comparison with a model that assigns a random control direction. A positive or negative active Z-score indicates that the potential topadressors were each activated or inhibited. Gene set enrichment and upstream regulator analyzes were performed using the Ingenuity Pathway Analysis (IPA) Tool.

실험예 7. 위험 점수 개발(Risk score development)Experimental Example 7. Risk score development

편리하게 사용 가능한 위험 점수(risk score)를 개발하기 위해 CIT 코호트 로부터 시그니쳐(signature) 내의 유전자를 위한 콕스 회기 계수(Cox regression coefficient)를 사용한 이전에 개발된 전략을 채택했다(Kim et al., 2012a; Paik et al., 2004). 각 환자들의 위험 점수는 유전자의 발현 수준에 해당하는 계수를 곱함으로써 유도되었던 각 유전자의 점수의 합으로서 계산되었다(Risk score = ∑ Cox coefficient of gene Gi × expression value of gene Gi). 환자들을 이후 위험 점수의 중간 컷-오프를 임계값으로 사용하여 높은 재발 위험 또는 낮은 재발 위험의 두 그룹으로 분류하였다. CIT 코호트 로부터 얻어진 계수와 임계값은 환자를 고위험 그룹과 저위험 그룹으로 2분하기 위해 AUS 및 UAPC 코호트로부터의 데이터에 직접 적용되었다. 36개월의 컷-오프 값을 가지는 최근접 추정 방법(nearest neighbour estimation method)을 사용하는 DFS(무병 생존률)의 시간 의존적인 ROC(receiver operating characteristic) 커브를 예후 값(prognostic value)을 강하게 보이는 소수의 유전자를 확인하기 위해 사용하였다(Heagerty et al., 2000). ROC 분석으로부터 유도된 AUC(area under curve) 값에 기반하여 가장 높은 또는 가장 낮은 AUC 값을 가지는 탑 유전자를 최적의 시그니쳐(signature)를 생성하기 위해 선택하였다.
Adopted a previously developed strategy using the Cox regression coefficient for the genes in the signature from the CIT cohort to develop a convenient risk score (Kim et al., 2012a Paik et al., 2004). The risk score for each patient was calculated as the sum of the scores of each gene, which was derived by multiplying the coefficient corresponding to the expression level of the gene (Risk score = Σ Cox coefficient of gene Gi × expression value of gene Gi). Patients were then categorized into two groups: high risk of recurrence or low risk of recurrence, using the intermediate cut-off of the risk score as a threshold. The coefficients and thresholds obtained from the CIT cohort were applied directly to the data from the AUS and UAPC cohorts in order to divide the patient into high-risk and low-risk groups. A time-dependent receiver operating characteristic (ROC) curve of the DFS (disease free survival) using a nearest neighbor estimation method with a 36-month cut-off value is used to estimate the prognostic value of a small number (Heagerty et al., 2000). Based on the area under curve (AUC) value derived from the ROC analysis, the top gene with the highest or lowest AUC value was selected to generate the optimal signature.

실험예 8. 무작위 분류자 생성(Random classifiers generation)Experimental Example 8. Random classifiers generation [

상기 실험예 7을 통해 시그니쳐(signature)로서 가능한 많이 유전자들을 무작위로 선택했고 CIT 코호트에서 중간 컷-오프 위험 점수를 가지는 유전자들의 회기 계수를 얻었다. 그 다음, 상기 회기계수를 고위험 그룹 및 저위험 그룹의 환자들로 분류하기 위해 AUS 코호트로부터의 데이터에 직접 적용하였다. 예후 값(prognostic value)은 모든 환자, 즉 2기, 3기 대장암 또는 3기 대장암 고령 환자에서 추정되었다. 화학요법 치료를 위한 예측값(predictive values)은 무작위 분류자로부터 유도된 3기 대장암의 고위험 또는 저위험 그룹에서 평가되었다. 무작위로 생성된 분류자의 성능을 추정하기 위해서, -log₁₀(log-rank P-value)에 대해 95%의 신뢰도를 계산하기 위한 부트스트랩(bootstrap) 방법(1000 resampling)을 사용하였다.
In Experimental Example 7, as many genes as possible as a signature were randomly selected, and the regression coefficients of genes having an intermediate cut-off risk score in the CIT cohort were obtained. The session coefficients were then applied directly to the data from the AUS cohort to classify them into high-risk and low-risk groups of patients. Prognostic value was estimated in all patients, ie, stage 2, stage 3 colorectal cancer or stage 3 colorectal cancer elderly patients. Predictive values for chemotherapy treatment were assessed in high risk or low risk groups of stage 3 colorectal cancer induced from randomized classifiers. To estimate the performance of the randomly generated classifier, we used a 1000 bootstrap method (1000 resampling) to calculate 95% confidence for -log ₁₀ (log-rank P-value).

실험예 9. 다른 통계적 분석Experimental Example 9. Other statistical analyzes

서브그룹 사이의 유전자 발현 차이의 유의성을 추정하기 위하여 두 가지의 샘플 테스트가 각 유전자에 대해 수행되었다. Kaplane-Meier 방법을 DFS(무병 생존률)까지의 시간을 계산하기 위해 사용하였고, 두 그룹 사이의 생존률의 차이를 로그 순위 통계치(log-rank statistics)를 사용하여 평가하였다. 시그니쳐(signature)와 잠재적 위험 인자 사이의 예후적 연관을 다변수 콕스 비례위험 모델(multivariate Cox proportional hazard regression models)을 사용하여 평가하였다. Backward-forward step procedure(function step, R package stats)가 다변수 모델을 가장 유용한 변수(informative variables)로 최적화하기 위해 적용되었다(Venables et al., 2002). 다른 분류자에 의해 예측된 결과 사이의 연관성의 세기를 평가하기 위해 Cramer's V 통계 및 이차원분할표(two-way contingency-table) 분석을 적용하였다. 통계적 분석은 R language environment software(ver. 3.0.1)를 사용하여 수행하였다.
To estimate the significance of differences in gene expression between subgroups, two sample tests were performed for each gene. The Kaplan-Meier method was used to calculate time to disease free survival (DFS), and the difference in survival rates between the two groups was assessed using log-rank statistics. The prognostic link between signatures and potential risk factors was assessed using multivariate Cox proportional hazard regression models. Backward-forward step procedures (R package stats) were applied to optimize multivariate models as informative variables (Venables et al., 2002). Cramer's V statistics and two-way contingency-table analysis were applied to assess the strength of the association between predicted results by other classifiers. Statistical analysis was performed using R language environment software (ver. 3.0.1).

실시예 1. 베이스라인 특성(Baseline characteristics)Example 1. Baseline characteristics [

AMC 및 HI 코호트에서 대장암 환자의 베이스라인 특성을 아래의 표 2에 나타내었다. The baseline characteristics of patients with colorectal cancer in the AMC and HI cohorts are shown in Table 2 below.

[표 2] AMC 및 HI 코호트의 대장암 환자의 베이스라인 특성[Table 2] Baseline characteristics of patients with colorectal cancer in the AMC and HI cohorts

또한, 예후적 분류자(prognostic classifier)를 개발하기 위한 다른 세 집단의 환자의 베이스라인 특성을 아래의 표 3에 나타내었다. 세 가지 분류자 개발 코호트 중 보조 화학요법 데이터가 CIT 및 AUS 코호트를 위해 가능했다. 이러한 코호트의 795 명의 환자 중 320명의 환자가 정규 보조 화학요법을 받은 반면, 나머지 475명의 환자는 어떤 화학요법도 받지 않았다.
The baseline characteristics of the other three groups of patients for developing a prognostic classifier are shown in Table 3 below. Supplementary chemotherapy data among three classifier-developed cohorts were available for the CIT and AUS cohorts. Of the 795 patients in these cohorts, 320 received regular adjuvant chemotherapy, while the remaining 475 received no chemotherapy.

[표 3] 유전자적 예측자를 개발하고 확인하기 위한 세 코흐트의 대장암 환자의 베이스라인 특성[Table 3] Baseline characteristics of patients with colorectal cancer of Sekorch to develop and confirm genetic predictors

실시예 2. 서로 다른 표현형을 갖는 대장암 환자 샘플 간의 분자 특성 차이(Differential molecular characteristics among CRC patient samples with distinct phenotypes)Example 2. Differential molecular characterization among patients with colorectal cancer with different phenotypes (CRC patient samples with distinct phenotypes)

대장암의 발생 및 진행과 상당히 연관된 유전자 세트를 확인하기 위해 우리는 다양한 분석 방법을 AMC 코흐트의 세 가지 샘플(NC, PC, 및 MC)에 적용하였다. 다른 샘플 그룹의 분자적 특성을 평가하기 위해 계층적 클러스터링 분석을 유전자 발현 데이터 초기에 적용하였다. 유전자 발현 데이터의 자율 계층적 클러스터링 분석은 세가지 주요한 클러스터인 초발성 대장암 조직(PC), 동시에 발생한 조직학적으로 선암으로 확인된 간 전이 조직(MC) 및 정상 대장 상피 조직(NC) 그룹을 수득하였다(도 2). 따라서, 유전자 발현 패턴을 PC, MC, 및 NC 조직 그룹에서 쉽게 구분할 수 있었다.To identify a set of genes that are significantly associated with the development and progression of colorectal cancer, we applied a variety of analytical methods to three samples (NC, PC, and MC) of AMC Cocht. Hierarchical clustering analysis was applied early in the gene expression data to evaluate the molecular characteristics of different sample groups. Autonomous hierarchical clustering analysis of gene expression data provided three primary clusters of primary colorectal cancer (PC), simultaneously histologically adenocarcinoma-identified liver metastasis (MC) and normal colonic epithelial tissue (NC) groups 2). Thus, gene expression patterns were easily distinguishable from PC, MC, and NC tissue groups.

세 조직 그룹에서 차별적으로 발현된 유전자를 다음에 동정하였다. PC 및 MC 조직을 비교할 때, 간-특이적 유전자(309 유전자)를 TiGER 데이터베이스를 사용하여 필터링하였다. 암의 생성 및 전이와 연관된 유전자를 이후 GLM 우도 비율 테스트(GLM likelihood ratio test)(도 3A, P < 0.001)를 사용하여 발생한 두 유전자 리스트 사이의 밴 다이어그램 비교에 의해 확인하였다. 유전자 리스트 "A"는 NC와 PC 그룹 사이에 다르게 발현된 유전자를 나타내고 유전자 리스트 "B"는 PC 및 MC 그룹 사이에 다르게 발현된 유전자를 나타낸다. 두 유전자 리스트를 비교하였을 때 세 가지 다른 패턴인 A(2018 유전자), A 및 B(843 유전자), 및 B(1003 유전자)로 관찰되었다(도 3B). A 카테고리에서의 유전자는 암생성과 연관된 발현 패턴을 가지는 반면, B 카테고리에서의 유전자는 간 전이와 연관된 발현 패턴을 가졌다. A 및 B 카테고리에서의 유전자는 대장암의 생성 및 전이에 공통적이었다.The genes differentially expressed in the three tissue groups were then identified. When comparing PC and MC tissues, liver-specific genes (309 genes) were filtered using the TiGER database. Genes associated with the generation and metastasis of cancer were identified by a van diagram comparison between the two gene lists generated using the GLM likelihood ratio test (Figure 3A, P < 0.001). The gene list "A " represents genes that are differentially expressed between the NC and PC groups and the gene list" B " represents the genes that are differentially expressed between PC and MC groups. A comparison of the two gene lists revealed three different patterns: A (2018 gene), A and B (843 genes), and B (1003 genes) (Fig. 3B). The genes in the A category had an expression pattern associated with cancer generation, while the genes in the B category had an expression pattern associated with liver metastasis. Genes in the A and B categories were common to the generation and metastasis of colorectal cancer.

또한, RNA-시퀀싱 데이터로부터 다른 조직 그룹 사이에 중요한 체세포 시퀀스 변화를 가지는 유전자를 확인하였다. 유전자 선택을 위해 1) 체세포 돌연변이만을 고려하고, 2) 동일 유전자의 체세포 돌연변이가 둘 이상의 환자에서 관찰되며, 3) 체세포 돌연변이가 기능적 유의성(MA 점수 > 1.5, SIFT 점수 < 0.05, 또는 PP2 점수 > 0.9의 세 가지 점수의 컷-오프 값에 기반한)을 갖는 것으로 예측하는, 세 가지 기준을 적용하였다. 이러한 기준에 따라, 우리는 36개의 유전자가 확인된 NC와 PC 그룹 사이 또는 57개의 유전자가 확인된 PC와 MC 그룹 사이에 중요한 시퀀스 변화가 있었던 두 유전자 리스트를 얻었다. 이를 아래의 표 4 및 표 5에 나타내었다.We also identified genes with important somatic cell sequence changes between different tissue groups from RNA-sequencing data. 2) somatic mutation of the same gene was observed in two or more patients; and 3) somatic mutation had functional significance (MA score> 1.5, SIFT score <0.05, or PP2 score> 0.9 Based on the cut-off values of the three scores of the three scores). Based on these criteria, we obtained a list of two genes with significant sequence changes between NC and PC groups with 36 genes identified or between PC and MC groups with 57 genes identified. These are shown in Tables 4 and 5 below.

[표 4] 정상 및 초발성 암 그룹 사이에 다른 변수를 갖는 중요한 유전자[Table 4] Significant genes with different parameters between normal and first cancer groups

[표 5] 초발성 암 및 전이 암 그룹 사이에 다른 변수를 갖는 중요한 유전자[Table 5] Significant genes with different parameters between primary cancer and metastatic cancer group

실시예 3. 대장암의 생성 및 간 전이에 있어 활성화된 조절자Example 3. Activated Adjuvant in the Generation and Liver Metastasis of Colorectal Cancer

유전자 세트 강화 테스트(gene set enrichment test)를 IPA 소프트웨어를 사용하여 암의 생성 및 전이와 관련된 유전자의 생물학적 특성을 확인하기 위해 수행하였다. A 및 B 카테고리(도 3A)의 843개의 유전자로부터 PC 그룹과 NC 또는 MC 그룹 사이에 지속적으로 증가하거나 감소하지 않았던 224개의 유전자를 제외하였다. 암 생성과 관련된 유전자 세트는 A 카테고리(도 3A)의 2018개 유전자, 36개의 변이(variant) 유전자와 A 및 B 카테고리(두 개의 오버랩되는 유전자를 포함한 총 2671개의 유전자)의 619개 유전자의 조합에 의해 정의되었다. 유사하게,전이와 관련된 유전자 세트는 B 카테고리(도 3A)의 1003개 유전자, 57개의 변이(variant) 유전자와 A 및 B 카테고리(여섯 개의 오버랩되는 유전자를 포함한 총 1673개의 유전자)의 619개 유전자의 조합에 의해 정의되었다. IPA를 이용하여 2671개 및 1673개의 유전자를 분석하고, 암, 위장질환, 세포의 성장과 증식, 및 세포 사멸과 생존에 관련된 유전자를 두 세트에서 유의하게 강화(enrich)하였다. 또한, 염증 반응, 면역 세포 교환(immune cell trafficking) 및 염증성 질환에 관련된 유전자가 유의하게 강화되었다(도 4). 이러한 결과는 두 선택된 유전자 리스트가 공통의 생물학적 특성을 반영하고, 대장암의 생성 및 간 전이에 관련된 병리학의 프로세스가 많은 생물학적 기능을 공유한다는 것을 나타낸다.Gene set enrichment tests were performed using IPA software to identify the biological characteristics of the genes involved in cancer production and metastasis. From the 843 genes in the A and B categories (Figure 3A), 224 genes that did not continuously increase or decrease between the PC group and the NC or MC group were excluded. The set of genes involved in cancer development is a combination of 2018 genes, 36 variants of the A category (Figure 3A) and 619 genes of the A and B categories (2671 genes, including two overlapping genes) . Similarly, the set of genes involved in the metastasis includes 1003 genes, 57 variant genes in the B category (FIG. 3A) and 619 genes in the A and B categories (a total of 1673 genes, including six overlapping genes) Lt; / RTI > Using IPA, 2671 and 1673 genes were analyzed and enriched in two sets of genes related to cancer, gastrointestinal disease, cell growth and proliferation, and apoptosis and survival. In addition, genes involved in inflammatory reactions, immune cell trafficking, and inflammatory diseases were significantly enhanced (FIG. 4). These results indicate that the two selected gene lists reflect common biological characteristics and that pathological processes involved in the production of colon cancer and liver metastasis share many biological functions.

강화된(enriched) 유전자는 여러 개의 중요한 조절자(regulator)를 포함했다. 그 중에서 TREM1 및 CTGF는 대장암의 생성 및 전이 동안 활성화된 두 가지 주요한 조절자였다(표 6, 도 5 및 6). The enriched gene contained several important regulators. Among them, TREM1 and CTGF were two major regulators activated during the generation and metastasis of colon cancer (Table 6, Figures 5 and 6).

[표 6] 암 생성 또는 전이 시 활성화된 상부 조절자의 예측[Table 6] Prediction of Upper Regulators Activated upon Cancer Generation or Transition

TREM1의 발현 수준은 NC 그룹에서보다 PC 및 MC 그룹에서 상당히 더 높았는데(두 샘플 t-test, 각각 P = 8.5 × 10^-7 및 2.68 × 10^-7; 도 7A), 이는 TREM1 신호 네트워크의 활성화가 대장암의 공격성과 관련된 주요 이벤트일 수 있다는 것을 나타낸다. 조직 그룹(P = 0.17(NC vs. PC) 및 0.27(NC vs. MC); 도 7B) 사이에 CTGF의 발현에 대한 어떤 유의적인 차이도 관찰되지 않았음에도 불구하고, CTGF에 대한 활성화 점수가 다르게 발현되거나 CTGF와 직접 상호 연결된 변이(variant) 분자에 의해 추정되므로(도 6), CTGF는 대장암 생성 및 전이와 강하게 연관되어 있음을 알 수 있다(표 6).Expression levels of TREM1 were significantly higher in the PC and MC groups than in the NC group (two sample t-tests, P = 8.5 × 10 ^-7 and 2.68 × 10 ^-7 , respectively) May be a major event related to the aggressiveness of colorectal cancer. Although no significant difference in expression of CTGF was observed between tissue groups (P = 0.17 (NC vs. PC) and 0.27 (NC vs. MC); FIG. 7B), activation scores for CTGF were different (Figure 6), CTGF is strongly associated with colon cancer formation and metastasis (Table 6), since it is estimated by variant molecules expressed or directly linked to CTGF (Figure 6).

다른 조직 그룹(NC, PC, 및 MC) 사이에서 TREM1 및 CTGF의 차이를 검증하기 위해 두 유전자의 발현 수준을 HI 코호트로부터 유전자 발현 데이터를 사용하여 분석하였다. TREM1의 발현 수준은 NC 그룹에서보다 PC 및 MC 그룹에서 상당히 더 높았고(두 샘플 t-test, 각각 P = 0.01 및 3.5×10^-4), 대장암 마커로서 이 분자의 유효성에 신뢰도를 제공한다. 더욱이 PC 및 MC 그룹 사이에 TREM1 발현은 상당한 차이가 있었다(두 샘플 t-test, P = 0.02; 도 7C). 반면, CTGF 발현은 이러한 세 그룹 사이에 차이가 없었으며(도 7D), AMC 코호트에서의 RNA-시퀀싱 데이터의 결과와 일치한다.
Expression levels of the two genes were analyzed using gene expression data from the HI cohort to verify differences in TREM1 and CTGF between different tissue groups (NC, PC, and MC). TREM1 expression levels were significantly higher in the PC and MC groups than in the NC group (two sample t-tests, P = 0.01 and 3.5 × 10 ^-4 , respectively), providing confidence in the efficacy of this molecule as a colon cancer marker. Furthermore, there was a significant difference in TREM1 expression between PC and MC groups (two sample t-test, P = 0.02; FIG. 7C). In contrast, CTGF expression did not differ between these three groups (Fig. 7D), consistent with the results of RNA-sequencing data in the AMC cohort.

실시예 4. TREM1 및 CTGF 조절자 네트워크를 이용한 위험 점수 개발 및 이들의 독립적인 코호트에서의 검증(Development of a risk score using TREM1 and CTGF regulatory networks and its validation in independent cohortsDevelopment of a risk score using TREM1 and CTGF regulatory networks and its validation in independent cohorts)Example 4. Development of a risk score using the TREM1 and CTGF regulator networks and their validation in an independent cohort (TREM1 and CTGF regulatory networks and their validation in independent cohortsDevelopment of a risk score using TREM1 and CTGF regulatory networks and its validation in independent cohorts.

TREM1 또는 CTGF에 의해 조절된 유전자의 세트의 예후 값(prognostic value)을 부가적인 대장암 코흐트에서 평가하였다. RNA-시퀀싱 데이터로부터 TREM1 또는 CTGF에 의해 조절되는 총 66개의 유전자(도 5 및 6)를 TREM1 또는 CTGF 활성화에 의해 정의된 위험 점수 분류자를 생성하기 위해 사용하였고(TCA66), 이 점수를 CIT 코호트에서 대장암 예후에 대한 위험 평가 모델로서 이후 계속해서 사용하였다. CIT 코호트에서 각 환자의 위험 점수를 66개의 유전자(130 unique probes) 각각의 회기 계수를 사용하여 계산하였다(표 7). The prognostic value of the set of genes regulated by TREM1 or CTGF was assessed in additional colon cancer cohorts. A total of 66 genes (Figures 5 and 6) regulated by TREM1 or CTGF from the RNA-sequencing data were used to generate the risk score classifier defined by TREM1 or CTGF activation (TCA66) and this score was calculated from the CIT cohort We used it as a risk assessment model for colorectal cancer prognosis. In the CIT cohort, the risk score for each patient was calculated using the regression coefficients for each of the 66 unique genes (130 unique probes) (Table 7).

[표 7] 다변수 콕스 회기 분석으로부터 TREM1 및 CTGF에 의해 조절되는 66개의 유전자(130 unique probes)의 회기 계수 및 AUC 값[Table 7] From the multivariate Cox regression analysis, the regression coefficients and AUC values of the 66 genes (130 unique probes) regulated by TREM1 and CTGF

위험 점수의 중간 컷-오프(8.410)를 사용하여 CIT 코흐트에서 대장암 샘플을 높은 또는 낮은 TCA66 점수를 갖는는 두 그룹으로 분류했다(도 8A). DFS(무병 생존률) 레이트는 로그-순위 테스트 분석(log-rank test analysis)(P　=　5.62　×　10^-7; 도 8B)에서 두 그룹 간에 유의한 차이가 있었다. TCA66 점수 시스템을 검증하기 위해, 계수 값 및 CIT 코호트로부터 유도된 중간 컷-오프 값을 고위험 및 저위험 그룹으로 환자를 양분하기 위해 AUS 코호트의 유전자 발현 데이터에 직접 적용하였다. Kaplan-Meier 추정은 두 서브그룹 사이의 DFS(무병 생존률)에서 유의적인 차이를 나타내었다(로그-순위 테스트, P　=　2.14　×　10^-4; 도 8C).The middle cut-off of risk scores (8.410) was used to classify colorectal cancer samples in the CIT cohort into two groups with high or low TCA66 scores (Fig. 8A). DFS (disease free survival) rates were significantly different between the two groups in the log-rank test analysis ( P = 5.62 × 10 ^-7 ; FIG. 8B). To validate the TCA66 scoring system, the coefficient values and the intermediate cut-off values derived from the CIT cohort were applied directly to the gene expression data of the AUS cohort to bisect the patient into high risk and low risk groups. The Kaplan-Meier estimates showed significant differences in DFS (disease-free survival) between the two subgroups (log-rank test, P = 2.14 × 10 ^-4 ; FIG.

유의적으로 예후에 미치는 영향이 큰 소수의 유전자를 선택하기 위해 3년 생존률에 기반한 시간 의존적인 ROC 분석을 수행하였다. TCA66의 66개의 유전자 중에서 가장 높은 또는 가장 낮은 AUC 점수(AUC　<　0.45 또는 AUC　>　0.55)를 가지는 19개의 유의한 유전자(32 unique probes)(P　<　0.05, 콕스 회기 분석)를 경험적으로 선택하였다(표 7). 위험 점수 분류자를 CIT 코호트에서 19개의 유전자에 기반하여 생성하였다(TCA19). TCA19 분류자를 중간 컷-오프(8.053)를 가지는 CIT 코호트에 적용했을 때, DFS(무병 생존률) 주기는 저위험 환자 그룹에서보다 고위험 환자 그룹에서 더 짧았다(로그-순위 테스트, P　=　4.42　×　10^-8; 도 9B). 계수 및 중간 컷-오프 값을 AUS 코흐트에 직접 적용하였을 때 고위험 환자의 재발률은 저위험 환자의 재발률보다 상당히 더 높았다(로그-순위 테스트, P　=　7.52　×　10^-4; 도 9C). 또한 DFS(무병 생존률) 데이터를 두 다른 코흐트(GSE33113 및 GSE37892)에서도 사용할 수 있었으므로, DFS(무병 생존률)와 TCA19 분류자 사이의 관계를 더욱 평가되었다. 이러한 검증을 위해 두 집단으로부터 유전자 발현 데이터(UAPC 코호트, n　=　220)를 입력하고 동일한 절차를 적용하였다. TCA19에 의한 고위험 서브그룹의 재발률은 저위험 서브그룹의 재발률보다 유의적으로 더 높았다(로그-순위 테스트, P　=　0.005; 도 9D).
Time-dependent ROC analysis based on 3-year survival rates was performed to select a small number of genes with a significant impact on prognosis. 19 unique genes (32 unique probes) ( P <0.05, Cox regression analysis) with the highest or lowest AUC score (AUC <0.45 or AUC> 0.55) among 66 genes in TCA66 were empirically selected 7). The risk score classifier was generated based on 19 genes in the CIT cohort (TCA19). When applied to CIT cohort with off (8.053), DFS (disease free survival) cycle is shorter in high-risk patient group than in the low-risk group of patients (log-character TCA19 classification Medium Cut ranking test, P = 4.42 × 10 ^{- 8} ; FIG. 9B). When the coefficients and the median cut-off values were directly applied to the AUS curve, the recurrence rate of high-risk patients was significantly higher than the recurrence rate of low-risk patients (log-rank test, P = 7.52 × 10 ^-4 ; The relationship between DFS (disease free survival) and the TCA19 classifier was further evaluated because DFS (disease free survival) data was also available in two different cohorts (GSE33113 and GSE37892). For this test, gene expression data (UAPC cohort, n = 220) were entered from both groups and the same procedure was applied. The recurrence rate of high-risk subgroups by TCA19 was significantly higher than the recurrence rate of low-risk subgroups (log-rank test, P = 0.005; Figure 9D).

실시예 5. 대장암에서 DFS(무병 생존률)에 대한 독립적인 위험 인자인 TCA19 분류자(The TCA19 classifier is an independent risk factor for DFS(무병 생존률) in CRC)Example 5. The TCA19 classifier is an independent risk factor for DFS (disease free survival) in CRC, an independent risk factor for DFS (disease free survival) in colorectal cancer.

TCA19 예측자(predictor)의 독립성을 평가하기 위해, 유전자 발현 데이터를 환자들의 AJCC 단계(stage)에 의해 등급별로 분류된, 두 유효 코호트인 AUS 및UAPC 코호트(n=449)로부터 수집하였다(Hari et　al., 2013). 이 분석에서 I 기의 환자들은 DFS(무병 생존률) 이벤트의 부족으로 인해 제외하였고 IV 기의 환자들로부터의 DFS(무병 생존률) 데이터는 사용 가능하지 않았다. TCA19 기반의 등급 분류는 입력된 코흐트의 II-III 기의 환자에게 적용하였을 때, III 기 병증을 가진 고위험 환자의 집단이 저위험 환자보다 상당히 나쁜 DFS(무병 생존률)를 보인 반면(P　=　0.026; 도 10B), 2기 병증을 가진 고위험 및 저위험 환자 그룹 사이의 DFS(무병 생존률)는 유의적인 위험도 차이가 없었다(P　=　0.326; 도 10A). 결과는 TCA19 분류자가 진행된 대장암 환자에서 DFS(무병 생존률)의 잠재적 예측자임을 나타낸다. 입력된 집단에서 시그니쳐(signature)와 DFS(무병 생존률)를 위한 다른 잠재적 위험 인자 사이의 예후적 관련성(prognostic association)은 다변수 콕스 회기 분석 (multivariate Cox regression analysis)에 의해 평가하였다. TCA19는 심지어 변수 선택 방법(variable selection procedure)을 적용한 이후에도 DFS(무병 생존률)에 대한 독립적인 위험 인자임이 밝혀졌다(HR　=　1.894, 95% CI　=　1.227-2.809, P　=　0.002; 표 8). 또한 AUS 집단에서 다른 다변수 분석을 수행하였고 TCA19는 여전히 DFS(무병 생존률)에 대한 통계적인 중요성을 보유하고 있다는 것이 밝혀졌다(HR　=　2.24, 95% CI　=　1.22-4.114, P　=　0.009; 표 8).
To assess the independence of the TCA19 predictor, gene expression data were collected from two valid cohorts, AUS and UAPC cohorts (n = 449), graded by patients' AJCC stages (Hari et al., 2013). In this analysis, patients in stage I were excluded due to lack of DFS (disease free survival) events and DFS (disease free survival) data from patients in stage IV were not available. TCA19-based grading showed a significantly worse DFS (disease free survival) of patients with high-risk patients with III disease when applied to the input Cochl's II-III patients ( P = 0.026 ; FIG. 10B), there was no significant risk difference between DFS (disease free survival) and high risk and low risk patient groups with bipolar disorder ( P = 0.326; FIG. 10A). The results indicate that TCA19 classifier is a potential predictor of DFS (disease free survival) in patients with advanced colorectal cancer. The prognostic association between signatures and other potential risk factors for DFS (disease free survival) in the input population was assessed by multivariate Cox regression analysis. TCA19 has been found to be an independent risk factor for DFS (HR = 1.894, 95% CI = 1.227-2.809, P = 0.002; Table 8) even after applying the variable selection procedure. Another multivariate analysis was performed in the AUS group and TCA19 was still found to have statistical significance for disease free survival (HR = 2.24, 95% CI = 1.22-4.114, P = 0.009; Table 8 ).

[표 8] DFS(무병 생존률) 예측자에 대한 단변수 및 다변수 콕스 회기 분석[Table 8] Univariate and multivariate Cox regression analyzes for DFS (disease free survival) predictors

또한, 콕스 비례 위험 회귀 모델(Cox proportional hazard regression model)을 TCA19 분류자 및 AJCC 단계 사이의 상호작용을 분석하기 위해 사용하였다. CIT 코흐트에서 TCA19와 스테이지(stage)의 상호작용은 II 기(95% CI 1.103-3.119, P　=　0.019) 에서 TCA19에 대해 추정된 HRs를 가지는 유의적인 수준(P　<　2.0　×　10^-16)인 1.55에 도달하였고, III 기에서 TCA19에 대한 추정된 HRs를 가지는 유의적인 수준(95% CI 1.503-4..08, P　=　3.693　×　10^-4; 도 11A)인 2.477에 도달하였다. 또한, AUS 및 UAPC 코흐트에서 TCA19와 스테이지(stage) 사이의 강한 상호작용이 또한 관찰되었다(P　=　1.185　×　10^-11). II 기및 III 기에서 TCA19에 대한 HR은 각각 2.038 (95% CI 1.092-3.802, P　=　0.025) 및 1.778(95% CI 1.063-2.972, P　=　0.028)이었다 (도 11B). 이러한 결과는 TCA19 분류자는 현재 스테이징(staging) 시스템과 강하게 상호작용을 하며 독립적이라는 것을 보여준다.The Cox proportional hazard regression model was also used to analyze the interaction between the TCA19 classifier and the AJCC stage. The interaction of TCA19 with the stage at CIT Coht was significantly higher ( P <2.0 × 10 ^-16 ) with estimated HRs for TCA19 in stage II (95% CI 1.103-3.119, P = 0.019) It had reached 1.55, significant levels having the HRs estimate for TCA19 from group III; reached the (95% CI 1.503-4..08, P = 3.693 × 10 -4 Figure 11A) 2.477. Strong interactions between TCA19 and the stage were also observed at AUS and UAPC cohorts ( P = 1.185 × 10 ^-11 ). The HR for TCA19 in groups II and III was 2.038 (95% CI 1.092-3.802, P = 0.025) and 1.778 (95% CI 1.063-2.972, P = 0.028) (FIG. 11B). These results show that the TCA19 classifier is strongly independent of the current staging system.

또한, TCA19와 MSI 상태(status) 사이의 관련성을 평가하였다. 환자 코호트에서 DNA 수선(MMR, mismatch repair) 상태 데이터는 CIT 코흐트[dMMR(deficient MMR) 또는 pMMR(proficient MMR)]에서 사용했다. dMMR을 가지는 대장암이 MSI-H를 보이고 pMMR을 가지는 암은 저주파수 MSI (MSI-L) 또는 MSS를 보이기 때문에 (Sinicrope et　al., 2011), CIT 코호트에서 다변수 콕스 회기 분석은 MMR 카테고리에서 수행되었다. TCA19는 스테이지(stage) 및 MMR 상태를 가지는 DFS(무병 생존률)에 대한 독립적인 위험 인자였다 (HR　=　1.952, 95% CI　=　1.407-2.708, P　=　6.155　×　10^-5) (표 9). 또한, 환자들을 MMR 상태에 따라 분류하고, 각 서브그룹의 예후값을 추정하였다. 두 MMR 조건(로그-순위 test, 각 P　<　0.05; 도 12)으로부터 고위험 환자의 집단을 성공적으로 확인하였다. 이러한 발견은 TCA19 분류자가 현재의 MMR(또는 MSI) 카테고리에 독립적이라는 것을 강하게 보여준다. We also assessed the association between TCA19 and MSI status. The DNA repair (MMR) mismatch repair status data in the patient cohort was used in CIT coht (deficient MMR) or pMMR (proficient MMR). Multivariable Cox regression analysis in the CIT cohort was performed in the MMR category because cancer with dMMR showed MSI-H and pMMR had cancer at low frequency MSI (MSI-L) or MSS (Sinicrope et al., 2011) . TCA19 was an independent risk factor for stage and DFS (disease free survival) with MMR status (HR = 1.952, 95% CI = 1.407-2.708, P = 6.155 × 10 ^-5 ) (Table 9). Patients were classified according to the MMR status and the prognostic value of each subgroup was estimated. Groups of high-risk patients were successfully identified from both MMR conditions (log-rank test, P <0.05; FIG. 12). This finding strongly suggests that the TCA19 classifier is independent of the current MMR (or MSI) category.

[표 9] CIT 코호트에서의 DFS(무병 생존률)의 예측에 대한 단변수 및 다변수 콕스 회기 분석[Table 9] Univariate and multivariate Cox regression analysis for prediction of DFS (disease free survival) in the CIT cohort

실시예 6. 보조적 항암 요법 이후의 TCA19 분류자와 DFS의 관련성(The TCA19 classifier is associated with DFS after adjuvant chemotherapy)Example 6. Association of TCA19 classifier and adjuvant chemotherapy with TCA19 classifier after adjuvant chemotherapy (TCA19 classifier is associated with DFS after adjuvant chemotherapy)

TCA19 예측자를 위한 검증 코흐트 중에서 보조 화학요법 데이터는 AUS 코흐트의 것을 사용하였다. TCA19 분류자를 이용한 보조 화학 요법의 감수성 예측 가능성을 분석했다. 상기 분석은 보조 화학요법이 생존 기간을 연장시킨다고 알려진 AJCC 3기(n=91)의 환자들에 대해 수행하였다(Laurie et　al., 1989;　Moertel et　al., 1990). 3기 환자에서 예후값을 추정하고, 기대했던 것과 같이 모든 환자들을 포함한 평가와 일치하게 TCA19가 DFS(무병 생존률)에 대한 고위험 환자들을 확인했다(도 13A 및 B). 흥미롭게도, 대장암 3기의 고령 환자들의 평가에서(75세 이상, n　=　23, 8명은 재발), TCA19는 또한 고위험 환자를 성공적으로 확인했는데 (도 13C), 이는 TCA19 분류자가 심지어 고령의 진행된 대장암 환자에서도 상당한 예후적 잠재력을 가졌다는 것을 나타낸다. 3기 환자에서 TCA19 분류자의 예측값을 평가하기 위해 대장암 환자들을 TCA19 위험 점수에 기반하여 고위험 및 저위험 서브그룹으로 분류하고, DFS(무병 생존률)에서의 차이를 독립적으로 평가하였다. 보조 화학 요법은 TCA19-분류자 고위험 환자 서브그룹(P　=　0.009, 도 13D)에서 환자의 DFS(무병 생존률)를 향상시킨 반면, 저위험 환자 서브그룹 (P　=　0.704, 도 13E)에서는 어떤 관련성도 확인되지 않았다. 콕스 회기 모델에서, TCA19와 보조 화학요법의 상호작용은 0.599의 상당한 수준에 도달하였다(도 14). 그러나, Kaplan-Meier 플롯 및 로그-순위 테스트와 일관성이 있게 TCA19 분류자에 의해 분류된 고위험 환자 그룹에서 보조 화학요법을 위해 추정된 HR은 0.363(95% CI　=　0.163-0.805; P　=　0.013)으로 유의한 예측값을 가지는 반면, 저위험 환자 그룹에서 보조 화학요법을 위한 재발에 대한 HR은 0.758이었다(95% CI　=　0.180-3.186; P　=　0.705). 이를 통해 TCA19 분류자가 진행된 대장암 환자의 예측값을 가질 뿐만 아니라 연령에 상관 없이 대장암 3기 환자에서의 유의적인 예후적 잠재력을 가졌음을 알 수 있다.Among the verification cohorts for the TCA19 predictor, the adjuvant chemotherapy data were obtained from AUS Cocht. We analyzed the predictability of susceptibility of adjuvant chemotherapy using TCA19 classifier. This analysis was performed on patients with AJCC stage 3 (n = 91) who were known to have prolonged survival with adjuvant chemotherapy (Laurie et al., 1989; Moertel et al., 1990). The prognostic value was estimated in stage III patients and TCA19 identified high-risk patients for DFS (disease free survival) consistent with the evaluation including all patients as expected (FIGS. 13A and B). Interestingly, TCA19 also successfully identified high-risk patients (Figure 13C) in the evaluation of elderly patients with stage 3 colon cancer (> 75 years old, n = 23, recurrence of 8) It also indicates that patients with colorectal cancer have significant prognostic potential. Patients with colorectal cancer were classified as high- and low-risk subgroups based on TCA19 risk scores and evaluated independently for DFS (disease-free survival) to assess the predictors of TCA19 classifier in stage III patients. The adjuvant chemotherapy improved the patient's DFS (disease free survival) in the TCA19-classifier high-risk patient subgroup ( P = 0.009, Figure 13D), while in the low risk patient subgroup ( P = 0.704, Figure 13E) Not confirmed. In the Cox regression model, the interaction of TCA19 with adjuvant chemotherapy reached a significant level of 0.599 (Figure 14). However, HR estimated for adjuvant chemotherapy was 0.363 (95% CI = 0.163-0.805; P = 0.013) in the high-risk group of patients classified by TCA19 classifier consistent with the Kaplan-Meier plot and log- The HR for recurrence for adjuvant chemotherapy in the low-risk patient group was 0.758 (95% CI = 0.180-3.186; P = 0.705), while it had significant predictive value. This suggests that TCA19 classifier has a significant prognostic potential in patients with colorectal cancer, regardless of age, as well as having predictive value for patients with advanced colorectal cancer.

대장암 3기의 고령환자에 대해 TCA19의 예측 값에 대한 다른 평가를 수행하였다. 언더-샘플링(under-sampling)을 방지하기 위해서, AUS 코호트를 보조 화학요법 데이터를 이용했던 CIT 코호트와 통합했다. 3기 대장암인 고령의 환자(75세 이상, n　=　84)를 분류자를 사용하여 고위험 및 저위험 서브 그룹으로 분류하고, DFS(무병 생존률)의 차이를 화학 요법 실시 여부에 따라 독립적으로 평가하였다. 여기서 통계적 유의성을 발견하지는 못했지만, TCA19 기반 고위험 환자에서 보조 화학요법으로부터의 이득에 대한 경향을 발견하였고, 반면, 저위험 환자 그룹에서는 어떤 연관성도 관찰되지 않았다(도 15). 또한, 통합된 코호트에서 대장암 환자를 화학 요법을 받은 그룹과 화학 요법을 받지 않은 그룹으로 분류하고 이러한 두 그룹에 대해 다변수 분석을 각각 수행하였다. TCA19 분류자는 화학 요법을 받은 그룹(HR　=　1.851, 95% CI　=　1.283-2.671, P　=　9.946　×　10^-4; 표 10) 및 화학 요법을 받지 않은 그룹(HR　=　2.287, 95% CI　=　1.478-3.538, P　=　2.401　×　10^-4; 표 10)에서 DFS(무병 생존률)에 대한 독립적인 위험 인자로서 유지되었다. 이러한 결과로 TCA19 예측자가 현재 연구에서 사용된 환자 코호트에서 환자 선택 바이어스(bias)가 존재했음에도 불구하고 보조 화학 요법에 대해 독립적일 수 있다는 것을 입증하였다.A further assessment of the predicted value of TCA19 was performed for older patients with stage 3 colorectal cancer. To prevent under-sampling, the AUS cohort was integrated with the CIT cohort that used the adjuvant chemotherapy data. The elderly patients (age 75 and over, n = 84) with stage 3 colorectal cancer were classified into high-risk and low-risk subgroups using the classifier and the differences in DFS (disease free survival) . Although we did not find statistical significance here, we found a trend toward benefit from adjuvant chemotherapy in TCA19-based high-risk patients, whereas no association was observed in low-risk patient groups (Figure 15). In the integrated cohort, colorectal cancer patients were classified into chemotherapeutic groups and non - chemotherapeutic groups, and multivariate analyzes were performed on these two groups. TCA19 classifier group that received chemotherapy (HR = 1.851, 95% CI = 1.283-2.671, P = 9.946 × 10 -4; Table 10) have not been and chemotherapy group (HR = 2.287, 95% CI = 1.478- 3.538, P = 2.401 × 10 ^-4 ; Table 10) as an independent risk factor for DFS (disease free survival). These results demonstrate that TCA19 predictors can be independent of adjuvant chemotherapy despite the presence of patient selection bias in the patient cohort used in the present study.

[표 10] AUS 및 CIT 통합 코호트의 화학 요법 여부에 대한 DFS 예측을 위한 단변수 및 다변수 콕스 회기 분석[Table 10] Single and multivariate Cox regression analysis for DFS prediction of chemotherapy in AUS and CIT integrated cohort

실시예 7. TCA19 분류자와 다른 유전자 예측자의 비교(Comparison of other genomic predictors with the TCA19 classifier)Example 7. Comparison of TCA19 classifier and other genetic predictors with the TCA19 classifier

MDA114(MD 암 센터의 예후 예측자인 114 개의 유전자)(Oh et　al., 2012) 및 OncoDX(7 gene Oncotype DX recurrence score)(Clark-Langone et　al., 2010)가 불량한 예후를 보이는 환자를 발견하는데 좋은 성능을 보였으므로(Park et　al., 2013), 이 두 예측자를 TCA19 예측자와 비교하였다. 세 가지 분류자(TCA19, MDA114, 및 OncoDX)의 원래 예측 방법 및 임계치 값을 적용하였고, AUS 코호트에서 각 분류자에 의해 예측된 위험 수준에 따라 환자들을 등급별로 분류하였다. Kaplan-Meier 플롯은 각 유전자 예측자에 의해 분류된 고위험 및 저위험 환자 그룹 사이에 DFS(무병 생존률) 비율에서 유의적인 차이를 나타냈다(도 16). 환자들을 AJCC 단계에 따라 등급별로 분류했고, 모든 분류자가 성공적으로 3기 대장암을 가지는 고위험 환자를 발견하였던 반면, 어떤 분류자도 2기 대장암을 가지는 환자를 분류하는데 유의적인 예후값을 나타내지 않았다. 특별히 고령(75세 이상)의 3기 대장암 환자에서 TCA19는 고위험 환자를 발견하는데 유일한 예측자였던 반면, MDA114 또는 OncoDX는 그러한 예측 능력을 보여주지 못했다(도 17). 다른 서브세트 분석을 분류자에 의해 예측된 결과와 보조 화학 요법의 감수성 사이의 연관성을 비교하기 위해 수행하였다. 모든 유전자 예측자에 의해 분류된 3기 대장암 고위험 환자 서브그룹은 보조 화학 요법의 잠재적 효과를 가졌던 반면, 저위험 환자 서브그룹에서 는 그렇지 않았다(도 18). 각 분류자의 특성을 표 11에 나타내었으며, 여기에는 모든 분류자가 대장암 결과를 예측할 수 있었으나 TAC19 예측자만 고령 3기 대장암 환자에서 예후적 잠재력을 가짐을 보였다. 예측된 결과 사이의 일치는 각 예측 모델에 의해 예측된 환자의 서브 그룹을 비교함으로써 평가하였다(Cramer's V statistics; 표 12). 예측자 사이의 모든 상관 관계는 통계학적으로 유의적이었고(χ² test, r　=　0.219-0.472; P　<　0.001), 가장 높은 상관관계가 TCA19 및 OncoDX에서 가장 높은 상관관계가 관찰되었다(χ² test, r　=　0.472; P　=　1.209　×　10^-12).(Oh et al., 2012) and OncoDX (7 gene oncotype DX recurrence score) (Clark-Langone et al., 2010) have identified poor prognosis in MDA114 Because of its good performance (Park et al., 2013), these two predictors were compared with TCA19 predictors. The original predictive methods and threshold values of the three classifiers (TCA19, MDA114, and OncoDX) were applied and the patients were classified according to the risk level predicted by each classifier in the AUS cohort. Kaplan-Meier plots showed significant differences in DFS (disease free survival) rates between the high-risk and low-risk groups categorized by each gene predictor (Fig. 16). Patients were classified by grade according to the AJCC stage and all classifiers were found to have high risk patients with stage 3 colorectal cancer successfully, whereas no classifier showed significant prognostic value in classifying patients with stage 2 colorectal cancer. TCA19 was the only predictor of high-risk patients, especially in older (> 75 years old) patients with stage 3 colorectal cancer, whereas MDA114 or OncoDX did not show such predictive ability (FIG. 17). Another subset analysis was performed to compare the association between the predicted results by the classifier and the susceptibility of adjuvant chemotherapy. The third group of high risk patients with colorectal cancer, classified by all gene predictors, had the potential effect of adjuvant chemotherapy, but not the low risk patient subgroup (Fig. 18). The characteristics of each classifier are shown in Table 11, in which all classifiers were able to predict the outcome of colorectal cancer, but only the TAC19 predictor showed prognostic potential in patients with advanced stage 3 colorectal cancer. Coincidence between the predicted results was assessed by comparing subgroups of patients predicted by each prediction model (Cramer's V statistics; Table 12). All correlations between predictors were statistically significant (χ ² test, r = 0.219-0.472; P <0.001), with the highest correlation being highest in TCA19 and OncoDX (χ ² test , r = 0.472; P = 1.209 × 10 -12).

[표 11] AUS 코호트에서의 세 분류자의 퍼포먼스(n=229)[Table 11] Performance of the three classifiers in the AUS cohort (n = 229)

[표 12] AUS 코호트에서의 세 분류자의 일치[Table 12] Matching of three classifiers in the AUS cohort

또한, TCA19의 예후값 및 예측값을 랜덤하게 발생된 유전자 예측자와 비교함으로써 검증하였다. TCA19 분류자의 개발 및 검증을 위한 유사한 절차를 사용하여 랜덤하게 선택된 19개의 유전자를 가지고 분류자를 만들고 그 예측값을 확인하였다. 랜덤 분류자의 경우에 모든 서브 세트 카테고리에서 로그-순위 테스트에 의한 평균 P 값은 통계적 유의성에 도달하지 못했다. TCA19 분류자와 비교하였을 때, 모든 환자, 대장암 3기 환자, 고령의 대장암 3기 환자, 및 대장암 3기의 고위험 그룹에서의 화학 요법 감수성에 대한 TCA19의 유의성 수준은 랜덤 분류자의 경우에 유의성에 대한 신뢰도의 범위를 벗어났고(도 19A), 이는 TCA19 분류자가 우연히 발생된 것이 아니라는 것을 보여준다. 서브 세트 분석에서 TCA19의 성능을 능가하는 랜덤 분류자의 일부가 있었으나, 부트스트랩 리샘플링 분석(bootstrap resampling analysis)(도 19B)에서 모든 서브 세트 카테고리에 걸쳐 TCA19의 성능을 능가하는 랜덤 분류자는 없었다.
In addition, the prognostic value and the predicted value of TCA19 were verified by comparing with randomly generated gene predictors. A similar procedure for the development and validation of the TCA19 classifier was used to construct a classifier with 19 randomly selected genes and confirm their predictions. In the case of random classifiers, the mean P value by log-rank test in all subset categories did not reach statistical significance. The significance level of TCA19 for chemotherapy susceptibility in all patients, patients with colorectal cancer 3, patients with colorectal cancer 3, and patients with colorectal cancer 3, when compared with TCA 19 classifier, (Figure 19A), indicating that the TCA19 classifier was not accidentally generated. There were some random classifiers that outperform TCA19 in the subset analysis, but in the bootstrap resampling analysis (FIG. 19B), there was no random classifier that outperforms TCA19 across all sub-categories.

실시예 8. 분류자의 생물학적 특성 및 대장암 서브타입의 비교(Biological characteristics of the classifier and comparison with CRC subtypes)Example 8. Biological characteristics of the classifier and comparison of CRC subtypes

유전자 세트 강화(enrichment) 테스트를 분류자의 생물학적 특성을 연구하기 위해 TCA66 유전자에 대하여 수행하였다(표 7). TREM1의 잘 알려진 활성인 염증성 질병 및 반응에 관련된 유전자가 강화되었다. 또한, 세포 발달, 세포 성장 및 증식, 세포간 신호 전달 및 상호작용, 세포 사멸 및 생존에 관련된 간격(terms)을 확인하였다(도 20). 이러한 결과는 TREM1 및 CTGF를 포함하는 66개의 유전자들이 염증성 또는 면역 반응을 넘어 암 진행과 연관된 상당한 활성을 가질 수 있음을 의미한다.The gene set enrichment test was performed on the TCA66 gene to study the biological properties of the classifier (Table 7). The well-known active inflammatory disease of TREM1 and the genes involved in the response were enhanced. In addition, terms related to cell development, cell growth and proliferation, intercellular signaling and interaction, apoptosis and survival were identified (Fig. 20). These results indicate that 66 genes, including TREM1 and CTGF, can have significant activity associated with cancer progression beyond the inflammatory or immune response.

또한, 최근에 보고된 대장암 서브 타입 및 TCA19 분류자의 비교 분석을 수행하였다. AUS 코호트에서 TCA19와 CRCassigner 사이의 대장암 환자의 분류를 비교하였다(Sadanandam et　al., 2013). CRCassigner의 다섯 가지 서브타입(goblet-like, enterocyte, stem-like, inflammatory, and transit-amplifying)을 TCA19 분류자와 비교하였고, 단시간 안에 재발하고 보조 화학요법으로부터 최대의 효과를 보이는 stem-like 서브 타입의 대부분의 환자(33 out of 38, 86.8%)가 TCA19에 의해 고위험 환자로 분류되었다. 또한, 5개의 서브타입 중에서 화학요법으로부터 적절한 효과를 보였던 염증성(inflammatory) 서브타입의 41.5%(17 out of 41)가 TCA19 고위험 서브그룹으로 분류되었다(도 21A). 환자들을 TCA19 위험 점수에 의해 분류했고, 높은 점수를 가지는 대장암 환자들의 대부분은 stem-like 서브타입으로 분류되었으며 (도 21B), 이는 TCA19에 의해 분류된 고위험 서브그룹은 불량한 예후를 나타내는 대장암의 특징적인 생물학적 서브타입을 잘 반영함을 나타낸다. CRCassigner (786 유전자)와 본 발명의 분류자(66 유전자) 사이에는 16개의 공통 유전자가 있었다. 그 중에서 TREM1 및 CTFG는 각각 염증성 및 stem-like 서브타입에서 가장 높은 PAM(Prediction of Microarray Analysis) 점수를 가졌다(표 13)(Sadanandam et　al., 2013).
A comparative analysis of recently reported colorectal cancer subtypes and TCA19 classifiers was also performed. In the AUS cohort, the classification of patients with colorectal cancer between TCA19 and CRCassigner was compared (Sadanandam et al., 2013). We compared the five subtypes of CRCassigner (goblet-like, enterocyte, stem-like, inflammatory, and transit-amplifying) with the TCA19 classifier, (33 out of 38, 86.8%) were classified as high risk patients by TCA19. Of the five subtypes, 41.5% (17 out of 41) of the inflammatory subtypes that showed adequate efficacy from chemotherapy were classified as TCA19 high risk subgroups (Figure 21A). Patients were categorized by TCA19 risk score and most of the high-grade colorectal cancer patients were classified as stem-like subtypes (Fig. 21B), indicating that the high-risk subgroups classified by TCA19 had a poor prognosis It reflects well the characteristic biological subtype. There were 16 common genes between CRCassigner (786 gene) and the present classifier (66 gene). Among them, TREM1 and CTFG had the highest Prediction of Microarray Analysis (PAM) score in inflammatory and stem-like subtypes, respectively (Table 13) (Sadanandam et al., 2013).

[표 13] TCA66 분류자 및 CRCassigner에 관련된 공통 유전자 및 PAM(Prediction of Microarray Analysis)에 의한 점수[Table 13] TCA66 classifier and CRCassigner common gene and score by Prediction of Microarray Analysis (PAM)

또한, AUS 코호트에서 146-유전자 분류자(De Sousa et　al., 2013)에 의해 유도된 TCA19와 3 결장암(colon cancer) 서브타입 사이의 대장암 환자의 분류를 비교하였다. 세가지 서브타입의 CRC(CCS1, CCS2, 및 CCS3)와 비교하였을 때, 특별히 바람직하지 않은 예후와 상피간엽이행(epithelial-mesenchymal transition, EMT) 및 세포외 매트릭스 재모형화(extracellular matrix remodelling)의 특징을 가지는 CCS3 서브타입의 다수의 대장암 환자(27 out of 36, 75%)들이 TCA19에 의해 고위험 서브그룹으로 분류되었다(도 22A). TCA19에 의해 분류된 높은 점수를 가지는 대장암 환자의 대부분은 CCS3와 상당히 관련이 있었고, 이는 TCA19에 의해 구분된 고위험 서브그룹이 불량한 예후를 보여주는 EMT 및 matrix remodelling과 매우 유사함을 의미한다. 흥미롭게도, CCS3는 거의 MSS이고(De Sousa et　al., 2013), MMR 상태 [pMMR (MSS 또는 MSI-L) vs. dMMR (MSI-H), HR　=　0.452, 95% CI　=　0.238-0.858, P　=　0.015; 표 9]를 포함하는 이전의 다변수 분석과 일치한다. 146-유전자 분류자(146 유전자)와 본 발명의 분류자(66 genes) 사이에는 4개의 유전자만 공통된다(도 22C).We also compared the classification of patients with colorectal cancer between the TCA19 and 3 colon cancer subtypes induced by the 146-gene classifier (De Sousa et al., 2013) in the AUS cohort. It is characterized by a particularly unfavorable prognosis, epithelial-mesenchymal transition (EMT) and extracellular matrix remodeling, as compared to three subtypes of CRC (CCS1, CCS2, and CCS3) A large number of colorectal cancer patients (27 out of 36, 75%) of the CCS3 subtype were classified as high risk subgroups by TCA19 (Figure 22A). Most of the high-grade colorectal cancer patients classified by TCA19 were significantly associated with CCS3, suggesting that the high-risk subgroups identified by TCA19 are very similar to those of EMT and matrix remodeling showing poor prognosis. Interestingly, CCS3 is almost MSS (De Sousa et al., 2013), MMR state [pMMR (MSS or MSI-L) vs. dMMR (MSI-H), HR = 0.452, 95% CI = 0.238-0.858, P = 0.015; Table 9]. Only four genes are common between the 146-gene classifier (146 gene) and the inventive classifier (66 genes) (FIG. 22C).

마지막으로 CIT 코흐트에서 TCA19 및 57-유전자 센트로이드 분류자(57-gene centroid classifier)(Marisa et　al., 2013)에 의해 분류된 6 가지 대장암 서브타입(C1 ~ C6) 사이의 비교 분석을 수행하였다. 더 짧은 재발 없는 생존 기간(shorter relapse-free survival)과 연관된 C4(stem cell phenotype-like) 서브타입의 대장암 환자의 96.6%(57 out of 59) 및 C6(normal-like) 서브타입의 대장암 환자의 71.7%(43 out of 60)가 TCA19에 의해 고위험 서브그룹으로 분류되었다 (도 23 A). 환자들을 TCA19 위험 점수에 의해 분류했을 때, 가장 높은 점수의 대장암 환자는 C4 또는 C6 서브타입과 강하게 연관되었고(도 23B), 이는 TCA19 기반의 고위험 서브그룹이 CRCassigner 및 본 발명의 이전 비교와 일치하는 암 줄기 세포(CSC) 특징과 매우 유사하다는 것을 보여준다. 흥미롭게도, 57-유전자 센트로이드 분류자(57 genes) 및 본 발명의 분류자(66 genes) 사이에는 어떤 공통의 유전자도 발견되지 않았다(도 23C).
Finally, a comparative analysis of six colon cancer subtypes (C1-C6) classified by the TCA19 and 57-gene centroid classifier (Marisa et al., 2013) Respectively. Of the 96.6% (57 out of 59) and C6 (normal-like) subtypes of colorectal cancer patients with C4 (stem cell phenotype-like) subtypes associated with shorter relapse-free survival 71.7% (43 out of 60) of patients were classified as high-risk subgroups by TCA19 (Figure 23A). When patients were classified by TCA19 risk score, the highest score of colorectal cancer patients was strongly associated with the C4 or C6 subtype (Figure 23B), suggesting that the high-risk subgroups based on TCA19 were consistent with CRCassigner and earlier comparisons of the present invention (CSC) characteristics of cancer stem cells. Interestingly, no common gene was found between the 57-gene centroid classifier (57 genes) and the classifier of the present invention (66 genes) (Figure 23C).

Claims

A biomarker composition for predicting cancer prognosis or anticancer drug susceptibility including genetic classifier TCA19.

4. The method of claim 1, wherein the gene classifier TCA19 is selected from the group consisting of growth arrest and DNA-damage-inducible beta (SADD45B), sphingosine-1-phosphate receptor (S1PR3), cyclin-dependent kinase inhibitor 2B growth factor 2, CTGF (connective tissue growth factor), SERPINE1 (serpin peptidase inhibitor, clade E), RGS16 (regulator of G-protein signaling 16), RHOU (ras homolog family member U), TIMP1 (metallopeptidase inhibitor 1) (IL-6R), IL6RN (interleukin 36 receptor antagonist), SLAMF7 (SLAM family member 7), E2F7 (transcription factor 7), DTL (denticleless E3 ubiquitin protein ligase homolog), CFB factor B, CDK1 (cyclin-dependent kinase 1), CXCL1 (chemokine (CXC motif) ligand 1), CXCL3 (CXC motif) ligand 3 and CKS2 (CDC28 protein kinase regulatory subunit 2) Wherein the biomarker composition comprises one or more genes.

2. The method according to claim 1, wherein the gene classifier TCA19 is a gene regulated by the activity of TREM1 (triggering receptor expressed on myeloid cells 1) and CTGF (connective tissue growth factor) Biomarker composition.

The method of claim 1, wherein the cancer is selected from the group consisting of acute lymphoblastic leukemia, lymphocytic leukemia, chronic lymphocytic leukemia, acute non-lymphoid leukemia, bladder cancer, brain tumor, breast cancer, chronic myelogenous leukemia, Endometriosis, esophageal cancer, bile bladder cancer, Ewing's sarcoma, dysplasia, Hopkins lymphoma, caposic sarcoma, kidney cancer, liver cancer, lung cancer, mesothelioma, multiple myeloma, neuroblastoma, non-hopskin lymphoma, osteosarcoma, ovarian cancer, neuroblastoma Wherein the cancer is selected from the group consisting of breast cancer, colorectal cancer, prostate cancer, pancreatic cancer, colon cancer, penis cancer, retinoblastoma, skin cancer, gastric cancer, thyroid cancer, uterine cancer, testicular cancer, Wilms' tumor, and tropoblastoma A biomarker composition for predicting prognosis or anticancer drug susceptibility.

The biomarker composition according to claim 1, wherein the cancer is a colon cancer, for predicting cancer prognosis or anticancer drug susceptibility.

6. The biomarker composition according to claim 5, wherein the colorectal cancer comprises rectal cancer, colon cancer and anal cancer.

The biomarker composition according to claim 5, wherein the large intestine is a third stage large intestine cancer.

The biomarker composition according to claim 1, wherein the anticancer agent is selected from the group consisting of oxaplylatin, fluorouracil, levopolinate and salts thereof.

MRNA of one or more genes selected from the group consisting of GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHDLA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, A composition for predicting cancer prognosis or anticancer drug susceptibility, comprising an agent for measuring the expression level of a protein.

The method according to claim 9, wherein the agent for measuring mRNA level of the gene is selected from the group consisting of GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, A primer pair or a probe that specifically binds to a gene selected from the group consisting of CXCL3, CXCL3, and CKS2.

10. The method of claim 9, wherein the agent is selected from the group consisting of GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3, and CKS2 in the presence of an antibody specific for the prognosis or cancer susceptibility of cancer.

A kit for predicting prognosis or anticancer drug sensitivity of cancer comprising the composition of any one of claims 9 to 11.

13. The kit according to claim 12, wherein the kit is an RT-PCR kit, a microarray chip kit, a DNA kit or a protein chip kit.

(a) a gene selected from the group consisting of GADD45B, S1PR3, CDKN2B, EGR2, CTGF, SERPINE1, RGS16, RHOU, TIMP1, PHLDA1, IL36RN, SLAMF7, E2F7, DTL, CFB, CDK1, CXCL1, CXCL3 and CKS2 in the biological sample Lt; RTI ID = 0.0 > mRNA < / RTI > or a protein thereof; And
(b) comparing the expression level of the mRNA of the gene or its protein measured in the biological sample in step (a) with the expression level of mRNA of the gene or a protein thereof in a normal control sample; and Or a method for providing information for predicting anticancer drug susceptibility.

The method according to claim 14, wherein in the step (a), the biological sample is at least one sample selected from the group consisting of whole blood, serum, plasma, saliva, urine, sputum, lymph, , A method of providing information for predicting cancer prognosis or anticancer drug susceptibility.

The method according to claim 14, wherein the mRNA expression level is measured by RT-PCR, competitive RT-PCR, real-time RT-PCR, Wherein the cancer cell is one or more methods selected from the group consisting of DNA polymerase chain reaction (PCR), RNase protection assay (RPA), Northern blotting and DNA chip. Delivery method.

The method according to claim 14, wherein the method for measuring the level of expression of the protein in step (b) includes Western blot, enzyme linked immunosorbent assay (ELISA), radioimmunoassay (RIA), radioimmunodiffusion Immunoprecipitation Assay, Complement Fixation Assay, Fluorescence Activated Cell Sorter (FACS), and Protein Immunoprecipitation Assay (FACS), as well as Ouchterlony immunodiffusion, rocket immunoelectrophoresis, Wherein the method is one or more methods selected from the group consisting of a protein chip, and a method of providing information for predicting the prognosis or cancer susceptibility of a cancer.

(a) statistically analyzing gene expression patterns between normal tissues, cancer tissues and metastatic cancer tissues to select major genes associated with cancer development and metastasis; And
(b) selecting a gene that can be used as a gene classifier by prognostic value of a gene regulated by the activity of a major gene associated with the development and metastasis of cancer in the step (a). Screening method of gene classifier for prediction of cancer prognosis or anticancer drug susceptibility.

19. The method of claim 18, wherein the major genes associated with the development and metastasis of the cancer are TREM1 and CTGF.