KR102055305B1

KR102055305B1 - Markers for diagnosis and targeted treatment of adenocarcinoma of gastroesophageal junction

Info

Publication number: KR102055305B1
Application number: KR1020180020009A
Authority: KR
Inventors: 서윤석; 나득채; 이장철; 양한광
Original assignee: 서울대학교병원; 이화여자대학교 산학협력단
Priority date: 2018-02-20
Filing date: 2018-02-20
Publication date: 2019-12-13
Also published as: KR20190099928A

Abstract

위식도경계부암을 식도암 또는 위저부/체부암으로 진단하기 위한 알고리즘, 마커 및 치료 표적을 제공한다. 상기 마커는 ERBB2 또는 EGFR을 포함한다. Provided are algorithms, markers, and therapeutic targets for diagnosing gastroesophageal border cancer as esophageal or gastric / body cancer. The marker includes ERBB2 or EGFR.

Description

Markers for diagnosis and targeted treatment of adenocarcinoma of gastroesophageal junction

위식도경계부선암의 진단 및 표적 치료에 관한 것이다.A diagnosis and targeted treatment of gastroesophageal adenocarcinoma.

미국의 SEER 데이터베이스의 보고에 따르면, 식도선암은 최근 30년간 600%의 증가율을 보이며 급격히 증가하였다. 특히 이러한 증가율의 대부분은 하부 1/3 식도에 발생하는 하부식도선암이며, 위식도경계부암 또한 2배에 가까운 증가율을 보였다 (Pohl H and Welch HG, 2005). 일본 국립 암 센터 (National Cancer Center)의 보고에 따르면 위식도경계부선암은 1962-1965년 2.3%에서 2001-2005년 10.0%로 5배 가까이 증가하였으며, 특히 분문부암(cardia cancer)(Siewert type II)이 차지하는 비율은 28.5%에서 57.3%로 2배 이상 증가하였다. According to a US SEER database report, esophageal adenocarcinoma has increased rapidly, with a 600% increase in the last 30 years. In particular, most of these growth rates were lower esophageal adenocarcinoma of the lower 1/3 esophagus, and gastric esophageal border cancer also showed a nearly double rate of increase (Pohl H and Welch HG, 2005). According to a report from the National Cancer Center, gastroesophageal adenocarcinoma increased nearly fivefold from 2.3% in 1962-1965 to 10.0% in 2001-2005, especially cardiac cancer (Siewert type II). This share more than doubled from 28.5% to 57.3%.

최근 발표된 국내 위암학회 보고에 따르면, 국내 또한 위식도경계부선암이 포함된 상부위암이 1995년 11.2%에서 2014년 16.0%로 뚜렷이 증가하고 있다 (Information Committee of Korean Gastric Cancer Association. 2016).According to a recent report from the Korean Gastric Cancer Society, the upper gastric cancer, including gastroesophageal border adenocarcinoma, has increased markedly from 11.2% in 1995 to 16.0% in 2014 (Information Committee of Korean Gastric Cancer Association. 2016).

위식도경계부선암은, 우리나라에서 지금까지 흔하게 관찰되었던 중하부 위선암과는 달리, 아직까지 종양의 발암기전에 대한 가설뿐 아니라 종양의 분류법과 그 생물학적 특징, 수술 방법 등에 대한 아직 학문적 의견 일치가 부족하며 최근까지도 여러 가지 다양한 논의가 이루어지고 있다. 그러나 식도선암이나 상부위암 대비 위식도경계부 선암을 구분지을 수 있는 생물학적 특성의 차이에 대한 근본적인 연구는 아직 부족한 실정으로, 실제 이 세 가지 인접한 암종이 어떠한 분자생물학적 차이를 가지고 발현되며 어떠한 유전자가 가장 효율적인 치료 타겟이 될 수 있을지에 대해서는 아직 뚜렷한 결과가 없다.Unlike gastroesophageal gastric adenocarcinoma, which has been commonly observed in Korea, the gastroesophageal border adenocarcinoma still lacks academic consensus on tumor classification, its biological characteristics, and surgical methods, as well as hypothesis on the tumor carcinogenesis mechanism. Until recently, various discussions have been made. However, there is still insufficient research on the biological characteristics that can distinguish gastroesophageal adenocarcinoma from esophageal adenocarcinoma or upper gastric cancer. Actually, these three adjacent carcinomas are expressed with some molecular biological differences and which genes are most effective. There are no clear results as to whether it can be a therapeutic target.

특히 이 위치에 있는 암종은 대부분 위전절제와 함께 매우 긴 시간의 수술법이 요구되며, 이러한 위전절제시 수술 후 전신 영양상태 및 항암치료에 대한 순응도 (compliance)가 크게 감소하여 재발율을 높이는 불량한 예후 인자로 작용한다. 따라서 위식도경계부선암에서, 수술 전 항암화학요법이나 표적치료제 적용에 대한 새로운 진단 방법 또는 치료제 선택에 관한 연구의 필요성은 계속 강조되고 있다.In particular, most of the carcinomas in this position require a very long time of surgery along with gastric resection, which is a poor prognostic factor that increases the recurrence rate by greatly reducing the compliance of the systemic nutrition and chemotherapy after surgery. Works. Therefore, the need for research on the selection of a new diagnostic method or treatment for chemotherapy or targeted therapy before surgery is emphasized.

최근 암 유전체 아틀라스 그룹 (the Cancer Genome Atlas group: TCGA)에서 위암 및 식도암에 대한 대규모 분자생물학적 분석 결과가 발표되었으나, 생존율 등 실제 임상에서의 적용에는 제한이 있으며, 이 연구 역시 위식도선암과 분문부암에 대한 정의가 모호하고, 치료 타겟 측면에서 식도선암과의 유사성 또는 차이점에 대한 직접적인 비교 분석이 없다.Recently, large-scale molecular biological analysis of gastric and esophageal cancers was published by the Cancer Genome Atlas group (TCGA), but there are limitations in practical applications such as survival rate. The definition is vague and there is no direct comparative analysis of similarities or differences with esophageal adenocarcinoma in terms of therapeutic targets.

일 양상은 위암 또는 식도암의 진단에 필요한 정보를 제공하기 위하여, 개체로부터 분리된 위 또는 식도 조직의 EGFR 또는 ERBB2의 발현량을 측정하는 단계를 포함하는 방법을 제공한다.One aspect provides a method comprising measuring the expression level of EGFR or ERBB2 in gastric or esophageal tissue isolated from an individual to provide information necessary for the diagnosis of gastric or esophageal cancer.

다른 양상은 EGFR 또는 ERBB2 단백질, 또는 이들 각각의 단편에 특이적으로 결합하는 항체, 항원 결합 단편, 또는 폴리펩티드, 또는 EGFR 또는 ERBB2 단백질을 코딩하는 폴리뉴클레오티드 서열에 특이적으로 결합하는 프로브, 프라이머 세트, 또는 뉴클레오티드를 포함하는 개체의 위암 또는 식도암을 진단하기 위한 조성물 및 이를 포함하는 키트를 제공한다.Another aspect is a probe, primer set, which specifically binds an antibody, antigen binding fragment, or polypeptide that binds specifically to an EGFR or ERBB2 protein, or a fragment thereof, or a polynucleotide sequence encoding an EGFR or ERBB2 protein, Or a composition for diagnosing gastric cancer or esophageal cancer of a subject comprising a nucleotide and a kit comprising the same.

일 양상은 위암 또는 식도암의 진단에 필요한 정보를 제공하기 위하여, 개체로부터 분리된 위 또는 식도 조직의 EGFR 또는 ERBB2의 발현량을 측정하는 단계를 포함하는 방법을 제공한다. One aspect provides a method comprising measuring the expression level of EGFR or ERBB2 in gastric or esophageal tissue isolated from an individual to provide information necessary for the diagnosis of gastric or esophageal cancer.

용어 "EGFR(epidermal growth factor receptor)"(ErbB1 또는 HER1 이라고도 지칭됨)은 세포외 단백질 리간드의 표피 성장 인자 패밀리(epidermal growth factor family)의 막에 대한 수용체로서 작용하는 막 관통 단백질 또는 이를 코딩하는 유전자를 가리킨다. 이의 서열은 당해 분야에 공지되어 있으며, 예를 들면 Genbank Accession No.　AAI18666.1 (https://www.ncbi.nlm.nih.gov/genbank/) 등이 있다.The term “epidermal growth factor receptor (EGFR)” (also referred to as ErbB1 or HER1) is a transmembrane protein or gene encoding it that acts as a receptor for the membrane of the epidermal growth factor family of extracellular protein ligands. Point to. Its sequences are known in the art, for example Genbank Accession No. AAI18666.1 (https://www.ncbi.nlm.nih.gov/genbank/).

용어 "ErbB2"(HER2(human epidermal growth factor receptor 2), 또는 Neu 라고도 지칭됨)는 ErbB 패밀리에 속하는 수용체 타이로신 카이네이즈(receptor tyrosine kinase) 단백질 또는 이를 코딩하는 유전자를 가리킨다. 이의 서열은 당해 분야에 공지되어 있으며, 예를 들면 Genbank Accession No. NP_001005862 (https://www.ncbi.nlm.nih.gov/genbank/) 등이 있다.The term "ErbB2" (also referred to as human epidermal growth factor receptor 2, or Neu) refers to a receptor tyrosine kinase protein belonging to the ErbB family or a gene encoding the same. Its sequences are known in the art, for example Genbank Accession No. NP_001005862 (https://www.ncbi.nlm.nih.gov/genbank/).

상기 방법은 위암 또는 식도암의 진단 및/또는 표적 치료에 관한 것이다. 상기 위암 또는 식도암은 위식도경계부선암(adenocarcinoma of gastroesophageal junction),분문부암(cardia cancer), 위식도경계부/분문부암, 상부위암(upperthird gastric cancer), 위저부암(gastric cancer located at fundus),위체부암(gastric cancer located at body), 위저부/체부암(gastric adenocarcinoma located at fundusor body of the stomach: GCFB), 식도 선암(esophageal adenocarcinoma: EAC), 또는 이들의 조합일 수 있다. 상기 위암 또는 식도암은 선암(adenocarcinoma)으로 분류되는 것일 수 있다. 위와 식도는 해부학적으로 연결되어 있고, 위와 식도의 경계부가 모호하여 위식도경계부선암의 생물학적인 이해 및 위선암 혹은 식도선암과의 분자생물학적 특징의 비교에 대해 끊임 없는 논란이 지속되어 왔다. 그러나 본 발명의 일 실시예는 이러한 암들을 분자생물학적 마커를 이용하여 정확히 진단하고, 또한 상기 마커를 표적으로 하는 치료에 유용하게 이용될 수 있다.The method relates to the diagnosis and / or targeted treatment of gastric or esophageal cancer. The gastric cancer or esophageal cancer is gastroesophageal adenocarcinoma (adenocarcinoma of gastroesophageal junction), cardia cancer, gastroesophageal border / gate cancer, gastric cancer located at fundus, gastric cancer located at fundus, gastric cancer (gastric cancer located at body), gastric adenocarcinoma located at fundusor body of the stomach (GCFB), esophageal adenocarcinoma (EAC), or a combination thereof. The gastric cancer or esophageal cancer may be classified as adenocarcinoma. The stomach and the esophagus are anatomically connected, and the boundary between the stomach and the esophagus is vague and there is a constant debate about the biological understanding of gastroesophageal border adenocarcinoma and the comparison of molecular biological characteristics with gastric or esophageal adenocarcinoma. However, an embodiment of the present invention can be usefully used for the diagnosis of such cancers accurately using molecular biological markers, and also for the treatment of targeting the markers.

상기 EGFR 또는 ERBB2의 발현량을 측정하는 단계는, EGFR 또는 ERBB2의 mRNA의 양을 확인하는 단계, 단백질의 양을 확인하는 단계, 또는 유전자 복제 수 변이를 확인하는 단계일 수 있다. The step of measuring the expression level of EGFR or ERBB2 may be a step of confirming the amount of mRNA of EGFR or ERBB2, a step of confirming the amount of protein, or a step of identifying the number of gene copies.

상기 발현량을 측정하는 단계는 EGFR 또는 ERBB2 단백질 또는 그를 코딩하는 mRNA에 특이적으로 결합하는 물질을 조직 시료에 접촉시켜 복합체를 형성시키고, 형성된 복합체의 수준을 측정하는 단계를 포함할 수 있다. 이후, 상기 복합체의 수준을 측정하는 것은, 예를 들면 상기 특이적으로 결합하는 물질에 검출 가능한 표지를 부착시키고 이로부터 나오는 신호를 검출함으로써 이루어지는 것일 수 있다. 상기 측정은 RT-PCR, 경쟁적 RT-PCR(competitive RT-PCR), 실시간 RT-PCR(Real-time RT-PCR), RNase 보호 분석법(RNase protection assay: RPA), 노던 블롯팅 (Northern blotting), DNA를 포함한 핵산 마이크로어레이, 웨스턴블롯팅, ELISA(enzyme linked immunosorbent assay), 방사선 면역분석(Radioimmunoassay, RIA), 방사 면역 확산법(radioimmunodiffusion), 오우크테로니(Ouchterlony) 면역 확산법, 로케트(rocket) 면역전기영동, 조직면역 염색, 면역침전 분석법 (Immunoprecipitation Assay), 보체 고정 분석법(Complement Fixation Assay), FACS, 질량분석, 자석비드-항체 면역 침강법, 단백질 칩(protein chip), 또는 그 조합으로 수행할 수 있다.Measuring the expression level may include contacting a tissue sample with a substance that specifically binds to an EGFR or ERBB2 protein or an mRNA encoding the same, to form a complex, and measuring the level of the formed complex. Thereafter, measuring the level of the complex may be achieved by, for example, attaching a detectable label to the specifically binding substance and detecting a signal therefrom. The measurements are RT-PCR, competitive RT-PCR, Real-time RT-PCR, RNase protection assay (RPA), Northern blotting, Nucleic acid microarrays including DNA, western blotting, enzyme linked immunosorbent assay (ELISA), radioimmunoassay (RIA), radioimmunodiffusion, Ouchterlony immunodiffusion, rocket immunity Electrophoresis, tissue immunostaining, immunoprecipitation assay, complement fixation assay, FACS, mass spectrometry, magnetic bead-antibody immunoprecipitation, protein chip, or combinations thereof Can be.

또한, 상기 발현량을 측정하는 단계는 역상 단백질 분석(reverse phase protein analysis), 조직 마이크로어레이(tissue microarray), 면역조직화학(imunohistochemistry: IHC), 은 제자리 부합법(silver in situ hybridization: SISH), 또는 이들의 조합으로 수행할 수 있다.In addition, measuring the expression level may include reverse phase protein analysis, tissue microarray, immunohistochemistry (IHC), silver in situ hybridization (SISH), Or a combination thereof.

상기 방법은 상기 측정된 EGFR 또는 ERBB2의 발현량을 대조군에서 측정된 EGFR 또는 ERBB2의 발현량과 비교하는 단계를 포함할 수 있다. 상기 대조군은 위암 또는 식도암에 걸리지 않은 개체의 조직, 또는 위암 또는 식도암 조직일 수 있다. The method may comprise comparing the expression level of EGFR or ERBB2 measured with the expression level of EGFR or ERBB2 measured in the control. The control group may be tissue of an individual who does not have gastric cancer or esophageal cancer, or gastric cancer or esophageal cancer tissue.

상기 방법은 또한 상기 EGFR 또는 ERBB2의 발현량이 대조군의 발현량에 비교하여 변화한 경우 개체를 위암 또는 식도암에 걸린 것으로 결정하는 단계를 포함할 수 있다. 상기 변화는 대조군에 비하여 1%, 2%, 3%, 4%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100%, 200%, 300%, 400%, 500%, 600%, 700%, 800%, 900% 및 1000% 이상 증가하거나, 1%, 2%, 3%, 4%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90% 및 100% 이상 감소하는 것을 포함할 수 있다. 예를 들면, 개체로부터 채취한 조직의 ERBB2 유전자의 복제 수가 정상 조직에 비하여 100%(2배) 이상 증가(증폭)한 경우, 상기 개체는 식도암에 걸린 것으로 판정할 수 있다.The method may also include determining that the subject has gastric cancer or esophageal cancer when the expression level of EGFR or ERBB2 changes compared to the expression level of the control group. The change was 1%, 2%, 3%, 4%, 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70%, 80%, 90%, 100% compared to the control group. , 200%, 300%, 400%, 500%, 600%, 700%, 800%, 900% and 1000% or more, 1%, 2%, 3%, 4%, 5%, 10%, 20 %, 30%, 40%, 50%, 60%, 70%, 80%, 90% and 100% or more reductions. For example, when the number of copies of the ERBB2 gene of tissues collected from an individual has increased (amplified) by 100% or more compared to normal tissues, the individual may be determined to have esophageal cancer.

상기 방법은 상기 EGFR 또는 ERBB2의 발현량이 증가한 경우, EGFR 또는 ERBB2의 저해제를 투여하기로 결정하는 단계를 더 포함할 수 있다. 상기 EGFR 또는 ERBB2의 저해제는 당해 분야에서 통상적으로 이용되는 것, 상업적으로 판매되는 것, 또는 임상 시험 중인 것이면 어느 것이든지 이용 가능하며, 예를 들면, 라파티닙(lapatinib), 트라스투주맙(Trastuzumab), 세툭시맙(cetuximab),제피티닙(gefitinib), 에피티닙(efitinib) 또는 이들의 조합일 수 있다.The method may further include determining to administer an inhibitor of EGFR or ERBB2 when the expression level of the EGFR or ERBB2 is increased. The inhibitor of EGFR or ERBB2 can be used as long as it is commonly used in the art, commercially available, or under clinical trial, for example, lapatinib, trastuzumab, Cetuximab, gefitinib, epitinib, or a combination thereof.

다른 양상은 EGFR 또는 ERBB2 단백질, 또는 이들 각각의 단편에 특이적으로 결합하는 항체, 항원 결합 단편, 또는 폴리펩티드, 또는 EGFR 또는 ERBB2 단백질을 코딩하는 폴리뉴클레오티드 서열에 특이적으로 결합하는 프로브, 프라이머 세트, 또는 뉴클레오티드를 포함하는 개체의 위암 또는 식도암을 진단하기 위한 조성물 및 키트를 제공한다.Another aspect is a probe, primer set, which specifically binds an antibody, antigen binding fragment, or polypeptide that specifically binds an EGFR or ERBB2 protein, or a fragment thereof, or a polynucleotide sequence encoding an EGFR or ERBB2 protein, Or compositions and kits for diagnosing gastric or esophageal cancer in a subject comprising nucleotides.

상기 EGFR 또는 ERBB2 단백질, 또는 이들 각각의 단편에 특이적으로 결합하는 항체, 항원 결합 단편, 또는 폴리펩티드, 또는 EGFR 또는 ERBB2 단백질을 코딩하는 폴리뉴클레오티드 서열에 특이적으로 결합하는 프로브, 프라이머 세트, 또는 뉴클레오티드는 고속 대량 스크리닝(high throughput screening: HTS) 등 당업계에 알려진 방법에 따라 실시자가 직접 제조 및 선별할 수 있고, 또는 상업적으로 구매 가능한 것일 수 있다.Probes, primer sets, or nucleotides that specifically bind to an antibody, antigen-binding fragment, or polypeptide that specifically binds to the EGFR or ERBB2 protein, or fragments thereof, or to a polynucleotide sequence encoding an EGFR or ERBB2 protein. Can be prepared and screened by the practitioner according to methods known in the art, such as high throughput screening (HTS), or may be commercially available.

또한 상기 키트는 하나 이상의 반응 시약을 갖는 포장단위를 포함하는 키트 형태를 제공할 수 있다. 또한 상기 키트는 하나 이상의 하기의 품목을 포함할 수 있다: 검출 대상을 검출하기 위한 물질, 완충액, 사용설명서, 및 양성 또는 음성 대조군. 상기 키트는 본 명세서에 기술된 방법을 수행하기 위해 적절한 비율로 함께 혼합된 반응 시약들의 용기들을 포함할 수 있다. 반응 시약 용기들은 바람직하게는 상기 방법을 수행할 때 측정하는 단계를 생략할 수 있도록 단위 수량의 반응 시약을 포함할 수 있다. The kit may also provide a kit form comprising a packaging unit having one or more reaction reagents. The kit may also include one or more of the following items: substances, buffers, instructions, and positive or negative controls for detecting the subject of detection. The kit may comprise containers of reaction reagents mixed together in an appropriate proportion to carry out the methods described herein. The reaction reagent vessels may preferably include a unit quantity of reaction reagent so that the step of measuring when performing the method may be omitted.

본 발명의 일 실시예에서, 식도양 선암은 분화암 및 장형암과의 연관성이 유의하게 높았으며, ERBB2 의 유전자복제수변이가 유의하게 증폭되어 있고, ERBB2 및 EGFR 의 단백질 발현이 유의하게 증가되어 있음을 확인하였으므로, ERBB2 또는 EGFR 발현량을 측정함으로써, 식도암 또는 위저부/체부암을 구별하여 진단할 수 있고, 이들 암의 치료를 위한 유용한 표적까지 제공할 수 있다. In one embodiment of the present invention, esophageal adenocarcinoma was significantly associated with differentiation and colon cancer, gene mutation of ERBB2 was significantly amplified, and protein expression of ERBB2 and EGFR was significantly increased. Since the expression level of ERBB2 or EGFR is measured, esophageal cancer or gastric / gastric cancer can be distinguished and diagnosed, and a useful target for the treatment of these cancers can be provided.

다른 양상은 개체의 위암 또는 식도암을 진단하기 위한 정보를 제공하기 위하여, 개체로부터 분리된 조직의 유전자의 발현 프로필을 측정하는 단계; 및 상기 유전자 발현 프로필로부터 BCCP (Bayesian Compound Covariate Predictor) 알고리즘을 실행하는 컴퓨터 시스템을 이용하여 BCCP 점수를 얻는 단계를 포함하는 유전자 발현 프로필 분석 방법을 제공한다. Another aspect includes measuring an expression profile of a gene of a tissue isolated from the individual to provide information for diagnosing gastric or esophageal cancer of the individual; And obtaining a BCCP score using a computer system executing a Bayesian Compound Covariate Predictor (BCCP) algorithm from the gene expression profile .

상기 유전자는 아래의 400개 유전자를 포함하는 것일 수 있다: KRT5, KRT13, KRT6A, KRT14, RHCG, KRT4, S100A7, SPRR3, KRT6C, SPRR2A, MUC21, KRT6B, PKP1, SERPINB3, CLCA2, SPRR1B, MUC4, TBX5, CRNN, SPRR2D, DUOX2, GJB6, SPRR2E, S100A2, KRT16, KRT17, SCEL, DSC3, CALML3, HEPHL1, A2ML1, NKX6-1, DUOXA2, SBSN, TMPRSS11D, CLCA4, ANXA8, FGFBP1, TGM1, IVL, DUOX1, DSG3, EREG, SERPINB4, KRT78, HOXC10, DUOXA1, ZNF750, SPRR1A, KRT7, TMPRSS11A, CXCL6, KLK13, CRCT1, PAX9, HORMAD1, FAM83A, S100A7A, GPR110, CAPN14, ABCA12, WFDC2, TMPRSS11B, CEACAM5, MUC12, SERPINB13, PADI1, FAT2, IL1RL2, STK31, SPINK5, DDX3Y, PI3, FAM83C, LYPD2, KLK5, IL1F5, KRT15, BBOX1, GPR87, ACTL8, SLC34A2, USP9Y, TGM3, LASS3, TMPRSS13, GUCY1B2, SPRR2F, KDM5D, KRT23, LY6D, VTCN1, SLC5A12, ABCA13, S100A8, GBP6, CEACAM7, IL1F9, MUC15, PPP1R14C, KLK7, DEFB1, MMP3, TM4SF20, SYCP2, TMEM40, CEACAM6, VWA2, CYorf15B, TDRD5, DCDC2, KLK12, C10orf99, CARD14, HOXC13, LOC642587, GABRR1, DNAH2, SLC15A1, KLK8, LGALS7B, ALDH1L1, FOXN1, RAET1L, TP63, ATP13A4, UTY, DSG1, ALOX12B, AMY2B, SAA2, SAA1, GPR115, LCN2, KRT24, PRKY, PADI3, SCNN1A, PGLYRP3, PRSS27, ANXA8L2, WISP3, IL20RB, FOXE1, ZFY, ALG1L, LY6K, NCCRP1, SLC44A5, NMU, CDH17, RNASE7, FER1L4, C4BPB, CALML5, CXorf61, SPACA4, ULBP2, KRTDAP, MYO7B, , C7orf54, SLC6A20, LOC221442, GOLGA8B, GPR81, ALDH3B2, TRIM29, C21orf125, BNC1, IL1F6, TRPA1, C12orf27, LOC440905, POU5F1, LOC100131726, KCNH8, SPRR2C, IL13RA2, C3orf67, SAA4, LYPD3, GJB5, ULBP1, PHACTR3, S100A9, TRIM54, KLK11, ITIH5L, BNIPL, LOC284233, LOC387646, MRPL42P5, FLJ42393, PCDHAC2, IFNE, BCL2L10, FLJ45445, COL7A1, GDA, ARL14, VIP, CLC, PPP2R2B, FABP4, SFRP5, CCL8, CD79A, HIST1H1E, CCL4L2, SYNPO2, SLC47A1, CXCR2P1, C1QTNF7, CDH2, SIGLEC7, GZMA, NKX6-3, PGM5, LILRB4, NMUR1, XCL2, SOX10, FCGR1A, ST6GALNAC5, CCL4, ACSM5, FOLR2, KCNA1, CCL26, FCGR1C, VSIG4, C1QA, NCF1B, NXPH3, C1QC, TAGLN, PTPRN, TYROBP, ISL1, HCST, RGMA, APOBEC3H, PRND, ADIPOQ, CCL11, GLP2R, PNMA6A, CD8B, FCGR1B, PCBP3, CD8A, GPR27, DPP6, F13A1, EPYC, CXCL9, GREM2, SNORA12, DNAJC5B, HLA-DQA2, PLP1, CHGB, APOC2, FAT3, TLR7, CHRDL1, LY86, AGTR1, ISLR, CR1, FCGR3A, DPEP1, PPAPDC1A, ODZ3, MAPK4, C17orf87, KCNJ5, MSR1, RELN, APOE, C1QB, PHYHIPL, CCDC80, NCF1C, SLITRK5, TREM2, FGF14, PGA5, NKX2-3, PRELP, FCRLA, CCL5, GNAO1, PCDH9, WSCD2, MS4A1, RNF150, KLRD1, KCNK2, MATK, HUNK, IGFBPL1, IGF1, CDO1, COLEC12, FIBIN, CARTPT, VPREB3, PPP1R1A, GDF6, NCAM2, CLEC10A, KIAA0408, BHLHE22, CD52, SULT4A1, SIT1, FNDC1, GRIK3, SUCNR1, TMEM90B, CYP1B1, MKX, SV2B, BMP3, TCEAL2, CADM3, SPOCK1, GREM1, SIGLEC8, ADCY2, CDC26, OGN, FCRL6, PSD, VIPR2, GZMM, ADH1B, CCL19, PLD4, TMEM189-UBE2V1, NKG7, EOMES, STMN2, GZMH, NPTX1, CNTFR, TMEM130, CTNND2, ITGBL1, ECEL1, ACTG2, COL10A1, ST6GAL2, NRK, PI16, NALCN, GZMK, NCAM1, RMRP, CHRM2, KCNMA1, HSPB7, SLIT2, PDZRN4, CNN1, CHRDL2, OMD, PTGIS, RSPO3, PLA2G2D, SMYD1, ZNF683, NRXN3, PCSK1N, HSPB6, C16orf89, PGA3, COMP, C2orf40, C7, GKN2, DES, CILP, COL11A1, LIPF, GATA5, MFAP5, SCRG1, SFRP2, GKN1, PGA4, CCL21, SIX2, NKX3-2, SFRP4, NBLA00301, THBS4, HAND2, 또는 BARX1.The gene may include the following 400 genes: KRT5, KRT13, KRT6A, KRT14, RHCG, KRT4, S100A7, SPRR3, KRT6C, SPRR2A, MUC21, KRT6B, PKP1, SERPINB3, CLCA2, SPRR1B, MUC4, TBX5 , CRNN, SPRR2D, DUOX2, GJB6, SPRR2E, S100A2, KRT16, KRT17, SCEL, DSC3, CALML3, HEPHL1, A2ML1, NKX6-1, DUOXA2, SBSN, TMPRSS11D, CLCA4, ANXA8, FGFBP1, TDU3, DGM , EREG, SERPINB4, KRT78, HOXC10, DUOXA1, ZNF750, SPRR1A, KRT7, TMPRSS11A, CXCL6, KLK13, CRCT1, PAX9, HORMAD1, FAM83A, S100A7A, GPR110, CAPN14, ABCA12, WSSDCB MCE13 , FAT2, IL1RL2, STK31, SPINK5, DDX3Y, PI3, FAM83C, LYPD2, KLK5, IL1F5, KRT15, BBOX1, GPR87, ACTL8, SLC34A2, USP9Y, TGM3, LASS3, TMPRSS13, GUCY1BRT, SPRRDF K23 , SLC5A12, ABCA13, S100A8, GBP6, CEACAM7, IL1F9, MUC15, PPP1R14C, KLK7, DEFB1, MMP3, TM4SF20, SYCP2, TMEM40, CEACAM6, VWA2, CYorf15B, TDRD5, DCDC2, KLKR12, C10, KLKOX12, C1014 , DNAH2, SLC15A1, KLK8, LGALS7B, ALDH1L1, FO XN1, RAET1L, TP63, ATP13A4, UTY, DSG1, ALOX12B, AMY2B, SAA2, SAA1, GPR115, LCN2, KRT24, PRKY, PADI3, SCNN1A, PGLYRP3, PRSS27, ANXA8L2, WISP3, IL20RB, FOX, NCCRP1, SLC44A5, NMU, CDH17, RNASE7, FER1L4, C4BPB, CALML5, CXorf61, SPACA4, ULBP2, KRTDAP, MYO7B,, C7orf54, SLC6A20, LOC221442, GOLGA8B, GPR81, TRLD125 PA2, ALDH3B2 C21 , LOC440905, POU5F1, LOC100131726, KCNH8, SPRR2C, IL13RA2, C3orf67, SAA4, LYPD3, GJB5, ULBP1, PHACTR3, S100A9, TRIM54, KLK11, ITIH5L, BNIPL, LOC284233, LOPL42P5646, FLJ387H , COL7A1, GDA, ARL14, VIP, CLC, PPP2R2B, FABP4, SFRP5, CCL8, CD79A, HIST1H1E, CCL4L2, SYNPO2, SLC47A1, CXCR2P1, C1QTNF7, CDH2, SIGLEC7, GZLR, PNK56 , SOX10, FCGR1A, ST6GALNAC5, CCL4, ACSM5, FOLR2, KCNA1, CCL26, FCGR1C, VSIG4, C1QA, NCF1B, NXPH3, C1QC, TAGLN, PTPRN, TYROBP, ISL1, HCST, RGMA, APNDCL3 , PNMA6A, CD8B, FCGR1B, PCBP3, CD8A, GPR27, DPP6, F13 A1, EPYC, CXCL9, GREM2, SNORA12, DNAJC5B, HLA-DQA2, PLP1, CHGB, APOC2, FAT3, TLR7, CHRDL1, LY86, AGTR1, ISLR, CR1, FCGR3A, DPEP1, PPAPDC1A, ODZ3, MAPF4, CAPJ4, CAPJ4 MSR1, RELN, APOE, C1QB, PHYHIPL, CCDC80, NCF1C, SLITRK5, TREM2, FGF14, PGA5, NKX2-3, PRELP, FCRLA, CCL5, GNAO1, PCDH9, WSCD2, MS4A1, RNF150, KLRD1, KCNK2 IGFBPL1, IGF1, CDO1, COLEC12, FIBIN, CARTPT, VPREB3, PPP1R1A, GDF6, NCAM2, CLEC10A, KIAA0408, BHLHE22, CD52, SULT4A1, SIT1, FNDC1, GRIK3, SUCNR1, TMEMBB, MYP3, MYP3 CADM3, SPOCK1, GREM1, SIGLEC8, ADCY2, CDC26, OGN, FCRL6, PSD, VIPR2, GZMM, ADH1B, CCL19, PLD4, TMEM189-UBE2V1, NKG7, EOMES, STMN2, GZMH, NPTX1, CNTFR1, TMEM130, CTMEM1 ECEL1, ACTG2, COL10A1, ST6GAL2, NRK, PI16, NALCN, GZMK, NCAM1, RMRP, CHRM2, KCNMA1, HSPB7, SLIT2, PDZRN4, CNN1, CHRDL2, OMD, PNNIS, RSPO3, PLA2G2D, SMNN31, SMXN3 HSPB6, C16orf89, PGA3, COMP, C2orf40, C7, GKN2, DES, CILP, COL11A1, LIPF, GATA5, MFAP5, SCRG1, SFRP2, GKN1, PGA4, CCL21, SIX2, NKX3-2, SFRP4, NBLA00301, THBS4, HAND2, or BARX1.

상기 400개 유전자에 관한, 조성물, 키트, 및 측정 방법 등은 특별히 다른 언급이 없는 한 상기 EGFR 또는 ERBB2에 관한 설명에서 언급된 내용들이 적용될 수 있다.As regards the 400 genes, the compositions, kits, methods of measurement, etc. may be applied to the contents mentioned in the description of the EGFR or ERBB2 unless otherwise specified.

위식도경계부암을 진단하기 위한 마커 및 표적으로서 ERBB2 및 EGFR을 제공한다.ERBB2 and EGFR are provided as markers and targets for diagnosing gastroesophageal border cancer.

도 1은 TCGA 코호트의 연구군의 해부학적 분포를 나타낸다.
도 2는 SNU 코호트의 연구군의 해부학적 분포를 나타낸다.
도 3은 BCCP를 이용한 클래스 예측 모델을 나타낸다.
도 4는 분석 개요에 따른 상세한 연구군을 나타낸다.
도 5는 TCGA 코호트의 EAC 및 GCFB 사이의 5,520개 유전자의 감독되지 않은 클러스터링을 나타낸다.
도 6은 400개 시그니쳐 유전자 분류를 이용하여 TCGA 코호트 내의 EAC 및 GCFB의 감독되지 않은 계층적 클러스터링을 수행한 결과를 나타낸다.
도 7은 LOOCV를 이용한 교차 검증 후 ROC 곡선을 나타낸다.
도 8은 BCCP를 이용한 TCGA 코호트의 GEJ/분문부의 계층적 클러스터링을 나타낸다.
도 9는 BCCP를 이용한 SNU 코호트의 AGEJ 또는 상부위암(upperthird gastric cancer)의 계층적 클러스터링을 나타낸다.
도 10은 TCGA 코호트의 EAC-예측 및 GCFB-예측 그룹 사이의 복제 수 변화를 나타낸다.
도 11은 SNU 코호트의 EAC-예측 및 GCFB-예측 그룹 사이의 복제 수 변화를 나타낸다.
도 12는 TCGA 코호트의 역상 단백질 분석을 이용한 히트맵을 나타낸다.
도 13은 조직 마이크로어레이의 IHC 염색을 이용한 단백질 발현을 나타낸다(200x).
도 14는 SNU 코호트의 EAC-예측 그룹 및 GCFB-예측 그룹 사이의 조직 마이크로어레이의 복합 H 점수를 나타낸다.
도 15는 CCLE 데이터베이스를 이용한 예측 모델의 외부 검증을 나타낸다.
도 16은 EAC-예측 그룹 및 GCFB-예측 그룹 사이의 CCLE 데이터베이스의 계층적 클러스터링을 나타낸다.
도 17은 EAC-예측 그룹 및 GCFB-예측 그룹 사이의 CCLE 데이터베이스의 IC50 데이터를 이용한 라파티닙의 약물 반응을 나타낸다.1 shows the anatomical distribution of a study group of TCGA cohorts.
2 shows the anatomical distribution of a study group of SNU cohorts.
3 shows a class prediction model using BCCP.
4 shows a detailed study group according to the analysis summary.
5 shows unsupervised clustering of 5,520 genes between EAC and GCFB of the TCGA cohort.
6 shows the results of performing unsupervised hierarchical clustering of EAC and GCFB in the TCGA cohort using 400 signature gene classification.
7 shows the ROC curve after cross-validation using LOOCV.
8 shows hierarchical clustering of GEJ / segments of a TCGA cohort using BCCP.
9 shows hierarchical clustering of AGEJ or upperthird gastric cancer of the SNU cohort using BCCP.
10 shows the change in replication number between the EAC-prediction and GCFB-prediction groups of the TCGA cohort.
Figure 11 shows the change in replication number between the EAC-prediction and GCFB-prediction groups of the SNU cohort.
12 shows a heat map using reversed phase protein analysis of the TCGA cohort.
13 shows protein expression using IHC staining of tissue microarrays (200 ×).
14 shows the composite H scores of tissue microarrays between the EAC-prediction group and the GCFB-prediction group of the SNU cohort.
15 illustrates external verification of a prediction model using a CCLE database.
16 illustrates hierarchical clustering of CCLE databases between EAC-prediction groups and GCFB-prediction groups.
17 shows the drug response of lapatinib using IC50 data from the CCLE database between the EAC-prediction group and the GCFB-prediction group.

이하 본 발명을 실시예를 통하여 보다 상세하게 설명한다. 그러나, 이들 실시예는 본 발명을 예시적으로 설명하기 위한 것으로 본 발명의 범위가 이들 실시예에 한정되는 것은 아니다.Hereinafter, the present invention will be described in more detail with reference to Examples. However, these examples are for illustrative purposes only and the scope of the present invention is not limited to these examples.

1. 재료 및 방법1. Materials and Methods

(1) (One) TCGATCGA 코호트Cohort

암 유전체 지도 (The Cancer Genome Atlas: TCGA)(https://tcga-data.nci.nih.gov/tcga/) 및 mRNA 발현, 체세포 돌연변이, 삽입/결실, 복제 수 변화, 순수 식도 선암(pure esophageal adenocarcinoma), 위식도 경계부(gastroesophageal junction)또는 분문부(cardia)의 선암 및 위 저부(fundus) 또는 체부(body)에 위치한 순수 위 선암 (pure gastric adenocarcinoma located at fundusor body of the stomach: GCFB)의 역상 단백질 어레이(RPPA)의 검색된 정보들을 조사하였다. 상기 언급된 각 인체 부위의 해부학적 위치는 도 1에 나타난 바와 같다.The Cancer Genome Atlas (TCGA) (https://tcga-data.nci.nih.gov/tcga/) and mRNA expression, somatic mutations, insertions / deletions, replication number changes, pure esophageal adenocarcinoma Adenocarcinoma, gastroesophageal junction or cardiac adenocarcinoma and reverse gas phase of pure gastric adenocarcinoma located at fundusor body of the stomach (GCFB) located in the fundus or body The retrieved information of the protein array (RPPA) was examined. The anatomical location of each of the above mentioned human body parts is as shown in FIG.

(2) 서울대학교 (2) Seoul National University 코호트Cohort

위암생물학 연구실, 암 연구소, 및 서울대학교에서 1999-2015년에 모은 임상 정보로서, AGEJ 및 위의 3등분의 상위부(upperthird: UT)에 대한 임상적 정보를 포함하는 신선 냉동 조직 보관 데이터베이스를 조사하였다. 상기 신선 조직의 보관은 서울대학교 병원의 조직 검토 위원회(IRB No: H-0806-072-248)로부터 승인 받았다. 초기 진단 당시에 다른 원발성 악성 종양, 재발 선암 또는 잔류 위암을 가진 환자는 제외되었다. 서울대학교 병원의 코호트에서 AGEJ 및 위 UT의 선암은 위 식도 접합부로부터의 거리 기준을 사용하여 분류하였다. 상기 언급된 각 인체 부위의 해부학적 위치는 도 2에 나타난 바와 같다.A clinical information gathered from the Gastric Cancer Biology Laboratory, Cancer Research Institute, and Seoul National University, 1999-2015, to investigate a fresh frozen tissue storage database containing clinical information about the AGEJ and upper third of the stomach. It was. The storage of the fresh tissue was approved by the Organizational Review Board of Seoul National University Hospital (IRB No: H-0806-072-248). Patients with other primary malignancies, recurrent adenocarcinoma or residual gastric cancer at the time of initial diagnosis were excluded. Adenocarcinoma of AGEJ and gastric UT in the cohort of Seoul National University Hospital was classified using distance criteria from the esophageal junction. The anatomical location of each of the above mentioned human body parts is as shown in FIG. 2.

AGEJ II는 위식도 경계부에서 구강쪽으로 1 cm 내, 및 구강 반대쪽으로 2 cm 에 위치한 종양으로 정의되었다. 이는 Siewert II 형 암의 일반적인 정의와 같다.AGEJ II was defined as a tumor located within 1 cm from the gastroesophageal border to the oral cavity and 2 cm opposite the oral cavity. This is the same as the general definition of Siewert type II cancer.

AGEJ III은 위식도 경계부의 포함 여부와 관계 없이, 위식도 경계부에서 구강 반대쪽으로 2-5 cm에 위치한 종양으로 정의되었다. AGEJ II 또는 AGEJ III를 제외한 나머지 위 선암 상위부 3분의 1은 UT로 정의되었다. AGEJ II로 분류가능한 모든 종양이 조사되고, 차세대(next-generation) 시퀀싱이 수행되었다. AGEJ III 및 UT는 가장 최근의 동일한 수의 샘플을 조사하였다. 병리학적 단계는 7th AJCC TNM 분류에 따라 진단하였다. 병리학적 분석을 위해, 유두(papillary), 잘 분화되거나 중간 정도로 분화된 유형들은 분화 그룹으로 분류하였고, 잘 분화되지 않거나, 점액성, 응집성이 좋지 않은 세포 유형은 비분화 그룹으로 분류하였다. 본 연구 프로토콜은 서울대학교 병원의 기관 검토위원회의 승인을 받았다 (IRB No: H-1501-027-639).AGEJ III was defined as a tumor located 2-5 cm from the gastroesophageal border, opposite the oral cavity, with or without gastroesophageal border. One third of the upper gastric adenocarcinoma except AGEJ II or AGEJ III was defined as UT. All tumors classifiable with AGEJ II were investigated and next-generation sequencing was performed. AGEJ III and UT examined the most recent equal number of samples. Pathological stage was diagnosed according to 7th AJCC TNM classification. For pathological analysis, papillary, well-differentiated or moderately differentiated types were classified into differentiation groups, and poorly differentiated, poorly mucus, cohesive cell types were classified into non-differentiated groups. This research protocol was approved by the Institutional Review Board of Seoul National University Hospital (IRB No: H-1501-027-639).

(3) 핵산 처리(3) nucleic acid treatment

SNU 코호트의 신선 조직 보관소로부터 각각의 냉동 종양 및 이에 상응하는 정상 위 점막을 약 2x2x1 mm³의 크기로 준비하였다. 스핀-칼럼 프로토콜을 이용하는 Qiagen DNA extraction kit (Qiagen, Venlo, Netherlands)를 사용하여 DNA를 추출하였다. 추출된 DNA는 최소 A260/280 ≥ 1.7 및 dsDNA의 양 ≥ 3.0 μg에서 QUBIT HS dsDNA assay (Life Technologies Gaithersburg, MD, USA)를 이용하여 정량화하였다. RNA의 분리는 TRIzol의 제조사가 제공하는 프로토콜 (사용자 메뉴얼 TRIzol® Reagent(www.invitrogen.com))에 따라 Eppendorf Tube5.0 mL에서 수행하였다. 용해를 위해서 각각의 신선한 조직에 1 mL TRIzol을 첨가하였다. TRIzol 프로토콜에 따라 출발 물질 1 mL를 각각의 튜브에 옮기고, 각각에 튜브에 200 μL 클로로폼을 첨가하였다. 수상으로부터 RNA를 침전시키기 위해 이소프로판올 0.5 mL와 아래의 세척 단계에서 1 mL 에탄올(75%)를 사용하였다. RNA 침전 및 세척 단계는 12,000 x g에서 5.0 mL 튜브에서 수행하였다. 생성된 RNA 펠렛은 50 μL EDPC-처리수에 다시 현탁시켰다. NanoDrop^TM 1000 (Thermo Scientific)을 이용하여, 260 nm 및 280 nm에서 OD를 측정하고 샘플의 농도 및 순도를 결정하였다. Each frozen tumor and the corresponding normal gastric mucosa from the fresh tissue reservoir of the SNU cohort were prepared to a size of about ² × ² × 1 mm ³ . DNA was extracted using a Qiagen DNA extraction kit (Qiagen, Venlo, Netherlands) using a spin-column protocol. The extracted DNA was quantified using a QUBIT HS dsDNA assay (Life Technologies Gaithersburg, MD, USA) at a minimum of A260 / 280 ≥ 1.7 and amount of dsDNA ≥ 3.0 μg. Isolation of RNA was performed in 5.0 mL of Eppendorf Tubes according to the protocol provided by the manufacturer of TRIzol (user manual TRIzol® Reagent (www.invitrogen.com)). 1 mL TRIzol was added to each fresh tissue for dissolution. 1 mL of starting material was transferred to each tube according to the TRIzol protocol, and 200 μL chloroform was added to each tube. 0.5 mL of isopropanol and 1 mL ethanol (75%) were used in the washing step below to precipitate RNA from the water phase. RNA precipitation and washing steps were performed in 5.0 mL tubes at 12,000 × g. The resulting RNA pellet was resuspended in 50 μL EDPC-treated water. Using NanoDrop ^™ 1000 (Thermo Scientific), OD was measured at 260 nm and 280 nm and the concentration and purity of the sample were determined.

(4) (4) SNUSNU 코호트의Cohort 전체 all 전사체Transcript (( wholewhole transcriptometranscriptome ) 시퀀싱Sequencing

SNU 코호트로부터의 모든 종양 샘플을, Illumina Truseq RNA library preparation kit (Ribo-Zero rRNA Removal Kit)를 이용하여 전체 전사체 라이브러리를 만드는데 이용하였다. 모든 라이브러리는 Illumina HiSeq2000 platform 상에서 일 레인 당 하나의 시료를 이용하여 시퀀싱하였다(a paired-end 2 x 101 bp read length). 종양 RNA 및 이에 상응하는 정상 RNA는 같은 보통 같은 플로우 셀(flow cell)에 로드하였다. 해독 정렬 및 프로세싱은 STAR aligner 및 Broad Institute의 Picard를 이용하여, GATK 최적 사례 추천 (GATK best practice recommendation)으로서 수행하였다 (http://broadinstitute.github.io/picard).All tumor samples from the SNU cohort were used to make the entire transcript library using the Illumina Truseq RNA library preparation kit (Ribo-Zero rRNA Removal Kit). All libraries were sequenced using one sample per lane on the Illumina HiSeq2000 platform (a paired-end 2 × 101 bp read length). Tumor RNA and corresponding normal RNA were loaded into the same usually same flow cell. Decryption alignment and processing was performed as a GATK best practice recommendation using STAR aligner and Broad Institute's Picard (http://broadinstitute.github.io/picard).

(5) (5) SNUSNU 코호트의Cohort 전체 all 엑솜Exome (( exomeexome ) 시퀀싱Sequencing

종양 및 이에 상응하는 정상 위 점막 샘플으로부터 최소 3 μg의 dsDNA의 전체 엑솜 시퀀싱을 Agilent SureSelectHumanAll Exon V5 + UTR region kit를 이용하여 수행하였다. 2 x 101 bp 해독 길이의 페어드 엔드(paired-end) 쌍을 Illumina HiSeq2000 플랫폼 상에서 시퀀싱하였다. 시퀀싱 목표 범위(depth)는 종양 및 정상 조직 모두에서 적어도 100x로 계획하였다 (종양의 경우 이상적으로는 200x). 해독 정렬 및 프로세싱은 Burrows-Wheeler Aligner(BWA)-mem 및 Broad Institute의 Picard를 이용하여 GATK 최적 사례 추천으로서 수행하였다.Whole exome sequencing of at least 3 μg of dsDNA from tumors and corresponding normal gastric mucosal samples was performed using the Agilent SureSelectHumanAll Exon V5 + UTR region kit. Paired-end pairs of 2 × 101 bp read length were sequenced on the Illumina HiSeq2000 platform. Sequencing target depth was planned to be at least 100 × in both tumor and normal tissue (ideally 200 × for tumor). Decryption alignment and processing was performed as a GATK best practice recommendation using Burrows-Wheeler Aligner (BWA) -mem and Broad Institute's Picard.

(6) (6) 전사체Transcript 시퀀싱을 이용한 예측 분류 알고리즘 Predictive Classification Algorithm Using Sequencing

유전자 발현 데이터를 분석하기 위해 BRB Array Tool을 사용하였다. TCGA 및 SNU 코호트의 RNA-시퀀싱 데이터는 다르게 발현되는 유전자(defferentially expressed genes: DEG) 및 예측 모델을 구축하기 위해 함께 분석하였다. 먼저, TCGA 코호트 중 EAC 및 GCFB 사이에서 DEG들을 Student's t 검정으로 확인하고 (P<0.001), 배수 변화가 상위 또는 하위에 해당하는지 여부에 따라 추가적으로 선별하였다. 예측 모델을 구축하기 위해, 이전에 확립된 BCCP (Bayesian Compound Covariate Predictor) 알고리즘과 함께 LOOVC (leave-one-out cross validation) 접근법을 사용하였다. 이 때 사용된 방법은 Radmacher MD, McShane LM, Simon R. A paradigm for class prediction using gene expression profiles. J Comput Biol 2002;9:505-511, Wright G, Tan B, Rosenwald A et al. A gene expression-based method to diagnose clinically distinct subgroupsof diffuse large B cell lymphoma. Proc Natl Acad Sci U S A 2003;100:9991-9996,　YOUDEN WJ. Index for rating diagnostic tests. Cancer 1950;3:32-35.　Robin X, TurckN, Hainard A et al. pROC: an open-source package for R and S+ to analyze and compare ROC curves.BMC Bioinformatics 2011;12:77. doi: 10.1186/1471-2105-12-77.:77-12 등의 문헌을 참조하였으며, 상기 문헌들의 기재 내용은 본 명세서에 모두 포함된다 (도 3).The BRB Array Tool was used to analyze gene expression data. RNA-sequencing data from the TCGA and SNU cohorts were analyzed together to build differentially expressed genes (DEGs) and prediction models. First, DEGs between the EAC and GCFB in the TCGA cohort were identified by Student's t test (P <0.001) and further screened according to whether the fold change was higher or lower. To build the prediction model, we used the leave-one-out cross validation (LOOVC) approach with the previously established Bayesian Compound Covariate Predictor (BCCP) algorithm. The method used at this time is Radmacher MD, McShane LM, Simon R. A paradigm for class prediction using gene expression profiles. J Comput Biol 2002; 9: 505-511, Wright G, Tan B, Rosenwald A et al. A gene expression-based method to diagnose clinically distinct subgroupsof diffuse large B cell lymphoma. Proc Natl Acad Sci U S A 2003; 100: 9991-9996, YOUDEN WJ. Index for rating diagnostic tests. Cancer 1950; 3: 32-35. Robin X, Turck N, Hainard A et al. pROC: an open-source package for R and S + to analyze and compare ROC curves. BMC Bioinformatics 2011; 12:77. doi: 10.1186 / 1471-2105-12-77.: 77-12 et al., which are incorporated herein by reference in their entirety (FIG. 3).

학습된 모델 (trained model)의 민감도 및 특이도는 ROC 커브로 평가하였다. EAC 및 GCFB 사이의 최적의 컷-오프 값은 Youden 인덱스를 이용하여 결정하였다. 예측 모델과 이의 컷-오프 값에 대한 외부 검증(external validation)은 CCLE(Cancer Cell Line Encyclopedia) 데이터베이스 (http://www.broadinstitute.org/ccle)의 위 및 식도 선암 세포주의 RNA 마이크로어레이를 이용하여 수행하였다. BCCP 모델의 가능성에 따라, TCGA 코호트의 GEJ/분문부 종양 및 SNU 코호트의 모든 종양은 게놈 아형에 따라 재분류되었다 (EAC-예측 또는 GCFB-예측). 임상 및 분자적 수준에서의 게놈 아형의 차이는 이후 임상 병리학적 데이터, 돌연변이, 복제 수 변화를 분석하여 평가하였다. 계통 분석(pathway analysis)는 KEGG (Kyoto Encyclopedia of Genes and Genomes) 분석 툴 (http://www.kegg.jp)를 이용하여 수행하였다. 게놈 아형과 관련된 잠재적인 대리 마커 (surrogate marker)는 TCGA 코호트에 대한 역상 단백질 분석 및 SNU 코호트에 대한 조직 마이크로어레이로 확인하였다. 이러한 대리 커들의 표적 약물 반응성은 CCLE 데이터 베이스의 위 및 식도 선암 세포 주의 IC50을 이용하여 비교하였다.The sensitivity and specificity of the trained model were assessed by the ROC curve. The optimal cut-off value between EAC and GCFB was determined using the Youden Index. External validation of predictive models and their cut-off values is performed using RNA microarrays of gastric and esophageal adenocarcinoma cell lines from the Cancer Cell Line Encyclopedia (CCLE) database (http://www.broadinstitute.org/ccle). It was performed by. Depending on the possibilities of the BCCP model, all tumors of the GEJ / gateway tumor of the TCGA cohort and the SNU cohort were reclassified according to genomic subtypes (EAC-prediction or GCFB-prediction). Differences in genomic subtypes at the clinical and molecular levels were then assessed by analyzing clinical pathological data, mutations, and copy number changes. Pathway analysis was performed using the Kyoto Encyclopedia of Genes and Genomes (KEGG) analysis tool (http://www.kegg.jp). Potential surrogate markers associated with genomic subtypes were identified by reverse phase protein analysis for the TCGA cohort and tissue microarrays for the SNU cohort. Target drug reactivity of these agents was compared using the IC50 of the gastric and esophageal adenocarcinoma cell lines of the CCLE database.

(7) 체세포 돌연변이 및 삽입/결실의 확인(7) Identification of somatic mutations and insertions / deletions

TCGA 코호트는, 삽입/결실을 포함하는 체세포 돌연변이를 당해 분야에 기존에 보고된 방법들을 이용하여 분석하였다.The TCGA cohort analyzed somatic mutations, including insertions / deletions, using methods previously reported in the art.

SNU 코호트는, 체세포 돌연변이 콜링(somatic mutationcalling)을 위해 Mutect 및 IndelGenotyper를 이용하여 전체 엑솜 시퀀싱에 대한 BAM 파일을 이용하였다. 1) 대표 서열(reference sequence)을 기반으로 엑손 및 스플라이싱 변형, 또는 2) 8개 초과 서열 및 변이형 대립유전자(alternate allele) 4개 초과인 경우의 변이를 선별하였다.The SNU cohort used BAM files for total exome sequencing using Mutect and IndelGenotyper for somatic mutationcalling. 1) exon and splicing modifications based on a reference sequence, or 2) variations in cases with more than 8 sequences and more than 4 alternative alleles.

각각의 TGCA 및 SNU 코호트 내에서 EAC-예측 그룹 및 GCFB-예측 그룹간의 체세포 돌연변이 및 삽입/결실을 비교하였다.Within each TGCA and SNU cohort, somatic mutations and insertions / deletions were compared between the EAC-prediction group and the GCFB-prediction group.

(8) 체세포 복제 수 분석(8) Somatic Cloning Number Analysis

TCGA 코호트는, 단일 염기 다형성(SNP) 분석으로 복제 수 변화(copy numberalteration: CNA)를 당해 분야에 기존에 보고된 방법들을 이용하여 분석하였다.The TCGA cohort analyzed copy numberalteration (CNA) using single base polymorphism (SNP) analysis using methods previously reported in the art.

SNU 코호트는, CNA를 CONIFER로부터의 RPKM(Read Per Kilobase per Million mapped reads) 값에 기반한 전체 엑솜 데이터를 이용하여 분석하였다. The SNU cohort analyzed CNAs using total exome data based on Read Per Kilobase per Million mapped reads (RPKM) values from CONIFER.

TCGA 및 SNU 코호트 모두에 대해서 체세포 CNA 분석은 GISTIC 2.0 알고리즘을 이용하여 수행하였다. GISTIC 알고리즘을 이용하여 국소적 복제 수 증폭을 갖는 것으로 확인된 유전자들 중, 적어도 한 쌍의 샘플에서, 종양의 log2 복제 수 비가 그에 상응하는 정상 위 점막의 값보다 ≥1인 후보 유전자들을 선별하였다.Somatic CNA analysis was performed using the GISTIC 2.0 algorithm for both TCGA and SNU cohorts. Of the genes identified as having local copy number amplification using the GISTIC algorithm, in at least one pair of samples, candidate genes with ≧ 1 greater than the value of the corresponding normal gastric mucosa were selected for at least one pair of samples.

EAC-예측 및 GCFB-예측 하위군에서 상기 후보 유전자들의 복제 수를 Student's t 검정으로 비교하였다.The number of copies of these candidate genes in the EAC-prediction and GCFB-prediction subgroups was compared by Student's t assay.

(9) (9) TCGATCGA 코호트의Cohort 역상 단백질 어레이 Reversed-phase protein array

180 사례 중 132 사례(44 EAC 및 88 GCFB 포함)의 역상 단백질 어레이(RPPA) 데이터를 TCGA 데이터베이스로부터 얻었다. 클러스터링(clustering) 분석은 재중심화 정규화(recentered normalization) 후 수행하였다.Reverse phase protein array (RPPA) data of 132 of 180 cases (including 44 EAC and 88 GCFB) were obtained from the TCGA database. Clustering analysis was performed after recentered normalization.

(10) 10 SNUSNU 코호트의Cohort 조직 group 마이크로어레이Microarray (( TMATMA ))

코어 조직 생검(직경 2 mm)을 각각의 파라핀-고정 위 종양으로부터 얻고(기증 블록), 트레핀(trephine) 장치(Superbiochips Laboratories, Seoul, Korea)를 이용하여 이를 새로운 수용 파라핀 블록에 나열하였다(조직 어레이 블록). 진행된 암의 조직학적 구성 요소 또는 분자적 이상의 다양성을 고려하여, 각각의 샘플마다 3 세트의 조직 마이크로어레이를 제조하였다. 표적 약물 반응은 위암 및 식도 선암 세포주에 대한 CCLE 데이터베이스의 IC50 데이터를 이용하여 평가하였다. 조직 어레이 블록은 총 138 사례에 대하여 3개 어레이 상에 최대 46 코어를 포함하였고, 면역조직화학(IHC) 염색을 수행하였다. 10% 초과의 코어 영역을 갖는 종양을 적절한 것으로 간주 하였다. 각각의 파라핀 블록은 내부 대조군을 포함하였다. IHC는 자동화 면역염색기(BenchMark XT, Ventana Medical Systems, Tucson,AZ, USA)를 이용하여 수행하였다. 각각의 코어로부터 샘플을 채취한 후, 염색 강도를 0(막 염색 없음, 음성), 1+(약함/거의 염색되지 않음, 약한 양성), 2+(약한-중간 정도의 염색, 중간 양성), 및 3+(강한 염색, 강한 양성)로 점수화하였다. 각 TMA 코어에 대한 모든 IHC 염색 및 SISH (silver in situ hybridization)는 모든 임상적 정보에 대해 알지 못하는 한 명의 전문 병리학자가 평가 및 점수화하였다. ERBB2를 제외한 모든 단백질에 대한 염색 상태는 염색된 세포의 비율에 의한 염색 강도와 각각의 강도 수준에 대한 개별적 H-스코어를 곱하여 복합 H-스코어(complex H-score)를 이용하여 분석하였다. ERBB2의 염색 상태는 10% 이상의 세포가 적어도 하나의 TMA 코어에서 염색된 때의, 가장 강한 염색 강도 점수에서 양성으로 간주되었다. IHC 및 SISH 결과를 이용한 최종 해석은 이전에 보고된 방법들에 따라 수행하였다. ERBB2의 발현은 IHC 3+ 또는 IHC 2+ 및 SISH 흑/적 비율 ≥ 2.0일 때 양성, IHC < 2+, 또는 IHC 2+ 및 SISH 흑/적 비율 < 2.0일 때 음성으로 결정되었다. 표적 약물 반응은 위암 및 식도 선암 세포주에 대한 CCLE 데이터베이스의 IC50 데이터를 이용하여 평가하였다.A core tissue biopsy (2 mm in diameter) was obtained from each paraffin-fixed gastric tumor (donation block) and listed in a new receiving paraffin block using a trephine device (Superbiochips Laboratories, Seoul, Korea). Array blocks). Three sets of tissue microarrays were prepared for each sample, taking into account the histological components of the cancer and the diversity of the molecular abnormalities. Target drug response was assessed using IC50 data from the CCLE database for gastric and esophageal adenocarcinoma cell lines. Tissue array blocks included up to 46 cores on three arrays for a total of 138 cases and were subjected to immunohistochemistry (IHC) staining. Tumors with more than 10% core area were considered appropriate. Each paraffin block contained an internal control. IHC was performed using an automated immunostainer (BenchMark XT, Ventana Medical Systems, Tucson, AZ, USA). After taking samples from each core, the staining intensity was 0 (no membrane staining, negative), 1+ (weak / nearly unstained, weak positive), 2+ (weak-medium staining, medium positive), And 3+ (strong staining, strong positive). All IHC staining and silver in situ hybridization (SISH) for each TMA core were evaluated and scored by one expert pathologist who did not know about all clinical information. Staining status for all proteins except ERBB2 was analyzed using complex H-scores by multiplying the staining intensity by the percentage of cells stained with the individual H-scores for each intensity level. Staining status of ERBB2 was considered positive at the strongest staining intensity score when at least 10% of cells were stained in at least one TMA core. Final interpretation using the IHC and SISH results was performed according to previously reported methods. Expression of ERBB2 was determined positive when IHC 3+ or IHC 2+ and SISH Black / Red ratio ≧ 2.0, negative when IHC <2+, or IHC 2+ and SISH Black / Red ratio <2.0. Target drug response was assessed using IC50 data from the CCLE database for gastric and esophageal adenocarcinoma cell lines.

(11) 통계학적 분석(11) statistical analysis

스튜던트 t 검정 및 카이 제곱(chi-squared) 검정으로 비교 분석하였다. 생존 분석은 Kaplan-Meier 방법 및 로그 랭크 검정으로 수행하였다. 단백질 발현의 위험 인자를 확인하기 위한 다변량 분석(multivariate analysis)은 이항 로지스틱 회귀 또는 선형 회귀 분석을 이용하여 수행하였다. 모든 검정은 2-방향(sided) 검정으로 수행하였고, SPSS version 21.0(SPSS, Inc., Chicago, IL, USA)을 이용한 유의 수준 5%에서 수행하였다.Comparative analysis was performed using the Student's t test and the chi-squared test. Survival analysis was performed by Kaplan-Meier method and log rank test. Multivariate analysis to identify risk factors for protein expression was performed using binary logistic regression or linear regression analysis. All assays were performed with a two-sided assay and at a significance level of 5% using SPSS version 21.0 (SPSS, Inc., Chicago, IL, USA).

2. 결과2. Results

도 4는 분석 계획에 따른 자세한 연구군을 나타낸다. TCGA 코호트로부터의 순수 식도 선암, 순수 위선암(저부/체부), GEJ/분문부 선암 228 종양, 및 46쌍(92 샘플)의 종양에 상응하는, SNU 코호트로부터의 AGEJⅡ, AGEJⅢ, 및 UT의 정상 점막을 분석하였다. 4 shows a detailed study group according to the analysis plan. Normal mucosa of AGEJII, AGEJIII, and UT from the SNU cohort, corresponding to pure esophageal adenocarcinoma from the TCGA cohort, pure gastric adenocarcinoma (base / body), GEJ / gateline adenocarcinoma 228 tumors, and 46 pairs (92 samples) of tumors Was analyzed.

SNU 코호트에 대해서, 신선한 냉동 조직 및 라이브러리로부터의 핵산 추출을 반복한 후, 로우 시퀀싱 데이터를 얻었다.For the SNU cohort, nucleic acid extraction from fresh frozen tissues and libraries was repeated before raw sequencing data was obtained.

(1) 임상 병리학적 특성(1) clinical pathological characteristics

TCGA 코호트에서, 엑솜 및 전사체 데이터에 사용할 수 있는 78 EAC, 48 GEJ/분문부 및 102 GCFB 샘플을 확인하였다 (표 1).In the TCGA cohort, 78 EAC, 48 GEJ / segment and 102 GCFB samples were identified that could be used for exome and transcript data (Table 1).

TCGA 코호트의 임상병리학적 특성Clinical Pathologic Characteristics of the TCGA Cohort EAC
(n=78) EAC
(n = 78) GEJ/분문부
(n=48) GEJ / department
(n = 48) GCFB
(n=102) GCFB
(n = 102) P
Value P
Value 성(남:여) Last name (male: female) 69:9 69: 9 37:11 37:11 57:45 57:45 <0.001 <0.001 나이 age 66.8±12.0 66.8 ± 12.0 66.9±9.2 66.9 ± 9.2 66.6±9.3 66.6 ± 9.3 0.985 0.985 위치 location 식도 중간 Middle esophagus 2(2.6%) 2 (2.6%) 0 0 0 0 <0.001 <0.001 식도 중-말 Esophagus-Horse 1(1.3%) 1 (1.3%) 0 0 0 0 식도 말단 Esophageal extremities 75(96.2%) 75 (96.2%) 0 0 0 0 GEJ/분문부 GEJ / department 0 0 48(100%) 48 (100%) 0 0 위저부/체부 Upper and Lower Body 0 0 0 0 102(100%) 102 (100%) WHO 분류WHO classification 유두형 Papilloma 0 0 4(8.3%) 4 (8.3%) 12(11.8%) 12 (11.8%) 관형 Tubular 0 0 23(47.9%) 23 (47.9%) 49(48.0%) 49 (48.0%) 저응집성 Low cohesion 0 0 9(18.8%) 9 (18.8%) 19(18.6%) 19 (18.6%) 점액성 Mucus 0 0 3(6.3%) 3 (6.3%) 4(3.9%) 4 (3.9%) 혼합형 Mixed 0 0 6(12.5%) 6 (12.5%) 6(5.9%) 6 (5.9%) 이용불가 Not available 78(100%) 78 (100%) 3(6.3%) 3 (6.3%) 12(11.8%) 12 (11.8%) Lauren 분류Lauren classification 장 chapter 0 0 32(66.7%) 32 (66.7%) 70(68.6%) 70 (68.6%) 확산형 Diffusion 0 0 9(18.8%) 9 (18.8%) 19(18.6%) 19 (18.6%) 혼합형 Mixed 0 0 6(12.5%) 6 (12.5%) 6(5.9%) 6 (5.9%) 이용불가 Not available 78(100%) 78 (100%) 1(2.1%) 1 (2.1%) 7(6.9%) 7 (6.9%)

SNU 코호트의 임상병리학적 특성은 아래 표 2와 같다The pathologic characteristics of the SNU cohort are shown in Table 2 below.

SNU 코호트의 임상병리학적 특성Clinical Pathologic Characteristics of the SNU Cohort AGEJⅡ
(n=16) AGEJⅡ
(n = 16) AGEJⅢ
(n=16) AGEJⅢ
(n = 16) UT
(n=14) UT
(n = 14) P
Value P
Value 성(남:여) Last name (male: female) 13:3 13: 3 12:4 12: 4 11:3 11: 3 0.912 0.912 나이 age 58.5±10.4 58.5 ± 10.4 66.5±9.4 66.5 ± 9.4 63.5±8.1 63.5 ± 8.1 0.062 0.062 WHO 분류 WHO classification 분화 differentiation 7(43.8%) 7 (43.8%) 7(43.8%) 7 (43.8%) 7(50.0%) 7 (50.0%) 0.919 0.919 비분화 Non-differentiation 7(43.8%) 7 (43.8%) 8(50.0%) 8 (50.0%) 5(35.7%) 5 (35.7%) 결정x Crystal x 2(12.5%) 2 (12.5%) 1(6.3%) 1 (6.3%) 2(14.3%) 2 (14.3%) Lauren 분류 Lauren classification 장 chapter 5(31.3%) 5 (31.3%) 6(37.5%) 6 (37.5%) 7(50.0%) 7 (50.0%) 0.526 0.526 확산형 Diffusion 7(43.8%) 7 (43.8%) 5(31.3%) 5 (31.3%) 6(41.9%) 6 (41.9%) 혼합형 Mixed 4(25.0%) 4 (25.0%) 5(31.3%) 5 (31.3%) 1(7.1%) 1 (7.1%)

(2) (2) 예측적Predictive 분류 모델의 개발 Development of Classification Model

TCGA 코호트 내의 EAC 및 CGFB의 감독되지 않은 계층적 클러스터링(unsupervisedhierarchical clustering)으로 5,520 유전자를 밝혀내었다 (도 5)(P<0.001).Unsupervised hierarchical clustering of EAC and CGFB in the TCGA cohort revealed 5,520 genes (FIG. 5) (P <0.001).

배수 변화의 등급에 따라 상위 200개 및 하위 200개 유전자를 400개의 시그니쳐 유전자로 선정하였다. 선정된 유전자의 목록은 다음과 같다: KRT5, KRT13, KRT6A, KRT14, RHCG, KRT4, S100A7, SPRR3, KRT6C, SPRR2A, MUC21, KRT6B, PKP1, SERPINB3, CLCA2, SPRR1B, MUC4, TBX5, CRNN, SPRR2D, DUOX2, GJB6, SPRR2E, S100A2, KRT16, KRT17, SCEL, DSC3, CALML3, HEPHL1, A2ML1, NKX6-1, DUOXA2, SBSN, TMPRSS11D, CLCA4, ANXA8, FGFBP1, TGM1, IVL, DUOX1, DSG3, EREG, SERPINB4, KRT78, HOXC10, DUOXA1, ZNF750, SPRR1A, KRT7, TMPRSS11A, CXCL6, KLK13, CRCT1, PAX9, HORMAD1, FAM83A, S100A7A, GPR110, CAPN14, ABCA12, WFDC2, TMPRSS11B, CEACAM5, MUC12, SERPINB13, PADI1, FAT2, IL1RL2, STK31, SPINK5, DDX3Y, PI3, FAM83C, LYPD2, KLK5, IL1F5, KRT15, BBOX1, GPR87, ACTL8, SLC34A2, USP9Y, TGM3, LASS3, TMPRSS13, GUCY1B2, SPRR2F, KDM5D, KRT23, LY6D, VTCN1, SLC5A12, ABCA13, S100A8, GBP6, CEACAM7, IL1F9, MUC15, PPP1R14C, KLK7, DEFB1, MMP3, TM4SF20, SYCP2, TMEM40, CEACAM6, VWA2, CYorf15B, TDRD5, DCDC2, KLK12, C10orf99, CARD14, HOXC13, LOC642587, GABRR1, DNAH2, SLC15A1, KLK8, LGALS7B, ALDH1L1, FOXN1, RAET1L, TP63, ATP13A4, UTY, DSG1, ALOX12B, AMY2B, SAA2, SAA1, GPR115, LCN2, KRT24, PRKY, PADI3, SCNN1A, PGLYRP3, PRSS27, ANXA8L2, WISP3, IL20RB, FOXE1, ZFY, ALG1L, LY6K, NCCRP1, SLC44A5, NMU, CDH17, RNASE7, FER1L4, C4BPB, CALML5, CXorf61, SPACA4, ULBP2, KRTDAP, MYO7B, , C7orf54, SLC6A20, LOC221442, GOLGA8B, GPR81, ALDH3B2, TRIM29, C21orf125, BNC1, IL1F6, TRPA1, C12orf27, LOC440905, POU5F1, LOC100131726, KCNH8, SPRR2C, IL13RA2, C3orf67, SAA4, LYPD3, GJB5, ULBP1, PHACTR3, S100A9, TRIM54, KLK11, ITIH5L, BNIPL, LOC284233, LOC387646, MRPL42P5, FLJ42393, PCDHAC2, IFNE, BCL2L10, FLJ45445, COL7A1, GDA, ARL14, VIP, CLC, PPP2R2B, FABP4, SFRP5, CCL8, CD79A, HIST1H1E, CCL4L2, SYNPO2, SLC47A1, CXCR2P1, C1QTNF7, CDH2, SIGLEC7, GZMA, NKX6-3, PGM5, LILRB4, NMUR1, XCL2, SOX10, FCGR1A, ST6GALNAC5, CCL4, ACSM5, FOLR2, KCNA1, CCL26, FCGR1C, VSIG4, C1QA, NCF1B, NXPH3, C1QC, TAGLN, PTPRN, TYROBP, ISL1, HCST, RGMA, APOBEC3H, PRND, ADIPOQ, CCL11, GLP2R, PNMA6A, CD8B, FCGR1B, PCBP3, CD8A, GPR27, DPP6, F13A1, EPYC, CXCL9, GREM2, SNORA12, DNAJC5B, HLA-DQA2, PLP1, CHGB, APOC2, FAT3, TLR7, CHRDL1, LY86, AGTR1, ISLR, CR1, FCGR3A, DPEP1, PPAPDC1A, ODZ3, MAPK4, C17orf87, KCNJ5, MSR1, RELN, APOE, C1QB, PHYHIPL, CCDC80, NCF1C, SLITRK5, TREM2, FGF14, PGA5, NKX2-3, PRELP, FCRLA, CCL5, GNAO1, PCDH9, WSCD2, MS4A1, RNF150, KLRD1, KCNK2, MATK, HUNK, IGFBPL1, IGF1, CDO1, COLEC12, FIBIN, CARTPT, VPREB3, PPP1R1A, GDF6, NCAM2, CLEC10A, KIAA0408, BHLHE22, CD52, SULT4A1, SIT1, FNDC1, GRIK3, SUCNR1, TMEM90B, CYP1B1, MKX, SV2B, BMP3, TCEAL2, CADM3, SPOCK1, GREM1, SIGLEC8, ADCY2, CDC26, OGN, FCRL6, PSD, VIPR2, GZMM, ADH1B, CCL19, PLD4, TMEM189-UBE2V1, NKG7, EOMES, STMN2, GZMH, NPTX1, CNTFR, TMEM130, CTNND2, ITGBL1, ECEL1, ACTG2, COL10A1, ST6GAL2, NRK, PI16, NALCN, GZMK, NCAM1, RMRP, CHRM2, KCNMA1, HSPB7, SLIT2, PDZRN4, CNN1, CHRDL2, OMD, PTGIS, RSPO3, PLA2G2D, SMYD1, ZNF683, NRXN3, PCSK1N, HSPB6, C16orf89, PGA3, COMP, C2orf40, C7, GKN2, DES, CILP, COL11A1, LIPF, GATA5, MFAP5, SCRG1, SFRP2, GKN1, PGA4, CCL21, SIX2, NKX3-2, SFRP4, NBLA00301, THBS4, HAND2, 및 BARX1.The top 200 and bottom 200 genes were selected as the 400 signature genes according to the grade of fold change. The list of selected genes is as follows: KRT5, KRT13, KRT6A, KRT14, RHCG, KRT4, S100A7, SPRR3, KRT6C, SPRR2A, MUC21, KRT6B, PKP1, SERPINB3, CLCA2, SPRR1B, MUC4, TBX5, CRNN, SPRR2D DUOX2, GJB6, SPRR2E, S100A2, KRT16, KRT17, SCEL, DSC3, CALML3, HEPHL1, A2ML1, NKX6-1, DUOXA2, SBSN, TMPRSS11D, CLCA4, ANXA8, FGFBP1, TGM1, IVL, DUB1REG, DSG3 KRT78, HOXC10, DUOXA1, ZNF750, SPRR1A, KRT7, TMPRSS11A, CXCL6, KLK13, CRCT1, PAX9, HORMAD1, FAM83A, S100A7A, GPR110, CAPN14, ABCA12, WFDC2, TMPRSS1B, CE2CAMRP, UC12, SERMAL STK31, SPINK5, DDX3Y, PI3, FAM83C, LYPD2, KLK5, IL1F5, KRT15, BBOX1, GPR87, ACTL8, SLC34A2, USP9Y, TGM3, LASS3, TMPRSS13, GUCY1B2, SPRR2F, KDM5D1, KRT5D1, KRT5D12 S100A8, GBP6, CEACAM7, IL1F9, MUC15, PPP1R14C, KLK7, DEFB1, MMP3, TM4SF20, SYCP2, TMEM40, CEACAM6, VWA2, CYorf15B, TDRD5, DCDC2, KLK12, C10orf99, CARD14HC64C13 KLK8, LGALS7B, ALDH1L1, FOXN1, RAET1L, TP63, ATP13A4, UTY , DSG1, ALOX12B, AMY2B, SAA2, SAA1, GPR115, LCN2, KRT24, PRKY, PADI3, SCNN1A, PGLYRP3, PRSS27, ANXA8L2, WISP3, IL20RB, FOXE1, ZFY, ALG1L, LY6K, NCCRP1, NCCH1 , FER1L4, C4BPB, CALML5, CXorf61, SPACA4, ULBP2, KRTDAP, MYO7B,, C7orf54, SLC6A20, LOC221442, GOLGA8B, GPR81, ALDH3B2, TRIM29, C21orf125, BNC1, IL1F6C6, PA6, TR6F6 SPRR2C, IL13RA2, C3orf67, SAA4, LYPD3, GJB5, ULBP1, PHACTR3, S100A9, TRIM54, KLK11, ITIH5L, BNIPL, LOC284233, LOC387646, MRPL42P5, FLJ42393, PCDHAC2, IFNE, BCL454, IFN, BCL454 CLC, PPP2R2B, FABP4, SFRP5, CCL8, CD79A, HIST1H1E, CCL4L2, SYNPO2, SLC47A1, CXCR2P1, C1QTNF7, CDH2, SIGLEC7, GZMA, NKX6-3, PGM5, LILRB4, NGCL1, NMURAOX ACSM5, FOLR2, KCNA1, CCL26, FCGR1C, VSIG4, C1QA, NCF1B, NXPH3, C1QC, TAGLN, PTPRN, TYROBP, ISL1, HCST, RGMA, APOBEC3H, PRND, ADIPOQ, CCL11, GLP2R, CD8B, PNMA6 CD8A, GPR27, DPP6, F13A1, EPYC, CXCL9, GREM2, SNORA12 , DNAJC5B, HLA-DQA2, PLP1, CHGB, APOC2, FAT3, TLR7, CHRDL1, LY86, AGTR1, ISLR, CR1, FCGR3A, DPEP1, PPAPDC1A, ODZ3, MAPK4, C17orf87, KCNJ5, MSB1, RHPL1, RHIPL, RHIPL , CCDC80, NCF1C, SLITRK5, TREM2, FGF14, PGA5, NKX2-3, PRELP, FCRLA, CCL5, GNAO1, PCDH9, WSCD2, MS4A1, RNF150, KLRD1, KCNK2, MATK, HUNK, IGFBPL1, IFIBIN, CDO , CARTPT, VPREB3, PPP1R1A, GDF6, NCAM2, CLEC10A, KIAA0408, BHLHE22, CD52, SULT4A1, SIT1, FNDC1, GRIK3, SUCNR1, TMEM90B, CYP1B1, MKX, SV2B, BMP3, TEMALOCK, SIC2, GL2 , CDC26, OGN, FCRL6, PSD, VIPR2, GZMM, ADH1B, CCL19, PLD4, TMEM189-UBE2V1, NKG7, EOMES, STMN2, GZMH, NPTX1, CNTFR, TMEM130, CTNND2, ITGBL1, ECEL1, ACTG2, COL10A , PI16, NALCN, GZMK, NCAM1, RMRP, CHRM2, KCNMA1, HSPB7, SLIT2, PDZRN4, CNN1, CHRDL2, OMD, PTGIS, RSPO3, PLA2G2D, SMYD1, ZNF683, NRXN3, PCSK1N, HfB89, CSP3 , C7, GKN2, DES, CILP, COL11A1, LIPF, GATA5, MFAP5, SCRG1, SFRP2, GKN1, PGA4, CCL21, SIX2, NKX3-2, SFRP4, NBLA00301, THBS4, HAND2, and BARX1.

상기 400개 시그니쳐 유전자 분류를 이용하여 TCGA 코호트 내의 EAC 및 GCFB의 감독되지 않은 계층적 클러스터링을 수행하였고, EAC 및 GCFB 사이의 확연한 클러스터 분류를 확인하였다 (도 6). Unsupervised hierarchical clustering of EAC and GCFB in the TCGA cohort was performed using the 400 signature gene classifications, confirming the distinct cluster classification between EAC and GCFB (FIG. 6).

예측적 분류 모델은 400개 유전자 분류와 함께 BCCP에 기반하여 개발하였고, LOOCV로 학습되었다(trained). BCCP 점수를 이용한 ROC 커브는 0.957의 선 아래 영역(area under curve)과 (95% 신뢰구간=0.93-0.98), EAC 및 GCFB를 구별하는 컷-오프 값으로서 0.4535 Youden 인덱스를 밝혀내었다 (도 7).Predictive classification models were developed based on BCCP with 400 gene classifications and trained with LOOCV. The ROC curve using the BCCP score revealed a 0.4535 Youden index as the cut-off value that distinguishes the area under curve of 0.957 (95% confidence interval = 0.93-0.98) and EAC and GCFB (FIG. 7). .

상기 컷-오프 값은 EAC를 예측하는데 90.2%의 민감도 및 89.7%의 특이도를 나타내었다. 400 시그니쳐 유전자에 대해서, KEGG 경로 분석을 수행하였다. 5개 이상의 유전자가 포함되는 여러 암-연관 경로 중 GCFB에 관련된 PI3K-AKT 신호 경로를 확인하였으며, 이에는 GCFB에서 과발현되는 200개 유전자 중 CHRM2, COMP, FGF14, IGF1, PPP2R2B, RELN, 및 THBS4가 포함되어 있었다. 결과적으로 PI3K 및 AKT가 TCGA 코호트의 RPPA 및 SNU 코호트의 조직 마이크로어레이를 이용한 단백질 검증에 고려되었다.The cut-off value showed a sensitivity of 90.2% and a specificity of 89.7% for predicting EAC. For 400 signature genes, KEGG pathway analysis was performed. The PI3K-AKT signaling pathway associated with GCFB was identified among several cancer-associated pathways containing five or more genes, including CHRM2, COMP, FGF14, IGF1, PPP2R2B, RELN, and THBS4, among 200 genes overexpressed in GCFB. Included. As a result, PI3K and AKT were considered for protein validation using tissue microarrays from the RPPA and SNU cohorts of the TCGA cohort.

(3) 체세포 돌연변이 분석을 이용한 (3) using somatic mutation analysis 예측적Predictive 분류 모델의 검정 Test of classification model

0.4535의 컷-오프 값을 갖는 BCCP 스코어를 이용하여, TCGA 코호트의 GEJ/분문부의 클러스터링을 테스트하였다 (도 8).Using the BCCP score with a cut-off value of 0.4535, the clustering of the GEJ / segment of the TCGA cohort was tested (FIG. 8).

TCGA 코호트의 GEJ/분문부의 계층적 클러스터링은 EAC-예측 및 GCFB-예측 그룹간의 클러스터의 스펙트럼 전이(spectral transition)를 나타냈고, 완전히 구분가능한 클러스터는 없었다. EAC로 예측된 TCGA 코호트의 GEJ/분문부는 15/48(31.2%)였고, GCFB로 예측된 것은 33/48(68.8%)였다. 체세포 돌연변이 측면에서, TCGA 코호트의 EAC-예측 그룹 및 GCFB-예측 그룹 사이에서 TP53, PIK3CA, RHOA, KRAS, 및 ARID1A의 유의한 차이는 없었다. SNU 코호트의 AGEJ Ⅱ, AGEJ Ⅲ, 및 UT의 클러스터링을 시험한 결과, EAC-예측 그룹 및 GCFB-예측 그룹 사이에서, TCGA 코호트와 유사한 클러스터의 스펙트럼 전이를 보였다 (도 9). Hierarchical clustering of the GEJ / segment of the TCGA cohort showed spectral transitions of the clusters between the EAC-prediction and GCFB-prediction groups, with no fully distinguishable clusters. The GEJ / segment of the TCGA cohort predicted by the EAC was 15/48 (31.2%) and the GCFB predicted 33/48 (68.8%). In terms of somatic mutation, there was no significant difference in TP53, PIK3CA, RHOA, KRAS, and ARID1A between the EAC-prediction group and the GCFB-prediction group of the TCGA cohort. Clustering of AGEJ II, AGEJ III, and UT of the SNU cohort tested spectral transitions of clusters similar to the TCGA cohort between the EAC-prediction group and the GCFB-prediction group (FIG. 9).

SNU 코호트의 AGEJ Ⅱ는 5/16(31.2%)가 EAC-예측 그룹으로 분류되었고, 11/16(68.8%)가 GCFB-예측 그룹으로 분류되었다. 특히, AGEJ Ⅲ의 15/16(93.7%)가 GCFB-예측 그룹으로 분류되었다. SNU 코호트의 AGEJ Ⅱ 및 Ⅲ을 함께 고려할 때, EAC-예측 그룹과 GCFB-예측 그룹은 각각 6/32(18.8%) 및 26/32(81.2%)였다. SNU 코호트의 EAC-예측 그룹 및 GCFB-예측 그룹 사이에서 TP53, PIK3CA, RHOA, KRAS, 및 ARID1A의 유의한 차이는 없었다. 특히, TCGA 및 SNU 코호트 모두의 EAC-예측 그룹에서 RHOA, KRAS 및 PIK3CA의 체세포 돌연변이는 발견되지 않았다.In the NUJ II of the SNU cohort, 5/16 (31.2%) was classified as an EAC-prediction group and 11/16 (68.8%) were classified as a GCFB-prediction group. In particular, 15/16 (93.7%) of AGEJ III were classified as GCFB-prediction groups. Considering AGEJ II and III of the SNU cohort, the EAC-prediction group and the GCFB-prediction group were 6/32 (18.8%) and 26/32 (81.2%), respectively. There was no significant difference in TP53, PIK3CA, RHOA, KRAS, and ARID1A between the EAC-prediction group and the GCFB-prediction group of the SNU cohort. In particular, no somatic mutations of RHOA, KRAS and PIK3CA were found in the EAC-prediction groups of both the TCGA and SNU cohorts.

(4) EAC-예측 및 (4) EAC-prediction and GCFBGCFB -예측 그룹의 병리학적 분석 Pathological analysis of predictive groups

SNU 코호트의 병리학적 특성 분석 결과, GEJ를 포함하는 모든 AGEJ Ⅲ 및 GEJ를 포함하지 않는 AGEJ Ⅲ의 80%가 GCFB-예측 그룹으로 분류됨을 확인하였다 (표 3). Pathological characterization of the SNU cohort confirmed that all AGEJ III including GEJ and 80% of AGEJ III without GEJ were classified as GCFB-predictive groups (Table 3).

SNU 코호트의 EAC-예측 및 GCFB-예측 그룹 사이의 병리학적 특성Pathological characteristics between the EAC-predictive and GCFB-predictive groups of the SNU cohort EAC-예측
(n=10) EAC-prediction
(n = 10) GCFB-예측
(n=36) GCFB-prediction
(n = 36) P
Value P
Value 위치 location AGEJⅡ AGEJⅡ 5(31.3%) 5 (31.3%) 11(68.8%) 11 (68.8%) 0.231 0.231 GEJ를 포함하는 AGEJⅢ AGEJⅢ including GEJ 0 0 11(100%) 11 (100%) GEJ를 포함하지 않는 AGEJⅢ AGEJⅢ without GEJ 1(20.0%) 1 (20.0%) 4(80.0%) 4 (80.0%) UT UT 4(28.6%) 4 (28.6%) 10(71.4%) 10 (71.4%) WHO WHO 분화 differentiation 8(80.0%) 8 (80.0%) 13(36.1%) 13 (36.1%) 0.043 0.043 비분화 Non-differentiation 2(20.0%) 2 (20.0%) 18(50.0%) 18 (50.0%) 결정x Crystal x 0 0 5(13.9%) 5 (13.9%) Lauren Lauren 장 chapter 8(80.0%) 8 (80.0%) 10(27.8%) 10 (27.8%) 0.009 0.009 확산형 Diffusion 2(20.0%) 2 (20.0%) 16(44.4%) 16 (44.4%) 혼합형 Mixed 0 0 10(27.8%) 10 (27.8%)

EAC-예측 그룹 및 GCFB-예측 그룹의 분포는 AGEJ II, AGEJ III 및 UT간에 유의한 차이가 없었다. 그러나, EAC-예측 그룹은 분화(differentiated) 및 장(intestinal) 유형의 비율이 유의하게 높은 반면, GCFB-예측 그룹은 비분화 및 확산(diffuse) 유형의 비율이 유의하게 높았다. The distributions of the EAC-prediction group and the GCFB-prediction group did not differ significantly between AGEJ II, AGEJ III and UT. However, the EAC-predicted group had a significantly higher percentage of differentiated and intestinal types, while the GCFB-predicted group had a significantly higher percentage of non-differentiated and diffuse types.

(5) EAC-예측 및 (5) EAC-prediction and GCFCGCFC -예측 그룹 사이의 복제 수 분석Analysis of the number of replicates between prediction groups

TCGA 코호트에서 게놈 수준의 복제 수 분석을 수행하였다. 그 결과, GISTIC 알고리즘에 의한 EAC-예측 및 GCFB-예측 그룹 사이에서 유의한 복제 수 차이가 있는 435개의 증폭된 유전자를 확인하였다(≥2배 변화 및 P<0.05).Genome-level copy number analysis was performed in the TCGA cohort. As a result, 435 amplified genes with significant copy number differences between the EAC-predicted and GCFB-predicted groups by the GISTIC algorithm were identified (≥2 fold change and P <0.05).

이들 435개의 유전자를 human Cancer Gene Census (http://www.sanger.ac.uk/science/data/cancer-gene-census)로 필터링한 결과, 6개의 암-관련 유전자를 밝혀내었고 이들은 다음과 같다 (도 10): COX6C (8q22.2, 전좌), HNRNPA2B1 (7p15.2, 전좌), NDRG1 (8q24.22, 전좌), RECQL4 (8q24.3, 넌센스, 프레임시프트/스플라이스), TCEA1 (8q11.23, 전좌), 및 TFEB (6p21.1, 전좌).Filtering these 435 genes with human Cancer Gene Census (http://www.sanger.ac.uk/science/data/cancer-gene-census) revealed six cancer-related genes. (FIG. 10): COX6C (8q22.2, translocation), HNRNPA2B1 (7p15.2, translocation), NDRG1 (8q24.22, translocation), RECQL4 (8q24.3, nonsense, frameshift / splice), TCEA1 ( 8q11.23, translocation), and TFEB (6p21.1, translocation).

SNU 코호트에서, EAC-예측 그룹과 GCFB-예측 그룹 사이의 GISTIC 알고리즘을 이용하여 국소적 증폭(focal amplication)을 가질 것으로 추정되는 유전자들을 비교한 결과, 유의한 복제 수 차이를 갖는 37개 유전자를 확인하였다 (P<0.05)(표 4).In the SNU cohort, using the GISTIC algorithm between the EAC-prediction group and the GCFB-prediction group, we compared 37 genes that were estimated to have focal amplication and identified 37 genes with significant differences in replication numbers. (P <0.05) (Table 4).

SNU 코호트 중 EAC 및 GCFB 사이에서 유의한 복제 수 차이를 나타내는 유전자 (P<0.05)Gene showing significant replication number difference between EAC and GCFB among SNU cohorts (P <0.05) EAC-유사 그룹의
평균 Log2복제 수 Of the EAC-like group
Average Log2 Replications GCFB-유사 그룹의
평균 Log2복제 수Of GCFB-like groups
Average Log2 Replications BOP1 BOP1 0.332 0.332 0.0260.026 C19orf12 C19orf12 0.279 0.279 0.039 0.039 DUSP8 DUSP8 0.267 0.267 0.002 0.002 EGFR EGFR 0.460 0.460 0.119 0.119 ERBB2 ERBB2 1.1851.185 0.227 0.227 FOXP4 FOXP4 0.303 0.303 0.052 0.052 GRB7 GRB7 0.8260.826 0.219 0.219 GSTA1 GSTA1 0.2520.252 0.001 0.001 GSTA2 GSTA2 0.320 0.320 -0.021-0.021 GSTA3 GSTA3 0.2730.273 0.030 0.030 GSTA5 GSTA5 0.275 0.275 0.012 0.012 HIST1H1B HIST1H1B 0.2280.228 -0.051 -0.051 HIST1H2AI HIST1H2AI 0.245 0.245 0.009 0.009 HIST1H2AK HIST1H2AK 0.234 0.234 -0.016 -0.016 HIST1H2AL HIST1H2AL 0.282 0.282 -0.027 -0.027 HIST1H2AM HIST1H2AM 0.281 0.281 -0.043 -0.043 HIST1H2BM HIST1H2BM 0.215 0.215 -0.007 -0.007 HIST1H2BN HIST1H2BN 0.230 0.230 -0.007 -0.007 HIST1H2BO HIST1H2BO 0.239 0.239 -0.023 -0.023 HIST1H2BN HIST1H2BN 0.230 0.230 -0.007 -0.007 HIST1H2BO HIST1H2BO 0.239 0.239 -0.023 -0.023 HIST1H3H HIST1H3H 0.266 0.266 0.019 0.019 HIST1H3J HIST1H3J 0.303 0.303 0.005 0.005 HIST1H4J HIST1H4J 0.254 0.254 0.017 0.017 LILRA3 LILRA3 0.276 0.276 -0.123 -0.123 LOC100287704 LOC100287704 0.399 0.399 -0.011 -0.011 LY86 LY86 0.236 0.236 -0.045 -0.045 MDFI MDFI 0.381 0.381 0.047 0.047 MIEN1 MIEN1 1.178 1.178 0.205 0.205 OR2B2 OR2B2 0.209 0.209 -0.019 -0.019 PI4KAP1 PI4KAP1 0.284 0.284 -0.010 -0.010 PLEKHF1 PLEKHF1 0.343 0.343 0.048 0.048 POP4 POP4 0.289 0.289 0.053 0.053 SSR1 SSR1 0.216 0.216 -0.002 -0.002 TFEB TFEB 0.431 0.431 0.075 0.075 TMEM191B TMEM191B 0.381 0.381 -0.013 -0.013 TRAM2 TRAM2 0.293 0.293 0.041 0.041 UGT2B17 UGT2B17 0.263 0.263 -0.049 -0.049 VSTM2B VSTM2B 0.270 0.270 0.069 0.069 ZNF439 ZNF439 -0.242 -0.242 0.023 0.023

37개 유전자 중, humanCancer Gene Census (http://www.sanger.ac.uk/science/data/cancer-gene-census)을 이용하여 필터링 한 결과 2 개의 암 연관 유전자를 밝혀내었다: ERBB2 (17q12, 증폭), 및 TFEB (6p21.1, 전좌)(도 11).Of 37 genes, filtering using humanCancer Gene Census (http://www.sanger.ac.uk/science/data/cancer-gene-census) revealed two cancer-associated genes: ERBB2 (17q12, Amplification), and TFEB (6p21.1, translocation) (FIG. 11).

SNU 코호트에서 ERBB1(EGFR)(7p11.2)은 EAC-예측 그룹 및 GCFB-예측 그룹에서 동시에 증폭되었으나, EGFR의 복제 수는 상기 두 그룹 사이에서 유의한 차이를 나타내지 않았다. 모든 코호트의 COX6C, HNRNPA2B1, NDRG1, RECQL4, TCEA1, 및 TFEB의 주석된(annotated) 돌연변이 패턴이 복제 수 증폭에 일관되지 않았기 때문에, 이의 가능한 헤테로다이머로서 ERBB2 및 ERRB1이 TCGA 코호트의 RPPA 및 SNU 코호트의 조직 마이크로어레이를 이용하여 평가되었다.ERBB1 (EGFR) (7p11.2) in the SNU cohort was amplified simultaneously in the EAC-predictive group and the GCFB-predicted group, but the number of copies of EGFR did not show a significant difference between the two groups. Since the annotated mutation pattern of COX6C, HNRNPA2B1, NDRG1, RECQL4, TCEA1, and TFEB of all cohorts was inconsistent for replication number amplification, ERBB2 and ERRB1 as possible heterodimers of RPPA and SNU cohorts of TCGA cohort Evaluation was performed using a tissue microarray.

(6) 역상 단백질 어레이((6) reversed-phase protein arrays ( RPPARPPA ) 및 조직 마이크로 어레이() And tissue microarrays ( TMATMA ))

TCGA 코호트의 44 EAC 및 88 GCFB로 구성된 RPPA 데이터의 감독된(supervised) 분석을 통해, EAC 및 GCFB 사이에서 81개 단백질의 명백히 분리되는 발현 클러스터를 확인하였다 (도 12).Supervised analysis of the RPPA data consisting of 44 EAC and 88 GCFB of the TCGA cohort identified an apparently isolated expression cluster of 81 proteins between the EAC and GCFB (FIG. 12).

81 단백질 중, 400개 시그니쳐 유전자의 경로 분석에서 확인된 PIK3CA 및 AKT1과, 복제 수 분석에서 확인된 ERBB2 및 EGFR이, RPPA의 단백질 발현에서 EAC 및 GCFB 사이에서 유의한 차이를 나타내었다. Of the 81 proteins, PIK3CA and AKT1 identified in the pathway analysis of 400 signature genes and ERBB2 and EGFR identified in the copy number analysis showed significant differences between EAC and GCFB in protein expression of RPPA.

추가적인 검증을 위해 상업적으로 판매되는 항체를 이용하여 SNU 코호트의 TMA 3세트를 이용하여 상기 4종 단백질의 발현을 분석하였다(표 5).Expression of the four proteins was analyzed using three sets of TMAs from the SNU cohort using commercially available antibodies for further validation (Table 5).

조직 마이크로어레이에 사용된 항체 정보Antibody Information Used in Tissue Microarrays 항체 Antibodies 클론성
(clonality) Clonality
(clonality) 희석 Dilution 검출키트 Detection kit 출처 source Cat.no Cat.no EGFR EGFR 마우스 모노클론성 Mouse monoclonal 바로
사용 Immediately
use OptiView
Polymer
(Ventana) OptiView
Polymer
(Ventana) Roche Roche 790-2988 790-2988 ERBB2 ERBB2 래빗 모노클론성 Rabbit Monoclonal 바로
사용Immediately
use OptiView
Polymer
(Ventana) OptiView
Polymer
(Ventana) Ventana medical systems Ventana medical systems 790-2991 790-2991 PI3KinaseP110alpha PI3KinaseP110alpha 래빗 모노클론성 Rabbit Monoclonal 1:100 1: 100 OptiView
Polymer
(Ventana) OptiView
Polymer
(Ventana) Cell signalling Cell signaling #4249 # 4249 AKT1 AKT1 래빗 모노클론성 Rabbit Monoclonal 1:50 1:50 OptiView
Polymer
(Ventana) OptiView
Polymer
(Ventana) Abcam Abcam ab32505 ab32505

TMA에서 EGFR, ERBB, PI3Kinasep110alpha, 및 AKT1의 염색 패턴은 도 13과 같다. 3개의 TMA 세트의 발현 결과를 이용하여 EGFR, PI3Kinasep110alpha, AKT1의 Complex H 점수를 계산한 결과, EGFR의 평균 H 점수는 EAC-예측 그룹에서 GCFB-예측 그룹보다 유의하게 증가하였다 (160.7 ± 108.8 in EAC-like vs. 105.6 ± 81.6 in GCFB-like, P=0.014, 도 14). ERBB2에 대한 IHC(immunohistochemistry) 염색 결과, ERBB2-양성이 EAC-예측 그룹에서 GCFB-예측 그룹보다 더 높은 점수를 나타내는 경향이 확인되었다 (표 6). 그러나 PI3Kinase와 AKT1의 유의한 발현차이는 관찰되지 않았다.Staining patterns of EGFR, ERBB, PI3Kinasep110alpha, and AKT1 in TMA are shown in FIG. 13. Complex H scores of EGFR, PI3Kinasep110alpha, and AKT1 were calculated using the expression results of three sets of TMA, and the average H score of EGFR was significantly increased in the EAC-predicted group than in the GCFB-predicted group (160.7 ± 108.8 in EAC -like vs. 105.6 ± 81.6 in GCFB-like, P = 0.014, FIG. 14). Immunohistochemistry (ICH) staining for ERBB2 revealed a tendency for ERBB2-positivity to show a higher score in the EAC-prediction group than the GCFB-prediction group (Table 6). However, no significant difference in expression of PI3Kinase and AKT1 was observed.

ERBB2의 IHC 및 SISHIHC and SISH of ERBB2 EAC-예측
(n=10) EAC-prediction
(n = 10) GCFB-예측
(n=36) GCFB-prediction
(n = 36) P value P value IHC IHC 0 0 3(30.0%) 3 (30.0%) 17(47.2%) 17 (47.2%) 0.081 0.081 1+ 1+ 2(20.0%) 2 (20.0%) 14(38.9%) 14 (38.9%) 2+ 2+ 1(10.0%) 1 (10.0%) 2(5.6%) 2 (5.6%) 3+ 3+ 4(40.0%) 4 (40.0%) 3(8.3%) 3 (8.3%) IHC 및SISH IHC and SISH IHC<2+, 또는 IHC2+ 및
흑/적 비율<2.0 IHC <2+, or IHC2 + and
Black / Red ratio <2.0 5(50.0%) 5 (50.0%) 32(88.9%) 32 (88.9%) 0.015 0.015 IHC 3+, 또는 IHC 2+, 및 흑/적 비율≥2.0 IHC 3+, or IHC 2+, and black / red ratio ≥ 2.0 5(50.0%) 5 (50.0%) 4(11.1%) 4 (11.1%)

IHC 및 SISH를 함께 비교했을 때, EAC-예측 그룹은 GCFB-예측 그룹에 비교하여 ERBB2가 유의하게 높은 양성 (IHC 3+, 또는 IHC 2+, 및 SISH ≥2.0의 흑/적 비율)을 나타내었다 (50.0% EAC-예측 vs. 11.1% GCFB-예측, P=0.015). 표 2의 단변량 분석으로부터의 모든 유의한 변화를 다변량 분석으로 분석하여 EGFR 및 ERBB2의 발현에 대한 위험 인자를 확인하였다. EGFR의 과발현에 대한 예측 유형(EAC-예측 또는 GCFB-예측)은 조정 결정 계수 R²(P=0.034)의 0.78의 독립 위험 인자가 유일하였다 (표 7).When comparing IHC and SISH together, the EAC-prediction group showed significantly higher positive ERBB2 (IHC 3+, or IHC 2+, and a black / red ratio of SISH ≥ 2.0) compared to the GCFB-prediction group. (50.0% EAC-prediction vs. 11.1% GCFB-prediction, P = 0.015). All significant changes from the univariate analysis of Table 2 were analyzed by multivariate analysis to identify risk factors for the expression of EGFR and ERBB2. The type of prediction (EAC-prediction or GCFB-prediction) for overexpression of EGFR was the only independent risk factor of 0.78 of the adjusted decision coefficient R ² (P = 0.034) (Table 7).

EGFR의 과발현에 대한 다변량 분석Multivariate Analysis of Overexpression of EGFR 변화 change 비표준화 계수 B±표준편차 Non-standardized coefficient B ± standard deviation 표준화
계수 β standardization
Coefficient β t t P value P value B의 95% 신뢰 구간 95% confidence interval for B WHO 분류 WHO classification 1.822±5.722 1.822 ± 5.722 0.053 0.053 0.318 0.318 0.752 0.752 -9.733-13.378 -9.733-13.378 Lauren 분류 Lauren classification 26.389±16.886 26.389 ± 16.886 0.244 0.244 1.563 1.563 0.125 0.125 -7.665-60.443 -7.665-60.443 신경주변 침투 Peripheral nerve penetration -38.504±27.032 -38.504 ± 27.032 -0.220 -0.220 -1.424 -1.424 0.162 0.162 -93.057-16.046 -93.057-16.046 예측 유형 Forecast type 62.500±28.509 62.500 ± 28.509 0.314 0.314 2.192 2.192 0.034 0.034 5.044-19.956 5.044-19.956

ERBB2 양성에 대해서는, 예측 유형 및 WHO 분류가 독립 위험 인자였다 (예측 유형에 대해 P=0.049 및 분화 유형에 대해 P=0.029)(표 8).For ERBB2 positive, predictive type and WHO classification were independent risk factors ( P = 0.049 for predictive type and P = 0.029 for differentiation type) (Table 8).

ERBB2 양성에 대한 다변량 분석Multivariate analysis for ERBB2 positive 변화 change P value P value 승산비
(Odds ratio) Odds ratio
(Odds ratio) 승산비의 95% 신뢰 구간 95% confidence interval for odds ratio WHO 분류
(vs. 결정x) WHO classification
(vs. decisionx) 분화 differentiation 0.029 0.029 0.223 0.223 0.058-0.856 0.058-0.856 비분화 Non-differentiation 0.002 0.002 0.036 0.036 0.004-0.309 0.004-0.309 Lauren 분류
(vs. 혼합형) Lauren classification
(vs. mixed) 장 chapter 0.387 0.387 4.156 4.156 0.165-105.009 0.165-105.009 확산 diffusion 0.734 0.734 0.581 0.581 0.025-13.322 0.025-13.322 신경주변침투
(vs. 침투) Peripheral Nerve Penetration
(vs. penetration) 비-침투 Non-penetrating 0.576 0.576 0.532 0.532 0.058-4.870 0.058-4.870 예측 유형 (vs. GCFB-예측) Type of prediction (vs. GCFB-prediction) EAC-예측 EAC-prediction 0.049 0.049 6.179 6.179 1.1011-37.752 1.1011-37.752

(7) (7) CCLECCLE 데이터베이스를 이용한 외부 검증 External verification using a database

CCLE 데이터베이스로부터, 발현 마이크로어레이 데이터, SNP 어레이 데이터 및 라파티닙, EGFR 및 HER2 타이로신 카이네이즈 억제제 병용에 대한 IC50을 가지고 있는 식도(n=3) 및 위(n=38) 선암 세포주를 확인하였다. 각 샘플에 대해 사용 가능한 데이터는 표 9와 같다.From the CCLE database, esophageal (n = 3) and gastric (n = 38) adenocarcinoma cell lines with IC50 for expression microarray data, SNP array data and lapatinib, EGFR and HER2 tyrosine kinase inhibitor combinations were identified. The data available for each sample is shown in Table 9.

CLLE 데이터베이스의 식도 및 위 선암 세포주 정보Esophageal and Gastric Adenocarcinoma Cell Line Information from the CLLE Database 세포주 Cell line 기관 Agency BCCP 점수 BCCP Score 예측 prediction ERBB2
복제수^* ERBB2
Replica ^* EGFR
복제수^* EGFR
Replica ^* IC50† IC50 † OE33 OE33 식도 esophagus 0.546 0.546 EAC-예측 EAC-prediction 증폭 Amplification 0 0 3.538 3.538 OE19 OE19 식도 esophagus 0.402 0.402 GCFB-예측 GCFB-prediction 증폭 Amplification 0 0 N/A N / A JHESOAD1 JHESOAD1 식도 esophagus 0.484 0.484 EAC-예측 EAC-prediction N/A N / A N/A N / A N/A N / A FU97 FU97 위 top 0.109 0.109 GCFB-예측 GCFB-prediction 결실 fruition 0 0 8.000 8.000 NUGC3 NUGC3 위 top 0.37 0.37 GCFB-예측 GCFB-prediction 0 0 0 0 2.411 2.411 IM95 IM95 위 top 0.318 0.318 GCFB-예측 GCFB-prediction 0 0 0 0 8.000 8.000 AGS AGS 위 top 0.19 0.19 GCFB-예측 GCFB-prediction 0 0 0 0 N/A N / A KATOIII KATOIII 위 top 0.536 0.536 EAC-예측 EAC-prediction 0 0 0 0 N/A N / A SNU16 SNU16 위 top 0.351 0.351 GCFB-예측 GCFB-prediction 0 0 0 0 6.698 6.698 NCIN87 NCIN87 위 top 0.753 0.753 EAC-예측 EAC-prediction 증폭 Amplification 0 0 0.066 0.066 OCUM1 OCUM1 위 top 0.347 0.347 GCFB-예측 GCFB-prediction 0 0 0 0 8.000 8.000 SNU5 SNU5 위 top 0.291 0.291 GCFB-예측 GCFB-prediction 0 0 0 0 N/A N / A GCIY GCIY 위 top 0.169 0.169 GCFB-예측 GCFB-prediction 0 0 0 0 7.255 7.255 SH10TCSH10TC 위 top 0.1520.152 GCFB-예측 GCFB-prediction 00 00 88 MKN1MKN1 위 top 0.3410.341 GCFB-예측 GCFB-prediction 00 00 N/AN / A MKN74MKN74 위 top 0.360.36 GCFB-예측 GCFB-prediction 00 증폭 Amplification 4.694.69 KE39KE39 위 top 0.2110.211 GCFB-예측 GCFB-prediction 증폭 Amplification 00 4.0564.056 HGC27HGC27 위 top 0.0620.062 GCFB-예측 GCFB-prediction 00 00 88 HUG1NHUG1N 위 top 0.3150.315 GCFB-예측 GCFB-prediction 00 00 N/AN / A NUGC4NUGC4 위 top 0.3130.313 GCFB-예측 GCFB-prediction 증폭 Amplification 증폭 Amplification 0.1720.172 RERFGC1BRERFGC1B 위 top 0.3650.365 GCFB-예측 GCFB-prediction 00 00 88 HS746THS746T 위 top 0.1430.143 GCFB-예측 GCFB-prediction 00 00 88 NUGC2NUGC2 위 top 0.5310.531 EAC-예측 EAC-prediction 00 00 N/AN / A SNU1SNU1 위 top 0.1760.176 GCFB-예측 GCFB-prediction 00 00 88 MKN45MKN45 위 top 0.3410.341 GCFB-예측 GCFB-prediction 00 증폭 Amplification 88 X2313287X2313287 위 top 0.5090.509 EAC-예측 EAC-prediction N/AN / A N/AN / A N/AN / A MKN7MKN7 위 top 0.2720.272 GCFB-예측 GCFB-prediction 증폭 Amplification 00 88 SNU216SNU216 위 top 0.370.37 GCFB-예측 GCFB-prediction 증폭 Amplification 00 N/AN / A AZ521AZ521 위 top 0.0970.097 GCFB-예측 GCFB-prediction 00 00 1.661.66 LMSULMSU 위 top 0.1290.129 GCFB-예측 GCFB-prediction 00 00 N/AN / A ECC10ECC10 위 top 0.1530.153 GCFB-예측 GCFB-prediction 00 00 N/AN / A TGBC11TKBTGBC11TKB 위 top 0.3260.326 GCFB-예측 GCFB-prediction 00 00 N/AN / A SNU520SNU520 위 top 0.2970.297 GCFB-예측 GCFB-prediction 00 00 N/AN / A GSSGSS 위 top 0.2230.223 GCFB-예측 GCFB-prediction 00 증폭 Amplification N/AN / A SNU620SNU620 위 top 0.3220.322 GCFB-예측 GCFB-prediction 00 00 N/AN / A ECC12ECC12 위 top 0.0740.074 GCFB-예측 GCFB-prediction 00 00 N/AN / A GSUGSU 위 top 0.3880.388 GCFB-예측 GCFB-prediction 00 00 N/AN / A SNU601SNU601 위 top 0.5070.507 EAC-예측 EAC-prediction 00 00 N/AN / A SNU668SNU668 위 top 0.1440.144 GCFB-예측 GCFB-prediction 00 00 N/AN / A NCCSTCK140NCCSTCK140 위 top 0.8160.816 EAC-예측 EAC-prediction 00 00 N/AN / A SNU719SNU719 위 top 0.3050.305 GCFB-예측 GCFB-prediction 00 증폭 Amplification N/AN / A

^*0은 복제 수가 변경되지 않았음을, N/A는 사용 가능하지 않음을 나타낸다.†IC50은 라파티닙에 대해서 최대 억제 농도의 절반을 나타낸다. ^* 0 indicates that the copy number has not changed and N / A is not available. † IC50 indicates half the maximum inhibitory concentration for lapatinib.

이러한 세포주들을 이용하여 CCLE 데이터베이스의 RNA 마이크로어레이 데이터를 이용한 외부 검증으로, 식도 및 위 선암 세포주 사이에서 BCCP 점수의 유의한 차이를 확인하였다(Wilcoxon Rank Sumtest 사용, P=0.031)(도 15).External validation using RNA microarray data from the CCLE database using these cell lines confirmed a significant difference in BCCP scores between esophageal and gastric adenocarcinoma cell lines (using Wilcoxon Rank Sumtest, P = 0.031) (FIG. 15).

BCCP 스코어를 이용한 CCLE 데이터베이스의 계층적 클러스터링으로 EAC-예측 그룹 및 GCFB-예측 그룹 사이에서 조직 기원(식도 또는 위), ERBB2 증폭, 또는 EGFR 증폭에 따른 유의한 차이는 없는 것을 밝혀냈다(도 16). Hierarchical clustering of the CCLE database using BCCP scores revealed no significant differences in tissue origin (esophagus or stomach), ERBB2 amplification, or EGFR amplification between the EAC-prediction group and the GCFB-prediction group (FIG. 16).

라파티닙, EGFR 및 HER2 타이로신 카이네이즈 억제제 병용에 대한 표적 약물 반응은 EAC-예측(n=2) 및 GCFB-예측 그룹(n=17) 사이에 대해서 CCLE 데이터베이스의 IC50 데이터를 이용하여 평가하였다 (도 17).Target drug responses to lapatinib, EGFR and HER2 tyrosine kinase inhibitor combinations were assessed using IC50 data from the CCLE database for the EAC-prediction (n = 2) and GCFB-prediction groups (n = 17) (FIG. 17). .

그 결과, Wilcoxon Rank Sumtest(P=0.044)를 이용하여, GCFB-예측 그룹보다 EAC-예측 그룹에서 IC50이 유의하게 낮게 나타남을 확인하였다.As a result, it was confirmed that the IC50 was significantly lower in the EAC-prediction group than the GCFB-prediction group, using Wilcoxon Rank Sumtest (P = 0.044).

Claims

To provide information necessary for the diagnosis of gastric or esophageal cancer,
Measuring the expression level of EGFR and ERBB2 in gastric or esophageal tissue isolated from the individual;
When the expression levels of EGFR and ERBB2 are increased compared to the control group, determining esophageal cancer; And
If the expression level of the EGFR and ERBB2 did not increase compared to the control, the method comprising the step of determining gastric cancer / body cancer

The method of claim 1, wherein the gastric cancer or esophageal cancer is gastroesophageal adenocarcinoma (adenocarcinoma of gastroesophageal junction), cardia cancer, gastroesophageal border / gate cancer, upperthird gastric cancer, gastric cancer located at fundus, gastric cancer located at body, gastric adenocarcinoma located at fundusor body of the stomach (GCFB), esophageal adenocarcinoma (EAC), or a combination thereof.

The method of claim 1, wherein measuring the expression levels of EGFR and ERBB2 comprises identifying a copy number of EGFR and ERBB2.

The method of claim 1, wherein measuring the expression levels of EGFR and ERBB2 comprises measuring the amount of protein of HER1 or HER2.

The method of claim 1, wherein measuring the expression levels of EGFR and ERBB2 comprises reverse phase protein analysis, tissue microarray, immunohistochemistry (IHC), and silver in situ matching (silver). in situ hybridization (SISH), or a combination thereof.

delete

The method according to claim 1, wherein the expression level of EGFR and ERBB2 is increased compared to the control when the number of copies of the ERBB2 gene is increased.

delete

The method of claim 1, further comprising: measuring a Bayesian Compound Covariate Predictor (BCCP) score of gastric or esophageal tissue isolated from the subject;
To provide information for diagnosing gastric or esophageal cancer in an individual,
Measuring the expression profile of a gene of tissue isolated from the individual; And
A method of analyzing a gene expression profile further comprising obtaining a BCCP score from a gene expression profile using a computer system executing a Bayesian Compound Covariate Predictor (BCCP) algorithm, wherein the gene is composed of the following genes: KRT5 , KRT13, KRT6A, KRT14, RHCG, KRT4, S100A7, SPRR3, KRT6C, SPRR2A, MUC21, KRT6B, PKP1, SERPINB3, CLCA2, SPRR1B, MUC4, TBX5, CRNN, SPRR2D, DUOX2, GJB6 S2, GJB6 S100 , SCEL, DSC3, CALML3, HEPHL1, A2ML1, NKX6-1, DUOXA2, SBSN, TMPRSS11D, CLCA4, ANXA8, FGFBP1, TGM1, IVL, DUOX1, DSG3, EREG, SERPINB4, KRT78, HOXC10, ZDUFKRT7 , TMPRSS11A, CXCL6, KLK13, CRCT1, PAX9, HORMAD1, FAM83A, S100A7A, GPR110, CAPN14, ABCA12, WFDC2, TMPRSS11B, CEACAM5, MUC12, SERPINB13, PADI1, FAT2, IL1RL2, STK31, SPIN3, SPX3 , KLK5, IL1F5, KRT15, BBOX1, GPR87, ACTL8, SLC34A2, USP9Y, TGM3, LASS3, TMPRSS13, GUCY1B2, SPRR2F, KDM5D, KRT23, LY6D, VTCN1, S LC5A12, ABCA13, S100A8, GBP6, CEACAM7, IL1F9, MUC15, PPP1R14C, KLK7, DEFB1, MMP3, TM4SF20, SYCP2, TMEM40, CEACAM6, VWA2, CYorf15B, TDRD5, DCDC2, KLK12, C10R, C14CARD DNAH2, SLC15A1, KLK8, LGALS7B, ALDH1L1, FOXN1, RAET1L, TP63, ATP13A4, UTY, DSG1, ALOX12B, AMY2B, SAA2, SAA1, GPR115, LCN2, KRT24, PRKY, PADI3, SCNN1AWISP IL20RB, FOXE1, ZFY, ALG1L, LY6K, NCCRP1, SLC44A5, NMU, CDH17, RNASE7, FER1L4, C4BPB, CALML5, CXorf61, SPACA4, ULBP2, KRTDAP, MYO7B,, C7orf54, LOCB20, SGA2A20 C21orf125 BNC1 IL1F6 TRPA1 C12orf27 LOC440905 POU5F1 LOC100131726 , FLJ42393, PCDHAC2, IFNE, BCL2L10, FLJ45445, COL7A1, GDA, ARL14, VIP, CLC, PPP2R2B, FABP4, SFRP5, CCL8, CD79A, HIST1H1E, CCL4L2, SYNPO2, SLC47A1, CXCR2P7, C1, CXCR2PN -3, PGM5, LILRB4, NMUR1 , XCL2, SOX10, FCGR1A, ST6GALNAC5, CCL4, ACSM5, FOLR2, KCNA1, CCL26, FCGR1C, VSIG4, C1QA, NCF1B, NXPH3, C1QC, TAGLN, PTPRN, TYROBP, ISL1, HCND, RGPOQ , GLP2R, PNMA6A, CD8B, FCGR1B, PCBP3, CD8A, GPR27, DPP6, F13A1, EPYC, CXCL9, GREM2, SNORA12, DNAJC5B, HLA-DQA2, PLP1, CHGB, APOC2, FAT3, TLR1, CHRDL1TR , CR1, FCGR3A, DPEP1, PPAPDC1A, ODZ3, MAPK4, C17orf87, KCNJ5, MSR1, RELN, APOE, C1QB, PHYHIPL, CCDC80, NCF1C, SLITRK5, TREM2, FGF14, PGA5, NKX2-3, PRELP5, PRELP5 , PCDH9, WSCD2, MS4A1, RNF150, KLRD1, KCNK2, MATK, HUNK, IGFBPL1, IGF1, CDO1, COLEC12, FIBIN, CARTPT, VPREB3, PPP1R1A, GDF6, NCAM2, CLEC10A, KIAA140852 BHL , GRIK3, SUCNR1, TMEM90B, CYP1B1, MKX, SV2B, BMP3, TCEAL2, CADM3, SPOCK1, GREM1, SIGLEC8, ADCY2, CDC26, OGN, FCRL6, PSD, VIPR2, GZMM, ADH1B, CCL19, PLD4, TMEMV1 , EOMES, STMN2, GZMH, NPTX1, CNTFR, TMEM130, CTNND2, ITGBL1, ECEL1, ACTG2, COL10A1, ST6GAL2, NRK, PI16, NALCN, GZMK, NCAM 1, RMRP, CHRM2, KCNMA1, HSPB7, SLIT2, PDZRN4, CNN1, CHRDL2, OMD, PTGIS, RSPO3, PLA2G2D, SMYD1, ZNF683, NRXN3, PCSK1N, HSPB6, C16orf89, Por3, COMP2, C2, C2, C2 CILP, COL11A1, LIPF, GATA5, MFAP5, SCRG1, SFRP2, GKN1, PGA4, CCL21, SIX2, NKX3-2, SFRP4, NBLA00301, THBS4, HAND2, and BARX1.

10. The method of claim 9, further comprising determining esophageal cancer when the BCCP score is at least 0.4535.

10. The method of claim 9, further comprising determining gastric / body cancer when the BCCP score is less than 0.4535.

The method of claim 1, further comprising determining to administer an inhibitor of EGFR and ERBB2 when the expression levels of EGFR and ERBB2 are increased.

The method of claim 12, wherein the inhibitors of EGFR and ERBB2 include lapatinib, trastuzumab, cetuximab, gefitinib, epitinib, or a combination thereof. How.

The method of claim 1, wherein the gastric or esophageal cancer is adenocarcinoma.

The method of claim 1, wherein measuring the expression levels of EGFR and ERBB2 is performed by at least one method selected from RT-PCR, RNase protection assay (RPA), northern blotting, and DNA chip. .

The method according to claim 1, wherein the step of measuring the expression level of EGFR and ERBB2 is Western blotting, ELISA, radioimmunoassay, radioimmunoassay, oukteroni immunodiffusion, rocket immunoelectrophoresis, immunoprecipitation assay, complement fixation assay Performing at least one method selected from FACS, or protein chip.

Probes, primer sets, or nucleotides that specifically bind to EGFR and ERBB2 proteins, or antibodies, antigen binding fragments, or polypeptides that specifically bind to their respective fragments, or polynucleotide sequences encoding EGFR and ERBB2 proteins. Gastric cancer of an individual comprising a substance for measuring the expression profile of a gene of a tissue isolated from the individual, for diagnosing gastric cancer or esophageal cancer of the individual, to provide information for diagnosing gastric cancer or esophageal cancer of the individual. A composition for diagnosing esophageal cancer, wherein the gene comprises the following genes: KRT5, KRT13, KRT6A, KRT14, RHCG, KRT4, S100A7, SPRR3, KRT6C, SPRR2A, MUC21, KRT6B, PKP1, SERPINB3, CLCA2 , SPRR1B, MUC4, TBX5, CRNN, SPRR2D, DUOX2, GJB6, SPRR2E, S100A2, KRT16, KRT17, SCEL, DSC3, CALML3, HEPHL1, A2ML1, NKX6-1, DUOXA2, SBSN, TMPRSS11D, C LCA4, ANXA8, FGFBP1, TGM1, IVL, DUOX1, DSG3, EREG, SERPINB4, KRT78, HOXC10, DUOXA1, ZNF750, SPRR1A, KRT7, TMPRSS11A, CXCL6, KLK13, CRCT1, PAX9, HO83A, SRM14, FAMAN ABCA12, WFDC2, TMPRSS11B, CEACAM5, MUC12, SERPINB13, PADI1, FAT2, IL1RL2, STK31, SPINK5, DDX3Y, PI3, FAM83C, LYPD2, KLK5, IL1F5, KRT15, BBOX1, GPR87, ACTL8, SLC34, ACTL8, SLC34 TMPRSS13, GUCY1B2, SPRR2F, KDM5D, KRT23, LY6D, VTCN1, SLC5A12, ABCA13, S100A8, GBP6, CEACAM7, IL1F9, MUC15, PPP1R14C, KLK7, DEFB1, MMP3, TM4SF20, SDCP2, SYCP2, SYCP2 DCDC2, KLK12, C10orf99, CARD14, HOXC13, LOC642587, GABRR1, DNAH2, SLC15A1, KLK8, LGALS7B, ALDH1L1, FOXN1, RAET1L, TP63, ATP13A4, UTY, DSG1, ALOX12B, AM22 SAA, KN24 PRKY, PADI3, SCNN1A, PGLYRP3, PRSS27, ANXA8L2, WISP3, IL20RB, FOXE1, ZFY, ALG1L, LY6K, NCCRP1, SLC44A5, NMU, CDH17, RNASE7, FER1L4, C4BPB, CALB61, CALB61 , C7orf54, SLC6A20, LOC221442, GOLGA8B, GPR81, ALDH3B2, TRIM 29, C21orf125, BNC1, IL1F6, TRPA1, C12orf27, LOC440905, POU5F1, LOC100131726, KCNH8, SPRR2C, IL13RA2, C3orf67, SAA4, LYPD3, GJB5, ULBP1, PHACTR3, S100A9CCLO, LO3 MRPL42P5, FLJ42393, PCDHAC2, IFNE, BCL2L10, FLJ45445, COL7A1, GDA, ARL14, VIP, CLC, PPP2R2B, FABP4, SFRP5, CCL8, CD79A, HIST1H1E, CCL4L2, SYNPO2, SLC472P1, CCR7A1 NKX6-3, PGM5, LILRB4, NMUR1, XCL2, SOX10, FCGR1A, ST6GALNAC5, CCL4, ACSM5, FOLR2, KCNA1, CCL26, FCGR1C, VSIG4, C1QA, NCF1B, NXPH3, C1QC, ISLG, PSL RGMA, APOBEC3H, PRND, ADIPOQ, CCL11, GLP2R, PNMA6A, CD8B, FCGR1B, PCBP3, CD8A, GPR27, DPP6, F13A1, EPYC, CXCL9, GREM2, SNORA12, DNAJC5B, HLA-DQA2, FAT, APOC2 TLR7, CHRDL1, LY86, AGTR1, ISLR, CR1, FCGR3A, DPEP1, PPAPDC1A, ODZ3, MAPK4, C17orf87, KCNJ5, MSR1, RELN, APOE, C1QB, PHYHIPL, CCDC80, NCF1C, SLITRK5, TREM2, TREM2 3, PRELP, FCRLA, CCL5, GNAO1, PCDH9, WSCD2, MS4A1, RNF150, KLRD1, KCNK2, MATK, HUN K, IGFBPL1, IGF1, CDO1, COLEC12, FIBIN, CARTPT, VPREB3, PPP1R1A, GDF6, NCAM2, CLEC10A, KIAA0408, BHLHE22, CD52, SULT4A1, SIT1, FNDC1, GRIK3, SUCNR1, TMEMBB, CMPBB2, MMPBB2 TCEAL2, CADM3, SPOCK1, GREM1, SIGLEC8, ADCY2, CDC26, OGN, FCRL6, PSD, VIPR2, GZMM, ADH1B, CCL19, PLD4, TMEM189-UBE2V1, NKG7, EOMES, STMN2, GZMH, NPTX1, CNTFRD, CNTFRD ITGBL1, ECEL1, ACTG2, COL10A1, ST6GAL2, NRK, PI16, NALCN, GZMK, NCAM1, RMRP, CHRM2, KCNMA1, HSPB7, SLIT2, PDZRN4, CNN1, CHRDL2, OMD, PTGIS, RSPO3, PLAFDG3D PCSK1N, HSPB6, C16orf89, PGA3, COMP, C2orf40, C7, GKN2, DES, CILP, COL11A1, LIPF, GATA5, MFAP5, SCRG1, SFRP2, GKN1, PGA4, CCL21, SIX2, NKX3-2, SF301, THB HAND2, and BARX1.

The gastroesophageal adenocarcinoma of the gastroesophageal junction, cardia cancer, gastroesophageal border / gateway cancer, upperthird gastric cancer, gastric cancer located at fundus, gastric cancer located at body, gastric adenocarcinoma located at fundusor body of the stomach (GCFB), esophageal adenocarcinoma (EAC), or a combination thereof.

The composition of claim 18, wherein the composition is for diagnosing esophageal cancer or gastric / body cancer of the subject.

A kit for diagnosing gastric or esophageal cancer comprising the composition, reagents and packaging units of claim 17.

delete

The method of claim 9, wherein measuring the gene expression profile is performed by one or more methods selected from RT-PCR, RNase protection assay (RPA), northern blotting, and DNA chip.

delete