WO2023040102A1 - 判断肝细胞肝癌患者预后的基因模型、构建方法和应用 - Google Patents

判断肝细胞肝癌患者预后的基因模型、构建方法和应用 Download PDF

Info

Publication number
WO2023040102A1
WO2023040102A1 PCT/CN2021/139502 CN2021139502W WO2023040102A1 WO 2023040102 A1 WO2023040102 A1 WO 2023040102A1 CN 2021139502 W CN2021139502 W CN 2021139502W WO 2023040102 A1 WO2023040102 A1 WO 2023040102A1
Authority
WO
WIPO (PCT)
Prior art keywords
expression level
prognosis
gene
hepatocellular carcinoma
patients
Prior art date
Application number
PCT/CN2021/139502
Other languages
English (en)
French (fr)
Inventor
徐俊杰
蔡秀军
茅棋江
潘浩奇
梁霄
Original Assignee
浙江大学
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 浙江大学 filed Critical 浙江大学
Publication of WO2023040102A1 publication Critical patent/WO2023040102A1/zh
Priority to US18/358,001 priority Critical patent/US20240021268A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/10Ploidy or copy number detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • C12Q1/6886Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B25/00ICT specially adapted for hybridisation; ICT specially adapted for gene or protein expression
    • G16B25/10Gene or protein expression profiling; Expression-ratio estimation or normalisation
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • G16B40/20Supervised data analysis
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/118Prognosis of disease development
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • the invention belongs to the technical field of biomedicine, and in particular relates to a gene model for judging the prognosis of patients with hepatocellular carcinoma and its application.
  • Liver cancer is one of the ten most common malignant tumors worldwide. There are approximately 500,000 new cases worldwide each year, of which hepatocellular carcinoma accounts for 85%.
  • the 5-year survival rate of primary liver cancer (hepatic cell carcinoma, HCC) has improved with the promotion of tumor markers and imaging examinations, the level of surgery and the development of various new treatment methods such as intra-arterial chemoembolization.
  • HCC hepatic cell carcinoma
  • the prognosis of HCC remains unsatisfactory.
  • One of the main reasons is the lack of effective markers for predicting the prognosis of HCC patients, which makes it impossible to stratify the risk of HCC patients and guide clinicians to conduct early intervention and early treatment for high-risk HCC patients.
  • Current studies have shown that the tumor microenvironment, especially the extracellular matrix, can promote tumor growth, invasion and metastasis, and has a greater impact on the prognosis of tumor patients.
  • the present invention constructs a gene combination model from the gene level to evaluate the prognosis of patients with hepatocellular carcinoma.
  • the patient's extracellular matrix-related genes are integrated and analyzed to construct a related gene combination model, and a tissue chip based on extracellular matrix genes can be used to evaluate the prognosis of patients with hepatocellular carcinoma through risk scores.
  • the results obtained from the evaluation help clinicians to stratify liver cancer patients, and provide the possibility for precise treatment of patients with hepatocellular carcinoma.
  • a method for constructing a gene model for judging the prognosis of patients with hepatocellular carcinoma comprising the steps of:
  • a kind of above-mentioned construction method constructs the gene model that obtains, specifically:
  • Risk score for patients with hepatocellular carcinoma (0.069 ⁇ MMP1 expression level)+(0.049 ⁇ EPO expression level)+(0.042 ⁇ MMRN1 expression level)+(0.036 ⁇ S100A9 expression level)+(0.027 ⁇ ADAM9 expression level)+(0.024 ⁇ GPC1 expression level)+(0.021 ⁇ SPP1 expression level)+(0.014 ⁇ GLDN expression level)+(0.007 ⁇ FGF9 expression level)+(0.001 ⁇ CXCL5 expression level)-(0.024 ⁇ CST7 expression level)-(0.027 ⁇ THBS3 expression level level)-(0.042 ⁇ ANXA10 expression level)-(0.049 ⁇ PIK3IP1 expression level)-(0.051 ⁇ MMP25 expression level)-(0.054 ⁇ CLEC3B expression level)-(0.062 ⁇ PZP expression level)-(0.069 ⁇ CLEC17A expression level) .
  • the TCGA database is used as a training set
  • the GEO database and ICGC database are used as a verification set
  • the risk score of the gene model is analyzed and the gene model is verified by CLIP staging and TMN staging, indicating that the risk of patients with hepatocellular carcinoma is The score was correlated with survival time, and patients with high risk score had shorter survival time and poorer prognosis.
  • a tissue chip based on extracellular matrix genes includes detecting MMP1, EPO, MMRN1, S100A9, ADAM9, GPC1, SPP1, GLDN, FGF9, CXCL5, CST7, THBS3, ANXA10, PIK3IP1, MMP25, CLEC3B, PZP and CLEC17A probes. It provides the possibility to provide precise treatment for patients with hepatocellular carcinoma. It can quickly evaluate the prognosis of postoperative patients with hepatocellular carcinoma and realize clinical transformation.
  • the present invention constructs a gene combination model of 18 genes, through which a tissue chip based on extracellular matrix genes can be constructed to evaluate the prognosis of patients with hepatocellular carcinoma, and can distinguish and select Hepatocellular carcinoma patients with poor prognosis, that is, to stratify patients with hepatocellular carcinoma, screen out high-risk and poor-prognosis hepatocellular carcinoma patients, guide clinicians to provide more active treatment options for high-risk patients, and at the same time avoid Low-risk HCC patients are overtreated.
  • Fig. 1 is the differential gene volcano figure of gene model of the present invention
  • Fig. 2 is the LASSO-Cox regression model construction figure of gene model of the present invention
  • Fig. 3 is the combined gene model figure of 18 genes of the present invention.
  • Figure 4 is a distribution map of the risk score of patients with hepatocellular carcinoma in the training set TCGA; where the abscissa is the serial number of the patient increasing according to the risk score, and the dotted line is the cut-off value;
  • Figure 5 is the distribution map of the survival period of patients with hepatocellular carcinoma in the training set TCGA; the abscissa is the serial number of patients increasing according to the risk score, the dotted line around 190 is the cut-off value, and around 120 and 250 are the dividing lines with obvious differences between death and survival ;
  • Figure 6 is the distribution of the survival period of patients with hepatocellular carcinoma in the validation set GEO; where the abscissa is the serial number of the patients increasing according to the risk score, and the dotted line is the cut-off value;
  • Figure 7 is a distribution map of the survival period of patients with hepatocellular carcinoma in the verification set ICGC; where the abscissa is the serial number of patients increasing according to the risk score, and the dotted line is the cut-off value;
  • Figure 8 is a diagram of the risk score results of different CLIP stages of patients with hepatocellular carcinoma in the training set TCGA;
  • Figure 9 is a graph of the risk score results of different TMN stages of patients with hepatocellular carcinoma in the training set TCGA;
  • Fig. 10 is a relationship diagram between prognosis and survival period of patients with hepatocellular carcinoma after grouping based on the gene model of the present invention in the training set TCGA;
  • Fig. 11 is the sensitivity and specificity result graph of the prognosis of hepatocellular carcinoma patients grouped based on the gene model of the present invention in the training set TCGA;
  • Figure 12 is a graph showing the relationship between prognosis and survival period of patients with hepatocellular carcinoma grouped based on the gene model of the present invention in the verification set GEO;
  • Fig. 13 is a graph showing the relationship between prognosis and survival period of hepatocellular carcinoma patients grouped based on the gene model of the present invention in the verification set ICGA.
  • the invention provides a gene model and application for predicting the prognosis of hepatocellular carcinoma based on extracellular matrix genes. That is, aiming at the differential genes in the extracellular matrix of patients with hepatocellular carcinoma, using the data of hepatocellular carcinoma tissue samples and normal liver tissue samples in the database and statistical analysis to establish a risk model for the prognosis of hepatocellular carcinoma, which can be used as a tool for predicting the prognosis of patients with hepatocellular carcinoma Gene model, so as to construct a tissue chip based on extracellular matrix gene, which is helpful to evaluate the prognosis of patients with hepatocellular carcinoma after surgery.
  • the inclusion and exclusion criteria for hepatocellular carcinoma tissue samples are:
  • Example 1 Construction of a gene model for judging the prognosis of patients with hepatocellular carcinoma
  • the gene model for judging the prognosis of patients with hepatocellular carcinoma of the present invention is constructed and obtained through the following steps:
  • Table 1 18 ECM genes obtained after LASSO regression model
  • the transcriptome data of 371 hepatocellular carcinoma tissue samples in the TCGA database were used as the training set, 247 hepatocellular carcinoma tissues in the GEO database (https://www.ncbi.nlm.nih.gov/geo/) GSE140520 and the ICGC database (https http://daco.icgc.org/) data of 203 cases of hepatocellular carcinoma tissues were used as the verification set, and the score of each hepatocellular carcinoma patient in the training set was calculated according to the risk scoring model, and the median score (0.044954) was taken as The cutoff value divides them into high-risk score group and low-risk score group, draws the relationship between the risk score and survival period, CLIP stage and TMN stage of the two groups of patients ( Figure 4-8), and verifies the prognosis prediction of hepatocellular carcinoma Effects of risk scoring models.
  • Fig. 4 and Fig. 5 are the risk score distribution map and survival time distribution map of HCC patients according to the cutoff value in the training set TCGA
  • Fig. 6 and Fig. 7 are the risk score distribution map and survival period distribution map of the HCC patients in the verification set GEO and ICGC according to the cutoff value.
  • the distribution of survival period Figure 8 and Figure 9 are the risk score results of different CLIP stages and TMN stages of HCC patients in the training set TCGA. It can be seen that the higher the risk score, the higher the patient survival rate, and the higher the CLIP stage and TMN stage. High, indicating that the model has a good classification effect of HCC.
  • Figure 10 is a graph showing the relationship between the prognosis and survival period of patients with HCC in the training set TCGA database, in which the survival period of patients with HCC in the high-risk score group is short, and the prognosis Patients in the lower risk score group were worse (see Figure 10).
  • Figure 11 and Table 2 are the results of the sensitivity and specificity of the model to verify the prognosis of HCC.
  • the 3-year AUC of the risk model was 0.81, the sensitivity was 73.7%, and the specificity was 73.7%.
  • the 3-year AUC of the risk model was 0.626, the sensitivity was 68.8%, and the specificity was 55.8%; the 5-year AUC was 0.625, the sensitivity was 60.0%, and the specificity was 34.7%; Table 4 is the model The sensitivity and specificity results of HCC prognosis were verified in the ICGC database.
  • the 3-year AUC of the risk model was 0.723, the sensitivity was 93.3%, and the specificity was 52.7%.
  • the 5-year AUC was 0.717, the sensitivity was 88.9%, and the specificity was 0.723. It was 52.3%; patients with high risk score had short survival time and poor prognosis. It shows that the hepatocellular carcinoma prognosis prediction risk score model of the present invention can be used to evaluate the prognosis of hepatocellular carcinoma.
  • the present invention also provides a gene chip that will detect MMP1, EPO, MMRN1, S100A9, ADAM9, GPC1, SPP1, GLDN, FGF9, CXCL5, CST7, THBS3, ANXA10, PIK3IP1, MMP25, CLEC3B, PZP and CLEC17A 18
  • the probes of each gene are constructed into a gene chip according to the above-mentioned model, which is convenient for clinical application.
  • each gene probe sequence is preferably as shown in Table 5. For multiple probes of a gene, the average of the probe test results can be selected. The value was taken as the final expression level of the gene.

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Data Mining & Analysis (AREA)
  • Chemical & Material Sciences (AREA)
  • Biotechnology (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Biophysics (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Genetics & Genomics (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Molecular Biology (AREA)
  • Analytical Chemistry (AREA)
  • Primary Health Care (AREA)
  • Organic Chemistry (AREA)
  • Software Systems (AREA)
  • Bioethics (AREA)
  • Evolutionary Computation (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Zoology (AREA)
  • Immunology (AREA)
  • Wood Science & Technology (AREA)
  • Biochemistry (AREA)
  • General Engineering & Computer Science (AREA)
  • Microbiology (AREA)
  • Oncology (AREA)
  • Hospice & Palliative Care (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

本发明公开了一种判断肝细胞肝癌预后的基因模型及其构建方法和应用。本发明通过比较肝细胞肝癌患者样本的数据和正常患者样本的转录组数据,得到具有差异表达的基因,与细胞外基质基因集整合后通过LASSO-COX回归模型缩小得到一个18个基因的模型。本发明模型可以对肝细胞肝癌患者的预后进行评估,区分并甄选出预后差的肝细胞肝癌患者,从而指导临床医生提供更积极的治疗方案,同时也能避免对低风险的肝细胞肝癌患者进行过度治疗。通过该基因模型有助于构建一款基于细胞外基质基因的组织芯片,能够对肝细胞肝癌术后患者快速进行预后进行评估,实现临床转化。

Description

判断肝细胞肝癌患者预后的基因模型、构建方法和应用 技术领域
本发明属于生物医学技术领域,具体涉及一种用于判断肝细胞肝癌患者预后的基因模型及应用。
背景技术
肝癌是世界范围内最常见的十大恶性肿瘤之一。全球每年约有50万新增病例,其中肝细胞肝癌占85%。随着肿瘤标志物与影像学检查的推广、外科手术水平和多种新型治疗方式如动脉内化疗栓塞等的发展,原发性肝癌(hepatic cell carcinoma,HCC)的5年生存率有所改善。但总体来说,肝细胞肝癌的预后仍然不尽如人意。其中主要原因之一就是缺少有效的预测肝细胞肝癌患者预后的标志物,从而无法将肝细胞肝癌患者进行风险分层,无法指导临床医生对高风险肝细胞肝癌患者进行早干预早治疗。当前研究表明,肿瘤微环境,尤其是细胞外基质,能够促进肿瘤的生长、侵袭和转移,对肿瘤患者的预后具有较大的影响。
发明内容
针对当前临床缺少有效的判断肝细胞肝癌患者预后的标志物,无法对肝细胞性肝癌患者预后进行判断,本发明从基因层面构建基因组合模型来评估肝细胞肝癌患者的预后,通过对肝细胞肝癌患者的细胞外基质相关基因进行整合分析,从而构建相关基因组合模型,构建一款基于细胞外基质基因的组织芯片,可以实现通过风险评分对肝细胞肝癌患者的预后进行评估。评估得到的结果有助于临床医生对肝癌患者进行分层,为肝细胞肝癌患者的精准治疗提供了可能性。
本发明采用的方案具体如下:
一种判断肝细胞肝癌患者预后的基因模型的构建方法,包括如下步骤:
(1)、获取肝细胞肝癌和正常肝脏组织样本的转录组数据,比较肝细胞肝癌组织样本的数据和正常肝脏组织样本的数据中的差异基因,设定P-value<0.05得到具有显著差异的基因,并将具有显著差异的基因与细胞外基质基因集(559个细胞外基质相关基因)进行整合;
(2)、随后使用LASSO方法进行分析,基于R语言glmnet包使用1000次Cox LASSO回归迭代和10倍交叉验证,将种子基因缩小为18个与HCC预后相关的ECM基因集,包括:MMP1、EPO、MMRN1、S100A9、ADAM9、GPC1、SPP1、GLDN、FGF9、CXCL5、CST7、THBS3、ANXA10、PIK3IP1、MMP25、CLEC3B、PZP和CLEC17A的18个基因组合(表1),并以18个基因为标志物构建获得肝细胞肝癌预后预测风险评分模型。
一种上述构建方法构建获得的基因模型,具体为:
肝细胞肝癌患者风险评分=(0.069×MMP1表达水平)+(0.049×EPO表达水平)+(0.042×MMRN1表达水平)+(0.036×S100A9表达水平)+(0.027×ADAM9表达水平)+(0.024×GPC1表达水平)+(0.021×SPP1表达水平)+(0.014×GLDN表达水平)+(0.007×FGF9表达水平)+(0.001×CXCL5表达水平)-(0.024×CST7表达水平)-(0.027×THBS3表达水平)-(0.042×ANXA10表达水平)-(0.049×PIK3IP1表达水平)-(0.051×MMP25表达水平)-(0.054×CLEC3B表达水平)-(0.062×PZP表达水平)-(0.069×CLEC17A表达水平)。
进一步地,将TCGA数据库作为训练集,GEO数据库和ICGC数据库作为验证集,对所述基因模型的风险评分进行分析并通过CLIP分期和TMN分期对基因模型进行验证,表明所述肝细胞肝癌患者风险评分与生存期相关,风险评分高的患者生存期短、预后差。
一种上述基因模型在评估肝细胞肝癌预后中的应用。
一种基于细胞外基质基因的组织芯片,所述组织芯片包含检测MMP1、EPO、MMRN1、S100A9、ADAM9、GPC1、SPP1、GLDN、FGF9、CXCL5、CST7、THBS3、ANXA10、PIK3IP1、MMP25、CLEC3B、PZP和CLEC17A的探针。为肝细胞肝癌患者提供精准治疗提供了可能性。能够对肝细胞肝癌术后患者快速进行预后评估,实现临床转化。
本发明的有益效果是:本发明构建了18个基因的基因组合模型,通过该基因模型可以构建一款基于细胞外基质基因的组织芯片,对肝细胞肝癌患者的预后进行评估,能够区分并甄选出预后差的肝细胞肝癌患者,即将肝细胞肝癌患者进行分层,筛选出高风险、预后差的肝细胞肝癌患者,指导临床医生对高风险患者提供更积极的治疗方案,同时也能避免对低风险的肝细胞肝癌患者进行过度治疗。
附图说明
下面结合附图和实施例对本发明进一步说明;
图1为本发明的基因模型的差异基因火山图;
图2为本发明的基因模型的LASSO-Cox回归模型构建图;
图3为本发明的18个基因的组合基因模型图;
图4为训练集TCGA中肝细胞肝癌患者的风险评分分布图;其中横坐标为根据风险评分递增的患者序号,虚线为截断值;
图5为训练集TCGA中肝细胞肝癌患者的生存期分布图;其中横坐标为根据风险评分递增的患者序号,190附近虚线为截断值,120附近和250附近为死亡和存活差异明显的分界线;
图6为验证集GEO中肝细胞肝癌患者的生存期分布图;其中横坐标为根据风险评分递增的患者序号,虚线为截断值;
图7为验证集ICGC中肝细胞肝癌患者的生存期分布图;其中横坐标为根据风险评分递增的患者序号,虚线为截断值;
图8为训练集TCGA中肝细胞肝癌患者不同CLIP分期的风险评分结果图;
图9为训练集TCGA中肝细胞肝癌患者不同TMN分期的风险评分结果图;
图10为训练集TCGA中基于本发明基因模型分组后肝细胞肝癌患者预后与生存期的关系图;
图11为训练集TCGA中基于本发明基因模型分组的肝细胞肝癌患者预后的敏感性和特异性结果图;
图12为验证集GEO中基于本发明基因模型分组的肝细胞肝癌患者预后与生存期的关系图;
图13为验证集ICGA中基于本发明基因模型分组的肝细胞肝癌患者预后与生存期的关系图。
具体实施方式
本发明提供了一种基于细胞外基质基因预测肝细胞肝癌预后的基因模型及应用。即针对肝细胞肝癌患者细胞外基质的差异基因,利用数据库中的肝细胞肝癌组织样本及正常肝脏组织样本数据和统计学分析建立肝细胞肝癌预后的风险模型,能够作为预测肝细胞肝癌患者预后的基因模型,从而构建一款基于细胞外基质基因的组织芯片,有助于对肝细胞肝癌术后患者预后进行评估。其中肝细胞肝癌组织样本纳入及排除的标准为:
(1)术前未曾接受过其他癌症治疗;
(2)无其他恶性肿瘤病史;
(3)具有完善的临床病理资料和随访信息。
下面结合具体的实施例对本发明的效果作进一步说明。
实施例1:构建用于判断肝细胞肝癌患者预后的基因模型
本发明的用于判断肝细胞肝癌患者预后的基因模型,通过如下步骤构建获得:
(1)、先从TCGA数据库(https://portal.gdc.cancer.gov/)中下载371例肝细胞肝癌组织样本和50例正常肝脏组织样本的转录组数据及对应患者的临床信息(包括性别、总生存时间、生存状态等),比较TCGA数据库中肝细胞肝癌组织样本的数据与正常肝脏组织样本中的差异基因,设定P-value<0.05,得到具有显著差异的基因,并将具有显著差异的基因与559个细胞外基质(ECM)相关基因进行整合(见图1)。
(2)、随后使用LASSO方法进行分析,基于R语言glmnet包使用1000次Cox LASSO回归迭代和10倍交叉验证,筛选出有统计意义的18个与ECM相关的候选基因及这些基因的预后AUC、HR值(见表1和图2)。将Cox LASSO回归模型的系数作为权重,构建出一个基于包含MMP1、EPO、MMRN1、S100A9、ADAM9、GPC1、SPP1、GLDN、FGF9、CXCL5、CST7、THBS3、ANXA10、PIK3IP1、MMP25、CLEC3B、PZP和CLEC17A 18个基因为标志物的肝细胞肝癌预后预测风险评分模型(见图3)。
所述肝细胞肝癌预后预测风险评分模型具体为:肝细胞肝癌患者风险评分=(0.069×MMP1表达水平)+(0.049×EPO表达水平)+(0.042×MMRN1表达水平)+(0.036×S100A9表达水平)+(0.027×ADAM9表达水平)+(0.024×GPC1表达水平)+(0.021×SPP1表达水平)+(0.014×GLDN表达水平)+(0.007×FGF9表达水平)+(0.001×CXCL5表达水平)-(0.024×CST7表达水平)-(0.027×THBS3表达水平)-(0.042×ANXA10表达水平)-(0.049×PIK3IP1表达水平)-(0.051×MMP25表达水平)-(0.054×CLEC3B表达水平)-(0.062×PZP表达水平)-(0.069×CLEC17A表达水平)。
表1 LASSO回归模型后得到的18个ECM基因
基因名称 疾病 AUC HR
MMP1 肝癌 0.628 1.220
EPO 肝癌 0.607 1.127
MMRN1 肝癌 0.542 1.088
S100A9 肝癌 0.589 1.213
ADAM9 肝癌 0.586 1.344
GPC1 肝癌 0.639 1.178
SPP1 肝癌 0.614 1.127
GLDN 肝癌 0.603 1.122
FGF9 肝癌 0.555 1.171
CXCL5 肝癌 0.575 1.096
CST7 肝癌 0.564 0.813
THBS3 肝癌 0.623 0.741
ANXA10 肝癌 0.629 0.870
PIK3IP1 肝癌 0.559 0.779
MMP25 肝癌 0.549 0.829
CLEC3B 肝癌 0.610 0.746
PZP 肝癌 0.608 0.863
CLEC17A 肝癌 0.593 0.826
实施例2:肝细胞肝癌预后预测风险评分模型在评估肝细胞肝癌预后中的应用
将TCGA数据库371例肝细胞肝癌组织样本转录组数据作为训练集,GEO数据库(https://www.ncbi.nlm.nih.gov/geo/)GSE140520中247例肝细胞肝癌组织和ICGC数据库 (https://daco.icgc.org/)的203例肝细胞肝癌组织的数据作为验证集,根据风险评分模型分别计算训练集每个肝细胞肝癌患者的评分,取评分的中位数(0.044954)作为截断值将其分为高风险分值组和低风险分值组,绘制两组患者的风险评分和生存期、CLIP分期和TMN分期的关系图(图4-8),验证肝细胞肝癌预后预测风险评分模型的效果。其中,图4和图5为训练集TCGA中肝细胞肝癌患者根据截断值的风险评分分布图和生存期分布图,图6和图7为验证集GEO和ICGC中肝细胞肝癌患者根据截断值的生存期分布图,图8和图9为训练集TCGA中肝细胞肝癌患者不同CLIP分期和TMN分期的风险评分结果,可以看到风险评分越高,患者存活率越高,CLIP分期与TMN分期越高,表明该模型具有良好的肝细胞肝癌分型效果。
进一步地,通过ROC曲线评估模型的预测性能:图10为训练集TCGA数据库中肝细胞肝癌患者的预后与生存期的关系图,其中高风险分值组的肝细胞肝癌患者的生存期短,预后较低风险分值组患者差(见图10),图11、表2为该模型验证HCC预后的敏感性和特异性结果,风险模型的3年AUC为0.81,敏感性为73.7%,特异性为75%;5年AUC为0.79,敏感性为77.3%,特异性为71.7%;用GEO数据库(https://www.ncbi.nlm.nih.gov/geo/)GSE140520中247例肝细胞肝癌组织和ICGC数据库中的203例肝细胞肝癌组织的数据作为验证集进行验证(见图12和图13),结果与TCGA数据库中结果一致,表3为该模型在GEO数据库中验证HCC预后的敏感性和特异性结果,风险模型的3年AUC为0.626,敏感性为68.8%,特异性为55.8%;5年AUC为0.625,敏感性为60.0%,特异性为34.7%;表4为该模型在ICGC数据库钟验证HCC预后的敏感性和特异性结果,风险模型的3年AUC为0.723,敏感性为93.3%,特异性为52.7%;5年AUC为0.717,敏感性为88.9%,特异性为52.3%;风险评分高的患者生存期短、预后差。表明本发明的肝细胞肝癌预后预测风险评分模型可以用于评估肝细胞肝癌预后。
表2 TCGA数据库风险模型的敏感性和特异性检验结果
风险模型 AUC 敏感性 特异性
3年ROC 0.81 73.7% 75.0%
5年ROC 0.79 77.3% 71.7%
表3 GEO数据库风险模型的敏感性和特异性检验结果
风险模型 AUC 敏感性 特异性
3年ROC 0.626 68.8% 55.8%
5年ROC 0.625 60.0% 34.7%
表4 ICGC数据库风险模型的敏感性和特异性检验结果
风险模型 AUC 敏感性 特异性
3年ROC 0.723 93.3% 52.7%
5年ROC 0.717 88.9% 52.3%
本发明还提供了一种基因芯片,即:将检测MMP1、EPO、MMRN1、S100A9、ADAM9、GPC1、SPP1、GLDN、FGF9、CXCL5、CST7、THBS3、ANXA10、PIK3IP1、MMP25、CLEC3B、PZP和CLEC17A 18个基因的探针按上述模型构建成基因芯片,便于在临床中应用,其中,各基因探针序列优选如表5所示,针对一个基因的多个探针,可以选择探针测试结果的平均值作为该基因最终的表达水平。
表5基因芯片各基因探针序列
Figure PCTCN2021139502-appb-000001
Figure PCTCN2021139502-appb-000002
上述对具体实施方式的描述是为了便于该技术领域的普通技术人员能理解和使用本发明。熟悉本领域技术人员显然可以容易的对这些具体实施方式做出各种修改,并把在此说明的一般原理应用到其他实施例中,而不必经过创造性的劳动。因此,本发明不限于上述具体实施方式。本领域技术人员根据本发明的原理,不脱离本发明的范畴所做出的改进和修改都应该在本发明的保护范围之内。

Claims (2)

  1. 一种基因组合在制备判断肝细胞肝癌预后的组织芯片中的应用,其特征在于,
    所述组织芯片包含检测MMP1、EPO、MMRN1、S100A9、ADAM9、GPC1、SPP1、GLDN、FGF9、CXCL5、CST7、THBS3、ANXA10、PIK3IP1、MMP25、CLEC3B、PZP和CLEC17A的探针,判断肝细胞肝癌预后的基因模型为:肝细胞肝癌患者风险评分=(0.069×MMP1表达水平)+(0.049×EPO表达水平)+(0.042×MMRN1表达水平)+(0.036×S100A9表达水平)+(0.027×ADAM9表达水平)+(0.024×GPC1表达水平)+(0.021×SPP1表达水平)+(0.014×GLDN表达水平)+(0.007×FGF9表达水平)+(0.001×CXCL5表达水平)-(0.024×CST7表达水平)-(0.027×THBS3表达水平)-(0.042×ANXA10表达水平)-(0.049×PIK3IP1表达水平)-(0.051×MMP25表达水平)-(0.054×CLEC3B表达水平)-(0.062×PZP表达水平)-(0.069×CLEC17A表达水平)。
  2. 根据权利要求1所述的应用,其特征在于,所述肝细胞肝癌患者风险评分与生存期相关,风险评分高的患者生存期短、预后差。
PCT/CN2021/139502 2021-09-16 2021-12-20 判断肝细胞肝癌患者预后的基因模型、构建方法和应用 WO2023040102A1 (zh)

Priority Applications (1)

Application Number Priority Date Filing Date Title
US18/358,001 US20240021268A1 (en) 2021-09-16 2023-07-24 Gene model for judging prognosis of hepatocellular carcinoma patients, construction method and use thereof

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202111089547.X 2021-09-16
CN202111089547.XA CN113539376B (zh) 2021-09-16 2021-09-16 判断肝细胞肝癌患者预后的基因模型、构建方法和应用

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US18/358,001 Continuation US20240021268A1 (en) 2021-09-16 2023-07-24 Gene model for judging prognosis of hepatocellular carcinoma patients, construction method and use thereof

Publications (1)

Publication Number Publication Date
WO2023040102A1 true WO2023040102A1 (zh) 2023-03-23

Family

ID=78092777

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2021/139502 WO2023040102A1 (zh) 2021-09-16 2021-12-20 判断肝细胞肝癌患者预后的基因模型、构建方法和应用

Country Status (3)

Country Link
US (1) US20240021268A1 (zh)
CN (1) CN113539376B (zh)
WO (1) WO2023040102A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116403076A (zh) * 2023-06-06 2023-07-07 中国科学院深圳先进技术研究院 一种基于dti序列对gbm患者进行风险分层的方法及系统
CN116741271A (zh) * 2023-06-09 2023-09-12 唐山市人民医院 一种食管鳞癌预后预测风险模型的构建方法及其应用
CN116959554A (zh) * 2023-07-10 2023-10-27 中山大学孙逸仙纪念医院 一种基于CAFs相关基因的前列腺癌生化复发预测模型及其应用

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113539376B (zh) * 2021-09-16 2022-01-18 浙江大学 判断肝细胞肝癌患者预后的基因模型、构建方法和应用
CN115019880B (zh) * 2022-05-05 2024-01-09 中山大学附属第一医院 一种肝癌预后模型及其构建方法和应用
CN115019953A (zh) * 2022-05-20 2022-09-06 中山大学附属第一医院 基于基因表达量预测肝细胞癌治疗反应性模型的构建方法
CN114672569A (zh) * 2022-05-24 2022-06-28 浙江大学医学院附属第一医院 基于色氨酸代谢基因的肝癌预后评估方法
CN118256622A (zh) * 2024-03-25 2024-06-28 中山大学附属第一医院 新型标志物及其在肝癌预后评估中的应用
CN118098378B (zh) * 2024-04-28 2024-07-19 浙江大学医学院附属邵逸夫医院 一种识别肝细胞肝癌新亚型的基因模型构建方法及应用
CN118380153B (zh) * 2024-05-06 2024-09-24 重庆市人口和计划生育科学技术研究院 基于铜死亡相关基因的肝癌预后模型及其构建方法和应用

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101560554B (zh) * 2009-03-06 2011-05-04 复旦大学附属中山医院 一种肝细胞肝癌术后复发预测基因芯片
CN102191323A (zh) * 2011-04-12 2011-09-21 复旦大学附属中山医院 预测肝癌术后生存的实时定量pcr微阵列芯片试剂盒
CN105063049A (zh) * 2015-08-14 2015-11-18 上海缔达生物科技有限公司 用于肝癌预后评估的微小核酸序列、探针以及试剂盒
CN113539376A (zh) * 2021-09-16 2021-10-22 浙江大学 判断肝细胞肝癌患者预后的基因模型、构建方法和应用

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10894988B2 (en) * 2015-09-11 2021-01-19 The Board Of Trustees Of The Leland Stanford Junior University Method of determining the prognosis of hepatocellular carcinomas using a multigene signature associated with metastasis
EP3481953A4 (en) * 2016-07-06 2020-04-15 Youhealth Biotech, Limited SPECIFIC METHYLATION MARKERS OF LIVER CANCER AND USES THEREOF
EP3698139A1 (en) * 2017-10-16 2020-08-26 Biopredictive Method of prognosis and follow up of primary liver cancer
CN108630317B (zh) * 2018-05-09 2022-04-15 中国科学院昆明动物研究所 一种基于多基因表达特征谱的肝癌个性化预后评估方法
CN110499364A (zh) * 2019-07-30 2019-11-26 北京凯昂医学诊断技术有限公司 一种用于检测扩展型遗传病全外显子的探针组及其试剂盒和应用
CN111402949B (zh) * 2020-04-17 2023-12-22 北京恩瑞尼生物科技股份有限公司 一种肝细胞肝癌患者诊断、预后和复发统一模型的构建方法
CN113241181A (zh) * 2021-06-29 2021-08-10 北京泱深生物信息技术有限公司 一种用于肝癌患者的预后风险评估模型及评估装置

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101560554B (zh) * 2009-03-06 2011-05-04 复旦大学附属中山医院 一种肝细胞肝癌术后复发预测基因芯片
CN102191323A (zh) * 2011-04-12 2011-09-21 复旦大学附属中山医院 预测肝癌术后生存的实时定量pcr微阵列芯片试剂盒
CN105063049A (zh) * 2015-08-14 2015-11-18 上海缔达生物科技有限公司 用于肝癌预后评估的微小核酸序列、探针以及试剂盒
CN113539376A (zh) * 2021-09-16 2021-10-22 浙江大学 判断肝细胞肝癌患者预后的基因模型、构建方法和应用

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116403076A (zh) * 2023-06-06 2023-07-07 中国科学院深圳先进技术研究院 一种基于dti序列对gbm患者进行风险分层的方法及系统
CN116403076B (zh) * 2023-06-06 2023-08-22 中国科学院深圳先进技术研究院 一种基于dti序列对gbm患者进行风险分层的方法及系统
CN116741271A (zh) * 2023-06-09 2023-09-12 唐山市人民医院 一种食管鳞癌预后预测风险模型的构建方法及其应用
CN116959554A (zh) * 2023-07-10 2023-10-27 中山大学孙逸仙纪念医院 一种基于CAFs相关基因的前列腺癌生化复发预测模型及其应用

Also Published As

Publication number Publication date
CN113539376B (zh) 2022-01-18
US20240021268A1 (en) 2024-01-18
CN113539376A (zh) 2021-10-22

Similar Documents

Publication Publication Date Title
WO2023040102A1 (zh) 判断肝细胞肝癌患者预后的基因模型、构建方法和应用
Mao et al. A 15-long non-coding RNA signature to improve prognosis prediction of cervical squamous cell carcinoma
US20200405225A1 (en) Methods and systems for identifying or monitoring lung disease
ES2938766T3 (es) Firmas génicas para el pronóstico de cáncer
Lang et al. Expression profiling of circulating tumor cells in metastatic breast cancer
CN110423816B (zh) 乳腺癌预后量化评估系统及应用
Yi et al. Incorporating SULF1 polymorphisms in a pretreatment CT-based radiomic model for predicting platinum resistance in ovarian cancer treatment
JP2022522354A (ja) 肝がん再発予測用dnaメチル化マーカー及びその用途
JP2016073287A (ja) 腫瘍特性及びマーカーセットの同定のための方法、腫瘍分類、並びに癌のマーカーセット
CN115497552A (zh) 一种基于内质网应激特征基因的胃癌预后风险模型和应用
KR20170067137A (ko) 암 진단용 miRNA 바이오마커 발굴 방법 및 그 이용
CN114203256A (zh) 基于微生物丰度的mibc分型及预后预测模型构建方法
Li et al. A novel prognostic model based on autophagy-related long non-coding RNAs for clear cell renal cell carcinoma
CN105074467A (zh) 提高分子靶向治疗肝细胞癌的敏感性的分析方法
Wells et al. Evolving paradigm for imaging, diagnosis, and management of DCIS
Papathomas et al. In situ metabolomics expands the spectrum of renal tumours positive on 99mTc-sestamibi single photon emission computed tomography/computed tomography examination
WO2020135422A1 (zh) 健康风险评估方法
EP3864178A1 (en) Pre-surgical risk stratification based on pde4d7 and dhx9 expression
US20200294622A1 (en) Subtyping of TNBC And Methods
Ma et al. Construction and validation of a prognostic nomogram in metastatic breast cancer patients of childbearing age: A study based on the SEER database and a Chinese cohort
KR102161511B1 (ko) 담도암 진단용 바이오마커의 추출 방법, 이를 위한 컴퓨팅 장치, 담도암 진단용 바이오마커 및 이를 포함하는 담도암 진단 장치
Pal et al. Recent advances in minimally invasive biomarkers of OSCC: from generalized to personalized approach
Lin et al. The cellular trajectories of tumor-associated macrophages decipher the heterogeneity of pancreatic cancer
Yang et al. ming Liu
Cui et al. Identification of Cuproptosis-related key genes in seminoma using machine learning and Weighted Gene Co-expression Networks analyses

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 21957366

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 21957366

Country of ref document: EP

Kind code of ref document: A1