CN111883203B - 用于预测pd-1疗效的模型的构建方法 - Google Patents
用于预测pd-1疗效的模型的构建方法 Download PDFInfo
- Publication number
- CN111883203B CN111883203B CN202010637199.4A CN202010637199A CN111883203B CN 111883203 B CN111883203 B CN 111883203B CN 202010637199 A CN202010637199 A CN 202010637199A CN 111883203 B CN111883203 B CN 111883203B
- Authority
- CN
- China
- Prior art keywords
- predicting
- model
- rna
- efficacy
- curative effect
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000000694 effects Effects 0.000 title claims abstract description 38
- 238000010276 construction Methods 0.000 title claims abstract description 11
- 238000003559 RNA-seq method Methods 0.000 claims abstract description 30
- 238000012163 sequencing technique Methods 0.000 claims abstract description 29
- 238000012216 screening Methods 0.000 claims abstract description 14
- 108010074708 B7-H1 Antigen Proteins 0.000 claims abstract description 13
- 102000008096 B7-H1 Antigen Human genes 0.000 claims abstract description 13
- 238000002591 computed tomography Methods 0.000 claims abstract description 5
- 239000012188 paraffin wax Substances 0.000 claims abstract description 4
- 238000007622 bioinformatic analysis Methods 0.000 claims abstract 2
- 206010028980 Neoplasm Diseases 0.000 claims description 24
- 230000014509 gene expression Effects 0.000 claims description 18
- 108090000623 proteins and genes Proteins 0.000 claims description 17
- 230000004044 response Effects 0.000 claims description 17
- 239000003814 drug Substances 0.000 claims description 11
- 238000000034 method Methods 0.000 claims description 10
- 229940079593 drug Drugs 0.000 claims description 9
- 238000004458 analytical method Methods 0.000 claims description 7
- 238000011156 evaluation Methods 0.000 claims description 6
- 238000001303 quality assessment method Methods 0.000 claims description 6
- 238000000540 analysis of variance Methods 0.000 claims description 5
- 238000005457 optimization Methods 0.000 claims description 4
- 239000002245 particle Substances 0.000 claims description 4
- 238000002123 RNA extraction Methods 0.000 claims description 3
- 210000003850 cellular structure Anatomy 0.000 claims description 3
- 238000010201 enrichment analysis Methods 0.000 claims description 3
- 238000010199 gene set enrichment analysis Methods 0.000 claims description 3
- 238000010606 normalization Methods 0.000 claims description 3
- 230000003213 activating effect Effects 0.000 claims description 2
- 239000002299 complementary DNA Substances 0.000 claims description 2
- 238000009826 distribution Methods 0.000 claims description 2
- 238000002360 preparation method Methods 0.000 claims description 2
- 238000012549 training Methods 0.000 claims description 2
- 238000004140 cleaning Methods 0.000 claims 1
- 230000001351 cycling effect Effects 0.000 claims 1
- 238000013441 quality evaluation Methods 0.000 claims 1
- 239000003147 molecular marker Substances 0.000 abstract description 7
- 238000012512 characterization method Methods 0.000 abstract 1
- 230000006870 function Effects 0.000 description 8
- 102000037982 Immune checkpoint proteins Human genes 0.000 description 6
- 108091008036 Immune checkpoint proteins Proteins 0.000 description 6
- 210000001744 T-lymphocyte Anatomy 0.000 description 6
- 210000004881 tumor cell Anatomy 0.000 description 6
- 206010058467 Lung neoplasm malignant Diseases 0.000 description 5
- 201000005202 lung cancer Diseases 0.000 description 5
- 208000020816 lung neoplasm Diseases 0.000 description 5
- 230000008901 benefit Effects 0.000 description 4
- 230000000052 comparative effect Effects 0.000 description 4
- 230000007246 mechanism Effects 0.000 description 4
- 102000008070 Interferon-gamma Human genes 0.000 description 3
- 108010074328 Interferon-gamma Proteins 0.000 description 3
- 229960003130 interferon gamma Drugs 0.000 description 3
- 210000004698 lymphocyte Anatomy 0.000 description 3
- 208000002154 non-small cell lung carcinoma Diseases 0.000 description 3
- 208000029729 tumor suppressor gene on chromosome 11 Diseases 0.000 description 3
- 108010021064 CTLA-4 Antigen Proteins 0.000 description 2
- 102000008203 CTLA-4 Antigen Human genes 0.000 description 2
- 208000032818 Microsatellite Instability Diseases 0.000 description 2
- 230000004913 activation Effects 0.000 description 2
- 239000000090 biomarker Substances 0.000 description 2
- 230000028993 immune response Effects 0.000 description 2
- 210000000987 immune system Anatomy 0.000 description 2
- 238000009169 immunotherapy Methods 0.000 description 2
- 230000003993 interaction Effects 0.000 description 2
- 201000001441 melanoma Diseases 0.000 description 2
- 239000000203 mixture Substances 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 229940045513 CTLA4 antagonist Drugs 0.000 description 1
- 102100031547 HLA class II histocompatibility antigen, DO alpha chain Human genes 0.000 description 1
- 101000866278 Homo sapiens HLA class II histocompatibility antigen, DO alpha chain Proteins 0.000 description 1
- 108091029795 Intergenic region Proteins 0.000 description 1
- 102000017578 LAG3 Human genes 0.000 description 1
- 101150030213 Lag3 gene Proteins 0.000 description 1
- 101100519207 Mus musculus Pdcd1 gene Proteins 0.000 description 1
- 239000012270 PD-1 inhibitor Substances 0.000 description 1
- 239000012668 PD-1-inhibitor Substances 0.000 description 1
- 238000010802 RNA extraction kit Methods 0.000 description 1
- 238000002835 absorbance Methods 0.000 description 1
- 210000000612 antigen-presenting cell Anatomy 0.000 description 1
- 238000013473 artificial intelligence Methods 0.000 description 1
- 230000006472 autoimmune response Effects 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 230000000903 blocking effect Effects 0.000 description 1
- 201000011510 cancer Diseases 0.000 description 1
- 210000004027 cell Anatomy 0.000 description 1
- 208000035250 cutaneous malignant susceptibility to 1 melanoma Diseases 0.000 description 1
- 210000001151 cytotoxic T lymphocyte Anatomy 0.000 description 1
- 238000001514 detection method Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000001647 drug administration Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000008004 immune attack Effects 0.000 description 1
- 210000002865 immune cell Anatomy 0.000 description 1
- 230000008073 immune recognition Effects 0.000 description 1
- 230000003832 immune regulation Effects 0.000 description 1
- 230000006058 immune tolerance Effects 0.000 description 1
- 229960003444 immunosuppressant agent Drugs 0.000 description 1
- 230000001861 immunosuppressant effect Effects 0.000 description 1
- 239000003018 immunosuppressive agent Substances 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 230000008595 infiltration Effects 0.000 description 1
- 238000001764 infiltration Methods 0.000 description 1
- 230000002401 inhibitory effect Effects 0.000 description 1
- 238000002347 injection Methods 0.000 description 1
- 239000007924 injection Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 229940127554 medical product Drugs 0.000 description 1
- 230000001394 metastastic effect Effects 0.000 description 1
- 206010061289 metastatic neoplasm Diseases 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 230000000869 mutational effect Effects 0.000 description 1
- 230000037361 pathway Effects 0.000 description 1
- 229940121655 pd-1 inhibitor Drugs 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 238000002864 sequence alignment Methods 0.000 description 1
- 239000000243 solution Substances 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 238000002560 therapeutic procedure Methods 0.000 description 1
- 230000003827 upregulation Effects 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B5/00—ICT specially adapted for modelling or simulations in systems biology, e.g. gene-regulatory networks, protein interaction networks or metabolic networks
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
- C12Q1/6886—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material for cancer
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/106—Pharmacogenomics, i.e. genetic variability in individual responses to drugs and drug metabolism
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/158—Expression markers
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Organic Chemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Biophysics (AREA)
- Zoology (AREA)
- Biotechnology (AREA)
- Wood Science & Technology (AREA)
- General Health & Medical Sciences (AREA)
- Analytical Chemistry (AREA)
- Immunology (AREA)
- Molecular Biology (AREA)
- Genetics & Genomics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Pathology (AREA)
- Microbiology (AREA)
- Theoretical Computer Science (AREA)
- Medical Informatics (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Biochemistry (AREA)
- General Engineering & Computer Science (AREA)
- Oncology (AREA)
- Hospice & Palliative Care (AREA)
- Physiology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
本发明公开了一种用于预测PD‑1疗效的模型的构建方法,包括如下步骤:1)选取用药时间前一个月内穿刺或石蜡组织的样本的RNA‑seq作为基线,患者在此期间内未接受其他治疗;2)患者进行PD‑1/PD‑L1治疗后,每3个月进行CT扫描。3)对样本进行常规的RNA‑seq测序。4)RNA‑seq生物信息分析。5)用于预测PD‑1疗效的RNA‑seq数据特征构建与筛选。6)用于预测PD‑1疗效的模型构建。本发明相比已有的PD‑1疗效预测分子标记物,RNA测序分子标记物,预测结果更为准确,费用更低。
Description
技术领域
本发明涉及一种用于预测PD-1疗效的模型的构建方法,其利用RNA测序获得的肿瘤及肿瘤微环境的基因表达信息,预测临床肿瘤免疫检查点PD-1阻断剂治疗的有效性。
背景技术
机体在正常情况下,淋巴免疫细胞处于免疫监视状态,当受到肿瘤入侵时,免疫系统激活并通过识别和多种杀伤机制清除肿瘤细胞。免疫检查点是免疫系统调控的一种机制,在正常情况下、其通过调节自身免疫反应强度维持免疫耐受,防止过度免疫反应。常见T细胞上免疫检查点包括程序性死亡受体1(programmed death 1,PD-1)、细胞毒性T淋巴细胞抗原4(cytotoxic T lymphocyte antigen 4,CTLA-4)、淋巴细胞激活因子-3(lymphocyte activation gene 3,LAG-3)等抑制性共刺激分子。肿瘤细胞正是利用免疫检查点这种重要的免疫调控抑制T细胞的免疫应答反应从而逃避机体的免疫识别和免疫攻击。目前获国家药品监督管理局(National Medical Products Administration,NMPA)批准上市的免疫检查点药物有PD-1阻断剂,主要用于晚期恶性黑色素瘤和局部晚期或转移性非小细胞肺癌治疗中。PD-1阻断剂通过阻断PD-1/PD-L1通路增强T细胞活性以到达肿瘤免疫治疗的效果。
然而并非所有的肿瘤患者都受益于PD-1阻断剂,例如,在无筛选的非小细胞肺癌患者中只有10~30%患者对PD-1阻断剂有应答,如何构建PD-1疗效预测的模型,用于免疫伴随诊断是迫切需要的。目前已有研究表明肿瘤突变负荷(Tumor mutation burden,TMB)、微卫星不稳定性(microsatellite instability,MSI)和PD-L1基因表达可以用于PD-1阻断剂受益患者的筛选,其中高TMB患者受益是由于产生新抗原表位并被抗原呈递细胞加工处理的概率增大,PD-L1基因表达上调通常表现为肿瘤细胞对T细胞分泌干扰素γ(interferon-γ,IFN-γ)的应答反应,侧面反映了肿瘤的免疫微环境即T细胞浸润,当患者使用PD-1阻断剂后,其体内T细胞活性增强,识别和杀伤肿瘤细胞的效应得以体现。尽管这些生物标记物可以有效富集PD-1抑制剂的适应患者,但其预测效力有限。例如,在非小细胞肺癌的一线治疗中只有44.8%PD-L1阳性患者对PD-1阻断剂有效,此外仍有17%PD-L1阴性患者对PD-1阻断剂有效。类似在一项跨癌症的研究中表明,只有58%TMB高的阳性患者对PD-1阻断剂有效,仍有20%TMB低的阴性患者对PD-1阻断剂有效。同时,PD-1费用昂贵,过度治疗和无效治疗都是无意义的。因此,有必要寻找预测PD-1免疫疗法受益患者新的生物标记。
一些研究表明肿瘤细胞和宿主免疫细胞之间的相互作用机理复杂,单从PD-L1表达和TMB值的高低,不能有效的对这种作用机制进行全局描绘。
发明内容
本发明的主要目的,在于提供一种用于预测PD-1疗效的模型的构建方法,其包括如下步骤:
1)选取用药时间前一个月内穿刺或石蜡组织的样本的RNA-seq作为基线,患者在此期间内未接受其他治疗;
2)患者进行PD-1/PD-L1治疗后,每间隔一段时间进行CT扫描,根据实体瘤疗效评价标准评估药物临床应答,并以最佳疗效作为最终应答疗效;
3)对样本进行常规的RNA-seq测序,利用RNA完整值(Rna Integrity Number,RIN)评估RNA抽提质量,并对建库好的片断化cDNA分子大小进行分布分析;
4)RNA-seq生物信息分析,具体包括,样本RNA测序下机后质量评估;测序read与人类参考基因组比对;测序read与基因组比对后质量评估,标记出质量较差的测序样本(包括低比对比例样本,基因间区高比对比例样本等);根据测序read比对到基因组上信息,估算基因转录本表达量、基因表达量,计算每百万转录本数目(Transcripts per millionreads,TPM),和每千个碱基的转录每百万reads比对上的reads数(Fragments perkilobase of exon per million reads mapped,FPKM);
5)用于预测PD-1疗效的RNA-seq数据特征构建与筛选;数据特征主要包括三类:①免疫相关基因表达;②免疫相关基因对大小关系,即两两基因表达量大小关系;③免疫微环境富集分析,即通过基因集合富集分析(Gene Set Enrichment Analysis,GSEA)定量患者肿瘤样本细胞组成成分;特征筛选,通过有标记的PD-1/PD-L1疗效数据,通过方差分析筛选出潜在的能用于预测PD-1疗效的RNA-seq数据特征;
6)用于预测PD-1疗效的模型构建
d)数据清洗和归一化。保留表达谱数据集中共有的基因,利用DESeq归一化不同数据集RNA-seq表达谱;
e)利用训练集,通过方差分析筛选在PD-1用药应答组和无应答组显著差异的三类数据特征;构建多核加权最近邻PD-1疗效预测模型,其中三类数据特征分别选取三个高斯核kgamma(x,x)(gamma>0,x为实数),对应每一个核函数设置不同权重wi(其中∑wi=1,wi>0),引入最近邻距离权重衰减函数即为d,K为上面高斯核得出的相似矩阵kgamma(x,x),其中λ为衰减比例因子(λ>1,实数),计算n个最近邻距离/>其中y为样本用药疗效标签(y=-1|1,-1:无应答,1:应答),n为正整数,选择sigmoid函数/>作为激活函数即为/>交叉熵/>作为损失函数,(其中高斯核gamma,w1,w2,最近邻节点n和λ为模型超参数);最后利用粒子群优化算法(ParticleSwarm Optimization,PSO)优化模型参数,完成用于预测PD-1疗效的模型构建。
优选地,考虑PSO可能陷入局部最优解,循环步骤6)的a-c 8-12次,取这些结果的中位数作为模型预测最终结果。更优选地,是循环10次。
优选地,步骤2)中,每间隔一段时间进行CT扫描,为每间隔2-4个月。更优选地,为每间隔3个月。
优选地,步骤2)中,实体瘤疗效评价标准为实体瘤疗效评价标准1.1(TheResponse Evaluation Criteria in Solid Tumors,RECIST)。
优选地,步骤4)中,样本RNA测序下机后质量评估包括测序read总数、测序read长度、测序read每碱基位点测序质量。
优选地,步骤4)中,质量较差的测序样本,包括低比对比例样本,基因间区高比对比例样本。
优选地,步骤5)中,通过方差分析筛选出潜在的能用于预测PD-1疗效的RNA-seq数据特征,筛选标准为p值小于0.05。
本发明还提供用于预测PD-1疗效的模型,其采用前述的方法建立。
本发明还提供所述的模型,在制备预测或辅助预测肿瘤免疫检查点PD-1阻断剂疗效的产品中的应用。
本发明提供了一种用于预测PD-1疗效的模型,该模型可用于多种癌症进行PD-1/PD-L1免疫抑制剂敏感性检测或辅助预测,基于先验知识的特征工程结合人工智能算法,可实现敏感度和特异性均较高的识别结果。
有益效果
与现有检测产品相比,本发明:
相比于基因组层面PD-1疗效预测分子标记物(TMB or MSI),本发明关注于更为复杂的肿瘤微环境,RNA测序可以富集出肿瘤细胞微环境的细胞成分,用于PD-1疗效预测,过程更为直接。
通过免疫相关基因对关系,可以忽略不同测序的平台RNA测序产生数据的偏向性,以及克服不同数据集合产生的批次效应,结果更为鲁棒。
通过多核学习和距离权重衰减,可以针对不同类型的数据分别从不同层次计算样本间的相似性,模型更为合理。
相比已有的PD-1疗效预测分子标记物,RNA测序分子标记物,预测结果更为准确。
相比已有的PD-1疗效预测分子标记物,RNA测序分子标记物,测序费用更经济。
附图说明
下面结合附图和实施例对本发明作进一步说明。
图1为本发明的技术流程图。
图2Kernel_weight_knn和已有PD-1用药疗效预测的分子标记物比较。
具体实施方式
1)数据准备。从公共数据库GEO下载两套已有PD-1疗效的黑素瘤数据集(PRJNA312948、PRJNA356761)用于三类数据特征筛选,从http://doi.org/10.5281/zenodo.546110.下载一套已有PD-1疗效的尿路上皮肿瘤(Urothelial)数据集用于验证,按“发明内容”中的第一步和第二步收集20例肺癌(Lung Cancer)患者用PD-1治疗,并进行药物临床应答评估,样本信息汇总见表一
表一样本信息汇总
数据集 | 应答人数 | 不应答人数 |
PRJNA312948 | 14 | 12 |
PRJNA356761 | 26 | 25 |
Urothelial | 12 | 9 |
Lung Cancer | 8 | 12 |
2)数据测序。数据按发明内容第三步进行常规的RNA-seq测序,其样本为福尔马林固定石蜡包埋组织,使用RNA提取试剂盒进行RNA提取,RNA产量应不低于10ng,浓度不低于2ng/μl,260/280吸光值在1.8~2.0之间,RIN值不低于1,DV200不低于20%,经RNA文库制备,在Novaseq测序平台测序。
3)生物信息分析。首先将待测样本RNAseq序列和参考基因组进行序列比对,比对软件为STAR_2.6.1a_08-27,人类参考基因组为hg19。后进行基因表达定量,定量软件为RSEM v1.2.28,人类基因注释文件为gencode.v29lift37.annotation.gtf。样本的测序质量评估,由RNA-SeQC_v1.1.8完成。注:其中所涉及软件均以默认参数运行。
4)数据预处理。对步骤3)的四套数据进行归一化,应用软件为public的R-3.3.3,归一化包为DESeq2 v1.14.1,归一化参数默认。
5)数据特征构建与筛选。免疫微环境特征构造,通过基因集合富集分析GSEA定量患者肿瘤样本细胞组成成分,并作为该类特征,富集分析包为GSVA v1.22.4。免疫相关基因相对关系特征构造,对相关基因表达进行0到1归一化,后用免疫相关基因对差值作为该类特征,免疫相关基因见表二。特征筛选,利用有标记(是否有用药疗效)PD-1/PD-L1用药疗效数据,通过方差分析筛选出潜在的能用于预测PD-1疗效的RNA-seq数据特征,其中p-value阈值为0.05。
表二免疫相关基本列表
6)利用Urothelial数据构建多核加权最近邻PD-1疗效预测模型,并用LungCancer数据集作为预测。
其中三类数据特征分别选取三个高斯核kgamma(x,x),对应每一个核函数设置不同权重wi,其中∑wi=1,引入最近邻距离权重衰减函数即为d,其中λ为衰减比例因子,计算n个最近邻距离/>其中y为样本用药疗效标签,y=-1|1,-1:无应答,1:应答,选择sigmoid函数/>做为激活函数即为/>交叉熵/>作为损失函数,其中高斯核gamma,w1,w2,最近邻节点n和λ为模型超参数;最后利用粒子群优化算法(PSO)优化模型参数,完成用于预测PD-1疗效的模型构建。10次循环各个超参数最优化取值见表三
表三10次循环各个超参数最优化取值
7)结果展示。在Urothelial数据集上比较kernal_weight_knn,TMB以及IFN-γ对PD-1用药疗效预测结果,其中kernal_weight_knn模型auc=0.78为最优,如图2A。在LungCancer数据上比较kernal_weight_knn和IFN-γ对PD-1用药疗效预测结果,其中kernal_weight_knn模型auc=0.88为最优,如图2B。
Claims (9)
1.用于预测PD-1疗效的模型的构建方法,包括如下步骤:
1)选取用药时间前一个月内穿刺或石蜡组织的样本的RNA-seq作为基线,患者在此期间内未接受其他治疗;
2)患者进行PD-1/PD-L1治疗后,每间隔一段时间进行CT扫描,根据实体瘤疗效评价标准评估药物临床应答,并以最佳疗效作为最终应答疗效;
3)对步骤1)样本进行常规的RNA-seq测序,利用RNA完整值评估RNA抽提质量,并对建库好的片断化cDNA分子大小进行分布分析;
4)RNA-seq生物信息分析,具体包括:样本RNA测序下机后质量评估;测序read与人类参考基因组比对;测序read与基因组比对后质量评估,标记出质量较差的测序样本;根据测序read比对到基因组上信息,估算基因转录本表达量、基因表达量;
5)用于预测PD-1疗效的RNA-seq数据特征构建与筛选;数据特征主要包括三类:①免疫相关基因表达;②免疫相关基因对大小关系,即两两基因表达量大小关系;③免疫微环境富集分析,即通过基因集合富集分析定量患者肿瘤样本细胞组成成分;特征筛选,将有标记的PD-1/PD-L1疗效数据,通过方差分析筛选出潜在的能用于预测PD-1疗效的RNA-seq数据特征;
6)用于预测PD-1疗效的模型构建
a)数据清洗和归一化:保留表达谱数据集中共有的基因,利用DESeq归一化不同数据集RNA-seq表达谱;
b)利用训练集,通过方差分析筛选在PD-1用药应答组和无应答组显著差异的三类数据特征;
c)构建多核加权最近邻PD-1疗效预测模型,其中三类数据特征分别选取三个高斯核kgamma(x,x),对应每一个核函数设置不同权重wi,其中∑wi=1,引入最近邻距离权重衰减函数即为d,其中λ为衰减比例因子,计算n个最近邻距离/>其中y为样本用药疗效标签,y=-1|1,-1:无应答,1:应答,选择sigmoid函数/>做为激活函数即为/>交叉熵/>作为损失函数,其中高斯核gamma,w1,w2,最近邻节点n和λ为模型超参数;最后利用粒子群优化算法(PSO)优化模型参数,完成用于预测PD-1疗效的模型构建。
2.根据权利要求1所述的一种用于预测PD-1疗效的模型的构建方法,其特征在于:循环步骤6)的a-c 8-12次,取这些结果的中位数作为模型预测最终结果。
3.根据权利要求1所述的一种用于预测PD-1疗效的模型的构建方法,其特征在于:步骤2)中,每间隔一段时间进行CT扫描,间隔期为2-4个月。
4.根据权利要求1所述的一种用于预测PD-1疗效的模型的构建方法,其特征在于:步骤2)中,实体瘤疗效评价标准为实体瘤疗效评价标准1.1。
5.根据权利要求1所述的一种用于预测PD-1疗效的模型的构建方法,其特征在于:步骤4)中,样本RNA测序下机后质量评估包括测序read总数、测序read长度、测序read每碱基位点测序质量。
6.根据权利要求1所述的一种用于预测PD-1疗效的模型的构建方法,其特征在于:步骤4)中,质量较差的测序样本,包括低比对比例样本,基因间区高比对比例样本。
7.根据权利要求1所述的一种用于预测PD-1疗效的模型的构建方法,其特征在于:步骤5)中,通过方差分析筛选出潜在的能用于预测PD-1疗效的RNA-seq数据特征,筛选标准为p值小于0.05。
8.用于预测PD-1疗效的模型,其采用权利要求1至7任一项的方法建立。
9.如权利要求8所述的模型,在制备预测或辅助预测肿瘤免疫检查点PD-1阻断剂疗效的产品中的应用。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010637199.4A CN111883203B (zh) | 2020-07-03 | 2020-07-03 | 用于预测pd-1疗效的模型的构建方法 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010637199.4A CN111883203B (zh) | 2020-07-03 | 2020-07-03 | 用于预测pd-1疗效的模型的构建方法 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111883203A CN111883203A (zh) | 2020-11-03 |
CN111883203B true CN111883203B (zh) | 2023-12-29 |
Family
ID=73150040
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010637199.4A Active CN111883203B (zh) | 2020-07-03 | 2020-07-03 | 用于预测pd-1疗效的模型的构建方法 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111883203B (zh) |
Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2017050855A1 (en) * | 2015-09-22 | 2017-03-30 | Institut Gustave Roussy | A scoring method for predicting the efficiency of a treatment with anti-pd-1 and/or anti-pd-l1 monoclonal antibodies |
CN106987631A (zh) * | 2017-04-01 | 2017-07-28 | 武汉赛云博生物科技有限公司 | 一种用于pd‑1/pd‑l1阻断治疗伴随诊断的免疫组测序技术 |
WO2017161188A1 (en) * | 2016-03-16 | 2017-09-21 | The Regents Of The University Of California | Detection and treatment of anti-pd-1 therapy resistant metastatic melanomas |
CN108664762A (zh) * | 2011-10-26 | 2018-10-16 | 加利福尼亚大学董事会 | 利用关于基因组模型的数据集成的途径识别算法(paradigm) |
CN109033749A (zh) * | 2018-06-29 | 2018-12-18 | 深圳裕策生物科技有限公司 | 一种肿瘤突变负荷检测方法、装置和存储介质 |
CN109680085A (zh) * | 2019-01-22 | 2019-04-26 | 深圳未知君生物科技有限公司 | 基于肠道微生物信息预测治疗响应性的模型 |
WO2019108135A1 (en) * | 2017-11-30 | 2019-06-06 | Singapore Health Services Pte. Ltd. | A system and method for classifying cancer patients into appropriate cancer treatment groups and compounds for treating the patient |
CN109937452A (zh) * | 2016-08-25 | 2019-06-25 | 南托米克斯有限责任公司 | 免疫疗法标志及其用途 |
CN110277135A (zh) * | 2019-08-10 | 2019-09-24 | 杭州新范式生物医药科技有限公司 | 一种基于预期疗效选择个体化肿瘤新抗原的方法和系统 |
CN111118126A (zh) * | 2019-11-13 | 2020-05-08 | 上海厦维生物技术有限公司 | 一种基于高通量测序的mRNA检测方法 |
WO2020191391A2 (en) * | 2019-03-21 | 2020-09-24 | Illumina, Inc. | Artificial intelligence-based sequencing |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20200395097A1 (en) * | 2019-05-30 | 2020-12-17 | Tempus Labs, Inc. | Pan-cancer model to predict the pd-l1 status of a cancer cell sample using rna expression data and other patient data |
-
2020
- 2020-07-03 CN CN202010637199.4A patent/CN111883203B/zh active Active
Patent Citations (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN108664762A (zh) * | 2011-10-26 | 2018-10-16 | 加利福尼亚大学董事会 | 利用关于基因组模型的数据集成的途径识别算法(paradigm) |
WO2017050855A1 (en) * | 2015-09-22 | 2017-03-30 | Institut Gustave Roussy | A scoring method for predicting the efficiency of a treatment with anti-pd-1 and/or anti-pd-l1 monoclonal antibodies |
WO2017161188A1 (en) * | 2016-03-16 | 2017-09-21 | The Regents Of The University Of California | Detection and treatment of anti-pd-1 therapy resistant metastatic melanomas |
CN109937452A (zh) * | 2016-08-25 | 2019-06-25 | 南托米克斯有限责任公司 | 免疫疗法标志及其用途 |
CN106987631A (zh) * | 2017-04-01 | 2017-07-28 | 武汉赛云博生物科技有限公司 | 一种用于pd‑1/pd‑l1阻断治疗伴随诊断的免疫组测序技术 |
WO2019108135A1 (en) * | 2017-11-30 | 2019-06-06 | Singapore Health Services Pte. Ltd. | A system and method for classifying cancer patients into appropriate cancer treatment groups and compounds for treating the patient |
CN109033749A (zh) * | 2018-06-29 | 2018-12-18 | 深圳裕策生物科技有限公司 | 一种肿瘤突变负荷检测方法、装置和存储介质 |
CN109680085A (zh) * | 2019-01-22 | 2019-04-26 | 深圳未知君生物科技有限公司 | 基于肠道微生物信息预测治疗响应性的模型 |
WO2020191391A2 (en) * | 2019-03-21 | 2020-09-24 | Illumina, Inc. | Artificial intelligence-based sequencing |
CN110277135A (zh) * | 2019-08-10 | 2019-09-24 | 杭州新范式生物医药科技有限公司 | 一种基于预期疗效选择个体化肿瘤新抗原的方法和系统 |
CN111118126A (zh) * | 2019-11-13 | 2020-05-08 | 上海厦维生物技术有限公司 | 一种基于高通量测序的mRNA检测方法 |
Non-Patent Citations (3)
Title |
---|
Comparative analysis of PD-1 target engagement of dostarlimab and pembrolizumab in advanced solid tumors using ex vivo IL-2 stimulation data;Daren Austin,等;《CPT Pharmacometrics Syst Pharmacol.》;第87-94页 * |
基于公共数据库构建肺腺癌肿瘤干性评分 模型预测免疫治疗疗效;庞兆飞,等;《山东大学学报(医药版)》;第59卷(第11期);19-28页 * |
炎性指标对非小细胞肺癌PD-1 抗体疗效预测及预后评估的初步探讨;魏熙胤,等;《中国肿瘤临床》;第48卷(第11期);第547-552页 * |
Also Published As
Publication number | Publication date |
---|---|
CN111883203A (zh) | 2020-11-03 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Puleo et al. | Stratification of pancreatic ductal adenocarcinomas based on tumor and microenvironment features | |
Jabbari et al. | Molecular signatures define alopecia areata subtypes and transcriptional biomarkers | |
CN109880910A (zh) | 一种肿瘤突变负荷的检测位点组合、检测方法、检测试剂盒及系统 | |
JP5841234B2 (ja) | フケ/脂漏性皮膚炎の治療に有効な皮膚活性剤を特定及び評価するためのシステム、モデル、及び方法 | |
KR20180039631A (ko) | 염색체 상호작용 부위를 이용하는 검출 방법 | |
CN109072309A (zh) | 癌症进化检测和诊断 | |
CN108064272B (zh) | 用于类风湿性关节炎的生物标记物及其用途 | |
JP2023071993A (ja) | がん治療の有効性を予測するためのシステムおよび方法 | |
JP2023535962A (ja) | 低カバレッジ次世代シーケンシングデータにおける相同修復欠損などの染色体空間不安定性を同定する方法 | |
CN103957705A (zh) | 适用于预测对醋酸格拉替雷的临床反应的单核苷酸多态性 | |
WO2020205993A1 (en) | Purity independent subtyping of tumors (purist), a platform and sample type independent single sample classifier for treatment decision making in pancreatic cancer | |
CN105567846A (zh) | 检测粪便中细菌dna的试剂盒及其在大肠癌诊断中的应用 | |
CN107292130A (zh) | 基于基因突变与基因表达的药物重定位方法 | |
CN111883203B (zh) | 用于预测pd-1疗效的模型的构建方法 | |
CN112274643A (zh) | 一种rbpj作为药物靶点在制备抑制t细胞耗竭药物中的应用 | |
WO2023154549A1 (en) | Urothelial tumor microenvironment (tme) types | |
Hu et al. | Establishment and validation of psoriasis evaluation models | |
CN115910214A (zh) | 一种利用肿瘤活组织生物样本库模拟临床试验评估抗肿瘤药物药效的方法及其应用 | |
Wang et al. | Genetic intratumor heterogeneity remodels the immune microenvironment and induces immune evasion in brain metastasis of lung cancer | |
CN114333998A (zh) | 一种基于深度学习模型的肿瘤新抗原预测方法及新生抗原预测系统 | |
KR102475860B1 (ko) | 니볼루맙 치료 예후 예측을 위한 정보제공방법 | |
Casulo et al. | Describing treatment of primary mediastinal large B cell lymphoma using rigorously defined molecular classification: a retrospective analysis | |
San-Miguel | HOW TO INCORPORATE MRD IN CLINICAL TRIALS | |
CN115844878A (zh) | 一种用于kras突变高危结肠腺癌的治疗药物和药物靶点 | |
Hordinsky et al. | Molecular signatures define alopecia areata subtypes and transcriptional biomarkers |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
TA01 | Transfer of patent application right | ||
TA01 | Transfer of patent application right |
Effective date of registration: 20201217 Address after: Room 202, building 3, 138 xinjunhuan Road, Minhang District, Shanghai Applicant after: Shanghai Xiawei medical laboratory Co.,Ltd. Address before: Room 201202, building 3, 138 xinjunhuan Road, Minhang District, Shanghai Applicant before: Shanghai Xiawei Biotechnology Co.,Ltd. |
|
GR01 | Patent grant | ||
GR01 | Patent grant |