CN112397200B - 一种非综合征型唇腭裂遗传风险预测模型 - Google Patents
一种非综合征型唇腭裂遗传风险预测模型 Download PDFInfo
- Publication number
- CN112397200B CN112397200B CN202011411075.0A CN202011411075A CN112397200B CN 112397200 B CN112397200 B CN 112397200B CN 202011411075 A CN202011411075 A CN 202011411075A CN 112397200 B CN112397200 B CN 112397200B
- Authority
- CN
- China
- Prior art keywords
- cleft lip
- palate
- genetic risk
- prediction model
- snp
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 206010009260 Cleft lip and palate Diseases 0.000 title claims abstract description 64
- 208000016653 cleft lip/palate Diseases 0.000 title claims abstract description 64
- 230000002068 genetic effect Effects 0.000 title claims abstract description 46
- 238000013058 risk prediction model Methods 0.000 title claims abstract description 28
- 208000011580 syndromic disease Diseases 0.000 title claims abstract description 21
- 108700028369 Alleles Proteins 0.000 claims abstract description 6
- 108090000623 proteins and genes Proteins 0.000 claims description 45
- 206010009259 cleft lip Diseases 0.000 claims description 34
- 210000005259 peripheral blood Anatomy 0.000 claims description 11
- 239000011886 peripheral blood Substances 0.000 claims description 11
- 238000012098 association analyses Methods 0.000 claims description 9
- 238000010219 correlation analysis Methods 0.000 claims description 9
- 230000035772 mutation Effects 0.000 claims description 9
- 238000007482 whole exome sequencing Methods 0.000 claims description 9
- 238000012070 whole genome sequencing analysis Methods 0.000 claims description 9
- 238000012216 screening Methods 0.000 claims description 7
- 238000000034 method Methods 0.000 claims description 4
- 238000012545 processing Methods 0.000 claims description 4
- 230000000694 effects Effects 0.000 abstract description 4
- 230000002265 prevention Effects 0.000 abstract description 4
- 238000012502 risk assessment Methods 0.000 abstract description 4
- 108020004414 DNA Proteins 0.000 description 12
- 230000001717 pathogenic effect Effects 0.000 description 7
- 238000004458 analytical method Methods 0.000 description 6
- 238000010276 construction Methods 0.000 description 5
- 201000010099 disease Diseases 0.000 description 5
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 5
- 239000012634 fragment Substances 0.000 description 5
- 238000010586 diagram Methods 0.000 description 3
- 102000004169 proteins and genes Human genes 0.000 description 3
- 230000000717 retained effect Effects 0.000 description 3
- 208000032170 Congenital Abnormalities Diseases 0.000 description 2
- 206010010356 Congenital anomaly Diseases 0.000 description 2
- 208000034826 Genetic Predisposition to Disease Diseases 0.000 description 2
- 238000010171 animal model Methods 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 210000003754 fetus Anatomy 0.000 description 2
- OVBPIULPVIDEAO-LBPRGKRZSA-N folic acid Chemical compound C=1N=C2NC(N)=NC(=O)C2=NC=1CNC1=CC=C(C(=O)N[C@@H](CCC(O)=O)C(O)=O)C=C1 OVBPIULPVIDEAO-LBPRGKRZSA-N 0.000 description 2
- 230000008774 maternal effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012163 sequencing technique Methods 0.000 description 2
- 102100040536 BTB/POZ domain-containing protein KCTD2 Human genes 0.000 description 1
- ZEOWTGPWHLSLOG-UHFFFAOYSA-N Cc1ccc(cc1-c1ccc2c(n[nH]c2c1)-c1cnn(c1)C1CC1)C(=O)Nc1cccc(c1)C(F)(F)F Chemical compound Cc1ccc(cc1-c1ccc2c(n[nH]c2c1)-c1cnn(c1)C1CC1)C(=O)Nc1cccc(c1)C(F)(F)F ZEOWTGPWHLSLOG-UHFFFAOYSA-N 0.000 description 1
- 102100033587 DNA topoisomerase 2-alpha Human genes 0.000 description 1
- 206010011878 Deafness Diseases 0.000 description 1
- 208000013668 Facial cleft Diseases 0.000 description 1
- 102100023593 Fibroblast growth factor receptor 1 Human genes 0.000 description 1
- 101710182386 Fibroblast growth factor receptor 1 Proteins 0.000 description 1
- 101000613889 Homo sapiens BTB/POZ domain-containing protein KCTD2 Proteins 0.000 description 1
- 101000926535 Homo sapiens Interferon-induced, double-stranded RNA-activated protein kinase Proteins 0.000 description 1
- 101001125123 Homo sapiens Interferon-inducible double-stranded RNA-dependent protein kinase activator A Proteins 0.000 description 1
- 101001056466 Homo sapiens Keratin, type II cytoskeletal 4 Proteins 0.000 description 1
- 101000742986 Homo sapiens Serine/threonine-protein kinase WNK4 Proteins 0.000 description 1
- 101001056878 Homo sapiens Squalene monooxygenase Proteins 0.000 description 1
- 101000659162 Homo sapiens Tetratricopeptide repeat protein 30A Proteins 0.000 description 1
- 101000642517 Homo sapiens Transcription factor SOX-6 Proteins 0.000 description 1
- 208000026350 Inborn Genetic disease Diseases 0.000 description 1
- 102100034170 Interferon-induced, double-stranded RNA-activated protein kinase Human genes 0.000 description 1
- 102100029408 Interferon-inducible double-stranded RNA-dependent protein kinase activator A Human genes 0.000 description 1
- 102100025758 Keratin, type II cytoskeletal 4 Human genes 0.000 description 1
- 101710143114 Mothers against decapentaplegic homolog 6 Proteins 0.000 description 1
- OVBPIULPVIDEAO-UHFFFAOYSA-N N-Pteroyl-L-glutaminsaeure Natural products C=1N=C2NC(N)=NC(=O)C2=NC=1CNC1=CC=C(C(=O)NC(CCC(O)=O)C(O)=O)C=C1 OVBPIULPVIDEAO-UHFFFAOYSA-N 0.000 description 1
- 101150111584 RHOA gene Proteins 0.000 description 1
- 102100038101 Serine/threonine-protein kinase WNK4 Human genes 0.000 description 1
- 102000049874 Smad6 Human genes 0.000 description 1
- 102100025560 Squalene monooxygenase Human genes 0.000 description 1
- 102100036173 Tetratricopeptide repeat protein 30A Human genes 0.000 description 1
- 102100036694 Transcription factor SOX-6 Human genes 0.000 description 1
- 102100022387 Transforming protein RhoA Human genes 0.000 description 1
- 108010046308 Type II DNA Topoisomerases Proteins 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000007698 birth defect Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000035622 drinking Effects 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000002474 experimental method Methods 0.000 description 1
- 229960000304 folic acid Drugs 0.000 description 1
- 235000019152 folic acid Nutrition 0.000 description 1
- 239000011724 folic acid Substances 0.000 description 1
- 230000014509 gene expression Effects 0.000 description 1
- 208000016361 genetic disease Diseases 0.000 description 1
- 230000007614 genetic variation Effects 0.000 description 1
- 230000010370 hearing loss Effects 0.000 description 1
- 231100000888 hearing loss Toxicity 0.000 description 1
- 208000016354 hearing loss disease Diseases 0.000 description 1
- 208000011977 language disease Diseases 0.000 description 1
- 238000007477 logistic regression Methods 0.000 description 1
- 230000036244 malformation Effects 0.000 description 1
- 238000010197 meta-analysis Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000003254 palate Anatomy 0.000 description 1
- 231100000614 poison Toxicity 0.000 description 1
- 230000035935 pregnancy Effects 0.000 description 1
- 230000006916 protein interaction Effects 0.000 description 1
- 230000000391 smoking effect Effects 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 238000001356 surgical procedure Methods 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
- 239000003440 toxic substance Substances 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Organic Chemistry (AREA)
- Medical Informatics (AREA)
- Physics & Mathematics (AREA)
- Analytical Chemistry (AREA)
- General Health & Medical Sciences (AREA)
- Genetics & Genomics (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Biomedical Technology (AREA)
- Public Health (AREA)
- Pathology (AREA)
- Biophysics (AREA)
- Biotechnology (AREA)
- Molecular Biology (AREA)
- Epidemiology (AREA)
- Immunology (AREA)
- Databases & Information Systems (AREA)
- Primary Health Care (AREA)
- Data Mining & Analysis (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- General Engineering & Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
本发明公开了一种非综合征型唇腭裂遗传风险预测模型,其公式为:式中,k为SNP位点的个数;Gi表示第i个SNP位点遗传风险等位基因的个数,即为0,1,2;βi表示第i个SNP位点的权重;SNP为rs139860270、rs1883873、rs139530062、rs144415105、rs55816698、rs139860270、rs6989548、rs12952376。本发明的遗传风险评分模型将8个SNPs的微弱效应进行叠加,大大提高了对非综合征唇腭裂遗传风险的预测。本发明首次提出遗传风险评分模型评估对非综合征唇腭裂的遗风险能力。该模型准确性强,可以为我国的唇腭裂的风险评估及防控提供更加全面、准确、个体化的科学依据。
Description
技术领域
本发明属于遗传病诊断技术领域,涉及一种与非综合征型唇腭相关的加权遗传风险预测模型。
背景技术
先天性唇腭裂是一种常见的出生缺陷,主要分为综合征型唇腭裂和非综合征型唇腭裂,发病率约为1/700。唇腭裂患者主要变现为单纯唇裂、单纯腭裂以及唇腭裂,患有综合征型唇腭裂的患者往往还伴随着其他方面的缺陷,比如:脑部结构畸形等。唇腭裂患者出生时,往往会出现进食困难等问题,而且随着时间的推移,唇腭裂患者还可能出现语言障碍,听力丧失等症状。即使通过手术治愈,唇腭裂患者还可能产生一些心理上的问题,为个人和家庭带来沉重的负担。
先天性唇腭裂的发病受环境和遗传的共同影响。母体在妊娠期间吸烟、饮酒或者缺乏叶酸均有可能造成胎儿唇腭裂的发生,另外,孕妇在工作生活环境中是否暴露于有毒物质中也有可能导致胎儿唇腭裂畸形。人们通过对唇腭裂家系的研究发现唇腭裂的发病具有家族聚集性,患者的家属经常伴有唇腭裂或者其他畸形,提示唇腭裂的发生受遗传因素的影响。通过家系分析,相关动物模型的构建,人们确定了一些唇腭裂的致病基因,但大多数的唇腭裂致病基因仍旧未被发现。利用GWAS、关联分析、meta分析等手段,人们锁定了一系列唇腭裂候选致病基因以及易感位点,亟需通过实验等方式进行验证。
风险评分(risk score)是流行病学研究中评价风险预测能力的重要方法之一,纳入遗传易感因素进行风险评分,从而评价遗传易感因素在风险预测模型中的效果的方法称为遗传风险评分(genetic risk score,GRS)。GRS能整合多个SNPs的综合信息来评价基因序列变异和疾病之间的联系,将每个SNPs的微弱效应进行叠加,大大提高了对疾病风险的预测。纳入遗传风险位点构建GRS模型,是评估非综合征唇腭裂遗传风险的有效手段。
然而,目前尚未见GRS模型应用于非综合征唇腭裂的遗传风险预测的研究报道。若能筛选出与非综合征唇腭裂发生密切相关的风险位点,构建GRS风险预测模型,可为我国的唇腭裂的风险评估及防控提供更加全面、准确、个体化的科学依据。
发明内容
本发明的目的是提供一种非综合征型唇腭裂遗传风险预测模型,为我国的唇腭裂的风险评估及防控提供更加全面、准确、个体化的科学依据。
本发明的目的是通过以下技术方案实现的:
一种非综合征型唇腭裂遗传风险预测模型,其公式为:
式中,k为SNP位点的个数;Gi表示第i个SNP位点遗传风险等位基因的个数,即为0,1,2;βi表示第i个SNP位点的权重;SNP为rs139860270、rs1883873、rs139530062、rs144415105、rs55816698、rs139860270、rs6989548、rs12952376。
一种上述非综合征型唇腭裂遗传风险预测模型的构建方法,包括如下步骤:
步骤一、唇腭裂样本采集
按照知情同意的原则,针对非综合征型唇腭裂患者及健康对照个体进行外周血样本的收集以及基本信息及临床资料整理;
步骤二、样本全外显子组及全基因组测序
对非综合征型唇腭裂患者进行全外显子组测序,对健康对照个体的外周血样本进行全基因组测序;
步骤三、非综合征型唇腭裂患者全外显子组数据的处理和分析
(1)突变位点的筛选:去除人群数据库频率较高的变异位点,去除病例样本中频率大于10%的变异位点;
(2)获取与唇腭裂相关基因:在NCBI Pubmed数据库和STRING数据库获得与唇腭裂相关基因;
(3)SKAT关联分析:筛选低频变异与唇腭裂表型相关的基因;
步骤四、唇腭裂遗传风险预测模型构建
(1)关联分析:将步骤三交集的唇腭裂候选基因的SNPs位点进行与唇腭裂易感性的关联分析;
(2)遗传风险预测模型构建:关联分析保留的SNP用于构建加权遗传风险预测模型。
相比于现有技术,本发明具有如下优点:
1、遗传风险评分模型将8个SNPs的微弱效应进行叠加,大大提高了对非综合征唇腭裂遗传风险的预测。
2、首次提出遗传风险评分模型评估对非综合征唇腭裂的遗风险能力。该模型准确性强,可以为我国的唇腭裂的风险评估及防控提供更加全面、准确、个体化的科学依据。
附图说明
图1为SKAT全基因组关联分析结果,A:曼哈顿图;B:Q-Q图;
图2为18个交集基因;
图3为遗传风险模型(wGRS)评分箱式图;
图4为遗传风险模型(wGRS)受试者工作曲线(ROC)及曲线下面积(AUC)图。
具体实施方式
下面结合附图对本发明的技术方案作进一步的说明,但并不局限于此,凡是对本发明技术方案进行修改或者等同替换,而不脱离本发明技术方案的精神和范围,均应涵盖在本发明的保护范围中。
本发明提供了一种非综合征型唇腭裂遗传风险预测模型,其公式为:
式中,k为SNP位点的个数;Gi表示第i个SNP位点遗传风险等位基因的个数,即为0,1,2;βi表示第i个SNP位点的权重;SNP为rs139860270、rs1883873、rs139530062、rs144415105、rs55816698、rs139860270、rs6989548、rs12952376。
一种上述非综合征型唇腭裂遗传风险预测模型的构建方法,通过围绕非综合征型唇腭裂患者基因测序结果,寻找新的可能的唇腭裂致病基因;剔除北方人群的背景变异位点,利用STRING数据库,通过已知的“唇腭裂相关基因”,确定蛋白互作网络,结合SKAT关联分析,确定唇腭裂的候选基因;利用唇腭裂的候选基因所对应的SNPS构建遗传风险预测模型。具体包括如下步骤:
1、唇腭裂样本采集
与临床医院合作,按照知情同意的原则,针对非综合征型唇腭裂患者、及健康对照个体进行外周血样本的收集,以及基本信息及临床资料整理。
2、样本全外显子组及全基因组测序
非综合征型唇腭裂患者送至北京诺禾致源有限公司进行全外显子组测序;健康对照个体的外周血样本送至北京诺禾致源有限公司进行全基因组测序。每个样本建立DNA文库需要1.0ug优质基因组DNA,这些基因组DNA会被随机片段化为180 280bp的DNA片段。在确定这些DNA片段的大小分布和浓度后,DNA文库用Illumina Hiseq 4000测序。
3、非综合征型唇腭裂患者全外显子组数据的处理和分析
(1)突变位点的筛选:去除人群数据库频率较高的变异位点;去除病例样本中频率大于10%的变异位点。
(2)获取与唇腭裂相关基因:在NCBI Pubmed数据库和STRING数据库获得与唇腭裂相关基因。
(3)SKAT关联分析:筛选低频变异与唇腭裂表型相关的基因。
4、唇腭裂遗传风险预测模型构建
(1)关联分析:将上述步骤3交集的唇腭裂候选基因的SNPs位点进行与唇腭裂易感性的关联分析。
(2)遗传风险预测模型构建:关联分析保留的SNP用于构建加权遗传风险预测模型(wGRS),并利用箱式图和受试者工作特征曲线(ROC)曲线下的面积(AUC)判断模型的预测能力。
通过上述研究分析,本发明确认了18个唇腭裂候选致病基因包括RYK、FGFRL、OSR2、SNAII、BMPI、PRKRA、TBX18、EIF2AK2、TTC30A、RHOA、SQLE、SOX6、KRT4、SMAD6、TOP2A、KCTD2、WNK4、RGM4。
进一步提出由8个SNP(rs139860270、rs1883873、rs139530062、rs144415105、rs55816698、rs139860270、rs6989548、rs12952376)构建的wGRS。该模型在病例和对照组中评分存在差异,通过观察分组情况,唇腭裂的患病风险均会随着评分的增加而提高。
实施例:
本实施例按照如下步骤构建非综合征型唇腭裂遗传风险预测模型:
1、唇腭裂样本采集通过哈尔滨医科大学伦理委员会共纳入71例非综合征型唇腭裂患者和67例健康对照个体,以及33例不包含与发育相关疾病的个体。健康对照个体均没有与发育相关的疾病。所有纳入研究的个体均源自中国北方地区。所有研究对象均采集外周血样本,均已签署知情同意书。
2、样本全外显子组及全基因组测序
71例非综合征型唇腭裂患者以及33例不包含与发有相关疾病的个体的外周血样本送至北京诺禾致源有限公司进行全外显子组测序;67例健康对照个体中50例个体的外周血样本送至北京诺禾致源有限公司进行全基因组测序,17例个体的外周血样本送至北京诺禾致源有限公司进行全外显子组测序。每个样本建立DNA文库需要1.0μg优质基因组DNA,这些基因组DNA会被随机片段化为180~280bp的DNA片段。在确定这些DNA片段的大小分布和浓度后,DNA文库用Illumina Hiseq 4000测序。
3、非综合征型唇腭裂患者全外显子组数据的处理和分析
对51例非综合征型唇腭裂患者外周血样本全外显子组测序获得每例样本的SNP/SNV数据。①去除在人群数据库中频率大于0.05的变异位点,获得3426个基因;去除病例样本中频率大于10%的变异位点,获得3320个基因;去除对照组中频率大于10%的变异位点,获得3302个基因。②在NCBI Pubmed数据库中检索关键词“craniofacial cleft”,唇腭裂文献及动物模型,获得105个唇腭裂相关基因。通过STRING数据库的蛋白网络寻找与这105个唇腭裂相关基因表达蛋白存在功能关联的蛋白,其对应基因命名为一级基因。每个唇腭裂相关基因对应5~11个一级基因,共获得875个一级基因。③针对测序数据使用PLINK和R语言去除遗传变异MAF>0.05的位点,获得74944个低频变异位点,进行基于基因的SKAT全基因组关联分析,如图1所示。纳入低频变异分析的基因有20258个,其中有607个基因的P值小于0.05。
将步骤①中遗传背景筛选的3302个基因,步骤②中105个唇腭裂相关基因和STRING筛选的875个一级基因,和步骤③中SKAT全基因组关联分析获得P值小于0.05的607个基因取交集进行联合筛选。获得两个已报道唇腭裂相关基因RYK和FGFR1;另获得16个一级基因,为本研究唇腭裂候选致病基因,如图2所示。
4、唇腭裂遗传风险预测模型构建
为了确认上述18个唇腭裂候选致病基因在71例非综合征型唇腭裂患者中所测得的SNPs位点与唇腭裂易感性的关联,我们对这18个基因对应的695个SNPs在71例非综合征型唇腭裂患者和67例健康个体的对照组中进行了关联分析。我们剔除最小等位基因频率(minor allele frequency,MAF)小于0.01的SNPs,剔除不满足哈迪温伯格平衡的SNPs(P<0.001),剔除连锁不平衡的SNPs,保留P<0.05的SNPs,共获得8个SNPs纳入遗传风险预测模型中,如表1所示。
表1遗传风险预测模型中包含的SNPs
用上述获得的8个SNPs建立wGRS。wGRS是8个SNPs位点的基因型加权(如表1所示)后相加所得,野生型的权重为0,杂合突变型和纯和突变型的权重基于Logistic回归分析得来。若该位点的基因型为野生型,则该位点的评分记为0×0=0;若该位点的基因型为杂合突变型,则该位点的评分记为1×杂合突变型的权重;若该位点的基因型为纯和突变型,则该位点的评分记为2×纯和突变型的权重。对wGRS进行评估绘制箱式图,如图3所示。wGRS中对照组的评分集中在50~60之间,中位数为53.20;病例组的评分集中在60左右,中位数为60.04;病例组的评分明显高于对照组的评分(P<0.001)。为了检验遗传风险预测模型的效力,将wGRS进行分组,取wGRS评分的四分位点为界限,将病例和对照个体分为4组:0(<Q25),1(Q25~Q50),2(Q50~Q75),3(>Q75)。以评分最低的一组为参照,将其OR值设为1。wGRS的后三组与参照组相比,唇腭裂发病风险均有明显的上升。wGRS的后三组患唇腭裂风险是wGRS:0(<Q25)组的2.78、8以及24倍。wGRS的趋势性P值为0.000006,表示趋势分析有统计学意义。唇腭裂患病风险会随着wGRS和评分的增加而升高,如表2所示。使用ROC曲线以及曲线下面积(AUC)评价wGRS的预测能力。如图4所示,wGRS的曲线下面积(AUC)为0.795,表示wGRS有一定准确性。
表2遗传风险预测模型的分组
Claims (1)
1.一种非综合征型唇腭裂遗传风险预测模型的构建方法,其特征在于包括如下步骤:
步骤一、唇腭裂样本采集
按照知情同意的原则,针对非综合征型唇腭裂患者及健康对照个体进行外周血样本的收集以及基本信息及临床资料整理;
步骤二、样本全外显子组及全基因组测序
对非综合征型唇腭裂患者进行全外显子组测序,对健康对照个体的外周血样本进行全基因组测序;
步骤三、非综合征型唇腭裂患者全外显子组数据的处理和分析
(1)突变位点的筛选:去除人群数据库频率较高的变异位点,去除病例样本中频率大于10%的变异位点;
(2)获取与唇腭裂相关基因:在NCBI Pubmed数据库和STRING数据库获得与唇腭裂相关基因;
(3)SKAT关联分析:筛选低频变异与唇腭裂表型相关的基因;
步骤四、唇腭裂遗传风险预测模型构建
(1)关联分析:将步骤三交集的唇腭裂候选基因的SNPs位点进行与唇腭裂易感性的关联分析;
(2)遗传风险预测模型构建:关联分析保留的SNP用于构建加权遗传风险预测模型;
所述预测模型的公式为:
式中,k为SNP位点的个数;Gi表示第i个SNP位点遗传风险等位基因的个数,即为0,1,2;βi表示第i个SNP位点的权重;
所述SNP为rs139860270、rs1883873、rs139530062、rs144415105、rs55816698、rs139860270、rs6989548、rs12952376。
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011411075.0A CN112397200B (zh) | 2020-12-04 | 2020-12-04 | 一种非综合征型唇腭裂遗传风险预测模型 |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202011411075.0A CN112397200B (zh) | 2020-12-04 | 2020-12-04 | 一种非综合征型唇腭裂遗传风险预测模型 |
Publications (2)
Publication Number | Publication Date |
---|---|
CN112397200A CN112397200A (zh) | 2021-02-23 |
CN112397200B true CN112397200B (zh) | 2023-12-19 |
Family
ID=74605799
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202011411075.0A Active CN112397200B (zh) | 2020-12-04 | 2020-12-04 | 一种非综合征型唇腭裂遗传风险预测模型 |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112397200B (zh) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113506631A (zh) * | 2021-08-06 | 2021-10-15 | 中国医学科学院基础医学研究所 | 一种提高慢阻肺急性加重状态诊断准确率的风险预测方法 |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104293968A (zh) * | 2014-10-23 | 2015-01-21 | 博奥生物集团有限公司 | 一组用于检测非综合征性唇腭裂遗传易感性的多态性位点及检测试剂盒 |
CN105279369A (zh) * | 2015-09-06 | 2016-01-27 | 苏州协云和创生物科技有限公司 | 一种基于二代测序的冠心病遗传风险评估方法 |
CN106987635A (zh) * | 2017-04-18 | 2017-07-28 | 南京医科大学附属口腔医院 | Ntn1基因易感snp位点及其应用 |
CN110699446A (zh) * | 2019-11-07 | 2020-01-17 | 南京医科大学附属口腔医院 | 一种与非综合征型唇腭裂诊断相关的SNP标志物rs3174298及其应用 |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20050089885A1 (en) * | 2003-05-06 | 2005-04-28 | University Of Iowa Research Foundation | IRF6 polymorphisms associated with cleft lip and/or palate |
-
2020
- 2020-12-04 CN CN202011411075.0A patent/CN112397200B/zh active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN104293968A (zh) * | 2014-10-23 | 2015-01-21 | 博奥生物集团有限公司 | 一组用于检测非综合征性唇腭裂遗传易感性的多态性位点及检测试剂盒 |
CN105279369A (zh) * | 2015-09-06 | 2016-01-27 | 苏州协云和创生物科技有限公司 | 一种基于二代测序的冠心病遗传风险评估方法 |
CN106987635A (zh) * | 2017-04-18 | 2017-07-28 | 南京医科大学附属口腔医院 | Ntn1基因易感snp位点及其应用 |
CN110699446A (zh) * | 2019-11-07 | 2020-01-17 | 南京医科大学附属口腔医院 | 一种与非综合征型唇腭裂诊断相关的SNP标志物rs3174298及其应用 |
Non-Patent Citations (3)
Title |
---|
Association between PTCH1 and RAD54B single-nucleotide polymorphisms andnon-syndromic orofacial clefts in a northern Chinese population;Xiaotong Liu等;《The Journal of Gene Medicine》;第20卷(第12期);全文 * |
Genetic risk score for nonsyndromic cleft lip with or without cleft palate for a Chilean population;BY R. BLANCO等;《GENETIC COUNSELING》;第25卷(第2期);143-149 * |
Machine Learning Models for Genetic Risk Assessment of Infants with Non-syndromic Orofacial Cleft;Shi-Jian Zhang等;《Genomics Proteomics Bioinformatics》;第16卷;354-364 * |
Also Published As
Publication number | Publication date |
---|---|
CN112397200A (zh) | 2021-02-23 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Bandres‐Ciga et al. | The genetic architecture of Parkinson Disease in Spain: Characterizing population‐specific risk, differential haplotype structures, and providing etiologic insight | |
CN102952854B (zh) | 单细胞分类和筛选方法及其装置 | |
Gerber et al. | Genetics of upper and lower airway diseases in the horse | |
CN105442052A (zh) | 一种检测诊断主动脉夹层病致病基因的dna文库及其应用 | |
WO2023071877A1 (zh) | 泌尿系统结石术后复发风险预测模型、评估系统及方法 | |
CN110364226A (zh) | 一种用于辅助生殖供精策略的遗传风险预警方法和系统 | |
Johnson et al. | A comprehensive targeted next‐generation sequencing panel for genetic diagnosis of patients with suspected inherited thrombocytopenia | |
CN107247890A (zh) | 一种用于临床诊断和预测的基因数据系统 | |
CN112397200B (zh) | 一种非综合征型唇腭裂遗传风险预测模型 | |
US12084719B2 (en) | RNA editing as biomarkers for mood disorders test | |
CN116287204A (zh) | 检测特征基因的突变情况在制备静脉血栓栓塞症风险检测产品中的应用 | |
Chen et al. | Classification and interpretation for 11 FBN1 variants responsible for Marfan syndrome and pre-implantation genetic testing (PGT) for two families successfully blocked transmission of the pathogenic mutations | |
CN112037863B (zh) | 一种早期nsclc预后预测系统 | |
Angulo-Aguado et al. | Next-generation sequencing of host genetics risk factors associated with COVID-19 severity and long-COVID in Colombian population | |
Bansal et al. | Advances in asthma genetics | |
CN105838720B (zh) | Ptprq基因突变体及其应用 | |
CN117143983A (zh) | 用于评估慢性阻塞性肺疾病患病风险的分子标记及应用 | |
Bayrak-Toydemir et al. | Likelihood ratios to assess genetic evidence for clinical significance of uncertain variants: hereditary hemorrhagic telangiectasia as a model | |
Jia et al. | A new and spontaneous animal model for ankylosing spondylitis is found in cynomolgus monkeys | |
CN115058511A (zh) | 基于多基因突变特征的静脉血栓栓塞症复发风险评估模型、其构建方法及其应用 | |
CN110459312A (zh) | 类风湿性关节炎易感位点及其应用 | |
Shi et al. | Sub-exome target sequencing in a family with syndactyly type IV due to a novel partial duplication of the LMBR1 gene: first case report in Fujian Province of China | |
CN110475874A (zh) | 脱靶序列在dna分析中的应用 | |
Christopherson et al. | The common VWF variant p. Y1584C: detailed pathogenic examination of an enigmatic sequence change | |
CN104164424A (zh) | Cc2d2a基因突变体及其应用 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |