CN105354444B - Method based on the susceptible SNP combinations of susceptible SNP screenings complex disease - Google Patents

Method based on the susceptible SNP combinations of susceptible SNP screenings complex disease Download PDF

Info

Publication number
CN105354444B
CN105354444B CN201510828517.4A CN201510828517A CN105354444B CN 105354444 B CN105354444 B CN 105354444B CN 201510828517 A CN201510828517 A CN 201510828517A CN 105354444 B CN105354444 B CN 105354444B
Authority
CN
China
Prior art keywords
susceptible
susceptible snp
snp
disease
combinations
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Expired - Fee Related
Application number
CN201510828517.4A
Other languages
Chinese (zh)
Other versions
CN105354444A (en
Inventor
杜红丽
关宇佳
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN201510828517.4A priority Critical patent/CN105354444B/en
Publication of CN105354444A publication Critical patent/CN105354444A/en
Application granted granted Critical
Publication of CN105354444B publication Critical patent/CN105354444B/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids

Abstract

The present invention discloses a kind of method based on the susceptible SNP combinations of susceptible SNP screenings complex disease.The method of the present invention can find rapidly the susceptible SNP combinations of specified disease from mass data.The present invention is intended to provide a kind of method of new disease risks evaluation using the method for susceptible SNP combinations.Susceptible SNP combinations may be better than the assessment of disease risks the risk assessment effect of single susceptible SNP site, lead to genetic disease effect similar to low frequency mutation.Effect of the method for susceptible SNP combinations in complex disease risk profile may reach 1 in terms of specific example result, this is that single susceptible SNP can not can achieve the effect that.This method can be applied to the screening of the susceptible SNP combinations of any complex disease and the prediction of diseases genetic risk.

Description

Method based on the susceptible SNP combinations of susceptible SNP screenings complex disease
Technical field
The method that SNP typing datas screening disease-susceptible humans SNP is combined in public database is the present invention relates to the use of, is especially related to A kind of and method based on the susceptible SNP combinations of susceptible SNP screenings complex disease.
Background technology
Complex disease, such as diabetes B, obesity, cardiovascular and cerebrovascular diseases are frequently not one because of its pathogenesis complexity Or the influence of several genes causes.Last decade, the world include the country using whole-genome association method (GWAS) to a large amount of SNP typing datas in the range of the full-length genome of individual are for statistical analysis, sifted out a large amount of susceptible SNP sites, for example The data of all diabetes B GWAS, identify 69 susceptible SNP (Nature before the DIAGRAM committees integrate Genetics, 2014).It, should not be independent when assessing SNP risks since complex disease is that have multiple gene joint effects to cause Assess the influence of single SNP.
Invention content
The shortcomings that in order to overcome the prior art, the purpose of the present invention is to provide one kind to be based on susceptible SNP screenings with insufficient The method of the susceptible SNP combinations of complex disease.Present invention employs the constitution's risks that a kind of new method assesses multiple SNP.The party Method can be applied to the screening of the susceptible SNP combinations of any complex disease.
Another object of the present invention is to provide the application of the above method.
The purpose of the present invention is achieved through the following technical solutions:
A kind of method based on the susceptible SNP combinations of susceptible SNP screenings complex disease, includes the following steps:
The present invention is found out and the relevant susceptible SNP of specified disease, a large amount of full bases of individual of download from known document first Because of the SNP typing datas of group, the genotype data of each susceptible SNP site of individual is separated, then by these susceptible SNP Site is ranked sequentially by one, if some susceptible SNP site contains more than one susceptible allele (i.e. containing 1 or 2 easily Feel genotype), it is marked with specific English alphabet, each individual will obtain one equal to or less than susceptible SNP in this way The alphabetic character string of number of sites.Then the number of various character strings in illness group and control group is counted respectively, and in illness group The unexistent character string of control group is found out, then therefrom selects the apparent more character string of number, then in turn in order will be alphabetical Genotype is converted to, finally obtains susceptible SNP combinations.
The control group refers to health population.
The illness group refers to disease populations.
The method of the present invention can find rapidly the susceptible SNP combinations of specified disease from mass data.
The method based on the susceptible SNP combinations of susceptible SNP screenings complex disease is in the susceptible SNP groups of screening complex disease Application in conjunction.
The method based on the susceptible SNP combinations of susceptible SNP screenings complex disease is in complex disease risk profile Using.
The present invention is had the following advantages and effect relative to the prior art:
In order to overcome the shortcomings of the susceptible SNP evaluations disease risks of complex disease unit point, the present invention is combined using susceptible SNP Method be intended to provide a kind of method of new disease risks evaluation.Susceptible SNP combinations may be more for the assessment of disease risks Better than the risk assessment effect of single susceptible SNP site, lead to genetic disease effect similar to low frequency mutation.From specific example knot Fruit sees that effect of the method for susceptible SNP combination in complex disease risk profile may reach 1, this be single susceptible SNP not The effect being likely to be breached.This method can be applied to screening and the diseases genetic risk of any complex disease susceptible SNP combinations Prediction.
Description of the drawings
Fig. 1 is idiotype initial data schematic diagram.
Fig. 2 is the schematic diagram of the individual character string of 14 random individuals.
Specific embodiment
With reference to embodiment and attached drawing, the present invention is described in further detail, but embodiments of the present invention are unlimited In this.
Embodiment 1
1st, SNP's is selected
Select Genome-wide trans-ancestry meta-analysis provides insight into 69 diabetes Bs in the genetic architecture of type 2diabetes susceptibility are easy Feel SNP (Nature Genetics, 2014), be shown in Table 1.
1 69 susceptible SNP of diabetes B of table
2nd, data source
All full-length genome SNP typing datas be fromhttp://www.ebi.ac.uk/ega/It downloads.
Control group WTCCC1project samples from 1958British Birth Cohort (1991 samples).
Diabetes B group WTCCC1project Type 2Diabetes (T2D) samples (1504 samples).Individual base Because type initial data schematic diagram is shown in Fig. 1.
3rd, SNP site is screened
It is numbered according to the rs of SNP site susceptible in table 1, by all susceptible SNP sites of individual each in downloading data Genotype extracts, and is extracted the genotype of 18 susceptible SNP sites of diabetes B altogether, this 18 susceptible SNP are shown in Table 2.
The genotype of 2 18 susceptible SNP sites of diabetes B of table
Locus Lead SNP Locus Lead SNP
NOTCH2 rs10923931 WFS1 rs4458523
RBMS1 rs7593730 KLF14 rs13233731
THADA rs10203174 CDKN2A/ rs10811661
IRS1 rs2943640 VPS26A rs1802295
GCKR rs780094 KCNJ11 rs5215
IGF2BP2 rs4402960 KLHDC5 rs10842994
PPARG rs1801282 CCND2 rs11063069
ADAMTS9 rs6795735 SPRY2 rs1359790
PSMD6 rs831571 RASGRP1 rs7403531
4th, the genotypic markers of susceptible SNP site
For the genotype of this 18 susceptible SNP sites of each individual, marked as long as there are 1 or 2 tumor susceptibility genes Note is as follows, refers to table 3, no tumor susceptibility gene is labeled as default.
3 susceptible SNP genotypic markers of table
SNP Genotype Post-conversion characters
rs10203174 CC、CT a
rs10811661 CT、TT b
rs10842994 CC、CT c
rs10923931 GT、TT d
rs11063069 AG、GG e
rs13233731 AG、GG f
rs1359790 AG、GG g
rs1801282 CC、CG h
rs1802295 CT、TT i
rs2943640 AC、CC j
rs4402960 GT、TT k
rs4458523 GG、GT l
rs5215 CC、CT m
rs6795735 CC、CT n
rs7403531 CT、TT o
rs7593730 CC、CT p
rs780094 CC、CT q
rs831571 CC、CT r
5th, character string is obtained
It is ranked sequentially according to SNP in table 3, to every in 3945 individuals (control group 1991, diabetes B group 1504) 18 loci gene types of an individual are converted into character string, obtain a series of character string, totally 3945, therefrom randomly select 14 individuals, as shown in Figure 2.
6th, the susceptible SNP combinations of statistics screening
3945 individuals (control group 1991, diabetes B group 1504) are carried out to find in 2 types by statistical analysis The i.e. SNP of the character string that diabetes group has but do not have in control group is combined, such as abcghp (i.e. rs10203174, The SNP site of rs10811661, rs10842994, rs1359790, rs1801282, rs75937306 carry easy sensillary base simultaneously Cause), the number of appearance is most, is not found in control group but, as long as some individual inheritance detection is as a result prompted more than group occur It closes, the risk for suffering from diabetes B may be 1.Character string number is shown in Table 4 in the statistical result of 5 or more.
Statistical form of the 4 character string number of table at 5 or more
Diabetes B Character string number Control group Character string number
abcghp 10 abcghpr 7
abcghjpr 8 abcghjlnpr 6
abchjpr 7 abchnpr 6
abchlnp 7 abchpr 6
abcfghpr 6 abghpr 6
abcghjlpq 6 abhnpr 6
abcghjp 6 abcghjnpr 5
abcghlp 6 abcghlpqr 5
Show that susceptible SNP combinations (frequency is very low) are perhaps better than single for the assessment of disease risks with the data of the present invention The risk assessment effect of one susceptible SNP site leads to genetic disease effect, due to the big number accumulated at present similar to low frequency mutation According to limited, if run up to more than 100,000 individual data, the risk of direct predictive disease is combined as 100% perhaps with SNP It is possibly realized.
Above-described embodiment is the preferable embodiment of the present invention, but embodiments of the present invention are not by above-described embodiment Limitation, other any Spirit Essences without departing from the present invention with made under principle change, modification, replacement, combine, simplification, Equivalent substitute mode is should be, is included within protection scope of the present invention.

Claims (4)

  1. A kind of 1. method based on the susceptible SNP combinations of susceptible SNP screenings complex disease, it is characterised in that include the following steps:
    Found out from known document with the relevant susceptible SNP of specified disease, download the SNP typing datas of individual whole genome, The genotype data of each susceptible SNP site of individual is separated, then arranges these susceptible SNP sites by a sequence Row, if some susceptible SNP site contains more than one susceptible allele, are marked with specific English alphabet, every in this way An individual will obtain an alphabetic character string for being equal to or less than susceptible SNP site number;Then respectively statistics illness group and The number of various character strings in control group, and the unexistent character string of control group is found out in illness group, then therefrom select number Apparent more character string, then letter is converted into genotype in order in turn, finally obtain susceptible SNP combinations.
  2. 2. the method according to claim 1 based on the susceptible SNP combinations of susceptible SNP screenings complex disease, it is characterised in that: The control group refers to health population.
  3. 3. the method according to claim 1 based on the susceptible SNP combinations of susceptible SNP screenings complex disease, it is characterised in that: The illness group refers to disease populations.
  4. 4. the method according to claim 1 based on the susceptible SNP combinations of susceptible SNP screenings complex disease, it is characterised in that: Described refers to containing more than one susceptible allele containing 1 or 2 susceptible genotype.
CN201510828517.4A 2015-11-24 2015-11-24 Method based on the susceptible SNP combinations of susceptible SNP screenings complex disease Expired - Fee Related CN105354444B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201510828517.4A CN105354444B (en) 2015-11-24 2015-11-24 Method based on the susceptible SNP combinations of susceptible SNP screenings complex disease

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201510828517.4A CN105354444B (en) 2015-11-24 2015-11-24 Method based on the susceptible SNP combinations of susceptible SNP screenings complex disease

Publications (2)

Publication Number Publication Date
CN105354444A CN105354444A (en) 2016-02-24
CN105354444B true CN105354444B (en) 2018-06-19

Family

ID=55330415

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201510828517.4A Expired - Fee Related CN105354444B (en) 2015-11-24 2015-11-24 Method based on the susceptible SNP combinations of susceptible SNP screenings complex disease

Country Status (1)

Country Link
CN (1) CN105354444B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107345248A (en) * 2017-06-26 2017-11-14 思畅信息科技(上海)有限公司 Gene and site methods of risk assessment and its system based on big data
CN113403380A (en) * 2021-06-11 2021-09-17 中国科学院北京基因组研究所(国家生物信息中心) Complex disease related SNP site primer composition and application

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101894216A (en) * 2010-07-16 2010-11-24 西安电子科技大学 Method of discovering SNP group related to complex disease from SNP information
CN104573408A (en) * 2013-10-18 2015-04-29 大江基因医学股份有限公司 Single nucleotide polymorphism disease incidence prediction system

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU785425B2 (en) * 2001-03-30 2007-05-17 Genetic Technologies Limited Methods of genomic analysis
WO2011076783A2 (en) * 2009-12-22 2011-06-30 Integragen A method for evaluating a risk for a transmissible neuropsychiatric disorder

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101894216A (en) * 2010-07-16 2010-11-24 西安电子科技大学 Method of discovering SNP group related to complex disease from SNP information
CN104573408A (en) * 2013-10-18 2015-04-29 大江基因医学股份有限公司 Single nucleotide polymorphism disease incidence prediction system

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
基于人2型糖尿病易感SNP位点筛查食蟹猴2型糖尿病易感SNP标记;柳明玉等;《中国比较医学杂志》;20141031;第24卷(第10期);第18-26页 *
基于单核苷酸多态性的基因互作分析方法学进展;栾奕昭等;《遗传》;20131231;1331-1339 *

Also Published As

Publication number Publication date
CN105354444A (en) 2016-02-24

Similar Documents

Publication Publication Date Title
Yuan et al. Predicting disease occurrence with high accuracy based on soil macroecological patterns of Fusarium wilt
Grieneisen et al. Gut microbiome heritability is nearly universal but environmentally contingent
Terzopoulos et al. Genetic diversity analysis of Mediterranean faba bean (Vicia faba L.) with ISSR markers
Miah et al. A review of microsatellite markers and their applications in rice breeding programs to improve blast disease resistance
Gao et al. Genome wide association study of seedling and adult plant leaf rust resistance in elite spring wheat breeding lines
Pour-Aboughadareh et al. Insight into the genetic variability analysis and relationships among some Aegilops and Triticum species, as genome progenitors of bread wheat, using SCoT markers
Baye et al. Genotype–environment interactions and their translational implications
Nayak et al. Promoting utilization of Saccharum spp. genetic resources through genetic diversity analysis and core collection construction
Evans et al. A multipurpose, high-throughput single-nucleotide polymorphism chip for the dengue and yellow fever mosquito, Aedes aegypti
Johnson et al. Genome‐wide population structure analyses of three minor millets: Kodo millet, little millet, and proso millet
Yang et al. High-throughput development of SSR markers from pea (Pisum sativum L.) based on next generation sequencing of a purified Chinese commercial variety
Miller et al. Vitis phylogenomics: hybridization intensities from a SNP array outperform genotype calls
Kettle et al. Determinants of fine-scale spatial genetic structure in three co-occurring rain forest canopy trees in Borneo
Abu Zaitoun et al. Characterizing Palestinian snake melon (Cucumis melo var. flexuosus) germplasm diversity and structure using SNP and DArTseq markers
Belete et al. Genetic diversity and population structure of bread wheat genotypes determined via phenotypic and SSR marker analyses under drought-stress conditions
CN105279369A (en) Next generation sequencing based coronary heart disease genetic risk evaluation method
Almeida et al. Genetic diversity, population structure, and andean introgression in Brazilian common bean cultivars after half a century of genetic breeding
Loera-Sánchez et al. DNA-based assessment of genetic diversity in grassland plant species: Challenges, approaches, and applications
Samarina et al. Transferability of ISSR, SCoT and SSR markers for Chrysanthemum× Morifolium Ramat and genetic relationships among commercial Russian cultivars
He et al. The history and diversity of rice domestication as resolved from 1464 complete plastid genomes
Santos et al. Tackling relationships and species circumscriptions of Octoblepharum, an enigmatic genus of haplolepideous mosses (Dicranidae, Bryophyta)
Escalas et al. A unifying quantitative framework for exploring the multiple facets of microbial biodiversity across diverse scales
CN105354444B (en) Method based on the susceptible SNP combinations of susceptible SNP screenings complex disease
Sugihara et al. Population genomics of yams: evolution and domestication of Dioscorea species
Robins et al. Contrasting patterns of population divergence on young and old landscapes in Banksia seminuda (Proteaceae), with evidence for recognition of subspecies

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20180619

Termination date: 20211124

CF01 Termination of patent right due to non-payment of annual fee