CN105354444B - Method based on the susceptible SNP combinations of susceptible SNP screenings complex disease - Google Patents
Method based on the susceptible SNP combinations of susceptible SNP screenings complex disease Download PDFInfo
- Publication number
- CN105354444B CN105354444B CN201510828517.4A CN201510828517A CN105354444B CN 105354444 B CN105354444 B CN 105354444B CN 201510828517 A CN201510828517 A CN 201510828517A CN 105354444 B CN105354444 B CN 105354444B
- Authority
- CN
- China
- Prior art keywords
- susceptible
- susceptible snp
- snp
- disease
- combinations
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B30/00—ICT specially adapted for sequence analysis involving nucleotides or amino acids
Abstract
The present invention discloses a kind of method based on the susceptible SNP combinations of susceptible SNP screenings complex disease.The method of the present invention can find rapidly the susceptible SNP combinations of specified disease from mass data.The present invention is intended to provide a kind of method of new disease risks evaluation using the method for susceptible SNP combinations.Susceptible SNP combinations may be better than the assessment of disease risks the risk assessment effect of single susceptible SNP site, lead to genetic disease effect similar to low frequency mutation.Effect of the method for susceptible SNP combinations in complex disease risk profile may reach 1 in terms of specific example result, this is that single susceptible SNP can not can achieve the effect that.This method can be applied to the screening of the susceptible SNP combinations of any complex disease and the prediction of diseases genetic risk.
Description
Technical field
The method that SNP typing datas screening disease-susceptible humans SNP is combined in public database is the present invention relates to the use of, is especially related to
A kind of and method based on the susceptible SNP combinations of susceptible SNP screenings complex disease.
Background technology
Complex disease, such as diabetes B, obesity, cardiovascular and cerebrovascular diseases are frequently not one because of its pathogenesis complexity
Or the influence of several genes causes.Last decade, the world include the country using whole-genome association method (GWAS) to a large amount of
SNP typing datas in the range of the full-length genome of individual are for statistical analysis, sifted out a large amount of susceptible SNP sites, for example
The data of all diabetes B GWAS, identify 69 susceptible SNP (Nature before the DIAGRAM committees integrate
Genetics, 2014).It, should not be independent when assessing SNP risks since complex disease is that have multiple gene joint effects to cause
Assess the influence of single SNP.
Invention content
The shortcomings that in order to overcome the prior art, the purpose of the present invention is to provide one kind to be based on susceptible SNP screenings with insufficient
The method of the susceptible SNP combinations of complex disease.Present invention employs the constitution's risks that a kind of new method assesses multiple SNP.The party
Method can be applied to the screening of the susceptible SNP combinations of any complex disease.
Another object of the present invention is to provide the application of the above method.
The purpose of the present invention is achieved through the following technical solutions:
A kind of method based on the susceptible SNP combinations of susceptible SNP screenings complex disease, includes the following steps:
The present invention is found out and the relevant susceptible SNP of specified disease, a large amount of full bases of individual of download from known document first
Because of the SNP typing datas of group, the genotype data of each susceptible SNP site of individual is separated, then by these susceptible SNP
Site is ranked sequentially by one, if some susceptible SNP site contains more than one susceptible allele (i.e. containing 1 or 2 easily
Feel genotype), it is marked with specific English alphabet, each individual will obtain one equal to or less than susceptible SNP in this way
The alphabetic character string of number of sites.Then the number of various character strings in illness group and control group is counted respectively, and in illness group
The unexistent character string of control group is found out, then therefrom selects the apparent more character string of number, then in turn in order will be alphabetical
Genotype is converted to, finally obtains susceptible SNP combinations.
The control group refers to health population.
The illness group refers to disease populations.
The method of the present invention can find rapidly the susceptible SNP combinations of specified disease from mass data.
The method based on the susceptible SNP combinations of susceptible SNP screenings complex disease is in the susceptible SNP groups of screening complex disease
Application in conjunction.
The method based on the susceptible SNP combinations of susceptible SNP screenings complex disease is in complex disease risk profile
Using.
The present invention is had the following advantages and effect relative to the prior art:
In order to overcome the shortcomings of the susceptible SNP evaluations disease risks of complex disease unit point, the present invention is combined using susceptible SNP
Method be intended to provide a kind of method of new disease risks evaluation.Susceptible SNP combinations may be more for the assessment of disease risks
Better than the risk assessment effect of single susceptible SNP site, lead to genetic disease effect similar to low frequency mutation.From specific example knot
Fruit sees that effect of the method for susceptible SNP combination in complex disease risk profile may reach 1, this be single susceptible SNP not
The effect being likely to be breached.This method can be applied to screening and the diseases genetic risk of any complex disease susceptible SNP combinations
Prediction.
Description of the drawings
Fig. 1 is idiotype initial data schematic diagram.
Fig. 2 is the schematic diagram of the individual character string of 14 random individuals.
Specific embodiment
With reference to embodiment and attached drawing, the present invention is described in further detail, but embodiments of the present invention are unlimited
In this.
Embodiment 1
1st, SNP's is selected
Select Genome-wide trans-ancestry meta-analysis provides insight into
69 diabetes Bs in the genetic architecture of type 2diabetes susceptibility are easy
Feel SNP (Nature Genetics, 2014), be shown in Table 1.
1 69 susceptible SNP of diabetes B of table
2nd, data source
All full-length genome SNP typing datas be fromhttp://www.ebi.ac.uk/ega/It downloads.
Control group WTCCC1project samples from 1958British Birth Cohort (1991 samples).
Diabetes B group WTCCC1project Type 2Diabetes (T2D) samples (1504 samples).Individual base
Because type initial data schematic diagram is shown in Fig. 1.
3rd, SNP site is screened
It is numbered according to the rs of SNP site susceptible in table 1, by all susceptible SNP sites of individual each in downloading data
Genotype extracts, and is extracted the genotype of 18 susceptible SNP sites of diabetes B altogether, this 18 susceptible SNP are shown in Table 2.
The genotype of 2 18 susceptible SNP sites of diabetes B of table
Locus | Lead SNP | Locus | Lead SNP |
NOTCH2 | rs10923931 | WFS1 | rs4458523 |
RBMS1 | rs7593730 | KLF14 | rs13233731 |
THADA | rs10203174 | CDKN2A/ | rs10811661 |
IRS1 | rs2943640 | VPS26A | rs1802295 |
GCKR | rs780094 | KCNJ11 | rs5215 |
IGF2BP2 | rs4402960 | KLHDC5 | rs10842994 |
PPARG | rs1801282 | CCND2 | rs11063069 |
ADAMTS9 | rs6795735 | SPRY2 | rs1359790 |
PSMD6 | rs831571 | RASGRP1 | rs7403531 |
4th, the genotypic markers of susceptible SNP site
For the genotype of this 18 susceptible SNP sites of each individual, marked as long as there are 1 or 2 tumor susceptibility genes
Note is as follows, refers to table 3, no tumor susceptibility gene is labeled as default.
3 susceptible SNP genotypic markers of table
SNP | Genotype | Post-conversion characters |
rs10203174 | CC、CT | a |
rs10811661 | CT、TT | b |
rs10842994 | CC、CT | c |
rs10923931 | GT、TT | d |
rs11063069 | AG、GG | e |
rs13233731 | AG、GG | f |
rs1359790 | AG、GG | g |
rs1801282 | CC、CG | h |
rs1802295 | CT、TT | i |
rs2943640 | AC、CC | j |
rs4402960 | GT、TT | k |
rs4458523 | GG、GT | l |
rs5215 | CC、CT | m |
rs6795735 | CC、CT | n |
rs7403531 | CT、TT | o |
rs7593730 | CC、CT | p |
rs780094 | CC、CT | q |
rs831571 | CC、CT | r |
5th, character string is obtained
It is ranked sequentially according to SNP in table 3, to every in 3945 individuals (control group 1991, diabetes B group 1504)
18 loci gene types of an individual are converted into character string, obtain a series of character string, totally 3945, therefrom randomly select
14 individuals, as shown in Figure 2.
6th, the susceptible SNP combinations of statistics screening
3945 individuals (control group 1991, diabetes B group 1504) are carried out to find in 2 types by statistical analysis
The i.e. SNP of the character string that diabetes group has but do not have in control group is combined, such as abcghp (i.e. rs10203174,
The SNP site of rs10811661, rs10842994, rs1359790, rs1801282, rs75937306 carry easy sensillary base simultaneously
Cause), the number of appearance is most, is not found in control group but, as long as some individual inheritance detection is as a result prompted more than group occur
It closes, the risk for suffering from diabetes B may be 1.Character string number is shown in Table 4 in the statistical result of 5 or more.
Statistical form of the 4 character string number of table at 5 or more
Diabetes B | Character string number | Control group | Character string number |
abcghp | 10 | abcghpr | 7 |
abcghjpr | 8 | abcghjlnpr | 6 |
abchjpr | 7 | abchnpr | 6 |
abchlnp | 7 | abchpr | 6 |
abcfghpr | 6 | abghpr | 6 |
abcghjlpq | 6 | abhnpr | 6 |
abcghjp | 6 | abcghjnpr | 5 |
abcghlp | 6 | abcghlpqr | 5 |
Show that susceptible SNP combinations (frequency is very low) are perhaps better than single for the assessment of disease risks with the data of the present invention
The risk assessment effect of one susceptible SNP site leads to genetic disease effect, due to the big number accumulated at present similar to low frequency mutation
According to limited, if run up to more than 100,000 individual data, the risk of direct predictive disease is combined as 100% perhaps with SNP
It is possibly realized.
Above-described embodiment is the preferable embodiment of the present invention, but embodiments of the present invention are not by above-described embodiment
Limitation, other any Spirit Essences without departing from the present invention with made under principle change, modification, replacement, combine, simplification,
Equivalent substitute mode is should be, is included within protection scope of the present invention.
Claims (4)
- A kind of 1. method based on the susceptible SNP combinations of susceptible SNP screenings complex disease, it is characterised in that include the following steps:Found out from known document with the relevant susceptible SNP of specified disease, download the SNP typing datas of individual whole genome, The genotype data of each susceptible SNP site of individual is separated, then arranges these susceptible SNP sites by a sequence Row, if some susceptible SNP site contains more than one susceptible allele, are marked with specific English alphabet, every in this way An individual will obtain an alphabetic character string for being equal to or less than susceptible SNP site number;Then respectively statistics illness group and The number of various character strings in control group, and the unexistent character string of control group is found out in illness group, then therefrom select number Apparent more character string, then letter is converted into genotype in order in turn, finally obtain susceptible SNP combinations.
- 2. the method according to claim 1 based on the susceptible SNP combinations of susceptible SNP screenings complex disease, it is characterised in that: The control group refers to health population.
- 3. the method according to claim 1 based on the susceptible SNP combinations of susceptible SNP screenings complex disease, it is characterised in that: The illness group refers to disease populations.
- 4. the method according to claim 1 based on the susceptible SNP combinations of susceptible SNP screenings complex disease, it is characterised in that: Described refers to containing more than one susceptible allele containing 1 or 2 susceptible genotype.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510828517.4A CN105354444B (en) | 2015-11-24 | 2015-11-24 | Method based on the susceptible SNP combinations of susceptible SNP screenings complex disease |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201510828517.4A CN105354444B (en) | 2015-11-24 | 2015-11-24 | Method based on the susceptible SNP combinations of susceptible SNP screenings complex disease |
Publications (2)
Publication Number | Publication Date |
---|---|
CN105354444A CN105354444A (en) | 2016-02-24 |
CN105354444B true CN105354444B (en) | 2018-06-19 |
Family
ID=55330415
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201510828517.4A Expired - Fee Related CN105354444B (en) | 2015-11-24 | 2015-11-24 | Method based on the susceptible SNP combinations of susceptible SNP screenings complex disease |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN105354444B (en) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107345248A (en) * | 2017-06-26 | 2017-11-14 | 思畅信息科技(上海)有限公司 | Gene and site methods of risk assessment and its system based on big data |
CN113403380A (en) * | 2021-06-11 | 2021-09-17 | 中国科学院北京基因组研究所(国家生物信息中心) | Complex disease related SNP site primer composition and application |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101894216A (en) * | 2010-07-16 | 2010-11-24 | 西安电子科技大学 | Method of discovering SNP group related to complex disease from SNP information |
CN104573408A (en) * | 2013-10-18 | 2015-04-29 | 大江基因医学股份有限公司 | Single nucleotide polymorphism disease incidence prediction system |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU785425B2 (en) * | 2001-03-30 | 2007-05-17 | Genetic Technologies Limited | Methods of genomic analysis |
WO2011076783A2 (en) * | 2009-12-22 | 2011-06-30 | Integragen | A method for evaluating a risk for a transmissible neuropsychiatric disorder |
-
2015
- 2015-11-24 CN CN201510828517.4A patent/CN105354444B/en not_active Expired - Fee Related
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN101894216A (en) * | 2010-07-16 | 2010-11-24 | 西安电子科技大学 | Method of discovering SNP group related to complex disease from SNP information |
CN104573408A (en) * | 2013-10-18 | 2015-04-29 | 大江基因医学股份有限公司 | Single nucleotide polymorphism disease incidence prediction system |
Non-Patent Citations (2)
Title |
---|
基于人2型糖尿病易感SNP位点筛查食蟹猴2型糖尿病易感SNP标记;柳明玉等;《中国比较医学杂志》;20141031;第24卷(第10期);第18-26页 * |
基于单核苷酸多态性的基因互作分析方法学进展;栾奕昭等;《遗传》;20131231;1331-1339 * |
Also Published As
Publication number | Publication date |
---|---|
CN105354444A (en) | 2016-02-24 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Yuan et al. | Predicting disease occurrence with high accuracy based on soil macroecological patterns of Fusarium wilt | |
Grieneisen et al. | Gut microbiome heritability is nearly universal but environmentally contingent | |
Terzopoulos et al. | Genetic diversity analysis of Mediterranean faba bean (Vicia faba L.) with ISSR markers | |
Miah et al. | A review of microsatellite markers and their applications in rice breeding programs to improve blast disease resistance | |
Gao et al. | Genome wide association study of seedling and adult plant leaf rust resistance in elite spring wheat breeding lines | |
Pour-Aboughadareh et al. | Insight into the genetic variability analysis and relationships among some Aegilops and Triticum species, as genome progenitors of bread wheat, using SCoT markers | |
Baye et al. | Genotype–environment interactions and their translational implications | |
Nayak et al. | Promoting utilization of Saccharum spp. genetic resources through genetic diversity analysis and core collection construction | |
Evans et al. | A multipurpose, high-throughput single-nucleotide polymorphism chip for the dengue and yellow fever mosquito, Aedes aegypti | |
Johnson et al. | Genome‐wide population structure analyses of three minor millets: Kodo millet, little millet, and proso millet | |
Yang et al. | High-throughput development of SSR markers from pea (Pisum sativum L.) based on next generation sequencing of a purified Chinese commercial variety | |
Miller et al. | Vitis phylogenomics: hybridization intensities from a SNP array outperform genotype calls | |
Kettle et al. | Determinants of fine-scale spatial genetic structure in three co-occurring rain forest canopy trees in Borneo | |
Abu Zaitoun et al. | Characterizing Palestinian snake melon (Cucumis melo var. flexuosus) germplasm diversity and structure using SNP and DArTseq markers | |
Belete et al. | Genetic diversity and population structure of bread wheat genotypes determined via phenotypic and SSR marker analyses under drought-stress conditions | |
CN105279369A (en) | Next generation sequencing based coronary heart disease genetic risk evaluation method | |
Almeida et al. | Genetic diversity, population structure, and andean introgression in Brazilian common bean cultivars after half a century of genetic breeding | |
Loera-Sánchez et al. | DNA-based assessment of genetic diversity in grassland plant species: Challenges, approaches, and applications | |
Samarina et al. | Transferability of ISSR, SCoT and SSR markers for Chrysanthemum× Morifolium Ramat and genetic relationships among commercial Russian cultivars | |
He et al. | The history and diversity of rice domestication as resolved from 1464 complete plastid genomes | |
Santos et al. | Tackling relationships and species circumscriptions of Octoblepharum, an enigmatic genus of haplolepideous mosses (Dicranidae, Bryophyta) | |
Escalas et al. | A unifying quantitative framework for exploring the multiple facets of microbial biodiversity across diverse scales | |
CN105354444B (en) | Method based on the susceptible SNP combinations of susceptible SNP screenings complex disease | |
Sugihara et al. | Population genomics of yams: evolution and domestication of Dioscorea species | |
Robins et al. | Contrasting patterns of population divergence on young and old landscapes in Banksia seminuda (Proteaceae), with evidence for recognition of subspecies |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20180619 Termination date: 20211124 |
|
CF01 | Termination of patent right due to non-payment of annual fee |