CN113012761B - Method and device for constructing stroke polygene genetic risk comprehensive score and application - Google Patents

Method and device for constructing stroke polygene genetic risk comprehensive score and application Download PDF

Info

Publication number
CN113012761B
CN113012761B CN202110371906.4A CN202110371906A CN113012761B CN 113012761 B CN113012761 B CN 113012761B CN 202110371906 A CN202110371906 A CN 202110371906A CN 113012761 B CN113012761 B CN 113012761B
Authority
CN
China
Prior art keywords
sub
phenotype
stroke
prs
prss
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110371906.4A
Other languages
Chinese (zh)
Other versions
CN113012761A (en
Inventor
鲁向锋
顾东风
邢小龙
黄建凤
李建新
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuwai Hospital of CAMS and PUMC
Original Assignee
Fuwai Hospital of CAMS and PUMC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuwai Hospital of CAMS and PUMC filed Critical Fuwai Hospital of CAMS and PUMC
Priority to CN202110371906.4A priority Critical patent/CN113012761B/en
Publication of CN113012761A publication Critical patent/CN113012761A/en
Application granted granted Critical
Publication of CN113012761B publication Critical patent/CN113012761B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/20Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Public Health (AREA)
  • Biotechnology (AREA)
  • Theoretical Computer Science (AREA)
  • Epidemiology (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Databases & Information Systems (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Biophysics (AREA)
  • Data Mining & Analysis (AREA)
  • Pathology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioethics (AREA)
  • Biomedical Technology (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Primary Health Care (AREA)
  • Genetics & Genomics (AREA)
  • Analytical Chemistry (AREA)
  • Chemical & Material Sciences (AREA)
  • Molecular Biology (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides a method, a device and application for constructing a stroke polygenic genetic risk comprehensive score (metaPRS). The construction method of the stroke polygene genetic risk comprehensive score comprises the following steps: screening a set of SNPs associated with stroke or associated with a stroke-associated phenotype; detecting the genotype of the SNP to be detected of the individual; respectively extracting risk alleles, effect values and P values of the detected SNP corresponding to a plurality of sub-phenotypes from the whole genome correlation research result, constructing a plurality of candidate sub-phenotype PRSs and screening the optimal sub-phenotype PRSs; determining a weight for each sub-phenotypic PRS; converting the weight of the sub-phenotypic PRS into a weight at the SNP level; constructing a stroke polygene genetic risk comprehensive score metaPRS; the effect of metaPRS on stroke risk prediction and stratification was evaluated. The method has important significance for evaluating the attack risk of the cerebral apoplexy.

Description

Method and device for constructing stroke polygene genetic risk comprehensive score and application
Technical Field
The invention relates to a method and a device for constructing a stroke polygenic genetic risk comprehensive score (metaPRS) and application thereof.
Background
Stroke death is one of the major health threats worldwide. It is estimated that the global risk of stroke in adults over 25 years of age is about 25% for life, with the highest risk in east asian populations reaching 39%. In China, stroke is the main cause of death of residents, and the number of dead people in stroke in 2017 reaches 207 ten thousand. Therefore, the early identification of high-risk people and the healthy lifestyle management and drug intervention aiming at major risk factors (such as hypertension, diabetes, dyslipidemia, etc.) are of great significance for primary prevention of stroke in China and even globally.
Stroke is a complex disease caused by both genetic and environmental factors. Genome-wide association studies (GWAS) have identified at least 42 genetic susceptibility genes associated with stroke, as well as hundreds of genetic genes associated with stroke-associated phenotypes, including blood pressure, type 2 diabetes (t 2 d), blood lipid levels, body Mass Index (BMI), atrial Fibrillation (AF), and the like. Integrating these genetic variations to construct a Polygenic stroke risk score (PRS) would help develop early risk prediction of cardiovascular disease and guide primary prevention.
However, the accuracy of the evaluation of the existing stroke polygene genetic risk score needs to be improved.
Furthermore, almost all existing genetic scores are based on european population construction (Stroke 2014 45. The existing stroke polygene genetic risk score constructed based on European population is not suitable for east Asian population.
Therefore, it is important to construct the eastern asian population stroke PRS and strictly evaluate the genetic risk prediction value of the eastern asian population in a prospective cohort population.
Disclosure of Invention
The invention aims to provide a construction method of a polygene genetic risk score of stroke.
The invention also aims to provide a device for constructing the stroke polygene genetic risk score.
On one hand, the invention provides a method for constructing a stroke polygene genetic risk score, which is a method for constructing a stroke polygene genetic risk comprehensive score, and the method comprises the following steps:
(1) Screening a set of Single Nucleotide Polymorphism Sites (SNPs) associated with stroke or with a stroke-associated phenotype (achieving a significant genome-wide association);
(2) Genotyping based on the single nucleotide polymorphic sites in step (1);
(3) Respectively extracting risk alleles, effect values and P values of the detected SNP corresponding to a plurality of sub-phenotypes from the whole genome correlation research result, constructing a plurality of candidate sub-phenotype PRSs and screening the optimal sub-phenotype PRSs; wherein the plurality of sub-phenotypes comprises: stroke, coronary heart disease, type 2 diabetes, atrial fibrillation, systolic pressure, diastolic pressure, mean arterial pressure, pulse pressure, body mass index, waist circumference, total cholesterol, low density lipoprotein cholesterol, triglycerides and high density lipoprotein cholesterol;
(4) Determining a weight for each sub-phenotypic PRS;
(5) Converting the weight of the sub-phenotypic PRS into a weight at the SNP level;
(6) And constructing a stroke polygene genetic risk comprehensive score metaPRS.
According to the embodiment of the invention, in the method for constructing the stroke polygene genetic risk score, the stroke-related phenotypes include blood pressure (systolic pressure, diastolic pressure, mean arterial pressure, pulse pressure), type 2 diabetes, blood lipids (total cholesterol, low density lipoprotein cholesterol, triglyceride and high density lipoprotein cholesterol), obesity (body mass index, waist circumference), atrial fibrillation and coronary heart disease. That is, in the method for constructing a stroke polygenic genetic risk score according to the present invention, the constructed plurality of candidate sub-phenotype PRSs include: stroke, coronary heart disease, type 2 diabetes, atrial fibrillation, systolic pressure, diastolic pressure, mean arterial pressure, pulse pressure, body mass index, waist circumference, total cholesterol, low density lipoprotein cholesterol, triglycerides, and the subphenotypic PRS of high density lipoprotein cholesterol.
According to a specific embodiment of the invention, in the method for constructing a stroke polygenic genetic risk score, the stroke polygenic genetic risk score is used for assessing stroke onset risk of the east asian population, and the set of single nucleotide polymorphism sites includes: the single nucleotide polymorphism sites related to stroke or coronary heart disease of all people and the single nucleotide polymorphism sites related to blood pressure, type 2 diabetes, blood fat, obesity and atrial fibrillation of east Asian people.
According to a specific embodiment of the invention, in the method for constructing the stroke polygene genetic risk score, the cohort population for genotyping is east asian population.
According to the specific embodiment of the invention, in the method for constructing the stroke polygene genetic risk score, the gene typing is carried out by using a multiplex polymerase chain reaction targeted amplicon sequencing technology. Median sequencing depth was 979 ×.
According to the specific embodiment of the invention, in the method for constructing the stroke polygenic genetic risk score, SNP with a genotype detection rate lower than 95% can be excluded in the genotyping process, and a qualified SNP set can be obtained.
According to the specific embodiment of the invention, in the method for constructing the stroke polygene genetic risk score, the risk alleles, the effect values and the P values of the detected SNP corresponding to a plurality of sub-phenotypes are respectively extracted from the whole genome association research results of the large-scale east Asian population.
According to a specific embodiment of the present invention, in the method for constructing a stroke polygenic genetic risk score, the process of constructing each sub-phenotype PRS includes:
dividing a plurality of groups of SNPs according to the extracted P value, and for each group of SNPs, based on the queue population data, using a plink software clumping command according to different linkage disequilibrium r 2 Pruning to obtain a plurality of groups of SNP combinations;
using genotype data, individual SNP risk allele factors (0, 1, OR 2) are weighted according to their corresponding effect values and summed to construct a plurality of candidate PRSs for inclusion in different combination SNPs, and the association of these candidate PRSs with stroke is assessed using a logistic regression model, with the score with the greatest Odds Ratio (OR) (one standard deviation increase in PRS) being selected as the best sub-phenotypic PRS.
Dividing a plurality of groups of SNPs according to the extracted P value, and for each group of SNPs, based on the queue population data, using a plink software clipping command according to different linkage disequilibrium r 2 Pruning to obtain a plurality of groups of SNP combinations;
using genotype data, individual SNP risk allele factors (0, 1, OR 2) are weighted according to their corresponding effect values and summed to construct a plurality of candidate PRSs incorporating different combination SNPs, the association of these candidate PRSs with stroke is assessed using a logistic regression model, and the score with the Odds Ratio (OR) maximum (one standard deviation per PRS increase) is selected as the best sub-phenotypic PRS.
According to a more specific embodiment of the present invention, in the above process of constructing each sub-phenotype PRS, N sets of SNPs can be separated according to the extracted P value, wherein N is greater than or equal to 2. For example, the P value can be 0.5, 0.05, 5X 10 -3 、5×10 -4 、5×10 -5 、5×10 -6 From these, 3, 4, 5 or 6 groups were selected.
According to a more specific embodiment of the present invention, the linkage disequilibrium r is different in the above-described construction of PRSs of the respective subphenotypes 2 For example, it may be selected from 0.2, 0.4, 0.6, 0.8, etc.
According to a more specific embodiment of the present invention, in the above process of constructing PRSs of respective subphenotypes, when N sets of SNPs are separated according to the size of the extracted P value, different linkage disequilibrium r 2 At 0.2, 0.4, 0.6 and 0.8, 4N sets of SNP combinations can be obtained, i.e. 4N candidate PRSs incorporating different combination SNPs can be constructed. For example, when 4 sets of SNPs are separated according to the size of the extracted P value, different linkage disequilibrium r 2 At 0.2, 0.4, 0.6, and 0.8, 16 candidate PRSs incorporating different combination SNPs can be constructed.
In the invention, correlation coefficients r and P values between every two of all the sub-phenotype PRSs can be further calculated through Pearson correlation analysis.
According to the specific embodiment of the invention, in the method for constructing the stroke polygene genetic risk score, part of people can be selected from all cohort people according to a predetermined proportion to be used as a training set (the rest of people can be used as a verification set). The process of constructing the sub-phenotype PRSs and determining the weight of each sub-phenotype PRS is carried out in a training set.
According to a specific embodiment of the present invention, in the method for constructing a stroke polygenic genetic risk score, the process of determining the weight of each sub-phenotype PRS comprises:
converting each sub-phenotype PRS into a standardized score with a mean value of 0 and a standard deviation of 1;
using the training set, putting the normalized PRSs of each sub-phenotype and covariates (age and sex) to be adjusted into an elastic reticular logistic regression model together, selecting the model with the highest AUC as a final model, and obtaining coefficients (beta) of each PRS from the final model 1 …β 14 ) As the weight.
In some embodiments of the invention, an elastic reticular logistic regression model that corrects the correlation between individual sub-phenotypic PRSs is used by the invention to evaluate the association of 14 sub-phenotypic PRSs with stroke, and the OR value of the elastic reticular logistic regression estimate is analyzed versus the OR value of the univariate logistic regression estimate. Further, the present invention constructs and verifies stroke metaPRS by integrating 14 sub-phenotypic PRSs and converting the weight of the sub-phenotypic PRSs into the weight of SNP level.
According to the specific embodiment of the invention, in the method for constructing the stroke polygene genetic risk score, the process of converting the weight of the sub-phenotype PRS into the weight of the SNP level is carried out according to the following model:
Figure BDA0003009615150000041
wherein σ 1 ,…,σ 14 Is the standard deviation, α, of each sub-phenotypic PRS in the training set j1 ,…,α j14 Is that the ith SNP corresponds to the effector value of each sub-phenotype, and if a SNP is not included in the kth score, the effector value of that SNP is of a magnitude α jk Is set to 0.
According to the specific embodiment of the invention, in the method for constructing the stroke polygene genetic risk score, the constructed stroke polygene genetic risk comprehensive score metaPRS is as follows:
metaPRS=∑βsnp_i×Ni
wherein, β SNP _ i refers to the effect value of the ith SNP, and Ni refers to the number of effect alleles of the ith SNP carried by the individual.
According to the specific embodiment of the invention, the method for constructing the stroke polygene genetic risk comprehensive score can further comprise a process of evaluating the effect of the constructed metaPRS on stroke risk prediction and stratification.
According to the specific embodiment of the invention, in the method for constructing the stroke polygene genetic risk score, preferably, 20% and 80% percentiles of metaPRS of all individuals in a cohort group are used as cut points to divide the stroke genetic morbidity risk of the individuals into low, medium and high risk groups.
In another aspect, the present invention further provides an apparatus for constructing a stroke polygene genetic risk composite score, where the apparatus includes:
a genotyping module for genotyping;
the sub-phenotype PRS construction module is used for respectively extracting dangerous alleles, effect values and P values of the detected SNP corresponding to a plurality of sub-phenotypes from the whole genome correlation research result, constructing candidate sub-phenotype PRSs and screening optimal sub-phenotype PRSs;
a model training module for determining a weight for each sub-phenotypic PRS in a training set;
and the metaPRS construction module is used for converting the weight of the sub-phenotype PRS into the weight of the SNP level and constructing a stroke polygenic genetic risk comprehensive score (metaPRS).
According to the embodiment of the invention, the device for constructing the stroke polygene genetic risk comprehensive score can also optionally comprise an SNP screening module for screening a set of Single Nucleotide Polymorphism Sites (SNPs) related to stroke or related to a stroke-related phenotype.
According to a specific embodiment of the invention, in the device for constructing the stroke polygenic genetic risk comprehensive score, the genotyping module can also be used for eliminating SNP (single nucleotide polymorphism) with a genotype detection rate lower than 95% after genotyping.
According to the specific embodiment of the invention, in the device for constructing a stroke polygenic genetic risk comprehensive score, optionally, the metaPRS construction module can be further used for evaluating the effect of the constructed metaPRS on the prediction and stratification of stroke risk.
In another aspect, the present invention further provides a computer device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to implement the stroke polygenic genetic risk composite score constructed by the method of the present invention to evaluate the risk of stroke in an individual.
In a specific embodiment of the invention, a polygenic genetic risk comprehensive score comprising a plurality of genetic variations is developed by relying on GWAS result data of a large east Asian population with stroke and related phenotype, and the effect of the polygenic genetic risk comprehensive score on stroke risk stratification is evaluated in a large prospective queue of 41,006 study objects. Studies found that individuals with high genetic risk (20% of the genetic risk) had approximately 2-fold higher risk of developing stroke than individuals with low genetic risk (20% of the genetic risk) (HR: 1.99, 95-ci. And the cerebral apoplexy incidence risk assessment of the invention is applicable to hemorrhagic and ischemic cerebral apoplexy. The research proves that the polygenic genetic risk comprehensive score can realize the refined layering of the stroke risk, and the method has important application prospect in the aspects of constructing the polygenic genetic risk comprehensive score of the stroke and preventing the stroke in the first stage.
Drawings
Fig. 1 shows the construction, research design and work flow of the multi-gene genetic risk score for stroke of the invention.
FIG. 2 shows the sequencing depth of 578SNPs for successful genotyping. The boxplot represents a 4-quantile distribution of the sequencing depths for the 578 SNPs. The rectangular middle line represents the median depth of sequencing (979 ×), the top and bottom represent 75% (1376 ×) and 25% quantile (738 ×).
Figure 3 shows the correlation of stroke PRS with stroke using east asia and europe and america GWAS effect values in the training set. The age and gender were adjusted using logistic regression models to calculate Odds Ratios (ORs) and 95% Confidence Intervals (CIs). Respectively using Japanese biological sample bank and MEGASTROKE alliance EuropeAnd taking the effect value of the human stroke GWAS data as the SNPs weight calculation score. Setting different P value threshold (5 x 10) -6 ,5×10 -4 0.05, 0.5) separately constructed 4 PRSs (linkage disequilibrium r) comprising different combinations of SNPs 2 <0.6)。
Figure 4 shows the association of candidate multigene risk scores (one standard deviation increase per training set) with stroke. Age and gender were adjusted using logistic regression to calculate Odds Ratios (OR) and 95% Confidence Intervals (CI). For each phenotype, different linkage disequilibrium r was employed based on the summarized data 2 (0.2, 0.4, 0.6, 0.8) and significance threshold (P-value =0.5, 0.05, 5 × 10) -4 、5×10 -6 ) 16 candidate PRSs were constructed.
FIG. 5 shows the correlation between individual sub-phenotypic PRSs in the training set. Correlation coefficients and P values for each pair of PRS were calculated using Pearson correlation analysis. * P<0.05,**P<10 -3 ,***P<10 -10
Figure 6 shows the association of a subphenotypic multigene risk score (one standard deviation increase per training set) with stroke. The age and gender were adjusted using logistic regression and elastic mesh logistic regression, respectively, to calculate the Odds Ratio (OR) and 95% Confidence Interval (CI).
Figure 7 shows the association of metaPRS and sub-phenotypic PRS in a prospective cohort with stroke onset. And calculating a risk ratio (HR) and a 95% Confidence Interval (CI) by adopting a Cox proportional risk regression model with queue layering and age as a time scale, and adjusting the gender.
Fig. 8 shows the association of metaPRS pentadecate with stroke onset. And calculating a risk ratio (HR) and a 95% Confidence Interval (CI) by adopting a Cox proportional risk regression model with queue layering and age as a time scale, and adjusting the gender.
Figure 9 shows the relative risk of stroke and lifetime risk under different genetic risk stratification. And calculating a risk ratio (HR) and an accumulated stroke curve before 80 years by adopting a Cox proportional risk regression model with queue layering and age as a time scale, and adjusting the gender.
Fig. 10 shows the relative risk of ischemic stroke and hemorrhagic stroke stratified by different genetic risks and the lifetime risk. And calculating a risk ratio (HR) and cumulative onset curves of ischemic stroke and hemorrhagic stroke before the age of 80 years by adopting a Cox proportional risk regression model with queue layering and age as a time scale, and adjusting the gender.
Detailed Description
For a more clear understanding of the technical features, objects and advantages of the present invention, reference is now made to the following detailed description taken in conjunction with the accompanying specific embodiments, and the technical solutions of the present invention are described, it being understood that these examples are intended to illustrate the present invention and are not intended to limit the scope of the present invention. In the examples, each raw reagent material is commercially available, and the experimental method not specifying the specific conditions is a conventional method and a conventional condition well known in the art, or a condition recommended by an instrument manufacturer.
Example 1
Research design process and research population
The study design flow is shown in fig. 1.
The research utilizes a training set designed by case contrast to construct metaPRS, and verifies and evaluates the clinical value of applying the metaPRS in stroke Risk Prediction in a large prospective queue, namely Chinese Atherosclerotic cardiovascular disease Risk Prediction project (China-PAR).
The training set included 2872 stroke cases (2548 ischemic and 324 hemorrhagic strokes) and 2494 controls (table 1). All cases had an effective diagnosis of stroke and were confirmed by neurologists from medical records of Computed Tomography (CT) and/or Magnetic Resonance Imaging (MRI). The control group was individuals enrolled in a community survey of cardiovascular risk factors and was confirmed as having not experienced stroke by history, clinical examination and standard questionnaires.
TABLE 1 crowd characteristics of training set
Feature(s) Control (N = 2494) Cerebral apoplexy case (N = 2872)
Age at study participation, years 66.1(10.3) -
Age of onset, years - 66.6(9.8)
Male, N (%) 934(37.4) 1,617(56.3)
Current smoker, N (%) 554(22.2) 622(21.8)
Systolic blood pressure, mmHg 132.4(15.9) 149.7(23.7)
Diastolic blood pressure, mmHg 82.9(8.5) 87.9(25.9)
Total cholesterol, mg/dl 188.1(36.8) 182.3(64.5)
Hypertension, N (%) 1,176(47.2) 2,242(78.9)
Diabetes, N (%) 285(11.4) 578(20.3)
Dyslipidemia, N (%) 895(35.9) 1,330(48.5)
Continuous type variables are expressed as mean (standard deviation) and categorical variables are expressed as number (percentage).
Verify that the population is from three queues of the China-PAR project: china Cardiovascular epidemiological multicenter cooperative research 1998 (China Multi-Center cooperative Study of Cardiovascular epidemic 1998, china MUCA1998), china Cardiovascular Health multicenter cooperative Study (International Cardiovascular Study of Cardiovascular Disease in Asia, interAsia) and Chinese Family Health Study (Community interaction of Metabolic Syndrome in China & Chinese Family Health Study, CIMIC). The establishment and follow-up of these queues has been described in detail in prior art articles (Circulation 2016. Briefly, china MUCA1998, interASIA and CIMIC were established in 1998, 2000-2001 and 2007-2008, respectively. The three queues were followed up during 2012-2015 using a unified questionnaire and protocol. In 43,881 participants with blood samples and follow-up information, the present invention further excluded 561 participants with high genotype deletion (> 5.0%) or low mean sequencing depth (< 30 ×), 1352 participants with a baseline age <30 or >75 years, 962 with baseline suffering from cardiovascular disease (stroke and myocardial infarction), and finally 41,006 participants for inclusion analysis.
These studies have all been approved by the ethical review committee of the hospital mons-foreign hospital, china medical sciences. Each participant signed a written informed consent prior to collecting the data.
Baseline major traditional risk factor Collection
In the baseline survey, standard questionnaires, physical examinations and laboratory tests were performed on each participant. A series of life style risk factors and cardiovascular metabolic indexes are collected by investigators trained professionally and qualified according to a uniformly formulated investigation scheme. The traditional risk factors of the baseline stroke mainly comprise hypertension, dyslipidemia, diabetes and obesity (the BMI is more than or equal to 28 kg/m) 2 ) And family history of stroke. Hypertension is defined as Systolic Blood Pressure (SBP) of 140mmHg or more and/or Diastolic Blood Pressure (DBP) of 90mmHg or more and/or administration of hypotensive agent over the past two weeks. Dyslipidemia is defined as Total Cholesterol (TC) of 240mg/dl or more and/or high-density lipoprotein cholesterol (HDL-C)<40mg/dl and/or Triglyceride (TG) not less than 200mg/dl and/or low-density lipoprotein cholesterol (LDL-C) not less than 160mg/dl and/or lipid-lowering drug. Diabetes is defined as fasting blood glucose of more than or equal to 126mg/dl and/or the use of insulin or oral hypoglycemic agents. Stroke family history is defined as the history of stroke in any first-degree relative (father, mother or brother sister).
Follow-up of stroke events
The three queues adopt the same research scheme for follow-up visit, cerebral apoplexy morbidity and mortality information of a study object is obtained in the modes of access and household investigation, and a medical record and a death certificate are further obtained for verification. All medical and death records were independently reviewed by two experts of the terminal evaluation committee of the hospital, outside the house, china medical science. If the opinions of the two experts are not in agreement, they are discussed together with other experts in the committee to arrive at the final diagnosis. Causes of death are encoded according to ICD-10 (International Classification of diseases, 10 th edition). Stroke is defined as the first lethal or non-lethal stroke event (I60-I69) diagnosed during follow-up. The cerebral apoplexy subtypes are classified into ischemic stroke (I63), hemorrhagic stroke (I60-I62) and undefined type stroke (I64-I69).
Selection and genotyping of single nucleotide polymorphic sites
The invention selects 588 Single Nucleotide Polymorphism (SNP) loci which are significantly associated with stroke or stroke-related phenotype in a whole genome based on previous whole genome association research, wherein the loci comprise stroke (n = 42) and a series of main risk factors of stroke, including blood pressure (n = 46), type 2 diabetes (n = 89), blood fat (n = 126), obesity (n = 79) and atrial fibrillation (n = 16) (tables 2 and 3). SNPs associated with Coronary Artery Disease (CAD) were also included in the present invention (n = 199). All SNPs reported by east asia and european populations associated with stroke and coronary heart disease were incorporated, with other phenotypes mainly incorporated into SNPs reported by east asia populations.
TABLE 2 number of SNPs selected in this study
Traits No.of SNPs
Stroke(AS,IS,HS) 42
BP(SBP,DBP,PP,MAP,hypertension) 46
CAD 199
T2D 89
Obesity(BMI,WC,WHR) 79
Lipids(TC,LDL-C,TG,HDL-C) 126
AF 16
Total 588 *
* The sum of the total number of overlaps due to susceptible SNPs between different phenotypes is not equal to 588 (equal to 597).
SNP, single nucleotide polymorphism; AS, all stroke; IS, ischemic stroke; HS, hemorrhagic stroke; BP, blood pressure; SBP, systolic blood pressure; DBP, diastolic pressure; PP, pulse pressure; MAP, mean arterial pressure; CAD, coronary artery disease; T2D, type 2 diabetes; BMI, body mass index; WC, waist circumference; WHR, waist-hip ratio; TC, total cholesterol; LDL-C, low density lipoprotein cholesterol; TG, triglycerides; HDL-C, high density lipoprotein cholesterol; AF, atrial fibrillation.
All training set participants were genotyped using multiplex polymerase chain reaction targeted amplicon sequencing technology. The region of interest was amplified using an Illumina Hiseq X Ten sequencer for high throughput sequencing. After excluding 10 SNPs with genotype detection rate lower than 95%, 578 autosomal SNPs were retained for subsequent analysis, with an average genotype detection rate of 99.9% and a median sequencing depth of 979 × (fig. 2). To assess genotyping reproducibility, 1648 replicate samples were tested with a genotyping concordance rate of >99.4%.
Construction of metaPRS
(1) Extracting SNP effect values from GWAS result data, and calculating PRS of each sub-phenotype
The results of genome-wide association studies of 14 sub-phenotypes (stroke, coronary heart disease, type 2 diabetes, atrial fibrillation, systolic pressure, diastolic pressure, mean arterial pressure, pulse pressure, body mass index, waist circumference, total cholesterol, low-density lipoprotein cholesterol, triglycerides and high-density lipoprotein cholesterol) were obtained from the website provided in the reference in table 3, and risk alleles, effect values and P values of the measured SNPs corresponding to the 14 sub-phenotypes were extracted from the results.
TABLE 3 sources of summary data for multigene risk score calculation
Figure BDA0003009615150000101
PRS, multigene risk scoring; GWAS, whole genome association study; EWAS, whole exon association study; IS, ischemic stroke; BP, blood pressure; SBP, systolic blood pressure; DBP, diastolic pressure; PP, pulse pressure; MAP, mean arterial pressure; CAD, coronary artery disease; T2D, type 2 diabetes; BMI, body mass index; WC, waist circumference; WHR, waist-hip ratio; TC, total cholesterol; LDL-C, low density lipoprotein cholesterol; TG, triglycerides; HDL-C, high density lipoprotein cholesterol; AF, atrial fibrillation.
Taking a subphenotypic stroke as an example, from a GWAS result of a website http:// jengeger.like.jp/en/result provided by an article (Large-scale genome-wide association study in a Japanese patent publication information sites. Nature genetics2020;52: 669-679), a risk allele, an effect value and a P value of the SNP to be measured were extracted. According to the extracted P value, the P value is 0.5, 0.05, 5 × 10 -4 、5×10 -6 Screening out 4 groups of SNPs, and for each group of SNPs, based on the cohort population data, using a plink software (version 1.9) clumping command according to different linkage disequilibrium r 2 (0.2, 0.4, 0.6, 0.8) pruning to finally obtain 16 SNP combinations. Weighting individual SNP risk allelic factors (0, 1 OR 2) according to corresponding effect values by using training set genotype data, summing the weighted individual SNP risk allelic factors to construct 16 candidate PRSs including different combination SNPs, evaluating the association of the candidate PRSs and the stroke by using a logistic regression model, and selecting the score with the largest ratio (odds ratio, OR) (every time the PRS is increased by one standard deviation) as the optimal stroke PRS. The above process was repeated to obtain other 13 sub-phenotypic PRSs, respectively. Among them, the SNP sites and effect values utilized by the best Stroke subphenotype (Stroke) PRS are shown in Table 4.
(2) Calculating weights for individual sub-phenotypic PRSs in a training set
14 sub-phenotypic PRSs were converted to a score with a mean of 0 and a standard deviation of 1. Using a training set, putting the normalized 14 sub-phenotype PRSs and covariates (age and sex) to be adjusted into an elastic mesh logistic regression model (cv. Glmnet function, R package "glmnet"), which adopts a 10-fold cross validation method to evaluate a series of models with different penalty terms (setting alpha =0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1.0), setting model parameter type.measure as "AUC", automatically screening the model with the highest AUC (area under receiver operating characteristic curve) as a final model, and obtaining coefficients (beta) of each PRS from the final model 1 …β 14 ) As weights.
(3) Conversion of weight of sub-phenotypic PRS into weight of SNP level
Figure BDA0003009615150000111
Converting the PRS level weight to the SNP level weight using the above formula, where σ 1 ,…,σ 14 Is the standard deviation, α, of each sub-phenotypic PRS in the training set j1 ,…,α j14 Is that the ith SNP corresponds to the effector value of each sub-phenotype, and if a SNP is not included in the kth score, the effector value of that SNP is of a magnitude α jk Is set to 0.
(4) Calculating metaPRS
Using the formula: metaPRS = ∑ β SNP _ i × Ni calculates metaPRS of the individual, where β SNP _ i refers to the effector value of the ith SNP (i.e., the weight of SNP level obtained at step 4), and Ni refers to the number of effector alleles of the ith SNP carried by the individual.
After statistical processing steps, a final total of 534 SNPs were included in the metaPRS calculation, and information and weights for all eligible SNPs are provided in table 4.
(5) MetaPRS cut point partitioning
Taking 20% and 80% percentiles of metaPRS of all individuals in the cohort group as tangents, and dividing the individual cerebral apoplexy genetic risk into low, medium and high risk groups.
TABLE 4 information and weights of SNPs determined by the invention
Figure BDA0003009615150000121
Figure BDA0003009615150000131
Figure BDA0003009615150000141
Figure BDA0003009615150000151
Figure BDA0003009615150000161
Figure BDA0003009615150000171
Figure BDA0003009615150000181
Figure BDA0003009615150000191
Figure BDA0003009615150000201
Figure BDA0003009615150000211
Statistical analysis
The continuity variable in the baseline characteristics of the study was expressed as the mean (standard deviation) and the classification variable as the frequency (percentage). Study subjects were divided into low (lowest quintile of metaPRS), medium (2-4 quintile of metaPRS) and high (highest quintile of metaPRS) genetic risk groups according to metaPRS levels.
And calculating a genetic risk score (hazard ratio, HR) and a 95% Confidence Interval (CIs) of the stroke incidence by adopting a gender-adjusted and age-time scale hierarchical Cox proportional risk regression model. Coxph (R package "survival") was used to plot gender corrected cumulative incidence curves to assess the lifetime risk of stroke at the age of 80 in subjects at different genetic risk stratification. All analyses were performed using R software version 3.6.0 (rfoution for Statistical Computing, vienna, austria) or SAS Statistical software version 9.4 (SAS Institute Inc, cary, NC).
Studying population genetic risk groups
Table 5 shows the baseline characteristics of 41,006 study subjects in the cohort population. The average age of the total population was 51.9 (10.6) years with 43.1% males. Participants with high genetic risk (20% on metaPRS) have higher risk factors for cardiovascular metabolism (hypertension, diabetes, dyslipidemia). After 367,750 years of follow-up (mean follow-up 9.0 years), 1227 participants had stroke before the age of 80 years, including 769 ischemic strokes, 355 hemorrhagic strokes, 21 ischemic strokes with hemorrhagic strokes, and 124 indeterminate subtypes of strokes.
TABLE 5 Baseline information for look-ahead queues
Figure BDA0003009615150000221
Continuous type variables are expressed as mean (standard deviation) and categorical variables are expressed as number (percentage).
Construction of polygene genetic risk score and prediction of stroke
The invention firstly sets 4 thresholds (5 multiplied by 10) according to the GWAS result P value of brain stroke in Japan biological sample bank -6 ,5×10 -4 0.05,0.5) screening 4 groups of different SNPs combinations, then calculating the PRS of the cerebral apoplexy by using GWAS result data of European population as SNP effect values in a training set, and further evaluating the correlation strength of the PRS and the cerebral apoplexy. As shown in figure 3, 4 PRSs incorporating different SNP combinations (each increased by one SD) had significantly reduced OR (95% ci) values associated with stroke when using effect values from the european population compared to stroke GWAS effect values using the japanese biosample bank. Therefore, the GWAS effect values of the east asian population were used in the present study to construct PRS of individual sub-phenotypes, the strength of association of each PRS of candidate sub-phenotypes with stroke in the training set is shown in fig. 4, and the score with the largest OR value was selected as the final PRS of sub-phenotypes. Correlation coefficients r and P values between each of the sub-phenotype PRSs were calculated by Pearson correlation analysis, and as shown in fig. 5, there were different degrees of correlation between each of the sub-phenotype PRSs, in which correlation between diastolic and mean arterial pressures (r = 0.90), systolic and mean arterial pressures (r = 0.86), systolic and diastolic pressures (r = 0.77), and total cholesterol and LDL-C (r = 0.85) PRSs were strong. The association of 14 sub-phenotype PRSs with stroke was further evaluated using an elastic reticular logistic regression model that corrects the correlation between individual sub-phenotype PRSs, with the OR values of the elastic reticular logistic regression estimates compared to the OR values of the univariate logistic regression estimates, see fig. 6 (LDL-C and HDL-C weights 0 in fig. 6). Finally, stroke metaPRS was constructed by integrating 14 sub-phenotypic PRSs and validated in cohort population.
In the construction process of the Stroke polygenic genetic risk score, the optimal Stroke subphenotype (Stroke) PRS determines a group of Stroke risk related genes related to east Asian population, the Stroke risk related genes comprise 280 Stroke related single nucleotide polymorphic sites shown in the table 4, the Stroke related single nucleotide polymorphic sites are detected, the genetic risk score of the attack risk is obtained through sigma beta i multiplied by Ni, and the Stroke attack risk of east Asian population can be well evaluated. The effect values of SNPs in the PRS column of the sub-phenotype in Table 4 may be used as the effect values of the SNPs in the Stroke-related SNPs, or the effect values of SNPs in the metaPRS column in Table 4 may be used as the effect values of the SNPs in the PRS column of the sub-phenotype in Table 4. The higher the genetic risk score, the higher the risk of stroke onset for the individual.
The scheme for evaluating the Stroke onset risk can further selectively detect one or more groups of 159 CAD related SNPs, 4 SBP related SNPs, 1 WC related SNP, 55T 2D related SNPs, 22 TC related SNPs, 9 PP related SNPs and 4 AF related SNPs shown in the table 4 on the basis of detecting 280 Stroke related SNPs shown in the table 4, obtain the genetic risk score of the onset risk through sigma beta i multiplied by Ni, and can better evaluate the Stroke onset risk of the east Asia population. When the stroke risk assessment protocol of the present invention includes detection of one or more sets of CAD, SBP, WC, T2D, TC, PP, AF-associated SNPs, the effect values of these SNPs may be unified with the effect values of the SNPs within the sub-phenotypic PRS column in table 3, preferably unified with the effect values of the SNPs within the metaPRS column in table 3. The higher the genetic risk score, the higher the risk of stroke onset in the individual.
In the validation population, the strength of association of metaPRS containing the 534 SNPs shown in Table 4 with stroke was higher than any other sub-phenotypic PRS, and for each increase in metaPRS of one standard deviation, the HR (95% CI) for total stroke, ischemic stroke and hemorrhagic stroke was 1.28 (1.21-1.36), 1.29 (1.20-1.39) and 1.30 (1.17-1.45), respectively (FIG. 7). Further adjusting clinical risk factors including family history of stroke (table 6), metaPRS and HR values of stroke onset were only slightly decreased, indicating that metaPRS of the present invention can be used to assess stroke onset risk independently of traditional clinical risk factors.
TABLE 6 MetaPRS (one standard deviation increase per standard deviation) correlation with stroke onset, with or without adjustment of clinical risk factors
Figure BDA0003009615150000231
Figure BDA0003009615150000241
And calculating a risk ratio (HR) and a 95% Confidence Interval (CI) by adopting a Cox proportional risk regression model with queue layering and age as a time scale, adjusting gender and adjusting or not adjusting clinical risk factors.
In the present invention, genetic risk stratification for metaPRS was performed based on the total population metaPRS genetic risk score (table 7). The genetic risk score of metaPRS is < -0.140, which can be judged as low genetic risk of individual stroke (metaPRS 0-20%), and the genetic risk score of metaPRS >0.305 can be judged as high genetic risk of individual stroke (metaPRS 80-100%).
TABLE 7MetaPRS genetic Risk stratification Table lookup
Figure BDA0003009615150000242
After the groups are divided into 5 equal parts of metaPRS, the cerebral apoplexy risk of each group of groups shows obvious gradient (trend P value)<0.001 (FIG. 8). High genetic risk persons (20% on metaPRS) are about 2 times more at risk of developing stroke than low genetic risk persons (20% under metaPRS) (HR: 1.99, 95% ci -13 ) The lifetime risk of stroke (risk of developing stroke at 80 years) in individuals with high genetic risk is also approximately 2-fold higher than in individuals with low genetic risk (25.2%, 95% ci.
The above genetic risk scores were similar in predictive effect on ischemic and hemorrhagic stroke (figure 10).

Claims (11)

1. A method for constructing a stroke polygene genetic risk comprehensive score comprises the following steps:
(1) Screening a set of SNP (single nucleotide polymorphism) sites which are included and reach the significant association of the whole genome and are related to the stroke and the phenotype related to the stroke; wherein the single nucleotide polymorphism sites related to the stroke-associated phenotype comprise: single nucleotide polymorphism sites related to blood pressure, type 2 diabetes, blood fat, obesity, atrial fibrillation and coronary heart disease;
(2) Genotyping based on the single nucleotide polymorphism sites in step (1);
(3) Extracting risk alleles, effect values and P values of the detected SNPs corresponding to a plurality of sub-phenotypes from the whole genome association research results respectively, wherein the plurality of sub-phenotypes comprises: stroke, coronary heart disease, type 2 diabetes, atrial fibrillation, systolic blood pressure, diastolic blood pressure, mean arterial pressure, pulse pressure, body mass index, waist circumference, total cholesterol, low density lipoprotein cholesterol, triglycerides and high density lipoprotein cholesterol, a plurality of candidate sub-phenotypes PRS are respectively constructed for each sub-phenotype: a plurality of candidate sub-phenotype PRSs for stroke, a plurality of candidate sub-phenotype PRSs for coronary heart disease, a plurality of candidate sub-phenotype PRSs for type 2 diabetes, a plurality of candidate sub-phenotype PRSs for atrial fibrillation, a plurality of candidate sub-phenotype PRSs for systolic blood pressure, a plurality of candidate sub-phenotype PRSs for diastolic blood pressure, a plurality of candidate sub-phenotype PRSs for mean arterial pressure, a plurality of candidate sub-phenotype PRSs for pulse pressure, a plurality of candidate sub-phenotype PRSs for body mass index, a plurality of candidate sub-phenotype PRSs for waist circumference, a plurality of candidate sub-phenotype PRSs for total cholesterol, a plurality of candidate sub-phenotype PRSs for low density lipoprotein cholesterol, a plurality of candidate sub-phenotype PRSs for triglyceride, and a plurality of candidate sub-phenotypes for high density lipoprotein cholesterol, and screening separately for stroke-optimized sub-phenotype PRS, coronary heart disease-optimized sub-phenotype PRS, type 2 diabetes-optimized sub-phenotype PRS, atrial fibrillation-optimized sub-phenotype PRS, systolic blood pressure-optimized sub-phenotype PRS, diastolic blood pressure-optimized sub-phenotype PRS, mean arterial pressure-optimized sub-phenotype PRS, pulse pressure-optimized sub-phenotype PRS, body mass index-optimized sub-phenotype PRS, waist circumference-optimized sub-phenotype PRS, total cholesterol-optimized sub-phenotype PRS, low density lipoprotein cholesterol-optimized sub-phenotype PRS, triglyceride-optimized sub-phenotype PRS and high density lipoprotein cholesterol-optimized sub-phenotype PRS;
wherein the process of constructing each candidate sub-phenotype PRS comprises the following steps:
and (3) separating multiple groups of SNPs according to the extracted P value: according to P value of 0.5, 0.05, 5X 10 -3 、5×10 -4 、5×10 -5 、5×10 -6 Selecting 3 groups from them4, 5 or 6 groups, for each group of SNPs, based on cohort population data, using plink software cloning commands in accordance with different linkage disequilibrium r 2 Pruning to obtain a plurality of groups of SNP combinations;
utilizing genotype data, weighting and summing individual SNP risk allele factors according to corresponding effect values to construct a plurality of candidate PRSs including different combination SNPs, evaluating the association of the candidate PRSs and stroke by adopting a logistic regression model, and selecting a score with the largest Odds Ratio (OR) as the best sub-phenotype PRS; wherein the individual SNP risk allele factor is 0, 1 or 2;
(4) Determining a weight for each sub-phenotypic PRS;
(5) Converting the weight of the sub-phenotypic PRS into a weight at the SNP level;
(6) And constructing a stroke polygene genetic risk comprehensive score metaPRS.
2. The method of claim 1, wherein the stroke polygenic genetic risk composite score is used for assessing stroke onset risk in east asian population, and the set of single nucleotide polymorphic sites comprises: the single nucleotide polymorphism sites related to cerebral apoplexy and coronary heart disease of all people, the single nucleotide polymorphism sites related to blood pressure, the single nucleotide polymorphism sites related to type 2 diabetes, the single nucleotide polymorphism sites related to blood fat, the single nucleotide polymorphism sites related to obesity and the single nucleotide polymorphism sites related to atrial fibrillation of east Asian people.
3. The method according to claim 1 or 2, wherein in step (2), the cohort of people who are genotyped is the east asian population.
4. The method of claim 3, wherein genotyping is performed using multiplex polymerase chain reaction targeted amplicon sequencing technology.
5. The method according to claim 1, wherein in step (4), the process of determining the weight of each sub-phenotypic PRS comprises:
converting each sub-phenotypic PRS into a normalized score with a mean of 0 and a standard deviation of 1;
using a training set, putting the normalized PRSs of each sub-phenotype and covariates to be adjusted into an elastic reticular logistic regression model together, selecting the model with the highest AUC as a final model, and obtaining coefficients (beta) of each PRS from the final model 1 …β 14 ) As the weight.
6. The method according to claim 1, wherein the process of converting the weight of the sub-phenotypic PRS into the weight of the SNP level in step (5) is performed according to the following model:
Figure FDA0003901258330000021
wherein σ 1 ,…,σ 14 Is the standard deviation, α, of each sub-phenotypic PRS in the training set j1 ,…,α j14 Is that the ith SNP corresponds to the effector value of each sub-phenotype, and if a SNP is not included in the kth score, the effector value of that SNP is of a magnitude α jk Is set to 0.
7. The method according to claim 1, wherein in step (6), the constructed stroke polygenic genetic risk composite score metaPRS is:
metaPRS=∑βsnp_i×Ni
wherein, β SNP _ i refers to the effector value of the ith SNP, and Ni refers to the number of effector alleles of the ith SNP carried by the individual.
8. The method of claim 7, wherein the risk of genetic stroke onset is divided into low, medium and high risk groups with 20% and 80% percentiles of metaPRS of all individuals in cohort group as cut points.
9. An apparatus for constructing a stroke polygenic genetic risk composite score, which performs the method for constructing the stroke polygenic genetic risk composite score as claimed in claim 1, the apparatus comprising:
a genotyping module for genotyping each SNP in the set of single nucleotide polymorphism sites recited in claim 1;
a sub-phenotype PRS construction module for extracting risk alleles, effect values and P values of the detected SNP corresponding to a plurality of sub-phenotypes from the whole genome association research results respectively, wherein the plurality of sub-phenotypes comprises: stroke, coronary heart disease, type 2 diabetes mellitus, atrial fibrillation, systolic pressure, diastolic pressure, mean arterial pressure, pulse pressure, body mass index, waist circumference, total cholesterol, low density lipoprotein cholesterol, triglyceride and high density lipoprotein cholesterol, and respectively constructing candidate sub-phenotype PRSs and screening optimal sub-phenotype PRSs for each sub-phenotype;
a model training module for determining a weight for each sub-phenotypic PRS in a training set;
and the metaPRS construction module is used for converting the weight of the sub-phenotype PRS into the weight of the SNP level and constructing the stroke polygene genetic risk comprehensive score metaPRS.
10. The apparatus of claim 9, wherein the metaPRS construction module is further configured to evaluate the effect of the constructed metaPRS on stroke risk prediction and stratification.
11. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the computer program to perform the method of any one of claims 1 to 8 to evaluate stroke risk of an individual.
CN202110371906.4A 2021-04-07 2021-04-07 Method and device for constructing stroke polygene genetic risk comprehensive score and application Active CN113012761B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110371906.4A CN113012761B (en) 2021-04-07 2021-04-07 Method and device for constructing stroke polygene genetic risk comprehensive score and application

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110371906.4A CN113012761B (en) 2021-04-07 2021-04-07 Method and device for constructing stroke polygene genetic risk comprehensive score and application

Publications (2)

Publication Number Publication Date
CN113012761A CN113012761A (en) 2021-06-22
CN113012761B true CN113012761B (en) 2023-02-03

Family

ID=76387997

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110371906.4A Active CN113012761B (en) 2021-04-07 2021-04-07 Method and device for constructing stroke polygene genetic risk comprehensive score and application

Country Status (1)

Country Link
CN (1) CN113012761B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113643754B (en) * 2021-08-11 2023-12-29 苏州赛美科基因科技有限公司 Missense variant gene scoring processing method, optimization scoring method and optimization scoring device
CN113838577B (en) * 2021-11-08 2022-09-09 北京航空航天大学 Convenient layered old people MODS early death risk assessment model, device and establishment method
CN115966259B (en) * 2022-12-26 2023-10-13 南京普恩瑞生物科技有限公司 Sample homology detection and verification method and system based on logistic regression modeling
CN117789819B (en) * 2024-02-27 2024-06-11 北京携云启源科技有限公司 Construction method of VTE risk assessment model

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102757954B (en) * 2012-06-07 2014-07-09 中国医学科学院阜外心血管病医院 Combination of multiple genetic single nucleotide polymorphisms related to coronary heart disease and application of combination

Also Published As

Publication number Publication date
CN113012761A (en) 2021-06-22

Similar Documents

Publication Publication Date Title
CN113012761B (en) Method and device for constructing stroke polygene genetic risk comprehensive score and application
US20220208298A1 (en) Determination of copy number variations using binomial probability calculations
CN113506594B (en) Construction method, device and application of polygene genetic risk comprehensive score of coronary heart disease
CN113046429B (en) Cerebral apoplexy polygene genetic risk scoring and morbidity risk evaluating device and application thereof
CN107423534B (en) Method and system for detecting genome copy number variation
US20130024127A1 (en) Determination of source contributions using binomial probability calculations
CN109661475A (en) Multiple Optimization mispairing expands (MOMA) target number
WO2022247903A1 (en) Polygenic risk score for coronary heart disease, construction method therefor, and application thereof in combination with clinical risk assessment
CN105002286A (en) Multiple single nucleotide polymorphic loca related to onset risks of hypertension and/or cardiovascular disease and associated application
US20220367063A1 (en) Polygenic risk score for in vitro fertilization
KR102044356B1 (en) A method of predicting skin phenotype using SNP
US20230383349A1 (en) Methods of assessing risk of developing a disease
CN115029431B (en) Type 2 diabetes gene detection kit and type 2 diabetes genetic risk assessment system
CN116287204A (en) Application of mutation condition of detection characteristic gene in preparation of venous thromboembolism risk detection product
Bray et al. Transethnic and race-stratified genome-wide association study of fibroid characteristics in African American and European American women
CN113643753B (en) Multi-gene genetic risk scoring and combined clinical risk assessment application of coronary heart disease
CN117778556A (en) Biological marker for noise susceptibility hearing impairment of workers and application thereof
RU2602451C1 (en) Diagnostic technique for genetic predisposition to ischemic stroke development in patients with atrial fibrillation
CN116386882A (en) Coronary heart disease genetic risk prediction method and system integrating genetic information of different populations
JP6564053B2 (en) A method for determining whether cells or cell groups are the same person, whether they are others, whether they are parents and children, or whether they are related
CN118186072A (en) Multi-gene detection kit for metabolic fatty liver disease and genetic risk assessment system
CN107841551B (en) Application of single nucleotide polymorphism site in wound sepsis risk assessment
Ye et al. A risk prediction model for ischemic stroke in southern Chinese population: impact of multiple genetic variants and clinical/lifestyle factors
KR20220077892A (en) Method for risk prediction of cardio-cerebrovascular disease using metabolic disease polygenic risk score
CN118638915A (en) Device for predicting cerebral apoplexy incidence risk and application

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant