CN113506594B - Construction method, device and application of polygene genetic risk comprehensive score of coronary heart disease - Google Patents

Construction method, device and application of polygene genetic risk comprehensive score of coronary heart disease Download PDF

Info

Publication number
CN113506594B
CN113506594B CN202110579230.8A CN202110579230A CN113506594B CN 113506594 B CN113506594 B CN 113506594B CN 202110579230 A CN202110579230 A CN 202110579230A CN 113506594 B CN113506594 B CN 113506594B
Authority
CN
China
Prior art keywords
sub
heart disease
coronary heart
phenotype
single nucleotide
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110579230.8A
Other languages
Chinese (zh)
Other versions
CN113506594A (en
Inventor
顾东风
鲁向锋
黄建凤
王来元
陈恕凤
刘钟应
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fuwai Hospital of CAMS and PUMC
Original Assignee
Fuwai Hospital of CAMS and PUMC
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fuwai Hospital of CAMS and PUMC filed Critical Fuwai Hospital of CAMS and PUMC
Priority to CN202110579230.8A priority Critical patent/CN113506594B/en
Publication of CN113506594A publication Critical patent/CN113506594A/en
Priority to PCT/CN2022/095221 priority patent/WO2022247903A1/en
Application granted granted Critical
Publication of CN113506594B publication Critical patent/CN113506594B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B20/00ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
    • G16B20/40Population genetics; Linkage disequilibrium
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B30/00ICT specially adapted for sequence analysis involving nucleotides or amino acids
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16BBIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
    • G16B40/00ICT specially adapted for biostatistics; ICT specially adapted for bioinformatics-related machine learning or data mining, e.g. knowledge discovery or pattern finding

Landscapes

  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Biotechnology (AREA)
  • Theoretical Computer Science (AREA)
  • Biophysics (AREA)
  • Spectroscopy & Molecular Physics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • General Health & Medical Sciences (AREA)
  • Chemical & Material Sciences (AREA)
  • Analytical Chemistry (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Genetics & Genomics (AREA)
  • Ecology (AREA)
  • Molecular Biology (AREA)
  • Physiology (AREA)
  • Artificial Intelligence (AREA)
  • Bioethics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Epidemiology (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)

Abstract

The invention provides a method and a device for constructing a polygenic genetic risk comprehensive score (metaPRS) of coronary heart disease and application thereof. The construction method of the coronary heart disease polygene genetic risk comprehensive score comprises the following steps: screening a set of SNPs associated with coronary heart disease and/or associated with a coronary heart disease-associated phenotype; detecting the genotype of the SNP to be detected of the individual; respectively extracting risk alleles, effect values and P values of the detected SNP corresponding to a plurality of sub-phenotypes from the whole genome association research result, constructing a plurality of candidate sub-phenotype PRSs and screening the optimal sub-phenotype PRSs; determining a weight for each sub-phenotypic PRS; converting the weight of the sub-phenotypic PRS into a weight at the SNP level; constructing a coronary heart disease polygene genetic risk comprehensive score metaPRS. The invention has important significance for predicting the onset risk of coronary heart disease and refining and layering.

Description

Construction method, device and application of multi-gene genetic risk comprehensive score of coronary heart disease
Technical Field
The invention relates to a method and a device for constructing a polygenic genetic risk comprehensive score (metaPRS) of coronary heart disease and application thereof.
Background
The development of cardiovascular disease (CVD) is influenced by a combination of genetic and environmental factors.
In the primary prevention of cardiovascular disease, risk prediction and assessment play a crucial role. Genetic factors as stable and quantifiable life-long markers have long been expected to be useful in risk assessment of disease to promote accurate prevention of cardiovascular disease. Over the past 10 years, genome-wide association studies have successfully identified hundreds of regions that have significant associations with coronary heart disease and coronary heart disease-associated phenotypes (blood lipid levels, blood pressure, type 2 diabetes, and BMI). Recently, a coronary heart disease polygenic genetic risk score (PRS) integrating multiple genetic variation information has been successfully developed and used for clinical utility assessment of coronary heart disease risk prediction (Eur. Heart. J.37,561-567 (2016); nat. Genet.50,1219-1224 (2018); J.Am. Coll. Cardiol.72,1883-1893 (2018); eur. Heart. J.37,3267-3278 (235); jama323,627-635 (2020); jama323,636-645 (2020); JAMA Cardiol. 3,693-702 (2018); N.Engl. J.Med.375, 2349-2016 (2016)). However, almost all of these genetic scores are constructed based on the european population, and the differences in ectopic site frequencies among different populations, differences in linkage disequilibrium patterns, have resulted in the inability of the european population to use the scores in the east asia and chinese populations. Second, this heterogeneity can also result from differences in lifestyle, other risk factors, and potential gene-environment interactions among different populations. Studies have reported that the predictive effect of these genetic scores predicts a significant decline in potency in other ethnic groups.
Therefore, there is an urgent need to develop genetic risk scores for the east asian population, particularly the chinese population.
Disclosure of Invention
The invention aims to provide a method for constructing a polygene genetic risk score of coronary heart disease.
The invention also aims to provide a device for constructing the multi-gene genetic risk score of the coronary heart disease.
Specifically, on one hand, the invention provides a method for constructing a polygene genetic risk score of coronary heart disease, which is a method for constructing a polygene genetic risk comprehensive score of coronary heart disease, and the method comprises the following steps:
(1) Screening a set of Single Nucleotide Polymorphism (SNP) sites related to coronary heart disease or related to coronary heart disease-related phenotype (achieving a complete genome significant association); wherein the coronary heart disease associated phenotype comprises: blood pressure, type 2 diabetes, blood lipids, obesity, and stroke;
(2) Genotyping based on the single nucleotide polymorphism sites in step (1);
(3) Respectively extracting risk alleles, effect values and P values of the detected SNP corresponding to a plurality of sub-phenotypes from the whole genome association research result, constructing a plurality of candidate sub-phenotype PRSs and screening the optimal sub-phenotype PRSs;
(4) Determining a weight for each sub-phenotypic PRS;
(5) Converting the weight of the sub-phenotypic PRS into a weight at the SNP level;
(6) Constructing a coronary heart disease polygene genetic risk comprehensive score metaPRS.
According to the specific embodiment of the invention, in the method for constructing the polygenic genetic risk score of coronary heart disease, the coronary heart disease-related phenotype blood pressure comprises: systolic, diastolic, pulse, mean arterial and hypertension; coronary heart disease-related phenotype obesity (body mass index) including body mass index, waist circumference, and waist-to-hip ratio; coronary heart disease-associated phenotypic lipids include total cholesterol, low density lipoprotein cholesterol, triglycerides and high density lipoprotein cholesterol.
According to a specific embodiment of the present invention, in the method for constructing a multiple-gene genetic risk score of coronary heart disease of the present invention, the plurality of sub-phenotypes comprises: coronary heart disease, body mass index, blood pressure, type 2 diabetes, total cholesterol, low density lipoprotein cholesterol, triglycerides, high density lipoprotein cholesterol, and stroke. That is, in the method for constructing a multi-gene genetic risk score for coronary heart disease of the present invention, the constructed plurality of candidate sub-phenotype PRSs include: coronary heart disease, stroke, type 2 diabetes, blood pressure, body mass index, total cholesterol, low density lipoprotein cholesterol, triglycerides, and high density lipoprotein cholesterol.
According to the specific embodiment of the invention, in the method for constructing the multiple-gene genetic risk score of coronary heart disease, the set of single nucleotide polymorphism sites is included in the whole genome association research and is found to have a whole genome significant association with coronary heart disease or a coronary heart disease-related phenotype (coronary heart disease-related risk factors). Specifically, the single nucleotide polymorphism sites are included in the collection of single nucleotide polymorphism sites: the SNP loci associated with coronary heart disease or stroke, and the SNP loci associated with blood pressure, type 2 diabetes, blood fat and obesity, and optionally further including the SNP loci associated with clinical phenotype of arteriosclerosis. According to the specific embodiment of the invention, in the method for constructing the coronary heart disease polygenic genetic risk score, the coronary heart disease polygenic genetic risk score is used for evaluating the coronary heart disease onset risk of the east asian population, the single nucleotide polymorphic sites included in the single nucleotide polymorphic site set can be of all populations, such as european populations and east asian populations, and the single nucleotide polymorphic sites related to the clinical phenotypes of blood pressure, type 2 diabetes, blood fat, obesity and arteriosclerosis can be mainly of the east asian population.
According to the specific embodiment of the invention, in the method for constructing the multi-gene genetic risk score of coronary heart disease, the crowd in the queue for genotyping is east Asian crowd.
According to the specific embodiment of the invention, in the construction method of the coronary heart disease polygene genetic risk score, the multiple polymerase chain reaction targeted amplicon sequencing technology is used for genotyping. Median sequencing depth was 982 x.
According to the specific embodiment of the invention, in the construction method of the coronary heart disease polygene genetic risk score, SNP (single nucleotide polymorphism) with the genotype detection rate lower than 95 percent can be excluded in the genotyping process, so that a qualified SNP set is obtained.
According to the specific embodiment of the invention, in the construction method of the coronary heart disease polygene genetic risk score, the risk alleles, effect values and P values of the detected SNP corresponding to a plurality of sub-phenotypes are respectively extracted from the large-scale east Asian population whole genome association research results.
According to the specific embodiment of the present invention, in the method for constructing a multi-gene genetic risk score of coronary heart disease, the process of constructing each sub-phenotype PRS includes:
dividing multiple groups of SNPs according to the extracted P value, and for each group of SNPs, based on the queue population data, using a plink software clicking command according to r 2 <0.2 pruning to obtain a plurality of groups of SNP combinations;
using genotype data, individual SNP risk allele factors (0, 1, OR 2) are weighted according to their corresponding effect values and summed to construct a plurality of candidate PRSs incorporating different combination SNPs, a logistic regression model is used to assess the association of these candidate PRSs with coronary heart disease, and the score with the greatest Odds Ratio (OR) (one standard deviation per PRS increase) is selected as the best sub-phenotypic PRS.
According to a more specific embodiment of the present invention, in the above process of constructing each sub-phenotype PRS, N sets of SNPs can be separated according to the size of the extracted P value, where N is greater than or equal to 2. For example, P values of 0.5,0.4,0.3,0.2,0.1,0.05,0.01,10 can be used -3 ,10 -4 ,10 -5 ,10 -6 ,10 -7 From these, 9, 10, 11 or 12 groups were selected.
According to a more specific embodiment of the present invention, in the above process of constructing PRSs of respective subphenotypes, when N sets of SNPs are separated based on the size of the extracted P value, linkage disequilibrium r is determined 2 <At 0.2, N sets of SNP combinations can be obtained, i.e., N candidate PRSs including SNPs of different combinations can be constructed.
In the invention, correlation coefficients r and P values between every two of all the sub-phenotype PRSs can be further calculated through Pearson correlation analysis.
According to the specific embodiment of the invention, in the method for constructing the multi-gene genetic risk score of coronary heart disease, part of people can be selected from all cohort people according to a predetermined proportion to be used as a training set (the rest of people can be used as a verification set). The process of constructing the sub-phenotype PRSs and determining the weight of each sub-phenotype PRS is carried out in a training set.
According to the specific embodiment of the present invention, in the method for constructing a multi-gene genetic risk score of coronary heart disease, the process of determining the weight of each sub-phenotype PRS includes:
converting each sub-phenotype PRS into a standardized score with a mean value of 0 and a standard deviation of 1;
using a training set, putting the normalized PRS with each sub-phenotype and covariates (age and sex) to be adjusted into an elastic mesh logistic regression model together, selecting the model with the highest AUC as a final model, and obtaining the final modelObtaining the coefficient (beta) of each PRS 1 …β n N PRSs in total) as the weight.
In some embodiments of the invention, an elastic reticular logistic regression model that corrects the correlation between individual sub-phenotypic PRSs is used by the invention to evaluate the correlation of 9 (i.e., n is 9) sub-phenotypic PRSs to coronary heart disease, compared to the OR value of the elastic reticular logistic regression estimate and the OR value of the univariate logistic regression estimate. Further, the invention constructs and verifies coronary heart disease metaPRS by integrating 9 sub-phenotype PRSs and converting the weight of the sub-phenotype PRSs into the weight of SNP level.
According to the specific embodiment of the invention, in the method for constructing the multi-gene genetic risk score of coronary heart disease, the process of converting the weight of the sub-phenotype PRS into the weight of the SNP level is carried out according to the following model:
Figure BDA0003085403870000041
wherein σ 1 ,…,σ n Is the standard deviation, α, of each (n total) sub-phenotypic PRS in the training set j1 ,...,α jn Is that the ith SNP corresponds to the effector value of each sub-phenotype, and if a SNP is not included in the kth score, the effector value of that SNP is of a magnitude α jk Is set to 0.
According to the specific embodiment of the invention, in the method for constructing the coronary heart disease polygene genetic risk score, the constructed coronary heart disease polygene genetic risk comprehensive score metaPRS is as follows:
metaPRS=∑βsnp_i×Ni
wherein, β SNP _ i refers to the effect value of the ith SNP, and Ni refers to the number of the effect alleles of the ith SNP carried by the individual.
According to the specific embodiment of the invention, the method for constructing the multi-gene genetic risk comprehensive score of coronary heart disease can further comprise the process of evaluating the effect of the constructed metaPRS on the risk prediction and stratification of coronary heart disease.
According to the specific embodiment of the invention, in the method for constructing the coronary heart disease polygenic genetic risk score, preferably, 20% and 80% percentiles of metaPRS of all individuals in a cohort group are used as cut points to divide the individual coronary heart disease genetic morbidity risk into low, medium and high risk groups.
In another aspect, the present invention further provides a device for constructing a multiple gene genetic risk composite score of coronary heart disease, the device comprising:
a genotyping module for genotyping;
the sub-phenotype PRS construction module is used for respectively extracting dangerous alleles, effect values and P values of the detected SNP corresponding to a plurality of sub-phenotypes from the whole genome correlation research result, constructing candidate sub-phenotype PRSs and screening optimal sub-phenotype PRSs;
a model training module for determining a weight for each sub-phenotypic PRS in a training set;
a metaPRS construction module for converting the weight of the sub-phenotypic PRS into the weight of the SNP level and constructing a crown heart disease polygenic genetic risk composite score (metaPRS).
According to the specific implementation scheme of the invention, the device for constructing the polygenic genetic risk comprehensive score of the coronary heart disease can also optionally comprise an SNP screening module for screening a set of Single Nucleotide Polymorphism (SNP) sites related to the coronary heart disease or related to the phenotype related to the coronary heart disease.
According to the specific embodiment of the invention, in the device for constructing the coronary heart disease polygenic genetic risk comprehensive score, the genotyping module can also be used for eliminating SNP (single nucleotide polymorphism) with the genotype detection rate lower than 95% after genotyping.
According to the specific embodiment of the invention, in the device for constructing the multi-gene genetic risk comprehensive score of coronary heart disease, optionally, the metaPRS construction module can be further used for evaluating the effect of the constructed metaPRS on risk prediction and stratification of coronary heart disease.
In another aspect, the present invention further provides a computer device, which includes a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor executes the computer program to implement the evaluation of the individual coronary heart disease onset risk by using the multiple gene genetic risk comprehensive score for coronary heart disease constructed by the method of the present invention.
In the specific embodiment of the invention, in order to accurately evaluate the correlation effect value of genetic variation and CAD (computer aided design) morbidity of east Asian population, the invention carries out a whole-gene correlation study in 51,531 coronary heart disease cases and 21,5934 controls. Then, 9 coronary heart diseases and relevant phenotype genetic information thereof are integrated to construct polygenic genetic risk scores in 2800 coronary heart disease cases and 2055 healthy controls, and finally verification and evaluation are carried out in a prospective queue of 41,271 Chinese population. The constructed polygene genetic risk score has good prediction value on the occurrence of the coronary heart disease. It was found that individuals with high genetic risk (20% of the genetic risk) had a risk of developing coronary heart disease about 3 times higher than individuals with low genetic risk (20% of the genetic risk) (HR: 2.93, 95% ci. And have similar predictive effects in the male and female populations. The research proves that the multi-gene genetic risk comprehensive score can realize refined layering of coronary heart disease risks, and the method has important application prospects in the aspects of constructing the multi-gene genetic risk comprehensive score of the coronary heart disease and preventing the coronary heart disease from the first stage.
Drawings
FIG. 1 is a flow chart of the study of the present invention. PRS, multigene risk scoring, among others.
Figure 2 shows the sequencing depth of 588 variant sites successfully typed.
Figure 3 shows the correlation of PRS in coronary heart disease with the effect values of GWAS in east asia and europe and america in the training set. The age and gender were adjusted using logistic regression models to calculate Odds Ratios (ORs) and 95% Confidence Intervals (CIs). Scores were calculated using the effect values of the east asian population and european UK Biobank coronary heart disease GWAS data, respectively, as SNPs weights. Setting different P value threshold values (0.5, 0.4,0.3,0.2,0.1,0.05,0.01, 10) -3 ,10 -4 ,10 -5 , 10 -6 ,10 -7 ) Respectively constructing 12 PRSs (linkage disequilibrium) containing different SNPs combinationsBalance r 2 <0.2)。
FIG. 4 shows the correlation of sub-phenotypic PRSs (each increase by one standard deviation) in the training set with CAD at different P-value thresholds. Age and gender were adjusted using logistic regression to calculate Odds Ratios (OR) and 95% Confidence Intervals (CI).
FIG. 5 is a correlation plot of sub-phenotype PRS and metaPRS in prospective cohorts. Wherein, P<0.05,**P<10 -3 , ***P<10 -10
FIG. 6 shows the association of subphenotypic multigene risk scores (one standard deviation increase per training set) with coronary heart disease. Age and gender were adjusted using logistic regression and elastic mesh logistic regression, respectively, to calculate Odds Ratios (OR) and 95% Confidence Intervals (CI).
Figure 7 shows the risk ratio of metaPRS (one standard deviation per increment) and sub-phenotypic PRS to CAD onset in a prospective cohort. Analysis was performed using a cox model that adjusted cohort source and gender with age as the time scale.
Figure 8 shows the relative and absolute risk of coronary heart disease onset for different genetic groups (< 20%,20% -80%, grouped > 80%). Wherein the different genetic risk groups HR and 95% CI and the cumulative incidence of coronary heart disease are estimated using a Cox model that adjusts gender and cohort source, scales with age, and considers competitive risk. The dotted line represents 95% CI. CAD, coronary heart disease; HR, risk ratio; CI, confidence interval.
Figure 9 shows the relative and absolute risk of coronary heart disease development for different genetic groups (< 20%,20% -80%, group > 80%) stratified by gender. Where the different genetic risk groups HR and 95% CI and the cumulative incidence of coronary heart disease were estimated using a Cox model that adjusted gender and cohort source, scaled by age, and considered race risk. Dotted line indicates 95% CI. CAD, coronary heart disease; HR, risk ratio; CI, confidence interval.
Detailed Description
For a more clear understanding of the technical features, objects and advantages of the present invention, reference is now made to the following detailed description taken in conjunction with the accompanying specific embodiments, and the technical solutions of the present invention are described, with the understanding that these examples are provided for the purpose of illustration only and are not intended to limit the scope of the present invention. Various changes and/or modifications within the spirit of the invention, which are readily contemplated by those skilled in the art, are deemed to be within the scope of the invention. In the examples, each raw reagent material is commercially available, and the experimental method without specifying the specific conditions is a conventional method and a conventional condition well known in the art, or a condition recommended by an instrument manufacturer.
Example 1
Research design process and research population
The study design flow is shown in figure 1. The present inventors developed a multi-gene risk score (PRS) for CAD in 2800 CAD patients and 2055 health controls (table 1) and then validated it in a large-scale prospective cohort of people. CAD cases in the training set came from the hospital mons outside, chinese medical science institute. The diagnosis of Myocardial Infarction (MI) strictly follows diagnostic criteria based on signs, symptoms, electrocardiogram and heart enzyme activity. Combined with whether a history of myocardial infarction was previously diagnosed, or whether the left coronary artery trunk was over 50% stenosed, or whether more than 70% of at least one major epicardial vascular stenosis was diagnosed as coronary heart disease.
Validation cohort three sub-cohorts from The China-PAR study, including The Chinese Cardiovascular health Multi-center cooperative study (InterASIA), the Chinese Cardiovascular epidemiological Multi-center cooperative study (China MUCA-1998), the Chinese Metabolic syndrome Community intervention and The Chinese family health study (CIMIC) (Yang, X.et al.Predicting The 10-Yeast Risks of Atherootic Cardiovascular Disease in Chinese Point Project The China-PAR Project (Prediction for CVD Risk in Chinese), circulation134,1430-1440 (2016)). Briefly, chinaMUCA-1998, interASIA, and CIMIC baselines were established in 1998, 2000-2001, and 2007-2008, respectively. The first follow-up was performed in InterASIA and ChinaMUCA-1998 in 2007-2008, and uniform follow-up was performed in 2012-2015 and 2018-2020 for all three cohorts, according to the unified standards. In this study, a total of 43,582 participant blood samples and primary covariate data were collected independent of the training set. A final total of 41,271 participants were enrolled after excluding 561 individuals with high genotype deletion rates (> 5.0%) or low mean sequencing depth (< 30 layers), 1352 individuals with <30 or >75 years at baseline, 398 individuals with confirmed coronary heart disease at baseline.
All studies were approved by the ethical review committee of the hospital mons-outsider hospital, china medical academy of sciences. Each participant signed an informed consent prior to data collection.
TABLE 1 training set general information
Figure BDA0003085403870000071
Figure BDA0003085403870000081
The values are mean (SD) or N (%).
Data collection and risk factor definition
Important information during baseline and follow-up visits was collected by trained investigators under strict quality control. Standard questionnaires were used to collect personal information (gender, date of birth, etc.), lifestyle information (eating habits, physical activity, etc.), disease history and CAD family history. Participants also received physical examinations (weight, height, blood pressure, etc.) and provided fasting blood samples for measurement of blood lipid and blood glucose levels.
To obtain information about disease outcome and mortality during follow-up, researchers follow-up participants or their agents while collecting medical records (or evidence of death) of the participants. Two committee members who did not know the baseline information independently verified the event. If there is an inconsistency, other committee members will participate in the discussion to eventually reach consensus. Coronary heart disease onset is defined as the first onset of unstable angina, non-fatal acute myocardial infarction, or coronary death. Fatal events caused by myocardial infarction or other coronary artery disease are defined as coronary heart disease deaths. The time interval between the baseline date and the date of coronary heart disease occurrence, death or last visit is the follow-up year.
Genetic variation site selection and genotyping
The invention firstly selects 600 genetic variation sites which are found to have significant genome-wide association (P) with coronary heart disease (n = 212) or coronary heart disease related risk factors in genome-wide association research<5×10 -8 ) Including stroke (n = 42), blood pressure (n = 56), blood lipids (n = 130), T2D (n = 90), and obesity (n = 79) (table 2). All genetic variation site information is provided in table 3. In short, the invention selects all genetic variation sites reported by the east Asia and European population for coronary heart disease; for other risk factors, the present invention focuses primarily on the reported sites of genetic variation in the east Asian population.
Training set samples were genotyped using Multi-Etnic Genotyping Arrays (MEGA) chips from Infinium to obtain genetic variation information at the detection sites. In cohort populations, the present invention uses multiplex PCR targeted amplicon sequencing technology to genotype samples. Multiplex primers were designed for each mutation using routine procedures in the art and the amplified target regions were subjected to high throughput sequencing using an Illumina Hiseq X Ten sequencer. After 12 mutation sites were removed with a detection rate of <95% or the mutations missing in the training data set, 588 mutations or their substitution sites were detected successfully, with an average detection rate of 99.9% and a median of 982 x in the sequencing depth (fig. 2). In order to evaluate the repeatability of genotyping, 1648 samples are subjected to repeated genotyping, and the consistency rate of the identification result is more than 99.4%.
TABLE 2 sources of selected genetic variations in this study
Figure BDA0003085403870000091
CAD, coronary heart disease; SBP, systolic blood pressure; DBP, diastolic pressure; PP, pulse pressure; MAP, mean arterial pressure; HTN, hypertension; T2D, type 2 diabetes; BMI, body mass index; WC, waist circumference; WHR, waist-hip ratio; TC, total cholesterol; LDL-C, low density lipoprotein cholesterol; TG, triglycerides; HDL-C, high density lipoprotein cholesterol.
Construction of MetaPRS
(1) Extracting SNP effect values from GWAS result data, and calculating PRS of each sub-phenotype
According to the invention, 9 genetic scores of CAD-related phenotypes are constructed according to effect values of large-scale whole genome association research of east Asia population. In order to accurately estimate the CAD effect value of the selected variants in the east asian population, the present invention performed a coronary heart disease whole genome association study in the east asian population with a total sample size of 267,465 patients (51,531 coronary heart disease patients and 215,934 non-coronary heart disease patients). For the other 8 phenotypes (stroke, type 2 diabetes, blood pressure, body mass index, total cholesterol, ldl cholesterol, triglycerides and hdl cholesterol), the present invention obtained the risk alleles, effect values and P values for each sub-phenotype at each locus from a large genome wide association study published by the east asian population. A detailed list of selected studies is shown in table 3.
TABLE 3 sources of summarized data for multigene risk score calculation
Figure BDA0003085403870000101
GWAS, whole genome association study; EWAS, whole exon association study; BP, blood pressure; CAD, coronary artery disease; T2D, type 2 diabetes; BMI, body mass index; TC, total cholesterol; LDL-C, low density lipoprotein cholesterol; TG, triglycerides; HDL-C, high density lipoprotein cholesterol.
Taking subphenotype CAD as an example, the invention integrates large-scale coronary heart disease case control genome data of east Asian people and Chinese people to carry out the correlation study of the whole coronary heart disease genome, samples reach 51,531 coronary heart disease patients and 215,934 non-coronary heart disease patients, and Meta analysis is carried out on different subqueue correlation analysis results by using a fixed effect model to obtain the risk allele, the effect value and the P value of the detected SNP. According to the extracted P value, according to 0.5,0.4,0.3,0.2,0.1,0.05,0.01,10 -3 ,10 -4 ,10 -5 ,10 -6 ,10 -7 Screening out12 sets of SNPs, for each set of SNPs, based on cohort population data, in linkage disequilibrium r using plink software (version 1.9) marketing commands 2 <0.2 pruning, and finally obtaining 12 groups of SNP combinations. Using training set genotype data, weighting individual SNP risk allelic factors (0, 1 OR 2) according to corresponding effect values, summing to construct 12 candidate PRSs including different combination SNPs, evaluating the association of the candidate PRSs and the coronary heart disease by using a logistic regression model, and selecting the best PRS for the coronary heart disease with the score with the largest Odds Ratio (OR) (every time the PRS is increased by one standard deviation). For the other 8 phenotypes, SNP effect values were obtained from the literature corresponding to the phenotypes provided in table 3, and then the other 8 sub-phenotypic PRS were constructed according to the same procedure as described above. Among them, the SNP sites and the effect values utilized by the best sub-phenotypic PRS are shown in Table 4.
(2) Calculating weights for individual sub-phenotypic PRSs in a training set
The 9 sub-phenotypic PRSs were converted to a score with a mean of 0 and a standard deviation of 1. Using a training set, putting the normalized 9 sub-phenotype PRSs and covariates (age and sex) to be adjusted into an elastic mesh logistic regression model (cv. Glmnet function, R package "glmnet"), which adopts a 10-fold cross validation method to evaluate a series of models with different penalty terms (setting alpha =0, 0.1, 0.2, 0.3, 0.4, 0.5, 0.6, 0.7, 0.8, 0.9 or 1.0), setting model parameter type.measure as "AUC", automatically screening a model with the highest AUC (area under receiver operating characteristic curve) as a final model, and obtaining coefficients (beta) of each PRS from the final model 1 …β 9 ) As weights. Table 5 provides the weights for each of the subphenotypic PRS, with subphenotypic weights of TG, HDL and LDL of 0.
(3) Conversion of weight of sub-phenotypic PRS into weight of SNP level
Figure BDA0003085403870000111
Converting PRS level weights to SNP level weights using the above formula, where σ 1 ,…,σ 9 Is the standard deviation, α, of each sub-phenotypic PRS in the training set j1 ,…,α j9 Is that the ith SNP corresponds to the effector value of each sub-phenotype, and if a SNP is not included in the kth score, the effector value of that SNP is of a magnitude α jk Is set to 0.
(4) Calculating metaPRS
Using the formula: metaPRS = ∑ β SNP _ i × Ni calculates metaPRS of the individual, where β SNP _ i refers to the effect value of the ith SNP (i.e., the weight of the SNP level obtained at step 3), and Ni refers to the number of effective alleles of the ith SNP carried by the individual.
After statistical processing steps, the final weight of a total of 510 SNPs is not 0 and is included in the metaPRS calculation, and information and weights for all eligible SNPs are provided in table 4.
(5) MetaPRS tangent point partitioning
Taking 20% and 80% percentiles of metaPRS of all individuals in the cohort population as cut points, and dividing the genetic risk of the coronary heart disease of the individuals into low, medium and high risk groups.
TABLE 4 information and weights of SNPs determined by the invention
Figure BDA0003085403870000121
Figure BDA0003085403870000131
Figure BDA0003085403870000141
Figure BDA0003085403870000151
Figure BDA0003085403870000161
Figure BDA0003085403870000171
Figure BDA0003085403870000181
Figure BDA0003085403870000191
Figure BDA0003085403870000201
Figure BDA0003085403870000211
Figure BDA0003085403870000221
TABLE 5 weight of each subphenotype in the multiple Gene genetic Risk Complex Scoring of coronary artery disease
Name of subphenotype PRS weights
Coronary heart disease 0.452
Blood pressure 0.074
Body mass index 0.072
Diabetes mellitus 0.064
Total Cholesterol 0.038
Cerebral apoplexy 0.004
Low density lipoprotein cholesterol 0
High density lipoprotein cholesterol 0
Triglycerides 0
Statistical analysis
For continuous variables, population characteristics are described as mean (standard deviation); for categorical variables, the population characteristics are described as a number (percentage). The polygenic genetic scores were divided into three groups (high, medium, low genetic risk group) according to <20%,20% -80%, and >80% quantile. The risk ratios (HRs) and their 95% Confidence Intervals (CIs) for coronary events in different genetic risk groups were estimated using a Cox proportional hazards regression model adjusted for age and gender, correcting cohort sources, and considering competing risks for non-coronary deaths. The lifetime risk (to age 80) of coronary heart disease in different genetic risk groups was assessed using a Cox proportional hazards regression model on a time scale of age. The analysis used the 'surffit. Coxph' function in R package survival. All reported P values in this study were uncorrected, and a two-sided P value <0.05 was considered statistically significant. Statistical analysis was performed in R software (R Foundation for Statistical Computing, vienna, austria, version 3.5.0) or SAS Statistical software (SAS Institute Inc, cary, NC, version 9.4).
Baseline information for a proactive queue
Table 6 shows baseline information for 41,271 subjects in the cohort population. The mean age at baseline was 52.3 years (standard deviation, 10.6 years), of which 42.5% were males. Men currently smoke at a higher rate than women. Through the follow-up of 534,701 people (average follow-up of 13.0 years), 1303 coronary heart diseases occur together.
TABLE 6 Baseline information for look-ahead queues
Figure BDA0003085403870000231
The values are mean (SD) or N (%). CAD, coronary heart disease.
Prediction of coronary heart disease by polygenic genetic risk scoring
The invention firstly sets 12 thresholds (0.5, 0.4,0.3,0.2,0.1,0.05,0.01, 10) according to the GWAS result P value of coronary heart disease of east Asian population -3 ,10 -4 ,10 -5 ,10 -6 ,10 -7 ) 12 groups of different SNPs are screened, then the PRS (coronary heart disease) is calculated by taking GWAS result data of European population as SNP effect values in training set, and the correlation strength between the PRS and the coronary heart disease is further evaluated. As shown in figure 3, when using effect values from the european population, 12 PRSs incorporating different SNP combinations (per SD increase) were all significantly reduced in OR (95% ci) values associated with coronary heart disease when compared to using effect values for coronary heart disease GWAS in the east asian population. Therefore, the GWAS effect values of the east asian population were used in the present study to construct PRS for each sub-phenotype, the strength of association of PRS for each candidate sub-phenotype with coronary heart disease in the training set is shown in fig. 4, and the score with the largest OR value was selected as the final PRS for the sub-phenotype.
There were varying degrees of correlation between the 9 sub-phenotypic PRSs (fig. 5). The association of 9 sub-phenotype PRSs with coronary heart disease was further evaluated using an elastic reticular logistic regression model that corrects the correlation between individual sub-phenotype PRSs, with the OR values of the elastic reticular logistic regression estimates compared to those of the univariate logistic regression estimates as shown in FIG. 6 (LDL-C, TG and HDL-C weights of 0 in FIG. 6). Finally, coronary heart disease metaPRS was constructed by integrating 9 sub-phenotypic PRSs and validated in cohort population.
The association of metaPRS with coronary heart disease risk was most intense compared to the sub-phenotypic PRS (fig. 7), with HR of coronary heart disease 1.44 (95% ci -39 ). The association of metaPRS with coronary heart disease was independent of dyslipidemia, hypertension, BMI, diabetes, smoking status and family history of coronary heart disease (table 7).
TABLE 7 Risk ratio of MetaPRS to coronary event after correction of coronary Risk factors (one standard deviation for each increase in MetaPRS)
Model (model) HR (95%CI) P value
metaPRS 1.44 (1.36,1.52) 2.84×10 -39
MetaPRS + dyslipidemia 1.42 (1.34,1.50) 2.54×10 -35
MetaPRS + hypertension 1.41 (1.34,1.49) 2.78×10 -35
MetaPRS + diabetes mellitus 1.43 (1.36,1.51) 1.33×10 -37
MetaPRS + body Mass index 1.42 (1.35,1.50) 1.74×10 -36
MetaPRS + smoking cigarette 1.44 (1.36,1.52) 4.55×10 -39
MetaPRS + CAD family history 1.44 (1.36,1.52) 9.52×10 -39
MetaPRS +6 common CAD Risk factors 1.39 (1.32,1.47) 2.75×10 -31
CAD, coronary heart disease; PRS, genetic risk score; HR, risk ratio; CI, confidence interval.
metaPRS were grouped in 20%, 80% quantiles, and individuals with high genetic risk (80% of genetic risk) were 3-fold more at risk of developing coronary events (HR =2.93, 95% ci. The cumulative risk of developing coronary heart disease in these two groups was 5.8% and 16.0% before age 80. Similar results were obtained by performing the analysis according to gender stratification (fig. 9).

Claims (12)

1. A construction method of a coronary heart disease polygene genetic risk comprehensive score comprises the following steps:
(1) Screening a set of SNP (single nucleotide polymorphism) sites brought into coronary heart disease related single nucleotide polymorphism sites and coronary heart disease related phenotype related single nucleotide polymorphism sites; wherein the coronary heart disease associated phenotype comprises: blood pressure, type 2 diabetes, blood lipids, obesity, and stroke; wherein the single nucleotide polymorphism sites are included in the collection of single nucleotide polymorphism sites: the single nucleotide polymorphism sites which are found to be significantly associated with the whole genome of coronary heart disease, the single nucleotide polymorphism sites which are significantly associated with the whole genome of blood pressure, the single nucleotide polymorphism sites which are significantly associated with the whole genome of type 2 diabetes, the single nucleotide polymorphism sites which are significantly associated with the whole genome of blood fat, the single nucleotide polymorphism sites which are significantly associated with the whole genome of obesity and the single nucleotide polymorphism sites which are significantly associated with the whole genome of stroke in the whole genome association research;
(2) Genotyping based on the single nucleotide polymorphic sites in step (1);
(3) Extracting risk alleles, effect values and P values of the detected SNPs corresponding to a plurality of sub-phenotypes from the whole genome association research results respectively, wherein the plurality of sub-phenotypes comprises: coronary heart disease, body mass index, blood pressure, type 2 diabetes, total cholesterol, low density lipoprotein cholesterol, triglycerides, high density lipoprotein cholesterol, and stroke, respectively constructing a plurality of candidate sub-phenotype PRSs for each sub-phenotype: a plurality of candidate sub-phenotype PRSs for coronary heart disease, a plurality of candidate sub-phenotype PRSs for body mass index, a plurality of candidate sub-phenotype PRSs for blood pressure, a plurality of candidate sub-phenotype PRSs for type 2 diabetes, a plurality of candidate sub-phenotype PRSs for total cholesterol, a plurality of candidate sub-phenotype PRSs for low density lipoprotein cholesterol, a plurality of candidate sub-phenotype PRSs for triglyceride, a plurality of candidate sub-phenotype PRSs for high density lipoprotein cholesterol, and a plurality of candidate sub-phenotype PRSs for stroke, and respectively screening the best sub-phenotype PRSs for coronary heart disease, the best sub-phenotype PRSs for body mass index, the best sub-phenotype PRSs for blood pressure, the best sub-phenotype PRSs for type 2 diabetes, the best sub-phenotype PRSs for total cholesterol, the best sub-phenotype PRSs for low density lipoprotein cholesterol, the best sub-phenotype PRSs for triglyceride, the best sub-phenotypes for high density lipoprotein cholesterol, and the best sub-phenotype PRSs for stroke;
wherein the process of constructing each candidate sub-phenotype PRS comprises the following steps:
dividing multiple groups of SNPs according to the extracted P value, and for each group of SNPs, based on the queue population data, using a plink software clicking command according to r 2 <0.2 pruning to obtain a plurality of groups of SNP combinations; wherein, the P value is 0.5,0.4,0.3,0.2,0.1,0.05,0.01,10 -3 , 10 -4 , 10 -5 , 10 -6 , 10 -7 Selecting 9, 10, 11 or 12 groups from the group;
weighting and summing individual SNP risk allelic factors according to corresponding effect values by using genotype data to construct a plurality of candidate PRSs (primary standard sequences) including different combination SNPs, evaluating the association between the candidate PRSs and the coronary heart disease by using a logistic regression model, and selecting the score with the maximum Odds Ratio (OR) as the best sub-phenotype PRS; wherein the individual SNP risk allele factor is 0,1, or 2;
(4) Determining a weight for each sub-phenotypic PRS;
(5) Converting the weight of the sub-phenotypic PRS into a weight at the SNP level;
(6) Constructing a coronary heart disease polygene genetic risk comprehensive score metaPRS.
2. The method of claim 1, wherein the single nucleotide polymorphic sites significantly associated with the presence of whole genome of blood pressure comprise: the single nucleotide polymorphism site which is obviously associated with the existence of the whole genome of systolic blood pressure, the single nucleotide polymorphism site which is obviously associated with the existence of the whole genome of diastolic blood pressure, the single nucleotide polymorphism site which is obviously associated with the existence of the whole genome of pulse pressure, the single nucleotide polymorphism site which is obviously associated with the existence of the whole genome of mean arterial pressure and the single nucleotide polymorphism site which is obviously associated with the existence of the whole genome of hypertension; and obesity includes: the single nucleotide polymorphism sites which are obviously associated with the whole genome of the body mass index, the single nucleotide polymorphism sites which are obviously associated with the whole genome of the waist circumference and the single nucleotide polymorphism sites which are obviously associated with the whole genome of the waist-hip ratio exist; the single nucleotide polymorphism sites which are significantly related to the existence of the whole genome of the blood fat comprise: a single nucleotide polymorphic site significantly associated with the presence of total cholesterol across the entire genome, a single nucleotide polymorphic site significantly associated with the presence of low density lipoprotein cholesterol across the entire genome, a single nucleotide polymorphic site significantly associated with the presence of triglycerides across the entire genome, and a single nucleotide polymorphic site significantly associated with the presence of high density lipoprotein cholesterol across the entire genome.
3. The method according to claim 2, wherein the coronary heart disease polygenic genetic risk composite score is used for assessing the coronary heart disease risk of the east Asian population.
4. The method of claim 1 or 3, wherein in step (2), the cohort population for genotyping is the east Asian population.
5. The method of claim 4, wherein genotyping is performed using multiplex polymerase chain reaction targeted amplicon sequencing techniques.
6. The method according to claim 1, wherein in step (4), the process of determining the weight of each sub-phenotypic PRS comprises:
converting each sub-phenotype PRS into a standardized score with a mean value of 0 and a standard deviation of 1;
using a training set to put the normalized PRS of each sub-phenotype and covariates to be adjusted into an elastic mesh logistic regression model together, and selecting the model with the highest AUC as a final modelType, from which the coefficients (β) of each PRS are obtained 1 …β n ) As weights.
7. The method according to claim 1, wherein the process of converting the weight of the sub-phenotypic PRS into the weight of the SNP level in step (5) is performed according to the following model:
Figure 687736DEST_PATH_IMAGE002
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE003
,…,
Figure 326528DEST_PATH_IMAGE004
is the standard deviation of each sub-phenotypic PRS in the training set,
Figure DEST_PATH_IMAGE005
is the effect value of the ith SNP corresponding to each of the subphenotypes, if the ith SNP
Figure 177197DEST_PATH_IMAGE006
If a SNP is not included in the score, the size of the effect value of the SNP
Figure DEST_PATH_IMAGE007
Is set to 0.
8. The method according to claim 1, wherein in step (6), the constructed coronary heart disease polygenic genetic risk composite score metaPRS is:
metaPRS=∑
Figure 929252DEST_PATH_IMAGE008
×Ni
wherein the content of the first and second substances,
Figure 220556DEST_PATH_IMAGE008
refers to the effect value of the ith SNP, and Ni refers to the number of effect alleles of the ith SNP carried by an individual.
9. The method of claim 8, wherein the genetic risk of coronary heart disease in the individual is divided into low, medium and high risk groups by using 20% and 80% percentiles of metaPRS in all individuals in the cohort group as cut points.
10. An apparatus for constructing a coronary heart disease polygenic genetic risk composite score, which executes the method for constructing a coronary heart disease polygenic genetic risk composite score as claimed in claim 1, the apparatus comprising:
a genotyping module for genotyping each SNP in the set of single nucleotide polymorphism sites recited in claim 1;
a sub-phenotype PRS construction module for extracting risk alleles, effect values and P values of the detected SNPs corresponding to a plurality of sub-phenotypes from the whole genome association study results respectively, wherein the plurality of sub-phenotypes comprises: coronary heart disease, body mass index, blood pressure, type 2 diabetes mellitus, total cholesterol, low density lipoprotein cholesterol, triglyceride, high density lipoprotein cholesterol and stroke, respectively constructing candidate sub-phenotype PRS aiming at each sub-phenotype, and screening optimal sub-phenotype PRS;
a model training module for determining a weight for each sub-phenotypic PRS in a training set;
and the metaPRS construction module is used for converting the weight of the sub-phenotype PRS into the weight of the SNP level and constructing the coronary heart disease polygene genetic risk comprehensive score metaPRS.
11. The apparatus of claim 10, wherein the metaPRS construction module is further configured to evaluate the effect of constructed metaPRS on the prediction and stratification of coronary heart disease risk.
12. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, wherein the processor executes the computer program to realize the evaluation of the individual coronary heart disease risk by using the multi-gene genetic risk composite score for coronary heart disease constructed by the method of any one of claims 1 to 9.
CN202110579230.8A 2021-05-26 2021-05-26 Construction method, device and application of polygene genetic risk comprehensive score of coronary heart disease Active CN113506594B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202110579230.8A CN113506594B (en) 2021-05-26 2021-05-26 Construction method, device and application of polygene genetic risk comprehensive score of coronary heart disease
PCT/CN2022/095221 WO2022247903A1 (en) 2021-05-26 2022-05-26 Polygenic risk score for coronary heart disease, construction method therefor, and application thereof in combination with clinical risk assessment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110579230.8A CN113506594B (en) 2021-05-26 2021-05-26 Construction method, device and application of polygene genetic risk comprehensive score of coronary heart disease

Publications (2)

Publication Number Publication Date
CN113506594A CN113506594A (en) 2021-10-15
CN113506594B true CN113506594B (en) 2023-02-03

Family

ID=78008724

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110579230.8A Active CN113506594B (en) 2021-05-26 2021-05-26 Construction method, device and application of polygene genetic risk comprehensive score of coronary heart disease

Country Status (1)

Country Link
CN (1) CN113506594B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022247903A1 (en) * 2021-05-26 2022-12-01 中国医学科学院阜外医院 Polygenic risk score for coronary heart disease, construction method therefor, and application thereof in combination with clinical risk assessment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101302563A (en) * 2008-07-08 2008-11-12 上海中优医药高科技有限公司 Comprehensive evaluation method of polygenic diseases genetic risk
CN102757954A (en) * 2012-06-07 2012-10-31 中国医学科学院阜外心血管病医院 Combination of multiple genetic single nucleotide polymorphisms related to coronary heart disease and application of combination
CN102758010A (en) * 2012-06-07 2012-10-31 中国医学科学院阜外心血管病医院 Combination of multiple genetic single nucleotide polymorphisms and environmental factors related to coronary heart disease and application of combination
CN111128298A (en) * 2019-12-24 2020-05-08 大连海事大学 Method and system for obtaining multi-gene risk scores based on deep learning model

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101302563A (en) * 2008-07-08 2008-11-12 上海中优医药高科技有限公司 Comprehensive evaluation method of polygenic diseases genetic risk
CN102757954A (en) * 2012-06-07 2012-10-31 中国医学科学院阜外心血管病医院 Combination of multiple genetic single nucleotide polymorphisms related to coronary heart disease and application of combination
CN102758010A (en) * 2012-06-07 2012-10-31 中国医学科学院阜外心血管病医院 Combination of multiple genetic single nucleotide polymorphisms and environmental factors related to coronary heart disease and application of combination
CN111128298A (en) * 2019-12-24 2020-05-08 大连海事大学 Method and system for obtaining multi-gene risk scores based on deep learning model

Also Published As

Publication number Publication date
CN113506594A (en) 2021-10-15

Similar Documents

Publication Publication Date Title
CN113012761B (en) Method and device for constructing stroke polygene genetic risk comprehensive score and application
CN112133365B (en) Gene set for evaluating tumor microenvironment, scoring model and application of gene set
CN109661475A (en) Multiple Optimization mispairing expands (MOMA) target number
US20220367063A1 (en) Polygenic risk score for in vitro fertilization
US20120309639A1 (en) Compositions and Methods for Diagnosing Genome Related Diseases and Disorders
WO2022179637A1 (en) Stroke polygenic risk score and pathogenesis risk evaluation device and application thereof
CN113506594B (en) Construction method, device and application of polygene genetic risk comprehensive score of coronary heart disease
CN116287204A (en) Application of mutation condition of detection characteristic gene in preparation of venous thromboembolism risk detection product
US20200354797A1 (en) Methods of assessing risk of developing breast cancer
Wang et al. Effects of angiotensinogen and angiotensin II type I receptor genes on blood pressure and left ventricular mass trajectories in multiethnic youth
Bray et al. Transethnic and race-stratified genome-wide association study of fibroid characteristics in African American and European American women
CN115029431A (en) Type 2diabetes gene detection kit and type 2diabetes genetic risk assessment system
CN113643753B (en) Multi-gene genetic risk scoring and combined clinical risk assessment application of coronary heart disease
EP4031688B1 (en) In vitro method for determining the risk of developing breast cancer in a subject
WO2022247903A1 (en) Polygenic risk score for coronary heart disease, construction method therefor, and application thereof in combination with clinical risk assessment
US20230383349A1 (en) Methods of assessing risk of developing a disease
CN116469552A (en) Method and system for breast cancer polygene genetic risk assessment
KR20190088037A (en) SNP marker set for predicting of prognosis of rheumatoid arthritis
JP7107883B2 (en) How to Determine Epilepsy Risk
KR20220077892A (en) Method for risk prediction of cardio-cerebrovascular disease using metabolic disease polygenic risk score
WO2021095855A1 (en) Method for assessing risk of ischemic heart disease and system for assessing risk of same
Li Puberty and DNA Methylation with Lung Function in Young Adults and Asthma Acquisition During Adolescence and Young Adulthood
정유리 Predicting Coronary Artery Disease Risk using Polygenic Risk Scores and Clinical Variables in the East Asian Population
CA3238945A1 (en) Breast cancer risk assessment
KR20230122495A (en) Physical activity related single nucleotide polymorphism markers and use thereof

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant