Disclosure of Invention
The invention aims to provide peripheral blood ACTB methylation as a potential marker for early diagnosis of stroke.
In a first aspect, the invention claims the use of a methylated ACTB gene as a marker in the preparation of a product; the product has any one of the following functions:
(1) Auxiliary diagnosis of cerebral apoplexy;
(2) Stroke is pre-warned prior to clinical symptoms.
In a second aspect, the invention claims the use of a substance for detecting the methylation level of the ACTB gene in the manufacture of a product; the product has any one of the following functions:
(1) The cerebral apoplexy is diagnosed in an auxiliary way;
(2) Stroke is pre-warned prior to clinical symptoms.
In a third aspect, the invention claims the use of a substance for detecting the methylation level of the ACTB gene and a medium storing a mathematical model building method and/or a method of use for preparing a product; the product has any one of the following functions:
(1) The cerebral apoplexy is diagnosed in an auxiliary way;
(2) Stroke is pre-warned prior to clinical symptoms.
The mathematical model is obtained according to a method comprising the following steps:
(A1) Detecting the methylation levels of the ACTB genes of n1 cerebral apoplexy samples and n2 control samples respectively;
(A2) And (2) taking the ACTB gene methylation level data of all the samples obtained in the step (A1), and establishing a mathematical model by a two-classification logistic regression method according to the classification mode of the cerebral apoplexy sample and the comparison sample.
Wherein n1 and n2 in (A1) can both be positive integers of 50 or more.
The use method of the mathematical model comprises the following steps:
(B1) Detecting the methylation level of the ACTB gene of a sample to be detected;
(B2) Substituting the ACTB gene methylation level data of the sample to be detected, which is obtained in the step (B1), into the mathematical model to obtain a detection index; and then comparing the detection index with the threshold value, and determining whether the sample to be detected is from or is candidate to be from a cerebral apoplexy patient according to the comparison result.
The threshold may be determined according to a maximum johnson index, and may be determined as a certain value, such as 0.5, according to an actual situation. The gray areas greater than the threshold are classified into one category, the gray areas less than the threshold are classified into another category, and the gray areas equal to the threshold are regarded as uncertain gray areas.
In a fourth aspect, the present invention claims the use of the "medium storing the mathematical model building method and/or the method of use" as described in the third aspect above for the manufacture of a product; the product has any one of the following functions:
(1) The cerebral apoplexy is diagnosed in an auxiliary way;
(2) Stroke is pre-warned prior to clinical symptoms.
In a fifth aspect, the invention claims a kit.
The kit claimed in the present invention comprises a substance for detecting methylation level of ACTB gene; the application of the kit is at least one of the following:
(1) Auxiliary diagnosis of cerebral apoplexy;
(2) Stroke is pre-warned prior to clinical symptoms.
Further, the kit further comprises a medium storing the mathematical model building method and/or the use method as described above.
In a sixth aspect, the invention claims a system.
The claimed system of the present invention comprises:
(D1) Reagents and/or instruments for detecting methylation levels of the ACTB gene;
(D2) An apparatus comprising unit a and unit B.
The unit A is used for establishing a mathematical model and comprises a data acquisition module, a data analysis processing module and a model output module.
And the data acquisition module is used for acquiring (D1) ACTB gene methylation level data of n1 cerebral apoplexy samples and n2 control samples obtained by detection.
The data analysis processing module can establish a mathematical model through a two-classification logistic regression method according to classification modes of the cerebral apoplexy samples and the contrast samples based on the ACTB gene methylation level data of the n1 cerebral apoplexy samples and the n2 contrast samples collected by the data collection module.
Wherein n1 and n2 in (A1) can both be positive integers of 50 or more.
The model output module is used for outputting the mathematical model established by the data analysis processing module.
The unit B is used for determining whether the sample to be detected is from or is candidate to be from a cerebral apoplexy patient, and comprises a data input module, a data operation module, a data comparison module and a conclusion output module.
And the data input module is used for inputting the ACTB gene methylation level data of the person to be detected, which is obtained by the detection (D1).
And the data operation module is used for substituting the ACTB gene methylation level data of the person to be detected into the mathematical model and calculating to obtain a detection index.
The data comparison module is used for comparing the detection index with a threshold value.
The threshold may be determined according to the maximum jordan index, or may be determined as a certain value, such as 0.5, according to actual conditions. A value greater than the threshold is classified as one type, and a value less than the threshold is classified as another type, and equal to the threshold is regarded as an indeterminate gray zone.
And the conclusion output module is used for outputting a conclusion whether the sample to be detected comes from or is candidate to come from the cerebral apoplexy patient according to the comparison result of the data comparison module.
In addition, the invention also claims a method for detecting whether the sample to be detected is from or is candidate to be from a cerebral apoplexy patient (namely a method for assisting in diagnosing cerebral apoplexy or a method for early warning cerebral apoplexy before clinical symptoms). The method may comprise the steps of:
(A) The mathematical model may be established according to a method comprising the steps of:
(A1) Detecting the methylation level (training set) of the ACTB genes of n1 cerebral apoplexy samples and n2 control samples respectively;
(A2) And (2) taking the ACTB gene methylation level data of all the samples obtained in the step (A1), and establishing a mathematical model by a two-classification logistic regression method according to the classification mode of the cerebral apoplexy sample and the comparison sample.
Wherein n1 and n2 in (A1) can both be positive integers of more than 50.
(B) Whether the test sample is from or is candidate for being from a stroke patient can be determined according to a method comprising the following steps:
(B1) Detecting the methylation level of the ACTB gene of the sample to be detected;
(B2) Substituting the ACTB gene methylation level data of the sample to be detected, which is obtained in the step (B1), into the mathematical model to obtain a detection index; and then comparing the detection index with the threshold value, and determining whether the sample to be detected is from or is candidate to be from the cerebral apoplexy patient according to the comparison result.
The threshold may be determined according to the maximum johnson index, or may be determined according to the actual situation as a certain value, such as 0.5. A value greater than the threshold is classified as one type, and a value less than the threshold is classified as another type, and equal to the threshold is regarded as an indeterminate gray zone.
In the foregoing aspects, the methylation level of the ACTB gene is the methylation level of all or part of CpG sites in the ACTB gene in the fragments shown in (e 1) to (e 5) below;
the methylated ACTB gene is methylated at all or part of CpG sites in the fragments shown in (e 1) to (e 5) in the ACTB gene;
(e1) A DNA fragment shown in SEQ ID No.1 or a DNA fragment with more than 80% of identity with the DNA fragment;
(e2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment with more than 80% of identity with the DNA fragment;
(e3) A DNA fragment shown in SEQ ID No.3 or a DNA fragment with more than 80% of identity with the DNA fragment;
(e4) A DNA fragment shown in SEQ ID No.4 or a DNA fragment with more than 80% of identity with the DNA fragment;
(e5) The DNA fragment shown in SEQ ID No.5 or the DNA fragment with more than 80 percent of identity with the DNA fragment.
Further, the "all or part of CpG sites" may specifically be any of:
(f1) The CpG sites shown by 128 th-129 th positions from the 5' end of the DNA fragment shown in SEQ ID No.1, the CpG sites shown by 180 th-181 th positions from the 5' end of the DNA fragment shown in SEQ ID No.1, the CpG sites shown by 231 th-232 th positions from the 5' end of the DNA fragment shown in SEQ ID No.1, the CpG sites shown by 313 th-314 th positions from the 5' end of the DNA fragment shown in SEQ ID No.1, the CpG sites shown by 362 th-363 th positions from the 5' end of the DNA fragment shown in SEQ ID No.1, the CpG sites shown by 367 th-368 th positions from the 5' end of the DNA fragment shown in SEQ ID No.1 or the CpG sites shown by 428 th-429 th positions from the 5' end of the DNA fragment shown in SEQ ID No. 1;
(f2) The CpG sites shown by 53-54 th positions from the 5 'end of the DNA fragment shown in SEQ ID No.2, the CpG sites shown by 57-58 th positions from the 5' end of the DNA fragment shown in SEQ ID No.2, the CpG sites shown by 125-126 th positions from the 5 'end of the DNA fragment shown in SEQ ID No.2, the CpG sites shown by 232-233 th positions from the 5' end of the DNA fragment shown in SEQ ID No.2, the CpG sites shown by 260-261 th positions from the 5 'end of the DNA fragment shown in SEQ ID No.2, or the CpG sites shown by 283-284 th positions from the 5' end of the DNA fragment shown in SEQ ID No. 2;
(f3) The CpG sites shown by 61 th-62 th sites from the 5' end of the DNA fragment shown in SEQ ID No.3, the CpG sites shown by 87 th-88 th sites from the 5' end of the DNA fragment shown in SEQ ID No.3, the CpG sites shown by 103 th-104 th sites from the 5' end of the DNA fragment shown in SEQ ID No.3, the CpG sites shown by 147 th-148 th sites from the 5' end of the DNA fragment shown in SEQ ID No.3, the CpG sites shown by 171 th-172 th sites from the 5' end of the DNA fragment shown in SEQ ID No.3, the CpG sites shown by 186 th-187 th sites from the 5' end of the DNA fragment shown in SEQ ID No.3 or the CpG sites shown by 238 th-239 th sites from the 5' end of the DNA fragment shown in SEQ ID No. 3;
(f4) The CpG sites shown by the 39 th-40 th bits from the 5 'end of the DNA fragment shown in SEQ ID No.4, the CpG sites shown by the 41 th-42 th bits from the 5' end of the DNA fragment shown in SEQ ID No.4, the CpG sites shown by the 69 th-70 th bits from the 5 'end of the DNA fragment shown in SEQ ID No.4, the CpG sites shown by the 107 th-108 th bits from the 5' end of the DNA fragment shown in SEQ ID No.4, the CpG sites shown by the 110 th-111 th bits from the 5 'end of the DNA fragment shown in SEQ ID No.4, the CpG sites shown by the 139 th-140 th bits from the 5' end of the DNA fragment shown in SEQ ID No.4, the CpG sites shown by the 185 th-186 th bits from the 5 'end of the DNA fragment shown in SEQ ID No.4, or the CpG sites shown by the 275 th-276 th bits from the 5' end of the DNA fragment shown in SEQ ID No. 4;
(f5) The CpG sites shown by 44 th-45 th bits of the DNA fragment shown by SEQ ID No.5 from the 5 'end, the CpG sites shown by 175 th-176 th bits of the DNA fragment shown by SEQ ID No.5 from the 5' end, the CpG sites shown by 266 th-267 th bits of the DNA fragment shown by SEQ ID No.5 from the 5 'end, or the CpG sites shown by 300 th-301 th bits of the DNA fragment shown by SEQ ID No.5 from the 5' end;
(f6) 9 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.1 and 7 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 2;
(f7) 9 distinguishable CpG sites of the DNA segment shown in SEQ ID No.1 and 12 distinguishable CpG sites of the DNA segment shown in SEQ ID No. 3;
(f8) 9 distinguishable CpG sites of the DNA segment shown in SEQ ID No.1 and 11 distinguishable CpG sites of the DNA segment shown in SEQ ID No. 4;
(f9) 9 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.1 and 5 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 5;
(f10) 7 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.2 and 12 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 3;
(f11) 7 distinguishable CpG sites of the DNA segment shown in SEQ ID No.2 and 11 distinguishable CpG sites of the DNA segment shown in SEQ ID No. 4;
(f12) 7 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.2 and 5 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 5;
(f13) 12 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.3 and 11 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 4;
(f14) 12 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.3 and 5 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 5;
(f15) 11 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.4 and 5 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 5;
(f16) 9 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.1, 7 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.2 and 12 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 3;
(f17) 9 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.1, 7 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.2 and 11 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 4;
(f18) 9 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.1, 7 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.2 and 5 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 5;
(f19) 9 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.1, 12 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.3 and 11 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 4;
(f20) 9 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.1, 12 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.3 and 5 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 5;
(f21) 9 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.1, 11 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.4 and 5 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 5;
(f22) 7 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.2, 12 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.3 and 11 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 4;
(f23) 7 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.2, 12 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.3 and 5 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 5;
(f24) 12 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.3, 11 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.4 and 5 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 5;
(f25) 9 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.1, 7 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.2, 12 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.3 and 11 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 4;
(f26) 9 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.1, 7 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.2, 12 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.3 and 5 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 5;
(f27) 9 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.1, 7 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.2, 11 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.4 and 5 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 5;
(f28) 9 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.1, 12 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.3, 11 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.4 and 5 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 5;
(f29) 7 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.2, 12 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.3, 11 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.4 and 5 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 5;
(f30) 9 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.1, 7 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.2, 12 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.3, 11 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.4 and 5 distinguishable CpG sites of the DNA fragment shown in SEQ ID No. 5;
in (f 6) to (f 30), the 9 distinguishable CpG sites of the DNA fragment shown in SEQ ID No.1 are: the CpG sites 128-129 shown from the 5' end of SEQ ID No.1 (ACTB _ A _ 1); cpG sites shown at positions 180-181 (ACTB _ A _ 3); cpG sites shown at positions 203-204 (ACTB _ A _ 4); cpG sites shown at positions 231-232 (ACTB _ A _ 5); cpG sites as shown at positions 313-314 (ACTB _ A _ 6); cpG sites indicated at positions 338-339 (ACTB _ A _ 7); cpG sites at positions 362-363 and 367-368 (ACTB _ A _ 8.9); cpG sites as shown at positions 406-407 (ACTB _ A _ 10); cpG sites shown at positions 428-429 (ACTB _ A _ 12);
the 7 distinguishable CpG sites of the DNA segment shown in SEQ ID No.2 are: cpG sites shown as 53-54 th and 57-58 th positions from the 5' end of SEQ ID No.2 (ACTB _ B _ 2.3); cpG sites shown at positions 96-97 and 101-102 (ACTB _ B _ 4.5); cpG sites as shown at positions 125-126 (ACTB _ B _ 6); cpG sites at positions 232-233 (ACTB _ B _ 8); cpG sites shown at positions 260-261 (ACTB _ B _ 9); cpG sites as shown at positions 283-284 (ACTB _ B _ 10); cpG sites at positions 335-336 (ACTB _ B _ 12);
the 12 distinguishable CpG sites of the DNA fragment shown in the SEQ ID No.3 are CpG sites shown in 25 th-26 th, 27 th-28 th, 29 th-30 th, 32 th-33 th and 45 th-46 th positions from the 5' end of the SEQ ID No.3 (ACTB _ C _ 1.2.3.4.5); cpG sites shown at positions 61-62 (ACTB _ C _ 6); cpG sites as shown at positions 63-64, 66-67 and 81-82 (ACTB _ C _ 8.9.10); cpG sites as shown at positions 87-88 and 103-104 (ACTB _ C _ 11.12); cpG sites as shown at positions 105-106, 109-110 and 119-120 (ACTB _ C _ 14.15.16); cpG sites as shown at positions 147-148 (ACTB _ C _ 17); cpG sites as shown at positions 149-150 and 165-166 (ACTB _ C _ 19.20); cpG sites at positions 171-172 (ACTB _ C _ 24); cpG sites at positions 186-187 (ACTB _ C _ 25); cpG sites shown at positions 192-193, 194-195, 198-199 and 201-202 (ACTB _ C _ 27.28.29.30); cpG sites as shown at positions 211-212 and 216-217 (ACTB _ C _ 31.32); cpG sites as shown at positions 238-239 (ACTB _ C _ 34);
the 11 distinguishable CpG sites of the DNA fragment shown in the SEQ ID No.4 are CpG sites shown as 39 th-40 th and 41 th-42 th positions from the 5' end of the SEQ ID No.4 (ACTB _ D _ 2.3); cpG sites as shown at positions 61-62 and 65-66 (ACTB _ D _ 4.5); cpG sites shown at positions 69-70 (ACTB _ D _ 6); cpG sites as shown at positions 77-78 and 81-82 (ACTB _ D _ 7.8); cpG sites shown at positions 107-108 and 110-111 (ACTB _ D _ 9.10); cpG sites represented by positions 122-123 (ACTB _ D _ 11); cpG sites at positions 139-140 (ACTB _ D _ 12); cpG sites as shown at positions 185-186 (ACTB _ D _ 14); cpG sites shown at positions 213-214 and 219-220 (ACTB _ D _ 15.16); a CpG site shown at positions 275-276 (ACTB _ D _ 17); cpG sites as shown at positions 304-305 (ACTB _ D _ 18);
the 5 distinguishable CpG sites of the DNA segment shown in SEQ ID No.5 are: the CpG site 44-45 th bit from the 5' end of SEQ ID No.5 (ACTB _ E _ 1); cpG sites as shown at positions 175-176 (ACTB _ E _ 2); a CpG site represented by positions 266-267 (ACTB _ E _ 3); cpG sites at positions 292-293 (ACTB _ E _ 4); cpG sites shown at positions 300-301 (ACTB _ E _ 5).
In the foregoing aspects, the means for detecting methylation levels of the ACTB gene comprises (or is) a primer combination for amplifying a full-length or partial fragment of the ACTB gene. The reagent for detecting the methylation level of the ACTB gene comprises (or is) a primer combination for amplifying the full-length or partial fragment of the ACTB gene.
Further, the partial fragment may be at least one of:
(g1) The DNA fragment shown in SEQ ID No.1 or the DNA fragment contained in the DNA fragment;
(g2) A DNA fragment shown in SEQ ID No.2 or a DNA fragment contained in the DNA fragment;
(g3) A DNA fragment shown as SEQ ID No.3 or a DNA fragment contained therein;
(g4) The DNA fragment shown in SEQ ID No.4 or the DNA fragment contained in the DNA fragment;
(g5) The DNA fragment shown in SEQ ID No.5 or the DNA fragment contained in the DNA fragment;
(g6) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.1 or a DNA fragment contained therein;
(g7) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.2 or a DNA fragment contained therein;
(g8) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.3 or a DNA fragment contained therein;
(g9) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.4 or a DNA fragment contained therein;
(g10) A DNA fragment having an identity of 80% or more to the DNA fragment represented by SEQ ID No.5 or to a DNA fragment comprising the same.
Further, the primer combination is a primer pair A and/or a primer pair B and/or a primer pair C and/or a primer pair D and/or a primer pair E.
The primer pair A is a primer pair consisting of a primer A1 and a primer A2; the primer A1 is single-stranded DNA shown by 11 th-35 th nucleotides of SEQ ID No.6 or SEQ ID No. 6; the primer A2 is single-stranded DNA shown by the 32 nd to 58 th nucleotides of SEQ ID No.7 or SEQ ID No. 7.
The primer pair B is a primer pair consisting of a primer B1 and a primer B2; the primer B1 is single-stranded DNA shown by 11 th to 34 th nucleotides of SEQ ID No.8 or SEQ ID No. 8; the primer B2 is single-stranded DNA shown by the 32 nd to 56 th nucleotides of SEQ ID No.9 or SEQ ID No. 9.
The primer pair C is a primer pair consisting of a primer C1 and a primer C2; the primer C1 is single-stranded DNA shown by 11 th-35 th nucleotides of SEQ ID No.10 or SEQ ID No. 10; the primer C2 is single-stranded DNA shown by the 32 nd to 51 th nucleotides of SEQ ID No.11 or SEQ ID No. 11.
The primer pair D is a primer pair consisting of a primer D1 and a primer D2; the primer D1 is single-stranded DNA shown by 11 th-37 th nucleotides of SEQ ID No.12 or SEQ ID No. 12; the primer D2 is single-stranded DNA shown by nucleotides 32 to 56 of SEQ ID No.13 or SEQ ID No. 13.
The primer pair E is a primer pair consisting of a primer E1 and a primer E2; the primer E1 is single-stranded DNA shown by 11 th-37 th nucleotides of SEQ ID No.14 or SEQ ID No. 14; the primer E2 is single-stranded DNA shown by 32 th-56 th nucleotides of SEQ ID No.15 or SEQ ID No. 15.
In the foregoing aspects, the person to be tested for assisting in diagnosing stroke and/or evaluating the risk of stroke is a stroke patient or a normal person (a person who does not have stroke) who meets at least one of the following conditions:
(h1) The age is less than 65 years;
(h2) Drinking;
(h3) Alcohol consumption and low methylation level of the ACTB gene.
Further, the stroke patient is a stroke patient with a attack time of less than 2 years or less than 1.5 years or less than 1.32 years or less than 1 year.
In the present invention, the drinking is defined as drinking more than or equal to 2 times/week and continuing for half a year at present or in the past.
Any of the above mathematical models may be changed in practical application according to the detection method of DNA methylation and the fitting manner, and needs to be determined according to a specific mathematical model without convention.
In the embodiment of the present invention, the model is specifically log (y/(1-y)) = b0+ b1x1+ b2x2+ b3x3+ \8230, + bnXn, where y is a detection index obtained after a dependent variable is substituted into the model for methylation values of one or more methylation sites of a sample to be tested, b0 is a constant, x1 to xn are independent variables, i.e., methylation values of one or more methylation sites of the test sample (each value is a value between 0and 1), and b1 to bn are weights assigned to the methylation values of each site by the model.
In embodiments of the invention, the model may be established by adding known parameters such as age, sex, white blood cell count, body mass index, smoking, drinking, history of hypertension, history of diabetes, HDL-C, LDL-C, TC, and TG as appropriate to improve discrimination efficiency. The specific model established in the embodiment of the invention is a model for assisting in distinguishing stroke patients with the attack time of less than 2 years from control samples, and the model specifically comprises the following steps: log (y/(1-y)) = -2.937-0.810 ACTB _D _, 2.3+0.916 ACTB _D _, 4.5-1.573 ACTB _, D _, 6-3.181 ACTB _, D _, 7.8 _, 0.931 ACTB d _, 9.10+0.882
ACTB _ D _11+3.763 ACTB _D _D12-2.570 ACTB _D14 +1.142 ACTB _D _15.16+0.221 ACTB _D _D17 +1.285 ACTB _D18 +0.013 age-0.072 gender (male assigned 1, female assigned 0) +0.302 leukocyte count +0.034 body weight index +0.025 smoking (smoking assigned 1, non-smoking assigned 0) -0.055 drinking wine (drinking wine assigned 1, non-drinking wine assigned 0) +0.035 hypertension history (drinking wine assigned 0-170). The ACTB _ D _2.3 is the methylation level of CpG sites shown in 39-40 and 41-42 positions of the 5' end of the DNA fragment shown in SEQ ID No. 4; the ACTB _ D _4.5 is the methylation level of CpG sites shown in the 61-62 and 65-66 positions of the DNA fragment shown in SEQ ID No.4 from the 5' end; the ACTB _ D _6 is the methylation level of CpG sites shown in 69 th to 70 th positions of the 5' end of the DNA fragment shown in SEQ ID No. 4; the ACTB _ D _7.8 is the methylation level of CpG sites shown in the 77 th-78 th and 81 th-82 th positions of the DNA fragment shown in SEQ ID No.4 from the 5' end; the ACTB _ D _9.10 is the methylation level of CpG sites shown in 107-108 and 110-111 positions from the 5' end of the DNA fragment shown in SEQ ID No. 4; the ACTB _ D _11 is the methylation level of CpG sites shown in 122-123 th sites of the 5' end of the DNA fragment shown in SEQ ID No. 4; the ACTB _ D _12 is the methylation level of the CpG sites shown in 139 th to 140 th positions of the 5' end of the DNA fragment shown in SEQ ID No. 4; the ACTB _ D _14 is the methylation level of CpG sites shown in 185-186 th site of the 5' end of the DNA fragment shown in SEQ ID No. 4; the ACTB _ D _15.16 is the methylation level of CpG sites shown in the 213 th to 214 th and 219 th to 220 th positions of the DNA fragment shown in SEQ ID No.4 from the 5' end; the ACTB _ D _17 is the methylation level of CpG sites shown in 275-276 th sites of the 5' end of the DNA fragment shown in SEQ ID No. 4; the ACTB _ D _18 is the methylation level of CpG sites shown in 304-305 th sites of the 5' end of the DNA fragment shown in SEQ ID No. 4. The threshold for the model is the diagnostic threshold of 0.36 by the maximum johnson index. The person to be tested with the detection index smaller than 0.36 calculated by the model is selected as a cerebral apoplexy patient, and the person to be tested with the detection index larger than 0.36 is selected as a non-cerebral apoplexy patient.
In each of the above aspects, the detecting the methylation level of the ACTB gene is detecting the methylation level of the ACTB gene in blood.
The invention adopts nested case contrast research, collects new stroke patients within 2 years after community queue selection as case groups (the case groups do not suffer from diseases during blood collection, and stroke occurs within 2 years after blood collection), and matches the patients who do not suffer from stroke during follow-up period according to the age group as contrast groups, thereby discussing the relationship between peripheral blood ACTB methylation and stroke of Chinese population. Researches prove that peripheral blood ACTB methylation can be used as a potential marker for early warning and early diagnosis of stroke. The invention has important scientific significance and clinical application value for improving the diagnosis and treatment effect of the cerebral apoplexy.
Detailed Description
The experimental procedures used in the following examples are all conventional procedures unless otherwise specified.
Materials, reagents and the like used in the following examples are commercially available unless otherwise specified.
Example 1 primer design for detection of methylation sites of ACTB Gene
The ACTB gene has 6 exons and has a total length of 3454bp (chr 7: 5566779-5570232). After a large number of sequence and function analyses, the CpG sites covering the ACTB gene promoter region, exon 1 (ACTB _ A and ACTB _ B), exon 2 and intron 2 (ACTB _ C), exon 3, 4 and 5 and intron 3, 4 and 5 regions (ACTB _ D), exon 6 and intron 6 (ACTB _ E) are subjected to methylation level and stroke correlation analysis.
The ACTB _ A fragment (SEQ ID No. 1) is located in the hg19 reference genome chr7:5570155-5571232, the antisense strand.
The ACTB _ B fragment (SEQ ID No. 2) is located in the hg19 reference genome, chr7:5570155-5571232, sense strand.
The ACTB _ C fragment (SEQ ID No. 3) is located in the hg19 reference genome chr7:5569032-5570100, sense strand.
The ACTB _ D fragment (SEQ ID No. 4) is located in the hg19 reference genome chr7:5567000-5569000, sense strand.
The ACTB _ E fragment (SEQ ID No. 5) is located in the hg19 reference genome chr7:5565779-5566988, sense strand.
The CpG site information in the ACTB _ a fragment is shown in table 1.
The CpG site information in the ACTB _ B fragment is shown in table 2.
The CpG site information in the ACTB _ C fragment is shown in table 3.
The CpG site information in the ACTB _ D fragment is shown in table 4.
The CpG site information in the ACTB _ E fragment is shown in table 5.
TABLE 1 CpG site information in ACTB _Afragment
CpG sites
|
Position of CpG sites in the sequence
|
ACTB_A_1
|
128 th to 129 th positions from 5' end of SEQ ID No.1
|
ACTB_A_3
|
180-181 th position from 5' end of SEQ ID No.1
|
ACTB_A_4
|
203-204 of SEQ ID No.1 from the 5' end
|
ACTB_A_5
|
231-232 of SEQ ID No.1 from the 5' end
|
ACTB_A_6
|
313-314 of SEQ ID No.1 from 5' end
|
ACTB_A_7
|
338-339 th site from 5' end of SEQ ID No.1
|
ACTB_A_8
|
362-363 bits from 5' end of SEQ ID No.1
|
ACTB_A_9
|
367 to 368 th positions from 5' end of SEQ ID No.1
|
ACTB_A_10
|
406-407 bits from 5' end of SEQ ID No.1
|
ACTB_A_12
|
428-429 from the 5' end of SEQ ID No.1 |
TABLE 2 CpG site information in ACTB _Bfragment
TABLE 3 CpG site information in ACTB _Cfragment
CpG sites
|
Position of CpG site in sequence
|
ACTB_C_1
|
25-26 from the 5' end of SEQ ID No.3
|
ACTB_C_2
|
27-28 from the 5' end of SEQ ID No.3
|
ACTB_C_3
|
From the 5' end of SEQ ID No.3Position 29-30
|
ACTB_C_4
|
32-33 of SEQ ID No.3 from 5' end
|
ACTB_C_5
|
SEQ ID No.3 from the 45 th to the 46 th position of the 5' end
|
ACTB_C_6
|
61-62 from the 5' end of SEQ ID No.3
|
ACTB_C_8
|
63-64 from the 5' end of SEQ ID No.3
|
ACTB_C_9
|
From the 66 th to the 67 th position of the 5' end of SEQ ID No.3
|
ACTB_C_10
|
81-82 from the 5' end of SEQ ID No.3
|
ACTB_C_11
|
The 87 th to 88 th positions from the 5' end of SEQ ID No.3
|
ACTB_C_12
|
103-104 from the 5' end of SEQ ID No.3
|
ACTB_C_14
|
105-106 from the 5' end of SEQ ID No.3
|
ACTB_C_15
|
109-110 th position from 5' end of SEQ ID No.3
|
ACTB_C_16
|
119-120 th from 5' end of SEQ ID No.3
|
ACTB_C_17
|
147-148 th from 5' end of SEQ ID No.3
|
ACTB_C_19
|
149-150 th position from 5' end of SEQ ID No.3
|
ACTB_C_20
|
SEQ ID No.3 from 165 to 166 of the 5' terminus
|
ACTB_C_24
|
171-172 of SEQ ID No.3 from the 5' end
|
ACTB_C_25
|
SEQ ID No.3 from the 5' end at positions 186-187
|
ACTB_C_27
|
No. 192 to No. 193 of SEQ ID No.3 from the 5' end
|
ACTB_C_28
|
194-195 th site from 5' end of SEQ ID No.3
|
ACTB_C_29
|
198-199 th position from 5' end of SEQ ID No.3
|
ACTB_C_30
|
SEQ ID No.3 from 201 to 202 of 5' end
|
ACTB_C_31
|
From the 211 th to 212 th position of the 5' end of SEQ ID No.3
|
ACTB_C_32
|
216-217 th position from 5' end of SEQ ID No.3
|
ACTB_C_34
|
238 th to 239 th from the 5' end of SEQ ID No.3 |
TABLE 4 CpG site information in ACTB _Dfragment
TABLE 5 CpG site information in ACTB _Efragment
CpG sites
|
Position of CpG sites in the sequence
|
ACTB_E_1
|
SEQ ID No.5 from 44 th to 45 th of 5' end
|
ACTB_E_2
|
175-176 of SEQ ID No.5 from the 5' end
|
ACTB_E_3
|
266-267 of SEQ ID No.5 from the 5' end
|
ACTB_E_4
|
292-293 positions from 5' end of SEQ ID No.5
|
ACTB_E_5
|
300-301 of SEQ ID No.5 from the 5' end |
Specific PCR primers were designed for five fragments (ACTB _ A, ACTB _ B, ACTB _ C, ACTB _ D and ACTB _ E) as shown in Table 6. Wherein, SEQ ID No.6, SEQ ID No.8, SEQ ID No.10, SEQ ID No.12 and SEQ ID No.14 are forward primers, and SEQ ID No.7, SEQ ID No.9, SEQ ID No.11, SEQ ID No.13 and SEQ ID No.15 are reverse primers; the 1 st to 10 th sites from the 5' end in SEQ ID No.6, SEQ ID No.8, SEQ ID No.10, SEQ ID No.12 and SEQ ID No.14 are non-specific tags, and the 11 th to 35 th sites of SEQ ID No.6 and SEQ ID No.10, 11 th to 34 th sites of SEQ ID No.8, and 11 th to 37 th sites of SEQ ID No.12 and SEQ ID No.14 are specific primer sequences; SEQ ID No.7, SEQ ID No.9, SEQ ID No.11, SEQ ID No.13 and SEQ ID No.15 are non-specific tags from positions 1 to 31 of the 5' end, and positions 32 to 58 of SEQ ID No.7, SEQ ID No.9, SEQ ID No.13 and SEQ ID No.15 are specific primer sequences from positions 32 to 56 of SEQ ID No. 11. The primer sequence does not contain SNP and CpG sites.
TABLE 6 ACTB methylation primer sequences
Amplification of fragments
|
Primer number
|
Primer sequence (5 '-3')
|
ACTB_A
|
SEQ ID No.6
|
aggaagagagGTTTTGAAAGTAGGGTTTGAGGATT
|
ACTB_A
|
SEQ ID No.7
|
cagtaatacgactcactatagggagaaggctCCTCCAAATAATCTAAAAAAACAATTC
|
ACTB_B
|
SEQ ID No.8
|
aggaagagagTAGATGGTTTGGGAGGGTAGTTTA
|
ACTB_B
|
SEQ ID No.9
|
cagtaatacgactcactatagggagaaggctCCCTAAAAACAAAACCTAAAAACCT
|
ACTB_C
|
SEQ ID No.10
|
aggaagagagGGAAGGAAAGGATAAGAAGTTTTGA
|
ACTB_C
|
SEQ ID No.11
|
cagtaatacgactcactatagggagaaggctACTACCCCACCCAACCAACT
|
ACTB_D
|
SEQ ID No.12
|
aggaagagagGGGATTTGATTGATTATTTTATGAAGA
|
ACTB_D
|
SEQ ID No.13
|
cagtaatacgactcactatagggagaaggctACCACAAAACTCCATACCTAAAAAA
|
ACTB_E
|
SEQ ID No.14
|
aggaagagagGGGTTTTGTAGAATAATGAGAATTTGA
|
ACTB_E
|
SEQ ID No.15
|
cagtaatacgactcactatagggagaaggctTCCTACCTATTCATCCAAACAAAAA |
Example 2 methylation detection and analysis of the results of ACTB Gene
1. Research sample
By adopting an epidemiological whole-group sampling method, community groups of more than 18 years old in 16 towns in Jiangsu province, sentence and city are investigated from 10 months to 12 months in 2015, and 11151 people are investigated in total in a baseline survey. All the investigators signed informed consent under review by the ethical committee of the Nanjing university of medicine.
Baseline survey content includes collecting general demographic information of the panelist such as gender, age, native place, ethnicity, etc.; inquiring disease history, medication history, family history and the like, wherein the disease history mainly comprises cardiovascular diseases, diabetes, kidney diseases, dyslipidemia history and the like; collecting smoking status, drinking status, etc. Smoking was defined as > 20 cigarettes/week and smoking was sustained for > 3 months/year. The definition of drinking is that drinking is more than or equal to 2 times/week at present or in the past and is continued for half a year. The human body measurement indexes include weight, blood pressure, height and waist circumference. Weight measurements required the respondents to be lightly loaded, with readings accurate to 0.1kg. When measuring blood pressure, the investigated subject needs to have a fasting state in the morning, does not exercise violently, and has a rest for 5 minutes in a sitting state. The method comprises the steps of measuring the blood pressure of the brachial artery of the right arm by a mercury column sphygmomanometer, repeatedly measuring the blood pressure of each examinee for three times, wherein the interval of each time is more than 30s, and if the difference value of the three systolic pressure or diastolic pressure measurements is larger than or equal to 8mmHg, measuring once. When measuring the height, the tape measure is fixed on the wall, all the testees take off the shoes and the hats, the heels of the two feet are closed and erected on the tape measure, and the right-angle side of the large set square is used for reading and is accurate to 0.1cm. When measuring waist, on the navelAnd at the position of 1cm, a circle of reading is taken by using a soft leather ruler to stick the skin. Body Mass Index (BMI) is equal to weight (kg)/height squared (m) 2 ). In addition, the examinee requires fasting blood sampling in the early morning, and two peripheral venous blood tubes (1 each of anticoagulation and procoagulant blood) are collected together for detecting biochemical indexes, mainly including leukocyte subtype ratio, fasting plasma Glucose (GLU), triglyceride (TG), high density lipoprotein cholesterol (HDL-C), total Cholesterol (TC), low density lipoprotein cholesterol (LDL-C), and the like.
The information of the cardiovascular and cerebrovascular diseases and the death information are recorded through the conventional registration items of the chronic diseases of local hospitals, disease control center chronic disease management systems, community health service centers and workstations and the reimbursement data of the social security centers every year. The queue starting time is the baseline investigation date, the ending variable is the attack of the cerebral apoplexy, and the follow-up time of the study object with the missed visit is uniformly calculated according to half of the follow-up ending time. By 7-13 days in 2018, the follow-up date is 234 cases of new stroke, the patients with new stroke within 2 years after the patients are selected as a case group, the total number of the patients is 139, and after the patients are matched with age and gender, the patients without stroke during the follow-up period are selected as a contrast, and the total number of the patients is 147.
The mean age of the stroke cases was 67.64 + -9.51 years, the mean age of the controls was 67.59 + -9.11 years, and the difference between the ages of the two groups was not statistically significant (P > 0.05). Gender, BMI, SBP, DBP, smoking, drinking, history of hypertension, history of diabetes, TC, TG, HDL-C, LDL-C, glucose, white blood cell count, neutrophil ratio, monocyte, eosinophil and basophil ratio were not statistically significant in both groups (P > 0.05). The median follow-up time (time from blood draw to follow-up expiration date) was 2.65 years for all subjects, and the median onset time (time from blood draw to stroke definitive diagnosis) for stroke cases was 1.32 years. The detailed results are shown in Table 7.
TABLE 7 subject general demographic and clinical profile
Note: BMI, body mass index; SBP, systolic blood pressure; DBP, diastolic pressure; TC, total cholesterol; glucose, fasting plasma Glucose; TG, triglycerides; HDL-C, high density lipoprotein cholesterol; LDL-C, low density lipoprotein cholesterol.
2. Methylation detection
1. Total DNA of the blood sample was extracted.
2. The total DNA of the blood sample prepared in step 1 was treated with bisulfite (see Qiagen for DNA methylation kit instructions). Following bisulfite treatment, unmethylated cytosine (C) is converted to uracil (U), while methylated cytosine remains unchanged, i.e., the C base of the original CpG site is converted to C or U following bisulfite treatment.
3. Taking the DNA treated by the bisulfite in the step 2 as a template, adopting 5 pairs of specific primers in the table 6 to perform PCR amplification by DNA polymerase according to a reaction system required by a conventional PCR reaction, wherein the 5 pairs of primers all adopt the same conventional PCR system, and the 5 pairs of primers all perform amplification according to the following procedures.
The PCR reaction program is: 95 ℃,4min → (95 ℃,20s → 56 ℃,30s → 72 ℃,2 min) 45 cycles → 72 ℃,5min → 4 ℃,1h.
4. Taking the amplification product in the step 3, and carrying out DNA methylation analysis by flight time mass spectrum, wherein the specific method comprises the following steps:
(1) To 5. Mu.l of the PCR product was added 2. Mu.l of a shrimp basic phosphate (SAP) solution (0.3 ml SAP 2.5U]+1.7ml H 2 O) then incubated in a PCR apparatus (37 ℃,20min → 85 ℃,5min → 4 ℃,5 min) according to the following procedure;
(2) Taking out 2 μ l of SAP treated product obtained in step (1), adding into 5 μ l of T-Cleavage reaction system according to the instruction, and incubating at 37 deg.C for 3h;
(3) Adding 19 mu l of deionized water into the product obtained in the step (2), and then performing deionization incubation for 1h by using 6 mu g of Resin in a rotary table;
(4) Centrifuging at 2000rpm at room temperature for 5min, and loading the micro supernatant into 384SpectroCHIP by a Nanodipen mechanical arm;
(5) Performing time-of-flight mass spectrometry; the data obtained were collected with the SpectroACQUIRE v3.3.1.3 software and visualized with the MassArray EpiTyper v1.2 software.
The reagents used in the flight time mass spectrometry detection are all kits (T-clean Mass clean Reagent Auto Kit, cat # 10129A); the detection instrument used for the flight time mass spectrum detection comprises
Analyzer Chip Prep Module 384, model: 41243 by weight; the data analysis software is self-contained software of the detection instrument.
3. Quality control
And (3) field investigation: formulating a rigorous questionnaire and unifying the measurement standards; uniformly training and examining inspectors before investigation; pre-investigation is carried out before formal investigation, and problems are found and summarized in time; the questionnaire is subjected to field quality control by special quality control personnel, and the questionnaire with unqualified quality is returned to the investigator to investigate the research object again; double-track input of questionnaire data and consistency check; and 5, timely submitting the collected blood sample on site for inspection. Strictly according to the experimental operation requirements, regularly performing ultraviolet disinfection on the operation environment; performing a pre-experiment before a formal experiment; and 5% of samples are randomly extracted to repeat the flight time mass spectrum detection, and the result consistency rate is ensured to be more than 99%. And the double judges the experimental result, arranges the methylation data and ensures the authenticity and accuracy of the data. Through mass spectrometry experiments, 44 distinguishable G peak patterns are obtained. Methylation levels were calculated from the comparison of G-containing and A-peak areas using SpectroACQUIRE v3.3.1.3 software (SpectroACQUIRE v3.3.1.3 software automatically calculates the peak area for each sample to obtain the methylation level at each CpG site).
4. Statistical analysis
Normally distributed measurement data are expressed by mean +/-standard deviation and counting data rate, and the group t test or chi-square test is used for comparing and analyzing the distribution difference of general population data of a case group and a control group. The non-normally distributed measurement data are expressed by median (interquartile range), and the Mann-Whitney U test is used to compare the difference between the case group and the control group. Unconditional logistic regression models were used for correlation analysis between ACTB methylation and stroke, correcting covariates such as leukocyte subtype ratio, gender, age, alcohol consumption, smoking, BMI, history of hypertension, diabetes, HDL-C, LDL-C, TC, and TG, and calculating Odds Ratio (OR) and 95% Confidence Interval (CI) at each 10% methylation increment. The statistical correlation between the two variables was evaluated using Spearman rank correlation coefficient. The value of the combination of multiple CpG sites for early warning and early diagnosis of stroke is evaluated through logistic regression and a receiver operating characteristic curve (ROC). The difference was statistically significant with a two-sided P < 0.05, and all data were statistically analyzed by SPSS 24.0.
5. Analysis of results
1. Association analysis of ACTB methylation and stroke
According to the invention, stroke cases are subjected to correlation analysis according to different clinical onset times, and the results show that the clinical onset time is less than or equal to 1.5 years, the stroke cases ACTB _ A fragment has 6 sites [ CpG _1 (0.30vs.0.35), cpG _3 (0.21vs.0.24), cpG _5 (0.27vs.0.32), cpG _6 (0.25vs.0.30), cpG _8.9 (0.21vs.0.25), cpG _12 (0.39vs.0.43) ], ACTB _ B fragment 5 sites [ CpG _2.3 (0.59vs.0.63), cpG _6 (0.61vs.0.64), cpG _8 (0.41vs.0.45), cpG _9 (0.31vs.35), cpG _10 (0.27vs.0.33), ACTB _ C fragment 6 (0.41vs.0.0.15), cpG _19 (0.35), cpG _2 (0.2 v.2 v.0.35), cpG _10 (0.2.2 v.2 v.0.0.3 v.0.0.0.4 v.20), cpG _19 v.0.0.0.35), cpG _2 (0.15), cpG _ 0.35), cpG _2 (0.11 v.0.0.0.0.2 v.0.0.2 v.0.0.0.0.0.15), cpG _2 v.35), cpG _ 0.0.0.0.0.35), cpG _19 v.0.0.0.0.0.0.15, 3 (V.0.0.0.0.0.11 v.0.0.0.0.0.2 v.0.0.0.0.0.0.45), cpG _19 v.35, 3 v.0.2.0.0.2.2.0.0.11 v.0.0.2.0.0.0.0.0.0.35), cpG.2 v.0.0.25, 3 v.35, 3 v.0.0.0.0.0.0.0.0.0.0.0.0.0.35, 3, 3.0.0.0.0.0.0.35, 3.0.0.0.0.0.0.0.3.3.3, 3.35, 3.0.0.0.0.3.3 v.2.3.3.0.35, 3.0.0.25, 3.2.3.2.0.35, 3.0.0.2.3.3.3.0.0.3.0.0.0.0.0.0.25, 3.2.2.2.2.0.35, 3.3.3.0.0.0.0.3, 3.0.0.0.0.3.2.2.0.0.0.25) and 19.9.0.35 [ ACTB (V.9.9.0.0.35) of; logistic regression results showed that after correcting covariates such as leukocyte subtype ratio, gender, age, alcohol consumption, smoking, BMI, history of hypertension, history of diabetes, HDL-C, LDL-C, TC, and TG, ORs (95% cis) were ACTB _ a [ CpG _ 1; cpG _3 (0.603-0.904), P =0.003; cpG _5 (0.600-0.921), P =0.002; cpG _6, 0.700 (0.602-0.901), P =0.001; cpG _8.9 (0.607-0.905), P =0.005; cpG _12 (0.617-0.913), P =0.006], ACTB _ B [ CpG _2.3 (0.618-0.903), P =0.004; cpG _6 (0.603-0.910), P =0.008; cpG — 8 (0.605-0.896), P =0.007; cpG — 9 (0.607-0.900), P =0.007; cpG _10 (0.608-0.872), P =0.009], ACTB _ C [ CpG _6 (0.619-0.909), P =0.006; cpG _11.12 (0.605-0.906), P =0.003; cpG _17 (0.704-0.913), P =0.001; cpG _24 (0.695-0.910), P =0.002; cpG — 25 (0.684-0.909), P =0.005; cpG _34 (0.688-0.907), P =0.002], ACTB _ D [ CpG _2.3 (0.683-0.908), P =0.008; cpG _6 (0.678-0.917), P =0.004; cpG _9.10 (0.646-0.902), P =0.004; cpG _12 (0.656-0.901), P =0.002; cpG _14 (0.649-0.897), P =0.001; cpG _17 (0.631-0.905), P =0.003], ACTB _ E [ CpG _ 1; cpG _2 (0.649-0.894), P =0.001; cpG _3 (0.646-0.908), P =0.004; cpG _5 (0.659-0.901), P =0.003], specific results are shown in table 8. Interestingly, the methylation level of the CpG sites in the stroke cases with the clinical onset time of less than or equal to 1.32 years and the clinical onset time of less than or equal to 1 year are more remarkably different than the control. The methylation levels of the CpG sites in the case of cerebral apoplexy with a clinical onset time of 1.32 years or less are, respectively, 6 sites [ CpG _1 (0.25vs.0.35), cpG _3 (0.15vs.0.24), cpG _5 (0.22vs.0.32), cpG _6 (0.20vs.0.30), cpG _8.9 (0.16vs.0.25), cpG _12 (0.34vs.0.43) ], 5 sites [ CpG _2.3 (0.53vs.0.63), cpG _6 (0.55vs.0.64), cpG _8 (0.36vs.0.45), cpG _9 (0.25vs.0.35), cpG _10 (CpG _ 0.23vs.0.33), 6 sites [ ACTB _ C (0.10vs.6), cpG _9 (0.25vs.0.35), cpG _10 (CpG _10, cpG _ 0.23vs.0.0.33), ACTB _ C (ACTB) 6 sites [ 0.10vs.0.35), cpG _2 (CpG _ 2.12 (0.24), cpG _ 2.12v2v2v2v0.35), cpG _ 2.24, cpG _ 2.2.24 (0.35), cpG _ 2.2.400.0.35), cpG _ 2.35, 3.24, 3.2.35, 3.35, 3.0.0.24, 3.0.2.35, 3.2.0.0.35, 2.0.35, cpG _ 2.0.35, 2.0.0.3.0.3.9 (0.9, 10.2.2.0.9, 2.2.0.0.2.0.0.9, 10.0.0.35), cpG _ 2.0.35), cpG _ 2.0.9 (0.0.0.2.0.0.24) and 10.35) sites [ 3.0.35.9 (0.2.0.9 (0.2.35), cpG _ 2.9), cpG _ 2.0.0.0.0.0.0.9 (0.0.2.0.9), cpG _ 2.0.0.0.0.35), 2.0.2.2.2.0.0.9 (0.0.2.2.2.2.0.2.2.2.35), cpG _ 2.2.2.35), cpG _ 2.35), 2.0.0.0.9, 2.3.3.0.0.3.0.0.0.0.0.0.35), 2.0.3.3.0.0.0.35), 2.35), 2.0.0.0.0.35), cpG _ 2.0.3.0.3.3.0.0.0.0.0.35), cpG _ 2.0.0.0.0.3, 2.0.0.3.0.3.3.3.3.3, 2.0.0.3.0.3.3.3.3.3.3.3.3.0.0.0.3.3.3.0.0.9, cpG _ 2.0.35), 2.0.0.3.3.0.0.0.0.0.0.0.0.35, cpG _ 2.0.0.0.0.3.0.0.0.3.0.35, 2.0.0.0.3.0.0.3.3.0.0.0.0.3.3.35), cpG _ 2.0.0.9, cpG _ 2.35), cpG _ 2.0.3.3.35), cpG _ 2.0.0.0.0.0.3.0.35), cpG _ 2.0.0.35, cpG _ 2.0.3.3, cpG _ 2.0.0.0.0.0.0.0.0; logistic regression results showed that after correcting covariates such as leukocyte subtype ratio, sex, age, alcohol consumption, smoking, BMI, history of hypertension, history of diabetes, HDL-C, LDL-C, TC and TG, ORs (95% cis) were methylated to ACTB _ A [ CpG-1 ]:0.610 (0.506-0.747); cpG _3 (0.493-0.789); cpG — 5 (0.500-0.779); cpG-6; cpG-8.9; cpG _12 (0.476-0.713) ], ACTB _ B [ CpG _2.3 (0.508-0.739); cpG-6; cpG-8; cpG-9 (0.507-0.770); cpG _10 (0.511-0.787) ], ACTB _ C [ CpG _6 (0.496-0.769); cpG-11.12 (0.505-0.755); cpG — 17 (0.504-0.713); cpG _ 24; cpG _25 (0.504-0.739); cpG _34 (0.508-0.727) ], ACTB _ D [ CpG _ 2.3; cpG _6 (0.497-0.707); cpG _9.10 (0.496-0.722); cpG _12 (0.505-0.719); cpG — 14 (0.501-0.750); cpG _17 (0.500-0.744) ], ACTB _ E [ CpG _ 1; cpG _2 (0.506-0.749); cpG _ 3; cpG-5 (0.498-0.709) ], all P values < 0.001 (Table 9). In the case of cerebral apoplexy with clinical onset time less than or equal to 1 year, the methylation levels of the CpG sites are respectively 6 sites [ CpG _1 (0.18vs.0.35), cpG _3 (0.13vs.0.24), cpG _5 (0.20vs.0.32), cpG _6 (0.18vs.0.30), cpG _8.9 (0.14vs.0.25), cpG _12 (0.32vs.0.43) ], 5 sites [ CpG _2.3 (0.51vs.0.63), cpG _6 (0.53vs.0.64), cpG _8 (0.34vs.0.45), ACTB _9 (0.23vs.0.35), cpG _10 (0.20vs.0.33) ], and 6 sites [ ACTB _ C (0.2v0.35) ] of the ACTB _ A fragment CpG _11.12 (0.12vs.0.25), cpG _17 (0.38vs.0.50), cpG _24 (0.23vs.0.34), cpG _25 (0.26vs.0.35), cpG _34 (0.29vs.0.40) ], ACTB _ D fragment 6 sites [ CpG _2.3 (0.35vs.0.47), cpG _6 (0.32vs.0.43), cpG _9.10 (0.24vs.0.36), cpG _12 (0.18vs.0.30), cpG _14 (0.37vs.0.51), cpG _17 (0.38vs.0.50) ], and ACTB _ E fragment 4 sites [ CpG _1 (0.23vs.cpg.0.34), cpG _2 (0.v.0.35), cpG _3 (0.1v0.35), cpG _ 5.0.43) ]; logistic regression results showed that, after correcting covariates such as leukocyte subtype ratio, gender, age, alcohol consumption, smoking, BMI, history of hypertension, history of diabetes, HDL-C, LDL-C, TC and TG, ORs (95% cis) were ACTB _ A [ CpG-1; cpG _3 (0.293-0.689); cpG _5 (0.300-0.679); cpG _6 (0.302-0.667); cpG — 8.9 (0.298-0.657); cpG _12 (0.277-0.613) ], ACTB _ B [ CpG _2.3 (0.308-0.639); cpG _ 6; cpG _8 (0.306-0.668); cpG _9 (0.307-0.670); cpG _10 (0.311-0.647) ], ACTB _ C [ CpG _6 (0.296-0.639); cpG-11.12 (0.305-0.655); cpG _17 (0.304-0.613); cpG _ 24; cpG-25; cpG-34 (0.310-0.657) ], ACTB _ D [ CpG-2.3 (0.315-0.647) ]; cpG _6 (0.297-0.607); cpG _9.10 (0.296-0.602); cpG _12 (0.305-0.619); cpG-14; cpG _17 (0.300-0.604) ], ACTB _ E [ CpG _ 1; cpG _ 2; cpG — 3 (0.303-0.606); cpG-5 (0.298-0.609) ], all P values < 0.001 (Table 10).
TABLE 8 comparison of methylation levels of CpG sites in 91 cases of stroke (clinical onset time ≤ 1.5 years) and 147 control ACTB genes
Note: IQR, interquartile range; OR: the ratio of advantages; CI: a trusted zone; ACTB _ A, promoter region and exon 1; ACTB _ B, promoter region and exon 1; ACTB _ C, exon 2 and intron 2; ACTB _ D, exons 3, 4, 5 and introns 3, 4, 5; ACTB _ E, exon 6 and intron 6; * Correcting leukocyte subtype ratio, sex, age, drinking, smoking, BMI, history of hypertension, history of diabetes, HDL-C, LDL-C, TC, and TG.
TABLE 9 comparison of methylation levels of CpG sites in 67 cases of stroke (clinical onset time ≤ 1.32 years) and 147 control ACTB genes
Note: IQR, interquartile range; OR: the ratio of advantages; CI: a credible interval; ACTB _ A, promoter region and exon 1; ACTB _ B, promoter region and exon 1; ACTB _ C, exon 2 and intron 2; ACTB _ D, exons 3, 4, 5 and introns 3, 4, 5; ACTB _ E, exon 6 and intron 6; * Correcting leukocyte subtype ratio, sex, age, drinking, smoking, BMI, history of hypertension, history of diabetes, HDL-C, LDL-C, TC, and TG.
TABLE 10 comparison of methylation levels of CpG sites in 35 stroke cases (clinical onset time ≦ 1 year) and 147 control ACTB genes
Note: IQR, interquartile range; OR: the ratio of advantages; CI: a trusted zone; ACTB _ A, promoter region and exon 1; ACTB _ B, promoter region and exon 1; ACTB _ C, exon 2 and intron 2; ACTB _ D, exons 3, 4, 5 and introns 3, 4, 5; ACTB _ E, exon 6 and intron 6; * Correcting leukocyte subtype ratio, sex, age, drinking, smoking, BMI, history of hypertension, history of diabetes, HDL-C, LDL-C, TC, and TG.
2. Correlation analysis of ACTB methylation and cerebral apoplexy clinical onset time
The correlation analysis of ACTB methylation and stroke onset time is carried out on stroke cases according to the clinical onset time of less than or equal to 1.5 years, 1.32 years and 1 year respectively, and the results show that the methylation degree of ACTB methylation and stroke onset time are positively correlated (11-13 tables) in 6 sites (CpG _1, cpG _3, cpG _5, cpG _6, cpG _8.9 and CpG _ 12) of ACTB _ A fragments, 5 sites (CpG _2.3, cpG _6, cpG _8, cpG _9 and CpG _ 10) of ACTB _ C fragments (CpG _6, cpG _11.12, cpG _17, cpG _24, cpG _25 and CpG _ 34) and 6 sites (CpG _2.3, cpG _6, cpG _9.10, cpG _12, cpG _14 and CpG _ 17) of ACTB _ D fragments and 4 sites (CpG _1, cpG _2, cpG _3 and CpG _ 5) of the ACTB _ E fragments; especially, in the case of cerebral apoplexy with clinical onset time within 1.32 years and 1 year, the correlation between the methylation degree of the CpG sites and the clinical onset time is strong (the correlation coefficient of Spearman rank is more than 0.534, tables 12-13).
TABLE 11 correlation between ACTB gene methylation and clinical onset time of stroke (clinical onset time of 91 stroke cases is less than or equal to 1.5 years)
TABLE 12 correlation between ACTB gene methylation and clinical onset time of stroke (67 stroke cases with clinical onset time of 1.32 years or less)
TABLE 13 correlation between ACTB gene methylation and clinical onset time of stroke (clinical onset time of 35 stroke cases ≤ 1 year)
3. Association of ACTB methylation with age
The subjects were stratified by age and analyzed for association of ACTB methylation with stroke. The results show that in the population < 65 years of age, brain stroke case ACTB _ A fragment 6 sites [ CpG _1 (0.29vs.0.35), cpG _3 (0.21vs.0.27), cpG _5 (0.28vs.0.33), cpG _6 (0.25vs.0.29), cpG _8.9 (0.19vs.0.23), cpG _12 (0.37vs.0.43) ], ACTB _ B fragment 5 sites [ CpG _2.3 (0.56vs.0.64), cpG _6 (0.57vs.0.61), cpG _8 (0.35vs.0.48), cpG _9 (0.24vs.0.35), cpG _10 (0.23vs.0.34) ], cpG _ C fragment 6 sites [ CpG _6 (0.25vs.0.35), cpG _11.12 (0.1v0.24), cpG _10 (0.23vs.0.0.34) ], cpG _ C fragment 6 sites [ CpG _6 (0.25vs.0.35), cpG _ 11.24 (0.1v0.24 (0.23vs.0.0.0.34) ], cpG _ 0.19 (0.25.0.19), cpG _ 0.19. V0.0.0.33.33.35), cpG _ 0.35), cpG _ 0.24 (0.24. V0.24. V0.25. V0.25.0.0.24, 3.0.25. V0.0.0.0.0.24.0.0.24.24, 3.0.0.0.0.0.0.0.0.0.0.26), 3.26), 3.0.0.19.0.19.0.0.33.19, 3.0.0.0.0.0.0.0.35) ], 3.0.0.19, 3.0.0.0.0.0.33, 3.0.0.0.0.0.0.33.0.33, 3.25.0.33.25.8, 3.0.0.0.0.0.0.0.0.33, 3.33.0.0.33.0.33, 3.0.0.0.0.0.0.33, 3.33, 3.0.0.0.0.0.0.0.0.0.0.0.35), cpG-2.0.33, 3.0.0.3.0.33, 3.0.33, 3.0.0.0.0.0.33, 3.35, ], 25.0.0.0.0.33, 3.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.0.33, 25.0.0.0.0.0.0.0.0.0.0.0.0.3.3.0.0.0.0.; logistic regression results showed that, after correcting covariates such as leukocyte subtype ratio, gender, alcohol consumption, smoking, BMI, history of hypertension, history of diabetes, HDL-C, LDL-C, TC, and TG, ORs (95% cis) were ACTB _ a [ CpG _ 1; cpG _3 (0.413-0.946), P =0.006; cpG _5 (0.373-0.882), P =0.001; cpG _6 (0.431-0.939), P =0.002; cpG _8.9 (0.467-0.906), P =0.003; cpG _12 (0.364-0.903), P =0.001], ACTB _ B [ CpG _2.3 (0.375-0.902), P =0.006; cpG _6, 0.518 (0.307-0.891), P < 0.001; cpG _8 (0.302-0.889), P < 0.001; cpG-9 (0.315-0.891), P < 0.001; cpG _10 (0.364-0.902), P < 0.001), ACTB _ C [ CpG _6 (0.382-0.901), P =0.002; cpG _11.12, 0.522 (0.392-0.909), P =0.004; cpG _17 (0.394-0.870), P =0.002; cpG _24, 0.504 (0.342-0.876), P < 0.001; cpG — 25 (0.385-0.892), P =0.005; cpG _34 (0.392-0.901), P =0.001], ACTB _ D [ CpG _2.3 (0.311-0.903), P =0.002; cpG _6, 0.548 (0.346-0.899), P < 0.001; cpG _9.10 (0.366-0.897), P =0.002; cpG _12, 0.546 (0.385-0.901), P =0.001; cpG-14 (0.374-0.900), P < 0.001; cpG _17 (0.358-0.903), P =0.001], ACTB _ E [ CpG _1 (0.366-0.913), P =0.006; cpG _2, 0.546 (0.370-0.906), P =0.005; cpG _3 (0.409-0.908), P =0.005; cpG _5 (0.424-0.904), P =0.003], specific results are shown in table 14.
Correlation analysis results of ACTB methylation and age show that in the control population, the methylation degrees of the ACTB _ A fragment 6 sites (CpG _1, cpG _3, cpG _5, cpG _6, cpG _8.9, cpG _ 12), ACTB _ B fragment 5 sites (CpG _2.3, cpG _6, cpG _8, cpG _9, cpG _ 10), ACTB _ C fragment 6 sites (CpG _6, cpG _11.12, cpG _17, cpG _24, cpG _25, cpG _ 34), ACTB _ D fragment 6 sites (CpG _2.3, cpG _6, cpG _9.10, cpG _12, cpG _14, cpG _ 17) and ACTB _ E fragment 4 sites (CpG _1, cpG _2, cpG _3, cpG _ 5) are all negatively correlated with age (Spearmarman rank correlation coefficient absolute values are all > 0.501, table 15).
TABLE 14 hierarchical analysis of methylation levels of CpG sites in 139 stroke cases and 147 control ACTB genes
Note: IQR, interquartile range; OR: the ratio of advantages; CI: a trusted zone; ACTB _ A, promoter region and exon 1; ACTB-B, promoter region and exon 1; ACTB _ C, exon 2 and intron 2; ACTB _ D, exons 3, 4, 5 and introns 3, 4, 5; ACTB _ E, exon 6 and intron 6; * Correcting leukocyte subtype ratio, sex, drinking, smoking, BMI, history of hypertension, history of diabetes, HDL-C, LDL-C, TC and TG.
TABLE 15 Association of ACTB Gene methylation with age (147 controls)
4. Correlation of ACTB methylation with alcohol consumption
Studies have shown that environmental factors (such as drinking) may lead to altered DNA methylation patterns in addition to genetic mechanisms. The study subjects were stratified for alcohol consumption and analyzed for correlation between ACTB methylation and stroke. <xnotran> , , ACTB _ A 6 [ CpG _1 (0.29vs.0.36), cpG _3 (0.19vs.0.26), cpG _5 (0.27vs.0.32), cpG _6 (0.24vs.0.30), cpG _8.9 (0.19vs.0.24), cpG _12 (0.34vs.0.44) ], ACTB _ B 5 [ CpG _2.3 (0.58vs.0.68), cpG _6 (0.57vs.0.64), cpG _8 (0.34vs.0.48), cpG _9 (0.25vs.0.33), cpG _10 (0.23vs.0.33) ], ACTB _ C 6 [ CpG _6 (0.25vs.0.32), cpG _11.12 (0.15vs.0.23), cpG _17 (0.38vs.0.46), cpG _24 (0.25vs.0.33), cpG _25 (0.28vs.0.36), cpG _34 (0.31vs.0.40) ], ACTB _ D 6 [ CpG _2.3 (0.35vs.0.48), cpG _6 (0.31vs.0.45), cpG _9.10 (0.25vs.0.35), cpG _12 (0.17vs.0.28), cpG _14 (0.38vs.0.51), cpG _17 (0.38vs.0.49) ] ACTB _ E 4 [ CpG _1 (0.24vs.0.34), cpG _2 (0.24vs.0.34), cpG _3 (0.22vs.0.32), cpG _5 (0.31vs.0.41) ] ; </xnotran> logistic regression results showed that, after correcting covariates such as leukocyte subtype ratio, gender, age, smoking, BMI, history of hypertension, history of diabetes, HDL-C, LDL-C, TC, and TG, ORs (95% cis) were ACTB _ a [ CpG _ 1; cpG _3, 0.625 (0.447-0.913), P =0.005; cpG _5 (0.437-0.918), P =0.006; cpG _6 (0.413-0.938), P =0.003; cpG _8.9 (0.411-0.906), P =0.004; cpG _12 (0.406-0.903), P =0.001], ACTB _ B [ CpG _2.3 (0.405-0.899), P =0.001; cpG _6 (0.437-0.911), P =0.006; cpG _8 (0.416-0.901), P < 0.001; cpG _9 (0.405-0.913), P =0.001; cpG _10 (0.406-0.892), P < 0.001), ACTB _ C [ CpG _6 (0.407-0.914), P =0.007; cpG — 11.12 (0.412-0.908), P =0.006; cpG-17 (0.394-0.879), P < 0.001; cpG _24 (0.424-0.913), P =0.003; cpG _25 (0.418-0.912), P =0.002; cpG _34 (0.412-0.907), P =0.001], ACTB _ D [ CpG _2.3 (0.393-0.883), P < 0.001; cpG-6 (0.397-0.891), P < 0.001; cpG-9.10 (0.362-0.882), P < 0.001; cpG-12 (0.392-0.891), P < 0.001; cpG-14 (0.387-0.882), P < 0.001; cpG _17 (0.376-0.883), P < 0.001], ACTB _ E [ CpG _ 1; cpG _ 2; cpG _ 3; cpG-5 (0.337-0.846), P < 0.001], specific results are shown in Table 16.
The results of correlation analysis of ACTB methylation with alcohol consumption showed that in the population of stroke cases, the methylation level of ACTB _ a fragment was negatively correlated with CpG level (Spearman absolute value > 0.542, table 17), in which the CpG level was in negative correlation with CpG amount (CpG _1, cpG _3, cpG _5, cpG _6, cpG _8, cpG _9, cpG _ 10), the ACTB _ C fragment was in 6 positions (CpG _6, cpG _11.12, cpG _17, cpG _24, cpG _25, cpG _ 34), the ACTB _ D fragment in 5 positions (CpG _2.3, cpG _6, cpG _9.10, cpG _12, cpG _14, cpG _ 17), and the ACTB _ E fragment in 4 positions (CpG _1, cpG _2, cpG _3, cpG _ 5).
To better illustrate the relationship between methylation and alcohol consumption, we divided the population into hypomethylated and hypermethylated groups based on the median methylation at CpG sites. The results showed that in the population with 6 sites of ACTB _ A fragment (CpG _1, cpG _3, cpG _5, cpG _6, cpG _8.9, cpG _ 12), 5 sites of ACTB _ B fragment (CpG _2.3, cpG _6, cpG _8, cpG _9, cpG _ 10), 6 sites of ACTB _ C fragment (CpG _6, cpG _11.12, cpG _17, cpG _24, cpG _25, cpG _ 34), 6 sites of ACTB _ D fragment (CpG _2.3, cpG _6, cpG _9.10, cpG _12, cpG _14, cpG _ 17) and 4 sites of ACTB _ E fragment (CpG _1, cpG _2, cpG _3, cpG _ 5), the risk of developing cerebral apoplexy was increased by drinking compared with the population without drinking wine (OR value > 1.287, P value < 0.007, table 18).
In addition, the study evaluated the combined effect of the above-mentioned CpG methylation levels (low versus high) and alcohol consumption (yes or no) on stroke, and used the hypermethylated, alcohol-free group as a reference group for the evaluation of methylation levels, alcohol consumption and their interactions. The results indicated that there were synergy between the low methylation and drinking of the ACTB _ A fragment at 6 sites (CpG _1, cpG _3, cpG _5, cpG _6, cpG _8.9, cpG _ 12), ACTB _ B fragment at 5 sites (CpG _2.3, cpG _6, cpG _8, cpG _9, cpG _ 10), ACTB _ C fragment at 6 sites (CpG _6, cpG _11.12, cpG _17, cpG _24, cpG _25, cpG _ 34), ACTB _ D fragment at 6 sites (CpG _2.3, cpG _6, cpG _9.10, cpG _12, cpG _14, cpG _ 17) and ACTB _ E fragment at 4 sites (CpG _1, cpG _2, cpG _3, cpG _ 5) (OR interaction was > 2.618, P value < 0.001, TABLE 19).
The results suggest that alcohol consumption may affect the methylation level of the ACTB gene, which in turn leads to the occurrence of stroke. Drinking is one of the most common living habits of Chinese people, and the drinking abstinence can obviously reduce the attack risk of cerebral apoplexy by combining the research result.
TABLE 16 comparison of methylation levels of CpG sites in 139 stroke cases and 147 control ACTB genes by alcohol intake/alcohol intake stratification
Note: IQR, interquartile range; OR: the ratio of advantages; CI: a trusted zone; ACTB _ A, promoter region and exon 1; ACTB _ B, promoter region and exon 1; ACTB _ C, exon 2 and intron 2; ACTB _ D, exons 3, 4, 5 and introns 3, 4, 5; ACTB _ E, exon 6 and intron 6; * Correcting leukocyte subtype ratio, sex, age, smoking, BMI, history of hypertension, history of diabetes, HDL-C, LDL-C, TC, and TG.
TABLE 17 correlation of ACTB Gene methylation with alcohol consumption (139 cases)
TABLE 18 correlation of different levels of methylated alcohol consumption with stroke
TABLE 19 Combined effects of methylation and alcohol on stroke
5. ACTB gene methylation value for stroke early warning and early diagnosis
The mathematical model for assisting stroke diagnosis established by the invention can achieve the following purposes:
(1) Distinguishing stroke patients from non-stroke controls;
(2) Early warning of stroke is provided.
The mathematical model is established as follows:
(A) The data source is as follows: and (3) the methylation level of the target CpG sites (one or more combinations in tables 1-5) of isolated blood samples of 139 new stroke patients within 2 years after the community queue listed in the step one is selected and 147 stroke patients during the follow-up period (the detection method is the same as the step two).
According to the data, known parameters such as age, sex, white blood cell count, body mass index, smoking, drinking, history of hypertension, diabetes, HDL-C, LDL-C, TC and TG can be added according to actual needs to improve the discrimination efficiency.
(B) Model building
According to the requirements, any two types of patient data of different types, namely training sets (for example, stroke patients and controls with the attack time of less than 2 years, stroke patients and controls with the attack time of less than or equal to 1.5 years, stroke patients and controls with the attack time of less than or equal to 1.32 years, stroke patients and controls with the attack time of less than or equal to 1 year, stroke patients and controls with the age of less than 65 years, stroke patients and controls with the age of more than or equal to 65 years, stroke patients and controls with alcohol, stroke patients without alcohol and controls without alcohol) are selected as data for establishing models, and statistical software such as SAS, R, SPSS and the like is used for establishing mathematical models by using a statistical method of two-class logistic regression through formulas. The numerical value corresponding to the maximum Johnson index calculated by the mathematical model formula is a threshold value or 0.5 is directly set as the threshold value, the detection index obtained after the sample to be detected is tested and substituted into the model calculation is classified into one class (B class) when being larger than the threshold value, and classified into the other class (A class) when being smaller than the threshold value, and the detection index is equal to the threshold value and is used as an uncertain gray zone. When a new sample to be detected is predicted to judge which type the sample belongs to, the methylation level of one or more CpG sites on the ACTB gene of the sample to be detected is detected by a DNA methylation determination method, then the data of the methylation levels are substituted into the mathematical model, the detection index corresponding to the sample to be detected is obtained through calculation, then the detection index corresponding to the sample to be detected is compared with the threshold value, and the sample to be detected belongs to which type the sample to be detected is determined according to the comparison result.
Examples are: and establishing a mathematical model for distinguishing the A class and the B class by using a formula of two-classification logistic regression through statistical software such as SAS, R, SPSS and the like according to the methylation level of single CpG sites or the methylation level of a plurality of CpG sites of the ACTB gene in the training set. The mathematical model is here a two-class logistic regression model, specifically: log (y/(1-y)) = b0+ b1x1+ b2x2+ b3x3+ \ 8230, + bnXn, wherein y is a detection index obtained by substituting a dependent variable into a model about the methylation value of one or more methylation sites of a sample to be tested, b0 is a constant, x 1-xn are independent variables, i.e., the methylation values of one or more methylation sites of the test sample (each value is a value between 0and 1), and b 1-bn are weights assigned to the methylation values of each site by the model. When the method is specifically applied, a mathematical model is established according to the methylation degree (x 1-xn) of one or more DNA methylation sites of a sample which is detected in a training set and the known classification condition (class A or class B, and the value of y is respectively assigned with 0and 1), so that the constant B0 of the mathematical model and the weights B1-bn of the methylation sites are determined, and the value which is calculated by the mathematical model and corresponds to the maximum york index is used as a threshold value or 0.5 is directly set as a divided threshold value. And (3) after the sample to be detected is tested and substituted into the model for calculation, the detection index (y value) obtained is classified as B when being larger than the threshold, classified as A when being smaller than the threshold, and is equal to the threshold to be used as an uncertain gray area. The class a and the class B are corresponding two classes (two classes of groups, which group is the class a and which group is the class B, are determined according to a specific mathematical model, and are not specified here). When a sample of a subject is predicted to determine which class it belongs to, blood of the subject is first collected and then DNA is extracted therefrom. After the extracted DNA is transformed by bisulfite, the methylation level of a single CpG site or the methylation level of a combination of multiple CpG sites of the ACTB gene of a subject is detected by a DNA methylation detection method, and then the methylation data obtained by detection is substituted into the mathematical model. If the methylation level of one or more CpG sites of the ACTB gene of the subject is substituted into the mathematical model, and the calculated value, namely the detection index is larger than the threshold value, the subject judges that the detection index in the training set is larger than the class B; if the methylation level data of one or more CpG sites of the ACTB gene of the subject is substituted into the mathematical model, and the calculated value, namely the detection index is smaller than the threshold, the subject belongs to the class (class A) with the detection index smaller than the threshold in the training set; if the methylation level data of one or more CpG sites of the ACTB gene of the subject is substituted into the above mathematical model, the calculated value, i.e., the detection index, is equal to the threshold, it cannot be determined whether the subject is of class A or B.
Examples are as follows: the methylation of all CpG sites of ACTB _ D (ACTB _ D _2.3, ACTB _D _U4.5, ACTB _D _U6, ACTB _D _U7.8, ACTB _D _U9.10, ACTB _D _U11, ACTB _D _U12, ACTB _D _U14, ACTB _D _U15.16, ACTB _D _U17 and ACTB _ D _ 18) and the use of mathematical modeling for the discovery of stroke patients 2 years ahead (early warning of stroke): data of methylation levels of all CpG sites of ACTB _ D that have been detected in stroke patients with an onset time of < 2 years and a control training set (here, 139 stroke patients with an onset time of < 2 years and 147 controls) and the age, sex (male assigned 1, female assigned 0), white blood cell count, body mass index, smoking (smoking assigned 1, non-smoking assigned 0), drinking (drinking assigned 1, non-drinking assigned 0), history of hypertension (with history of hypertension assigned 1, non-history of hypertension assigned 0), history of diabetes (with history of diabetes assigned 1, non-diabetes assigned 0), HDL-C, LDL-C, TC, and TG of patients who have developed stroke 2 years earlier (early-warning of stroke) were created by SPSS software or R software using a formula of binary classification logistic regression. The mathematical model is here a two-class logistic regression model, from which the constant b0 of the mathematical model and the weights b1 to bn of the individual methylation sites are determined, in this case in particular: log (y/(1-y)) = -2.937-0.810 ACTB _D _ _2.3 _ _0.916 ACTB _D _4.5-1.573 ACTB _D6-3.181 ACTB _D7.8 _0.931 ACTB _D9.10 _0.882 ACTB _D11 _3.763 ACTB _D12-2.570 ACTB _D14 _D1.142 ACTB _D15.16 _ \/0.221 ACTB u D17 _1.285 ACTB u D18 u 18+ age-0.15.15.16 _ (male gender 1), women assigned a value of 0) +0.302 white blood cell count +0.034 body mass index +0.025 smoking (smoking assigned 1, no smoking assigned 0) -0.055 drinking (drinking assigned 1, no drinking assigned 0) +0.035 history of hypertension (with history assigned 1, the non-hypertensive history is assigned a value of 0 to 0.030 (with diabetes history assigned a value of 1, without diabetes history assigned a value of 0) +0.009 HDL-C +0.351 LDL-C-0.170 TC-0.013 TG, wherein y is the methylation value of all CpG sites of ACTB _ D of the sample to be tested and the detection index obtained after substituting the dependent variables into the model, i.e., age, sex, white cell count, body mass index, smoking, drinking, hypertension, diabetes history, HDL-C, LDL-C, TC, and TG. And (3) obtaining a diagnosis threshold value (0.36) through the maximum Youden index, and substituting the methylation levels of all CpG sites of ACTB _ D of a sample to be detected into a model together with information of age, sex, white blood cell count, body mass index, smoking, drinking, hypertension history, diabetes history, HDL-C, LDL-C, TC and TG of the sample to be detected after testing, wherein the obtained detection index, namely the y value is less than the threshold value and classified as a cerebral apoplexy patient, the value greater than the threshold value is classified as a cerebral apoplexy control, and the value equal to the threshold value is not determined as the cerebral apoplexy patient or the control. The area under the curve (AUC) for this model was calculated to be 0.76 (table 20). Specific examples of the method for determining subjects include a method in which DNAs are extracted from blood collected from two subjects (A, B), the extracted DNAs are converted with bisulfite, and the methylation levels of all CpG sites of ACTB _ D in the subjects are measured by a DNA methylation measurement method. The measured methylation level data is then substituted into the above mathematical model along with the subject's age, sex, white blood cell count, body mass index, smoking, drinking, history of hypertension, history of diabetes, HDL-C, LDL-C, TC and TG information. The first subject is judged as a cerebral apoplexy patient if the value calculated by the first subject through the mathematical model is 0.35 to 0.36; and if the methylation level data of all CpG sites of ACTB _ D of the second subject is calculated to be more than 0.58 and more than 0.36 after being substituted into the mathematical model, judging that the second subject has no cerebral apoplexy.
(C) Evaluation of model Effect
According to the above method, mathematical models for finding stroke patients 1 year, 1.32 years, 1.5 years and 2 years in advance are established, respectively, and the effectiveness thereof is evaluated by a receiver curve (ROC curve). The larger the area under the curve (AUC) obtained by the ROC curve, the better the discrimination of the model, and the more effective the molecular marker. The results of the evaluation after mathematical modeling using different CpG sites are shown in Table 20. In Table 20, 1 CpG site represents any CpG site in the amplified fragments ACTB _ A/ACTB _ B/ACTB _ C/ACTB _ D/ACTB _ E, 2 CpG sites represent any combination of 2 CpG sites in the amplified fragments ACTB _ A/ACTB _ B/ACTB _ C/ACTB _ D/ACTB _ E, and 3 CpG sites represent any combination of 3 CpG sites in the amplified fragments ACTB _ A/ACTB _ B/ACTB _ C/ACTB _ D/ACTB _ E, \ 8230;' 8230, and so on. The values in the table are ranges for the results of different site combinations (i.e., results for any combination of CpG sites are within the range).
The research result shows that the methylation (all CpG sites) of the ACTB gene is used for finding that the areas under ROC curves of 1 year, 1.32 years, 1.5 years and 2 years of stroke are respectively 0.91, 0.89, 0.85 and 0.81 in advance, the sensitivities corresponding to the diagnosis thresholds obtained by the maximum Yotanden index are respectively 86.2%, 85.3%, 83.3% and 79.1%, and the specificities are respectively 90.2%, 86.8%, 85.9% and 80.2%, so that the methylation of the ACTB gene has good early warning and early diagnosis effects on the stroke. In addition, we further analyzed the diagnostic value of ACTB gene methylation for stroke of different ages and drinking states, and the results showed that the areas under the ROC curve for the diagnosis of stroke by ACTB gene methylation (all CpG sites) for those aged < 65 years and aged > 65 years were 0.88 and 0.79, the sensitivity of diagnosis threshold obtained by maximum York index was 84.6% and 79.8%, and the specificity was 88.4% and 80.1%; the areas under the ROC curve for diagnosing stroke under drinking and non-drinking states of ACTB gene methylation (all CpG sites) were 0.88 and 0.80, the sensitivities corresponding to the diagnostic thresholds obtained by the maximum york index were 83.6% and 78.6%, and the specificities were 87.1% and 80.3%, suggesting that ACTB gene methylation has a good diagnostic effect on stroke in people under 65 years old and drinkers (table 20).
TABLE 20 value of ACTB Gene methylation for early warning and early diagnosis of Stroke
<110> Nanjing university of medical science
<120> methylation of blood skeleton protein gene as potential marker for early diagnosis of stroke
<130> GNCLN200632
<160> 15
<170> PatentIn version 3.5
<210> 1
<211> 467
<212> DNA
<213> Artificial sequence
<400> 1
gccctgaaag cagggcctga ggacctctgg ctgtggggct cagctagcta aatgtgctgg 60
gtgggtcact agggagagac ctgggcttga gaggtagagt gtggtgttgg gggagtcagg 120
tggcttgcgg ccacttagaa gtcgcaggac cacactcccc aggacagggc aggggccagc 180
ggtccagtgg ctggaggtgg cccgtgatga aggctacaaa cctacccagc cgcagccctg 240
ggaaggaagg tgggctctac agggcagggc accttttacc ctggagctgc ctgcttttga 300
gggtaacagt cacgcccagc caagacaagg cctggggcgt tagtgggtga cctaggcact 360
gcggggcggg ggggctgggt ctacacagcc tgggtctggg cccaccgtcc gttgtatgtc 420
tgctatgcgc agccacagct gaactgccct cccagaccat ctggagg 467
<210> 2
<211> 462
<212> DNA
<213> Artificial sequence
<400> 2
cagatggtct gggagggcag ttcagctgtg gctgcgcata gcagacatac aacggacggt 60
gggcccagac ccaggctgtg tagacccagc ccccccgccc cgcagtgcct aggtcaccca 120
ctaacgcccc aggccttgtc ttggctgggc gtgactgtta ccctcaaaag caggcagctc 180
cagggtaaaa ggtgccctgc cctgtagagc ccaccttcct tcccagggct gcggctgggt 240
aggtttgtag ccttcatcac gggccacctc cagccactgg accgctggcc cctgccctgt 300
cctggggagt gtggtcctgc gacttctaag tggccgcaag ccacctgact cccccaacac 360
cacactctac ctctcaagcc caggtctctc cctagtgacc cacccagcac atttagctag 420
ctgagcccca cagccagagg tcctcaggcc ctgctttcag gg 462
<210> 3
<211> 279
<212> DNA
<213> Artificial sequence
<400> 3
ggcttccttt gtccccaatc tgggcgcgcg ccggcgcccc ctggcggcct aaggactcgg 60
cgcgccggaa gtggccaggg cgggggcgac ctcggctcac agcgcgcccg gctattctcg 120
cagctcacca tggatgatga tatcgccgcg ctcgtcgtcg acaacggctc cggcatgtgc 180
aaggccggct tcgcgggcga cgatgccccc cgggccgtct tcccctccat cgtggggcgc 240
cccaggcacc aggtagggga gctggctggg tggggcagc 279
<210> 4
<211> 367
<212> DNA
<213> Artificial sequence
<400> 4
gggacctgac tgactacctc atgaagatcc tcaccgagcg cggctacagc ttcaccacca 60
cggccgagcg ggaaatcgtg cgtgacatta aggagaagct gtgctacgtc gccctggact 120
tcgagcaaga gatggccacg gctgcttcca gctcctccct ggagaagagc tacgagctgc 180
ctgacggcca ggtcatcacc attggcaatg agcggttccg ctgccctgag gcactcttcc 240
agccttcctt cctgggtgag tggagactgt ctcccggctc tgcctgacat gagggttacc 300
cctcggggct gtgctgtgga agctaagtcc tgccctcatt tccctctcag gcatggagtc 360
ctgtggc 367
<210> 5
<211> 445
<212> DNA
<213> Artificial sequence
<400> 5
gggccctgta gaacaatgag aatctgacct gcaactagct gggcgtgctg gggcatgcct 60
gtgtagtttc agctacttgg gaggctgagg caggagaatt gcttgagccc aaagttgagg 120
ctgcagtgag ccatggttgt gccattacac tccagcctgg gcaacacaag accccgtctc 180
agaaataaaa agagaacctg gcctgcagtg ccaggcaggc cctgaggtcc aggagcctgg 240
gtatctccct ctgcagcatg ggtcacgaac aaactgggcc ctcagaggcc acgggatggc 300
gcccagtctc cagtcacaag gcagaatcca gacctcagcc catagctaac cagagctgtc 360
tgcaggccag atatggcccc atggaccccc taccccaact tgactttgat tccaggtccc 420
cctctgtctg gatgaacagg tagga 445
<210> 6
<211> 35
<212> DNA
<213> Artificial sequence
<400> 6
aggaagagag gttttgaaag tagggtttga ggatt 35
<210> 7
<211> 58
<212> DNA
<213> Artificial sequence
<400> 7
cagtaatacg actcactata gggagaaggc tcctccaaat aatctaaaaa aacaattc 58
<210> 8
<211> 34
<212> DNA
<213> Artificial sequence
<400> 8
aggaagagag tagatggttt gggagggtag ttta 34
<210> 9
<211> 56
<212> DNA
<213> Artificial sequence
<400> 9
cagtaatacg actcactata gggagaaggc tccctaaaaa caaaacctaa aaacct 56
<210> 10
<211> 35
<212> DNA
<213> Artificial sequence
<400> 10
aggaagagag ggaaggaaag gataagaagt tttga 35
<210> 11
<211> 51
<212> DNA
<213> Artificial sequence
<400> 11
cagtaatacg actcactata gggagaaggc tactacccca cccaaccaac t 51
<210> 12
<211> 37
<212> DNA
<213> Artificial sequence
<400> 12
aggaagagag gggatttgat tgattatttt atgaaga 37
<210> 13
<211> 56
<212> DNA
<213> Artificial sequence
<400> 13
cagtaatacg actcactata gggagaaggc taccacaaaa ctccatacct aaaaaa 56
<210> 14
<211> 37
<212> DNA
<213> Artificial sequence
<400> 14
aggaagagag gggttttgta gaataatgag aatttga 37
<210> 15
<211> 56
<212> DNA
<213> Artificial sequence
<400> 15
cagtaatacg actcactata gggagaaggc ttcctaccta ttcatccaaa caaaaa 56