CN113257421A - Method and system for constructing hypertension prediction model - Google Patents

Method and system for constructing hypertension prediction model Download PDF

Info

Publication number
CN113257421A
CN113257421A CN202110606139.0A CN202110606139A CN113257421A CN 113257421 A CN113257421 A CN 113257421A CN 202110606139 A CN202110606139 A CN 202110606139A CN 113257421 A CN113257421 A CN 113257421A
Authority
CN
China
Prior art keywords
hypertension
characteristic
variable
variables
risk
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110606139.0A
Other languages
Chinese (zh)
Other versions
CN113257421B (en
Inventor
李平
陈伯怀
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Wuzheng Intelligent Technology Beijing Co ltd
Original Assignee
Wuzheng Intelligent Technology Beijing Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Wuzheng Intelligent Technology Beijing Co ltd filed Critical Wuzheng Intelligent Technology Beijing Co ltd
Priority to CN202110606139.0A priority Critical patent/CN113257421B/en
Publication of CN113257421A publication Critical patent/CN113257421A/en
Application granted granted Critical
Publication of CN113257421B publication Critical patent/CN113257421B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a method and a system for constructing a hypertension prediction model, wherein the method comprises the following steps: screening and determining characteristic variables of a hypertension prediction model based on a statistical method; determining regression coefficients and corresponding scores of the characteristic variables by taking the characteristic variables as factors; and constructing a multi-factor Logistic (logarithmic probability) regression model for hypertension prediction according to the regression coefficient and the corresponding score of each characteristic variable, wherein the multi-factor Logistic regression model is a hypertension risk prediction probability value function corresponding to each characteristic variable, analyzing main influence factors of hypertension, determining multiple risk factors and the occurrence of future hypertension as a quantitative relation by adopting the Logistic regression model, and predicting the incidence probability of the future hypertension of an individual according to the levels of the multiple risk factors.

Description

Method and system for constructing hypertension prediction model
Technical Field
The invention relates to the technical field of health management, in particular to a method and a system for constructing a hypertension prediction model.
Background
Hypertension is one of the chronic diseases with the largest number of patients and is the most important risk factor for cardiovascular and cerebrovascular disease death of urban and rural residents, but the awareness rate, treatment rate and control rate of hypertension are still at a low level overall.
At present, whether the patient has the disease is judged based on the clinical manifestations of the patient, so that the best prevention opportunity of hypertension is easily missed. Therefore, it is important to take measures to prevent hypertension. However, the prior art cannot predict whether hypertension happens in the future, so that timely treatment and prevention cannot be realized.
Disclosure of Invention
Aiming at the technical problems in the prior art, the invention provides a method and a system for constructing a hypertension prediction model, which are used for analyzing main influence factors of hypertension, determining a quantitative relation between various risk factors and the future occurrence of hypertension by adopting a Logistic regression model, and predicting the future occurrence probability of hypertension of an individual according to the levels of the various risk factors.
According to a first aspect of the present invention, there is provided a method for constructing a hypertension prediction model, including:
screening and determining characteristic variables of the hypertension prediction model based on a statistical method;
determining regression coefficients and corresponding scores of the characteristic variables by taking the characteristic variables as factors;
and constructing a multi-factor Logistic regression model for hypertension prediction according to the regression coefficient and the corresponding score of each characteristic variable, wherein the multi-factor Logistic regression model is a hypertension risk prediction probability value function corresponding to each characteristic variable.
On the basis of the technical scheme, the invention can be improved as follows.
Optionally, the process of screening and determining the characteristic variables of the hypertension prediction model includes: determining variables influencing hypertension, carrying out correlation analysis according to a binary Logistic regression method to obtain a hidden state P value of each variable, and selecting the variable with the P value smaller than a set threshold value as the characteristic variable.
Optionally, the feature variables include: age, gender, smoking, exercise, family history of hypertension, BMI, diabetes, systolic and diastolic blood pressure.
Optionally, the hypertension risk prediction probability value function is as follows:
Figure BDA0003093403490000021
wherein i and N respectively represent the serial number and the total number of the characteristic variables,
Figure BDA0003093403490000022
the value of (B) is β + β i Wij + B S, β i represents a regression coefficient of the i-th feature variable, Wij represents a reference value determined from the value of the i-th feature variable, B is a constant set according to the regression coefficient and the change rate of the reference value, and S represents the sum of the corresponding scores of the respective feature variables.
Optionally, the method for determining the reference value Wij includes:
grouping the values of the characteristic variables;
when the characteristic variable is a numerical variable, grouping of each segmentation range is set according to the numerical range of the characteristic variable, and a middle value is selected as a reference value Wij in each grouping;
and when the characteristic variables are classified variables, setting the characteristic variables into two groups respectively according to the types of the characteristic variables, wherein the reference values Wij of the two groups are 0 or 1.
Optionally, the method for determining the corresponding score poinsij of the ith characteristic variable includes:
selecting a group of characteristic variables as a basic risk reference value WiREF;
calculating the distance D between each characteristic variable group and the basic risk reference value WiREF by combining a regression coefficient beta i, (Wij-WiREF) × beta i;
determining a constant B x β i, x representing an interval of the grouping of the characteristic variables;
and calculating the corresponding score Pointsij (D)/B (Wij-WiREF) beta i/B of the ith characteristic variable.
Optionally, the construction method further includes: and (4) layering the risk of the hypertension according to the probability corresponding to each score, wherein the layering comprises high risk, medium risk and low risk.
According to a second aspect of the present invention, there is provided a system for constructing a hypertension prediction model, including:
the system comprises a characteristic variable screening module, a characteristic variable parameter calculating module and a model constructing module;
the characteristic variable screening module is used for screening and determining the characteristic variables of the hypertension prediction model based on a statistical method;
the characteristic variable parameter calculation module is used for determining a regression coefficient and a corresponding score of each characteristic variable by taking the characteristic variable as a factor;
and the model construction module is used for constructing a multi-factor Logistic regression model for hypertension prediction according to the regression coefficient and the corresponding score of each characteristic variable, and the multi-factor Logistic regression model is a hypertension risk prediction probability value function corresponding to each characteristic variable.
According to a third aspect of the present invention, there is provided an electronic device comprising a memory, and a processor, wherein the processor is configured to implement the steps of the method for constructing the hypertension prediction model when executing a computer management class program stored in the memory.
According to a fourth aspect of the present invention, there is provided a computer-readable storage medium having stored thereon a computer management-like program, which when executed by a processor, implements the steps of the method of constructing a hypertension prediction model.
According to the method, the system, the electronic equipment and the storage medium for constructing the hypertension prediction model, provided by the invention, the main influence factors of hypertension are analyzed, the characteristic variables are screened based on statistics, the numerical values of the characteristic variables are classified and assigned, a Logistic regression model is adopted to determine a quantitative relation between various risk factors and the occurrence of future hypertension, the incidence probability of the future hypertension of an individual is predicted according to the levels of the various risk factors, and an instructive suggestion is provided for early screening of clinical hypertension; the risk of hypertension is layered according to the probability, such as high risk, medium risk and low risk, and personalized and specialized health management schemes are provided for different layers.
Drawings
FIG. 1 is a flow chart of a method for constructing a hypertension prediction model according to the present invention;
fig. 2 is a structural diagram of a system for constructing a hypertension prediction model according to an embodiment of the present invention;
fig. 3 is a schematic diagram of an embodiment of an electronic device according to an embodiment of the present invention;
fig. 4 is a schematic diagram of an embodiment of a computer-readable storage medium provided in the present invention.
Detailed Description
The principles and features of this invention are described below in conjunction with the following drawings, which are set forth by way of illustration only and are not intended to limit the scope of the invention.
It can be understood that, based on the defects in the background art, the embodiment of the invention provides a method for constructing a hypertension prediction model. Fig. 1 is a flowchart of a method for constructing a hypertension prediction model according to the present invention, and as shown in fig. 1, the method for constructing a hypertension prediction model includes:
and screening and determining characteristic variables of the hypertension prediction model based on a statistical method.
And determining the regression coefficient and the corresponding score of each characteristic variable by taking the characteristic variable as a factor.
And constructing a multi-factor Logistic regression model for hypertension prediction according to the regression coefficient and the corresponding score of each characteristic variable, wherein the multi-factor Logistic regression model is a hypertension risk prediction probability value function corresponding to each characteristic variable and is used for predicting the probability of hypertension of the middle-aged and elderly people in the future 3 years.
The invention provides a method for constructing a hypertension prediction model, which analyzes main influence factors of hypertension, adopts a Logistic regression model to determine a quantitative relation between various risk factors and future hypertension occurrence, and predicts the future hypertension occurrence probability of an individual according to the levels of various risk factors.
Example 1
Embodiment 1 provided by the present invention is an embodiment of constructing a hypertension prediction model provided by the present invention, and as can be seen from fig. 1, the embodiment includes:
and screening and determining characteristic variables of the hypertension prediction model based on a statistical method.
Preferably, the process of screening and determining the characteristic variables of the hypertension prediction model comprises: determining variables influencing hypertension, performing correlation analysis according to a binary Logistic regression method to obtain the hidden state P value of each variable, and selecting the variable with the P value smaller than a set threshold value as a characteristic variable.
In specific implementation, some factors, namely variables, which may affect hypertension are determined according to chinese hypertension health management regulations (2019) and recent hypertension demographic tables. The method comprises 15 steps: age, sex, smoking, exercise, family history of hypertension, obesity, diabetes, long-term mental stress, smoking, hyperlipidemia, high salt intake, systolic blood pressure, diastolic blood pressure, excessive drinking and air pollution.
The screening aims to eliminate the variable with poor prediction efficiency from the 15 variables, and screen out the strongly correlated variable to serve as the basis for establishing a subsequent prediction model. Taking out hypertension disease factor data according to health information data provided by a CDC BRFSS database, screening variables by adopting a statistical method, and carrying out correlation analysis according to a binary Logistic regression method to obtain a P value of a single variable; the statistical significance is provided by the P value being less than 0.05, so that factors which have small influence on hypertension are eliminated. The final screened feature variables include: age, gender, smoking, exercise, family history of hypertension, obesity (BMI), diabetes, systolic and diastolic blood pressure. The selected 9 characteristic variables all reached the screening condition (P < 0.05) in the one-factor analysis. Obesity is expressed as BMI, which is weight (kg) divided by height (square meters).
And determining the regression coefficient and the corresponding score of each characteristic variable by taking the characteristic variable as a factor.
And constructing a multi-factor Logistic regression model for hypertension prediction according to the regression coefficient and the corresponding score of each characteristic variable, wherein the multi-factor Logistic regression model is a hypertension risk prediction probability value function corresponding to each characteristic variable.
The hypertension risk prediction probability value function is:
Figure BDA0003093403490000051
wherein i and N respectively represent the serial number and the total number of the characteristic variables,
Figure BDA0003093403490000052
the value of (B) is β + β i Wij + B S, β i represents a regression coefficient of the i-th feature variable, Wij represents a reference value determined from the value of the i-th feature variable, B is a constant set according to the regression coefficient and the change rate of the reference value, and S represents the sum of the corresponding scores of the respective feature variables.
By constructing a multi-factor Logistic regression model, the risk factors mainly considered are included in the multi-factor Logistic regression model, so that the regression coefficients beta, OR (Odds ratio) values and 95% CI (Confidence interval) thereof of each factor are estimated. In the multi-factor Logistic regression model, the OR value is 1, which indicates that the factor does not work for the occurrence of diseases; an OR value greater than 1 indicates that the factor is a risk factor; an OR value less than 1 indicates that this factor is a protective factor. If the regression coefficient beta is positive, the logarithm of the dependent variable, namely ln (p/1-p) is also increased along with the increase of the independent variable, and the probability p of the value of the inevitable dependent variable is also increased, but the probability that the value of the dependent variable is low is increased at the moment, the independent variable is associated with the smaller value of the dependent variable; conversely, the regression coefficient β is negative, indicating that the effect of the independent variable on the dependent variable is negative, i.e., negatively correlated.
Figure BDA0003093403490000061
For example, in the embodiments provided by the present invention,
Figure BDA0003093403490000062
Figure BDA0003093403490000063
Figure BDA0003093403490000064
Figure BDA0003093403490000065
Figure BDA0003093403490000066
for example, when the total score S is 5 minutes, the corresponding risk probability value is 5.93%.
In a possible embodiment, the method of determining the reference value Wij comprises:
the values of the individual characteristic variables are grouped.
And when the characteristic variable is a numerical variable, setting groups of each segment range according to the numerical range of the characteristic variable, and selecting a middle value as a reference value Wij in each group.
And when the characteristic variables are classified variables, setting the characteristic variables into two groups respectively according to the types of the characteristic variables, wherein the reference values Wij of the two groups are 0 or 1.
When the characteristic variables are numerical variables, the risk factors are grouped according to clinical significance or use habits, an appropriate numerical value is selected as a reference value Wij in each group, and a middle value in the group is usually selected as a reference value.
For example, in the present example, the study population is in the age range of 45-84 years, and is usually divided into 5 groups according to an age group of 10 years, and each group selects the middle value as the reference value Wij, for example, the reference value Wij of the group of 45-54 years is (45+54)/2 ═ 44.5.
The systolic blood pressure was in the range of 70-139mmHg, one group of < 110mmHg, and we above 110mmHg were divided into 7 groups per 5mmHg, and the median value was selected as the reference value Wij for each group. For example, the reference value Wij of the group of 120-.
The diastolic pressure ranges from 50 to 89mmHg, one group is less than 70mmHg, and more than 70mmHg we divide each group into 3 groups per 10mmHg, and each group selects the middle value as the reference value Wij. For example, the reference Wij for the group 70-80mmHg is (70+ 80)/2-75.
The BMI ranges from 15 to 50, one group is less than 25, one group is 25 to 29, one group is 30 to 39, one group is more than or equal to 40, and the middle value is selected as the reference value Wij. For example, the reference value Wij of the group 25 to 59 is (25+29)/2 ═ 27.
When the characteristic variable is a classification variable, such as gender, a male can be set as a reference at the moment, namely the reference value Wij is 0, then the female is naturally assigned with a value of 1, and similarly, no smoking is set as 0, and smoking is 1; motion is set to 0 and no motion is 1; the family history of non-hypertension is set as 0, and the family history of hypertension is set as 1; the no diabetes setting is 0 and the diabetes setting is 1.
The method for determining the corresponding score Pointsij of the ith characteristic variable comprises the following steps:
a grouping of characteristic variables is selected as the base risk reference value WiREF.
For each risk factor, an appropriate group needs to be selected as a risk reference value WiREF, when a multi-factor Logistic regression model is constructed, the value of the group is marked as 0, the value of the risk factor is higher than a positive score when the risk factor is larger than the WiREF, the risk is higher when the score is higher, and the risk is opposite to the negative score when the score is lower than the WiREF.
For example, reference values Wij corresponding to age 45-54 years, male, no smoking, exercise, BMI < 25, family history of no hypertension, no diabetes, systolic blood pressure < 110mmHg, diastolic blood pressure < 70mmHg may be selected as the basal risk reference value WiREF for each risk factor.
And calculating the distance D between the grouping of the characteristic variables and the basic risk reference value WiREF by combining the regression coefficient beta i, wherein the distance D is (Wij-WiREF) beta i.
For example, in the embodiment of the present invention, the basal risk reference value WiREF of the age is 44.5, and the regression coefficient β i of the age corresponding to the multifactor Logistic regression model is 0.0575, then the reference value Wij of the age group is 59.5 for the age groups 55-64, and the distance between the age group and the basal risk reference value is (59.5-44.5) × 0.0575 ═ 0.8625. Similarly, the distance D from each group to the base risk reference is calculated by the other risk factors according to the above formula.
The constant B is determined x β i, x representing the interval of the grouping of the characteristic variables.
The constant B is a constant for changing each corresponding risk factor when 1 time is recorded in the set scoring tool. For example, in the embodiment provided by the present invention, if the set age is 1 point every 5 years old, then the constant B is 5 × β i — 5 × 0.0575 — 0.2875.
And calculating the corresponding score Pointsij (D)/B (Wij-WiREF) beta i/B of the ith characteristic variable.
Finally, the calculated value can be rounded up to obtain the score corresponding to the group.
The scores of each risk factor are added up to calculate the total score, theoretically, when each risk factor takes the lowest value, the lowest value of the total score is 0+0+0+0+0+0+0+0, and the same principle can be used to obtain the highest value of the total score is 4+1+1+1+ 3+2+14+3, which is 0, so that the range of the total score is: 0-30 minutes.
And calculating a corresponding table of the total score and the risk prediction probability.
Preferably, the construction method further comprises: and (4) layering the risk of the hypertension according to the probability corresponding to each score, wherein the layering comprises high risk, medium risk and low risk.
Such as high risk (> 35%), medium risk (10% -35%) and low risk (< 10%), and provides personalized and specialized health management schemes for different layers, including diet, exercise, physical examination, daily attention, preventive measures, medical indication, and health education for hypertension.
To further verify the accuracy of the Logistic regression model, the present invention provides an example to compare the difference between the scoring tool and the original Logistic regression model prediction.
Assuming a male patient, age 75, with a systolic pressure of 129mmHg, a diastolic pressure of 85mmHg, no smoking, exercise, no family history of hypertension, diabetes, BMI 21.5, he was predicted to be at risk for developing hypertension in the next 3 years.
Firstly, according to the scores of all risk factors in the scoring tool, the scores are respectively marked as 0, 4, 6, 3, 0, 1 and 0, the total score is 14, and the risk probability corresponding to table lookup is 31.17%.
Then, carrying out primary calculation according to a multifactor logistic regression model:
Figure BDA0003093403490000091
and y is 30.23%, and it can be seen that the difference between the scoring tool and the prediction result of the Logistic regression model is only 1%, which is enough to meet the requirement of disease risk prediction and evaluation, and the application is also very intuitive and convenient.
Example 2
Embodiment 2 provided by the present invention is an embodiment of a system for constructing a hypertension prediction model provided by the present invention, fig. 2 is a structural diagram of a system for constructing a hypertension prediction model provided by the embodiment of the present invention, and it can be known from fig. 2 that the embodiment of the system for constructing a hypertension prediction model includes: the device comprises a characteristic variable screening module, a characteristic variable parameter calculating module and a model building module.
And the characteristic variable screening module is used for screening and determining the characteristic variables of the hypertension prediction model based on a statistical method.
And the characteristic variable parameter calculation module is used for determining the regression coefficient and the corresponding score of each characteristic variable by taking the characteristic variable as a factor.
And the model construction module is used for constructing a multi-factor Logistic regression model for hypertension prediction according to the regression coefficient and the corresponding score of each characteristic variable, and the multi-factor Logistic regression model is a hypertension risk prediction probability value function corresponding to each characteristic variable.
It can be understood that the system for constructing a hypertension prediction model provided by the present invention corresponds to the method for constructing a hypertension prediction model provided in the foregoing embodiments, and the relevant technical features of the system for constructing a hypertension prediction model may refer to the relevant technical features of the method for constructing a hypertension prediction model, and are not described herein again.
Referring to fig. 3, fig. 3 is a schematic diagram of an embodiment of an electronic device according to an embodiment of the invention. As shown in fig. 3, an embodiment of the present invention provides an electronic device, which includes a memory 1310, a processor 1320, and a computer program 1311 stored in the memory 1320 and executable on the processor 1320, where the processor 1320 executes the computer program 1311 to implement the following steps:
screening and determining characteristic variables of a hypertension prediction model based on a statistical method; determining regression coefficients and corresponding scores of the characteristic variables by taking the characteristic variables as factors; and constructing a multi-factor Logistic regression model for hypertension prediction according to the regression coefficient and the corresponding score of each characteristic variable, wherein the multi-factor Logistic regression model is a hypertension risk prediction probability value function corresponding to each characteristic variable.
Referring to fig. 4, fig. 4 is a schematic diagram of an embodiment of a computer-readable storage medium according to the present invention. As shown in fig. 4, the present embodiment provides a computer-readable storage medium 1400, on which a computer program 1411 is stored, which computer program 1411, when executed by a processor, implements the steps of:
screening and determining characteristic variables of a hypertension prediction model based on a statistical method; determining regression coefficients and corresponding scores of the characteristic variables by taking the characteristic variables as factors; and constructing a multi-factor Logistic regression model for hypertension prediction according to the regression coefficient and the corresponding score of each characteristic variable, wherein the multi-factor Logistic regression model is a hypertension risk prediction probability value function corresponding to each characteristic variable.
The hypertension risk assessment tool is a practical tool for assessing and health guiding the hypertension morbidity risk of middle-aged and elderly people, is mainly suitable for people over 45 years old, establishes a prediction rule through age, sex, smoking, exercise, hypertension family history, BMI, diabetes, SBP, DBP and other factor variables, achieves the hypertension morbidity risk assessment of people to be tested within 3 years in the future, and gives corresponding prompts and suggestions according to different risk stratification and single risk factor levels. Provides guiding suggestions for early screening of clinical hypertension. The risk of hypertension is layered according to the probability, such as high risk, medium risk and low risk, and personalized and specialized health management schemes are provided for different layers.
It should be noted that, in the foregoing embodiments, the descriptions of the respective embodiments have respective emphasis, and reference may be made to relevant descriptions of other embodiments for parts that are not described in detail in a certain embodiment.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded computer, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present invention have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. Therefore, it is intended that the appended claims be interpreted as including preferred embodiments and all such alterations and modifications as fall within the scope of the invention.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (10)

1. A method for constructing a hypertension prediction model is characterized by comprising the following steps:
screening and determining characteristic variables of the hypertension prediction model based on a statistical method;
determining regression coefficients and corresponding scores of the characteristic variables by taking the characteristic variables as factors;
and constructing a multi-factor Logistic regression model for hypertension prediction according to the regression coefficient and the corresponding score of each characteristic variable, wherein the multi-factor Logistic regression model is a hypertension risk prediction probability value function corresponding to each characteristic variable.
2. The construction method according to claim 1, wherein the process of screening to determine the characteristic variables of the hypertension prediction model comprises: determining variables influencing hypertension, carrying out correlation analysis according to a binary Logistic regression method to obtain a hidden state P value of each variable, and selecting the variable with the P value smaller than a set threshold value as the characteristic variable.
3. The construction method according to claim 1 or 2, wherein the feature variables include: age, gender, smoking, exercise, family history of hypertension, BMI, diabetes, systolic and diastolic blood pressure.
4. The construction method according to claim 1, wherein the hypertension risk prediction probability value function is:
Figure FDA0003093403480000011
wherein i and N respectively represent the serial number and the total number of the characteristic variables,
Figure FDA0003093403480000012
the value of (B) is β + β i Wij + B S, β i represents a regression coefficient of the i-th feature variable, Wij represents a reference value determined from the value of the i-th feature variable, B is a constant set according to the regression coefficient and the change rate of the reference value, and S represents the sum of the corresponding scores of the respective feature variables.
5. The building method according to claim 4, wherein the method of determining the reference value Wij comprises:
grouping the values of the characteristic variables;
when the characteristic variable is a numerical variable, grouping of each segmentation range is set according to the numerical range of the characteristic variable, and a middle value is selected as a reference value Wij in each grouping;
and when the characteristic variables are classified variables, setting the characteristic variables into two groups respectively according to the types of the characteristic variables, wherein the reference values Wij of the two groups are 0 or 1.
6. The construction method according to claim 1, wherein the determination method of the corresponding score poinsij of the ith characteristic variable comprises:
selecting a group of characteristic variables as a basic risk reference value WiREF;
calculating the distance D between each characteristic variable group and the basic risk reference value WiREF by combining a regression coefficient beta i, (Wij-WiREF) × beta i;
determining a constant B x β i, x representing an interval of the grouping of the characteristic variables;
and calculating the corresponding score Pointsij (D)/B (Wij-WiREF) beta i/B of the ith characteristic variable.
7. The build method of claim 1, further comprising: and (4) layering the risk of the hypertension according to the probability corresponding to each score, wherein the layering comprises high risk, medium risk and low risk.
8. A system for constructing a hypertension prediction model, comprising: the system comprises a characteristic variable screening module, a characteristic variable parameter calculating module and a model constructing module;
the characteristic variable screening module is used for screening and determining the characteristic variables of the hypertension prediction model based on a statistical method;
the characteristic variable parameter calculation module is used for determining a regression coefficient and a corresponding score of each characteristic variable by taking the characteristic variable as a factor;
and the model construction module is used for constructing a multi-factor Logistic regression model for hypertension prediction according to the regression coefficient and the corresponding score of each characteristic variable, and the multi-factor Logistic regression model is a hypertension risk prediction probability value function corresponding to each characteristic variable.
9. An electronic device, comprising a memory, and a processor, wherein the processor is configured to implement the steps of the method for constructing a hypertension prediction model according to any one of claims 1-7 when executing a computer management-like program stored in the memory.
10. A computer-readable storage medium, having stored thereon a computer management-like program which, when executed by a processor, carries out the steps of the method of constructing a hypertension prediction model according to any one of claims 1 to 7.
CN202110606139.0A 2021-05-31 2021-05-31 Construction method and system of hypertension prediction model Active CN113257421B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110606139.0A CN113257421B (en) 2021-05-31 2021-05-31 Construction method and system of hypertension prediction model

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110606139.0A CN113257421B (en) 2021-05-31 2021-05-31 Construction method and system of hypertension prediction model

Publications (2)

Publication Number Publication Date
CN113257421A true CN113257421A (en) 2021-08-13
CN113257421B CN113257421B (en) 2023-09-15

Family

ID=77185582

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110606139.0A Active CN113257421B (en) 2021-05-31 2021-05-31 Construction method and system of hypertension prediction model

Country Status (1)

Country Link
CN (1) CN113257421B (en)

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1891144A (en) * 2005-07-01 2007-01-10 上海祥鹤制药厂 Stroke pre-warning detector
US20120185274A1 (en) * 2011-01-14 2012-07-19 Guzihou Hu System and Method for Predicting Inner Age
RU2456608C1 (en) * 2011-03-15 2012-07-20 Государственное образовательное учреждение высшего профессионального образования "Курский государственный медицинский университет Министерства здравоохранения и социального развития Российской Федерации" Method for prediction of risk of hypertension in males
CN104504297A (en) * 2015-01-21 2015-04-08 甘肃百合物联科技信息有限公司 Method for using neural network to forecast hypertension
KR101737279B1 (en) * 2016-12-15 2017-05-18 신성대학 산학협력단 Prediction system for onset of stroke disease
CN108257673A (en) * 2018-01-12 2018-07-06 南通大学 Risk value Forecasting Methodology and electronic equipment
CN112017783A (en) * 2020-09-14 2020-12-01 华中科技大学同济医学院附属协和医院 Prediction model for pulmonary infection after heart operation and construction method thereof
CN112331340A (en) * 2020-10-14 2021-02-05 国家卫生健康委科学技术研究所 Intelligent prediction method and system for pregnancy probability of pregnant couple

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1891144A (en) * 2005-07-01 2007-01-10 上海祥鹤制药厂 Stroke pre-warning detector
US20120185274A1 (en) * 2011-01-14 2012-07-19 Guzihou Hu System and Method for Predicting Inner Age
RU2456608C1 (en) * 2011-03-15 2012-07-20 Государственное образовательное учреждение высшего профессионального образования "Курский государственный медицинский университет Министерства здравоохранения и социального развития Российской Федерации" Method for prediction of risk of hypertension in males
CN104504297A (en) * 2015-01-21 2015-04-08 甘肃百合物联科技信息有限公司 Method for using neural network to forecast hypertension
KR101737279B1 (en) * 2016-12-15 2017-05-18 신성대학 산학협력단 Prediction system for onset of stroke disease
CN108257673A (en) * 2018-01-12 2018-07-06 南通大学 Risk value Forecasting Methodology and electronic equipment
CN112017783A (en) * 2020-09-14 2020-12-01 华中科技大学同济医学院附属协和医院 Prediction model for pulmonary infection after heart operation and construction method thereof
CN112331340A (en) * 2020-10-14 2021-02-05 国家卫生健康委科学技术研究所 Intelligent prediction method and system for pregnancy probability of pregnant couple

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
NUSINOVICI, S,等: "Logistic regression was as good as machine learning for predicting major chronic diseases", JOURNAL OF CLINICAL EPIDEMIOLOGY, vol. 122, pages 56 - 69 *
于红,等: "子痫前期高危因素的Logistic回归分析", 实用妇产科杂志, no. 04, pages 273 - 275 *
王秀青,等: "高血压患者急性脑梗死发生风险预测模型的建立与评价", 中国现代医生, vol. 58, no. 32, pages 129 - 132 *

Also Published As

Publication number Publication date
CN113257421B (en) 2023-09-15

Similar Documents

Publication Publication Date Title
CN111292853B (en) Multi-parameter-based cardiovascular disease risk prediction network model and construction method thereof
CN108648827B (en) Cardiovascular and cerebrovascular disease risk prediction method and device
CN111653359B (en) Intelligent prediction model construction method and prediction system for hemorrhagic disease
US20180211727A1 (en) Automated Evidence Based Identification of Medical Conditions and Evaluation of Health and Financial Benefits Of Health Management Intervention Programs
Wang et al. Deep learning approaches for predicting glaucoma progression using electronic health records and natural language processing
CN110379521B (en) Medical data set feature selection method based on information theory
CN106355033A (en) Life risk assessment system
US20160358282A1 (en) Computerized system and method for reducing hospital readmissions
CN113539460A (en) Intelligent diagnosis guiding method and device for remote medical platform
CN107358019B (en) Recommendation method for concept-shifted medical solutions
CN112768074A (en) Artificial intelligence-based serious disease risk prediction method and system
CN102419791A (en) Method for estimating genetic risk of human common diseases
CN111815487B (en) Deep learning-based health education assessment method, device and medium
CN113257421B (en) Construction method and system of hypertension prediction model
CN114743619B (en) Questionnaire quality evaluation method and system for disease risk prediction
CN116564521A (en) Chronic disease risk assessment model establishment method, medium and system
CN115394448B (en) Modeling method, model and equipment of coronary heart disease motion reactivity prediction model
CN116705310A (en) Data set construction method, device, equipment and medium for perioperative risk assessment
CN116864142A (en) Epidemic trend prediction method and system
Kalatzis et al. Interactive dimensionality reduction for improving patient adherence in remote health monitoring
CN112382382B (en) Cost-sensitive integrated learning classification method and system
CN113921136A (en) System for fusing multi-source data to intelligently evaluate and predict chronic disease risk
Heitz et al. WRSE-a non-parametric weighted-resolution ensemble for predicting individual survival distributions in the ICU
CN112259231A (en) High-risk gastrointestinal stromal tumor patient postoperative recurrence risk assessment method and system
CN109801711B (en) Juvenile body composition prediction method based on PSO algorithm

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant