WO2023033275A1 - 개인 맞춤 생체나이 예측 모형 생성 방법 및 시스템 - Google Patents
개인 맞춤 생체나이 예측 모형 생성 방법 및 시스템 Download PDFInfo
- Publication number
- WO2023033275A1 WO2023033275A1 PCT/KR2022/002749 KR2022002749W WO2023033275A1 WO 2023033275 A1 WO2023033275 A1 WO 2023033275A1 KR 2022002749 W KR2022002749 W KR 2022002749W WO 2023033275 A1 WO2023033275 A1 WO 2023033275A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- age
- probability
- oagm
- group
- training data
- Prior art date
Links
- 238000000034 method Methods 0.000 title claims abstract description 140
- 230000036541 health Effects 0.000 claims abstract description 58
- 230000032683 aging Effects 0.000 claims abstract description 42
- 230000008569 process Effects 0.000 claims description 101
- 238000012549 training Methods 0.000 claims description 92
- 238000007477 logistic regression Methods 0.000 claims description 80
- 238000004364 calculation method Methods 0.000 claims description 66
- 238000013211 curve analysis Methods 0.000 claims description 19
- 238000012937 correction Methods 0.000 claims description 18
- 238000000605 extraction Methods 0.000 claims description 15
- 230000004044 response Effects 0.000 claims description 15
- HVYWMOMLDIMFJA-DPAQBDIFSA-N cholesterol Chemical compound C1C=C2C[C@@H](O)CC[C@]2(C)[C@@H]2[C@@H]1[C@@H]1CC[C@H]([C@H](C)CCCC(C)C)[C@@]1(C)CC2 HVYWMOMLDIMFJA-DPAQBDIFSA-N 0.000 claims description 14
- DDRJAANPRJIHGJ-UHFFFAOYSA-N creatinine Chemical compound CN1CC(=O)NC1=N DDRJAANPRJIHGJ-UHFFFAOYSA-N 0.000 claims description 14
- 238000013480 data collection Methods 0.000 claims description 11
- 238000013500 data storage Methods 0.000 claims description 10
- 230000035488 systolic blood pressure Effects 0.000 claims description 9
- 102000001554 Hemoglobins Human genes 0.000 claims description 8
- 108010054147 Hemoglobins Proteins 0.000 claims description 8
- 230000035487 diastolic blood pressure Effects 0.000 claims description 8
- 101000856500 Bacillus subtilis subsp. natto Glutathione hydrolase proenzyme Proteins 0.000 claims description 7
- 239000008280 blood Substances 0.000 claims description 7
- 210000004369 blood Anatomy 0.000 claims description 7
- 238000009534 blood test Methods 0.000 claims description 7
- 229940109239 creatinine Drugs 0.000 claims description 7
- 210000004185 liver Anatomy 0.000 claims description 7
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 claims description 6
- 235000012000 cholesterol Nutrition 0.000 claims description 6
- 239000008103 glucose Substances 0.000 claims description 6
- HVYWMOMLDIMFJA-UHFFFAOYSA-N 3-cholesterol Natural products C1C=C2CC(O)CCC2(C)C2C1C1CCC(C(C)CCCC(C)C)C1(C)CC2 HVYWMOMLDIMFJA-UHFFFAOYSA-N 0.000 claims description 2
- 230000007246 mechanism Effects 0.000 abstract description 5
- 238000000513 principal component analysis Methods 0.000 description 38
- 238000012417 linear regression Methods 0.000 description 19
- 238000010586 diagram Methods 0.000 description 11
- 239000000090 biomarker Substances 0.000 description 9
- 238000005259 measurement Methods 0.000 description 8
- 238000004458 analytical method Methods 0.000 description 7
- 230000001419 dependent effect Effects 0.000 description 6
- 238000011161 development Methods 0.000 description 6
- 230000018109 developmental process Effects 0.000 description 6
- 230000000694 effects Effects 0.000 description 4
- 230000036772 blood pressure Effects 0.000 description 3
- 238000004422 calculation algorithm Methods 0.000 description 3
- 238000006243 chemical reaction Methods 0.000 description 3
- 230000007423 decrease Effects 0.000 description 3
- 238000001228 spectrum Methods 0.000 description 3
- 239000000284 extract Substances 0.000 description 2
- 230000003054 hormonal effect Effects 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 102220011340 rs181969066 Human genes 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 230000035945 sensitivity Effects 0.000 description 2
- 230000000391 smoking effect Effects 0.000 description 2
- 102000009027 Albumins Human genes 0.000 description 1
- 108010088751 Albumins Proteins 0.000 description 1
- 108020004206 Gamma-glutamyltransferase Proteins 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 208000008589 Obesity Diseases 0.000 description 1
- 201000007902 Primary cutaneous amyloidosis Diseases 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 238000010171 animal model Methods 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 238000011833 dog model Methods 0.000 description 1
- 230000035622 drinking Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 102000006640 gamma-Glutamyltransferase Human genes 0.000 description 1
- 230000003862 health status Effects 0.000 description 1
- 239000005556 hormone Substances 0.000 description 1
- 229940088597 hormone Drugs 0.000 description 1
- 238000011835 investigation Methods 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 230000002503 metabolic effect Effects 0.000 description 1
- 210000003205 muscle Anatomy 0.000 description 1
- 235000020824 obesity Nutrition 0.000 description 1
- 208000014670 posterior cortical atrophy Diseases 0.000 description 1
- 230000003449 preventive effect Effects 0.000 description 1
- 230000002250 progressing effect Effects 0.000 description 1
- 238000000611 regression analysis Methods 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 238000007619 statistical method Methods 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 238000012360 testing method Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/30—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N5/00—Computing arrangements using knowledge-based models
- G06N5/02—Knowledge representation; Symbolic representation
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
Definitions
- the present invention relates to a method for generating a model for predicting biological age in a personalized manner, and based on health examination data, a personalized biological age for generating a model capable of predicting individual biological age by obtaining an excess age for each age-specific birth age It relates to a method for generating a predictive model and a system therefor.
- the age of birth represents the difference between the current year and the year of birth, and regardless of the individual's current health status, all people born in the same year inevitably have the same birth age.
- Bio age unlike birth age, is a digitization of parts that vary depending on the overall health condition of the body, that is, it is a numerical expression of the health and aging degree of the body.
- biomarkers used to measure biological age are:
- biomarkers that are physical, physiological, and biochemical.
- Biomarkers commonly used to measure biological age include body mass index (BMI), blood pressure (systolic blood pressure, diastolic blood pressure), waist circumference, lung capacity, muscle mass, albumin, cholesterol level, etc.
- BMI body mass index
- PCA multivariable linear regression analysis
- PCA principal component analysis
- Levine and Crimmins conducted a study to predict mortality for 10 years using biological age
- Brown and McDaid conducted a study on birth age, education level, gender, income, marital status, occupation, race, religion, smoking, drinking, activity level, and obesity. Investigations and studies on the effects of factors such as these on adult mortality were conducted.
- R2 means a coefficient of determination
- the points shown in FIG. 3 represent the measured coordinates X (checkup value) and Y (age) of each individual. As the checkup value increases, the birth age tends to increase. Expressing this as a linear regression model, the larger the screening value, the more the effect of age.
- Equation 1 The multiple linear regression analysis model can be expressed as in Equation 1 below.
- Equation 1 shows the linear influence of the independent variable on the age of birth, with the dependent variable (Y) as the age of birth and three variables of BMI, SBP, and HDL as independent variables.
- a1, a2, and a3 are regression coefficients and indicate the influence of BMI, SBP, and HDL on birth age, respectively.
- Y calculated through Equation 1 is a value calculated when BMI, SBP, and HDL measurement values are input, and the key to the MLR model is to consider this value as biological age.
- Such a multiple linear regression model has the following problems.
- BA biological age
- CA birth age
- BA biological age
- FIG. 4 is a graph showing the relationship between birth age (X) and biological age (Y), and shows an example of over (under) estimation of a multiple linear regression model.
- birth age is not a health examination item, but is dependent on calendar time.
- PCA Principal Component Analysis
- PCA analysis is performed using five variables such as SBP, DBP, HDL, LDL, and TG, two independent factors, "blood pressure factor” and "cholesterol factor” can be extracted.
- PCA is applied to a number of health examination variables (BMI, WST, SBP, DBP, AST, ALT, GGTP, HDL, LDL, TG, vital capacity, etc.) to extract "one factor" common to these variables.
- BMI health examination variables
- WST SBP
- DBP DBP
- AST ALT
- GGTP HDL
- LDL LDL
- TG vital capacity
- the core of the PCA biological age prediction model is to determine that "one factor” extracted by the PCA method is "biological age” representing a person's actual aging state.
- the biological age prediction model using PCA does not use birth age (CA) as a dependent variable, but the extracted factor with the greatest influence is age (e.g., 1 year old, 2 years old). ), and the birth age (CA) is entered into the BA prediction model as an independent variable in order to correct the bias in predicting the biological age (BA).
- CA birth age
- BA biological age
- Equation 2 By arranging the PCA model, it can be expressed as Equation 2 below.
- BA biological age
- X1 is one principal component factor extracted through PCA
- CA birth age
- F is a conversion function using X1 as an input variable
- G is a conversion function using CA as an input variable. do.
- the biological age means a numerical value calculated by multiplying the PCA principal component factors and the birth age by weights, respectively, and then adding them together.
- Another reason for including "birth age” as a parameter in the biological age model is that before using "birth age” as a parameter, it was overestimated in the younger group and underestimated in the older group like the MLR model. This is because the phenomenon underestimation occurs in the same way.
- Korean Patent Publication No. 0126229 of 2014 “Method and system for generating a biological age calculation model and biological age calculation method and system therefor," provides a method for calculating biological age using the PCA biological age prediction model.
- a biological age prediction model is constructed for each gender and birth age, and the biological age can be predicted according to the biological age prediction model for each age group. It is intended to provide a method and system for generating a custom biological age prediction model.
- the present invention provides biological age information that can be more objectively and clearly interpreted by expressing the individual's aging state in the form of a biological age probability spectrum/distribution rather than simply presenting only one numerical value of biological age (eg, 55 years old). It is intended to provide a personalized bio-age prediction model and service system so that it can be provided.
- the present invention does not directly predict biological age using examination data, but rather "excess aging factor (ie, ⁇ ) that birth age cannot explain” through examination data It is a technical feature to calculate ".
- the present invention intends to develop a plurality of biological age measurement models that operate differently according to gender and birth age.
- the present invention aims to predict biological age with a statistical model considering the distribution of differences in checkup values measured in individuals when compared with values representing people of the same birth age (eg, average body mass index, average blood pressure, etc.).
- An age interval setting process for setting an age interval (x to y) to be used as training data to generate a binary logistic regression model
- each age unit is set as one unit, and the training data for each age unit is divided into two groups: an under-age group (UAGm) and an over-age group (OAGm), and each age unit A binary logistic regression model generation process for generating a star binary logistic regression model (Mx ⁇ My);
- ROC curve Receiveiver Operating Characteristic curve
- It is characterized in that it comprises a biological age calculation process of calculating the biological age by adding the excess age of each individual obtained through the excess age calculation process to the birth age.
- the training data in the process of generating the binary logistic regression model is made according to the examination item information, and further comprises a examination item information setting process for searching, adding, and deleting examination item information used as training data.
- condition information setting process for setting male and female condition information for training data in the process of generating the binary logistic regression model may be further included.
- Examination data collection means for collecting health examination data provided from the health examination system and storing and managing them in a data storage means
- Training data setting means for determining valid training data from the examination data provided from the examination data collection means according to the set training data reference age range (x to y) and examination item information;
- Binary logistic regression model generation means for generating a binary logistic regression model (Mx-My) for each age unit within an age interval (x-y) set for the training data set by the training data setting means;
- Age prediction probability calculation means for calculating a probability (Pm) of being predicted as an over-age group for each individual in the training data according to the binary logistic regression model generated by the binary logistic regression model generation means;
- the under-age group (UAGm) and over-age group (OAGm) are set as bipartite response variables, and the probability (Pm) of being predicted by the over-age group (OAGm) is set as a predictor variable through ROC curve analysis.
- the excess probability (Dm) to be predicted by the individual overage group (OAGm) by applying a cutoff (cm) (Pm-Cm) from the probability (Pm) predicted by the overage group (OAGm) calculated through the age prediction probability calculation means ) and age prediction probability correction means for correcting the probability (Pm) of being predicted as an over-age group (OAGm) calculated by the age prediction probability calculation means;
- Excess age calculation means for calculating individual's excess aging by obtaining a weighted average ( ⁇ i) for the excess probability (Dm) predicted by the over age group (OAGm) obtained through the age prediction probability correction means;
- Biological age calculation means for calculating biological age from birth age using the excess age of each individual obtained through the excess age calculation means
- It is characterized in that it is configured to include a data storage means for storing and managing health examination data collected from the examination data collection means and training data set through the training data setting means.
- It is characterized in that it is configured to further include a user setting means for providing a process so that the user can inquire and set the age section and checkup item information of the training data setting means.
- the training data setting means further includes a user setting means for providing a process so that the user can set condition information for determining training data, and the condition information is male/female gender information.
- Physical examination indicators such as body mass index, waist circumference, systolic blood pressure, and diastolic blood pressure, 3 types of liver values (AST, ALT, ⁇ -GTP), creatinine, 3 types of cholesterol (HDL, LDL, TG), fasting blood glucose, hemoglobin and It is characterized in that it consists of health insurance checkup item data including the same blood test index.
- the present invention develops a bio-age prediction model by utilizing high-quality, large-scale health examination data already accumulated in the National Health Insurance Corporation. cost and time can be reduced.
- the present invention calculates the individual's excess age using the relative values of each individual according to men and women and age groups using examination data, and calculates the age of each individual as weight information.
- 1 is a diagram showing an example of data distribution showing the correlation between birth age and systolic blood pressure.
- Figure 2 is a diagram showing an example of data distribution showing the correlation between birth age and hemoglobin.
- FIG 3 is a diagram showing a linear regression line in a multiple linear regression analysis model (MLR).
- FIG. 4 is a graph showing the relationship between birth age (X) and biological age (Y).
- PCA Principal Component Analysis
- FIG. 6 is a flowchart showing the process of the method for generating a personalized bio-age prediction model of the present invention.
- FIG. 7 is a diagram showing a process of generating a binary logistic regression model in the present invention.
- Pm probability values
- 9 is a chart showing cutoff values extracted through a cutoff extraction process in the present invention.
- FIG. 10 is a chart showing an excess probability (Dm) to be predicted with an overage group (OAGm) obtained through an age prediction probability correction process in the present invention.
- 11 is a diagram showing an example of an overage profile for each individual in the present invention.
- FIG. 12 is a flowchart showing an embodiment of a process of generating a model for predicting biological age in the present invention.
- FIG. 13 is a block diagram showing the configuration of the personalized bio-age model generation system of the present invention as described above.
- the method for generating a personalized bio-age prediction model of the present invention has a technical feature in that it calculates an "excess aging factor ( ⁇ )" that cannot be explained by birth age through examination data, and predicts the biological age using this.
- the process of generating a personalized bio-age model according to the present invention is performed as follows.
- An age interval setting process for setting an age interval (x to y) to be used as training data to generate a binary logistic regression model, and each age unit in the age interval set in the age interval setting process is 1 unit Binary logistic regression that divides the training data into two groups, an under-age group (UAGm) and an over-age group (OAGm) for each age unit, and creates a binary logistic regression model (Mx ⁇ My) for each age unit model creation process,
- UAGm under-age group
- OAGm over-age group
- the under-age group (UAGm) and over-age group (OAGm) are set as bipartite response variables, and the probability (Pm) of being predicted by the over-age group (OAGm) is set as a predictor variable through ROC curve analysis.
- the biological age prediction model of the present invention can be defined as multivariable binary logistic regression (MBLR), and its characteristics can be simplified as follows.
- MBLR multivariable binary logistic regression
- MBLR Biological Age Prediction Model
- BA birth age (CA) + ⁇
- f(BMI, SBP, 7) represents an overaging factor calculation function based on a binary logistic regression model using health checkup values as input variables.
- the technical feature is that the excess age ( ⁇ i) for the birth age (CA) can be obtained. As shown in FIG. 6,
- an age interval (x to y) used to obtain a binary logistic regression model is set.
- An embodiment of the present invention sets 26 years old (x) to 75 years old (y) as subjects of health insurance checkup data.
- 26 and 75 years old are values used because of the characteristics of health insurance data, and in the case of non-health insurance data, x (26 years old) and y (75 years old) may be changed.
- the binary logistic regression model generation process is a process for generating a binary logistic regression model for obtaining the probability (Pm) of being seen as over-age (OAGm) in two groups, and "birth age” is divided into two groups It is a process for generating a model capable of predicting any one group (OAGm) from these two groups.
- UAGm under-age group
- OAGm over-age group
- FIG. 7 is a diagram illustrating a process of generating a binary logistic regression model.
- each unit age is divided into a group under the corresponding age (UAGm) and a group above the corresponding age (OAGm), and in each unit, one of the two groups is selected as training data, and a total of 50 This will create a binary logistic regression model.
- a group under the age of 26 and a group over the age of 26 are set, and age prediction is made by dividing the group under the age of 26 and the age of 26 or older by (0,1) in the examination item data unit set as training data.
- a binary logistic regression model (M26) is created to predict the age of 26 or older, and for specific values for each examination item, people under the age of 26 are classified as '0' and those over the age of 26 as '1'
- a binary logistic regression model (M26) is created.
- Physical examination indicators such as body mass index, waist circumference, systolic blood pressure, and diastolic blood pressure, 3 types of liver values (AST, ALT, ⁇ -GTP), creatinine, 3 types of cholesterol (HDL, LDL, TG), fasting blood glucose, hemoglobin and A binary logistic regression model (M26) is created by dividing each examination data of the same blood test index and the same health insurance examination item into people under the age of 26 and those having a value of over 26 years of age.
- M26 binary logistic regression model
- a binary logistic regression model is constructed according to a predictor variable having the under-age group (UAGm) and the over-age group (OAGm) as the response variable as the Y-axis and the training data (examination data) as the X-axis.
- a checkup item information setting process may be further included so that health insurance checkup items to be used as training data may be searched and added or deleted as checkup item information.
- condition information setting process for setting condition information for training data may be further included, and the condition information may be composed of male and female gender information.
- biological age prediction models according to male and female sexes can be configured separately.
- This process is performed from the age of 26 to 76 to generate a total of 50 binary logistic regression models (M26 to M75).
- the age prediction probability calculation process is a process of calculating the probability (Pm) of being predicted as an over-age group (OAGm) for each individual according to the binary logistic regression models (M26 to M75) generated as described above.
- Equation 3 shows the age prediction probability calculation process according to the binary logistic regression model.
- Pm probability values
- the probability value "P45” is a probability value obtained using a binary logistic regression model (M45), and means a probability value predicted to be 45 years of age or older.
- the probability (P45) of being predicted to be 45 years old or older is 0.655
- the probability of being predicted to be 75 years old or older is 0.211.
- the age prediction probability calculation process calculates 50 of these probability values for all ages (P26 to P75) for all people (sample) to generate a chart as shown in FIG. 8 above.
- the probability (Pm) value is obtained for all age units for each individual.
- the cutoff extraction process involves ROC (Reciever Operating Characteristic curve and Area Under the Curve) curve analysis for the probability values (Pm) obtained for 50 models (M26 to M75) for all people aged 26 to 75 years.
- ROC Reciever Operating Characteristic curve and Area Under the Curve
- the under-age group (UAGm) and the over-age group (OAGm) are set as two-part response variables, and the probability (Pm) predicted by the over-age group (OAGm)
- ROC curve analysis is performed to extract a cutoff (Cm).
- This cutoff extraction process is to extract the cutoff (Cm) at the point of maximizing Youden's J statistic, meaning the result of extracting the cutoff at which the sum of Sensitivity and Specificity is maximized. do.
- 9 is a chart showing cutoff values extracted through a cutoff extraction process.
- C45 is a cutoff value obtained from model M45, and when the probability value is calculated as 0.547 or higher, it means that the person is predicted to belong to a group whose age is 45 years or older.
- the age prediction probability correction process applies (Pm-Cm) the cutoff (Cm) value obtained through the age prediction probability calculation process to the probability (Pm) of being predicted as an over-age group (OAGm) to form an over-age group (OAGm). This is the process of correcting with the predicted excess probability (Dm).
- FIG. 10 is a chart showing the excess probability (Dm) to be predicted with the over-age group (OAGm) obtained through the age prediction probability correction process.
- the excess age calculation process calculates the weighted average ( ⁇ i) of the excess probability (Dm) predicted by the over age group (OAGm) obtained through the above process to obtain individual's excess aging to obtain the biological age. It is a process.
- Equation 4 shows a process of calculating the weighted average ⁇ i for the excess probability Dm to be predicted by the overage group OAGm.
- the weighted average of the excess probability (Dm) predicted by the overage group (OAGm) is obtained for each individual's excess age. If there is an additional weight (Wm) to be applied, the weighted average can be obtained by applying it.
- Equation 5 shows a process of calculating the weighted average ⁇ i for the excess probability Dm to be predicted by the overage group OAGm.
- the biological age calculation process is a process of calculating the biological age by adding the excess age obtained in the excess age calculation process to the birth age.
- the technical feature of the present invention is to generate a model (algorithm) for predicting biological age using health insurance checkup data.
- the biological age can be predicted by obtaining the excess age ( ⁇ i) for the birth age (CA).
- an object of health insurance checkup data for using training data to obtain biological age is set.
- 26 to 75 years old is set as the training data age target (x to y), which is an age interval for obtaining a binary logistic regression model.
- checkup item information setting process for setting checkup items to be used as training data as checkup item information may be further included, and a user (manager) may set checkup items to be used as training data for biological age prediction.
- FIG. 12 is a flowchart showing an embodiment of a process of generating a model for predicting biological age in the present invention. An embodiment of the operation process will be described with reference to FIG. 12 .
- UAG26 under-age group
- OAG26 over-age group
- the health checkup data is classified into those under the age of 26 and over the age of 26, and the sample target (person) of the checkup data is checked for specific values for each health checkup item, and the sample (person) under the age of 26 is not age
- the group (UAGm) is set to '0'
- the sample (person) aged 26 or older is set to the overage group (OAGm) '1'
- a binary logistic regression model (M26) corresponding to the age of 26 is generated. .
- the binary logistic regression model is for obtaining the probability (Pm) of being seen as over age (OAGm) in the two groups, and as described above, physical tests such as body mass index, waist circumference, systolic blood pressure, and diastolic blood pressure Indices, three types of liver values (AST, ALT, ⁇ -GTP), creatinine, three types of cholesterol (HDL, LDL, TG), fasting blood sugar, and blood test indicators such as hemoglobin, etc. It is used, and it can be set as examination item information by adding or deleting it as needed.
- the probability (P26) of being predicted as an over-age group (OAG26) for each individual is calculated through Equation 3 to obtain an age prediction probability.
- such an age prediction probability represents an individual's aging status, and represents a probability to be predicted as OAGm of an over-age group.
- the cutoff (Cm) which is the reference value for determining the biological age
- UAGm under-age group
- OAGm over-age group
- the predictor variable By setting the probability (Pm) of being predicted as the over-age group (OAGm) and extracting the cutoff (Cm) through ROC curve analysis, the probability (P26) of being predicted to be 26 years old or older is targeted , a cutoff (C26) value for determining biological age is obtained through ROC curve analysis.
- the cutoff (C26) value obtained through the age prediction probability calculation process is calculated (P26-C26) from the probability (P26) to be predicted as the over age group (OAG26) to predict the over age group (OAG26) Calculate the excess probability (D26).
- the excess probability D26 to be predicted as the overage group 0AG26 is obtained by applying the cutoff C26 to each individual.
- UAGm under-age group
- OAGm over-age group
- a binary logistic regression model is created by dividing people with values less than 26 years old and those with values greater than 26 years old, such as This process creates a binary logistic regression model (M27 to M75) for the ages of 27, 28, ...., 75.
- the cutoffs (C26 to C75) obtained as described above are values extracted through ROC curve analysis, and mean that the cutoff (Cm) is extracted at the point of maximizing Youden's J statistic.
- the excess probability (Dm) to be predicted by the overage group (OAGm) calculated through the age prediction probability calculation process is a cutoff (Cm) to the probability (Pm) to be predicted by the overage group (OAGm) obtained in the age prediction probability process is applied, and as shown in FIG. 10 for each individual, D26 to D75 is obtained from 26 to 75 years of age.
- the weighted average ( ⁇ i) of such an individual's excess age can be obtained through Equation 4 above.
- the weighted average obtained in this way can be applied to the birth age as the excess age of each individual to obtain the biological age.
- 11 is a diagram showing an example of an over-age profile for each individual, with the X-axis set to training data age targets 26 to 75 and the Y-axis set to the over-age probability (Dm) predicted by the over-age group (OAGm), and the over-age for each age target It represents the excess probability (Dm) predicted by the age group (OAGm).
- the present invention obtains average information of information representing the degree of aging of each individual using health insurance checkup data, and accordingly creates a model (algorithm) capable of predicting biological age.
- FIG. 13 shows the configuration of the personalized bio-age model generation system of the present invention as described above.
- Examination data collection means 110 for collecting health examination data provided from the health examination system and storing and managing them in the data storage means 190;
- Training data setting means 120 for determining valid training data from the examination data collected from the examination data collection means 110 according to the set training data reference age interval (x to y) and examination item information;
- Binary logistic regression model generation means 130 for generating a binary logistic regression model (Mx-My) for each age unit within the age interval (x-y) set for the training data set by the training data setting means 120;
- the under-age group (UAGm) and over-age group (OAGm) are set as bipartite response variables, and the probability (Pm) of being predicted by the over-age group (OAGm) is set as a predictor variable through ROC curve analysis.
- a cutoff (cm) is applied (Pm-Cm) from the probability (Pm) predicted by the overage group (OAGm) calculated through the age prediction probability calculation means 140 to exceed the predicted overage group (OAGm) for each individual.
- an age prediction probability correcting unit 160 for calculating a probability Dm and correcting a probability Pm of being predicted as an over-age group OAGm calculated in the age prediction probability calculating unit 140;
- Excess age calculation means for calculating individual's excess aging by obtaining the weighted average ( ⁇ i) for the excess probability (Dm) predicted by the over age group (OAGm) obtained through the age prediction probability correction means (160) ( 170) and,
- Biological age calculation means 180 for calculating biological age from birth age using the excess age of each individual obtained through the excess age calculation means 170;
- It is configured to include a data storage means 190 for storing and managing health checkup data collected from the checkup data collection means 110 and training data set through the training data setting means 120 .
- the technical feature of the personalized bio-age prediction system of the present invention is that it sets training data from health check-up data provided from the health check-up system and extracts individual over-age information therefrom to predict the biological age.
- It consists of a biological age prediction model generation system for generating a personalized biological age model by receiving health examination data from the health examination system,
- the examination data collection means 110 is a means for collecting health examination data provided from the health examination system, and is a means for storing and managing the collected health examination data in the data storage means 190.
- the training data setting means 120 is a means for setting training data for generating a biological age prediction model, and the data storage means 190 according to the set training data reference age interval (x-y) and checkup item information. It is a means for determining valid training data for the binary logistic regression model generating means from the checkup data stored in .
- the binary logistic regression model generating means 130 is a means for generating a binary logistic regression model (Mx to My) for each age unit within the age interval set for the training data set by the training data setting means 120,
- each age unit is set as one unit, and the training data for each age unit is divided into two groups: an under-age group (UAGm) and an over-age group (OAGm), and the under-age group (UAGm) and over-age group
- UAGm under-age group
- UAGm over-age group
- Mx ⁇ My binary logistic regression model
- the age prediction probability calculation means 140 calculates the probability (Pm) of being predicted as an over-age group (OAGm) for each individual according to 50 binary logistic regression models generated by the binary logistic regression model generation means 130 is a means for
- the cutoff extracting means 150 is a means for extracting a cutoff (Cm) for correcting the probability (Pm) of being predicted by the overage group (OAGm) calculated through the age prediction probability calculation means 140,
- Age group (UAGm) and over age group (OAGm) are set as bipartite response variables, and the probability (Pm) of being predicted by the over age group (OAGm) is set as a predictor variable to cut off through ROC curve analysis. It is a means for extracting (Cm).
- the age prediction probability correcting means 160 is a means for correcting the probability Pm predicted by the over age group OAGm calculated through the age prediction probability calculation means 140, By applying the cutoff (cm) to the predicted probability (Pm) (Pm-Cm), the excess probability (Dm) to be predicted by the individual overage group (OAGm) is calculated, and the overage calculated in the age prediction probability calculation means 140 This is a means for correcting the probability Pm to be predicted by the age group OAGm.
- the excess age calculation means 170 is a means for obtaining the excess age of each individual to obtain the biological age, and the excess probability Dm to be predicted by the overage group OAGm obtained through the age prediction probability correction means 160 It is a means for obtaining the weighted average ( ⁇ i) for , and obtaining the excess age of each individual.
- the biological age calculator 180 is a means for calculating the biological age from the birth age using the individual excess age obtained through the excess age calculator 170.
- the examination data collection means 110 collects examination data provided from the health examination system and stores them in the data storage means 190 .
- the training data setting means 120 sets training data for obtaining a binary logistic regression model from the health examination data stored in the data storage means 190.
- the training data setting unit 120 determines training data for the set age range (x-y) and health examination items.
- An embodiment of the present invention uses health insurance checkup data, and the age range is set to 26 years old (x) to 75 years old (y).
- the user setting means for providing a process so that the user (manager) can inquire and reset the age section and checkup item information of the training data setting means 120 can be configured to be further included.
- training data setting means 120 may further include a user setting means for providing a process so that the user can set condition information for determining training data.
- the condition information may be composed of male and female gender information, and by setting male and female gender information, biological age prediction models according to male and female gender may be configured separately.
- the binary logistic regression model generating means 130, 50 are set for each age unit within the age section of the training data setting means 120, and training data for each unit is set as an under-age group (UAGm) and an over-age group ( OAGm) and generate a binary logistic regression model.
- UAGm under-age group
- OAGm over-age group
- a group under the age of 26 UAG26
- a group over the age of 26 OAG26
- M26 binary logistic regression model
- a binary logistic regression model is created by dividing the training data for health insurance checkup items, such as blood test indicators such as hemoglobin, into people under the age of 26 and those over the age of 26.
- This process is performed from the age of 26 to 76 to generate a total of 50 binary logistic regression models (M26 to M75).
- the probability (Pm) of being predicted as an overage group (OAGm) for each individual is calculated according to the binary logistic regression model (M26 to M75) generated as described above.
- the probability (Pm) of being predicted as such an over-age group (OAGm) is information for obtaining the over-age of each individual in order to predict the biological age, and can be obtained through Equation 3 above.
- the probability value (Pm) for each individual can be obtained according to the binary logistic regression model.
- the cutoff extracting unit 150 extracts a cutoff (Cm) for the probability (Pm) of each individual to be predicted as an over-age group (OAGm) through ROC curve analysis.
- the cutoff (Cm) is a reference value for determining the biological age
- the under-age group (UAGm) and the over-age group (OAGm) are set as two-part response variables, and the probability (Pm) predicted by the over-age group (OAGm) ) as a predictor variable, ROC curve analysis can be performed to obtain a cutoff (Cm) value as shown in FIG. 9 .
- the age prediction probability correction unit 160 uses the cutoff value (Cm) obtained from the cutoff extraction unit 150 to predict the probability (Pm) of the overage group OAGm obtained from the age prediction probability calculation unit 140. ) is corrected.
- Such age prediction probability correction is performed by applying the cutoff (Cm) value obtained through the age prediction probability calculation means 140 to the probability (Pm) of being predicted as the overage group (OAGm) (Pm-Cm) to overage group (Pm-Cm) to overage group (
- Pm overage group
- Pm-Cm overage group
- Dm excess probability predicted by OAGm
- the overage calculation means 170 obtains the weighted average ⁇ i through Equation 4 for the overprobability Dm predicted for the overage group OAGm, thereby obtaining the overage for each individual.
- the weighted average of the excess probability (Dm) predicted by the overage group (OAGm) obtains the individual excess age. If there is an additional weight (Wm) to be applied, it is applied to average the weight as in Equation 5 above. can be obtained.
- the present invention can provide a more reliable biological age by calculating the age in excess of the birth age from health insurance checkup data and predicting the biological age therefrom.
- the present invention developed a bio-age prediction model by utilizing high-quality, large-scale health examination data accumulated by the National Health Insurance Service, and is widely used in the medical and statistical analysis industries to realize its practical and economical value. am.
Landscapes
- Engineering & Computer Science (AREA)
- Health & Medical Sciences (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Data Mining & Analysis (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Pathology (AREA)
- Theoretical Computer Science (AREA)
- Epidemiology (AREA)
- Primary Health Care (AREA)
- Databases & Information Systems (AREA)
- Life Sciences & Earth Sciences (AREA)
- Physics & Mathematics (AREA)
- Software Systems (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- General Engineering & Computer Science (AREA)
- Computing Systems (AREA)
- Evolutionary Computation (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Biophysics (AREA)
- Veterinary Medicine (AREA)
- Molecular Biology (AREA)
- Animal Behavior & Ethology (AREA)
- Heart & Thoracic Surgery (AREA)
- Computational Linguistics (AREA)
- Surgery (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
- Medical Treatment And Welfare Office Work (AREA)
- Physiology (AREA)
- Psychiatry (AREA)
- Signal Processing (AREA)
Abstract
Description
Claims (18)
- 건강검진시스템으로부터 수집된 건강 검진 데이터로부터 생체나이 예측 모형을 생성하기 위한 개인 맞춤 생체나이 예측 모형 생성시스템에서 수행되는,바이너리 로지스틱 회귀 모형을 생성하기 위하여 트레이닝 데이터(training data)로 이용될 연령 구간(x~y)을 설정하기 위한 트레이닝데이터 설정수단(120)의 연령 구간 설정 과정과,상기 연령 구간 설정 과정에서 설정된 연령 구간에서 각 연령 단위를 1단위로 하고, 각 연령 단위마다 트레이닝 데이터를 언더에이지 그룹(UAGm), 오버에이지 그룹(OAGm)의 2개 그룹으로 구분하고, 각 연령 단위별 바이너리 로지스틱 회귀 모형(Mx~My)을 생성하는 바이너리 로지스틱 회귀 모형 생성수단(130)의 바이너리 로지스틱 회귀 모형 생성과정과,바이너리 로지스틱 회기 모형에 따라서 샘플 대상인 개인별로 오버에이지 그룹(OAGm)으로 예측될 확률(Pm) 연산하는 연령예측확률연산수단(140)의 연령예측확률연산과정과,언더에이지 그룹(UAGm), 오버에이지 그룹(OAGm)을 2분형 반응변수로 설정하고, 상기 오버에이지 그룹(OAGm)으로 예측될 확률(Pm)을 예측변수로 설정하여 ROC 커브(curve) 분석을 통해 컷오프(cutoff)(Cm)를 추출하는 컷오프추출수단(150)의 컷오프추출과정과,오버에이지 그룹(OAGm)으로 예측될 확률(Pm)로부터 컷오프(Cm)를 적용(Pm-Cm)하여 오버에이지 그룹(OAGm)으로 예측될 초과확률(Dm)을 연산하는 연령예측확률보정수단(160)의 연령예측확률보정과정과,상기 연령예측확률보정과정을 통해 구한 오버에이지 그룹(OAGm)으로 예측될 초과확률(Dm)에 대한 가중치 평균(Δi)을 구하여 개인별 초과나이(Individual's excess aging)를 구하는 초과나이연산수단(170)의 초과나이 연산과정과,상기 초과나이 연산과정을 통해 구한 개인별 초과나이를 출생나이에 더해 생체나이를 구하는 생체나이 연산수단(180)의 생체나이 연산과정, 을 포함하여 이루어지는 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 방법.
- 제1항에 있어서, 상기 바이너리 로지스틱 회귀 모형 생성과정에서의 트레이닝 데이터는 검진항목정보에 따라 이루어지며,상기 검진항목정보는,체질량지수, 허리둘레, 수축기 혈압, 이완기 혈압과 같은 신체검사지표와, 간 수치 3종(AST, ALT, γ-GTP), 크레아티닌, 콜레스테롤 3종(HDL, LDL, TG), 공복혈당, 헤모글로빈과 같은 혈액검사지표를 포함하는 건강보험 검진항목 데이터로 이루어지는 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 방법.
- 제1항 또는 제2항에 있어서,상기 바이너리 로지스틱 회귀 모형 생성과정에서의 트레이닝 데이터는 검진항목정보에 따라 이루어지며,트레이닝 데이터로 이용되는 검진항목정보를 조회 및 추가, 삭제 설정하기 위한 검진항목정보설정과정을 더 포함하여 이루어지는 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 방법.
- 제1항에 있어서, 상기 바이너리 로지스틱 회귀 모형 생성과정에서의 트레이닝 데이터에 대한 조건정보를 설정하기 위한 조건정보설정과정을 더 포함하여 이루어지는 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 방법.
- 제4항에 있어서, 상기 조건정보설정과정에서의 조건정보는 남,녀 성별정보인 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 방법.
- 제1항에 있어서, 상기 바이너리 로지스틱 회귀 모형 생성과정에 있어서,바이너리 로지스틱 회귀 모형(Mx~My)은,설정된 연령 구간에서 각 연령 단위를 1단위로 하고, 각 연령 단위마다 트레이닝 데이터를 언더에이지 그룹(UAGm), 오버에이지 그룹(OAGm)의 2개 그룹을 구분하고, 언더에이지 그룹(UAGm), 오버에이지 그룹(OAGm)의 2개 그룹을 반응변수로 하고, 트레이닝 데이터를 예측변수로 하여 각 연령 단위별로 생성하도록 한 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 방법.
- 제1항에 있어서, 상기 연령예측확률연산과정에 있어서, 바이너리 로지스틱 회기 모형에 따라서 샘플 대상인 개인별로 오버에이지 그룹(OAGm)으로 예측될 확률(Pm)의 연산은 다음의 수학식,여기서,Y: 개인별 노화 상태(individual's aging status)p(Y = OAGm) : 오버에이지 그룹으로 예측될 확률(probability to be predicted as OAGm)Yi: i번째 개인별 노화 상태(ith individual's aging status)i = 1,2, … , : 샘플번호(sample number)m = 26(x),27, … , 75(y) ; 트레이닝 데이터에 이용되는 나이(chronological age observed in the training data)CA: 출생나이(Chronological age)Xk: k번째 독립 변수(kth independent variable)βk : k번째 독립변수의 회귀계수 (regression coefficient of kth independent variable)p: 독립변수의 수(number of independent variable),으로 이루어지는 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 방법.
- 제1항에 있어서,상기 초과나이 연산과정에 있어서, 개인별 초과나이는,개인별로 계산된 Dm (m=26, …, 75) 에 해당 나이 (=m)를 곱해서 모두 더한 값의 평균을 나타내는 다음의 수학식,여기서, N: sample number i = 1,2, … , NΔi : weighted mean of (Pim-Cm)Cm: 상기 연령예측확률연산과정을 통해 구해진 컷오프(Cm) 값(cutoff of Pm to predict individual′s aging status from ROC curve analysis),으로 연산되는 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 방법.
- 제1항에 있어서, 상기 초과나이 연산과정에 있어서, 개인별 초과나이는,오버에이지 그룹(OAGm)으로 예측될 초과확률(Dm)에 대한 가중치 평균으로 구하되, 추가적으로 적용할 가중치(Wm)를 적용하여 가중치 평균은 다음의 수학식,여기서, N: sample number i = 1,2, … , NΔi : weighted mean of (Pim-Cm)Cm: 상기 연령예측확률연산과정을 통해 구해진 컷오프(Cm) 값(cutoff of Pm to predict individual′s aging status from ROC curve analysis)Wm: 출생나이가 m이상으로 예측하기 위한 가중치(weight applied for the model to predict CA ≥ m),을 통해 연산되는 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 방법.
- 건강검진시스템으로부터 제공되는 건강 검진 데이터를 수집하여 데이터 저장수단에 저장 관리하기 위한 검진데이터수집수단(110)과,설정된 트레이닝 데이터 기준 연령 구간(x~y) 및 검진항목정보에 따라서 검진데이터수집수단(110)으로부터 제공되는 검진데이터로부터 유효한 트레이닝 데이터를 결정하기 위한 트레이닝데이터 설정수단(120)과,상기 트레이닝 데이터 설정수단(120)에 의해 설정된 트레이닝 데이터에 대하여 설정된 연령 구간(x~y)내 연령 단위마다 바이너리 로지스틱 회귀 모형(Mx~My)을 생성하는 바이너리 로지스틱 회귀 모형 생성수단(130)과,바이너리 로지스틱 회귀 모형 생성수단(130)을 통해 생성된 바이너리 로지스틱 회귀 모형에 따라서 트레이닝 데이터의 각 개인별로 오버에이지 그룹(OAGm)으로 예측될 확률(Pm)을 연산하는 연령예측확률연산수단(140)과,언더에이지 그룹(UAGm), 오버에이지 그룹(OAGm)을 2분형 반응변수로 설정하고, 상기 오버에이지 그룹(OAGm)으로 예측될 확률(Pm)을 예측변수로 설정하여 ROC 커브(curve) 분석을 통해 컷오프(cutoff)(Cm)를 추출하는 컷오프추출수단(150)과,상기 연령예측확률연산수단(140)을 통해 연산된 오버에이지 그룹(OAGm)으로 예측될 확률(Pm)로부터 컷오프(cm)를 적용(Pm-Cm)하여 개인별 오버에이지 그룹(OAGm)으로 예측될 초과확률(Dm)을 연산하여 상기 연령예측확률연산수단(140)에서 연산된 오버에이지 그룹(OAGm)으로 예측될 확률(Pm)을 보정하는 연령예측확률보정수단(160)과,상기 연령예측확률보정수단(160)을 통해 구한 오버에이지 그룹(OAGm)으로 예측될 초과확률(Dm)에 대한 가중치 평균(Δi)을 구하여 개인별 초과나이(Individual's excess aging)를 구하는 초과나이 연산수단(170)과,상기 초과나이 연산수단(170)을 통해 구한 개인별 초과나이를 이용하여 출생나이로부터 생체나이를 연산하는 생체나이 연산수단(180)과,검진데이터수집수단(110)으로부터 수집된 건강검진데이터, 트레이닝 데이터 설정수단(120)을 통해 설정된 트레이닝 데이터가 저장 관리되는 데이터저장수단(190)을 포함하여 구성되는 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 시스템.
- 제10항에 있어서, 상기 트레이닝데이터 설정수단(120)의 연령구간, 검진항목정보를 사용자가 조회, 설정할 수 있도록 프로세스를 제공하는 사용자설정수단을 더 포함하여 구성된 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 시스템.
- 제10항 또는 제11항에 있어서, 상기 트레이닝데이터 설정수단(120)에서 트레이닝 데이터를 결정하기 위한 조건정보를 사용자가 설정할 수 있도록 프로세스를 제공하는 사용자설정수단을 더 포함하여 구성된 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 시스템.
- 제12항에 있어서, 상기 사용자설정수단의 조건정보는 남,녀 성별정보인 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 시스템.
- 제10항에 있어서, 상기 바이너리 로지스틱 회귀 모형 생성수단(130)에서의 바이너리 로지스틱 회귀 모형(Mx~My)은,설정된 연령 구간에서 각 연령 단위를 1단위로 하고, 각 연령 단위마다 트레이닝 데이터를 언더에이지 그룹(UAGm), 오버에이지 그룹(OAGm)의 2개 그룹을 구분하고, 언더에이지 그룹(UAGm), 오버에이지 그룹(OAGm)의 2개 그룹을 반응변수로 하고, 트레이닝 데이터를 예측변수로 하여 각 연령 단위별로 생성하도록 한 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 시스템.
- 제10항 또는 제11항에 있어서,상기 트레이닝데이터 설정수단(120)의 검진항목정보는,체질량지수, 허리둘레, 수축기 혈압, 이완기 혈압과 같은 신체검사지표와, 간 수치 3종(AST, ALT, γ-GTP), 크레아티닌, 콜레스테롤 3종(HDL, LDL, TG), 공복혈당, 헤모글로빈과 같은 혈액검사지표를 포함하는 건강보험 검진항목 데이터로 이루어지는 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 시스템.
- 제10항에 있어서, 상기 연령예측확률연산수단(140)은 바이너리 로지스틱 회기 모형에 따라서 샘플 대상인 개인별로 오버에이지 그룹(OAGm)으로 예측될 확률(Pm)의 연산은 다음의 수학식,여기서,Y: 개인별 노화 상태(individual's aging status)p(Y = OAGm) : 오버에이지 그룹으로 예측될 확률(probability to be predicted as OAGm)Yi: i번째 개인별 노화 상태(ith individual's aging status)i = 1,2, … , : 샘플번호(sample number)m = 26(x),27, … , 75(y) ; 트레이닝 데이터에 이용되는 나이(chronological age observed in the training data)CA: 출생나이(Chronological age)Xk: k번째 독립 변수(kth independent variable)βk : k번째 독립변수의 회귀계수 (regression coefficient of kth independent variable)p: 독립변수의 수(number of independent variable),으로 이루어지는 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 시스템.
- 제10항에 있어서, 상기 초과나이 연산수단(170)에서는 오버에이지 그룹(OAGm)으로 예측될 확률(Dm)에 대하여 다음의 수학식,여기서, N: sample number i = 1,2, … , NΔi : weighted mean of (Pim-Cm)Cm: 컷오프추출수단(150)을 통해 구해진 컷오프(Cm) 값(cutoff of Pm to predict individual′s aging status from ROC curve analysis),을 통해 가중치 평균(Δi)을 구하여 개인별 초과나이를 구하는 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 시스템.
- 제10항에 있어서, 상기 초과나이 연산수단(170)에서는 오버에이지 그룹(OAGm)으로 예측될 확률(Dm)에 대하여 다음의 수학식,여기서, N: sample number i = 1,2, … , NΔi : weighted mean of (Pim-Cm)Cm: 컷오프추출수단(150)을 통해 구해진 컷오프(Cm) 값(cutoff of Pm to predict individual′s aging status from ROC curve analysis)Wm: 출생나이가 m이상으로 예측하기 위한 가중치(weight applied for the model to predict CA ≥ m),을 통해 가중치 평균(Δi)을 구하여 개인별 초과나이를 구하는 것을 특징으로 하는 개인 맞춤 생체나이 예측 모형 생성 시스템.
Priority Applications (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US18/259,054 US20240047077A1 (en) | 2021-08-28 | 2022-02-24 | Method and system for generating personalized biological age prediction model |
JP2024513366A JP2024530322A (ja) | 2021-08-28 | 2022-02-24 | パーソナライズ生体年齢予測モデル生成方法及びシステム |
CN202280063597.7A CN117999617A (zh) | 2021-08-28 | 2022-02-24 | 个性化的生体年龄预测模型生成方法及系统 |
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020210114310A KR102371440B1 (ko) | 2021-08-28 | 2021-08-28 | 개인 맞춤 생체나이 예측 모형 생성 방법 및 시스템 |
KR10-2021-0114310 | 2021-08-28 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023033275A1 true WO2023033275A1 (ko) | 2023-03-09 |
Family
ID=80817388
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2022/002749 WO2023033275A1 (ko) | 2021-08-28 | 2022-02-24 | 개인 맞춤 생체나이 예측 모형 생성 방법 및 시스템 |
Country Status (5)
Country | Link |
---|---|
US (1) | US20240047077A1 (ko) |
JP (1) | JP2024530322A (ko) |
KR (1) | KR102371440B1 (ko) |
CN (1) | CN117999617A (ko) |
WO (1) | WO2023033275A1 (ko) |
Families Citing this family (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20230187076A1 (en) * | 2021-12-03 | 2023-06-15 | MEDIAGE Co.,Ltd | Disease risk prediction method and system based on biological age using medical check-up clinical data independent of dyslipidemia data |
KR20240012704A (ko) | 2022-07-21 | 2024-01-30 | 주식회사 로그미 | 건강나이를 예측하는 장치 및 방법 |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101603308B1 (ko) * | 2013-11-20 | 2016-03-14 | 주식회사 바이오에이지 | 생체 나이 연산 모델 생성 방법 및 시스템과, 그 생체 나이 연산 방법 및 시스템 |
KR101669526B1 (ko) * | 2015-03-04 | 2016-10-26 | 주식회사 바이오에이지 | 생체나이를 이용한 잔여 수명 예측방법 |
KR20190067727A (ko) * | 2017-12-07 | 2019-06-17 | 서울대학교산학협력단 | 생체인식 연령 예측 모델 생성 방법 및 장치 |
KR102106428B1 (ko) * | 2018-02-19 | 2020-05-06 | 주식회사 셀바스에이아이 | 건강나이 예측 방법 |
KR102189233B1 (ko) * | 2018-05-17 | 2020-12-09 | 재단법인차세대융합기술연구원 | 생활 나이를 제공하는 방법, 시스템 및 비일시성의 컴퓨터 판독 가능 기록 매체 |
-
2021
- 2021-08-28 KR KR1020210114310A patent/KR102371440B1/ko active IP Right Grant
-
2022
- 2022-02-24 WO PCT/KR2022/002749 patent/WO2023033275A1/ko active Application Filing
- 2022-02-24 CN CN202280063597.7A patent/CN117999617A/zh active Pending
- 2022-02-24 JP JP2024513366A patent/JP2024530322A/ja active Pending
- 2022-02-24 US US18/259,054 patent/US20240047077A1/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101603308B1 (ko) * | 2013-11-20 | 2016-03-14 | 주식회사 바이오에이지 | 생체 나이 연산 모델 생성 방법 및 시스템과, 그 생체 나이 연산 방법 및 시스템 |
KR101669526B1 (ko) * | 2015-03-04 | 2016-10-26 | 주식회사 바이오에이지 | 생체나이를 이용한 잔여 수명 예측방법 |
KR20190067727A (ko) * | 2017-12-07 | 2019-06-17 | 서울대학교산학협력단 | 생체인식 연령 예측 모델 생성 방법 및 장치 |
KR102106428B1 (ko) * | 2018-02-19 | 2020-05-06 | 주식회사 셀바스에이아이 | 건강나이 예측 방법 |
KR102189233B1 (ko) * | 2018-05-17 | 2020-12-09 | 재단법인차세대융합기술연구원 | 생활 나이를 제공하는 방법, 시스템 및 비일시성의 컴퓨터 판독 가능 기록 매체 |
Also Published As
Publication number | Publication date |
---|---|
CN117999617A (zh) | 2024-05-07 |
JP2024530322A (ja) | 2024-08-16 |
US20240047077A1 (en) | 2024-02-08 |
KR102371440B1 (ko) | 2022-03-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2023033275A1 (ko) | 개인 맞춤 생체나이 예측 모형 생성 방법 및 시스템 | |
WO2023080379A1 (ko) | 다유전자 위험점수를 이용한 시간 의존 연관성 기반의 질환 발병 정보 생성 장치 및 그 방법 | |
WO2021132851A1 (ko) | 전자 장치, 두피 케어 시스템 및 그들의 제어 방법 | |
WO2020101108A1 (ko) | 인공지능 모델 플랫폼 및 인공지능 모델 플랫폼 운영 방법 | |
WO2016082267A1 (zh) | 语音识别方法和系统 | |
WO2023153818A1 (en) | Method of providing neural network model and electronic apparatus for performing the same | |
WO2017191858A1 (ko) | 개인 맞춤형 정보를 제공하는 체성분 측정 장치 및 서버 | |
WO2023172025A1 (ko) | 시계열적 정보를 인코딩하는 모델을 사용하여 개체-쌍 사이의 연관성 관련 정보를 예측하는 방법 및 이를 이용하여 생성되는 예측 시스템 | |
WO2019000466A1 (zh) | 人脸识别方法、装置、存储介质及电子设备 | |
WO2015084091A1 (ko) | 채혈횟수를 최소화한 혈당 측정 시스템 및 그 방법 | |
Talib et al. | Fuzzy decision-making framework for sensitively prioritizing autism patients with moderate emergency level | |
EP3973418A1 (en) | Method, apparatus, electronic device and storage medium for predicting user attribute | |
EP4252203A1 (en) | Action localization method, device, electronic equipment, and computer-readable storage medium | |
WO2023182774A1 (ko) | 심박 정보를 기초로 사용자의 질환을 모니터링하는 방법 및 이를 수행하는 서버 | |
WO2023191206A1 (ko) | 변수 속성에 기반한 탐색적 데이터 분석 자동화 시스템과 방법 | |
WO2023080766A1 (ko) | 시간 변동 공변량 기반의 prs 모델을 이용한 질환별 위험 유전자 변이 정보 생성 장치 및 그 방법 | |
WO2020017827A1 (ko) | 전자 장치, 및 전자 장치의 제어 방법 | |
WO2020060161A1 (ko) | 대화형 인터페이스를 이용한 통계 분석 시스템과 통계분석 방법 | |
WO2022186607A1 (ko) | 정확도 높은 배뇨 정보 획득 방법 | |
WO2023063528A1 (ko) | 시간 변동성 기반의 질환 연관성 요인 분석을 통한 질환 발병 정보 생성 장치 및 그 방법 | |
WO2019045320A1 (ko) | 소재의 전자 구조를 예측하는 방법 및 전자 장치 | |
WO2017014483A1 (ko) | 기술적 파급효과 분석 방법 | |
WO2022234952A1 (ko) | 햅틱 피드백을 제공하는 웨어러블 디바이스 및 그 동작 방법 | |
WO2023229279A1 (ko) | 마이크로바이옴을 이용한 나이 판단 방법 | |
WO2022145590A1 (ko) | 피분석물의 크로마토그래피 분석 시 머무름 시간 예측 장치 및 방법 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22864784 Country of ref document: EP Kind code of ref document: A1 |
|
WWE | Wipo information: entry into national phase |
Ref document number: 18259054 Country of ref document: US |
|
WWE | Wipo information: entry into national phase |
Ref document number: 2024513366 Country of ref document: JP |
|
WWE | Wipo information: entry into national phase |
Ref document number: 202280063597.7 Country of ref document: CN |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22864784 Country of ref document: EP Kind code of ref document: A1 |