WO2022211385A1 - Health care consultation system using distribution of disease prediction values - Google Patents

Health care consultation system using distribution of disease prediction values Download PDF

Info

Publication number
WO2022211385A1
WO2022211385A1 PCT/KR2022/004222 KR2022004222W WO2022211385A1 WO 2022211385 A1 WO2022211385 A1 WO 2022211385A1 KR 2022004222 W KR2022004222 W KR 2022004222W WO 2022211385 A1 WO2022211385 A1 WO 2022211385A1
Authority
WO
WIPO (PCT)
Prior art keywords
disease
index
distribution
health
customer
Prior art date
Application number
PCT/KR2022/004222
Other languages
French (fr)
Korean (ko)
Inventor
김세연
윤준영
김상수
송승재
Original Assignee
주식회사 라이프시맨틱스
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 주식회사 라이프시맨틱스 filed Critical 주식회사 라이프시맨틱스
Publication of WO2022211385A1 publication Critical patent/WO2022211385A1/en

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/20ICT specially adapted for the handling or processing of patient-related medical or healthcare data for electronic clinical trials or questionnaires
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H80/00ICT specially adapted for facilitating communication between medical practitioners or patients, e.g. for collaborative diagnosis, therapy or health monitoring

Definitions

  • the present invention groups the patients in the sample data into disease onset prediction values, calculates the distribution of each group for the biometric index, and uses the calculated distribution to convert the biometric index that affects the reduction of the possibility of disease occurrence of the customer as a management index. It relates to a health care counseling system using the distribution of disease prediction values to be extracted.
  • the present invention estimates a customer's disease onset prediction value, which can be seen as an important indicator of health deterioration and health status, and based on the estimated disease onset prediction value, the health to be managed for each customer by considering the influence of each bio-index on each disease prediction It relates to a health care counseling system using the distribution of disease prediction values that extracts the management index.
  • the present invention is a research conducted with the support of the Information and Communication Industry Promotion Agency AI Precision Medical Solution (Doctor Answer 2.0) Development Project in 2022 with the funding of the Republic of Korea Government (Ministry of Science and Technology Information and Communication) (No. S0252-21-1001).
  • the user can directly check the periodically measured blood flow and oxygen levels through the mobile terminal so that he or she is aware of the current health condition, inquire about his or her health condition to a medical institution, and respond to the inquiry in real time in a chat format
  • Patent Document 1 a technique for providing health management information using integrated or convergence information in nursing, food nutrition, and physical education fields to older people such as the elderly using a mobile device has been proposed [Patent Document 2].
  • Patent Document 2 a technology has been proposed that allows the user to directly select a service provider for each service item, and generates and provides data on the user's exercise program compliance status and diet program compliance status based on the information provided by the service providers.
  • Patent Document 3
  • the prior art generates and provides a lot of health management information for identifying the user's condition by the variously measured biometrics and optimizing each condition.
  • the health status and factors of health deterioration are all different for each individual. For example, it would be undesirable to primarily provide a heart-friendly health care regimen to users with a low risk of heart disease but a high risk of diabetes. That is, in this case, it is good to focus on providing the user with a health management program good for diabetic disease.
  • each variable management index
  • Patent Document 1 Korean Patent Application Laid-Open No. 10-2006-0037123
  • Patent Document 2 Korean Patent Application Laid-Open No. 10-2016-0145244
  • Patent Document 3 Korean Patent Publication No. 10-2017-0131067
  • An object of the present invention is to solve the above-mentioned problems, group the patients of the sample data into disease onset prediction values, calculate the distribution of each group for the biometric index, and use the calculated distribution to determine the disease of the customer. It is to provide a health care counseling system using the distribution of disease prediction values that extracts the biometric index that affects the reduction of the incidence as a management index.
  • the present invention relates to a health care counseling system using the distribution of disease prediction values, comprising: a sample data collection unit for collecting health information data of past patients as sample data; a health checkup collection unit that collects customer health information data; a disease prediction unit that predicts a disease using health information data; a risk group classification unit for obtaining a disease prediction value of each patient from each patient data of the sample data, through the disease prediction unit, and classifying each patient into a plurality of risk groups according to the size of the disease prediction value; an exponential distribution generating unit for generating a distribution with the bioindices values of patients belonging to each risk group for each bioindices; For each disease, a bio-index evaluation unit for calculating the influence of each bio-index by calculating the degree of difference in the distribution of the bio-index between the risk groups of the disease; And, a management index extraction unit for extracting a disease with a risk of onset by calculating the disease onset prediction value of the customer through the disease prediction unit, and extracting it as
  • the present invention provides a health care consultation system using the distribution of disease prediction values, wherein the disease prediction unit predicts a customer's disease using a disease prediction model, and the disease prediction model inputs input values of predetermined input variables. Upon receiving, it outputs a predetermined probability of occurrence of each disease variable, and the disease prediction model is characterized in that it is composed of a neural network in which internal variables are learned by learning data.
  • the present invention is characterized in that the biometric index is created as an item of the health information data.
  • the present invention provides a health care consultation system using the distribution of disease prediction values, wherein the management index extraction unit determines which risk group the acquired disease prediction value belongs to, and if the disease prediction value belongs to a predetermined risk group as a high risk group It is characterized as a disease at risk.
  • the present invention is a health care consultation system using the distribution of disease prediction values, wherein the management index extraction unit selects a biometric index with a high influence among the biometric indexes of the disease at risk of the customer, but in the order of the highest influence A predetermined number is selected, or a biometric index that is greater than or equal to the threshold of a predetermined influence is selected, and with respect to the selected biometric index, the customer's biometric index is calculated, and the calculated customer's biometric index and the normality of the disease are selected. It is determined whether the bio-index of the risk group is outside the normal quantile of the distribution of the bio-index of the normal risk group, and if it is outside the quantile, the bio-index is finally extracted as a management index.
  • the present invention calculates the maximum residual of the cumulative distribution function by using the cumulative distribution function of the distribution for the difference in the distribution, and calculates the calculated It is characterized in that the difference of the distribution is calculated using the maximum residual.
  • the present invention is characterized in that in the health care consultation system using the distribution of disease prediction values, the health information data of the customer or the patient is composed of demographic information and health checkup data.
  • the demographic information includes gender, age, residential area, insurance subscription type, income decile, disability, height, and weight
  • the health Screening data includes waist circumference, systolic blood pressure, diastolic blood pressure, fasting blood sugar, total cholesterol, high-density cholesterol, low-density cholesterol, triglycide, hemoglobin, urine protein, serum creatinine, serum GOT, serum GPT, and gamma GTP. characterized in that
  • the present invention provides a health care consultation system using the distribution of disease prediction values, wherein the system includes a result output unit for displaying the distribution of risk groups for a specific biometric index of a specific disease, and outputting the customer's biometric index on the distribution It is characterized in that it further comprises.
  • the health care counseling system using the distribution of disease prediction values according to the present invention by selecting a disease with a high onset prediction value and extracting a management index that has a large influence on the disease through big data, the most The effect of extracting important and necessary health care index more scientifically and accurately is obtained.
  • the health management consulting system using the distribution of disease prediction values according to the present invention, by extracting the management factor index of health management by calculating the disease incidence prediction value from the customer's health examination data, medical records at a medical institution, etc., additional customers The effect of presenting a more accurate customer's health management program is obtained without biometric measurement.
  • FIGS. 1A and 1B are block diagrams of an entire system for implementing the present invention.
  • FIG. 2 is a block diagram of the configuration of a health care consultation system using the distribution of disease prediction values according to an embodiment of the present invention.
  • FIG. 3 is a table showing input variables of a disease prediction model according to an embodiment of the present invention.
  • 4A and 4B are distribution diagrams of BP_HIGH and HMG variables for each hypertension prediction group according to an embodiment of the present invention.
  • 5 is a graph showing the influence/importance of the bio-index of hypertension by group according to an embodiment of the present invention.
  • FIG. 6 is a table showing the severity of BMI when predicting hypertension according to an embodiment of the present invention.
  • 7A and 7B are graphs showing the quantiles of BMI and systolic blood pressure variables of patient A in the normal group according to an embodiment of the present invention.
  • 8A and 8B are graphs showing quantiles of BMI and systolic blood pressure variables of customer A in a high-risk group according to an embodiment of the present invention.
  • FIG. 9 is a table illustrating the degree of management necessity according to BMI and systolic blood pressure level according to an embodiment of the present invention.
  • FIG. 10 is a table illustrating the degree of management necessity according to BMI and systolic blood pressure quantile according to an embodiment of the present invention.
  • 11 is a graph showing the management variable (bio-index of the super-risk group) of the prediction of five major cancers according to an embodiment of the present invention.
  • 12A and 12B are graphs illustrating distributions of GAMMA_GTP and TRICLYCERIDE indices (distributions corresponding to 5 major cancer predictions) according to an embodiment of the present invention.
  • the health care consultation system (hereinafter referred to as the consultation system) using the distribution of disease prediction values using predictive data according to the present invention estimates the customer/user's disease onset prediction value and uses the distribution to estimate the customer's/user's Extracting the health management index, it can be implemented as a program system on the computer terminal (10).
  • the consultation system 30 may be implemented as a program system on the computer terminal 10 such as a PC, a smartphone, or a tablet PC.
  • the consultation system may be configured as a program system or a mobile application (or an application, an app), and may be installed and executed in the computer terminal 10 .
  • the counseling system 30 provides a service of receiving health data, estimating a disease outbreak prediction value, and extracting health care factors by using the hardware or software resources of the computer terminal 10 .
  • the counseling system may be configured and executed as a server-client system composed of a counseling client 30a and a counseling server 30b on the computer terminal 10 .
  • the counseling client 30a and the counseling server 30b may be implemented according to a typical method of configuring a client and a server. That is, the functions of the entire system can be divided according to the performance of the client or the amount of communication between the server and the server.
  • it will be described as a consultation system, but it may be implemented in various forms according to the configuration method of the server-client.
  • the counseling server 30b may be a server that provides a health care counseling service on the web
  • the counseling client 30a may be a web browser that accesses the counseling server 30b and uses the corresponding service.
  • consultation server 30b may additionally include a database 40 that stores statistical information of a user's or customer's health checkup results, disease prediction information, and the like.
  • the database 40 includes a sample data DB 41 for storing sample data such as demographic information and health examination results of patients, a risk group DB 42 for storing risk group information for each disease, and a biometric index for each risk group. and an exponential distribution DB 43 for storing the distribution.
  • sample data DB 41 for storing sample data such as demographic information and health examination results of patients
  • risk group DB 42 for storing risk group information for each disease
  • biometric index for each risk group
  • an exponential distribution DB 43 for storing the distribution.
  • the configuration of the database 40 is only a preferred embodiment, and in developing a specific device, it may be configured in a different structure according to the database construction theory in consideration of the ease and efficiency of access and search.
  • the disease prediction service system 30 includes a sample data collection unit 31 that collects demographic information and health checkup information of past patients as sample data, a customer A health information collection unit 32 that collects sociodemographic information and health examination results of Risk group classification unit 34, exponential distribution generating unit 35 for obtaining the distribution of the biometric index for each risk group, the biometric index evaluation unit 36 for evaluating each biometric index by comparing the distribution between risk groups, and customer management It consists of a management index extraction unit 37 for extracting the index. Additionally, it may be configured to further include a result output unit 38 for outputting the distribution and the management index of the customer.
  • the sample data collection unit 31 collects demographic information and health checkup information of past patients as sample data.
  • Demographic information is data representing the sociodemographic characteristics of a patient, and consists of age, gender, height, weight, presence or absence of a disability, lifestyle, and the like.
  • the health checkup information is the patient's health checkup data, and is data measured (checked) when performing a health checkup, such as blood pressure, cholesterol level, hemoglobin level, and urine protein level.
  • the onset data is data on the disease onset of the patient, and indicates whether the patient has the disease.
  • sample data is collected by being classified or identified by demographic information. That is, personal information that identifies the patient, such as the patient's name, is excluded, and medical data is collected based on health status information such as age, gender, height, weight, disability, and lifestyle.
  • the sample data uses a sample cohort DB.
  • the total data that is building the sample cohort DB refers to the data of 1 million people. The 1 million subjects were stratified based on the gender and age of the citizens and the distribution of residential areas, so it can be said that the results derived from this data are representative of the whole nation.
  • the health information collection unit 32 collects demographic information and health checkup data of the customer.
  • the health information collection unit 32 receives the customer's demographic information.
  • demographic information of the customer may be input through a questionnaire or questionnaire.
  • the customer's demographic information consists of age, gender, height, weight, presence or absence of a disability, lifestyle, income quintile, past medical history, family medical history, and the like.
  • the questionnaire or questionnaire consists of 26 items.
  • the questionnaire data includes gender, age, region of residence, insurance subscription type, income quintile, disability, type of examination institution, height, weight, person (stroke, heart disease, high blood pressure, diabetes, dyslipidemia, pulmonary tuberculosis, and other diseases including cancer) ), family history (stroke, heart disease, high blood pressure, diabetes, liver disease, cancer) history, smoking status, smoking period, smoking amount per day, drinking habits, alcohol consumption per day, exercise amount per week, etc.
  • the health information collection unit 32 collects the customer's health checkup data.
  • the customer's health checkup data consists of data measured during the health checkup, such as systolic blood pressure, diastolic blood pressure, pre-meal blood sugar, total cholesterol, high-density cholesterol, low-density cholesterol, triglycerides, and urine protein. Screening data that are usually measured during a health checkup include waist circumference, systolic blood pressure, diastolic blood pressure, fasting blood sugar, total cholesterol, HDL cholesterol, LDL cholesterol triglycide, hemoglobin, urine protein, serum creatinine, serum GioT, serum GPT, Gamma GTP and the like.
  • the customer's health checkup data is directly input by the customer, or the most recent health checkup data is obtained from the Health Insurance Corporation, a medical data institution (Health Insight), etc. through the customer's accredited authentication process.
  • the items collected by the questionnaire before the checkup are usually called questionnaire data.
  • the questionnaire data during the health checkup process cannot be imported. Therefore, the corresponding data can be input through a separate questionnaire.
  • the customer's 'height' and 'weight' are included in the health checkup items, but they are entered through a separate direct questionnaire.
  • the disease prediction unit 33 predicts the disease of the patient or customer using demographic information and health examination data of the patient or customer.
  • the disease prediction unit 33 predicts the customer's disease using the disease prediction model.
  • the disease prediction model receives an input value of a predetermined input variable
  • the disease prediction model outputs an onset probability of each predetermined disease variable.
  • the disease prediction model is composed of a neural network, etc., and internal variables are learned by learning data. And when the disease prediction model is trained, when the values of the input variables are input, the disease prediction model outputs the probability of occurrence of each disease.
  • the disease prediction model is the result of an artificial intelligence neural network that machine-learned millions of health check-up results from thousands to tens of thousands of people selected to represent domestic patients for each disease, demographic factors, and lifestyles. The calculation result may vary freely due to the user's steady improvement in health behavior, etc. 3 shows examples of input variables of the disease prediction model. 3 illustrates a case in which all 44 input variables are configured.
  • the output variable consists of the incidence probabilities for 12 diseases (or 12 major diseases).
  • the diseases are breast cancer, five major cancers, cancer integration, cerebrovascular disease, osteoporosis, cataract, hypertension, obesity, diabetes, COPD (chronic obstructive pulmonary disease), joint disease, dyslipidemia, and the like.
  • the output variable is an onset probability of each disease, or an onset prediction value.
  • the risk group classification unit 34 obtains a disease prediction value of each patient from each patient data of the sample data, and classifies each patient into a plurality of risk group groups according to the size of the disease prediction value.
  • the risk group classification unit 34 obtains a disease prediction value of each patient from the data of each patient through the disease prediction unit 33 .
  • Disease predictors are obtained for each disease. That is, as described above, the disease prediction unit 33 may calculate a disease onset prediction value by inputting the patient's demographic information and health checkup information.
  • the risk group classification unit 34 divides the acquired disease prediction value into quantiles to form a risk group group for each quantile, and groups patients having disease prediction values belonging to the quantile into the corresponding risk group.
  • a plurality of risk group groups are divided into a plurality of quantiles (intervals) according to the size of the overall disease prediction value.
  • the risk group is classified into a low-risk group (0-25%), a medium-risk group (25-50%), a high-risk group (50-75%), and a very high-risk group (75-100%). That is, if the disease prediction values of all patients are all sorted by size, the low-risk group is a group of patients with disease prediction values ranging from 0% to 25% of the disease prediction value.
  • the risk group is divided by each disease.
  • one patient may belong to a high-risk group for gastric cancer but a low-risk group for liver cancer.
  • the exponential distribution generating unit 35 generates a distribution (bioindex distribution or exponential distribution) as the biometric index values of patients belonging to each risk group for each biometric index.
  • the biometric index is composed of items (or variables) of demographic data or health examination data, or a combination of these items.
  • the biometric index may be BMI and blood pressure.
  • BMI Body Mass Index
  • blood pressure is one item of health checkup data.
  • the corresponding blood pressure index may be a biometric index by combining the various blood pressure data and setting it as one blood pressure index.
  • biometric index is calculated as an item of demographic data or health checkup data, it is possible to calculate the biometric index for all patients of standard data.
  • the exponential distribution generating unit 35 calculates the bioindices of all patients belonging to each risk group for the risk groups of a specific disease, and generates a distribution using the calculated bioindices. This will be referred to as the bioindex distribution (or exponential distribution) of the risk group.
  • An exponential distribution is generated for each risk group. For example, it is obtained from the exponential distribution of the low-risk group, the exponential distribution of the medium-risk group, the exponential distribution of the high-risk group, and the exponential distribution of the very high-risk group. If the risk groups are divided into 4 groups, 4 exponential distributions are generated.
  • the bio-index evaluation unit 36 calculates the degree of difference in the index distribution between risk groups of the disease. That is, the influence of the corresponding bio-index on the disease is evaluated using the degree of distribution difference.
  • the fact that there is a significant difference in the distribution of the bioindex between risk groups means that the bioindex affects the risk of the disease.
  • the bioindex value of the high-risk group is managed to be moved to the average of the bio-index distribution of the low-risk group, the patient is the predicted disease value of the disease. can be lowered.
  • the bio-index evaluation unit 36 calculates the influence of the bio-index by using the degree of difference in the distribution.
  • the influence of the biometric index (ex. BMI, blood pressure, etc.) is calculated as follows.
  • the importance (or influence) of the index in predicting the disease is checked according to the degree of distribution difference between the indices.
  • the degree of distribution difference uses the residual of the cumulative distribution function of each distribution map. That is, the cumulative distribution function of the distribution of each group is derived, and the cumulative distribution function of each group and the normal risk group (people predicted to develop hypertension with a probability of 0-25% in the low-risk group in the example of the previous four groups) Calculate the maximum residual of the cumulative distribution function.
  • the residual of the cumulative distribution function is the difference between the values of the cumulative distribution function. That is, the difference with the maximum value among the differences between the function values of the cumulative distribution function by each exponential value is calculated as the maximum residual.
  • An index with a large maximum residual value (0 to 1) of the cumulative distributions between groups can be considered that the value of the variable has a large correlation with the disease prediction probability, so that the index has a large influence on the disease prediction value. .
  • the importance of the biometric index is calculated using the maximum residual value.
  • the difference in the distribution of the BP_HIGH (systolic blood pressure) index between the groups divided by the predicted onset of hypertension is clearly revealed.
  • the maximum residual of each group is calculated from the cumulative distribution shown in the lower figure of FIG. 4A , the maximum residual value of the ultra-high-risk group and the low-risk group is 0.776.
  • the maximum residual value increases significantly, indicating that the systolic blood pressure value has a great influence on the prediction of hypertension.
  • HMG hemoglobin
  • the importance of variables in predicting hypertension can be confirmed for each of the medium-risk group, the high-risk group, and the ultra-high-risk group. There is a difference in the ranking of the importance (influence) of the indices between groups, and the absolute importance (influence) of each index decreases as the level of risk decreases.
  • the management index extraction unit 37 extracts a disease with a risk of onset by calculating the customer's disease onset prediction value, and extracts a biometric index with a high influence from the biometric index of the disease as a management index. That is, the corresponding biometric index is selected as the management index according to the size of the influence.
  • the management index extraction unit 37 obtains the disease prediction value of the customer from the customer's data through the disease prediction unit 33 .
  • Disease predictors are obtained for each disease. That is, the disease prediction unit 33 may input the customer's demographic information and health checkup information to calculate the customer's disease onset prediction value.
  • the management index extraction unit 37 determines which risk group the acquired disease prediction value belongs to, and if it belongs to the risk group to be managed, it is determined as a disease to be managed (disease with risk).
  • the risk group to be managed is determined in advance. For example, it may be set as a risk group to be managed in a high-risk group, such as a high-risk group and an ultra-high-risk group, among the above four risk groups.
  • the management index extraction unit 37 selects a biometric index having a high influence among the biometric index of the disease to be managed by the customer. In this case, a predetermined number is selected in the order of the highest influence, or a biometric index that is greater than or equal to the reference value of the predetermined influence is selected.
  • the management index extraction unit 37 calculates the customer's biological index with respect to the selected biological index, and the calculated customer's biological index is the normal quantile of the distribution of the biological index of the normal risk group of the disease, or, in advance to determine if it is outside the specified range.
  • the normal risk group (or normal group) is a risk group group with the lowest predicted incidence and is predetermined.
  • the low-risk group corresponds to the normal-risk group.
  • the corresponding biometric index is finally extracted as the management index.
  • the management level is determined according to which range (quantile) the biometric index belongs to the distribution of the normal risk group.
  • the result output unit 38 outputs the biometric index distribution of the risk group and the customer's biometric index or management index on the screen.
  • the distribution of risk groups is displayed, and the customer's biometric index is output on the distribution.
  • the customer can check his/her condition with the naked eye, thereby accurately grasping his/her health condition, so that he/she can be alert to health care.
  • the quantiles of customers are identified in the exponential distribution of the low-risk group.
  • the severity is confirmed by the measured value (bio-index) of customer A, and the need for management is determined.
  • a range of quantiles is determined to determine the severity of the measurement and the need for management. For example, as shown in the table of Figure 6, the higher the BMI, the higher the risk of hypertension. Therefore, when the BMI value of a specific customer is within 25% of the top/bottom of the normal group, it is “severe”, and when it is within 10%, it is "very high”. determined to be serious.
  • FIGS. 7A and 7B quantiles within the normal group distribution of BMI and systolic blood pressure values of customers predicted to develop hypertension by 97% are shown in FIGS. 7A and 7B .
  • the customer's BMI value is 24 and the systolic blood pressure value is 159.
  • 7A shows the quantiles of BMI of customer A in the normal group
  • FIG. 7B shows the quantiles of the systolic blood pressure variable.
  • Customer A's BMI value is in the top 22.6% of the distribution of the normal group. This can be seen as requiring proper management and attention.
  • customer A's systolic blood pressure is in the top 0.01% compared to the distribution of the normal group, which requires intensive management and serious attention.
  • For each index determine the range of quantiles that determine the severity of the measurement and the need for management. For example, it is determined when the number is above the top 20%.
  • the quantile check determines the severity and management need of customer A's measurements compared to the group to which customer A belongs.
  • FIGS. 8A and 8B show the quantiles of BMI and systolic blood pressure variables of customer A in the high-risk group.
  • the management standard as shown in FIG. 9 is established.
  • the BMI value of customer A is 62.8% higher than the risk group to which customer A belongs, and is an average value within the risk group.
  • customer A's blood pressure level was in the top 13.4%, confirming that it was very serious even within the risk group. In other words, customer A's blood pressure level requires intensive management.
  • FIG. 10 shows the bioindices of the prediction of five major cancers in the super-risk group.
  • FIGS. 12A and 12B illustrate distributions of GAMMA_GTP and TRICLYCERIDE indices (distributions corresponding to the prediction of 5 major cancers).
  • the GAMMA_GTP (gamma GTP) value of a customer with a very high risk of 5 major cancers is a very important index for predicting 5 major cancers.
  • customer A's gamma GTP value is in the upper 79% (lowest 21%) level of the distribution of the normal group, so it can be seen as a variable that requires relatively little management.
  • the quantile of the gamma GTP value is also in the top 89%, so there is no seriousness.
  • the relative distribution and location of the index can be identified, and the importance of the index can be grasped without prior background knowledge.
  • the existing method of determining the importance of each index was limited to the algorithm used (ex. the method of determining the importance of each index of the random forest), and it was not intuitive (it is difficult to grasp the criteria for determining the importance of each index, Not intuitively understandable).
  • one index importance value for all customer groups is being calculated.
  • the exponential importance/influence calculation method of the present invention can be commonly used for all models, and can be intuitively understood.
  • the importance of the index suitable for the customer can be identified.
  • the present invention selects a disease with high predictive value and extracts the management index that has a lot of influence on the disease through big data to more scientifically and accurately extracts the most important and necessary health management index for customers,
  • a health care consultation system technology using the distribution of disease prediction values that provides a more accurate customer health management program without additional customer biometric measurements by extracting the management factor index of health management by calculating the disease outbreak prediction value from medical records at medical institutions can be applied to

Abstract

The present invention relates to a health care consultation system using the distribution of disease prediction values, the system comprising: a sample data collection unit for collecting health information data about past patients as sample data; a health checkup collection unit for collecting health information data about customers; a disease prediction unit for predicting diseases by using the health information data; a risk group classification unit for obtaining the disease prediction value of each patient from each patient datum of the sample data through the disease prediction unit, and classifying each patient into a plurality of risk groups according to the size of the disease prediction value; an exponential distribution generation unit for generating a distribution with the biometric index values of patients belonging to each risk group for each biometric index; a biometric index evaluation unit for calculating the influence of each biometric index on each disease by calculating the degree of difference in biometric index distribution between risk groups of a corresponding disease; and a management index extraction unit which calculates customer disease onset prediction values through the disease prediction unit to extract diseases with a high risk, and which extracts management indexes according to the magnitude of influence of a corresponding biometric index from among the biometric indexes of the extracted diseases, wherein the diseases with a high onset prediction value are selected and management indexes having a large influence on a corresponding disease are extracted through big data, and thus the health management indexes that are most important and necessary for customers can be more scientifically and accurately extracted.

Description

질병 예측치의 분포를 이용한 건강관리 상담 시스템Health care consultation system using distribution of disease prediction values
본 발명은 표본 데이터의 환자들을 질병 발병 예측치들로 그룹화 하고 생체지수에 대한 각 그룹의 분포를 산출하고, 산출된 분포를 이용하여 해당 고객의 질병 발병 가능성을 줄이는데 영향을 주는 생체지수를 관리지수로 추출하는, 질병 예측치의 분포를 이용한 건강관리 상담 시스템에 관한 것이다.The present invention groups the patients in the sample data into disease onset prediction values, calculates the distribution of each group for the biometric index, and uses the calculated distribution to convert the biometric index that affects the reduction of the possibility of disease occurrence of the customer as a management index. It relates to a health care counseling system using the distribution of disease prediction values to be extracted.
본 발명은 건강 악화와 건강상태의 중요 지표로 볼 수 있는 고객의 질병 발병 예측치를 추정하고, 추정된 질병 발병 예측치를 기준으로, 각 질병 예측에 생체지수별 영향력을 고려하여 각 고객 별로 관리할 건강관리지수를 추출하는, 질병 예측치의 분포를 이용한 건강관리 상담 시스템에 관한 것이다.The present invention estimates a customer's disease onset prediction value, which can be seen as an important indicator of health deterioration and health status, and based on the estimated disease onset prediction value, the health to be managed for each customer by considering the influence of each bio-index on each disease prediction It relates to a health care counseling system using the distribution of disease prediction values that extracts the management index.
본 발명은 2022년도 대한민국 정부(과학기술정보통신부)의 재원으로 정보통신산업진흥원 AI정밀의료솔루션(닥터앤서2.0) 개발 사업의 지원을 받아 수행된 연구이다(No. S0252-21-1001).The present invention is a research conducted with the support of the Information and Communication Industry Promotion Agency  AI Precision Medical Solution (Doctor Answer 2.0) Development Project in 2022 with the funding of the Republic of Korea Government (Ministry of Science and Technology Information and Communication) (No. S0252-21-1001).
일반적으로, 의료 서비스를 받기 위해서는 환자가 직접 의료 서비스를 수행하는 병원 또는 보건소까지 직접 방문하여 의료 서비스를 제공받아야 한다. 그러나 최근 네트워크 기술의 비약적인 발전을 통해 인터넷을 통한 건강 관리 시스템 서비스가 다양화되어 자신의 건강 상태 관리를 용이하게 관리 가능하도록 한다. 특히, 최근 정보통신과 보건 의료를 연결하여 언제 어디서나 예방, 진단, 치료, 사후 관리와 보건 의료 서비스가 가능한 유헬스(U-Health)의 보편화 더불어 스트레스, 고혈압, 당뇨 등 다양한 질병에 대한 건강 관리 프로그램이 개발되어 인터넷 또는 휴대단말을 통하여 제공하려는 기술들이 제안되고 있다.In general, in order to receive medical services, a patient must directly visit a hospital or public health center where medical services are performed to receive medical services. However, recently, through rapid development of network technology, health management system services through the Internet have been diversified, making it possible to easily manage one's own health condition. In particular, in addition to the generalization of U-Health, which enables prevention, diagnosis, treatment, follow-up management and health care services anytime, anywhere by linking information and communication with health care, health management programs for various diseases such as stress, high blood pressure, and diabetes This has been developed and technologies to be provided through the Internet or a portable terminal have been proposed.
일례로서, 주기적으로 측정된 혈류량 및 산소량을 이동단말기를 통해 직접 확인할 수 있도록 하여 본인이 현재의 건강 상태를 인지하게 하고, 자신의 건강 상태를 의료기관에 문의하고 실시간으로 그 문의에 관한 응답을 채팅 형식으로 진행하는 기술이 제시되고 있다[특허문헌 1]. 또한, 모바일 기기 등을 이용하여 노인 등의 고연령층에게 간호, 식품영양 및 체육 분야의 통합 또는 융합정보를 이용하여 건강관리 정보를 제공하는 기술이 제시되고 있다[특허문헌 2]. 또한, 사용자가 직접 각 서비스 품목별 서비스 프로바이더를 선택할 수 있고, 서비스 프로바이더들이 제공한 정보들을 바탕으로 사용자의 운동 프로그램 준수 현황 및 식단 프로그램 준수 현황에 대한 데이터를 생성하여 제공하는 기술이 제시되고 있다[특허문헌 3]As an example, the user can directly check the periodically measured blood flow and oxygen levels through the mobile terminal so that he or she is aware of the current health condition, inquire about his or her health condition to a medical institution, and respond to the inquiry in real time in a chat format [Patent Document 1] has been proposed. In addition, a technique for providing health management information using integrated or convergence information in nursing, food nutrition, and physical education fields to older people such as the elderly using a mobile device has been proposed [Patent Document 2]. In addition, a technology has been proposed that allows the user to directly select a service provider for each service item, and generates and provides data on the user's exercise program compliance status and diet program compliance status based on the information provided by the service providers. [Patent Document 3]
그러나 상기 선행기술들은 다양하게 측정된 생체 측정량에 의해 사용자의 상태를 파악하고 각 상태를 최적화하기 위한 많은 건강관리 정보를 생성하여 제공한다.However, the prior art generates and provides a lot of health management information for identifying the user's condition by the variously measured biometrics and optimizing each condition.
그러나 이러한 선행기술들은 실시간으로 생체 신호를 측정해야 하는 등 사용자의 상태를 수집하는 과정이 복잡하여 사용자를 불편하게 한다. 또한, 이러한 선행기술들은 사용자의 모든 질환에 대비한 건강 프로그램을 제시하므로, 사용자 입장에서는 너무 준수해야할 프로그램이 많아지게 된다. 따라서 이러한 종래기술은 사용자에게 해당 프로그램을 모두 수행하기 어렵게 하여, 사용자는 해당 건강 프로그램을 제대로 수행하지 못할 수 있다.However, these prior technologies make the user inconvenient because the process of collecting the user's status is complicated, such as the need to measure bio-signals in real time. In addition, since these prior arts suggest a health program in preparation for all diseases of the user, there are too many programs to be observed from the user's point of view. Therefore, the prior art makes it difficult for the user to perform all the corresponding programs, and the user may not be able to properly perform the corresponding health program.
한편, 건강관리의 측면에서 개인별로 건강 상태와 건강 악화의 요인들이 모두 다르다. 예를 들어, 심장 질환의 가능성이 낮으나 당뇨 질환의 가능성이 높은 사용자에게 심장에 좋은 건강관리 요법을 주요하게 제공하는 것은 바람직하지 않다. 즉, 이 경우, 사용자에게 당뇨 질환에 좋은 건강관리 프로그램을 집중적으로 제공하는 것이 좋다.On the other hand, in terms of health management, the health status and factors of health deterioration are all different for each individual. For example, it would be undesirable to primarily provide a heart-friendly health care regimen to users with a low risk of heart disease but a high risk of diabetes. That is, in this case, it is good to focus on providing the user with a health management program good for diabetic disease.
또한, 각 예측된 질병에 각 변수(관리지수)들의 영향력을 파악할 수 있으면, 건강관리를 위해 집중적으로 관리해야 할 변수들을 파악할 수 있다. 즉, 각 사용자 별로 중요하게 관리할 지수 또는 변수를 찾는 것이 매우 중요하다.In addition, if the influence of each variable (management index) on each predicted disease can be grasped, it is possible to grasp the variables to be managed intensively for health management. That is, it is very important to find an index or variable to be managed as important for each user.
(특허문헌 1) 한국공개특허공보 제10-2006-0037123호(Patent Document 1) Korean Patent Application Laid-Open No. 10-2006-0037123
(특허문헌 2) 한국공개특허공보 제10-2016-0145244호(Patent Document 2) Korean Patent Application Laid-Open No. 10-2016-0145244
(특허문헌 3) 한국공개특허공보 제10-2017-0131067호(Patent Document 3) Korean Patent Publication No. 10-2017-0131067
본 발명의 목적은 상술한 바와 같은 문제점을 해결하기 위한 것으로, 표본 데이터의 환자들을 질병 발병 예측치들로 그룹화 하고 생체지수에 대한 각 그룹의 분포를 산출하고, 산출된 분포를 이용하여 해당 고객의 질병 발병 가능성을 줄이는데 영향을 주는 생체지수를 관리지수로 추출하는, 질병 예측치의 분포를 이용한 건강관리 상담 시스템을 제공하는 것이다.An object of the present invention is to solve the above-mentioned problems, group the patients of the sample data into disease onset prediction values, calculate the distribution of each group for the biometric index, and use the calculated distribution to determine the disease of the customer. It is to provide a health care counseling system using the distribution of disease prediction values that extracts the biometric index that affects the reduction of the incidence as a management index.
또한, 본 발명의 목적은 건강 악화와 건강상태의 중요 지표로 볼 수 있는 고객의 질병 발병 예측치를 추정하고, 추정된 질병 발병 예측치를 기준으로, 각 질병 예측에 생체지수별 영향력을 고려하여 각 고객 별로 관리할 건강관리지수를 추출하는, 질병 예측치의 분포를 이용한 건강관리 상담 시스템을 제공하는 것이다.In addition, it is an object of the present invention to estimate a customer's disease onset prediction value, which can be seen as an important indicator of health deterioration and health status, and consider the influence of each bio-index on each disease prediction based on the estimated disease onset prediction value. It is to provide a health care counseling system using the distribution of disease prediction values that extracts the health care index to be managed separately.
상기 목적을 달성하기 위해 본 발명은 질병 예측치의 분포를 이용한 건강관리 상담 시스템에 관한 것으로서, 과거 환자들의 건강정보 데이터를 표본 데이터로 수집하는 표본자료 수집부; 고객의 건강정보 데이터를 수집하는 건강검진 수집부; 건강정보 데이터를 이용하여 질환을 예측하는 질환 예측부; 상기 표본 데이터의 각 환자 데이터로부터 각 환자의 질환 예측치를, 상기 질환 예측부를 통해 획득하고, 질환 예측치의 크기에 따라 각 환자를 다수 개의 위험군 그룹으로 분류하는 위험군 분류부; 각 생체지수 별로 각 위험군에 속하는 환자들의 생체지수 값으로 분포를 생성하는 지수분포 생성부; 각 질환에 대하여, 해당 질환의 위험군 간의 생체지수 분포의 차이 정도를 산출하여, 각 생체지수의 영향력을 산출하는 생체지수 평가부; 및, 상기 질환 예측부를 통해 고객의 질환 발병 예측치를 산출하여 발병 위험이 있는 질환을 추출하고, 추출된 질환의 생체지수들 중에서 해당 생체지수의 영향력의 크기에 따라 관리지수로 추출하는 관리지수 추출부를 포함하는 것을 특징으로 한다.In order to achieve the above object, the present invention relates to a health care counseling system using the distribution of disease prediction values, comprising: a sample data collection unit for collecting health information data of past patients as sample data; a health checkup collection unit that collects customer health information data; a disease prediction unit that predicts a disease using health information data; a risk group classification unit for obtaining a disease prediction value of each patient from each patient data of the sample data, through the disease prediction unit, and classifying each patient into a plurality of risk groups according to the size of the disease prediction value; an exponential distribution generating unit for generating a distribution with the bioindices values of patients belonging to each risk group for each bioindices; For each disease, a bio-index evaluation unit for calculating the influence of each bio-index by calculating the degree of difference in the distribution of the bio-index between the risk groups of the disease; And, a management index extraction unit for extracting a disease with a risk of onset by calculating the disease onset prediction value of the customer through the disease prediction unit, and extracting it as a management index according to the magnitude of the influence of the biometric index among the extracted disease bioindices characterized by including.
또, 본 발명은 질병 예측치의 분포를 이용한 건강관리 상담 시스템에 있어서, 상기 질환 예측부는 질환예측모델을 사용하여 고객의 질환을 예측하되, 상기 질환예측모델은 사전에 정해진 입력 변수의 입력값을 입력받으면, 사전에 정해진 각 질환 변수의 발병 확률을 출력하고, 상기 질환예측모델은 학습 데이터에 의해 내부 변수들이 학습되는 신경망으로 구성되는 것을 특징으로 한다.In addition, the present invention provides a health care consultation system using the distribution of disease prediction values, wherein the disease prediction unit predicts a customer's disease using a disease prediction model, and the disease prediction model inputs input values of predetermined input variables. Upon receiving, it outputs a predetermined probability of occurrence of each disease variable, and the disease prediction model is characterized in that it is composed of a neural network in which internal variables are learned by learning data.
또, 본 발명은 질병 예측치의 분포를 이용한 건강관리 상담 시스템에 있어서, 상기 생체지수는 상기 건강정보 데이터의 항목으로 만들어지는 것을 특징으로 한다.In addition, in the health care consultation system using the distribution of disease prediction values, the present invention is characterized in that the biometric index is created as an item of the health information data.
또, 본 발명은 질병 예측치의 분포를 이용한 건강관리 상담 시스템에 있어서, 상기 관리지수 추출부는 획득된 질환 예측치가 어느 위험군에 속하는지를 판별하고, 발병 예측치가 높은 위험군으로서 사전에 정해진 위험군에 속하면 발병 위험이 있는 질환으로 판단하는 것을 특징으로 한다.In addition, the present invention provides a health care consultation system using the distribution of disease prediction values, wherein the management index extraction unit determines which risk group the acquired disease prediction value belongs to, and if the disease prediction value belongs to a predetermined risk group as a high risk group It is characterized as a disease at risk.
또, 본 발명은 질병 예측치의 분포를 이용한 건강관리 상담 시스템에 있어서, 상기 관리지수 추출부는 해당 고객의 발병 위험이 있는 질환의 생체지수 중 영향력이 높은 생체지수를 선정하되, 영향력이 높은 순에 의해 사전에 정해진 개수를 선정하거나, 사전에 정해진 영향력의 기준치 이상이 되는 생체지수를 선정하고, 선정된 생체지수에 대하여, 고객의 생체지수를 산출하고, 산출된 고객의 생체지수와, 해당 질환의 정상 위험군의 생체지수가 정상 위험군의 생체지수의 분포의 정상 분위 밖인지를 판단하고, 분위 밖인 경우, 해당 생체지수를 관리지수로 최종 추출하는 것을 특징으로 한다.In addition, the present invention is a health care consultation system using the distribution of disease prediction values, wherein the management index extraction unit selects a biometric index with a high influence among the biometric indexes of the disease at risk of the customer, but in the order of the highest influence A predetermined number is selected, or a biometric index that is greater than or equal to the threshold of a predetermined influence is selected, and with respect to the selected biometric index, the customer's biometric index is calculated, and the calculated customer's biometric index and the normality of the disease are selected. It is determined whether the bio-index of the risk group is outside the normal quantile of the distribution of the bio-index of the normal risk group, and if it is outside the quantile, the bio-index is finally extracted as a management index.
또, 본 발명은 질병 예측치의 분포를 이용한 건강관리 상담 시스템에 있어서, 상기 생체지수 평가부는 분포의 차이를 해당 분포의 누적분포함수를 이용하여, 해당 누적분포함수의 최대 잔차를 계산하고, 계산된 최대 잔차를 이용하여 분포의 차이를 구하는 것을 특징으로 한다.In addition, in the health care counseling system using the distribution of disease prediction values, the present invention calculates the maximum residual of the cumulative distribution function by using the cumulative distribution function of the distribution for the difference in the distribution, and calculates the calculated It is characterized in that the difference of the distribution is calculated using the maximum residual.
또, 본 발명은 질병 예측치의 분포를 이용한 건강관리 상담 시스템에 있어서, 상기 고객 또는 상기 환자의 건강정보 데이터는 인구사회학적 정보와, 건강검진 데이터로 구성되는 것을 특징으로 한다.In addition, the present invention is characterized in that in the health care consultation system using the distribution of disease prediction values, the health information data of the customer or the patient is composed of demographic information and health checkup data.
또, 본 발명은 질병 예측치의 분포를 이용한 건강관리 상담 시스템에 있어서, 상기 인구사회학적 정보는 성별, 연령, 거주지역, 보험가입유형, 소득분위, 장애유무, 신장, 체중을 포함하고, 상기 건강검진 데이터는, 허리둘레, 수축기 혈압, 이완기 혈압, 공복혈당, 총콜레스테롤, 고밀도 콜레스테롤, 저밀도 콜레스테롤, 트리글라세이드, 혈색소, 요단백, 혈청크레아티닌, 혈청지오티, 혈청지피티, 감마지티피를 포함하는 것을 특징으로 한다.In addition, in the present invention, in the health care consultation system using the distribution of disease prediction values, the demographic information includes gender, age, residential area, insurance subscription type, income decile, disability, height, and weight, and the health Screening data includes waist circumference, systolic blood pressure, diastolic blood pressure, fasting blood sugar, total cholesterol, high-density cholesterol, low-density cholesterol, triglycide, hemoglobin, urine protein, serum creatinine, serum GOT, serum GPT, and gamma GTP. characterized in that
또, 본 발명은 질병 예측치의 분포를 이용한 건강관리 상담 시스템에 있어서, 상기 시스템은, 특정 질환의 특정 생체지수에 대하여, 위험군의 분포를 표시하고, 분포 상에서 고객의 생체지수를 출력하는 결과 출력부를 더 포함하는 것을 특징으로 한다.In addition, the present invention provides a health care consultation system using the distribution of disease prediction values, wherein the system includes a result output unit for displaying the distribution of risk groups for a specific biometric index of a specific disease, and outputting the customer's biometric index on the distribution It is characterized in that it further comprises.
상술한 바와 같이, 본 발명에 따른 질병 예측치의 분포를 이용한 건강관리 상담 시스템에 의하면, 발병 예측치가 높은 질병을 선정하고 해당 질병에 영향을 많이 주는 관리지수를 빅데이터를 통해 추출함으로써, 고객에게 가장 중요하고 필요한 건강관리지수를 보다 과학적으로 정확하게 추출할 수 있는 효과가 얻어진다.As described above, according to the health care counseling system using the distribution of disease prediction values according to the present invention, by selecting a disease with a high onset prediction value and extracting a management index that has a large influence on the disease through big data, the most The effect of extracting important and necessary health care index more scientifically and accurately is obtained.
또한, 본 발명에 따른 질병 예측치의 분포를 이용한 건강관리 상담 시스템에 의하면, 고객의 건강검진 자료, 의료기관에서의 진료 기록 등으로부터 질병 발병 예측치를 산출하여 건강관리의 관리요인 지수를 추출함으로써, 추가적인 고객의 생체 측정 없이도, 보다 정확한 고객의 건강관리 프로그램을 제시할 수 있는 효과가 얻어진다.In addition, according to the health management consulting system using the distribution of disease prediction values according to the present invention, by extracting the management factor index of health management by calculating the disease incidence prediction value from the customer's health examination data, medical records at a medical institution, etc., additional customers The effect of presenting a more accurate customer's health management program is obtained without biometric measurement.
도 1a와 도 1b는 본 발명을 실시하기 위한 전체 시스템에 대한 구성도.1A and 1B are block diagrams of an entire system for implementing the present invention.
도 2는 본 발명의 일실시예에 따른 질병 예측치의 분포를 이용한 건강관리 상담 시스템의 구성에 대한 블록도.2 is a block diagram of the configuration of a health care consultation system using the distribution of disease prediction values according to an embodiment of the present invention.
도 3은 본 발명의 일실시예에 따른 질환예측모델의 입력 변수를 나타낸 표.3 is a table showing input variables of a disease prediction model according to an embodiment of the present invention.
도 4a와 도 4b는 본 발명의 일실시예에 따른 고혈압 예측 그룹별 BP_HIGH와 HMG 변수의 분포도.4A and 4B are distribution diagrams of BP_HIGH and HMG variables for each hypertension prediction group according to an embodiment of the present invention.
도 5는 본 발명의 일실시예에 따른 그룹별 고혈압의 생체지수의 영향력/중요도를 나타낸 그래프.5 is a graph showing the influence/importance of the bio-index of hypertension by group according to an embodiment of the present invention.
도 6은 본 발명의 일실시예에 따른 고혈압을 예측할 때 BMI의 심각성을 나타낸 표.6 is a table showing the severity of BMI when predicting hypertension according to an embodiment of the present invention.
도 7a와 도 7b는 본 발명의 일실시예에 따른 정상군에서의 환자 A의 BMI의 분위와 수축기 혈압 변수의 분위를 나타내는 그래프.7A and 7B are graphs showing the quantiles of BMI and systolic blood pressure variables of patient A in the normal group according to an embodiment of the present invention.
도 8a와 도 8b 본 발명의 일실시예에 따른 고위험군에서의 고객 A의 BMI와 수축기 혈압 변수의 분위를 나타내는 그래프.8A and 8B are graphs showing quantiles of BMI and systolic blood pressure variables of customer A in a high-risk group according to an embodiment of the present invention.
도 9는 본 발명의 일실시예에 따른 BMI, 수축기 혈압 분위에 따른 관리 필요성 정도를 예시한 표.9 is a table illustrating the degree of management necessity according to BMI and systolic blood pressure level according to an embodiment of the present invention.
도 10은 본 발명의 일실시예에 따른 BMI, 수축기 혈압 분위에 따른 관리 필요성 정도를 예시한 표.10 is a table illustrating the degree of management necessity according to BMI and systolic blood pressure quantile according to an embodiment of the present invention.
도 11은 본 발명의 일실시예에 따른 5대암 예측의 관리변수(초위험군의 생체지수)를 나타내는 그래프.11 is a graph showing the management variable (bio-index of the super-risk group) of the prediction of five major cancers according to an embodiment of the present invention.
도 12a와 도 12b는 본 발명의 일실시예에 따른 GAMMA_GTP와 TRICLYCERIDE 지수의 분포(5대암 예측에 해당하는 분포)를 예시한 그래프.12A and 12B are graphs illustrating distributions of GAMMA_GTP and TRICLYCERIDE indices (distributions corresponding to 5 major cancer predictions) according to an embodiment of the present invention.
이하, 본 발명의 실시를 위한 구체적인 내용을 도면에 따라서 설명한다.Hereinafter, specific contents for carrying out the present invention will be described with reference to the drawings.
또한, 본 발명을 설명하는데 있어서 동일 부분은 동일 부호를 붙이고, 그 반복 설명은 생략한다.In addition, in demonstrating this invention, the same part is attached|subjected by the same code|symbol, and the repetition description is abbreviate|omitted.
먼저, 본 발명을 실시하기 위한 전체 시스템의 구성을 도 1a와 도 1b를 참조하여 설명한다.First, the configuration of the entire system for carrying out the present invention will be described with reference to FIGS. 1A and 1B.
도 1a에서 보는 바와 같이, 본 발명에 따른 예측 데이터를 이용한 질병 예측치의 분포를 이용한 건강관리 상담 시스템(이하 상담 시스템)은 고객/사용자의 질병 발병 예측치를 추정하고 분포를 이용하여 해당 고객/사용자의 건강관리지수를 추출하는, 컴퓨터 단말(10) 상의 프로그램 시스템으로 실시될 수 있다.As shown in FIG. 1A , the health care consultation system (hereinafter referred to as the consultation system) using the distribution of disease prediction values using predictive data according to the present invention estimates the customer/user's disease onset prediction value and uses the distribution to estimate the customer's/user's Extracting the health management index, it can be implemented as a program system on the computer terminal (10).
즉, 상담 시스템(30)은 PC, 스마트폰, 태플릿PC 등 컴퓨터 단말(10) 상의 프로그램 시스템으로 실시될 수 있다. 특히, 상기 상담 시스템은 프로그램 시스템 또는 모바일 어플리케이션(또는 어플, 앱)으로 구성되어, 컴퓨터 단말(10)에 설치되어 실행될 수 있다. 상담 시스템(30)은 컴퓨터 단말(10)의 하드웨어 또는 소프트웨어 자원을 이용하여, 건강 데이터를 입력받아 질병 발병 예측치를 추정하고 건강관리 요인을 추출하는 서비스를 제공한다.That is, the consultation system 30 may be implemented as a program system on the computer terminal 10 such as a PC, a smartphone, or a tablet PC. In particular, the consultation system may be configured as a program system or a mobile application (or an application, an app), and may be installed and executed in the computer terminal 10 . The counseling system 30 provides a service of receiving health data, estimating a disease outbreak prediction value, and extracting health care factors by using the hardware or software resources of the computer terminal 10 .
또한, 다른 실시예로서, 도 1b에서 보는 바와 같이, 상기 상담 시스템은 컴퓨터 단말(10) 상의 상담 클라이언트(30a)과 상담 서버(30b)로 구성된 서버-클라이언트 시스템으로 구성되어 실행될 수 있다.Also, as another embodiment, as shown in FIG. 1B , the counseling system may be configured and executed as a server-client system composed of a counseling client 30a and a counseling server 30b on the computer terminal 10 .
한편, 상담 클라이언트(30a)과 상담 서버(30b)는 통상의 클라이언트와 서버의 구성 방법에 따라 구현될 수 있다. 즉, 전체 시스템의 기능들을 클라이언트의 성능이나 서버와 통신량 등에 따라 분담될 수 있다. 이하에서는 상담 시스템으로 설명하나, 서버-클라이언트의 구성 방법에 따라 다양한 분담 형태로 구현될 수 있다.Meanwhile, the counseling client 30a and the counseling server 30b may be implemented according to a typical method of configuring a client and a server. That is, the functions of the entire system can be divided according to the performance of the client or the amount of communication between the server and the server. Hereinafter, it will be described as a consultation system, but it may be implemented in various forms according to the configuration method of the server-client.
또는, 상담 서버(30b)는 웹 상에서 건강관리 상담 서비스를 제공하는 서버로서 제공하고, 상담 클라이언트(30a)는 상담 서버(30b)에 접속하여 해당 서비스를 이용하는 웹 브라우저 등일 수 있다.Alternatively, the counseling server 30b may be a server that provides a health care counseling service on the web, and the counseling client 30a may be a web browser that accesses the counseling server 30b and uses the corresponding service.
또한, 상담 서버(30b)는 사용자 또는 고객의 건강검진 결과의 통계 정보, 질환 예측 정보 등을 저장하는 데이터베이스(40)를 추가적으로 구성될 수 있다.In addition, the consultation server 30b may additionally include a database 40 that stores statistical information of a user's or customer's health checkup results, disease prediction information, and the like.
구체적으로, 데이터베이스(40)는 환자들의 인구사회학적 정보와 건강검진 결과 등 표본 데이터를 저장하는 표본자료DB(41), 각 질환별 위험군 정보를 저장하는 위험군DB(42), 각 위험군의 생체지수 분포를 저장하는 지수분포DB(43) 등을 포함한다. 그러나 상기 데이터베이스(40)의 구성은 바람직한 일실시예일 뿐이며, 구체적인 장치를 개발하는데 있어서, 접근 및 검색의 용이성 및 효율성 등을 감안하여 데이터베이스 구축이론에 의하여 다른 구조로 구성될 수 있다.Specifically, the database 40 includes a sample data DB 41 for storing sample data such as demographic information and health examination results of patients, a risk group DB 42 for storing risk group information for each disease, and a biometric index for each risk group. and an exponential distribution DB 43 for storing the distribution. However, the configuration of the database 40 is only a preferred embodiment, and in developing a specific device, it may be configured in a different structure according to the database construction theory in consideration of the ease and efficiency of access and search.
다음으로, 본 발명의 일실시예에 따른 질병 예측치의 분포를 이용한 건강관리 상담 시스템(30)을 도 2를 참조하여 설명한다.Next, a health care consultation system 30 using the distribution of disease prediction values according to an embodiment of the present invention will be described with reference to FIG. 2 .
도 2에서 보는 바와 같이, 본 발명의 일실시예에 따른 질환예측 서비스 시스템(30)은 과거 환자들의 인구사회학적 정보와 건강검진 정보 등을 표본 데이터로 수집하는 표본자료 수집부(31), 고객의 인구사회학적 정보 및 건강검진결과를 수집하는 건강정보 수집부(32), 인구사회학적 정보 및 건강검진결과를 이용하여 질환을 예측하는 질환 예측부(33), 표본 데이터로부터 각 위험군으로 분류하는 위험군 분류부(34), 각 위험군 그룹에 대한 생체지수 분포를 구하는 지수분포 생성부(35), 위험군 간 분포 대비를 통해 각 생체지수를 평가하는 생체지수 평가부(36), 및, 고객의 관리지수를 추출하는 관리지수 추출부(37)로 구성된다. 추가적으로, 분포와 고객의 관리지수를 출력하는 결과 출력부(38)를 더 포함하여 구성될 수 있다.As shown in FIG. 2 , the disease prediction service system 30 according to an embodiment of the present invention includes a sample data collection unit 31 that collects demographic information and health checkup information of past patients as sample data, a customer A health information collection unit 32 that collects sociodemographic information and health examination results of Risk group classification unit 34, exponential distribution generating unit 35 for obtaining the distribution of the biometric index for each risk group, the biometric index evaluation unit 36 for evaluating each biometric index by comparing the distribution between risk groups, and customer management It consists of a management index extraction unit 37 for extracting the index. Additionally, it may be configured to further include a result output unit 38 for outputting the distribution and the management index of the customer.
먼저, 표본자료 수집부(31)는 과거 환자들의 인구사회학적 정보, 건강검진 정보 등을 표본 데이터로 수집한다.First, the sample data collection unit 31 collects demographic information and health checkup information of past patients as sample data.
인구사회학적 정보는 환자의 인구사회학적 특성을 나타내는 데이터로서, 나이, 성별, 신장, 체중, 장애 유무, 생활 습관 등으로 구성된다.Demographic information is data representing the sociodemographic characteristics of a patient, and consists of age, gender, height, weight, presence or absence of a disability, lifestyle, and the like.
또한, 건강검진 정보는 환자의 건강검진 데이터로서, 혈압, 콜레스테롤 수치, 혈색소, 요단백 수치 등 건강검진을 수행할 때 측정(검진)되는 데이터이다.In addition, the health checkup information is the patient's health checkup data, and is data measured (checked) when performing a health checkup, such as blood pressure, cholesterol level, hemoglobin level, and urine protein level.
또한, 발병 데이터는 해당 환자의 발병된 질환에 대한 데이터로서, 해당 환자의 질환 발병 여부를 나타낸다.In addition, the onset data is data on the disease onset of the patient, and indicates whether the patient has the disease.
한편, 표본 데이터는 인구사회학적 정보에 의해 분류 또는 식별되어 수집된다. 즉, 환자의 이름 등 환자를 식별하는 개인 정보는 제외되고, 나이, 성별, 신장, 체중, 장애 유무, 생활 습관 등 건강상태 정보를 기준으로 의료 데이터가 수집된다.On the other hand, sample data is collected by being classified or identified by demographic information. That is, personal information that identifies the patient, such as the patient's name, is excluded, and medical data is collected based on health status information such as age, gender, height, weight, disability, and lifestyle.
특히, 바람직하게는, 표본 데이터는 표본 코호트 DB를 이용한다. 표본코호트DB를 구축하고 있는 전체 데이터는 국민 100만명의 데이터를 의미한다. 해당 100만명의 대상자는 전국민의 성별 및 연령과 거주지역 분포를 기준으로 층화 추출되었으므로, 본 데이터를 통해 도출되는 결과값은 전국민을 대표한다고 할 수 있다.In particular, preferably, the sample data uses a sample cohort DB. The total data that is building the sample cohort DB refers to the data of 1 million people. The 1 million subjects were stratified based on the gender and age of the citizens and the distribution of residential areas, so it can be said that the results derived from this data are representative of the whole nation.
다음으로, 건강정보 수집부(32)는 고객의 인구사회학적 정보 및, 건강검진 데이터를 수집한다.Next, the health information collection unit 32 collects demographic information and health checkup data of the customer.
먼저, 건강정보 수집부(32)는 고객의 인구사회학적 정보를 입력받는다. 이때, 고객의 인구사회학적 정보를 설문 또는 문진을 통해 입력받을 수 있다.First, the health information collection unit 32 receives the customer's demographic information. In this case, demographic information of the customer may be input through a questionnaire or questionnaire.
앞서 설명한 바와 같이, 고객의 인구사회학적 정보는 나이, 성별, 신장, 체중, 장애 유무, 생활 습관, 소득 분위, 과거 병력, 가족 병력 등으로 구성된다.As described above, the customer's demographic information consists of age, gender, height, weight, presence or absence of a disability, lifestyle, income quintile, past medical history, family medical history, and the like.
바람직하게는, 설문 또는 문진은 26개의 항목으로 구성된다. 즉, 문진 데이터는 성별, 연령, 거주지역, 보험가입유형, 소득분위, 장애유무, 검진기관종류, 신장, 체중, 본인(뇌졸중, 심장병, 고혈압, 당뇨, 이상지질혈증, 폐결핵, 암포함 기타질환)과거력, 가족(뇌졸증, 심장병, 고혈압, 당뇨, 간장질환, 암)과거력, 흡연상태, 흡연기간, 하루흡연량, 음주습관, 1회음주량, 1주운동량 등을 포함한다.Preferably, the questionnaire or questionnaire consists of 26 items. In other words, the questionnaire data includes gender, age, region of residence, insurance subscription type, income quintile, disability, type of examination institution, height, weight, person (stroke, heart disease, high blood pressure, diabetes, dyslipidemia, pulmonary tuberculosis, and other diseases including cancer) ), family history (stroke, heart disease, high blood pressure, diabetes, liver disease, cancer) history, smoking status, smoking period, smoking amount per day, drinking habits, alcohol consumption per day, exercise amount per week, etc.
또한, 건강정보 수집부(32)는 고객의 건강검진 데이터를 수집한다.In addition, the health information collection unit 32 collects the customer's health checkup data.
고객의 건강검진 데이터는 수축기 혈압, 이완기 혈압, 식전 혈당, 총콜레스테롤, 고밀도 콜레스테롤, 저밀도 콜레스테롤, 중성지방, 요단백 등 건강 검진 시 측정되는 데이터들로 구성된다. 통상 건강 검진시 측정되는 검진 데이터는, 허리둘레, 수축기 혈압, 이완기 혈압, 공복혈당, 총콜레스테롤, HDL 콜레스테롤, LDL 콜레스테롤 트리글라세이드, 혈색소, 요단백, 혈청크레아티닌, 혈청지오티, 혈청지피티, 감마지티피 등을 포함한다.The customer's health checkup data consists of data measured during the health checkup, such as systolic blood pressure, diastolic blood pressure, pre-meal blood sugar, total cholesterol, high-density cholesterol, low-density cholesterol, triglycerides, and urine protein. Screening data that are usually measured during a health checkup include waist circumference, systolic blood pressure, diastolic blood pressure, fasting blood sugar, total cholesterol, HDL cholesterol, LDL cholesterol triglycide, hemoglobin, urine protein, serum creatinine, serum GioT, serum GPT, Gamma GTP and the like.
이때, 고객의 건강검진 데이터는 고객이 직접 입력하거나, 고객의 공인인증 과정을 통해 가장 최근의 건강검진 데이터를 건강보험공단, 의료데이터 기관(건강인 사이트) 등으로부터 가져온다. At this time, the customer's health checkup data is directly input by the customer, or the most recent health checkup data is obtained from the Health Insurance Corporation, a medical data institution (Health Insight), etc. through the customer's accredited authentication process.
일반적으로, 국민건강보험공단 일반건강검진을 진행할 때, 검진 전 설문조사로 수집하는 항목을 문진 데이터라고 통상 부른다. 그러나 고객의 건강검진 데이터를 가져올 때, 건강검진 과정에서의 문진 데이터를 가져오지 못한다. 따라서 별도의 문진을 통해 해당 데이터를 입력받을 수 있다. 또한 고객의 '신장'과 '체중'의 경우도 건강검진 항목에 포함되지만, 별도의 직접 문진에 의해 입력받는다.In general, when the National Health Insurance Corporation general health checkup is conducted, the items collected by the questionnaire before the checkup are usually called questionnaire data. However, when importing the customer's health checkup data, the questionnaire data during the health checkup process cannot be imported. Therefore, the corresponding data can be input through a separate questionnaire. In addition, the customer's 'height' and 'weight' are included in the health checkup items, but they are entered through a separate direct questionnaire.
다음으로, 질환 예측부(33)은 환자 또는 고객의 인구사회학적 정보 및 건강검진 데이터를 이용하여 환자 또는 고객의 질환을 예측한다.Next, the disease prediction unit 33 predicts the disease of the patient or customer using demographic information and health examination data of the patient or customer.
바람직하게는, 질환 예측부(33)은 질환예측모델을 사용하여 고객의 질환을 예측한다. 질환예측모델은 사전에 정해진 입력 변수의 입력값을 입력받으면, 사전에 정해진 각 질환 변수의 발병 확률을 출력한다.Preferably, the disease prediction unit 33 predicts the customer's disease using the disease prediction model. When the disease prediction model receives an input value of a predetermined input variable, the disease prediction model outputs an onset probability of each predetermined disease variable.
특히, 질환예측모델은 신경망 등으로 구성되어, 학습 데이터에 의해 내부 변수들이 학습된다. 그리고 질환예측모델은 학습이 되면, 입력 변수의 값들을 입력받으면, 각 질환의 발병 확률을 출력시킨다.In particular, the disease prediction model is composed of a neural network, etc., and internal variables are learned by learning data. And when the disease prediction model is trained, when the values of the input variables are input, the disease prediction model outputs the probability of occurrence of each disease.
질환예측모델은 각 질환 별 국내 환자를 대표할 수 있게 선정된 수천명에서부터 수만명까지의 건강검진결과, 인구사회학적요인, 생활습관 등 수백만 건을 기계학습(Machine Learning)한 인공지능 신경망의 결과물이다. 해당 산출결과는 사용자의 꾸준한 건강행태 개선 등으로 얼마든지 달라질 수 있다. 도 3은 질환예측모델의 입력 변수의 예를 나타내고 있다. 도 3은 모두 총 44개의 입력 변수로 구성되는 경우를 예시하고 있다.The disease prediction model is the result of an artificial intelligence neural network that machine-learned millions of health check-up results from thousands to tens of thousands of people selected to represent domestic patients for each disease, demographic factors, and lifestyles. The calculation result may vary freely due to the user's steady improvement in health behavior, etc. 3 shows examples of input variables of the disease prediction model. 3 illustrates a case in which all 44 input variables are configured.
또한, 일실시예로서, 출력 변수는 12개의 질환(또는 12대 질환)에 대한 발병 확률로 구성된다. 특히, 질환은 유방암, 5대암, 암통합, 뇌혈관질환, 골다공증, 백내장, 고혈압, 비만, 당뇨, COPD(만성폐쇄성폐질환), 관절질환, 이상지혈증 등이다.Also, as an embodiment, the output variable consists of the incidence probabilities for 12 diseases (or 12 major diseases). In particular, the diseases are breast cancer, five major cancers, cancer integration, cerebrovascular disease, osteoporosis, cataract, hypertension, obesity, diabetes, COPD (chronic obstructive pulmonary disease), joint disease, dyslipidemia, and the like.
즉, 출력 변수는 각 질환의 발병 확률, 또는, 발병 예측치이다.That is, the output variable is an onset probability of each disease, or an onset prediction value.
다음으로, 위험군 분류부(34)는 표본 데이터의 각 환자 데이터로부터 각 환자의 질환 예측치를 획득하고, 질환 예측치의 크기에 따라 각 환자를 다수 개의 위험군 그룹으로 분류한다.Next, the risk group classification unit 34 obtains a disease prediction value of each patient from each patient data of the sample data, and classifies each patient into a plurality of risk group groups according to the size of the disease prediction value.
먼저, 위험군 분류부(34)는 질환 예측부(33)를 통해, 각 환자의 데이터로부터 각 환자의 질환 예측치를 획득한다. 질환 예측치는 질환별로 획득한다. 즉, 앞서 설명한 바와 같이, 질환 예측부(33)는 환자의 인구사회학적 정보와 건강검진 정보를 입력하여 질환 발병 예측치를 산출할 수 있다.First, the risk group classification unit 34 obtains a disease prediction value of each patient from the data of each patient through the disease prediction unit 33 . Disease predictors are obtained for each disease. That is, as described above, the disease prediction unit 33 may calculate a disease onset prediction value by inputting the patient's demographic information and health checkup information.
그리고 위험군 분류부(34)는 획득된 질환 예측치를 분위로 구분하여 각 분위 별로 위험군 그룹을 형성하고 해당 분위에 속하는 질환 예측치를 가지는 환자를 해당 위험군으로 그룹화 한다.In addition, the risk group classification unit 34 divides the acquired disease prediction value into quantiles to form a risk group group for each quantile, and groups patients having disease prediction values belonging to the quantile into the corresponding risk group.
즉, 다수 개의 위험군 그룹은 전체 질환 예측치의 그 크기에 따라 다수 개의 분위(구간)로 나누어 구분된다. 바람직하게는, 위험군 그룹은 저위험군 (0~25%), 중위험군 (25~50%), 고위험군 (50~75%), 그리고 초고위험군 (75~100%) 등으로 분류된다. 즉, 모든 환자의 질환 예측치를 크기에 따라 모두 정렬하면, 저위험군은 질환 예측치가 하위 0%에서 25%까지의 질환 예측치를 가지는 환자들의 그룹이다.That is, a plurality of risk group groups are divided into a plurality of quantiles (intervals) according to the size of the overall disease prediction value. Preferably, the risk group is classified into a low-risk group (0-25%), a medium-risk group (25-50%), a high-risk group (50-75%), and a very high-risk group (75-100%). That is, if the disease prediction values of all patients are all sorted by size, the low-risk group is a group of patients with disease prediction values ranging from 0% to 25% of the disease prediction value.
또한, 위험군 그룹은 각 질환 별로 나누어 구분한다. 예를 들어, 어느 하나의 환자는 위암에는 고위험군 그룹에 속하나, 간암에는 저위험군 그룹에 속할 수 있다.In addition, the risk group is divided by each disease. For example, one patient may belong to a high-risk group for gastric cancer but a low-risk group for liver cancer.
다음으로, 지수분포 생성부(35)는 각 생체지수 별로 각 위험군에 속하는 환자들의 생체지수 값으로 분포(생체지수 분포 또는 지수분포)를 생성한다.Next, the exponential distribution generating unit 35 generates a distribution (bioindex distribution or exponential distribution) as the biometric index values of patients belonging to each risk group for each biometric index.
바람직하게는, 생체지수는 인구사회학적 데이터 또는 건강검진 데이터의 항목(또는 변수)으로 구성되거나 이들 항목들의 결합에 의해 구성된다. 예를 들어, 생체지수는 BMI, 혈압일 수 있다. BMI(체질량 지수)는 건강검진 데이터의 체중과 신장의 결합에 의해 구해진다. 또한, 혈압은 건강검진 데이터의 하나의 항목이다. 또는 건강검진 데이터가 다양한 혈압 데이터를 가지는 경우, 이들 다양한 혈압 데이터를 합하여 하나의 혈압 지수로 설정하면 해당 혈압지수가 생체지수가 될 수 있다.Preferably, the biometric index is composed of items (or variables) of demographic data or health examination data, or a combination of these items. For example, the biometric index may be BMI and blood pressure. BMI (Body Mass Index) is obtained by combining the weight and height of the health examination data. Also, blood pressure is one item of health checkup data. Alternatively, when the health checkup data includes various blood pressure data, the corresponding blood pressure index may be a biometric index by combining the various blood pressure data and setting it as one blood pressure index.
즉, 생체지수는 인구사회학적 데이터 또는 건강검진 데이터의 항목으로 산출되므로, 표준 데이터의 모든 환자에 대해 생체지수를 산출할 수 있다.That is, since the biometric index is calculated as an item of demographic data or health checkup data, it is possible to calculate the biometric index for all patients of standard data.
지수분포 생성부(35)는 특정 질환의 위험군들에 대하여, 각 위험군에 속하는 모든 환자의 생체지수를 산출하고, 산출된 생체지수들로 분포를 생성한다. 이를 위험군의 생체지수 분포(또는 지수 분포)라 부르기로 한다.The exponential distribution generating unit 35 calculates the bioindices of all patients belonging to each risk group for the risk groups of a specific disease, and generates a distribution using the calculated bioindices. This will be referred to as the bioindex distribution (or exponential distribution) of the risk group.
지수 분포는 위험군별로 생성된다. 예를 들어, 저위험군의 지수분포, 중위험군의 지수분포, 고위험군의 지수분포, 초고위험군의 지수분포 등으로 구해진다. 위험군이 4개로 구분되면 4개의 지수분포가 생성된다.An exponential distribution is generated for each risk group. For example, it is obtained from the exponential distribution of the low-risk group, the exponential distribution of the medium-risk group, the exponential distribution of the high-risk group, and the exponential distribution of the very high-risk group. If the risk groups are divided into 4 groups, 4 exponential distributions are generated.
또한, 모든 위험군의 지수분포는 각 생체지수 별, 각 질환별로 구해져 생성된다.In addition, the exponential distribution of all risk groups is obtained and generated for each bio-index and each disease.
다음으로, 생체지수 평가부(36)는 각 질환에 대하여, 해당 질환의 위험군 간의 지수 분포의 차이 정도를 산출한다. 즉, 분포 차이 정도를 이용하여 해당 생체지수의 해당 질환에 대한 영향력을 평가한다.Next, for each disease, the bio-index evaluation unit 36 calculates the degree of difference in the index distribution between risk groups of the disease. That is, the influence of the corresponding bio-index on the disease is evaluated using the degree of distribution difference.
특정 질환의 특정 생체지수에 대하여, 위험군 간의 해당 생체지수의 분포가 유의미한 차이가 있다는 것은, 해당 생체지수가 해당 질환의 위험도에 영향을 준다는 의미이다. 즉, 저위험군의 분포와 고위험군의 분포가 서로 차이가 많이 나는 경우, 고위험군에 속하는 환자의 생체지수의 값을 저위험군의 생체지수 분포의 평균으로 옮겨 가도록 관리한다면, 해당 환자는 해당 질환의 질병 예측치가 낮아질 수 있다.For a specific bioindex of a specific disease, the fact that there is a significant difference in the distribution of the bioindex between risk groups means that the bioindex affects the risk of the disease. In other words, if the distribution of the low-risk group and the distribution of the high-risk group are very different from each other, if the bio-index value of the high-risk group is managed to be moved to the average of the bio-index distribution of the low-risk group, the patient is the predicted disease value of the disease. can be lowered.
바람직하게는, 생체지수 평가부(36)는 분포의 차이 정도를 이용하여 생체지수의 영향력을 산출한다.Preferably, the bio-index evaluation unit 36 calculates the influence of the bio-index by using the degree of difference in the distribution.
즉, 생체지수(ex. BMI, 혈압 등)의 영향력을 다음과 같이 계산한다.That is, the influence of the biometric index (ex. BMI, blood pressure, etc.) is calculated as follows.
각 위험군 그룹별 지수 분포를 파악 후, 그룹간의 지수별 분포차이 정도에 따라 해당 지수의 해당 질병을 예측하는데 중요도(또는 영향력)을 확인한다.After determining the distribution of the index for each risk group, the importance (or influence) of the index in predicting the disease is checked according to the degree of distribution difference between the indices.
바람직하게는, 분포차이 정도는 각 분포도의 누적분포함수의 잔차를 이용한다. 즉, 각 그룹의 분포의 누적분포함수를 도출하고, 각 그룹의 누적분포함수와 정상 위험군(앞서의 4개 그룹의 예에서 저위험군의 0~25%의 확률로 고혈압 발병이 예측된 사람들)의 누적분포함수의 최대 잔차를 계산한다. 누적분포함수의 잔차는 누적분포함수 값 간의 차이를 말한다. 즉, 각 지수 값에 의한 누적분포함수의 함수 값 간의 차이 중에서 최대 값이 나는 차이를 최대 잔차로 구한다.Preferably, the degree of distribution difference uses the residual of the cumulative distribution function of each distribution map. That is, the cumulative distribution function of the distribution of each group is derived, and the cumulative distribution function of each group and the normal risk group (people predicted to develop hypertension with a probability of 0-25% in the low-risk group in the example of the previous four groups) Calculate the maximum residual of the cumulative distribution function. The residual of the cumulative distribution function is the difference between the values of the cumulative distribution function. That is, the difference with the maximum value among the differences between the function values of the cumulative distribution function by each exponential value is calculated as the maximum residual.
그룹간 누적분포들의 최대잔차값(0~1)이 큰 지수는, 그 변수의 값이 질병예측확률과의 상관관계가 크다고 볼 수 있기에, 그 지수가 질환 예측치에 큰 영향을 끼친다고 볼 수 있다. An index with a large maximum residual value (0 to 1) of the cumulative distributions between groups can be considered that the value of the variable has a large correlation with the disease prediction probability, so that the index has a large influence on the disease prediction value. .
따라서, 생체지수의 중요도는 최대 잔차값을 이용하여 계산한다.Therefore, the importance of the biometric index is calculated using the maximum residual value.
도 4a의 상단 도형에 표시된 분포도를 참고하면, 고혈압의 발병 예측치로 나눈 그룹간의 BP_HIGH(수축기혈압) 지수의 분포도 차이가 선명하게 드러난다. 도 4a의 하단 도형에 표시된 누적분포도로 각 그룹의 최대잔차를 계산하면, 초고위험군의 저위험군과의 최대잔차값은 0.776이다. 질병 발병 위험도가 높아 질수록, 최대 잔차값이 크게 늘어나는걸 보아 수축기혈압값이 고혈압 예측에 큰 영향을 끼친다는 것을 알 수 있다. 반대로, HMG(혈색소)의 최대잔차값들은 위험도가 증가하여도 크게 변함이 없음으로, 중요하지 않은 변수라는 결론이 나온다. Referring to the distribution map shown in the upper diagram of FIG. 4A , the difference in the distribution of the BP_HIGH (systolic blood pressure) index between the groups divided by the predicted onset of hypertension is clearly revealed. When the maximum residual of each group is calculated from the cumulative distribution shown in the lower figure of FIG. 4A , the maximum residual value of the ultra-high-risk group and the low-risk group is 0.776. As the risk of disease development increases, the maximum residual value increases significantly, indicating that the systolic blood pressure value has a great influence on the prediction of hypertension. Conversely, since the maximum residual values of HMG (hemoglobin) do not change significantly even with an increase in risk, it is concluded that they are insignificant variables.
도 5를 참고하면, 중위험군, 고위험군, 초고위험군별로 고혈압 예측의 변수 중요도를 확인할 수 있다. 그룹간 지수의 중요도(영향력) 순위의 차이가 존재하며, 위험도가 적을수록 각 지수의 절대적 중요도(영향력)가 감소한다. Referring to FIG. 5 , the importance of variables in predicting hypertension can be confirmed for each of the medium-risk group, the high-risk group, and the ultra-high-risk group. There is a difference in the ranking of the importance (influence) of the indices between groups, and the absolute importance (influence) of each index decreases as the level of risk decreases.
다음으로, 관리지수 추출부(37)는 고객의 질환 발병 예측치를 산출하여 발병 위험이 있는 질환을 추출하고, 해당 질환의 생체지수 중에서 영향력이 높은 생체지수를 관리지수로 추출한다. 즉, 영향력의 크기에 따라 해당 생체지수를 관리지수로 선정한다.Next, the management index extraction unit 37 extracts a disease with a risk of onset by calculating the customer's disease onset prediction value, and extracts a biometric index with a high influence from the biometric index of the disease as a management index. That is, the corresponding biometric index is selected as the management index according to the size of the influence.
먼저, 관리지수 추출부(37)는 질환 예측부(33)를 통해, 해당 고객의 데이터로부터 해당 고객의 질환 예측치를 획득한다. 질환 예측치는 질환별로 획득한다. 즉, 질환 예측부(33)는 고객의 인구사회학적 정보와 건강검진 정보를 입력하여 고객의 질환 발병 예측치를 산출할 수 있다.First, the management index extraction unit 37 obtains the disease prediction value of the customer from the customer's data through the disease prediction unit 33 . Disease predictors are obtained for each disease. That is, the disease prediction unit 33 may input the customer's demographic information and health checkup information to calculate the customer's disease onset prediction value.
다음으로, 관리지수 추출부(37)는 획득된 질환 예측치가 어느 위험군에 속하는지를 판별하고, 관리해야할 위험군에 속하면 관리할 질환(위험이 있는 질환)으로 판단한다. 이때, 관리해야할 위험군은 사전에 정해진다. 예를 들어, 앞서 4개의 위험군 중에서 고위험군과 초고위험군 등 발병 예측치가 높은 위험군을 관리해야할 위험군으로 설정될 수 있다.Next, the management index extraction unit 37 determines which risk group the acquired disease prediction value belongs to, and if it belongs to the risk group to be managed, it is determined as a disease to be managed (disease with risk). At this time, the risk group to be managed is determined in advance. For example, it may be set as a risk group to be managed in a high-risk group, such as a high-risk group and an ultra-high-risk group, among the above four risk groups.
다음으로, 관리지수 추출부(37)는 해당 고객의 관리할 질환의 생체지수 중 영향력이 높은 생체지수를 선정한다. 이때, 영향력이 높은 순에 의해 사전에 정해진 개수를 선정하거나, 사전에 정해진 영향력의 기준치 이상이 되는 생체지수를 선정한다.Next, the management index extraction unit 37 selects a biometric index having a high influence among the biometric index of the disease to be managed by the customer. In this case, a predetermined number is selected in the order of the highest influence, or a biometric index that is greater than or equal to the reference value of the predetermined influence is selected.
다음으로, 관리지수 추출부(37)는 선정된 생체지수에 대하여, 고객의 생체지수를 산출하고, 산출된 고객의 생체지수가 해당 질환의 정상 위험군의 생체지수의 분포의 정상 분위, 또는, 사전에 정해진 범위 밖인지를 판단한다.Next, the management index extraction unit 37 calculates the customer's biological index with respect to the selected biological index, and the calculated customer's biological index is the normal quantile of the distribution of the biological index of the normal risk group of the disease, or, in advance to determine if it is outside the specified range.
여기서, 정상 위험군(또는 정상군)은 발병 예측치가 가장 낮은 위험군 그룹으로서, 사전에 정해진다. 앞서의 예에서, 4개의 위험군으로 구분하는 경우, 저위험군이 정상 위험군에 해당한다.Here, the normal risk group (or normal group) is a risk group group with the lowest predicted incidence and is predetermined. In the previous example, when divided into four risk groups, the low-risk group corresponds to the normal-risk group.
기준치 보다 차이가 큰 경우, 해당 생체지수를 관리지수로 최종 추출한다.If the difference is greater than the standard value, the corresponding biometric index is finally extracted as the management index.
바람직하게는, 생체지수가 정상 위험군의 분포 중에서 어느 범위(분위)에 속하는지에 따라 관리 수준을 정한다.Preferably, the management level is determined according to which range (quantile) the biometric index belongs to the distribution of the normal risk group.
다음으로, 결과 출력부(38)는 위험군의 생체지수 분포와, 고객의 생체지수 또는 관리지수를 화면으로 출력한다.Next, the result output unit 38 outputs the biometric index distribution of the risk group and the customer's biometric index or management index on the screen.
즉, 특정 질환의 특정 생체지수에 대하여, 위험군의 분포를 표시하고, 분포 상에서 고객의 생체지수를 출력한다That is, for a specific biometric index of a specific disease, the distribution of risk groups is displayed, and the customer's biometric index is output on the distribution.
이를 통해, 고객은 자신의 상태를 육안으로 확인함으로써, 자신의 건강 상태를 정확하게 파악하여 건강 관리에 대한 경각심을 가질 수 있다.Through this, the customer can check his/her condition with the naked eye, thereby accurately grasping his/her health condition, so that he/she can be alert to health care.
다음으로, 본 발명의 일실시예에 따른 특정 고객(A)의 맞춤형 관리지수를 추출하는 예를 도 6 내지 도 12를 참조하여 구체적으로 설명한다.Next, an example of extracting the customized management index of a specific customer (A) according to an embodiment of the present invention will be described in detail with reference to FIGS. 6 to 12 .
앞서 1차적으로 선별된 각 관리지수들의 저위험군에서의 분포와 고객이 속한 그룹(EX. 초고위험군)에서의 분포에서의 분위를 확인한 후, 이 결과에 따른 심각성을 확인하고, 건강계획을 수립한다.After confirming the distribution in the low-risk group and the distribution in the group to which the customer belongs (EX. ultra-high-risk group) of each of the previously selected management indices, the severity of the result is checked, and a health plan is established. .
먼저, 저위험군의 지수 분포에서 고객의 분위를 확인한다. 정상군과 비교하였을 때 고객 A의 측정값(생체지수)에 의해 심각성을 확인하고 관리 필요성을 결정한다.First, the quantiles of customers are identified in the exponential distribution of the low-risk group. When compared with the normal group, the severity is confirmed by the measured value (bio-index) of customer A, and the need for management is determined.
지수마다 측정 값의 심각성과 관리 필요성을 판단하는 분위 범위를 결정한다. 예를 들어, 도 6의 표에서 보는 바와 같이, BMI는 높을수록 고혈압의 위험도가 증가하므로, 특정 고객의 BMI값이 정상군에서 상위/하위 25% 이내일시 "심각", 10% 이내일시 "매우 심각"으로 결정한다.For each index, a range of quantiles is determined to determine the severity of the measurement and the need for management. For example, as shown in the table of Figure 6, the higher the BMI, the higher the risk of hypertension. Therefore, when the BMI value of a specific customer is within 25% of the top/bottom of the normal group, it is "severe", and when it is within 10%, it is "very high". determined to be serious.
일례로서, 97%로 고혈압 발병이 예측된 고객의 BMI, 수축기혈압값의 정상군 분포 내의 분위값을 도 7a와 도 7b에 도시되고 있다. 이때, 고객의 BMI수치는 24이고, 수축기혈압값은 159이다. 도 7a는 정상군에서의 고객 A의 BMI의 분위를 나타내고, 도 7b는 수축기 혈압 변수의 분위를 나타내고 있다.As an example, quantiles within the normal group distribution of BMI and systolic blood pressure values of customers predicted to develop hypertension by 97% are shown in FIGS. 7A and 7B . At this time, the customer's BMI value is 24 and the systolic blood pressure value is 159. 7A shows the quantiles of BMI of customer A in the normal group, and FIG. 7B shows the quantiles of the systolic blood pressure variable.
고객 A의 BMI값은 정상군의 분포도에서 상위 22.6% 분위를 가진다. 이는 적절한 관리와 주의가 필요한 것으로 볼 수 있다. 또한, 고객 A의 수축기 혈압은 정상군의 분포도와 비교하였을 때 상위 0.01%로 이는 집중적인 관리와 함께 심각한 주의가 필요하다.Customer A's BMI value is in the top 22.6% of the distribution of the normal group. This can be seen as requiring proper management and attention. In addition, customer A's systolic blood pressure is in the top 0.01% compared to the distribution of the normal group, which requires intensive management and serious attention.
다음으로, 고객이 속한 그룹에서의 분위를 확인한다.Next, check the decile in the group to which the customer belongs.
지수마다 측정 값의 심각성과 관리 필요성을 판단하는 분위 값의 범위를 결정한다. 예를 들어, 수치가 상위 20%이상일 시로 결정한다.For each index, determine the range of quantiles that determine the severity of the measurement and the need for management. For example, it is determined when the number is above the top 20%.
고객 A가 속한 그룹의 지수 분포도에서의 측정값의 상위 분위(위치)를 확인한다. 분위 확인을 통해 고객 A가 속한 그룹과 비교하였을 때 고객 A의 측정값의 심각성 및 관리 필요성을 결정한다.Check the upper quantile (position) of the measured value in the exponential distribution of the group to which customer A belongs. The quantile check determines the severity and management need of customer A's measurements compared to the group to which customer A belongs.
고혈압 예측률이 97%인 고객의 BMI, 수축기혈압값의 초위험군 분포내의 분위값인 경우를 예시한다. 즉, 도 8a와 도 8b는 고위험군에서의 고객 A의 BMI와 수축기 혈압 변수의 분위를 나타내고 있다. 이때, 도 9와 같은 관리 기준을 정하고 있다.The case where the BMI and systolic blood pressure values of a customer with a high blood pressure prediction rate of 97% are quantiles within the super-risk group distribution is exemplified. That is, FIGS. 8A and 8B show the quantiles of BMI and systolic blood pressure variables of customer A in the high-risk group. At this time, the management standard as shown in FIG. 9 is established.
도 8a와 도 8b 및 도 9를 참조하면, 고객 A의 BMI값은 고객 A가 속한 위험군과 비교하였을 때 상위 62.8%로 위험군 내에서 평균 수치이다. 또한, 고객 A의 혈압 수치는 상위 13.4%로 위험군 내에서도 매우 심각한 것을 확인 가능하다. 즉 고객 A의 혈압 수치는 집중적인 관리가 필요하다.Referring to FIGS. 8A, 8B, and 9 , the BMI value of customer A is 62.8% higher than the risk group to which customer A belongs, and is an average value within the risk group. In addition, customer A's blood pressure level was in the top 13.4%, confirming that it was very serious even within the risk group. In other words, customer A's blood pressure level requires intensive management.
한편, 파악된 관리 지수들을 활용하여 건강상담이 가능하다.Meanwhile, health counseling is possible using the identified management indices.
다음의 예는, 도 10과 같이, 질환은 5대암이고, 고객 A의 5대암 발병 확률을 0.8이라고 예시한다. 도 11은 초위험군의 5대암 예측의 생체지수들을 나타내고 있다. 또한, 도 12a와 도 12b는 GAMMA_GTP와 TRICLYCERIDE 지수의 분포(5대암 예측에 해당하는 분포)를 예시하고 있다.The following example exemplifies that the disease is 5 major cancers, and the probability of occurrence of 5 major cancers of customer A is 0.8, as shown in FIG. 10 . 11 shows the bioindices of the prediction of five major cancers in the super-risk group. In addition, FIGS. 12A and 12B illustrate distributions of GAMMA_GTP and TRICLYCERIDE indices (distributions corresponding to the prediction of 5 major cancers).
도 10과 같이, 앞서 설명된 방식으로 생체지수의 영향력/중요도를 계산하였을 때, 5대암 초고위험 고객의 GAMMA_GTP(감마지티피)값은 5대암 예측에 매우 중요한 지수이다. 그럼에도 불구하고, 고객 A의 감마지티피값은 정상군의 분포도에서 상위 79% (하위 21%) 수준임으로, 비교적 관리가 필요하지 않은 변수라고 볼 수 있다. 고객 A가 속한 초위험군에서의 감마지티피값의 분위도 상위 89%이므로 심각성이 보이지 않는다.As shown in FIG. 10 , when the influence/importance of the biometric index is calculated in the manner described above, the GAMMA_GTP (gamma GTP) value of a customer with a very high risk of 5 major cancers is a very important index for predicting 5 major cancers. Nevertheless, customer A's gamma GTP value is in the upper 79% (lowest 21%) level of the distribution of the normal group, so it can be seen as a variable that requires relatively little management. In the high-risk group to which customer A belongs, the quantile of the gamma GTP value is also in the top 89%, so there is no seriousness.
반면에, TRIGLYCERIDE(트리글리세라이드)의 경우 변수 초위험군에게 이 변수의 중요도는 상대적으로 낮지만(8위) 최저의 중요도는 아니고, 수치가 정상군의 분포에서 상위 8%인 값에 해당되기에, 매우 심각하다. 따라서 관리할 필요가 있다.On the other hand, in the case of TRIGLYCERIDE (triglyceride), although the importance of this variable to the variable high-risk group is relatively low (8th place), it is not the lowest importance, and the value corresponds to the value of the top 8% of the distribution of the normal group, Very serious. So it needs to be managed.
다음으로, 본 발명의 효과에 대하여 보다 구체적으로 설명한다.Next, the effects of the present invention will be described in more detail.
개인별로 각 질병 발병 예측치에 영향력이 높은 지수들을 파악하고, 그 지수 별 분포 분위를 확인함으로써 집중적으로 관리해야 할 지수를 파악한다.Identify the indices that have a high influence on the predictive value of each disease for each individual, and identify the indices that need to be managed intensively by checking the distribution quantiles for each index.
저위험군, 중위험군, 고위험군, 그리고 초고위험군 등의 위험도에 맞는 지수 중요도(영향력)를 계산한다.Calculate the index importance (impact) for the risk levels of low-risk, medium-risk, high-risk, and ultra-high-risk groups.
특정 지수에 대한 저위험군의 분포와, 개인의 예측치로 정해진 그룹의 지수 분위를 확인함으로써 그 지수의 상대적 분포 및 위치를 파악하여, 그 지수에 대한 중요도를 기존 배경지식 없이도 파악할 수 있다.By checking the distribution of the low-risk group for a specific index and the index quantile of the group determined by individual predictions, the relative distribution and location of the index can be identified, and the importance of the index can be grasped without prior background knowledge.
파악된 중요 지수에 집중해 건강을 관리함으로써 개인의 적절한 건강수준을 보다 정확하게 추정할 수 있고, 이를 통해 후에 진행할 건강관리를 보다 효과적으로 진행할 수 있는 효과가 얻어진다. By concentrating on the identified important indices to manage health, it is possible to more accurately estimate an individual's appropriate health level, and through this, it is possible to more effectively proceed with health care to be carried out later.
기존 중요지수 파악 방법에 비해 다음과 같은 차별점이 존재한다. 즉, 기존의 사용되는 지수 별 중요도 파악 방법은 사용된 알고리즘에 국한되어 있었고, (ex. 랜덤포래스트의 지수 별 중요도 파악법), 직관적이지 않았다(지수 별 중요도를 파악하는 기준을 파악하기 어렵고, 직관적으로 이해가 잘 되지 않음). 또한, 모든 고객군에 해당하는 하나의 지수중요도 수치가 계산이 되고 있다.There are the following differences compared to the existing method of identifying important indices. In other words, the existing method of determining the importance of each index was limited to the algorithm used (ex. the method of determining the importance of each index of the random forest), and it was not intuitive (it is difficult to grasp the criteria for determining the importance of each index, Not intuitively understandable). In addition, one index importance value for all customer groups is being calculated.
하지만, 본 발명의 지수 중요도/영향력 계산 방법은 모든 모델에 공통적으로 사용될 수 있고, 이해가 직관적으로 가능하다. 또한, 저위험군, 중위험군, 고위험군, 초고위험군등의 위험군마다 각각 다른 지수 중요도가 계산이되기 때문에 고객에게 맞는 지수 중요도를 파악할 수 있다.However, the exponential importance/influence calculation method of the present invention can be commonly used for all models, and can be intuitively understood. In addition, since different index importance is calculated for each risk group such as low-risk group, medium-risk group, high-risk group, and very high-risk group, the importance of the index suitable for the customer can be identified.
이상, 본 발명자에 의해서 이루어진 발명을 상기 실시 예에 따라 구체적으로 설명하였지만, 본 발명은 상기 실시 예에 한정되는 것은 아니고, 그 요지를 이탈하지 않는 범위에서 여러 가지로 변경 가능한 것은 물론이다.In the above, the invention made by the present inventors has been described in detail according to the above embodiments, but the present invention is not limited to the above embodiments, and various modifications can be made without departing from the gist of the present invention.
본 발명은 발병 예측치가 높은 질병을 선정하고 해당 질병에 영향을 많이 주는 관리지수를 빅데이터를 통해 추출해서 고객에게 가장 중요하고 필요한 건강관리지수를 보다 과학적으로 정확하게 추출하며, 고객의 건강검진 자료, 의료기관에서의 진료 기록 등으로부터 질병 발병 예측치를 산출하여 건강관리의 관리요인 지수를 추출하여 추가적인 고객의 생체 측정 없이도, 보다 정확한 고객의 건강관리 프로그램을 제시하는 질병 예측치의 분포를 이용한 건강관리 상담 시스템 기술에 적용될 수 있다. The present invention selects a disease with high predictive value and extracts the management index that has a lot of influence on the disease through big data to more scientifically and accurately extracts the most important and necessary health management index for customers, A health care consultation system technology using the distribution of disease prediction values that provides a more accurate customer health management program without additional customer biometric measurements by extracting the management factor index of health management by calculating the disease outbreak prediction value from medical records at medical institutions can be applied to

Claims (4)

  1. 질병 예측치의 분포를 이용한 건강관리 상담 시스템에 있어서,In the health care consultation system using the distribution of disease prediction values,
    과거 환자들의 건강정보 데이터를 표본 데이터로 수집하는 표본자료 수집부;a sample data collection unit that collects health information data of past patients as sample data;
    고객의 건강정보 데이터를 수집하는 건강검진 수집부;a health checkup collection unit that collects customer health information data;
    건강정보 데이터를 이용하여 질환을 예측하는 질환 예측부;a disease prediction unit that predicts a disease using health information data;
    상기 표본 데이터의 각 환자 데이터로부터 각 환자의 질환 예측치를, 상기 질환 예측부를 통해 획득하고, 질환 예측치의 크기에 따라 각 환자를 다수 개의 위험군 그룹으로 분류하는 위험군 분류부;a risk group classification unit for obtaining a disease prediction value of each patient from each patient data of the sample data, through the disease prediction unit, and classifying each patient into a plurality of risk groups according to the size of the disease prediction value;
    상기 표본 데이터의 각 환자에 대하여, 각 생체지수 별로 각 위험군에 속하는 환자들의 생체지수 값으로 분포를 생성하는 지수분포 생성부;for each patient of the sample data, an exponential distribution generating unit for generating a distribution with the bio-index values of patients belonging to each risk group for each bio-index;
    각 질환에 대하여, 해당 질환의 위험군 간의 생체지수 분포의 차이 정도를 산출하여, 각 생체지수의 영향력을 산출하는 생체지수 평가부; 및,For each disease, a bio-index evaluation unit for calculating the influence of each bio-index by calculating the degree of difference in the distribution of the bio-index between the risk groups of the disease; and,
    상기 질환 예측부를 통해 고객의 질환 발병 예측치를 산출하여 발병 위험이 있는 질환을 추출하고, 추출된 질환의 생체지수들 중에서 해당 생체지수의 영향력의 크기에 따라 관리지수로 추출하는 관리지수 추출부를 포함하고,It includes a management index extraction unit for extracting a disease with a risk of developing a disease by calculating the disease onset prediction value of the customer through the disease prediction unit, and extracting it as a management index according to the magnitude of the influence of the biometric index among the extracted disease bioindices, ,
    상기 질환 예측부는 질환예측모델을 사용하여 고객의 질환을 예측하되, 상기 질환예측모델은 사전에 정해진 입력 변수의 입력값을 입력받으면, 사전에 정해진 각 질환 변수의 발병 확률을 출력하고, 상기 질환예측모델은 학습 데이터에 의해 내부 변수들이 학습되는 모델로 구성되고,The disease prediction unit predicts a customer's disease using the disease prediction model, and the disease prediction model receives an input value of a predetermined input variable, outputs a probability of occurrence of each predetermined disease variable, and predicts the disease The model consists of a model in which internal variables are learned by training data,
    상기 생체지수는 상기 건강정보 데이터의 항목으로 만들어지고,The biometric index is made of items of the health information data,
    상기 관리지수 추출부는 획득된 질환 예측치가 어느 위험군에 속하는지를 판별하고, 발병 예측치가 높은 위험군으로서 사전에 정해진 위험군에 속하면 발병 위험이 있는 질환으로 판단하고,The management index extraction unit determines which risk group the acquired disease prediction value belongs to, and if it belongs to a predetermined risk group as a high risk group, it is determined as a disease with a risk of onset,
    상기 생체지수 평가부는 분포의 차이를 해당 분포의 누적분포함수를 이용하여, 해당 누적분포함수의 최대 잔차를 계산하고, 계산된 최대 잔차를 이용하여 분포의 차이를 구하는 것을 특징으로 하는 질병 예측치의 분포를 이용한 건강관리 상담 시스템.The bio-index evaluation unit calculates the maximum residual of the corresponding cumulative distribution function by using the cumulative distribution function of the distribution for the difference in the distribution, and calculates the difference in the distribution using the calculated maximum residual Distribution of disease prediction values A health care consultation system using
  2. 제1항에 있어서,The method of claim 1,
    상기 관리지수 추출부는 해당 고객의 발병 위험이 있는 질환의 생체지수 중 영향력이 높은 생체지수를 선정하되, 영향력이 높은 순에 의해 사전에 정해진 개수를 선정하거나, 사전에 정해진 영향력의 기준치 이상이 되는 생체지수를 선정하고, 선정된 생체지수에 대하여, 고객의 생체지수를 산출하고, 산출된 고객의 생체지수와, 해당 질환의 정상 위험군의 생체지수가 정상 위험군의 생체지수의 분포의 정상 분위 밖인지를 판단하고, 분위 밖인 경우, 해당 생체지수를 관리지수로 최종 추출하는 것을 특징으로 하는 질병 예측치의 분포를 이용한 건강관리 상담 시스템.The management index extraction unit selects a high-influence bio-index from among the bio-indices of a disease that is at risk of developing the customer, and selects a predetermined number in the order of the highest influence, or selects a predetermined number of bio-indexes that are greater than or equal to the threshold value of the pre-determined influence. Select an index, calculate the customer's bio-index for the selected bio-index, and check whether the calculated customer's bio-index and the bio-index of the normal risk group of the disease are outside the normal quantile of the distribution of the bio-index of the normal risk group A health care counseling system using the distribution of disease prediction values, characterized in that the determination and final extraction of the biometric index as a management index when it is outside the quantile.
  3. 제1항에 있어서,The method of claim 1,
    상기 고객 또는 상기 환자의 건강정보 데이터는 인구사회학적 정보와, 건강검진 데이터로 구성되는 것을 특징으로 하는 질병 예측치의 분포를 이용한 건강관리 상담 시스템.The health information data of the customer or the patient is a health care counseling system using a distribution of disease prediction values, characterized in that it consists of demographic information and health checkup data.
  4. 제1항에 있어서,The method of claim 1,
    상기 시스템은, 특정 질환의 특정 생체지수에 대하여, 위험군의 분포를 표시하고, 분포 상에서 고객의 생체지수를 출력하는 결과 출력부를 더 포함하는 것을 특징으로 하는 질병 예측치의 분포를 이용한 건강관리 상담 시스템.The system, with respect to a specific biometric index of a specific disease, displays the distribution of the risk group, and a health care counseling system using the distribution of disease prediction value characterized in that it further comprises a result output unit for outputting the customer's biometric index on the distribution.
PCT/KR2022/004222 2021-03-30 2022-03-25 Health care consultation system using distribution of disease prediction values WO2022211385A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
KR10-2021-0040913 2021-03-30
KR1020210040913A KR102342770B1 (en) 2021-03-30 2021-03-30 A health management counseling system using the distribution of predicted disease values

Publications (1)

Publication Number Publication Date
WO2022211385A1 true WO2022211385A1 (en) 2022-10-06

Family

ID=79175827

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/KR2022/004222 WO2022211385A1 (en) 2021-03-30 2022-03-25 Health care consultation system using distribution of disease prediction values

Country Status (2)

Country Link
KR (1) KR102342770B1 (en)
WO (1) WO2022211385A1 (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR102342770B1 (en) * 2021-03-30 2021-12-23 주식회사 라이프시맨틱스 A health management counseling system using the distribution of predicted disease values
KR102434112B1 (en) * 2022-01-12 2022-08-24 주식회사 에이치디메디 Method and apparatus for generating disease prediction ai model, and system and method for predicting user-customized disease using the same

Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170060557A (en) * 2015-11-23 2017-06-01 한국전자통신연구원 Apparatus and method for predicting future health
KR20170061222A (en) * 2015-11-25 2017-06-05 한국전자통신연구원 The method for prediction health data value through generation of health data pattern and the apparatus thereof
KR101792982B1 (en) * 2017-07-25 2017-11-20 국민건강보험공단 Healthcare message management apparatus
KR101876858B1 (en) * 2017-03-13 2018-07-11 삼성화재해상보험 주식회사 Disease prediction and consulting system
KR101927669B1 (en) * 2018-08-14 2019-03-12 국민건강보험공단 Method for providing customized health-care service algorithm
KR20190030876A (en) * 2017-09-15 2019-03-25 주식회사 셀바스에이아이 Method for prediting health risk
KR102342770B1 (en) * 2021-03-30 2021-12-23 주식회사 라이프시맨틱스 A health management counseling system using the distribution of predicted disease values

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR100673237B1 (en) 2004-10-27 2007-01-22 에스케이 텔레콤주식회사 System and method for health management using mobile station
KR101747521B1 (en) 2015-06-09 2017-06-15 이화여자대학교 산학협력단 Smart helth-care information service mehtod and computer program
KR101866909B1 (en) 2016-05-20 2018-07-23 정택진 Method and device for providing intergrated health care service

Patent Citations (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
KR20170060557A (en) * 2015-11-23 2017-06-01 한국전자통신연구원 Apparatus and method for predicting future health
KR20170061222A (en) * 2015-11-25 2017-06-05 한국전자통신연구원 The method for prediction health data value through generation of health data pattern and the apparatus thereof
KR101876858B1 (en) * 2017-03-13 2018-07-11 삼성화재해상보험 주식회사 Disease prediction and consulting system
KR101792982B1 (en) * 2017-07-25 2017-11-20 국민건강보험공단 Healthcare message management apparatus
KR20190030876A (en) * 2017-09-15 2019-03-25 주식회사 셀바스에이아이 Method for prediting health risk
KR101927669B1 (en) * 2018-08-14 2019-03-12 국민건강보험공단 Method for providing customized health-care service algorithm
KR102342770B1 (en) * 2021-03-30 2021-12-23 주식회사 라이프시맨틱스 A health management counseling system using the distribution of predicted disease values

Also Published As

Publication number Publication date
KR102342770B1 (en) 2021-12-23

Similar Documents

Publication Publication Date Title
Yang et al. Risk prediction of diabetes: big data mining with fusion of multifarious physical examination indicators
Haydar et al. Comparison of QSOFA score and SIRS criteria as screening mechanisms for emergency department sepsis
Wollenstein-Betech et al. Personalized predictive models for symptomatic COVID-19 patients using basic preconditions: hospitalizations, mortality, and the need for an ICU or ventilator
KR102558021B1 (en) A clinical decision support ensemble system and the clinical decision support method by using the same
US20170308981A1 (en) Patient condition identification and treatment
US7418399B2 (en) Methods and kits for managing diagnosis and therapeutics of bacterial infections
WO2022211385A1 (en) Health care consultation system using distribution of disease prediction values
US20040225200A1 (en) System and method of analyzing the health of a population
CN107491630A (en) Clinical decision support integrated system and use its clinical decision support method
CN105260588A (en) Health protection robot system and data processing method thereof
Cole et al. Profiling risk factors for chronic uveitis in juvenile idiopathic arthritis: a new model for EHR-based research
Shenas et al. Identifying high-cost patients using data mining techniques and a small set of non-trivial attributes
KR20180002234A (en) A smart examination apparatus for dementia early diagnosis and the method by using the same
WO2021215809A1 (en) System and method for providing early diagnosis of cognitive disorder and community care matching service for elderly
WO2018105995A2 (en) Device and method for health information prediction using big data
Aronsky et al. Evaluation of a computerized diagnostic decision support system for patients with pneumonia: study design considerations
Zhu et al. An empirical study of factor identification in smart health-monitoring wearable device
Zambelli-Weiner et al. Building a basis for action: Enhancing public health surveillance of vision impairment and eye health in the United States
KR100673252B1 (en) System for health predicting using mobile and method for providing contents having health predicting information
Fujiwara et al. Association of socioeconomic characteristics with receipt of pediatric cochlear implantations in California
KR19980025157A (en) Disease Management Methods and Systems
JP2011134106A (en) Medical information collection system, medical information collection processing method and display control method for medical information collection screen
KR20180002229A (en) An agent apparatus for constructing database for dementia information and the operating method by using the same
JP2000099605A (en) Health examination information analysis server and network health information system
Smith et al. Predicting follow-up living setting in patients with stroke

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 22781504

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 22781504

Country of ref document: EP

Kind code of ref document: A1