WO2023003315A1 - Système et procédé pour prédire une interruption précoce du traitement d'un patient ambulatoire souffrant de trouble de consommation d'alcool - Google Patents
Système et procédé pour prédire une interruption précoce du traitement d'un patient ambulatoire souffrant de trouble de consommation d'alcool Download PDFInfo
- Publication number
- WO2023003315A1 WO2023003315A1 PCT/KR2022/010516 KR2022010516W WO2023003315A1 WO 2023003315 A1 WO2023003315 A1 WO 2023003315A1 KR 2022010516 W KR2022010516 W KR 2022010516W WO 2023003315 A1 WO2023003315 A1 WO 2023003315A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- alcohol use
- patients
- use disorder
- outpatient treatment
- predictive model
- Prior art date
Links
- 208000007848 Alcoholism Diseases 0.000 title claims abstract description 119
- 208000025746 alcohol use disease Diseases 0.000 title claims abstract description 116
- 238000000034 method Methods 0.000 title claims abstract description 63
- 238000004422 calculation algorithm Methods 0.000 claims abstract description 65
- 238000010801 machine learning Methods 0.000 claims abstract description 55
- 238000007781 pre-processing Methods 0.000 claims description 38
- 238000012360 testing method Methods 0.000 claims description 26
- 238000011156 evaluation Methods 0.000 claims description 21
- 230000001419 dependent effect Effects 0.000 claims description 15
- 238000012706 support-vector machine Methods 0.000 claims description 13
- 238000012549 training Methods 0.000 claims description 13
- 238000012423 maintenance Methods 0.000 claims description 11
- 238000012545 processing Methods 0.000 claims description 11
- 238000007477 logistic regression Methods 0.000 claims description 8
- 238000007637 random forest analysis Methods 0.000 claims description 8
- 238000013480 data collection Methods 0.000 claims description 4
- 238000010586 diagram Methods 0.000 description 14
- 230000008569 process Effects 0.000 description 10
- 230000006870 function Effects 0.000 description 9
- 239000004065 semiconductor Substances 0.000 description 7
- 241000220259 Raphanus Species 0.000 description 6
- 235000006140 Raphanus sativus var sativus Nutrition 0.000 description 6
- 238000005070 sampling Methods 0.000 description 6
- 230000035945 sensitivity Effects 0.000 description 6
- 238000004891 communication Methods 0.000 description 5
- 208000020401 Depressive disease Diseases 0.000 description 4
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 4
- 238000013473 artificial intelligence Methods 0.000 description 4
- DQCKKXVULJGBQN-XFWGSAIBSA-N naltrexone Chemical compound N1([C@@H]2CC3=CC=C(C=4O[C@@H]5[C@](C3=4)([C@]2(CCC5=O)O)CC1)O)CC1CC1 DQCKKXVULJGBQN-XFWGSAIBSA-N 0.000 description 4
- 229960003086 naltrexone Drugs 0.000 description 4
- 238000000611 regression analysis Methods 0.000 description 4
- 238000007619 statistical method Methods 0.000 description 4
- 238000003860 storage Methods 0.000 description 4
- 238000007726 management method Methods 0.000 description 3
- 208000020016 psychiatric disease Diseases 0.000 description 3
- 230000002787 reinforcement Effects 0.000 description 3
- 208000019901 Anxiety disease Diseases 0.000 description 2
- 238000000692 Student's t-test Methods 0.000 description 2
- 238000000546 chi-square test Methods 0.000 description 2
- 238000010224 classification analysis Methods 0.000 description 2
- 238000013499 data model Methods 0.000 description 2
- 238000003066 decision tree Methods 0.000 description 2
- 238000013136 deep learning model Methods 0.000 description 2
- 238000013461 design Methods 0.000 description 2
- 206010012601 diabetes mellitus Diseases 0.000 description 2
- 230000007786 learning performance Effects 0.000 description 2
- 238000012417 linear regression Methods 0.000 description 2
- 208000019423 liver disease Diseases 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012353 t test Methods 0.000 description 2
- 206010012289 Dementia Diseases 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 206010001584 alcohol abuse Diseases 0.000 description 1
- 201000007930 alcohol dependence Diseases 0.000 description 1
- 230000036506 anxiety Effects 0.000 description 1
- 238000003491 array Methods 0.000 description 1
- 238000013528 artificial neural network Methods 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 230000001364 causal effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000007418 data mining Methods 0.000 description 1
- 238000013503 de-identification Methods 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 238000009826 distribution Methods 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000007636 ensemble learning method Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000014759 maintenance of location Effects 0.000 description 1
- 238000004519 manufacturing process Methods 0.000 description 1
- 239000000463 material Substances 0.000 description 1
- 230000003278 mimic effect Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 238000010606 normalization Methods 0.000 description 1
- 230000003287 optical effect Effects 0.000 description 1
- 238000003909 pattern recognition Methods 0.000 description 1
- 208000024335 physical disease Diseases 0.000 description 1
- 230000008092 positive effect Effects 0.000 description 1
- 230000002028 premature Effects 0.000 description 1
- 238000007639 printing Methods 0.000 description 1
- 238000004393 prognosis Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 210000000225 synapse Anatomy 0.000 description 1
- 230000000946 synaptic effect Effects 0.000 description 1
- 238000010998 test method Methods 0.000 description 1
- 238000012795 verification Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/50—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/10—Machine learning using kernel methods, e.g. support vector machines [SVM]
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H20/00—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance
- G16H20/70—ICT specially adapted for therapies or health-improving plans, e.g. for handling prescriptions, for steering therapy or for monitoring patient compliance relating to mental therapies, e.g. psychological therapy or autogenous training
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/20—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/70—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
Definitions
- the present invention provides a system and method for predicting early discontinuation of outpatient treatment for alcohol use disorder patients.
- Machine learning is the study of computer algorithms that automatically improve through experience and the use of data. Machine learning is also considered a part of artificial intelligence. Machine learning algorithms do not explicitly program specific actions; instead, machine learning algorithms can be used to build models to make predictions or decisions based on samples called training data. Machine learning can be used for a variety of applications, including medicine, speech recognition and computer vision.
- Predictive models based on machine learning can classify with high accuracy.
- predictive models based on machine learning have been usefully used in the process of developing decision support systems.
- Alcohol use disorder may cause not only physical diseases such as alcohol-induced physical complications and alcohol-related dementia, but also social problems such as alcohol-related crimes and accidents, and enormous economic losses.
- Alcohol use disorder has a higher relapse rate than other mental disorders. In order to prevent recurrence, it is necessary to be managed over a long period of time, not terminated with a single treatment. In addition, continuous treatment can have a positive effect on the treatment outcome. Therefore, continuous follow-up of patients is an important indicator to evaluate the prognosis of alcohol use disorder.
- Embodiments of the present invention can predict whether or not to discontinue outpatient treatment early by calculating the probability of early withdrawal from outpatient treatment for alcohol use disorder patients through a predictive model design through machine learning.
- Embodiments of the present invention help in patient management so that patients with alcohol use disorder who have a high risk of early outpatient treatment discontinuation can continue treatment steadily, and ultimately contribute to preventing relapse of patients and increasing the success rate of treatment. .
- embodiments of the present invention collect data of a plurality of alcohol use disorder patients, and apply one or more machine learning algorithms to generate a predictive model for early discontinuation of outpatient treatment of a plurality of alcohol use disorder patients.
- a pre-processing unit that determines a plurality of independent variables and generates processed data by processing data of a plurality of alcohol use disorder patients; Receives processed data, sets whether or not the outpatient treatment of multiple alcohol use disorder patients is prematurely discontinued as a dependent variable, and applies one or more machine learning algorithms to all or part of the processed data based on independent variables to generate a predictive model a predictive model generating unit;
- a prediction unit that inputs all or part of the processed data into a predictive model to generate a prediction result regarding whether or not the outpatient treatment of a plurality of alcohol use disorder patients is prematurely discontinued;
- an output unit for outputting a prediction result provides a system for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder.
- embodiments of the present invention include a data collection step of collecting data of a plurality of alcohol use disorder patients; An independent variable determination step of determining a plurality of independent variables to which one or more machine learning algorithms for generating a predictive model for early discontinuation of outpatient treatment of a plurality of alcohol use disorder patients are to be applied; A pre-processing step of generating processed data by processing data of a plurality of alcohol use disorder patients; Receives processed data, sets whether or not to discontinue outpatient treatment early for multiple alcohol use disorder patients as a dependent variable, and applies one or more machine learning algorithms to all or part of the processed data based on independent variables to create a predictive model generating a predictive model; Outpatient treatment of patients with alcohol use disorder, including a prediction step of generating a prediction result on whether a plurality of patients with alcohol use disorder will initially discontinue outpatient treatment by inputting all or part of the processed data into a predictive model, and an output step of outputting the prediction result.
- FIG. 1 is a schematic configuration diagram of a system for predicting early discontinuation of outpatient treatment for alcohol use disorder patients according to embodiments of the present invention.
- FIG. 2 is a flowchart illustrating a variable determination operation performed by a preprocessor according to embodiments of the present invention.
- FIG. 3 is a diagram illustrating an operation of classifying data of an alcohol use disorder patient into a learning data group and a test data group by a preprocessing unit according to embodiments of the present invention.
- FIG. 4 is a diagram illustrating an operation of performing sampling on a specific class by a pre-processor according to embodiments of the present invention.
- FIG. 5 is a diagram illustrating an example of an operation of generating a predictive model by applying one or more machine learning algorithms to a training data group by a predictive model generator according to embodiments of the present invention.
- FIG. 6 is a diagram illustrating an operation of determining a predictive model according to a performance evaluation index of a predictive model generator according to embodiments of the present invention.
- AUC is one of the performance evaluation indicators according to embodiments of the present invention.
- FIG. 8 is a diagram illustrating a method for predicting early discontinuation of outpatient treatment for alcohol use disorder patients according to embodiments of the present invention.
- step of (doing) or “step of” as used throughout the specification of the present invention does not mean “step for”.
- a "unit” includes a unit realized by hardware, a unit realized by software, and a unit realized using both. Further, one unit may be realized using two or more hardware, and two or more units may be realized by one hardware.
- FIG. 1 is a schematic configuration diagram of a system for predicting early discontinuation of outpatient treatment for alcohol use disorder patients according to embodiments of the present invention.
- a system for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder includes a preprocessor 110, a predictive model generator 120, a predictor 130, and An output unit 140 may be included.
- the pre-processing unit 110 may collect data of a plurality of alcohol use disorder patients. In addition, the preprocessing unit 110 may determine a plurality of independent variables to which one or more machine learning algorithms are applied to generate a predictive model for whether or not a plurality of patients with alcohol use disorder prematurely discontinue outpatient treatment. In addition, the pre-processing unit 110 may generate processed data by processing data of a plurality of alcohol use disorder patients in order to apply one or more machine learning algorithms described above.
- the pre-processing unit 110 may collect data of patients with alcohol use disorder, wired/wireless communication with a server or terminal storing the data may be used.
- the pre-processing unit 110 may receive medical data of alcohol use disorder patients from one or more medical institutions.
- data of a plurality of alcohol use disorder patients may be standardized as a common data model (CDM, Common Data Model).
- data of a plurality of alcohol use disorder patients may be collected from a Clinical Data Warehouse (CDW).
- the clinical data warehouse (CDW) may transmit data extracted according to research characteristics to the pre-processing unit 110 through de-identification.
- a plurality of alcohol use disorder patients may be selected from patients with a hospitalization period of 2 weeks or more.
- the date of hospitalization for a patient who has been hospitalized two or more times for two weeks or longer among multiple alcohol use disorder patients may be defined based on the first hospitalized date.
- Whether or not patients with alcohol use disorder continue to visit the outpatient clinic can be defined as whether or not the patient visits the outpatient clinic at least once a month for 6 months after being discharged from the hospital.
- the preprocessing unit 110 may determine a plurality of independent variables to which one or more machine learning algorithms for generating a predictive model for early discontinuation of outpatient treatment of a plurality of alcohol use disorder patients will be applied.
- the preprocessing unit 110 is 1) the patient's age, 2) gender, 3) hospitalization period, 4) address, 5) medical department, 6) diabetes, liver disease, depressive disorder and anxiety diagnosed within 1 year before hospitalization
- Independent variables can be determined among variables including whether there are comorbidities such as disabilities, 7) outpatient treatment for alcohol use disorder before hospitalization, and 8) whether naltrexone was prescribed.
- the t-test is a statistical method for verifying whether the difference in average between two groups is significant.
- the chi-square test is a statistical method based on a chi-square distribution, and is a test method used to test whether an observed frequency is significantly different from an expected frequency.
- the pre-processing unit 110 may generate processed data by processing data of a plurality of alcohol use disorder patients in order to apply one or more machine learning algorithms. Meanwhile, the pre-processing unit 110 may perform the above-described process for prediction target data even after the prediction model is generated. Through this, the learning performance of the predictive model can be improved.
- the pre-processing unit 110 may improve the accuracy of the predictive model by securing high-quality processed data by removing or correcting missing data, abnormal data, and redundant data among the data of a plurality of alcohol use disorder patients.
- the pre-processing unit 110 processes the data of a plurality of alcohol use disorder patients, such as combining data, segmentation, filtering sampling derived variable generation, dummy variable generation, scaling adjustment, data type change, normalization, etc., to obtain processed data.
- a plurality of alcohol use disorder patients such as combining data, segmentation, filtering sampling derived variable generation, dummy variable generation, scaling adjustment, data type change, normalization, etc.
- the pre-processing unit 110 may convert digital information of numbers or characters derived empirically or experimentally into a simplified form by correcting and arranging them.
- the predictive model generation unit 120 receives the processed data, sets whether or not the outpatient treatment of the plurality of alcohol use disorder patients is prematurely discontinued as a dependent variable, and uses one or more machine learning algorithms in all or part of the processed data based on the independent variables. can be applied to generate a predictive model.
- machine learning algorithms can be largely classified into three types: supervised learning algorithms, unsupervised learning algorithms, and reinforcement learning algorithms.
- a supervised learning algorithm is an algorithm that is used when there is an intended result.
- a machine learning algorithm model can adjust variables for input values and map them to outputs.
- An unsupervised learning algorithm is an algorithm used when there is no intended result, and can classify an input data set into a set of similar types. Unsupervised learning algorithms can be used for data mining.
- Reinforcement learning algorithm is an algorithm used when making a decision about an input value. When a decision is made, the decision on the given input value gradually changes according to success/failure. As the reinforcement learning algorithm learns, it may be possible to predict the result of the input.
- the predictive model generating unit 120 may be implemented as, for example, a workstation server or a cloud server.
- the prediction unit 130 inputs all or part of the processed data generated by the pre-processing unit 110 to the prediction model generated by the predictive model generation unit 120 to determine whether a plurality of patients with alcohol use disorders are prematurely discontinued from outpatient treatment. predictive results can be generated.
- the system 100 for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder can predict whether or not the outpatient treatment for patients with alcohol use disorder will be prematurely discontinued by using the prediction result generated by the prediction unit 130, and also determines whether or not the outpatient treatment is prematurely discontinued. Influencing variables can be identified.
- the system 100 for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder plays a role of helping to receive treatment steadily by promoting special management according to the characteristics of the patient using the prediction result generated by the prediction unit 130. can do.
- the output unit 140 may output a prediction result generated by the prediction unit 130 . At this time, the output unit 140 may output the prediction result by a method such as screen output through a display or printing of the prediction result using a printer.
- FIG. 2 is a flowchart illustrating a variable determination operation performed by the preprocessor 110 according to embodiments of the present invention.
- the preprocessor 110 calculates variance inflation factors (VIFs) between independent variables in order to solve the problem of multicollinearity between independent variables.
- Independent variables may be determined so that the variance expansion coefficient (VIF) is maintained below a predetermined threshold value.
- a multicollinearity problem refers to a problem in which some of the independent variables can be expressed as a combination of other independent variables.
- the multicollinearity problem can occur when independent variables are not independent of each other and have strong interrelationships.
- a method for solving the multicollinearity problem a method of eliminating variables dependent on other independent variables may be used, and in this case, a variance inflation factor (VIF) may be used.
- VIF variance inflation factor
- the variance inflation factor (VIF) represents the performance of a linear regression of one independent variable on another.
- the variance inflation factor (VIF) of the i-th variable can be obtained through Equation 1 below.
- the preprocessor 110 may calculate variance inflation coefficient (VIF) values for all determined independent variables, and determine independent variables such that all VIF values are maintained below a predetermined threshold value.
- VIF variance inflation coefficient
- a predetermined critical value of the variance expansion coefficient (VIF) may be determined to be 5.
- the pre-processing unit 110 may determine independent variables (S210).
- the pre-processing unit 110 may calculate the variance expansion coefficient (VIF) for all the determined independent variables in the above-described manner (S220).
- the preprocessing unit 110 may determine another independent variable when the variance inflation factor (VIF) for any one independent variable exceeds a critical value (S230-Y) (S240).
- VF variance inflation factor
- S230-Y critical value
- the preprocessor 110 may enter step S220 and calculate a variance inflation factor (VIF) again for the determined independent variable.
- VIP variance inflation factor
- the preprocessor 110 may end the variable determination process.
- FIG. 3 is a diagram illustrating an operation of classifying alcohol use disorder patient data into a training data set and a test data set by the preprocessing unit 110 according to embodiments of the present invention.
- the preprocessing unit 110 may classify data of a plurality of alcohol use disorder patients into a learning data group and a test data group.
- the preprocessing unit 110 classifies some of the data of a plurality of patients with alcohol use disorder into a training data set, and can be used to learn the predictive model. .
- test data set may be used to test a predictive model generated based on data of a plurality of alcohol use disorder patients.
- the test data set may be set large enough to derive statistically significant results, and may include all data of a plurality of alcohol use disorder patients.
- the test data group may be classified to have the same characteristics as the training data group.
- the predictive model generating unit 120 may derive a performance evaluation index to be used to measure performance of the predictive model using a test data set.
- the predictive model generation unit 120 may check the objective performance of the predictive model and compare the performances of different predictive models using the performance evaluation index.
- FIG. 4 is a diagram illustrating an operation of performing oversampling on a specific class by the preprocessor 110 according to embodiments of the present invention.
- the pre-processing unit 110 may apply a sampling method to a specific class in order to solve the class imbalance of the learning data group when processing data of a plurality of alcohol use disorder patients.
- the number of data included in the outpatient treatment maintenance class and the outpatient treatment early discontinuation class is disproportionate (e.g. 85:15) to each other. problems can arise.
- sampling methods may be divided into oversampling methods and undersampling methods.
- the undersampling method is a method of reducing a data group of a majority class to a level of a data group of a minority class. Since the undersampling method removes a large number of class data, calculation time can be reduced and class overlap can be reduced. However, the undersampling method drastically reduces the total number of data used for learning, and may rather degrade learning performance.
- the oversampling method secures enough data for learning by increasing the data group of the minority class to the level of the majority class.
- oversampling methods include random oversampling, which simply replicates an existing minority class to match the ratio, and synthetic minority (SMOTE), which is a method of generating new data between neighboring minority classes from data of an arbitrary minority class.
- SMOTE synthetic minority
- the pre-processing unit 110 may correct class imbalance by using an oversampling or undersampling method and derive a more precise prediction.
- the pre-processing unit 110 in order to solve the imbalance problem between the foreign treatment maintenance class and the foreign treatment early discontinuation class, which are dependent variables for the learning data group, a class that is a minority of the two classes ( Oversampling can be applied to outpatient treatment maintenance classes).
- the pre-processing unit 110 may generate duplicated data for data a and b included in the minority class for a minority class.
- FIG. 5 is a diagram illustrating an operation of generating a predictive model by applying one or more machine learning algorithms to a training data group by the predictive model generator 120 according to embodiments of the present invention.
- the predictive model generator 120 may generate a predictive model by applying one or more machine learning algorithms to a portion corresponding to a training data group among processed data.
- the one or more machine learning algorithms may be one or more of logistic regression, support vector machine (SVM), random forest, gradient boosting, and adaboost. .
- Logistic regression is a statistical technique for estimating a causal relationship between a dependent variable having only two values and independent variables using a logistic function.
- the dependent variable is dichotomous (0 or 1), and the independent variable can be categorical or continuous.
- the logistic regression model is a special form of generalized linear model and is a functional model that draws an S-shaped curve. As a result of logistic regression analysis, if the value of the dependent variable is greater than 0.5, the event is predicted to occur, and if the value is less than 0.5, the event is predicted not to occur.
- a Support Vector Machine is one of the machine learning fields and is a supervised learning model for pattern recognition and data analysis, and is mainly used for classification and regression analysis.
- the support vector machine algorithm may create a non-probabilistic binary linear classification model that determines which category new data belongs to when given a set of data belonging to one of two categories.
- the category may be divided into an outpatient treatment maintenance group and an outpatient treatment early discontinuation group for patients with alcohol use disorder, and a support vector machine may be used to determine which of the two groups the new data corresponds to.
- a random forest is a type of ensemble learning method used in regression analysis, etc., and operates by outputting a classification or average prediction value from a plurality of decision trees constructed in the training process.
- the random forest test process using the ensemble model may derive a final result through average, multiplication, or majority voting of the result obtained from the decision tree. These tests can be performed in parallel, resulting in high computational efficiency.
- Gradient Boosting is a machine learning algorithm that can perform regression analysis or classification analysis, and is an algorithm that belongs to the boosting family of ensemble methodologies of machine learning algorithms.
- Boosting is the process of creating a strong classifier by combining weak classifiers, and gradient boosting takes the error of the data predicted by the model in the previous stage and creates a new model with the goal of making this error zero. It is an algorithm that creates a model by combining them.
- Adaboost is a machine learning algorithm that expresses the final result by weighting and adding the results of other learning algorithms.
- the predictive model generator 120 may generate a predictive model corresponding to the machine learning algorithm by applying a machine learning algorithm to a training data set.
- FIG. 6 is a diagram illustrating an operation of determining a predictive model according to performance evaluation indexes by the predictive model generator 120 according to embodiments of the present invention.
- the predictive model generation unit 120 applies a plurality of machine learning algorithms to a portion corresponding to the learning data group among the processed data received from the preprocessor 110 to generate For each of the plurality of candidate prediction models, a test result may be derived by inputting a part corresponding to a test data group among processed data.
- the predictive model generator 120 may calculate a performance evaluation index for each of a plurality of candidate predictive models using the derived test results.
- the prediction model generator 120 may determine a candidate prediction model having the highest performance evaluation index value among a plurality of candidate prediction models as the prediction model.
- the performance evaluation index may be, for example, one of accuracy, sensitivity, specificity, and area under the ROC curve (AUC).
- Accuracy is a value obtained by dividing the number of data with identical prediction results (TP + TN) by the total number of predicted data (TP + FP + FN + TN), and is an index for determining how identical the predicted data is in actual data.
- Accuracy refers to the ratio of whether to discontinue outpatient treatment or to maintain outpatient treatment among all patients.
- TP is the number of data that the prediction model predicted to be positive but is actually positive
- FP is the number of data that the prediction model predicted to be positive but is actually negative
- FN is the number of data that the prediction model is negative
- TN means the number of data that is predicted to be negative but is actually positive
- TN is the number of data that is actually negative even though the prediction model predicted to be negative.
- Sensitivity also called recall rate or hit rate, is the ratio of actual positives (TP) among those predicted by the predictive model to be positive (TP + FP). It means the proportion of the predictive model of one patient.
- TN + FP the ratio of actual negatives among those predicted by the predictive model to be negative
- FP + FP the ratio of actual outpatient treatment maintenance patients to which the predictive model is correct.
- AUC can be obtained from the ROC (Receiver Operating Characteristics) curve, and means the true positive rate according to the false positive rate, which means (1 - specificity) according to the sensitivity.
- AUC is the area under the ROC curve, and the maximum is 1, and a good predictive model has an AUC value close to 1.
- the predictive model generation unit 120 may select a predictive model capable of predicting whether an alcohol use disorder patient will stop outpatient treatment early with the highest probability using the above-described performance evaluation index for a plurality of candidate predictive models.
- the predictive model generation unit 120 may generate Table 2 by calculating a performance evaluation index according to each of the candidate predictive models.
- a predictive model using the Adaboost algorithm can be determined as the predictive model.
- accuracy or specificity When a predictive model is determined based on specificity, a candidate predictive model using a random forest algorithm may be determined as the predictive model.
- AUC is one of the performance evaluation indicators according to embodiments of the present invention.
- the predictive model generation unit 120 may determine a predictive model using AUC as a performance evaluation index.
- An ROC curve for example, may be determined as shown in FIG. 7 .
- AUC means the area under the ROC curve, and the predictive model generation unit 120 may check the AUC value by calculating the area under the ROC curve for each candidate prediction model.
- the prediction model generator 120 may select a prediction model using Adaboost.
- FIG. 8 is a diagram illustrating a method for predicting early discontinuation of outpatient treatment for alcohol use disorder patients according to embodiments of the present invention.
- the method for predicting early discontinuation of outpatient treatment for alcohol use disorder patients may include a data collection step ( S810 ) of collecting data of a plurality of alcohol use disorder patients.
- the method for predicting early discontinuation of outpatient treatment of patients with alcohol use disorder is an independent variable that determines a plurality of independent variables to which one or more machine learning algorithms are applied to generate a predictive model for whether or not a plurality of patients with alcohol use disorder will discontinue outpatient treatment early.
- a variable determination step (S820) may be included.
- the method for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder may include a preprocessing step ( S830 ) of generating processed data by processing data of a plurality of patients with alcohol use disorder.
- the aforementioned data collection step (S810), independent variable determination step (S820), and preprocessing step (S830) may be executed by the aforementioned preprocessor 110.
- the method for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder receives processed data, sets whether or not early discontinuation of outpatient treatment for multiple alcohol use disorder patients as a dependent variable, and based on independent variables, all of the processed data
- it may include a predictive model generating step (S840) of generating a predictive model by applying one or more machine learning algorithms to a part.
- the predictive model generating step (S840) may be executed by the aforementioned predictive model generating unit 120.
- the method for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder is a prediction step of inputting all or part of the processed data into a predictive model to generate a prediction result on whether or not to discontinue the outpatient treatment of a plurality of patients with alcohol use disorder in the early stage (S850) can include Meanwhile, the predicting step (S850) may be executed by the predicting unit 130 described above.
- the method for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder may include an output step ( S860 ) of outputting a prediction result. Meanwhile, the output step (S860) may be executed by the above-described output unit 140.
- independent variable determining step (S820) for example, when determining the independent variables, in order to solve the multicollinearity problem between the independent variables, variance inflation factors (VIFs) are calculated for the independent variables and , independent variables may be determined so that the variance expansion coefficient is maintained below a predetermined threshold value.
- VAFs variance inflation factors
- the pre-processing step ( S830 ) may include classifying a plurality of alcohol use disorder patient data into a learning data group and a test data group.
- the pre-processing step (S830) may include applying oversampling to a minority of the foreign treatment maintenance class and the foreign treatment early discontinuation class, which are dependent variables for the learning data group.
- a predictive model may be generated by applying one or more machine learning algorithms to a part corresponding to the training data group among the processed data.
- one or more machine learning algorithms are: 1) Logistic Regression, 2) Support Vector Machine (SVM), 3) Random Forest, 4) Gradient Boosting, and 5 ) may be one or more of Adaboost.
- the predictive model generating step (S840) when there are a plurality of machine learning algorithms, a plurality of candidate prediction models generated by applying a plurality of machine learning algorithms to a part corresponding to a training data group among processed data For each of the processing data, deriving a test result by inputting a part corresponding to the test data group, 2) Calculating a performance evaluation index for each of a plurality of candidate prediction models using the test result, 3 ) determining a candidate prediction model having the highest performance evaluation index among a plurality of candidate prediction models as a prediction model.
- the performance evaluation index may be an area under the ROC curve (AUC).
- the aforementioned system 100 for predicting early withdrawal from outpatient treatment for patients with alcohol use disorder may be implemented by a computing device including at least some of a processor, a memory, a user input device, and a presentation device.
- Memory is a medium that stores computer-readable software, applications, program modules, routines, instructions, and/or data that are coded to perform particular tasks when executed by a processor.
- a processor may read and execute computer-readable software, applications, program modules, routines, instructions, and/or data stored in memory.
- the user input device may be a means for allowing a user to input a command to execute a specific task to the processor or input data required for execution of the specific task.
- the user input device may include a physical or virtual keyboard or keypad, key buttons, mouse, joystick, trackball, touch-sensitive input means, or a microphone.
- the presentation device may include a display, a printer, a speaker, or a vibrator.
- Computing devices may include a variety of devices such as smart phones, tablets, laptops, desktops, servers, and clients.
- a computing device may be a single stand-alone device or may include multiple computing devices operating in a distributed environment consisting of multiple computing devices cooperating with each other over a communications network.
- the above-described method for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder has a processor and is coded to perform an image diagnosis method using a deep learning model when executed by the processor.
- Computer readable software and applications , program modules, routines, instructions, and/or data structures, etc. may be executed by a computing device having a memory.
- present embodiments described above may be implemented through various means.
- the present embodiments may be implemented by hardware, firmware, software, or a combination thereof.
- the present embodiments include one or more ASICs (Application Specific Integrated Circuits), DSPs (Digital Signal Processors), DSPDs (Digital Signal Processing Devices), PLDs (Programmable Logic Devices), FPGAs (Field Programmable Gates) Arrays), processors, controllers, microcontrollers or microprocessors.
- ASICs Application Specific Integrated Circuits
- DSPs Digital Signal Processors
- DSPDs Digital Signal Processing Devices
- PLDs Programmable Logic Devices
- FPGAs Field Programmable Gates
- processors controllers, microcontrollers or microprocessors.
- a method for predicting early withdrawal from outpatient treatment of a patient with alcohol use disorder may be implemented using an artificial intelligence semiconductor device in which neurons and synapses of a deep neural network are implemented as semiconductor devices.
- the semiconductor device may be currently used semiconductor devices such as SRAM, DRAM, NAND, etc., next-generation semiconductor devices, RRAM, STT MRAM, PRAM, etc., or a combination thereof.
- the method for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder is implemented using an artificial intelligence semiconductor device
- the result (weight) of learning the deep learning model as software is transferred to the synaptic mimic device arranged in an array, or Learning may be performed on an artificial intelligence semiconductor device.
- the method for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder may be implemented in the form of a device, procedure, or function that performs the functions or operations described above.
- the software codes may be stored in a memory unit and driven by a processor.
- the memory unit may be located inside or outside the processor and exchange data with the processor by various means known in the art.
- system generally refer to computer-related entities hardware, hardware and software.
- a component can be both an application running on a controller or processor and a controller or processor.
- One or more components may reside within a process and/or thread of execution, and components may reside on one device (eg, system, computing device, etc.) or may be distributed across two or more devices.
- another embodiment provides a computer program stored in a computer recording medium that performs the above-described method for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder.
- another embodiment provides a computer-readable recording medium recording a program for realizing the above-described method for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder.
- a program recorded on a recording medium may be read, installed, and executed in a computer to execute the above-described steps.
- the above-described program is C, C++ that can be read by the computer's processor (CPU) through the computer's device interface.
- JAVA may include a code coded in a computer language such as machine language.
- These codes may include functional codes related to functions defining the above-described functions, and may include control codes related to execution procedures necessary for a processor of a computer to execute the above-described functions according to a predetermined procedure.
- these codes may further include memory reference related codes for which location (address address) of the computer's internal or external memory should be referenced for additional information or media necessary for the computer's processor to execute the above-mentioned functions. .
- the code allows the computer processor to use the computer's communication module to communicate with any other remote computer or server.
- Communication-related codes for how to communicate with other computers or servers, what information or media to transmit/receive during communication, and the like may be further included.
- Recording media that can be read by a computer on which the program as described above is recorded are, for example, ROM, RAM, CD-ROM, magnetic tape, floppy disk, optical media storage device, etc., and also carrier wave (e.g. , Transmission through the Internet) may also include what is implemented in the form of.
- carrier wave e.g. , Transmission through the Internet
- the computer-readable recording medium is distributed in computer systems connected through a network, so that computer-readable codes can be stored and executed in a distributed manner.
- a functional program for implementing the present invention codes and code segments related thereto, in consideration of the system environment of a computer that reads a recording medium and executes a program, etc., help programmers in the art to which the present invention belongs It may be easily inferred or changed by
- the method for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder may be implemented in the form of a recording medium including instructions executable by a computer, such as an application or program module executed by a computer.
- Computer readable media can be any available media that can be accessed by a computer and includes both volatile and nonvolatile media, removable and non-removable media. Also, computer readable media may include all computer storage media. Computer storage media includes both volatile and nonvolatile, removable and non-removable media implemented in any method or technology for storage of information such as computer readable instructions, data structures, program modules or other data.
- a method for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder may be executed by an application basically installed in a terminal (this may include a program included in a platform or operating system, etc. It may be executed by an application (that is, a program) directly installed in the master terminal through an application providing server such as a server, an application, or a web server related to the corresponding service.
- an application that is, a program
- the above-described method for predicting early discontinuation of outpatient treatment for patients with alcohol use disorder is implemented as an application (i.e., a program) that is basically installed in a terminal or directly installed by a user, and is stored in a computer-readable recording medium such as a terminal. can be recorded
Landscapes
- Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Health & Medical Sciences (AREA)
- Public Health (AREA)
- Data Mining & Analysis (AREA)
- Software Systems (AREA)
- Biomedical Technology (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Computing Systems (AREA)
- Artificial Intelligence (AREA)
- Physics & Mathematics (AREA)
- Computer Vision & Pattern Recognition (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Evolutionary Computation (AREA)
- Child & Adolescent Psychology (AREA)
- Developmental Disabilities (AREA)
- Hospice & Palliative Care (AREA)
- Psychiatry (AREA)
- Psychology (AREA)
- Social Psychology (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
Des modes de réalisation de la présente invention concernent un système et un procédé pour prédire une interruption précoce du traitement de patients ambulatoires souffrant de trouble de consommation d'alcool. Selon des modes de réalisation de la présente invention, une pluralité de paramètres indépendants auxquels un algorithme d'apprentissage automatique sera appliqué est déterminé, l'algorithme d'apprentissage automatique est appliqué pour produire un modèle de prédiction, et des données traitées sont entrées dans le modèle de prédiction pour générer un résultat de prédiction représentant si une interruption précoce du traitement de patients ambulatoires souffrant de trouble de consommation d'alcool sera effectuée ou non.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
KR1020210096253A KR102601514B1 (ko) | 2021-07-22 | 2021-07-22 | 알코올 사용장애 환자의 외래 치료 조기 중단 예측 시스템 및 그 방법 |
KR10-2021-0096253 | 2021-07-22 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2023003315A1 true WO2023003315A1 (fr) | 2023-01-26 |
Family
ID=84979465
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/KR2022/010516 WO2023003315A1 (fr) | 2021-07-22 | 2022-07-19 | Système et procédé pour prédire une interruption précoce du traitement d'un patient ambulatoire souffrant de trouble de consommation d'alcool |
Country Status (2)
Country | Link |
---|---|
KR (1) | KR102601514B1 (fr) |
WO (1) | WO2023003315A1 (fr) |
Family Cites Families (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20190008770A (ko) * | 2017-07-17 | 2019-01-25 | 주식회사 헬스맥스 | 건강 컨설팅의 성공 예측 방법 |
KR20200022760A (ko) * | 2018-08-23 | 2020-03-04 | 가톨릭대학교 산학협력단 | 고위험 음주행동 예방 서비스 제공 시스템 |
KR102200039B1 (ko) * | 2018-12-13 | 2021-01-08 | 연세대학교 산학협력단 | 요관 결석의 자연 배출에 대한 예측 값을 제공하기 위한 방법 및 장치 |
-
2021
- 2021-07-22 KR KR1020210096253A patent/KR102601514B1/ko active IP Right Grant
-
2022
- 2022-07-19 WO PCT/KR2022/010516 patent/WO2023003315A1/fr active Application Filing
Non-Patent Citations (6)
Title |
---|
FONSI ELBREDER M., DE SOUZA E SILVA R., CRISTINA PILLON S., LARANJEIRA R.: "Alcohol Dependence: Analysis of Factors Associated with Retention of Patients in Outpatient Treatment", ALCOHOL AND ALCOHOLISM, PERGAMON, OXFORD, GB, vol. 46, no. 1, 1 January 2011 (2011-01-01), GB , pages 74 - 76, XP093026899, ISSN: 0735-0414, DOI: 10.1093/alcalc/agq078 * |
J.D. WESTWOOD, S.W. WESTWOOD, L. FELLäNDER-TSAI, C.M. FIDOPIASTIS, R. S. HALUCK, R.A. ROBB, S. SENGER, AND K. G. VOSBURGH: "STUDIES IN HEALTH TECHNOLOGY AND INFORMATICS", 27 May 2021, I O S PRESS, AMSTERDAM , NL , ISSN: 0926-9630, article EBRAHIMI ALI, WIIL UFFE KOCK, MANSOURVAR MARJAN, NAEMI AMIN, ANDERSEN KJELD, NIELSEN ANETTE SØGAARD: "Deep Neural Network to Identify Patients with Alcohol Use Disorder : Proceedings of MIE 2021", pages: 238 - 242, XP093026903, DOI: 10.3233/SHTI210156 * |
JOHANNESSEN DAGNY ADRIAENSSEN, NORDFJÆRN TROND, GEIRDAL AMY ØSTERTUN: "Substance use disorder patients’ expectations on transition from treatment to post-discharge period", SAGE JOURNALS, vol. 37, no. 3, 1 June 2020 (2020-06-01), pages 208 - 226, XP093026910, ISSN: 1455-0725, DOI: 10.1177/1455072520910551 * |
KIM SUK-YOUNG, PARK TAESUNG, KIM KWONYOUNG, OH JIHOON, PARK YOONJAE, KIM DAI-JIN: "A Deep Learning Algorithm to Predict Hazardous Drinkers and the Severity of Alcohol-Related Problems Using K-NHANES", FRONTIERS IN PSYCHIATRY, vol. 12, XP093026907, DOI: 10.3389/fpsyt.2021.684406 * |
LEE MARY R., SANKAR VIGNESH, HAMMER AARON, KENNEDY WILLIAM G., BARB JENNIFER J., MCQUEEN PHILIP G., LEGGIO LORENZO: "Using Machine Learning to Classify Individuals With Alcohol Use Disorder Based on Treatment Seeking Status", ECLINICAL MEDICINE, vol. 12, 1 July 2019 (2019-07-01), pages 70 - 78, XP093026902, ISSN: 2589-5370, DOI: 10.1016/j.eclinm.2019.05.008 * |
PARK SO JIN, LEE SUN JUNG, KIM HYUNGMIN, KIM JAE KWON, CHUN JI-WON, LEE SOO-JUNG, LEE HAE KOOK, KIM DAI JIN, CHOI IN YOUNG: "Machine learning prediction of dropping out of outpatients with alcohol use disorders", PLOS ONE, vol. 16, no. 8, pages e0255626, XP093026913, DOI: 10.1371/journal.pone.0255626 * |
Also Published As
Publication number | Publication date |
---|---|
KR102601514B1 (ko) | 2023-11-14 |
KR20230015009A (ko) | 2023-01-31 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Biswas et al. | An XAI based autism detection: the context behind the detection | |
WO2018143540A1 (fr) | Procédé, dispositif et programme de prédiction de pronostic de cancer de l'estomac à l'aide d'un réseau neuronal artificiel | |
WO2021212670A1 (fr) | Procédé de prédiction de risque d'apparition d'une nouvelle maladie infectieuse, appareil, dispositif terminal, et support | |
WO2022005090A1 (fr) | Méthode et appareil de fourniture de résultat de diagnostic | |
Keniya et al. | Disease prediction from various symptoms using machine learning | |
Chai et al. | Glaucoma diagnosis in the Chinese context: An uncertainty information-centric Bayesian deep learning model | |
Eapen | Artificial intelligence in dermatology: a practical introduction to a paradigm shift | |
WO2019235828A1 (fr) | Système de diagnostic de maladie à deux faces et méthode associée | |
WO2021139432A1 (fr) | Procédé et appareil de prédiction d'évaluation d'utilisateur sur la base de l'intelligence artificielle, terminal et support | |
Biswas et al. | Machine Learning‐Based Model to Predict Heart Disease in Early Stage Employing Different Feature Selection Techniques | |
EP3953869A1 (fr) | Procédé d'apprentissage d'un modèle d'ia et appareil électronique | |
Saffari et al. | DCNN‐FuzzyWOA: Artificial Intelligence Solution for Automatic Detection of COVID‐19 Using X‐Ray Images | |
Bhatt et al. | An intelligent system for diagnosing thyroid disease in pregnant ladies through artificial neural network | |
WO2022265292A1 (fr) | Procédé et dispositif de détection de données anormales | |
WO2022119162A1 (fr) | Méthode de prédiction de maladie basée sur une image médicale | |
WO2022181907A1 (fr) | Procédé, appareil et système pour la fourniture d'informations nutritionnelles sur la base d'une analyse d'image de selles | |
Reeves et al. | Resampling to address inequities in predictive modeling of suicide deaths | |
CN110867225A (zh) | 字符级临床概念提取命名实体识别方法及系统 | |
WO2023003315A1 (fr) | Système et procédé pour prédire une interruption précoce du traitement d'un patient ambulatoire souffrant de trouble de consommation d'alcool | |
WO2024058465A1 (fr) | Procédé d'apprentissage de modèle de réseau neuronal local pour apprentissage fédéré | |
Sun et al. | TSRNet: Diagnosis of COVID-19 based on self-supervised learning and hybrid ensemble model | |
Murugan et al. | Impact of Internet of Health Things (IoHT) on COVID-19 disease detection and its treatment using single hidden layer feed forward neural networks (SIFN) | |
Meena et al. | Depression Detection on COVID 19 Tweets Using Chimp Optimization Algorithm. | |
WO2023101417A1 (fr) | Procédé permettant de prédire une précipitation sur la base d'un apprentissage profond | |
WO2023003169A1 (fr) | Procédé, serveur et programme informatique pour fournir une réponse à des données d'interrogation sur la base de données de qualité de produits pharmaceutiques |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 22846185 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 22846185 Country of ref document: EP Kind code of ref document: A1 |