CN113658681A - A decision tree-based method for evaluating the effect of drug addicts - Google Patents

A decision tree-based method for evaluating the effect of drug addicts Download PDF

Info

Publication number
CN113658681A
CN113658681A CN202110864809.9A CN202110864809A CN113658681A CN 113658681 A CN113658681 A CN 113658681A CN 202110864809 A CN202110864809 A CN 202110864809A CN 113658681 A CN113658681 A CN 113658681A
Authority
CN
China
Prior art keywords
decision tree
sample
drug
standard deviation
effect
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110864809.9A
Other languages
Chinese (zh)
Inventor
陆宇升
李家深
朱晓东
许金礼
陶炜
廖淑珍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi Youdi Information Technology Co ltd
Original Assignee
Guangxi Youdi Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guangxi Youdi Information Technology Co ltd filed Critical Guangxi Youdi Information Technology Co ltd
Priority to CN202110864809.9A priority Critical patent/CN113658681A/en
Publication of CN113658681A publication Critical patent/CN113658681A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Software Systems (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • General Engineering & Computer Science (AREA)
  • Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Medical Treatment And Welfare Office Work (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明提供一种基于决策树的戒毒人员戒治效果评估方法,属于机器学习技术领域。该方法包括以下技术步骤:S1:目标函数选择;S2:特征选择;S3:训练过程:S4:评估过程。本发明方法得到的戒毒人员戒治效果评估模型可以对戒毒人员进行定期的戒治效果评估,评估只需要从信息系统数据库中提取数据,评估过程简单成本低,无需加入额外的人为主观判断,准确率高,输出指标易于理解和把握,灵活性高,可以适应各地域不同的制度、不同的技术设备带来的巨大差异,适应性强,当制度变化,技术进步导致数据发生巨大变化之后,可以通过重新训练模型的方式快速适应变化。

Figure 202110864809

The invention provides a method for evaluating the effect of drug addiction treatment based on a decision tree, which belongs to the technical field of machine learning. The method includes the following technical steps: S1: objective function selection; S2: feature selection; S3: training process; S4: evaluation process. The drug addiction treatment effect evaluation model for drug addicts obtained by the method of the invention can carry out regular evaluation of the drug addicts, the evaluation only needs to extract data from the information system database, the evaluation process is simple and the cost is low, no additional human subjective judgment is required, and the accuracy is accurate. High rate, easy to understand and grasp the output indicators, high flexibility, can adapt to the huge differences brought about by different systems and different technical equipment in different regions, strong adaptability, when the system changes and technological progress leads to great changes in data, it can Quickly adapt to changes by retraining the model.

Figure 202110864809

Description

Decision tree-based drug abstinence personnel abstinence effect evaluation method
Technical Field
The invention belongs to the technical field of machine learning, and particularly relates to a decision tree-based method for evaluating the abstinence effect of drug addicts.
Background
Although many methods for evaluating the abstinence effect of forced drug addicts are proposed at present, the problems of high operation difficulty and low reliability evaluation effectiveness generally exist in the actual operation process. In addition, the design of the existing evaluation method is based on experience, parameters cannot be changed rapidly and flexibly, and the method is difficult to adapt to environmental changes caused by new technology development and information system and related system changes.
The existing drug rehabilitation information system has a large amount of data directly related to the treatment effect, such as scoring and checking data, examination results, medical examination results, rehabilitation training data and the like; however, these data lack unified standards, the difference between various regions is huge, and each institutional change and technological progress can cause the data to change greatly, and it is difficult and not intuitive to evaluate the withdrawal effect directly from these data by means of manual analysis, and the accuracy of the evaluation result depends heavily on the experience of the evaluator.
Disclosure of Invention
Aiming at the problems, the invention provides a decision tree-based withdrawal effect evaluation method for drug addicts, which can be used for carrying out regular withdrawal effect evaluation on the drug addicts by an obtained withdrawal effect evaluation model of the drug addicts through target function selection, characteristic selection, a training process and an evaluation process, wherein the evaluation only needs to extract data from an information system database, the evaluation process is simple and low in cost, no additional artificial subjective judgment is needed, the accuracy is high, the output index is easy to understand and grasp, the flexibility is high, the method can adapt to the huge differences brought by different systems and different technical equipment in various regions, the adaptability is strong, and when the system changes and the technical progress causes the huge changes of the data, the method can adapt to the changes quickly in a mode of retraining the model.
The invention is realized by the following technical scheme:
a decision tree-based method for evaluating the withdrawal effect of drug addicts comprises the following steps:
s1: selecting an objective function: selecting one dimension YD from the multi-dimensional data of the drug-dropping personnel as a target function;
s2: selecting characteristics: selecting a set of features FD from the multi-dimensional data of the drug addict;
s3: training process: establishing a training data set TrainSet according to the objective function YD and the characteristic FD, training a decision tree regression model DTM, calculating parameters LNSTD and LNMEAN of each leaf node in the model DTM, saving the decision tree regression model DTM, the overall mean GMEAN, the overall standard deviation GSTD, the sample standard deviation LNSTD and the sample mean LNMEAN, and finishing the training process;
s4: an evaluation process; loading a decision tree regression model DTM, an overall mean value GMEAN, an overall standard deviation GSTD, a sample standard deviation LNSTD and a sample mean value LNMEAN which are saved in the training process, predicting a target function YD value of a person to be evaluated according to the model DTM by using a decision tree regression algorithm, obtaining a hit leaf node of the decision tree regression model DTM to calculate LSS, calculating GSS according to the target function YD value of the person to be evaluated, the overall mean value GMEAN and the overall standard deviation GSTD, and outputting the LSS and the GSS as an evaluation result.
Further, in step S3, the trained decision tree regression model DTM is obtained by extracting samples whose month is equal to mi from the data set TrainSet, and putting the samples into the subset ModelTrainSet for training the decision tree regression model DTM, that is, the data in month mi is extracted to train the decision tree regression model DTM, and mi is equal to 12 or the median of month.
Further, in step S3, in the process of training the decision tree regression model DTM by the subset ModelTrainSet, the minimum sample number of the leaf nodes > MNS is controlled, where 10 ≦ MNS < the total number of the ModelTrainSet samples of the subset or the total number of the leaf nodes.
Further, in step S3, the parameters LNSTD and LNMEAN of each leaf node in the decision tree regression model DTM are to put all leaf nodes in the obtained decision tree regression model DTM into a unified leaf node array lnodes, the number of leaf nodes is lnsize, which is equal to the length of lnodes, the standard deviation LNSTD array and the mean LNMEAN array of all dataset transet samples hitting the leaf nodes are calculated, both the sample standard deviation LNSTD and the sample mean LNMEAN are two-dimensional arrays, the first dimension represents month and the length is 36, the second dimension represents nodes, the length is lnsize, the value of lnd [ m ] [ i ] is the standard deviation of label of the mth month sample hitting the ith leaf node, and the value of LNMEAN [ m ] [ i ] is the average of label of the mth month sample hitting the ith leaf node.
Further, in step S3, the specific calculation method of the sample standard deviation LNSTD and the sample mean LNMEAN is as follows:
s301: establishing a set array TSS, wherein the set array TSS is a two-dimensional array, the first dimension represents a month, the length is 36, the second dimension represents a node, the length is the number lnsize of leaf nodes, and all elements of the set array TSS are initialized to be an empty set;
s302: enumerating each sample x in a data set TrainSet set, predicting a predicted value py of x.features by using a decision tree regression algorithm, ignoring the predicted value py, taking a subscript lni of a leaf node of a decision tree hit in the prediction process in a leaf node array lnodes, and adding the sample x into a subset TSS [ x.month ] [ lni ];
s303: enumerating each element TSS [ m ] [ j ] of the set array TSS, wherein TSS [ m ] [ j ] is a subset of a sample, calculating the mean value and standard deviation of elements label of the subset, and storing the mean value and standard deviation into LNSTD [ m ] [ i ] and LNMEAN [ m ] [ i ];
s304: establishing a one-dimensional set array GTSS with the length of 36, initializing all elements into an empty set, listing each sample x in a data set TrainSet set, and adding x into a subset GTSS [ x.month ];
s305: enumerating each element GTSS [ m ] of the one-dimensional set array GTSS, wherein GTSS [ m ] is a subset of a sample, calculating the mean value and standard deviation of label of all samples of the subset, and storing the mean value and standard deviation into arrays GMEAN [ m ] and GSTD [ m ], wherein GMEAN and GSTD are one-dimensional arrays and represent the mean value and standard deviation of the whole, and the subscript m represents a month.
Further, in step S3, the data set TrainSet is a sample set, each sample corresponds to data of one person in the multi-dimensional drug rehabilitation data, and each sample has three columns: according to the method, the values of an objective function YD are used as label, data of selected characteristics FD are extracted from multi-dimensional drug-dropping data to construct characteristic vectors featuress, and drug-dropping time is extracted from the multi-dimensional drug-dropping data to serve as mouth, and months are taken as units.
Further, in step S4, the specific evaluation process is:
s401: a decision tree regression model DTM, an overall mean value GMEAN, an overall standard deviation GSTD, a sample standard deviation LNSTD and a sample mean value LNMEAN obtained from a storage medium loading training process;
s402: extracting feature vectors featurees of the evaluated person by using the same method as a data set TrainSet sample featureseries, predicting an objective function YD attribute value of the featurees by using a decision tree regression algorithm according to a decision tree regression model DTM, neglecting a predicted value, obtaining subscripts lni of leaf nodes of the featurees hitting the decision tree regression model DTM, calculating the drug-dropping time month of the evaluated person, and calculating a parameter LSS (YD-LNMEAN [ m ] [ lni ])/LNSTD [ m ] [ lni ];
s403: calculating GSS ═ (YD-GMEAN [ m ])/GSTD [ m ];
s404: outputting the evaluation results LSS and GSS and the time-varying trends of the LSS and GSS indexes as visual explanations of the withdrawal effect index YD of the evaluated person;
GSS & gt 0 represents that the withdrawal effect of the evaluated person is better than the overall average level, and GSS & lt 0 represents that the withdrawal effect of the evaluated person is worse than the overall average level;
when the LSS is more than 0, the current withdrawal effect of the evaluated person is superior to the average value of similar drug-relief persons,
-1 < LSS < 1, indicating that the deviation of the mean values of the person to be evaluated and the person with similar drug addiction falls within 1 standard deviation, and labeling the withdrawal effect as "normal",
LSS < -1 > indicates that the withdrawal effect of the drug-addict is lower than the average value of similar drug-addicts and exceeds 1 standard deviation, and the withdrawal effect is marked as 'poor',
LSS >1 indicates that the withdrawal effect of the drug-addict is higher than the average value of similar drug-addicts by more than 1 standard deviation, and the withdrawal effect is marked as 'excellent';
when the evaluation results of the LSS and the GSS are different, the evaluation result of the LSS is used as a standard.
Further, in step S1, the objective function YD is any one of a cumulative award penalty, a monthly award penalty, an examination score, a medical examination result, and a rehabilitation training score.
Further, in step S2, the characteristic FD is any one of gender, age, drug type and culture degree.
Compared with the prior art, the invention has the advantages and beneficial effects that:
1. the method overcomes the defects of the existing method for evaluating the abstinence effect of the forced abstinence personnel, utilizes the data directly related to the abstinence effect in the abstinence information system to automatically extract the data from the database to construct a training set, and uses a decision tree regression algorithm to train a forced abstinence personnel abstinence effect evaluation model based on the abstinence historical data, the obtained model can carry out regular abstinence effect evaluation on the abstinence personnel, the evaluation only needs to extract the data from the information system database, no additional artificial subjective judgment needs to be added, the method is simple and easy to operate, the accuracy is high, and the output index is easy to understand and grasp.
2. The method of the invention establishes a withdrawal effect evaluation model completely based on data, eliminates human subjective factors, realizes that the model is updated at any time by constructing a data set and retraining, can quickly and flexibly adapt to the change of environment, and can also adapt to the huge difference of the technical and institutional environments of different regions.
3. The method has the advantages of simple evaluation process, low cost, easy operation and easy understanding of evaluation results; the LSS index takes the average value and the standard deviation of similar drug addicts as comparison standards, and considers the differences of sex, cultural degree and the like of the evaluated person, so that the evaluation result is more reasonable; the flexibility is high, the method can adapt to the huge difference brought by different systems and different technical equipment of each region, although the original data has huge difference, the evaluation results LSS and GSS have the same value range and similar numerical value meaning, and the method is easy to popularize; the adaptability is strong, and when the system is changed and the data is changed greatly due to the technical progress, the change can be quickly adapted in a mode of retraining the model.
Drawings
Fig. 1 is a flowchart of a training process in embodiment 1 of the present invention.
FIG. 2 is a flowchart of an evaluation process in embodiment 1 of the present invention.
Detailed Description
The present invention is further illustrated by the following examples, which are provided only for illustrating the present invention and are not intended to limit the scope of the present invention.
Example 1
A decision tree-based method for evaluating the withdrawal effect of drug addicts comprises the following steps:
s1: selecting an objective function: selecting one dimension YD from multidimensional data of drug addicts as a target function, wherein YD is a continuous real number type and is a quantitative index directly related to the abstinence effect, and selecting any one of accumulated award penalty points, monthly award penalty points, examination scores, medical examination results and rehabilitation training scores;
s2: selecting characteristics: selecting a group of characteristic FD from the multidimensional data of the drug-dropping personnel, selecting the static attribute of the drug-dropping personnel as the characteristic, namely that the attribute values are not changed in the whole drug-dropping process, and selecting any one of gender, age, drug types and cultural degree;
s3: training process: establishing a training data set TrainSet according to the objective function YD and the characteristic FD, training a decision tree regression model DTM, calculating parameters LNSTD and LNMEAN of each leaf node in the model DTM, saving the decision tree regression model DTM, the overall mean GMEAN, the overall standard deviation GSTD, the sample standard deviation LNSTD and the sample mean LNMEAN, and finishing the training process as shown in a flow chart of a training process in figure 1;
the data set TrainSet is a sample set, each sample corresponds to data of one person in the multi-dimensional drug rehabilitation data, and each sample has three columns: according to the method, month, label and features are used, the value of an objective function YD is used as label, data of a selected feature FD is extracted from multi-dimensional drug-dropping data to construct feature vectors, the drug-dropping time is extracted from the multi-dimensional drug-dropping data to serve as mouth, and a month is used as a unit;
the training decision tree regression model DTM is that samples with month equal to mi are extracted from a data set TrainSet, a subset ModelTrainSet is put into the samples for training the decision tree regression model DTM, namely the data in the month mi are extracted to train the decision tree regression model DTM, and mi is the middle value of month or mi is 12; in the process of training a decision tree regression model DTM by the subset ModelTrainSet, controlling the minimum sample number of leaf nodes to be more than MNS, wherein MNS is more than or equal to 10 and less than the total number of the ModelTrainSet samples or the total number of the leaf nodes;
calculating parameters LNSTD and LNMEAN of each leaf node in the decision tree regression model DTM, namely putting all leaf nodes in the obtained decision tree regression model DTM into a unified leaf node array lnodes, wherein the number of the leaf nodes is lnize which is equal to the length of the lnodes, calculating a standard deviation LNSTD array and a mean LNMEAN array of all dataset TransSet samples hitting the leaf nodes, wherein the sample standard deviation LNSTD and the sample mean LNMEAN are two-dimensional arrays, the first dimension represents months and the length is 36, the second dimension represents nodes, the length is lnize, the value of LNSTD [ m ] [ i ] is the standard deviation of label of an mth month sample hitting the ith leaf node, and the value of LNMEAN [ m ] [ i ] is the mean of label of the mth month sample hitting the ith leaf node;
the specific calculation method of the sample standard deviation LNSTD and the sample mean LNMEAN comprises the following steps:
s301: establishing a set array TSS, wherein the set array TSS is a two-dimensional array, the first dimension represents a month, the length is 36, the second dimension represents a node, the length is the number lnsize of leaf nodes, and all elements of the set array TSS are initialized to be an empty set;
s302: enumerating each sample x in a data set TrainSet set, predicting a predicted value py of x.features by using a decision tree regression algorithm, ignoring the predicted value py, taking a subscript lni of a leaf node of a decision tree hit in the prediction process in a leaf node array lnodes, and adding the sample x into a subset TSS [ x.month ] [ lni ];
s303: enumerating each element TSS [ m ] [ j ] of the set array TSS, wherein TSS [ m ] [ j ] is a subset of a sample, calculating the mean value and standard deviation of elements label of the subset, and storing the mean value and standard deviation into LNSTD [ m ] [ i ] and LNMEAN [ m ] [ i ];
s304: establishing a one-dimensional set array GTSS with the length of 36, initializing all elements into an empty set, listing each sample x in a data set TrainSet set, and adding x into a subset GTSS [ x.month ];
s305: enumerating each element GTSS [ m ] of a one-dimensional set array GTSS, wherein GTSS [ m ] is a subset of a sample, calculating the mean value and standard deviation of label of all samples of the subset, storing the mean value and standard deviation into arrays GMEAN [ m ] and GSTD [ m ], wherein GMEAN and GSTD are one-dimensional arrays and represent the mean value and standard deviation of the whole, and subscript m represents a month;
s4: an evaluation process; loading a decision tree regression model DTM, an overall mean value GMEAN, an overall standard deviation GSTD, a sample standard deviation LNSTD and a sample mean value LNMEAN stored in a training process, predicting a target function YD value of a person to be evaluated according to the model DTM by using a decision tree regression algorithm, obtaining leaf nodes of the hit decision tree regression model DTM to calculate LSS, calculating GSS according to the target function YD value of the person to be evaluated, the overall mean value GMEAN and the overall standard deviation GSTD, and outputting LSS and GSS as an evaluation result, wherein an evaluation process flow chart is shown in FIG. 2;
the specific evaluation process is as follows:
s401: a decision tree regression model DTM, an overall mean value GMEAN, an overall standard deviation GSTD, a sample standard deviation LNSTD and a sample mean value LNMEAN obtained from a storage medium loading training process;
s402: extracting feature vectors featurees of the evaluated person by using the same method as a data set TrainSet sample featureseries, predicting an objective function YD attribute value of the featurees by using a decision tree regression algorithm according to a decision tree regression model DTM, neglecting a predicted value, obtaining subscripts lni of leaf nodes of the featurees hitting the decision tree regression model DTM, calculating the drug-dropping time month of the evaluated person, and calculating a parameter LSS (YD-LNMEAN [ m ] [ lni ])/LNSTD [ m ] [ lni ];
s403: calculating GSS ═ (YD-GMEAN [ m ])/GSTD [ m ];
s404: outputting the evaluation results LSS and GSS and the time-varying trends of the LSS and GSS indexes as visual explanations of the withdrawal effect index YD of the evaluated person;
GSS & gt 0 represents that the withdrawal effect of the evaluated person is better than the overall average level, and GSS & lt 0 represents that the withdrawal effect of the evaluated person is worse than the overall average level;
when the LSS is more than 0, the current withdrawal effect of the evaluated person is superior to the average value of similar drug-relief persons,
-1 < LSS < 1, indicating that the deviation of the mean values of the person to be evaluated and the person with similar drug addiction falls within 1 standard deviation, and labeling the withdrawal effect as "normal",
LSS < -1 > indicates that the withdrawal effect of the drug-addict is lower than the average value of similar drug-addicts and exceeds 1 standard deviation, and the withdrawal effect is marked as 'poor',
LSS >1 indicates that the withdrawal effect of the drug-addict is higher than the average value of similar drug-addicts by more than 1 standard deviation, and the withdrawal effect is marked as 'excellent';
when the evaluation results of the LSS and the GSS are different, the evaluation result of the LSS is used as a standard.
Example 2
According to the method of the embodiment 1 of the invention, the method is used for testing at a certain drug rehabilitation bureau, extracting 762 dimensional data such as basic information of 13126 drug rehabilitation personnel who have left the country since 2016-09-01, SCL90 scale test results, score assessment and the like in a drug rehabilitation law enforcement platform database, constructing a training data set TrainSet after data cleaning, error deletion and too low quality data, selecting a cumulative prize penalty as YD, mi for 12, training a model for evaluating the withdrawal effect, and then evaluating the withdrawal effect of 6971 drug rehabilitation personnel on the book.
64836 evaluation results (one for each month for each drug-addict) were obtained, with 92.7% of the results showing a consistent LSS and GSS evaluation.
Of these, 7.3% of the results (528 people involved) scored lower than the global mean, but LSS >1, i.e. the abstinence effect was "excellent"; to verify the 7.3% accuracy of the results, 20 results were randomly selected from the interval and manually evaluated by experts, of which 18 were excellent and 2 were normal, i.e., the LSS evaluation accuracy of the data in the interval was 90%.
Therefore, the invention considers and analyzes the combination of the LSS and the GSS, carries out comprehensive evaluation, not only improves the evaluation efficiency and confirms the accuracy of most data, but also has higher accuracy, so that about 470 drug addicts which have no outstanding score but actually have good performance obtain more fair evaluation.
The above description is only for the purpose of illustrating the preferred embodiments of the present invention and is not to be construed as limiting the invention, and any modifications, equivalents and the like that are within the spirit and principle of the present invention are included in the present invention.

Claims (9)

1.一种基于决策树的戒毒人员戒治效果评估方法,其特征在于,包括以下步骤:1. a drug addiction treatment effect evaluation method based on a decision tree, is characterized in that, comprises the following steps: S1:目标函数选择:从戒毒人员多维数据中选择一个维度YD作为目标函数;S1: Objective function selection: select a dimension YD from the multidimensional data of drug addicts as the objective function; S2:特征选择:从戒毒人员多维数据选择一组特征FD;S2: Feature selection: select a set of features FD from the multidimensional data of drug addicts; S3:训练过程:根据目标函数YD和特征FD建立训练数据集TrainSet,训练决策树回归模型DTM,计算模型DTM中每个叶子节点的参数LNSTD和LNMEAN,保存决策树回归模型DTM、全体均值GMEAN、全体标准差GSTD、样本标准差LNSTD和样本均值LNMEAN,训练过程完成;S3: Training process: Create a training data set TrainSet according to the objective function YD and feature FD, train the decision tree regression model DTM, calculate the parameters LNSTD and LNMEAN of each leaf node in the model DTM, save the decision tree regression model DTM, the overall mean GMEAN, The overall standard deviation GSTD, the sample standard deviation LNSTD and the sample mean LNMEAN, the training process is completed; S4:评估过程;加载训练过程保存的决策树回归模型DTM、全体均值GMEAN、全体标准差GSTD、样本标准差LNSTD和样本均值LNMEAN,用决策树回归算法根据模型DTM预测被评估人员的目标函数YD值,获得命中的决策树回归模型DTM叶子节点计算LSS,根据被评估人员的目标函数YD值、全体均值GMEAN、全体标准差GSTD计算GSS,作为评估结果输出LSS和GSS。S4: Evaluation process; load the decision tree regression model DTM, the overall mean GMEAN, the overall standard deviation GSTD, the sample standard deviation LNSTD and the sample mean LNMEAN saved in the training process, and use the decision tree regression algorithm to predict the target function YD of the evaluated person according to the model DTM value, obtain the hit decision tree regression model DTM leaf node to calculate LSS, calculate GSS according to the target function YD value of the evaluated person, the overall mean GMEAN, and the overall standard deviation GSTD, and output LSS and GSS as the evaluation result. 2.根据权利要求1所述的基于决策树的戒毒人员戒治效果评估方法,其特征在于,步骤S3中,所述训练决策树回归模型DTM为从数据集TrainSet中,提取month等于mi的样本,放入子集ModelTrainSet用于训练决策树回归模型DTM,即提取第mi月的数据训练决策树回归模型DTM,mi取month的中间值或取mi=12。2. the drug rehab effect evaluation method for drug addicts based on decision tree according to claim 1, is characterized in that, in step S3, described training decision tree regression model DTM is from data set TrainSet, extracts the sample that month is equal to mi , put it into the subset ModelTrainSet for training the decision tree regression model DTM, that is, extract the data of the mi month to train the decision tree regression model DTM, and take the middle value of month or take mi=12. 3.根据权利要求2所述的基于决策树的戒毒人员戒治效果评估方法,其特征在于,步骤S3中,所述子集ModelTrainSet训练决策树回归模型DTM过程中,控制叶子节点的最小样本数>MNS,其中10≤MNS<子集ModelTrainSet样本总数或叶子节点总数。3. the drug rehab effect evaluation method based on decision tree according to claim 2, is characterized in that, in step S3, in described subset ModelTrainSet training decision tree regression model DTM process, controls the minimum sample number of leaf node >MNS, where 10≤MNS<subset ModelTrainSet samples or total number of leaf nodes. 4.根据权利要求1所述的基于决策树的戒毒人员戒治效果评估方法,其特征在于,步骤S3中,所述计算决策树回归模型DTM中每个叶子节点的参数LNSTD和LNMEAN为将得到的决策树回归模型DTM中所有叶子节点放到统一的叶子节点数组lnodes中,叶子节点数为lnsize,lnsize等于lnodes的长度,计算命中叶子节点的所有数据集TrainSet样本的标准差LNSTD数组和均值LNMEAN数组,样本标准差LNSTD和样本均值LNMEAN均为二维数组,第一维表示月份,长度为36,第二维表示节点,长度为lnsize,LNSTD[m][i]的值为命中第i个叶子节点的第m个月样本的label的标准差,LNMEAN[m][i]的值为命中第i个叶子节点的第m个月样本的label的平均值。4. the method for evaluating the effect of drug rehab based on decision tree according to claim 1, is characterized in that, in step S3, the parameter LNSTD and LNMEAN of each leaf node in the described computing decision tree regression model DTM are to obtain. All leaf nodes in the decision tree regression model DTM are placed in a unified leaf node array lnodes, the number of leaf nodes is lnsize, lnsize is equal to the length of lnodes, calculate the standard deviation LNSTD array and mean LNMEAN of all dataset TrainSet samples that hit leaf nodes Array, sample standard deviation LNSTD and sample mean LNMEAN are two-dimensional arrays, the first dimension represents the month, the length is 36, the second dimension represents the node, the length is lnsize, the value of LNSTD[m][i] is the ith hit The standard deviation of the label of the mth month sample of the leaf node, the value of LNMEAN[m][i] is the average value of the label of the mth month sample that hits the ith leaf node. 5.根据权利要求4所述的基于决策树的戒毒人员戒治效果评估方法,其特征在于,步骤S3中,所述样本标准差LNSTD和样本均值LNMEAN的具体计算方法为:5. the method for evaluating the effect of drug rehabilitation personnel based on decision tree according to claim 4, is characterized in that, in step S3, the concrete calculation method of described sample standard deviation LNSTD and sample mean value LNMEAN is: S301:建立集合数组TSS,集合数组TSS为一个二维数组,第一维表示月份,长度为36,第二维表示节点,长度为叶子节点数lnsize,集合数组TSS的所有元素初始化为空集;S301: Establish a set array TSS, the set array TSS is a two-dimensional array, the first dimension represents the month, the length is 36, the second dimension represents the node, the length is the number of leaf nodes lnsize, and all elements of the set array TSS are initialized as empty sets; S302:列举数据集TrainSet集合中每一个样本x,用决策树回归算法预测x.features的预测值py,忽略预测值py,取预测过程中命中的决策树叶子节点在叶子节点数组lnodes中的下标lni,将样本x加入子集TSS[x.month][lni];S302: List each sample x in the data set TrainSet, use the decision tree regression algorithm to predict the predicted value py of x.features, ignore the predicted value py, and take the decision tree leaf node hit during the prediction process in the leaf node array lnodes. Label lni, add sample x to subset TSS[x.month][lni]; S303:列举集合数组TSS的每个元素TSS[m][j],TSS[m][j]为一个样本的子集,计算这个子集的元素label的均值和标准差,保存到LNSTD[m][i]和LNMEAN[m][i]中;S303: List each element TSS[m][j] of the set array TSS, TSS[m][j] is a subset of a sample, calculate the mean and standard deviation of the element label of this subset, and save it to LNSTD[m ][i] and LNMEAN[m][i]; S304:建立一维集合数组GTSS,长度为36,所有元素初始化为空集,列举数据集TrainSet集合中每一个样本x,将x加入子集GTSS[x.month];S304: Create a one-dimensional set array GTSS, the length is 36, all elements are initialized as empty sets, enumerate each sample x in the dataset TrainSet set, and add x to the subset GTSS[x.month]; S305:列举一维集合数组GTSS的每个元素GTSS[m],GTSS[m]为一个样本的子集,计算这个子集所有样本的label的均值和标准差,保存到数组GMEAN[m]和GSTD[m],GMEAN和GSTD为一维数组,表示全体的均值和标准差,下标m表示月份。S305: List each element GTSS[m] of the one-dimensional set array GTSS, where GTSS[m] is a subset of a sample, calculate the mean and standard deviation of the labels of all samples in this subset, and save them to the arrays GMEAN[m] and GSTD[m], GMEAN and GSTD are one-dimensional arrays, which represent the mean and standard deviation of the whole, and the subscript m represents the month. 6.根据权利要求1所述的基于决策树的戒毒人员戒治效果评估方法,其特征在于,步骤S3中,所述数据集TrainSet为样本集合,每个样本对应多维度戒毒数据中一个人员的数据,每个样本有三个列:month、label和features,使用目标函数YD的值作为label,从多维度戒毒数据中提取选中特征FD的数据构造特征向量features,从多维度戒毒数据中提取戒毒时间作为mouth,以月为单位。6. the method for evaluating the effect of drug rehabilitation personnel based on decision tree according to claim 1, is characterized in that, in step S3, described data set TrainSet is sample collection, and each sample corresponds to a person's data in multi-dimensional drug rehabilitation data. Data, each sample has three columns: month, label and features, use the value of the objective function YD as the label, extract the data of the selected feature FD from the multi-dimensional detoxification data to construct the feature vector features, and extract the detoxification time from the multi-dimensional detoxification data. As mouth, in months. 7.根据权利要求1所述的基于决策树的戒毒人员戒治效果评估方法,其特征在于,步骤S4中,所述具体评估过程为:7. the method for evaluating the effect of drug rehab personnel based on decision tree according to claim 1, is characterized in that, in step S4, described concrete evaluation process is: S401:从存储介质加载训练过程得到的决策树回归模型DTM、全体均值GMEAN、全体标准差GSTD、样本标准差LNSTD和样本均值LNMEAN;S401: Load the decision tree regression model DTM, the overall mean GMEAN, the overall standard deviation GSTD, the sample standard deviation LNSTD, and the sample mean LNMEAN obtained from the storage medium during the training process; S402:使用与数据集TrainSet样本features列相同的方法,提取被评估人员的特征向量features,用决策树回归算法根据决策树回归模型DTM预测features的目标函数YD属性值,忽略预测值,取得features命中决策树回归模型DTM中的叶子节点的下标lni,计算被评估人员的戒毒时间month,计算参数LSS=(YD-LNMEAN[m][lni])/LNSTD[m][lni];S402: Use the same method as the features column of the data set TrainSet sample to extract the feature vector features of the evaluated person, and use the decision tree regression algorithm to predict the YD attribute value of the feature's objective function according to the decision tree regression model DTM, ignore the predicted value, and obtain the features hit The subscript lni of the leaf node in the decision tree regression model DTM, calculate the detoxification time month of the evaluated person, and calculate the parameter LSS=(YD-LNMEAN[m][lni])/LNSTD[m][lni]; S403:计算GSS=(YD-GMEAN[m])/GSTD[m];S403: Calculate GSS=(YD-GMEAN[m])/GSTD[m]; S404:输出评估结果LSS和GSS,以及LSS和GSS指标随时间变化的趋势,作为被评估人员戒治效果指标YD的直观说明;S404: Output the evaluation results LSS and GSS, as well as the trend of LSS and GSS indicators changing over time, as an intuitive description of the evaluation person's abstinence effect index YD; GSS>0表示被评价人员的戒治效果优于整体平均水平,GSS<0表示被评价人员的戒治效果比整体平均水平差;GSS>0 means that the abstinence effect of the evaluated person is better than the overall average level; GSS<0 means that the abstinence effect of the evaluated person is worse than the overall average level; 当LSS>0,表示被评估人员当前的戒治效果优于类似戒毒人员平均值,When LSS>0, it means that the current treatment effect of the evaluated person is better than the average of similar treatment personnel. -1<LSS<1,表示被评估人员戒治效果和类似戒毒人员的平均值偏差在1个标准差之内,标注其戒治效果为“正常”,-1<LSS<1, it means that the average deviation between the evaluated person's treatment effect and similar drug treatment personnel is within 1 standard deviation, and the treatment effect is marked as "normal". LSS<-1表示戒毒人员戒治效果低于类似戒毒人员平均值超过1个标准差,标注其戒治效果为“差”,LSS<-1 means that the treatment effect of the drug addicts is lower than the average value of similar drug addicts by more than 1 standard deviation, and the treatment effect is marked as "poor". LSS>1表示戒毒人员戒治效果高于类似戒毒人员平均值超过1个标准差,标注其戒治效果为“优”;LSS>1 means that the detoxification effect of drug addicts is higher than the average of similar drug addicts by more than 1 standard deviation, and the detoxification effect is marked as "excellent"; 当LSS和GSS的评估结果不同时,以LSS的评估结果为标准。When the evaluation results of LSS and GSS are different, the evaluation results of LSS shall be used as the standard. 8.根据权利要求1所述的基于决策树的戒毒人员戒治效果评估方法,其特征在于,步骤S1中,所述目标函数YD为累计奖罚分、月度奖罚分、考试成绩、医疗检验结果和康复训练成绩中的任一种。8. the method for evaluating the effect of drug rehab based on decision tree according to claim 1, is characterized in that, in step S1, described objective function YD is cumulative reward and penalty points, monthly reward and penalty points, test scores, medical examination Either outcome or rehabilitation training performance. 9.根据权利要求1所述的基于决策树的戒毒人员戒治效果评估方法,其特征在于,步骤S2中,所述特征FD为性别、年龄、吸食毒品种类和文化程度中的任一种。9. The method for assessing the effect of drug rehabilitation for drug addicts based on a decision tree according to claim 1, characterized in that, in step S2, the feature FD is any one of gender, age, drug use type and educational level.
CN202110864809.9A 2021-07-29 2021-07-29 A decision tree-based method for evaluating the effect of drug addicts Pending CN113658681A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110864809.9A CN113658681A (en) 2021-07-29 2021-07-29 A decision tree-based method for evaluating the effect of drug addicts

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110864809.9A CN113658681A (en) 2021-07-29 2021-07-29 A decision tree-based method for evaluating the effect of drug addicts

Publications (1)

Publication Number Publication Date
CN113658681A true CN113658681A (en) 2021-11-16

Family

ID=78490851

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110864809.9A Pending CN113658681A (en) 2021-07-29 2021-07-29 A decision tree-based method for evaluating the effect of drug addicts

Country Status (1)

Country Link
CN (1) CN113658681A (en)

Similar Documents

Publication Publication Date Title
CN110378818B (en) Personalized exercise recommendation method, system and medium based on difficulty
RU2012133279A (en) COMPUTER SYSTEM FOR FORECASTING TREATMENT RESULTS
CN109816265B (en) Knowledge characteristic mastery degree evaluation method, question recommendation method and electronic equipment
CN111461442B (en) Knowledge tracking method and system based on federal learning
CN113808698B (en) Computerized social adaptation training method and system
CN112069329A (en) Text corpus processing method, device, equipment and storage medium
CN107610009B (en) A neural network-based method for predicting the admission probability of Trinity enrollment
CN110069776B (en) Customer satisfaction evaluation method and device and computer readable storage medium
CN106529110A (en) Classification method and equipment of user data
CN117290462B (en) An intelligent decision-making system and method for large data models
CN113571158A (en) Intelligent AI intelligent mental health detection and analysis evaluation system
CN113554213A (en) Natural gas demand prediction method, system, storage medium and equipment
Tomkin et al. An improved grade point average, with applications to CS undergraduate education analytics
CN112733340A (en) Well selection method and equipment for modifying candidate well based on data-driven reservoir
CN114548494A (en) Visual cost data prediction intelligent analysis system
Burns Validity of person matching in vocational interest inventories
Wang et al. A method for evaluating elicitation schemes for probabilistic models
CN118397886B (en) Interactive data supervision method and system based on MVC framework
CN113035363B (en) Probability density weighted genetic metabolic disease screening data mixed sampling method
CN118658589A (en) A mental health assessment system and method
CN113326976A (en) Port freight volume online prediction method and system based on time-space correlation
CN113658681A (en) A decision tree-based method for evaluating the effect of drug addicts
CN113673811B (en) On-line learning performance evaluation method and device based on session
CN108074240A (en) Recognition methods, identification device, computer readable storage medium and program product
CN115562492A (en) Cognitive assessment training method and system based on virtual reality and eye movement

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20211116