CN113192642B - Surgical patient postoperative recovery state prediction model construction system - Google Patents
Surgical patient postoperative recovery state prediction model construction system Download PDFInfo
- Publication number
- CN113192642B CN113192642B CN202110357379.1A CN202110357379A CN113192642B CN 113192642 B CN113192642 B CN 113192642B CN 202110357379 A CN202110357379 A CN 202110357379A CN 113192642 B CN113192642 B CN 113192642B
- Authority
- CN
- China
- Prior art keywords
- variable
- variables
- postoperative
- proportion
- data
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16H—HEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
- G16H50/00—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
- G16H50/50—ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
Landscapes
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Public Health (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Pathology (AREA)
- Epidemiology (AREA)
- General Health & Medical Sciences (AREA)
- Primary Health Care (AREA)
- Medical Treatment And Welfare Office Work (AREA)
Abstract
The invention discloses a system for constructing a prediction model of postoperative recovery state of a surgical patient, which is realized by the following steps: converting the time series variable into a trend variable, and performing the following steps: determining the postoperative recovery state variable of each index by using the trend variable and the normal range of the index, and classifying the postoperative recovery state variable of the patient by using a Support Vector Machine (SVM) algorithm according to the postoperative recovery state variable converted by all time sequence variables and other variables except the times of the bed leaving activity and the time of the bed leaving activity for 3 days after operation in non-time sequence variables; step four: and (3) classifying the postoperative recovery state as a dependent variable, and constructing a prediction model by using a random forest algorithm by using the preoperative variable, the postoperative patient behavior variable, the postoperative bed leaving activity frequency and the postoperative bed leaving activity duration in all non-time sequences as independent variables after 3 days.
Description
Technical Field
The invention relates to the technical field of medical treatment, in particular to a system for constructing a prediction model of postoperative recovery state of a surgical patient.
Background
In the prior art, an effective assessment method for the recovery state of a postoperative patient does not exist, and the recovery state of the patient is not convenient to predict.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention discloses a postoperative recovery state prediction model construction system for a surgical patient, so as to predict the recovery state of the postoperative patient.
The invention discloses a system for constructing a prediction model of postoperative recovery state of a surgical patient, which comprises the following steps:
the method comprises the following steps: acquiring time series variables and non-time series variables, and converting the time series variables into trend variables, wherein the method comprises the following steps:
(1) Calculating each type of time series variable X between adjacent data acquisition times t-1 and t i =(X i,t1 ,X i,t2 ,...,X i,tj A.) of the variables, i represents a time series variable type,
(2) Determining the trend variable TR of each time series variable at each time i,t Taking a value, and if the variation is a negative value, taking a corresponding trend variable as-1; if the variation is a non-negative value, the corresponding trend variable takes a value of 1,
step two: determining the postoperative recovery state variable of each index by using the trend variable and the normal range of the index, wherein the postoperative recovery state variable is as follows:
(1) Judging whether the value of each trend variable is in the corresponding normal range of the index at each data acquisition moment, and if the value is in the normal range, judging that the value of the index variable is 1; otherwise, the index variable takes the value 0,
(2) Determining the postoperative recovery state variable I of each index of each patient according to the clinical diagnosis standard and whether the trend variable is in the normal index range i Taking values; if the trend variables are all-1 or the trend variables are increased and then decreased, and the value of the trend variable at the last data acquisition moment is in the normal range of the index, the postoperative recovery of the index of the patient is better, and the postoperative recovery state variable is marked as 1; if the values of the trend variables at the last two data acquisition moments are all in the indexesIn the normal range, the index of the patient is better recovered after the operation, and the recovery state variable after the operation is recorded as 1; if the 3 trend variables are all-1, but the value of the trend variable at the last data acquisition moment is not in the normal range of the index, the index of the patient is generally restored after the operation, and the post-operation restoration state variable is marked as 0; if the trend variable fluctuates and repeats and the value is increased, the postoperative recovery of the patient is poor, and the postoperative recovery state variable is marked as-1.
Step three:
classifying the postoperative recovery state variables of the patients according to the postoperative recovery state variables converted by all time series variables and other variables except the times of the next-bed activities and the time of the next-bed activities in the non-time series variables by 3 days after the operation by utilizing a Support Vector Machine (SVM) algorithm:
step four: using the postoperative recovery state classification as a dependent variable, using preoperative variables, (postoperative patient behavior variables) postoperative after-bed activity times and postoperative after-bed activity duration in all non-time sequences as independent variables, using a random forest algorithm to construct a prediction model, and if the observed value shows that the proportion of one or more postoperative recovery state classifications to all observed values is less than 20% of the theoretical proportion, taking the data as unbalanced data, wherein the theoretical proportion of each classification is equal to or less than the theoretical proportion
Wherein the time series variables comprise VAS scores, BT scores, active cough and expectoration times and deep respiration times; non-time series variables include pre-operative variables and post-operative variables, the pre-operative variables including post-operative length of stay, sex, type of incision, length of incision, BMI, CA, tumor diameter, number of tumors, operative time, extent of resection, amount of bleeding, presence or absence of blood transfusion, amount of plasma transfusion, amount of suspended red blood transfusion, ish score, tissue differentiation grade, presence or absence of invading envelope, extent of resection, presence or absence of cancer emboli, presence or absence of satellite nodules, and pre-operative PT; the postoperative variables include postoperative 1-month life quality score, presence or absence of complications, operative complications Clavien rating, presence or absence of readmission, number of times of bed discharge activity, time of bed discharge activity duration ending 3 days after operation, postoperative bowel sound time, postoperative anal venting or defecation time, and postoperative 1-month life quality score.
Further, the fourth step is specifically: the postoperative recovery state is n types in total, the theoretical proportion of each type is 1/n, if the proportion of the number of patients belonging to a certain type to the total number of patients in the data set is less than 20% of 1/n, the observation value of the type is a minority type, for the minority type in the data set, data are artificially synthesized by the following method to supplement the data set, and then the new supplemented data set is utilized to construct a prediction model:
(1) If there is only a few classes in the dataset
(1) For a small number of classesEach sample ofSearching for samples belonging to the same class and closest to the Euclidean distanceWithout other kinds of samples between themNamely, it is
y o ≠y a +β y (y b -y a ),β y ∈(0,1)
Or
x o1 ≠x a1 +β x1 (x b1 -x a1 ),β x1 ∈(0,1)
Or
x o2 ≠x a2 +β x2 (x b2 -x a2 ),β x2 ∈(0,1)
…
And is
(2) Randomly selecting a point on the connecting line between the two samples to generate artificially synthesized data
y ab =y a +β y (y b -y a ),β y ∈(0,1)
x ab1 =x a1 +β x1 (x b1 -x a1 ),β x1 ∈(0,1)
x ab2 =x a2 +β x2 (x b2 -x a2 ),β x2 ∈(0,1)
…
(3) Combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of the minority class in the new data set, and stopping artificially synthesizing the data if the proportion is more than or equal to 40% of the theoretical proportion; otherwise, entering (4);
(4) finding minority classes using new data setsThree adjacent samples in (2)And no other samples are in the triangle formed by taking the three samples as the vertexes Namely that
y o =c 1 y a +c 2 y b +c 3 y c
x o1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x o2 =c 1 x a2 +c 2 x b2 +c 3 x c2
…
And satisfy c 1 +c 2 +c 3 Not equal to 1 or c 1 ,c 2 ,c 3 At least one is not in [0,1 ]];
(5) Randomly selecting a point in a triangle formed by three samples to generate artificial synthetic data
y abc =c 1 y a +c 2 y b +c 3 y c
x abc1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x abc2 =c 1 x a2 +c 2 x b2 +c 3 x c2
…
c 1 +c 2 +c 3 =1
(6) Combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of the minority class in the new data set, and stopping artificially synthesizing the data if the proportion is more than or equal to 40% of the theoretical proportion; otherwise, entering (7);
(7) according to the steps, the new data set is utilized, data are artificially synthesized based on four, five, six and other samples in sequence until the proportion of a few types in the new data set is more than or equal to 40% of the theoretical proportion, or a few type sample combination which does not cover other types of samples cannot be found, and the artificial synthesis of the data is stopped;
(2) If the data set contains two or more minority classes
(1) For each minority classArtificially synthesizing data based on two samples by using the methods (1) and (2) in the step four (1);
(2) combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of each minority class in the new data set, and for the minority classes with the proportion less than 40% of the theoretical proportionEntering (3);
(3) for each minority classArtificially synthesizing data based on two samples by using the methods (4) and (5) in the step four (1);
(4) combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of each minority class in the new data set, and comparing the minority classes with the proportion smaller than 40% of the theoretical proportionEntering (5);
(5) according to the steps, the new data set is utilized, data are artificially synthesized for each minority class on the basis of four, five, six and other samples in sequence until the proportion of all the minority classes in the new data set is more than or equal to 40% of the theoretical proportion, or a minority class sample combination which does not cover other classes of samples cannot be found, and the artificial synthesis of the data is stopped.
In conclusion, the postoperative recovery state evaluation method provided by the invention is beneficial to effectively evaluating the postoperative recovery state of the surgical patient and predicting the postoperative recovery state of the surgical patient.
The invention is further described with reference to the following drawings and detailed description. All the technologies realized based on the above contents of the present invention belong to the scope of the present invention. It will be apparent that various other modifications, substitutions and alterations can be made in the present invention without departing from the basic technical concept of the invention as described above, according to the common technical knowledge and common practice in the field.
The present invention will be described in further detail with reference to the following examples. This should not be understood as limiting the scope of the above-described subject matter of the present invention to the following examples.
The invention is further described with reference to the following figures and detailed description. Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to assist in understanding the invention, and are included to explain the invention and their equivalents and not limit it unduly. In the drawings:
FIG. 1 is a schematic flow chart of a system for constructing a prediction model of postoperative recovery state of a surgical patient according to the present invention.
FIG. 2 is a schematic diagram of two sample-based artificially synthesized data in unbalanced data in the present invention
FIG. 3 is a schematic diagram of artificially synthesizing data based on three samples in unbalanced data in the present invention
Detailed Description
The invention will be described more fully hereinafter with reference to the accompanying drawings. One of ordinary skill in the art will be able to implement the invention based on this disclosure. Before the present invention is described in detail with reference to the accompanying drawings, it is to be noted that:
technical solutions and technical features provided in the respective portions including the following description in the present invention may be combined with each other without conflict.
The preferred embodiments and examples of the present invention described in the following description are generally only embodiments and examples of a part of the present invention. Therefore, all other embodiments and examples obtained by a person skilled in the art without any inventive work shall fall within the protection scope of the present invention.
The terms "comprising," "having," and any variations thereof in the description and claims of this invention and the related sections are intended to cover non-exclusive inclusions.
Other related terms and units in the invention can be reasonably construed based on the relevant contents of the invention.
The invention discloses a system for constructing a prediction model of postoperative recovery state of a surgical patient, which comprises the following steps:
the method comprises the following steps: acquiring time series variables and non-time series variables, and converting the time series variables into trend variables, wherein the method comprises the following steps of:
(1) Calculating each type of time series variable X between adjacent data acquisition times t-1 and t i =(X i,t1 ,X i,t2 ,...,X i,tj A.) of the variables, i represents a time series variable type,
(2) Determining the trend variable TR of each time series variable at each time i,t Taking a value, and if the variation is a negative value, taking a corresponding trend variable as-1; if the variation is a non-negative value, the corresponding trend variable takes a value of 1,
step two: determining the postoperative recovery state variable of each index by using the trend variable and the normal range of the index, wherein the method specifically comprises the following steps:
(1) Judging whether the value of each trend variable is in the corresponding normal range of the index at each data acquisition moment, and if so, judging that the value of the index variable is 1; otherwise, the index variable takes the value 0,
(2) Determining postoperative recovery state variable I of each index of each patient according to clinical diagnosis and treatment standard and whether trend variable is in normal index range i Taking values; if the trend variables are all-1 or the trend variables rise first and then fall, and the value of the trend variable at the last data acquisition moment is in the normal range of the index, the postoperative recovery of the index of the patient is better, and the postoperative recovery state variable is marked as 1; if the values of the trend variables at the last two data acquisition moments are both in the normal range of the index, the index of the patient is better recovered after the operation, and the recovery state variable after the operation is recorded as 1; if the 3 trend variables are all-1, but the value of the trend variable at the last data acquisition moment is not in the normal range of the index, the index of the patient is generally restored after the operation, and the postoperative recovery state variable is marked as 0; if the trend variable fluctuates and repeats and the value is increased, the postoperative recovery of the patient is poor, and the postoperative recovery state variable is marked as-1.
Step three:
classifying the postoperative recovery state variables of the patients according to the postoperative recovery state variables converted by all time series variables and other variables except the times of the next-bed activities and the time of the next-bed activities in the non-time series variables by 3 days after the operation by utilizing a Support Vector Machine (SVM) algorithm:
step four: classifying postoperative recovery state as dependent variable, and intercepting postoperative discharge activity times and postoperative discharge activity duration by preoperative variable, (postoperative patient behavior variable) in all non-time sequences3 days after operation are independent variables, a prediction model is constructed by utilizing a random forest algorithm, if the observed value is in a condition that the proportion of one or more postoperative recovery state classifications to all observed values is less than 20% of the theoretical proportion, the data is regarded as unbalanced data, wherein the theoretical proportion of each classification is equal to the theoretical proportion
Wherein the time series variables comprise VAS scores, BT scores, active cough and expectoration times and deep respiration times; non-time series variables include pre-operative variables and post-operative variables, the pre-operative variables including post-operative length of stay, gender, incision type, incision length, BMI, CA, tumor diameter, tumor number, time of operation, extent of resection, amount of bleeding, presence or absence of blood transfusion, amount of plasma transfusion, amount of suspended red blood transfusion, ISHAK score, tissue differentiation grade, presence or absence of invasion envelope, extent of resection, presence or absence of cancer emboli, presence or absence of satellite nodules, and pre-operative PT; the postoperative variables include postoperative 1-month life quality score, presence or absence of complications, operative complications Clavien rating, presence or absence of readmission, number of times of bed discharge activity, time of bed discharge activity duration ending 3 days after operation, postoperative bowel sound time, postoperative anal venting or defecation time, and postoperative 1-month life quality score.
The fourth step is specifically as follows: the postoperative recovery state is n types in total, the theoretical proportion of each type is 1/n, if the proportion of the number of patients belonging to a certain type to the total number of patients in the data set is less than 20% of 1/n, the observation value of the type is a minority type, for the minority type in the data set, data are artificially synthesized by the following method to supplement the data set, and then the new supplemented data set is utilized to construct a prediction model:
(1) If there is only a few classes in the dataset
(1) For a minority classEach sample ofSearching for samples belonging to the same class and closest to the Euclidean distanceWithout other kinds of samples between themNamely, it is
y o ≠y a +β y (y b -y a ),β y ∈(0,1)
Or
x o1 ≠x a1 +β x1 (x b1 -x a1 ),β x1 ∈(0,1)
Or
x o2 ≠x a2 +β x2 (x b2 -x a2 ),β x2 ∈(0,1)
…
And is provided with
(2) As shown in FIG. 2, a point is randomly selected on the line between the two samples to generate a synthetic data
y ab =y a +β y (y b -y a ),β y ∈(0,1)
x ab1 =x a1 +β x1 (x b1 -x a1 ),β x1 ∈(0,1)
x ab2 =x a2 +β x2 (x b2 -x a2 ),β x2 ∈(0,1)
…
(3) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of the minority class in the new data set. If the proportion is more than or equal to 40 percent of the theoretical proportion, stopping artificially synthesizing data; otherwise, go to (4).
(4) Finding minority classes using new data setsThree adjacent samples in (2)And no other samples are in the triangle formed by taking the three samples as the vertexesNamely, it is
y o =c 1 y a +c 2 y b +c 3 y c
x o1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x o2 =c 1 x a2 +c 2 x b2 +c 3 x c2
…
And satisfy c 1 +c 2 +c 3 Not equal to 1 or c 1 ,c 2 ,c 3 At least one is not in [0,1 ]]。
(5) As shown in FIG. 3, in a triangle formed by three samples, a point is randomly selected to generate a synthetic data
y abc =c 1 y a +c 2 y b +c 3 y c
x abc1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x abc2 =c 1 x a2 +c 2 x b2 +c 3 x c2
…
c1+c 2 +c 3 =1
c 1 ,c 2 ,c 3 ∈[0,1]
(6) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of the minority class in the new data set. If the proportion is more than or equal to 40 percent of the theoretical proportion, stopping artificially synthesizing data; otherwise, go to (7).
(7) According to the steps, the new data set is utilized, data are artificially synthesized based on four, five, six and other samples in sequence until the proportion of a few classes in the new data set is more than or equal to 40% of the theoretical proportion, or a combination of a few classes of samples which do not cover other classes of samples cannot be found, and the artificial synthesis of the data is stopped.
(2) If the data set contains two or more minority classes
(1) For each minority classThe data was artificially synthesized based on the two samples using the (1) and (2) methods of step four (1).
(2) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of each minority class in the new data set. For a minority class with a proportion of less than 40% of the theoretical proportionAnd (4) entering into (3).
(3) For each minority classData was artificially synthesized based on two samples using the methods (4) and (5) of step four (1).
(4) Will newly synthesizeAnd combining the data with the original data to form a new data set, and recalculating the proportion of each minority class in the new data set. For a minority class with a proportion of less than 40% of the theoretical proportionAnd (5) entering.
(5) According to the steps, the new data set is utilized, data are artificially synthesized for each minority class on the basis of four, five, six and other samples in sequence until the proportion of all the minority classes in the new data set is more than or equal to 40% of the theoretical proportion, or a minority class sample combination which does not cover other classes of samples cannot be found, and the artificial synthesis of the data is stopped.
The non-time series variable indexes are as shown in the following table 1;
TABLE 1
The invention is further illustrated below by means of the crp index, with reference to table 1 and the accompanying drawing 1:
the method comprises the following steps: acquiring time series variables and non-time series variables, and converting the time series variables into trend variables, wherein the method comprises the following steps:
(1) Calculating each type of time series variable X between adjacent data acquisition times t-1 and t i =(X i,t1 ,X i,t2 ,...,X i,tj A variation of), i denotesThe type of the variable of the inter-sequence,
(2) Determining the trend variable TR of each time series variable at each time i,t Taking a value, if the variable quantity is a negative value, taking a value of-1 by the corresponding trend variable; if the variation is a non-negative value, the corresponding trend variable takes a value of 1,
the amount of change in the post-operative 1-day VAS score was the difference between the post-operative 1-day VAS score and the post-operative 8-hour VAS score,
the variation of HB in 3 days after operation is the difference value of HB in 3 days after operation and HB in 1 day after operation,
Δ HB,3 =HB 3 -HB 1
(2) Determining trend variables TR of each time series variable at each time i,t And (4) taking values. If the variation is a negative value, the corresponding trend variable takes the value-1; if the variation is a non-negative value, the corresponding trend variable takes a value of 1.
Step two: determining the postoperative recovery state variable of each index by using the trend variable and the normal range of the index, wherein the method specifically comprises the following steps:
(1) Judging whether the value of each trend variable is in the corresponding normal range of the index at each data acquisition moment, and if the value is in the normal range, judging that the value of the index variable is 1; otherwise, the index variable takes the value 0,
the normal reference value for CRP is 800-8000. Mu.g/L,
(2) Determining the postoperative recovery state variable I of each index of each patient according to the clinical diagnosis standard and whether the trend variable is in the normal index range i Taking values; if the trend variables are all-1 or the trend variables rise first and then fall, and the value of the trend variable at the last data acquisition moment is in the normal range of the index, the postoperative recovery of the index of the patient is better, and the postoperative recovery state variable is marked as 1; if the values of the trend variables at the last two data acquisition moments are in the normal range of the index, the index of the patient is better restored after the operation, and the post-operation restoration state variable is recorded as 1; if the 3 trend variables are all-1, but the value of the trend variable at the last data acquisition moment is not in the normal range of the index, the index of the patient is generally restored after the operation, and the postoperative recovery state variable is marked as 0; other conditions, such as fluctuation and repetition of the trend variable and an increase in the value, indicated poor post-operative recovery in this patient, and the post-operative recovery state variable was noted as-1.
Elevation of CRP indicates hyperactivity of the body inflammatory response. CRP should be elevated before surgery and decreased 5 days after surgery, e.g., CRP should not decrease or increase again, suggesting possible complications of infection or thromboembolism. Thus, for CRP, there were 4 post-operative observations,CRP 1 ,CRP 2 ,CRP 3 corresponding to 3 trend variables, TR CRP,1 ,TR CRP,2 ,TR CRP,3 And 4 indicating variables indicating whether the indexes are normal or notN CRP,1 ,N CRP,2 ,N CRP,3 . If all of the 3 trend variables are-1,TR CRP,1 =TR CRP,2 =TR CRP,3 = -1, or increase then decrease TR CRP,1 =1,TR CRP,2 =TR CRP,3 =1 or TR CRP,1 =TR CRP,2 =1,TR CRP,3 = -1, and the last data acquisition time variable value is within the index normal range, N CRP,3 =1, then the patient has better CRP index postoperative recovery, and the CRP postoperative recovery state variable is recorded as 1,I CRP =1; TR if the variable values of the last two data acquisition moments are within the normal range of the index CRP,2 =TR CRP,3 =1, then the patient has better postoperative recovery of CRP index, and the state variable of the postoperative recovery of CRP is recorded as 1,I CRP =1; if all the 3 trend variables are-1 and the variable value at the last data acquisition time is not in the normal range of the index, N CRP,3 =1, then the patient had a general post-surgical recovery of the CRP index, the post-surgical recovery state variable for CRP was noted as 0 CRP =0;
Other conditions, such as fluctuation and repetition of the trend and increased values, indicated that the CRP index of the patient had poor postoperative recovery, and the CRP postoperative recovery state variable was-1,I CRP And (4) = -1. By this step, each type of time series variable X is converted into a time series variable X i Is converted into a post-operation recovery state variable I i 。
Step three:
classifying the postoperative recovery state variables of the patients according to the postoperative recovery state variables converted by all time series variables and other variables except the times of the next-bed activities and the time of the next-bed activities in the non-time series variables by 3 days after the operation by utilizing a Support Vector Machine (SVM) algorithm:
step four: classifying postoperative recovery state as dependent variable, and taking preoperative variable, (postoperative patient behavior variable) postoperative bed descending activity times and postoperative bed descending activity duration in all non-time sequences as 3 days after operationIndependent variables, a random forest algorithm is utilized to construct a prediction model, if the observed value is in a situation that the proportion of one or more postoperative recovery state classifications in all the observed values is less than 20% of the theoretical proportion, the data is regarded as unbalanced data, wherein the theoretical proportion of each classification is equal to
Wherein the time series variables comprise VAS scores, BT scores, active cough and expectoration times and deep respiration times; non-time series variables include pre-operative variables, post-operative variables including post-operative length of stay, gender, type of incision, length of incision, BMI, CA, tumor diameter, number of tumors, time of operation, extent of resection, amount of bleeding, presence or absence of blood transfusion, amount of plasma transfusion, amount of suspended red blood transfusion, ISHAK score, tissue differentiation grade, presence or absence of invasion envelope, extent of resection, presence or absence of cancer emboli, presence or absence of satellite nodules, pre-operative PT; the postoperative variables include postoperative 1-month life quality score, presence or absence of complications, operative complications Clavien classification, presence or absence of readmission, frequency of bed discharge activity, time of bed discharge activity for 3 days after operation, postoperative bowel sound (hours), postoperative anal venting or defecation (hours), and postoperative 1-month life quality score.
The fourth step is specifically as follows: and (2) n classes are shared in the postoperative recovery state, the theoretical proportion of each class is 1/n, if the proportion of the number of patients belonging to a certain class to the total number of patients in the data set is less than 20% of 1/n, the observation value of the class is a minority class, the minority class in the data set is artificially synthesized into data by the following method to supplement the data set, and the new supplemented data set is utilized to construct a prediction model. Assuming that the postoperative recovery status of the patients obtained in step three is totally 5 classes, the theoretical proportion of each class isObservations of this type are in the minority if the ratio of the number of patients belonging to a certain type to the total number of patients in the dataset is less than 20% x 20% = 4%. For a few classes in the data, the following method was used to artificiallySynthesizing data, supplementing a data set, and constructing a prediction model by using the supplemented new data set:
(1) If there is only one few class in the dataset.
y ab =y a +β y (y b -y a ),β y ∈(0,1)
x ab1 =x a1 +β x1 (x b1 -x a1 ),β x1 ∈(0,1)
x ab2 =x a2 +β x2 (x b2 -x a2 ),β x2 ∈(0,1)
…
(3) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of the minority class in the new data set. If the proportion is more than or equal to 40 percent of the theoretical proportion, stopping artificially synthesizing data; otherwise, go to (4).
(4) Finding minority classes using new data setsThree adjacent samples in (1)And no other samples are in the triangle formed by taking the three samples as the vertexesNamely, it is
y o =c 1 y a +c 2 y b +c 3 y c
x o1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x o2 =c 1 x a2 +c 2 x b2 +c 3 x c2
…
And satisfy c 1 +c 2 +c 3 Not equal to 1 or c 1 ,c 2 ,c 3 At least one is not in [0,1 ]]。
(5) As shown in FIG. 3, in a triangle formed by three samples, a point is randomly selected to generate a synthetic data
y abc =c 1 y a +c 2 y b +c 3 y c
x abc1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x abc2 =c 1 x a2 +c 2 x b2 +c 3 x c2
…
c 1 +c 2 +c 3 =1
c 1 ,c 2 ,c 3 ∈[0,1]
(6) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of the minority class in the new data set. If the proportion is more than or equal to 40 percent of the theoretical proportion, stopping artificially synthesizing data; otherwise, go to (7).
(7) According to the steps, the new data set is utilized, data are artificially synthesized based on four, five, six and other samples in sequence until the proportion of a few classes in the new data set is more than or equal to 40% of the theoretical proportion, or a combination of a few classes of samples which do not cover other classes of samples cannot be found, and the artificial synthesis of the data is stopped.
(2) If the data set contains more than two (including two) minority classes
(1) For each minority classThe (1) and (2) methods using step four (1) are based on twoThe samples were artificially synthesized into data.
(2) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of each minority class in the new data set. For a minority class with a proportion of less than 40% of the theoretical proportionAnd (4) entering into (3).
(3) For each minority classThe data was artificially synthesized based on the two samples using the methods (4) and (5) of step four (1).
(4) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of each minority class in the new data set. For a minority class with a proportion of less than 40% of the theoretical proportionAnd (5) entering.
(5) According to the steps, the new data set is utilized, data are artificially synthesized for each minority class on the basis of four, five, six and other samples in sequence until the proportion of all the minority classes in the new data set is more than or equal to 40% of the theoretical proportion, or a minority class sample combination which does not cover other classes of samples cannot be found, and the artificial synthesis of the data is stopped.
In summary, the invention (1) converts the time series variable describing the postoperative recovery state of the patient into the trend variable, and then forms a comprehensive postoperative recovery state indicating variable for each postoperative recovery index. (2) And if the data set is unbalanced data, performing artificial synthesis on the minority class by using the region formed by the vertex of the minority class sample, and training a prediction model by using a new data set containing new synthesized data and original data. The postoperative recovery state of the surgical patient can be effectively evaluated, and the postoperative recovery state of the patient can be predicted.
The contents of the present invention have been explained above. Those skilled in the art will be able to practice the invention based on these descriptions. Based on the above disclosure of the present invention, all other preferred embodiments and examples obtained by a person skilled in the art without any inventive step should fall within the scope of protection of the present invention.
Claims (2)
1. The system for constructing the prediction model of the postoperative recovery state of the surgical patient is characterized by comprising the following steps of:
the method comprises the following steps:
acquiring time series variables and non-time series variables, and converting the time series variables into trend variables, wherein the method comprises the following steps:
(1) Calculating each type of time series variable X between adjacent data acquisition times t-1 and t i =(X i,t1 ,X i,t2 ,...,X i,tj A.) of the variables, i represents a time series variable type,
(2) Determining the trend variable TR of each time series variable at each time i,t Taking a value, and if the variation is a negative value, taking a corresponding trend variable as-1; if the variation is a non-negative value, the corresponding trend variable takes a value of 1,
step two:
determining the postoperative recovery state variable of each index by using the trend variable and the normal range of the index, wherein the method specifically comprises the following steps:
(1) Judging whether the value of each trend variable is in the corresponding normal range of the index at each data acquisition moment, and if the value is in the normal range, judging that the value of the index variable is 1; otherwise, the index variable takes the value 0,
(2) Determining the postoperative recovery state variable I of each index of each patient according to the clinical diagnosis standard and whether the trend variable is in the normal index range i Taking values; if the trend variables are all-1 or the trend variables rise first and then fall, and the value of the trend variable at the last data acquisition moment is in the normal range of the index, the postoperative recovery of the index of the patient is better, and the postoperative recovery state variable is marked as 1; if the values of the trend variables at the last two data acquisition moments are both in the normal range of the index, the index of the patient is better recovered after the operation, and the recovery state variable after the operation is recorded as 1; if the 3 trend variables are all-1, but the value of the trend variable at the last data acquisition moment is not in the normal range of the index, the index of the patient is generally restored after the operation, and the postoperative recovery state variable is marked as 0; if the trend variable fluctuates and repeats and the numerical value is increased, the postoperative recovery difference of the patient is shown, and the postoperative recovery state variable is marked as-1;
step three:
classifying the postoperative recovery state variables of the patients according to the postoperative recovery state variables converted by all time series variables and other variables except the times of the next-bed activities and the time of the next-bed activities in the non-time series variables by 3 days after the operation by utilizing a Support Vector Machine (SVM) algorithm:
step four:
using postoperative recovery state classification as dependent variable, using preoperative variable, postoperative patient behavior variable, postoperative bed descending activity frequency and postoperative bed descending activity duration in all non-time sequences as independent variables for 3 days after operation, using random forest algorithm to construct a prediction model, if the observed value has the condition that the proportion of certain or several postoperative recovery state classifications in all observed values is less than 20% of theoretical ratio, using data as unbalanced data, wherein each classification isIs equal to
Wherein the time series variables comprise VAS scores, BI scores, active cough and expectoration times and deep respiration times; non-time series variables include pre-operative variables and post-operative variables, the pre-operative variables including post-operative length of stay, gender, incision type, incision length, BMI, CA, tumor diameter, tumor number, time of operation, extent of resection, amount of bleeding, presence or absence of blood transfusion, amount of plasma transfusion, amount of suspended red blood transfusion, ISHAK score, tissue differentiation grade, presence or absence of invasion envelope, extent of resection, presence or absence of cancer emboli, presence or absence of satellite nodules, and pre-operative PT; the postoperative variables comprise postoperative 1-month life quality score, existence of complications, operative complication Clavien grading, existence of readmission, times of bed discharge activity, time of bed discharge activity for 3 days after operation, postoperative bowel sound time, postoperative anal air discharge or defecation time and postoperative 1-month life quality score.
2. The system for constructing a model for predicting the postoperative recovery state of a surgical patient according to claim 1, wherein the fourth step is specifically: the postoperative recovery state is n types in total, the theoretical proportion of each type is 1/n, if the proportion of the number of patients belonging to a certain type to the total number of patients in the data set is less than 20% of 1/n, the observation value of the type is a minority type, for the minority type in the data set, data are artificially synthesized by the following method to supplement the data set, and then the new supplemented data set is utilized to construct a prediction model:
(1) If there is only one minority class in the dataset
(1) For a minority classEach sample ofSearching for samples belonging to the same class and closest to the Euclidean distance Without other kinds of samples between them Namely, it is
Or
Or
…
And is provided with
(2) Randomly selecting a point on the connecting line between the two samples to generate artificially synthesized data
y ab =y a +β y (y b -y a ),β y ∈(0,1)
x ab1 =x a1 +β x1 (x b1 -x a1 ),β x1 ∈(0,1)
x ab2 =x a2 +β x2 (x b2 -x a2 ),β x2 ∈0,1
…
(3) Combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of the minority class in the new data set, and stopping artificially synthesizing the data if the proportion is more than or equal to 40% of the theoretical proportion; otherwise, entering (4);
(4) finding minority classes using new data setsThree adjacent samples in (1) And no other samples are in the triangle formed by taking the three samples as the vertexes Namely, it is
y o =c 1 y a +c 2 y b +c 3 y c
x o1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x o2 =c 1 x a2 +c 2 x b2 +c 3 x c2
…
And satisfy c 1 +c 2 +c 3 Not equal to 1 or c 1 ,c 2 ,c 3 At least one is not in [0,1 ]];
(5) Randomly selecting a point in a triangle formed by the three samples to generate artificial synthetic data
y abc =c 1 y a +c 2 y b +c 3 y c
x abc1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x abc2 =c 1 x a2 +c 2 x b2 +c 3 x c2
…
c 1 +c 2 +c 3 =1
(6) Combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of the minority class in the new data set, and stopping artificially synthesizing the data if the proportion is more than or equal to 40% of the theoretical proportion; otherwise, entering (7);
(7) according to the steps, the new data set is utilized, data are artificially synthesized based on four, five, six and other samples in sequence until the proportion of minority samples in the new data set is more than or equal to 40% of the theoretical proportion, or minority sample combinations which do not cover other samples cannot be found, and the artificial synthesis of the data is stopped;
(2) If the data set contains two or more minority classes
(1) For each minority classArtificially synthesizing data based on two samples by using the methods (1) and (2) in the step four (1);
(2) combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of each minority class in the new data set, and for the minority classes with the proportion less than 40% of the theoretical proportion Entering (3);
(3) for each minority classArtificially synthesizing data based on two samples by using the methods (4) and (5) in the step four (1);
(4) combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of each minority class in the new data set, and for the minority classes with the proportion less than 40% of the theoretical proportion Entering (5);
(5) according to the steps, the new data set is utilized, data are artificially synthesized for each minority class on the basis of four, five, six and other samples in sequence until the proportion of all the minority classes in the new data set is more than or equal to 40% of the theoretical proportion, or a minority class sample combination which does not cover other class samples cannot be found.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110357379.1A CN113192642B (en) | 2021-04-01 | 2021-04-01 | Surgical patient postoperative recovery state prediction model construction system |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110357379.1A CN113192642B (en) | 2021-04-01 | 2021-04-01 | Surgical patient postoperative recovery state prediction model construction system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113192642A CN113192642A (en) | 2021-07-30 |
CN113192642B true CN113192642B (en) | 2023-02-28 |
Family
ID=76974450
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110357379.1A Active CN113192642B (en) | 2021-04-01 | 2021-04-01 | Surgical patient postoperative recovery state prediction model construction system |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113192642B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116230212A (en) * | 2023-04-04 | 2023-06-06 | 曜立科技(北京)有限公司 | Diagnosis decision system for postoperative cerebral apoplexy review based on data processing |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107958708A (en) * | 2017-12-22 | 2018-04-24 | 北京鑫丰南格科技股份有限公司 | Risk trend appraisal procedure and system after institute |
CN108742513A (en) * | 2018-02-09 | 2018-11-06 | 上海长江科技发展有限公司 | Patients with cerebral apoplexy rehabilitation prediction technique and system |
CN109659033A (en) * | 2018-12-18 | 2019-04-19 | 浙江大学 | A kind of chronic disease change of illness state event prediction device based on Recognition with Recurrent Neural Network |
CN111292824A (en) * | 2020-01-20 | 2020-06-16 | 深圳市丞辉威世智能科技有限公司 | Rehabilitation method, rehabilitation device, rehabilitation apparatus, and computer-readable storage medium |
CN112133441A (en) * | 2020-08-21 | 2020-12-25 | 广东省人民医院 | Establishment method and terminal of MH post-operation fissure hole state prediction model |
CN112270441A (en) * | 2020-10-30 | 2021-01-26 | 华东师范大学 | Method for establishing autism child rehabilitation effect prediction model and method and system for predicting autism child rehabilitation effect |
-
2021
- 2021-04-01 CN CN202110357379.1A patent/CN113192642B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107958708A (en) * | 2017-12-22 | 2018-04-24 | 北京鑫丰南格科技股份有限公司 | Risk trend appraisal procedure and system after institute |
CN108742513A (en) * | 2018-02-09 | 2018-11-06 | 上海长江科技发展有限公司 | Patients with cerebral apoplexy rehabilitation prediction technique and system |
CN109659033A (en) * | 2018-12-18 | 2019-04-19 | 浙江大学 | A kind of chronic disease change of illness state event prediction device based on Recognition with Recurrent Neural Network |
CN111292824A (en) * | 2020-01-20 | 2020-06-16 | 深圳市丞辉威世智能科技有限公司 | Rehabilitation method, rehabilitation device, rehabilitation apparatus, and computer-readable storage medium |
CN112133441A (en) * | 2020-08-21 | 2020-12-25 | 广东省人民医院 | Establishment method and terminal of MH post-operation fissure hole state prediction model |
CN112270441A (en) * | 2020-10-30 | 2021-01-26 | 华东师范大学 | Method for establishing autism child rehabilitation effect prediction model and method and system for predicting autism child rehabilitation effect |
Non-Patent Citations (2)
Title |
---|
Prevalence and serotype distribution of nasopharyngeal carriage of Streptococcus pneumoniae in China:a meta-analysis;Lin Wang等;《BMC Infectious Diseases 2017》;20171213;第1-14页 * |
加速康复外科模式在国内外应用的评价指标研究进展;李卡等;《中国普外基础与临床杂志》;20180531;第25卷(第5期);第629-634页 * |
Also Published As
Publication number | Publication date |
---|---|
CN113192642A (en) | 2021-07-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Hardman et al. | Ruptured abdominal aortic aneurysms: who should be offered surgery? | |
Sandri et al. | Variable selection using random forests | |
Aoki et al. | Predictive model for survival at the conclusion of a damage control laparotomy | |
Hanko et al. | Random forest–based prediction of outcome and mortality in patients with traumatic brain injury undergoing primary decompressive craniectomy | |
CN113192642B (en) | Surgical patient postoperative recovery state prediction model construction system | |
CN107908819A (en) | The method and apparatus for predicting User Status change | |
Fouquet et al. | Totally endoscopic lateral parathyroidectomy: prospective evaluation of 200 patients: ESES 2010 Vienna Presentation | |
Demšar et al. | Feature mining and predictive model construction from severe trauma patient's data | |
Al-Mualemi et al. | A deep learning-based sepsis estimation scheme | |
CN107130017A (en) | The purposes of kit and reagent in reagent preparation box | |
Syed et al. | Determining if positive predictive value using laboratory risk indicator for necrotising fasciitis is applicable in Malaysian patients with necrotising fasciitis | |
CN117153380A (en) | Method, system and equipment for predicting postoperative acute kidney injury of non-cardiac surgery patient | |
Morland et al. | Epidemiology and prognoses in a medical intermediate care unit | |
George et al. | A hospital throughput model in the context of long waiting lists | |
Drosou et al. | Support vector machines classification on class imbalanced data: a case study with real medical data | |
CN112259219B (en) | System, equipment and storage medium for predicting diseases based on upper gastrointestinal bleeding | |
Vrtková | Predicting clinical status of patients after an acute ischemic stroke using random forests | |
van Erven et al. | Hospital standardised mortality ratio: A reliable indicator of quality of care? | |
CN110504030A (en) | A kind of traumatic coagulopathy prediction technique | |
Exarchos et al. | Modelling of oral cancer progression using dynamic Bayesian networks | |
Spremo et al. | Acute mastoiditis in children: susceptibility factors and management | |
Padmanaban et al. | Backward model building for nonparametric discrimination and classification of fatty liver cases | |
Buchlak et al. | 401 Applying Machine Learning for Risk Stratification and Acute Clinical Outcome Prediction Amongst Aneurysmal Subarachnoid Hemorrhage Patients | |
Paterson | Rough classification of pneumonia patients using a clinical database | |
Haukoos et al. | A multi-center pragmatic randomized comparison of HIV screening strategy effectiveness in the emergency department: the HIV TESTED trial |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |