CN113192642B

CN113192642B - Surgical patient postoperative recovery state prediction model construction system

Info

Publication number: CN113192642B
Application number: CN202110357379.1A
Authority: CN
Inventors: 胡艳杰; 房圆晨; 徐湖洋; 曾思瑜
Original assignee: West China Hospital of Sichuan University
Current assignee: West China Hospital of Sichuan University
Priority date: 2021-04-01
Filing date: 2021-04-01
Publication date: 2023-02-28
Anticipated expiration: 2041-04-01
Also published as: CN113192642A

Abstract

The invention discloses a system for constructing a prediction model of postoperative recovery state of a surgical patient, which is realized by the following steps: converting the time series variable into a trend variable, and performing the following steps: determining the postoperative recovery state variable of each index by using the trend variable and the normal range of the index, and classifying the postoperative recovery state variable of the patient by using a Support Vector Machine (SVM) algorithm according to the postoperative recovery state variable converted by all time sequence variables and other variables except the times of the bed leaving activity and the time of the bed leaving activity for 3 days after operation in non-time sequence variables; step four: and (3) classifying the postoperative recovery state as a dependent variable, and constructing a prediction model by using a random forest algorithm by using the preoperative variable, the postoperative patient behavior variable, the postoperative bed leaving activity frequency and the postoperative bed leaving activity duration in all non-time sequences as independent variables after 3 days.

Description

Surgical patient postoperative recovery state prediction model construction system

Technical Field

The invention relates to the technical field of medical treatment, in particular to a system for constructing a prediction model of postoperative recovery state of a surgical patient.

Background

In the prior art, an effective assessment method for the recovery state of a postoperative patient does not exist, and the recovery state of the patient is not convenient to predict.

Disclosure of Invention

In order to solve the technical problems in the prior art, the invention discloses a postoperative recovery state prediction model construction system for a surgical patient, so as to predict the recovery state of the postoperative patient.

The invention discloses a system for constructing a prediction model of postoperative recovery state of a surgical patient, which comprises the following steps:

the method comprises the following steps: acquiring time series variables and non-time series variables, and converting the time series variables into trend variables, wherein the method comprises the following steps:

(1) Calculating each type of time series variable X between adjacent data acquisition times t-1 and t _i ＝(X _i，t1 ，X _i，t2 ，...，X _i，tj A.) of the variables, i represents a time series variable type,

(2) Determining the trend variable TR of each time series variable at each time _i，t Taking a value, and if the variation is a negative value, taking a corresponding trend variable as-1; if the variation is a non-negative value, the corresponding trend variable takes a value of 1,

step two: determining the postoperative recovery state variable of each index by using the trend variable and the normal range of the index, wherein the postoperative recovery state variable is as follows:

(1) Judging whether the value of each trend variable is in the corresponding normal range of the index at each data acquisition moment, and if the value is in the normal range, judging that the value of the index variable is 1; otherwise, the index variable takes the value 0,

(2) Determining the postoperative recovery state variable I of each index of each patient according to the clinical diagnosis standard and whether the trend variable is in the normal index range _i Taking values; if the trend variables are all-1 or the trend variables are increased and then decreased, and the value of the trend variable at the last data acquisition moment is in the normal range of the index, the postoperative recovery of the index of the patient is better, and the postoperative recovery state variable is marked as 1; if the values of the trend variables at the last two data acquisition moments are all in the indexesIn the normal range, the index of the patient is better recovered after the operation, and the recovery state variable after the operation is recorded as 1; if the 3 trend variables are all-1, but the value of the trend variable at the last data acquisition moment is not in the normal range of the index, the index of the patient is generally restored after the operation, and the post-operation restoration state variable is marked as 0; if the trend variable fluctuates and repeats and the value is increased, the postoperative recovery of the patient is poor, and the postoperative recovery state variable is marked as-1.

Step three:

classifying the postoperative recovery state variables of the patients according to the postoperative recovery state variables converted by all time series variables and other variables except the times of the next-bed activities and the time of the next-bed activities in the non-time series variables by 3 days after the operation by utilizing a Support Vector Machine (SVM) algorithm:

step four: using the postoperative recovery state classification as a dependent variable, using preoperative variables, (postoperative patient behavior variables) postoperative after-bed activity times and postoperative after-bed activity duration in all non-time sequences as independent variables, using a random forest algorithm to construct a prediction model, and if the observed value shows that the proportion of one or more postoperative recovery state classifications to all observed values is less than 20% of the theoretical proportion, taking the data as unbalanced data, wherein the theoretical proportion of each classification is equal to or less than the theoretical proportion

Wherein the time series variables comprise VAS scores, BT scores, active cough and expectoration times and deep respiration times; non-time series variables include pre-operative variables and post-operative variables, the pre-operative variables including post-operative length of stay, sex, type of incision, length of incision, BMI, CA, tumor diameter, number of tumors, operative time, extent of resection, amount of bleeding, presence or absence of blood transfusion, amount of plasma transfusion, amount of suspended red blood transfusion, ish score, tissue differentiation grade, presence or absence of invading envelope, extent of resection, presence or absence of cancer emboli, presence or absence of satellite nodules, and pre-operative PT; the postoperative variables include postoperative 1-month life quality score, presence or absence of complications, operative complications Clavien rating, presence or absence of readmission, number of times of bed discharge activity, time of bed discharge activity duration ending 3 days after operation, postoperative bowel sound time, postoperative anal venting or defecation time, and postoperative 1-month life quality score.

Further, the fourth step is specifically: the postoperative recovery state is n types in total, the theoretical proportion of each type is 1/n, if the proportion of the number of patients belonging to a certain type to the total number of patients in the data set is less than 20% of 1/n, the observation value of the type is a minority type, for the minority type in the data set, data are artificially synthesized by the following method to supplement the data set, and then the new supplemented data set is utilized to construct a prediction model:

(1) If there is only a few classes in the dataset

(1) For a small number of classes

Each sample of

Searching for samples belonging to the same class and closest to the Euclidean distance

Without other kinds of samples between them

Namely, it is

y _o ≠y _a +β _y (y _b -y _a )，β _y ∈(0，1)

Or

x _o1 ≠x _a1 +β _x1 (x _b1 -x _a1 )，β _x1 ∈(0，1)

Or

x _o2 ≠x _a2 +β _x2 (x _b2 -x _a2 )，β _x2 ∈(0，1)

…

And is

And

there is no other class of samples in between

(2) Randomly selecting a point on the connecting line between the two samples to generate artificially synthesized data

y _ab ＝y _a +β _y (y _b -y _a )，β _y ∈(0，1)

x _ab1 ＝x _a1 +β _x1 (x _b1 -x _a1 )，β _x1 ∈(0，1)

x _ab2 ＝x _a2 +β _x2 (x _b2 -x _a2 )，β _x2 ∈(0，1)

…

(3) Combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of the minority class in the new data set, and stopping artificially synthesizing the data if the proportion is more than or equal to 40% of the theoretical proportion; otherwise, entering (4);

(4) finding minority classes using new data sets

Three adjacent samples in (2)

And no other samples are in the triangle formed by taking the three samples as the vertexes

Namely that

y _o ＝c ₁ y _a +c ₂ y _b +c ₃ y _c

x _o1 ＝c ₁ x _a1 +c ₂ x _b1 +c ₃ x _c1

x _o2 ＝c ₁ x _a2 +c ₂ x _b2 +c ₃ x _c2

…

And satisfy c ₁ +c ₂ +c ₃ Not equal to 1 or c ₁ ，c ₂ ，c ₃ At least one is not in [0,1 ]]；

(5) Randomly selecting a point in a triangle formed by three samples to generate artificial synthetic data

y _abc ＝c ₁ y _a +c ₂ y _b +c ₃ y _c

x _abc1 ＝c ₁ x _a1 +c ₂ x _b1 +c ₃ x _c1

x _abc2 ＝c ₁ x _a2 +c ₂ x _b2 +c ₃ x _c2

…

c ₁ +c ₂ +c ₃ ＝1

(6) Combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of the minority class in the new data set, and stopping artificially synthesizing the data if the proportion is more than or equal to 40% of the theoretical proportion; otherwise, entering (7);

(7) according to the steps, the new data set is utilized, data are artificially synthesized based on four, five, six and other samples in sequence until the proportion of a few types in the new data set is more than or equal to 40% of the theoretical proportion, or a few type sample combination which does not cover other types of samples cannot be found, and the artificial synthesis of the data is stopped;

(2) If the data set contains two or more minority classes

(1) For each minority class

Artificially synthesizing data based on two samples by using the methods (1) and (2) in the step four (1);

(2) combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of each minority class in the new data set, and for the minority classes with the proportion less than 40% of the theoretical proportion

Entering (3);

(3) for each minority class

Artificially synthesizing data based on two samples by using the methods (4) and (5) in the step four (1);

(4) combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of each minority class in the new data set, and comparing the minority classes with the proportion smaller than 40% of the theoretical proportion

Entering (5);

(5) according to the steps, the new data set is utilized, data are artificially synthesized for each minority class on the basis of four, five, six and other samples in sequence until the proportion of all the minority classes in the new data set is more than or equal to 40% of the theoretical proportion, or a minority class sample combination which does not cover other classes of samples cannot be found, and the artificial synthesis of the data is stopped.

In conclusion, the postoperative recovery state evaluation method provided by the invention is beneficial to effectively evaluating the postoperative recovery state of the surgical patient and predicting the postoperative recovery state of the surgical patient.

The invention is further described with reference to the following drawings and detailed description. All the technologies realized based on the above contents of the present invention belong to the scope of the present invention. It will be apparent that various other modifications, substitutions and alterations can be made in the present invention without departing from the basic technical concept of the invention as described above, according to the common technical knowledge and common practice in the field.

The present invention will be described in further detail with reference to the following examples. This should not be understood as limiting the scope of the above-described subject matter of the present invention to the following examples.

The invention is further described with reference to the following figures and detailed description. Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.

Drawings

The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to assist in understanding the invention, and are included to explain the invention and their equivalents and not limit it unduly. In the drawings:

FIG. 1 is a schematic flow chart of a system for constructing a prediction model of postoperative recovery state of a surgical patient according to the present invention.

FIG. 2 is a schematic diagram of two sample-based artificially synthesized data in unbalanced data in the present invention

FIG. 3 is a schematic diagram of artificially synthesizing data based on three samples in unbalanced data in the present invention

Detailed Description

The invention will be described more fully hereinafter with reference to the accompanying drawings. One of ordinary skill in the art will be able to implement the invention based on this disclosure. Before the present invention is described in detail with reference to the accompanying drawings, it is to be noted that:

technical solutions and technical features provided in the respective portions including the following description in the present invention may be combined with each other without conflict.

The preferred embodiments and examples of the present invention described in the following description are generally only embodiments and examples of a part of the present invention. Therefore, all other embodiments and examples obtained by a person skilled in the art without any inventive work shall fall within the protection scope of the present invention.

The terms "comprising," "having," and any variations thereof in the description and claims of this invention and the related sections are intended to cover non-exclusive inclusions.

Other related terms and units in the invention can be reasonably construed based on the relevant contents of the invention.

the method comprises the following steps: acquiring time series variables and non-time series variables, and converting the time series variables into trend variables, wherein the method comprises the following steps of:

step two: determining the postoperative recovery state variable of each index by using the trend variable and the normal range of the index, wherein the method specifically comprises the following steps:

(1) Judging whether the value of each trend variable is in the corresponding normal range of the index at each data acquisition moment, and if so, judging that the value of the index variable is 1; otherwise, the index variable takes the value 0,

(2) Determining postoperative recovery state variable I of each index of each patient according to clinical diagnosis and treatment standard and whether trend variable is in normal index range _i Taking values; if the trend variables are all-1 or the trend variables rise first and then fall, and the value of the trend variable at the last data acquisition moment is in the normal range of the index, the postoperative recovery of the index of the patient is better, and the postoperative recovery state variable is marked as 1; if the values of the trend variables at the last two data acquisition moments are both in the normal range of the index, the index of the patient is better recovered after the operation, and the recovery state variable after the operation is recorded as 1; if the 3 trend variables are all-1, but the value of the trend variable at the last data acquisition moment is not in the normal range of the index, the index of the patient is generally restored after the operation, and the postoperative recovery state variable is marked as 0; if the trend variable fluctuates and repeats and the value is increased, the postoperative recovery of the patient is poor, and the postoperative recovery state variable is marked as-1.

Step three:

step four: classifying postoperative recovery state as dependent variable, and intercepting postoperative discharge activity times and postoperative discharge activity duration by preoperative variable, (postoperative patient behavior variable) in all non-time sequences3 days after operation are independent variables, a prediction model is constructed by utilizing a random forest algorithm, if the observed value is in a condition that the proportion of one or more postoperative recovery state classifications to all observed values is less than 20% of the theoretical proportion, the data is regarded as unbalanced data, wherein the theoretical proportion of each classification is equal to the theoretical proportion

Wherein the time series variables comprise VAS scores, BT scores, active cough and expectoration times and deep respiration times; non-time series variables include pre-operative variables and post-operative variables, the pre-operative variables including post-operative length of stay, gender, incision type, incision length, BMI, CA, tumor diameter, tumor number, time of operation, extent of resection, amount of bleeding, presence or absence of blood transfusion, amount of plasma transfusion, amount of suspended red blood transfusion, ISHAK score, tissue differentiation grade, presence or absence of invasion envelope, extent of resection, presence or absence of cancer emboli, presence or absence of satellite nodules, and pre-operative PT; the postoperative variables include postoperative 1-month life quality score, presence or absence of complications, operative complications Clavien rating, presence or absence of readmission, number of times of bed discharge activity, time of bed discharge activity duration ending 3 days after operation, postoperative bowel sound time, postoperative anal venting or defecation time, and postoperative 1-month life quality score.

The fourth step is specifically as follows: the postoperative recovery state is n types in total, the theoretical proportion of each type is 1/n, if the proportion of the number of patients belonging to a certain type to the total number of patients in the data set is less than 20% of 1/n, the observation value of the type is a minority type, for the minority type in the data set, data are artificially synthesized by the following method to supplement the data set, and then the new supplemented data set is utilized to construct a prediction model:

(1) If there is only a few classes in the dataset

(1) For a minority class

Each sample of

Without other kinds of samples between them

Namely, it is

y _o ≠y _a +β _y (y _b -y _a )，β _y ∈(0，1)

Or

x _o1 ≠x _a1 +β _x1 (x _b1 -x _a1 )，β _x1 ∈(0，1)

Or

x _o2 ≠x _a2 +β _x2 (x _b2 -x _a2 )，β _x2 ∈(0，1)

…

And is provided with

And with

There is no other class of samples in between

(2) As shown in FIG. 2, a point is randomly selected on the line between the two samples to generate a synthetic data

y _ab ＝y _a +β _y (y _b -y _a )，β _y ∈(0，1)

x _ab1 ＝x _a1 +β _x1 (x _b1 -x _a1 )，β _x1 ∈(0，1)

x _ab2 ＝x _a2 +β _x2 (x _b2 -x _a2 )，β _x2 ∈(0，1)

…

(3) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of the minority class in the new data set. If the proportion is more than or equal to 40 percent of the theoretical proportion, stopping artificially synthesizing data; otherwise, go to (4).

(4) Finding minority classes using new data sets

Three adjacent samples in (2)

Namely, it is

y _o ＝c ₁ y _a +c ₂ y _b +c ₃ y _c

x _o1 ＝c ₁ x _a1 +c ₂ x _b1 +c ₃ x _c1

x _o2 ＝c ₁ x _a2 +c ₂ x _b2 +c ₃ x _c2

…

And satisfy c ₁ +c ₂ +c ₃ Not equal to 1 or c ₁ ，c ₂ ，c ₃ At least one is not in [0,1 ]]。

(5) As shown in FIG. 3, in a triangle formed by three samples, a point is randomly selected to generate a synthetic data

y _abc ＝c ₁ y _a +c ₂ y _b +c ₃ y _c

x _abc1 ＝c ₁ x _a1 +c ₂ x _b1 +c ₃ x _c1

x _abc2 ＝c ₁ x _a2 +c ₂ x _b2 +c ₃ x _c2

…

c1+c ₂ +c ₃ ＝1

c ₁ ，c ₂ ，c ₃ ∈[0，1]

(6) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of the minority class in the new data set. If the proportion is more than or equal to 40 percent of the theoretical proportion, stopping artificially synthesizing data; otherwise, go to (7).

(7) According to the steps, the new data set is utilized, data are artificially synthesized based on four, five, six and other samples in sequence until the proportion of a few classes in the new data set is more than or equal to 40% of the theoretical proportion, or a combination of a few classes of samples which do not cover other classes of samples cannot be found, and the artificial synthesis of the data is stopped.

(2) If the data set contains two or more minority classes

(1) For each minority class

The data was artificially synthesized based on the two samples using the (1) and (2) methods of step four (1).

(2) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of each minority class in the new data set. For a minority class with a proportion of less than 40% of the theoretical proportion

And (4) entering into (3).

(3) For each minority class

Data was artificially synthesized based on two samples using the methods (4) and (5) of step four (1).

(4) Will newly synthesizeAnd combining the data with the original data to form a new data set, and recalculating the proportion of each minority class in the new data set. For a minority class with a proportion of less than 40% of the theoretical proportion

And (5) entering.

The non-time series variable indexes are as shown in the following table 1;

TABLE 1

The invention is further illustrated below by means of the crp index, with reference to table 1 and the accompanying drawing 1:

(1) Calculating each type of time series variable X between adjacent data acquisition times t-1 and t _i ＝(X _i，t1 ，X _i，t2 ，...，X _i，tj A variation of), i denotesThe type of the variable of the inter-sequence,

(2) Determining the trend variable TR of each time series variable at each time _i，t Taking a value, if the variable quantity is a negative value, taking a value of-1 by the corresponding trend variable; if the variation is a non-negative value, the corresponding trend variable takes a value of 1,

the amount of change in the post-operative 1-day VAS score was the difference between the post-operative 1-day VAS score and the post-operative 8-hour VAS score,

the variation of HB in 3 days after operation is the difference value of HB in 3 days after operation and HB in 1 day after operation,

Δ _HB，3 ＝HB ₃ -HB ₁

(2) Determining trend variables TR of each time series variable at each time _i，t And (4) taking values. If the variation is a negative value, the corresponding trend variable takes the value-1; if the variation is a non-negative value, the corresponding trend variable takes a value of 1.

the normal reference value for CRP is 800-8000. Mu.g/L,

(2) Determining the postoperative recovery state variable I of each index of each patient according to the clinical diagnosis standard and whether the trend variable is in the normal index range _i Taking values; if the trend variables are all-1 or the trend variables rise first and then fall, and the value of the trend variable at the last data acquisition moment is in the normal range of the index, the postoperative recovery of the index of the patient is better, and the postoperative recovery state variable is marked as 1; if the values of the trend variables at the last two data acquisition moments are in the normal range of the index, the index of the patient is better restored after the operation, and the post-operation restoration state variable is recorded as 1; if the 3 trend variables are all-1, but the value of the trend variable at the last data acquisition moment is not in the normal range of the index, the index of the patient is generally restored after the operation, and the postoperative recovery state variable is marked as 0; other conditions, such as fluctuation and repetition of the trend variable and an increase in the value, indicated poor post-operative recovery in this patient, and the post-operative recovery state variable was noted as-1.

Elevation of CRP indicates hyperactivity of the body inflammatory response. CRP should be elevated before surgery and decreased 5 days after surgery, e.g., CRP should not decrease or increase again, suggesting possible complications of infection or thromboembolism. Thus, for CRP, there were 4 post-operative observations,

CRP ₁ ，CRP ₂ ，CRP ₃ corresponding to 3 trend variables, TR _CRP，1 ，TR _CRP，2 ，TR _CRP，3 And 4 indicating variables indicating whether the indexes are normal or not

N _CRP，1 ，N _CRP，2 ，N _CRP，3 . If all of the 3 trend variables are-1,TR _CRP，1 ＝TR _CRP，2 ＝TR _CRP，3 = -1, or increase then decrease TR _CRP，1 ＝1，TR _CRP，2 ＝TR _CRP，3 =1 or TR _CRP，1 ＝TR _CRP，2 ＝1，TR _CRP，3 = -1, and the last data acquisition time variable value is within the index normal range, N _CRP，3 =1, then the patient has better CRP index postoperative recovery, and the CRP postoperative recovery state variable is recorded as 1,I _CRP =1; TR if the variable values of the last two data acquisition moments are within the normal range of the index _CRP，2 ＝TR _CRP，3 =1, then the patient has better postoperative recovery of CRP index, and the state variable of the postoperative recovery of CRP is recorded as 1,I _CRP =1; if all the 3 trend variables are-1 and the variable value at the last data acquisition time is not in the normal range of the index, N _CRP，3 =1, then the patient had a general post-surgical recovery of the CRP index, the post-surgical recovery state variable for CRP was noted as 0 _CRP ＝0；

Other conditions, such as fluctuation and repetition of the trend and increased values, indicated that the CRP index of the patient had poor postoperative recovery, and the CRP postoperative recovery state variable was-1,I _CRP And (4) = -1. By this step, each type of time series variable X is converted into a time series variable X _i Is converted into a post-operation recovery state variable I _i 。

Step three:

step four: classifying postoperative recovery state as dependent variable, and taking preoperative variable, (postoperative patient behavior variable) postoperative bed descending activity times and postoperative bed descending activity duration in all non-time sequences as 3 days after operationIndependent variables, a random forest algorithm is utilized to construct a prediction model, if the observed value is in a situation that the proportion of one or more postoperative recovery state classifications in all the observed values is less than 20% of the theoretical proportion, the data is regarded as unbalanced data, wherein the theoretical proportion of each classification is equal to

Wherein the time series variables comprise VAS scores, BT scores, active cough and expectoration times and deep respiration times; non-time series variables include pre-operative variables, post-operative variables including post-operative length of stay, gender, type of incision, length of incision, BMI, CA, tumor diameter, number of tumors, time of operation, extent of resection, amount of bleeding, presence or absence of blood transfusion, amount of plasma transfusion, amount of suspended red blood transfusion, ISHAK score, tissue differentiation grade, presence or absence of invasion envelope, extent of resection, presence or absence of cancer emboli, presence or absence of satellite nodules, pre-operative PT; the postoperative variables include postoperative 1-month life quality score, presence or absence of complications, operative complications Clavien classification, presence or absence of readmission, frequency of bed discharge activity, time of bed discharge activity for 3 days after operation, postoperative bowel sound (hours), postoperative anal venting or defecation (hours), and postoperative 1-month life quality score.

The fourth step is specifically as follows: and (2) n classes are shared in the postoperative recovery state, the theoretical proportion of each class is 1/n, if the proportion of the number of patients belonging to a certain class to the total number of patients in the data set is less than 20% of 1/n, the observation value of the class is a minority class, the minority class in the data set is artificially synthesized into data by the following method to supplement the data set, and the new supplemented data set is utilized to construct a prediction model. Assuming that the postoperative recovery status of the patients obtained in step three is totally 5 classes, the theoretical proportion of each class is

Observations of this type are in the minority if the ratio of the number of patients belonging to a certain type to the total number of patients in the dataset is less than 20% x 20% = 4%. For a few classes in the data, the following method was used to artificiallySynthesizing data, supplementing a data set, and constructing a prediction model by using the supplemented new data set:

(1) If there is only one few class in the dataset.

(1) For a minority class

y _ab ＝y _a +β _y (y _b -y _a )，β _y ∈(0，1)

x _ab1 ＝x _a1 +β _x1 (x _b1 -x _a1 )，β _x1 ∈(0，1)

x _ab2 ＝x _a2 +β _x2 (x _b2 -x _a2 )，β _x2 ∈(0，1)

…

(4) Finding minority classes using new data sets

Three adjacent samples in (1)

Namely, it is

y _o ＝c ₁ y _a +c ₂ y _b +c ₃ y _c

x _o1 ＝c ₁ x _a1 +c ₂ x _b1 +c ₃ x _c1

x _o2 ＝c ₁ x _a2 +c ₂ x _b2 +c ₃ x _c2

…

y _abc ＝c ₁ y _a +c ₂ y _b +c ₃ y _c

x _abc1 ＝c ₁ x _a1 +c ₂ x _b1 +c ₃ x _c1

x _abc2 ＝c ₁ x _a2 +c ₂ x _b2 +c ₃ x _c2

…

c ₁ +c ₂ +c ₃ ＝1

c ₁ ，c ₂ ，c ₃ ∈[0，1]

(2) If the data set contains more than two (including two) minority classes

(1) For each minority class

The (1) and (2) methods using step four (1) are based on twoThe samples were artificially synthesized into data.

And (4) entering into (3).

(3) For each minority class

The data was artificially synthesized based on the two samples using the methods (4) and (5) of step four (1).

(4) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of each minority class in the new data set. For a minority class with a proportion of less than 40% of the theoretical proportion

And (5) entering.

In summary, the invention (1) converts the time series variable describing the postoperative recovery state of the patient into the trend variable, and then forms a comprehensive postoperative recovery state indicating variable for each postoperative recovery index. (2) And if the data set is unbalanced data, performing artificial synthesis on the minority class by using the region formed by the vertex of the minority class sample, and training a prediction model by using a new data set containing new synthesized data and original data. The postoperative recovery state of the surgical patient can be effectively evaluated, and the postoperative recovery state of the patient can be predicted.

The contents of the present invention have been explained above. Those skilled in the art will be able to practice the invention based on these descriptions. Based on the above disclosure of the present invention, all other preferred embodiments and examples obtained by a person skilled in the art without any inventive step should fall within the scope of protection of the present invention.

Claims

1. The system for constructing the prediction model of the postoperative recovery state of the surgical patient is characterized by comprising the following steps of:

the method comprises the following steps:

acquiring time series variables and non-time series variables, and converting the time series variables into trend variables, wherein the method comprises the following steps:

step two:

determining the postoperative recovery state variable of each index by using the trend variable and the normal range of the index, wherein the method specifically comprises the following steps:

(2) Determining the postoperative recovery state variable I of each index of each patient according to the clinical diagnosis standard and whether the trend variable is in the normal index range _i Taking values; if the trend variables are all-1 or the trend variables rise first and then fall, and the value of the trend variable at the last data acquisition moment is in the normal range of the index, the postoperative recovery of the index of the patient is better, and the postoperative recovery state variable is marked as 1; if the values of the trend variables at the last two data acquisition moments are both in the normal range of the index, the index of the patient is better recovered after the operation, and the recovery state variable after the operation is recorded as 1; if the 3 trend variables are all-1, but the value of the trend variable at the last data acquisition moment is not in the normal range of the index, the index of the patient is generally restored after the operation, and the postoperative recovery state variable is marked as 0; if the trend variable fluctuates and repeats and the numerical value is increased, the postoperative recovery difference of the patient is shown, and the postoperative recovery state variable is marked as-1;

step three:

step four:

using postoperative recovery state classification as dependent variable, using preoperative variable, postoperative patient behavior variable, postoperative bed descending activity frequency and postoperative bed descending activity duration in all non-time sequences as independent variables for 3 days after operation, using random forest algorithm to construct a prediction model, if the observed value has the condition that the proportion of certain or several postoperative recovery state classifications in all observed values is less than 20% of theoretical ratio, using data as unbalanced data, wherein each classification isIs equal to

Wherein the time series variables comprise VAS scores, BI scores, active cough and expectoration times and deep respiration times; non-time series variables include pre-operative variables and post-operative variables, the pre-operative variables including post-operative length of stay, gender, incision type, incision length, BMI, CA, tumor diameter, tumor number, time of operation, extent of resection, amount of bleeding, presence or absence of blood transfusion, amount of plasma transfusion, amount of suspended red blood transfusion, ISHAK score, tissue differentiation grade, presence or absence of invasion envelope, extent of resection, presence or absence of cancer emboli, presence or absence of satellite nodules, and pre-operative PT; the postoperative variables comprise postoperative 1-month life quality score, existence of complications, operative complication Clavien grading, existence of readmission, times of bed discharge activity, time of bed discharge activity for 3 days after operation, postoperative bowel sound time, postoperative anal air discharge or defecation time and postoperative 1-month life quality score.

2. The system for constructing a model for predicting the postoperative recovery state of a surgical patient according to claim 1, wherein the fourth step is specifically: the postoperative recovery state is n types in total, the theoretical proportion of each type is 1/n, if the proportion of the number of patients belonging to a certain type to the total number of patients in the data set is less than 20% of 1/n, the observation value of the type is a minority type, for the minority type in the data set, data are artificially synthesized by the following method to supplement the data set, and then the new supplemented data set is utilized to construct a prediction model:

(1) If there is only one minority class in the dataset

(1) For a minority class

Each sample of

Without other kinds of samples between them

Namely, it is

Or

Or

…

And is provided with

And

therebetween has no itSamples of his class

y _ab ＝y _a +β _y (y _b -y _a )，β _y ∈(0，1)

x _ab1 ＝x _a1 +β _x1 (x _b1 -x _a1 )，β _x1 ∈(0，1)

x _ab2 ＝x _a2 +β _x2 (x _b2 -x _a2 )，β _x2 ∈0，1

…

(4) finding minority classes using new data sets

Three adjacent samples in (1)

Namely, it is

y _o ＝c ₁ y _a +c ₂ y _b +c ₃ y _c

x _o1 ＝c ₁ x _a1 +c ₂ x _b1 +c ₃ x _c1

x _o2 ＝c ₁ x _a2 +c ₂ x _b2 +c ₃ x _c2

…

(5) Randomly selecting a point in a triangle formed by the three samples to generate artificial synthetic data

y _abc ＝c ₁ y _a +c ₂ y _b +c ₃ y _c

x _abc1 ＝c ₁ x _a1 +c ₂ x _b1 +c ₃ x _c1

x _abc2 ＝c ₁ x _a2 +c ₂ x _b2 +c ₃ x _c2

…

c ₁ +c ₂ +c ₃ ＝1

(7) according to the steps, the new data set is utilized, data are artificially synthesized based on four, five, six and other samples in sequence until the proportion of minority samples in the new data set is more than or equal to 40% of the theoretical proportion, or minority sample combinations which do not cover other samples cannot be found, and the artificial synthesis of the data is stopped;

(2) If the data set contains two or more minority classes

(1) For each minority class

Entering (3);

(3) for each minority class

(4) combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of each minority class in the new data set, and for the minority classes with the proportion less than 40% of the theoretical proportion

Entering (5);

(5) according to the steps, the new data set is utilized, data are artificially synthesized for each minority class on the basis of four, five, six and other samples in sequence until the proportion of all the minority classes in the new data set is more than or equal to 40% of the theoretical proportion, or a minority class sample combination which does not cover other class samples cannot be found.