CN113192642B - Surgical patient postoperative recovery state prediction model construction system - Google Patents

Surgical patient postoperative recovery state prediction model construction system Download PDF

Info

Publication number
CN113192642B
CN113192642B CN202110357379.1A CN202110357379A CN113192642B CN 113192642 B CN113192642 B CN 113192642B CN 202110357379 A CN202110357379 A CN 202110357379A CN 113192642 B CN113192642 B CN 113192642B
Authority
CN
China
Prior art keywords
variable
variables
postoperative
proportion
data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110357379.1A
Other languages
Chinese (zh)
Other versions
CN113192642A (en
Inventor
胡艳杰
房圆晨
徐湖洋
曾思瑜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
West China Hospital of Sichuan University
Original Assignee
West China Hospital of Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by West China Hospital of Sichuan University filed Critical West China Hospital of Sichuan University
Priority to CN202110357379.1A priority Critical patent/CN113192642B/en
Publication of CN113192642A publication Critical patent/CN113192642A/en
Application granted granted Critical
Publication of CN113192642B publication Critical patent/CN113192642B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a system for constructing a prediction model of postoperative recovery state of a surgical patient, which is realized by the following steps: converting the time series variable into a trend variable, and performing the following steps: determining the postoperative recovery state variable of each index by using the trend variable and the normal range of the index, and classifying the postoperative recovery state variable of the patient by using a Support Vector Machine (SVM) algorithm according to the postoperative recovery state variable converted by all time sequence variables and other variables except the times of the bed leaving activity and the time of the bed leaving activity for 3 days after operation in non-time sequence variables; step four: and (3) classifying the postoperative recovery state as a dependent variable, and constructing a prediction model by using a random forest algorithm by using the preoperative variable, the postoperative patient behavior variable, the postoperative bed leaving activity frequency and the postoperative bed leaving activity duration in all non-time sequences as independent variables after 3 days.

Description

Surgical patient postoperative recovery state prediction model construction system
Technical Field
The invention relates to the technical field of medical treatment, in particular to a system for constructing a prediction model of postoperative recovery state of a surgical patient.
Background
In the prior art, an effective assessment method for the recovery state of a postoperative patient does not exist, and the recovery state of the patient is not convenient to predict.
Disclosure of Invention
In order to solve the technical problems in the prior art, the invention discloses a postoperative recovery state prediction model construction system for a surgical patient, so as to predict the recovery state of the postoperative patient.
The invention discloses a system for constructing a prediction model of postoperative recovery state of a surgical patient, which comprises the following steps:
the method comprises the following steps: acquiring time series variables and non-time series variables, and converting the time series variables into trend variables, wherein the method comprises the following steps:
(1) Calculating each type of time series variable X between adjacent data acquisition times t-1 and t i =(X i,t1 ,X i,t2 ,...,X i,tj A.) of the variables, i represents a time series variable type,
Figure GDA0003893974260000011
(2) Determining the trend variable TR of each time series variable at each time i,t Taking a value, and if the variation is a negative value, taking a corresponding trend variable as-1; if the variation is a non-negative value, the corresponding trend variable takes a value of 1,
Figure GDA0003893974260000012
step two: determining the postoperative recovery state variable of each index by using the trend variable and the normal range of the index, wherein the postoperative recovery state variable is as follows:
(1) Judging whether the value of each trend variable is in the corresponding normal range of the index at each data acquisition moment, and if the value is in the normal range, judging that the value of the index variable is 1; otherwise, the index variable takes the value 0,
Figure GDA0003893974260000013
(2) Determining the postoperative recovery state variable I of each index of each patient according to the clinical diagnosis standard and whether the trend variable is in the normal index range i Taking values; if the trend variables are all-1 or the trend variables are increased and then decreased, and the value of the trend variable at the last data acquisition moment is in the normal range of the index, the postoperative recovery of the index of the patient is better, and the postoperative recovery state variable is marked as 1; if the values of the trend variables at the last two data acquisition moments are all in the indexesIn the normal range, the index of the patient is better recovered after the operation, and the recovery state variable after the operation is recorded as 1; if the 3 trend variables are all-1, but the value of the trend variable at the last data acquisition moment is not in the normal range of the index, the index of the patient is generally restored after the operation, and the post-operation restoration state variable is marked as 0; if the trend variable fluctuates and repeats and the value is increased, the postoperative recovery of the patient is poor, and the postoperative recovery state variable is marked as-1.
Step three:
classifying the postoperative recovery state variables of the patients according to the postoperative recovery state variables converted by all time series variables and other variables except the times of the next-bed activities and the time of the next-bed activities in the non-time series variables by 3 days after the operation by utilizing a Support Vector Machine (SVM) algorithm:
Figure GDA0003893974260000021
step four: using the postoperative recovery state classification as a dependent variable, using preoperative variables, (postoperative patient behavior variables) postoperative after-bed activity times and postoperative after-bed activity duration in all non-time sequences as independent variables, using a random forest algorithm to construct a prediction model, and if the observed value shows that the proportion of one or more postoperative recovery state classifications to all observed values is less than 20% of the theoretical proportion, taking the data as unbalanced data, wherein the theoretical proportion of each classification is equal to or less than the theoretical proportion
Figure GDA0003893974260000022
Wherein the time series variables comprise VAS scores, BT scores, active cough and expectoration times and deep respiration times; non-time series variables include pre-operative variables and post-operative variables, the pre-operative variables including post-operative length of stay, sex, type of incision, length of incision, BMI, CA, tumor diameter, number of tumors, operative time, extent of resection, amount of bleeding, presence or absence of blood transfusion, amount of plasma transfusion, amount of suspended red blood transfusion, ish score, tissue differentiation grade, presence or absence of invading envelope, extent of resection, presence or absence of cancer emboli, presence or absence of satellite nodules, and pre-operative PT; the postoperative variables include postoperative 1-month life quality score, presence or absence of complications, operative complications Clavien rating, presence or absence of readmission, number of times of bed discharge activity, time of bed discharge activity duration ending 3 days after operation, postoperative bowel sound time, postoperative anal venting or defecation time, and postoperative 1-month life quality score.
Further, the fourth step is specifically: the postoperative recovery state is n types in total, the theoretical proportion of each type is 1/n, if the proportion of the number of patients belonging to a certain type to the total number of patients in the data set is less than 20% of 1/n, the observation value of the type is a minority type, for the minority type in the data set, data are artificially synthesized by the following method to supplement the data set, and then the new supplemented data set is utilized to construct a prediction model:
(1) If there is only a few classes in the dataset
(1) For a small number of classes
Figure GDA0003893974260000031
Each sample of
Figure GDA0003893974260000032
Searching for samples belonging to the same class and closest to the Euclidean distance
Figure GDA0003893974260000033
Without other kinds of samples between them
Figure GDA0003893974260000034
Namely, it is
y o ≠y ay (y b -y a ),β y ∈(0,1)
Or
x o1 ≠x a1x1 (x b1 -x a1 ),β x1 ∈(0,1)
Or
x o2 ≠x a2x2 (x b2 -x a2 ),β x2 ∈(0,1)
And is
Figure GDA0003893974260000035
Figure GDA0003893974260000036
And
Figure GDA0003893974260000037
there is no other class of samples in between
(2) Randomly selecting a point on the connecting line between the two samples to generate artificially synthesized data
Figure GDA00038939742600000312
y ab =y ay (y b -y a ),β y ∈(0,1)
x ab1 =x a1x1 (x b1 -x a1 ),β x1 ∈(0,1)
x ab2 =x a2x2 (x b2 -x a2 ),β x2 ∈(0,1)
(3) Combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of the minority class in the new data set, and stopping artificially synthesizing the data if the proportion is more than or equal to 40% of the theoretical proportion; otherwise, entering (4);
(4) finding minority classes using new data sets
Figure GDA0003893974260000038
Three adjacent samples in (2)
Figure GDA0003893974260000039
And no other samples are in the triangle formed by taking the three samples as the vertexes
Figure GDA00038939742600000310
Figure GDA00038939742600000311
Namely that
y o =c 1 y a +c 2 y b +c 3 y c
x o1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x o2 =c 1 x a2 +c 2 x b2 +c 3 x c2
And satisfy c 1 +c 2 +c 3 Not equal to 1 or c 1 ,c 2 ,c 3 At least one is not in [0,1 ]];
(5) Randomly selecting a point in a triangle formed by three samples to generate artificial synthetic data
Figure GDA00038939742600000313
y abc =c 1 y a +c 2 y b +c 3 y c
x abc1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x abc2 =c 1 x a2 +c 2 x b2 +c 3 x c2
c 1 +c 2 +c 3 =1
(6) Combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of the minority class in the new data set, and stopping artificially synthesizing the data if the proportion is more than or equal to 40% of the theoretical proportion; otherwise, entering (7);
(7) according to the steps, the new data set is utilized, data are artificially synthesized based on four, five, six and other samples in sequence until the proportion of a few types in the new data set is more than or equal to 40% of the theoretical proportion, or a few type sample combination which does not cover other types of samples cannot be found, and the artificial synthesis of the data is stopped;
(2) If the data set contains two or more minority classes
(1) For each minority class
Figure GDA0003893974260000041
Artificially synthesizing data based on two samples by using the methods (1) and (2) in the step four (1);
(2) combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of each minority class in the new data set, and for the minority classes with the proportion less than 40% of the theoretical proportion
Figure GDA0003893974260000042
Entering (3);
(3) for each minority class
Figure GDA0003893974260000044
Artificially synthesizing data based on two samples by using the methods (4) and (5) in the step four (1);
(4) combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of each minority class in the new data set, and comparing the minority classes with the proportion smaller than 40% of the theoretical proportion
Figure GDA0003893974260000043
Entering (5);
(5) according to the steps, the new data set is utilized, data are artificially synthesized for each minority class on the basis of four, five, six and other samples in sequence until the proportion of all the minority classes in the new data set is more than or equal to 40% of the theoretical proportion, or a minority class sample combination which does not cover other classes of samples cannot be found, and the artificial synthesis of the data is stopped.
In conclusion, the postoperative recovery state evaluation method provided by the invention is beneficial to effectively evaluating the postoperative recovery state of the surgical patient and predicting the postoperative recovery state of the surgical patient.
The invention is further described with reference to the following drawings and detailed description. All the technologies realized based on the above contents of the present invention belong to the scope of the present invention. It will be apparent that various other modifications, substitutions and alterations can be made in the present invention without departing from the basic technical concept of the invention as described above, according to the common technical knowledge and common practice in the field.
The present invention will be described in further detail with reference to the following examples. This should not be understood as limiting the scope of the above-described subject matter of the present invention to the following examples.
The invention is further described with reference to the following figures and detailed description. Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, are included to assist in understanding the invention, and are included to explain the invention and their equivalents and not limit it unduly. In the drawings:
FIG. 1 is a schematic flow chart of a system for constructing a prediction model of postoperative recovery state of a surgical patient according to the present invention.
FIG. 2 is a schematic diagram of two sample-based artificially synthesized data in unbalanced data in the present invention
FIG. 3 is a schematic diagram of artificially synthesizing data based on three samples in unbalanced data in the present invention
Detailed Description
The invention will be described more fully hereinafter with reference to the accompanying drawings. One of ordinary skill in the art will be able to implement the invention based on this disclosure. Before the present invention is described in detail with reference to the accompanying drawings, it is to be noted that:
technical solutions and technical features provided in the respective portions including the following description in the present invention may be combined with each other without conflict.
The preferred embodiments and examples of the present invention described in the following description are generally only embodiments and examples of a part of the present invention. Therefore, all other embodiments and examples obtained by a person skilled in the art without any inventive work shall fall within the protection scope of the present invention.
The terms "comprising," "having," and any variations thereof in the description and claims of this invention and the related sections are intended to cover non-exclusive inclusions.
Other related terms and units in the invention can be reasonably construed based on the relevant contents of the invention.
The invention discloses a system for constructing a prediction model of postoperative recovery state of a surgical patient, which comprises the following steps:
the method comprises the following steps: acquiring time series variables and non-time series variables, and converting the time series variables into trend variables, wherein the method comprises the following steps of:
(1) Calculating each type of time series variable X between adjacent data acquisition times t-1 and t i =(X i,t1 ,X i,t2 ,...,X i,tj A.) of the variables, i represents a time series variable type,
Figure GDA0003893974260000061
(2) Determining the trend variable TR of each time series variable at each time i,t Taking a value, and if the variation is a negative value, taking a corresponding trend variable as-1; if the variation is a non-negative value, the corresponding trend variable takes a value of 1,
Figure GDA0003893974260000062
step two: determining the postoperative recovery state variable of each index by using the trend variable and the normal range of the index, wherein the method specifically comprises the following steps:
(1) Judging whether the value of each trend variable is in the corresponding normal range of the index at each data acquisition moment, and if so, judging that the value of the index variable is 1; otherwise, the index variable takes the value 0,
Figure GDA0003893974260000063
(2) Determining postoperative recovery state variable I of each index of each patient according to clinical diagnosis and treatment standard and whether trend variable is in normal index range i Taking values; if the trend variables are all-1 or the trend variables rise first and then fall, and the value of the trend variable at the last data acquisition moment is in the normal range of the index, the postoperative recovery of the index of the patient is better, and the postoperative recovery state variable is marked as 1; if the values of the trend variables at the last two data acquisition moments are both in the normal range of the index, the index of the patient is better recovered after the operation, and the recovery state variable after the operation is recorded as 1; if the 3 trend variables are all-1, but the value of the trend variable at the last data acquisition moment is not in the normal range of the index, the index of the patient is generally restored after the operation, and the postoperative recovery state variable is marked as 0; if the trend variable fluctuates and repeats and the value is increased, the postoperative recovery of the patient is poor, and the postoperative recovery state variable is marked as-1.
Step three:
classifying the postoperative recovery state variables of the patients according to the postoperative recovery state variables converted by all time series variables and other variables except the times of the next-bed activities and the time of the next-bed activities in the non-time series variables by 3 days after the operation by utilizing a Support Vector Machine (SVM) algorithm:
Figure GDA0003893974260000064
step four: classifying postoperative recovery state as dependent variable, and intercepting postoperative discharge activity times and postoperative discharge activity duration by preoperative variable, (postoperative patient behavior variable) in all non-time sequences3 days after operation are independent variables, a prediction model is constructed by utilizing a random forest algorithm, if the observed value is in a condition that the proportion of one or more postoperative recovery state classifications to all observed values is less than 20% of the theoretical proportion, the data is regarded as unbalanced data, wherein the theoretical proportion of each classification is equal to the theoretical proportion
Figure GDA0003893974260000071
Wherein the time series variables comprise VAS scores, BT scores, active cough and expectoration times and deep respiration times; non-time series variables include pre-operative variables and post-operative variables, the pre-operative variables including post-operative length of stay, gender, incision type, incision length, BMI, CA, tumor diameter, tumor number, time of operation, extent of resection, amount of bleeding, presence or absence of blood transfusion, amount of plasma transfusion, amount of suspended red blood transfusion, ISHAK score, tissue differentiation grade, presence or absence of invasion envelope, extent of resection, presence or absence of cancer emboli, presence or absence of satellite nodules, and pre-operative PT; the postoperative variables include postoperative 1-month life quality score, presence or absence of complications, operative complications Clavien rating, presence or absence of readmission, number of times of bed discharge activity, time of bed discharge activity duration ending 3 days after operation, postoperative bowel sound time, postoperative anal venting or defecation time, and postoperative 1-month life quality score.
The fourth step is specifically as follows: the postoperative recovery state is n types in total, the theoretical proportion of each type is 1/n, if the proportion of the number of patients belonging to a certain type to the total number of patients in the data set is less than 20% of 1/n, the observation value of the type is a minority type, for the minority type in the data set, data are artificially synthesized by the following method to supplement the data set, and then the new supplemented data set is utilized to construct a prediction model:
(1) If there is only a few classes in the dataset
(1) For a minority class
Figure GDA0003893974260000072
Each sample of
Figure GDA0003893974260000073
Searching for samples belonging to the same class and closest to the Euclidean distance
Figure GDA0003893974260000074
Without other kinds of samples between them
Figure GDA0003893974260000075
Namely, it is
y o ≠y ay (y b -y a ),β y ∈(0,1)
Or
x o1 ≠x a1x1 (x b1 -x a1 ),β x1 ∈(0,1)
Or
x o2 ≠x a2x2 (x b2 -x a2 ),β x2 ∈(0,1)
And is provided with
Figure GDA0003893974260000076
Figure GDA0003893974260000077
And with
Figure GDA0003893974260000078
There is no other class of samples in between
(2) As shown in FIG. 2, a point is randomly selected on the line between the two samples to generate a synthetic data
Figure GDA0003893974260000079
y ab =y ay (y b -y a ),β y ∈(0,1)
x ab1 =x a1x1 (x b1 -x a1 ),β x1 ∈(0,1)
x ab2 =x a2x2 (x b2 -x a2 ),β x2 ∈(0,1)
(3) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of the minority class in the new data set. If the proportion is more than or equal to 40 percent of the theoretical proportion, stopping artificially synthesizing data; otherwise, go to (4).
(4) Finding minority classes using new data sets
Figure GDA0003893974260000081
Three adjacent samples in (2)
Figure GDA0003893974260000082
And no other samples are in the triangle formed by taking the three samples as the vertexes
Figure GDA0003893974260000083
Namely, it is
y o =c 1 y a +c 2 y b +c 3 y c
x o1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x o2 =c 1 x a2 +c 2 x b2 +c 3 x c2
And satisfy c 1 +c 2 +c 3 Not equal to 1 or c 1 ,c 2 ,c 3 At least one is not in [0,1 ]]。
(5) As shown in FIG. 3, in a triangle formed by three samples, a point is randomly selected to generate a synthetic data
Figure GDA0003893974260000085
y abc =c 1 y a +c 2 y b +c 3 y c
x abc1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x abc2 =c 1 x a2 +c 2 x b2 +c 3 x c2
c1+c 2 +c 3 =1
c 1 ,c 2 ,c 3 ∈[0,1]
(6) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of the minority class in the new data set. If the proportion is more than or equal to 40 percent of the theoretical proportion, stopping artificially synthesizing data; otherwise, go to (7).
(7) According to the steps, the new data set is utilized, data are artificially synthesized based on four, five, six and other samples in sequence until the proportion of a few classes in the new data set is more than or equal to 40% of the theoretical proportion, or a combination of a few classes of samples which do not cover other classes of samples cannot be found, and the artificial synthesis of the data is stopped.
(2) If the data set contains two or more minority classes
(1) For each minority class
Figure GDA0003893974260000084
The data was artificially synthesized based on the two samples using the (1) and (2) methods of step four (1).
(2) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of each minority class in the new data set. For a minority class with a proportion of less than 40% of the theoretical proportion
Figure GDA0003893974260000091
And (4) entering into (3).
(3) For each minority class
Figure GDA0003893974260000092
Data was artificially synthesized based on two samples using the methods (4) and (5) of step four (1).
(4) Will newly synthesizeAnd combining the data with the original data to form a new data set, and recalculating the proportion of each minority class in the new data set. For a minority class with a proportion of less than 40% of the theoretical proportion
Figure GDA0003893974260000093
And (5) entering.
(5) According to the steps, the new data set is utilized, data are artificially synthesized for each minority class on the basis of four, five, six and other samples in sequence until the proportion of all the minority classes in the new data set is more than or equal to 40% of the theoretical proportion, or a minority class sample combination which does not cover other classes of samples cannot be found, and the artificial synthesis of the data is stopped.
The non-time series variable indexes are as shown in the following table 1;
TABLE 1
Figure GDA0003893974260000094
Figure GDA0003893974260000101
Figure GDA0003893974260000111
Figure GDA0003893974260000121
The invention is further illustrated below by means of the crp index, with reference to table 1 and the accompanying drawing 1:
the method comprises the following steps: acquiring time series variables and non-time series variables, and converting the time series variables into trend variables, wherein the method comprises the following steps:
(1) Calculating each type of time series variable X between adjacent data acquisition times t-1 and t i =(X i,t1 ,X i,t2 ,...,X i,tj A variation of), i denotesThe type of the variable of the inter-sequence,
Figure GDA0003893974260000122
(2) Determining the trend variable TR of each time series variable at each time i,t Taking a value, if the variable quantity is a negative value, taking a value of-1 by the corresponding trend variable; if the variation is a non-negative value, the corresponding trend variable takes a value of 1,
Figure GDA0003893974260000131
the amount of change in the post-operative 1-day VAS score was the difference between the post-operative 1-day VAS score and the post-operative 8-hour VAS score,
Figure GDA0003893974260000132
the variation of HB in 3 days after operation is the difference value of HB in 3 days after operation and HB in 1 day after operation,
Δ HB,3 =HB 3 -HB 1
(2) Determining trend variables TR of each time series variable at each time i,t And (4) taking values. If the variation is a negative value, the corresponding trend variable takes the value-1; if the variation is a non-negative value, the corresponding trend variable takes a value of 1.
Figure GDA0003893974260000133
Step two: determining the postoperative recovery state variable of each index by using the trend variable and the normal range of the index, wherein the method specifically comprises the following steps:
(1) Judging whether the value of each trend variable is in the corresponding normal range of the index at each data acquisition moment, and if the value is in the normal range, judging that the value of the index variable is 1; otherwise, the index variable takes the value 0,
Figure GDA0003893974260000134
the normal reference value for CRP is 800-8000. Mu.g/L,
Figure GDA0003893974260000135
(2) Determining the postoperative recovery state variable I of each index of each patient according to the clinical diagnosis standard and whether the trend variable is in the normal index range i Taking values; if the trend variables are all-1 or the trend variables rise first and then fall, and the value of the trend variable at the last data acquisition moment is in the normal range of the index, the postoperative recovery of the index of the patient is better, and the postoperative recovery state variable is marked as 1; if the values of the trend variables at the last two data acquisition moments are in the normal range of the index, the index of the patient is better restored after the operation, and the post-operation restoration state variable is recorded as 1; if the 3 trend variables are all-1, but the value of the trend variable at the last data acquisition moment is not in the normal range of the index, the index of the patient is generally restored after the operation, and the postoperative recovery state variable is marked as 0; other conditions, such as fluctuation and repetition of the trend variable and an increase in the value, indicated poor post-operative recovery in this patient, and the post-operative recovery state variable was noted as-1.
Elevation of CRP indicates hyperactivity of the body inflammatory response. CRP should be elevated before surgery and decreased 5 days after surgery, e.g., CRP should not decrease or increase again, suggesting possible complications of infection or thromboembolism. Thus, for CRP, there were 4 post-operative observations,
Figure GDA0003893974260000141
CRP 1 ,CRP 2 ,CRP 3 corresponding to 3 trend variables, TR CRP,1 ,TR CRP,2 ,TR CRP,3 And 4 indicating variables indicating whether the indexes are normal or not
Figure GDA0003893974260000142
N CRP,1 ,N CRP,2 ,N CRP,3 . If all of the 3 trend variables are-1,TR CRP,1 =TR CRP,2 =TR CRP,3 = -1, or increase then decrease TR CRP,1 =1,TR CRP,2 =TR CRP,3 =1 or TR CRP,1 =TR CRP,2 =1,TR CRP,3 = -1, and the last data acquisition time variable value is within the index normal range, N CRP,3 =1, then the patient has better CRP index postoperative recovery, and the CRP postoperative recovery state variable is recorded as 1,I CRP =1; TR if the variable values of the last two data acquisition moments are within the normal range of the index CRP,2 =TR CRP,3 =1, then the patient has better postoperative recovery of CRP index, and the state variable of the postoperative recovery of CRP is recorded as 1,I CRP =1; if all the 3 trend variables are-1 and the variable value at the last data acquisition time is not in the normal range of the index, N CRP,3 =1, then the patient had a general post-surgical recovery of the CRP index, the post-surgical recovery state variable for CRP was noted as 0 CRP =0;
Other conditions, such as fluctuation and repetition of the trend and increased values, indicated that the CRP index of the patient had poor postoperative recovery, and the CRP postoperative recovery state variable was-1,I CRP And (4) = -1. By this step, each type of time series variable X is converted into a time series variable X i Is converted into a post-operation recovery state variable I i
Step three:
classifying the postoperative recovery state variables of the patients according to the postoperative recovery state variables converted by all time series variables and other variables except the times of the next-bed activities and the time of the next-bed activities in the non-time series variables by 3 days after the operation by utilizing a Support Vector Machine (SVM) algorithm:
Figure GDA0003893974260000143
step four: classifying postoperative recovery state as dependent variable, and taking preoperative variable, (postoperative patient behavior variable) postoperative bed descending activity times and postoperative bed descending activity duration in all non-time sequences as 3 days after operationIndependent variables, a random forest algorithm is utilized to construct a prediction model, if the observed value is in a situation that the proportion of one or more postoperative recovery state classifications in all the observed values is less than 20% of the theoretical proportion, the data is regarded as unbalanced data, wherein the theoretical proportion of each classification is equal to
Figure GDA0003893974260000144
Wherein the time series variables comprise VAS scores, BT scores, active cough and expectoration times and deep respiration times; non-time series variables include pre-operative variables, post-operative variables including post-operative length of stay, gender, type of incision, length of incision, BMI, CA, tumor diameter, number of tumors, time of operation, extent of resection, amount of bleeding, presence or absence of blood transfusion, amount of plasma transfusion, amount of suspended red blood transfusion, ISHAK score, tissue differentiation grade, presence or absence of invasion envelope, extent of resection, presence or absence of cancer emboli, presence or absence of satellite nodules, pre-operative PT; the postoperative variables include postoperative 1-month life quality score, presence or absence of complications, operative complications Clavien classification, presence or absence of readmission, frequency of bed discharge activity, time of bed discharge activity for 3 days after operation, postoperative bowel sound (hours), postoperative anal venting or defecation (hours), and postoperative 1-month life quality score.
The fourth step is specifically as follows: and (2) n classes are shared in the postoperative recovery state, the theoretical proportion of each class is 1/n, if the proportion of the number of patients belonging to a certain class to the total number of patients in the data set is less than 20% of 1/n, the observation value of the class is a minority class, the minority class in the data set is artificially synthesized into data by the following method to supplement the data set, and the new supplemented data set is utilized to construct a prediction model. Assuming that the postoperative recovery status of the patients obtained in step three is totally 5 classes, the theoretical proportion of each class is
Figure GDA0003893974260000151
Observations of this type are in the minority if the ratio of the number of patients belonging to a certain type to the total number of patients in the dataset is less than 20% x 20% = 4%. For a few classes in the data, the following method was used to artificiallySynthesizing data, supplementing a data set, and constructing a prediction model by using the supplemented new data set:
(1) If there is only one few class in the dataset.
(1) For a minority class
Figure GDA0003893974260000152
y ab =y ay (y b -y a ),β y ∈(0,1)
x ab1 =x a1x1 (x b1 -x a1 ),β x1 ∈(0,1)
x ab2 =x a2x2 (x b2 -x a2 ),β x2 ∈(0,1)
(3) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of the minority class in the new data set. If the proportion is more than or equal to 40 percent of the theoretical proportion, stopping artificially synthesizing data; otherwise, go to (4).
(4) Finding minority classes using new data sets
Figure GDA0003893974260000161
Three adjacent samples in (1)
Figure GDA0003893974260000162
And no other samples are in the triangle formed by taking the three samples as the vertexes
Figure GDA0003893974260000163
Namely, it is
y o =c 1 y a +c 2 y b +c 3 y c
x o1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x o2 =c 1 x a2 +c 2 x b2 +c 3 x c2
And satisfy c 1 +c 2 +c 3 Not equal to 1 or c 1 ,c 2 ,c 3 At least one is not in [0,1 ]]。
(5) As shown in FIG. 3, in a triangle formed by three samples, a point is randomly selected to generate a synthetic data
Figure GDA0003893974260000165
y abc =c 1 y a +c 2 y b +c 3 y c
x abc1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x abc2 =c 1 x a2 +c 2 x b2 +c 3 x c2
c 1 +c 2 +c 3 =1
c 1 ,c 2 ,c 3 ∈[0,1]
(6) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of the minority class in the new data set. If the proportion is more than or equal to 40 percent of the theoretical proportion, stopping artificially synthesizing data; otherwise, go to (7).
(7) According to the steps, the new data set is utilized, data are artificially synthesized based on four, five, six and other samples in sequence until the proportion of a few classes in the new data set is more than or equal to 40% of the theoretical proportion, or a combination of a few classes of samples which do not cover other classes of samples cannot be found, and the artificial synthesis of the data is stopped.
(2) If the data set contains more than two (including two) minority classes
(1) For each minority class
Figure GDA0003893974260000164
The (1) and (2) methods using step four (1) are based on twoThe samples were artificially synthesized into data.
(2) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of each minority class in the new data set. For a minority class with a proportion of less than 40% of the theoretical proportion
Figure GDA0003893974260000171
And (4) entering into (3).
(3) For each minority class
Figure GDA0003893974260000172
The data was artificially synthesized based on the two samples using the methods (4) and (5) of step four (1).
(4) And combining the newly synthesized data with the original data to form a new data set, and recalculating the proportion of each minority class in the new data set. For a minority class with a proportion of less than 40% of the theoretical proportion
Figure GDA0003893974260000173
And (5) entering.
(5) According to the steps, the new data set is utilized, data are artificially synthesized for each minority class on the basis of four, five, six and other samples in sequence until the proportion of all the minority classes in the new data set is more than or equal to 40% of the theoretical proportion, or a minority class sample combination which does not cover other classes of samples cannot be found, and the artificial synthesis of the data is stopped.
In summary, the invention (1) converts the time series variable describing the postoperative recovery state of the patient into the trend variable, and then forms a comprehensive postoperative recovery state indicating variable for each postoperative recovery index. (2) And if the data set is unbalanced data, performing artificial synthesis on the minority class by using the region formed by the vertex of the minority class sample, and training a prediction model by using a new data set containing new synthesized data and original data. The postoperative recovery state of the surgical patient can be effectively evaluated, and the postoperative recovery state of the patient can be predicted.
The contents of the present invention have been explained above. Those skilled in the art will be able to practice the invention based on these descriptions. Based on the above disclosure of the present invention, all other preferred embodiments and examples obtained by a person skilled in the art without any inventive step should fall within the scope of protection of the present invention.

Claims (2)

1. The system for constructing the prediction model of the postoperative recovery state of the surgical patient is characterized by comprising the following steps of:
the method comprises the following steps:
acquiring time series variables and non-time series variables, and converting the time series variables into trend variables, wherein the method comprises the following steps:
(1) Calculating each type of time series variable X between adjacent data acquisition times t-1 and t i =(X i,t1 ,X i,t2 ,...,X i,tj A.) of the variables, i represents a time series variable type,
Figure FDA0003893974250000011
(2) Determining the trend variable TR of each time series variable at each time i,t Taking a value, and if the variation is a negative value, taking a corresponding trend variable as-1; if the variation is a non-negative value, the corresponding trend variable takes a value of 1,
Figure FDA0003893974250000012
step two:
determining the postoperative recovery state variable of each index by using the trend variable and the normal range of the index, wherein the method specifically comprises the following steps:
(1) Judging whether the value of each trend variable is in the corresponding normal range of the index at each data acquisition moment, and if the value is in the normal range, judging that the value of the index variable is 1; otherwise, the index variable takes the value 0,
Figure FDA0003893974250000013
(2) Determining the postoperative recovery state variable I of each index of each patient according to the clinical diagnosis standard and whether the trend variable is in the normal index range i Taking values; if the trend variables are all-1 or the trend variables rise first and then fall, and the value of the trend variable at the last data acquisition moment is in the normal range of the index, the postoperative recovery of the index of the patient is better, and the postoperative recovery state variable is marked as 1; if the values of the trend variables at the last two data acquisition moments are both in the normal range of the index, the index of the patient is better recovered after the operation, and the recovery state variable after the operation is recorded as 1; if the 3 trend variables are all-1, but the value of the trend variable at the last data acquisition moment is not in the normal range of the index, the index of the patient is generally restored after the operation, and the postoperative recovery state variable is marked as 0; if the trend variable fluctuates and repeats and the numerical value is increased, the postoperative recovery difference of the patient is shown, and the postoperative recovery state variable is marked as-1;
step three:
classifying the postoperative recovery state variables of the patients according to the postoperative recovery state variables converted by all time series variables and other variables except the times of the next-bed activities and the time of the next-bed activities in the non-time series variables by 3 days after the operation by utilizing a Support Vector Machine (SVM) algorithm:
Figure FDA0003893974250000021
step four:
using postoperative recovery state classification as dependent variable, using preoperative variable, postoperative patient behavior variable, postoperative bed descending activity frequency and postoperative bed descending activity duration in all non-time sequences as independent variables for 3 days after operation, using random forest algorithm to construct a prediction model, if the observed value has the condition that the proportion of certain or several postoperative recovery state classifications in all observed values is less than 20% of theoretical ratio, using data as unbalanced data, wherein each classification isIs equal to
Figure FDA0003893974250000022
Wherein the time series variables comprise VAS scores, BI scores, active cough and expectoration times and deep respiration times; non-time series variables include pre-operative variables and post-operative variables, the pre-operative variables including post-operative length of stay, gender, incision type, incision length, BMI, CA, tumor diameter, tumor number, time of operation, extent of resection, amount of bleeding, presence or absence of blood transfusion, amount of plasma transfusion, amount of suspended red blood transfusion, ISHAK score, tissue differentiation grade, presence or absence of invasion envelope, extent of resection, presence or absence of cancer emboli, presence or absence of satellite nodules, and pre-operative PT; the postoperative variables comprise postoperative 1-month life quality score, existence of complications, operative complication Clavien grading, existence of readmission, times of bed discharge activity, time of bed discharge activity for 3 days after operation, postoperative bowel sound time, postoperative anal air discharge or defecation time and postoperative 1-month life quality score.
2. The system for constructing a model for predicting the postoperative recovery state of a surgical patient according to claim 1, wherein the fourth step is specifically: the postoperative recovery state is n types in total, the theoretical proportion of each type is 1/n, if the proportion of the number of patients belonging to a certain type to the total number of patients in the data set is less than 20% of 1/n, the observation value of the type is a minority type, for the minority type in the data set, data are artificially synthesized by the following method to supplement the data set, and then the new supplemented data set is utilized to construct a prediction model:
(1) If there is only one minority class in the dataset
(1) For a minority class
Figure FDA0003893974250000031
Each sample of
Figure FDA0003893974250000032
Searching for samples belonging to the same class and closest to the Euclidean distance
Figure FDA0003893974250000033
Figure FDA0003893974250000034
Without other kinds of samples between them
Figure FDA0003893974250000035
Figure FDA0003893974250000036
Namely, it is
Figure FDA0003893974250000037
Or
Figure FDA0003893974250000038
Or
Figure FDA0003893974250000039
And is provided with
Figure FDA00038939742500000310
Figure FDA00038939742500000311
Figure FDA00038939742500000312
And
Figure FDA00038939742500000313
therebetween has no itSamples of his class
(2) Randomly selecting a point on the connecting line between the two samples to generate artificially synthesized data
Figure FDA00038939742500000314
y ab =y ay (y b -y a ),β y ∈(0,1)
x ab1 =x a1x1 (x b1 -x a1 ),β x1 ∈(0,1)
x ab2 =x a2x2 (x b2 -x a2 ),β x2 ∈0,1
(3) Combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of the minority class in the new data set, and stopping artificially synthesizing the data if the proportion is more than or equal to 40% of the theoretical proportion; otherwise, entering (4);
(4) finding minority classes using new data sets
Figure FDA0003893974250000041
Three adjacent samples in (1)
Figure FDA0003893974250000042
Figure FDA0003893974250000043
And no other samples are in the triangle formed by taking the three samples as the vertexes
Figure FDA0003893974250000044
Figure FDA0003893974250000045
Namely, it is
y o =c 1 y a +c 2 y b +c 3 y c
x o1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x o2 =c 1 x a2 +c 2 x b2 +c 3 x c2
And satisfy c 1 +c 2 +c 3 Not equal to 1 or c 1 ,c 2 ,c 3 At least one is not in [0,1 ]];
(5) Randomly selecting a point in a triangle formed by the three samples to generate artificial synthetic data
Figure FDA0003893974250000046
y abc =c 1 y a +c 2 y b +c 3 y c
x abc1 =c 1 x a1 +c 2 x b1 +c 3 x c1
x abc2 =c 1 x a2 +c 2 x b2 +c 3 x c2
c 1 +c 2 +c 3 =1
(6) Combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of the minority class in the new data set, and stopping artificially synthesizing the data if the proportion is more than or equal to 40% of the theoretical proportion; otherwise, entering (7);
(7) according to the steps, the new data set is utilized, data are artificially synthesized based on four, five, six and other samples in sequence until the proportion of minority samples in the new data set is more than or equal to 40% of the theoretical proportion, or minority sample combinations which do not cover other samples cannot be found, and the artificial synthesis of the data is stopped;
(2) If the data set contains two or more minority classes
(1) For each minority class
Figure FDA0003893974250000051
Artificially synthesizing data based on two samples by using the methods (1) and (2) in the step four (1);
(2) combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of each minority class in the new data set, and for the minority classes with the proportion less than 40% of the theoretical proportion
Figure FDA0003893974250000052
Figure FDA0003893974250000053
Entering (3);
(3) for each minority class
Figure FDA0003893974250000054
Artificially synthesizing data based on two samples by using the methods (4) and (5) in the step four (1);
(4) combining the newly synthesized data with the original data to form a new data set, recalculating the proportion of each minority class in the new data set, and for the minority classes with the proportion less than 40% of the theoretical proportion
Figure FDA0003893974250000055
Figure FDA0003893974250000056
Entering (5);
(5) according to the steps, the new data set is utilized, data are artificially synthesized for each minority class on the basis of four, five, six and other samples in sequence until the proportion of all the minority classes in the new data set is more than or equal to 40% of the theoretical proportion, or a minority class sample combination which does not cover other class samples cannot be found.
CN202110357379.1A 2021-04-01 2021-04-01 Surgical patient postoperative recovery state prediction model construction system Active CN113192642B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110357379.1A CN113192642B (en) 2021-04-01 2021-04-01 Surgical patient postoperative recovery state prediction model construction system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110357379.1A CN113192642B (en) 2021-04-01 2021-04-01 Surgical patient postoperative recovery state prediction model construction system

Publications (2)

Publication Number Publication Date
CN113192642A CN113192642A (en) 2021-07-30
CN113192642B true CN113192642B (en) 2023-02-28

Family

ID=76974450

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110357379.1A Active CN113192642B (en) 2021-04-01 2021-04-01 Surgical patient postoperative recovery state prediction model construction system

Country Status (1)

Country Link
CN (1) CN113192642B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116230212A (en) * 2023-04-04 2023-06-06 曜立科技(北京)有限公司 Diagnosis decision system for postoperative cerebral apoplexy review based on data processing

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107958708A (en) * 2017-12-22 2018-04-24 北京鑫丰南格科技股份有限公司 Risk trend appraisal procedure and system after institute
CN108742513A (en) * 2018-02-09 2018-11-06 上海长江科技发展有限公司 Patients with cerebral apoplexy rehabilitation prediction technique and system
CN109659033A (en) * 2018-12-18 2019-04-19 浙江大学 A kind of chronic disease change of illness state event prediction device based on Recognition with Recurrent Neural Network
CN111292824A (en) * 2020-01-20 2020-06-16 深圳市丞辉威世智能科技有限公司 Rehabilitation method, rehabilitation device, rehabilitation apparatus, and computer-readable storage medium
CN112133441A (en) * 2020-08-21 2020-12-25 广东省人民医院 Establishment method and terminal of MH post-operation fissure hole state prediction model
CN112270441A (en) * 2020-10-30 2021-01-26 华东师范大学 Method for establishing autism child rehabilitation effect prediction model and method and system for predicting autism child rehabilitation effect

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107958708A (en) * 2017-12-22 2018-04-24 北京鑫丰南格科技股份有限公司 Risk trend appraisal procedure and system after institute
CN108742513A (en) * 2018-02-09 2018-11-06 上海长江科技发展有限公司 Patients with cerebral apoplexy rehabilitation prediction technique and system
CN109659033A (en) * 2018-12-18 2019-04-19 浙江大学 A kind of chronic disease change of illness state event prediction device based on Recognition with Recurrent Neural Network
CN111292824A (en) * 2020-01-20 2020-06-16 深圳市丞辉威世智能科技有限公司 Rehabilitation method, rehabilitation device, rehabilitation apparatus, and computer-readable storage medium
CN112133441A (en) * 2020-08-21 2020-12-25 广东省人民医院 Establishment method and terminal of MH post-operation fissure hole state prediction model
CN112270441A (en) * 2020-10-30 2021-01-26 华东师范大学 Method for establishing autism child rehabilitation effect prediction model and method and system for predicting autism child rehabilitation effect

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Prevalence and serotype distribution of nasopharyngeal carriage of Streptococcus pneumoniae in China:a meta-analysis;Lin Wang等;《BMC Infectious Diseases 2017》;20171213;第1-14页 *
加速康复外科模式在国内外应用的评价指标研究进展;李卡等;《中国普外基础与临床杂志》;20180531;第25卷(第5期);第629-634页 *

Also Published As

Publication number Publication date
CN113192642A (en) 2021-07-30

Similar Documents

Publication Publication Date Title
Hardman et al. Ruptured abdominal aortic aneurysms: who should be offered surgery?
Sandri et al. Variable selection using random forests
Aoki et al. Predictive model for survival at the conclusion of a damage control laparotomy
Hanko et al. Random forest–based prediction of outcome and mortality in patients with traumatic brain injury undergoing primary decompressive craniectomy
CN113192642B (en) Surgical patient postoperative recovery state prediction model construction system
CN107908819A (en) The method and apparatus for predicting User Status change
Fouquet et al. Totally endoscopic lateral parathyroidectomy: prospective evaluation of 200 patients: ESES 2010 Vienna Presentation
Demšar et al. Feature mining and predictive model construction from severe trauma patient's data
Al-Mualemi et al. A deep learning-based sepsis estimation scheme
CN107130017A (en) The purposes of kit and reagent in reagent preparation box
Syed et al. Determining if positive predictive value using laboratory risk indicator for necrotising fasciitis is applicable in Malaysian patients with necrotising fasciitis
CN117153380A (en) Method, system and equipment for predicting postoperative acute kidney injury of non-cardiac surgery patient
Morland et al. Epidemiology and prognoses in a medical intermediate care unit
George et al. A hospital throughput model in the context of long waiting lists
Drosou et al. Support vector machines classification on class imbalanced data: a case study with real medical data
CN112259219B (en) System, equipment and storage medium for predicting diseases based on upper gastrointestinal bleeding
Vrtková Predicting clinical status of patients after an acute ischemic stroke using random forests
van Erven et al. Hospital standardised mortality ratio: A reliable indicator of quality of care?
CN110504030A (en) A kind of traumatic coagulopathy prediction technique
Exarchos et al. Modelling of oral cancer progression using dynamic Bayesian networks
Spremo et al. Acute mastoiditis in children: susceptibility factors and management
Padmanaban et al. Backward model building for nonparametric discrimination and classification of fatty liver cases
Buchlak et al. 401 Applying Machine Learning for Risk Stratification and Acute Clinical Outcome Prediction Amongst Aneurysmal Subarachnoid Hemorrhage Patients
Paterson Rough classification of pneumonia patients using a clinical database
Haukoos et al. A multi-center pragmatic randomized comparison of HIV screening strategy effectiveness in the emergency department: the HIV TESTED trial

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant