CN112768060A - Liver cancer postoperative recurrence prediction method based on random survival forest and storage medium - Google Patents

Liver cancer postoperative recurrence prediction method based on random survival forest and storage medium Download PDF

Info

Publication number
CN112768060A
CN112768060A CN202110098484.8A CN202110098484A CN112768060A CN 112768060 A CN112768060 A CN 112768060A CN 202110098484 A CN202110098484 A CN 202110098484A CN 112768060 A CN112768060 A CN 112768060A
Authority
CN
China
Prior art keywords
recurrence
liver cancer
case
postoperative
risk
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110098484.8A
Other languages
Chinese (zh)
Inventor
刘景丰
曾建兴
郭鹏飞
刘红枝
林孔英
陈振伟
黄起桢
傅俊
丁宗仁
曾建阳
陈传椿
李保晟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Mengchao Hepatobiliary Hospital Of Fujian Medical University (fuzhou Hospital For Infectious Diseases)
Fuzhou Yixing Dashuju Industry Investment Co ltd
Original Assignee
Mengchao Hepatobiliary Hospital Of Fujian Medical University (fuzhou Hospital For Infectious Diseases)
Fuzhou Yixing Dashuju Industry Investment Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Mengchao Hepatobiliary Hospital Of Fujian Medical University (fuzhou Hospital For Infectious Diseases), Fuzhou Yixing Dashuju Industry Investment Co ltd filed Critical Mengchao Hepatobiliary Hospital Of Fujian Medical University (fuzhou Hospital For Infectious Diseases)
Publication of CN112768060A publication Critical patent/CN112768060A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/80ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for detecting, monitoring or modelling epidemics or pandemics, e.g. flu

Abstract

The invention provides a liver cancer postoperative recurrence prediction method and a storage medium based on a random survival forest, wherein the method comprises the following steps: acquiring clinical data and recurrence time of each case; the preset grouping dimension comprises basic factors of a patient, preoperative inspection factors and postoperative pathological factors; acquiring a data set according to the clinical data, wherein the data set is composed of preset grouping dimensions corresponding to each case; and (3) constructing a corresponding liver cancer postoperative early-stage recurrence prediction model by adopting a random survival forest algorithm according to the data set and the recurrence time of each case. The method can accurately predict the postoperative recurrence probability of the liver cancer of individual patients, and better determine the postoperative attention; help in active prevention; particularly, aiming at medical institutions, the method can help medical staff to accurately screen out high-risk recurrent patients after liver cancer surgery, is helpful for intervention in early recurrence and guides postoperative follow-up and treatment.

Description

Liver cancer postoperative recurrence prediction method based on random survival forest and storage medium
Technical Field
The invention relates to the field of bioinformatics, in particular to a liver cancer postoperative recurrence prediction method and a storage medium based on a random survival forest.
Background
Primary liver cancer (hereinafter referred to as liver cancer) is one of the most common malignant tumors in China, the incidence rate is the fourth rate of tumor incidence in China, the mortality rate is the third rate of tumor mortality in China, and the liver cancer seriously threatens the life and health of people in China. At present, surgical resection is the main means for radical treatment of liver cancer, but postoperative recurrence is still the important reason for death after liver cancer operation. Clinical data indicate that the recurrence rate after liver cancer surgery is about 50%. Recurrence is generally divided into early recurrence and late recurrence at 2-year cut-off, with the number of early recurrence accounting for about 70% of the total recurrence. Therefore, the method can be used for accurately predicting the early relapse of the liver cancer after the operation, screening the patients with high risk of early relapse, providing proper monitoring in clinical diagnosis and treatment so as to find the tumor at the early stage of relapse, and providing radical treatment again, so that the method has very high clinical value.
In recent years, the method for realizing disease risk prediction by utilizing various machine learning algorithms is a research hotspot in the field of medical big data, various complex algorithms can deeply mine the interrelationship among disease variables, but the mainstream machine learning algorithm is difficult to process medical data with deletion characteristics, so that certain deviation still exists, and the accuracy is not high.
Random Survival Forest (RSF) is a random forest method that can analyze right-erasure survival data. The method introduces a new memory splitting rule for growing the survival tree and a new missing data algorithm for estimating missing data, and is suitable for application of survival analysis. The application aims to provide a method and a storage medium for establishing a liver cancer postoperative early relapse prediction model based on random survival forests so as to obtain a more accurate disease variable relation.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: the method and the storage medium for predicting the postoperative recurrence of the liver cancer based on the random survival forest are provided, the postoperative recurrence probability of the liver cancer of an individual patient can be accurately predicted, and reference is provided for postoperative attention.
In order to solve the technical problems, the invention adopts the technical scheme that:
the liver cancer postoperative recurrence prediction method based on the random survival forest comprises the following steps:
acquiring clinical data and recurrence time of each case;
the preset grouping dimension comprises basic factors of a patient, preoperative inspection factors and postoperative pathological factors;
acquiring a data set according to the clinical data, wherein the data set is composed of preset grouping dimensions corresponding to each case;
and (3) constructing a corresponding liver cancer postoperative early-stage recurrence prediction model by adopting a random survival forest algorithm according to the data set and the recurrence time of each case.
The invention provides another technical scheme as follows:
a computer readable storage medium, having stored thereon a computer program, which when executed by a processor, is capable of implementing the steps included in the above method for predicting post-operative recurrence of liver cancer based on a random survival forest.
The invention has the beneficial effects that: according to the invention, based on the random survival forest and the clinical data of a certain amount of historical relapse cases, the early relapse prediction model after the liver cancer operation is established and obtained, so that individual prediction of patients based on the model can be realized, the relapse condition can be obtained, and active prevention is facilitated; particularly, aiming at medical institutions, the method can help medical staff to accurately screen out high-risk recurrent patients after liver cancer surgery, help intervention in early recurrence, guide postoperative follow-up and treatment, and improve the cure rate.
Drawings
FIG. 1 is a schematic flow chart of a method for predicting postoperative recurrence of liver cancer based on a random survival forest according to an embodiment of the present invention;
FIG. 2 is a schematic flow chart of a method for predicting postoperative recurrence of liver cancer based on a random survival forest according to a second embodiment of the present invention;
fig. 3 is an exemplary diagram of an interface effect of a predicted result in the fifth embodiment of the present invention.
Detailed Description
In order to explain technical contents, achieved objects, and effects of the present invention in detail, the following description is made with reference to the accompanying drawings in combination with the embodiments.
Referring to fig. 1, the present invention provides a method for predicting postoperative recurrence of liver cancer based on a random survival forest, comprising:
acquiring clinical data and recurrence time of each case;
the preset grouping dimension comprises basic factors of a patient, preoperative inspection factors and postoperative pathological factors;
acquiring a data set according to the clinical data, wherein the data set is composed of preset grouping dimensions corresponding to each case;
and (3) constructing a corresponding liver cancer postoperative early-stage recurrence prediction model by adopting a random survival forest algorithm according to the data set and the recurrence time of each case.
From the above description, the beneficial effects of the present invention are: the individual prediction of the patient can be realized based on the model, the recurrence condition of the patient can be obtained, and active prevention is facilitated; particularly, aiming at medical institutions, the method can help medical staff to accurately screen out high-risk recurrent patients after liver cancer surgery, help intervention in early recurrence, guide postoperative follow-up and treatment, and improve the cure rate.
Further, the acquiring clinical data and recurrence time of each case further comprises:
the patients were obtained in the group, who had normal liver function assessment before surgery, had no history of malignant tumor, no invasion of adjacent organs and distant metastasis, had undergone hepatoma resection surgery and had pathology confirmed as hepatocellular carcinoma after surgery, and had relapsed after surgery.
From the above description, it can be known that determining qualified incoming cases according to the above conditions can significantly improve the accuracy of the model.
Further, the method for constructing and obtaining the corresponding liver cancer postoperative early-stage recurrence prediction model by adopting a random survival forest algorithm according to the data set and the recurrence time of each case comprises the following steps:
dividing each case according to a preset proportion to obtain a training group case and a testing group case;
dividing the data set according to the training group cases and the test group to obtain a training group data set and a test group data set;
according to the training group data set and the recurrence time of each case in the training group cases, adopting a random survival forest algorithm to construct and obtain a corresponding liver cancer postoperative early recurrence prediction model and an accumulated risk function thereof;
predicting each case in the training set of cases by using the cumulative risk function to obtain a risk score set;
and dividing the risk score set according to a preset proportion to obtain risk score ranges respectively corresponding to the low-risk recurrence group, the medium-risk recurrence group and the high-risk recurrence group.
According to the description, a test group is set, and the risk score of each test case is obtained according to the model; and then, marking out the score ranges corresponding to different risk groups according to medical experience and rules, and providing support for quickly and clearly determining the risk grade to which the risk score obtained based on model calculation belongs.
Further, still include:
acquiring the group entry dimension of a case;
calculating a risk score corresponding to the case through the early relapse prediction model after the liver cancer operation according to the grouping dimension of the case;
determining corresponding risk groups according to the risk score range to which the calculated risk score belongs;
outputting the determined risk group.
According to the above description, the risk grouping to which the case belongs can be directly output, and a more intuitive and understandable prediction result can be provided.
Further, still include:
and deploying the early relapse prediction model after the liver cancer operation into a server, and generating a corresponding prediction webpage.
The above description shows that the prediction function can be provided in the form of a web page, and the method has the characteristics of simpler operation, more flow saving, less memory and resource occupation and the like.
Further, still include:
acquiring the group entry dimension of a case;
and according to the grouping dimension of the case, calculating the recurrence condition corresponding to the case through the liver cancer postoperative early recurrence prediction model.
According to the description, the recurrence condition of the case can be quickly known by directly inputting the grouping dimension information of the case, and a more accurate prediction function is provided for the user.
Further, the recurrence condition includes risk score, probability of no recurrence and their curves.
As can be seen from the above description, the data obtained based on model calculation has the characteristics of intuition, comprehensiveness and fineness.
Further, the patient basic factors include age and gender; the preoperative test factors comprise platelets, albumin, total bilirubin, etiological examination results and alpha fetoprotein; the postoperative pathological factors comprise tumor maximum diameter, tumor number, macroscopic blood vessel invasion, microvascular invasion, satellite foci, tumor envelope, liver cancer differentiation grade and liver cirrhosis type.
As can be seen from the above description, the accuracy of the prediction result can be ensured by analyzing and obtaining the early recurrence prediction model after the liver cancer operation based on the clinical data which is comprehensive enough and key to the cases.
The invention provides another technical scheme as follows:
a computer-readable storage medium, having stored thereon a computer program which, when executed by a processor, enables the following method for predicting post-operative recurrence of liver cancer based on a random survival forest, comprising the steps of:
acquiring clinical data and recurrence time of each case;
the preset grouping dimension comprises basic factors of a patient, preoperative inspection factors and postoperative pathological factors;
acquiring a data set according to the clinical data, wherein the data set is composed of preset grouping dimensions corresponding to each case;
and (3) constructing a corresponding liver cancer postoperative early-stage recurrence prediction model by adopting a random survival forest algorithm according to the data set and the recurrence time of each case.
Further, the acquiring clinical data and recurrence time of each case further comprises:
the patients were obtained in the group, who had normal liver function assessment before surgery, had no history of malignant tumor, no invasion of adjacent organs and distant metastasis, had undergone hepatoma resection surgery and had pathology confirmed as hepatocellular carcinoma after surgery, and had relapsed after surgery.
Further, the method for constructing and obtaining the corresponding liver cancer postoperative early-stage recurrence prediction model by adopting a random survival forest algorithm according to the data set and the recurrence time of each case comprises the following steps:
dividing each case according to a preset proportion to obtain a training group case and a testing group case;
dividing the data set according to the training group cases and the test group to obtain a training group data set and a test group data set;
according to the training group data set and the recurrence time of each case in the training group cases, adopting a random survival forest algorithm to construct and obtain a corresponding liver cancer postoperative early recurrence prediction model and an accumulated risk function thereof;
predicting each case in the training set of cases by using the cumulative risk function to obtain a risk score set;
and dividing the risk score set according to a preset proportion to obtain risk score ranges respectively corresponding to the low-risk recurrence group, the medium-risk recurrence group and the high-risk recurrence group.
Further, still include:
acquiring the group entry dimension of a case;
calculating a risk score corresponding to the case through the early relapse prediction model after the liver cancer operation according to the grouping dimension of the case;
determining corresponding risk groups according to the risk score range to which the calculated risk score belongs;
outputting the determined risk group.
Further, still include:
and deploying the early relapse prediction model after the liver cancer operation into a server, and generating a corresponding prediction webpage.
Further, still include:
acquiring the group entry dimension of a case;
and according to the grouping dimension of the case, calculating the recurrence condition corresponding to the case through the liver cancer postoperative early recurrence prediction model.
Further, the recurrence condition includes risk score, probability of no recurrence and their curves.
Further, the patient basic factors include age and gender; the preoperative test factors comprise platelets, albumin, total bilirubin, etiological examination results and alpha fetoprotein; the postoperative pathological factors comprise tumor maximum diameter, tumor number, macroscopic blood vessel invasion, microvascular invasion, satellite foci, tumor envelope, liver cancer differentiation grade and liver cirrhosis type.
As can be understood from the above description, those skilled in the art can understand that all or part of the processes in the above technical solutions can be implemented by instructing related hardware through a computer program, where the program can be stored in a computer-readable storage medium, and when executed, the program can include the processes of the above methods. The program can also achieve advantageous effects corresponding to the respective methods after being executed by a processor.
The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
Example one
The embodiment provides a liver cancer postoperative recurrence prediction method based on random survival forests, which comprises the following steps:
s1: obtaining cases for which liver function assessment was normal before surgery, had no history of malignant tumor, no invasion of adjacent organs and distant metastasis, underwent hepatoma resection surgery and postoperative pathology confirmed to be hepatocellular carcinoma, and recurred after surgery;
s2: acquiring clinical data and recurrence time of each case;
s3: the preset grouping dimension comprises basic factors of a patient, preoperative inspection factors and postoperative pathological factors;
specifically, the patient basic factors include age and gender; the preoperative test factors comprise platelets, albumin, total bilirubin, etiological examination results and alpha fetoprotein; the postoperative pathological factors comprise tumor maximum diameter, tumor number, macroscopic blood vessel invasion, microvascular invasion, satellite foci, tumor envelope, liver cancer differentiation grade and liver cirrhosis type.
S4: acquiring a data set according to the clinical data, wherein the data set is composed of preset grouping dimensions corresponding to each case;
s5: and (3) constructing a corresponding liver cancer postoperative early-stage recurrence prediction model by adopting a random survival forest algorithm according to the data set and the recurrence time of each case. Preferably, a random survival forest algorithm is adopted, and a random forest SRC program package of the R language is used for constructing and obtaining the prediction model.
S6: acquiring the group entry dimension of a case;
s7: and according to the grouping dimension of the case, calculating the recurrence condition corresponding to the case through the liver cancer postoperative early recurrence prediction model.
Preferably, the recurrence profile includes a risk score, a probability of no recurrence, and a curve thereof.
In one embodiment, the method further comprises the following steps:
s8: and deploying the liver cancer postoperative early relapse prediction model into a server, and generating a corresponding prediction webpage or prediction application.
Example two
Referring to fig. 2, the present embodiment is further defined on the basis of the first embodiment:
the S5 specifically includes:
s51: dividing each case according to a preset proportion to obtain a training group case and a testing group case;
s52: dividing the data set according to the training group cases and the test group to obtain a training group data set and a test group data set;
s53: according to the training group data set and the recurrence time of each case in the training group cases, adopting a random survival forest algorithm to construct and obtain a corresponding liver cancer postoperative early recurrence prediction model and an accumulated risk function thereof;
s54: predicting each case in the training set of cases by using the cumulative risk function to obtain a risk score set formed by the risk scores of each case;
s55: and dividing the risk score set according to a preset proportion to obtain risk score ranges respectively corresponding to the low-risk recurrence group, the medium-risk recurrence group and the high-risk recurrence group.
Specifically, the risk scores of all cases in the test group are ranked from low to high according to medical experience and rules: the high-risk patients account for a minority, the low-risk patients account for about half, namely, the high-risk patients are divided according to 50 percent and 85 percent of the patients, and the risk score range corresponding to 0-50 percent of cases is defined as a low-risk recurrence group; the corresponding risk score range of 50% -85% of cases is defined as the medium-risk relapse group; the range of risk scores corresponding to more than 85% of cases was defined as the high-risk relapse group. For example, if a patient has a risk score of 25 points that falls within the low risk recurrence group, the patient is a low risk recurrence.
Meanwhile, the method further comprises the following steps:
acquiring the group entry dimension of a case;
calculating a risk score corresponding to the case through the early relapse prediction model after the liver cancer operation according to the grouping dimension of the case;
determining corresponding risk groups according to the risk score range to which the calculated risk score belongs;
outputting the determined risk group.
The prediction result of the embodiment also comprises the risk level of the individual case, so the prediction result is more intuitive and understandable, and is more favorable for being popularized to non-medical personnel for use, thereby having stronger practicability.
EXAMPLE III
This embodiment corresponds to the second embodiment, and the whole of the scheme is further limited, and also refer to fig. 2, the method includes:
s1: acquiring grouped cases, wherein each case meets the following requirements: liver function assessment before operation is normal, the history of malignant tumor is absent, adjacent organ invasion and distant metastasis are absent, liver cancer resection operation is performed, and postoperative pathology is proved to be hepatocellular carcinoma, and postoperative recurrence is caused;
s2: acquiring the recurrence time of each case, relevant clinical data and follow-up data, and eliminating patients with incomplete data;
s3: determining an entry dimension, comprising at least:
1. basic factors of cases: sex, age;
2. preoperative test factors: platelets, albumin, total bilirubin, etiological tests (hepatitis b, hepatitis c, others), alpha fetoprotein;
3. pathological factors after operation: tumor maximum diameter, tumor number, macroscopic vascular invasion, microvascular invasion, satellite foci, tumor envelope, liver cancer differentiation grade, and liver cirrhosis type;
acquiring a data set according to the respective corresponding dimensionalities of the cases determined by S2;
s4: dividing the data set into a training group and a testing group according to a proportion by taking case corresponding data as a unit;
s5: based on the training set data set, adopting a random survival forest algorithm, constructing a model by using a randomForestSRC program package of an R language, and selecting default parameters to form a liver cancer postoperative early relapse prediction model;
s6: predicting each patient in the test group according to the accumulated risk function of the model to obtain a corresponding risk score; wherein a greater risk score indicates a greater probability of early relapse;
s7: sorting the risk scores of all patients in the test group from low to high, wherein according to medical experience and rules, the high-risk patients account for a few, the low-risk patients account for about half, and the patients are segmented according to 50% and 85% of the number of the patients, and if two segmentation points of risk scores 32.524 and 66.511 are obtained, 0-50% of the patients are divided into a low-risk recurrence group (the corresponding risk score is less than or equal to 32.524), 50% -85% of the patients are divided into a medium-risk recurrence group (32.524< risk score is less than or equal to 66.511), and more than 85% of the patients are divided into a high-risk recurrence group (risk score is more than 66.511);
for example, if a patient has a risk score of 25, the patient is a low risk recurrence; if the risk score of one patient is 50 points, the patient is in medium risk relapse; one patient had a risk score of 71, and was a high risk recurrence.
S8: constructing a webpage and a server by using a Shiny program package based on an R language, and deploying the liver cancer postoperative early relapse prediction model into the server to form a webpage prediction page;
s9: the patient who meets the grouping condition is collected, the grouping dimension of the patient is collected, and 15 indexes of the age (numerical value), the sex (male and female), the etiology (hepatitis B, hepatitis C and other), the blood platelet (numerical value), the albumin (numerical value), the total bilirubin (numerical value), the alpha fetoprotein (numerical value), the tumor size (numerical value), the tumor number (1, 2, 3, 4,5 and above), the microvascular cancer embolus (existence or not), the macroscopic vascular invasion (existence or not), the differentiation grade (I-II, III-IV), the tumor envelope (existence or not), the satellite focus (existence or not) and the liver cirrhosis condition (existence or not) of the patient are input into the model prediction page through a selector and a sliding strip;
s10: clicking a prediction button, receiving webpage data by the server, and finally obtaining model scores, risk groups, probability of no recurrence within 2 years and a curve of no recurrence by utilizing the logic operation of a training model; for example, a risk score greater than 66.511, the patient is in a high risk group and the physician needs to pay special attention to optimize post-operative treatment and follow-up.
Example four
This embodiment corresponds to the first to third embodiments, and provides a specific application scenario:
as shown in fig. 3, the patient data is entered as: age 60 (Age), Male (Male), HBV infection, platelets 57 x 109/l (plt), albumin 30g/l (alb), total bilirubin 10 μmol/l (tbil), alpha fetoprotein 388ng/ml (afp), Tumor size 12cm (Tumor size), Tumor number 1 (Tumor number), Microvascular cancer plug (Microvascular invasion), macroscopic vascular invasion (Macrovascular invasion), differentiation grade I-II (edmondside), no Tumor envelope (Tunor capsule), no Satellite foci (Satellite nodules), with a background of cirrhosis (Liver cirrhosis);
the random survival forest algorithm of the above embodiment is used for prediction, a model score of 71.39 is obtained, a curve (curve in fig. 3) of high risk patients and no recurrence in 2 years is obtained, and probabilities of no recurrence in 3 months, 6 months, 9 months, 12 months, 18 months and 24 months are calculated to be 66%, 44%, 33%, 26%, 18% and 14% respectively (above the curve in the figure, the probabilities correspond to time periods).
EXAMPLE five
In this embodiment, corresponding to the first to fourth embodiments, a computer-readable storage medium is provided, on which a computer program is stored, and the program, when executed by a processor, can implement the steps included in the method for predicting recurrence after liver cancer operation based on random survival forest of any one of the first to fourth embodiments. The detailed steps are not repeated here, and refer to the descriptions of the first to fourth embodiments for details.
In conclusion, the liver cancer postoperative recurrence prediction method and the storage medium based on the random survival forest provided by the invention can accurately predict the postoperative recurrence probability of the liver cancer of an individual patient, and better determine the postoperative attention; help in active prevention; particularly, aiming at medical institutions, the method can help medical staff to accurately screen out high-risk recurrent patients after liver cancer surgery, is helpful for intervention in early recurrence and guiding postoperative follow-up and treatment, and thus improves the cure rate; the prediction result is visual and understandable, the application range is wide, and the practicability is strong. Therefore, the method has the characteristics of easiness in implementation, convenience and quickness in operation, low cost, high accuracy, strong practicability, easiness in popularization and the like.
The above description is only an embodiment of the present invention, and not intended to limit the scope of the present invention, and all equivalent changes made by using the contents of the present specification and the drawings, or applied directly or indirectly to the related technical fields, are included in the scope of the present invention.

Claims (9)

1. The method for predicting postoperative recurrence of liver cancer based on random survival forest is characterized by comprising the following steps:
acquiring clinical data and recurrence time of each case;
the preset grouping dimension comprises basic factors of a patient, preoperative inspection factors and postoperative pathological factors;
acquiring a data set according to the clinical data, wherein the data set is composed of preset grouping dimensions corresponding to each case;
and (3) constructing a corresponding liver cancer postoperative early-stage recurrence prediction model by adopting a random survival forest algorithm according to the data set and the recurrence time of each case.
2. The method of predicting post-operative recurrence of liver cancer based on random survival forests as claimed in claim 1, wherein the obtaining of clinical data and time of recurrence for each case further comprises:
the patients were obtained in the group, who had normal liver function assessment before surgery, had no history of malignant tumor, no invasion of adjacent organs and distant metastasis, had undergone hepatoma resection surgery and had pathology confirmed as hepatocellular carcinoma after surgery, and had relapsed after surgery.
3. The method for predicting postoperative recurrence of liver cancer based on random survival forest as claimed in claim 1, wherein the step of constructing a corresponding postoperative early recurrence prediction model of liver cancer by using a random survival forest algorithm according to the data set and the recurrence time of each case comprises:
dividing each case according to a preset proportion to obtain a training group case and a testing group case;
dividing the data set according to the training group cases and the test group to obtain a training group data set and a test group data set;
according to the training group data set and the recurrence time of each case in the training group cases, adopting a random survival forest algorithm to construct and obtain a corresponding liver cancer postoperative early recurrence prediction model and an accumulated risk function thereof;
predicting each case in the training set of cases by using the cumulative risk function to obtain a risk score set;
and dividing the risk score set according to a preset proportion to obtain risk score ranges respectively corresponding to the low-risk recurrence group, the medium-risk recurrence group and the high-risk recurrence group.
4. The method of predicting post-operative recurrence of liver cancer based on random survival forests as claimed in claim 3 further comprising:
acquiring the group entry dimension of a case;
calculating a risk score corresponding to the case through the early relapse prediction model after the liver cancer operation according to the grouping dimension of the case;
determining corresponding risk groups according to the risk score range to which the calculated risk score belongs;
outputting the determined risk group.
5. The method of predicting post-operative recurrence of liver cancer based on random survival forests as claimed in claim 1 further comprising:
and deploying the early relapse prediction model after the liver cancer operation into a server, and generating a corresponding prediction webpage.
6. The method of predicting post-operative recurrence of liver cancer based on random survival forests as claimed in claim 1 further comprising:
acquiring the group entry dimension of a case;
and according to the grouping dimension of the case, calculating the recurrence condition corresponding to the case through the liver cancer postoperative early recurrence prediction model.
7. The method of predicting post-operative recurrence of liver cancer based on random survival forests as claimed in claim 6 wherein the recurrence profile comprises risk score, probability of no recurrence and their profile.
8. The method of predicting post-operative recurrence of liver cancer based on random survival forests as claimed in claim 6 wherein the patient's basic factors include age and gender; the preoperative test factors comprise platelets, albumin, total bilirubin, etiological examination results and alpha fetoprotein; the postoperative pathological factors comprise tumor maximum diameter, tumor number, macroscopic blood vessel invasion, microvascular invasion, satellite foci, tumor envelope, liver cancer differentiation grade and liver cirrhosis type.
9. A computer-readable storage medium, having a computer program stored thereon, wherein the program, when executed by a processor, is capable of implementing the steps included in the method for predicting post-operative recurrence of liver cancer based on random survival forest according to any one of claims 1 to 8.
CN202110098484.8A 2020-07-14 2021-01-25 Liver cancer postoperative recurrence prediction method based on random survival forest and storage medium Pending CN112768060A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN202010671934 2020-07-14
CN2020106719343 2020-07-14

Publications (1)

Publication Number Publication Date
CN112768060A true CN112768060A (en) 2021-05-07

Family

ID=75707197

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110098484.8A Pending CN112768060A (en) 2020-07-14 2021-01-25 Liver cancer postoperative recurrence prediction method based on random survival forest and storage medium

Country Status (1)

Country Link
CN (1) CN112768060A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113517023A (en) * 2021-05-18 2021-10-19 柳州市人民医院 Sex-related liver cancer prognosis marker factor and screening method thereof
CN113571194A (en) * 2021-07-09 2021-10-29 清华大学 Modeling method and device for hepatocellular carcinoma long-term prognosis prediction

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120034235A1 (en) * 2009-01-22 2012-02-09 Korea Institute Of Radiological & Medical Sciences Marker for Liver-Cancer Diagnosis and Recurrence and Survival Prediction, a Kit Comprising the Same, and Prognosis Prediction in Liver-Cancer Patients Using the Marker
US20180251851A1 (en) * 2015-09-10 2018-09-06 Mathias HEIKENWALDER Ectopic lymphoid structures as targets for liver cancer detection, risk prediction and therapy
CN110660481A (en) * 2019-09-27 2020-01-07 颐保医疗科技(上海)有限公司 Artificial intelligence technology-based primary liver cancer recurrence prediction method
CN110791565A (en) * 2019-09-29 2020-02-14 浙江大学 Prognostic marker gene for colorectal cancer recurrence prediction in stage II and random survival forest model
CN110993106A (en) * 2019-12-11 2020-04-10 深圳市华嘉生物智能科技有限公司 Liver cancer postoperative recurrence risk prediction method combining pathological image and clinical information

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120034235A1 (en) * 2009-01-22 2012-02-09 Korea Institute Of Radiological & Medical Sciences Marker for Liver-Cancer Diagnosis and Recurrence and Survival Prediction, a Kit Comprising the Same, and Prognosis Prediction in Liver-Cancer Patients Using the Marker
US20180251851A1 (en) * 2015-09-10 2018-09-06 Mathias HEIKENWALDER Ectopic lymphoid structures as targets for liver cancer detection, risk prediction and therapy
CN110660481A (en) * 2019-09-27 2020-01-07 颐保医疗科技(上海)有限公司 Artificial intelligence technology-based primary liver cancer recurrence prediction method
CN110791565A (en) * 2019-09-29 2020-02-14 浙江大学 Prognostic marker gene for colorectal cancer recurrence prediction in stage II and random survival forest model
CN110993106A (en) * 2019-12-11 2020-04-10 深圳市华嘉生物智能科技有限公司 Liver cancer postoperative recurrence risk prediction method combining pathological image and clinical information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
陈凯 等: "肝癌根治性切除术后早期复发危险因素分析及预测模型构建", 《中华肿瘤防治杂志》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113517023A (en) * 2021-05-18 2021-10-19 柳州市人民医院 Sex-related liver cancer prognosis marker factor and screening method thereof
CN113571194A (en) * 2021-07-09 2021-10-29 清华大学 Modeling method and device for hepatocellular carcinoma long-term prognosis prediction

Similar Documents

Publication Publication Date Title
KR102491988B1 (en) Methods and systems for using quantitative imaging
Azar et al. Decision tree classifiers for automated medical diagnosis
CN105184103B (en) Virtual name based on the database of case history cures system
Bozkurt et al. Using automatically extracted information from mammography reports for decision-support
Peng et al. Random forest can predict 30‐day mortality of spontaneous intracerebral hemorrhage with remarkable discrimination
CN110246577B (en) Method for assisting gestational diabetes genetic risk prediction based on artificial intelligence
CN115036002B (en) Treatment effect prediction method based on multi-mode fusion model and terminal equipment
CN112768060A (en) Liver cancer postoperative recurrence prediction method based on random survival forest and storage medium
CN112542247B (en) Method and system for predicting complete remission probability of pathology after breast cancer neoadjuvant chemotherapy
CN113223722B (en) Method and system for constructing lung nodule database and prediction model based on nomogram
Movahedi et al. Limitations of receiver operating characteristic curve on imbalanced data: assist device mortality risk scores
Alam et al. A machine learning classification technique for predicting prostate cancer
CN115376706A (en) Prediction model-based breast cancer drug scheme prediction method and device
Chen et al. Integration of pre-surgical blood test results predict microvascular invasion risk in hepatocellular carcinoma
Armstrong Diagnosis: From classification to prediction
CN111524600A (en) Liver cancer postoperative recurrence risk prediction system based on neighbor2vec
Fogarasi et al. Glandular object based tumor morphometry in H&E biopsy samples for prostate cancer prognosis
CN117271804A (en) Method, device, equipment and medium for generating common disease feature knowledge base
CN110895969A (en) Atrial fibrillation prediction decision tree and pruning method thereof
Lu et al. Deep learning-based long term mortality prediction in the National Lung Screening Trial
Singh A Comprehensive Review of Diagnosis of Renal Cancer
CN114613498B (en) Machine learning-based MDT (minimization drive test) clinical decision making assisting method, system and equipment
Guo et al. Integrated learning: screening optimal biomarkers for identifying preeclampsia in placental mRNA samples
Dy et al. Domain Adaptation using Silver Standard Labels for Ki-67 Scoring in Digital Pathology A Step Closer to Widescale Deployment
Guo et al. LesionTalk: Core Data Extraction and Multi-class Lesion Detection in IoT-based Intelligent Healthcare

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20210507