CN114974598B - Method for constructing lung cancer prognosis prediction model and lung cancer prognosis prediction system - Google Patents

Method for constructing lung cancer prognosis prediction model and lung cancer prognosis prediction system Download PDF

Info

Publication number
CN114974598B
CN114974598B CN202210750259.2A CN202210750259A CN114974598B CN 114974598 B CN114974598 B CN 114974598B CN 202210750259 A CN202210750259 A CN 202210750259A CN 114974598 B CN114974598 B CN 114974598B
Authority
CN
China
Prior art keywords
lung cancer
factors
model
prediction
prognosis
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210750259.2A
Other languages
Chinese (zh)
Other versions
CN114974598A (en
Inventor
杨帆
薛付忠
李江冰
钟璐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202210750259.2A priority Critical patent/CN114974598B/en
Publication of CN114974598A publication Critical patent/CN114974598A/en
Application granted granted Critical
Publication of CN114974598B publication Critical patent/CN114974598B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Measuring And Recording Apparatus For Diagnosis (AREA)

Abstract

The invention discloses a lung cancer prognosis prediction model construction method and a lung cancer prognosis prediction system, comprising the following steps: the obtained lung cancer disease variables are subjected to univariate analysis and then screened to obtain lung cancer prediction factors, and a Bayesian network model is constructed according to the lung cancer prediction factors; secondarily screening the lung cancer predictive factors through feature selection to obtain lung cancer prognostic factors, and constructing a multi-factor Cox proportional risk regression model according to the lung cancer prognostic factors; introducing a lung cancer survival ending variable into the Bayesian network model to obtain a new network structure; predicting the survival probability of the lung cancer prognosis factors in a combined state according to a multi-factor Cox proportional risk regression model to obtain a conditional probability table of a lung cancer survival outcome variable; and fitting the new network structure and the conditional probability table to obtain a lung cancer prognosis prediction model. The problem that the traditional prediction model is difficult to predict when clinical data lacks key prediction variables is solved, and the accuracy of clinical prediction is improved.

Description

Method for constructing lung cancer prognosis prediction model and lung cancer prognosis prediction system
Technical Field
The invention relates to the technical field of prognosis prediction, in particular to a lung cancer prognosis prediction model construction method and a lung cancer prognosis prediction system.
Background
The statements in this section merely provide background information related to the present disclosure and may not necessarily constitute prior art.
In cancer research, research into personalized prognosis prediction models has mostly focused on using regression-based methods, such as lung cancer survival prediction models based on Cox proportional-risk regression models. However, in retrospective studies, clinical and survival data may contain many missing values for a variety of reasons. If the study included long-term follow-up data, some clinical covariates may not be measured. Thus, the lack of covariate data is very common in clinical datasets, which presents a significant challenge to regression-based models.
Current methods of processing incomplete covariate data include complete case analysis methods and padding-based methods. Complete case analysis removes records containing deletions, easily resulting in selection bias; the filling-based method is limited to implementation of the whole data set, and personalized prediction cannot be achieved. In the process of establishing a risk prediction model, if the missing data is improperly processed, the result can be influenced, and the analysis accuracy is reduced.
Disclosure of Invention
In order to solve the problems, the invention provides a lung cancer prognosis prediction model construction method and a lung cancer prognosis prediction system, solves the problem that the traditional prediction model is difficult to predict when clinical data lacks key prediction variables, and improves the accuracy of clinical prediction.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
in a first aspect, the present invention provides a method for constructing a lung cancer prognosis prediction model, comprising:
the obtained lung cancer disease variables are subjected to univariate analysis and then screened to obtain lung cancer prediction factors, and a Bayesian network model is constructed according to the lung cancer prediction factors;
secondarily screening the lung cancer predictive factors through feature selection to obtain lung cancer prognostic factors, and constructing a multi-factor Cox proportional risk regression model according to the lung cancer prognostic factors;
introducing a lung cancer survival ending variable into the Bayesian network model to obtain a new network structure;
predicting the survival probability of the lung cancer prognosis factors in a combined state according to a multi-factor Cox proportional risk regression model to obtain a conditional probability table of a lung cancer survival outcome variable;
and fitting the new network structure and the conditional probability table to obtain a lung cancer prognosis prediction model.
As an alternative embodiment, the conditional probability table of lung cancer survival outcome variables is:
wherein S is 0 (t) is a baseline risk function for year t; beta i As the corresponding variable x i Regression coefficients of the Cox proportional-risk regression model; and p is the number of lung cancer prognosis factors in the multifactor Cox proportional risk regression model.
As an alternative implementation manner, a Bayesian network model is constructed according to lung cancer prediction factors by adopting a model averaging method, and the inter-dependency relationship among the lung cancer prediction factors is modeled by the Bayesian network model.
As an alternative embodiment, the lung cancer prediction factor is subjected to secondary screening by adopting a LASSO-Cox regression characteristic selection method, so as to obtain the lung cancer prognosis factor.
As an alternative embodiment, the lung cancer predictor includes: smoking, advanced age, pleural effusion, pathological stage, lung abscess, pulmonary heart disease, interstitial lung disease, pulmonary embolism, respiratory failure, erythrocyte count, fibrinogen and eosinophils.
As an alternative embodiment, the lung cancer prognostic factor includes: stage, sex, age, smoking, alcohol consumption, chronic obstructive pulmonary disease, targeted therapy, pneumonia, interstitial lung disease, respiratory failure, fibrinogen and pathological classification.
In a second aspect, the present invention provides a lung cancer prognosis prediction system comprising:
the first model construction module is configured to obtain lung cancer prediction factors through screening the obtained lung cancer disease variables after univariate analysis, and construct a Bayesian network model according to the lung cancer prediction factors;
the second model construction module is configured to carry out secondary screening on the lung cancer prediction factors through feature selection to obtain lung cancer prognosis factors, and construct a multi-factor Cox proportional risk regression model according to the lung cancer prognosis factors;
the network structure learning module is configured to introduce a lung cancer survival ending variable into the Bayesian network model to obtain a new network structure;
the network parameter learning module is configured to predict the survival probability of the lung cancer prognosis factors in a combined state according to a multi-factor Cox proportional risk regression model to obtain a conditional probability table of a lung cancer survival ending variable;
the prognosis prediction model construction module is configured to fit the new network structure and the conditional probability table to obtain a lung cancer prognosis prediction model;
and the prognosis prediction module is configured to carry out probability prediction on the lung cancer diseased risk of the patient according to the lung cancer prognosis prediction model.
In an alternative embodiment, in the prognosis prediction module, a likelihood weighted reasoning algorithm is adopted to predict the lung cancer risk probability according to a lung cancer prognosis prediction model.
In a third aspect, the present invention provides an electronic device comprising a memory and a processor, and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform a method of prognosis of lung cancer; the lung cancer prognosis prediction method comprises the following steps:
the obtained lung cancer disease variables are subjected to univariate analysis and then screened to obtain lung cancer prediction factors, and a Bayesian network model is constructed according to the lung cancer prediction factors;
secondarily screening the lung cancer predictive factors through feature selection to obtain lung cancer prognostic factors, and constructing a multi-factor Cox proportional risk regression model according to the lung cancer prognostic factors;
introducing a lung cancer survival ending variable into the Bayesian network model to obtain a new network structure;
predicting the survival probability of the lung cancer prognosis factors in a combined state according to a multi-factor Cox proportional risk regression model to obtain a conditional probability table of a lung cancer survival outcome variable;
fitting the new network structure and the conditional probability table to obtain a lung cancer prognosis prediction model;
and carrying out probability prediction on the lung cancer diseased risk of the patient according to the lung cancer prognosis prediction model.
In a fourth aspect, the present invention provides a computer readable storage medium storing computer instructions that, when executed by a processor, perform a method for prognosis of lung cancer; the lung cancer prognosis prediction method comprises the following steps:
the obtained lung cancer disease variables are subjected to univariate analysis and then screened to obtain lung cancer prediction factors, and a Bayesian network model is constructed according to the lung cancer prediction factors;
secondarily screening the lung cancer predictive factors through feature selection to obtain lung cancer prognostic factors, and constructing a multi-factor Cox proportional risk regression model according to the lung cancer prognostic factors;
introducing a lung cancer survival ending variable into the Bayesian network model to obtain a new network structure;
predicting the survival probability of the lung cancer prognosis factors in a combined state according to a multi-factor Cox proportional risk regression model to obtain a conditional probability table of a lung cancer survival outcome variable;
fitting the new network structure and the conditional probability table to obtain a lung cancer prognosis prediction model;
and carrying out probability prediction on the lung cancer diseased risk of the patient according to the lung cancer prognosis prediction model.
Compared with the prior art, the invention has the beneficial effects that:
aiming at the problem of missing data widely existing in clinical data, the invention provides a method for constructing a lung cancer prognosis prediction model based on Bayesian network uncertainty reasoning and a lung cancer prognosis prediction system, which solve the defect that the risk cannot be predicted for a patient when a covariate is missing in the existing prediction model, and can effectively and accurately predict the risk of the patient.
The invention provides a lung cancer prognosis prediction model construction method and a lung cancer prognosis prediction system, which construct a Bayesian network model according to lung cancer prediction factors; the Bayesian network can model the probability dependency relationship among lung cancer predictors, and more accurately deduce the probability distribution of the missing variable, thereby improving the accuracy of clinical prediction.
The invention provides a lung cancer prognosis prediction model construction method and a lung cancer prognosis prediction system, which combine a Bayesian network with a multi-factor Cox proportion risk regression model, automatically fill up missing values by constructing the Bayesian network, and deduce possible values of missing data; and predicting the survival probability of all possible states of the missing data according to the multi-factor Cox proportional risk regression model, wherein the constructed lung cancer prognosis prediction model has accurate and robust performance on clinical diagnosis and risk prediction tasks due to the capability of the Bayesian network for assisting uncertainty reasoning.
Additional aspects of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the invention.
FIG. 1 is a flowchart of a method for constructing a lung cancer prognosis prediction model according to embodiment 1 of the present invention;
FIG. 2 (a) is a graph showing the distribution of LASSO coefficients of the feature selection predictors of the LASSO Cox regression model provided in example 1 of the present invention;
FIG. 2 (b) is a graph of the LASSO Cox regression model feature selection cross-validation selection optimal regularization parameters provided in example 1 of the present invention;
FIGS. 3 (a) -3 (b) are ROC graphs of the lung cancer prognosis prediction model provided in example 1 of the present invention in training and validation sets;
FIGS. 4 (a) -4 (b) are calibration graphs of the lung cancer prognosis prediction model provided in example 1 of the present invention in training and validation sets;
FIG. 5 is a graph comparing decision curves of the lung cancer prognosis prediction model and the Cox proportional risk regression model provided in example 1 of the present invention.
Detailed Description
The invention is further described below with reference to the drawings and examples.
It should be noted that the following detailed description is exemplary and is intended to provide further explanation of the invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.
It is noted that the terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of exemplary embodiments according to the present invention. As used herein, unless the context clearly indicates otherwise, the singular forms also are intended to include the plural forms, and furthermore, it is to be understood that the terms "comprises" and "comprising" and any variations thereof are intended to cover non-exclusive inclusions, such as, for example, processes, methods, systems, products or devices that comprise a series of steps or units, are not necessarily limited to those steps or units that are expressly listed, but may include other steps or units that are not expressly listed or inherent to such processes, methods, products or devices.
Embodiments of the invention and features of the embodiments may be combined with each other without conflict.
Example 1
In order to solve the problem of data loss in clinical risk assessment, the embodiment constructs a lung cancer prognosis prediction model, wherein the lung cancer prognosis prediction model is constructed based on a survival Bayesian network model, and the survival Bayesian network model combines a Bayesian network and a multi-factor Cox proportional risk regression model. Filling the missing values automatically by constructing a Bayesian network, and deducing possible values of the missing data; and predicting the survival probability of all possible states of the missing data according to a multi-factor Cox proportional risk regression model, and predicting the survival of the lung cancer individual patient by weighting the survival probability by the probability corresponding to each state. Because the Bayesian network has the capability of assisting uncertainty reasoning, the constructed lung cancer prognosis prediction model has accurate and robust performance on clinical diagnosis and risk prediction tasks.
As shown in fig. 1, the embodiment provides a method for constructing a lung cancer prognosis prediction model based on bayesian network uncertainty reasoning, which includes:
the obtained lung cancer disease variables are subjected to univariate analysis and then screened to obtain lung cancer prediction factors, and a Bayesian network model is constructed according to the lung cancer prediction factors;
secondarily screening the lung cancer predictive factors through feature selection to obtain lung cancer prognostic factors, and constructing a multi-factor Cox proportional risk regression model according to the lung cancer prognostic factors;
introducing a lung cancer survival ending variable into the Bayesian network model to obtain a new network structure;
predicting the survival probability of the lung cancer prognosis factors in a combined state according to a multi-factor Cox proportional risk regression model to obtain a conditional probability table of a lung cancer survival outcome variable;
and fitting the new network structure and the conditional probability table to obtain a lung cancer prognosis prediction model.
In the present embodiment, demographic information and pathology information of a lung cancer patient are acquired; comprising the following steps: sex, age, occupation, marital status, family history of lung cancer, smoking, drinking, pneumonia, pleural effusion, lung abscess, pulmonary heart disease, interstitial lung disease, pulmonary embolism, respiratory failure, red blood cell count, monocyte count, direct bilirubin, eosinophil count, fibrinogen, chemoradiotherapy, targeted therapy, medication, staging, pathological diagnosis type, whether surgery, surgical mode, total survival of lung cancer.
The inclusion criteria for the above information was the first diagnosis of lung cancer by surgical or biopsy (bronchoscopy, lung biopsy or lymph node biopsy), ICD-10 being encoded as C34; the exclusion criteria are secondary lung cancer, abnormal identification card number format, abnormal record of death cause registration data, etc.
In this example, the endpoint of the study was defined as the end of death, the total survival of lung cancer was defined as the interval from the time of lung cancer diagnosis to the time of death or last follow-up, and the patient data was encoded and discretized.
In the embodiment, lung cancer patient data is divided into a training set and a testing set, a model is built through the training set, and the model is verified through the testing set; wherein, the patients in the training set contain complete covariate information, and the patients in the verification set lack certain covariate information; specifically, the training set included 2137 lung cancer patients not containing any missing covariate data, and the validation set included 3103 lung cancer patients with missing covariate information.
In order to construct a lung cancer prognosis prediction model, identifying factors possibly related to death risk of a lung cancer patient, the embodiment adopts a single factor Cox proportional risk regression model, and screens out candidate lung cancer disease variables to obtain lung cancer prediction factors;
specifically, p <0.05 is defined to screen all candidate lung cancer disease variables after univariate analysis, and lung cancer predictors obtained by screening are included in the multi-factor analysis.
In this embodiment, the lung cancer predictors include: smoking, advanced age, pleural effusion, poor pathological stage, lung abscess, pulmonary heart disease, interstitial lung disease, pulmonary embolism, respiratory failure, higher red blood cell count, higher fibrinogen and higher eosinophils.
In this embodiment, a bayesian network model is constructed according to the lung cancer predictors by using a model averaging method, and the inter-dependency relationship between the lung cancer predictors is modeled by the bayesian network model, so that the posterior probability distribution of the missing variable is deduced.
In this embodiment, the bayesian network structure is learned by a data-driven manner using a Tabu search (tab search) algorithm, and is used as a blacklist and a whitelist in combination with a priori knowledge of medical literature.
For example, according to existing medical evidence, age and gender variables are allowed to be pointed to smoking variables, and none of the variables are allowed to be pointed to age and gender; in addition, smoking is the primary cause of chronic obstructive pulmonary disease, and therefore is the father node of chronic obstructive pulmonary disease.
To avoid the problem that a single bayesian network structure may cause overfitting, the present embodiment employs a model averaging strategy in the structure learning process.
To learn a robust network structure, the present embodiment resamples the data 200 times using bootstrap, learning 200 network structures; and then obtaining a converged Bayesian network structure on 200 bootstrap resampling networks by using a model averaging method.
In order to further simplify the Bayesian network model, the problem of over-fitting of the Bayesian network model is prevented. In the embodiment, a LASSO characteristic selection method is adopted for secondary screening of lung cancer prediction factors to obtain lung cancer prognosis factors, and a multi-factor Cox proportional risk regression model is constructed according to the lung cancer prognosis factors.
Specifically, as shown in fig. 2 (a) -2 (b), a lung cancer prognosis factor is obtained by selecting a LASSO-Cox regression 10-fold cross validation variable, so that a multi-factor Cox proportional risk regression model is constructed, and a baseline risk, an absolute risk and a survival probability are obtained.
Specifically, after secondary screening, 12 lung cancer prognostic factors were obtained, including: stage, sex, age, smoking, alcohol consumption, chronic obstructive pulmonary disease, targeted therapy, pneumonia, interstitial lung disease, respiratory failure, fibrinogen and pathological classification.
The multi-factor analysis shows that the lung cancer death risk is increased by the pathological stage worse, smoking, older, chronic obstructive pulmonary disease, higher fibrinogen level and pneumonia.
In order to combine the effective reasoning capacity of the Bayesian network on the missing data with the good survival prediction capacity of the Cox proportional risk regression model, the embodiment constructs a survival Bayesian network model as a lung cancer prognosis prediction model; the method specifically comprises the following steps:
firstly, introducing a lung cancer survival ending variable into a Bayesian network to obtain a structure of the survival Bayesian network; in a surviving bayesian network, lung cancer prognostic factors are directed to lung cancer survival outcome variables.
Then, predicting the survival probability of the lung cancer prognosis factor combination by adopting a multi-factor Cox proportional risk regression model to obtain a conditional probability table of a lung cancer survival ending variable node, wherein the calculation formula is as follows:
wherein survivinal status is survival outcome of each lung cancer patient, 1 represents death, 0 represents survival; s is S 0 (t) represents a baseline risk function for year t; beta i As the corresponding variable x i Regression coefficients of the Cox model; p is the number of prognosis factors of lung cancer patients in a multifactor Cox regression model, n E A set of parent nodes that are surviving nodes of lung cancer.
Finally, fitting the obtained new Bayesian network structure and the conditional probability table to obtain a final survival Bayesian network model; the survival Bayesian network model comprises a group of discrete predictors and lung cancer survival ending variables.
In this embodiment, according to the survived bayesian network model, a likelihood weighted reasoning algorithm based on a markov chain monte carlo sampling method is adopted to predict the survival probability of lung cancer for a patient containing missing variable information.
In this embodiment, to verify the validity of the lung cancer prognosis prediction model, a test set is used to verify the survival bayesian network model, the calibration and distinguishing capabilities of the model are evaluated by adopting an AUC and a calibration chart respectively, and the clinical utility of the survival bayesian network model and the Cox proportional risk regression model is predicted by comparing the decision curve analysis.
The deleted individuals in the validation queue are excluded, with the remaining 1433 samples used for model internal validation. For the 3-year lung cancer survival outcome, the consistency index of the model in the training queue is 0.841 (95% CI: 0.828-0.856), and the consistency index in the verification queue is 0.802 (95% CI: 0.787-0.817), which shows that the model has better distinguishing capability, and the corresponding ROC curve is shown in the graph from FIG. 3 (a) to FIG. 3 (b).
The calibration curve verifying the 3-year lifetime of the queue shows good agreement between the predicted probability and the observed probability, the calibration curve being around a 45 degree straight line, indicating good calibration of the model, the corresponding calibration graphs being shown in fig. 4 (a) -4 (b).
The clinical utility of the surviving bayesian network model was compared to the Cox proportional hazards regression model by decision curve analysis (decision curve analyses, DCA). And for the Cox proportion risk regression model, filling the missing covariates in the test sample by using a non-parameter missing data filling method MissForest. And then calculating the net benefits of each model at different risk thresholds, wherein as shown in fig. 5, the decision curve results show that the normalized net benefits of the survival Bayesian network model are higher than the Cox proportion risk regression model within the range that the threshold probability is up to 80%, and the survival Bayesian network model has better prediction performance under the condition of missing the covariates.
The lung cancer prognosis prediction model proposed in this embodiment integrates a missing data uncertainty reasoning method, thereby allowing incomplete covariate data to be input in the application process. Modeling the dependency relationship among clinical variables by using a Bayesian network, and then executing data filling by using the Bayesian network, so that a complete data set is obtained and applied to a Cox proportional risk regression model; the Bayesian network uncertainty reasoning algorithm has stronger robustness on the data missing proportion, and is suitable for clinical data sets with high missing rate.
Bayesian networks act as a probability graph model that relies on a directed acyclic graph to model structured dependencies between random variables and represent a joint probability distribution. And after the Bayesian network is learned, reasoning the possible values of the missing data according to the posterior probability distribution of the unknown variables calculated by the estimated joint probability distribution. The present embodiment uses bayesian networks in a similar way to previous studies, but unlike them, the present embodiment uses a global structure learning method and adds known prior knowledge to obtain a more accurate model. Compared with other common filling methods, the method based on Bayesian network uncertainty reasoning can improve survival prediction accuracy of patients.
Example 2
The present embodiment provides a lung cancer prognosis prediction system, including:
the first model construction module is configured to obtain lung cancer prediction factors through screening the obtained lung cancer disease variables after univariate analysis, and construct a Bayesian network model according to the lung cancer prediction factors;
the second model construction module is configured to carry out secondary screening on the lung cancer prediction factors through feature selection to obtain lung cancer prognosis factors, and construct a multi-factor Cox proportional risk regression model according to the lung cancer prognosis factors;
the network structure learning module is configured to introduce a lung cancer survival ending variable into the Bayesian network model to obtain a new network structure;
the network parameter learning module is configured to predict the survival probability of the lung cancer prognosis factors in a combined state according to a multi-factor Cox proportional risk regression model to obtain a conditional probability table of a lung cancer survival ending variable;
the prognosis prediction model construction module is configured to fit the new network structure and the conditional probability table to obtain a lung cancer prognosis prediction model;
and the prognosis prediction module is configured to carry out probability prediction on the lung cancer diseased risk of the patient according to the lung cancer prognosis prediction model.
It should be noted that the above modules correspond to the steps described in embodiment 1, and the above modules are the same as examples and application scenarios implemented by the corresponding steps, but are not limited to those disclosed in embodiment 1. It should be noted that the modules described above may be implemented as part of a system in a computer system, such as a set of computer-executable instructions.
In further embodiments, there is also provided:
an electronic device comprising a memory and a processor, and computer instructions stored on the memory and running on the processor, which when executed by the processor, perform a lung cancer prognosis method; the lung cancer prognosis prediction method comprises the following steps:
the obtained lung cancer disease variables are subjected to univariate analysis and then screened to obtain lung cancer prediction factors, and a Bayesian network model is constructed according to the lung cancer prediction factors;
secondarily screening the lung cancer predictive factors through feature selection to obtain lung cancer prognostic factors, and constructing a multi-factor Cox proportional risk regression model according to the lung cancer prognostic factors;
introducing a lung cancer survival ending variable into the Bayesian network model to obtain a new network structure;
predicting the survival probability of the lung cancer prognosis factors in a combined state according to a multi-factor Cox proportional risk regression model to obtain a conditional probability table of a lung cancer survival outcome variable;
fitting the new network structure and the conditional probability table to obtain a lung cancer prognosis prediction model;
and carrying out probability prediction on the lung cancer diseased risk of the patient according to the lung cancer prognosis prediction model.
It should be understood that in this embodiment, the processor may be a central processing unit CPU, and the processor may also be other general purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate array FPGA or other programmable logic device, discrete gate or transistor logic devices, discrete hardware components, or the like. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like.
The memory may include read only memory and random access memory and provide instructions and data to the processor, and a portion of the memory may also include non-volatile random access memory. For example, the memory may also store information of the device type.
A computer readable storage medium storing computer instructions that, when executed by a processor, perform a method of prognosis prediction of lung cancer; the lung cancer prognosis prediction method comprises the following steps:
the obtained lung cancer disease variables are subjected to univariate analysis and then screened to obtain lung cancer prediction factors, and a Bayesian network model is constructed according to the lung cancer prediction factors;
secondarily screening the lung cancer predictive factors through feature selection to obtain lung cancer prognostic factors, and constructing a multi-factor Cox proportional risk regression model according to the lung cancer prognostic factors;
introducing a lung cancer survival ending variable into the Bayesian network model to obtain a new network structure;
predicting the survival probability of the lung cancer prognosis factors in a combined state according to a multi-factor Cox proportional risk regression model to obtain a conditional probability table of a lung cancer survival outcome variable;
fitting the new network structure and the conditional probability table to obtain a lung cancer prognosis prediction model;
and carrying out probability prediction on the lung cancer diseased risk of the patient according to the lung cancer prognosis prediction model.
The method may be embodied directly in hardware processor execution or in a combination of hardware and software modules in a processor. The software modules may be located in a random access memory, flash memory, read only memory, programmable read only memory, or electrically erasable programmable memory, registers, etc. as well known in the art. The storage medium is located in a memory, and the processor reads the information in the memory and, in combination with its hardware, performs the steps of the above method. To avoid repetition, a detailed description is not provided herein.
Those of ordinary skill in the art will appreciate that the elements of the various examples described in connection with the present embodiments, i.e., the algorithm steps, can be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
While the foregoing description of the embodiments of the present invention has been presented in conjunction with the drawings, it should be understood that it is not intended to limit the scope of the invention, but rather, it is intended to cover all modifications or variations within the scope of the invention as defined by the claims of the present invention.

Claims (8)

1. The method for constructing the lung cancer prognosis prediction model is characterized by comprising the following steps of:
the obtained lung cancer disease variables are subjected to univariate analysis and then screened to obtain lung cancer prediction factors, and a Bayesian network model is constructed according to the lung cancer prediction factors;
constructing a Bayesian network model by adopting a model averaging method according to the lung cancer prediction factors, and modeling the interdependence relationship among the lung cancer prediction factors by the Bayesian network model;
secondarily screening the lung cancer predictive factors through feature selection to obtain lung cancer prognostic factors, and constructing a multi-factor Cox proportional risk regression model according to the lung cancer prognostic factors;
introducing a lung cancer survival ending variable into the Bayesian network model to obtain a new network structure;
predicting the survival probability of the lung cancer prognosis factors in a combined state according to a multi-factor Cox proportional risk regression model to obtain a conditional probability table of a lung cancer survival outcome variable;
the conditional probability tables for lung cancer survival outcome variables are:
wherein S is 0 (t) is a baseline risk function for year t; beta i As the corresponding variable x i Regression coefficients of the Cox proportional-risk regression model; p is the number of lung cancer prognosis factors in the multi-factor Cox proportional risk regression model;
and fitting the new network structure and the conditional probability table to obtain a lung cancer prognosis prediction model.
2. The method for constructing a lung cancer prognosis model according to claim 1, wherein the lung cancer prognosis factor is obtained by performing secondary screening on the lung cancer prognosis factor by using a LASSO-Cox regression feature selection method.
3. The method for constructing a lung cancer prognosis prediction model according to claim 1, wherein the lung cancer predictor comprises: smoking, advanced age, pleural effusion, pathological stage, lung abscess, pulmonary heart disease, interstitial lung disease, pulmonary embolism, respiratory failure, erythrocyte count, fibrinogen and eosinophils.
4. The method for constructing a lung cancer prognosis model according to claim 1, wherein the lung cancer prognosis factor comprises: stage, sex, age, smoking, alcohol consumption, chronic obstructive pulmonary disease, targeted therapy, pneumonia, interstitial lung disease, respiratory failure, fibrinogen and pathological classification.
5. A lung cancer prognosis prediction system based on the method of any one of claims 1-4, comprising:
the first model construction module is configured to obtain lung cancer prediction factors through screening the obtained lung cancer disease variables after univariate analysis, and construct a Bayesian network model according to the lung cancer prediction factors;
the second model construction module is configured to carry out secondary screening on the lung cancer prediction factors through feature selection to obtain lung cancer prognosis factors, and construct a multi-factor Cox proportional risk regression model according to the lung cancer prognosis factors;
the network structure learning module is configured to introduce a lung cancer survival ending variable into the Bayesian network model to obtain a new network structure;
the network parameter learning module is configured to predict the survival probability of the lung cancer prognosis factors in a combined state according to a multi-factor Cox proportional risk regression model to obtain a conditional probability table of a lung cancer survival ending variable;
the prognosis prediction model construction module is configured to fit the new network structure and the conditional probability table to obtain a lung cancer prognosis prediction model;
and the prognosis prediction module is configured to carry out probability prediction on the lung cancer diseased risk of the patient according to the lung cancer prognosis prediction model.
6. The lung cancer prognosis prediction system according to claim 5, wherein the prognosis prediction module predicts the probability of lung cancer disease using likelihood weighted reasoning algorithm according to the lung cancer prognosis prediction model.
7. An electronic device comprising a memory and a processor, and computer instructions stored on the memory and running on the processor, the computer instructions, when executed by the processor, perform a lung cancer prognosis method; the lung cancer prognosis prediction method comprises the following steps:
the obtained lung cancer disease variables are subjected to univariate analysis and then screened to obtain lung cancer prediction factors, and a Bayesian network model is constructed according to the lung cancer prediction factors;
secondarily screening the lung cancer predictive factors through feature selection to obtain lung cancer prognostic factors, and constructing a multi-factor Cox proportional risk regression model according to the lung cancer prognostic factors;
introducing a lung cancer survival ending variable into the Bayesian network model to obtain a new network structure;
predicting the survival probability of the lung cancer prognosis factors in a combined state according to a multi-factor Cox proportional risk regression model to obtain a conditional probability table of a lung cancer survival outcome variable;
fitting the new network structure and the conditional probability table to obtain a lung cancer prognosis prediction model;
and carrying out probability prediction on the lung cancer diseased risk of the patient according to the lung cancer prognosis prediction model.
8. A computer readable storage medium for storing computer instructions that, when executed by a processor, perform a method for prognosis of lung cancer; the lung cancer prognosis prediction method comprises the following steps:
the obtained lung cancer disease variables are subjected to univariate analysis and then screened to obtain lung cancer prediction factors, and a Bayesian network model is constructed according to the lung cancer prediction factors;
secondarily screening the lung cancer predictive factors through feature selection to obtain lung cancer prognostic factors, and constructing a multi-factor Cox proportional risk regression model according to the lung cancer prognostic factors;
introducing a lung cancer survival ending variable into the Bayesian network model to obtain a new network structure;
predicting the survival probability of the lung cancer prognosis factors in a combined state according to a multi-factor Cox proportional risk regression model to obtain a conditional probability table of a lung cancer survival outcome variable;
fitting the new network structure and the conditional probability table to obtain a lung cancer prognosis prediction model;
and carrying out probability prediction on the lung cancer diseased risk of the patient according to the lung cancer prognosis prediction model.
CN202210750259.2A 2022-06-29 2022-06-29 Method for constructing lung cancer prognosis prediction model and lung cancer prognosis prediction system Active CN114974598B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210750259.2A CN114974598B (en) 2022-06-29 2022-06-29 Method for constructing lung cancer prognosis prediction model and lung cancer prognosis prediction system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210750259.2A CN114974598B (en) 2022-06-29 2022-06-29 Method for constructing lung cancer prognosis prediction model and lung cancer prognosis prediction system

Publications (2)

Publication Number Publication Date
CN114974598A CN114974598A (en) 2022-08-30
CN114974598B true CN114974598B (en) 2024-04-16

Family

ID=82967943

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210750259.2A Active CN114974598B (en) 2022-06-29 2022-06-29 Method for constructing lung cancer prognosis prediction model and lung cancer prognosis prediction system

Country Status (1)

Country Link
CN (1) CN114974598B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115512833B (en) * 2022-11-22 2023-03-24 四川省医学科学院·四川省人民医院 Establishment of long-term cost effectiveness prediction system for lung cancer patient based on deep learning Markov framework
CN115938579A (en) * 2022-11-30 2023-04-07 常州国药医学检验实验室有限公司 Characteristic combination and Cox proportion risk model for predicting survival rate of non-small cell lung cancer patients
CN116486922B (en) * 2023-04-18 2024-01-23 中日友好医院(中日友好临床医学研究所) Method for predicting lung transplant rejection based on gene polymorphism and plasma cytokines and application thereof
CN116313062B (en) * 2023-05-18 2023-07-21 四川省肿瘤医院 Lung adenocarcinoma prognosis model

Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004037996A2 (en) * 2002-10-24 2004-05-06 Duke University Evaluation of breast cancer states and outcomes using gene expression profiles
CN105678104A (en) * 2016-04-06 2016-06-15 电子科技大学成都研究院 Method for analyzing health data of old people on basis of Cox regression model
CN109859801A (en) * 2019-02-14 2019-06-07 辽宁省肿瘤医院 A kind of model and method for building up containing seven genes as biomarker prediction lung squamous cancer prognosis
CN111816310A (en) * 2020-07-16 2020-10-23 山东大学 Bone marrow blood disease risk factor contribution rate calculation and risk prediction system
CN111816319A (en) * 2020-07-16 2020-10-23 山东大学 Urinary system severe disease index determination method and risk prediction system capable of gradually screening
CN112635063A (en) * 2020-12-30 2021-04-09 华南理工大学 Lung cancer prognosis comprehensive prediction model, construction method and device
CN112680523A (en) * 2021-01-25 2021-04-20 复旦大学附属中山医院 Molecular model for judging prognosis of ovarian cancer patient and application
CN112735592A (en) * 2021-01-18 2021-04-30 中国医学科学院肿瘤医院 Construction method and application method of lung cancer prognosis model and electronic equipment
CN113066585A (en) * 2021-03-05 2021-07-02 中山大学附属第六医院 Method for efficiently and quickly evaluating prognosis of stage II colorectal cancer patient based on immune gene expression profile
CN113430269A (en) * 2021-06-29 2021-09-24 北京泱深生物信息技术有限公司 Application of biomarker in prediction of lung cancer prognosis
CN113517073A (en) * 2021-09-13 2021-10-19 生物岛实验室 Method and system for predicting survival rate after lung cancer surgery
CN114283937A (en) * 2021-09-30 2022-04-05 北京大学第一医院 Device for predicting kidney development risk of ANCA (acute coronary intervention) related small vasculitis and model training method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170268066A1 (en) * 2016-03-15 2017-09-21 Chalmers Ventures Ab Cancer biomarkers
WO2021003485A1 (en) * 2019-07-03 2021-01-07 The Board Of Trustees Of The Leland Stanford Junior University Methods to assess clinical outcome based upon updated probabilities and treatments thereof

Patent Citations (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2004037996A2 (en) * 2002-10-24 2004-05-06 Duke University Evaluation of breast cancer states and outcomes using gene expression profiles
CN105678104A (en) * 2016-04-06 2016-06-15 电子科技大学成都研究院 Method for analyzing health data of old people on basis of Cox regression model
CN109859801A (en) * 2019-02-14 2019-06-07 辽宁省肿瘤医院 A kind of model and method for building up containing seven genes as biomarker prediction lung squamous cancer prognosis
CN111816310A (en) * 2020-07-16 2020-10-23 山东大学 Bone marrow blood disease risk factor contribution rate calculation and risk prediction system
CN111816319A (en) * 2020-07-16 2020-10-23 山东大学 Urinary system severe disease index determination method and risk prediction system capable of gradually screening
CN112635063A (en) * 2020-12-30 2021-04-09 华南理工大学 Lung cancer prognosis comprehensive prediction model, construction method and device
CN112735592A (en) * 2021-01-18 2021-04-30 中国医学科学院肿瘤医院 Construction method and application method of lung cancer prognosis model and electronic equipment
CN112680523A (en) * 2021-01-25 2021-04-20 复旦大学附属中山医院 Molecular model for judging prognosis of ovarian cancer patient and application
CN113066585A (en) * 2021-03-05 2021-07-02 中山大学附属第六医院 Method for efficiently and quickly evaluating prognosis of stage II colorectal cancer patient based on immune gene expression profile
CN113430269A (en) * 2021-06-29 2021-09-24 北京泱深生物信息技术有限公司 Application of biomarker in prediction of lung cancer prognosis
CN113517073A (en) * 2021-09-13 2021-10-19 生物岛实验室 Method and system for predicting survival rate after lung cancer surgery
CN114283937A (en) * 2021-09-30 2022-04-05 北京大学第一医院 Device for predicting kidney development risk of ANCA (acute coronary intervention) related small vasculitis and model training method

Non-Patent Citations (5)

* Cited by examiner, † Cited by third party
Title
Predicting lung cancer survival prognosis based on the conditional survival bayesian network;Zhong, L etc.;《BMC Medical Research Methodology》;20240122;第24卷(第16期);1-13 *
健康医疗大数据驱动的健康管理学理论方法体系;薛付忠;;山东大学学报(医学版)(第06期);1-23 *
利用人工智能预测癌症的易感性、复发性和生存期;高美虹等;《生物化学与生物物理进展》;20220210;第49卷(第9期);1687-1702 *
基于贝叶斯网络不确定性推理的肺癌风险预测模型;钟璐,薛付忠;《山东大学学报( 医学版)》;20230303;第61卷(第4期);86-93 *
高晓光.《离散动态贝叶斯网络推理及其应用》.国防工业出版社,2016,13-14. *

Also Published As

Publication number Publication date
CN114974598A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
CN114974598B (en) Method for constructing lung cancer prognosis prediction model and lung cancer prognosis prediction system
Nanayakkara et al. Characterising risk of in-hospital mortality following cardiac arrest using machine learning: A retrospective international registry study
Lin et al. Early diagnosis and prediction of sepsis shock by combining static and dynamic information using convolutional-LSTM
Son et al. Decision-making model for early diagnosis of congestive heart failure using rough set and decision tree approaches
JP6883584B2 (en) Integrated methods and systems for identifying functional patient-specific somatic abnormalities using multiomic cancer profiles
US20080086272A1 (en) Identification and use of biomarkers for the diagnosis and the prognosis of inflammatory diseases
Safaei et al. E-CatBoost: An efficient machine learning framework for predicting ICU mortality using the eICU Collaborative Research Database
CN113327679A (en) Pulmonary embolism clinical risk and prognosis scoring method and system
Rahnenführer et al. Statistical analysis of high-dimensional biomedical data: a gentle introduction to analytical goals, common approaches and challenges
Song et al. Bayesian hierarchical models for high‐dimensional mediation analysis with coordinated selection of correlated mediators
CN115798602A (en) Gene regulation and control network construction method, device, equipment and storage medium
Wang et al. BoXHED: Boosted eXact hazard estimator with dynamic covariates
Özkan et al. Effect of data preprocessing on ensemble learning for classification in disease diagnosis
Zhou et al. Use of disease embedding technique to predict the risk of progression to end-stage renal disease
Puga et al. Discovery of patient phenotypes through multi-layer network analysis on the example of tinnitus
CN113889180B (en) Biomarker identification method and system based on dynamic network entropy
Langham et al. Predicting risk of dementia with machine learning and survival models using routine primary care records
KR102541510B1 (en) Method for constructing prediction model of suicide using national medical check-up data
Thuluvath et al. A Scoring Model to Predict In-Hospital Mortality in Patients With Budd–Chiari Syndrome
Mareeswari et al. Predicting Chronic Kidney Disease Using KNN Algorithm
Chen et al. A Bayesian approach for the Cox proportional hazards model with covariates subject to detection limit
Wang et al. Prediction of target range of intact parathyroid hormone in hemodialysis patients with artificial neural network
An et al. Statistical approaches applicable in managing OMICS data: Urinary proteomics as exemplary case
Luong et al. Learning deep representations from clinical data for chronic kidney disease
Giampieri et al. Statistical strategies and stochastic predictive models for the MARK-AGE data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant