CN113270188A - Method and device for constructing prognosis prediction model of patient after esophageal squamous carcinoma radical treatment - Google Patents

Method and device for constructing prognosis prediction model of patient after esophageal squamous carcinoma radical treatment Download PDF

Info

Publication number
CN113270188A
CN113270188A CN202110505452.5A CN202110505452A CN113270188A CN 113270188 A CN113270188 A CN 113270188A CN 202110505452 A CN202110505452 A CN 202110505452A CN 113270188 A CN113270188 A CN 113270188A
Authority
CN
China
Prior art keywords
data
patient
variables
esophageal squamous
treatment
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202110505452.5A
Other languages
Chinese (zh)
Other versions
CN113270188B (en
Inventor
柯杨
何忠虎
杨文蕾
刘芳芳
刘震
徐瑞平
杨伟
陈蕾
周福有
何煜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Institute for Cancer Research
Original Assignee
Beijing Institute for Cancer Research
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Institute for Cancer Research filed Critical Beijing Institute for Cancer Research
Priority to CN202110505452.5A priority Critical patent/CN113270188B/en
Publication of CN113270188A publication Critical patent/CN113270188A/en
Application granted granted Critical
Publication of CN113270188B publication Critical patent/CN113270188B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/20ICT specially adapted for the handling or processing of patient-related medical or healthcare data for electronic clinical trials or questionnaires
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/50ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for simulation or modelling of medical disorders
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H70/00ICT specially adapted for the handling or processing of medical references
    • G16H70/60ICT specially adapted for the handling or processing of medical references relating to pathologies
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Primary Health Care (AREA)
  • Epidemiology (AREA)
  • Biomedical Technology (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Investigating Or Analysing Biological Materials (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a method and a device for constructing a prognosis prediction model of a patient after esophageal squamous carcinoma radical surgery, which comprises the following steps: obtaining clinical diagnosis and treatment data and follow-up survival data, respectively carrying out multi-factor Cox regression analysis on patient characteristic variables, tumor pathological characteristic variables, treatment condition variables and inspection index variables according to the follow-up survival data, carrying out variable screening by using a step-by-step back algorithm and a Chichi information quantity criterion, and carrying out variable screening on screened candidate variables again to obtain modeling variables; carrying out multi-factor Cox regression analysis on the modeling variables and the pairwise interaction items thereof to construct a prognosis prediction model of the patient after esophageal squamous cell carcinoma radical treatment, wherein the prediction variables comprise: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms. The invention can improve the prediction accuracy, determine the optimal benefit groups of different treatment schemes and realize the prognosis evaluation precision of the esophageal squamous cell carcinoma.

Description

Method and device for constructing prognosis prediction model of patient after esophageal squamous carcinoma radical treatment
Technical Field
The invention relates to the technical field of medical treatment, in particular to a method and a device for constructing a prognosis prediction model of a patient after esophageal squamous cell carcinoma radical treatment.
Background
Esophageal Cancer (EC) is one of the common upper gastrointestinal malignancies. About 57.2 ten thousand new cases of esophageal cancer and about 50.9 ten thousand cases of esophageal cancer death in 2018 are respectively positioned in the 7 th and 6 th orders of malignant tumor morbidity and mortality. Esophageal cancer is a distinctive tumor species in China, about half of new cases and death cases occur in China every year, and the histological type is mainly Esophageal Squamous Cell Carcinoma (ESCC), accounting for more than 90%. In 2015, about 24.6 ten thousand new cases of esophageal cancer and about 18.8 ten thousand deaths in China are listed as the 6 th and 4 th orders of malignant tumor spectrum. Because esophageal cancer is hidden, the typical symptoms are lacked in the early stage, most cases are already in the late stage at the time of diagnosis and treatment, and the survival condition and the prognosis are poor. The tumor registration data based on the population shows that the 5-year age-normalized relative survival rate of esophageal cancer in 2012-2015 is 30.3% (95% confidence interval [ CI ]: 29.6-31.0%); hospital-based clinical survival studies have shown that esophageal cancer has a 5-year observed survival rate of 40.1% (95% CI: 33.7-46.4%), placing a significant burden of disease on society and patient families. In order to reduce the incidence of esophageal cancer and improve survival, various preventive measures are required to be taken, and breakthrough is made in various links of prevention, diagnosis and treatment.
The prognosis of esophageal squamous carcinoma refers to predicting various fates and probabilities of the fates which may appear after esophageal carcinoma occurs for patients with esophageal squamous carcinoma, including survival (cure, remission, exacerbation, relapse), death and the like. The clinical outcome of esophageal squamous carcinoma is a process of multi-factor participation and multi-dimensional appearance. The previous research shows that the prognosis of the esophageal cancer is influenced by the characteristics of patients (such as age and sex), pathological characteristics of tumors (such as histological type, tumor positions, tumor size, surgical margin and lymph node state), treatment-related factors (such as surgical modes, chemotherapy cycles and chemotherapy schemes), molecular markers (such as immune inflammatory markers, tumor markers and blood examination indexes), social and economic factors (such as medical insurance and economic income) and the like. The existence of a plurality of potential prognostic influence factors causes the clinical prognosis of the esophageal squamous carcinoma to have obvious heterogeneity, which brings great challenges to the prognostic evaluation of the esophageal squamous carcinoma.
Surgical treatment remains the current primary mode of treatment for esophageal squamous carcinoma. With the progress of minimally invasive technology and the development of various auxiliary treatments, the treatment modes of esophageal squamous cell carcinoma in the perioperative period are complex and various, and the prognosis evaluation requirements of various diagnosis and treatment modes are huge. There is still controversy on how to choose surgical approach, lymph node clearing range, whether chemotherapy is needed before surgery, whether chemotherapy is needed after surgery, etc., and evidence of evidence-based medicine is to be supplemented. At present, the main basis for the selection of the esophageal squamous cell carcinoma clinical diagnosis and treatment scheme is the TNM staging system jointly issued by American Joint Committee on Cancer (AJCC) and International Cancer Union (UICC). The system mainly stratifies esophageal squamous carcinoma patients according to the primary tumor range (T, tumor), the regional lymph node metastasis existence and range (N, lymphadeneode) and the distant metastasis existence (M, metastasis), not only can provide a basis for evaluating tumor progress and selecting diagnosis and treatment schemes, but also can judge prognosis of the patients, and is a most widely applied tumor prognosis prediction and evaluation tool at present.
However, when a TNM staging system is adopted to predict esophageal squamous cell carcinoma, the curative effect and survival prognosis of patients in the same stage and stage are greatly different after the same treatment measures are taken; or different treatments for the same disease, the results are instead the same. Therefore, the prognosis prediction accuracy of esophageal squamous cell carcinoma with a single standard, which is mainly based on a TNM staging system, is low, and the probability of the outcome of the prognosis of a patient cannot be accurately estimated.
Therefore, there is a need for a prognostic prediction scheme for patients with esophageal squamous cell carcinoma that overcomes the above-mentioned problems.
Disclosure of Invention
The embodiment of the invention provides a method for constructing a prognosis prediction model of a patient after an esophageal squamous cell carcinoma radical operation, which is used for improving the prognosis prediction accuracy of the patient after the esophageal squamous cell carcinoma radical operation, determining the optimal benefit groups of different treatment schemes and realizing the accuracy of the prognosis evaluation of the esophageal squamous cell carcinoma, and comprises the following steps:
obtaining clinical diagnosis and treatment data and follow-up survival data of a patient after esophageal squamous carcinoma radical surgery, wherein the clinical diagnosis and treatment data are obtained from a hospital information management system (HIS) database, and the follow-up survival data are obtained from a follow-up database;
performing data cleaning processing on the clinical diagnosis and treatment data, and determining latent variable categories, wherein the latent variable categories comprise: patient characteristic variables, tumor pathology characteristic variables, treatment condition variables, and test index variables;
according to the follow-up survival data, multi-factor Cox regression analysis is respectively carried out on the characteristic variables of the patient, the pathological characteristic variables of the tumor, the treatment condition variables and the inspection index variables, variable screening is carried out by utilizing a step-by-step back algorithm and an akachi information quantity criterion, and multi-factor Cox regression analysis is carried out on the screened candidate variables again, and variable screening is carried out on the basis of the step-by-step back algorithm and the akachi information quantity criterion to obtain modeling variables;
carrying out multi-factor Cox regression analysis on the modeling variables and the pairwise interaction items thereof, and constructing a prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical treatment by using a akage information quantity criterion, wherein the prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical treatment comprises prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical treatment, and the prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical treatment comprise: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms.
The embodiment of the invention provides a device for constructing a prognosis prediction model of a patient after an esophageal squamous cell carcinoma radical operation, which is used for improving the prognosis prediction accuracy of the patient after the esophageal squamous cell carcinoma radical operation, determining the optimal benefit groups of different treatment schemes and realizing the accuracy of the prognosis evaluation of the esophageal squamous cell carcinoma, and comprises the following components:
the data acquisition module is used for acquiring clinical diagnosis and treatment data and follow-up survival data of a patient after esophageal squamous carcinoma radical treatment, wherein the clinical diagnosis and treatment data are acquired from a hospital information management system (HIS) database, and the follow-up survival data are acquired from a follow-up database;
the first variable determination module is used for performing data cleaning processing on the clinical diagnosis and treatment data and determining latent variable categories, wherein the latent variable categories comprise: patient characteristic variables, tumor pathology characteristic variables, treatment condition variables, and test index variables;
the second variable determination module is used for respectively carrying out multi-factor Cox regression analysis on the patient characteristic variable, the tumor pathological characteristic variable, the treatment condition variable and the inspection index variable according to the follow-up survival data, carrying out variable screening by utilizing a step-by-step back algorithm and an akachi information quantity criterion, carrying out multi-factor Cox regression analysis on the screened candidate variables again, and carrying out variable screening on the candidate variables based on the step-by-step back algorithm and the akachi information quantity criterion to obtain a modeling variable;
the model construction module is used for carrying out multi-factor Cox regression analysis on the modeling variables and the pairwise interaction items of the modeling variables and constructing a prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical operation by using a Chichi information criterion, wherein the prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical operation comprises prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical operation, and the prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical operation comprise: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms.
The embodiment of the invention also provides computer equipment which comprises a memory, a processor and a computer program which is stored on the memory and can be run on the processor, wherein the processor realizes the method for constructing the prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical treatment when executing the computer program.
The embodiment of the invention also provides a computer readable storage medium, which stores a computer program for executing the method for constructing the prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical operation.
Compared with the scheme of performing prognosis prediction on a patient after an esophageal squamous cell carcinoma radical operation by adopting a TNM staging system in the prior art, the embodiment of the invention obtains clinical diagnosis and treatment data and follow-up survival data of the patient after the esophageal squamous cell carcinoma radical operation, wherein the clinical diagnosis and treatment data is obtained from a hospital information management system (HIS) database, and the follow-up survival data is obtained from a follow-up database; performing data cleaning processing on the clinical diagnosis and treatment data, and determining the category of latent variables, wherein the types of the latent variables comprise: patient characteristic variables, tumor pathology characteristic variables, treatment condition variables, and test index variables; according to the follow-up survival data, multi-factor Cox regression analysis is respectively carried out on the characteristic variables of the patient, the pathological characteristic variables of the tumor, the treatment condition variables and the inspection index variables, variable screening is carried out by utilizing a step-by-step back algorithm and an akachi information quantity criterion, and multi-factor Cox regression analysis is carried out on the screened candidate variables again, and variable screening is carried out on the basis of the step-by-step back algorithm and the akachi information quantity criterion to obtain modeling variables; carrying out multi-factor Cox regression analysis on the modeling variables and the pairwise interaction items thereof, and constructing a prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical treatment by using a akage information quantity criterion, wherein the prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical treatment comprises prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical treatment, and the prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical treatment comprise: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms. The embodiment of the invention respectively carries out multi-factor Cox regression analysis on the characteristic variables of patients, the pathological characteristic variables of tumors, the treatment condition variables and the inspection index variables according to follow-up survival data, screens the variables by using a step-by-step back algorithm and an akachi information quantity criterion, carries out the multi-factor Cox regression analysis on the screened candidate variables again, and also carries out the variable screening on the basis of the step-by-step back algorithm and the akachi information quantity criterion to obtain modeling variables, carries out the multi-factor Cox regression analysis on the modeling variables and pairwise interaction items thereof, and constructs a prognosis prediction model of patients after esophageal squamous cell radical treatment by using the akage information quantity criterion, wherein the prediction model comprises the following prediction variables: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms. The screened prognosis prediction variable of the patient after the esophageal squamous carcinoma radical operation is utilized to carry out prognosis prediction on the patient after the esophageal squamous carcinoma radical operation, so that the prognosis prediction accuracy of the patient after the esophageal squamous carcinoma radical operation can be effectively improved, the optimal benefit groups of different treatment schemes are determined, the precision and individualized treatment of the esophageal squamous carcinoma is realized, the curative effect is improved, and the survival of the patient is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to the drawings without creative efforts. In the drawings:
FIG. 1 is a schematic diagram illustrating a method for constructing a prognosis prediction model of a patient after esophageal squamous cell carcinoma radical surgery according to an embodiment of the present invention;
FIG. 2 is a schematic Nomogram diagram of an embodiment of the present invention;
FIG. 3 is a graph of a modeling set and a validation set calibration in an embodiment of the present invention;
FIG. 4 is a Kaplan-Meier survival graph of risk classification in an embodiment of the invention;
FIG. 5 is a structural diagram of a device for constructing a prognosis prediction model of a patient after a radical esophageal squamous carcinoma treatment in an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the embodiments of the present invention are further described in detail below with reference to the accompanying drawings. The exemplary embodiments and descriptions of the present invention are provided to explain the present invention, but not to limit the present invention.
First, technical terms related to the embodiments of the present invention are explained:
esophageal squamous cell carcinoma: the malignant mucosal epithelial tumor of the esophagus with squamous cell differentiation, which is called esophageal squamous carcinoma for short.
Prognosis: predicting various outcomes and probabilities of the outcomes, including cure, remission, exacerbation, relapse, complications, death, etc., that may occur after a certain disease has occurred.
HR: hazard ratio, refers to the ratio of the instantaneous risk of a particular event occurring in one group of subjects to the instantaneous risk of a particular event occurring in another group of subjects.
95% CI: 95% confidence interval, which means an interval range that is estimated with a given confidence of 0.95 to contain unknown global parameters.
Survival rate for 5 years: the probability that a tumor patient survives for more than 5 years.
Overall Survival (OS): time from study initiation to death (for any reason).
Aic (akaike information criterion) criterion: the akachi information content criterion is a standard for evaluating the complexity of a statistical model and measuring the Goodness of fit (Goodness of fit), and is established and developed by the japan statistician akachi in a progressive manner.
As described above, the TNM staging system, which is currently widely used for prognosis prediction in clinical practice, has several limitations as follows. (1) The prognosis of patients after esophageal squamous carcinoma radical treatment has clear heterogeneity, the TNM stage mainly depends on the anatomical part of tumor to grade the risk of the patients, and after the same stage and grade of the patients take the same treatment measures, the curative effect and the survival prognosis of the patients have great difference; and the accurate risk identification and risk stratification of the patient cannot be realized only by depending on the TNM staging. (2) The data used for constructing the 7 th version of TNM staging system mainly come from Western people in Europe and America, and the applicability of the staging system in Chinese people needs to be further considered. (3) In clinical practice, TNM staging is used as a combined variable, and the role or relative weight of variables such as tumor infiltration depth, lymph node metastasis status, distant metastasis and the like cannot be considered separately, and less information can be provided for prognosis. (4) TNM staging can only provide a relative judgment of survival, i.e. the later the patient is staged, the worse the prognosis, the less accurate the probability of the outcome of the prognosis for the patient. In order to more accurately evaluate the prognosis of a patient after esophageal squamous cell carcinoma radical treatment, a large-sample and prospective-design-based research needs to be developed, a plurality of prediction factors with independent resolution capacity are integrated, a prognosis prediction model is constructed, quantitative evaluation of the occurrence risk of an ending event and corresponding risk classification standards are realized, and the diagnosis and treatment trend of esophageal squamous cell carcinoma is promoted to be precise and individualized. In the field of prognosis prediction research of esophageal squamous cell carcinoma at present, the quantity and the quality of prognosis prediction models are limited. Since 2005, there were 11 prognosis models with outcome of survival of patients after esophageal squamous cell carcinoma radical therapy (including Overall survival [ OS ], Cancer-specific survival [ CSS ]). 10 of them were from China; 10 are hospital-based cohort studies; 83.3% (10/12) of the studies were single-center studies, lacking effective external validation; only 2 study samples were greater than 1000; most data collection dates centered between 2000-; the model discrimination (consistency index, C-index) is between 0.6 and 0.8; the prediction variables included in different models have obvious heterogeneity, are based on clinical medical record data, and lack the integration of multiple groups of mathematical data. In addition, there is a problem of reporting irregularities, with only 16.7% (2/10) of the studies reporting the number of outcome events. In conclusion, the existing esophageal squamous cell carcinoma prognosis prediction model is difficult to popularize and apply in clinical practice, and clinical patients cannot benefit from the model. To go to clinical application, prediction study of esophageal squamous carcinoma prognosis, which is designed in high specification, large in sample, capable of screening prediction variables widely and simultaneously has independent external verification, is very needed.
With the arrival of the big data era, health-related big data, clinical diagnosis and treatment records and end-point outcome follow-up systems are gradually improved, and a good opportunity is provided for collecting and mining tumor prognosis information and developing high-quality esophageal squamous cell carcinoma prognosis prediction research based on the clinical big data. Therefore, the research aims to construct an esophageal cancer prognosis prediction model covering conventional clinical diagnosis and treatment data and multi-dimensional biomarkers and carry out effective verification. Finally, an esophageal squamous carcinoma prognosis evaluation scheme which accords with the actual condition of China and has clear popularization value is formed.
In order to improve the prognosis prediction accuracy of a patient after an esophageal squamous cell carcinoma radical operation, determine optimal benefit groups of different treatment schemes, and realize the accuracy of esophageal squamous cell carcinoma prognosis evaluation, an embodiment of the invention provides a method for constructing a prognosis prediction model of a patient after an esophageal squamous cell carcinoma radical operation, and as shown in fig. 1, the method may include:
101, obtaining clinical diagnosis and treatment data and follow-up survival data of a patient after esophageal squamous carcinoma radical surgery, wherein the clinical diagnosis and treatment data are obtained from a hospital information management system (HIS) database, and the follow-up survival data are obtained from a follow-up database;
102, performing data cleaning processing on the clinical diagnosis and treatment data, and determining potential variable categories, wherein the potential variable categories comprise: patient characteristic variables, tumor pathology characteristic variables, treatment condition variables, and test index variables;
103, respectively carrying out multi-factor Cox regression analysis on the patient characteristic variable, the tumor pathological characteristic variable, the treatment condition variable and the inspection index variable according to the follow-up survival data, carrying out variable screening by utilizing a step-by-step back algorithm and a Chichi information quantity criterion, carrying out multi-factor Cox regression analysis on the screened candidate variables again, and carrying out variable screening on the screened candidate variables based on the step-by-step back algorithm and the Chichi information quantity criterion to obtain modeling variables;
104, carrying out multi-factor Cox regression analysis on the modeling variables and the pairwise interaction items of the modeling variables and utilizing a akage information quantity criterion to construct a prognosis prediction model of the patient after the esophageal squamous cell radical operation, wherein the prognosis prediction model of the patient after the esophageal squamous cell radical operation comprises prognosis prediction variables of the patient after the esophageal squamous cell radical operation, and the prognosis prediction variables of the patient after the esophageal squamous cell radical operation comprise: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms.
As shown in fig. 1, in the embodiment of the present invention, by obtaining clinical diagnosis and treatment data and follow-up survival data of a patient after esophageal squamous carcinoma radical surgery, the clinical diagnosis and treatment data is obtained from a hospital information management system HIS database, and the follow-up survival data is obtained from a follow-up database; performing data cleaning processing on the clinical diagnosis and treatment data, and determining latent variable categories, wherein the latent variable categories comprise: patient characteristic variables, tumor pathology characteristic variables, treatment condition variables, and test index variables; according to the follow-up survival data, multi-factor Cox regression analysis is respectively carried out on the characteristic variables of the patient, the pathological characteristic variables of the tumor, the treatment condition variables and the inspection index variables, variable screening is carried out by utilizing a step-by-step back algorithm and an akachi information quantity criterion, and multi-factor Cox regression analysis is carried out on the screened candidate variables again, and variable screening is carried out on the basis of the step-by-step back algorithm and the akachi information quantity criterion to obtain modeling variables; carrying out multi-factor Cox regression analysis on the modeling variables and the pairwise interaction items thereof, and constructing a prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical treatment by using a akage information quantity criterion, wherein the prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical treatment comprises prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical treatment, and the prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical treatment comprise: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms. The embodiment of the invention respectively carries out multi-factor Cox regression analysis on the characteristic variables of patients, the pathological characteristic variables of tumors, the treatment condition variables and the inspection index variables according to follow-up survival data, screens the variables by using a step-by-step back algorithm and an akachi information quantity criterion, carries out the multi-factor Cox regression analysis on the screened candidate variables again, and also carries out the variable screening on the basis of the step-by-step back algorithm and the akachi information quantity criterion to obtain modeling variables, carries out the multi-factor Cox regression analysis on the modeling variables and pairwise interaction items thereof, and constructs a prognosis prediction model of patients after esophageal squamous cell radical treatment by using the akage information quantity criterion, wherein the prediction model comprises the following prediction variables: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms. The screened prognosis prediction variable of the patient after the esophageal squamous carcinoma radical operation is utilized to carry out prognosis prediction on the patient after the esophageal squamous carcinoma radical operation, so that the prognosis prediction accuracy of the patient after the esophageal squamous carcinoma radical operation can be effectively improved, the optimal benefit groups of different treatment schemes are determined, the precision and individualized treatment of the esophageal squamous carcinoma is realized, the curative effect is improved, and the survival of the patient is improved.
In the embodiment, clinical diagnosis and treatment data and follow-up survival data of a patient after esophageal squamous carcinoma radical treatment are obtained, wherein the clinical diagnosis and treatment data are obtained from a hospital information management system (HIS) database, and the follow-up survival data are obtained from a follow-up database.
In this embodiment, the electronic case is derived from the HIS database of the hospital information management system, and then the personal privacy data in the electronic case of the patient after the esophageal squamous cell carcinoma radical treatment is covered. Wherein the personal privacy data may include: identity card number and other basic information data.
When the system is specifically implemented, a patient is marked by taking a hospital number or a case number as a unique identification code, and data integration and cleaning are performed. The storage management, statistical analysis and safety supervision of data are all responsible for special persons.
In this example, the data was obtained from a large tumor-specific hospital in a northern high-incidence area and a large tumor-specific hospital in a southern non-high-incidence area. The electronic medical record of the esophageal cancer patient is derived from the HIS, and the electronic medical record covers the data of the entrance and exit of the patient, the data of the treatment condition in the hospital, the pathological data, the image data, the endoscope data, the auxiliary examination data, the consultation condition data, the curative effect evaluation data, the conventional inspection data (blood routine, blood biochemistry, urine routine and the like) and the data of the diagnosis and treatment cost are combined at will. After structuring, extracting the clinical diagnosis and treatment data of the patient after the esophageal squamous carcinoma radical treatment.
In this embodiment, the follow-up survival data includes: follow-up date data, follow-up mode data, survival state data, death date data and death reason data.
In an embodiment, the data cleaning processing is performed on the clinical medical data to determine latent variable categories, where the latent variable categories include: patient characteristic variables, tumor pathology characteristic variables, treatment condition variables, and test index variables.
In this embodiment, the patient characteristic data includes: demographic characteristic data, physical condition data, personal history data, past history data, medical insurance type data and family history data;
the tumor pathological feature data comprise: primary tumor position data, differentiation degree data and pathological characteristic data;
the treatment condition data includes: surgical condition data and initial treatment data;
the test indicator data includes: blood routine data, blood biochemical data and combined variable data.
In specific implementation, the study subjects were patients with radical esophageal squamous carcinoma resection who were continuously diagnosed in a certain large tumor hospital in the north from 1/2012 to 12/31/2017 and in a certain large tumor hospital in the south from 8/1/2009 to 12/31/2018. And (4) deriving a corresponding electronic medical record, and extracting a data set required by constructing an esophageal squamous carcinoma prognosis prediction model. The follow-up expiration dates are 19 days in 7 months in 2018 and 7 days in 11 months in 2019. And constructing an esophageal squamous carcinoma prognosis prediction model by taking the northern data set as a modeling set, and performing external independent verification on the model by taking the southern data set as a verification set. Specific inclusion exclusion criteria were as follows: firstly, treating patients with esophageal squamous cell carcinoma; secondly, the follow-up information is complete, namely the survival state (survival or death) of the study object and the definite follow-up date or death date are recorded in at least one follow-up visit, and the follow-up visit period is more than or equal to 6 months; there was no distant metastasis. Exclusion criteria: clinical death in hospital or death within 1 month of peritherapeutic period; receiving endoscopic treatment; receiving new adjuvant therapy; and the clinical data is incomplete.
In this embodiment, the data cleaning processing of the clinical medical data further includes processing missing values, abnormal values, and repeated value data in the clinical medical data. For data with multiple sources, the data quality of different source databases is evaluated, the priority of data selection is set, and then the data are merged and integrated. For repeated measurement data, using baseline patient characteristic data and test index data; and (3) integrating the multi-admission diagnosis and treatment information and setting a reasonable time window (such as half a year of admission time), and acquiring an initial treatment scheme of the patient.
In this embodiment, after the data is cleaned, part of the variables may be combined. Combinatorial variables such as BMI Index, Prognostic Nutritional Index (PNI; serum albumin and peripheral blood lymphocyte combinations), etc. are generated based primarily on literature, expertise, etc. The original properties of the data may also be transformed. Continuous variables are converted to categorical variables by finding the best cutoff (cutoff) value according to literature, accepted criteria, or statistical methods such as median, interquartile range, area under curve ROC, etc.
In this embodiment, data quality control is carried out to clinical diagnosis data, including:
A. data acquisition and entry: establishing a data acquisition and input standard, and manually sampling and checking to ensure the consistency of input data and a data source;
B. quality control of key variables: confirming the definition and the code of the key variable, and establishing a standardized variable dictionary; determining whether the integrity and the accuracy of the key variables meet the research requirements; defining links which may exist by mistake;
C. quality control of data analysis: a strict and reasonable statistical method is adopted for data analysis, relevant mixing and bias control and detailed recording; the analysis results are evaluated by experienced professional instructors, and the disputed results are analyzed and agreed independently by multiple researchers.
In this embodiment, the data quality control is performed on the clinical diagnosis and treatment data, so that the potential variables can be obtained, as shown in table 1.
TABLE 1
Figure BDA0003058188990000091
Figure BDA0003058188990000101
In the combined variable data, the neutrophil lymphocyte ratio (neutrophil count) × 109/L)/lymphocyte count (× 10)9L); platelet lymphocyte ratio (platelet count (× 10)9/L)/lymphocyte count (× 10)9L); lymphocyte monocyte fraction (× 10) lymphocyte count9/L)/monocyte count (× 10)9L); prognostic nutritional index ═ albumin concentration (g/L) +5 × lymphocyte count (× 10)9L); systemic immunoinflammatory index (platelet count (x 10)9Per L). times.neutrophil count (× 10)9/L)/lymphocyte count (× 10)9/L)。
In the embodiment, according to the follow-up survival data, multi-factor Cox regression analysis is respectively carried out on the patient characteristic variable, the tumor pathological characteristic variable, the treatment condition variable and the inspection index variable, variable screening is carried out by utilizing a step-by-step back algorithm and an akachi information quantity criterion, and multi-factor Cox regression analysis is carried out on the screened candidate variables again, and variable screening is carried out on the basis of the step-by-step back algorithm and the akachi information quantity criterion to obtain the modeling variable. Carrying out multi-factor Cox regression analysis on the modeling variables and the pairwise interaction items thereof, and constructing a prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical treatment by using a akage information quantity criterion, wherein the prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical treatment comprises prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical treatment, and the prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical treatment comprise: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms.
In specific implementation, Overall Survival (OS) was used as the study endpoint, and survival time was defined as the interval from the first admission date to the death or last follow-up date. Screening was performed using a Cox proportional hazards regression model. First, preliminary selection of predictor variables. Potential variable data are divided into 4 dimensions according to the intrinsic properties and clinical relevance of the variables, and the potential variable data comprise patient characteristic data, tumor pathological characteristic data, treatment condition data and inspection index data. And performing multi-factor Cox regression in each dimension, screening variables by adopting a step-by-step back method, and selecting the model with the minimum AIC value. And secondly, determining a predictive variable. And (3) performing multifactor Cox regression on the candidate variables obtained by screening in the first step, screening predictive variables according to clinical significance and a gradual retreat method, putting two interactive items of the predictive variables into a Cox model one by one, and determining the predictive variables of the patients after esophageal squamous cell carcinoma radical treatment according to an AIC criterion.
In specific implementation, a multi-factor Cox regression method is used for screening variables in 4 dimensions of patient characteristic data, tumor pathological characteristic data, treatment condition data and inspection index data, and 16 candidate variables capable of predicting total survival after esophageal squamous cell carcinoma radical treatment are preliminarily screened, wherein the candidate variables are age, sex, co-morbidity, esophageal cancer family history, tumor primary position, T stage, N stage, lymph node picking number, treatment mode, operation mode, tumor size, preoperative erythrocyte level, preoperative hemoglobin level, preoperative eosinophilic lymphocyte count, preoperative systemic immunoinflammatory index and preoperative albumin globulin ratio. And (3) carrying out multifactor Cox regression on the 16 candidate variables obtained by screening in the first step, screening according to clinical significance and a gradual retreating method to obtain modeling variables, putting pairwise interactive items of the modeling variables into a Cox model one by one, and determining a prognosis prediction model structure of a patient after esophageal squamous cell carcinoma radical treatment according to an AIC criterion.
In this embodiment, the prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical treatment include: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms. The N stages of treatment mode interaction items are interaction items of the N stages and the treatment modes, namely different N stages, and the four treatment modes have different influences on survival. Patients in stage N0, with any adjuvant treatment post-operatively, would have reduced survival. Patients in stage N1, who were post-operative with simple chemotherapy or simple radiotherapy, could improve survival. Patients in stage N2 or N3, with any post-operative adjuvant therapy, could improve survival, with the most benefit from adjuvant chemotherapy alone.
In this embodiment, the method for constructing a prognosis prediction model of a patient after esophageal squamous cell carcinoma radical surgery further includes:
extracting regression coefficients corresponding to prognosis prediction variables of patients after the esophageal squamous cell carcinoma radical operation in the prognosis prediction model of the patients after the esophageal squamous cell carcinoma radical operation;
establishing Nomogram of a Nomogram according to the regression coefficient;
assigning each value level of the prognosis prediction variable of each esophageal squamous carcinoma radical operation patient according to the Nomogram;
determining a total score corresponding to the prognosis prediction variable of the patient after the esophageal squamous cell carcinoma radical treatment according to the scoring result;
and calculating the survival probability of the patient after the esophageal squamous cell carcinoma radical treatment according to the total score and the function conversion relation.
In specific implementation, Nomogram is drawn, scores of all factors are calculated, and the survival probability of a specific future year is estimated according to the total score. The basic principle of Nomogram is that each value level of each factor is assigned with a score (Points, reference group is 0) according to the contribution degree (regression coefficient) of each prediction factor to the outcome variable of the regression model, then all the scores are added to obtain a Total score (Total Points), and finally the probability of the individual outcome is calculated through the functional conversion relation between the Total score and the occurrence probability of the outcome event. Fig. 2 is a Nomogram diagram. And finding out corresponding scores from the table 2 according to the prognosis prediction characteristic data of the patient after the esophageal squamous cell carcinoma radical operation, and then determining a total score corresponding to the prognosis prediction characteristic data of the patient after the esophageal squamous cell carcinoma radical operation according to the scoring result. Further, the survival probability of the patient after esophageal squamous carcinoma radical treatment was calculated according to the functional transformation relationship in table 3.
TABLE 2
Figure BDA0003058188990000121
Figure BDA0003058188990000131
TABLE 3
Figure BDA0003058188990000132
In this embodiment, the method for constructing a prognosis prediction model of a patient after esophageal squamous cell carcinoma radical surgery further includes: after a prognosis prediction model of a patient after esophageal squamous cell carcinoma radical surgery is constructed, the prognosis prediction model of the patient after esophageal squamous cell carcinoma radical surgery is evaluated. Wherein, the evaluation of the model comprises discrimination evaluation and/or calibration evaluation.
In particular, the discrimination indicates the ability of the model to correctly discriminate whether an individual will have an ending event, and can be evaluated using a Harrell's concordance index (C-index). And pairing all the research objects pairwise, and excluding the pairing which can not judge who first occurs the ending event, wherein if the two cases are deletion (the ending event does not occur), one of the pairing is deletion, and the other one is deletion, and the time of the deletion is earlier than that of the ending event. In the remaining pairs, the lengths or probabilities of the survival times of the two objects are compared, and if the predicted result matches the actual result, it is said to be consistent. And calculating the proportion of the consistent pairing in all available pairing, namely the C-index. The value range of the C-index is between 0.5 and 1.0. C-index lower than 0.60 represents that the discrimination of the model is poor; 0.60-0.75 is medium in degree of distinction; a discrimination higher than 0.75 is preferable.
The Calibration degree indicates the degree of coincidence between the predicted result and the actual result, and is usually evaluated using a Calibration chart (Calibration curve). The basic idea is that the probability of occurrence of an ending event of each individual is calculated according to a prediction model, the individual is ranked from small to large according to the prediction probability, the groups are equally divided (such as quintuple) according to quantiles, the average values of the prediction probability and the actual probability of each group are respectively calculated, and a scatter diagram and a curve are drawn by taking the prediction probability value as an x axis and the actual probability value as a y axis. The closer the curve is to the diagonal with a slope of 1, the more accurate the model is.
As an example, the C-index of the predictive model in the embodiment of the invention is 0.729 (95% CI:0.714-0.744), and the C-index of the model is 0.695 (95% CI:0.674-0.715) in the independent outside population, which shows that the model has high discrimination, repeatability and extrapolation. Fig. 3 shows a calibration graph of a modeling set and a verification set, where a is the modeling set and B is the verification set. Calibration curves are drawn in the two data sets, survival probability curves of the modeling sets for 1 year, 3 years and 5 years are well overlapped with the calibration curves (diagonals), and the verification sets are slightly shifted, so that the accuracy of the model is high. Therefore, the overall prediction efficiency of the prediction model is better, and the effect is more stable.
In an embodiment, the method for constructing the prognosis prediction model of the patient after the esophageal squamous carcinoma radical surgery further comprises the following steps: after a prognosis prediction model of a patient after esophageal squamous cell carcinoma radical treatment is constructed, a risk stratification system is constructed. And calculating the total score of each Nomogram according to the variable information of all the study objects in the modeling set. All total scores were ranked from low to high, with the tertile number as the cutoff, and the study was equally divided into 3 risk groups, low, medium, and high. Three sets of Kaplan-Meier survival curves were plotted and the difference in survival time between the groups was compared using the log-rank test. In the modeling set, the total score of the study subjects was 0.47 point at the lowest and 30.65 points at the highest, and the tertile number of the total score, i.e., the cutoff value (cutoff) of the risk stratification, was 11.99 and 16.94, respectively. Validation set study subjects were also divided into low, medium, and high 3 risk groups, accounting for 25.7%, 27.9%, and 46.4%, respectively, based on the cutoff values. Respectively drawing a Kaplan-Meier survival curve of risk level in the modeling and verification data set, wherein the risk grading Kaplan-Meier survival curve is shown in figure 4, wherein A is all research objects in the modeling set; b is all study objects in the validation set; c is a research object in the 0-I stage of the modeling set; d is a study object in the verification set from 0 stage to I stage; e is a modeling set II phase research object; f is a study object in the verification set II phase; g is a study object of the modeling set in stage III; h is validation set stage III study. The survival curves of the three risk groups differ significantly, with higher risk levels and poorer survival. The model can further subdivide the patient within different stages. In the model, other variables are "unchangeable prognostic factors" except for the treatment modality, which can be intervened. The model can therefore be used to estimate the survival probability and risk level for the same patient for different treatment modalities.
In an embodiment, Stata 15.0 and R3.6.3 may be used for data processing and analysis.
The following provides a specific example to illustrate a specific application of the method for constructing the prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical operation in the embodiment of the invention. In the specific example, the patient is Zhang three, male, 64 years old, middle esophageal squamous carcinoma, 6 lymph nodes are taken and detected in the operation, the pathological stage pT3N2M0 is postoperative, the tumor size is 4cm, and the hemoglobin is 140g/L before the operation. The probability of survival that different treatment regimens can be achieved based on the comparison of the scores corresponding to the patient characteristics is shown in table 4.
TABLE 4
Figure BDA0003058188990000141
Figure BDA0003058188990000151
If chemotherapy is continued to be given after the operation, the survival rate of the patient will increase from 19% to 27% according to the 5-year survival probability, and it may be recommended that the patient continue to receive chemotherapy. Therefore, the model is suitable for patients with esophageal squamous cell carcinoma who are subjected to radical excision of esophageal cancer as initial treatment, the application scene is whether to perform auxiliary treatment or not after postoperative evaluation, and if the auxiliary treatment is performed, which scheme (simple chemotherapy, simple radiotherapy or synchronous radiotherapy and chemotherapy) is the better choice.
Based on the same inventive concept, the embodiment of the invention also provides a device for constructing a prognosis prediction model of a patient after esophageal squamous carcinoma radical treatment, which is described in the following embodiment. Because the principles for solving the problems are similar to the construction method of the prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical operation, the implementation of the device can refer to the implementation of the method, and repeated parts are not repeated.
Fig. 5 is a structural diagram of a device for constructing a prognosis prediction model of a patient after esophageal squamous carcinoma radical surgery in an embodiment of the invention, as shown in fig. 5, the device includes:
the data obtaining module 501 is configured to obtain clinical diagnosis and treatment data and follow-up survival data of a patient after esophageal squamous carcinoma radical surgery, where the clinical diagnosis and treatment data is obtained from a hospital information management system (HIS) database, and the follow-up survival data is obtained from a follow-up database;
a first variable determining module 502, configured to perform data cleaning processing on the clinical medical data, and determine latent variable categories, where the latent variable categories include: patient characteristic variables, tumor pathology characteristic variables, treatment condition variables, and test index variables;
a second variable determination module 503, configured to perform multi-factor Cox regression analysis on the patient characteristic variable, the tumor pathological characteristic variable, the treatment condition variable, and the inspection index variable according to the follow-up survival data, perform variable screening using a step-by-step back algorithm and an akachi information quantity criterion, and perform multi-factor Cox regression analysis again on the screened candidate variables, and perform variable screening based on the step-by-step back algorithm and the akachi information quantity criterion to obtain modeling variables;
a model construction module 504, configured to perform multi-factor Cox regression analysis on the modeling variables and pairwise interaction terms thereof, and construct a prognosis prediction model for a patient after esophageal squamous cell carcinoma radical treatment by using a akage information quantity criterion, where the prognosis prediction model for the patient after esophageal squamous cell carcinoma radical treatment includes prognosis prediction variables for the patient after esophageal squamous cell carcinoma radical treatment, and the prognosis prediction variables for the patient after esophageal squamous cell carcinoma radical treatment include: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms.
In summary, according to the embodiment of the present invention, clinical diagnosis and treatment data and follow-up survival data of a patient after esophageal squamous carcinoma radical surgery are obtained, the clinical diagnosis and treatment data are obtained from a hospital information management system HIS database, and the follow-up survival data are obtained from a follow-up database; performing data cleaning processing on the clinical diagnosis and treatment data, and determining latent variable categories, wherein the latent variable categories comprise: patient characteristic variables, tumor pathology characteristic variables, treatment condition variables, and test index variables; according to the follow-up survival data, multi-factor Cox regression analysis is respectively carried out on the characteristic variables of the patient, the pathological characteristic variables of the tumor, the treatment condition variables and the inspection index variables, variable screening is carried out by utilizing a step-by-step back algorithm and an akachi information quantity criterion, and multi-factor Cox regression analysis is carried out on the screened candidate variables again, and variable screening is carried out on the basis of the step-by-step back algorithm and the akachi information quantity criterion to obtain modeling variables; carrying out multi-factor Cox regression analysis on the modeling variables and the pairwise interaction items thereof, and constructing a prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical treatment by using a akage information quantity criterion, wherein the prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical treatment comprises prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical treatment, and the prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical treatment comprise: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms. The embodiment of the invention respectively carries out multi-factor Cox regression analysis on the characteristic variables of patients, the pathological characteristic variables of tumors, the treatment condition variables and the inspection index variables according to follow-up survival data, screens the variables by using a step-by-step back algorithm and an akage information quantity criterion, carries out multi-factor Cox regression analysis on the screened candidate variables again, screens the variables by using the step-by-step back algorithm and the akage information quantity criterion to obtain modeling variables, carries out multi-factor Cox regression analysis on the modeling variables and pairwise interaction items thereof, and constructs a prognosis prediction model of patients after esophageal squamous cell carcinoma radical treatment by using the akage information quantity criterion, wherein the prediction model comprises the following prediction variables: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms. The screened prognosis prediction variable of the patient after the esophageal squamous carcinoma radical operation is utilized to carry out prognosis prediction on the patient after the esophageal squamous carcinoma radical operation, so that the prognosis prediction accuracy of the patient after the esophageal squamous carcinoma radical operation can be effectively improved, the optimal benefit groups of different treatment schemes are determined, the precision and individualized treatment of the esophageal squamous carcinoma is realized, the curative effect is improved, and the survival of the patient is improved.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
The above-mentioned embodiments are intended to illustrate the objects, technical solutions and advantages of the present invention in further detail, and it should be understood that the above-mentioned embodiments are only exemplary embodiments of the present invention, and are not intended to limit the scope of the present invention, and any modifications, equivalent substitutions, improvements and the like made within the spirit and principle of the present invention should be included in the scope of the present invention.

Claims (10)

1. A method for constructing a prognosis prediction model of a patient after esophageal squamous carcinoma radical surgery is characterized by comprising the following steps:
obtaining clinical diagnosis and treatment data and follow-up survival data of a patient after esophageal squamous carcinoma radical surgery, wherein the clinical diagnosis and treatment data are obtained from a hospital information management system (HIS) database, and the follow-up survival data are obtained from a follow-up database;
performing data cleaning processing on the clinical diagnosis and treatment data, and determining latent variable categories, wherein the latent variable categories comprise: patient characteristic variables, tumor pathology characteristic variables, treatment condition variables, and test index variables;
according to the follow-up survival data, multi-factor Cox regression analysis is respectively carried out on the characteristic variables of the patient, the pathological characteristic variables of the tumor, the treatment condition variables and the inspection index variables, variable screening is carried out by utilizing a step-by-step back algorithm and an akachi information quantity criterion, and multi-factor Cox regression analysis is carried out on the screened candidate variables again, and variable screening is carried out on the basis of the step-by-step back algorithm and the akachi information quantity criterion to obtain modeling variables;
carrying out multi-factor Cox regression analysis on the modeling variables and the pairwise interaction items thereof, and constructing a prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical treatment by using a akage information quantity criterion, wherein the prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical treatment comprises prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical treatment, and the prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical treatment comprise: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms.
2. The method for constructing a prognostic prediction model for a patient after esophageal squamous cancer radical surgery according to claim 1, wherein the follow-up survival data includes: follow-up date data, follow-up mode data, survival state data, death date data and death reason data.
3. The method for constructing a prognostic predictive model for patients after esophageal squamous cancer radical surgery according to claim 1, wherein the patient characteristic data includes: demographic characteristic data, physical condition data, personal history data, past history data, medical insurance type data and family history data;
the tumor pathological feature data comprise: primary tumor position data, differentiation degree data and pathological characteristic data;
the treatment condition data includes: surgical condition data and initial treatment data;
the test indicator data includes: blood routine data, blood biochemical data and combined variable data.
4. The method for constructing a model for predicting prognosis of a patient after radical esophageal squamous cancer therapy according to claim 1, further comprising:
extracting regression coefficients corresponding to prognosis prediction variables of patients after the esophageal squamous cell carcinoma radical operation in the prognosis prediction model of the patients after the esophageal squamous cell carcinoma radical operation;
establishing Nomogram of a Nomogram according to the regression coefficient;
assigning each value level of the prognosis prediction variable of each esophageal squamous carcinoma radical operation patient according to the Nomogram;
determining a total score corresponding to the prognosis prediction variable of the patient after the esophageal squamous cell carcinoma radical treatment according to the scoring result;
and calculating the survival probability of the patient after the esophageal squamous cell carcinoma radical treatment according to the total score and the function conversion relation.
5. A device for constructing a prognosis prediction model of a patient after esophageal squamous carcinoma radical surgery is characterized by comprising the following components:
the data acquisition module is used for acquiring clinical diagnosis and treatment data and follow-up survival data of a patient after esophageal squamous carcinoma radical treatment, wherein the clinical diagnosis and treatment data are acquired from a hospital information management system (HIS) database, and the follow-up survival data are acquired from a follow-up database;
the first variable determination module is used for performing data cleaning processing on the clinical diagnosis and treatment data and determining latent variable categories, wherein the latent variable categories comprise: patient characteristic variables, tumor pathology characteristic variables, treatment condition variables, and test index variables;
the second variable determination module is used for respectively carrying out multi-factor Cox regression analysis on the patient characteristic variable, the tumor pathological characteristic variable, the treatment condition variable and the inspection index variable according to the follow-up survival data, carrying out variable screening by utilizing a step-by-step back algorithm and an akachi information quantity criterion, carrying out multi-factor Cox regression analysis on the screened candidate variables again, and carrying out variable screening on the candidate variables based on the step-by-step back algorithm and the akachi information quantity criterion to obtain a modeling variable;
the model construction module is used for carrying out multi-factor Cox regression analysis on the modeling variables and the pairwise interaction items of the modeling variables and constructing a prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical operation by using a Chichi information criterion, wherein the prognosis prediction model of the patient after the esophageal squamous cell carcinoma radical operation comprises prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical operation, and the prognosis prediction variables of the patient after the esophageal squamous cell carcinoma radical operation comprise: age, sex, primary location of tumor, stage T, number of lymph nodes taken, tumor size, preoperative hemoglobin level and N-stage treatment modality interaction terms.
6. The apparatus for constructing a prognostic model for a patient after esophageal squamous cancer radical surgery according to claim 5, wherein the follow-up survival data includes: follow-up date data, follow-up mode data, survival state data, death date data and death reason data.
7. The apparatus for constructing a prognostic prediction model for a patient after esophageal squamous cancer radical surgery according to claim 5, wherein the patient characteristic data includes: demographic characteristic data, physical condition data, personal history data, past history data, medical insurance type data and family history data;
the tumor pathological feature data comprise: primary tumor position data, differentiation degree data and pathological characteristic data;
the treatment condition data includes: surgical condition data and initial treatment data;
the test indicator data includes: blood routine data, blood biochemical data and combined variable data.
8. The device for constructing a prognosis prediction model for a patient after esophageal squamous cancer radical surgery according to claim 5, further comprising:
the regression coefficient extraction module is used for extracting regression coefficients corresponding to prognosis prediction variables of patients after the esophageal squamous cell carcinoma radical operation in the prognosis prediction model of the patients after the esophageal squamous cell carcinoma radical operation;
the Nomogram establishing module is used for establishing Nomogram according to the regression coefficient;
a value level assigning module for assigning each value level of the prognosis prediction variable of each esophageal squamous carcinoma radical operation patient according to the Nomogram;
the total score determining module is used for determining a total score corresponding to the prognosis prediction variable of the patient after the esophageal squamous cell carcinoma radical treatment according to the scoring result;
and the survival probability calculation module is used for calculating the survival probability of the patient after the esophageal squamous cell carcinoma radical treatment according to the total score and the function conversion relation.
9. A computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the method of any one of claims 1 to 4 when executing the computer program.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program for executing the method of any one of claims 1 to 4.
CN202110505452.5A 2021-05-10 2021-05-10 Method and device for constructing prognosis prediction model of patient after radical esophageal squamous carcinoma treatment Active CN113270188B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110505452.5A CN113270188B (en) 2021-05-10 2021-05-10 Method and device for constructing prognosis prediction model of patient after radical esophageal squamous carcinoma treatment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110505452.5A CN113270188B (en) 2021-05-10 2021-05-10 Method and device for constructing prognosis prediction model of patient after radical esophageal squamous carcinoma treatment

Publications (2)

Publication Number Publication Date
CN113270188A true CN113270188A (en) 2021-08-17
CN113270188B CN113270188B (en) 2024-07-02

Family

ID=77230306

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110505452.5A Active CN113270188B (en) 2021-05-10 2021-05-10 Method and device for constructing prognosis prediction model of patient after radical esophageal squamous carcinoma treatment

Country Status (1)

Country Link
CN (1) CN113270188B (en)

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113936804A (en) * 2021-08-23 2022-01-14 四川大学华西医院 System for constructing model for predicting risk of continuous air leakage after lung cancer resection
CN114496306A (en) * 2022-01-28 2022-05-13 北京大学口腔医学院 Machine learning-based prognosis survival stage prediction method and system
CN114974500A (en) * 2022-05-17 2022-08-30 浙江省肿瘤医院 Intestinal cancer patient nutrition treatment and prognosis prediction evaluation model based on TPN control system
CN115713964A (en) * 2022-10-16 2023-02-24 洛兮基因科技(杭州)有限公司 Immune-related gene prognosis model for predicting overall survival rate of squamous cell lung carcinoma patients
CN115810426A (en) * 2022-12-21 2023-03-17 河南科技大学第一附属医院 Tool and system for prognosis of esophageal squamous cell carcinoma and application
CN117524486A (en) * 2024-01-04 2024-02-06 北京市肿瘤防治研究所 TTE model establishment method for predicting non-progressive survival probability of postoperative patient

Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070248948A1 (en) * 2006-04-14 2007-10-25 Christos Hatzis Method of measuring residual cancer and predicting patient survival
KR20120065959A (en) * 2010-12-13 2012-06-21 사회복지법인 삼성생명공익재단 Markers for predicting gastric cancer prognostication and method for predicting gastric cancer prognostication using the same
WO2012145607A2 (en) * 2011-04-20 2012-10-26 Board Of Regents, The University Of Texas System Specific copy number aberrations as predictors of breast cancer
CN107305596A (en) * 2016-04-15 2017-10-31 中国科学院上海生命科学研究院 Patients with hilar cholangiocarcinoma prognostic predictive model
CN108463228A (en) * 2015-10-23 2018-08-28 科罗拉多大学董事会法人团体 The prognosis and treatment of squamous cell carcinoma
CN111128385A (en) * 2020-01-17 2020-05-08 河南科技大学第一附属医院 Prognosis early warning system for esophageal squamous carcinoma and application thereof
CN111383765A (en) * 2020-03-13 2020-07-07 中国医学科学院肿瘤医院 Esophageal squamous carcinoma onset risk information prediction model, construction method and application
CN111394454A (en) * 2020-01-06 2020-07-10 江苏省肿瘤防治研究所(江苏省肿瘤医院) Immune-related biomarker and application thereof in head and neck squamous cell carcinoma prognosis diagnosis
CN111862085A (en) * 2020-08-03 2020-10-30 徐州市肿瘤医院 Method and system for predicting latent N2 lymph node metastasis of peripheral NSCLC
CN112185549A (en) * 2020-09-29 2021-01-05 郑州轻工业大学 Esophageal squamous carcinoma risk prediction method based on clinical phenotype and logistic regression analysis
CN112185546A (en) * 2020-09-23 2021-01-05 山东大学第二医院 Model for prognosis prediction of breast cancer patient and establishing method
CN112635057A (en) * 2020-12-17 2021-04-09 郑州轻工业大学 Esophageal squamous carcinoma prognosis index model construction method based on clinical phenotype and LASSO
CN112635056A (en) * 2020-12-17 2021-04-09 郑州轻工业大学 Lasso-based esophageal squamous carcinoma patient risk prediction nomogram model establishing method

Patent Citations (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20070248948A1 (en) * 2006-04-14 2007-10-25 Christos Hatzis Method of measuring residual cancer and predicting patient survival
KR20120065959A (en) * 2010-12-13 2012-06-21 사회복지법인 삼성생명공익재단 Markers for predicting gastric cancer prognostication and method for predicting gastric cancer prognostication using the same
WO2012145607A2 (en) * 2011-04-20 2012-10-26 Board Of Regents, The University Of Texas System Specific copy number aberrations as predictors of breast cancer
CN108463228A (en) * 2015-10-23 2018-08-28 科罗拉多大学董事会法人团体 The prognosis and treatment of squamous cell carcinoma
CN107305596A (en) * 2016-04-15 2017-10-31 中国科学院上海生命科学研究院 Patients with hilar cholangiocarcinoma prognostic predictive model
CN111394454A (en) * 2020-01-06 2020-07-10 江苏省肿瘤防治研究所(江苏省肿瘤医院) Immune-related biomarker and application thereof in head and neck squamous cell carcinoma prognosis diagnosis
CN111128385A (en) * 2020-01-17 2020-05-08 河南科技大学第一附属医院 Prognosis early warning system for esophageal squamous carcinoma and application thereof
CN111383765A (en) * 2020-03-13 2020-07-07 中国医学科学院肿瘤医院 Esophageal squamous carcinoma onset risk information prediction model, construction method and application
CN111862085A (en) * 2020-08-03 2020-10-30 徐州市肿瘤医院 Method and system for predicting latent N2 lymph node metastasis of peripheral NSCLC
CN112185546A (en) * 2020-09-23 2021-01-05 山东大学第二医院 Model for prognosis prediction of breast cancer patient and establishing method
CN112185549A (en) * 2020-09-29 2021-01-05 郑州轻工业大学 Esophageal squamous carcinoma risk prediction method based on clinical phenotype and logistic regression analysis
CN112635057A (en) * 2020-12-17 2021-04-09 郑州轻工业大学 Esophageal squamous carcinoma prognosis index model construction method based on clinical phenotype and LASSO
CN112635056A (en) * 2020-12-17 2021-04-09 郑州轻工业大学 Lasso-based esophageal squamous carcinoma patient risk prediction nomogram model establishing method

Non-Patent Citations (7)

* Cited by examiner, † Cited by third party
Title
REDA AL-BAHRANI等: "Colon cancer survival prediction using ensemble data mining on SEER data", 2013 IEEE INTERNATIONAL CONFERENCE ON BIG DATA, 23 December 2013 (2013-12-23) *
ZHUQING C.等: "The Early Stage Lung Cancer Prognosis Prediction Model based on Support Vector Machine", 2018 IEEE 23RD INTERNATIONAL CONFERENCE ON DIGITAL SIGNAL PROCESSING (DSP), 3 February 2019 (2019-02-03) *
丛蕾;崔言刚;王潍博;杜贾军;刘奇;: "晚期非小细胞肺癌化疗预后因素的COX回归分析", 中国癌症杂志, no. 04 *
周支瑞 等: "临床预测模型构建方法学", 28 February 2021, 长沙:中南大学出版社, pages: 135 *
戴;陆俊;李平;郑朝晖;黄昌明;: "皮革胃患者术后生存情况预测的列线图模型研究", 中国普通外科杂志, no. 04, 15 April 2019 (2019-04-15) *
维托里奥•亚历山德罗: "当代腹膜后肉瘤诊治策略", 31 July 2020, 广州:广东科技出版社, pages: 123 *
董英;黄品贤;: "Cox模型及预测列线图在R软件中的实现", 数理医药学杂志, no. 06, 15 December 2012 (2012-12-15) *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113936804A (en) * 2021-08-23 2022-01-14 四川大学华西医院 System for constructing model for predicting risk of continuous air leakage after lung cancer resection
CN113936804B (en) * 2021-08-23 2023-03-28 四川大学华西医院 System for constructing model for predicting risk of continuous air leakage after lung cancer resection
CN114496306A (en) * 2022-01-28 2022-05-13 北京大学口腔医学院 Machine learning-based prognosis survival stage prediction method and system
CN114496306B (en) * 2022-01-28 2022-12-20 北京大学口腔医学院 Machine learning-based prognosis survival stage prediction method and system
WO2023143232A1 (en) * 2022-01-28 2023-08-03 北京大学口腔医学院 Prognosis survival stage prediction method and system based on machine learning
CN114974500A (en) * 2022-05-17 2022-08-30 浙江省肿瘤医院 Intestinal cancer patient nutrition treatment and prognosis prediction evaluation model based on TPN control system
CN115713964A (en) * 2022-10-16 2023-02-24 洛兮基因科技(杭州)有限公司 Immune-related gene prognosis model for predicting overall survival rate of squamous cell lung carcinoma patients
CN115713964B (en) * 2022-10-16 2023-08-15 洛兮基因科技(杭州)有限公司 Method for predicting overall survival rate of lung squamous carcinoma patient based on immune related genes
CN115810426A (en) * 2022-12-21 2023-03-17 河南科技大学第一附属医院 Tool and system for prognosis of esophageal squamous cell carcinoma and application
CN115810426B (en) * 2022-12-21 2024-07-16 河南科技大学第一附属医院 Tool, system and application for esophageal squamous cell carcinoma prognosis
CN117524486A (en) * 2024-01-04 2024-02-06 北京市肿瘤防治研究所 TTE model establishment method for predicting non-progressive survival probability of postoperative patient
CN117524486B (en) * 2024-01-04 2024-04-05 北京市肿瘤防治研究所 TTE model establishment method for predicting non-progressive survival probability of postoperative patient

Also Published As

Publication number Publication date
CN113270188B (en) 2024-07-02

Similar Documents

Publication Publication Date Title
CN113270188B (en) Method and device for constructing prognosis prediction model of patient after radical esophageal squamous carcinoma treatment
US12051509B2 (en) Methods and machine learning systems for predicting the likelihood or risk of having cancer
US20240087754A1 (en) Plasma based protein profiling for early stage lung cancer diagnosis
Salle et al. Comprehensive molecular and pathologic evaluation of transitional mesothelioma assisted by deep learning approach: a multi-institutional study of the International Mesothelioma Panel from the MESOPATH Reference Center
JP2022020738A (en) Method for improving disease diagnosis using measured analytes
Breen et al. A holistic comparative analysis of diagnostic tests for urothelial carcinoma: a study of Cxbladder Detect, UroVysion® FISH, NMP22® and cytology based on imputation of multiple datasets
WO2017192965A2 (en) Compositions, methods and kits for diagnosis of lung cancer
Parodi et al. Differential diagnosis of pleural mesothelioma using Logic Learning Machine
CN115862838A (en) Bile duct cancer diagnosis model based on machine learning algorithm and construction method and application thereof
CN115144599A (en) Application of protein combination in preparation of kit for carrying out prognosis stratification on thyroid cancer of children, and kit and system thereof
US20240233952A1 (en) Systems and Methods for Continuous Cancer Treatment and Prognostics
CN115274118A (en) Method for constructing testis tumor diagnosis and postoperative recurrence risk prediction model
CN114898874A (en) Prognosis prediction method and system for renal clear cell carcinoma patient
Wang et al. Survival risk prediction model for ESCC based on relief feature selection and CNN
Schneider et al. Multimodal integration of image, epigenetic and clinical data to predict BRAF mutation status in melanoma
CN118256622A (en) Novel marker and application thereof in prognosis evaluation of liver cancer
Nowinski et al. Population-based stroke atlas for outcome prediction: method and preliminary results for ischemic stroke from CT
US20240194294A1 (en) Artificial-intelligence-based method for detecting tumor-derived mutation of cell-free dna, and method for early diagnosis of cancer, using same
Wilk et al. Radiomic signature accurately predicts the risk of metastatic dissemination in late-stage non-small cell lung cancer
Feng et al. Flexible diagnostic measures and new cut‐point selection methods under multiple ordered classes
Gronnier et al. Relevance of blood tumor markers in inpatients with significant involuntary weight loss and elevated levels of inflammation biomarkers
CN117153392B (en) Marker for prognosis prediction of gastric cancer, assessment model and construction method thereof
CN117476097B (en) Colorectal cancer prognosis and treatment response prediction model based on tertiary lymphoid structure characteristic genes, and construction method and application thereof
Li et al. Risk stratification in nasopharyngeal carcinoma patients: application of co-occurrence matrix and centrality analysis in prognostic evaluation
Goyal et al. Prediction of Breast Cancer Recurrence Risk Using a Multi-Model Approach Integrating Whole Slide Imaging and Clinicopathologic Features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant