CN115206527A - Cerebral infarction surgery patient survival risk classification method based on machine learning - Google Patents

Cerebral infarction surgery patient survival risk classification method based on machine learning Download PDF

Info

Publication number
CN115206527A
CN115206527A CN202210760511.8A CN202210760511A CN115206527A CN 115206527 A CN115206527 A CN 115206527A CN 202210760511 A CN202210760511 A CN 202210760511A CN 115206527 A CN115206527 A CN 115206527A
Authority
CN
China
Prior art keywords
cerebral infarction
module
data
model
patients
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210760511.8A
Other languages
Chinese (zh)
Inventor
卢莉
黄文弘
王琳娜
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN202210760511.8A priority Critical patent/CN115206527A/en
Publication of CN115206527A publication Critical patent/CN115206527A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/20ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for computer-aided diagnosis, e.g. based on medical expert systems
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Medical Informatics (AREA)
  • Public Health (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Biomedical Technology (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • Molecular Biology (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention provides a machine learning-based survival risk classification method for patients with cerebral infarction surgery, which is realized based on a survival prediction system of perioperative patients with cerebral infarction, wherein the survival prediction system of perioperative patients with cerebral infarction comprises an acquisition module, a prediction module and an output module; the acquisition module is used for inputting the data of the cerebral infarction patient; the prediction module is used for inputting the data of the cerebral infarction patients into the prediction model to predict the survival period, and the output module is used for outputting the prediction result; the prediction model comprises a base model of a first layer and a logistic regression model of a second layer; the base model is divided into a first base model, a second base model and a third base model, wherein the first base model is a comprehensive random forest model, the second base model is an XGboost model, and the third base model is an MLP model; solves the problems that the gold survival time of the patients is delayed due to the uneven capability of medical staff in the prior art, or the survival time of the cerebral infarction patients is shortened due to other serious side effects on the patients caused by over-nursing.

Description

Cerebral infarction surgery patient survival risk classification method based on machine learning
Technical Field
The invention belongs to the technical field of medical equipment intelligence, and particularly relates to a method for classifying survival risks of patients with cerebral infarction surgery based on machine learning.
Background
Cerebral infarction is also called ischemic stroke, which refers to softening necrosis of local brain tissue caused by blood circulation disorder, ischemia and anoxia. The problem of survival in patients with cerebral infarction has long been one of the most serious concerns for physicians. Perioperative, in addition to the decisions that the patient needs to perform medical treatment, the attending physician's clinical ability, ability to randomize strains, and means of medication and treatment for the patient are closely related to the survival rate of patients with cerebral infarction.
In the prior art, a specific decision-making treatment is usually performed on a patient by an attending physician by checking the physical state of the patient with the cerebral infarction in combination with self medical experience, but in many remote areas or places with immature medical level, the gold survival time of the patient is delayed due to the uneven ability of the attending physician, or the survival time of the patient with the cerebral infarction is shortened due to other serious side effects on the patient caused by over-nursing. Therefore, there is still a considerable risk and limitation in the targeted decision-making treatment of such patients, simply by means of medical personnel plus conventional medical techniques.
Disclosure of Invention
The invention aims to at least solve the technical problems in the prior art, provides a method for classifying survival risks of patients with cerebral infarction surgery based on machine learning, and solves the problems that in the prior art, the golden survival time of the patients is delayed due to the uneven capabilities of medical staff, or the survival time of the patients is shortened due to other serious side effects on the patients caused by over-nursing.
In order to achieve the above object, according to a first aspect of the present invention, the present invention provides a method for classifying survival risks of a cerebral infarction surgery patient based on machine learning, the method is implemented based on a survival prediction system of a perioperative cerebral infarction patient, the survival prediction system of the perioperative cerebral infarction patient comprises an obtaining module, a prediction module and an output module; the acquisition module is used for inputting cerebral infarction patient data; the prediction module is used for inputting the data of the cerebral infarction patients into the prediction model for survival prediction, and the output module is used for outputting prediction results; the prediction model comprises a base model of a first layer and a logistic regression model of a second layer; the base model is divided into a first base model, a second base model and a third base model, the first base model is a comprehensive random forest model, the second base model is an XGboost model, and the third base model is an MLP model.
In another preferred embodiment of the present invention, the system further comprises a data processing module for performing data change processing on the data of the cerebral infarction patient.
In another preferred embodiment of the present invention, the data processing module comprises a culling module, a cleaning module and a transformation module;
the removing module is used for removing the characteristic that the actual deletion rate in the data of the cerebral infarction patient is greater than the standard deletion rate;
the cleaning module is used for processing missing values of the data of the cerebral infarction patient;
the transformation module is used for carrying out feature coding processing and data normalization processing on the cerebral infarction patient data.
In another preferred embodiment of the present invention, the missing value processing is to fill in the missing value by using a misforest filling method.
In another preferred embodiment of the present invention, the feature encoding process is specifically to encode the cerebral infarction patient data by using One-hot encoding rule.
In another preferred embodiment of the present invention, the data normalization process specifically uses a standard deviation normalization method combined with a maximum value normalization method to process the data of the cerebral infarction patient.
In another preferred embodiment of the present invention, the system further comprises an optimization module for optimizing the hyper-parameters of the prediction model, wherein the optimization module comprises a screening module, a feature selection module and an assignment module;
the screening module is used for screening the data of the cerebral infarction patient from the patient database;
the characteristic selection module is used for sequencing the characteristic importance scores of the processed data of the cerebral infarction patients and screening out the characteristics of which the importance scores are greater than the standard scores to form a test set;
and the giving module is used for putting the test set into the base model to carry out hyper-parameter optimization processing to obtain the optimized base model hyper-parameters and giving the optimized hyper-parameters to the base model again.
In another preferred embodiment of the present invention, the feature selection module is specifically configured to rank the feature importance scores of the processed data of the cerebral infarction patients, screen out features with importance scores greater than a standard score, form a data set, and divide the data set into a test set and a training set;
the optimization module further comprises a training module, and the training module is used for putting the training set into the base model endowed with the hyperparameters again for training.
In another preferred embodiment of the present invention, the hyper-parameter optimization process employs a method combining genetic algorithm and cross-validation.
The beneficial technical effects of the technical scheme of the invention comprise: the invention introduces an ensemble learning method. The random forest is a Bagging algorithm, the XGboost is a Boosting algorithm, and the MLP is a deep network method, and the random forest, the XGboost and the MLP are different greatly in learning and have advantages and disadvantages. The method integrates random forests, XGboost and MLP, constructs the RF-XBM model by using the Stacking integration strategy, gives full play to the advantages of various learners and prevents overfitting. And the prediction result of the first-layer learner is used as the logistic regression model input to the second layer, and the final prediction result is output, so that the accuracy of the prediction result is improved.
According to the invention, a large amount of data of the cerebral infarction patients are combined with various data processing modes and hyper-parameter optimization modes for multiple times, the processed cerebral infarction patient data are put into a prediction model for prediction, a data processing mode with poor prediction effect is abandoned, and a data processing mode with the highest prediction accuracy is screened out, so that the data processing module disclosed by the invention is obtained.
The prediction model of the invention can predict the life cycle of the patient more quickly, and the attending doctor can make a decision more aiming at the treatment means of the patient through the prediction result of the prediction model, and for the patient with higher death possibility, more medical resources are given to the patient to save the life of the patient, and for the patient with lower death possibility, the overdose is avoided to prevent the side effect on the body of the patient, and the waste of the medical resources is avoided. Compared with the prior art, the prediction model is obtained by training a large amount of data, so that the survival time of the cerebral infarction patient can be judged more accurately compared with a common diagnosis and treatment method, and the risk and the limitation of medical staff on the targeted decision-making treatment of the cerebral infarction patient are reduced.
Drawings
FIG. 1 is a schematic diagram of the architecture of the survival prediction system for perioperative patients with cerebral infarction of the present invention;
FIG. 2 is a schematic diagram of the implementation steps of the survival prediction system for perioperative patients with cerebral infarction according to the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
In the description of the present invention, it is to be understood that the terms "longitudinal", "lateral", "upper", "lower", "front", "rear", "left", "right", "vertical", "horizontal", "top", "bottom", "inner", "outer", and the like, indicate orientations or positional relationships based on those shown in the drawings, and are used merely for convenience of description and for simplicity of description, and do not indicate or imply that the referenced devices or elements must have a particular orientation, be constructed in a particular orientation, and be operated, and thus, are not to be construed as limiting the present invention.
In the description of the present invention, unless otherwise specified and limited, it should be noted that the terms "mounted," "connected," and "connected" are to be interpreted broadly, and may be, for example, a mechanical connection or an electrical connection, a communication between two elements, a direct connection, or an indirect connection through an intermediate medium, and those skilled in the art will understand the specific meaning of the terms as they are used in the specific case.
The invention provides a survival prediction system for perioperative patients with cerebral infarction, which comprises an acquisition module, a data processing module, a prediction module and an output module, wherein the data processing module is connected with the acquisition module;
the acquisition module is used for inputting cerebral infarction patient data; cerebral infarction patient data comprises patient age, sex, duration of disease onset, basic physical data and disease state; the data processing module is used for carrying out data change processing on the data of the cerebral infarction patient so that the data of the cerebral infarction patient is suitable for machine learning; the prediction module is used for inputting the data of the cerebral infarction patients into the prediction model to predict the survival time, the output module is used for outputting the prediction result, and the output result is the information of the survival time of the patients.
As shown in fig. 2, the prediction model includes a base model of a first layer and a logistic regression model of a second layer; the base model is divided into a first base model, a second base model and a third base model, wherein the first base model is a comprehensive random forest model, the second base model is an XGboost model, and the third base model is an MLP model; XGboost is a Boosting algorithm based on a Gradient Boosting Decision Tree (GBDT); MLP is a basic algorithm for deep learning networks.
In the invention, eight machine learning models such as K-neighbor, naive Bayes, decision trees, adaBoost, random forest, GBDT, XGBoost, MLP and the like are researched in experiments, wherein the random forest is a Bagging algorithm, the XGBoost is a Boosting algorithm, the MLP is a deep network method, the three have great difference and respective advantages and disadvantages in the aspect of learning, and Bagging and Boosting are achieved by training a weak learner and then fusing through an averaging method, a voting method or other methods to obtain the strong learner.
On the basis, the invention introduces an ensemble learning method and a Stacking integration strategy, wherein the Stacking is different from Bagging and Boosting: and adding a layer of learner in the Stacking process, namely, respectively sending the data of the cerebral infarction patients to the first layer of learner for training, sending the training result of the first layer of learner as input to the second layer of learner for retraining, and taking the final result as the output result of the model. Random forests, XGboost and MLP are integrated, a Stacking integration strategy is used for constructing an RF-XBM model, the advantages of various learners are fully exerted, and overfitting is prevented.
In the experimental process, the eight machine learning models and the RF-XBM model are trained, the experiment adopts a data set of the cerebral infarction patient to carry out cross validation, the accuracy, the recall rate and the F1 value of each validation are recorded, the average values are respectively taken as the result of one experiment, each model carries out five experiments, and the average values of the five experiments are taken as the result of the model experiment.
The results of the study are shown in table one. The RF-XBM proposed herein is optimal, the precision (recall, F1 value) is 0.8320, and the performance improvement is significant. This demonstrates that the integrated model proposed herein works well in this experimental problem based on data from patients with cerebral infarction.
TABLE I Experimental results
Figure BDA0003723930080000071
In a preferred experiment mode, the data processing module comprises a rejection module, a cleaning module and a transformation module;
the removing module is used for removing the characteristic that the actual deletion rate in the data of the cerebral infarction patient is greater than the standard deletion rate; preferably, the standard deletion rate is 30%;
the cleaning module is used for processing missing values of the data of the cerebral infarction patient; in order to improve the survival prediction accuracy of the cerebral infarction patients, the method performs an experiment on the data processing of the cerebral infarction patients, and the cleaning processing of the cerebral infarction patient data in the experiment process is divided into two stages of missing value processing and abnormal value processing;
missing value processing: carrying out model test on the cerebral infarction patient data sets filled by different filling methods, and taking a prediction result as a model effect; the four padding methods and their effects under different models are summarized, and the results are shown in table two.
TABLE II, FOUR FILLING METHODS AND THE EFFECTS THEREOF IN DIFFERENT MODELS
Figure BDA0003723930080000081
The second table shows that the Missforest filling method with the optimal precision rate is selected as the missing value processing method;
abnormal value processing: the abnormal value detection of the invention adopts a box plot method, which is a way of simply summarizing a data set by only using 5 points, wherein the five points are respectively a middle point, upper and lower quartile points (Q3 and Q1), a highest point and a lowest point of a distribution state. A four-quadrant IQR = Q3-Q1 is defined, and data greater than Q3+1.5 × IQR or less than Q1-1.5 × QR is considered an outlier.
The test group uses the abnormal value detected by the box plot as the deficiency value. The misforest filling method is carried out on the processed data, model training is carried out on the processed data, and model training results of the experimental group and the control group are shown in the table three.
Third, training results of experimental group and control group models
Figure BDA0003723930080000091
As can be seen from the comparative experiment in Table III, the accuracy of the control group is higher than that of the experimental group, so the method of the invention adopts the control group for processing the abnormal value, i.e. the abnormal value is not processed.
The invention also discusses a data unbalance processing method, and the data unbalance processing method mainly comprises three methods: (1) an oversampling method; (2) an undersampling method; (3) not processing; the data unbalance processing method mainly discussed in the invention adopts an oversampling method and does not carry out data unbalance processing;
currently, commonly used oversampling methods include: (1) SMOTE; (2) ROS; (3) ADASYN; (4) SMOTE-Borderline; (5) SMOTE-SVM. The grouping experiment is carried out by the oversampling method, and the grouping is specifically set to be that the samples are not processed, the samples are processed by the SMOTE method, the samples are processed by the ROS method, the samples are processed by the ADASYNN method, the samples are processed by the SMOTE-Borderline method, the samples are processed by the SMOTE-SVM method, and the samples are processed by the new research method. In the experimental process, a data set of a cerebral infarction patient is firstly divided into a training set and a testing set, the training set is subjected to oversampling processing, the testing set is not processed, XGboost is adopted for training, and the model effects of all groups are compared. The experimental results are shown in table four;
fourth, micro-precision rate of each group under XGboost model
Figure BDA0003723930080000101
As can be seen from table four, the effect of not performing data imbalance processing is better than that of other oversampling methods, and for such a result, the reason is that oversampling is based on the structural information of the samples of the minority class, and even reversely optimized when the representation quality of the minority class is poor, so that data imbalance processing is not performed on the data of the cerebral infarction patient.
The data processing experiment shows that: (1) And comparing the four missing value processing methods to obtain a result that the MissForest filling method is superior to KNN filling, mean value filling and iterative filling. (2) Compared with the method for processing the abnormal value, the effect of obtaining the non-processing is better than the result of processing the abnormal value. (3) And the idea of data integration and change and feature selection is provided. (4) And comparing the six data unbalance processing methods to finally obtain the optimal result without data unbalance processing.
The transformation module is used for carrying out feature coding processing and data normalization processing on the data of the cerebral infarction patient; the characteristic coding processing specifically comprises the steps of coding data of the cerebral infarction patient by adopting One-hot coding rules; the data normalization processing specifically adopts a method of combining standard deviation normalization and maximum value normalization to process the data of the cerebral infarction patient.
In a preferred experimental mode, the system further comprises an optimization module, wherein the optimization module is used for optimizing the hyper-parameters of the prediction model and comprises a screening module, a feature selection module and an assignment module;
the screening module is used for screening the data of the cerebral infarction patient from the patient database;
the characteristic selection module is used for sequencing the characteristic importance scores of the processed data of the cerebral infarction patients, screening out the characteristics of which the importance scores are greater than the standard scores to form a data set, and dividing the data set into a test set and a training set, wherein the proportion of the test set is 30%; preferably, the importance score ordering is based on an output data set of the XGboost model, and the standard score is 100;
the assigning module is used for placing the test set into the base model to perform hyper-parameter optimization processing to obtain the optimized base model hyper-parameters and assigning the optimized hyper-parameters to the base model again; the hyper-parameters are specifically divided into network structure related parameters and model training related parameters; the network structure related parameters include: the number and type of network middle layers, the number of neurons in each layer and an activation function; model training related parameters: loss function, optimization method, batch size, iteration times, learning rate, regular method and coefficient, optimization method;
the hyper-parameter optimization processing adopts a method combining genetic algorithm and cross validation. The cross-validation process comprises: setting the chromosome number of the whole population as 50 hyper-parameter combinations, adopting a 4-fold cross validation method for the population, selecting 10% of hyper-parameters for random value taking by a later generation population, selecting 50% of genes for cross by parent chromosomes, and selecting 3 hyper-parameter combinations with the best fitness from the previous generation each time for direct 'copying'.
The optimization module further comprises a training module, wherein the training module is used for putting the training set into the base model endowed with the hyperparameters again for training, and the logistic regression model still adopts default parameters.
In the description herein, references to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., mean that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims (10)

1. A brain stem operation patient survival risk classification method based on machine learning is characterized in that: the method is realized based on a survival prediction system of perioperative cerebral infarction patients, and the survival prediction system of the perioperative cerebral infarction patients comprises an acquisition module, a prediction module and an output module; the acquisition module is used for inputting cerebral infarction patient data; the prediction module is used for inputting the data of the cerebral infarction patients into the prediction model for survival prediction, and the output module is used for outputting prediction results; the prediction model comprises a base model of a first layer and a logistic regression model of a second layer; the base model is divided into a first base model, a second base model and a third base model, the first base model is a comprehensive random forest model, the second base model is an XGboost model, and the third base model is an MLP model.
2. The machine learning-based method for classifying survival risk of patients with cerebral infarction surgery according to claim 1, wherein the method comprises the following steps: the system also comprises a data processing module which is used for carrying out data change processing on the data of the cerebral infarction patient.
3. The machine learning-based method for classifying survival risk of patients with cerebral infarction surgery according to claim 2, wherein the method comprises the following steps: the data processing module comprises a rejection module, a cleaning module and a transformation module;
the removing module is used for removing the characteristic that the actual deletion rate in the data of the cerebral infarction patient is greater than the standard deletion rate;
the cleaning module is used for processing missing values of the data of the cerebral infarction patient;
the transformation module is used for carrying out feature coding processing and data normalization processing on the cerebral infarction patient data.
4. The machine learning-based method for classifying survival risk of patients with cerebral infarction surgery according to claim 3, wherein the method comprises the following steps: the missing value processing specifically includes filling the missing value by using a MissForest filling method.
5. The machine learning-based method for classifying survival risk of patients with cerebral infarction surgery according to claim 3, wherein the method comprises the following steps: the characteristic coding processing specifically comprises the step of coding the data of the cerebral infarction patient by adopting One-hot coding rules.
6. The machine learning-based method for classifying survival risk of patients with cerebral infarction surgery according to claim 3, wherein the method comprises the following steps: the data normalization processing specifically adopts a method of combining standard deviation normalization and maximum value normalization to process the data of the cerebral infarction patient.
7. The machine learning based method for classifying survival risk of a patient with cerebral infarction surgery according to claim 1, 2, 3, 4, 5 or 6, wherein the method comprises the following steps: the system also comprises an optimization module used for optimizing the hyper-parameters of the prediction model, wherein the optimization module comprises a screening module, a feature selection module and an assignment module;
the screening module is used for screening the data of the cerebral infarction patient from the patient database;
the characteristic selection module is used for sequencing the characteristic importance scores of the processed data of the cerebral infarction patients and screening out the characteristics with the importance scores larger than the standard scores to form a test set;
and the giving module is used for putting the test set into the base model to carry out hyper-parameter optimization processing to obtain the optimized base model hyper-parameters and giving the optimized hyper-parameters to the base model again.
8. The method for machine learning-based survival risk classification of patients with cerebral infarction surgery according to claim 7, wherein the method comprises the following steps: the hyperparametric optimization processing adopts a method combining genetic algorithm and cross validation.
9. The machine learning-based method for classifying survival risk of patients with cerebral infarction surgery according to claim 7, wherein: the feature selection module is specifically used for sequencing feature importance scores of the processed data of the cerebral infarction patients, screening out features with the importance scores larger than standard scores to form a data set, and dividing the data set into a test set and a training set;
the optimization module further comprises a training module, and the training module is used for putting the training set into the base model endowed with the hyperparameters again for training.
10. The machine learning-based method for classifying survival risk of patients with cerebral infarction surgery according to claim 9, wherein: the hyper-parameter optimization processing adopts a method combining genetic algorithm and cross validation.
CN202210760511.8A 2022-06-30 2022-06-30 Cerebral infarction surgery patient survival risk classification method based on machine learning Pending CN115206527A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210760511.8A CN115206527A (en) 2022-06-30 2022-06-30 Cerebral infarction surgery patient survival risk classification method based on machine learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210760511.8A CN115206527A (en) 2022-06-30 2022-06-30 Cerebral infarction surgery patient survival risk classification method based on machine learning

Publications (1)

Publication Number Publication Date
CN115206527A true CN115206527A (en) 2022-10-18

Family

ID=83577271

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210760511.8A Pending CN115206527A (en) 2022-06-30 2022-06-30 Cerebral infarction surgery patient survival risk classification method based on machine learning

Country Status (1)

Country Link
CN (1) CN115206527A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117894481A (en) * 2024-03-15 2024-04-16 长春大学 Bayesian super-parameter optimization gradient lifting tree heart disease prediction method and device

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117894481A (en) * 2024-03-15 2024-04-16 长春大学 Bayesian super-parameter optimization gradient lifting tree heart disease prediction method and device

Similar Documents

Publication Publication Date Title
CN112992346B (en) Method for establishing prediction model of severe spinal cord injury prognosis
CN111613289B (en) Individuation medicine dosage prediction method, device, electronic equipment and storage medium
Abedini et al. Classification of Pima Indian diabetes dataset using ensemble of decision tree, logistic regression and neural network
CN111105860A (en) Intelligent prediction, analysis and optimization system for accurate motion big data for chronic disease rehabilitation
CN110459328A (en) A kind of Clinical Decision Support Systems for assessing sudden cardiac arrest
Mishra et al. An improved and adaptive attribute selection technique to optimize dengue fever prediction
CN107368707A (en) Gene chip expression data analysis system and method based on US ELM
CN102682210A (en) Self-adaptive frog cluster evolutionary tree designing method used for electronic medical record attribute reduction
Hussein Improve the performance of K-means by using genetic algorithm for classification heart attack
CN109243620A (en) Drug effect optimization method and device based on therapeutic drug monitoring
CN115206527A (en) Cerebral infarction surgery patient survival risk classification method based on machine learning
CN112489769A (en) Intelligent traditional Chinese medicine diagnosis and medicine recommendation system for chronic diseases based on deep neural network
CN115985503B (en) Cancer prediction system based on ensemble learning
KR20210068713A (en) System for predicting disease progression using multiple medical data based on deep learning
Sarra et al. Enhanced accuracy for heart disease prediction using artificial neural network
Singh et al. Nature-inspired computing and machine learning based classification approach for glaucoma in retinal fundus images
Qiao et al. Log-sum enhanced sparse deep neural network
Liu et al. A hybrid attention-enhanced DenseNet neural network model based on improved U-Net for rice leaf disease identification
Patra et al. Multiobjective evolutionary algorithm based on decomposition for feature selection in medical diagnosis
Yenidogan et al. Multimodal machine learning for 30-days post-operative mortality prediction of elderly hip fracture patients
Patel Predicting a risk of diabetes at early stage using machine learning approach
CN111145901B (en) Deep venous thrombosis thrombolytic curative effect prediction method and system, storage medium and terminal
Zhi et al. BNCPL: brain-network-based convolutional prototype learning for discriminating depressive disorders
Devanath et al. Thalassemia Prediction using Machine Learning Approaches
Zhong et al. Gestational Diabetes Mellitus Prediction Based on Two Classification Algorithms

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination