CN116864130A - Method for predicting nursing treatment result of EC patient by using DBA-LSTM - Google Patents

Method for predicting nursing treatment result of EC patient by using DBA-LSTM Download PDF

Info

Publication number
CN116864130A
CN116864130A CN202310863872.XA CN202310863872A CN116864130A CN 116864130 A CN116864130 A CN 116864130A CN 202310863872 A CN202310863872 A CN 202310863872A CN 116864130 A CN116864130 A CN 116864130A
Authority
CN
China
Prior art keywords
data
lstm
patient
model
dba
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310863872.XA
Other languages
Chinese (zh)
Inventor
孙悦
李智
杨帆
李欣阳
朱玉龙
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Sichuan University
Original Assignee
Sichuan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Sichuan University filed Critical Sichuan University
Priority to CN202310863872.XA priority Critical patent/CN116864130A/en
Publication of CN116864130A publication Critical patent/CN116864130A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • G06N3/0442Recurrent networks, e.g. Hopfield networks characterised by memory or gating, e.g. long short-term memory [LSTM] or gated recurrent units [GRU]
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/0985Hyperparameter optimisation; Meta-learning; Learning-to-learn
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H10/00ICT specially adapted for the handling or processing of patient-related medical or healthcare data
    • G16H10/60ICT specially adapted for the handling or processing of patient-related medical or healthcare data for patient-specific data, e.g. for electronic patient records
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Public Health (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biophysics (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Medical Informatics (AREA)
  • General Engineering & Computer Science (AREA)
  • Epidemiology (AREA)
  • Primary Health Care (AREA)
  • Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Medical Treatment And Welfare Office Work (AREA)

Abstract

The invention discloses a method for predicting the nursing treatment result of an EC patient by using DBA-LSTM, which comprises the steps of developing a machine learning model to predict various indexes of the EC nursing patient after three months of nursing treatment, obtaining a trend chart of the nursing treatment of the patient, visually observing various indexes of the patient, assisting doctors in treatment and analyzing a collected clinical data set. The problem of too little original data is solved by adopting a DBA algorithm, and a time sequence parallel prediction model is established by using an LSTM algorithm. The result shows that the method is superior to other traditional time sequence prediction models, and a realistic strategy is provided for monitoring and early warning of the nursing patient. The proposed model provides a platform for doctors to more objectively and scientifically evaluate and predict the indexes, treatment schemes, curative effect monitoring and risk early warning of the EC-reserved fertility patients in different treatment stages, assist the clinicians and guide the EC patients to make decisions with maximum benefit risk ratio.

Description

Method for predicting nursing treatment result of EC patient by using DBA-LSTM
Technical Field
The invention belongs to the field of disease risk prediction, and relates to a method for predicting an EC patient nursing treatment result by using DBA-LSTM, which is particularly used for predicting various indexes of an endometrial cancer nursing patient after three months of nursing treatment and evaluating whether the disease of the nursing patient is relieved after the nursing treatment.
Background
The application of artificial intelligence is largely divided into two main categories. The first category includes machine learning techniques, which analyze structured data to cluster patient features and thereby predict the probability of disease outcome. The second category includes natural language processing methods that extract information from unstructured data (e.g., clinical notes and patient medical records) to supplement and enrich structured medical data. Natural language processing converts text into machine-readable structured data, which can then be analyzed by machine learning techniques. LSTM in deep learning represents Long Short-Term Memory (Long Short-Term Memory), a variant of Recurrent Neural Network (RNN). LSTM networks are designed to solve the gradient extinction and gradient explosion problems in traditional RNNs in order to better address long-term dependencies. LSTM can better capture and memorize long-term dependencies when processing sequence data. It does this by using a set of structures called "gates" which can selectively control the flow of information. Through a combination of these gates, the LSTM can selectively select and ignore information from the input, memorize and forget previous states, and output the relevant results. This enables LSTM networks to better handle long sequence data and perform well in the fields of natural language processing, speech recognition, machine translation, etc. Therefore, an LSTM algorithm is selected to construct a parallel prediction model to improve the prediction accuracy of the time series data.
Dynamic time warping (Dynamic Time Warping, DTW) is proposed by Itakura as a method of measuring the similarity of two time series of different lengths, where the similarity is the distance between the time series. Compared with the traditional method for calculating the distance between time sequences, when the time sequences are complex, such as the phenomenon that different time sequences exist in different time lengths, the distance calculated by using the Euclidean distance is not accurate enough, and compared with the method, the DTW algorithm can more accurately calculate the distance between two time sequences in the complex time sequences. This is because the DTW algorithm is able to warp the time series on the time axis so that the two time series are better aligned, resulting in a more accurate distance of the two time series. The DBA algorithm is performed on the basis of the DTW algorithm, and is a global averaging algorithm, and the new average time sequence is continuously followed by multiple iterations, so that the sum of the square Distances (DTWs) of the average time sequence and other sequences in the time sequence set is finally minimized. As is known from the principles of the DBA algorithm, the algorithm can be used to calculate an average time sequence of a time series data set, i.e. an original time series set can be averaged to obtain a new time series of synthesis. In fact, there have been studies to apply DBA algorithm to time-series data enhancement, such as Fawaz, etc., to enhance time-series samples using DBA algorithm, thereby obtaining more new time-series samples. Based on the idea, the DBA algorithm is utilized to synthesize small sample data, so that the problem of too little data of the original endometrial cancer nursing patient is solved.
The MICE "fill in" (interpolate) missing data in the dataset through a series of iterative predictive models. In each iteration, each specified variable in the dataset will be estimated using the other variables in the dataset, and the iterations continue until convergence is met. And carrying out missing value processing on the data by using MICE, and ensuring the integrity of the data.
The constructed DBA-LSTM model is divided into a model training stage and a model application stage when the technology is realized. The model training stage is a stage in which the LSTM algorithm learns training data to train a model; the model application stage refers to predicting new data with a trained model.
Disclosure of Invention
The invention predicts various indexes of an endometrial cancer nursing patient after three months of nursing treatment to evaluate whether the nursing patient is relieved after the nursing treatment, and provides a method for predicting the nursing treatment result of an EC patient by using DBA-LSTM.
The invention is realized by the following technical scheme: 1) Processing basic conditions, high risk factors, hospital laboratory examination and auxiliary examination data, diagnosis results and other data of endometrial cancer nursing patients; 2) Performing missing value processing on the data and performing data enhancement on the data; 3) Constructing a model by using LSTM; 4) Training the model by using the data to obtain optimal super parameters; 5) The validity and accuracy of the model is checked by the test data.
Drawings
FIG. 1 is a flow chart of data synthesis based on DBA algorithm;
FIG. 2 is a basic block diagram of an LSTM memory cell;
FIG. 3 is a model graph of predicted EC patient care treatment outcome based on DBA-LSTM;
FIG. 4 is a graph of patient data truth versus predictive value three months after a DBA-LSTM based EC-care patient;
FIG. 5 is a graph of comparison of the results of four predictive models;
Detailed Description
The invention is described in further detail below in connection with the following embodiments:
1. and (3) data processing: the obtained original data comprise basic conditions, high risk factors, hospital laboratory and auxiliary examination data and diagnosis results of patients with endometrial cancer and atypical hyperplasia, text medical records in time series of patient data are subjected to structuring treatment by machine learning, unstructured data are processed into structuring data which can be identified by a model, other digital information is extracted, and the obtained different data are integrated to generate a data collection table. The pandas is used for cleaning, preparing and regulating the data, and the Matplotlib is used for constructing a line graph to carry out exploratory analysis on the data of the patient, so that the change trend of the variable along with time can be clearly shown;
2. missing value processing: the data is subjected to missing value processing by the MICE, and missing data in the dataset is "filled in" (interpolated) by a series of iterative predictive models.
3. Data enhancement: the data set is subjected to data enhancement by DBA, and small sample data are synthesized by using a DBA algorithm, so that the problem of too little data of an original endometrial cancer nursing patient is solved, and the model prediction accuracy is improved.
4. Model construction and parameter selection: and (3) constructing a parallel prediction model by utilizing the LSTM, predicting various indexes of the patient in the next stage (three months later) of the nursing treatment by utilizing the LSTM, and drawing a trend chart of the patient for starting the nursing treatment. Long-short time memory (LSTM) is a special recurrent neural network that can learn to rely on for long periods of time. The method is mainly characterized in that the problems of gradient disappearance and gradient explosion in the long-sequence training process are solved by integrating the gating function into the state dynamic state of the gating function. The repeat modules of LSTM have a different structure than RNN. There are four layers rather than a single layer, which interact in a very specific way. Training the model by using data, manually checking the random super-parameter set by using a traditional manual searching mode through a training algorithm, and selecting the optimal parameters conforming to the target.
5. Model test: the method comprises the steps of checking the validity and accuracy of a model through test data, dividing the data into K partitions with the same size by adopting K-fold cross validation to evaluate the performance of the model, wherein the value of a K value is generally between [2 and 10], and reasonably selecting the K value according to the size of a data set.
Detailed description of the drawings:
fig. 1 is a data synthesis flow chart of a DBA algorithm, s= { S 1 ,S 2 ,...,S m The patient sample time series set in the original training set is represented by m, where m represents the number of samples, S i A set of sample time sequences is represented, which contains characteristic time sequences such as CA125, CA199, CEA, etc. Sequentially selecting k different time sequence sets S without replacement from the time sequence sets S i Forming a time series subset, thus, m/k time series subsets can be formed together, and then each time series subset is respectively subjected to gravity center average by using DBA algorithm, wherein the gravity center average is performed on the characteristic time series, and { S } 1 ,S 2 ,...,S k This subset of time series is exemplified by { S 11 ,S 21 ,...,S k1 The time sequence set corresponding to the first characteristic is formed { S 12 ,S 22 ,...,S k2 The second characteristic is formed into a time sequence set corresponding to the second characteristic, and the two time sequence sets are respectively subjected to center-of-gravity average by using DBA algorithm, so that the average time sequences c of the two time sequence sets are respectively obtained 11 And c 12 As does the averaging of other feature sequences. Finally for { S ] 1 ,S 2 ,...,S k The time series subset can be averaged by the center of gravity of DBA to obtain r average time series, and C is used for 1 ={c 11 ,c 12 ,...,c 1r And is represented by the new sample, thus resulting in a first new composite sample. For the whole data set S, m/k new synthetic samples can be obtained, using c= { C 1 ,C 2 ,...,C m/k And } represents. In order to obtain more synthetic samples as much as possible, the above procedure is cycled n times, so that n·m/k synthetic samples can be obtained, and before each cycle, the samples in the original training set time sequence set S need to be randomly ordered, so that m/k time sequence sets constructed in each cycle can be ensured to be different.
FIG. 2 is a basic block diagram of LSTM memory cells, LSTM is a special RNN optimization model, which is a model that applies the results of previous learning to the current learning, and replaces a neuron in the RNN with a memory cell (memory Block), which is a more complex structure than the RNN neurons. The LSTM comprises a special gate structure, can control the forgetting amount of the historical information and the admission amount of the current input information according to the current input and hidden state information, thereby realizing the effective utilization of the historical information and the current information and well solving the problem of RNN gradient disappearance. Such special door structures include forgetting doors, input doors, and output doors. In addition to representing the short-term memory state h as a whole, the cell state C representing long-term memory is increased, both flowing over time. Due to the presence of cell state C, LSTM networks are more suitable for handling long time sequences than RNN networks.
FIG. 3 is a model diagram of the predicted EC patient care treatment results based on DBA-LSTM, a parallel prediction model is constructed by using LSTM, and the raw data without data synthesis and the data with data synthesis are imported into the model for training and testing, so as to obtain different evaluation parameters of the model before and after data enhancement.
FIG. 4 is a graph of patient data truth versus predictive value for three months post-EC care patients based on DBA-LSTM, data enhancement based on DBA, prediction of three months post-EC care treatment outcome using LSTM model.
FIG. 5 is a graph comparing the results of four predictive models, wherein the constructed model is compared with the evaluation effects of the predictive models of RNN, CNN and GRU of some basic models, and MAPE of the constructed model and the comparative model in the same test set is calculated to obtain a result comparison graph.

Claims (5)

1. A method of predicting the outcome of an EC patient care treatment using DBA-LSTM, characterized by: processing basic conditions, high-risk factors, hospital laboratory examination and auxiliary examination data and diagnosis result time sequence data of four stages of patients participating in conservation treatment, obtaining patient data, building a model, training the model, and finally evaluating the model efficacy, wherein the method comprises the following steps:
step 1: preprocessing time sequence data according to research content;
step 2: interpolation processing is carried out on the missing value of the data, and data enhancement processing is carried out on the data set;
step 3: constructing a network model according to the research result of the recurrent neural network;
step 4: model parameters are optimized through data training, and model validity and accuracy are checked through test data.
2. A method of predicting the outcome of an EC patient care treatment using DBA-LSTM as claimed in claim 1, wherein: step 1, carrying out structural processing on text medical records in time-series patient data, extracting other digital information, cleaning, preparing and regulating the data by using pandas, and carrying out exploratory analysis on the patient data by constructing a line graph through Matplotlib, so that the change trend of variables along with time can be clearly shown.
3. A method of predicting the outcome of an EC patient care treatment using DBA-LSTM as claimed in claim 1, wherein: and 2, carrying out interpolation processing on the missing values by using a chained equation multiple interpolation method, and carrying out data enhancement on the table data by using a DBA algorithm.
4. A method of predicting the outcome of an EC patient care treatment using DBA-LSTM as claimed in claim 1, wherein: and 3, constructing an LSTM parallel prediction model according to an LSTM algorithm in the deep learning.
5. A method of predicting the outcome of an EC patient care treatment using DBA-LSTM as claimed in claim 1, wherein: and 4, training an optimization model through data to obtain optimal parameters of the model, and checking the validity and accuracy of the model by using a K-fold cross validation method.
CN202310863872.XA 2023-07-14 2023-07-14 Method for predicting nursing treatment result of EC patient by using DBA-LSTM Pending CN116864130A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310863872.XA CN116864130A (en) 2023-07-14 2023-07-14 Method for predicting nursing treatment result of EC patient by using DBA-LSTM

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310863872.XA CN116864130A (en) 2023-07-14 2023-07-14 Method for predicting nursing treatment result of EC patient by using DBA-LSTM

Publications (1)

Publication Number Publication Date
CN116864130A true CN116864130A (en) 2023-10-10

Family

ID=88221315

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310863872.XA Pending CN116864130A (en) 2023-07-14 2023-07-14 Method for predicting nursing treatment result of EC patient by using DBA-LSTM

Country Status (1)

Country Link
CN (1) CN116864130A (en)

Similar Documents

Publication Publication Date Title
Lin et al. Early diagnosis and prediction of sepsis shock by combining static and dynamic information using convolutional-LSTM
Melnychuk et al. Causal transformer for estimating counterfactual outcomes
CN109659033A (en) A kind of chronic disease change of illness state event prediction device based on Recognition with Recurrent Neural Network
CN110459324A (en) Disease forecasting method, apparatus and computer equipment based on shot and long term memory models
Jin et al. Visual causality analysis of event sequence data
CN112489769A (en) Intelligent traditional Chinese medicine diagnosis and medicine recommendation system for chronic diseases based on deep neural network
Madaan et al. Predicting ayurveda-based constituent balancing in human body using machine learning methods
CN113113130A (en) Tumor individualized diagnosis and treatment scheme recommendation method
Taylor et al. A model to detect heart disease using machine learning algorithm
CN112530594B (en) Hemodialysis complication long-term risk prediction system based on convolution survival network
CN111859264A (en) Time sequence prediction method and device based on Bayes optimization and wavelet decomposition
CN106709588A (en) Prediction model construction method and equipment and real-time prediction method and equipment
Yang et al. Predicting coronary heart disease using an improved LightGBM model: Performance analysis and comparison
CN115185937A (en) SA-GAN architecture-based time sequence anomaly detection method
CN115456245A (en) Prediction method for dissolved oxygen in tidal river network area
CN113866391B (en) Deep learning model prediction factor interpretation method and application thereof in soil water content prediction
Renugadevi et al. Predicting heart disease using hybrid machine learning model
Lamba et al. Role of mathematics in machine learning
AV et al. Evaluation of Recurrent Neural Network Models for Parkinson's Disease Classification Using Drawing Data
CN115565669B (en) Cancer survival analysis method based on GAN and multitask learning
CN117038096A (en) Chronic disease prediction method based on low-resource medical data and knowledge mining
CN116864130A (en) Method for predicting nursing treatment result of EC patient by using DBA-LSTM
CN115662642A (en) Construction and application of esophageal cancer life prediction model based on improved goblet ascidian algorithm
CN115171896A (en) System and method for predicting long-term death risk of critically ill patient
CN114997303A (en) Bladder cancer metabolic marker screening method and system based on deep learning

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination