CN114883003A - ICU (intensive care unit) hospitalization duration and death risk prediction method based on convolutional neural network - Google Patents

ICU (intensive care unit) hospitalization duration and death risk prediction method based on convolutional neural network Download PDF

Info

Publication number
CN114883003A
CN114883003A CN202210645934.5A CN202210645934A CN114883003A CN 114883003 A CN114883003 A CN 114883003A CN 202210645934 A CN202210645934 A CN 202210645934A CN 114883003 A CN114883003 A CN 114883003A
Authority
CN
China
Prior art keywords
time sequence
layer
network
data
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210645934.5A
Other languages
Chinese (zh)
Inventor
王建新
A.阿戴拉米
邹梦洁
匡湖林
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Central South University
Original Assignee
Central South University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Central South University filed Critical Central South University
Priority to CN202210645934.5A priority Critical patent/CN114883003A/en
Publication of CN114883003A publication Critical patent/CN114883003A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/25Fusion techniques
    • G06F18/253Fusion techniques of extracted features
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/70ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for mining of medical data, e.g. analysing previous cases of other patients
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Abstract

The invention discloses an ICU (intensive care unit) hospitalization duration and death risk prediction method based on a convolutional neural network, which comprises the steps of obtaining basic data, processing and classifying to obtain a training data set, a verification data set and a test data set; constructing a basic prediction model based on a time sequence cavity convolution and context perception feature fusion module with different receptive fields; setting a loss function, and training, verifying and testing a basic prediction model by adopting a data set to obtain an optimal prediction model; and (4) adopting an optimal prediction model to predict the ICU hospitalization time and death risk of the actual personnel. The invention uses a time sequence cavity separable convolutional network to independently code each feature and provides a context-aware feature fusion method; generating a final inpatient representation for prediction in conjunction with the multi-view and multi-scale feature fusion module; therefore, the method has high reliability, high precision and good effect.

Description

ICU (intensive care unit) hospitalization duration and death risk prediction method based on convolutional neural network
Technical Field
The invention belongs to the field of data processing, and particularly relates to an ICU (intensive care unit) hospitalization duration and death risk prediction method based on a convolutional neural network.
Background
With the development of economic technology and the improvement of living standard of people, the attention degree of people to medical resources is higher and higher. How to plan and configure the medical resources becomes the research focus of researchers.
The number and management of Intensive Care Unit (ICU) beds has been one of the important factors reflecting medical resources. The length of stay of the ICU and the death probability data of the ICU also influence the planning and the configuration of medical resources to a certain extent. Therefore, the prediction of the length of ICU stay and the death probability of ICU becomes one of the new research hotspots.
Currently, for the prediction research of the length of ICU stay and the death probability of ICU, the problem is generally regarded and modeled as a regression problem, and is optimized by Mean Square Error (MSE) or logarithmic mean square error (MSLE). However, most existing methods do not perform well due to positive skew and data loss of the length of stay data distribution. Meanwhile, some researchers have proposed methods based on a time series convolutional neural network (TCN), but these methods neglect the interrelation between different clinical features when modeling patient electronic medical record data, which makes the prediction accuracy of the prior art worse and the effect worse.
Disclosure of Invention
The invention aims to provide the ICU hospitalization duration and death risk prediction method based on the convolutional neural network, which is high in reliability, high in precision and good in effect.
The ICU hospitalization duration and death risk prediction method based on the convolutional neural network comprises the following steps:
s1, acquiring basic data from an existing electronic medical record database, and processing and classifying to obtain a training data set, a verification data set and a test data set;
s2, constructing a basic prediction model based on a time sequence cavity convolution and context perception feature fusion module with different receptive fields;
s3, setting a loss function, and training, verifying and testing the basic prediction model constructed in the step S2 by adopting the data set obtained in the step S1 so as to obtain an optimal prediction model;
and S4, predicting the ICU hospitalization duration and the death risk of the actual personnel by adopting the optimal prediction model obtained in the step S3.
Step S1, which is to obtain basic data from an existing electronic medical record database, process and classify the basic data to obtain a training data set, a verification data set, and a test data set, and specifically includes the following steps:
for the ICU stay duration prediction task, acquiring remaining stay duration data of each hour of a stay in the hospital during the ICU stay; and only data are acquired X days after hospitalization; x is a set integer value;
for the mortality risk prediction task, only the first 24 hours of ICU hospitalization after ICU hospitalization data were obtained;
acquiring clinical time sequence sequences, static demographic data and diagnostic data of inpatients;
the clinical time sequence comprises data of the clinical variables of the patient changing along with time and attenuation indexes of the corresponding clinical variables; clinical time series as x 1 ,x 2 ,...,x T ∈R F×2 Where F is the number of clinical features and the time series of each time step t includes the actual clinical feature measurement x t ' and corresponding clinical variable attenuation index x t ", clinical variable attenuation index x t "used to indicate the clinical characteristic measurement value x t ' recorded time information;
static demographic data includes time invariant index data for each item of hospitalized personnel; static demographic data is recorded as s ∈ R S×1 S is staticThe number of demographic data features;
the diagnosis data comprises diagnosis data and corresponding codes of the inpatients; the diagnostic data is recorded as d ∈ R D×1 And D is the number of codes corresponding to the diagnostic data.
Step S2, constructing a basic prediction model based on the temporal cavity convolution with different receptive fields and the context-aware feature fusion module, specifically including the following steps:
the method comprises the following steps of learning time sequence characteristics by adopting a basic prediction model formed by N layers of continuous time sequence cavity reelable blocks with different receptive fields and a context perception characteristic fusion module;
for the first layer of the base prediction model:
input h of time sequence cavity separable convolution network 0 Splicing an original clinical time sequence x 'and an attenuation index x'; the input of the context-aware feature fusion network is an original clinical time sequence x ', an attenuation index x ' and a static demographic data feature s ' repeated T times; input x of a jump connection 1 Is the original clinical time sequence x';
for the nth layer of the basic prediction model, N is more than 1 and less than or equal to N:
variable h output by time sequence hole separable convolution network to n-1 layer n-1 Learning individual time series trends g, respectively n
The time sequence hole separable convolutional network adopts a stacked time sequence convolutional network to extract a time sequence trend from data; the time sequence convolution network layer adopts depth separable convolution, and the weight is only shared between time steps; the operation of the time-series convolutional network is defined as:
Figure BDA0003684104610000031
wherein h is n,i Time series input characteristics formed by ith characteristics in nth layer of basic prediction model until t moment, wherein each input characteristic contains C n A channel, and
Figure BDA0003684104610000033
Figure BDA0003684104610000032
convolution filter for each feature, representing size C out ×C in A tensor of x k; for every k time steps, the convolution filter will input channel C in Mapping to output channel C out (ii) a Output of
Figure BDA0003684104610000041
The receptive field (d (K-1) +1) of the convolution filter is determined by the convolution kernel size K and the void factor d;
for each time-sequential convolutional network layer, padding is added on the left side of (d (K-1) +1) to ensure that the output size is consistent with the input size; the t-d (j-1) term is used to ensure that only past time steps are reviewed in the course of the convolution; increasing the size of a time sequence receptive field through the stacked time sequence convolution network layers; finally, the output of the time sequence convolution of each input characteristic is spliced together to obtain a time sequence trend g n Comprises the following steps:
Figure BDA0003684104610000042
wherein | | | is the splicing operation; the output dimension of the time sequence convolution network layer is R n ×C,R n The number of timing characteristics for the nth layer; a batch normalization layer and a Dropout layer are used behind each stacked time sequence convolution network layer for accelerating model convergence and preventing overfitting;
time sequence feature g learned by context-aware feature fusion network through n-1 layer time sequence holes by using convolution network n -1 Time sequence characteristic z learned by n-1 level context perception characteristic fusion network n-1 Fusing the original clinical time sequence x 'and the static demographic data characteristics s' repeated T times to obtain more comprehensive context expression z between the characteristics of the health condition of the inpatients n (ii) a And overlapping the output of each layer of the context-aware feature fusion network through splicing operation to obtain the nth layer of context awarenessMulti-scale features for feature fusion networks
Figure BDA0003684104610000043
Is composed of
Figure BDA0003684104610000044
Wherein
Figure BDA0003684104610000045
Is empty;
the n layer of the basic prediction model combines the original clinical time sequence x' and the multi-scale characteristics of the n-1 layer
Figure BDA0003684104610000046
The last channel of (a) gets x by a jump connection n (ii) a Then adopting splicing operation to make x n Time sequence trend g of time sequence convolution network layer output n Inter-feature context representation z fused with context-aware feature output from a network n Fusing to obtain time sequence characteristic v n Is v is n =[x n ,g n ,z n ];
And the nth layer of the basic prediction model also adopts a feature attention block based on a point-by-point convolution neural network to carry out the time sequence feature v n Processing to obtain more effective time sequence characteristic h n ;h n The output of the nth layer for the final base prediction model;
taking the time sequence characteristic h output by the Nth layer of the basic prediction model N As a final timing feature;
processing the diagnosis data through a full connection layer, and repeating the processed diagnosis data for T times to obtain a coded diagnosis characteristic d'; then repeating the static demographic data characteristic s 'T times, the coded diagnosis characteristic d' and the final time sequence characteristic h N And multi-scale features
Figure BDA0003684104610000051
Splicing, processing the spliced representation through a full connection layer to obtain a multi-view multi-scale fusion feature h final
Finally, fusing the multi-view and multi-scale features h final And sending the data into a full connection layer, and obtaining a prediction result through an activation function.
The hole factor of each layer is increased by 1 through the stacked time sequence convolution network layers.
The context-aware feature fusion network can learn the time sequence feature g of the n-1 layer time sequence hole by the convolutional network n-1 Time sequence characteristic z learned by n-1 level context perception characteristic fusion network n-1 Fusing the original clinical time sequence x 'and the static demographic data characteristics s' repeated T times to obtain more comprehensive context expression z between the characteristics of the health condition of the inpatients n The method specifically comprises the following steps:
time sequence characteristic g learned by using (n-1) th layer time sequence hole through separable convolution network n-1 Time sequence characteristic z learned by n-1 level context perception characteristic fusion network n-1 Combining the original clinical time sequence x 'and the static demographic data characteristics s' repeated for T times through splicing operation, and capturing the mutual relation among the dynamic characteristics by adopting a characteristic attention block based on a point-by-point convolution neural network to generate attention characteristics; the attention profile is then adjusted through the full connectivity layer to obtain a more comprehensive inter-profile contextual representation of the health of the resident z n (ii) a The calculation formula of the n-th layer context-aware feature fusion network is as follows:
z n =(PWAtt(E(g n-1 ))||E(z n-1 )||x||s')*W n +b n
where E () is a flattening function, W n And b n Is the weight of the full connection layer, and
Figure BDA0003684104610000052
(ii) a I is splicing operation; PWAtt () is an operation function corresponding to a feature attention block based on a point-by-point convolution neural network; p n Is a weight matrix W n Size of the first dimension, and P n =(R n xXC) + Z + F + S, Z is the fusion characteristic Z of the n-1 layer context sensing characteristic fusion network n-1 Size of dimension (d)F is the feature number of the original clinical time series x ', and S is the feature number of the static demographic data features S'.
The feature attention block based on the point-by-point convolution neural network is used for capturing the mutual relation among the dynamic features so as to generate attention features; the feature attention block based on the point-by-point convolutional neural network comprises 5 layers of neural networks, namely a point-by-point convolutional layer with a dimensionality reduction ratio of r, a batch normalization layer, a ReLU activation function layer, a point-by-point convolutional layer with the number of filters of C and a Sigmoid activation function layer in sequence;
the input characteristic X satisfies X belongs to R F And multiplying by T, wherein C is the size of a channel, F is the dimension of the time sequence characteristic, and T is the length of the time sequence characteristic, and then the corresponding operation function of the characteristic attention block based on the point-by-point convolution neural network is as follows:
Figure BDA0003684104610000061
wherein a (x) is an output of the attention weight map, and a (x) σ (PWConv) 2 (δ(β(PWConv 1 (X))))),PWConv 1 () And PWConv 2 () All are point convolution layer operation functions, and beta is a batch normalization operation function; delta is a ReLU function, and sigma is a Sigmoid activation function;
Figure BDA0003684104610000062
is an element-by-element multiplication.
The multi-view and multi-scale fusion feature h final Sending the data into a full connection layer, and obtaining a prediction result through an activation function, wherein the method specifically comprises the following steps:
for the ICU hospitalization duration prediction task, an exponential activation function is adopted for processing, so that the condition that the hospitalization duration cannot be predicted in the whole dynamic range is avoided; then, cutting the prediction result which is shorter than the first set time length or longer than the second set time length by adopting a HardTanh activation function; the calculation formula is as follows:
Figure BDA0003684104610000063
in the formula
Figure BDA0003684104610000071
Is the predicted length of ICU stay;
Figure BDA0003684104610000072
is an exponential activation function; τ () is the HardTanh activation function; w yt And b yt Is a parameter to be learned; h is final Is a multi-view multi-scale fusion feature;
for the death risk prediction task, adopting a sigmoid activation function to predict:
Figure BDA0003684104610000073
in the formula
Figure BDA0003684104610000074
Is a predicted risk of death; sigmoid () is a sigmoid activation function; w y And b y Are parameters to be learned.
The setting of the loss function in step S3 specifically includes the following steps:
for the ICU stay duration prediction task, a loss function L is set t To average logarithmic error function:
Figure BDA0003684104610000075
wherein T is the length of the clinical time sequence;
Figure BDA0003684104610000076
is the predicted length of ICU stay; y is t Actual length of ICU stay;
for the death risk prediction task, setting a loss function L as a cross entropy loss function:
Figure BDA0003684104610000077
in the formula, N is the number of samples;
Figure BDA0003684104610000078
is a predicted risk of death; y is i Is the true death label of the specimen.
The ICU hospitalization duration and death risk prediction method based on the convolutional neural network comprises a continuous time sequence cavity separable convolutional network, a context perception feature fusion module and a multi-view and multi-scale feature fusion module; in each time sequence hole separable convolutional network and the context perception feature fusion module, each feature is independently coded by using the time sequence hole separable convolutional network, a context perception feature fusion method is provided, and context perception comprehensive feature representation is obtained from an original clinical time sequence, static demographic data and the output of the last time sequence hole separable convolutional network and the context perception feature fusion module; the multi-view and multi-scale feature fusion module is used for fusing the captured multi-scale features and features between different views and generating a final inpatient representation for prediction; therefore, the method has high reliability, high precision and good effect.
Drawings
FIG. 1 is a schematic process flow diagram of the process of the present invention.
FIG. 2 is a schematic diagram of a prediction model structure in the method of the present invention.
FIG. 3 is a schematic structural diagram of a feature attention block based on a point-by-point convolution neural network in the method of the present invention.
Detailed Description
FIG. 1 is a schematic flow diagram of the process of the present invention: the ICU hospitalization duration and death risk prediction method based on the convolutional neural network comprises the following steps:
s1, acquiring basic data from an existing electronic medical record database, and processing and classifying to obtain a training data set, a verification data set and a test data set; the method specifically comprises the following steps:
for the ICU stay duration prediction task, acquiring remaining stay duration data of each hour of a stay in the hospital during the ICU stay; and only data are acquired X days after hospitalization; x is a set integer value;
for the mortality risk prediction task, only the first 24 hours of ICU hospitalization after ICU hospitalization data were obtained;
acquiring clinical time sequence sequences, static demographic data and diagnostic data of inpatients;
the clinical time sequence comprises data of the clinical variables of the patient changing along with time and attenuation indexes of the corresponding clinical variables; clinical time series as x 1 ,x 2 ,...,x T ∈R F×2 Where F is the number of clinical features and the time series of each time step t includes the actual clinical feature measurement x t ' and corresponding clinical variable attenuation index x t ", clinical variable attenuation index x t "used to indicate the clinical characteristic measurement value x t ' recorded time information;
static demographic data includes time invariant index data for each item of hospitalized personnel; static demographic data is recorded as s ∈ R S×1 S is the number of static demographic data features;
the diagnosis data comprises diagnosis data and corresponding codes of the inpatients; the diagnostic data is recorded as d ∈ R D×1 D is the number of codes corresponding to the diagnostic data;
s2, constructing a basic prediction model (shown in figure 2) based on a time sequence cavity separable convolution (TDSC network) and context-aware feature fusion (CAFF network) module with different receptive fields; the method specifically comprises the following steps:
the method comprises the following steps of learning time sequence characteristics by adopting a basic prediction model formed by N layers of continuous time sequence cavity reelable blocks with different receptive fields and a context perception characteristic fusion module;
for the first layer of the base prediction model:
input h of time sequence cavity separable convolution network 0 Splicing an original clinical time sequence x 'and an attenuation index x'; context-aware feature fusionThe input of the network combination is an original clinical time sequence x ', an attenuation index x ' and a static demographic data characteristic s ' repeated T times; input x of a jump connection 1 Is the original clinical time sequence x';
for the nth layer of the basic prediction model, N is more than 1 and less than or equal to N:
variable h output by time sequence hole separable convolution (TDSC) network to layer n-1 n-1 Learning individual time series trends g, respectively n
The time sequence hole separable convolutional network adopts a stacked time sequence convolutional network to extract a time sequence trend from data; the TDSC network adopts a stacked time sequence convolution network (TCN) layer to extract a time sequence trend from the electronic medical record data; TCNs are variants of CNNs convolved over time that utilize causal convolution and hole convolution to adapt to the modeling of time series data. They are based on two important principles: (1) the length of the input and output must be the same, (2) the current data point depends only on the data point at the previous time (data from the future cannot be used, preventing information leakage). Different from the design of the conventional TCN, the time sequence convolution network layer in the method adopts depth separable convolution, the weight sharing among the characteristics is avoided, and the weight is only shared among time steps; the operation of the time-series convolutional network is defined as:
Figure BDA0003684104610000101
wherein h is n,i Time series input characteristics formed by ith characteristics in nth layer of basic prediction model until t moment, wherein each input characteristic contains C n A channel, and
Figure BDA0003684104610000102
convolution filter for each feature, representing size C out ×C in A tensor of x k; for every k time steps, the convolution filter will input channel C in Mapping to output channel C out (ii) a Output of
Figure BDA0003684104610000104
(ii) a Convolution with a bit lineThe receptive field (d (K-1) +1) of the filter is determined by the convolution kernel size K and the void factor d;
for each time-sequential convolutional network layer, padding is added on the left side of (d (K-1) +1) to ensure that the output size is consistent with the input size; the t-d (j-1) term is used to ensure that only past time steps are reviewed in the course of the convolution; increasing the size of a time sequence receptive field through the stacked time sequence convolution network layers; finally, the output of the time sequence convolution of each input characteristic is spliced together to obtain a time sequence trend g n Comprises the following steps:
Figure BDA0003684104610000103
wherein | | | is the splicing operation; the output dimension of the time sequence convolution network layer is R n ×C,R n The number of timing characteristics for the nth layer; a batch normalization layer and a Dropout layer are used behind each stacked time sequence convolution network layer for accelerating model convergence and preventing overfitting; wherein, through the stacked time sequence convolution network layers, specifically, the void factor of each layer is increased by 1;
time sequence feature g learned by context-aware feature fusion (CAFF) network from n-1 layer time sequence hole separable convolution network n-1 Time sequence characteristic z learned by n-1 level context perception characteristic fusion network n-1 Fusing the original clinical time sequence x 'and the static demographic data characteristics s' repeated T times to obtain more comprehensive context expression z between the characteristics of the health condition of the inpatients n (ii) a The method specifically comprises the following steps:
feature fusion, also known as combining features from multiple layers or branches, is commonly used in modern deep learning methods, typically using simple operations such as stitching or summing, to provide a linear aggregation of fixed fused features regardless of the correlation between features; in order to effectively combine time sequence characteristics with different receptive fields and consider the relationship among the characteristics, the method provides a CAFF network;
timing characteristic g learned by CAFF network through layer n-1 timing hole separable convolution network n-1 The n-1 th layer up and downTime sequence characteristic z learned by text perception characteristic fusion network n-1 Combining the original clinical time sequence x 'and the static demographic data characteristics s' repeated for T times through splicing operation, and capturing the mutual relation among the dynamic characteristics by adopting a characteristic attention block based on a point-by-point convolution neural network to generate attention characteristics; the attention profile is then adjusted through the full connectivity layer to obtain a more comprehensive inter-profile contextual representation of the health of the resident z n (ii) a The calculation formula of the n-th layer context-aware feature fusion network is as follows:
z n =(PWAtt(E(g n-1 ))||E(z n-1 )||x||s')*W n +b n
where E () is a flattening function, W n And b n Is the weight of the full connection layer, and
Figure BDA0003684104610000111
i is splicing operation; PWAtt () is an operation function corresponding to a feature attention block based on a point-by-point convolution neural network; p n Is a weight matrix W n Size of the first dimension, and P n =(R n xXC) + Z + F + S, Z is the fusion characteristic Z of the n-1 layer context sensing characteristic fusion network n-1 F is the characteristic quantity of the original clinical time series x ', and S is the characteristic quantity of the static demographic data characteristics S';
in specific implementation, the point-by-point convolution neural network is also called one-dimensional convolution and is generally adopted in a modern deep learning architecture to reduce the number of parameters as much as possible, particularly in the field of image processing; in view of the fact that in the TCN, the feature weights are only shared at all time steps and no information interaction is performed between different features, the method proposes a PWAtt block to capture the correlation between dynamic features to generate more efficient features; a feature attention block based on a point-by-point convolutional neural network for capturing interrelations between dynamic features to generate attention features; the feature attention block (specifically shown in fig. 3) based on the point-by-point convolutional neural network comprises 5 layers of neural networks, namely a point-by-point convolutional layer with a dimensionality reduction ratio r, a batch normalization layer, a ReLU activation function layer, a point-by-point convolutional layer with a filter number of C and a Sigmoid activation function layer in sequence;
the input characteristic X satisfies X belongs to R F And multiplying by T, wherein C is the size of a channel, F is the dimension of the time sequence characteristic, and T is the length of the time sequence characteristic, and then the corresponding operation function of the characteristic attention block based on the point-by-point convolution neural network is as follows:
Figure BDA0003684104610000121
wherein a (x) is an output of the attention weight map, and a (x) σ (PWConv) 2 (δ(β(PWConv 1 (X))))),PWConv 1 () And PWConv 2 () All are point convolution layer operation functions, and beta is a batch normalization operation function; delta is a ReLU function, and sigma is a Sigmoid activation function;
Figure BDA0003684104610000122
is element-by-element multiplication;
then, the output of each layer of the context-aware feature fusion network is overlapped through splicing operation to obtain the multi-scale features of the n-th layer of context-aware feature fusion network
Figure BDA0003684104610000123
Is composed of
Figure BDA0003684104610000124
Wherein
Figure BDA0003684104610000125
Is empty;
the n layer of the basic prediction model combines the original clinical time sequence x' and the multi-scale characteristics of the n-1 layer
Figure BDA0003684104610000126
The last channel of (a) gets x by a jump connection n (ii) a Then adopting splicing operation to make x n Time sequence trend g of time sequence convolution network layer output n And context-aware feature fusion network outputHereinafter, z is shown n Fusing to obtain time sequence characteristic v n Is v is n =[x n ,g n ,z n ](ii) a In particular, in order to be able to process infrequently sampled features (for example, a particular blood test is performed only once a day), the method forward fills the data and convolves it with layers of TCN with increasing hole factors until the desired width is reached, extracting the time-series trend without loss of time-series resolution; however, the actual training process is more challenging, and if a useful time sequence trend is not captured after convolution, the TCN layer with the smaller hole factor of the previous layers loses information of the original input data by re-weighting; in order to ensure that the feature learned by each layer of TDSC network can extract useful trends and capture multi-scale information (with different receptive fields), the method adopts a jump connection method to connect the multi-scale features in the n-1 th layer of CAFF
Figure BDA0003684104610000131
The last channel and the input characteristic x' of the original clinical time sequence are spliced to obtain x n (ii) a Subsequently, x is n Output g with TDSC network n Connected to obtain an adjusted time-series representation r of the specific feature n =[g n ,x n ](ii) a Finally, the fused feature z with the context between features n And r n Spliced together to obtain a series time sequence characteristic v n =[r n ,z n ];
And the nth layer of the basic prediction model also adopts a feature attention block based on a point-by-point convolution neural network to carry out the time sequence feature v n Processing to obtain more effective time sequence characteristic h n ;h n The output of the nth layer for the final base prediction model; in specific practice, h n =PWAtt(v n );
Taking the time sequence characteristic h output by the Nth layer of the basic prediction model N As a final timing feature;
processing the diagnosis data through a full connection layer, and repeating the processed diagnosis data for T times to obtain a coded diagnosis characteristic d'; then will repeat TSecondary static demographic data features s ', encoded diagnostic features d', final timing features h N And multi-scale features
Figure BDA0003684104610000132
Splicing, processing the spliced representation through a full connection layer to obtain a multi-view multi-scale fusion feature h final (ii) a In specific implementation, the input original static demographic data s is encoded by repeating the data T times; encoding the input original diagnostic data d through a full connection layer, and repeating the encoding T times; the re-encoded demographic and diagnostic data were:
s'=[s 0 ,...,s t ,...,s T ]
Figure BDA0003684104610000133
where s 'and d' are the encoded static demographic and diagnostic features, respectively, T is the length of the time series, s t Equivalent to the original static demographic data s,
Figure BDA0003684104610000134
is the output of the original diagnostic data d after the full connection layer coding [ ·]Representing a splicing operation;
finally, the multi-view multi-scale fusion features h final Sending the data into a full connection layer, and obtaining a prediction result through an activation function; the method specifically comprises the following steps:
for the ICU hospitalization duration prediction task, an exponential activation function is adopted for processing, so that the condition that the hospitalization duration cannot be predicted in the whole dynamic range is avoided; then using HardTanh activation function to cut the prediction result which is shorter than a first set time (preferably 30min) or longer than a second set time (preferably 100 days); the calculation formula is as follows:
Figure BDA0003684104610000141
in the formula
Figure BDA0003684104610000142
Is the predicted length of ICU stay;
Figure BDA0003684104610000143
is an exponential activation function; τ () is the HardTanh activation function; w yt And b yt Is a parameter to be learned; h is final Is a multi-view multi-scale fusion feature;
for the death risk prediction task, adopting a sigmoid activation function to predict:
Figure BDA0003684104610000144
in the formula
Figure BDA0003684104610000145
Is a predicted risk of death; sigmoid () is a sigmoid activation function; w y And b y Is a parameter to be learned;
s3, setting a loss function, and training, verifying and testing the basic prediction model constructed in the step S2 by adopting the data set obtained in the step S1 so as to obtain an optimal prediction model; the method specifically comprises the following steps:
for the ICU stay duration prediction task, a loss function L is set t To average logarithmic error function:
Figure BDA0003684104610000146
wherein T is the length of the clinical time sequence;
Figure BDA0003684104610000147
is the predicted length of ICU stay; y is t Actual length of ICU stay;
for the death risk prediction task, setting a loss function L as a cross entropy loss function:
Figure BDA0003684104610000148
in the formula, N is the number of samples;
Figure BDA0003684104610000149
is a predicted risk of death; y is i A true death label for the sample;
and S4, predicting the ICU hospitalization duration and the death risk of the actual personnel by adopting the optimal prediction model obtained in the step S3.
The process of the invention is further illustrated below with reference to one example:
the method is based on a Pythrch deep learning framework, and the model training is carried out by using an NVIDIARTX 2080Ti display card to accelerate the GPU. The final selected hyper-parameter values of the model are shown in table 1:
TABLE 1 schematic representation of the values of the hyper-parameters of the method on two data sets
Hyper-parameter eICU MIMIC-IV
Number of channels 12 11
Dropout size (TDSC network) 0.05 0.05
Dropout size (Main model) 0.45 0
Convolution kernel size 4 5
Number of layers of TDSC-CAFF 11 8
Diagnostic data embedding dimension 64 -
Output size of last full connection layer 32 32
Training batch size 32 8
Learning rate 0.00226 0.00221
In the LoS prediction task, the method uses the following evaluation indexes: cohen Kappa Score, coefficient (R) determined 2 ) Mean logarithmic error (MSLE), mean logarithmic root error (RMSLE), Mean Square Error (MSE), mean square root error (RMSE), Mean Absolute Percent Error (MAPE), and Mean Absolute Deviation (MAD). In addition to the Cohen Kappa Score and R2 indices, lower index values indicate better process performance.
In the death risk prediction task, the method uses Accuracy (Accuracy), area under the precision-recall curve (AUPRC), area under the receive operating characteristic curve (AUROC) and F1-Score as model performance evaluation indexes. For all four indices, a higher index indicates a better prediction performance of the method.
In order to evaluate the effectiveness of the ICU hospitalization duration and death risk prediction model provided by the invention, the method is compared with the following comparison methods:
mean and media: the method calculates the mean (3.47 days) and median (1.67 days) of the LoS in the eICU training set, and the mean (5.70 days) and median (2.70 days) of the LoS in the MIMIC-IV training set as a baseline for predicting performance levels within a reasonable range, and a reference point for setting the expected performance for each data set.
APACHE-IV: this method is a widely used clinical prediction technique that generates one value (LoS or risk of death) every 24 hours for each patient in the ICU. It is noted that this method is only used for the eICU dataset.
LSTM: this method is a variant of the standard LSTM model, using only time series data as input.
Multi-Channel LSTM (MC-LSTM): the method consists of a series of different LSTMs, which process clinical time series characteristics separately. The outputs of the individual LSTM are spliced together to produce the final output.
Transformer: this method is based on a multi-headed autofocusing mechanism, similar to the method of the present invention, but it does not independently expand the receptive field or treatment characteristics.
ConCare: the method uses multi-channel GRUs, extracts the variation trend of clinical features with a time-aware attention mechanism, and extracts context information between features with a multi-head attention mechanism. The present invention uses this method as a comparative method for a mortality risk prediction task, since it can only process the entire 24 hour prediction window data at once.
TPC: the method uses a time-series convolutional neural network to process each feature separately, and uses point-by-point convolution to fuse the features and extract the interrelation between the features. However, unlike the method of the present invention, this method cannot extract context information between different features.
Note that in the baseline described above, only the TPC method utilizes the multiview feature as an input. For fair comparison performance, in implementing other comparison methods, the time series characteristics are fused with static demographic and diagnostic characteristics after being obtained using each comparison method.
The method and comparative method were evaluated on the same test set. The performance on the Los prediction task was first verified and the experimental results are shown in tables 2 and 3, where the results for the remaining methods, except for the first three comparative methods, are in the form of mean ± standard deviation of 10 runs.
TABLE 2 prediction Performance comparison in LoS prediction task on eICU dataset
Method MAD MSE RMSE MAPE MSLE RMSLE R2 Kappa
Mean 3.58 35.1 5.92 418.3 2.92 1.71 -7.04 0.00
Median 3.10 39.1 6.25 194.8 2.19 1.48 -0.11 0.00
APACHE-IV 2.54 16.3 4.04 182.6 1.10 1.05 -0.01 0.21
LSTM 2.66±0.01 31.7±0.2 5.63±0.02 129.2±2.3 1.48±0.01 1.21±0.00 0.10±0.01 0.31±0.01
MC-LSTM 2.65±0.01 31.3±0.3 5.59±0.03 130.4±1.7 1.46±0.01 1.20±0.00 0.11±0.01 0.33±0.01
Transformer 2.58±0.01 30.6±0.3 5.53±0.03 120.0±1.9 1.41±0.00 1.19±0.00 0.13±0.01 0.34±0.01
TPC 1.50±0.04 18.4±0.8 4.30±0.09 36.7±1.0 0.32±0.01 0.57±0.01 0.48±0.02 0.77±0.02
The invention 0.85±0.03 11.6±0.5 3.40±0.07 14.4±1.2 0.07±0.01 0.26±0.01 0.66±0.01 0.92±0.01
TABLE 3 prediction Performance comparison schematic when LoS prediction task is performed on MIMIC-IV dataset
Method MAD MSE RMSE MAPE MSLE RMSLE R2 Kappa
Mean 4.55 52.2 7.23 445.1 2.77 1.66 0.00 0.00
Median 3.97 59.4 7.71 198.8 2.08 1.44 -0.14 0.00
LSTM 2.99±0.03 37.4±0.9 6.61±0.07 104.7±2.4 1.09±0.01 1.04±0.01 0.28±0.02 0.51±0.01
MC-LSTM 2.97±0.03 38.5±1.0 6.20±0.08 98.0±1.1 1.05±0.01 1.02±0.00 0.26±0.02 0.50±0.01
Transformer 2.94±0.02 38.2±0.9 6.18±0.07 97.7±1.8 1.05±0.01 1.02±0.00 0.27±0.02 0.49±0.01
TPC 1.60±0.10 24.3±3.5 4.90±0.3 24.4±1.9 0.14±0.01 0.38±0.01 0.54±0.05 0.88±0.01
The invention 1.26±0.03 18.6±0.6 4.30±0.1 15.3±0.7 0.08±0.00 0.23±0.01 0.64±0.01 0.92±0.00
As can be seen from tables 2 and 3, the TDSC-CAFF model, which is the method proposed by the invention, achieves the most of all comparison methods on eICU and MIMIC-IV data setsGood performance, where MSE on eICU dataset is 11.6, MSLE is 0.07, R 2 It was 0.66 and Kappa score was 0.92. LoS predicted MSLE of 0.08 on MIMIC-IV dataset, R 2 It was 0.64 and Kappa score 0.92. These results show that the method proposed by the present invention is superior to the comparative method in terms of the LoS prediction task.
In the LoS prediction task, two RNN-based comparison methods: LSTM and MC-LSTM do not encode long LoS sequences well, resulting in worse performance than TCN based models (TPC and TDSC-CAFF of the invention), verifying that TCN performs well in LoS prediction. The LSTM and the Transformer process the time series characteristics as a whole, and the comparison shows the importance of processing the time series characteristics alone, compared with the TPC which processes the time series characteristics alone and the TDSC-CAFF which is provided by the method. Furthermore, TPC is inferior to the present method in its performance because it cannot effectively capture context information between features, which means the importance of context information between features, and the importance of multi-view and multi-scale feature fusion to the LoS prediction task.
The present invention then demonstrated performance on the task of mortality risk prediction, and the results of the experiments are shown in tables 4 and 5, where the results are shown in the format of mean ± standard deviation of 10 runs.
TABLE 4 prediction Performance comparison when performing a mortality Risk prediction task on eICU datasets
Method Accuracy AUROC AUPRC F1
LSTM 0.899±0.003 0.838±0.004 0.361±0.015 0.632±0.010
MC-LSTM 0.907±0.001 0.855±0.002 0.440±0.008 0.606±0.027
Transformer 0.907±0.001 0.849±0.003 0.428±0.007 0.591±0.017
ConCare 0.928±0.001 0.905±0.001 0.615±0.002 0.707±0.008
TPC 0.912±0.003 0.864±0.002 0.498±0.009 0.617±0.016
The invention 0.947±0.003 0.909±0.002 0.735±0.008 0.806±0.006
Table 5 comparison of prediction performance in performing a task of predicting mortality risk on MIMICIV data set
Method Accuracy AUROC AUPRC F1
LSTM 0.911±0.001 0.896±0.002 0.639±0.005 0.745±0.005
MC-LSTM 0.911±0.001 0.899±0.002 0.634±0.008 0.751±0.008
Transformer 0.912±0.002 0.895±0.001 0.630±0.007 0.738±0.004
ConCare 0.927±0.001 0.922±0.001 0.726±0.002 0.778±0.004
TPC 0.912±0.003 0.898±0.003 0.671±0.008 0.779±0.005
The invention 0.934±0.003 0.926±0.002 0.744±0.006 0.821±0.004
Tables 4 and 5 show a comparison of the present method on the eICU and MIMIC-IV datasets with 5 other comparative methods, respectively. The TDSC-CAFF provided by the method realizes AUROC of 0.909, AURC of 0.735 and F1 of 0.806 on an eICU data set, and AUROC of 0.926, AURC of 0.744 and F1 of 0.821 on a MIMIC-IV data set, and is superior to all comparison methods, including comparison methods specially aiming at death risk prediction.
In the mortality risk prediction task, the comparison method ConCare achieved with an AUROC of 0.905, an AUROC of 0.616 and an F1 value of 0.707 on the elcu dataset, an AUROC of 0.922, an aurrc of 0.726 and an F1 value of 0.778 on the MIMIC-IV dataset, which is better than the other comparison methods. Similar to TDSC-CAFF proposed by the present method, ConCare also used to encode each feature separately using multi-channel GRUs and capture the context between features using a multi-head attention mechanism. These comparisons indicate that encoding the features of the temporal sequence separately and capturing context information between the features is effective for mortality risk prediction.
In conclusion, the experimental results of the method provided by the invention on eICU and MIMIC-IV data sets show that the method has good capability of independently learning clinical time sequence feature codes and multi-view and multi-scale features, and can accurately predict the hospitalization duration and death risk of ICU patients.

Claims (8)

1. An ICU stay in hospital duration and death risk prediction method based on a convolutional neural network comprises the following steps:
s1, acquiring basic data from an existing electronic medical record database, and processing and classifying to obtain a training data set, a verification data set and a test data set;
s2, constructing a basic prediction model based on a time sequence cavity convolution and context perception feature fusion module with different receptive fields;
s3, setting a loss function, and training, verifying and testing the basic prediction model constructed in the step S2 by adopting the data set obtained in the step S1 so as to obtain an optimal prediction model;
and S4, predicting the ICU hospitalization duration and the death risk of the actual personnel by adopting the optimal prediction model obtained in the step S3.
2. The convolutional neural network-based ICU stay in hospital duration and mortality risk prediction method as claimed in claim 1, wherein the step S1 of obtaining the basic data from the existing electronic medical record database, and processing and classifying the basic data to obtain the training data set, the verification data set and the test data set specifically comprises the following steps:
for the ICU stay duration prediction task, acquiring remaining stay duration data of each hour of a stay in the hospital during the ICU stay; and only data are acquired X days after hospitalization; x is a set integer value;
for the mortality risk prediction task, only the first 24 hours of ICU hospitalization after ICU hospitalization data were obtained;
acquiring clinical time sequence sequences, static demographic data and diagnostic data of inpatients;
the clinical time sequence comprises data of the clinical variables of the patient changing along with time and attenuation indexes of the corresponding clinical variables; clinical time series as x 1 ,x 2 ,...,x T ∈R F×2 Where F is the number of clinical features, the time-series sequence of each time step t includes the actual clinical feature measurement x' t And corresponding clinical variable attenuation index x ″ t Clinical variable attenuation index x ″ t For indicating clinical characteristic measurement value x' t Recorded time information;
static demographic data includes time invariant index data for each item of hospitalized personnel; static demographic data is recorded as s ∈ R S×1 S is the number of static demographic data features;
the diagnosis data comprises diagnosis data and corresponding codes of the inpatients; the diagnostic data is recorded as d ∈ R D×1 And D is the number of codes corresponding to the diagnostic data.
3. The ICU stay in hospital and mortality risk prediction method according to claim 2, wherein the step S2 of constructing a basic prediction model based on the time-series hole convolution-able and context-aware feature fusion modules with different receptive fields specifically comprises the following steps:
the method comprises the following steps of learning time sequence characteristics by adopting a basic prediction model formed by N layers of continuous time sequence cavity reelable blocks with different receptive fields and a context perception characteristic fusion module;
for the first layer of the base prediction model:
input h of time sequence cavity separable convolution network 0 Splicing an original clinical time sequence x 'and an attenuation index x'; the input of the context-aware feature fusion network is an original clinical time sequence x ', an attenuation index x ' and a static demographic data feature s ' repeated T times;input x of a jump connection 1 Is the original clinical time sequence x';
for the nth layer of the basic prediction model, N is more than 1 and less than or equal to N:
variable h output by time sequence hole separable convolution network to n-1 layer n-1 Learning individual time series trends g, respectively n
The time sequence hole separable convolutional network adopts a stacked time sequence convolutional network to extract a time sequence trend from data; the time sequence convolution network layer adopts depth separable convolution, and the weight is only shared between time steps; the operation of the time-series convolutional network is defined as:
Figure FDA0003684104600000021
wherein h is n,i Time series input characteristics formed by ith characteristics in nth layer of basic prediction model and up to t th time, wherein each input characteristic contains C n A channel, and
Figure FDA0003684104600000031
f n,i :
Figure FDA0003684104600000032
convolution filter for each feature, representing size C out ×C in A tensor of x k; for every k time steps, the convolution filter will input channel C in Mapping to output channel C out (ii) a Output of
Figure FDA0003684104600000033
The receptive field (d (K-1) +1) of the convolution filter is determined by the convolution kernel size K and the void factor d;
for each time-sequential convolutional network layer, padding is added on the left side of (d (K-1) +1) to ensure that the output size is consistent with the input size; the t-d (j-1) term is used to ensure that only past time steps are reviewed in the course of the convolution; increasing the size of a time sequence receptive field through the stacked time sequence convolution network layers; finally, each input character is inputThe outputs of the characterized time sequence convolutions are spliced together to obtain a time sequence trend g n Comprises the following steps:
Figure FDA0003684104600000034
wherein | | | is the splicing operation; the output dimension of the time sequence convolution network layer is R n ×C,R n The number of timing features for the nth layer; a batch normalization layer and a Dropout layer are used behind each stacked time sequence convolution network layer for accelerating model convergence and preventing overfitting;
time sequence feature g learned by n-1 layer time sequence hole by context-aware feature fusion network through separable convolution network n-1 Time sequence characteristic z learned by n-1 level context perception characteristic fusion network n-1 Fusing the original clinical time sequence x 'and the static demographic data characteristics s' repeated T times to obtain more comprehensive context expression z between the characteristics of the health condition of the inpatients n (ii) a And overlapping the output of each layer of the context-aware feature fusion network through splicing operation to obtain the multi-scale features of the n-th layer of the context-aware feature fusion network
Figure FDA0003684104600000035
Is composed of
Figure FDA0003684104600000036
Wherein
Figure FDA0003684104600000037
Is empty;
the n layer of the basic prediction model combines the original clinical time sequence x' and the multi-scale characteristics of the n-1 layer
Figure FDA0003684104600000038
The last channel of (a) gets x by a jump connection n (ii) a Then adopting splicing operation to make x n Time sequence trend g of time sequence convolution network layer output n And context awarenessInter-feature context representation z of feature fusion network output n Fusing to obtain time sequence characteristic v n Is v is n =[x n ,g n ,z n ];
And the nth layer of the basic prediction model also adopts a feature attention block based on a point-by-point convolution neural network to carry out the time sequence feature v n Processing to obtain more effective time sequence characteristic h n ;h n The output of the nth layer for the final base prediction model;
taking the time sequence characteristic h output by the Nth layer of the basic prediction model N As a final timing feature;
processing the diagnosis data through a full connection layer, and repeating the processed diagnosis data for T times to obtain a coded diagnosis characteristic d'; then repeating the static demographic data characteristic s 'T times, the coded diagnosis characteristic d' and the final time sequence characteristic h N And multi-scale features
Figure FDA0003684104600000041
Splicing, processing the spliced representation through a full connection layer to obtain a multi-view multi-scale fusion feature h final
Finally, fusing the multi-view and multi-scale features h final And sending the data into a full connection layer, and obtaining a prediction result through an activation function.
4. An ICU stay in hospital and risk of death prediction method based on convolutional neural network as claimed in claim 3, characterized in that the hole factor is increased by 1 through stacked time-series convolutional network layers, specifically each layer.
5. The convolutional neural network-based ICU stay in hospital and mortality risk prediction method of claim 4, wherein the context-aware feature fusion network learns the timing feature g of the n-1 layer timing hole separable convolutional network n-1 And the time sequence characteristic z learned by the n-1 st layer context perception characteristic fusion network n-1 Original clinical time sequence x' and static state repeated T timesThe demographic data characteristics s' are fused to obtain a more comprehensive context representation z between the characteristics of the health condition of the inpatients n The method specifically comprises the following steps:
time sequence characteristic g learned by using (n-1) th layer time sequence hole through separable convolution network n-1 Time sequence characteristic z learned by n-1 level context perception characteristic fusion network n-1 Combining the original clinical time sequence x 'and the static demographic data characteristics s' repeated for T times through splicing operation, and capturing the mutual relation among the dynamic characteristics by adopting a characteristic attention block based on a point-by-point convolution neural network to generate attention characteristics; the attention profile is then adjusted through the full connectivity layer to obtain a more comprehensive inter-profile contextual representation of the health of the resident z n (ii) a The calculation formula of the n-th layer context-aware feature fusion network is as follows:
z n =(PWAtt(E(g n-1 ))||E(z n-1 )||x||s')*W n +b n
where E () is a flattening function, W n And b n Is the weight of the full connection layer, and
Figure FDA0003684104600000051
i is splicing operation; PWAtt () is an operation function corresponding to a feature attention block based on a point-by-point convolution neural network; p n Is a weight matrix W n Size of the first dimension, and P n =(R n xXC) + Z + F + S, Z is the fusion characteristic Z of the n-1 layer context sensing characteristic fusion network n-1 F is the feature quantity of the original clinical time series x ', and S is the feature quantity of the static demographic data features S'.
6. An ICU stay in hospital and mortality risk prediction method based on convolutional neural network as claimed in claim 5, characterized in that said feature attention block based on point-by-point convolutional neural network is used to capture the correlation between dynamic features to generate attention features; the feature attention block based on the point-by-point convolutional neural network comprises 5 layers of neural networks, namely a point-by-point convolutional layer with a dimensionality reduction ratio of r, a batch normalization layer, a ReLU activation function layer, a point-by-point convolutional layer with the number of filters of C and a Sigmoid activation function layer in sequence;
the input characteristic X satisfies X ∈ R F And multiplying by T, wherein C is the size of a channel, F is the dimension of the time sequence characteristic, and T is the length of the time sequence characteristic, and then the corresponding operation function of the characteristic attention block based on the point-by-point convolution neural network is as follows:
Figure FDA0003684104600000052
wherein a (x) is an output of the attention weight map, and a (x) σ (PWConv) 2 (δ(β(PWConv 1 (X))))),PWConv 1 () And PWConv 2 () All are point convolution layer operation functions, and beta is a batch normalization operation function; delta is a ReLU function, and sigma is a Sigmoid activation function;
Figure FDA0003684104600000061
is an element-by-element multiplication.
7. The convolutional neural network-based ICU stay in hospital and mortality risk prediction method of claim 6, wherein the feature h is fused in a multi-view and multi-scale manner final Sending the data into a full connection layer, and obtaining a prediction result through an activation function, wherein the method specifically comprises the following steps:
for the ICU hospitalization duration prediction task, an exponential activation function is adopted for processing, so that the condition that the hospitalization duration cannot be predicted in the whole dynamic range is avoided; then, cutting the prediction result which is shorter than the first set time length or longer than the second set time length by adopting a HardTanh activation function; the calculation formula is as follows:
Figure FDA0003684104600000062
in the formula
Figure FDA0003684104600000063
Is the predicted length of ICU stay;
Figure FDA0003684104600000064
is an exponential activation function; τ () is the HardTanh activation function; w yt And b yt Is a parameter to be learned; h is final Is a multi-view multi-scale fusion feature;
for the death risk prediction task, adopting a sigmoid activation function to predict:
Figure FDA0003684104600000065
in the formula
Figure FDA0003684104600000066
Is a predicted risk of death; sigmoid () is a sigmoid activation function; w y And b y Are parameters to be learned.
8. The convolutional neural network-based ICU stay in hospital and mortality risk prediction method as claimed in claim 7, wherein the setting of the loss function in step S3 specifically comprises the following steps:
for the ICU stay duration prediction task, a loss function L is set t To average logarithmic error function:
Figure FDA0003684104600000067
wherein T is the length of the clinical time sequence;
Figure FDA0003684104600000068
is the predicted length of ICU stay; y is t Actual ICU length of stay;
for the death risk prediction task, setting a loss function L as a cross entropy loss function:
Figure FDA0003684104600000071
in the formula, N is the number of samples;
Figure FDA0003684104600000072
is a predicted risk of death; y is i Is the true death label of the specimen.
CN202210645934.5A 2022-06-08 2022-06-08 ICU (intensive care unit) hospitalization duration and death risk prediction method based on convolutional neural network Pending CN114883003A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210645934.5A CN114883003A (en) 2022-06-08 2022-06-08 ICU (intensive care unit) hospitalization duration and death risk prediction method based on convolutional neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210645934.5A CN114883003A (en) 2022-06-08 2022-06-08 ICU (intensive care unit) hospitalization duration and death risk prediction method based on convolutional neural network

Publications (1)

Publication Number Publication Date
CN114883003A true CN114883003A (en) 2022-08-09

Family

ID=82682306

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210645934.5A Pending CN114883003A (en) 2022-06-08 2022-06-08 ICU (intensive care unit) hospitalization duration and death risk prediction method based on convolutional neural network

Country Status (1)

Country Link
CN (1) CN114883003A (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115547502A (en) * 2022-11-23 2022-12-30 浙江大学 Hemodialysis patient risk prediction device based on time sequence data
CN116227365A (en) * 2023-05-06 2023-06-06 成都理工大学 Landslide displacement prediction method based on improved VMD-TCN
CN116364290A (en) * 2023-06-02 2023-06-30 之江实验室 Hemodialysis characterization identification and complications risk prediction system based on multi-view alignment

Cited By (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115547502A (en) * 2022-11-23 2022-12-30 浙江大学 Hemodialysis patient risk prediction device based on time sequence data
CN115547502B (en) * 2022-11-23 2023-04-07 浙江大学 Hemodialysis patient risk prediction device based on time sequence data
CN116227365A (en) * 2023-05-06 2023-06-06 成都理工大学 Landslide displacement prediction method based on improved VMD-TCN
CN116227365B (en) * 2023-05-06 2023-07-07 成都理工大学 Landslide displacement prediction method based on improved VMD-TCN
CN116364290A (en) * 2023-06-02 2023-06-30 之江实验室 Hemodialysis characterization identification and complications risk prediction system based on multi-view alignment
CN116364290B (en) * 2023-06-02 2023-09-08 之江实验室 Hemodialysis characterization identification and complications risk prediction system based on multi-view alignment

Similar Documents

Publication Publication Date Title
CN114883003A (en) ICU (intensive care unit) hospitalization duration and death risk prediction method based on convolutional neural network
Baytas et al. Patient subtyping via time-aware LSTM networks
CN111367961A (en) Time sequence data event prediction method and system based on graph convolution neural network and application thereof
CN110491465A (en) Classification of diseases coding method, system, equipment and medium based on deep learning
CN111738363B (en) Alzheimer disease classification method based on improved 3D CNN network
CN113421652A (en) Method for analyzing medical data, method for training model and analyzer
CN116628597B (en) Heterogeneous graph node classification method based on relationship path attention
Wazir et al. HistoSeg: Quick attention with multi-loss function for multi-structure segmentation in digital histology images
CN111477328B (en) Non-contact psychological state prediction method
Barhate et al. Analysis of classifiers for prediction of type ii diabetes mellitus
CN111477329B (en) Method for evaluating psychological state based on image-text combination
CN117438029A (en) Intelligent evaluation system for wound severity of orthopedic wound patient
CN117034142B (en) Unbalanced medical data missing value filling method and system
CN110335160A (en) A kind of medical treatment migratory behaviour prediction technique and system for improving Bi-GRU based on grouping and attention
Raj et al. Alzheimers Disease Recognition using CNN Model with EfficientNetV2
CN116091763A (en) Apple leaf disease image semantic segmentation system, segmentation method, device and medium
Wang et al. Tagnet: Temporal aware graph convolution network for clinical information extraction
CN117373085A (en) Non-contact heart rate detection method based on space-time self-attention and deep neural network
CN116779174A (en) Disease data typing method based on T-LSTM bidirectional automatic encoder
CN117059264A (en) Disease prediction method, device, equipment and medium
CN116007937A (en) Intelligent fault diagnosis method and device for mechanical equipment transmission part
CN116777923A (en) Medical image segmentation algorithm based on convolution operator and multi-scale channel cross fusion
Memon et al. AiDHealth: An AI-enabled Digital Health Framework for Connected Health and Personal Health Monitoring
Gajare et al. Automatic Feature Selection from EHR & DNN Modeling
Labach et al. Effective Self-Supervised Transformers For Sparse Time Series Data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination