CN114883003A

CN114883003A - ICU (intensive care unit) hospitalization duration and death risk prediction method based on convolutional neural network

Info

Publication number: CN114883003A
Application number: CN202210645934.5A
Authority: CN
Inventors: 王建新; A.阿戴拉米; 邹梦洁; 匡湖林
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2022-06-08
Filing date: 2022-06-08
Publication date: 2022-08-09

Abstract

The invention discloses an ICU (intensive care unit) hospitalization duration and death risk prediction method based on a convolutional neural network, which comprises the steps of obtaining basic data, processing and classifying to obtain a training data set, a verification data set and a test data set; constructing a basic prediction model based on a time sequence cavity convolution and context perception feature fusion module with different receptive fields; setting a loss function, and training, verifying and testing a basic prediction model by adopting a data set to obtain an optimal prediction model; and (4) adopting an optimal prediction model to predict the ICU hospitalization time and death risk of the actual personnel. The invention uses a time sequence cavity separable convolutional network to independently code each feature and provides a context-aware feature fusion method; generating a final inpatient representation for prediction in conjunction with the multi-view and multi-scale feature fusion module; therefore, the method has high reliability, high precision and good effect.

Description

ICU (intensive care unit) hospitalization duration and death risk prediction method based on convolutional neural network

Technical Field

The invention belongs to the field of data processing, and particularly relates to an ICU (intensive care unit) hospitalization duration and death risk prediction method based on a convolutional neural network.

Background

With the development of economic technology and the improvement of living standard of people, the attention degree of people to medical resources is higher and higher. How to plan and configure the medical resources becomes the research focus of researchers.

The number and management of Intensive Care Unit (ICU) beds has been one of the important factors reflecting medical resources. The length of stay of the ICU and the death probability data of the ICU also influence the planning and the configuration of medical resources to a certain extent. Therefore, the prediction of the length of ICU stay and the death probability of ICU becomes one of the new research hotspots.

Currently, for the prediction research of the length of ICU stay and the death probability of ICU, the problem is generally regarded and modeled as a regression problem, and is optimized by Mean Square Error (MSE) or logarithmic mean square error (MSLE). However, most existing methods do not perform well due to positive skew and data loss of the length of stay data distribution. Meanwhile, some researchers have proposed methods based on a time series convolutional neural network (TCN), but these methods neglect the interrelation between different clinical features when modeling patient electronic medical record data, which makes the prediction accuracy of the prior art worse and the effect worse.

Disclosure of Invention

The invention aims to provide the ICU hospitalization duration and death risk prediction method based on the convolutional neural network, which is high in reliability, high in precision and good in effect.

The ICU hospitalization duration and death risk prediction method based on the convolutional neural network comprises the following steps:

s1, acquiring basic data from an existing electronic medical record database, and processing and classifying to obtain a training data set, a verification data set and a test data set;

s2, constructing a basic prediction model based on a time sequence cavity convolution and context perception feature fusion module with different receptive fields;

s3, setting a loss function, and training, verifying and testing the basic prediction model constructed in the step S2 by adopting the data set obtained in the step S1 so as to obtain an optimal prediction model;

and S4, predicting the ICU hospitalization duration and the death risk of the actual personnel by adopting the optimal prediction model obtained in the step S3.

Step S1, which is to obtain basic data from an existing electronic medical record database, process and classify the basic data to obtain a training data set, a verification data set, and a test data set, and specifically includes the following steps:

for the ICU stay duration prediction task, acquiring remaining stay duration data of each hour of a stay in the hospital during the ICU stay; and only data are acquired X days after hospitalization; x is a set integer value;

for the mortality risk prediction task, only the first 24 hours of ICU hospitalization after ICU hospitalization data were obtained;

acquiring clinical time sequence sequences, static demographic data and diagnostic data of inpatients;

the clinical time sequence comprises data of the clinical variables of the patient changing along with time and attenuation indexes of the corresponding clinical variables; clinical time series as x ₁ ,x ₂ ,...,x _T ∈R ^F×2 Where F is the number of clinical features and the time series of each time step t includes the actual clinical feature measurement x _t ' and corresponding clinical variable attenuation index x _t ", clinical variable attenuation index x _t "used to indicate the clinical characteristic measurement value x _t ' recorded time information;

static demographic data includes time invariant index data for each item of hospitalized personnel; static demographic data is recorded as s ∈ R ^S×1 S is staticThe number of demographic data features;

the diagnosis data comprises diagnosis data and corresponding codes of the inpatients; the diagnostic data is recorded as d ∈ R ^D×1 And D is the number of codes corresponding to the diagnostic data.

Step S2, constructing a basic prediction model based on the temporal cavity convolution with different receptive fields and the context-aware feature fusion module, specifically including the following steps:

the method comprises the following steps of learning time sequence characteristics by adopting a basic prediction model formed by N layers of continuous time sequence cavity reelable blocks with different receptive fields and a context perception characteristic fusion module;

for the first layer of the base prediction model:

input h of time sequence cavity separable convolution network ⁰ Splicing an original clinical time sequence x 'and an attenuation index x'; the input of the context-aware feature fusion network is an original clinical time sequence x ', an attenuation index x ' and a static demographic data feature s ' repeated T times; input x of a jump connection ¹ Is the original clinical time sequence x';

for the nth layer of the basic prediction model, N is more than 1 and less than or equal to N:

variable h output by time sequence hole separable convolution network to n-1 layer ^n-1 Learning individual time series trends g, respectively ⁿ ；

The time sequence hole separable convolutional network adopts a stacked time sequence convolutional network to extract a time sequence trend from data; the time sequence convolution network layer adopts depth separable convolution, and the weight is only shared between time steps; the operation of the time-series convolutional network is defined as:

wherein h is ^n,i Time series input characteristics formed by ith characteristics in nth layer of basic prediction model until t moment, wherein each input characteristic contains C ⁿ A channel, and

；

convolution filter for each feature, representing size C _out ×C _in A tensor of x k; for every k time steps, the convolution filter will input channel C _in Mapping to output channel C _out (ii) a Output of

The receptive field (d (K-1) +1) of the convolution filter is determined by the convolution kernel size K and the void factor d;

for each time-sequential convolutional network layer, padding is added on the left side of (d (K-1) +1) to ensure that the output size is consistent with the input size; the t-d (j-1) term is used to ensure that only past time steps are reviewed in the course of the convolution; increasing the size of a time sequence receptive field through the stacked time sequence convolution network layers; finally, the output of the time sequence convolution of each input characteristic is spliced together to obtain a time sequence trend g ⁿ Comprises the following steps:

wherein | | | is the splicing operation; the output dimension of the time sequence convolution network layer is R ⁿ ×C，R ⁿ The number of timing characteristics for the nth layer; a batch normalization layer and a Dropout layer are used behind each stacked time sequence convolution network layer for accelerating model convergence and preventing overfitting;

time sequence feature g learned by context-aware feature fusion network through n-1 layer time sequence holes by using convolution network ⁿ ^-1 Time sequence characteristic z learned by n-1 level context perception characteristic fusion network ^n-1 Fusing the original clinical time sequence x 'and the static demographic data characteristics s' repeated T times to obtain more comprehensive context expression z between the characteristics of the health condition of the inpatients ⁿ (ii) a And overlapping the output of each layer of the context-aware feature fusion network through splicing operation to obtain the nth layer of context awarenessMulti-scale features for feature fusion networks

Is composed of

Wherein

Is empty;

the n layer of the basic prediction model combines the original clinical time sequence x' and the multi-scale characteristics of the n-1 layer

The last channel of (a) gets x by a jump connection ⁿ (ii) a Then adopting splicing operation to make x ⁿ Time sequence trend g of time sequence convolution network layer output ⁿ Inter-feature context representation z fused with context-aware feature output from a network ⁿ Fusing to obtain time sequence characteristic v ⁿ Is v is ⁿ ＝[x ⁿ ,g ⁿ ,z ⁿ ]；

And the nth layer of the basic prediction model also adopts a feature attention block based on a point-by-point convolution neural network to carry out the time sequence feature v ⁿ Processing to obtain more effective time sequence characteristic h ⁿ ；h ⁿ The output of the nth layer for the final base prediction model;

taking the time sequence characteristic h output by the Nth layer of the basic prediction model ^N As a final timing feature;

processing the diagnosis data through a full connection layer, and repeating the processed diagnosis data for T times to obtain a coded diagnosis characteristic d'; then repeating the static demographic data characteristic s 'T times, the coded diagnosis characteristic d' and the final time sequence characteristic h ^N And multi-scale features

Splicing, processing the spliced representation through a full connection layer to obtain a multi-view multi-scale fusion feature h ^final ；

Finally, fusing the multi-view and multi-scale features h ^final And sending the data into a full connection layer, and obtaining a prediction result through an activation function.

The hole factor of each layer is increased by 1 through the stacked time sequence convolution network layers.

The context-aware feature fusion network can learn the time sequence feature g of the n-1 layer time sequence hole by the convolutional network ^n-1 Time sequence characteristic z learned by n-1 level context perception characteristic fusion network ^n-1 Fusing the original clinical time sequence x 'and the static demographic data characteristics s' repeated T times to obtain more comprehensive context expression z between the characteristics of the health condition of the inpatients ⁿ The method specifically comprises the following steps:

time sequence characteristic g learned by using (n-1) th layer time sequence hole through separable convolution network ^n-1 Time sequence characteristic z learned by n-1 level context perception characteristic fusion network ^n-1 Combining the original clinical time sequence x 'and the static demographic data characteristics s' repeated for T times through splicing operation, and capturing the mutual relation among the dynamic characteristics by adopting a characteristic attention block based on a point-by-point convolution neural network to generate attention characteristics; the attention profile is then adjusted through the full connectivity layer to obtain a more comprehensive inter-profile contextual representation of the health of the resident z ⁿ (ii) a The calculation formula of the n-th layer context-aware feature fusion network is as follows:

z ⁿ ＝(PWAtt(E(g ^n-1 ))||E(z ^n-1 )||x||s')*W ⁿ +b ⁿ

where E () is a flattening function, W ⁿ And b ⁿ Is the weight of the full connection layer, and

(ii) a I is splicing operation; PWAtt () is an operation function corresponding to a feature attention block based on a point-by-point convolution neural network; p ⁿ Is a weight matrix W ⁿ Size of the first dimension, and P ⁿ ＝(R ⁿ xXC) + Z + F + S, Z is the fusion characteristic Z of the n-1 layer context sensing characteristic fusion network ^n-1 Size of dimension (d)F is the feature number of the original clinical time series x ', and S is the feature number of the static demographic data features S'.

The feature attention block based on the point-by-point convolution neural network is used for capturing the mutual relation among the dynamic features so as to generate attention features; the feature attention block based on the point-by-point convolutional neural network comprises 5 layers of neural networks, namely a point-by-point convolutional layer with a dimensionality reduction ratio of r, a batch normalization layer, a ReLU activation function layer, a point-by-point convolutional layer with the number of filters of C and a Sigmoid activation function layer in sequence;

the input characteristic X satisfies X belongs to R ^F And multiplying by T, wherein C is the size of a channel, F is the dimension of the time sequence characteristic, and T is the length of the time sequence characteristic, and then the corresponding operation function of the characteristic attention block based on the point-by-point convolution neural network is as follows:

wherein a (x) is an output of the attention weight map, and a (x) σ (PWConv) ₂ (δ(β(PWConv ₁ (X)))))，PWConv ₁ () And PWConv ₂ () All are point convolution layer operation functions, and beta is a batch normalization operation function; delta is a ReLU function, and sigma is a Sigmoid activation function;

is an element-by-element multiplication.

The multi-view and multi-scale fusion feature h ^final Sending the data into a full connection layer, and obtaining a prediction result through an activation function, wherein the method specifically comprises the following steps:

for the ICU hospitalization duration prediction task, an exponential activation function is adopted for processing, so that the condition that the hospitalization duration cannot be predicted in the whole dynamic range is avoided; then, cutting the prediction result which is shorter than the first set time length or longer than the second set time length by adopting a HardTanh activation function; the calculation formula is as follows:

in the formula

Is the predicted length of ICU stay;

is an exponential activation function; τ () is the HardTanh activation function; w _yt And b _yt Is a parameter to be learned; h is ^final Is a multi-view multi-scale fusion feature;

for the death risk prediction task, adopting a sigmoid activation function to predict:

in the formula

Is a predicted risk of death; sigmoid () is a sigmoid activation function; w _y And b _y Are parameters to be learned.

The setting of the loss function in step S3 specifically includes the following steps:

for the ICU stay duration prediction task, a loss function L is set _t To average logarithmic error function:

wherein T is the length of the clinical time sequence;

is the predicted length of ICU stay; y is _t Actual length of ICU stay;

for the death risk prediction task, setting a loss function L as a cross entropy loss function:

in the formula, N is the number of samples;

is a predicted risk of death; y is _i Is the true death label of the specimen.

The ICU hospitalization duration and death risk prediction method based on the convolutional neural network comprises a continuous time sequence cavity separable convolutional network, a context perception feature fusion module and a multi-view and multi-scale feature fusion module; in each time sequence hole separable convolutional network and the context perception feature fusion module, each feature is independently coded by using the time sequence hole separable convolutional network, a context perception feature fusion method is provided, and context perception comprehensive feature representation is obtained from an original clinical time sequence, static demographic data and the output of the last time sequence hole separable convolutional network and the context perception feature fusion module; the multi-view and multi-scale feature fusion module is used for fusing the captured multi-scale features and features between different views and generating a final inpatient representation for prediction; therefore, the method has high reliability, high precision and good effect.

Drawings

FIG. 1 is a schematic process flow diagram of the process of the present invention.

FIG. 2 is a schematic diagram of a prediction model structure in the method of the present invention.

FIG. 3 is a schematic structural diagram of a feature attention block based on a point-by-point convolution neural network in the method of the present invention.

Detailed Description

FIG. 1 is a schematic flow diagram of the process of the present invention: the ICU hospitalization duration and death risk prediction method based on the convolutional neural network comprises the following steps:

s1, acquiring basic data from an existing electronic medical record database, and processing and classifying to obtain a training data set, a verification data set and a test data set; the method specifically comprises the following steps:

static demographic data includes time invariant index data for each item of hospitalized personnel; static demographic data is recorded as s ∈ R ^S×1 S is the number of static demographic data features;

the diagnosis data comprises diagnosis data and corresponding codes of the inpatients; the diagnostic data is recorded as d ∈ R ^D×1 D is the number of codes corresponding to the diagnostic data;

s2, constructing a basic prediction model (shown in figure 2) based on a time sequence cavity separable convolution (TDSC network) and context-aware feature fusion (CAFF network) module with different receptive fields; the method specifically comprises the following steps:

for the first layer of the base prediction model:

input h of time sequence cavity separable convolution network ⁰ Splicing an original clinical time sequence x 'and an attenuation index x'; context-aware feature fusionThe input of the network combination is an original clinical time sequence x ', an attenuation index x ' and a static demographic data characteristic s ' repeated T times; input x of a jump connection ¹ Is the original clinical time sequence x';

variable h output by time sequence hole separable convolution (TDSC) network to layer n-1 ^n-1 Learning individual time series trends g, respectively ⁿ ；

The time sequence hole separable convolutional network adopts a stacked time sequence convolutional network to extract a time sequence trend from data; the TDSC network adopts a stacked time sequence convolution network (TCN) layer to extract a time sequence trend from the electronic medical record data; TCNs are variants of CNNs convolved over time that utilize causal convolution and hole convolution to adapt to the modeling of time series data. They are based on two important principles: (1) the length of the input and output must be the same, (2) the current data point depends only on the data point at the previous time (data from the future cannot be used, preventing information leakage). Different from the design of the conventional TCN, the time sequence convolution network layer in the method adopts depth separable convolution, the weight sharing among the characteristics is avoided, and the weight is only shared among time steps; the operation of the time-series convolutional network is defined as:

(ii) a Convolution with a bit lineThe receptive field (d (K-1) +1) of the filter is determined by the convolution kernel size K and the void factor d;

wherein | | | is the splicing operation; the output dimension of the time sequence convolution network layer is R ⁿ ×C，R ⁿ The number of timing characteristics for the nth layer; a batch normalization layer and a Dropout layer are used behind each stacked time sequence convolution network layer for accelerating model convergence and preventing overfitting; wherein, through the stacked time sequence convolution network layers, specifically, the void factor of each layer is increased by 1;

time sequence feature g learned by context-aware feature fusion (CAFF) network from n-1 layer time sequence hole separable convolution network ^n-1 Time sequence characteristic z learned by n-1 level context perception characteristic fusion network ^n-1 Fusing the original clinical time sequence x 'and the static demographic data characteristics s' repeated T times to obtain more comprehensive context expression z between the characteristics of the health condition of the inpatients ⁿ (ii) a The method specifically comprises the following steps:

feature fusion, also known as combining features from multiple layers or branches, is commonly used in modern deep learning methods, typically using simple operations such as stitching or summing, to provide a linear aggregation of fixed fused features regardless of the correlation between features; in order to effectively combine time sequence characteristics with different receptive fields and consider the relationship among the characteristics, the method provides a CAFF network;

timing characteristic g learned by CAFF network through layer n-1 timing hole separable convolution network ^n-1 The n-1 th layer up and downTime sequence characteristic z learned by text perception characteristic fusion network ^n-1 Combining the original clinical time sequence x 'and the static demographic data characteristics s' repeated for T times through splicing operation, and capturing the mutual relation among the dynamic characteristics by adopting a characteristic attention block based on a point-by-point convolution neural network to generate attention characteristics; the attention profile is then adjusted through the full connectivity layer to obtain a more comprehensive inter-profile contextual representation of the health of the resident z ⁿ (ii) a The calculation formula of the n-th layer context-aware feature fusion network is as follows:

z ⁿ ＝(PWAtt(E(g ^n-1 ))||E(z ^n-1 )||x||s')*W ⁿ +b ⁿ

i is splicing operation; PWAtt () is an operation function corresponding to a feature attention block based on a point-by-point convolution neural network; p ⁿ Is a weight matrix W ⁿ Size of the first dimension, and P ⁿ ＝(R ⁿ xXC) + Z + F + S, Z is the fusion characteristic Z of the n-1 layer context sensing characteristic fusion network ^n-1 F is the characteristic quantity of the original clinical time series x ', and S is the characteristic quantity of the static demographic data characteristics S';

in specific implementation, the point-by-point convolution neural network is also called one-dimensional convolution and is generally adopted in a modern deep learning architecture to reduce the number of parameters as much as possible, particularly in the field of image processing; in view of the fact that in the TCN, the feature weights are only shared at all time steps and no information interaction is performed between different features, the method proposes a PWAtt block to capture the correlation between dynamic features to generate more efficient features; a feature attention block based on a point-by-point convolutional neural network for capturing interrelations between dynamic features to generate attention features; the feature attention block (specifically shown in fig. 3) based on the point-by-point convolutional neural network comprises 5 layers of neural networks, namely a point-by-point convolutional layer with a dimensionality reduction ratio r, a batch normalization layer, a ReLU activation function layer, a point-by-point convolutional layer with a filter number of C and a Sigmoid activation function layer in sequence;

is element-by-element multiplication;

then, the output of each layer of the context-aware feature fusion network is overlapped through splicing operation to obtain the multi-scale features of the n-th layer of context-aware feature fusion network

Is composed of

Wherein

Is empty;

The last channel of (a) gets x by a jump connection ⁿ (ii) a Then adopting splicing operation to make x ⁿ Time sequence trend g of time sequence convolution network layer output ⁿ And context-aware feature fusion network outputHereinafter, z is shown ⁿ Fusing to obtain time sequence characteristic v ⁿ Is v is ⁿ ＝[x ⁿ ,g ⁿ ,z ⁿ ](ii) a In particular, in order to be able to process infrequently sampled features (for example, a particular blood test is performed only once a day), the method forward fills the data and convolves it with layers of TCN with increasing hole factors until the desired width is reached, extracting the time-series trend without loss of time-series resolution; however, the actual training process is more challenging, and if a useful time sequence trend is not captured after convolution, the TCN layer with the smaller hole factor of the previous layers loses information of the original input data by re-weighting; in order to ensure that the feature learned by each layer of TDSC network can extract useful trends and capture multi-scale information (with different receptive fields), the method adopts a jump connection method to connect the multi-scale features in the n-1 th layer of CAFF

The last channel and the input characteristic x' of the original clinical time sequence are spliced to obtain x ⁿ (ii) a Subsequently, x is ⁿ Output g with TDSC network ⁿ Connected to obtain an adjusted time-series representation r of the specific feature ⁿ ＝[g ⁿ ,x ⁿ ](ii) a Finally, the fused feature z with the context between features ⁿ And r ⁿ Spliced together to obtain a series time sequence characteristic v ⁿ ＝[r ⁿ ,z ⁿ ]；

And the nth layer of the basic prediction model also adopts a feature attention block based on a point-by-point convolution neural network to carry out the time sequence feature v ⁿ Processing to obtain more effective time sequence characteristic h ⁿ ；h ⁿ The output of the nth layer for the final base prediction model; in specific practice, h ⁿ ＝PWAtt(v ⁿ )；

processing the diagnosis data through a full connection layer, and repeating the processed diagnosis data for T times to obtain a coded diagnosis characteristic d'; then will repeat TSecondary static demographic data features s ', encoded diagnostic features d', final timing features h ^N And multi-scale features

Splicing, processing the spliced representation through a full connection layer to obtain a multi-view multi-scale fusion feature h ^final (ii) a In specific implementation, the input original static demographic data s is encoded by repeating the data T times; encoding the input original diagnostic data d through a full connection layer, and repeating the encoding T times; the re-encoded demographic and diagnostic data were:

s'＝[s ₀ ,...,s _t ,...,s _T ]

where s 'and d' are the encoded static demographic and diagnostic features, respectively, T is the length of the time series, s _t Equivalent to the original static demographic data s,

is the output of the original diagnostic data d after the full connection layer coding [ ·]Representing a splicing operation;

finally, the multi-view multi-scale fusion features h ^final Sending the data into a full connection layer, and obtaining a prediction result through an activation function; the method specifically comprises the following steps:

for the ICU hospitalization duration prediction task, an exponential activation function is adopted for processing, so that the condition that the hospitalization duration cannot be predicted in the whole dynamic range is avoided; then using HardTanh activation function to cut the prediction result which is shorter than a first set time (preferably 30min) or longer than a second set time (preferably 100 days); the calculation formula is as follows:

in the formula

Is the predicted length of ICU stay;

in the formula

Is a predicted risk of death; sigmoid () is a sigmoid activation function; w _y And b _y Is a parameter to be learned;

s3, setting a loss function, and training, verifying and testing the basic prediction model constructed in the step S2 by adopting the data set obtained in the step S1 so as to obtain an optimal prediction model; the method specifically comprises the following steps:

wherein T is the length of the clinical time sequence;

is the predicted length of ICU stay; y is _t Actual length of ICU stay;

in the formula, N is the number of samples;

is a predicted risk of death; y is _i A true death label for the sample;

The process of the invention is further illustrated below with reference to one example:

the method is based on a Pythrch deep learning framework, and the model training is carried out by using an NVIDIARTX 2080Ti display card to accelerate the GPU. The final selected hyper-parameter values of the model are shown in table 1:

TABLE 1 schematic representation of the values of the hyper-parameters of the method on two data sets

Hyper-parameter	eICU	MIMIC-IV
			Number of channels	12	11
Dropout size (TDSC network)	0.05	0.05
			Dropout size (Main model)	0.45	0
Convolution kernel size	4	5
			Number of layers of TDSC-CAFF	11	8
Diagnostic data embedding dimension	64	-
			Output size of last full connection layer	32	32
Training batch size	32	8
			Learning rate	0.00226	0.00221

In the LoS prediction task, the method uses the following evaluation indexes: cohen Kappa Score, coefficient (R) determined ² ) Mean logarithmic error (MSLE), mean logarithmic root error (RMSLE), Mean Square Error (MSE), mean square root error (RMSE), Mean Absolute Percent Error (MAPE), and Mean Absolute Deviation (MAD). In addition to the Cohen Kappa Score and R2 indices, lower index values indicate better process performance.

In the death risk prediction task, the method uses Accuracy (Accuracy), area under the precision-recall curve (AUPRC), area under the receive operating characteristic curve (AUROC) and F1-Score as model performance evaluation indexes. For all four indices, a higher index indicates a better prediction performance of the method.

In order to evaluate the effectiveness of the ICU hospitalization duration and death risk prediction model provided by the invention, the method is compared with the following comparison methods:

mean and media: the method calculates the mean (3.47 days) and median (1.67 days) of the LoS in the eICU training set, and the mean (5.70 days) and median (2.70 days) of the LoS in the MIMIC-IV training set as a baseline for predicting performance levels within a reasonable range, and a reference point for setting the expected performance for each data set.

APACHE-IV: this method is a widely used clinical prediction technique that generates one value (LoS or risk of death) every 24 hours for each patient in the ICU. It is noted that this method is only used for the eICU dataset.

LSTM: this method is a variant of the standard LSTM model, using only time series data as input.

Multi-Channel LSTM (MC-LSTM): the method consists of a series of different LSTMs, which process clinical time series characteristics separately. The outputs of the individual LSTM are spliced together to produce the final output.

Transformer: this method is based on a multi-headed autofocusing mechanism, similar to the method of the present invention, but it does not independently expand the receptive field or treatment characteristics.

ConCare: the method uses multi-channel GRUs, extracts the variation trend of clinical features with a time-aware attention mechanism, and extracts context information between features with a multi-head attention mechanism. The present invention uses this method as a comparative method for a mortality risk prediction task, since it can only process the entire 24 hour prediction window data at once.

TPC: the method uses a time-series convolutional neural network to process each feature separately, and uses point-by-point convolution to fuse the features and extract the interrelation between the features. However, unlike the method of the present invention, this method cannot extract context information between different features.

Note that in the baseline described above, only the TPC method utilizes the multiview feature as an input. For fair comparison performance, in implementing other comparison methods, the time series characteristics are fused with static demographic and diagnostic characteristics after being obtained using each comparison method.

The method and comparative method were evaluated on the same test set. The performance on the Los prediction task was first verified and the experimental results are shown in tables 2 and 3, where the results for the remaining methods, except for the first three comparative methods, are in the form of mean ± standard deviation of 10 runs.

TABLE 2 prediction Performance comparison in LoS prediction task on eICU dataset

Method

MAD

MSE

RMSE

MAPE

MSLE

RMSLE

R2

Kappa

Mean

3.58

35.1

5.92

418.3

2.92

1.71

-7.04

0.00

Median

3.10

39.1

6.25

194.8

2.19

1.48

-0.11

0.00

APACHE-IV

2.54

16.3

4.04

182.6

1.10

1.05

-0.01

0.21

LSTM

2.66±0.01

31.7±0.2

5.63±0.02

129.2±2.3

1.48±0.01

1.21±0.00

0.10±0.01

0.31±0.01

MC-LSTM

2.65±0.01

31.3±0.3

5.59±0.03

130.4±1.7

1.46±0.01

1.20±0.00

0.11±0.01

0.33±0.01

Transformer

2.58±0.01

30.6±0.3

5.53±0.03

120.0±1.9

1.41±0.00

1.19±0.00

0.13±0.01

0.34±0.01

TPC

1.50±0.04

18.4±0.8

4.30±0.09

36.7±1.0

0.32±0.01

0.57±0.01

0.48±0.02

0.77±0.02

The invention

0.85±0.03

11.6±0.5

3.40±0.07

14.4±1.2

0.07±0.01

0.26±0.01

0.66±0.01

0.92±0.01

TABLE 3 prediction Performance comparison schematic when LoS prediction task is performed on MIMIC-IV dataset

Method

MAD

MSE

RMSE

MAPE

MSLE

RMSLE

R2

Kappa

Mean

4.55

52.2

7.23

445.1

2.77

1.66

0.00

Median

3.97

59.4

7.71

198.8

2.08

1.44

-0.14

0.00

LSTM

2.99±0.03

37.4±0.9

6.61±0.07

104.7±2.4

1.09±0.01

1.04±0.01

0.28±0.02

0.51±0.01

MC-LSTM

2.97±0.03

38.5±1.0

6.20±0.08

98.0±1.1

1.05±0.01

1.02±0.00

0.26±0.02

0.50±0.01

Transformer

2.94±0.02

38.2±0.9

6.18±0.07

97.7±1.8

1.05±0.01

1.02±0.00

0.27±0.02

0.49±0.01

TPC

1.60±0.10

24.3±3.5

4.90±0.3

24.4±1.9

0.14±0.01

0.38±0.01

0.54±0.05

0.88±0.01

The invention

1.26±0.03

18.6±0.6

4.30±0.1

15.3±0.7

0.08±0.00

0.23±0.01

0.64±0.01

0.92±0.00

As can be seen from tables 2 and 3, the TDSC-CAFF model, which is the method proposed by the invention, achieves the most of all comparison methods on eICU and MIMIC-IV data setsGood performance, where MSE on eICU dataset is 11.6, MSLE is 0.07, R ² It was 0.66 and Kappa score was 0.92. LoS predicted MSLE of 0.08 on MIMIC-IV dataset, R ² It was 0.64 and Kappa score 0.92. These results show that the method proposed by the present invention is superior to the comparative method in terms of the LoS prediction task.

In the LoS prediction task, two RNN-based comparison methods: LSTM and MC-LSTM do not encode long LoS sequences well, resulting in worse performance than TCN based models (TPC and TDSC-CAFF of the invention), verifying that TCN performs well in LoS prediction. The LSTM and the Transformer process the time series characteristics as a whole, and the comparison shows the importance of processing the time series characteristics alone, compared with the TPC which processes the time series characteristics alone and the TDSC-CAFF which is provided by the method. Furthermore, TPC is inferior to the present method in its performance because it cannot effectively capture context information between features, which means the importance of context information between features, and the importance of multi-view and multi-scale feature fusion to the LoS prediction task.

The present invention then demonstrated performance on the task of mortality risk prediction, and the results of the experiments are shown in tables 4 and 5, where the results are shown in the format of mean ± standard deviation of 10 runs.

TABLE 4 prediction Performance comparison when performing a mortality Risk prediction task on eICU datasets

Method	Accuracy	AUROC	AUPRC	F1
					LSTM	0.899±0.003	0.838±0.004	0.361±0.015	0.632±0.010
MC-LSTM	0.907±0.001	0.855±0.002	0.440±0.008	0.606±0.027
					Transformer	0.907±0.001	0.849±0.003	0.428±0.007	0.591±0.017
ConCare	0.928±0.001	0.905±0.001	0.615±0.002	0.707±0.008
					TPC	0.912±0.003	0.864±0.002	0.498±0.009	0.617±0.016
The invention	0.947±0.003	0.909±0.002	0.735±0.008	0.806±0.006

Table 5 comparison of prediction performance in performing a task of predicting mortality risk on MIMICIV data set

Method	Accuracy	AUROC	AUPRC	F1
					LSTM	0.911±0.001	0.896±0.002	0.639±0.005	0.745±0.005
MC-LSTM	0.911±0.001	0.899±0.002	0.634±0.008	0.751±0.008
					Transformer	0.912±0.002	0.895±0.001	0.630±0.007	0.738±0.004
ConCare	0.927±0.001	0.922±0.001	0.726±0.002	0.778±0.004
					TPC	0.912±0.003	0.898±0.003	0.671±0.008	0.779±0.005
The invention	0.934±0.003	0.926±0.002	0.744±0.006	0.821±0.004

Tables 4 and 5 show a comparison of the present method on the eICU and MIMIC-IV datasets with 5 other comparative methods, respectively. The TDSC-CAFF provided by the method realizes AUROC of 0.909, AURC of 0.735 and F1 of 0.806 on an eICU data set, and AUROC of 0.926, AURC of 0.744 and F1 of 0.821 on a MIMIC-IV data set, and is superior to all comparison methods, including comparison methods specially aiming at death risk prediction.

In the mortality risk prediction task, the comparison method ConCare achieved with an AUROC of 0.905, an AUROC of 0.616 and an F1 value of 0.707 on the elcu dataset, an AUROC of 0.922, an aurrc of 0.726 and an F1 value of 0.778 on the MIMIC-IV dataset, which is better than the other comparison methods. Similar to TDSC-CAFF proposed by the present method, ConCare also used to encode each feature separately using multi-channel GRUs and capture the context between features using a multi-head attention mechanism. These comparisons indicate that encoding the features of the temporal sequence separately and capturing context information between the features is effective for mortality risk prediction.

In conclusion, the experimental results of the method provided by the invention on eICU and MIMIC-IV data sets show that the method has good capability of independently learning clinical time sequence feature codes and multi-view and multi-scale features, and can accurately predict the hospitalization duration and death risk of ICU patients.

Claims

1. An ICU stay in hospital duration and death risk prediction method based on a convolutional neural network comprises the following steps:

2. The convolutional neural network-based ICU stay in hospital duration and mortality risk prediction method as claimed in claim 1, wherein the step S1 of obtaining the basic data from the existing electronic medical record database, and processing and classifying the basic data to obtain the training data set, the verification data set and the test data set specifically comprises the following steps:

the clinical time sequence comprises data of the clinical variables of the patient changing along with time and attenuation indexes of the corresponding clinical variables; clinical time series as x ₁ ,x ₂ ,...,x _T ∈R ^F×2 Where F is the number of clinical features, the time-series sequence of each time step t includes the actual clinical feature measurement x' _t And corresponding clinical variable attenuation index x ″ _t Clinical variable attenuation index x ″ _t For indicating clinical characteristic measurement value x' _t Recorded time information;

3. The ICU stay in hospital and mortality risk prediction method according to claim 2, wherein the step S2 of constructing a basic prediction model based on the time-series hole convolution-able and context-aware feature fusion modules with different receptive fields specifically comprises the following steps:

for the first layer of the base prediction model:

input h of time sequence cavity separable convolution network ⁰ Splicing an original clinical time sequence x 'and an attenuation index x'; the input of the context-aware feature fusion network is an original clinical time sequence x ', an attenuation index x ' and a static demographic data feature s ' repeated T times;input x of a jump connection ¹ Is the original clinical time sequence x';

wherein h is ^n,i Time series input characteristics formed by ith characteristics in nth layer of basic prediction model and up to t th time, wherein each input characteristic contains C ⁿ A channel, and

f ^n,i :

for each time-sequential convolutional network layer, padding is added on the left side of (d (K-1) +1) to ensure that the output size is consistent with the input size; the t-d (j-1) term is used to ensure that only past time steps are reviewed in the course of the convolution; increasing the size of a time sequence receptive field through the stacked time sequence convolution network layers; finally, each input character is inputThe outputs of the characterized time sequence convolutions are spliced together to obtain a time sequence trend g ⁿ Comprises the following steps:

wherein | | | is the splicing operation; the output dimension of the time sequence convolution network layer is R ⁿ ×C，R ⁿ The number of timing features for the nth layer; a batch normalization layer and a Dropout layer are used behind each stacked time sequence convolution network layer for accelerating model convergence and preventing overfitting;

time sequence feature g learned by n-1 layer time sequence hole by context-aware feature fusion network through separable convolution network ^n-1 Time sequence characteristic z learned by n-1 level context perception characteristic fusion network ^n-1 Fusing the original clinical time sequence x 'and the static demographic data characteristics s' repeated T times to obtain more comprehensive context expression z between the characteristics of the health condition of the inpatients ⁿ (ii) a And overlapping the output of each layer of the context-aware feature fusion network through splicing operation to obtain the multi-scale features of the n-th layer of the context-aware feature fusion network

Is composed of

Wherein

Is empty;

The last channel of (a) gets x by a jump connection ⁿ (ii) a Then adopting splicing operation to make x ⁿ Time sequence trend g of time sequence convolution network layer output ⁿ And context awarenessInter-feature context representation z of feature fusion network output ⁿ Fusing to obtain time sequence characteristic v ⁿ Is v is ⁿ ＝[x ⁿ ,g ⁿ ,z ⁿ ]；

4. An ICU stay in hospital and risk of death prediction method based on convolutional neural network as claimed in claim 3, characterized in that the hole factor is increased by 1 through stacked time-series convolutional network layers, specifically each layer.

5. The convolutional neural network-based ICU stay in hospital and mortality risk prediction method of claim 4, wherein the context-aware feature fusion network learns the timing feature g of the n-1 layer timing hole separable convolutional network ^n-1 And the time sequence characteristic z learned by the n-1 st layer context perception characteristic fusion network ^n-1 Original clinical time sequence x' and static state repeated T timesThe demographic data characteristics s' are fused to obtain a more comprehensive context representation z between the characteristics of the health condition of the inpatients ⁿ The method specifically comprises the following steps:

z ⁿ ＝(PWAtt(E(g ^n-1 ))||E(z ^n-1 )||x||s')*W ⁿ +b ⁿ

i is splicing operation; PWAtt () is an operation function corresponding to a feature attention block based on a point-by-point convolution neural network; p ⁿ Is a weight matrix W ⁿ Size of the first dimension, and P ⁿ ＝(R ⁿ xXC) + Z + F + S, Z is the fusion characteristic Z of the n-1 layer context sensing characteristic fusion network ^n-1 F is the feature quantity of the original clinical time series x ', and S is the feature quantity of the static demographic data features S'.

6. An ICU stay in hospital and mortality risk prediction method based on convolutional neural network as claimed in claim 5, characterized in that said feature attention block based on point-by-point convolutional neural network is used to capture the correlation between dynamic features to generate attention features; the feature attention block based on the point-by-point convolutional neural network comprises 5 layers of neural networks, namely a point-by-point convolutional layer with a dimensionality reduction ratio of r, a batch normalization layer, a ReLU activation function layer, a point-by-point convolutional layer with the number of filters of C and a Sigmoid activation function layer in sequence;

the input characteristic X satisfies X ∈ R ^F And multiplying by T, wherein C is the size of a channel, F is the dimension of the time sequence characteristic, and T is the length of the time sequence characteristic, and then the corresponding operation function of the characteristic attention block based on the point-by-point convolution neural network is as follows:

is an element-by-element multiplication.

7. The convolutional neural network-based ICU stay in hospital and mortality risk prediction method of claim 6, wherein the feature h is fused in a multi-view and multi-scale manner ^final Sending the data into a full connection layer, and obtaining a prediction result through an activation function, wherein the method specifically comprises the following steps:

in the formula

Is the predicted length of ICU stay;

in the formula

8. The convolutional neural network-based ICU stay in hospital and mortality risk prediction method as claimed in claim 7, wherein the setting of the loss function in step S3 specifically comprises the following steps:

wherein T is the length of the clinical time sequence;

is the predicted length of ICU stay; y is _t Actual ICU length of stay;

in the formula, N is the number of samples;

is a predicted risk of death; y is _i Is the true death label of the specimen.