CN113679393B

CN113679393B - ECG data feature generation model based on contrast predictive coding

Info

Publication number: CN113679393B
Application number: CN202110978438.7A
Authority: CN
Inventors: 孙乐; 任超旭
Original assignee: Nanjing University of Information Science and Technology
Current assignee: Nanjing University of Information Science and Technology
Priority date: 2021-08-25
Filing date: 2021-08-25
Publication date: 2023-05-26
Anticipated expiration: 2041-08-25
Also published as: CN113679393A

Abstract

The invention discloses an ECG data characteristic generation model based on contrast prediction coding, which comprises the following steps: firstly, ECG training data are divided, positive sample pairs and negative sample pairs are seen transversely, the positive sample pairs are data of the same category, the negative sample pairs are data of different categories, training data and data to be trained are seen longitudinally, then the training data and the data to be trained are encoded through an encoder, a result obtained by encoding the training data is put into an autoregressive model to obtain Context information Context, the Context enters a prediction model to obtain future multi-step prediction values, and finally the prediction values and the encoded values of the data to be trained are calculated together to obtain a dot product to obtain a loss value. The invention can expand the data with insufficient sample number and improve the generalization capability of the downstream task.

Description

ECG data feature generation model based on contrast predictive coding

Technical Field

The invention belongs to the technical field of computer software, and particularly relates to an ECG data characteristic generation model based on contrast prediction coding.

Background

The contrast prediction coding is one of self-supervision learning, and the main methods of self-supervision learning at present are divided into three types, namely context-based, time sequence-based and contrast-based. Self-supervised learning based on contrast constructs a representation by learning to encode similarity or dissimilarity of two things, the performance of which is very strong. The self-supervised learning algorithm no longer relies on labels, but rather generates labels from the data by revealing the relationships between the parts of the data. In addition, in the current deep learning application, the problem of data is ubiquitous, the ECG is used as one of medical data, and the problems of unbalanced sample distribution, no labels and the like exist.

Electrocardiogram (ECG) has good effect on diagnosis and analysis of various arrhythmias and conduction blocks, and has great significance on diagnosis of coronary heart disease. The electrocardiogram mainly reflects the electrical activity of heart activation, and myocardial damage, insufficient blood supply, medicine and electrolyte disturbance can cause certain electrocardiogram changes, and a reliable method for diagnosing myocardial infarction when the characteristic electrocardiogram changes. At present, various models aiming at ECG data classification face a problem that the distribution of samples in the ECG data is extremely unbalanced, the proportion of normal samples to heart beat samples is seriously unbalanced, a supervision learning network cannot obtain enough data for training, and the performance of the model cannot be guaranteed. The high-dimensional characteristics consistent with the original categories of the ECG data can be generated through contrast prediction coding, the number of samples is expanded, meanwhile, the score between the same samples is higher through a scoring function, the score between different samples is lower, the categories of the samples are further distinguished, the method can be used for downstream tasks, such as classification tasks, model overfitting can be greatly prevented, the convergence speed of a downstream training model is improved, and the classification accuracy of the model is improved.

Disclosure of Invention

Aiming at the defects in the prior art, the invention provides an ECG data characteristic generation model based on contrast prediction coding, which introduces a self-supervision learning model to contrast prediction coding to predict high-dimensional characteristics of the same category as the original ECG data, so that the sample set is increased, the manual labeling cost is reduced, and meanwhile, the classification is carried out by matching with a downstream classification task, thereby being convenient for other classification models to reduce overfitting and improve the classification accuracy.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

a model is generated based on the ECG data characteristics of the comparative predictive coding, comprising the steps of:

s1, adopting a data set and preprocessing;

s2, dividing ECG training data into positive sample pairs and negative sample pairs, wherein the positive sample pairs are data of the same category, and the negative sample pairs are data of different categories; respectively dividing training data and data to be trained in the positive sample pair and the negative sample pair;

s3, constructing a contrast predictive coding CPC model, and inputting training data and data to be trained;

coding the training data and the data to be trained through an encoder, and then putting a result obtained by coding the training data into an autoregressive model to obtain Context information Context, wherein the Context enters a prediction model to obtain a future multi-step prediction value;

s4, calculating dot products together with the predicted value and the value of the data to be trained after encoding to obtain a loss value;

s5, training a contrast prediction coding CPC model;

s6, applying the trained CPC model to a downstream classification task.

In order to optimize the technical scheme, the specific measures adopted further comprise:

further, the preprocessing process of the data set in s1 includes:

s11, acquiring heart beats by adopting R peak positions marked by the data set;

s12, resampling the heart beat;

s13, filtering by using wavelet transformation;

s14, re-labeling the data set, disturbing and rearranging the data set, dividing the data set into a training set and a verification set, and dividing the training set into two parts, namely training data and data to be trained; and simultaneously constructing a positive sample pair and a negative sample pair.

Further, the autoregressive model construction process in s3 includes:

the autoregressive model GRU is used to fuse the history information, with an output dimension of 256, returning only the output of the last unit.

Further, the process of building the prediction model includes:

full connection layer output dimension 10, using a linear activation function; because the four fully connected layers are placed in a list, the Lambda layer is used to splice the four fully connected layers together laterally to form a network.

Further, in s4, the loss value obtained by dot product is within [0,1] range by using a sigmoid function, and is used as the output of the contrast prediction coding CPC model.

Further, training the CPC model is performed as follows:

s51, initializing model parameters;

s52, inputting the data into a model for training;

and s53, saving the model, and drawing the training set and verifying the accuracy of the set.

Further, s6 includes the steps of:

s61, dividing the training data, and dividing the data set into 5 parts in order to keep consistent with the trained CPC model.

S62, constructing a classification model, wherein three identical training data are used by the classification model; each training data pass through an encoder part of CPC, a one-dimensional convolution layer, a relu activation layer, a one-dimensional maximum pooling layer, a one-dimensional convolution layer, a relu activation layer and a one-dimensional maximum pooling layer; splicing the results obtained by the three data, then connecting a flat layer and two full-connection layers, and finally obtaining a classification result through the full-connection layer with an activation function of softmax;

s63, training a classification model, using a penalty function of categorical cross sentropy, using rmsprop with batch size set to 64, training 10 epochs.

The beneficial effects of the invention are as follows:

(1) The invention is suitable for ECG data under the condition of unbalanced data;

the invention is applicable to less frequent arrhythmia data with less ECG data collected. Aiming at the condition that the collected data is less, the CPC can solve the problem caused by insufficient data quantity by maximizing mutual information of the CPC, so that the data with insufficient sample quantity can be expanded, and the generalization capability of downstream tasks is improved.

(2) The invention improves the accuracy of generating the same class of ECG data features;

effective features are extracted through the encoder, unnecessary noise is removed, the features are more obvious, and subsequent processing is facilitated. The contrast prediction coding utilizes own mutual information, improves the self prediction capability, and strengthens the characteristic extraction capability of the encoder at the same time, so that the characteristic extraction effect is quite good. The model achieves a fairly good effect on the MIT-BIT arrhythmia database.

(3) The training speed of a downstream ECG classification model is accelerated;

the ECG data is coded by the coder through the contrast prediction coding, and different types of data are distinguished, so that the convergence speed of the model can be improved and the model training can be accelerated when the classification model is trained.

Drawings

FIG. 1 is a flowchart of the application of the comparative predictive coding of the present invention to ECG.

Fig. 2 is a schematic diagram of a model structure of an encoder of the present invention.

Fig. 3 is a schematic structural diagram of a prediction model of the present invention.

Fig. 4 is a schematic diagram of the relationship between the training data and the data to be trained, and the positive sample pair and the negative sample pair.

FIG. 5 is a graph of accuracy records of training and validation sets of the present invention.

Fig. 6 is a schematic structural diagram of a classification model according to an embodiment of the invention.

FIG. 7 is a schematic diagram of model accuracy in the classification model training process according to an embodiment of the present invention.

Detailed Description

The invention will now be described in further detail with reference to the accompanying drawings.

It should be noted that the terms like "upper", "lower", "left", "right", "front", "rear", and the like are also used for descriptive purposes only and are not intended to limit the scope of the invention in which the invention may be practiced, but rather the relative relationship of the terms may be altered or modified without materially altering the teachings of the invention.

The invention provides an ECG data characteristic generation model based on contrast prediction coding, which introduces a self-supervision learning model to contrast prediction coding to predict the characteristics of the same category as the original ECG data, so that the sample set is increased, the manual labeling cost is reduced, and meanwhile, the invention is convenient for other classification models to reduce overfitting and improve the classification accuracy. The present invention aims to solve the following problems:

1) The ECG data is unbalanced in number. The data is used as a model training material, and the size of the data often determines the performance of the model. There are various models for classifying ECG today, however, due to the characteristics of ECG data itself, some heart rate types have very little data, and the computer cannot be trained with enough data, which results in various problems for classifying models.

2) Labeling costs are high. In the full-supervision learning training process, a large amount of manually labeled data sets are needed, however, labeling data types consumes a large amount of resources such as manpower, material resources and the like. For ECG specific data, labeling requires a certain expertise, and increases the threshold for manual labeling, making labeling large-scale, more complex data sets increasingly difficult.

3) The prediction accuracy is not high. The final goal of predicting ECG data features using deep learning models is to achieve fast and accurate predictions. Most of the existing models are used for prediction based on the same probability distribution, and feature differences among different types of data are not mined, so that the prediction accuracy is not high.

4) And the training speed of the downstream tasks is improved. The difference between the same samples after coding is smaller and the difference between different samples is larger and larger by comparing the predictive coding, so that different types of data are distinguished before formal downstream task training, and the convergence speed of downstream tasks can be increased.

The invention mainly comprises the following steps:

as shown in fig. 1, fig. 1 illustrates the working steps of applying the comparative predictive coding to ECG data, firstly dividing ECG training data, looking up the ECG training data in the horizontal direction as a positive sample pair and a negative sample pair, the positive sample pair being the same class of data, the negative sample pair being different classes of data, looking up the ECG training data and the data to be trained in the vertical direction, then coding both the ECG training data and the data to be trained by an encoder, then putting the result obtained by coding the ECG training data into an autoregressive model to obtain Context information Context, entering the Context into the predictive model to obtain future multi-step predictive values, and finally calculating dot products of the predictive values and the coded values of the data to be trained to obtain loss values. The update gradient gradually reduces the loss value by back-propagation. The specific implementation process is as follows:

step 1: data preprocessing

1.1 collecting cardiac beat with R peak position marked by data set

1.2 resampling of heart beats

1.3 Filtering Using wavelet transforms

1.4 re-labeling the data set and disturbing and rearranging the data set, dividing the data set into a training set and a verification set, and dividing the training set into two parts, namely training data and data to be trained. And simultaneously constructing a positive sample pair and a negative sample pair.

Step 2: constructing CPC model

2.1 building encoder model to extract data features

2.2 constructing an autoregressive network model, inputting the features extracted in 2.1 into the autoregressive network model, and fusing the historical information by using the autoregressive model GRU to obtain a feature fusion vector c of the historical information.

And 2.3, constructing a prediction model, and inputting the feature fusion vector obtained in the step 2.2 into the prediction model to generate a prediction result.

2.4, constructing a model output, carrying out dot product operation on the prediction result obtained in the step 2.3 and the result obtained by the data to be trained through the encoder, and taking the result obtained by calculation as the output of the model by using a sigmoid function in the range of [0,1] according to the characteristic of vector dot product operation, wherein the larger the result is, the smaller the result is.

2.5 constructing a CPC model, wherein the input data of the CPC model comprises training data and data to be trained, and the output is calculated as 2.4.

Step 3: training CPC model

3.1 initializing model parameters

3.2 inputting data into the model for training

3.3, saving the model, and drawing the accuracy of the training set and the verification set. At this point the model can generate features of the same class as the original ECG data, which can be used for training of downstream tasks such as classification tasks.

Step 4: classification task

4.1 randomly scrambling the training data, and dividing the training data into 5 folds by using the idea of K-fold cross validation.

4.2 constructing a classification model, which is specific in that three identical training data are used instead of only one.

4.3 training the model.

The following is one embodiment of the present invention.

Step 1: data preprocessing

1.1 a heart beat was truncated by the R peak position. The present experiment uses an MIT-BIT arrhythmia database as the data set. MITAB contains 48 dual lead ECG recordings, with the first lead of each recording being the II lead except for a few recordings, each recording being 30 minutes long, with a sampling rate of 360Hz, and each recording having 650000 points. Since the four records 102, 104, 107, 217 contain pacing beats, the four records are deleted. Taking the R peak as a reference point, taking the front 0.4s and the rear 0.5s as a heart beat, the sampling rate is 360Hz, so the heart beat length is 0.4×360+0.5×360=324, and resampling the heart beat to 251. All heart beats can be classified into these five categories according to the standard set forth by the american medical equipment enhancement institute (The Association for the Advancement of Medical Instrumentation, AAMI for short): normal heart beat (N for short), supraventricular ectopic heart beat (supraventricular ectopic beat for short SVEB), ventricular ectopic heart beat (VEB for short), fusion heart beat (F for short), and unknown classification heart beat (Q for short). The heart beats are classified simultaneously when the heart beats are cut off, and the obtained N, SVEB, VEB, F, Q heart beats of the five types correspond to the following numbers: 90081. 2781, 7008, 802, 15. Since Q is an unclassified heartbeat, the final dataset contains only four major classes N, SVEB, VEB, F, with labels encoded as 0,1, 2, 3.

1.2, carrying out scrambling rearrangement on all data and corresponding labels according to the mutual corresponding relation. The first 90% of data is taken as training set and the last 10% of data is taken as validation set.

1.3 wavelet transform filtering the signal. The wavelet basis uses db6 and the filtering replaces wavelet coefficients less than 5hz and greater than 90hz with 0, leaving only coefficients between the 3 rd and 6 th detail subbands for reconstruction.

1.4 generates positive and negative samples. The positive sample pairs are the same in category, the negative sample pairs are different in category, and sample data are divided into training data and data to be trained. Both training data and data to be trained are (32,4,151).

Step 2: and (5) constructing a CPC model.

2.1 building an encoder model. The overall architecture of the coding model is shown in fig. 2, the first half consists of four blocks consisting of a full-connection layer, a batch normalization layer, a LeakyReLu activation layer, the output dimension of the full-connection layer being 64, after passing through a flat layer, a block is connected, the output dimension of the full-connection layer of the block is 256, the output dimension of the last full-connection layer is 10, and all full-connection layers use linear activation functions. The function of the encoder is to extract the features of the training data.

2.2 constructing an autoregressive network model. The autoregressive network model uses GRU (Gated Recurrent Unit) with an output dimension of 256, returning only the output of the last cell. This part results in a feature fusion vector c that fuses the history information.

And 2.3, building a prediction model. The predictive model is shown in fig. 3, where the full connected layer output dimension 10 uses a linear activation function. Because the four fully connected layers are placed in a list, the Lambda layer is used to splice the four fully connected layers together laterally to form a network. And inputting the feature fusion vector output by the autoregressive model into a prediction model, and outputting a predicted result.

And 2.4, building an output model. The output model calculates the dot product between the result obtained by encoding the data to be trained by the encoder and the result to be predicted generated by the prediction model, and maps the value of the dot product to the range of [0,1] through a sigmoid function after averaging.

2.5, constructing an integral CPC model, inputting training data and data to be trained, obtaining a result 1 after the training data passes through an encoder, an autoregressive model and a prediction model, obtaining a result 2 after the data to be trained passes through the encoder, calculating dot product values of the result 1 and the result 2, and outputting the result after the dot product value passes through sigmoid.

And step 3, initializing model parameters, and inputting ECG data into a model for training.

3.1 initializing model parameters. The model learning rate was set to 0.001, the sample batch size was 32, and the number of iterations was 10. Adam was used to optimize the learning rate and the loss function was used with binary_cross sentropy. The learning rate is specially processed to make the model converge more quickly, and when 2 epochs pass and the model performance is not improved, the learning rate is reduced to 1/3 of that of the original model.

And 3.2, generating training data and data to be trained. Wherein the relationship between the training data and the data to be trained, the positive sample pair and the negative sample pair is shown in fig. 4. The number of positive sample pairs is the same as the number of negative sample pairs, and the probability of generating N, SVEB, VEB, F categories is 0.1,0.3,0.2,0.4 respectively according to the different numbers of the categories of the training set, so that the categories with fewer sample numbers can be sufficiently trained. The data generated after the training data passes through the encoder, the autoregressive model and the prediction model is prediction data, and the data generated after the data to be trained passes through the encoder is the data to be predicted. Training the training data and the data to be trained in the model.

3.3 saving the model. The trained model is stored, and meanwhile, the accuracy of the training set and the verification set in the training process is drawn, as shown in fig. 5, the model can be seen from the graph to obtain a very good effect.

Step four: applying the trained CPC model to downstream classification tasks

4.1 dividing the training data. To keep pace with the trained CPC model, the training set data was also divided into 5 shares using the MIT-BIT arrhythmia database.

4.2 constructing a classification model. Three identical training data were used for the classification model. Each training data pass through the encoder portion of the CPC, the one-dimensional convolutional layer, the relu active layer, the one-dimensional max-pooling layer, the one-dimensional convolutional layer, the relu active layer, and the one-dimensional max-pooling layer. And splicing the results obtained by the three data, then connecting a flat layer and two full-connection layers, and finally obtaining a classification result through the full-connection layer with an activation function of softmax. A concrete model structure is shown in fig. 6.

4.3 training a classification model. The penalty function uses categorical cross sentropy and the optimizer uses rmsprop, set to a batch size of 64, training 10 epochs. The model accuracy of the training process is shown in fig. 7.

The invention is applicable to ECG data under the condition of unbalanced data.

The invention improves the accuracy of generating the same class of ECG data features.

The invention accelerates the training speed of the downstream ECG classification model.

The above is only a preferred embodiment of the present invention, and the protection scope of the present invention is not limited to the above examples, and all technical solutions belonging to the concept of the present invention belong to the protection scope of the present invention. It should be noted that modifications and adaptations to the invention without departing from the principles thereof are intended to be within the scope of the invention as set forth in the following claims.

Claims

1. A model is generated based on the characteristics of the ECG data of the comparative predictive coding, comprising the steps of:

s1, adopting a data set and preprocessing;

s5, training a contrast prediction coding CPC model;

and s6, applying the trained CPC model to arrhythmia classification tasks.

2. The contrast prediction encoding based ECG data feature generation model of claim 1, wherein the preprocessing of the data set in s1 comprises:

s11, acquiring heart beats by adopting R peak positions marked by the data set;

s12, resampling the heart beat;

s13, filtering by using wavelet transformation;

3. The contrast prediction encoding based ECG data feature generation model of claim 1, wherein the autoregressive model construction process in s3 comprises:

4. The contrast prediction encoding-based ECG data feature generation model of claim 1, wherein the process of constructing the prediction model comprises:

the output dimension of the full connection layer is 10, and a linear activation function is used; because the four fully connected layers are placed in a list, the Lambda layer is used to splice the four fully connected layers together laterally to form a network.

5. The ECG data feature generation model based on contrast prediction coding according to claim 1, wherein in s4, the dot product loss value is set to be in the range of [0,1] using a sigmoid function as the output of the contrast prediction coding CPC model.

6. The contrast prediction encoding based ECG data feature generation model of claim 1, wherein the CPC model is trained as follows:

s51, initializing model parameters;

s52, inputting the data into a model for training;

7. The contrast prediction encoding based ECG data feature generation model of claim 1, wherein s6 comprises the steps of:

s61, dividing training data, and dividing a data set into 5 parts in order to keep consistency with a trained CPC model;

s62, constructing a classification model, wherein the classification model uses three identical training data; each training data sequentially passes through an encoder part of the CPC, a one-dimensional convolution layer, a relu activation layer, a one-dimensional maximum pooling layer, a one-dimensional convolution layer, a relu activation layer and a one-dimensional maximum pooling layer; splicing the results obtained by the three data, then connecting a flat layer and two full-connection layers, and finally obtaining a classification result through the full-connection layer with an activation function of softmax;