Disclosure of Invention
In order to overcome the defects of the technology, the invention provides a method which is suitable for classification detection of data in different modes, can analyze signals from various angles and further improves the accuracy of classification.
The technical scheme adopted by the invention for overcoming the technical problems is as follows:
a multi-modal data classification method based on deep learning and time sequence attention mechanism comprises the following steps:
a) Selecting training-a in PhysioNet/CinC Challenge 2016 as a data set, expanding the data set, and dividing the expanded data set into a training set and a test set;
b) Establishing an electrocardiosignal model, wherein the electrocardiosignal model consists of a PC module, a TBG module and a classification module in sequence;
c) Resampling the electrocardiosignals in the training set and the testing set to 2048 sampling points, and then carrying out z-score normalization processing to obtain normalized electrocardiosignals x' ecg ;
d) Normalizing the electrocardiosignals x 'in the training set' ecg Inputting to PC module of electrocardiosignal model, outputting to obtain characteristic signal X 1 The PC module is composed of four convolution branches and a 1 multiplied by 1 convolution block in sequence;
e) The characteristic signal X 1 Inputting the signal into a TBG module of the electrocardiosignal model, and outputting the signal to obtain a characteristic signal X 2 The TBG module consists of 3 convolutional coding modules and a bidirectional GRU layer with a TPA mechanism;
f) The characteristic signal X 2 Inputting the prediction data into a classification module of the electrocardiosignal model, and outputting the prediction data to obtain a prediction category f ecg The classification module is sequentially composed of a full connection layer and a Softmax activation layer;
g) Repeating the steps d) to f) N times, and obtaining an optimal electrocardiosignal model after training by using an SGD optimizer through a minimized cross entropy loss function;
h) Establishing a heart sound signal model which sequentially consists of a PC module, a TBG module and a classification module;
i) Resampling the heart sound signals in the training set and the test set to 8000 sampling points, and then carrying out z-score normalization processing to obtain normalized heart sound signals x' pcg ;
j) The heart sound signals x 'after normalization in the training set' pcg Inputting the signal into PC module of heart sound signal model, outputting to obtain characteristic signal Y 1 The PC module is composed of four convolution branches and a 1 multiplied by 1 convolution block in sequence;
k) The characteristic signal Y 1 Inputting the signal into a TBG module of the heart sound signal model, and outputting to obtain a characteristic signal Y 2 The TBG module consists of 4 convolutional coding modules and a bidirectional GRU layer with a TPA mechanism;
l) applying the characteristic signal Y 2 Inputting the prediction into a classification module of a heart sound signal model, and outputting the prediction category f pcg The classification module is sequentially composed of a full connection layer and a Softmax activation layer;
m) repeating the steps j) to l) M times, and obtaining an optimal heart sound signal model after training by minimizing a cross entropy loss function by using an SGD optimizer;
n) manually dividing the data set into a new training set and a new testing set again according to the proportion of 4:1, inputting the new training set into the optimal electrocardiosignal model, and outputting a 64-dimensional characteristic signal X through a TBG (tunnel boring generator) module of the optimal electrocardiosignal model 3 Inputting the new training set into the optimal heart sound signal model, and outputting a 64-dimensional characteristic signal Y through a TBG module of the optimal heart sound signal model 3 By the formula PP x =[X 3 ,Y 3 ]Calculating to obtain spliced 128-dimensional feature fusion signal PP x ;
o) fusing featuresComposite signal PP x Inputting the signals into an XGboost classifier to obtain a feature fusion signal PP x The importance score ranking of (2) and selecting the signal of the top 64 of the importance score ranking as the characteristic signal PP 1 x Selecting an optimal hyper-parameter by adopting 5-fold cross validation, and training the XGboost classifier by utilizing the optimal hyper-parameter to obtain an optimized XGboost classifier;
p) inputting the new test set into the optimal electrocardiosignal model, and outputting a 64-dimensional characteristic signal X through a TBG (tunnel boring generator) module of the optimal electrocardiosignal model 4 Inputting the new test set into the optimal heart sound signal model, and outputting a 64-dimensional characteristic signal Y through a TBG module of the optimal heart sound signal model 4 By the formula PP c =[X 4 ,Y 4 ]Calculating to obtain spliced 128-dimensional feature fusion signal PP c ;
q) feature fusion signal PP c Inputting the signals into an XGboost classifier to obtain a feature fusion signal PP c The importance score ranking of (2) and selecting the signal of the top 64 of the importance score ranking as the characteristic signal PP 1 c 。
Preferably, the data set is expanded in step a) by using a sliding window segmentation method, and the data set is divided into 5 different training sets and test sets by using a five-fold cross validation method.
Further, in step c), the formula is used
Calculating to obtain a normalized electrocardiosignal x'
ecg In the formula x
ecg For training and testing the concentrated ECG signal, u
ecg Is the mean value, σ, of the electrocardiosignal
ecg Is the variance of the electrocardiosignals.
Further, step d) comprises the following steps:
d-1) the first convolution branch comprises a convolution layer with 32 channels, convolution kernel size of 1 × 15 and step size of 1, a batch normalization layer and a ReLU activation layer in sequence, and the electrocardiosignal x 'after normalization in the training set' ecg The input is to the first of the convolution branches,the output obtains a 32-dimensional characteristic signal E 1 ;
d-2) the second convolution branch comprises a convolution layer with 32 channels, convolution kernel size of 1 × 13 and step size of 1, a batch normalization layer and a ReLU activation layer in sequence, and the electrocardiosignal x 'after normalization in the training set' ecg Inputting the signal into the second convolution branch, and outputting to obtain a 32-dimensional characteristic signal E 2 ;
d-3) the third convolution branch comprises a convolution layer with a channel number of 32, a convolution kernel size of 1 x 9 and a step length of 1, a batch normalization layer and a ReLU activation layer in sequence, and the electrocardiosignals x 'after normalization in the training set' ecg Inputting the signal into a third convolution branch, and outputting to obtain a 32-dimensional characteristic signal E 3 ;
d-4) the fourth convolution branch comprises a convolution layer with a channel number of 32, a convolution kernel size of 1 x 5 and a step length of 1, a batch normalization layer and a ReLU activation layer in sequence, and the electrocardiosignals x 'after normalization in the training set' ecg Inputting the signal into the fourth convolution branch, and outputting to obtain a 32-dimensional characteristic signal E 4 ;
d-5) converting the characteristic signal E 1 Characteristic signal E 2 Characteristic signal E 3 Characteristic signal E 4 Performing characteristic cascade to obtain a 128-dimensional characteristic signal E = [ E ] after cascade 1 ,E 2 ,E 3 ,E 4 ];
d-6) 1 × 1 convolution block is composed of convolution layers with 16 channels, 1 × 1 convolution kernel size and 1 step size and ReLU active layer, and a 128-dimensional characteristic signal E = [ E ] 1 ,E 2 ,E 3 ,E 4 ]Inputting the signal into a 1 × 1 convolution block, and outputting to obtain a 16-dimensional characteristic signal X 1 。
Further, step e) comprises the steps of:
e-1) the first convolutional coding module sequentially comprises a convolutional layer with the number of channels of 32 and the size of a convolutional kernel of 1 multiplied by 11, a batch normalization layer, a ReLU active layer and a pooling layer with the size of 4, and a characteristic signal X is formed 1 Inputting the signal into a first convolution coding module, and outputting to obtain a 32-dimensional characteristic signal E 5 ;
e-2) second convolution encoding modeThe block comprises convolutional layer with channel number of 64 and convolution kernel size of 1 × 7, batch normalization layer, reLU activation layer, and pooling layer with size of 2, and the feature signal E is obtained 5 Inputting the signal into a second convolutional coding module, and outputting to obtain a 64-dimensional characteristic signal E 6 ;
E-3) the third convolutional coding module consists of a convolutional layer with the channel number of 128 and the convolutional kernel size of 3, a batch normalization layer, a ReLU activation layer and a pooling layer with the size of 2 in sequence, and a characteristic signal E is generated 6 Inputting the signal into a third convolutional coding module, and outputting to obtain a 128-dimensional characteristic signal E 7 ;
E-4) converting the characteristic signal E
7 Inputting into 32-unit bidirectional GRU layer with TPA mechanism, and outputting to obtain 64-dimensional characteristic signal X
2 In the bidirectional GRU layer of TPA mechanism by formula
Calculating to obtain a characteristic signal X
2 Where i = {1,2.., n }, n =128, t is the transpose, τ
i For the attention weight of the ith row vector,
sigma (-) is a sigmoid function,
is a time pattern matrix G
C Row i of (1), G
C Conv1d (·) is a one-dimensional convolution operation, G is a hidden state matrix,
g
i i = {1,2., t-1}, t is the time, w is the hidden state vector of the ith bidirectional GRU
k Is a weight coefficient, g
t Is the hidden state vector of the bi-directional GRU at time t.
Further, in the step g), the value of N is 150, the learning rate of the SGD optimizer is 0.001, the learning rate is attenuated to be 0.1 at every 80 periods, and the formula is used
Calculating to obtain a cross entropy loss function cc (x), wherein L is the number of categories, and L =2,f
i (x) As a prediction class f
ecg The predictive label of the ith category of (c),
as a prediction class f
ecg The real category of the corresponding ith category; in the step m), the value of N is 180, the learning rate of the SGD optimizer is 0.001, the learning rate is attenuated to be 0.1 at every 90 periods, and the formula is used
Calculating to obtain a cross entropy loss function cc (y), wherein L is the number of categories, and L =2,f
i (y) is prediction class f
pcg The predictive label of the ith category of (c),
as a prediction class f
pcg True category of the ith category of (1).
Further, in step i), the formula is used
Calculating to obtain a normalized heart sound signal x'
pcg In the formula x
pcg For the heart sound signals of the training set and the test set, u
pcg Is the mean value, σ, of the heart sound signal
pcg Is the variance of the heart sound signal.
Further, step j) comprises the following steps:
j-1) the first convolution branch is composed of convolution layer with 32 channels, convolution kernel size of 1 × 15 and step size of 2, batch normalization layer and ReLU activation layer in sequence, and the heart sound signal x 'after normalization in training set' pcg Inputting the signal into the first convolution branch, and outputting to obtain a 32-dimensional characteristic signal P 1 ;
j-2) the second convolution branch comprises a convolution layer with 32 channels, convolution kernel size of 1 × 11 and step size of 2, a batch normalization layer and a ReLU activation layer in sequence, and the heart sound signal x 'after normalization in the training set' pcg Inputting the signal into the second convolution branch, and outputting to obtain 32-dimensional characteristic signal P 2 ;
j-3) the third convolution branch comprises a convolution layer with a channel number of 32, a convolution kernel size of 1 × 9 and a step size of 2, a batch normalization layer and a ReLU activation layer in sequence, and the heart sound signal x 'after normalization in the training set' pcg Inputting the signal into a third convolution branch, and outputting to obtain a 32-dimensional characteristic signal P 3 ;
j-4) the fourth convolution branch comprises a convolution layer with a channel number of 32, a convolution kernel size of 1 × 5 and a step size of 2, a batch normalization layer and a ReLU activation layer in sequence, and the heart sound signal x 'after normalization in the training set' pcg Inputting the signal into the fourth convolution branch, and outputting to obtain a 32-dimensional characteristic signal P 4 ;
j-5) converting the characteristic signal P 1 Characteristic signal P 2 Characteristic signal P 3 Characteristic signal P 4 Performing characteristic cascade to obtain a 128-dimensional characteristic signal P = [ P ] after cascade 1 ,P 2 ,P 3 ,P 4 ];
j-6) 1 × 1 convolution block is composed of convolution layer with 32 channel number, convolution kernel size of 1 × 1 and step size of 1 and ReLU active layer, and 128-dimensional characteristic signal P = [ P = 1 ,P 2 ,P 3 ,P 4 ]Inputting into a 1 × 1 convolution block, and outputting to obtain a 32-dimensional characteristic signal Y 1 。
Further, step k) comprises the steps of:
k-1) the first convolutional coding module sequentially comprises a convolutional layer with the number of channels being 16 and the convolutional kernel size being 1 multiplied by 1, a batch normalization layer, a ReLU activation layer and a pooling layer with the size being 4, and the feature signal Y is processed 1 Inputting the data into a first convolutional coding module, and outputting to obtain a 16-dimensional characteristic signal P 5 ;
k-2) the second convolutional coding module sequentially comprises a convolutional layer with the channel number of 32 and the convolutional kernel size of 1 multiplied by 11, a batch normalization layer, a ReLU active layer and a pooling layer with the size of 2, and a characteristic signal P is obtained 5 Inputting the signal into a second convolutional coding module, and outputting to obtain a 32-dimensional characteristic signal P 6 ;
k-3) the third convolutional coding module sequentially comprises a convolutional layer with the channel number of 64 and the convolutional kernel size of 1 multiplied by 7, a batch normalization layer, a ReLU activation layer and a pooling layer with the size of 2, and a characteristic signal P is obtained 6 Inputting the signal into a third convolutional coding module, and outputting to obtain a 64-dimensional characteristic signal P 7 ;
k-4) the fourth convolutional coding module sequentially comprises a convolutional layer with the channel number of 128 and the convolutional kernel size of 1 multiplied by 3, a batch normalization layer, a ReLU activation layer and a pooling layer with the size of 2, and the feature signal P is converted into a feature signal 7 Inputting the signal into a fourth convolutional coding module, and outputting to obtain a 128-dimensional characteristic signal P 8 ;
k-5) converting the characteristic signal P
8 Inputting into 32-unit bidirectional GRU layer with TPA mechanism, and outputting to obtain 64-dimensional characteristic signal Y
2 In the bidirectional GRU layer of TPA mechanism by formula
Calculating to obtain a characteristic signal Y
2 。
The invention has the beneficial effects that: firstly, the characteristics of electrocardiosignals and heart sound signals are respectively extracted by utilizing a PC-TBG-ECG model and a PC-TBG-PCG model, and then the extracted characteristics are selected and classified by adopting an XGboost integrated classification algorithm. While the operation efficiency is increased, the regularization is added, and overfitting is effectively prevented. The method is suitable for classification detection of data in different modes, and can analyze signals from various angles, so that the classification accuracy is improved.
Detailed Description
The invention will be further explained with reference to fig. 1 and 2.
A multi-modal data classification method based on deep learning and time sequence attention mechanism comprises the following steps:
a) Selecting training-a in PhysioNet/CinC Change 2016 as a data set, expanding the data set, and dividing the expanded data set into a training set and a test set.
b) And establishing an electrocardiosignal model (PC-TBG-ECG), wherein the electrocardiosignal model is sequentially composed of a PC module, a TBG module and a classification module.
c) Resampling the electrocardiosignals in the training set and the testing set to 2048 sampling points, and then carrying out z-score normalization processing to obtain a normalized electrocardiosignal x' ecg 。
d) Normalizing the electrocardiosignals x 'in the training set' ecg Inputting the signal into a PC module of the electrocardiosignal model, and outputting the signal to obtain a characteristic signal X 1 The PC module, in turn, is made up of four convolution branches and a 1 x 1 convolution block.
e) The characteristic signal X 1 Inputting the signal into TBG module of electrocardio signal model, outputting to obtain characteristic signal X 2 The TBG module consists of 3 convolutional coding modules and a Bi-directional GRU layer (TPA-Bi-GRU) with TPA mechanism. f) The characteristic signal X 2 Inputting the prediction data into a classification module of the electrocardiosignal model, and outputting the prediction data to obtain a prediction category f ecg The classification module is composed of a full connection layer and a Softmax activation layer in sequence.
g) And (f) repeating the steps d) to f) N times, and obtaining the trained optimal electrocardiosignal model by using an SGD optimizer and minimizing a cross entropy loss function.
h) And establishing a heart sound signal model (PC-TBG-PCG), wherein the heart sound signal model is composed of a PC module, a TBG module and a classification module in sequence.
i) Resampling the heart sound signals in the training set and the test set to 8000 sampling points, and then carrying out z-score normalization processing to obtain normalized heart sound signals x' pcg 。
j) The heart sound signals x 'after normalization in the training set' pcg Inputting the signal into PC module of heart sound signal model, outputting to obtain characteristic signal Y 1 The PC module, in turn, is made up of four convolution branches and a 1 x 1 convolution block.
k) The characteristic signal Y 1 Inputting the signal into a TBG module of the heart sound signal model, and outputting to obtain a characteristic signal Y 2 The TBG module consists of 4 convolutional coding modules and oneA Bi-directional GRU layer (TPA-Bi-GRU) with TPA mechanism. l) applying the characteristic signal Y 2 Inputting the prediction into a classification module of a heart sound signal model, and outputting the prediction category f pcg The classification module is composed of a full connection layer and a Softmax activation layer in sequence.
M) repeating the steps j) to l) M times, and obtaining the trained optimal heart sound signal model by minimizing a cross entropy loss function by using an SGD optimizer.
n) manually dividing the data set into a new training set and a new testing set again according to the proportion of 4:1, inputting the new training set into the optimal electrocardiosignal model, and outputting a 64-dimensional characteristic signal X through a TBG (tunnel boring generator) module of the optimal electrocardiosignal model 3 Inputting the new training set into the optimal heart sound signal model, and outputting a 64-dimensional characteristic signal Y through a TBG (tunnel boring generator) module of the optimal heart sound signal model 3 By the formula PP x =[X 3 ,Y 3 ]Calculating to obtain spliced 128-dimensional feature fusion signal PP x 。
o) fusing features into a signal PP x Inputting the signals into an XGboost classifier to obtain a feature fusion signal PP x The importance score ranking of (2) and selecting the signal of the top 64 of the importance score ranking as the characteristic signal PP 1 x And selecting an optimal hyper-parameter by adopting 5-fold cross validation, and training the XGboost classifier by utilizing the optimal hyper-parameter to obtain the optimized XGboost classifier.
p) inputting the new test set into the optimal electrocardiosignal model, and outputting a 64-dimensional characteristic signal X through a TBG (tunnel boring generator) module of the optimal electrocardiosignal model 4 Inputting the new test set into the optimal heart sound signal model, and outputting a 64-dimensional characteristic signal Y through a TBG (tunnel boring generator) module of the optimal heart sound signal model 4 By the formula PP c =[X 4 ,Y 4 ]Calculating to obtain spliced 128-dimensional feature fusion signal PP c 。
q) feature fusion signal PP c Inputting the signals into an XGboost classifier to obtain a feature fusion signal PP c The importance score ranking of (2) and selecting the signal of the top 64 of the importance score ranking as the characteristic signal PP 1 c 。
The signals do not need to be subjected to noise reduction, filtering and other processing, the problems of low classification accuracy rate or poor practicability and the like caused by unreasonable signal preprocessing in the past are avoided, and the robustness of the model is ensured. Firstly, the characteristics of electrocardiosignals and heart sound signals are respectively extracted by utilizing PC-TBG-ECG and PC-TBG-PCG models, and then the XGboost integrated classification algorithm is adopted to select and classify the extracted characteristics. While the operation efficiency is increased, the regularization is added, and overfitting is effectively prevented. The method is suitable for classification detection of data in different modes, and can analyze signals from various angles, so that the classification accuracy is improved.
Example 1:
in the step a), a sliding window segmentation method is used for expanding the data set, and a five-fold cross validation method is used for dividing the data set into 5 different training sets and test sets.
Example 2:
in step c) by the formula
Calculating to obtain a normalized electrocardiosignal x'
ecg In the formula x
ecg For training and testing the concentrated ECG signal, u
ecg Is the mean value, σ, of the electrocardiosignal
ecg Is the variance of the electrocardiosignals.
Example 3:
the step d) comprises the following steps:
d-1) the first convolution branch comprises a convolution layer with 32 channels, convolution kernel size of 1 × 15 and step size of 1, a batch normalization layer and a ReLU activation layer in sequence, and the electrocardiosignal x 'after normalization in the training set' ecg Inputting the signal into the first convolution branch, and outputting to obtain 32-dimensional characteristic signal E 1 ;
d-2) the second convolution branch comprises a convolution layer with 32 channels, convolution kernel size of 1 × 13 and step length of 1, a batch normalization layer and a ReLU activation layer in sequence, and the electrocardiosignal x 'after normalization in the training set' ecg Input into the second convolution branchAnd outputting the obtained 32-dimensional characteristic signal E 2 ;
d-3) the third convolution branch comprises a convolution layer with a channel number of 32, a convolution kernel size of 1 x 9 and a step length of 1, a batch normalization layer and a ReLU activation layer in sequence, and the electrocardiosignals x 'after normalization in the training set' ecg Inputting the signal into a third convolution branch, and outputting to obtain a 32-dimensional characteristic signal E 3 ;
d-4) the fourth convolution branch comprises a convolution layer with a channel number of 32, a convolution kernel size of 1 x 5 and a step length of 1, a batch normalization layer and a ReLU activation layer in sequence, and the electrocardiosignals x 'after normalization in the training set' ecg Inputting the signal into the fourth convolution branch, and outputting to obtain a 32-dimensional characteristic signal E 4 ;
d-5) converting the characteristic signal E 1 Characteristic signal E 2 Characteristic signal E 3 Characteristic signal E 4 Performing characteristic cascade to obtain a 128-dimensional characteristic signal E = [ E ] after cascade 1 ,E 2 ,E 3 ,E 4 ];
d-6) 1 × 1 convolution block is composed of convolution layer with 16 channel number, convolution kernel size 1 × 1 and step size 1 and ReLU active layer, and 128-dimensional feature signal E = [ 1 ,E 2 ,E 3 ,E 4 ]Inputting the signal into a 1 × 1 convolution block, and outputting to obtain a 16-dimensional characteristic signal X 1 。
Example 4:
step e) comprises the following steps:
e-1) the first convolutional coding module consists of a convolutional layer with the number of channels of 32 and the convolutional kernel size of 1 multiplied by 11, a batch normalization layer, a ReLU activation layer and a pooling layer with the size of 4 in sequence, and the characteristic signal X is converted into a characteristic signal 1 Inputting the signal into a first convolution coding module, and outputting to obtain a 32-dimensional characteristic signal E 5 ;
E-2) the second convolutional coding module sequentially comprises a convolutional layer with the channel number of 64 and the convolutional kernel size of 1 multiplied by 7, a batch normalization layer, a ReLU activation layer and a pooling layer with the size of 2, and a characteristic signal E is obtained 5 Inputting the signal into a second convolutional coding module, and outputting to obtain a 64-dimensional characteristic signal E 6 ;
e-3) The third convolutional coding module consists of a convolutional layer with the channel number of 128 and the convolutional kernel size of 3, a batch normalization layer, a ReLU activation layer and a pooling layer with the size of 2 in sequence, and a characteristic signal E is formed 6 Inputting the signal into a third convolutional coding module, and outputting to obtain a 128-dimensional characteristic signal E 7 ;
E-4) converting the characteristic signal E
7 Inputting into 32-unit bidirectional GRU layer with TPA mechanism, and outputting to obtain 64-dimensional characteristic signal X
2 In the bidirectional GRU layer of TPA mechanism by formula
Calculating to obtain a characteristic signal X
2 Where i = {1,2.., n }, n =128, t is the transpose, τ
i For the attention weight of the ith row vector,
sigma (-) is a sigmoid function,
is a time pattern matrix G
C Row i of (1), G
C Conv1d (G), conv1d (·) is a one-dimensional convolution operation, G is a hidden state matrix,
g
i i = {1,2.., t-1}, t is the time, w is the hidden state vector for the ith bidirectional GRU
k Is a weight coefficient, g
t Is the hidden state vector of the bi-directional GRU at time t.
Example 5:
in the step g), the value of N is 150, the learning rate of the SGD optimizer is 0.001, the learning rate is attenuated to be 0.1 at every 80 periods, and the formula is used
Calculating to obtain a cross entropy loss function cc (x), wherein L is the number of categories, and L =2,f
i (x) To predict class f
ecg The predictive label of the ith category of (c),
as a prediction class f
ecg The real category of the corresponding ith category; in the step m), the value of N is 180, the learning rate of the SGD optimizer is 0.001, the learning rate is attenuated to be 0.1 at every 90 periods, and the formula is used
Calculating to obtain a cross entropy loss function cc (y), wherein L is the number of categories, and L =2,f
i (y) is prediction class f
pcg The predictive tag of the ith category of (a),
as a prediction class f
pcg True category of the ith category of (1).
Example 6:
in step i) by the formula
Calculating to obtain a normalized heart sound signal x'
pcg In the formula x
pcg For the heart sound signals in the training set and test set, u
pcg Is the mean value, σ, of the heart sound signal
pcg Is the variance of the heart sound signal.
Example 7:
step j) comprises the following steps:
j-1) the first convolution branch is composed of convolution layer with 32 channels, convolution kernel size of 1 × 15 and step size of 2, batch normalization layer and ReLU activation layer in sequence, and the heart sound signal x 'after normalization in training set' pcg Inputting the signal into the first convolution branch, and outputting to obtain a 32-dimensional characteristic signal P 1 ;
j-2) the second convolution branch comprises a convolution layer with 32 channels, convolution kernel size of 1 × 11 and step size of 2, a batch normalization layer and a ReLU activation layer in sequence, and the heart sound signal x 'after normalization in the training set' pcg Inputting the signal into the second convolution branch, and outputting to obtain 32-dimensional characteristic signal P 2 ;
j-3) the third convolution branch consists of 32 channels,Convolution layer with convolution kernel size of 1 × 9 and step size of 2, batch normalization layer, and ReLU activation layer, and training set is normalized to obtain normalized heart sound signal x' pcg Inputting the signal into a third convolution branch, and outputting to obtain a 32-dimensional characteristic signal P 3 ;
j-4) the fourth convolution branch comprises a convolution layer with a channel number of 32, a convolution kernel size of 1 × 5 and a step size of 2, a batch normalization layer and a ReLU activation layer in sequence, and the heart sound signal x 'after normalization in the training set' pcg Inputting the signal into the fourth convolution branch, and outputting to obtain a 32-dimensional characteristic signal P 4 ;
j-5) converting the characteristic signal P 1 Characteristic signal P 2 Characteristic signal P 3 Characteristic signal P 4 Performing characteristic cascade to obtain a 128-dimensional characteristic signal P = [ P ] after cascade 1 ,P 2 ,P 3 ,P 4 ];
j-6) 1 × 1 convolution block is composed of convolution layer with 32 channel number, convolution kernel size of 1 × 1 and step size of 1 and ReLU active layer, and 128-dimensional characteristic signal P = [ P = 1 ,P 2 ,P 3 ,P 4 ]Inputting into a 1 × 1 convolution block, and outputting to obtain a 32-dimensional characteristic signal Y 1 。
Example 8:
step k) comprises the following steps:
k-1) the first convolutional coding module sequentially comprises a convolutional layer with the number of channels being 16 and the convolutional kernel size being 1 multiplied by 1, a batch normalization layer, a ReLU activation layer and a pooling layer with the size being 4, and the feature signal Y is processed 1 Inputting the signals into a first convolutional encoding module, and outputting to obtain a 16-dimensional characteristic signal P 5 ;
k-2) the second convolutional coding module sequentially comprises a convolutional layer with the number of channels of 32 and the convolutional kernel size of 1 multiplied by 11, a batch normalization layer, a ReLU activation layer and a pooling layer with the size of 2, and the feature signal P is converted into a linear convolution function 5 Inputting the data into a second convolutional coding module, and outputting to obtain a 32-dimensional characteristic signal P 6 ;
k-3) the third convolutional coding module sequentially comprises a convolutional layer with the channel number of 64 and the convolutional kernel size of 1 multiplied by 7, a batch normalization layer, a ReLU active layer and a convolutional layer with the size of 2Formation of pooling layer by combining characteristic signal P 6 Inputting the signal into a third convolutional coding module, and outputting to obtain a 64-dimensional characteristic signal P 7 ;
k-4) the fourth convolutional coding module sequentially comprises a convolutional layer with the channel number of 128 and the convolutional kernel size of 1 multiplied by 3, a batch normalization layer, a ReLU activation layer and a pooling layer with the size of 2, and the feature signal P is converted into a feature signal 7 Inputting the signal into a fourth convolutional coding module, and outputting to obtain a 128-dimensional characteristic signal P 8 ;
k-5) converting the characteristic signal P
8 Inputting into 32-unit bidirectional GRU layer with TPA mechanism, and outputting to obtain 64-dimensional characteristic signal Y
2 In the bidirectional GRU layer of TPA mechanism by formula
Calculating to obtain a characteristic signal Y
2 。
Finally, it should be noted that: although the present invention has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that changes may be made in the embodiments and/or equivalents thereof without departing from the spirit and scope of the invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.