CN114764575B

CN114764575B - Multimodal Data Classification Method Based on Deep Learning and Temporal Attention Mechanism

Info

Publication number: CN114764575B
Application number: CN202210376944.3A
Authority: CN
Inventors: 舒明雷; 朱佳媛; 刘辉; 陈达; 谢小云
Original assignee: Qilu University of Technology; Shandong Institute of Artificial Intelligence
Current assignee: Qilu University of Technology; Shandong Institute of Artificial Intelligence
Priority date: 2022-04-11
Filing date: 2022-04-11
Publication date: 2023-02-28
Anticipated expiration: 2042-04-11
Also published as: CN114764575A

Abstract

A multi-modal data classification method based on deep learning and temporal attention mechanism. Firstly, the PC‑TBG‑ECG and PC‑TBG‑PCG models are used to extract the features of ECG signals and heart sound signals, and then XGBoost is used for integrated classification The algorithm performs feature selection and classification on the extracted features. While increasing the operational efficiency, regularization is added to effectively prevent over-fitting. The invention is suitable for classification and detection of different modal data, and can analyze signals from various angles, thereby improving classification accuracy.

Description

Multimodal Data Classification Method Based on Deep Learning and Temporal Attention Mechanism

技术领域technical field

本发明涉及多模态数据分类领域，具体涉及一种基于深度学习和时序注意力机制的多模态数据分类方法。The invention relates to the field of multimodal data classification, in particular to a multimodal data classification method based on deep learning and temporal attention mechanism.

背景技术Background technique

心电图(ECG)和心音图(PCG)作为无创且高成本效益的信号采集工具，可以根据二者之间的互补性，从多种角度挖掘并分析两种信号的潜在特征，从而提高分类效果。在以往的研究中，相关研究人员主要采用单一模态数据或单一分类器来进行信号的分类研究，但采用这种方法的分类研究无法从全面性角度对信号进行分类，所以本研究提出的一种融合多模态数据的分类方法是极符合现实需求的。Electrocardiogram (ECG) and phonocardiogram (PCG), as non-invasive and cost-effective signal acquisition tools, can mine and analyze the potential characteristics of the two signals from multiple perspectives according to their complementarity, thereby improving the classification effect. In previous studies, relevant researchers mainly used single-modal data or a single classifier to carry out signal classification research, but the classification research using this method cannot classify signals from a comprehensive perspective. A classification method that fuses multimodal data is very in line with practical needs.

发明内容Contents of the invention

本发明为了克服以上技术的不足，提供了一种适合不同模态数据的分类检测，可从多种角度对信号进行分析，进而提高分类的准确率的方法。In order to overcome the deficiencies of the above technologies, the present invention provides a method suitable for classification and detection of different modal data, which can analyze signals from various angles, and further improve the accuracy of classification.

本发明克服其技术问题所采用的技术方案是：The technical scheme that the present invention overcomes its technical problem adopts is:

一种基于深度学习和时序注意力机制的多模态数据分类方法，包括如下步骤：A multi-modal data classification method based on deep learning and temporal attention mechanism, comprising the following steps:

a)选择PhysioNet/CinC Challenge 2016中的training-a作为数据集，对数据集进行扩充，将扩充后的数据集划分为训练集和测试集；a) Select training-a in PhysioNet/CinC Challenge 2016 as the data set, expand the data set, and divide the expanded data set into training set and test set;

b)建立心电信号模型，该心电信号模型依次由PC模块、TBG模块、分类模块构成；b) establish an electrocardiographic signal model, which is composed of a PC module, a TBG module, and a classification module in turn;

c)将训练集和测试集中的心电信号重采样到2048个采样点后进行z-score归一化处理，得到归一化后的心电信号x′_ecg；c) resampling the ECG signals in the training set and the test set to 2048 sampling points and performing z-score normalization processing to obtain the normalized ECG signal x′ _ecg ;

d)将训练集中归一化后的心电信号x′_ecg输入到心电信号模型的PC模块，输出得到特征信号X₁，PC模块依次由四个卷积分支和一个1×1的卷积块构成；d) Input the normalized ECG signal x′ _ecg in the training set to the PC module of the ECG signal model, and output the characteristic signal X ₁ . The PC module consists of four convolution branches and a 1×1 convolution in turn block composition;

e)将特征信号X₁输入到心电信号模型的TBG模块，输出得到特征信号X₂，TBG模块由3个卷积编码模块和一个带有TPA机制的双向GRU层构成；e) Input the characteristic signal X ₁ to the TBG module of the ECG signal model, and output the characteristic signal X ₂ , the TBG module is composed of 3 convolutional coding modules and a bidirectional GRU layer with a TPA mechanism;

f)将特征信号X₂输入到心电信号模型的分类模块中，输出得到预测类别f_ecg，分类模块依次由全连接层和Softmax激活层构成；f) Input the feature signal X ₂ into the classification module of the ECG signal model, and output the predicted category f _ecg , and the classification module is sequentially composed of a fully connected layer and a Softmax activation layer;

g)重复步骤d)至步骤f)N次，使用SGD优化器，通过最小化交叉熵损失函数得到训练后的最优的心电信号模型；g) repeating step d) to step f) N times, using the SGD optimizer, by minimizing the cross-entropy loss function to obtain the optimal ECG model after training;

h)建立心音信号模型，该心音信号模型依次由PC模块、TBG模块、分类模块构成；h) establishing a heart sound signal model, which is sequentially composed of a PC module, a TBG module, and a classification module;

i)将训练集和测试集中的心音信号重采样到8000个采样点后进行z-score归一化处理，得到归一化后的心音信号x′_pcg；i) carry out z-score normalization processing after the heart sound signal resampling of training set and test set to 8000 sampling points, obtain the heart sound signal x ' _pcg after normalization;

j)将训练集中归一化后的心音信号x′_pcg输入到心音信号模型的PC模块，输出得到特征信号Y₁，PC模块依次由四个卷积分支和一个1×1的卷积块构成；j) Input the normalized heart sound signal x′ _pcg in the training set to the PC module of the heart sound signal model, and output the characteristic signal Y ₁ . The PC module is sequentially composed of four convolution branches and a 1×1 convolution block ;

k)将特征信号Y₁输入到心音信号模型的TBG模块，输出得到特征信号Y₂，TBG模块由4个卷积编码模块和一个带有TPA机制的双向GRU层构成；k) Input the characteristic signal _Y1 to the TBG module of the heart sound signal model, and output the characteristic signal _Y2 . The TBG module is composed of 4 convolutional coding modules and a bidirectional GRU layer with a TPA mechanism;

l)将特征信号Y₂输入到心音信号模型的分类模块中，输出得到预测类别f_pcg，分类模块依次由全连接层和Softmax激活层构成；l) The feature signal _Y2 is input into the classification module of the heart sound signal model, and the output is obtained to predict the category f _pcg , and the classification module is successively composed of a fully connected layer and a Softmax activation layer;

m)重复步骤j)至步骤l)M次，使用SGD优化器，通过最小化交叉熵损失函数得到训练后的最优的心音信号模型；m) repeat step j) to step l) M times, use the SGD optimizer, obtain the optimal heart sound signal model after training by minimizing the cross-entropy loss function;

n)将数据集重新按4:1的比例手动划分成新的训练集和新的测试集，将新的训练集输入到最优的心电信号模型中，通过最优的心电信号模型的TBG模块输出得到64维的特征信号X₃，将新的训练集输入到最优的心音信号模型中，通过最优的心音信号模型的TBG模块输出得到64维的特征信号Y₃，通过公式PP^x＝[X₃,Y₃]计算得到拼接的128维的特征融合信号PP^x；n) Manually divide the data set into a new training set and a new test set according to the ratio of 4:1, input the new training set into the optimal ECG signal model, and pass the optimal ECG signal model The TBG module outputs the 64-dimensional characteristic signal X ₃ , and the new training set is input into the optimal heart sound signal model, and the 64-dimensional characteristic signal Y ₃ is obtained through the output of the TBG module of the optimal heart sound signal model, and the formula PP ^x = [X ₃ , Y ₃ ] Calculate the spliced 128-dimensional feature fusion signal PP ^x ;

o)将特征融合信号PP^x输入到XGBoost分类器中，得到特征融合信号PP^x的重要性分数排名，选择重要性分数排名前64的信号作为特征信号PP₁ ^x，采用5折交叉验证选择最优超参数，利用最优超参数训练XGBoost分类器，得到优化后的XGBoost分类器；o) Input the feature fusion signal PP ^x into the XGBoost classifier to obtain the importance score ranking of the feature fusion signal PP ^x , select the top 64 signals with the importance score as the feature signal PP ₁ ^x , and use 5-fold cross-validation to select the most Optimal hyperparameters, use the optimal hyperparameters to train the XGBoost classifier, and get the optimized XGBoost classifier;

p)将新的测试集输入到最优的心电信号模型中，通过最优的心电信号模型的TBG模块输出得到64维的特征信号X₄，将新的测试集输入到最优的心音信号模型中，通过最优的心音信号模型的TBG模块输出得到64维的特征信号Y₄，通过公式PP^c＝[X₄,Y₄]计算得到拼接的128维的特征融合信号PP^c；p) Input the new test set into the optimal ECG model, obtain the 64-dimensional characteristic signal X ₄ through the output of the TBG module of the optimal ECG model, and input the new test set into the optimal heart sound In the signal model, the 64-dimensional feature signal Y ₄ is obtained through the output of the TBG module of the optimal heart sound signal model, and the spliced 128-dimensional feature fusion signal PP ^c is obtained by calculating the formula PP ^c =[X ₄ , Y ₄ ];

q)特征融合信号PP^c输入到XGBoost分类器中，得到特征融合信号PP^c的重要性分数排名，选择重要性分数排名前64的信号作为特征信号PP₁ ^c。q) The feature fusion signal PP ^c is input to the XGBoost classifier, and the importance score ranking of the feature fusion signal PP ^c is obtained, and the signal with the top 64 importance scores is selected as the feature signal PP ₁ ^c .

优选的，步骤a)中使用滑动窗口分割的方法对数据集进行扩充，使用五折交叉验证的方法对数据集进行5次不同的训练集和测试集的划分。Preferably, in step a), the data set is expanded by using a sliding window segmentation method, and the data set is divided into five different training sets and test sets by using a five-fold cross-validation method.

进一步的，步骤c)中通过公式

计算得到归一化后的心电信号x′_ecg，式中x_ecg为训练集和测试集中的心电信号，u_ecg为心电信号的平均值，σ_ecg为心电信号的方差。Further, through the formula in step c)

Calculate the normalized ECG signal x′ _ecg , where x _ecg is the ECG signal in the training set and the test set, u _ecg is the average value of the ECG signal, and σ _ecg is the variance of the ECG signal.

进一步的，步骤d)包括如下步骤：Further, step d) includes the following steps:

d-1)第一个卷积分支依次由通道数为32、卷积核大小为1×15、步长为1的卷积层、批归一化层、ReLU激活层构成，将训练集中归一化后的心电信号x′_ecg输入到第一个卷积分支中，输出得到32维的特征信号E₁；d-1) The first convolution branch consists of a convolution layer with a channel number of 32, a convolution kernel size of 1×15, and a step size of 1, a batch normalization layer, and a ReLU activation layer. The normalized ECG signal x′ _ecg is input to the first convolution branch, and the output is a 32-dimensional characteristic signal E ₁ ;

d-2)第二个卷积分支依次由通道数为32、卷积核大小为1×13、步长为1的卷积层、批归一化层、ReLU激活层构成，将训练集中归一化后的心电信号x′_ecg输入到第二个卷积分支中，输出得到32维的特征信号E₂；d-2) The second convolution branch consists of a convolution layer with a channel number of 32, a convolution kernel size of 1×13, and a step size of 1, a batch normalization layer, and a ReLU activation layer. The normalized ECG signal x′ _ecg is input to the second convolution branch, and the output is a 32-dimensional characteristic signal E ₂ ;

d-3)第三个卷积分支依次由通道数为32、卷积核大小为1×9、步长为1的卷积层、批归一化层、ReLU激活层构成，将训练集中归一化后的心电信号x′_ecg输入到第三个卷积分支中，输出得到32维的特征信号E₃；d-3) The third convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×9, and a step size of 1, a batch normalization layer, and a ReLU activation layer. The normalized ECG signal x′ _ecg is input to the third convolution branch, and the output is a 32-dimensional characteristic signal E ₃ ;

d-4)第四个卷积分支依次由通道数为32、卷积核大小为1×5、步长为1的卷积层、批归一化层、ReLU激活层构成，将训练集中归一化后的心电信号x′_ecg输入到第四个卷积分支中，输出得到32维的特征信号E₄；d-4) The fourth convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×5, and a step size of 1, a batch normalization layer, and a ReLU activation layer. The normalized ECG signal x′ _ecg is input to the fourth convolution branch, and the output is a 32-dimensional characteristic signal E ₄ ;

d-5)将特征信号E₁、特征信号E₂、特征信号E₃、特征信号E₄进行特征级联，得到级联后的128维特征信号E＝[E₁,E₂,E₃,E₄]；d-5) Concatenate the characteristic signal E ₁ , E ₂ , E ₃ , and E ₄ to obtain the concatenated 128-dimensional characteristic signal E=[E ₁ , E ₂ , E ₃ , E ₄ ];

d-6)1×1的卷积块由通道数为16、卷积核大小为1×1、步长为1的卷积层和ReLU激活层构成，将128维特征信号E＝[E₁,E₂,E₃,E₄]输入到1×1的卷积块中，输出得到16维的特征信号X₁。d-6) The 1×1 convolution block is composed of a convolution layer with a channel number of 16, a convolution kernel size of 1×1, and a step size of 1, and a ReLU activation layer. The 128-dimensional feature signal E=[E ₁ ,E ₂ ,E ₃ ,E ₄ ] are input into a 1×1 convolutional block, and the output is a 16-dimensional feature signal X ₁ .

进一步的，步骤e)包括如下步骤：Further, step e) includes the following steps:

e-1)第一个卷积编码模块依次由通道数为32、卷积核大小为1×11的卷积层、批归一化层、ReLU激活层、大小为4的池化层构成，将特征信号X₁输入到第一个卷积编码模块中，输出得到32维特征信号E₅；e-1) The first convolutional encoding module consists of a convolutional layer with a channel number of 32 and a convolution kernel size of 1×11, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 4. Input the feature signal X ₁ into the first convolutional coding module, and output the 32-dimensional feature signal E ₅ ;

e-2)第二个卷积编码模块依次由通道数为64、卷积核大小为1×7的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成，将特征信号E₅输入到第二个卷积编码模块中，输出得到64维特征信号E₆；e-2) The second convolutional encoding module consists of a convolutional layer with a channel number of 64 and a convolution kernel size of 1×7, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. Input the feature signal E ₅ into the second convolution coding module, and output the 64-dimensional feature signal E ₆ ;

e-3)第三个卷积编码模块依次由通道数为128、卷积核大小为3的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成，将特征信号E₆输入到第三个卷积编码模块中，输出得到128维特征信号E₇；e-3) The third convolutional coding module consists of a convolutional layer with a channel number of 128 and a convolution kernel size of 3, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. The feature The signal E ₆ is input to the third convolutional encoding module, and the output is obtained as a 128-dimensional feature signal E ₇ ;

e-4)将特征信号E₇输入到单元数为32的带有TPA机制的双向GRU层中，输出得到64维的特征信号X₂，在TPA机制的双向GRU层中通过公式

计算得到特征信号X₂，式中i＝{1,2,...,n}，n＝128，T为转置，τ_i为第i个行向量的注意力权重，

σ(·)为sigmoid函数，

为时间模式矩阵G^C的第i行，G^C＝Conv1d(G)，Conv1d(·)为一维卷积运算，G为隐状态矩阵，

g_i为第i个双向GRU的隐藏状态向量，i＝{1,2,...,t-1}，t为时刻，w_k为权重系数，g_t为t时刻的双向GRU的隐藏状态向量。e-4) Input the characteristic signal E ₇ into the bidirectional GRU layer with a TPA mechanism with a unit number of 32, and output a 64-dimensional characteristic signal X ₂ , through the formula in the bidirectional GRU layer of the TPA mechanism

Calculate the characteristic signal X ₂ , where i={1,2,...,n}, n=128, T is the transpose, τ _i is the attention weight of the ith row vector,

σ(·) is the sigmoid function,

is the i-th row of the time pattern matrix G ^C , G ^C =Conv1d(G), Conv1d(·) is a one-dimensional convolution operation, G is the hidden state matrix,

g _i is the hidden state vector of the i-th bidirectional GRU, i={1,2,...,t-1}, t is the moment, w _k is the weight coefficient, g _t is the hidden state of the bidirectional GRU at the time t vector.

进一步的，步骤g)中N取值为150，SGD优化器的学习率为0.001，每80个周期学习率衰减为当前的0.1，通过公式

计算得到交叉熵损失函数cc(x)，式中L为类别数，L＝2，f_i(x)为预测类别f_ecg的第i个类别的预测标签，

为预测类别f_ecg所对应的第i个类别的真实类别；步骤m)中N取值为180，SGD优化器的学习率为0.001，每90个周期学习率衰减为当前的0.1，通过公式

计算得到交叉熵损失函数cc(y)，式中L为类别数，L＝2，f_i(y)为预测类别f_pcg的第i个类别的预测标签，

为预测类别f_pcg的第i个类别的真实类别。Further, the value of N in step g) is 150, the learning rate of the SGD optimizer is 0.001, and the learning rate decays to the current 0.1 every 80 cycles, through the formula

Calculate the cross-entropy loss function cc(x), where L is the number of categories, L=2, f _i (x) is the predicted label of the i-th category of the predicted category f _ecg ,

It is the true category of the i-th category corresponding to the predicted category f _ecg ; the value of N in step m) is 180, the learning rate of the SGD optimizer is 0.001, and the learning rate decays to the current 0.1 every 90 cycles, through the formula

Calculate the cross-entropy loss function cc(y), where L is the number of categories, L=2, f _i (y) is the predicted label of the i-th category of the predicted category f _pcg ,

is the true category of the i-th category of the predicted category f _pcg .

进一步的，步骤i)中通过公式

计算得到归一化后的心音信号x′_pcg，式中x_pcg为训练集和测试集中的心音信号，u_pcg为心音信号的平均值，σ_pcg为心音信号的方差。Further, through the formula in step i)

Calculate the normalized heart sound signal x′ _pcg , where x _pcg is the heart sound signal in the training set and the test set, u _pcg is the average value of the heart sound signal, and σ _pcg is the variance of the heart sound signal.

进一步的，步骤j)包括如下步骤：Further, step j) includes the following steps:

j-1)第一个卷积分支依次由通道数为32、卷积核大小为1×15、步长为2的卷积层、批归一化层、ReLU激活层构成，将训练集中归一化后的心音信号x′_pcg输入到第一个卷积分支中，输出得到32维的特征信号P₁；j-1) The first convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×15, and a step size of 2, a batch normalization layer, and a ReLU activation layer. The normalized heart sound signal x′ _pcg is input to the first convolution branch, and the output is obtained as a 32-dimensional characteristic signal P ₁ ;

j-2)第二个卷积分支依次由通道数为32、卷积核大小为1×11、步长为2的卷积层、批归一化层、ReLU激活层构成，将训练集中归一化后的心音信号x′_pcg输入到第二个卷积分支中，输出得到32维的特征信号P₂；j-2) The second convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×11, and a step size of 2, a batch normalization layer, and a ReLU activation layer. The normalized heart sound signal x′ _pcg is input to the second convolution branch, and the output is a 32-dimensional characteristic signal P ₂ ;

j-3)第三个卷积分支依次由通道数为32、卷积核大小为1×9、步长为2的卷积层、批归一化层、ReLU激活层构成，将训练集中归一化后的心音信号x′_pcg输入到第三个卷积分支中，输出得到32维的特征信号P₃；j-3) The third convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×9, and a step size of 2, a batch normalization layer, and a ReLU activation layer. The normalized heart sound signal x′ _pcg is input to the third convolution branch, and the output is a 32-dimensional characteristic signal P ₃ ;

j-4)第四个卷积分支依次由通道数为32、卷积核大小为1×5、步长为2的卷积层、批归一化层、ReLU激活层构成，将训练集中归一化后的心音信号x′_pcg输入到第四个卷积分支中，输出得到32维的特征信号P₄；j-4) The fourth convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×5, and a step size of 2, a batch normalization layer, and a ReLU activation layer. The normalized heart sound signal x′ _pcg is input to the fourth convolution branch, and the output is a 32-dimensional characteristic signal P ₄ ;

j-5)将特征信号P₁、特征信号P₂、特征信号P₃、特征信号P₄进行特征级联，得到级联后的128维特征信号P＝[P₁,P₂,P₃,P₄]；j-5) Concatenate the characteristic signal P ₁ , characteristic signal P ₂ , characteristic signal P ₃ , and characteristic signal P ₄ to obtain the concatenated 128-dimensional characteristic signal P=[P ₁ , P ₂ , P ₃ , P ₄ ];

j-6)1×1的卷积块由通道数为32、卷积核大小为1×1、步长为1的卷积层和ReLU激活层构成，将128维特征信号P＝[P₁,P₂,P₃,P₄]输入到1×1的卷积块中，输出得到32维的特征信号Y₁。j-6) The 1×1 convolutional block is composed of a convolutional layer with a channel number of 32, a convolution kernel size of 1×1, and a step size of 1, and a ReLU activation layer. The 128-dimensional feature signal P=[P ₁ ,P ₂ ,P ₃ ,P ₄ ] are input into the 1×1 convolutional block, and the output is a 32-dimensional feature signal Y ₁ .

进一步的，步骤k)包括如下步骤：Further, step k) comprises the following steps:

k-1)第一个卷积编码模块依次由通道数为16、卷积核大小为1×1的卷积层、批归一化层、ReLU激活层、大小为4的池化层构成，将特征信号Y₁输入到第一个卷积编码模块中，输出得到16维特征信号P₅；k-1) The first convolutional encoding module consists of a convolutional layer with 16 channels and a convolution kernel size of 1×1, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 4. Input the characteristic signal Y ₁ into the first convolution coding module, and output the 16-dimensional characteristic signal P ₅ ;

k-2)第二个卷积编码模块依次由通道数为32、卷积核大小为1×11的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成，将特征信号P₅输入到第二个卷积编码模块中，输出得到32维特征信号P₆；k-2) The second convolutional encoding module consists of a convolutional layer with a channel number of 32 and a convolution kernel size of 1×11, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. Input the feature signal P ₅ into the second convolutional coding module, and output the 32-dimensional feature signal P ₆ ;

k-3)第三个卷积编码模块依次由通道数为64、卷积核大小为1×7的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成，将特征信号P₆输入到第三个卷积编码模块中，输出得到64维特征信号P₇；k-3) The third convolutional encoding module consists of a convolutional layer with a channel number of 64 and a convolution kernel size of 1×7, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. Input the feature signal P ₆ into the third convolutional coding module, and output the 64-dimensional feature signal P ₇ ;

k-4)第四个卷积编码模块依次由通道数为128、卷积核大小为1×3的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成，将特征信号P₇输入到第四个卷积编码模块中，输出得到128维特征信号P₈；k-4) The fourth convolutional encoding module consists of a convolutional layer with a channel number of 128 and a convolution kernel size of 1×3, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. Input the feature signal P ₇ into the fourth convolutional coding module, and output the 128-dimensional feature signal P ₈ ;

k-5)将特征信号P₈输入到单元数为32的带有TPA机制的双向GRU层中，输出得到64维的特征信号Y₂，在TPA机制的双向GRU层中通过公式

计算得到特征信号Y₂。k-5) Input the characteristic signal P ₈ into the bidirectional GRU layer with a TPA mechanism with a unit number of 32, and output a 64-dimensional characteristic signal Y ₂ , and pass the formula in the bidirectional GRU layer of the TPA mechanism

The characteristic signal Y ₂ is obtained through calculation.

本发明的有益效果是：先利用PC-TBG-ECG和PC-TBG-PCG模型分别实现了心电信号和心音信号的特征提取，然后采用XGBoost集成分类算法对提取出来的特征进行特征选择并分类。在增加运算效率的同时，加入了正则化，有效防止过拟合。本发明适合不同模态数据的分类检测，可从多种角度对信号进行分析，进而提高分类的准确率。The beneficial effects of the present invention are as follows: first, the PC-TBG-ECG and PC-TBG-PCG models are used to respectively realize the feature extraction of the ECG signal and the heart sound signal, and then the extracted features are selected and classified by using the XGBoost integrated classification algorithm . While increasing the operational efficiency, regularization is added to effectively prevent over-fitting. The invention is suitable for classification and detection of different modal data, and can analyze signals from various angles, thereby improving classification accuracy.

附图说明Description of drawings

图1为本发明的方法流程图；Fig. 1 is method flowchart of the present invention;

图2为本发明的PC模块的网络结构图。Fig. 2 is a network structure diagram of the PC module of the present invention.

具体实施方式Detailed ways

下面结合附图1、附图2对本发明做进一步说明。The present invention will be further described below in conjunction with accompanying drawing 1, accompanying drawing 2.

a)选择PhysioNet/CinC Challenge 2016中的training-a作为数据集，对数据集进行扩充，将扩充后的数据集划分为训练集和测试集。a) Select training-a in PhysioNet/CinC Challenge 2016 as the data set, expand the data set, and divide the expanded data set into training set and test set.

b)建立心电信号模型(PC-TBG-ECG)，该心电信号模型依次由PC模块、TBG模块、分类模块构成。b) Establish an electrocardiographic signal model (PC-TBG-ECG), which is composed of a PC module, a TBG module, and a classification module in sequence.

c)将训练集和测试集中的心电信号重采样到2048个采样点后进行z-score归一化处理，得到归一化后的心电信号x′_ecg。c) Resampling the ECG signals in the training set and the test set to 2048 sampling points and performing z-score normalization processing to obtain the normalized ECG signal x′ _ecg .

d)将训练集中归一化后的心电信号x′_ecg输入到心电信号模型的PC模块，输出得到特征信号X₁，PC模块依次由四个卷积分支和一个1×1的卷积块构成。d) Input the normalized ECG signal x′ _ecg in the training set to the PC module of the ECG signal model, and output the characteristic signal X ₁ . The PC module consists of four convolution branches and a 1×1 convolution in turn block composition.

e)将特征信号X₁输入到心电信号模型的TBG模块，输出得到特征信号X₂，TBG模块由3个卷积编码模块和一个带有TPA机制的双向GRU层(TPA-Bi-GRU)构成。f)将特征信号X₂输入到心电信号模型的分类模块中，输出得到预测类别f_ecg，分类模块依次由全连接层和Softmax激活层构成。e) The characteristic signal X ₁ is input to the TBG module of the ECG signal model, and the characteristic signal X ₂ is output. The TBG module consists of 3 convolutional coding modules and a bidirectional GRU layer with a TPA mechanism (TPA-Bi-GRU) constitute. f) Input the feature signal X ₂ into the classification module of the ECG signal model, and output the predicted category f _ecg , and the classification module is sequentially composed of a fully connected layer and a Softmax activation layer.

g)重复步骤d)至步骤f)N次，使用SGD优化器，通过最小化交叉熵损失函数得到训练后的最优的心电信号模型。g) repeating step d) to step f) N times, using the SGD optimizer to obtain the optimal ECG model after training by minimizing the cross-entropy loss function.

h)建立心音信号模型(PC-TBG-PCG)，该心音信号模型依次由PC模块、TBG模块、分类模块构成。h) Establishing a heart sound signal model (PC-TBG-PCG), which is sequentially composed of a PC module, a TBG module, and a classification module.

i)将训练集和测试集中的心音信号重采样到8000个采样点后进行z-score归一化处理，得到归一化后的心音信号x′_pcg。i) Resample the heart sound signals in the training set and the test set to 8000 sampling points, and then perform z-score normalization processing to obtain the normalized heart sound signal x′ _pcg .

j)将训练集中归一化后的心音信号x′_pcg输入到心音信号模型的PC模块，输出得到特征信号Y₁，PC模块依次由四个卷积分支和一个1×1的卷积块构成。j) Input the normalized heart sound signal x′ _pcg in the training set to the PC module of the heart sound signal model, and output the characteristic signal Y ₁ . The PC module is sequentially composed of four convolution branches and a 1×1 convolution block .

k)将特征信号Y₁输入到心音信号模型的TBG模块，输出得到特征信号Y₂，TBG模块由4个卷积编码模块和一个带有TPA机制的双向GRU层(TPA-Bi-GRU)构成。l)将特征信号Y₂输入到心音信号模型的分类模块中，输出得到预测类别f_pcg，分类模块依次由全连接层和Softmax激活层构成。k) Input the characteristic signal Y ₁ into the TBG module of the heart sound signal model, and output the characteristic signal Y ₂ , the TBG module consists of 4 convolutional coding modules and a bidirectional GRU layer (TPA-Bi-GRU) with a TPA mechanism . l) The characteristic signal Y ₂ is input into the classification module of the heart sound signal model, and the predicted category f _pcg is outputted. The classification module is sequentially composed of a fully connected layer and a Softmax activation layer.

m)重复步骤j)至步骤l)M次，使用SGD优化器，通过最小化交叉熵损失函数得到训练后的最优的心音信号模型。m) Repeat step j) to step l) M times, use the SGD optimizer, and obtain the optimal heart sound signal model after training by minimizing the cross-entropy loss function.

n)将数据集重新按4:1的比例手动划分成新的训练集和新的测试集，将新的训练集输入到最优的心电信号模型中，通过最优的心电信号模型的TBG模块输出得到64维的特征信号X₃，将新的训练集输入到最优的心音信号模型中，通过最优的心音信号模型的TBG模块输出得到64维的特征信号Y₃，通过公式PP^x＝[X₃,Y₃]计算得到拼接的128维的特征融合信号PP^x。n) Manually divide the data set into a new training set and a new test set according to the ratio of 4:1, input the new training set into the optimal ECG signal model, and pass the optimal ECG signal model The TBG module outputs the 64-dimensional characteristic signal X ₃ , and the new training set is input into the optimal heart sound signal model, and the 64-dimensional characteristic signal Y ₃ is obtained through the output of the TBG module of the optimal heart sound signal model, and the formula PP ^x = [X ₃ , Y ₃ ] is calculated to obtain the concatenated 128-dimensional feature fusion signal PP ^x .

o)将特征融合信号PP^x输入到XGBoost分类器中，得到特征融合信号PP^x的重要性分数排名，选择重要性分数排名前64的信号作为特征信号PP₁ ^x，采用5折交叉验证选择最优超参数，利用最优超参数训练XGBoost分类器，得到优化后的XGBoost分类器。o) Input the feature fusion signal PP ^x into the XGBoost classifier to obtain the importance score ranking of the feature fusion signal PP ^x , select the top 64 signals with the importance score as the feature signal PP ₁ ^x , and use 5-fold cross-validation to select the most Optimal hyperparameters, use the optimal hyperparameters to train the XGBoost classifier, and get the optimized XGBoost classifier.

p)将新的测试集输入到最优的心电信号模型中，通过最优的心电信号模型的TBG模块输出得到64维的特征信号X₄，将新的测试集输入到最优的心音信号模型中，通过最优的心音信号模型的TBG模块输出得到64维的特征信号Y₄，通过公式PP^c＝[X₄,Y₄]计算得到拼接的128维的特征融合信号PP^c。p) Input the new test set into the optimal ECG model, obtain the 64-dimensional characteristic signal X ₄ through the output of the TBG module of the optimal ECG model, and input the new test set into the optimal heart sound In the signal model, the 64-dimensional feature signal Y ₄ is obtained through the output of the TBG module of the optimal heart sound signal model, and the concatenated 128-dimensional feature fusion signal PP ^c is calculated by the formula PP ^c =[X ₄ , Y ₄ ].

无需对信号进行降噪、滤波等处理，避免了以往因对信号预处理不合理所导致的分类准确率低或实用性不强等问题，保证了模型的鲁棒性。首先利用PC-TBG-ECG和PC-TBG-PCG模型分别实现了心电信号和心音信号的特征提取，然后采用XGBoost集成分类算法对提取出来的特征进行特征选择并分类。在增加运算效率的同时，加入了正则化，有效防止过拟合。本发明适合不同模态数据的分类检测，可从多种角度对信号进行分析，进而提高分类的准确率。There is no need to perform noise reduction and filtering on the signal, which avoids the problems of low classification accuracy or poor practicability caused by unreasonable signal preprocessing in the past, and ensures the robustness of the model. Firstly, PC-TBG-ECG and PC-TBG-PCG models are used to extract the features of ECG signal and heart sound signal, and then XGBoost ensemble classification algorithm is used to select and classify the extracted features. While increasing the operational efficiency, regularization is added to effectively prevent over-fitting. The invention is suitable for classification and detection of different modal data, and can analyze signals from various angles, thereby improving classification accuracy.

实施例1：Example 1:

步骤a)中使用滑动窗口分割的方法对数据集进行扩充，使用五折交叉验证的方法对数据集进行5次不同的训练集和测试集的划分。In step a), the data set is expanded by using the method of sliding window segmentation, and the data set is divided into 5 different training sets and test sets by using the method of 5-fold cross-validation.

实施例2：Example 2:

步骤c)中通过公式

计算得到归一化后的心电信号x′_ecg，式中x_ecg为训练集和测试集中的心电信号，u_ecg为心电信号的平均值，σ_ecg为心电信号的方差。In step c) through the formula

实施例3：Example 3:

步骤d)包括如下步骤：Step d) comprises the following steps:

实施例4：Example 4:

步骤e)包括如下步骤：Step e) comprises the following steps:

σ(·)为sigmoid函数，

σ(·) is the sigmoid function,

实施例5：Example 5:

步骤g)中N取值为150，SGD优化器的学习率为0.001，每80个周期学习率衰减为当前的0.1，通过公式

为预测类别f_pcg的第i个类别的真实类别。In step g), the value of N is 150, the learning rate of the SGD optimizer is 0.001, and the learning rate decays to the current 0.1 every 80 cycles, through the formula

is the true category of the i-th category of the predicted category f _pcg .

实施例6：Embodiment 6:

步骤i)中通过公式

计算得到归一化后的心音信号x′_pcg，式中x_pcg为训练集和测试集中的心音信号，u_pcg为心音信号的平均值，σ_pcg为心音信号的方差。In step i) through the formula

实施例7：Embodiment 7:

步骤j)包括如下步骤：Step j) comprises the following steps:

实施例8：Embodiment 8:

步骤k)包括如下步骤：Step k) comprises the following steps:

The characteristic signal Y ₂ is obtained through calculation.

最后应说明的是：以上所述仅为本发明的优选实施例而已，并不用于限制本发明，尽管参照前述实施例对本发明进行了详细的说明，对于本领域的技术人员来说，其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。Finally, it should be noted that: the above is only a preferred embodiment of the present invention, and is not intended to limit the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, for those skilled in the art, it still The technical solutions recorded in the foregoing embodiments may be modified, or some technical features thereof may be equivalently replaced. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.

Claims

1. A multi-modal data classification method based on deep learning and time-series attention mechanism is characterized by comprising the following steps:

a) Selecting training-a in PhysioNet/CinC Challenge 2016 as a data set, expanding the data set, and dividing the expanded data set into a training set and a test set;

b) Establishing an electrocardiosignal model, wherein the electrocardiosignal model consists of a PC module, a TBG module and a classification module in sequence;

c) Resampling the electrocardiosignals in the training set and the testing set to 2048 sampling points, and then carrying out z-score normalization processing to obtain a regressionNormalized electrocardiosignal x' _ecg ；

d) Normalizing the electrocardiosignals x 'in the training set' _ecg Inputting the signal into a PC module of the electrocardiosignal model, and outputting the signal to obtain a characteristic signal X ₁ The PC module is composed of four convolution branches and a 1 multiplied by 1 convolution block in sequence;

e) The characteristic signal X ₁ Inputting the signal into a TBG module of the electrocardiosignal model, and outputting the signal to obtain a characteristic signal X ₂ The TBG module consists of 3 convolutional coding modules and a bidirectional GRU layer with a TPA mechanism;

f) The characteristic signal X ₂ Inputting the prediction data into a classification module of the electrocardiosignal model, and outputting the prediction data to obtain a prediction category f _ecg The classification module is sequentially composed of a full connection layer and a Softmax activation layer;

g) Repeating the steps d) to f) N times, and obtaining an optimal electrocardiosignal model after training by using an SGD optimizer through a minimized cross entropy loss function;

h) Establishing a heart sound signal model which sequentially consists of a PC module, a TBG module and a classification module;

i) Resampling the heart sound signals in the training set and the test set to 8000 sampling points, and then carrying out z-score normalization processing to obtain normalized heart sound signals x' _pcg ；

j) Normalizing the heart sound signals x 'in the training set' _pcg Inputting the signal into PC module of heart sound signal model, outputting to obtain characteristic signal Y ₁ The PC module is composed of four convolution branches and a 1 multiplied by 1 convolution block in sequence;

k) The characteristic signal Y ₁ Inputting the signal into a TBG module of the heart sound signal model, and outputting to obtain a characteristic signal Y ₂ The TBG module consists of 4 convolutional coding modules and a bidirectional GRU layer with a TPA mechanism;

l) applying the characteristic signal Y ₂ Inputting the prediction into a classification module of a heart sound signal model, and outputting the prediction category f _pcg The classification module is sequentially composed of a full connection layer and a Softmax activation layer;

m) repeating the steps j) to l) M times, and obtaining an optimal heart sound signal model after training by minimizing a cross entropy loss function by using an SGD optimizer;

n) manually dividing the data set into a new training set and a new testing set according to the proportion of 4:1, inputting the new training set into the optimal electrocardiosignal model, and outputting a 64-dimensional characteristic signal X through a TBG (tunnel boring generator) module of the optimal electrocardiosignal model ₃ Inputting the new training set into the optimal heart sound signal model, and outputting a 64-dimensional characteristic signal Y through a TBG module of the optimal heart sound signal model ₃ By the formula PP ^x ＝[X ₃ ,Y ₃ ]Calculating to obtain spliced 128-dimensional feature fusion signal PP ^x ；

o) fusing features into a signal PP ^x Inputting the signals into an XGboost classifier to obtain a feature fusion signal PP ^x The importance score ranking of (2) and selecting the signal of the top 64 of the importance score ranking as the characteristic signal PP ₁ ^x Selecting an optimal hyper-parameter by adopting 5-fold cross validation, and training the XGboost classifier by utilizing the optimal hyper-parameter to obtain an optimized XGboost classifier;

p) inputting the new test set into the optimal electrocardiosignal model, and outputting a 64-dimensional characteristic signal X through a TBG (tunnel boring generator) module of the optimal electrocardiosignal model ₄ Inputting the new test set into the optimal heart sound signal model, and outputting a 64-dimensional characteristic signal Y through a TBG module of the optimal heart sound signal model ₄ By the formula PP ^c ＝[X ₄ ,Y ₄ ]Calculating to obtain spliced 128-dimensional feature fusion signal PP ^c ；

q) feature fusion signal PP ^c Inputting the signals into an XGboost classifier to obtain a feature fusion signal PP ^c The importance score ranking of (2) and selecting the signal of the top 64 of the importance score ranking as the characteristic signal PP ₁ ^c ；

The step d) comprises the following steps:

d-1) the first convolution branch comprises a convolution layer with 32 channels, convolution kernel size of 1 × 15 and step size of 1, a batch normalization layer and a ReLU activation layer in sequence, and the electrocardiosignal x 'after normalization in the training set' _ecg Input into the first convolution branch, and output is obtained32-dimensional characteristic signal E ₁ ；

d-2) the second convolution branch comprises a convolution layer with 32 channels, convolution kernel size of 1 × 13 and step length of 1, a batch normalization layer and a ReLU activation layer in sequence, and the electrocardiosignal x 'after normalization in the training set' _ecg Inputting the signal into the second convolution branch, and outputting to obtain a 32-dimensional characteristic signal E ₂ ；

d-3) the third convolution branch comprises a convolution layer with a channel number of 32, a convolution kernel size of 1 x 9 and a step length of 1, a batch normalization layer and a ReLU activation layer in sequence, and the electrocardiosignals x 'after normalization in the training set' _ecg Inputting the signal into a third convolution branch, and outputting to obtain a 32-dimensional characteristic signal E ₃ ；

d-4) the fourth convolution branch comprises a convolution layer with a channel number of 32, a convolution kernel size of 1 x 5 and a step length of 1, a batch normalization layer and a ReLU activation layer in sequence, and the electrocardiosignals x 'after normalization in the training set' _ecg Inputting the signal into the fourth convolution branch, and outputting to obtain a 32-dimensional characteristic signal E ₄ ；

d-5) converting the characteristic signal E ₁ Characteristic signal E ₂ Characteristic signal E ₃ Characteristic signal E ₄ Performing characteristic cascade to obtain a 128-dimensional characteristic signal E = [ E ] after cascade ₁ ,E ₂ ,E ₃ ,E ₄ ]；

d-6) 1 × 1 convolution block is composed of convolution layers with 16 channels, 1 × 1 convolution kernel size and 1 step size and ReLU active layer, and a 128-dimensional characteristic signal E = [ E ] ₁ ,E ₂ ,E ₃ ,E ₄ ]Inputting the signal into a 1 × 1 convolution block, and outputting to obtain a 16-dimensional characteristic signal X ₁ ；

The step j) comprises the following steps:

j-1) the first convolution branch is composed of convolution layer with 32 channels, convolution kernel size of 1 × 15 and step size of 2, batch normalization layer and ReLU activation layer in sequence, and the heart sound signal x 'after normalization in training set' _pcg Inputting the signal into the first convolution branch, and outputting to obtain 32-dimensional characteristic signal P ₁ ；

j-2) the second convolution branch is composed of 32 channels and convolution kernelA convolution layer of 1 × 11 size and step size 2, a batch normalization layer, and a ReLU activation layer, and the normalized heart sound signal x 'in the training set' _pcg Inputting the signal into the second convolution branch, and outputting to obtain a 32-dimensional characteristic signal P ₂ ；

j-3) the third convolution branch comprises a convolution layer with a channel number of 32, a convolution kernel size of 1 × 9 and a step size of 2, a batch normalization layer and a ReLU activation layer in sequence, and the heart sound signal x 'after normalization in the training set' _pcg Inputting the signal into a third convolution branch, and outputting to obtain a 32-dimensional characteristic signal P ₃ ；

j-4) the fourth convolution branch comprises a convolution layer with a channel number of 32, a convolution kernel size of 1 × 5 and a step size of 2, a batch normalization layer and a ReLU activation layer in sequence, and the heart sound signal x 'after normalization in the training set' _pcg Inputting the signal into the fourth convolution branch, and outputting to obtain 32-dimensional characteristic signal P ₄ ；

j-5) converting the characteristic signal P ₁ Characteristic signal P ₂ Characteristic signal P ₃ Characteristic signal P ₄ Performing characteristic cascade to obtain a 128-dimensional characteristic signal P = [ P ] after cascade ₁ ,P ₂ ,P ₃ ,P ₄ ]；

j-6) 1 × 1 convolution block is composed of convolution layer with 32 channel number, convolution kernel size of 1 × 1 and step size of 1 and ReLU active layer, and 128-dimensional characteristic signal P = [ P = ₁ ,P ₂ ,P ₃ ,P ₄ ]Inputting into a 1 × 1 convolution block, and outputting to obtain a 32-dimensional feature signal Y ₁ 。

2. The multi-modal data classification method based on deep learning and time series attention mechanism as claimed in claim 1, wherein: in the step a), a sliding window segmentation method is used for expanding the data set, and a five-fold cross validation method is used for dividing the data set into 5 different training sets and test sets.

3. The multi-modal data classification method based on deep learning and time series attention mechanism as claimed in claim 1, wherein: in step c) byIs of the formula

Calculating to obtain a normalized electrocardiosignal x' _ecg In the formula x _ecg For training and testing the concentrated ECG signal u _ecg Is the mean value, σ, of the electrocardiosignal _ecg Is the variance of the electrocardiosignals.

4. The method for multi-modal data classification based on deep learning and temporal attention mechanism according to claim 1, wherein step e) comprises the following steps:

e-1) the first convolutional coding module consists of a convolutional layer with the number of channels of 32 and the convolutional kernel size of 1 multiplied by 11, a batch normalization layer, a ReLU activation layer and a pooling layer with the size of 4 in sequence, and the characteristic signal X is converted into a characteristic signal ₁ Inputting the signal into a first convolution coding module, and outputting to obtain a 32-dimensional characteristic signal E ₅ ；

E-2) the second convolutional coding module sequentially comprises a convolutional layer with the channel number of 64 and the convolutional kernel size of 1 multiplied by 7, a batch normalization layer, a ReLU active layer and a pooling layer with the size of 2, and a characteristic signal E is generated ₅ Inputting the signal into a second convolutional coding module, and outputting to obtain a 64-dimensional characteristic signal E ₆ ；

E-3) the third convolution coding module consists of a convolution layer with the channel number of 128 and the convolution kernel size of 3, a batch normalization layer, a ReLU activation layer and a pooling layer with the size of 2 in sequence, and a characteristic signal E is generated ₆ Inputting the signal into a third convolutional coding module, and outputting to obtain a 128-dimensional characteristic signal E ₇ ；

E-4) converting the characteristic signal E ₇ Inputting into 32-unit bidirectional GRU layer with TPA mechanism, and outputting to obtain 64-dimensional characteristic signal X ₂ In the bidirectional GRU layer of TPA mechanism by formula

Calculating to obtain a characteristic signal X ₂ Where i = {1,2.., n }, n =128, t is the transpose, τ _i For the attention weight of the ith row vector,

σ (-) is a sigmoid function,

is a time pattern matrix G ^C Row i of (1), G ^C Conv1d (G), conv1d (·) is a one-dimensional convolution operation, G is a hidden state matrix,

g _i i = {1,2., t-1}, t is the time, w is the hidden state vector of the ith bidirectional GRU _k Is a weight coefficient, g _t Is the hidden state vector of the bi-directional GRU at time t.

5. The multi-modal data classification method based on deep learning and time series attention mechanism as claimed in claim 1, wherein: in the step g), the value of N is 150, the learning rate of the SGD optimizer is 0.001, the learning rate is attenuated to be 0.1 at every 80 periods, and the formula is used

Calculating to obtain a cross entropy loss function cc (x), wherein L is the number of categories, and L =2,f _i (x) As a prediction class f _ecg The predictive label of the ith category of (c),

as a prediction class f _ecg The real category of the corresponding ith category; in the step m), the value of N is 180, the learning rate of the SGD optimizer is 0.001, the learning rate is attenuated to be 0.1 at every 90 periods, and the N is obtained by a formula

Calculating to obtain a cross entropy loss function cc (y), wherein L is the number of categories, and L =2,f _i (y) as a prediction class f _pcg The predictive tag of the ith category of (a),

as a prediction class f _pcg True category of the ith category of (1).

6. The multi-modal data classification method based on deep learning and time series attention mechanism as claimed in claim 1, wherein: in step i) by the formula

Calculating to obtain a normalized heart sound signal x' _pcg In the formula x _pcg For the heart sound signals in the training set and test set, u _pcg Is the mean value, σ, of the heart sound signal _pcg Is the variance of the heart sound signal.

7. The multi-modal data classification method based on deep learning and time series attention mechanism as claimed in claim 1, wherein step k) comprises the following steps:

k-1) the first convolutional coding module sequentially comprises a convolutional layer with the number of channels being 16 and the convolutional kernel size being 1 multiplied by 1, a batch normalization layer, a ReLU activation layer and a pooling layer with the size being 4, and the feature signal Y is processed ₁ Inputting the signals into a first convolutional encoding module, and outputting to obtain a 16-dimensional characteristic signal P ₅ ；

k-2) the second convolutional coding module sequentially comprises a convolutional layer with the number of channels of 32 and the convolutional kernel size of 1 multiplied by 11, a batch normalization layer, a ReLU activation layer and a pooling layer with the size of 2, and the feature signal P is converted into a linear convolution function ₅ Inputting the signal into a second convolutional coding module, and outputting to obtain a 32-dimensional characteristic signal P ₆ ；

k-3) the third convolutional coding module sequentially comprises a convolutional layer with the channel number of 64 and the convolutional kernel size of 1 multiplied by 7, a batch normalization layer, a ReLU active layer and a pooling layer with the size of 2, and a characteristic signal P is obtained ₆ Inputting the signal into a third convolutional coding module, and outputting to obtain a 64-dimensional characteristic signal P ₇ ；

k-4) the fourth convolutional encoding module sequentially comprises a channel number of 128 and a convolutional kernel size of 1 x 3A convolution layer, a batch normalization layer, a ReLU active layer, a pooling layer of size 2, and a feature signal P ₇ Inputting the signal into a fourth convolutional coding module, and outputting to obtain a 128-dimensional characteristic signal P ₈ ；

k-5) converting the characteristic signal P ₈ Inputting into 32-unit bidirectional GRU layer with TPA mechanism, and outputting to obtain 64-dimensional characteristic signal Y ₂ In the bidirectional GRU layer of TPA mechanism by formula

Calculating to obtain a characteristic signal Y ₂ 。