CN114764575B - Multimodal Data Classification Method Based on Deep Learning and Temporal Attention Mechanism - Google Patents
Multimodal Data Classification Method Based on Deep Learning and Temporal Attention Mechanism Download PDFInfo
- Publication number
- CN114764575B CN114764575B CN202210376944.3A CN202210376944A CN114764575B CN 114764575 B CN114764575 B CN 114764575B CN 202210376944 A CN202210376944 A CN 202210376944A CN 114764575 B CN114764575 B CN 114764575B
- Authority
- CN
- China
- Prior art keywords
- signal
- layer
- convolution
- characteristic signal
- module
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000007246 mechanism Effects 0.000 title claims abstract description 30
- 238000000034 method Methods 0.000 title claims abstract description 23
- 238000013135 deep learning Methods 0.000 title claims abstract description 12
- 230000002123 temporal effect Effects 0.000 title claims abstract description 6
- 230000005236 sound signal Effects 0.000 claims abstract description 61
- 238000010606 normalization Methods 0.000 claims description 53
- 230000004913 activation Effects 0.000 claims description 52
- 238000012549 training Methods 0.000 claims description 48
- 238000012360 testing method Methods 0.000 claims description 27
- 230000002457 bidirectional effect Effects 0.000 claims description 23
- 238000011176 pooling Methods 0.000 claims description 21
- 230000004927 fusion Effects 0.000 claims description 17
- 230000006870 function Effects 0.000 claims description 16
- 238000002790 cross-validation Methods 0.000 claims description 6
- 239000011159 matrix material Substances 0.000 claims description 6
- 238000012545 processing Methods 0.000 claims description 6
- 238000005070 sampling Methods 0.000 claims description 6
- 238000002759 z-score normalization Methods 0.000 claims description 6
- 238000012952 Resampling Methods 0.000 claims description 5
- YTAHJIFKAKIKAV-XNMGPUDCSA-N [(1R)-3-morpholin-4-yl-1-phenylpropyl] N-[(3S)-2-oxo-5-phenyl-1,3-dihydro-1,4-benzodiazepin-3-yl]carbamate Chemical compound O=C1[C@H](N=C(C2=C(N1)C=CC=C2)C1=CC=CC=C1)NC(O[C@H](CCN1CCOCC1)C1=CC=CC=C1)=O YTAHJIFKAKIKAV-XNMGPUDCSA-N 0.000 claims description 5
- GNFTZDOKVXKIBK-UHFFFAOYSA-N 3-(2-methoxyethoxy)benzohydrazide Chemical compound COCCOC1=CC=CC(C(=O)NN)=C1 GNFTZDOKVXKIBK-UHFFFAOYSA-N 0.000 claims description 3
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 230000002238 attenuated effect Effects 0.000 claims 2
- 238000001514 detection method Methods 0.000 abstract description 4
- FGUUSXIOTUKUDN-IBGZPJMESA-N C1(=CC=CC=C1)N1C2=C(NC([C@H](C1)NC=1OC(=NN=1)C1=CC=CC=C1)=O)C=CC=C2 Chemical compound C1(=CC=CC=C1)N1C2=C(NC([C@H](C1)NC=1OC(=NN=1)C1=CC=CC=C1)=O)C=CC=C2 FGUUSXIOTUKUDN-IBGZPJMESA-N 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 238000007635 classification algorithm Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000010586 diagram Methods 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/08—Feature extraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/12—Classification; Matching
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Biomedical Technology (AREA)
- Data Mining & Analysis (AREA)
- Evolutionary Computation (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Computing Systems (AREA)
- Computational Linguistics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Signal Processing (AREA)
- Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
Abstract
一种基于深度学习和时序注意力机制的多模态数据分类方法,先利用PC‑TBG‑ECG和PC‑TBG‑PCG模型分别实现了心电信号和心音信号的特征提取,然后采用XGBoost集成分类算法对提取出来的特征进行特征选择并分类。在增加运算效率的同时,加入了正则化,有效防止过拟合。本发明适合不同模态数据的分类检测,可从多种角度对信号进行分析,进而提高分类的准确率。
A multi-modal data classification method based on deep learning and temporal attention mechanism. Firstly, the PC‑TBG‑ECG and PC‑TBG‑PCG models are used to extract the features of ECG signals and heart sound signals, and then XGBoost is used for integrated classification The algorithm performs feature selection and classification on the extracted features. While increasing the operational efficiency, regularization is added to effectively prevent over-fitting. The invention is suitable for classification and detection of different modal data, and can analyze signals from various angles, thereby improving classification accuracy.
Description
技术领域technical field
本发明涉及多模态数据分类领域,具体涉及一种基于深度学习和时序注意力机制的多模态数据分类方法。The invention relates to the field of multimodal data classification, in particular to a multimodal data classification method based on deep learning and temporal attention mechanism.
背景技术Background technique
心电图(ECG)和心音图(PCG)作为无创且高成本效益的信号采集工具,可以根据二者之间的互补性,从多种角度挖掘并分析两种信号的潜在特征,从而提高分类效果。在以往的研究中,相关研究人员主要采用单一模态数据或单一分类器来进行信号的分类研究,但采用这种方法的分类研究无法从全面性角度对信号进行分类,所以本研究提出的一种融合多模态数据的分类方法是极符合现实需求的。Electrocardiogram (ECG) and phonocardiogram (PCG), as non-invasive and cost-effective signal acquisition tools, can mine and analyze the potential characteristics of the two signals from multiple perspectives according to their complementarity, thereby improving the classification effect. In previous studies, relevant researchers mainly used single-modal data or a single classifier to carry out signal classification research, but the classification research using this method cannot classify signals from a comprehensive perspective. A classification method that fuses multimodal data is very in line with practical needs.
发明内容Contents of the invention
本发明为了克服以上技术的不足,提供了一种适合不同模态数据的分类检测,可从多种角度对信号进行分析,进而提高分类的准确率的方法。In order to overcome the deficiencies of the above technologies, the present invention provides a method suitable for classification and detection of different modal data, which can analyze signals from various angles, and further improve the accuracy of classification.
本发明克服其技术问题所采用的技术方案是:The technical scheme that the present invention overcomes its technical problem adopts is:
一种基于深度学习和时序注意力机制的多模态数据分类方法,包括如下步骤:A multi-modal data classification method based on deep learning and temporal attention mechanism, comprising the following steps:
a)选择PhysioNet/CinC Challenge 2016中的training-a作为数据集,对数据集进行扩充,将扩充后的数据集划分为训练集和测试集;a) Select training-a in PhysioNet/CinC Challenge 2016 as the data set, expand the data set, and divide the expanded data set into training set and test set;
b)建立心电信号模型,该心电信号模型依次由PC模块、TBG模块、分类模块构成;b) establish an electrocardiographic signal model, which is composed of a PC module, a TBG module, and a classification module in turn;
c)将训练集和测试集中的心电信号重采样到2048个采样点后进行z-score归一化处理,得到归一化后的心电信号x′ecg;c) resampling the ECG signals in the training set and the test set to 2048 sampling points and performing z-score normalization processing to obtain the normalized ECG signal x′ ecg ;
d)将训练集中归一化后的心电信号x′ecg输入到心电信号模型的PC模块,输出得到特征信号X1,PC模块依次由四个卷积分支和一个1×1的卷积块构成;d) Input the normalized ECG signal x′ ecg in the training set to the PC module of the ECG signal model, and output the characteristic signal X 1 . The PC module consists of four convolution branches and a 1×1 convolution in turn block composition;
e)将特征信号X1输入到心电信号模型的TBG模块,输出得到特征信号X2,TBG模块由3个卷积编码模块和一个带有TPA机制的双向GRU层构成;e) Input the characteristic signal X 1 to the TBG module of the ECG signal model, and output the characteristic signal X 2 , the TBG module is composed of 3 convolutional coding modules and a bidirectional GRU layer with a TPA mechanism;
f)将特征信号X2输入到心电信号模型的分类模块中,输出得到预测类别fecg,分类模块依次由全连接层和Softmax激活层构成;f) Input the feature signal X 2 into the classification module of the ECG signal model, and output the predicted category f ecg , and the classification module is sequentially composed of a fully connected layer and a Softmax activation layer;
g)重复步骤d)至步骤f)N次,使用SGD优化器,通过最小化交叉熵损失函数得到训练后的最优的心电信号模型;g) repeating step d) to step f) N times, using the SGD optimizer, by minimizing the cross-entropy loss function to obtain the optimal ECG model after training;
h)建立心音信号模型,该心音信号模型依次由PC模块、TBG模块、分类模块构成;h) establishing a heart sound signal model, which is sequentially composed of a PC module, a TBG module, and a classification module;
i)将训练集和测试集中的心音信号重采样到8000个采样点后进行z-score归一化处理,得到归一化后的心音信号x′pcg;i) carry out z-score normalization processing after the heart sound signal resampling of training set and test set to 8000 sampling points, obtain the heart sound signal x ' pcg after normalization;
j)将训练集中归一化后的心音信号x′pcg输入到心音信号模型的PC模块,输出得到特征信号Y1,PC模块依次由四个卷积分支和一个1×1的卷积块构成;j) Input the normalized heart sound signal x′ pcg in the training set to the PC module of the heart sound signal model, and output the characteristic signal Y 1 . The PC module is sequentially composed of four convolution branches and a 1×1 convolution block ;
k)将特征信号Y1输入到心音信号模型的TBG模块,输出得到特征信号Y2,TBG模块由4个卷积编码模块和一个带有TPA机制的双向GRU层构成;k) Input the characteristic signal Y1 to the TBG module of the heart sound signal model, and output the characteristic signal Y2 . The TBG module is composed of 4 convolutional coding modules and a bidirectional GRU layer with a TPA mechanism;
l)将特征信号Y2输入到心音信号模型的分类模块中,输出得到预测类别fpcg,分类模块依次由全连接层和Softmax激活层构成;l) The feature signal Y2 is input into the classification module of the heart sound signal model, and the output is obtained to predict the category f pcg , and the classification module is successively composed of a fully connected layer and a Softmax activation layer;
m)重复步骤j)至步骤l)M次,使用SGD优化器,通过最小化交叉熵损失函数得到训练后的最优的心音信号模型;m) repeat step j) to step l) M times, use the SGD optimizer, obtain the optimal heart sound signal model after training by minimizing the cross-entropy loss function;
n)将数据集重新按4:1的比例手动划分成新的训练集和新的测试集,将新的训练集输入到最优的心电信号模型中,通过最优的心电信号模型的TBG模块输出得到64维的特征信号X3,将新的训练集输入到最优的心音信号模型中,通过最优的心音信号模型的TBG模块输出得到64维的特征信号Y3,通过公式PPx=[X3,Y3]计算得到拼接的128维的特征融合信号PPx;n) Manually divide the data set into a new training set and a new test set according to the ratio of 4:1, input the new training set into the optimal ECG signal model, and pass the optimal ECG signal model The TBG module outputs the 64-dimensional characteristic signal X 3 , and the new training set is input into the optimal heart sound signal model, and the 64-dimensional characteristic signal Y 3 is obtained through the output of the TBG module of the optimal heart sound signal model, and the formula PP x = [X 3 , Y 3 ] Calculate the spliced 128-dimensional feature fusion signal PP x ;
o)将特征融合信号PPx输入到XGBoost分类器中,得到特征融合信号PPx的重要性分数排名,选择重要性分数排名前64的信号作为特征信号PP1 x,采用5折交叉验证选择最优超参数,利用最优超参数训练XGBoost分类器,得到优化后的XGBoost分类器;o) Input the feature fusion signal PP x into the XGBoost classifier to obtain the importance score ranking of the feature fusion signal PP x , select the top 64 signals with the importance score as the feature signal PP 1 x , and use 5-fold cross-validation to select the most Optimal hyperparameters, use the optimal hyperparameters to train the XGBoost classifier, and get the optimized XGBoost classifier;
p)将新的测试集输入到最优的心电信号模型中,通过最优的心电信号模型的TBG模块输出得到64维的特征信号X4,将新的测试集输入到最优的心音信号模型中,通过最优的心音信号模型的TBG模块输出得到64维的特征信号Y4,通过公式PPc=[X4,Y4]计算得到拼接的128维的特征融合信号PPc;p) Input the new test set into the optimal ECG model, obtain the 64-dimensional characteristic signal X 4 through the output of the TBG module of the optimal ECG model, and input the new test set into the optimal heart sound In the signal model, the 64-dimensional feature signal Y 4 is obtained through the output of the TBG module of the optimal heart sound signal model, and the spliced 128-dimensional feature fusion signal PP c is obtained by calculating the formula PP c =[X 4 , Y 4 ];
q)特征融合信号PPc输入到XGBoost分类器中,得到特征融合信号PPc的重要性分数排名,选择重要性分数排名前64的信号作为特征信号PP1 c。q) The feature fusion signal PP c is input to the XGBoost classifier, and the importance score ranking of the feature fusion signal PP c is obtained, and the signal with the top 64 importance scores is selected as the feature signal PP 1 c .
优选的,步骤a)中使用滑动窗口分割的方法对数据集进行扩充,使用五折交叉验证的方法对数据集进行5次不同的训练集和测试集的划分。Preferably, in step a), the data set is expanded by using a sliding window segmentation method, and the data set is divided into five different training sets and test sets by using a five-fold cross-validation method.
进一步的,步骤c)中通过公式计算得到归一化后的心电信号x′ecg,式中xecg为训练集和测试集中的心电信号,uecg为心电信号的平均值,σecg为心电信号的方差。Further, through the formula in step c) Calculate the normalized ECG signal x′ ecg , where x ecg is the ECG signal in the training set and the test set, u ecg is the average value of the ECG signal, and σ ecg is the variance of the ECG signal.
进一步的,步骤d)包括如下步骤:Further, step d) includes the following steps:
d-1)第一个卷积分支依次由通道数为32、卷积核大小为1×15、步长为1的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心电信号x′ecg输入到第一个卷积分支中,输出得到32维的特征信号E1;d-1) The first convolution branch consists of a convolution layer with a channel number of 32, a convolution kernel size of 1×15, and a step size of 1, a batch normalization layer, and a ReLU activation layer. The normalized ECG signal x′ ecg is input to the first convolution branch, and the output is a 32-dimensional characteristic signal E 1 ;
d-2)第二个卷积分支依次由通道数为32、卷积核大小为1×13、步长为1的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心电信号x′ecg输入到第二个卷积分支中,输出得到32维的特征信号E2;d-2) The second convolution branch consists of a convolution layer with a channel number of 32, a convolution kernel size of 1×13, and a step size of 1, a batch normalization layer, and a ReLU activation layer. The normalized ECG signal x′ ecg is input to the second convolution branch, and the output is a 32-dimensional characteristic signal E 2 ;
d-3)第三个卷积分支依次由通道数为32、卷积核大小为1×9、步长为1的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心电信号x′ecg输入到第三个卷积分支中,输出得到32维的特征信号E3;d-3) The third convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×9, and a step size of 1, a batch normalization layer, and a ReLU activation layer. The normalized ECG signal x′ ecg is input to the third convolution branch, and the output is a 32-dimensional characteristic signal E 3 ;
d-4)第四个卷积分支依次由通道数为32、卷积核大小为1×5、步长为1的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心电信号x′ecg输入到第四个卷积分支中,输出得到32维的特征信号E4;d-4) The fourth convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×5, and a step size of 1, a batch normalization layer, and a ReLU activation layer. The normalized ECG signal x′ ecg is input to the fourth convolution branch, and the output is a 32-dimensional characteristic signal E 4 ;
d-5)将特征信号E1、特征信号E2、特征信号E3、特征信号E4进行特征级联,得到级联后的128维特征信号E=[E1,E2,E3,E4];d-5) Concatenate the characteristic signal E 1 , E 2 , E 3 , and E 4 to obtain the concatenated 128-dimensional characteristic signal E=[E 1 , E 2 , E 3 , E 4 ];
d-6)1×1的卷积块由通道数为16、卷积核大小为1×1、步长为1的卷积层和ReLU激活层构成,将128维特征信号E=[E1,E2,E3,E4]输入到1×1的卷积块中,输出得到16维的特征信号X1。d-6) The 1×1 convolution block is composed of a convolution layer with a channel number of 16, a convolution kernel size of 1×1, and a step size of 1, and a ReLU activation layer. The 128-dimensional feature signal E=[E 1 ,E 2 ,E 3 ,E 4 ] are input into a 1×1 convolutional block, and the output is a 16-dimensional feature signal X 1 .
进一步的,步骤e)包括如下步骤:Further, step e) includes the following steps:
e-1)第一个卷积编码模块依次由通道数为32、卷积核大小为1×11的卷积层、批归一化层、ReLU激活层、大小为4的池化层构成,将特征信号X1输入到第一个卷积编码模块中,输出得到32维特征信号E5;e-1) The first convolutional encoding module consists of a convolutional layer with a channel number of 32 and a convolution kernel size of 1×11, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 4. Input the feature signal X 1 into the first convolutional coding module, and output the 32-dimensional feature signal E 5 ;
e-2)第二个卷积编码模块依次由通道数为64、卷积核大小为1×7的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成,将特征信号E5输入到第二个卷积编码模块中,输出得到64维特征信号E6;e-2) The second convolutional encoding module consists of a convolutional layer with a channel number of 64 and a convolution kernel size of 1×7, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. Input the feature signal E 5 into the second convolution coding module, and output the 64-dimensional feature signal E 6 ;
e-3)第三个卷积编码模块依次由通道数为128、卷积核大小为3的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成,将特征信号E6输入到第三个卷积编码模块中,输出得到128维特征信号E7;e-3) The third convolutional coding module consists of a convolutional layer with a channel number of 128 and a convolution kernel size of 3, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. The feature The signal E 6 is input to the third convolutional encoding module, and the output is obtained as a 128-dimensional feature signal E 7 ;
e-4)将特征信号E7输入到单元数为32的带有TPA机制的双向GRU层中,输出得到64维的特征信号X2,在TPA机制的双向GRU层中通过公式计算得到特征信号X2,式中i={1,2,...,n},n=128,T为转置,τi为第i个行向量的注意力权重,σ(·)为sigmoid函数,为时间模式矩阵GC的第i行,GC=Conv1d(G),Conv1d(·)为一维卷积运算,G为隐状态矩阵,gi为第i个双向GRU的隐藏状态向量,i={1,2,...,t-1},t为时刻,wk为权重系数,gt为t时刻的双向GRU的隐藏状态向量。e-4) Input the characteristic signal E 7 into the bidirectional GRU layer with a TPA mechanism with a unit number of 32, and output a 64-dimensional characteristic signal X 2 , through the formula in the bidirectional GRU layer of the TPA mechanism Calculate the characteristic signal X 2 , where i={1,2,...,n}, n=128, T is the transpose, τ i is the attention weight of the ith row vector, σ(·) is the sigmoid function, is the i-th row of the time pattern matrix G C , G C =Conv1d(G), Conv1d(·) is a one-dimensional convolution operation, G is the hidden state matrix, g i is the hidden state vector of the i-th bidirectional GRU, i={1,2,...,t-1}, t is the moment, w k is the weight coefficient, g t is the hidden state of the bidirectional GRU at the time t vector.
进一步的,步骤g)中N取值为150,SGD优化器的学习率为0.001,每80个周期学习率衰减为当前的0.1,通过公式计算得到交叉熵损失函数cc(x),式中L为类别数,L=2,fi(x)为预测类别fecg的第i个类别的预测标签,为预测类别fecg所对应的第i个类别的真实类别;步骤m)中N取值为180,SGD优化器的学习率为0.001,每90个周期学习率衰减为当前的0.1,通过公式计算得到交叉熵损失函数cc(y),式中L为类别数,L=2,fi(y)为预测类别fpcg的第i个类别的预测标签,为预测类别fpcg的第i个类别的真实类别。Further, the value of N in step g) is 150, the learning rate of the SGD optimizer is 0.001, and the learning rate decays to the current 0.1 every 80 cycles, through the formula Calculate the cross-entropy loss function cc(x), where L is the number of categories, L=2, f i (x) is the predicted label of the i-th category of the predicted category f ecg , It is the true category of the i-th category corresponding to the predicted category f ecg ; the value of N in step m) is 180, the learning rate of the SGD optimizer is 0.001, and the learning rate decays to the current 0.1 every 90 cycles, through the formula Calculate the cross-entropy loss function cc(y), where L is the number of categories, L=2, f i (y) is the predicted label of the i-th category of the predicted category f pcg , is the true category of the i-th category of the predicted category f pcg .
进一步的,步骤i)中通过公式计算得到归一化后的心音信号x′pcg,式中xpcg为训练集和测试集中的心音信号,upcg为心音信号的平均值,σpcg为心音信号的方差。Further, through the formula in step i) Calculate the normalized heart sound signal x′ pcg , where x pcg is the heart sound signal in the training set and the test set, u pcg is the average value of the heart sound signal, and σ pcg is the variance of the heart sound signal.
进一步的,步骤j)包括如下步骤:Further, step j) includes the following steps:
j-1)第一个卷积分支依次由通道数为32、卷积核大小为1×15、步长为2的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心音信号x′pcg输入到第一个卷积分支中,输出得到32维的特征信号P1;j-1) The first convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×15, and a step size of 2, a batch normalization layer, and a ReLU activation layer. The normalized heart sound signal x′ pcg is input to the first convolution branch, and the output is obtained as a 32-dimensional characteristic signal P 1 ;
j-2)第二个卷积分支依次由通道数为32、卷积核大小为1×11、步长为2的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心音信号x′pcg输入到第二个卷积分支中,输出得到32维的特征信号P2;j-2) The second convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×11, and a step size of 2, a batch normalization layer, and a ReLU activation layer. The normalized heart sound signal x′ pcg is input to the second convolution branch, and the output is a 32-dimensional characteristic signal P 2 ;
j-3)第三个卷积分支依次由通道数为32、卷积核大小为1×9、步长为2的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心音信号x′pcg输入到第三个卷积分支中,输出得到32维的特征信号P3;j-3) The third convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×9, and a step size of 2, a batch normalization layer, and a ReLU activation layer. The normalized heart sound signal x′ pcg is input to the third convolution branch, and the output is a 32-dimensional characteristic signal P 3 ;
j-4)第四个卷积分支依次由通道数为32、卷积核大小为1×5、步长为2的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心音信号x′pcg输入到第四个卷积分支中,输出得到32维的特征信号P4;j-4) The fourth convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×5, and a step size of 2, a batch normalization layer, and a ReLU activation layer. The normalized heart sound signal x′ pcg is input to the fourth convolution branch, and the output is a 32-dimensional characteristic signal P 4 ;
j-5)将特征信号P1、特征信号P2、特征信号P3、特征信号P4进行特征级联,得到级联后的128维特征信号P=[P1,P2,P3,P4];j-5) Concatenate the characteristic signal P 1 , characteristic signal P 2 , characteristic signal P 3 , and characteristic signal P 4 to obtain the concatenated 128-dimensional characteristic signal P=[P 1 , P 2 , P 3 , P 4 ];
j-6)1×1的卷积块由通道数为32、卷积核大小为1×1、步长为1的卷积层和ReLU激活层构成,将128维特征信号P=[P1,P2,P3,P4]输入到1×1的卷积块中,输出得到32维的特征信号Y1。j-6) The 1×1 convolutional block is composed of a convolutional layer with a channel number of 32, a convolution kernel size of 1×1, and a step size of 1, and a ReLU activation layer. The 128-dimensional feature signal P=[P 1 ,P 2 ,P 3 ,P 4 ] are input into the 1×1 convolutional block, and the output is a 32-dimensional feature signal Y 1 .
进一步的,步骤k)包括如下步骤:Further, step k) comprises the following steps:
k-1)第一个卷积编码模块依次由通道数为16、卷积核大小为1×1的卷积层、批归一化层、ReLU激活层、大小为4的池化层构成,将特征信号Y1输入到第一个卷积编码模块中,输出得到16维特征信号P5;k-1) The first convolutional encoding module consists of a convolutional layer with 16 channels and a convolution kernel size of 1×1, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 4. Input the characteristic signal Y 1 into the first convolution coding module, and output the 16-dimensional characteristic signal P 5 ;
k-2)第二个卷积编码模块依次由通道数为32、卷积核大小为1×11的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成,将特征信号P5输入到第二个卷积编码模块中,输出得到32维特征信号P6;k-2) The second convolutional encoding module consists of a convolutional layer with a channel number of 32 and a convolution kernel size of 1×11, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. Input the feature signal P 5 into the second convolutional coding module, and output the 32-dimensional feature signal P 6 ;
k-3)第三个卷积编码模块依次由通道数为64、卷积核大小为1×7的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成,将特征信号P6输入到第三个卷积编码模块中,输出得到64维特征信号P7;k-3) The third convolutional encoding module consists of a convolutional layer with a channel number of 64 and a convolution kernel size of 1×7, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. Input the feature signal P 6 into the third convolutional coding module, and output the 64-dimensional feature signal P 7 ;
k-4)第四个卷积编码模块依次由通道数为128、卷积核大小为1×3的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成,将特征信号P7输入到第四个卷积编码模块中,输出得到128维特征信号P8;k-4) The fourth convolutional encoding module consists of a convolutional layer with a channel number of 128 and a convolution kernel size of 1×3, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. Input the feature signal P 7 into the fourth convolutional coding module, and output the 128-dimensional feature signal P 8 ;
k-5)将特征信号P8输入到单元数为32的带有TPA机制的双向GRU层中,输出得到64维的特征信号Y2,在TPA机制的双向GRU层中通过公式计算得到特征信号Y2。k-5) Input the characteristic signal P 8 into the bidirectional GRU layer with a TPA mechanism with a unit number of 32, and output a 64-dimensional characteristic signal Y 2 , and pass the formula in the bidirectional GRU layer of the TPA mechanism The characteristic signal Y 2 is obtained through calculation.
本发明的有益效果是:先利用PC-TBG-ECG和PC-TBG-PCG模型分别实现了心电信号和心音信号的特征提取,然后采用XGBoost集成分类算法对提取出来的特征进行特征选择并分类。在增加运算效率的同时,加入了正则化,有效防止过拟合。本发明适合不同模态数据的分类检测,可从多种角度对信号进行分析,进而提高分类的准确率。The beneficial effects of the present invention are as follows: first, the PC-TBG-ECG and PC-TBG-PCG models are used to respectively realize the feature extraction of the ECG signal and the heart sound signal, and then the extracted features are selected and classified by using the XGBoost integrated classification algorithm . While increasing the operational efficiency, regularization is added to effectively prevent over-fitting. The invention is suitable for classification and detection of different modal data, and can analyze signals from various angles, thereby improving classification accuracy.
附图说明Description of drawings
图1为本发明的方法流程图;Fig. 1 is method flowchart of the present invention;
图2为本发明的PC模块的网络结构图。Fig. 2 is a network structure diagram of the PC module of the present invention.
具体实施方式Detailed ways
下面结合附图1、附图2对本发明做进一步说明。The present invention will be further described below in conjunction with accompanying drawing 1, accompanying drawing 2.
一种基于深度学习和时序注意力机制的多模态数据分类方法,包括如下步骤:A multi-modal data classification method based on deep learning and temporal attention mechanism, comprising the following steps:
a)选择PhysioNet/CinC Challenge 2016中的training-a作为数据集,对数据集进行扩充,将扩充后的数据集划分为训练集和测试集。a) Select training-a in PhysioNet/CinC Challenge 2016 as the data set, expand the data set, and divide the expanded data set into training set and test set.
b)建立心电信号模型(PC-TBG-ECG),该心电信号模型依次由PC模块、TBG模块、分类模块构成。b) Establish an electrocardiographic signal model (PC-TBG-ECG), which is composed of a PC module, a TBG module, and a classification module in sequence.
c)将训练集和测试集中的心电信号重采样到2048个采样点后进行z-score归一化处理,得到归一化后的心电信号x′ecg。c) Resampling the ECG signals in the training set and the test set to 2048 sampling points and performing z-score normalization processing to obtain the normalized ECG signal x′ ecg .
d)将训练集中归一化后的心电信号x′ecg输入到心电信号模型的PC模块,输出得到特征信号X1,PC模块依次由四个卷积分支和一个1×1的卷积块构成。d) Input the normalized ECG signal x′ ecg in the training set to the PC module of the ECG signal model, and output the characteristic signal X 1 . The PC module consists of four convolution branches and a 1×1 convolution in turn block composition.
e)将特征信号X1输入到心电信号模型的TBG模块,输出得到特征信号X2,TBG模块由3个卷积编码模块和一个带有TPA机制的双向GRU层(TPA-Bi-GRU)构成。f)将特征信号X2输入到心电信号模型的分类模块中,输出得到预测类别fecg,分类模块依次由全连接层和Softmax激活层构成。e) The characteristic signal X 1 is input to the TBG module of the ECG signal model, and the characteristic signal X 2 is output. The TBG module consists of 3 convolutional coding modules and a bidirectional GRU layer with a TPA mechanism (TPA-Bi-GRU) constitute. f) Input the feature signal X 2 into the classification module of the ECG signal model, and output the predicted category f ecg , and the classification module is sequentially composed of a fully connected layer and a Softmax activation layer.
g)重复步骤d)至步骤f)N次,使用SGD优化器,通过最小化交叉熵损失函数得到训练后的最优的心电信号模型。g) repeating step d) to step f) N times, using the SGD optimizer to obtain the optimal ECG model after training by minimizing the cross-entropy loss function.
h)建立心音信号模型(PC-TBG-PCG),该心音信号模型依次由PC模块、TBG模块、分类模块构成。h) Establishing a heart sound signal model (PC-TBG-PCG), which is sequentially composed of a PC module, a TBG module, and a classification module.
i)将训练集和测试集中的心音信号重采样到8000个采样点后进行z-score归一化处理,得到归一化后的心音信号x′pcg。i) Resample the heart sound signals in the training set and the test set to 8000 sampling points, and then perform z-score normalization processing to obtain the normalized heart sound signal x′ pcg .
j)将训练集中归一化后的心音信号x′pcg输入到心音信号模型的PC模块,输出得到特征信号Y1,PC模块依次由四个卷积分支和一个1×1的卷积块构成。j) Input the normalized heart sound signal x′ pcg in the training set to the PC module of the heart sound signal model, and output the characteristic signal Y 1 . The PC module is sequentially composed of four convolution branches and a 1×1 convolution block .
k)将特征信号Y1输入到心音信号模型的TBG模块,输出得到特征信号Y2,TBG模块由4个卷积编码模块和一个带有TPA机制的双向GRU层(TPA-Bi-GRU)构成。l)将特征信号Y2输入到心音信号模型的分类模块中,输出得到预测类别fpcg,分类模块依次由全连接层和Softmax激活层构成。k) Input the characteristic signal Y 1 into the TBG module of the heart sound signal model, and output the characteristic signal Y 2 , the TBG module consists of 4 convolutional coding modules and a bidirectional GRU layer (TPA-Bi-GRU) with a TPA mechanism . l) The characteristic signal Y 2 is input into the classification module of the heart sound signal model, and the predicted category f pcg is outputted. The classification module is sequentially composed of a fully connected layer and a Softmax activation layer.
m)重复步骤j)至步骤l)M次,使用SGD优化器,通过最小化交叉熵损失函数得到训练后的最优的心音信号模型。m) Repeat step j) to step l) M times, use the SGD optimizer, and obtain the optimal heart sound signal model after training by minimizing the cross-entropy loss function.
n)将数据集重新按4:1的比例手动划分成新的训练集和新的测试集,将新的训练集输入到最优的心电信号模型中,通过最优的心电信号模型的TBG模块输出得到64维的特征信号X3,将新的训练集输入到最优的心音信号模型中,通过最优的心音信号模型的TBG模块输出得到64维的特征信号Y3,通过公式PPx=[X3,Y3]计算得到拼接的128维的特征融合信号PPx。n) Manually divide the data set into a new training set and a new test set according to the ratio of 4:1, input the new training set into the optimal ECG signal model, and pass the optimal ECG signal model The TBG module outputs the 64-dimensional characteristic signal X 3 , and the new training set is input into the optimal heart sound signal model, and the 64-dimensional characteristic signal Y 3 is obtained through the output of the TBG module of the optimal heart sound signal model, and the formula PP x = [X 3 , Y 3 ] is calculated to obtain the concatenated 128-dimensional feature fusion signal PP x .
o)将特征融合信号PPx输入到XGBoost分类器中,得到特征融合信号PPx的重要性分数排名,选择重要性分数排名前64的信号作为特征信号PP1 x,采用5折交叉验证选择最优超参数,利用最优超参数训练XGBoost分类器,得到优化后的XGBoost分类器。o) Input the feature fusion signal PP x into the XGBoost classifier to obtain the importance score ranking of the feature fusion signal PP x , select the top 64 signals with the importance score as the feature signal PP 1 x , and use 5-fold cross-validation to select the most Optimal hyperparameters, use the optimal hyperparameters to train the XGBoost classifier, and get the optimized XGBoost classifier.
p)将新的测试集输入到最优的心电信号模型中,通过最优的心电信号模型的TBG模块输出得到64维的特征信号X4,将新的测试集输入到最优的心音信号模型中,通过最优的心音信号模型的TBG模块输出得到64维的特征信号Y4,通过公式PPc=[X4,Y4]计算得到拼接的128维的特征融合信号PPc。p) Input the new test set into the optimal ECG model, obtain the 64-dimensional characteristic signal X 4 through the output of the TBG module of the optimal ECG model, and input the new test set into the optimal heart sound In the signal model, the 64-dimensional feature signal Y 4 is obtained through the output of the TBG module of the optimal heart sound signal model, and the concatenated 128-dimensional feature fusion signal PP c is calculated by the formula PP c =[X 4 , Y 4 ].
q)特征融合信号PPc输入到XGBoost分类器中,得到特征融合信号PPc的重要性分数排名,选择重要性分数排名前64的信号作为特征信号PP1 c。q) The feature fusion signal PP c is input to the XGBoost classifier, and the importance score ranking of the feature fusion signal PP c is obtained, and the signal with the top 64 importance scores is selected as the feature signal PP 1 c .
无需对信号进行降噪、滤波等处理,避免了以往因对信号预处理不合理所导致的分类准确率低或实用性不强等问题,保证了模型的鲁棒性。首先利用PC-TBG-ECG和PC-TBG-PCG模型分别实现了心电信号和心音信号的特征提取,然后采用XGBoost集成分类算法对提取出来的特征进行特征选择并分类。在增加运算效率的同时,加入了正则化,有效防止过拟合。本发明适合不同模态数据的分类检测,可从多种角度对信号进行分析,进而提高分类的准确率。There is no need to perform noise reduction and filtering on the signal, which avoids the problems of low classification accuracy or poor practicability caused by unreasonable signal preprocessing in the past, and ensures the robustness of the model. Firstly, PC-TBG-ECG and PC-TBG-PCG models are used to extract the features of ECG signal and heart sound signal, and then XGBoost ensemble classification algorithm is used to select and classify the extracted features. While increasing the operational efficiency, regularization is added to effectively prevent over-fitting. The invention is suitable for classification and detection of different modal data, and can analyze signals from various angles, thereby improving classification accuracy.
实施例1:Example 1:
步骤a)中使用滑动窗口分割的方法对数据集进行扩充,使用五折交叉验证的方法对数据集进行5次不同的训练集和测试集的划分。In step a), the data set is expanded by using the method of sliding window segmentation, and the data set is divided into 5 different training sets and test sets by using the method of 5-fold cross-validation.
实施例2:Example 2:
步骤c)中通过公式计算得到归一化后的心电信号x′ecg,式中xecg为训练集和测试集中的心电信号,uecg为心电信号的平均值,σecg为心电信号的方差。In step c) through the formula Calculate the normalized ECG signal x′ ecg , where x ecg is the ECG signal in the training set and the test set, u ecg is the average value of the ECG signal, and σ ecg is the variance of the ECG signal.
实施例3:Example 3:
步骤d)包括如下步骤:Step d) comprises the following steps:
d-1)第一个卷积分支依次由通道数为32、卷积核大小为1×15、步长为1的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心电信号x′ecg输入到第一个卷积分支中,输出得到32维的特征信号E1;d-1) The first convolution branch consists of a convolution layer with a channel number of 32, a convolution kernel size of 1×15, and a step size of 1, a batch normalization layer, and a ReLU activation layer. The normalized ECG signal x′ ecg is input to the first convolution branch, and the output is a 32-dimensional characteristic signal E 1 ;
d-2)第二个卷积分支依次由通道数为32、卷积核大小为1×13、步长为1的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心电信号x′ecg输入到第二个卷积分支中,输出得到32维的特征信号E2;d-2) The second convolution branch consists of a convolution layer with a channel number of 32, a convolution kernel size of 1×13, and a step size of 1, a batch normalization layer, and a ReLU activation layer. The normalized ECG signal x′ ecg is input to the second convolution branch, and the output is a 32-dimensional characteristic signal E 2 ;
d-3)第三个卷积分支依次由通道数为32、卷积核大小为1×9、步长为1的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心电信号x′ecg输入到第三个卷积分支中,输出得到32维的特征信号E3;d-3) The third convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×9, and a step size of 1, a batch normalization layer, and a ReLU activation layer. The normalized ECG signal x′ ecg is input to the third convolution branch, and the output is a 32-dimensional characteristic signal E 3 ;
d-4)第四个卷积分支依次由通道数为32、卷积核大小为1×5、步长为1的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心电信号x′ecg输入到第四个卷积分支中,输出得到32维的特征信号E4;d-4) The fourth convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×5, and a step size of 1, a batch normalization layer, and a ReLU activation layer. The normalized ECG signal x′ ecg is input to the fourth convolution branch, and the output is a 32-dimensional characteristic signal E 4 ;
d-5)将特征信号E1、特征信号E2、特征信号E3、特征信号E4进行特征级联,得到级联后的128维特征信号E=[E1,E2,E3,E4];d-5) Concatenate the characteristic signal E 1 , E 2 , E 3 , and E 4 to obtain the concatenated 128-dimensional characteristic signal E=[E 1 , E 2 , E 3 , E 4 ];
d-6)1×1的卷积块由通道数为16、卷积核大小为1×1、步长为1的卷积层和ReLU激活层构成,将128维特征信号E=[E1,E2,E3,E4]输入到1×1的卷积块中,输出得到16维的特征信号X1。d-6) The 1×1 convolution block is composed of a convolution layer with a channel number of 16, a convolution kernel size of 1×1, and a step size of 1, and a ReLU activation layer. The 128-dimensional feature signal E=[E 1 ,E 2 ,E 3 ,E 4 ] are input into a 1×1 convolutional block, and the output is a 16-dimensional feature signal X 1 .
实施例4:Example 4:
步骤e)包括如下步骤:Step e) comprises the following steps:
e-1)第一个卷积编码模块依次由通道数为32、卷积核大小为1×11的卷积层、批归一化层、ReLU激活层、大小为4的池化层构成,将特征信号X1输入到第一个卷积编码模块中,输出得到32维特征信号E5;e-1) The first convolutional encoding module consists of a convolutional layer with a channel number of 32 and a convolution kernel size of 1×11, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 4. Input the feature signal X 1 into the first convolutional coding module, and output the 32-dimensional feature signal E 5 ;
e-2)第二个卷积编码模块依次由通道数为64、卷积核大小为1×7的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成,将特征信号E5输入到第二个卷积编码模块中,输出得到64维特征信号E6;e-2) The second convolutional encoding module consists of a convolutional layer with a channel number of 64 and a convolution kernel size of 1×7, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. Input the feature signal E 5 into the second convolution coding module, and output the 64-dimensional feature signal E 6 ;
e-3)第三个卷积编码模块依次由通道数为128、卷积核大小为3的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成,将特征信号E6输入到第三个卷积编码模块中,输出得到128维特征信号E7;e-3) The third convolutional coding module consists of a convolutional layer with a channel number of 128 and a convolution kernel size of 3, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. The feature The signal E 6 is input to the third convolutional encoding module, and the output is obtained as a 128-dimensional feature signal E 7 ;
e-4)将特征信号E7输入到单元数为32的带有TPA机制的双向GRU层中,输出得到64维的特征信号X2,在TPA机制的双向GRU层中通过公式计算得到特征信号X2,式中i={1,2,...,n},n=128,T为转置,τi为第i个行向量的注意力权重,σ(·)为sigmoid函数,为时间模式矩阵GC的第i行,GC=Conv1d(G),Conv1d(·)为一维卷积运算,G为隐状态矩阵,gi为第i个双向GRU的隐藏状态向量,i={1,2,...,t-1},t为时刻,wk为权重系数,gt为t时刻的双向GRU的隐藏状态向量。e-4) Input the characteristic signal E 7 into the bidirectional GRU layer with a TPA mechanism with a unit number of 32, and output a 64-dimensional characteristic signal X 2 , through the formula in the bidirectional GRU layer of the TPA mechanism Calculate the characteristic signal X 2 , where i={1,2,...,n}, n=128, T is the transpose, τ i is the attention weight of the ith row vector, σ(·) is the sigmoid function, is the i-th row of the time pattern matrix G C , G C =Conv1d(G), Conv1d(·) is a one-dimensional convolution operation, G is the hidden state matrix, g i is the hidden state vector of the i-th bidirectional GRU, i={1,2,...,t-1}, t is the moment, w k is the weight coefficient, g t is the hidden state of the bidirectional GRU at the time t vector.
实施例5:Example 5:
步骤g)中N取值为150,SGD优化器的学习率为0.001,每80个周期学习率衰减为当前的0.1,通过公式计算得到交叉熵损失函数cc(x),式中L为类别数,L=2,fi(x)为预测类别fecg的第i个类别的预测标签,为预测类别fecg所对应的第i个类别的真实类别;步骤m)中N取值为180,SGD优化器的学习率为0.001,每90个周期学习率衰减为当前的0.1,通过公式计算得到交叉熵损失函数cc(y),式中L为类别数,L=2,fi(y)为预测类别fpcg的第i个类别的预测标签,为预测类别fpcg的第i个类别的真实类别。In step g), the value of N is 150, the learning rate of the SGD optimizer is 0.001, and the learning rate decays to the current 0.1 every 80 cycles, through the formula Calculate the cross-entropy loss function cc(x), where L is the number of categories, L=2, f i (x) is the predicted label of the i-th category of the predicted category f ecg , It is the true category of the i-th category corresponding to the predicted category f ecg ; the value of N in step m) is 180, the learning rate of the SGD optimizer is 0.001, and the learning rate decays to the current 0.1 every 90 cycles, through the formula Calculate the cross-entropy loss function cc(y), where L is the number of categories, L=2, f i (y) is the predicted label of the i-th category of the predicted category f pcg , is the true category of the i-th category of the predicted category f pcg .
实施例6:Embodiment 6:
步骤i)中通过公式计算得到归一化后的心音信号x′pcg,式中xpcg为训练集和测试集中的心音信号,upcg为心音信号的平均值,σpcg为心音信号的方差。In step i) through the formula Calculate the normalized heart sound signal x′ pcg , where x pcg is the heart sound signal in the training set and the test set, u pcg is the average value of the heart sound signal, and σ pcg is the variance of the heart sound signal.
实施例7:Embodiment 7:
步骤j)包括如下步骤:Step j) comprises the following steps:
j-1)第一个卷积分支依次由通道数为32、卷积核大小为1×15、步长为2的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心音信号x′pcg输入到第一个卷积分支中,输出得到32维的特征信号P1;j-1) The first convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×15, and a step size of 2, a batch normalization layer, and a ReLU activation layer. The normalized heart sound signal x′ pcg is input to the first convolution branch, and the output is obtained as a 32-dimensional characteristic signal P 1 ;
j-2)第二个卷积分支依次由通道数为32、卷积核大小为1×11、步长为2的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心音信号x′pcg输入到第二个卷积分支中,输出得到32维的特征信号P2;j-2) The second convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×11, and a step size of 2, a batch normalization layer, and a ReLU activation layer. The normalized heart sound signal x′ pcg is input to the second convolution branch, and the output is a 32-dimensional characteristic signal P 2 ;
j-3)第三个卷积分支依次由通道数为32、卷积核大小为1×9、步长为2的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心音信号x′pcg输入到第三个卷积分支中,输出得到32维的特征信号P3;j-3) The third convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×9, and a step size of 2, a batch normalization layer, and a ReLU activation layer. The normalized heart sound signal x′ pcg is input to the third convolution branch, and the output is a 32-dimensional characteristic signal P 3 ;
j-4)第四个卷积分支依次由通道数为32、卷积核大小为1×5、步长为2的卷积层、批归一化层、ReLU激活层构成,将训练集中归一化后的心音信号x′pcg输入到第四个卷积分支中,输出得到32维的特征信号P4;j-4) The fourth convolution branch consists of a convolutional layer with a channel number of 32, a convolution kernel size of 1×5, and a step size of 2, a batch normalization layer, and a ReLU activation layer. The normalized heart sound signal x′ pcg is input to the fourth convolution branch, and the output is a 32-dimensional characteristic signal P 4 ;
j-5)将特征信号P1、特征信号P2、特征信号P3、特征信号P4进行特征级联,得到级联后的128维特征信号P=[P1,P2,P3,P4];j-5) Concatenate the characteristic signal P 1 , characteristic signal P 2 , characteristic signal P 3 , and characteristic signal P 4 to obtain the concatenated 128-dimensional characteristic signal P=[P 1 , P 2 , P 3 , P 4 ];
j-6)1×1的卷积块由通道数为32、卷积核大小为1×1、步长为1的卷积层和ReLU激活层构成,将128维特征信号P=[P1,P2,P3,P4]输入到1×1的卷积块中,输出得到32维的特征信号Y1。j-6) The 1×1 convolutional block is composed of a convolutional layer with a channel number of 32, a convolution kernel size of 1×1, and a step size of 1, and a ReLU activation layer. The 128-dimensional feature signal P=[P 1 ,P 2 ,P 3 ,P 4 ] are input into the 1×1 convolutional block, and the output is a 32-dimensional feature signal Y 1 .
实施例8:Embodiment 8:
步骤k)包括如下步骤:Step k) comprises the following steps:
k-1)第一个卷积编码模块依次由通道数为16、卷积核大小为1×1的卷积层、批归一化层、ReLU激活层、大小为4的池化层构成,将特征信号Y1输入到第一个卷积编码模块中,输出得到16维特征信号P5;k-1) The first convolutional encoding module consists of a convolutional layer with 16 channels and a convolution kernel size of 1×1, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 4. Input the characteristic signal Y 1 into the first convolution coding module, and output the 16-dimensional characteristic signal P 5 ;
k-2)第二个卷积编码模块依次由通道数为32、卷积核大小为1×11的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成,将特征信号P5输入到第二个卷积编码模块中,输出得到32维特征信号P6;k-2) The second convolutional encoding module consists of a convolutional layer with a channel number of 32 and a convolution kernel size of 1×11, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. Input the feature signal P 5 into the second convolutional coding module, and output the 32-dimensional feature signal P 6 ;
k-3)第三个卷积编码模块依次由通道数为64、卷积核大小为1×7的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成,将特征信号P6输入到第三个卷积编码模块中,输出得到64维特征信号P7;k-3) The third convolutional encoding module consists of a convolutional layer with a channel number of 64 and a convolution kernel size of 1×7, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. Input the feature signal P 6 into the third convolutional coding module, and output the 64-dimensional feature signal P 7 ;
k-4)第四个卷积编码模块依次由通道数为128、卷积核大小为1×3的卷积层、批归一化层、ReLU激活层、大小为2的池化层构成,将特征信号P7输入到第四个卷积编码模块中,输出得到128维特征信号P8;k-4) The fourth convolutional encoding module consists of a convolutional layer with a channel number of 128 and a convolution kernel size of 1×3, a batch normalization layer, a ReLU activation layer, and a pooling layer with a size of 2. Input the feature signal P 7 into the fourth convolutional coding module, and output the 128-dimensional feature signal P 8 ;
k-5)将特征信号P8输入到单元数为32的带有TPA机制的双向GRU层中,输出得到64维的特征信号Y2,在TPA机制的双向GRU层中通过公式计算得到特征信号Y2。k-5) Input the characteristic signal P 8 into the bidirectional GRU layer with a TPA mechanism with a unit number of 32, and output a 64-dimensional characteristic signal Y 2 , and pass the formula in the bidirectional GRU layer of the TPA mechanism The characteristic signal Y 2 is obtained through calculation.
最后应说明的是:以上所述仅为本发明的优选实施例而已,并不用于限制本发明,尽管参照前述实施例对本发明进行了详细的说明,对于本领域的技术人员来说,其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换。凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。Finally, it should be noted that: the above is only a preferred embodiment of the present invention, and is not intended to limit the present invention. Although the present invention has been described in detail with reference to the foregoing embodiments, for those skilled in the art, it still The technical solutions recorded in the foregoing embodiments may be modified, or some technical features thereof may be equivalently replaced. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present invention shall be included in the protection scope of the present invention.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210376944.3A CN114764575B (en) | 2022-04-11 | 2022-04-11 | Multimodal Data Classification Method Based on Deep Learning and Temporal Attention Mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210376944.3A CN114764575B (en) | 2022-04-11 | 2022-04-11 | Multimodal Data Classification Method Based on Deep Learning and Temporal Attention Mechanism |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114764575A CN114764575A (en) | 2022-07-19 |
CN114764575B true CN114764575B (en) | 2023-02-28 |
Family
ID=82364741
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210376944.3A Active CN114764575B (en) | 2022-04-11 | 2022-04-11 | Multimodal Data Classification Method Based on Deep Learning and Temporal Attention Mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114764575B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116186593B (en) * | 2023-03-10 | 2023-10-03 | 山东省人工智能研究院 | An ECG signal detection method based on separable convolution and attention mechanism |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2018130541A (en) * | 2017-02-16 | 2018-08-23 | タタ コンサルタンシー サービシズ リミテッドTATA Consultancy Services Limited | Method and system for detection of coronary artery disease in person using fusion approach |
CN110236518A (en) * | 2019-04-02 | 2019-09-17 | 武汉大学 | Method and device for joint classification of ECG and cardiac shock signals based on neural network |
CN110537910A (en) * | 2019-09-18 | 2019-12-06 | 山东大学 | Non-invasive screening system for coronary heart disease based on joint analysis of ECG and heart sound signals |
CN113288163A (en) * | 2021-06-04 | 2021-08-24 | 浙江理工大学 | Multi-feature fusion electrocardiosignal classification model modeling method based on attention mechanism |
CN113855063A (en) * | 2021-10-21 | 2021-12-31 | 华中科技大学 | Heart sound automatic diagnosis system based on deep learning |
CN114190952A (en) * | 2021-12-01 | 2022-03-18 | 山东省人工智能研究院 | 12-lead electrocardiosignal multi-label classification method based on lead grouping |
-
2022
- 2022-04-11 CN CN202210376944.3A patent/CN114764575B/en active Active
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JP2018130541A (en) * | 2017-02-16 | 2018-08-23 | タタ コンサルタンシー サービシズ リミテッドTATA Consultancy Services Limited | Method and system for detection of coronary artery disease in person using fusion approach |
CN110236518A (en) * | 2019-04-02 | 2019-09-17 | 武汉大学 | Method and device for joint classification of ECG and cardiac shock signals based on neural network |
CN110537910A (en) * | 2019-09-18 | 2019-12-06 | 山东大学 | Non-invasive screening system for coronary heart disease based on joint analysis of ECG and heart sound signals |
CN113288163A (en) * | 2021-06-04 | 2021-08-24 | 浙江理工大学 | Multi-feature fusion electrocardiosignal classification model modeling method based on attention mechanism |
CN113855063A (en) * | 2021-10-21 | 2021-12-31 | 华中科技大学 | Heart sound automatic diagnosis system based on deep learning |
CN114190952A (en) * | 2021-12-01 | 2022-03-18 | 山东省人工智能研究院 | 12-lead electrocardiosignal multi-label classification method based on lead grouping |
Non-Patent Citations (2)
Title |
---|
Integrating multi-domain deep features of electrocardiogram and phonocardiogram for coronary artery disease detection;HanLi, et al.;《Computers in Biology and Medicine》;20211130;1-7 * |
基于心音心电信号的心衰分析系统;李俊杰 等;《Software Engineering and Applications》;20220209;1-10 * |
Also Published As
Publication number | Publication date |
---|---|
CN114764575A (en) | 2022-07-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108388348B (en) | An EMG gesture recognition method based on deep learning and attention mechanism | |
CN106778014B (en) | Disease risk prediction modeling method based on recurrent neural network | |
CN112784798A (en) | Multi-modal emotion recognition method based on feature-time attention mechanism | |
CN112766355B (en) | A method for EEG emotion recognition under label noise | |
CN111956212B (en) | Inter-group atrial fibrillation recognition method based on frequency domain filtering-multi-mode deep neural network | |
CN112885372A (en) | Intelligent diagnosis method, system, terminal and medium for power equipment fault sound | |
CN105841961A (en) | Bearing fault diagnosis method based on Morlet wavelet transformation and convolutional neural network | |
CN113749657B (en) | Brain electricity emotion recognition method based on multi-task capsule | |
CN113274031B (en) | Arrhythmia classification method based on depth convolution residual error network | |
CN110522444A (en) | A Kernel-CNN-based ECG Signal Recognition and Classification Method | |
CN109840290B (en) | A dermoscopy image retrieval method based on end-to-end deep hashing | |
CN105095863A (en) | Similarity-weight-semi-supervised-dictionary-learning-based human behavior identification method | |
CN116584951A (en) | A ECG signal detection and localization method based on weakly supervised learning | |
CN107609588A (en) | A kind of disturbances in patients with Parkinson disease UPDRS score Forecasting Methodologies based on voice signal | |
CN112101401B (en) | Multi-modal emotion recognition method based on sparse supervision least square multi-class kernel canonical correlation analysis | |
CN115530788A (en) | Arrhythmia classification method based on self-attention mechanism | |
US20230225663A1 (en) | Method for predicting multi-type electrocardiogram heart rhythms based on graph convolution | |
CN114564990A (en) | Electroencephalogram signal classification method based on multi-channel feedback capsule network | |
CN116186593B (en) | An ECG signal detection method based on separable convolution and attention mechanism | |
CN113768515A (en) | An ECG Signal Classification Method Based on Deep Convolutional Neural Networks | |
CN114764575B (en) | Multimodal Data Classification Method Based on Deep Learning and Temporal Attention Mechanism | |
CN116369877A (en) | A non-invasive blood pressure estimation method based on photoplethysmography | |
CN112465054B (en) | A Multivariate Time Series Data Classification Method Based on FCN | |
CN117398084A (en) | Physiological signal real-time quality assessment method based on light-weight mixed model | |
CN118861843B (en) | Mental health state auxiliary evaluation system based on big data |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant | ||
CP03 | Change of name, title or address | ||
CP03 | Change of name, title or address |
Address after: No.19 Keyuan Road, Lixia District, Jinan City, Shandong Province Patentee after: Shandong Institute of artificial intelligence Country or region after: China Patentee after: Qilu University of Technology (Shandong Academy of Sciences) Address before: No.19 Keyuan Road, Lixia District, Jinan City, Shandong Province Patentee before: Shandong Institute of artificial intelligence Country or region before: China Patentee before: Qilu University of Technology |