CN115374812A

CN115374812A - A signal feature extraction method for multiple feature fusion extraction of EEG signals

Info

Publication number: CN115374812A
Application number: CN202210924013.2A
Authority: CN
Inventors: 王晓鹰; 杜玉晓; 杨钦泰; 凌宇; 刘子锋; 李向欢
Original assignee: Guangdong University of Technology; Third Affiliated Hospital Sun Yat Sen University
Current assignee: Guangdong University of Technology; Third Affiliated Hospital Sun Yat Sen University
Priority date: 2022-08-02
Filing date: 2022-08-02
Publication date: 2022-11-22

Abstract

The invention discloses a signal feature extraction method for multi-feature fusion extraction of EEG signals. The method comprises: acquiring EEG signal data and performing signal preprocessing on the EEG signal data to obtain preprocessed signals; The final signal is subjected to multi-dimensional feature extraction and matrix construction to obtain the original feature matrix; the original feature matrix is fused and dimensionally reduced to obtain the final feature matrix; the final feature matrix is input to the pre-trained classification model for classification, and the classification result is output. The invention overcomes the problem of incomplete EEG signal information in the single-domain feature extraction algorithm in the traditional algorithm, and effectively improves the classification performance. The invention can be widely used in the field of signal processing.

Description

A signal feature extraction method for multiple feature fusion extraction of EEG signals

技术领域technical field

本发明涉及信号处理领域，尤其涉及一种脑电信号的多元特征融合提取的信号特征提取方法。The invention relates to the field of signal processing, in particular to a signal feature extraction method for multi-feature fusion extraction of electroencephalogram signals.

背景技术Background technique

脑电信号是非线性、非平稳的时序信号，可通过头皮上电极的传感器检测，这些信号是神经元膜电位的外部表现。基于脑电信号的研究可以用于进行情绪识别，目前大多都是通过提取单个领域的特征进行情绪识别，但是单个领域的特征提取只能包含脑电信号的部分信息，会导致分类性能不太理想。EEG signals are non-linear, non-stationary time-series signals that can be detected by sensors on electrodes on the scalp, and these signals are the external manifestations of neuron membrane potential. Research based on EEG signals can be used for emotion recognition. At present, most of the emotion recognition is performed by extracting the features of a single field, but the feature extraction of a single field can only contain part of the information of the EEG signal, which will lead to unsatisfactory classification performance. .

发明内容Contents of the invention

本发明的目的是提供一种脑电信号的多元特征融合提取的信号特征提取方法，克服了传统算法中的单域特征提取算法中的脑电信号信息不全的问题。The purpose of the present invention is to provide a signal feature extraction method for multi-feature fusion extraction of EEG signals, which overcomes the problem of incomplete EEG signal information in single-domain feature extraction algorithms in traditional algorithms.

本发明所采用的第一技术方案是：一种脑电信号的多元特征融合提取的信号特征提取方法，包括以下步骤：The first technical solution adopted in the present invention is: a signal feature extraction method for multi-element feature fusion extraction of EEG signals, comprising the following steps:

获取脑电信号数据并对脑电信号数据进行信号预处理，得到预处理后的信号；Obtain the EEG signal data and perform signal preprocessing on the EEG signal data to obtain the preprocessed signal;

对预处理后的信号进行多维度特征提取和矩阵构建，得到原始特征矩阵；Perform multi-dimensional feature extraction and matrix construction on the preprocessed signal to obtain the original feature matrix;

对原始特征矩阵进行融合降维处理，得到最终特征矩阵；Perform fusion and dimensionality reduction processing on the original feature matrix to obtain the final feature matrix;

将最终特征矩阵输入至预训练的分类模型进行分类，输出分类结果。Input the final feature matrix to the pre-trained classification model for classification, and output the classification result.

进一步，所述获取脑电信号数据集并对脑电信号数据集进行信号预处理，得到预处理后的信号这一步骤，其具体包括：Further, the step of obtaining the EEG signal data set and performing signal preprocessing on the EEG signal data set to obtain the preprocessed signal specifically includes:

获取脑电信号数据；Obtain EEG signal data;

基于独立成分分析方法对脑电信号数据进行杂讯分离处理，得到分离后的信号；Based on the independent component analysis method, the EEG signal data is subjected to noise separation processing to obtain the separated signal;

基于巴特沃夫带通滤波器对分离后的信号进行频率筛选处理，得到筛选后的信号；Frequency filtering is performed on the separated signal based on the Butterworth bandpass filter to obtain the filtered signal;

对筛选后的信号进行归一化，得到预处理后的信号。Normalize the filtered signal to obtain the preprocessed signal.

进一步，所述对预处理后的信号进行多维度特征提取和矩阵构建，得到原始特征矩阵这一步骤，其具体包括：Further, the step of performing multi-dimensional feature extraction and matrix construction on the preprocessed signal to obtain the original feature matrix specifically includes:

对预处理后的信号在时域上进行特征提取，得到时域特征；Perform feature extraction on the preprocessed signal in the time domain to obtain time domain features;

基于AR模型功率谱估计方法对预处理后的信号在频域上进行特征提取，得到频域特征；Based on the AR model power spectrum estimation method, the preprocessed signal is extracted in the frequency domain to obtain the frequency domain features;

基于希尔伯特-黄变换方法对预处理后的信号在时频域上进行特征提取，得到时频域特征；Based on the Hilbert-Huang transform method, the preprocessed signal is extracted in the time-frequency domain to obtain the time-frequency domain features;

基于非线性动力学分析对预处理后的信号在非线性域上进行特征提取，得到非线性域特征；Based on the nonlinear dynamic analysis, the preprocessed signal is extracted in the nonlinear domain to obtain the nonlinear domain features;

基于公共空间模式分析对预处理后的信号在空域上进行特征提取，得到空域特征；Based on the public space pattern analysis, the preprocessed signal is extracted in the airspace to obtain the airspace features;

根据时域特征、频域特征、时频域特征、非线性域特征和空域特征进行矩阵构建，得到原始特征矩阵。According to the time domain feature, frequency domain feature, time-frequency domain feature, nonlinear domain feature and space domain feature, the matrix is constructed to obtain the original feature matrix.

进一步，所述时域特征包括标准差、均方根、一阶差分绝对值的均值，所述非线性特征包括近似熵、模糊熵、样本熵。Further, the time-domain features include standard deviation, root mean square, and the mean value of the absolute value of the first-order difference, and the nonlinear features include approximate entropy, fuzzy entropy, and sample entropy.

进一步，所述对原始特征矩阵进行融合降维处理，得到最终特征矩阵这一步骤，其具体包括：Further, the step of performing fusion and dimensionality reduction processing on the original feature matrix to obtain the final feature matrix specifically includes:

对原始特征矩阵进行去均值处理，得到中心化后的矩阵；Perform de-meaning processing on the original feature matrix to obtain the centered matrix;

计算中心化后的矩阵的协方差矩阵并进行特征值分解，得到特征值和对应的特征向量；Calculate the covariance matrix of the centered matrix and perform eigenvalue decomposition to obtain the eigenvalues and corresponding eigenvectors;

将特征值排序并按顺序取预设数量特征值对应的特征向量构建矩阵，得到最终特征矩阵。The eigenvalues are sorted and the eigenvectors corresponding to the preset number of eigenvalues are taken in order to construct a matrix to obtain the final eigenmatrix.

进一步，所述将最终特征矩阵输入至预训练的分类模型进行分类，输出分类结果对这一步骤，其具体包括：Further, the step of inputting the final feature matrix to the pre-trained classification model for classification, and outputting the classification result specifically includes:

基于粒子群算法和预构建的训练集对SVM-KNN分类器进行参数寻优，得到预训练的分类模型；Optimize the parameters of the SVM-KNN classifier based on the particle swarm optimization algorithm and the pre-built training set to obtain the pre-trained classification model;

以最终特征矩阵为输入，基于预训练的分类模型计算待测样本与最优超平面的距离，得到样本距离；Taking the final feature matrix as input, calculate the distance between the sample to be tested and the optimal hyperplane based on the pre-trained classification model, and obtain the sample distance;

将样本距离与预设阈值比较；Compare the sample distance with a preset threshold;

判断到样本距离的绝对值小于预设阈值，采用KNN算法进行分类，输出分类结果；It is judged that the absolute value of the sample distance is less than the preset threshold, the KNN algorithm is used for classification, and the classification result is output;

判断到样本距离的绝对值大于等于预设阈值，采用SVM算法进行分类，输出分类结果。Judging that the absolute value of the sample distance is greater than or equal to the preset threshold, the SVM algorithm is used for classification, and the classification result is output.

进一步，所述样本距离的计算公式如下：Further, the calculation formula of the sample distance is as follows:

上式中，a_i为拉格朗日乘数，y_ia_i为支持向量在决策函数中的系数，k为系统常数，b为SVM中的决策函数的常数项。In the above formula, a _i is the Lagrangian multiplier, y _i a _i is the coefficient of the support vector in the decision function, k is the system constant, and b is the constant term of the decision function in SVM.

本发明所采用的第二技术方案是：一种脑电信号的多元特征融合提取的信号特征提取系统，包括：The second technical solution adopted in the present invention is: a signal feature extraction system for the fusion and extraction of multiple features of EEG signals, including:

预处理模块，用于获取脑电信号数据并对脑电信号数据进行信号预处理，得到预处理后的信号；The preprocessing module is used to obtain the EEG signal data and perform signal preprocessing on the EEG signal data to obtain the preprocessed signal;

特征提取模块，用于对预处理后的信号进行多维度特征提取和矩阵构建，得到原始特征矩阵；The feature extraction module is used to perform multi-dimensional feature extraction and matrix construction on the preprocessed signal to obtain the original feature matrix;

降维模块，用于对原始特征矩阵进行融合降维处理，得到最终特征矩阵；A dimensionality reduction module is used to perform fusion and dimensionality reduction processing on the original feature matrix to obtain the final feature matrix;

分类模块，用于将最终特征矩阵输入至预训练的分类器进行分类，输出分类结果。The classification module is used to input the final feature matrix to the pre-trained classifier for classification, and output the classification result.

本发明所采用的第三技术方案是：一种脑电信号的多元特征融合提取的信号特征提取装置，包括：The third technical solution adopted by the present invention is: a signal feature extraction device for the fusion and extraction of multiple features of EEG signals, including:

至少一个处理器；at least one processor;

至少一个存储器，用于存储至少一个程序；at least one memory for storing at least one program;

当所述至少一个程序被所述至少一个处理器执行，使得所述至少一个处理器实现如上所述一种脑电信号的多元特征融合提取的信号特征提取方法。When the at least one program is executed by the at least one processor, the at least one processor implements the signal feature extraction method for multi-element feature fusion extraction of EEG signals as described above.

本发明方法的有益效果是：本发明通过多维度特征尽可能完整地保留脑电信号中包含的信息，能够有效提升分类器的分类性能，通过降维处理，使得分类器的计算更加简便，能够进一步提高分类器的性能。The beneficial effect of the method of the present invention is: the present invention retains the information contained in the EEG signal as completely as possible through multi-dimensional features, can effectively improve the classification performance of the classifier, and makes the calculation of the classifier more convenient and convenient through dimensionality reduction processing. Further improve the performance of the classifier.

附图说明Description of drawings

图1是本发明一种脑电信号的多元特征融合提取的信号特征提取方法的步骤流程图；Fig. 1 is the step flow chart of the signal feature extraction method of the multiple feature fusion extraction of a kind of EEG signal of the present invention;

图2是本发明一种脑电信号的多元特征融合提取的信号特征提取系统的结构框图。FIG. 2 is a structural block diagram of a signal feature extraction system for multi-element feature fusion extraction of EEG signals according to the present invention.

具体实施方式Detailed ways

下面结合附图和具体实施例对本发明做进一步的详细说明。对于以下实施例中的步骤编号，其仅为了便于阐述说明而设置，对步骤之间的顺序不做任何限定，实施例中的各步骤的执行顺序均可根据本领域技术人员的理解来进行适应性调整。The present invention will be further described in detail below in conjunction with the accompanying drawings and specific embodiments. For the step numbers in the following embodiments, it is only set for the convenience of illustration and description, and the order between the steps is not limited in any way. The execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art sexual adjustment.

如图1所示，本发明提供了一种脑电信号的多元特征融合提取的信号特征提取方法，该方法包括以下步骤：As shown in Figure 1, the present invention provides a kind of signal feature extraction method of the multiple feature fusion extraction of EEG signal, and this method comprises the following steps:

S1、获取脑电信号数据并对脑电信号数据进行信号预处理，得到预处理后的信号；S1. Obtain the EEG signal data and perform signal preprocessing on the EEG signal data to obtain a preprocessed signal;

具体地，由于脑电信号中杂讯比较多，并且与情绪有关的信息大部分集中在0-50Hz频带中，因此需要对脑电信号数据进行预处理。Specifically, since there are many noises in the EEG signal, and most of the emotion-related information is concentrated in the 0-50 Hz frequency band, it is necessary to preprocess the EEG signal data.

S1.1、获取脑电信号数据；S1.1. Obtain EEG signal data;

S1.2、基于独立成分分析方法对脑电信号数据进行杂讯分离处理，得到分离后的信号；S1.2. Perform noise separation processing on the EEG signal data based on the independent component analysis method to obtain separated signals;

具体地，我们需要将脑电信号中的眼电、肌电等人体中的其他生物电分离开来，目前对于伪迹的去除可以采用独立成分分析(ICA)来将脑电信号与其他杂讯分离开。Specifically, we need to separate the oculoelectricity, myoelectricity and other bioelectricity in the human body from the EEG signal. At present, the removal of artifacts can use Independent Component Analysis (ICA) to separate the EEG signal from other noises. separated.

S1.3、基于巴特沃夫带通滤波器对分离后的信号进行频率筛选处理，得到筛选后的信号；S1.3. Perform frequency screening processing on the separated signal based on the Butterworth bandpass filter to obtain the filtered signal;

具体地，通过巴特沃夫带通滤波器将脑电信号中0-50Hz中的内容保留下来。Specifically, the content of 0-50 Hz in the EEG signal is preserved through a Butterworth band-pass filter.

S1.4、对筛选后的信号进行归一化，得到预处理后的信号。S1.4. Normalize the filtered signals to obtain preprocessed signals.

具体地，最后通过matlab中自带的mapminmax函数进行归一化。Specifically, normalization is finally performed through the mapminmax function that comes with matlab.

S2、对预处理后的信号进行多维度特征提取和矩阵构建，得到原始特征矩阵；S2. Perform multi-dimensional feature extraction and matrix construction on the preprocessed signal to obtain the original feature matrix;

S2.1、对预处理后的信号在时域上进行特征提取，得到时域特征；S2.1. Perform feature extraction on the preprocessed signal in the time domain to obtain time domain features;

具体地，时域特征是在时域上将脑电信号的信号统计结果作为特征，时域特征包括标准差、均方根、一阶差分绝对值的均值。Specifically, the time-domain feature is to use the signal statistical result of the EEG signal as a feature in the time domain, and the time-domain feature includes standard deviation, root mean square, and the mean value of the absolute value of the first-order difference.

标准差的计算公式如下：The formula for calculating the standard deviation is as follows:

其中，x_i为时间序列中第i个采样点的值，μ_x当前时间序列的均值，N为时间序列的长度。Among them, x _i is the value of the i-th sampling point in the time series, μ _x the mean value of the current time series, and N is the length of the time series.

均方根的计算公式如下：The root mean square calculation formula is as follows:

一阶差分绝对值的均值的计算公式如下：The formula for calculating the mean of the absolute value of the first-order difference is as follows:

S2.2、基于AR模型功率谱估计方法对预处理后的信号在频域上进行特征提取，得到频域特征；S2.2, based on the AR model power spectrum estimation method, perform feature extraction on the preprocessed signal in the frequency domain to obtain frequency domain features;

具体地，提取脑电频域特征时,先将信号映射到对应频段上，然后得到各个频段上的特征量。由于脑电信号是随机信号，本文采用AR模型功率谱估计方法进行频域分析和特征提取。AR模型又称自回归滑动平均模型,估计模型参数需要求解Y-W方程。对于一般随机信号，AR模型可以表示为如下形式：Specifically, when extracting EEG frequency domain features, the signals are first mapped to corresponding frequency bands, and then the feature quantities of each frequency band are obtained. Since the EEG signal is a random signal, this paper uses the AR model power spectrum estimation method for frequency domain analysis and feature extraction. The AR model is also called the autoregressive moving average model, and the Y-W equation needs to be solved to estimate the model parameters. For general random signals, the AR model can be expressed as follows:

其中，y(n)是信号的第n个采样值，p是模型阶数，a_i是模型系数，r(n)为零均值白噪声的残差。Among them, y(n) is the nth sampling value of the signal, p is the model order, a _i is the model coefficient, and r(n) is the residual of zero-mean white noise.

AR模型的系统函数：System function of AR model:

模型输出功率谱:Model output power spectrum:

采用AR模型方法获EEG信号的功率谱，进而计算脑电各个节律对应的谱能量或功率,作为脑电θ、α、β和γ节律的频域特征Use the AR model method to obtain the power spectrum of the EEG signal, and then calculate the spectral energy or power corresponding to each rhythm of the EEG, as the frequency domain characteristics of the EEG θ, α, β, and γ rhythms

S2.3、基于希尔伯特-黄变换方法对预处理后的信号在时频域上进行特征提取，得到时频域特征；S2.3, based on the Hilbert-Huang transform method, perform feature extraction on the preprocessed signal in the time-frequency domain to obtain time-frequency domain features;

具体地，希尔伯特–黄变换方法主要包括经验模态分解(EMD)和Hilbert谱分析(HSA)。Specifically, the Hilbert-Huang transform method mainly includes Empirical Mode Decomposition (EMD) and Hilbert Spectral Analysis (HSA).

经验模态分解是为了获得本征模函数(Intrinsic mode function,IMF),它具有自适应性、正交性、完备性、IMF分量的调制特性。EMD满足如下两个条件：Empirical mode decomposition is to obtain the intrinsic mode function (Intrinsic mode function, IMF), which has adaptive, orthogonality, completeness, and modulation characteristics of IMF components. EMD satisfies the following two conditions:

1)信号极值点的数量与零点数相等或相差1；1) The number of signal extreme points is equal to or differs by 1 from the number of zero points;

2)信号的由极大值定义的上包络和由极小值定义的下包络的局部均值为0.2) The local mean of the upper envelope defined by the maximum value and the lower envelope defined by the minimum value of the signal is 0.

EMD过程如下：The EMD process is as follows:

1)对输入信号求取所有的极大值点和极小值点.1) Find all the maximum and minimum points of the input signal.

2)对极大值点和极小值点采用三次样条进行拟合,求上、下包络的曲线,计算均值函数,进而求出待分析信号和均值的差值h。2) Use cubic splines to fit the maximum and minimum points, find the curves of the upper and lower envelopes, calculate the mean function, and then calculate the difference h between the signal to be analyzed and the mean.

3)考察h是否满足IMF条件，如果满足，则把h作为第1个IMF；否则，对其进行前两步操作，直到第k步满足IMF条件，然后求得第1个IMF，求出原信号与IMF的差值r。3) Investigate whether h satisfies the IMF condition, and if so, take h as the first IMF; otherwise, perform the first two steps until the kth step satisfies the IMF condition, then obtain the first IMF, and obtain the original The difference r between the signal and the IMF.

4)把差值r作为待分解的信号,直到剩余的r为单调信号或者只存在一个极点为止,得到的表达式如下：4) Take the difference r as the signal to be decomposed until the remaining r is a monotonic signal or there is only one pole, the obtained expression is as follows:

其中，x(t)为原信号，C_i(t)表示第i次筛选得到的IMF分量，N为筛选次数，R_n(t)为最终的剩余分量。Among them, x(t) is the original signal, C _i (t) represents the IMF component obtained by the i-th screening, N is the number of screening times, and R _n (t) is the final residual component.

EMD过程后，进行Hilbert谱分析(HSA)，下式对每个IMF分量进行Hilbert谱变换：After the EMD process, carry out Hilbert spectrum analysis (HSA), the following formula carries out Hilbert spectrum transformation to each IMF component:

解析信号为:Parse the signal as:

其中A_i(t)为瞬时幅值，Pi(t)为瞬时相位，进一步可以求出瞬时频率ω_i(t)，即：Among them, A _i (t) is the instantaneous amplitude, Pi (t) is the instantaneous phase, and the instantaneous frequency ω _i (t) can be obtained further, namely:

则可描述信号幅度在时-频域的分布情况，即Hilbert谱：Then the distribution of the signal amplitude in the time-frequency domain can be described, that is, the Hilbert spectrum:

其中，Re为取实部。Among them, Re is the real part.

根据上式可进一步求取瞬时能量谱(IES)和边际能量谱(MES)：According to the above formula, the instantaneous energy spectrum (IES) and marginal energy spectrum (MES) can be further obtained:

式中：[ω₁，ω₂]为信号的频率范围，[t₂，t₂]为信号的时间范围。Where: [ω ₁ , ω ₂ ] is the frequency range of the signal, and [t ₂ , t ₂ ] is the time range of the signal.

S2.4、基于非线性动力学分析对预处理后的信号在非线性域上进行特征提取，得到非线性域特征；S2.4. Based on nonlinear dynamic analysis, perform feature extraction on the preprocessed signal in the nonlinear domain to obtain nonlinear domain features;

具体地，非线性特征包括近似熵、模糊熵、样本熵。Specifically, nonlinear features include approximate entropy, fuzzy entropy, and sample entropy.

采用近似熵作为非线性动力学特征之一，是基于近似熵具有量化EEG信号的规律性和不可预测性优势，可以表示EEG信号的复杂度，反映信号中新信息发生的可能性大小。The use of approximate entropy as one of the nonlinear dynamic features is based on the fact that approximate entropy has the advantage of quantifying the regularity and unpredictability of EEG signals. It can represent the complexity of EEG signals and reflect the possibility of new information in the signals.

近似熵对预处理后EEG信号x(t)的处理步骤如下：The processing steps of approximate entropy on the preprocessed EEG signal x(t) are as follows:

(1)对N维原始信号时间序列为等时间间隔采样，重构m维向量X(1)，X(2),…,X(N-m+1，其中Xi＝[u(i),u(i+1),…,u(i+m-1)]。(1) The N-dimensional original signal time series is sampled at equal time intervals, and the m-dimensional vectors X(1), X(2),...,X(N-m+1) are reconstructed, where Xi=[u(i), u(i+1),...,u(i+m-1)].

(2)对于1,2,…,N-m+1，统计满足以下满足条件的向量个数(2) For 1,2,...,N-m+1, count the number of vectors that satisfy the following conditions

其中X(i)要满足的条件是d[X(i)，X(j)]≤r，d[X,X^*]＝max|u(a)一u*(a)|，其中u(a)为向量x的元素，d表示两个向量X之间的距离，j的取值范围是l≤j≤N-m+1，j可以与i相等。Among them, the condition to be satisfied by X(i) is d[X(i), X(j)]≤r, d[X,X ^* ]=max|u(a)-u*(a)|, where u( a) is an element of vector x, d represents the distance between two vectors X, the value range of j is l≤j≤N-m+1, and j can be equal to i.

(3)定义：(3) Definition:

(4)则可将近似熵定义为:(4) The approximate entropy can be defined as:

ApEn＝Φ^m(r)-Φ^m+1(r)ApEn＝Φ ^m (r)-Φ ^m+1 (r)

式中通常设置参数m＝2或m＝3，m＝3能更细致地重构系统的动态演化过程；r值主要取决于应用场合，通常选择r＝0.2*std(std为时间序的标准差)。In the formula, the parameter m=2 or m=3 is usually set, and m=3 can reconstruct the dynamic evolution process of the system in more detail; the value of r mainly depends on the application, and usually choose r=0.2*std (std is the standard of time series Difference).

采用模糊熵作为非线性动力学特征之一，在继承样本熵优点的同时，减少对时间序列长度的依赖性，由于具有较好的连续性和鲁棒性，可有效用于脑电时间序列的分析中。Using fuzzy entropy as one of the nonlinear dynamic features, while inheriting the advantages of sample entropy, reduces the dependence on the length of the time series. Due to its good continuity and robustness, it can be effectively used for EEG time series. Analyzing.

模糊熵对预处理后脑电信号x(t)的处理步骤如下：The processing steps of fuzzy entropy on the preprocessed EEG signal x(t) are as follows:

(1)给定的N维信号时间序列与近似熵相同，定义相空间维数为m(m<N-2)和r，重构相空间：X(i)＝[u(i),u(i+1),…,u(i+m-1)]-u₀(i)(1) The given N-dimensional signal time series is the same as the approximate entropy, define the dimensions of the phase space as m (m<N-2) and r, and reconstruct the phase space: X(i)=[u(i),u (i+1),...,u(i+m-1)]-u ₀ (i)

(2)引入模糊关系函数A(x)，并计算：(2) Introduce the fuzzy relationship function A(x), and calculate:

(3)得到

并定义：(3) get

and define:

(4)则可将模糊熵定义为：(4) The fuzzy entropy can be defined as:

FuzzyEn＝InΦ^m(r)-InΦ^m+1(r)FuzzyEn＝InΦ ^m (r)-InΦ ^m+1 (r)

采用样本熵作为非线性动力学特征之一，样本熵反映了时间序列的复杂度，能够衡量时间序列中产生新模式概率的大小，样本熵值越大，产生新模式的概率越大，序列越复杂。不同情绪状态下，对应大脑皮层被激活的程度不同，脑电活动的复杂性也不同。因此，样本熵特征可以反映情绪脑电的变化。The sample entropy is used as one of the nonlinear dynamic characteristics. The sample entropy reflects the complexity of the time series and can measure the probability of generating new patterns in the time series. The larger the sample entropy value, the greater the probability of generating new patterns, and the higher the sequence is. complex. Under different emotional states, the degree of activation of the corresponding cerebral cortex is different, and the complexity of brain electrical activity is also different. Therefore, the sample entropy feature can reflect the change of emotional EEG.

样本熵对预处理后脑信号x(t)的处理步骤如下：The processing steps of the sample entropy on the preprocessed brain signal x(t) are as follows:

(1)前三步样本熵与近似熵均相同，其中，将近似熵中的

改为

将

的分母N-m+1，改为N-m，且j≠i。(1) The sample entropy and approximate entropy in the first three steps are the same, where the approximate entropy

changed to

Will

The denominator of N-m+1 is changed to Nm, and j≠i.

(2)求

对所有i的平均值，为：(2) seeking

The average value for all i is:

(3)令k＝m+1，重复样本熵的第一步和第二步，可得:(3) Make k=m+1, repeat the first and second steps of sample entropy, and get:

(4)可以将样本熵定义为：(4) The sample entropy can be defined as:

S2.5、基于公共空间模式分析对预处理后的信号在空域上进行特征提取，得到空域特征；S2.5. Based on the public space pattern analysis, perform feature extraction on the preprocessed signal in the airspace to obtain airspace features;

具体地，采用“一对一”(OVO)方法对CSP算法做多分类扩展。采用OVO-CSP方法对脑电信号进行空域特征提取。该方法将多分类拆分为若干个二分类问题，故对用于二分类的CSP传统算法具体实现过程做说明。Specifically, the "one-to-one" (OVO) method is used to extend the multi-classification of the CSP algorithm. The OVO-CSP method is used to extract spatial features of EEG signals. This method splits multi-classification into several binary classification problems, so the specific implementation process of the traditional CSP algorithm for binary classification is explained.

(1)求两类数据的空间协方差矩阵：(1) Find the spatial covariance matrix of two types of data:

其中E_i为两类信号转换而成的数据矩阵，

表示矩阵对角线上元素之和。Where E _i is the data matrix converted from two types of signals,

Represents the sum of elements on the diagonal of a matrix.

(2)求每一类平均协方差矩阵C_i，再求和：(2) Calculate the average covariance matrix C _i of each category, and then sum:

(3)对混合空间协方差矩阵按式进行特征值分解、白化处理得具有相同特征向量的S₁和S₂，再对特征向量S₁和S₂分别进行特征值分解处理。(3) Perform eigenvalue decomposition and whitening processing on the mixed space covariance matrix according to the formula to obtain S ₁ and S ₂ with the same eigenvector, and then perform eigenvalue decomposition on the eigenvectors S ₁ and S ₂ respectively.

S₁＝Bλ₁B^TS₂＝Bλ₂B^T S ₁ =Bλ ₁ B ^T S ₂ =Bλ ₂ B ^T

B是S₁和S₂共同的特征向量，特征值之和为1。B is the common eigenvector of S ₁ and S ₂ , and the sum of eigenvalues is 1.

(4)构建空间滤波器后对脑电信号矩阵E，滤波得z，将Z_i进行如下运算后作为特征值：(4) After constructing the spatial filter, the EEG signal matrix E is filtered to obtain z, and _Zi is used as the eigenvalue after performing the following operations:

其中p＝1,2,…,2m(2m<n)。where p=1,2,...,2m (2m<n).

S2.6、根据时域特征、频域特征、时频域特征、非线性域特征和空域特征进行矩阵构建，得到原始特征矩阵。S2.6. Perform matrix construction according to time domain features, frequency domain features, time-frequency domain features, nonlinear domain features and space domain features to obtain an original feature matrix.

具体地，将所有f_p构成最终的特征矩F＝{f₁,f₂，...f_2m}，则得到了一组脑电特征。Specifically, all f _p constitute the final feature moment F={f ₁ , f ₂ , . . . f _2m }, and a set of EEG features is obtained.

S3、对原始特征矩阵进行融合降维处理，得到最终特征矩阵；S3. Perform fusion and dimensionality reduction processing on the original feature matrix to obtain the final feature matrix;

具体地，特征向量组合后维数较高，不利于分类器识别，使算法不够准确快速，因此选择PCA算法实现降维。PCA根据方差最大化原理，用一组线性无关且相互正交的新向量来对原向量进行表征，这组新向量即主成分，是原来向量的线性组合引，实现步骤如下：Specifically, the dimensionality of the combination of feature vectors is high, which is not conducive to classifier recognition, making the algorithm not accurate and fast enough, so the PCA algorithm is selected to achieve dimensionality reduction. According to the principle of variance maximization, PCA uses a set of linearly independent and mutually orthogonal new vectors to represent the original vector. This set of new vectors is the principal component, which is the linear combination of the original vectors. The implementation steps are as follows:

S3.1、对原始特征矩阵{x₁，x₂,…,x_n}进行去均值处理，得到中心化后的矩阵；S3.1. Perform demeaning processing on the original feature matrix {x ₁ , x ₂ ,…,x _n } to obtain a centered matrix;

具体地，去中心化后的矩阵表示如下：Specifically, the decentralized matrix is expressed as follows:

S3.2、计算中心化后的矩阵X的协方差矩阵C并进行特征值分解，得到特征值λ和对应的特征向量U；S3.2. Calculate the covariance matrix C of the centered matrix X and perform eigenvalue decomposition to obtain the eigenvalue λ and the corresponding eigenvector U;

具体地，λ为对角阵，表示主成分方差的大小。Specifically, λ is a diagonal matrix, representing the size of the variance of the principal components.

S3.3、将特征值排序并按顺序取预设数量特征值对应的特征向量构建矩阵，得到最终特征矩阵。S3.3. Sorting the eigenvalues and taking the eigenvectors corresponding to the preset number of eigenvalues in sequence to construct a matrix to obtain the final eigenmatrix.

具体地，将λ排序：λ₁≥λ₂≥…≥λ_d，取前d’个特征值对应的特征向量构成投影矩阵形W＝(u₁,u₂,…,u_d′)。其中，d’表示降维后的维数，由各成分的累积贡献率决定，即：Specifically, λ is sorted: λ ₁ ≥λ ₂ ≥...≥λ _d , and the eigenvectors corresponding to the first d' eigenvalues are taken to form a projection matrix W=(u ₁ ,u ₂ ,...,u _d′ ). Among them, d' represents the dimension after dimension reduction, which is determined by the cumulative contribution rate of each component, namely:

一般设置C_rate大于85％。Generally, the C _rate is set to be greater than 85%.

根据W求出低维空间的新特征量F＝X*W。经PCA处理后的新特征最大程度代表了有用信息，剔除对应于d-d’个特征值的那些特征向量，使样本采样密度增大，一定程度起到去噪作用，作为下一步分类器的输入更佳。According to W, the new feature quantity F=X*W of the low-dimensional space is obtained. The new features processed by PCA represent the useful information to the greatest extent, and those eigenvectors corresponding to the d-d' eigenvalues are eliminated to increase the sample sampling density and play a denoising role to a certain extent. As the next step of the classifier Type better.

S4、将最终特征矩阵输入至预训练的分类模型进行分类，输出分类结果。S4. Input the final feature matrix to the pre-trained classification model for classification, and output the classification result.

具体地，本实施例中选取的是一个通过粒子群算法优化后的SVM-KNN分类器。Specifically, an SVM-KNN classifier optimized by the particle swarm optimization algorithm is selected in this embodiment.

S4.1、基于粒子群算法和预构建的训练集对SVM-KNN分类器进行参数寻优，得到预训练的分类模型；S4.1. Optimizing the parameters of the SVM-KNN classifier based on the particle swarm optimization algorithm and the pre-built training set to obtain a pre-trained classification model;

具体地，在PSO算法中，在粒子群中的粒子通过将自身的飞行运动经验与粒子群的飞行运动经验不断修正粒子在n维空间中的位置和速度，其中n为需要进行参数寻优的参数个数。粒子群通过不断地迭代更新，通过跟踪个体极值(Pbest)和全体极值(Gbest)这两个值来更新自己的位置和速度。速度和位置的更新公式如下：Specifically, in the PSO algorithm, the particles in the particle swarm continuously modify the position and velocity of the particles in the n-dimensional space by combining their own flight experience with the flight experience of the particle swarm, where n is the number of parameters that need to be optimized the number of parameters. The particle swarm updates its position and velocity by continuously iteratively updating and tracking the two values of the individual extremum (Pbest) and the overall extremum (Gbest). The update formulas of speed and position are as follows:

V_id(t+1)＝ω·V_id(t)+c₁r₁(Pbest(t)-x_id(t))+c₂r₂(Gbest(t)-x_id(t))V _id (t+1)＝ω·V _id (t)+c ₁ r ₁ (Pbest(t)-x _id (t))+c ₂ r ₂ (Gbest(t)-x _id (t))

X_id(t+1)＝X_id(t)+V_id(t+1)X _id (t+1)＝X _id (t)+V _id (t+1)

式中1≤i≤n,1≤d≤m,m为例子群中粒子的个数；ω为惯性权重，c₁、c₂为加速度常数，一般为2，r₁、r₂为在(0，1)之间的随机数，t为迭代代数。In the formula, 1≤i≤n, 1≤d≤m, m is the number of particles in the example group; ω is the inertia weight, c ₁ and c ₂ are acceleration constants, generally 2, r ₁ and r ₂ are in ( 0, 1), t is the iteration algebra.

在进行PSO算法优化参数之前，需要将一些参数的范围设置好：其中ω为惯性权重设为1，c₁、c₂设为2，最大飞行速度上限V_max设为200，种群粒子总数为50，最大迭代次数设为10000。在本文中需要调优的参数也要设置范围，SVM算法中的惩罚因子C和RBF核函数的参数γ的数值范围都设为(-100，100)，KNN中的参数K值的选取数值范围为(0，50)，另一个判断远近的阈值ω的范围设为(0.3，0.8)。Before the PSO algorithm optimizes the parameters, it is necessary to set the range of some parameters: where ω is the inertia weight set to 1, c ₁ and c ₂ are set to 2, the maximum flight speed upper limit V _max is set to 200, and the total number of population particles is 50 , and the maximum number of iterations is set to 10000. In this article, the parameters that need to be tuned should also be set within the range. The value range of the penalty factor C in the SVM algorithm and the parameter γ of the RBF kernel function are both set to (-100, 100), and the selected value range of the parameter K value in KNN is (0, 50), and the range of another threshold ω for judging distance is set to (0.3, 0.8).

在参数设置好后，参数寻优步骤如下：After the parameters are set, the parameter optimization steps are as follows:

(1)初始化所有粒子的位置和速度。(1) Initialize the position and velocity of all particles.

(2)计算当参数下的SVM-KNN分类模型的分类精度，并将其作为适应度函数计算每个粒子的适应度。(2) Calculate the classification accuracy of the SVM-KNN classification model under the parameters, and use it as a fitness function to calculate the fitness of each particle.

(3)比较当前和以搜索的适应度的个体极值更新Pbest，比较所有的Pbest并从中选取最佳的全体极值Gbest，更新Gbest。(3) Update Pbest by comparing the current individual extremum with the searched fitness, compare all Pbests and select the best overall extremum Gbest from them, and update Gbest.

(4)对于每次更新，重置SVM惩罚因子C，为粒子创造更大的研究空间，避免陷入当前最佳值的局部区域。(4) For each update, the SVM penalty factor C is reset to create a larger research space for particles and avoid falling into the local region of the current optimal value.

(5)根据上面提到的速度和位置的更新公式来更新粒子的速度和位置。(5) Update the velocity and position of the particle according to the update formula of velocity and position mentioned above.

(6)当达到最大迭代次数时，输出最优参数；否则，返回步骤(2)。(6) When the maximum number of iterations is reached, output the optimal parameters; otherwise, return to step (2).

在通过PSO算法对于SVM-KNN分类模型的参数进行调优后，得到一组最优参数，依照参数来设置分类模型。After tuning the parameters of the SVM-KNN classification model through the PSO algorithm, a set of optimal parameters is obtained, and the classification model is set according to the parameters.

依靠PSO算法寻找出来的最优参数，设置SVM算法的惩罚因子C和核函数的参数γ，KNN算法中的K值，SVM-KNN算法中的阈值ω。Rely on the optimal parameters found by the PSO algorithm, set the penalty factor C of the SVM algorithm and the parameter γ of the kernel function, the K value in the KNN algorithm, and the threshold ω in the SVM-KNN algorithm.

S4.2、基于预训练的分类模型计算待测样本与最优超平面的距离，得到样本距离；S4.2. Calculate the distance between the sample to be tested and the optimal hyperplane based on the pre-trained classification model to obtain the sample distance;

具体地，在本文的分类模型中距离由下面公式计算：Specifically, the distance in the classification model in this paper is calculated by the following formula:

判断距离的远近就由认为设定的阈值来决定，阈值越小，融合算法中的SVM算法占的比重就更大，算法整体更偏向于SVM算法，当阈值为0，融合算法等价于SVM算法；反之，阈值越大，融合算法中改进KNN算法的占比更重，当阈值大到无穷大时，此算法就等价于KNN分类算法。The distance of judging the distance is determined by the set threshold. The smaller the threshold, the larger the proportion of the SVM algorithm in the fusion algorithm, and the overall algorithm is more biased towards the SVM algorithm. When the threshold is 0, the fusion algorithm is equivalent to SVM Algorithm; on the contrary, the larger the threshold, the greater the proportion of the improved KNN algorithm in the fusion algorithm. When the threshold is as large as infinity, this algorithm is equivalent to the KNN classification algorithm.

S4.3、将样本距离与预设阈值比较；S4.3. Comparing the sample distance with a preset threshold;

S4.4、判断到样本距离的绝对值小于预设阈值，采用KNN算法进行分类，输出分类结果；S4.4. Determine that the absolute value of the distance to the sample is less than the preset threshold, use the KNN algorithm to classify, and output the classification result;

具体地，将g(x)与阈值ω相比较，如果|g(x)|<ω，可以将这个样本点看成距离最优超平面较近的样本点，采用KNN算法进行分类，计算待测样本点与所有的特征向量之间的距离，挑选出最小K个距离的样本点，并统计它们所对应的类别，待测样本的类别与出现概率高的类别相同。Specifically, g(x) is compared with the threshold ω, if |g(x)|<ω, this sample point can be regarded as a sample point close to the optimal hyperplane, and the KNN algorithm is used for classification, and the calculation is to be Measure the distance between the sample point and all the feature vectors, select the sample points with the minimum K distances, and count their corresponding categories. The category of the sample to be tested is the same as the category with a high probability of occurrence.

S4.5、判断到样本距离的绝对值大于等于预设阈值，采用SVM算法进行分类，输出分类结果。S4.5. Determine that the absolute value of the distance to the sample is greater than or equal to the preset threshold, use the SVM algorithm to classify, and output the classification result.

具体地，如果|g(x)|》ω，可以将这个样本点看成距离最优超平面较远的样本点，采用SVM算法进行分类，在SVM算法中，计算F(x)＝sgn(g(x))得出分类类别。Specifically, if |g(x)|》ω, this sample point can be regarded as a sample point far away from the optimal hyperplane, and the SVM algorithm is used for classification. In the SVM algorithm, calculate F(x)=sgn( g(x)) yields the categorical categories.

本发明中提出了一种新型的信号分类方法——基于PSO改进SVM-KNN分类算法，该分类算法通过将SVM与改进的KNN算法融合，解决了SVM算法在面对距离超平面比较近的样本点时分类不准确的问题。同时利用PSO算法对传统的SVM-KNN算法进行优化，寻找SVM-KNN分类器的最优参数，提高了系统的分类性能，对情绪分类的效率有了一个大幅提升；本发明中采用了时域、频域、时频域、非线性、空间域的特征对脑电信号进行特征提取，尽可能完整地保留脑电信号中包含的信息。再通过PCA计算出各个特征的贡献率，剔除所有贡献率低于85％的特征，构建新的特征矩阵，这样既能最大的保留信号中的信息，又可以提高分类器计算效率。The present invention proposes a novel signal classification method—based on the PSO improved SVM-KNN classification algorithm. The classification algorithm solves the problem of the SVM algorithm in the face of samples that are relatively close to the hyperplane by integrating the SVM and the improved KNN algorithm. Point-time classification is not accurate. Utilize PSO algorithm to optimize traditional SVM-KNN algorithm at the same time, seek the optimum parameter of SVM-KNN classifier, have improved the classification performance of system, have had a substantial promotion to the efficiency of sentiment classification; Adopted time domain in the present invention , frequency domain, time-frequency domain, non-linear, and spatial domain features to extract the features of the EEG signal, and retain the information contained in the EEG signal as completely as possible. Then calculate the contribution rate of each feature through PCA, remove all features with a contribution rate lower than 85%, and construct a new feature matrix, which can not only retain the information in the signal to the greatest extent, but also improve the computational efficiency of the classifier.

如图2所示，一种脑电信号的多元特征融合提取的信号特征提取系统，包括：As shown in Figure 2, a signal feature extraction system for multi-element feature fusion extraction of EEG signals, including:

上述方法实施例中的内容均适用于本系统实施例中，本系统实施例所具体实现的功能与上述方法实施例相同，并且达到的有益效果与上述方法实施例所达到的有益效果也相同。The content in the above-mentioned method embodiments is applicable to this system embodiment. The specific functions realized by this system embodiment are the same as those of the above-mentioned method embodiments, and the beneficial effects achieved are also the same as those achieved by the above-mentioned method embodiments.

一种脑电信号的多元特征融合提取的信号特征提取装置：A signal feature extraction device for multiple feature fusion extraction of EEG signals:

至少一个处理器；at least one processor;

上述方法实施例中的内容均适用于本装置实施例中，本装置实施例所具体实现的功能与上述方法实施例相同，并且达到的有益效果与上述方法实施例所达到的有益效果也相同。The content in the above-mentioned method embodiment is applicable to this device embodiment, and the specific functions realized by this device embodiment are the same as those of the above-mentioned method embodiment, and the beneficial effects achieved are also the same as those achieved by the above-mentioned method embodiment.

一种存储介质，其中存储有处理器可执行的指令，其特征在于：所述处理器可执行的指令在由处理器执行时用于实现如上所述一种脑电信号的多元特征融合提取的信号特征提取方法。A storage medium, which stores processor-executable instructions, is characterized in that: the processor-executable instructions are used to realize the above-mentioned multi-element feature fusion extraction of EEG signals when executed by the processor Signal feature extraction method.

上述方法实施例中的内容均适用于本存储介质实施例中，本存储介质实施例所具体实现的功能与上述方法实施例相同，并且达到的有益效果与上述方法实施例所达到的有益效果也相同。The content in the above-mentioned method embodiments is applicable to this storage medium embodiment. The functions realized by this storage medium embodiment are the same as those of the above-mentioned method embodiments, and the beneficial effects achieved are also the same as those achieved by the above-mentioned method embodiments. same.

以上是对本发明的较佳实施进行了具体说明，但本发明创造并不限于所述实施例，熟悉本领域的技术人员在不违背本发明精神的前提下还可做作出种种的等同变形或替换，这些等同的变形或替换均包含在本申请权利要求所限定的范围内。The above is a specific description of the preferred implementation of the present invention, but the invention is not limited to the described embodiments, and those skilled in the art can also make various equivalent deformations or replacements without violating the spirit of the present invention. , these equivalent modifications or replacements are all within the scope defined by the claims of the present application.

Claims

1. A signal feature extraction method for multi-element feature fusion extraction of an electroencephalogram signal is characterized by comprising the following steps:

acquiring electroencephalogram signal data and performing signal preprocessing on the electroencephalogram signal data to obtain a preprocessed signal;

performing multi-dimensional feature extraction and matrix construction on the preprocessed signals to obtain an original feature matrix;

performing fusion dimensionality reduction processing on the original feature matrix to obtain a final feature matrix;

and inputting the final characteristic matrix into a pre-trained classification model for classification, and outputting a classification result.

2. The method for extracting signal features of multi-feature fusion extraction of electroencephalogram signals according to claim 1, wherein the step of obtaining an electroencephalogram signal data set and performing signal preprocessing on the electroencephalogram signal data set to obtain a preprocessed signal specifically comprises:

acquiring electroencephalogram signal data;

carrying out noise separation processing on the electroencephalogram signal data based on an independent component analysis method to obtain a separated signal;

performing frequency screening processing on the separated signals based on a Butterworth band-pass filter to obtain screened signals;

and normalizing the screened signals to obtain preprocessed signals.

3. The method for extracting signal features of multi-feature fusion extraction of electroencephalogram signals according to claim 2, wherein the step of performing multi-dimensional feature extraction and matrix construction on the preprocessed signals to obtain an original feature matrix specifically comprises:

performing feature extraction on the preprocessed signals on a time domain to obtain time domain features;

performing feature extraction on the preprocessed signal on a frequency domain based on an AR model power spectrum estimation method to obtain frequency domain features;

performing feature extraction on the preprocessed signal on a time-frequency domain based on a Hilbert-Huang transform method to obtain time-frequency domain features;

performing feature extraction on the preprocessed signals on a nonlinear domain based on nonlinear dynamics analysis to obtain nonlinear domain features;

performing feature extraction on the preprocessed signals on a space domain based on public space mode analysis to obtain space domain features;

and constructing a matrix according to the time domain characteristics, the frequency domain characteristics, the time-frequency domain characteristics, the nonlinear domain characteristics and the space domain characteristics to obtain an original characteristic matrix.

4. The method for extracting signal features of multi-feature fusion extraction of electroencephalogram signals according to claim 3, wherein the time-domain features comprise standard deviation, root mean square, mean of first-order difference absolute values, and the non-linear features comprise approximate entropy, fuzzy entropy, and sample entropy.

5. The method for extracting signal features through multi-feature fusion extraction of electroencephalogram signals according to claim 3, wherein the step of performing fusion dimensionality reduction processing on the original feature matrix to obtain a final feature matrix specifically comprises the following steps:

carrying out mean value removing processing on the original characteristic matrix to obtain a centralized matrix;

calculating a covariance matrix of the centralized matrix and decomposing an eigenvalue to obtain an eigenvalue and a corresponding eigenvector;

and sequencing the characteristic values and sequentially taking the characteristic vectors corresponding to the characteristic values of the preset number to construct a matrix to obtain a final characteristic matrix.

6. The method for extracting signal features through multi-feature fusion extraction of electroencephalogram signals according to claim 5, wherein the step of inputting the final feature matrix into a pre-trained classification model for classification and outputting a classification result specifically comprises:

performing parameter optimization on the SVM-KNN classifier based on a particle swarm algorithm and a pre-constructed training set to obtain a pre-trained classification model;

calculating the distance between the sample to be tested and the optimal hyperplane based on a pre-trained classification model by taking the final characteristic matrix as input to obtain a sample distance;

comparing the sample distance with a preset threshold value;

judging that the absolute value of the distance to the sample is smaller than a preset threshold value, classifying by adopting a KNN algorithm, and outputting a classification result;

and judging whether the absolute value of the distance to the sample is larger than or equal to a preset threshold value, classifying by adopting an SVM algorithm, and outputting a classification result.

7. The method for extracting signal features of multi-feature fusion extraction of electroencephalogram signals according to claim 6, wherein the calculation formula of the sample distance is as follows:

in the above formula, a _i Is a Lagrangian multiplier, y _i a _i And k is a system constant and b is a constant term of the decision function in the SVM for the coefficient of the support vector in the decision function.