CN112200016A - Electroencephalogram signal emotion recognition based on ensemble learning method AdaBoost - Google Patents

Electroencephalogram signal emotion recognition based on ensemble learning method AdaBoost Download PDF

Info

Publication number
CN112200016A
CN112200016A CN202010977310.4A CN202010977310A CN112200016A CN 112200016 A CN112200016 A CN 112200016A CN 202010977310 A CN202010977310 A CN 202010977310A CN 112200016 A CN112200016 A CN 112200016A
Authority
CN
China
Prior art keywords
data
classifier
emotion
features
feature
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202010977310.4A
Other languages
Chinese (zh)
Inventor
陈宇
常锐
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Northeast Forestry University
Original Assignee
Northeast Forestry University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Northeast Forestry University filed Critical Northeast Forestry University
Priority to CN202010977310.4A priority Critical patent/CN112200016A/en
Publication of CN112200016A publication Critical patent/CN112200016A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/08Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Signal Processing (AREA)
  • Medical Informatics (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)

Abstract

The invention relates to an electroencephalogram signal emotion recognition method based on an ensemble learning method AdaBoost, wherein the electroencephalogram signal emotion recognition method comprises the following steps: firstly, importing a DEAP data set (the DEAP data set is subjected to down-sampling to 128Hz and artifact removal), taking out the last 60s data of the first 32 channels of the 32 data files, and extracting data and 0/1 labels; then, feature extraction and feature selection are carried out, and time domain, frequency domain and nonlinear features related to emotion are extracted from the electroencephalogram signal data; next, emotion two-classification in 4 dimensions (Valence, aroma, Dominance, Liking) is performed, the extracted feature data is divided into a training set and a test set, the training set is input into a trained AdaBoost classifier, and the test set adopts a 5-fold cross validation method to verify the classification effect. In addition, a comparison experiment is carried out by using a Random Forest classifier and an XGboost classifier, and the method disclosed by the invention is found to have optimal performance, so that the emotion recognition accuracy is obviously improved.

Description

Electroencephalogram signal emotion recognition based on ensemble learning method AdaBoost
The technical field is as follows:
the invention relates to the field of machine learning and emotion recognition, in particular to electroencephalogram emotion recognition based on an integrated learning method AdaBoost.
Background art:
it is well known that emotions play an important role in activities such as daily learning and life of people, simplifying and making vivid the communication between people. Emotion can be expressed in a way that speech intonation and facial expression are easily perceived by others when speaking, and also in a way that physiological changes of the nervous system are not easily perceived by others. However, due to the "forgeability" of speech intonation and facial expression, the former is not a reliable index for judging emotion, and the physiological signal is more accurate. Therefore, along with the development of artificial intelligence technology, emotion recognition based on physiological signals, Especially Electroencephalogram (EEG) signals, is becoming a popular research topic and attracting much attention.
However, the electroencephalogram signal is taken as a chaotic time sequence, and the abundant emotional information stored in the chaotic time sequence is not obvious, but is expressed in a numerical form along with the time change. Naturally, these "values" also contain some information that is not relevant to emotion and noise that affects emotion recognition. If the electroencephalogram signal data are directly sent into the classifier, the classifier is difficult to identify, so that the emotion identification effect is seriously influenced, and the research has no practical significance.
Therefore, the information contained in the electroencephalogram signals needs to be deeply mined, and effective information is extracted and then emotion recognition is carried out on the extracted effective information on the basis of filtering out noise and irrelevant information. The scheme is mentioned in a large number of electroencephalogram signal related researches, namely: and (3) extracting the characteristics of the preprocessed electroencephalogram signals, and then sending the characteristic data related to emotion into a classifier for emotion recognition, so that the emotion recognition accuracy is improved to a certain extent. Although the attempted feature extraction and classification methods vary, most of these schemes suffer from a common problem: the extracted features are single in category, generally only time domain or frequency domain features are considered, emotional feature information is difficult to reflect comprehensively, and the used classification method is mostly concentrated in the field of traditional machine learning (such as KNN, SVM, ANN and the like), so that the classification performance of the classifier is limited.
The invention content is as follows:
the invention aims to overcome the defects of the existing method and provides an electroencephalogram signal emotion recognition method based on an integrated learning method AdaBoost. The method uses an emotion recognition standard data set, namely a DEAP data set (Python edition), and after feature extraction, emotion secondary classification is carried out on 4 emotion dimensions, Valence, Arousal, Dominance and Liking through an AdaBoost classifier. In particular to a feature extraction link, three types of features of time domain, frequency domain and nonlinearity are extracted in the link, and feature dimension is reduced through a supervised feature selection method, so that the problems that emotion related information reflected by the features extracted by the existing series electroencephalogram signal emotion recognition methods is not comprehensive enough, the classification performance of a classifier is limited and the like are solved. The method comprises the following steps:
step 1: reading in the preprocessed data set, and determining a data range to be used;
step 2: performing feature extraction and feature selection on the electroencephalogram signal data, and extracting features related to emotion;
and step 3: and (3) sending the characteristic data obtained in the step (2) into an AdaBoost classifier for training, and testing the performance of the AdaBoost classifier through the experiment and the comparative experiment.
The implementation of step 1 comprises:
step 1.1: reading DEAP data set source files downloaded from an official website (the source files are preprocessed to 128Hz), and importing the last 60s data of the first 32 channels of the 32 dat files (removing the baseline signal data of the first 3s which are irrelevant to emotion);
step 1.2: respectively storing the data of the data and the labels parts of 4 emotion dimensions in features _ raw.csv files and labels0-3.dat files, wherein the score value > in the label is 5.0 and is 1, otherwise, the score value > is 0, and preparing for subsequent feature extraction, feature selection and binary classification.
The implementation of step 2 comprises:
step 2.1: feature extraction: in order to capture the detail information in the electroencephalogram signal as much as possible, a method of simultaneously extracting a plurality of class features is adopted. And sequentially extracting time domain, time-frequency domain and nonlinear features from the last 60s data of all 32 channels. Step 2.1 again comprises the following steps:
step 2.1.1: extracting time domain features, wherein the time domain features comprise 8 statistical features of a Mean (Mean), a standard deviation (Std), a Range (Range), a Skewness (Skewness), a Kurtosis (Kurtosis), three Hjorth index activities, Mobility and Complexity;
step 2.1.2: extracting time-frequency domain features, namely extracting 4 different frequency bands (Theta, Alpha, Beta and Gamma) related to emotion based on a Wavelet packet decomposition method of db4 mother function, and sequentially extracting Wavelet Energy (Wavelet Energy) and Wavelet Entropy (Wavelet Entropy) on the frequency bands, wherein 2 x 4 is 8 features in total;
step 2.1.3: and nonlinear feature extraction, including Power Spectral Density (PSD) and Differential Entropy (DE), for 2 features.
Step 2.2: combining the characteristics: step 2.1, 18 feature values are extracted from each channel in each tested video data, so that each tested video file contains 32 × 18 — 576 feature values in total, and the feature values are spliced into a feature value matrix of (32 × 40) × 576 — 1280 × 576, and stored in a train.
Step 2.3: selecting characteristics: and a supervised feature selection method of Linear Discriminant Analysis (LDA) is adopted, so that the feature dimension extracted in the feature extraction link is reduced, the relevance of the screened features is further improved, and the robustness of the classifier is enhanced.
The implementation of step 3 comprises:
step 3.1: reading the processed characteristic data in a train.csv file, and randomly and independently dividing a training set and a test set according to a dividing ratio of 4: 1 (80% training set, 20% testing set);
step 3.2: training a classifier: sending the training set data into an AdaBoost integrated learning classifier for two-classification training, and continuously adjusting the parameters of the classifier to determine the optimal parameter combination and enhance the learning capability of the classifier;
step 3.3: and (3) testing the performance of the classifier: step 3.3 again comprises the following steps:
step 3.3.1: sending the data of the test set into the classifier trained in the step 2, and performing a two-classification test on 4 emotion dimensions (Valence, Arousal, Dominance, Liking) based on 5-Fold Cross Validation (5Fold Cross Validation);
step 3.3.2: performing comparison experiments by using other classifiers, wherein Random Forest and XGboost are adopted;
step 3.3.3: for the main experiment and the comparative experiment, the following 4 performance indexes are considered: accuracy, precision, recall, F1-Score, and plotting Confusion Matrix (fusion Matrix) and results to evaluate classifier performance in multiple angles.
The invention has the beneficial effects that: the invention starts from two major links of feature extraction and classification in consideration of the problems that the extracted features are single in type, the features are difficult to reflect emotional feature information comprehensively, the used classification method is mostly concentrated in the field of traditional machine learning and the like in the existing electroencephalogram signal emotion recognition method, so that the classification performance of a classifier is limited. Firstly, in the aspect of feature extraction, comprehensively considering various features of electroencephalogram signals, extracting three types of features of time domain, time frequency domain and nonlinearity, taking each channel data in each tested video file as a unit, extracting 5 types of time domain, 2 types of frequency domain and 5 types of nonlinearity, combining the time domain, the time frequency domain and the nonlinearity into feature vectors, further combining all the feature vectors into a feature matrix, then reducing feature dimension through a supervised feature selection method, and enhancing the correlation between the extracted features and emotion; and then, in classification, performing two-classification of the electroencephalogram signals in 4 emotional dimensions by using an ensemble learning method AdaBoost, performing a comparison experiment by using Random Forest and XGboost classifiers, and evaluating the performance of the classifiers in multiple angles by performing 5-fold cross validation and 4-fold individual performance indexes, drawing a confusion matrix and a result curve graph. The invention makes a certain breakthrough on the basis of the existing problems, and the performance of the classifier is effectively improved.
Description of the drawings:
FIG. 1 is a flowchart of an electroencephalogram signal emotion recognition method based on an ensemble learning method AdaBoost.
Fig. 2 is a diagram of electrode names and electrode locations corresponding to different brain regions.
FIG. 3 is a diagram showing the distribution of positive and negative examples in 4 emotion dimensions in the data Labels used in step 1.
Fig. 4 is a diagram showing the original signal and the decomposed signals a4, D4 to D1, taking the channel Fp1 of test 1 as an example.
Fig. 5 is a schematic diagram of the algorithm idea of the feature selection method LDA.
Fig. 6 is a working schematic diagram of the ensemble learning method AdaBoost.
Fig. 7 is a schematic diagram of a confusion matrix.
FIG. 8 is a classification result confusion matrix for the AdaBoost classifier in 4 emotion dimensions.
FIG. 9 is ACC values over 4 emotion dimensions for a 3-group classifier based on 5-fold cross validation.
FIG. 10 is the F1-Score values in 4 emotion dimensions for a 3-group classifier based on 5-fold cross validation.
The specific implementation mode is as follows:
the technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
FIG. 1 is a schematic diagram of a specific process for carrying out the present invention, which mainly comprises the following three steps:
1. pretreatment:
the invention uses an emotion recognition standard dataset, the DEAP dataset (Python edition), which collects physiological signals and corresponding emotion data of 32 volunteers, each volunteer watches 40 music videos containing different emotions, and records their physiological signals to data files s01-s32. dat. When the physiological signals are recorded, the total number of the signals is 40 (the front 32-lead electroencephalogram signal + the back 8-lead peripheral physiological signals comprise an electrooculogram signal, an electromyogram signal, a respiratory signal and the like), the sampling frequency is 512Hz, but the sampling frequency is reduced to 128Hz after a series of filtering operations. Each data file comprises the following two matrixes:
(1) data matrix: 40 x 8064, where the first 40 represents the total number of videos, the second 40 represents the total number of channels of signal acquisition, 8064 is 63 seconds of experimental data (63 x 128) for any 1 channel of video, the first 3 seconds of data being baseline data obtained before 1 experiment, and the last 60 seconds of data being data recorded during the experiment;
(2) labels matrix: 40 x 4, these 4 columns are 4 emotion dimensions respectively: the scores of the titer (Valence), the wakefulness (Arousal), the Dominance (Dominance) and the popularity (Liking) range from 1 to 9.
The object of study in the present invention is electroencephalogram, so this step selects 32. the last 60s data of the first 32 channels, the last 8 channels and the first 3s baseline signal data irrelevant to emotion from the dat file, and stores the data in these files and the labels parts of 4 emotion dimensions in the features _ raw.csv file and labels0-3.dat file, respectively, where the score value > in the label is 5.0 and is 1, otherwise, is 0 (the distribution of positive and negative examples in 4 emotion dimensions in labels is shown in fig. 2), so as to prepare for subsequent feature extraction, feature selection and binary classification.
2. Feature extraction and feature selection:
the electroencephalogram signal is taken as a chaotic time sequence, the interior of the chaotic time sequence contains rich emotional characteristic information, and the key point of the link is how to extract characteristic information related to emotion from the chaotic time sequence. Relevant studies have shown that: the electroencephalogram signal characteristics mainly comprise time domain, frequency domain, time-frequency domain and nonlinear dynamics characteristics, and particularly the latter two characteristics are more relevant to emotion. Therefore, in order to better capture the detail information in the electroencephalogram signal, 3 types of features of time domain, time-frequency domain and nonlinear dynamics are extracted.
(1) Time domain characteristics:
mean (Mean):
Figure BDA0002686252220000071
standard deviation (Std):
Figure BDA0002686252220000072
value Range (Range): range (x) max (x) -min (x)
Bias (Skewness): skewness (x) Mean ((x-Mean (x))3)
Kurtosis (Kurtosis):
Figure BDA0002686252220000073
sixthly, Hjorth parameter-Activity: activity (x) ═ Std (x)2
Seventhly, Hjorth parameter-Mobility:
Figure BDA0002686252220000074
-Hjorth parameter-Complexity:
Figure BDA0002686252220000075
wherein: skewness (Skewness) is the standard third-order central moment of a sample, and more emphasis is placed on describing the symmetry of overall value distribution; the Kurtosis (Kurtosis) is a standard fourth-order central moment of the sample, and the steepness of the overall all-value distribution form is described more emphatically, so that the data distribution form can be better described by combining the Kurtosis and the steepness; the Hjorth parameter provides a method that can quickly compute three important features of the signal in the time domain: mobility, and Complexity. The method is widely applied to the field of physiological signal processing.
(2) Time-frequency domain characteristics:
besides the time domain, the time-frequency domain features are also a class of important features in electroencephalogram signals. Researches show that the wavelet packet decomposition has effective multi-resolution capability when analyzing non-stationary signals, the problem that the wavelet transformation can only decompose low-frequency subbands and cannot extract information of high-frequency subbands at the same high resolution when decomposing signals at each level is solved, and db4 in the mother function has good smooth characteristic and can better detect the transformation condition of electroencephalogram signals, so the method adopts a 4-layer wavelet packet decomposition method based on db4 mother function. In the concrete implementation: the electroencephalogram signals are decomposed into 4-order detailed signals D4-D1 and a first-order approximation signal A4, the numerical values in the signals are wavelet coefficients of the signals of each order, and the signals respectively represent frequency bands Gamma (32-64Hz), Beta (16-32Hz), Alpha (8-16Hz) and Theta (4-8 Hz). Taking the channel Fp1 of the test sample 1 as an example, the original signal and the decomposed signals A4, D4-D1 are shown in FIG. 3. Then, two features of wavelet energy and wavelet entropy are extracted from the wavelet coefficients of each frequency band.
(ii) Wavelet Energy:
Figure BDA0002686252220000081
② Wavelet Entropy (Wavelet Entropy):
Figure BDA0002686252220000082
(3) nonlinear kinetic characteristics:
since the human brain is a typical nonlinear dynamical system, the emotion-related information reflected from the nonlinear dynamical features in the brain electrical signal is also very representative. In the invention, the traditional Power Spectral Density (PSD) and Differential Entropy (DE) characteristics are calculated on 60s data of each channel by adopting short-time Fourier transform of non-overlapping Hamming windows. The differential entropy is a characteristic relative to a continuous random variable, and the calculation formula can be expressed as:
Figure BDA0002686252220000083
wherein: x is the Gaussian distribution obeying N (mu, sigma)2) F (x) is the probability density function of x. The research shows that: for a fixed-length electroencephalogram signal sequence on a certain frequency band, the differential entropy is equal to the logarithm of the power spectral density.
In the above steps, 18 feature values are extracted for each channel in each video data to be tested, so that each video file to be tested contains 32 × 18 — 576 feature values in total, and these feature values are spliced into a feature value matrix of (32 × 40) × 576 — 1280 × 576, and stored in a train.
In order to reduce the feature dimension and further improve the relevance of the screened features and emotion, Linear Discriminant Analysis (LDA) is used for feature selection. LDA is a supervised learning-based dimension reduction technique, i.e. each sample of a data set is output in a category, and the idea is as follows: the data is projected on a low dimension, and after projection, the projection point of each category of data is desirably as small as possible, and the distance between the category centers of different categories of data is desirably as large as possible, as shown in fig. 5. Because the invention carries out the two classification tasks of the electroencephalogram signals, the algorithm is realized as follows:
let data set D { (x)1,y1),(x2,y2),...,(xm,ym) In which any sample xiFor a vector of dimension N, yi ∈ {0,1}, define NjIs the number of class j samples, XjFor a set of class j samples, μjIs a mean vector of class j samples, ΣjA covariance matrix for the jth sample (strictly speaking, a covariance matrix lacking a denominator part), where j is 0,1, then:
Figure BDA0002686252220000091
Figure BDA0002686252220000092
in the binary task, we only need to project data onto a straight line. If the projection line is vector w, then for any sample xiIts projection on a straight line is wTxiCenter point μ of two classes01The projection on w is wTμ0And wTμ1. Defining by combining with the idea of LDA 'data in class is as close as possible and data between classes is as far as possible':
within-class divergence matrix
Figure BDA0002686252220000093
Inter-class divergence matrix Sb=(μ01)(μ01)T
Thereby having an optimization goal
Figure BDA0002686252220000094
Note that at this time Sbw is constantly parallel to mu01Not to let Sbw=λw01)
Substituting into a formula of characteristic value to obtain Sbw=(μ01)(μ01)Tw=(μ01wResolving w ═ Sw -101)
In summary, the original sample set is projected into a 1-dimensional low-dimensional space generated by taking w as a base vector, and the projected feature set is the solved feature set after dimension reduction.
3. And (4) classification:
on the basis of the two previous steps, the obtained feature set can be sent to a classifier for training and performance evaluation. Firstly, randomly and independently dividing read-in feature set data into 4 parts: 1 (80% training set, 20% testing set), ensuring no intersection between the testing set data and the training set data; the training set data is then fed into a classifier for two-class training, here using the ensemble learning method AdaBoost.
AdaBoost is a self-adaptive lifting algorithm in the field of integrated learning, and is effectively applied to the binary problem. The basic principle is iteration, a new weak classifier is added in each iteration, only the weak classifier is trained in each iteration until a predetermined small enough error rate is reached, and each training sample is given a weight indicating the probability that it is selected into the training set by a certain classifier. If a sample point has been accurately classified, then the probability that it is selected is reduced in constructing the next training set; conversely, if a sample point is not classified accurately, its weight is increased. By the mode, AdaBoost can better focus on the sample which is easy to be wrongly divided, the generalization capability of the classifier is improved, and overfitting is not easy to occur.
The working principle diagram of AdaBoost is shown in fig. 6, and the mathematical description is as follows:
(1) initializing training dataWeight distribution
Figure BDA0002686252220000101
Wherein: d1Represents the weight, w, of each sample at the first iteration11Representing the weight of a first sample in the first iteration, wherein N is the total number of samples;
(2) performing m iterations:
using weight distribution as DmLearning training samples of (m ═ 1,2,. n) to obtain weak classifiers Gm(x):x->1, the performance index of the weak classifier is determined by the value ε of the following error functionmTo measure:
Figure BDA0002686252220000111
② calculating weak classifier GmThe speaking right (also weight) alphamIt represents GmThe degree of importance in the final classification:
Figure BDA0002686252220000112
from the above formula, with εmDecrease of alphamGradually increasing, namely: the classifier with small error has large importance degree in the final classifier;
and thirdly, updating the weight distribution of the training samples for the next iteration: the misclassified sample weight increases and the correctly classified sample weight decreases:
Dm+1=(wm+1,1,wm+1,2,...,wm+1,i,...,wm+1,N)
Figure BDA0002686252220000113
Figure BDA0002686252220000114
wherein: dm+1Is the weight, w, of the sample for the next iterationm+1,1Is the weight, y, of the ith sample at the next iterationiIs the category (1/-1), G, corresponding to the ith samplem(xi) Is the weak classifier to the sample xiThe classification result (1/-1) of (2), if the classification is correct, yiGm(xi) Is 1, otherwise is-1;
(3) combining the weak classifiers to obtain a strong classifier:
weighted summation of all iterated classifiers:
Figure BDA0002686252220000115
secondly, applying sign function (sign function) to the summation result to obtain a final strong classifier G (x):
Figure BDA0002686252220000121
in the experiment of the invention, through the realization and parameter adjustment in the training stage, the best performance can be obtained when the iteration number of the weak learner is set to be about 30.
After the training stage is finished, sending the test set data into the classifier trained in the step 2, performing a two-classification test on 4 emotion dimensions (Valence, Arousal, Dominance, and Liking) based on 5-Fold Cross Validation (5Fold Cross Validation), and performing a comparison experiment by using Random Forest and XGBoost classifiers. For the main experiment and the comparative experiment, the following 4 performance indexes are considered: accuracy, precision, recall, F1-Score, and plotting Confusion Matrix (fusion Matrix) and results to evaluate classifier performance in multiple angles.
Confusion Matrix (fusion Matrix): under the classification task, a plurality of different combinations exist between the predicted result (Predict) and the Real result (Real), and the matrix corresponding to the combined results is the confusion matrix. In the two classification tasks, there are 2 × 2 — 4 different combinations of the above results, which are respectively denoted as TP, FP, FN, TN, so as to obtain the confusion matrix shown in fig. 7.
True example (TP): the prediction is positive example, and the result is also positive example;
false Positive (FP): the prediction is positive case, but the result is negative case;
pseudo counter example (FN): the prediction is negative, but the result is positive;
true Negative (TN): the prediction is counterexample, and the result is counterexample.
Performance indexes are as follows:
(1) the accuracy is as follows:
Figure BDA0002686252220000122
(2) the precision ratio is as follows:
Figure BDA0002686252220000123
(3) the recall ratio is as follows:
Figure BDA0002686252220000124
(4)F1-Score:
Figure BDA0002686252220000131
the larger these performance index values, the better the classifier performance. The final results are shown in fig. 8 to 10. According to the analysis and result graph, the method has the advantages that the performance is optimal, and compared with the traditional feature extraction and classification method, the emotion recognition accuracy is obviously improved.
Finally, it should be understood that parts of the specification not set forth in detail are well within the prior art.
While the invention has been described with reference to specific embodiments and procedures, it will be understood by those skilled in the art that the invention is not limited thereto, and that various changes and substitutions may be made without departing from the spirit of the invention. The scope of the invention is only limited by the appended claims.
The embodiments of the invention described herein are exemplary only and should not be taken as limiting the invention, which is described by reference to the accompanying drawings.

Claims (3)

1. The electroencephalogram signal preprocessing comprises the following steps:
step 1: reading DEAP data set source files downloaded from an official website (the source files are preprocessed to 128Hz), and importing the last 60s data of the first 32 channels of the 32 dat files (removing the baseline signal data of the first 3s which are irrelevant to emotion);
step 2: and respectively storing the data and the labels parts of the data in feature _ raw.csv files and labels0-3.dat files, wherein the score value > of the label is 5.0 and is stored as 1, and otherwise, the score value > is stored as 0, so that preparation is made for subsequent feature extraction, feature selection and classification work.
2. A method of feature extraction and feature selection based on claim 1, comprising the steps of:
step 1: feature extraction: in order to capture the detail information in the electroencephalogram signal as much as possible, a method of simultaneously extracting a plurality of class features is adopted. And sequentially extracting time domain, frequency domain and nonlinear characteristics from the last 60s data of all 32 channels. The step 1 comprises the following steps:
step 1.1: extracting time domain features, wherein the time domain features comprise 8 statistical features of a Mean (Mean), a standard deviation (Std), a Range (Range), a Skewness (Skewness), a Kurtosis (Kurtosis), three Hjorth index activities, Mobility and Complexity;
step 1.2: extracting time-frequency domain features, namely extracting 4 different frequency bands (Theta, Alpha, Beta and Gamma) related to emotion based on a Wavelet packet decomposition method of db4 mother function, and sequentially extracting Wavelet Energy (Wavelet Energy) and Wavelet Entropy (Wavelet Entropy) on the frequency bands, wherein 2 x 4 is 8 features in total;
step 1.3: and nonlinear feature extraction, including Power Spectral Density (PSD) and Differential Entropy (DE), for 2 features.
Step 2: combining the characteristics: step 1, extracting 18 characteristic values from each channel in each tested video data, so that each tested video file contains 32 × 18 — 576 characteristic values in total, splicing the characteristic values into a characteristic value matrix of (32 × 40) × 576 — 1280 ═ 576, and storing the characteristic value matrix in a train.
And step 3: selecting characteristics: and a supervised feature selection method of Linear Discriminant Analysis (LDA) is adopted, so that the feature dimension extracted in the feature extraction link is reduced, the relevance of the screened features is further improved, and the robustness of the classifier is enhanced.
3. The electroencephalogram signal classification method based on the ensemble learning method AdaBoost on the basis of the claim 2, comprising the following steps:
step 1: reading the processed characteristic data in a train.csv file, and randomly and independently dividing a training set and a test set according to a dividing ratio of 4: 1 (80% training set, 20% testing set);
step 2: training a classifier: sending the training set data into an AdaBoost integrated learning classifier for two-classification training, and continuously adjusting the parameters of the classifier to determine the optimal parameter combination and enhance the learning capability of the classifier;
and step 3: and (3) testing the performance of the classifier: the step 3 comprises the following steps:
step 3.1: sending the data of the test set into the classifier trained in the step 2, and performing a two-classification test on 4 emotion dimensions (Valence, Arousal, Dominance, Liking) based on 5-Fold Cross Validation (5Fold Cross Validation);
step 3.2: performing comparison experiments by using other classifiers, wherein Random Forest and XGboost are adopted;
step 3.3: for the main experiment and the comparative experiment, the following 4 performance indexes are considered: accuracy, precision, recall, F1-Score, and plotting Confusion Matrix (fusion Matrix) and results to evaluate classifier performance in multiple angles.
CN202010977310.4A 2020-09-17 2020-09-17 Electroencephalogram signal emotion recognition based on ensemble learning method AdaBoost Pending CN112200016A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010977310.4A CN112200016A (en) 2020-09-17 2020-09-17 Electroencephalogram signal emotion recognition based on ensemble learning method AdaBoost

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010977310.4A CN112200016A (en) 2020-09-17 2020-09-17 Electroencephalogram signal emotion recognition based on ensemble learning method AdaBoost

Publications (1)

Publication Number Publication Date
CN112200016A true CN112200016A (en) 2021-01-08

Family

ID=74015281

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010977310.4A Pending CN112200016A (en) 2020-09-17 2020-09-17 Electroencephalogram signal emotion recognition based on ensemble learning method AdaBoost

Country Status (1)

Country Link
CN (1) CN112200016A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112836593A (en) * 2021-01-15 2021-05-25 西北大学 Emotion recognition method and system fusing prior and automatic electroencephalogram characteristics
CN112883855A (en) * 2021-02-04 2021-06-01 东北林业大学 Electroencephalogram signal emotion recognition based on CNN + data enhancement algorithm Borderline-SMOTE
CN113191232A (en) * 2021-04-21 2021-07-30 西安交通大学 Electro-hydrostatic actuator fault identification method based on multi-mode homologous features and XGboost model
CN113180659A (en) * 2021-01-11 2021-07-30 华东理工大学 Electroencephalogram emotion recognition system based on three-dimensional features and cavity full convolution network
CN114027840A (en) * 2021-11-12 2022-02-11 江苏科技大学 Emotional electroencephalogram recognition method based on variational modal decomposition
CN114202524A (en) * 2021-12-10 2022-03-18 中国人民解放军陆军特色医学中心 Performance evaluation method and system of multi-modal medical image
CN116028882A (en) * 2023-03-29 2023-04-28 深圳市傲天科技股份有限公司 User labeling and classifying method, device, equipment and storage medium
CN116211322A (en) * 2023-03-31 2023-06-06 上海外国语大学 Depression recognition method and system based on machine learning electroencephalogram signals

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106886792A (en) * 2017-01-22 2017-06-23 北京工业大学 A kind of brain electricity emotion identification method that Multiple Classifiers Combination Model Based is built based on layering
CN109498041A (en) * 2019-01-15 2019-03-22 吉林大学 Driver road anger state identification method based on brain electricity and pulse information
CN110135285A (en) * 2019-04-26 2019-08-16 中国人民解放军战略支援部队信息工程大学 It is a kind of to use the brain electrical silence state identity identifying method and device of singly leading equipment
CN110414548A (en) * 2019-06-06 2019-11-05 西安电子科技大学 The level Bagging method of sentiment analysis is carried out based on EEG signals
US20190347476A1 (en) * 2018-05-09 2019-11-14 Korea Advanced Institute Of Science And Technology Method for estimating human emotions using deep psychological affect network and system therefor
CN110610168A (en) * 2019-09-20 2019-12-24 合肥工业大学 Electroencephalogram emotion recognition method based on attention mechanism

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106886792A (en) * 2017-01-22 2017-06-23 北京工业大学 A kind of brain electricity emotion identification method that Multiple Classifiers Combination Model Based is built based on layering
US20190347476A1 (en) * 2018-05-09 2019-11-14 Korea Advanced Institute Of Science And Technology Method for estimating human emotions using deep psychological affect network and system therefor
CN109498041A (en) * 2019-01-15 2019-03-22 吉林大学 Driver road anger state identification method based on brain electricity and pulse information
CN110135285A (en) * 2019-04-26 2019-08-16 中国人民解放军战略支援部队信息工程大学 It is a kind of to use the brain electrical silence state identity identifying method and device of singly leading equipment
CN110414548A (en) * 2019-06-06 2019-11-05 西安电子科技大学 The level Bagging method of sentiment analysis is carried out based on EEG signals
CN110610168A (en) * 2019-09-20 2019-12-24 合肥工业大学 Electroencephalogram emotion recognition method based on attention mechanism

Non-Patent Citations (4)

* Cited by examiner, † Cited by third party
Title
HARIKUMAR RAJAGURU等: "Analysis of adaboost classifier from compressed EEG features for epilepsy detection", 《2017 INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC)》 *
QIAO XIE等: "Electroencephalogram Emotion Recognition Based on A Stacking Classification Model", 《2018 37TH CHINESE CONTROL CONFERENCE (CCC)》 *
王永宗: "面向情绪识别的脑电特征组合及通道优化选择研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *
郭金良: "基于稀疏组lasso-granger因果关系特征的EEG情感识别", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 *

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113180659A (en) * 2021-01-11 2021-07-30 华东理工大学 Electroencephalogram emotion recognition system based on three-dimensional features and cavity full convolution network
CN113180659B (en) * 2021-01-11 2024-03-08 华东理工大学 Electroencephalogram emotion recognition method based on three-dimensional feature and cavity full convolution network
CN112836593A (en) * 2021-01-15 2021-05-25 西北大学 Emotion recognition method and system fusing prior and automatic electroencephalogram characteristics
CN112836593B (en) * 2021-01-15 2023-06-20 西北大学 Emotion recognition method and system integrating priori and automatic electroencephalogram features
CN112883855A (en) * 2021-02-04 2021-06-01 东北林业大学 Electroencephalogram signal emotion recognition based on CNN + data enhancement algorithm Borderline-SMOTE
CN113191232A (en) * 2021-04-21 2021-07-30 西安交通大学 Electro-hydrostatic actuator fault identification method based on multi-mode homologous features and XGboost model
CN114027840A (en) * 2021-11-12 2022-02-11 江苏科技大学 Emotional electroencephalogram recognition method based on variational modal decomposition
CN114202524A (en) * 2021-12-10 2022-03-18 中国人民解放军陆军特色医学中心 Performance evaluation method and system of multi-modal medical image
CN116028882A (en) * 2023-03-29 2023-04-28 深圳市傲天科技股份有限公司 User labeling and classifying method, device, equipment and storage medium
CN116028882B (en) * 2023-03-29 2023-06-02 深圳市傲天科技股份有限公司 User labeling and classifying method, device, equipment and storage medium
CN116211322A (en) * 2023-03-31 2023-06-06 上海外国语大学 Depression recognition method and system based on machine learning electroencephalogram signals

Similar Documents

Publication Publication Date Title
CN112200016A (en) Electroencephalogram signal emotion recognition based on ensemble learning method AdaBoost
Golmohammadi et al. Automatic analysis of EEGs using big data and hybrid deep learning architectures
Hussein et al. Epileptic seizure detection: A deep learning approach
CN111134666A (en) Emotion recognition method of multi-channel electroencephalogram data and electronic device
Travieso et al. Detection of different voice diseases based on the nonlinear characterization of speech signals
CN110472649B (en) Electroencephalogram emotion classification method and system based on multi-scale analysis and integrated tree model
Hemmerling et al. Voice data mining for laryngeal pathology assessment
Hariharan et al. Objective evaluation of speech dysfluencies using wavelet packet transform with sample entropy
Diykh et al. Texture analysis based graph approach for automatic detection of neonatal seizure from multi-channel EEG signals
CN115770044B (en) Emotion recognition method and device based on electroencephalogram phase amplitude coupling network
CN112364697A (en) Electroencephalogram emotion recognition method based on R-LSTM model
Ge et al. Applicability of hyperdimensional computing to seizure detection
Khare et al. Multiclass sleep stage classification using artificial intelligence based time-frequency distribution and CNN
Hariharan et al. A hybrid expert system approach for telemonitoring of vocal fold pathology
Yao et al. A cnn-transformer deep learning model for real-time sleep stage classification in an energy-constrained wireless device
Kumar et al. Comparison of Machine learning models for Parkinson’s Disease prediction
Sharan Cough sound detection from raw waveform using SincNet and bidirectional GRU
Xie et al. Multi-view features fusion for birdsong classification
CN114091529A (en) Electroencephalogram emotion recognition method based on generation countermeasure network data enhancement
Ghoraani et al. Discriminant non-stationary signal features’ clustering using hard and fuzzy cluster labeling
Boualoulou et al. CNN and LSTM for the classification of parkinson's disease based on the GTCC and MFCC
CN114742107A (en) Method for identifying perception signal in information service and related equipment
Prawira et al. Emotion classification using fast fourier transform and recurrent neural networks
US20220180129A1 (en) Fcn-based multivariate time series data classification method and device
Mishra et al. Improvement of emotion classification performance using multi-resolution variational mode decomposition method

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20210108

WD01 Invention patent application deemed withdrawn after publication