CN112200016A - Electroencephalogram signal emotion recognition based on ensemble learning method AdaBoost - Google Patents
Electroencephalogram signal emotion recognition based on ensemble learning method AdaBoost Download PDFInfo
- Publication number
- CN112200016A CN112200016A CN202010977310.4A CN202010977310A CN112200016A CN 112200016 A CN112200016 A CN 112200016A CN 202010977310 A CN202010977310 A CN 202010977310A CN 112200016 A CN112200016 A CN 112200016A
- Authority
- CN
- China
- Prior art keywords
- data
- classifier
- emotion
- features
- feature
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F2218/00—Aspects of pattern recognition specially adapted for signal processing
- G06F2218/08—Feature extraction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N20/00—Machine learning
- G06N20/20—Ensemble learning
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Artificial Intelligence (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Software Systems (AREA)
- Signal Processing (AREA)
- Medical Informatics (AREA)
- Computing Systems (AREA)
- Mathematical Physics (AREA)
- Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
Abstract
The invention relates to an electroencephalogram signal emotion recognition method based on an ensemble learning method AdaBoost, wherein the electroencephalogram signal emotion recognition method comprises the following steps: firstly, importing a DEAP data set (the DEAP data set is subjected to down-sampling to 128Hz and artifact removal), taking out the last 60s data of the first 32 channels of the 32 data files, and extracting data and 0/1 labels; then, feature extraction and feature selection are carried out, and time domain, frequency domain and nonlinear features related to emotion are extracted from the electroencephalogram signal data; next, emotion two-classification in 4 dimensions (Valence, aroma, Dominance, Liking) is performed, the extracted feature data is divided into a training set and a test set, the training set is input into a trained AdaBoost classifier, and the test set adopts a 5-fold cross validation method to verify the classification effect. In addition, a comparison experiment is carried out by using a Random Forest classifier and an XGboost classifier, and the method disclosed by the invention is found to have optimal performance, so that the emotion recognition accuracy is obviously improved.
Description
The technical field is as follows:
the invention relates to the field of machine learning and emotion recognition, in particular to electroencephalogram emotion recognition based on an integrated learning method AdaBoost.
Background art:
it is well known that emotions play an important role in activities such as daily learning and life of people, simplifying and making vivid the communication between people. Emotion can be expressed in a way that speech intonation and facial expression are easily perceived by others when speaking, and also in a way that physiological changes of the nervous system are not easily perceived by others. However, due to the "forgeability" of speech intonation and facial expression, the former is not a reliable index for judging emotion, and the physiological signal is more accurate. Therefore, along with the development of artificial intelligence technology, emotion recognition based on physiological signals, Especially Electroencephalogram (EEG) signals, is becoming a popular research topic and attracting much attention.
However, the electroencephalogram signal is taken as a chaotic time sequence, and the abundant emotional information stored in the chaotic time sequence is not obvious, but is expressed in a numerical form along with the time change. Naturally, these "values" also contain some information that is not relevant to emotion and noise that affects emotion recognition. If the electroencephalogram signal data are directly sent into the classifier, the classifier is difficult to identify, so that the emotion identification effect is seriously influenced, and the research has no practical significance.
Therefore, the information contained in the electroencephalogram signals needs to be deeply mined, and effective information is extracted and then emotion recognition is carried out on the extracted effective information on the basis of filtering out noise and irrelevant information. The scheme is mentioned in a large number of electroencephalogram signal related researches, namely: and (3) extracting the characteristics of the preprocessed electroencephalogram signals, and then sending the characteristic data related to emotion into a classifier for emotion recognition, so that the emotion recognition accuracy is improved to a certain extent. Although the attempted feature extraction and classification methods vary, most of these schemes suffer from a common problem: the extracted features are single in category, generally only time domain or frequency domain features are considered, emotional feature information is difficult to reflect comprehensively, and the used classification method is mostly concentrated in the field of traditional machine learning (such as KNN, SVM, ANN and the like), so that the classification performance of the classifier is limited.
The invention content is as follows:
the invention aims to overcome the defects of the existing method and provides an electroencephalogram signal emotion recognition method based on an integrated learning method AdaBoost. The method uses an emotion recognition standard data set, namely a DEAP data set (Python edition), and after feature extraction, emotion secondary classification is carried out on 4 emotion dimensions, Valence, Arousal, Dominance and Liking through an AdaBoost classifier. In particular to a feature extraction link, three types of features of time domain, frequency domain and nonlinearity are extracted in the link, and feature dimension is reduced through a supervised feature selection method, so that the problems that emotion related information reflected by the features extracted by the existing series electroencephalogram signal emotion recognition methods is not comprehensive enough, the classification performance of a classifier is limited and the like are solved. The method comprises the following steps:
step 1: reading in the preprocessed data set, and determining a data range to be used;
step 2: performing feature extraction and feature selection on the electroencephalogram signal data, and extracting features related to emotion;
and step 3: and (3) sending the characteristic data obtained in the step (2) into an AdaBoost classifier for training, and testing the performance of the AdaBoost classifier through the experiment and the comparative experiment.
The implementation of step 1 comprises:
step 1.1: reading DEAP data set source files downloaded from an official website (the source files are preprocessed to 128Hz), and importing the last 60s data of the first 32 channels of the 32 dat files (removing the baseline signal data of the first 3s which are irrelevant to emotion);
step 1.2: respectively storing the data of the data and the labels parts of 4 emotion dimensions in features _ raw.csv files and labels0-3.dat files, wherein the score value > in the label is 5.0 and is 1, otherwise, the score value > is 0, and preparing for subsequent feature extraction, feature selection and binary classification.
The implementation of step 2 comprises:
step 2.1: feature extraction: in order to capture the detail information in the electroencephalogram signal as much as possible, a method of simultaneously extracting a plurality of class features is adopted. And sequentially extracting time domain, time-frequency domain and nonlinear features from the last 60s data of all 32 channels. Step 2.1 again comprises the following steps:
step 2.1.1: extracting time domain features, wherein the time domain features comprise 8 statistical features of a Mean (Mean), a standard deviation (Std), a Range (Range), a Skewness (Skewness), a Kurtosis (Kurtosis), three Hjorth index activities, Mobility and Complexity;
step 2.1.2: extracting time-frequency domain features, namely extracting 4 different frequency bands (Theta, Alpha, Beta and Gamma) related to emotion based on a Wavelet packet decomposition method of db4 mother function, and sequentially extracting Wavelet Energy (Wavelet Energy) and Wavelet Entropy (Wavelet Entropy) on the frequency bands, wherein 2 x 4 is 8 features in total;
step 2.1.3: and nonlinear feature extraction, including Power Spectral Density (PSD) and Differential Entropy (DE), for 2 features.
Step 2.2: combining the characteristics: step 2.1, 18 feature values are extracted from each channel in each tested video data, so that each tested video file contains 32 × 18 — 576 feature values in total, and the feature values are spliced into a feature value matrix of (32 × 40) × 576 — 1280 × 576, and stored in a train.
Step 2.3: selecting characteristics: and a supervised feature selection method of Linear Discriminant Analysis (LDA) is adopted, so that the feature dimension extracted in the feature extraction link is reduced, the relevance of the screened features is further improved, and the robustness of the classifier is enhanced.
The implementation of step 3 comprises:
step 3.1: reading the processed characteristic data in a train.csv file, and randomly and independently dividing a training set and a test set according to a dividing ratio of 4: 1 (80% training set, 20% testing set);
step 3.2: training a classifier: sending the training set data into an AdaBoost integrated learning classifier for two-classification training, and continuously adjusting the parameters of the classifier to determine the optimal parameter combination and enhance the learning capability of the classifier;
step 3.3: and (3) testing the performance of the classifier: step 3.3 again comprises the following steps:
step 3.3.1: sending the data of the test set into the classifier trained in the step 2, and performing a two-classification test on 4 emotion dimensions (Valence, Arousal, Dominance, Liking) based on 5-Fold Cross Validation (5Fold Cross Validation);
step 3.3.2: performing comparison experiments by using other classifiers, wherein Random Forest and XGboost are adopted;
step 3.3.3: for the main experiment and the comparative experiment, the following 4 performance indexes are considered: accuracy, precision, recall, F1-Score, and plotting Confusion Matrix (fusion Matrix) and results to evaluate classifier performance in multiple angles.
The invention has the beneficial effects that: the invention starts from two major links of feature extraction and classification in consideration of the problems that the extracted features are single in type, the features are difficult to reflect emotional feature information comprehensively, the used classification method is mostly concentrated in the field of traditional machine learning and the like in the existing electroencephalogram signal emotion recognition method, so that the classification performance of a classifier is limited. Firstly, in the aspect of feature extraction, comprehensively considering various features of electroencephalogram signals, extracting three types of features of time domain, time frequency domain and nonlinearity, taking each channel data in each tested video file as a unit, extracting 5 types of time domain, 2 types of frequency domain and 5 types of nonlinearity, combining the time domain, the time frequency domain and the nonlinearity into feature vectors, further combining all the feature vectors into a feature matrix, then reducing feature dimension through a supervised feature selection method, and enhancing the correlation between the extracted features and emotion; and then, in classification, performing two-classification of the electroencephalogram signals in 4 emotional dimensions by using an ensemble learning method AdaBoost, performing a comparison experiment by using Random Forest and XGboost classifiers, and evaluating the performance of the classifiers in multiple angles by performing 5-fold cross validation and 4-fold individual performance indexes, drawing a confusion matrix and a result curve graph. The invention makes a certain breakthrough on the basis of the existing problems, and the performance of the classifier is effectively improved.
Description of the drawings:
FIG. 1 is a flowchart of an electroencephalogram signal emotion recognition method based on an ensemble learning method AdaBoost.
Fig. 2 is a diagram of electrode names and electrode locations corresponding to different brain regions.
FIG. 3 is a diagram showing the distribution of positive and negative examples in 4 emotion dimensions in the data Labels used in step 1.
Fig. 4 is a diagram showing the original signal and the decomposed signals a4, D4 to D1, taking the channel Fp1 of test 1 as an example.
Fig. 5 is a schematic diagram of the algorithm idea of the feature selection method LDA.
Fig. 6 is a working schematic diagram of the ensemble learning method AdaBoost.
Fig. 7 is a schematic diagram of a confusion matrix.
FIG. 8 is a classification result confusion matrix for the AdaBoost classifier in 4 emotion dimensions.
FIG. 9 is ACC values over 4 emotion dimensions for a 3-group classifier based on 5-fold cross validation.
FIG. 10 is the F1-Score values in 4 emotion dimensions for a 3-group classifier based on 5-fold cross validation.
The specific implementation mode is as follows:
the technical solution in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention. It is to be understood that the described embodiments are merely exemplary of the invention, and not restrictive of the full scope of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
FIG. 1 is a schematic diagram of a specific process for carrying out the present invention, which mainly comprises the following three steps:
1. pretreatment:
the invention uses an emotion recognition standard dataset, the DEAP dataset (Python edition), which collects physiological signals and corresponding emotion data of 32 volunteers, each volunteer watches 40 music videos containing different emotions, and records their physiological signals to data files s01-s32. dat. When the physiological signals are recorded, the total number of the signals is 40 (the front 32-lead electroencephalogram signal + the back 8-lead peripheral physiological signals comprise an electrooculogram signal, an electromyogram signal, a respiratory signal and the like), the sampling frequency is 512Hz, but the sampling frequency is reduced to 128Hz after a series of filtering operations. Each data file comprises the following two matrixes:
(1) data matrix: 40 x 8064, where the first 40 represents the total number of videos, the second 40 represents the total number of channels of signal acquisition, 8064 is 63 seconds of experimental data (63 x 128) for any 1 channel of video, the first 3 seconds of data being baseline data obtained before 1 experiment, and the last 60 seconds of data being data recorded during the experiment;
(2) labels matrix: 40 x 4, these 4 columns are 4 emotion dimensions respectively: the scores of the titer (Valence), the wakefulness (Arousal), the Dominance (Dominance) and the popularity (Liking) range from 1 to 9.
The object of study in the present invention is electroencephalogram, so this step selects 32. the last 60s data of the first 32 channels, the last 8 channels and the first 3s baseline signal data irrelevant to emotion from the dat file, and stores the data in these files and the labels parts of 4 emotion dimensions in the features _ raw.csv file and labels0-3.dat file, respectively, where the score value > in the label is 5.0 and is 1, otherwise, is 0 (the distribution of positive and negative examples in 4 emotion dimensions in labels is shown in fig. 2), so as to prepare for subsequent feature extraction, feature selection and binary classification.
2. Feature extraction and feature selection:
the electroencephalogram signal is taken as a chaotic time sequence, the interior of the chaotic time sequence contains rich emotional characteristic information, and the key point of the link is how to extract characteristic information related to emotion from the chaotic time sequence. Relevant studies have shown that: the electroencephalogram signal characteristics mainly comprise time domain, frequency domain, time-frequency domain and nonlinear dynamics characteristics, and particularly the latter two characteristics are more relevant to emotion. Therefore, in order to better capture the detail information in the electroencephalogram signal, 3 types of features of time domain, time-frequency domain and nonlinear dynamics are extracted.
(1) Time domain characteristics:
value Range (Range): range (x) max (x) -min (x)
Bias (Skewness): skewness (x) Mean ((x-Mean (x))3)
sixthly, Hjorth parameter-Activity: activity (x) ═ Std (x)2
wherein: skewness (Skewness) is the standard third-order central moment of a sample, and more emphasis is placed on describing the symmetry of overall value distribution; the Kurtosis (Kurtosis) is a standard fourth-order central moment of the sample, and the steepness of the overall all-value distribution form is described more emphatically, so that the data distribution form can be better described by combining the Kurtosis and the steepness; the Hjorth parameter provides a method that can quickly compute three important features of the signal in the time domain: mobility, and Complexity. The method is widely applied to the field of physiological signal processing.
(2) Time-frequency domain characteristics:
besides the time domain, the time-frequency domain features are also a class of important features in electroencephalogram signals. Researches show that the wavelet packet decomposition has effective multi-resolution capability when analyzing non-stationary signals, the problem that the wavelet transformation can only decompose low-frequency subbands and cannot extract information of high-frequency subbands at the same high resolution when decomposing signals at each level is solved, and db4 in the mother function has good smooth characteristic and can better detect the transformation condition of electroencephalogram signals, so the method adopts a 4-layer wavelet packet decomposition method based on db4 mother function. In the concrete implementation: the electroencephalogram signals are decomposed into 4-order detailed signals D4-D1 and a first-order approximation signal A4, the numerical values in the signals are wavelet coefficients of the signals of each order, and the signals respectively represent frequency bands Gamma (32-64Hz), Beta (16-32Hz), Alpha (8-16Hz) and Theta (4-8 Hz). Taking the channel Fp1 of the test sample 1 as an example, the original signal and the decomposed signals A4, D4-D1 are shown in FIG. 3. Then, two features of wavelet energy and wavelet entropy are extracted from the wavelet coefficients of each frequency band.
(3) nonlinear kinetic characteristics:
since the human brain is a typical nonlinear dynamical system, the emotion-related information reflected from the nonlinear dynamical features in the brain electrical signal is also very representative. In the invention, the traditional Power Spectral Density (PSD) and Differential Entropy (DE) characteristics are calculated on 60s data of each channel by adopting short-time Fourier transform of non-overlapping Hamming windows. The differential entropy is a characteristic relative to a continuous random variable, and the calculation formula can be expressed as:
wherein: x is the Gaussian distribution obeying N (mu, sigma)2) F (x) is the probability density function of x. The research shows that: for a fixed-length electroencephalogram signal sequence on a certain frequency band, the differential entropy is equal to the logarithm of the power spectral density.
In the above steps, 18 feature values are extracted for each channel in each video data to be tested, so that each video file to be tested contains 32 × 18 — 576 feature values in total, and these feature values are spliced into a feature value matrix of (32 × 40) × 576 — 1280 × 576, and stored in a train.
In order to reduce the feature dimension and further improve the relevance of the screened features and emotion, Linear Discriminant Analysis (LDA) is used for feature selection. LDA is a supervised learning-based dimension reduction technique, i.e. each sample of a data set is output in a category, and the idea is as follows: the data is projected on a low dimension, and after projection, the projection point of each category of data is desirably as small as possible, and the distance between the category centers of different categories of data is desirably as large as possible, as shown in fig. 5. Because the invention carries out the two classification tasks of the electroencephalogram signals, the algorithm is realized as follows:
let data set D { (x)1,y1),(x2,y2),...,(xm,ym) In which any sample xiFor a vector of dimension N, yi ∈ {0,1}, define NjIs the number of class j samples, XjFor a set of class j samples, μjIs a mean vector of class j samples, ΣjA covariance matrix for the jth sample (strictly speaking, a covariance matrix lacking a denominator part), where j is 0,1, then:
in the binary task, we only need to project data onto a straight line. If the projection line is vector w, then for any sample xiIts projection on a straight line is wTxiCenter point μ of two classes0,μ1The projection on w is wTμ0And wTμ1. Defining by combining with the idea of LDA 'data in class is as close as possible and data between classes is as far as possible':
Inter-class divergence matrix Sb=(μ0-μ1)(μ0-μ1)T
Note that at this time Sbw is constantly parallel to mu0-μ1Not to let Sbw=λw(μ0-μ1)
Substituting into a formula of characteristic value to obtain Sbw=(μ0-μ1)(μ0-μ1)Tw=(μ0-μ1)λwResolving w ═ Sw -1(μ0-μ1)
In summary, the original sample set is projected into a 1-dimensional low-dimensional space generated by taking w as a base vector, and the projected feature set is the solved feature set after dimension reduction.
3. And (4) classification:
on the basis of the two previous steps, the obtained feature set can be sent to a classifier for training and performance evaluation. Firstly, randomly and independently dividing read-in feature set data into 4 parts: 1 (80% training set, 20% testing set), ensuring no intersection between the testing set data and the training set data; the training set data is then fed into a classifier for two-class training, here using the ensemble learning method AdaBoost.
AdaBoost is a self-adaptive lifting algorithm in the field of integrated learning, and is effectively applied to the binary problem. The basic principle is iteration, a new weak classifier is added in each iteration, only the weak classifier is trained in each iteration until a predetermined small enough error rate is reached, and each training sample is given a weight indicating the probability that it is selected into the training set by a certain classifier. If a sample point has been accurately classified, then the probability that it is selected is reduced in constructing the next training set; conversely, if a sample point is not classified accurately, its weight is increased. By the mode, AdaBoost can better focus on the sample which is easy to be wrongly divided, the generalization capability of the classifier is improved, and overfitting is not easy to occur.
The working principle diagram of AdaBoost is shown in fig. 6, and the mathematical description is as follows:
Wherein: d1Represents the weight, w, of each sample at the first iteration11Representing the weight of a first sample in the first iteration, wherein N is the total number of samples;
(2) performing m iterations:
using weight distribution as DmLearning training samples of (m ═ 1,2,. n) to obtain weak classifiers Gm(x):x->1, the performance index of the weak classifier is determined by the value ε of the following error functionmTo measure:
② calculating weak classifier GmThe speaking right (also weight) alphamIt represents GmThe degree of importance in the final classification:
from the above formula, with εmDecrease of alphamGradually increasing, namely: the classifier with small error has large importance degree in the final classifier;
and thirdly, updating the weight distribution of the training samples for the next iteration: the misclassified sample weight increases and the correctly classified sample weight decreases:
Dm+1=(wm+1,1,wm+1,2,...,wm+1,i,...,wm+1,N)
wherein: dm+1Is the weight, w, of the sample for the next iterationm+1,1Is the weight, y, of the ith sample at the next iterationiIs the category (1/-1), G, corresponding to the ith samplem(xi) Is the weak classifier to the sample xiThe classification result (1/-1) of (2), if the classification is correct, yiGm(xi) Is 1, otherwise is-1;
(3) combining the weak classifiers to obtain a strong classifier:
weighted summation of all iterated classifiers:
secondly, applying sign function (sign function) to the summation result to obtain a final strong classifier G (x):
in the experiment of the invention, through the realization and parameter adjustment in the training stage, the best performance can be obtained when the iteration number of the weak learner is set to be about 30.
After the training stage is finished, sending the test set data into the classifier trained in the step 2, performing a two-classification test on 4 emotion dimensions (Valence, Arousal, Dominance, and Liking) based on 5-Fold Cross Validation (5Fold Cross Validation), and performing a comparison experiment by using Random Forest and XGBoost classifiers. For the main experiment and the comparative experiment, the following 4 performance indexes are considered: accuracy, precision, recall, F1-Score, and plotting Confusion Matrix (fusion Matrix) and results to evaluate classifier performance in multiple angles.
Confusion Matrix (fusion Matrix): under the classification task, a plurality of different combinations exist between the predicted result (Predict) and the Real result (Real), and the matrix corresponding to the combined results is the confusion matrix. In the two classification tasks, there are 2 × 2 — 4 different combinations of the above results, which are respectively denoted as TP, FP, FN, TN, so as to obtain the confusion matrix shown in fig. 7.
True example (TP): the prediction is positive example, and the result is also positive example;
false Positive (FP): the prediction is positive case, but the result is negative case;
pseudo counter example (FN): the prediction is negative, but the result is positive;
true Negative (TN): the prediction is counterexample, and the result is counterexample.
Performance indexes are as follows:
the larger these performance index values, the better the classifier performance. The final results are shown in fig. 8 to 10. According to the analysis and result graph, the method has the advantages that the performance is optimal, and compared with the traditional feature extraction and classification method, the emotion recognition accuracy is obviously improved.
Finally, it should be understood that parts of the specification not set forth in detail are well within the prior art.
While the invention has been described with reference to specific embodiments and procedures, it will be understood by those skilled in the art that the invention is not limited thereto, and that various changes and substitutions may be made without departing from the spirit of the invention. The scope of the invention is only limited by the appended claims.
The embodiments of the invention described herein are exemplary only and should not be taken as limiting the invention, which is described by reference to the accompanying drawings.
Claims (3)
1. The electroencephalogram signal preprocessing comprises the following steps:
step 1: reading DEAP data set source files downloaded from an official website (the source files are preprocessed to 128Hz), and importing the last 60s data of the first 32 channels of the 32 dat files (removing the baseline signal data of the first 3s which are irrelevant to emotion);
step 2: and respectively storing the data and the labels parts of the data in feature _ raw.csv files and labels0-3.dat files, wherein the score value > of the label is 5.0 and is stored as 1, and otherwise, the score value > is stored as 0, so that preparation is made for subsequent feature extraction, feature selection and classification work.
2. A method of feature extraction and feature selection based on claim 1, comprising the steps of:
step 1: feature extraction: in order to capture the detail information in the electroencephalogram signal as much as possible, a method of simultaneously extracting a plurality of class features is adopted. And sequentially extracting time domain, frequency domain and nonlinear characteristics from the last 60s data of all 32 channels. The step 1 comprises the following steps:
step 1.1: extracting time domain features, wherein the time domain features comprise 8 statistical features of a Mean (Mean), a standard deviation (Std), a Range (Range), a Skewness (Skewness), a Kurtosis (Kurtosis), three Hjorth index activities, Mobility and Complexity;
step 1.2: extracting time-frequency domain features, namely extracting 4 different frequency bands (Theta, Alpha, Beta and Gamma) related to emotion based on a Wavelet packet decomposition method of db4 mother function, and sequentially extracting Wavelet Energy (Wavelet Energy) and Wavelet Entropy (Wavelet Entropy) on the frequency bands, wherein 2 x 4 is 8 features in total;
step 1.3: and nonlinear feature extraction, including Power Spectral Density (PSD) and Differential Entropy (DE), for 2 features.
Step 2: combining the characteristics: step 1, extracting 18 characteristic values from each channel in each tested video data, so that each tested video file contains 32 × 18 — 576 characteristic values in total, splicing the characteristic values into a characteristic value matrix of (32 × 40) × 576 — 1280 ═ 576, and storing the characteristic value matrix in a train.
And step 3: selecting characteristics: and a supervised feature selection method of Linear Discriminant Analysis (LDA) is adopted, so that the feature dimension extracted in the feature extraction link is reduced, the relevance of the screened features is further improved, and the robustness of the classifier is enhanced.
3. The electroencephalogram signal classification method based on the ensemble learning method AdaBoost on the basis of the claim 2, comprising the following steps:
step 1: reading the processed characteristic data in a train.csv file, and randomly and independently dividing a training set and a test set according to a dividing ratio of 4: 1 (80% training set, 20% testing set);
step 2: training a classifier: sending the training set data into an AdaBoost integrated learning classifier for two-classification training, and continuously adjusting the parameters of the classifier to determine the optimal parameter combination and enhance the learning capability of the classifier;
and step 3: and (3) testing the performance of the classifier: the step 3 comprises the following steps:
step 3.1: sending the data of the test set into the classifier trained in the step 2, and performing a two-classification test on 4 emotion dimensions (Valence, Arousal, Dominance, Liking) based on 5-Fold Cross Validation (5Fold Cross Validation);
step 3.2: performing comparison experiments by using other classifiers, wherein Random Forest and XGboost are adopted;
step 3.3: for the main experiment and the comparative experiment, the following 4 performance indexes are considered: accuracy, precision, recall, F1-Score, and plotting Confusion Matrix (fusion Matrix) and results to evaluate classifier performance in multiple angles.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010977310.4A CN112200016A (en) | 2020-09-17 | 2020-09-17 | Electroencephalogram signal emotion recognition based on ensemble learning method AdaBoost |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010977310.4A CN112200016A (en) | 2020-09-17 | 2020-09-17 | Electroencephalogram signal emotion recognition based on ensemble learning method AdaBoost |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112200016A true CN112200016A (en) | 2021-01-08 |
Family
ID=74015281
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010977310.4A Pending CN112200016A (en) | 2020-09-17 | 2020-09-17 | Electroencephalogram signal emotion recognition based on ensemble learning method AdaBoost |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112200016A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN112836593A (en) * | 2021-01-15 | 2021-05-25 | 西北大学 | Emotion recognition method and system fusing prior and automatic electroencephalogram characteristics |
CN112883855A (en) * | 2021-02-04 | 2021-06-01 | 东北林业大学 | Electroencephalogram signal emotion recognition based on CNN + data enhancement algorithm Borderline-SMOTE |
CN113191232A (en) * | 2021-04-21 | 2021-07-30 | 西安交通大学 | Electro-hydrostatic actuator fault identification method based on multi-mode homologous features and XGboost model |
CN113180659A (en) * | 2021-01-11 | 2021-07-30 | 华东理工大学 | Electroencephalogram emotion recognition system based on three-dimensional features and cavity full convolution network |
CN114027840A (en) * | 2021-11-12 | 2022-02-11 | 江苏科技大学 | Emotional electroencephalogram recognition method based on variational modal decomposition |
CN114202524A (en) * | 2021-12-10 | 2022-03-18 | 中国人民解放军陆军特色医学中心 | Performance evaluation method and system of multi-modal medical image |
CN116028882A (en) * | 2023-03-29 | 2023-04-28 | 深圳市傲天科技股份有限公司 | User labeling and classifying method, device, equipment and storage medium |
CN116211322A (en) * | 2023-03-31 | 2023-06-06 | 上海外国语大学 | Depression recognition method and system based on machine learning electroencephalogram signals |
Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106886792A (en) * | 2017-01-22 | 2017-06-23 | 北京工业大学 | A kind of brain electricity emotion identification method that Multiple Classifiers Combination Model Based is built based on layering |
CN109498041A (en) * | 2019-01-15 | 2019-03-22 | 吉林大学 | Driver road anger state identification method based on brain electricity and pulse information |
CN110135285A (en) * | 2019-04-26 | 2019-08-16 | 中国人民解放军战略支援部队信息工程大学 | It is a kind of to use the brain electrical silence state identity identifying method and device of singly leading equipment |
CN110414548A (en) * | 2019-06-06 | 2019-11-05 | 西安电子科技大学 | The level Bagging method of sentiment analysis is carried out based on EEG signals |
US20190347476A1 (en) * | 2018-05-09 | 2019-11-14 | Korea Advanced Institute Of Science And Technology | Method for estimating human emotions using deep psychological affect network and system therefor |
CN110610168A (en) * | 2019-09-20 | 2019-12-24 | 合肥工业大学 | Electroencephalogram emotion recognition method based on attention mechanism |
-
2020
- 2020-09-17 CN CN202010977310.4A patent/CN112200016A/en active Pending
Patent Citations (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106886792A (en) * | 2017-01-22 | 2017-06-23 | 北京工业大学 | A kind of brain electricity emotion identification method that Multiple Classifiers Combination Model Based is built based on layering |
US20190347476A1 (en) * | 2018-05-09 | 2019-11-14 | Korea Advanced Institute Of Science And Technology | Method for estimating human emotions using deep psychological affect network and system therefor |
CN109498041A (en) * | 2019-01-15 | 2019-03-22 | 吉林大学 | Driver road anger state identification method based on brain electricity and pulse information |
CN110135285A (en) * | 2019-04-26 | 2019-08-16 | 中国人民解放军战略支援部队信息工程大学 | It is a kind of to use the brain electrical silence state identity identifying method and device of singly leading equipment |
CN110414548A (en) * | 2019-06-06 | 2019-11-05 | 西安电子科技大学 | The level Bagging method of sentiment analysis is carried out based on EEG signals |
CN110610168A (en) * | 2019-09-20 | 2019-12-24 | 合肥工业大学 | Electroencephalogram emotion recognition method based on attention mechanism |
Non-Patent Citations (4)
Title |
---|
HARIKUMAR RAJAGURU等: "Analysis of adaboost classifier from compressed EEG features for epilepsy detection", 《2017 INTERNATIONAL CONFERENCE ON COMPUTING METHODOLOGIES AND COMMUNICATION (ICCMC)》 * |
QIAO XIE等: "Electroencephalogram Emotion Recognition Based on A Stacking Classification Model", 《2018 37TH CHINESE CONTROL CONFERENCE (CCC)》 * |
王永宗: "面向情绪识别的脑电特征组合及通道优化选择研究", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 * |
郭金良: "基于稀疏组lasso-granger因果关系特征的EEG情感识别", 《中国优秀博硕士学位论文全文数据库(硕士)信息科技辑》 * |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113180659A (en) * | 2021-01-11 | 2021-07-30 | 华东理工大学 | Electroencephalogram emotion recognition system based on three-dimensional features and cavity full convolution network |
CN113180659B (en) * | 2021-01-11 | 2024-03-08 | 华东理工大学 | Electroencephalogram emotion recognition method based on three-dimensional feature and cavity full convolution network |
CN112836593A (en) * | 2021-01-15 | 2021-05-25 | 西北大学 | Emotion recognition method and system fusing prior and automatic electroencephalogram characteristics |
CN112836593B (en) * | 2021-01-15 | 2023-06-20 | 西北大学 | Emotion recognition method and system integrating priori and automatic electroencephalogram features |
CN112883855A (en) * | 2021-02-04 | 2021-06-01 | 东北林业大学 | Electroencephalogram signal emotion recognition based on CNN + data enhancement algorithm Borderline-SMOTE |
CN113191232A (en) * | 2021-04-21 | 2021-07-30 | 西安交通大学 | Electro-hydrostatic actuator fault identification method based on multi-mode homologous features and XGboost model |
CN114027840A (en) * | 2021-11-12 | 2022-02-11 | 江苏科技大学 | Emotional electroencephalogram recognition method based on variational modal decomposition |
CN114202524A (en) * | 2021-12-10 | 2022-03-18 | 中国人民解放军陆军特色医学中心 | Performance evaluation method and system of multi-modal medical image |
CN116028882A (en) * | 2023-03-29 | 2023-04-28 | 深圳市傲天科技股份有限公司 | User labeling and classifying method, device, equipment and storage medium |
CN116028882B (en) * | 2023-03-29 | 2023-06-02 | 深圳市傲天科技股份有限公司 | User labeling and classifying method, device, equipment and storage medium |
CN116211322A (en) * | 2023-03-31 | 2023-06-06 | 上海外国语大学 | Depression recognition method and system based on machine learning electroencephalogram signals |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN112200016A (en) | Electroencephalogram signal emotion recognition based on ensemble learning method AdaBoost | |
Golmohammadi et al. | Automatic analysis of EEGs using big data and hybrid deep learning architectures | |
Hussein et al. | Epileptic seizure detection: A deep learning approach | |
CN111134666A (en) | Emotion recognition method of multi-channel electroencephalogram data and electronic device | |
Travieso et al. | Detection of different voice diseases based on the nonlinear characterization of speech signals | |
CN110472649B (en) | Electroencephalogram emotion classification method and system based on multi-scale analysis and integrated tree model | |
Hemmerling et al. | Voice data mining for laryngeal pathology assessment | |
Hariharan et al. | Objective evaluation of speech dysfluencies using wavelet packet transform with sample entropy | |
Diykh et al. | Texture analysis based graph approach for automatic detection of neonatal seizure from multi-channel EEG signals | |
CN115770044B (en) | Emotion recognition method and device based on electroencephalogram phase amplitude coupling network | |
CN112364697A (en) | Electroencephalogram emotion recognition method based on R-LSTM model | |
Ge et al. | Applicability of hyperdimensional computing to seizure detection | |
Khare et al. | Multiclass sleep stage classification using artificial intelligence based time-frequency distribution and CNN | |
Hariharan et al. | A hybrid expert system approach for telemonitoring of vocal fold pathology | |
Yao et al. | A cnn-transformer deep learning model for real-time sleep stage classification in an energy-constrained wireless device | |
Kumar et al. | Comparison of Machine learning models for Parkinson’s Disease prediction | |
Sharan | Cough sound detection from raw waveform using SincNet and bidirectional GRU | |
Xie et al. | Multi-view features fusion for birdsong classification | |
CN114091529A (en) | Electroencephalogram emotion recognition method based on generation countermeasure network data enhancement | |
Ghoraani et al. | Discriminant non-stationary signal features’ clustering using hard and fuzzy cluster labeling | |
Boualoulou et al. | CNN and LSTM for the classification of parkinson's disease based on the GTCC and MFCC | |
CN114742107A (en) | Method for identifying perception signal in information service and related equipment | |
Prawira et al. | Emotion classification using fast fourier transform and recurrent neural networks | |
US20220180129A1 (en) | Fcn-based multivariate time series data classification method and device | |
Mishra et al. | Improvement of emotion classification performance using multi-resolution variational mode decomposition method |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20210108 |
|
WD01 | Invention patent application deemed withdrawn after publication |