CN113180659A

CN113180659A - Electroencephalogram emotion recognition system based on three-dimensional features and cavity full convolution network

Info

Publication number: CN113180659A
Application number: CN202110034341.0A
Authority: CN
Inventors: 李冬冬; 王喆; 柴冰; 杨海; 杜文莉
Original assignee: East China University of Science and Technology
Current assignee: East China University of Science and Technology
Priority date: 2021-01-11
Filing date: 2021-01-11
Publication date: 2021-07-30
Anticipated expiration: 2041-01-11
Also published as: CN113180659B

Abstract

An electroencephalogram emotion recognition system based on three-dimensional features and a cavity full convolution network comprises the following steps: firstly, preprocessing an electroencephalogram signal, decomposing the electroencephalogram signal into four different frequency bands, performing framing processing, and extracting electroencephalogram emotional characteristics of each electrode channel on different frequency bands from each frame; then, the characteristics are rearranged according to position information of the electrodes during electroencephalogram signal acquisition, and are spliced according to frequency bands to construct a three-dimensional electroencephalogram characteristic array, then the three-dimensional electroencephalogram characteristic array is input into a hole full convolution network for training, a spectrum norm regularization item is introduced, and finally a softmax classifier is used for emotion classification. Because the emotional activity of the brain relates to information return and interaction between different electrode channels and frequency bands, the three-dimensional characteristic representation in the invention can capture the information between different electrode channels and frequency bands, and the deep characteristic which is favorable for emotion classification can be further excavated by the cavity full convolution network on the basis, thereby further improving the accuracy of electroencephalogram emotion recognition.

Description

Electroencephalogram emotion recognition system based on three-dimensional features and cavity full convolution network

Technical Field

The invention relates to the technical field of electroencephalogram emotion recognition, in particular to an electroencephalogram emotion recognition system for carrying out emotion classification on electroencephalogram signals based on three-dimensional features and a cavity full convolution network.

Background

Emotions play an irreplaceable role in our daily lives and communications. Especially when we interact with a machine, we also want the machine to understand what emotion we are expressing. Thus, emotion is considered to be an important factor in building a more friendly, natural human-machine interaction (HMI). With the rapid development of artificial intelligence technology, the emergence of automatic emotion recognition technology makes it possible. For emotion recognition, emotion should be quantitatively defined and used. Psychologists typically classify emotions into two classification models, one discrete, containing six basic emotions (happiness, sadness, fear, disgust, anger, surprise), and mixed emotions, etc. The other is a dimension model, represented by a value and an arousal dimension. Generally, electroencephalogram (EEG) signals acquired directly from the cerebral cortex are a real-time reflection of emotional stimuli, providing a more comprehensive approach to emotion recognition.

The process of applying the electroencephalogram signals to emotion recognition comprises emotion induction, electroencephalogram signal acquisition, electroencephalogram signal preprocessing, feature extraction and selection and emotion classification. In these processes, the extraction and selection of valid features and the final sentiment classification are the most important two steps. Of interest to researchers are five bands of brain electrical signals including Delta (about 1-3Hz), Theta (about 4-7Hz), Alpha (about 8-12Hz), Beta (about 13-30Hz), and Gamma (about 31-100 Hz). Basically, the electroencephalogram features on each frequency band can be time domain features (including mean, standard deviation, etc.), frequency domain features (including frequency band power, power spectral density, etc.), or time-frequency domain features (including differential entropy, wavelet transform, etc.). The time domain features mainly consider the time characteristics of the electroencephalogram signals, the frequency domain features capture electroencephalogram signal information from the angle of frequency, and the time domain features extract the electroencephalogram information from the time and frequency dimensions of the electroencephalogram signals. And constructing a large number of emotion recognition models according to the acquired electroencephalogram characteristics. The commonly used emotion classification models can be divided into two types, one type is a traditional machine learning model (such as SVM, decision tree, random forest and the like), and the feature extraction and optimization need to be performed manually; the other is an end-to-end deep learning model (such as CNN, RNN, LSTM and the like), manual features are not needed, and emotion related features can be automatically extracted for emotion recognition.

Since the emotional activity of the brain involves information back-transfer and interaction between different electrode channels and frequency bands, it is important how to properly exploit these a priori knowledge. The three-dimensional characteristic representation can capture information among different electrode channels and frequency bands, and the hole full convolution network can further mine deep characteristics beneficial to emotion classification on the basis. Aiming at the respective characteristics of the two, if the electroencephalogram characteristic information can be fully utilized, the accuracy of electroencephalogram emotion recognition can be further improved.

Disclosure of Invention

The invention aims to provide a more effective electroencephalogram emotion recognition system, and the accuracy of emotion recognition can be further improved through the electroencephalogram emotion recognition system. Because the three-dimensional characteristic representation can well utilize the information of the electroencephalogram signal, the hole full-convolution network can further excavate deep characteristics favorable for emotion classification, and how to perfectly analyze the electroencephalogram characteristics by better combining the respective characteristics of the two characteristics is a difficulty of the invention. In view of the above difficulties, the present invention aims to provide an electroencephalogram emotion recognition system based on three-dimensional feature representation and a cavity full convolution network. Complementary information among channels of the electroencephalogram signal electrodes, frequency bands and different characteristic activation modes can be better utilized through three-dimensional characteristic representation; deeper electroencephalogram characteristic information is extracted through a full-convolution network of the cavity, and therefore the electroencephalogram emotion recognition performance is improved.

1. The electroencephalogram emotion recognition system based on the three-dimensional features and the cavity full convolution network is characterized by comprising the following steps:

s1, preprocessing the collected electroencephalogram signal samples, and decomposing the preprocessed electroencephalogram signals into four different frequency bands;

s2, performing framing processing on the electroencephalogram signals on different frequency bands obtained in the step S1, extracting electroencephalogram emotional characteristics of each electrode channel on different frequency bands from each frame, and removing baseline characteristics;

and S3, rearranging the electroencephalogram emotional characteristics of the electrode channels on the different frequency bands obtained in the step S2 according to the position information of the electrodes during the acquisition of the electroencephalogram signals, and splicing the electroencephalogram emotional characteristics according to the frequency bands to obtain two-dimensional representations of the different characteristics.

S4, stacking the two-dimensional representations of the different characteristics obtained in the step S3 to construct a three-dimensional electroencephalogram characteristic array;

and S5, inputting the three-dimensional electroencephalogram characteristic array obtained in the step S4 into a cavity full convolution network for training, introducing a spectrum norm regularization item, and finally performing emotion classification by using a softmax classifier.

2. The electroencephalogram emotion recognition system based on the three-dimensional features and the hollow full convolution network architecture, which is characterized in that: s1, the collected brain electrical signal samples come from a multi-modal data set DEAP and a DREAMER; the collected electroencephalogram signal sample contains four continuous emotions, which are respectively: high/low value, high/low arousal, high/low dominance, high/low liking; the collected EEG signal samples comprise five frequency ranges which are respectively Delta, Theta, Alpha, Beta and Gamma; the collected electroencephalogram signal samples need to be preprocessed to remove ocular artifacts, and 4.0-45Hz electroencephalogram signals are filtered by a band-pass filter; the sampling frequency of the preprocessed electroencephalogram signal is 128 Hz; the preprocessed brain electrical signal uses a butterworth filter to decompose the brain electrical signal onto five different frequency bands.

3. The electroencephalogram emotion recognition system based on the three-dimensional features and the hollow full convolution network architecture, which is characterized in that: s2, extracting electroencephalogram emotion of each electrode channel on different frequency bands from each frame, wherein the electroencephalogram emotion comprises a time domain feature Kurtosis (K), a frequency domain feature Power (P) and a time-frequency domain feature Differencentiannepy (DE), and the time domain feature K can be defined as,

where μ is the mean, σ is the standard deviation, E denotes the averaging operation, μ₄Is the fourth order central moment.

The frequency-domain feature P may be defined as,

where N represents the length of a frame of EEG signals.

The time-frequency domain characteristic DE may be defined as,

where f (x) is a probability density function. When the sample follows Gaussian distribution N (mu, sigma)²) Time, corresponding probability density function

May be further simplified by substituting the above formula, and DE after the simplification may be defined as,

where e is the Euler constant and σ represents the standard deviation of the EEG signal.

The removal of the baseline characteristic may be defined as,

wherein s represents the number of stimulation segments, t represents the electroencephalogram signal of the t-th frame,

representing the baseline signal characteristics of the first frame.

4. The electroencephalogram emotion recognition system based on the three-dimensional features and the hollow full convolution network architecture, which is characterized in that: s3, rearranging the EEG emotional characteristics of each electrode channel in different frequency bands according to the position information of the electrode during EEG signal acquisition,

and then spliced according to frequency bands, taking K features as an example, the two-dimensional representation of the K features can be defined as,

wherein the content of the first and second substances,

likewise, a two-dimensional representation of P features

Two-dimensional representation of DE features

5. The electroencephalogram emotion recognition system based on the three-dimensional features and the hollow full convolution network architecture, which is characterized in that: s4 the two-dimensional representation of different features of the stack is at the K'_t，P'_t，D'_tThe dimension of the three-dimensional electroencephalogram feature array is 2 hx 2 wx 3.

6. The electroencephalogram emotion recognition system based on the three-dimensional features and the hollow full convolution network architecture, which is characterized in that: the spectral norm regularization term of S5 is defined as the maximum singular value of a matrix,

in order to constrain each weight matrix W in the hole full convolution network^lThe corresponding optimization procedure can be defined as,

wherein the content of the first and second substances,

the second term is a spectrum norm regularization term, and the constraint on the whole cavity full convolution network spectrum norm can be realized by punishing the sum of the spectrum norms of each layer.

The invention has the beneficial effects that: the electroencephalogram emotion recognition system based on the three-dimensional characteristics and the cavity full convolution network can further mine complementary information among electrode channels, frequency bands and different characteristic activation modes of electroencephalogram signals by utilizing three-dimensional characteristic representation. The full-convolution network of the cavity only uses convolution layers instead of pooling layers to prevent the loss of electroencephalogram information, wherein the convolution of the cavity is combined into the network to increase the receptive field, and meanwhile, a spectral norm regularization term is introduced into a loss function to train the network, so that the full-convolution network of the cavity has better generalization capability. Therefore, compared with other methods only focusing on the characteristic level and the network model, the method has better robustness and improves the performance of emotion recognition.

Drawings

FIG. 1 is a flow chart of the overall framework of the present invention;

FIG. 2 is a flow chart of three-dimensional feature representation in the present invention;

FIG. 3 is a block diagram of a void full convolution network in accordance with the present invention;

Detailed Description

The invention is described in detail below with reference to the following figures and specific embodiments: the method of the invention is divided into four parts.

A first part: electroencephalogram data preprocessing

Electroencephalogram signals are susceptible to noise and artifacts when it is acquired. The acquired brain electrical signals typically require pre-processing to extract the useful signal submerged in the extraneous noise. Thus, the raw EEG data is downsampled to 128Hz, the EOG artifacts are removed, and a band pass frequency filter is used to filter out the 4.0-45Hz EEG signal. On the basis, the brain electrical data from the c electrode channels are decomposed into several frequency bands, namely a Theta frequency band, an Alpha frequency band, a Beta frequency band and a Gamma frequency band by using a Butterworth filter. The next step is to divide the EEG signal in one trial for each subject into m frames, the first frame being the baseline frame and the remaining m-1 frames being the trial frames. Each frame is n seconds long with no overlap between adjacent frames. Here, the t-th test frame is regarded as X_tThus, each subject who experienced s stimuli had s x (m-1) test frames, i.e., the frame index varied from 1 to s x (m-1). After these operations, frame-level features may be obtained from each channel for subsequent electroencephalogram emotion recognition.

A second part: multi-feature acquisition and representation

For multi-feature acquisition, features of specific emotions in different activation modes are extracted. For its representation, these features are rearranged by electrode placement location to obtain more a priori inter-channel and inter-band information. In order to capture emotional activity between different activation patterns, a characteristic sequence construction is adopted to fully utilize complementary information between the different activation patterns.

(1) Feature extraction: for any one of the test frames X mentioned above, which is actually a set of EEG signals for different electrode channels on four frequency bands, each test frame can be considered as a sample. Specifically, X ═ S_ijAnd S_ijIs the j electricity on the i frequency bandThe electroencephalogram signals of the polar channels. Where i denotes 4 brain electrical frequency bands and j denotes c electrode channels. The proposed method extracts different kinds of features including kurtosis feature in time domain, frequency band power feature in frequency domain and differential entropy feature in time-frequency domain.

The kurtosis feature (K) in the time domain is a very simple but effective feature for electroencephalogram emotion recognition. Intuitively, K reflects the sharpness of the signal peak. For one sample X, K^(X)Is the ratio of the fourth central moment to the square of the variance. The calculation formula is as follows:

The power characteristic (P) in the frequency domain contains a large amount of rhythm frequency band information, and is still widely applied to electroencephalogram emotion recognition until now. For a sample X, the power characteristic of the jth electrode channel at a particular frequency band i can be calculated. It has the following formula:

where N represents the length of a frame of EEG signals.

The DE feature (D) in the time-frequency domain is derived from shannon entropy and is considered to be its continuous form, which depends only on the probability density function. Can be expressed as follows:

where f (X) is the probability density function of sample X. If X obeys Gaussian distribution N (mu, sigma)²) Then the corresponding probability density function

Can be substituted into the above formula for further simplification. The simplified formula can be derived as follows:

where e represents the Euler constant and σ represents the standard deviation of the brain electrical signal.

(2) And (3) baseline correction: in order to further prevent artifacts from appearing in the preprocessed electroencephalogram signals, the difference between the trial features and the baseline features is taken as a final classification feature. Then, for each bit, the difference operation can be defined as follows:

where s denotes the number of stimuli and t denotes the t-th test frame in a stimulus and there are m-1 test frames in total. finalFeature_s ^t＝{finalFeature_s ²,…,finalFeature_s ^m}，trialFeature_s ^t＝{trialFeature_s ²,…,trialFeature_s ^m}， baseFeature_s ¹Representing the first baseline signal characteristic in a stimulus.

The features obtained may vary widely, and some larger features may dominate the classifier so that it cannot learn correctly from other features. To avoid this, each feature is normalized, and the normalized feature is mathematically expressed as follows:

wherein

Representing features of the sample after X normalizationFeature, finalfuration^(X)Representing the final features for classification. Mu.s_XAnd σ_XThe characteristic mean and standard deviation of the sample X are indicated.

And a third part: cross-band feature rearrangement and construction

Because the emotional activity of the brain involves information feedback and interaction between different electrode channels and frequency bands, the obtained one-dimensional characteristic sequence K of the t frame_t，P_t，D_tReordering is performed to mine inter-channel and inter-band information. In addition, the three characteristics express electroencephalogram information in different activation modes, and complementarity and correlation among the three characteristics can be better explored through characteristic array construction. The method proposed in this section attempts to convert the acquired one-dimensional brain electrical feature sequence into a two-dimensional grid (h x w) according to the electrode position, where h and w represent the number of electrodes used vertically and horizontally, respectively. The process of three-dimensional feature representation is shown in fig. 2, and mainly includes three steps: (a) obtaining frame-level features using a feature extraction method; (b) the obtained features on different frequency bands are respectively approximately mapped into a two-dimensional electrode grid, which can take the additional inter-channel and inter-band information into account; (c) and constructing the plurality of characteristics into a three-dimensional electroencephalogram characteristic array.

Three-dimensional feature representation (K)_t，P_t，D_t) The basic steps of the algorithm are as follows:

1: converting the obtained one-dimensional electroencephalogram characteristic sequence into a two-dimensional electrode grid;

2: for (K)_t，P_t，D_t) Any one of the features f_tEntering a circulation;

3: grid on Theta band

The size is 9 × 9;

4: grid on Alpha band

The size is 9 × 9;

5: grid on Beta band

The size is 9 × 9;

6: grid on Gamma band

The size is 9 × 9;

7: grid transformation on horizontally stacked Theta and Alpha bands

8: the grid on the horizontally stacked Beta and Gamma bands becomes

9: vertical stacking f₁And f₂Become

10: constructing a three-dimensional electroencephalogram characteristic number array by using a two-dimensional electrode grid;

11: the obtained three grids K'_t，P'_t，D'_tStitching in a third dimension

Concatenate((K'_t,P'_t,D'_t) Axis ═ 3), the size is 18 × 18 × 3;

12: after the step is finished, returning to the three-dimensional electroencephalogram characteristic array of the t frame

The fourth part: void full convolution network

The hole full convolution network contains only convolution layers and it employs hole convolution kernels to systematically aggregate inter-channel and inter-band information, rather than downsampling the feature map, as the pooling operation inevitably loses more EEG feature information. The proposed hole fully convolutional network has five convolutional layers and one fully connected layer, as shown in fig. 3. Specifically, the first layer has 64 convolution kernels of 1 × 1 size to facilitate cross-band and cross-electrode channel interaction and integration of electroencephalogram feature information, and then a modified linear unit (RELU) activation function is used for nonlinear turbo conversion. The next three layers employ 64, 128 and 256 convolution kernels of size 4 x 4, respectively, which convolve the feature maps from the previous layer, where hole convolution is used to expand the receptive field to contain a larger range of brain electrical feature information. After convolution, a Scaled Exponential Linear Unit (SELU) activation function is used to prevent gradient disappearance and gradient explosion. It is worth noting that the last convolutional layer has a convolutional kernel of 1 × 1 size, which is actually the pixel combination of each pixel point on different feature maps and reduces the feature dimension from 256 to 64, which helps to further improve the representation capability of the hole full convolutional network. After these convolutional layers, a fully-connected layer is used to map all 64 feature maps into a one-dimensional vector, and the number of neurons is 1024, then the loss rate of the dropout layer is set to 0.5. The corresponding output is converted into a probability distribution by the softmax activation function, with the dimensions set to 2 or 4, depending on the category of emotion.

To further improve the generalization performance of the hole full convolution network, a spectral norm regularization term [ is ] is introduced into the loss function of the network to reduce input perturbation. The spectral norm regularization term introduces rule constraint from the spectral norm angle of the weight matrix, and prevents the weight matrix from having a larger spectral norm, so that the cavity full convolution network shows better generalization capability. For a matrix a, the spectral norm is expressed as its maximum singular value, and the corresponding mathematical expression is as follows:

in order to constrain each weight matrix W in the hole fully-convolutional network^lThe proposed method takes into account the following optimization procedure:

wherein

Is a regularization factor and the second term is a spectral norm regularization term. By penalizing the sum of the spectral norms of each layer, the constraint on the spectral norm of the entire network can be achieved.

Design of experiments

Experimental data set:

the DEAP data set used in the experiment is a large standard sentiment data set. This data set contains 32 channels of electroencephalographic signal recordings and 8 channels of peripheral physiological signal recordings from 32 healthy subjects (half female, half male). Each subject was asked to watch 40 one-minute long music video clips, whose physiological signals were recorded with 40 electrodes (electrodes placed according to the international 10-20 system). Each experiment contained 63 seconds of signal, with 3 seconds being the baseline signal. At the end of each trial, each subject scored the four dimensions of Valence, Arousal, Dominance and Liking, respectively, in order to assess their current emotional state. All raw signals were recorded at a 512 hz sampling frequency.

Results of the experiment

Experimental results under different characteristic rearrangements:

to further discuss the contribution of the inter-channel and inter-band information in electroencephalogram emotion recognition, the proposed three-dimensional feature representation was compared with several representative feature rearrangement approaches. Here, CNN consists of three convolutional layers and one max pooling layer for emotion classification, and one full link layer with 1024 hidden nodes is added to obtain output. All convolutional layers have a convolution kernel of 3 × 3, the number of convolution kernels of the first layer is 32, and then are doubled in sequence at the following layers. RELU is used as the activation function for convolutional layers, with a 2 x 2 window for the largest pooling layer.

First, the experiment examined different rearrangement patterns under three characteristics, respectively. The total number of each brain electrical feature extracted from one sample is 128 (32 channels × 4 bands). Each feature is then represented in a 1X 128 image format, which is the baseline for a single brain electrical feature rearrangement (denoted B1). Next, each feature is rearranged according to the frequency band (denoted as R11) and the electrode channel information (denoted as R12), which can be expressed in 4 × 32 and 32 × 4 forms, respectively. Finally, the feature rearrangement (denoted as R13) approach of integrating frequency band and electrode channel information is represented as a size of 18 × 18 × 1. The results of the experiment are shown in table 1. It is clear that R13 gives better performance under all individual features, which may be attributed to R13 making good use of a priori information given by the feature rearrangement.

TABLE 1 Classification accuracy when rearranging using different features on a DEAP dataset

Table 1Classification accuracies when using different feature rearrangements on DEAP dataset.

Secondly, for multi-feature fusion, the proposed method constructs extensive experiments to analyze the contribution of prior information in detail, as shown in table 2. A total of 384(32 channels x 4 bands x 3) features are available per sample at all bands, and then the horizontal concatenation of the multiple features is represented in a 1 x 384 format, which is also the baseline for the rearrangement of the multiple features (denoted B2). To further verify the presence of inter-channel information, experiments were performed on four frequency bands (Theta, Alpha, Beta, and Gamma), respectively, when a total of 96(32 channels × 3) features were obtained per sample. The EEG features of the conventional input format (1 x 32 x 3, denoted R21) are then compared with the EEG features of the reformatted input format (9 x 3, denoted R22). Obviously, from R21 to R22, the classification performance is greatly improved, which indicates that the inter-channel information may play an important role in the electroencephalogram emotion classification task. At the same time, this section also performed experiments on the average over four bands and all bands in order to verify the presence of inter-band information. The EEG features reformatted into the input format (9 x 3, noted R22) are then compared with EEG features whose three dimensional features represent the input format. The performance improvement from R22 to three-dimensional feature representation indicates that inter-band information may play an important role in electroencephalogram emotion classification tasks. Overall, the three-dimensional feature representation achieves the best results. This has the advantage of a powerful representation capability, which allows to take into account complementary information between channels, between frequency bands and between different activation patterns. Therefore, three-dimensional feature representation may be a helpful feature rearrangement method that can make full use of a priori knowledge of the electroencephalogram signal.

Table 2 verification of the accuracy of classification of inter-channel and inter-band information using different feature reordering on a DEAP dataset

Table 2 Classification accuracies when using different feature rearrangements to verify the inter-channel and inter-frequency band information.

Experimental results under different classification models:

as mentioned above, multi-feature fusion takes into account complementary information between different activation patterns, and therefore it can lead to better emotion recognition performance. In order to show the superiority of the hole full convolution network, this section also implements emotion classifiers using other models, such as a one-dimensional depth model CNN, and a shallow Decision Tree (DT) and a Random Forest (RF). CNN is configured with the same parameters as above, and the optimal parameters for DT and RF are determined by cross-validation grid search. The maximum depth of DT is set to 10 and the other parameters are set by default. The RF can detect the interaction between features during training with a maximum depth set to 6. The advantage of depth representation learning when using B2 feature rearrangement is illustrated by comparing the depth model (CNN) and the shallow model (DT and RF), with the corresponding results shown in table 3.4. Notably, the three-dimensional feature representation brings better performance because it supplements prior knowledge of additional inter-channel, inter-band and inter-different activation modes for electroencephalogram emotion recognition. Therefore, the validity of the full convolution network of the hole and the SNR is verified by taking an 18X 3 three-dimensional electroencephalogram feature array as input. In addition, hole convolution is used to allow the output of each convolution to contain a greater range of brain electrical characteristic information without changing the number of parameters. The void rate (denoted D) uses fibonacci sequence terms, which helps to alleviate the meshing problem to some extent by providing a lower increase in void rate. As can be seen from Table 3, the best performance is achieved when D is equal to 2, where a 4 × 4 convolution kernel is equivalent to a 9 × 9 convolution kernel, which is exactly the same size as the electroencephalogram electrode grid. On the basis, the sensitivity of the network to the disturbed EEG characteristics is reduced by adding a spectrum norm regularization term into a loss function of the network, and the performance of electroencephalogram emotion recognition is further improved.

TABLE 3 Classification accuracy when using different classification models on DEAP data set

Table 3 Classification accuracies when using different models.

Claims

and S5, inputting the three-dimensional electroencephalogram characteristic array obtained in the step S4 into a DFCN for training, introducing a spectrum norm regularization item, and finally performing emotion classification by using a softmax classifier.

2. The electroencephalogram emotion recognition system based on the three-dimensional features and the full-convolution network of the cavity, as recited in claim 1, wherein: s1, the collected brain electrical signal samples come from a multi-modal data set DEAP and a DREAMER; the collected electroencephalogram signal sample contains four continuous emotions, which are respectively: high/low value, high/low arousal, high/low dominance, high/low liking; the collected EEG signal samples comprise five frequency ranges which are respectively Delta, Theta, Alpha, Beta and Gamma; the collected electroencephalogram signal samples need to be preprocessed to remove ocular artifacts, and 4.0-45Hz electroencephalogram signals are filtered by a band-pass filter; the sampling frequency of the preprocessed electroencephalogram signal is 128 Hz; the preprocessed brain electrical signal uses a butterworth filter to decompose the brain electrical signal onto five different frequency bands.

3. The electroencephalogram emotion recognition system based on the three-dimensional features and the full-convolution network of the cavity, as recited in claim 1, wherein: s2, extracting electroencephalogram emotion of each electrode channel on different frequency bands from each frame, wherein the electroencephalogram emotion comprises a time domain feature Kurtosis (K), a frequency domain feature Power (P) and a time-frequency domain feature Differencentiannepy (DE), and the time domain feature K can be defined as,

where μ is the mean, σ is the standard deviation, E denotes the averaging operation, μ₄Is the fourth order central moment. The frequency-domain feature P may be defined as,

where N represents the length of a frame of EEG signals.

The time-frequency domain characteristic DE may be defined as,

The removal of the baseline characteristic may be defined as,

representing the baseline signal characteristics of the first frame.

4. The electroencephalogram emotion recognition system based on the three-dimensional features and the full-convolution network of the cavity, as recited in claim 1, wherein: s3, rearranging the EEG emotional characteristics of each electrode channel in different frequency bands according to the position information of the electrode during EEG signal acquisition,

wherein the content of the first and second substances,

likewise, a two-dimensional representation of P features

Two-dimensional representation of DE features

5. The electroencephalogram emotion recognition system based on the three-dimensional features and the full-convolution network of the cavity as claimed in claim 1, wherein: s4 the two-dimensional representation of different features of the stack is at the K'_t，P′_t，D′_tThe dimension of the three-dimensional electroencephalogram feature array is 2 hx 2 wx 3.

6. The electroencephalogram emotion recognition system based on the three-dimensional features and the full-convolution network of the cavity, as recited in claim 1, wherein: the spectral norm regularization term of S5 is defined as the maximum singular value of a matrix,

in order to constrain each weight matrix W in the DFCN network^lThe corresponding optimization procedure can be defined as,

wherein the content of the first and second substances,

and the second term is a spectral norm regularization term, and the constraint on the spectral norm of the whole DFCN network can be realized by punishing the sum of the spectral norms of each layer.