CN113558644B

CN113558644B - Emotion classification method, medium and equipment for 3D matrix and multidimensional convolution network

Info

Publication number: CN113558644B
Application number: CN202110821176.3A
Authority: CN
Inventors: 陈景霞; 刘洋; 闵重丹; 林文涛
Original assignee: Shaanxi University of Science and Technology
Current assignee: Shaanxi University of Science and Technology
Priority date: 2021-07-20
Filing date: 2021-07-20
Publication date: 2024-03-22
Anticipated expiration: 2041-07-20
Also published as: CN113558644A

Abstract

The invention discloses an emotion classification method, medium and equipment for a 3D matrix and a multidimensional convolution network, which comprise the following steps: extracting original characteristics of an electroencephalogram signal; extracting time domain features from the electroencephalogram signals of a plurality of channels; extracting frequency domain features from electroencephalogram signals of a plurality of channels; converting the one-dimensional electroencephalogram sequences of the time domain features and the frequency domain features into a two-dimensional network sequence of the time domain features and a two-dimensional network sequence of the frequency domain features respectively, and splicing the two-dimensional network sequence of the time domain features and the two-dimensional network sequence of the frequency domain features in a third dimension respectively to obtain a 3D time domain feature matrix sequence and a 3D frequency domain feature matrix sequence respectively; constructing a multidimensional convolutional neural network model, respectively inputting a 3D time domain feature matrix sequence and a 3D frequency domain feature matrix sequence as the multidimensional convolutional neural network model, extracting deep time domain features and frequency domain features, and inputting the extracted deep time domain features and frequency domain features into a softMax layer for emotion classification. The emotion recognition method has higher classification accuracy.

Description

Emotion classification method, medium and equipment for 3D matrix and multidimensional convolution network

Technical Field

The invention belongs to the technical field of deep learning application, and particularly relates to an emotion classification method, medium and equipment for a 3D matrix and a multidimensional convolution network.

Background

Emotion is the appearance of the mental ideas and psychological tendencies of people, is closely related to the rational behaviors and daily lives of people, and positive emotion helps to improve the efficiency of our daily work, while negative emotion may affect our decisions, attention, etc. With the development of artificial intelligence technology, emotion recognition has received a great deal of attention.

Firstly, EEG signals have a very low signal-to-noise ratio and are subject to a large variety of noise and secondly, people tend to only be interested in EEG signals associated with a particular brain activity, but it is difficult to separate this signal from the background. Thus, in order to determine and extract partial information in an EEG signal that relates to a particular brain activity or emotion, complex EEG signal analysis and processing techniques are required, taking into account both spatial and temporal correlations of the EEG signal.

When dealing with the problem of EEG emotion recognition, two technical challenges are usually encountered, namely, how to extract high-level semantic features from an electroencephalogram signal, and how to construct an emotion classification model to make the classification effect better. Different electroencephalogram emotion recognition methods are continuously proposed in recent years, and classification effects are gradually improved, but in order to further improve the performance of electroencephalogram emotion recognition, important problems still need to be studied intensively: firstly, how to select and extract more effective brain electrical characteristics from original brain electrical signals, and express the characteristics so that the characteristics have more obvious relativity and discriminant in time domain and frequency domain; and secondly, how to construct an effective depth model, and excavate deeper emotion related features from input brain electrical features, so as to improve emotion recognition capability.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides an emotion classification method, medium and equipment for a 3D matrix and multidimensional convolution network, solves the problem of lower classification accuracy of electroencephalogram emotion recognition at present,

in order to achieve the above purpose, the present invention provides the following technical solutions: an emotion classification method for a 3D matrix and a multidimensional convolution network comprises the following steps: collecting electroencephalogram signals of a plurality of channels, and extracting original characteristics of the electroencephalogram signals;

extracting time domain features from the electroencephalogram signals of the plurality of channels based on the extracted original features;

extracting frequency domain features from the electroencephalogram signals of the plurality of channels based on the extracted original features;

converting the one-dimensional electroencephalogram sequences of the time domain features and the frequency domain features into a two-dimensional network sequence of the time domain features and a two-dimensional network sequence of the frequency domain features respectively, and splicing the two-dimensional network sequence of the time domain features and the two-dimensional network sequence of the frequency domain features in a third dimension respectively to obtain a 3D time domain feature matrix sequence and a 3D frequency domain feature matrix sequence respectively;

constructing a multidimensional convolutional neural network model, respectively inputting a 3D time domain feature matrix sequence and a 3D frequency domain feature matrix sequence as the multidimensional convolutional neural network model, extracting deep time domain features and frequency domain features, and inputting the extracted deep time domain features and frequency domain features into a softMax layer for emotion classification.

Further, based on the extracted original features, the time domain features extracted from the electroencephalogram signals of the plurality of channels include: mean, variance, standard deviation, mean of first order difference absolute values, mean of second order difference absolute values, and approximate entropy;

based on the extracted original features, frequency domain features extracted from the electroencephalogram signals of the plurality of channels comprise: five frequency bands of 1Hz-4Hz, 4Hz-8Hz, 8Hz-13Hz, 13Hz-30Hz and above 30Hz, and PSD features on the full frequency band.

Further, based on the extracted original features, the specific steps of extracting the frequency domain features from the electroencephalogram signals of the plurality of channels are as follows: based on the original characteristics of the electroencephalogram signals extracted by a plurality of channels, frequency domain characteristics of the original time sequence electroencephalogram signals are respectively extracted on five frequency bands of Delta, theta, alpha, beta and Gamma by utilizing fast Fourier transform, original characteristic data are scanned by utilizing a Hamming window with the window length of 0.5s, the step length of window moving is set to be 0.25s, 32 power spectral density PSD characteristics are extracted once in a sliding way, and then the PSD characteristics on the whole frequency band are connected with the PSD characteristics on the five frequency bands, so that 6 different frequency domain characteristics can be obtained.

Further, the method for acquiring the electroencephalogram signals of the multiple channels further comprises the step of preprocessing the acquired electroencephalogram signals, wherein the preprocessing comprises the following specific steps of:

and a 4-45Hz band-pass filter is adopted to filter data to eliminate direct current noise, power supply noise and other artifacts, and then a blind source separation technology is adopted to remove the electro-oculogram interference.

Further, the specific steps of extracting the original characteristics of the electroencephalogram signal are as follows:

based on the extracted electroencephalogram signals of a plurality of channels, non-overlapping segmentation is carried out on the electroencephalogram signals of each test, a plurality of samples are obtained in each test, the total number of the samples to be tested is obtained, each sample comprises a plurality of sampling points, and each sampling point comprises the extracted electroencephalogram signals of the plurality of channels, so that the original characteristics are obtained.

Further, the specific steps of converting the one-dimensional electroencephalogram sequence of the time domain feature and the frequency domain feature into a two-dimensional reticular sequence and then splicing the two-dimensional reticular sequence in a third dimension are as follows:

according to the corresponding positions of different electrodes of the cerebral cortex, mapping an electroencephalogram into a 9X 9 two-dimensional mesh matrix, filling other positions of the 9X 9 two-dimensional mesh matrix with 0, wherein non-0 values in the 9X 9 two-dimensional mesh matrix represent electroencephalogram characteristic values of corresponding channels, converting 6 different one-dimensional chained characteristics respectively extracted from time domain characteristics and frequency domain characteristics into a 9X 1 characteristic matrix, and respectively splicing the time domain characteristics or the frequency domain characteristics of each sample in a third dimension to respectively obtain a 9X 6 3D time domain characteristic matrix and a 3D frequency domain characteristic matrix.

Further, the multidimensional convolutional neural network model comprises a characteristic input layer, a unit convolutional layer, a multi-element convolutional layer and an output layer, wherein the characteristic input layer is used for respectively taking a 3D time domain characteristic matrix sequence and a 3D frequency domain characteristic matrix sequence as the input of the unit convolutional layer,

the unit convolution layer is used for respectively extracting the primary time domain features of the 3D time domain feature matrix sequence and the primary frequency domain features of the 3D frequency domain feature matrix sequence and inputting the extracted primary time domain features and primary frequency domain features into the multi-element convolution layer,

the multi-element convolution layer is used for respectively extracting deep time domain features and deep frequency domain features from the primary time domain features and the primary frequency domain features;

the output layer comprises a full-connection layer and a softMax layer, the output of the multi-element convolution layer is used as input, a Dropout layer is added after the full-connection layer, and the full-connection layer inputs the extracted high-level deep time domain features and frequency domain features into the softMax layer for emotion classification.

Further, the unit convolution layer scans with a 1×1 convolution kernel, and after each 1×1 convolution, a ReLU activation function is used to obtain a nonlinear result;

the multi-element convolution layers respectively adopt convolution kernels with the sizes of 3 multiplied by 3, 5 multiplied by 5 and 7 multiplied by 7 to extract local electroencephalogram characteristics of different areas, each convolution network with the size comprises two layers, the first layer groups local EEG channels together to learn local correlation among the channels, the second layer is used for capturing context correlation information among the groups, after each convolution operation, a ReLU activation function is used for obtaining nonlinear output, then nonlinear results of the multi-element convolution layers are connected, after cascade connection of the multi-layer convolution results, special convolution is carried out again, the convolution kernel size is consistent with the size of input data, finally a filter compresses each tensor into a vector in different manners, after the full connection layer, the Dropout layer is added, and then the SoftMax layer is accessed for emotion classification.

The present invention also provides a computer readable storage medium storing one or more programs, the one or more programs comprising instructions, which when executed by a computing device, cause the computing device to perform any of the methods described above.

The present invention also provides a computing device comprising:

one or more processors, memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods described above.

Compared with the prior art, the invention has at least the following beneficial effects:

the invention provides an electroencephalogram signal emotion classification method based on a 3D matrix and a multidimensional convolution network, which is characterized in that time domain features and frequency domain features are respectively extracted from electroencephalogram signals of a plurality of channels, then one-dimensional chained electroencephalogram sequences of the extracted time domain features and frequency domain features are respectively converted into two-dimensional reticular matrix sequences, and are respectively converted into three-dimensional matrix representations, namely, the time domain feature sequences and the frequency domain feature sequences are spliced in a third dimension to finally obtain a 3D time domain feature matrix sequence and a 3D frequency domain feature matrix sequence, and the obtained 3D time domain features and frequency domain features are respectively input into the multidimensional convolution neural network to carry out an in-test emotion two classification experiment; compared with the prior study, the model has less pretreatment on the original data, is more suitable for real-time application such as BCI and the like, and achieves 0.8483 classification precision in the dimension of the awakening degree for 3D time domain characteristics by the emotion recognition method; the classification accuracy in the potency dimension is 0.8519; for the 3D frequency domain characteristics, the classification precision in the wakeup degree dimension reaches 0.8588; the classification accuracy in the potency dimension is 0.8732, and the classification accuracy of electroencephalogram signal emotion recognition is high.

Furthermore, the selected six different time domain features can effectively ensure the acquisition of information related to emotion in the time domain.

Further, original feature data are scanned through a Hamming window with the window length of 0.5s, the step length of window moving is set to be 0.25s, 32 power spectrum density PSD features are extracted once in sliding, then the PSD features on the whole frequency band and the PSD features on the five frequency bands are connected in a third dimension, namely 6 different frequency domain features are cascaded to obtain 3D frequency domain features, and the feature characterization method enables EEG samples to acquire rich gain information related to emotion on space, time and frequency domain.

Further, the one-dimensional chain type sequence is converted into the two-dimensional reticular matrix, so that the spatial information related to emotion in the brain electrical signals can be obtained by fully utilizing the position information between different electrodes of the brain, and then the time information carried by the original brain electrical signals is fused, so that the extracted brain electrical space-time characteristics can be better represented, the characteristic optimization is facilitated, and the brain electrical characteristics with more discriminant are extracted; the two-dimensional characteristics are converted into three-dimensional characteristics, and although a characteristic construction process is needed, the representation capability of the multi-channel EEG emotion related characteristics can be effectively improved, and a reasonable characteristic extraction and representation method can be shown to be very important for improving the emotion recognition performance. Secondly, the 3D feature matrix is very similar to the multichannel image, and can be combined with a multidimensional convolutional neural network to better identify the electroencephalogram emotion features.

Furthermore, in the specific process of emotion classification through the multidimensional convolutional neural network model, a univariate convolutional layer and a multivariate convolutional layer are used, 1×1 convolution in the univariate convolutional layer is used for extracting unique features of each EEG channel and deepening the network, and the multivariate convolutional layer uses parallel filters with different sizes to extract regional features of different regions of the cerebral cortex. The unique multidimensional convolution combination mode also reserves the local characteristics of different areas, and can extract more effective emotion characteristics from the data.

Drawings

FIG. 1 is a two-dimensional mesh matrix diagram corresponding to a 32-channel EEG signal when a one-dimensional original EEG sequence is converted into a two-dimensional mesh sequence;

FIG. 2 is a diagram of a feature structure after converting a two-dimensional mesh sequence into a 3D feature matrix sequence according to the present invention;

FIG. 3 is a diagram of a multi-dimensional neural network structure according to the present invention;

FIG. 4 is a graph of classification accuracy of three-dimensional EEG features in a multidimensional convolutional neural network model;

FIG. 5 is a comparison of classification accuracy of frequency domain three-dimensional EEG features versus one-dimensional timing features;

FIG. 6 is a graph showing the accuracy of the classification of emotion between three-dimensional features and two-dimensional features in a multidimensional convolutional neural network model;

FIG. 7 is a comparison of the performance of a multidimensional convolution model with other prior art depth models.

Detailed Description

The invention is further described below with reference to the drawings and the detailed description.

The invention provides an electroencephalogram signal emotion classification method based on 3D matrix characteristics and a multidimensional convolution network, which provides a new EEG characteristic representation method aiming at original EEG signals on a large-scale public DEAP data set, provides a multidimensional convolution neural network model on the basis, learns and extracts deep time-frequency correlation characteristics with higher discriminant, classifies two types of emotion depending on a tested, and obtains better classification accuracy compared with the prior method, and comprises the following steps:

step 1: collecting electroencephalogram signals of a plurality of channels and preprocessing the electroencephalogram signals of the channels; the method comprises the steps of performing electroencephalogram emotion classification experiments and verification of model performance on a public large-scale electroencephalogram emotion data set DEAP;

step 2: extracting six time domain features, namely mean value, variance, standard deviation, mean value of first-order difference absolute value, mean value of second-order difference absolute value and approximate entropy from the electroencephalogram signals of a plurality of channels preprocessed by the DEAP data set;

step 3: extracting PSD features on five sub-bands and a full band of Delta (1 Hz-4 Hz), theta (4 Hz-8 Hz), alpha (8 Hz-13 Hz), beta (13 Hz-30 Hz) and Gamma (more than 30 Hz) from the electroencephalogram signals of a plurality of channels preprocessed by the DEAP data set;

step 4: converting the one-dimensional electroencephalogram sequence of the time domain feature into a two-dimensional reticular sequence, and splicing the obtained two-dimensional reticular sequence in a third dimension to finally obtain a 3D time domain feature matrix sequence;

step 5: carrying out the same operation of the step 4 on the one-dimensional electroencephalogram sequence with the frequency domain characteristics to finally obtain a 3D frequency domain characteristic matrix sequence;

step 6: constructing a multidimensional convolution neural network model, inputting the three-dimensional feature matrix obtained by the steps as the multidimensional convolution neural network model, and finally accessing a softMax layer to predict emotion types through a unit convolution layer and a multi-element convolution layer;

specifically, the specific details for realizing the steps are as follows:

EEG-based emotion brain-machine interface systems typically use a portable wearable multichannel electrode cap to collect EEG signals, with sensors on the electrode cap capturing fluctuations in scalp current of the subject's brain as the subject views the stimulus video;

in this embodiment, experiments are performed based on the electroencephalogram signals collected in the public large-scale DEAP dataset, the dataset records that 32 testees watch physiological signals such as electroencephalogram signals, electrocardio signals, myoelectricity signals and the like induced by music videos with different emotion tendencies for about 1 minute, and then the testees evaluate the watched videos in terms of arousal degree, titer, preference, dominance and familiarity by using continuous values of 1-9, wherein the evaluation values respectively represent that all indexes are from negative to positive or from weak to strong from small to large. The 40 stimulus videos contain 20 high-titer/arousal degree stimulus and 20 low-titer/arousal degree stimulus;

in the embodiment, extracting EEG signals of 32 channels, reducing the sampling frequency to 128Hz, filtering data by using a 4-45Hz band-pass filter for eliminating direct current noise, power supply noise and other artifacts, and removing electrooculogram interference by using a blind source separation technology to obtain EEG signals with the total duration of 63 seconds, wherein the EEG signals comprise 60 seconds for watching video and 3 seconds for resting state before watching;

the raw EEG signal of the present invention is expressed as 32 (subs) by 40 (trials) by 40 (channels) by 8064 (samples), where 8064 represents 128 (samples) by 63(s) and the label Labels represents 40 (trials) by 4. The raw data was preprocessed to extract the required 32 EEG channels from the 40 channels, and the invention uses the first 3 seconds as a benchmark and the EEG signal 60 seconds after extraction as experimental data, with the preprocessed data expressed as 32 (subs) by 40 (trials) by 32 (channels) by 7680 (samples) due to the delayed response in human vision. The label is selected from the first two dimensions, which respectively represent the titer and the arousal degree, namely 40 (three) multiplied by 2;

then, extracting original characteristics of a plurality of channels, carrying out non-overlapping segmentation on the electroencephalogram signals of each test based on the electroencephalogram signals of the plurality of channels extracted in the step 1, obtaining a plurality of samples for each test, obtaining the total number of the samples to be tested, wherein each sample comprises a plurality of sampling points, and each sampling point comprises the data of the plurality of channels extracted in the step 1, so as to obtain the original characteristics;

specifically, the EEG sequence is segmented in a non-overlapping manner, namely the samples are divided, by adopting a time length of 1s based on the experiment in the invention. Each trial resulted in 60 segments, each segment containing 128 sampling points, each sampling point containing 32 channels, i.e. each EEG data to be tested could be represented as 40 x 128 x 60 x 32, which was dimension transformed to give 2400 x 32 x 128 electroencephalogram data, each test amounting to 2400 EEG segments, each segment size 32 x 128. The same dimension conversion of the tag can be expressed as 2400 x 1;

as shown in fig. 1, 6 time domain features such as a mean value, a variance, a standard deviation, a mean value of absolute values of first order differences, a mean value of absolute values of second order differences, an approximate entropy and the like are extracted for the original features, and then a one-dimensional time domain chain sequence of the time domain features is converted into a two-dimensional network structure. In the present invention, EEG data of 32 channels is used to map an EEG into a 9X 9 two-dimensional mesh matrix according to the corresponding positions of the electrodes. To ensure that the spatial information is complete and does not affect its function, 0's are used to fill in other locations of the mapping matrix, where non-0 values represent EEG eigenvalues of the corresponding channels. Then each feature sequence is converted into a 9 multiplied by 1 feature matrix, and then 6 time domain features of each sample are spliced in a third dimension to obtain a 9 multiplied by 6 3D time domain feature matrix;

specifically, the data of 32 channels are converted into a two-dimensional network structure, the corresponding relationship between the 32 channels and the two-dimensional network matrix is shown in fig. 2, the data representation of 128×2400×9×9 is obtained, the three-dimensional time domain matrix characteristics (3D_Time-domain_matrix features) are obtained after the 6 characteristics are spliced, and are expressed as 307200 multiplied by 9 multiplied by 6, namely, the number of samples of an input depth model is 307200, the corresponding label is 307200 multiplied by 1, and a specific three-dimensional time domain matrix structure is shown in figure 3;

then, converting the frequency domain characteristics by adopting a time domain characteristic conversion method; specifically, the invention extracts frequency domain features from electroencephalogram signals of a plurality of channels on the basis of original features, extracts frequency domain features on five frequency bands of Delta, theta, alpha, beta and Gamma of an original EEG time sequence signal respectively by utilizing fast Fourier transform, then extracts full-frequency-band PSD features by utilizing Hamming window sliding, specifically, extracts 64 PSD features on each channel of an EEG segment of 1s by utilizing a fast Fourier algorithm on 4-45Hz frequency bands by utilizing a Hamming window of 0.5s in a non-overlapping manner, performs dimension conversion on the dimension of the PSD features which are extracted altogether by each tested 40 times by adopting the same method as the time domain, connects the 5 frequency bands and the full-frequency-band frequency domain features to obtain three-dimensional frequency domain features (3 D_Freq-domain_matrix features), and represents 153600×9×9×6, and the corresponding tag is also subjected to uniform conversion, and the three-dimensional matrix structure is shown in figure 3. The EEG samples after conversion contain abundant information on space, time and frequency domain;

then, processing labels of each EEG sample, dividing the evaluation values on the titer and the arousal degree into two classes based on the evaluation value of each video in the range of 1-9 by taking the median 5 as a threshold value, wherein when the problem of classification is solved in a certain dimension, more than 5 represents a high class or a positive index, and the high class or the positive index is represented by 1; less than or equal to 5 represents a low class or negative index, represented by 0;

further, the implementation process of the multidimensional convolution model in the step 6 is as follows: construction of a multidimensional convolution model as shown in fig. 3, the input of the model is a three-dimensional EEG feature matrix processed as described above, and each sample size is 9 x 6. Firstly, a unit convolution layer is adopted to extract the time domain features of the EEG data from each three-dimensional EEG feature matrix, then the extracted time domain feature sequences are input into a multi-element convolution layer, and the time domain features of the EEG data are further extracted. Finally, receiving the output of the CNN network through a full connection layer, and inputting the obtained feature vector into a softMax layer to carry out final emotion classification, wherein the frequency domain feature and the time domain feature process are the same; specific:

to enhance the local abstraction of the model, we scan with a 1×1 convolution kernel as shown in the unit convolution layer of fig. 3, using the ReLU activation function to obtain nonlinear results after each 1×1 convolution is performed;

and inputting the three-dimensional feature matrix sequence obtained by the steps into a multi-element convolution layer, wherein the layer adopts convolution kernels with the sizes of 3 multiplied by 3, 5 multiplied by 5 and 7 multiplied by 7 to extract local electroencephalogram features under different fields of view. The size of the convolution kernel depends on the representation of the input layer 3D matrix features. Each different convolution network contains two layers, the first layer grouping together local EEG channels to learn local correlations between channels, the second layer to capture contextual information from group to group, the same padding on each convolution layer, and likewise, after each convolution operation, non-linear outputs are obtained using a ReLU activation function, and then the results of the multiple convolutions are concatenated. The multi-element convolution layer retains the unique function of each set of convolution, and after multi-layer convolution cascade, one convolution is performed again, and the convolution kernel has the same size as the input data. Finally, the filter compresses each tensor into a vector in different ways;

and finally, the input of the full-connection layer is the output of the multi-element convolution layer, in order to prevent the overfitting of the model, a Dropout layer is added after the full-connection layer, and then a SoftMax layer is accessed for final classification.

In summary, the multi-convolution model provided by the invention considers the correlation and interaction between each channel and each region aiming at the EEG signals of the multiple channels, so that context semantic features related to emotion in the 3D EEG feature matrix can be more fully mined, and the emotion recognition performance is improved.

From fig. 4, it can be intuitively found that the multidimensional convolution network model taking the three-dimensional EEG feature matrix feature as input obtains the classification accuracy of 0.8588 on the frequency domain feature and 0.8483 on the time domain feature in the wake-up degree dimension. In the potency dimension, the classification precision on the frequency domain features is 0.8732, the precision on the time domain features is 0.8519, and the classification precision of the frequency domain features and the time domain features is higher; from fig. 5 and fig. 6, it can be seen that the three-dimensional matrix features are significantly better than the classification accuracy of other one-dimensional features and two-dimensional features; as is evident from FIG. 7, the multi-dimensional convolution model provided by the invention is superior to other models in terms of arousal degree and titer compared with other existing depth models, and classification performance is greatly improved compared with the previous models. Taken together, it is shown that the 3D features learned in the present invention are critical to improving the performance of EEG emotion classification recognition.

In yet another embodiment of the present invention, a terminal device is provided, the terminal device including a processor and a memory, the memory for storing a computer program, the computer program including program instructions, the processor for executing the program instructions stored by the computer storage medium. The processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), off-the-shelf Programmable gate arrays (FPGAs) or other Programmable logic devices, discrete gate or transistor logic devices, discrete hardware components, etc., which are the computational core and control core of the terminal adapted to implement one or more instructions, in particular adapted to load and execute one or more instructions to implement a corresponding method flow or a corresponding function; the processor in the embodiment of the invention can be used for the emotion classification operation of the 3D matrix and the multidimensional convolution network, and comprises the following steps:

collecting electroencephalogram signals of a plurality of channels, and extracting original characteristics of the electroencephalogram signals;

In a further embodiment of the present invention, the present invention also provides a storage medium, in particular, a computer readable storage medium (Memory), which is a Memory device in a terminal device, for storing programs and data. It will be appreciated that the computer readable storage medium herein may include both a built-in storage medium in the terminal device and an extended storage medium supported by the terminal device. The computer-readable storage medium provides a storage space storing an operating system of the terminal. Also stored in the memory space are one or more instructions, which may be one or more computer programs (including program code), adapted to be loaded and executed by the processor. The computer readable storage medium herein may be a high-speed RAM memory or a non-volatile memory (non-volatile memory), such as at least one magnetic disk memory.

One or more instructions stored in a computer-readable storage medium may be loaded and executed by a processor to implement the corresponding steps in the above embodiments relating to emotion classification of 3D matrices and multidimensional convolutional networks; one or more instructions in a computer-readable storage medium are loaded by a processor and perform the steps of:

Finally, it should be noted that: the above examples are only specific embodiments of the present invention, and are not intended to limit the scope of the present invention, but it should be understood by those skilled in the art that the present invention is not limited thereto, and that the present invention is described in detail with reference to the foregoing examples: any person skilled in the art may modify or easily conceive of the technical solution described in the foregoing embodiments, or perform equivalent substitution of some of the technical features, while remaining within the technical scope of the present disclosure; such modifications, changes or substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention. Therefore, the protection scope of the present invention shall be subject to the protection scope of the claims.

Claims

1. The emotion classification method for the 3D matrix and the multidimensional convolution network is characterized by comprising the following steps of: collecting electroencephalogram signals of a plurality of channels, and extracting original characteristics of the electroencephalogram signals;

constructing a multidimensional convolutional neural network model, respectively inputting a 3D time domain feature matrix sequence and a 3D frequency domain feature matrix sequence as the multidimensional convolutional neural network model, extracting deep time domain features and frequency domain features, and inputting the extracted deep time domain features and frequency domain features into a softMax layer for emotion classification;

based on the extracted original features, the time domain features extracted from the electroencephalogram signals of the plurality of channels comprise: mean, variance, standard deviation, mean of first order difference absolute values, mean of second order difference absolute values, and approximate entropy;

based on the extracted original features, frequency domain features extracted from the electroencephalogram signals of the plurality of channels comprise: five frequency bands of 1Hz-4Hz, 4Hz-8Hz, 8Hz-13Hz, 13Hz-30Hz and above 30Hz, and PSD features over the full frequency band;

based on the extracted original features, the specific steps of extracting frequency domain features from the electroencephalogram signals of a plurality of channels are as follows: based on original characteristics of electroencephalogram signals extracted by a plurality of channels, frequency domain characteristics of the original time sequence electroencephalogram signals are respectively extracted on five frequency bands of Delta, theta, alpha, beta and Gamma by utilizing fast Fourier transform, original characteristic data are scanned by utilizing a Hamming window with the window length of 0.5s, the step length of window moving is set to be 0.25s, 32 power spectral density PSD characteristics are extracted once in a sliding way, and then the PSD characteristics on the whole frequency band are connected with the PSD characteristics on the five frequency bands, so that 6 different frequency domain characteristics can be obtained;

the multidimensional convolutional neural network model comprises a characteristic input layer, a unit convolutional layer, a multi-element convolutional layer and an output layer, wherein the characteristic input layer is used for respectively taking a 3D time domain characteristic matrix sequence and a 3D frequency domain characteristic matrix sequence as the input of the unit convolutional layer,

the output layer comprises a full-connection layer and a softMax layer, takes the output of the multi-element convolution layer as input, adds a Dropout layer after the full-connection layer, and inputs the extracted high-level deep time domain features and frequency domain features into the softMax layer for emotion classification;

the unit convolution layer scans by adopting a convolution kernel of 1 multiplied by 1, and a ReLU activation function is used to obtain a nonlinear result after each 1 multiplied by 1 convolution is carried out;

2. The emotion classification method of a 3D matrix and multidimensional convolution network according to claim 1, wherein the method is characterized by collecting electroencephalogram signals of a plurality of channels, and further comprising preprocessing the collected electroencephalogram signals, wherein the preprocessing comprises the following specific steps:

3. The emotion classification method of a 3D matrix and multidimensional convolution network according to claim 1, wherein the specific steps of extracting original characteristics of an electroencephalogram signal are as follows:

4. The emotion classification method of a 3D matrix and multidimensional convolution network according to claim 1, wherein the specific steps of converting a one-dimensional electroencephalogram sequence of a time domain feature and a frequency domain feature into a two-dimensional mesh sequence and then splicing the two-dimensional mesh sequence in a third dimension are as follows:

5. A computer readable storage medium storing one or more programs, wherein the one or more programs comprise instructions, which when executed by a computing device, cause the computing device to perform any of the methods of claims 1-4.

6. A computing device, comprising:

one or more processors, memory, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the one or more processors, the one or more programs comprising instructions for performing any of the methods of claims 1-4.