CN114767130A - Multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging - Google Patents

Multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging Download PDF

Info

Publication number
CN114767130A
CN114767130A CN202210440906.XA CN202210440906A CN114767130A CN 114767130 A CN114767130 A CN 114767130A CN 202210440906 A CN202210440906 A CN 202210440906A CN 114767130 A CN114767130 A CN 114767130A
Authority
CN
China
Prior art keywords
electroencephalogram
signal
model
feature fusion
scale
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210440906.XA
Other languages
Chinese (zh)
Inventor
徐华兴
胡飞
常加兴
毛晓波
李立国
郑鹏远
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhengzhou University
Original Assignee
Zhengzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhengzhou University filed Critical Zhengzhou University
Priority to CN202210440906.XA priority Critical patent/CN114767130A/en
Publication of CN114767130A publication Critical patent/CN114767130A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
    • A61B5/316Modalities, i.e. specific diagnostic methods
    • A61B5/369Electroencephalography [EEG]
    • A61B5/372Analysis of electroencephalograms
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/16Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
    • A61B5/165Evaluating the state of mind, e.g. depression, anxiety
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F2218/00Aspects of pattern recognition specially adapted for signal processing
    • G06F2218/12Classification; Matching

Landscapes

  • Health & Medical Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Molecular Biology (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • General Health & Medical Sciences (AREA)
  • Psychiatry (AREA)
  • Theoretical Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Pathology (AREA)
  • Veterinary Medicine (AREA)
  • Public Health (AREA)
  • Mathematical Physics (AREA)
  • Animal Behavior & Ethology (AREA)
  • Surgery (AREA)
  • Medical Informatics (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Computational Linguistics (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Psychology (AREA)
  • General Physics & Mathematics (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Developmental Disabilities (AREA)
  • Child & Adolescent Psychology (AREA)
  • Educational Technology (AREA)
  • Social Psychology (AREA)
  • Hospice & Palliative Care (AREA)
  • Fuzzy Systems (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physiology (AREA)
  • Signal Processing (AREA)
  • Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)

Abstract

The invention discloses a multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging, which combines multi-scale and time sequence imaging algorithms, realizes emotion recognition by converting electroencephalogram signals into images, can save the spatial information of the electroencephalogram signals, can reduce the calculated amount by using the multi-scale algorithm, finds potential electroencephalogram signal modes, and codes high-dimensional information into the images at the same time, so that the images contain rich information, fully utilizes the advantages of machine vision, extracts the high-dimensional features of the images by using a 2DCNN model, and obtains better emotion classification results by different multi-modal feature fusion methods.

Description

Multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging
Technical Field
The invention relates to the field of physiological signal processing, in particular to a multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging.
Background
Emotion is a complex psychological and physiological state that affects people's cognition, behavior, and interpersonal interactions. According to cognitive and neurophysiological theories, emotions that play an important role in human brain activity can be detected in brain electroencephalogram (EEG) signals. Thus, effective emotion recognition can be performed using the EEG signal.
The traditional EEG signal-based emotion recognition method mainly uses a 1DCNN (Chinese definition: one-dimensional convolution) technology to extract signal features of an electroencephalogram and trains a classifier to realize emotion recognition. The traditional emotion recognition method only focuses on time domain or frequency domain information, so that electroencephalogram spatial information is seriously lost, the classification performance is limited, a great deal of effort is needed to search signal features related to emotion from an original EEG signal, corresponding correlation is constructed, the feature calculation is long in time consumption, and the generalization capability is very limited. In recent years, deep learning is vigorously developed in various fields, and more possibilities are provided for constructing emotion classification models.
Disclosure of Invention
The invention aims to provide a multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging, and aims to solve the problems that emotion related features need to be constructed, feature calculation is long in time consumption, model generalization capability is poor, electroencephalogram spatial information is seriously lost, and classification performance is limited in a traditional emotion recognition method.
In order to realize the purpose, the invention adopts the following technical scheme: time series classification using time series coded imaging also shows high performance due to the success of convolutional neural networks in image classification. The invention combines multi-scale and time sequence imaging algorithms, realizes emotion recognition by converting electroencephalogram signals into images, can save spatial information of the electroencephalogram signals, can reduce calculated amount by using the multi-scale algorithm, finds potential electroencephalogram signal modes, and codes high-dimensional information into the images simultaneously so that the images contain rich information, fully utilizes the advantages of machine vision, extracts high-dimensional characteristics of the images by using a 2DCNN (Chinese definition: two-dimensional convolution) model, and obtains better emotion classification results by different multi-modal characteristic fusion methods.
The invention discloses a multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging, which comprises the following steps:
s1, performing baseline removal on the original electroencephalogram signal by using a python code to obtain a first electroencephalogram signal;
s2, performing multi-scale processing on the first electroencephalogram signal to obtain a second electroencephalogram signal;
s3, converting the second brain electrical signal into an image by using a time series imaging algorithm to obtain N image data sets;
s4, performing data enhancement on the N image data sets to construct N samples and label sets;
s5, obtaining N first feature vectors by the N samples and the label sets through a ResNet model and a DNN-01 model respectively;
s6, forming 3 second eigenvectors by combining N first eigenvectors;
s7, forming a multi-modal feature fusion electroencephalogram emotion classification model by the 3 second feature vectors through a DNN-02 model respectively;
and S8, randomly dividing the N samples and the label sets in the step S4 into M parts by adopting a ten-fold cross validation method, taking M-1 part as training data and the others as test data, training the multi-mode feature fusion electroencephalogram emotion classification model, and obtaining an electroencephalogram emotion classification recognition model.
Further, the baseline removal includes the following: and dividing the baseline signal and the experimental signal in the original electroencephalogram signal into a K section and an I section with the length of L respectively, and subtracting the average value of all the baseline signal sections from each experimental signal section.
Further, the mathematical definition of the multi-scale process is:
Figure 747509DEST_PATH_IMAGE001
wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE002
for the set time scale, L is the length of the original brain electrical signal,
Figure 87486DEST_PATH_IMAGE003
is the signal value of the original brain electrical signal at the time i,
Figure 100002_DEST_PATH_IMAGE004
is the second brain electrical signal, j is the index of the second brain electrical signal.
Further, the data is enhanced to Mixup.
Further, 3 of the second feature vectors are: a second eigenvector formed by adding N of the first eigenvectors; a second feature vector consisting of the maximum values of the same positions of the N first feature vectors; and a second feature vector formed by weighted combination of the N first feature vectors by using a full connection layer.
Further, the weight in the weighted combination is a parameter of the N first feature vectors after full-link layer training.
Further, the loss calculation formula of the multi-modal feature fusion electroencephalogram emotion classification model is as follows:
Figure 125676DEST_PATH_IMAGE005
wherein the content of the first and second substances,
Figure 100002_DEST_PATH_IMAGE006
and
Figure 271618DEST_PATH_IMAGE007
l i, being a preset parameter, is the loss of N of the samples and label sets through the ResNet model and the DNN-01 model,
Figure 100002_DEST_PATH_IMAGE008
an integer of (d); lcom is a loss of the N third feature vectors through the DNN-02 model, respectively.
The invention has the advantages that the multi-scale and time sequence imaging algorithms are combined, emotion recognition is realized by converting electroencephalogram signals into images, compared with the traditional emotion recognition method based on EEG signals, the method not only can store the spatial information of the electroencephalogram signals, but also can reduce the calculated amount by using the multi-scale algorithm, find potential electroencephalogram signal modes, and encode high-dimensional information into the images, so that the images contain rich information, the advantages of machine vision are fully utilized, the high-dimensional characteristics of the images are extracted by using the 2DCNN model, and better emotion classification results are obtained by different multi-modal characteristic fusion methods.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
Fig. 2 is a graph of 32 electrode positions in an embodiment of the method of the invention.
FIG. 3 is a schematic diagram of spatial information obtained after electroencephalogram signals are converted into images in the embodiment of the method.
Fig. 4 is a schematic diagram of a ResNet network structure in the embodiment of the method of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be described clearly and completely below, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
As shown in FIG. 1, the multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging comprises the following steps: the method comprises the following steps:
s1, performing baseline removal on the original electroencephalogram signal by using a python code to obtain a first electroencephalogram signal;
this embodiment downloads an electroencephalogram signal data set from the disclosed DEAP as raw data. In the DEAP database, 32 participants participated in the experiment. Each participant was asked to watch 40 one minute music videos and electroencephalographic signals were recorded from 32 electrodes according to the international 10-20 system, the electrode positions being shown in fig. 2. Participants scored the allotment, arousal, disposition and preference on a continuous scale between 1 and 9 after viewing each video. The data recorded by each participant included 40 pieces of electroencephalographic (abbreviated in english: EEG) data and corresponding labels. Each segment of brain electrical data contains 60 seconds of the experimental signal and a 3 second baseline signal in a relaxed state.
Because the human electroencephalogram signal is unstable, the human electroencephalogram signal is easily influenced by tiny changes of the surrounding environment; and the electroencephalogram signals generated by the emotional stimulation are also influenced by the emotional state before the stimulation is received to a certain extent. Thus, removing the baseline may achieve better classification results. In the invention, a baseline signal and an experimental signal are respectively divided into a K section and an I section with the length of L, and then a python code is used for removing the baseline signal (namely, the electroencephalogram signal in a relaxed state) from the electroencephalogram signal, the method is to subtract the average value of all baseline signal sections from each experimental signal section, and the mathematical expression is as follows:
Figure RE-728702DEST_PATH_IMAGE009
Figure RE-861743DEST_PATH_IMAGE010
wherein the content of the first and second substances,
Figure RE-977466DEST_PATH_IMAGE011
the average value of all baseline signal segments;
Figure RE-646607DEST_PATH_IMAGE012
is the ith section of baseline signal;
Figure RE-455163DEST_PATH_IMAGE013
is an i-th experimental signal segment;
Figure RE-75501DEST_PATH_IMAGE014
an i-th experimental signal section after the baseline is removed;
s2, performing multi-scale processing on the first electroencephalogram signal to obtain a second electroencephalogram signal;
because the potential electroencephalogram mode of the electroencephalogram signal is unknown and the relevant time scale is also unknown, the electroencephalogram signal can be processed in a multi-scale mode, the data size can be reduced, different scale modes can be learned by a machine, and the classification performance is improved. The mathematical definition of the multiscale process is:
Figure RE-870281DEST_PATH_IMAGE015
wherein the content of the first and second substances,
Figure RE-158043DEST_PATH_IMAGE016
in order to set the time scale for the device,
Figure RE-636036DEST_PATH_IMAGE017
is the length of the original brain electrical signal,
Figure RE-478090DEST_PATH_IMAGE018
is the signal value of the original brain electrical signal at the ith moment,
Figure RE-201195DEST_PATH_IMAGE019
is the second brain electrical signal, j is the index of the second brain electrical signal.
When the temperature is higher than the set temperature
Figure DEST_PATH_IMAGE020
When the utility model is used, the water is discharged,
Figure 796818DEST_PATH_IMAGE021
namely the signal value of the original brain electrical signal.
When the temperature is higher than the set temperature
Figure DEST_PATH_IMAGE022
When the utility model is used, the water is discharged,
Figure 471513DEST_PATH_IMAGE023
the time sequence after coarse granulation formed by the average value of the original EEG signal values at every two continuous moments is obtained. I.e., when j =1, the system,
Figure DEST_PATH_IMAGE024
when the sum of j =2 is greater than the maximum value,
Figure 73526DEST_PATH_IMAGE025
when the sum of j =3 is greater than or equal to,
Figure DEST_PATH_IMAGE026
when the temperature is higher than the set temperature
Figure 457365DEST_PATH_IMAGE027
When the utility model is used, the water is discharged,
Figure DEST_PATH_IMAGE028
the time sequence after coarse graining is formed by the average value of the original EEG signal values at every three continuous moments. I.e., when j =1, the system,
Figure 902253DEST_PATH_IMAGE029
and when j =2, the ratio of the total of the three components,
Figure DEST_PATH_IMAGE030
and when j =3, the number of the terminals,
Figure 865792DEST_PATH_IMAGE031
the time sequence after coarse graining is the second brain electrical signal.
S3, converting the second brain electrical signal into an image by using a time series imaging algorithm to obtain N image data sets;
the step of converting the second brain electrical signal into the image can utilize information in the original brain electrical signal and encode high-dimensional information into the image, so that the image contains rich information, the advantages of the existing machine vision can be fully utilized, and a better emotion classification result can be obtained. The invention adopts a time series imaging algorithm (MDF for short) to convert the second brain electrical signal.
First, from a time series after a certain coarse graining
Figure DEST_PATH_IMAGE032
Figure 446946DEST_PATH_IMAGE033
Take n values as a basic time series unit, and record as
Figure DEST_PATH_IMAGE034
An integer of (d); and setting the interval d of the values and the initial index s of the values.
When d =1, it indicates a time series from a certain coarse grain
Figure 126320DEST_PATH_IMAGE035
Taking n values continuously as a basic time sequence unit, then
Figure DEST_PATH_IMAGE036
When the n is =2, the number of the n is more than 2,
Figure 543658DEST_PATH_IMAGE037
when n =3, the number of the bits is increased,
Figure DEST_PATH_IMAGE038
when in use
Figure 825734DEST_PATH_IMAGE039
Time represents a time series from a certain coarse grain
Figure DEST_PATH_IMAGE040
Taking n values as a basic time sequence unit, then
Figure 301144DEST_PATH_IMAGE041
When n =2, the number of the bits is increased,
Figure DEST_PATH_IMAGE042
when the n is =3, the number of the n is more than 3,
Figure 276053DEST_PATH_IMAGE043
namely, it is
Figure DEST_PATH_IMAGE044
Then, the difference between the basic units is calculated according to different intervals d to obtain a new time sequence which is recorded as
Figure 915107DEST_PATH_IMAGE045
The concrete steps are shown as the following formula:
Figure DEST_PATH_IMAGE046
wherein the content of the first and second substances,
Figure 735296DEST_PATH_IMAGE047
as a result of this, it is possible to,
Figure DEST_PATH_IMAGE048
the lengths are different, and a new sequence needs to be constructed
Figure 510616DEST_PATH_IMAGE049
The formula is expressed as:
Figure DEST_PATH_IMAGE050
wherein the content of the first and second substances,
Figure 125268DEST_PATH_IMAGE051
then, an MDF matrix is constructed, the formula being:
Figure DEST_PATH_IMAGE052
when n is determined, the corresponding
Figure 251618DEST_PATH_IMAGE053
Namely have
Figure DEST_PATH_IMAGE054
Is composed of elements capable of generating
Figure 609918DEST_PATH_IMAGE055
A channel data, wherein
Figure DEST_PATH_IMAGE056
The matrix of individual channels can be defined as:
Figure 770904DEST_PATH_IMAGE057
wherein the content of the first and second substances,
Figure DEST_PATH_IMAGE058
to fill in
Figure 556457DEST_PATH_IMAGE059
The element with a value of 0 in the matrix, each channel of the MDF image is defined as:
Figure DEST_PATH_IMAGE060
wherein
Figure 435682DEST_PATH_IMAGE061
Representing a Hadamard product (Hadamard product),
Figure DEST_PATH_IMAGE062
is that
Figure 128832DEST_PATH_IMAGE063
Matrix rotation
Figure DEST_PATH_IMAGE064
The matrix of the latter is then formed,
Figure 239264DEST_PATH_IMAGE065
to prevent from
Figure DEST_PATH_IMAGE066
And
Figure 195718DEST_PATH_IMAGE067
when the two are added, they are overlapped,
Figure DEST_PATH_IMAGE068
meanwhile, after each channel data of the second electroencephalogram signal is converted into an image by using the time series imaging algorithm, the images are spliced into a large image according to the physical positions of the corresponding channels, and the spatial information of the electroencephalogram signal is reserved as much as possible, as shown in fig. 3. The letters in the figure indicate 32 electrode positions for electrode placement according to the international 10-20 system.
S5, performing data enhancement on the N image data sets to construct N samples and label sets;
mixup is a simple and efficient data enhancement method that constructs new training samples and labels in a linear interpolation fashion. The application can remarkably enhance the generalization capability of the network architecture by using the Mixup and costs little calculation expense. The specific process of the step is expressed by a formula as follows:
Figure DEST_PATH_IMAGE070
Figure DEST_PATH_IMAGE072
wherein, the first and the second end of the pipe are connected with each other,
Figure 906448DEST_PATH_IMAGE073
and
Figure DEST_PATH_IMAGE074
sample data in the image dataset;
Figure 668867DEST_PATH_IMAGE075
and
Figure DEST_PATH_IMAGE076
is and is
Figure 460237DEST_PATH_IMAGE077
And
Figure DEST_PATH_IMAGE078
a corresponding label.
Figure 259697DEST_PATH_IMAGE079
Is a compliance
Figure DEST_PATH_IMAGE080
The parameters of the distribution are such that,
Figure 769307DEST_PATH_IMAGE081
wherein
Figure DEST_PATH_IMAGE082
S6, obtaining N first feature vectors by the N samples and the label set through a ResNet model and a DNN-01 model respectively;
using the classical ResNet model as a feature extraction network, there are two differences compared to the traditional CNN (convolutional neural) network: (1) use residual structure, can build ultra-deep network structure, (2) use the Batch standardization layer (english is pieced together for Batch Normalization), solved two problems that traditional convolution network exists: (1) disappearance of the gradient or explosion of the gradient. (2) The degeneration problem (English spelling: degeneration program).
The ResNet18 network is mainly composed of an input layer, a convolutional layer, a Batch Normalization layer, an activation function, a pooling layer, a residual structure, a full link layer, and an exponential Normalization layer (Softmax), and the specific structure is shown in fig. 4.
In the embodiment of the application, when the MDF algorithm is used for converting the second electroencephalogram signal, n is 2, 3 and 4 respectively, and three different image data sets are obtained. Considering that the information contained in the images converted by different algorithms is different, in order to fully utilize all image data, the invention respectively sends three different image data sets to a ResNet18 model and a DNN-01 model, extracts the high-level features of different images and obtains 3 first feature vectors which are recorded as
Figure 210783DEST_PATH_IMAGE083
Figure DEST_PATH_IMAGE084
Figure 974168DEST_PATH_IMAGE085
The network structure of the DNN-01 model is shown in table 1 below, where FL represents a fully-connected layer, RELU represents a linear rectification function, Dr represents random deactivation, and the numbers in the Output column represent the dimension of the Output characteristic of the layer.
TABLE 1 DNN-01 network architecture
Figure DEST_PATH_IMAGE086
Wherein, the meaning of RELU in the supplementary table, the meaning of the number in the Outpu column
Although all three different image data sets need to be subjected to the same ResNet18 model and DNN-01 model, parameters after model training are different due to different image data.
S6, forming 3 second eigenvectors by combining the 3 first eigenvectors;
as shown in fig. 4, in the present application, a feature combiner (hereinafter, referred to as Comber) module combines 3 first feature vectors into 3 second feature vectors, and the specific combination method is as follows:
the first combination method is denoted as SUM and will
Figure 944529DEST_PATH_IMAGE087
Figure DEST_PATH_IMAGE088
Figure 207014DEST_PATH_IMAGE089
Adding the three first eigenvectors to obtain a new vector
Figure DEST_PATH_IMAGE090
The formula is expressed as:
Figure 186602DEST_PATH_IMAGE091
the second combination method is marked as MAX, and is taken out
Figure DEST_PATH_IMAGE092
Figure 686985DEST_PATH_IMAGE093
Figure DEST_PATH_IMAGE094
Three first feature vectors having maximum values at the same positions, and forming a new vector by using the maximum values
Figure 218460DEST_PATH_IMAGE095
The formula is expressed as:
Figure DEST_PATH_IMAGE096
a third combination method is denoted FC, provided that
Figure 437083DEST_PATH_IMAGE097
Figure DEST_PATH_IMAGE098
Figure 33412DEST_PATH_IMAGE099
The three first feature vectors have a linear relation, and then the three first feature vectors are weighted and combined by using a full connection layer to obtain a new vector
Figure DEST_PATH_IMAGE100
The formula is expressed as:
Figure 919460DEST_PATH_IMAGE101
wherein
Figure DEST_PATH_IMAGE102
Is that
Figure 825099DEST_PATH_IMAGE103
Figure DEST_PATH_IMAGE104
Figure 531018DEST_PATH_IMAGE105
And the three first feature vectors are parameters after full-connection layer training.
S7, sending the 3 second feature vectors formed after combination to a DNN-02 model to form a multi-mode feature fusion electroencephalogram emotion classification model; the DNN-02 network structure is shown in table 2.
TABLE 2 DNN-02 network architecture
Figure DEST_PATH_IMAGE106
And S8, randomly dividing the N samples and the label sets in the step S4 into 10 parts by adopting a ten-fold cross validation method, taking 9 parts as training data and taking the rest 1 part as test data, training the multi-modal feature fusion electroencephalogram emotion classification model, and obtaining an electroencephalogram emotion classification recognition model.
In the embodiment, the electroencephalogram signal of each experimenter is converted by using MDF, n is 2, 3 and 4, three different image data sets are obtained, 2400 pictures are obtained, the image data sets are randomly divided into 10 parts by using a ten-fold cross validation method, 9 parts of the image data sets are taken as training data in turn, and 1 part of the image data sets is taken as test data. And then, the pictures of the training set are sent to the multi-modal feature fusion electroencephalogram emotion classification model obtained in the step S7 for training, an Adam optimizer is selected, the learning rate is set to be 0.0001, the loss function is set to be a cross entropy loss function, and the batch size is set to be 32, so that the electroencephalogram emotion classification recognition model is obtained.
In this embodiment, a total of 4 Loss functions (Loss) are defined as
Figure 462196DEST_PATH_IMAGE107
Figure DEST_PATH_IMAGE108
Figure 937170DEST_PATH_IMAGE109
Figure DEST_PATH_IMAGE110
To train the classification network in such a way that,
Figure 544869DEST_PATH_IMAGE111
for optimizing the penalty in training three different image data sets through a single ResNet18 and DNN-01 network architecture,
Figure DEST_PATH_IMAGE112
for optimizing the loss of the second feature vectors through the DNN-02 model, respectively. By passing
Figure 472505DEST_PATH_IMAGE113
The loss of the entire network structure is optimized,
Figure DEST_PATH_IMAGE114
the formula is as follows:
Figure 941795DEST_PATH_IMAGE115
wherein, the first and the second end of the pipe are connected with each other,
Figure DEST_PATH_IMAGE116
and
Figure DEST_PATH_IMAGE117
for the preset parameters, focusing on learning specific features or combinations, the present embodiment is set to 1/3 and 1.0, making the contribution of each image data when training equal.

Claims (7)

1. A multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging is characterized by comprising the following steps: the method comprises the following steps:
s1, performing baseline removal on the original electroencephalogram signal by using a python code to obtain a first electroencephalogram signal;
s2, carrying out multi-scale processing on the first electroencephalogram signal to obtain a second electroencephalogram signal;
s3, converting the second brain electrical signal into an image by using a time series imaging algorithm to obtain N image data sets;
s4, performing data enhancement on the N image data sets to construct N samples and label sets;
s5, obtaining N first feature vectors by the N samples and the label sets through a ResNet model and a DNN-01 model respectively;
s6, forming 3 second eigenvectors by combining N first eigenvectors;
s7, forming a multi-modal feature fusion electroencephalogram emotion classification model by the 3 second feature vectors through a DNN-02 model respectively;
and S8, randomly dividing the N samples and the label sets in the step S4 into ten parts by adopting a ten-fold cross validation method, taking nine parts as training data and taking the rest part as test data, training the multi-modal feature fusion electroencephalogram emotion classification model, and obtaining an electroencephalogram emotion classification recognition model.
2. The multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging as recited in claim 1, wherein: the baseline removal includes the following: and respectively dividing the baseline signal and the experimental signal in the original electroencephalogram signal into a K section and an I section with the length of L, and subtracting the average value of all the baseline signal sections from each experimental signal section.
3. The multi-modality feature fusion electroencephalogram emotion recognition method based on multi-scale imaging as claimed in claim 1, wherein: the mathematical definition of the multiscale process is:
Figure DEST_PATH_IMAGE002
wherein, the first and the second end of the pipe are connected with each other,
Figure DEST_PATH_IMAGE004
for the set time scale, L is the length of the original brain electrical signal,
Figure DEST_PATH_IMAGE006
is the signal value of the original brain electrical signal at the time i,
Figure DEST_PATH_IMAGE008
is the second brain electrical signal, j is the index of the second brain electrical signal.
4. The multi-modality feature fusion electroencephalogram emotion recognition method based on multi-scale imaging as claimed in claim 1, wherein: the data enhancement is Mixup.
5. The multi-modality feature fusion electroencephalogram emotion recognition method based on multi-scale imaging as claimed in claim 1, wherein: 3 of the second feature vectors are: a second feature vector composed of N additions of the first feature vectors; a second feature vector consisting of the maximum values of the same positions of the N first feature vectors; and a second feature vector formed by weighted combination of the N first feature vectors by using a full connection layer.
6. The multi-modality feature fusion electroencephalogram emotion recognition method based on multi-scale imaging, which is characterized in that: and the weight in the weighted combination is the parameter of the N first characteristic vectors after full-connected layer training.
7. The multi-modality feature fusion electroencephalogram emotion recognition method based on multi-scale imaging as claimed in claim 1, wherein: the loss calculation formula of the multi-modal feature fusion electroencephalogram emotion classification model is as follows:
Figure DEST_PATH_IMAGE010
wherein, the first and the second end of the pipe are connected with each other,
Figure DEST_PATH_IMAGE012
and
Figure DEST_PATH_IMAGE014
l i is the loss of the N samples and the label set through the ResNet model and the DNN-01 model for the preset parameters,
Figure DEST_PATH_IMAGE016
an integer of (a); l is a radical of an alcoholcomAnd respectively passing the loss of the DNN-02 model for the N third feature vectors.
CN202210440906.XA 2022-04-26 2022-04-26 Multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging Pending CN114767130A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210440906.XA CN114767130A (en) 2022-04-26 2022-04-26 Multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210440906.XA CN114767130A (en) 2022-04-26 2022-04-26 Multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging

Publications (1)

Publication Number Publication Date
CN114767130A true CN114767130A (en) 2022-07-22

Family

ID=82432172

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210440906.XA Pending CN114767130A (en) 2022-04-26 2022-04-26 Multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging

Country Status (1)

Country Link
CN (1) CN114767130A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115238835A (en) * 2022-09-23 2022-10-25 华南理工大学 Electroencephalogram emotion recognition method, medium and equipment based on double-space adaptive fusion
CN115644870A (en) * 2022-10-21 2023-01-31 东北林业大学 Electroencephalogram signal emotion recognition method based on TSM-ResNet model

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115238835A (en) * 2022-09-23 2022-10-25 华南理工大学 Electroencephalogram emotion recognition method, medium and equipment based on double-space adaptive fusion
CN115644870A (en) * 2022-10-21 2023-01-31 东北林业大学 Electroencephalogram signal emotion recognition method based on TSM-ResNet model
CN115644870B (en) * 2022-10-21 2024-03-08 东北林业大学 Electroencephalogram signal emotion recognition method based on TSM-ResNet model

Similar Documents

Publication Publication Date Title
Li et al. From regional to global brain: A novel hierarchical spatial-temporal neural network model for EEG emotion recognition
Alhagry et al. Emotion recognition based on EEG using LSTM recurrent neural network
Guo et al. Multimodal emotion recognition from eye image, eye movement and EEG using deep neural networks
Zheng Multichannel EEG-based emotion recognition via group sparse canonical correlation analysis
Kollias et al. Aff-wild2: Extending the aff-wild database for affect recognition
Zhou et al. Visually interpretable representation learning for depression recognition from facial images
Tan et al. Deep transfer learning for EEG-based brain computer interface
CN114767130A (en) Multi-modal feature fusion electroencephalogram emotion recognition method based on multi-scale imaging
Liu et al. Multiple feature fusion for automatic emotion recognition using EEG signals
CN111297380A (en) Emotion recognition method based on space-time convolution core block
Zhong et al. Cross-subject emotion recognition from EEG using convolutional neural networks
Kim et al. Attended relation feature representation of facial dynamics for facial authentication
Yao et al. Information-preserving feature filter for short-term EEG signals
Altaheri et al. Dynamic convolution with multilevel attention for EEG-based motor imagery decoding
Tan et al. Attention-based transfer learning for brain-computer interface
Sartipi et al. EEG emotion recognition via graph-based spatio-temporal attention neural networks
Zhu et al. RAMST-CNN: a residual and multiscale spatio-temporal convolution neural network for personal identification with EEG
Li et al. A transfer learning method based on VGG-16 convolutional neural network for MI classification
Hu et al. Multi-modal emotion recognition combining face image and EEG signal
Zhang et al. Learn to walk across ages: Motion augmented multi-age group gait video translation
Partovi et al. A Self-Supervised Task-Agnostic Embedding for EEG Signals
Liu Class-constrained transfer LDA for cross-view action recognition in internet of things
Huang et al. Shallow inception domain adaptation network for EEG-based motor imagery classification
Rahman et al. Convolutional temporal attention model for video-based person re-identification
Tan et al. Adaptive adversarial transfer learning for electroencephalography classification

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination