CN114176607B - Electroencephalogram signal classification method based on vision transducer - Google Patents
Electroencephalogram signal classification method based on vision transducer Download PDFInfo
- Publication number
- CN114176607B CN114176607B CN202111616915.1A CN202111616915A CN114176607B CN 114176607 B CN114176607 B CN 114176607B CN 202111616915 A CN202111616915 A CN 202111616915A CN 114176607 B CN114176607 B CN 114176607B
- Authority
- CN
- China
- Prior art keywords
- data
- eeg
- num
- module
- size
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 38
- 238000013145 classification model Methods 0.000 claims abstract description 13
- 230000000007 visual effect Effects 0.000 claims abstract description 10
- 238000012549 training Methods 0.000 claims abstract description 6
- 238000007781 pre-processing Methods 0.000 claims abstract description 5
- 239000011159 matrix material Substances 0.000 claims description 26
- 239000010410 layer Substances 0.000 claims description 22
- 210000004556 brain Anatomy 0.000 claims description 15
- 230000008451 emotion Effects 0.000 claims description 7
- 230000011218 segmentation Effects 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 2
- 230000006870 function Effects 0.000 claims description 2
- 238000011478 gradient descent method Methods 0.000 claims description 2
- 239000011229 interlayer Substances 0.000 claims description 2
- 238000010606 normalization Methods 0.000 claims description 2
- 238000005457 optimization Methods 0.000 claims description 2
- 238000004458 analytical method Methods 0.000 description 7
- 238000013135 deep learning Methods 0.000 description 3
- 230000000694 effects Effects 0.000 description 3
- 238000010801 machine learning Methods 0.000 description 3
- 238000012545 processing Methods 0.000 description 3
- 238000013473 artificial intelligence Methods 0.000 description 2
- 230000007177 brain activity Effects 0.000 description 2
- 238000011161 development Methods 0.000 description 2
- 230000018109 developmental process Effects 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 210000003128 head Anatomy 0.000 description 2
- 210000002569 neuron Anatomy 0.000 description 2
- 238000001228 spectrum Methods 0.000 description 2
- 238000005481 NMR spectroscopy Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 210000003710 cerebral cortex Anatomy 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
- 230000001242 postsynaptic effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 210000004761 scalp Anatomy 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/24—Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
- A61B5/316—Modalities, i.e. specific diagnostic methods
- A61B5/369—Electroencephalography [EEG]
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/16—Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
- A61B5/165—Evaluating the state of mind, e.g. depression, anxiety
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/16—Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
- A61B5/168—Evaluating attention deficit, hyperactivity
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/16—Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state
- A61B5/18—Devices for psychotechnics; Testing reaction times ; Devices for evaluating the psychological state for vehicle drivers or machine operators
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/24—Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
- A61B5/316—Modalities, i.e. specific diagnostic methods
- A61B5/369—Electroencephalography [EEG]
- A61B5/372—Analysis of electroencephalograms
- A61B5/374—Detecting the frequency distribution of signals, e.g. detecting delta, theta, alpha, beta or gamma waves
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7203—Signal processing specially adapted for physiological signals or for diagnostic purposes for noise prevention, reduction or removal
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
- A61B5/725—Details of waveform analysis using specific filters therefor, e.g. Kalman or adaptive filters
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
- A61B5/7264—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
- A61B5/7267—Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
- G06F18/241—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
- G06F18/2415—Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B2503/00—Evaluating a particular growth phase or type of persons or animals
- A61B2503/20—Workers
- A61B2503/22—Motor vehicles operators, e.g. drivers, pilots, captains
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- Biomedical Technology (AREA)
- General Health & Medical Sciences (AREA)
- Biophysics (AREA)
- Molecular Biology (AREA)
- Psychiatry (AREA)
- Heart & Thoracic Surgery (AREA)
- Medical Informatics (AREA)
- Surgery (AREA)
- Animal Behavior & Ethology (AREA)
- Pathology (AREA)
- Public Health (AREA)
- Veterinary Medicine (AREA)
- Artificial Intelligence (AREA)
- Theoretical Computer Science (AREA)
- Signal Processing (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Psychology (AREA)
- Evolutionary Computation (AREA)
- Data Mining & Analysis (AREA)
- Developmental Disabilities (AREA)
- Physiology (AREA)
- Social Psychology (AREA)
- General Engineering & Computer Science (AREA)
- Hospice & Palliative Care (AREA)
- Mathematical Physics (AREA)
- Educational Technology (AREA)
- Child & Adolescent Psychology (AREA)
- General Physics & Mathematics (AREA)
- Computational Linguistics (AREA)
- Computing Systems (AREA)
- Software Systems (AREA)
- Fuzzy Systems (AREA)
- Probability & Statistics with Applications (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Evolutionary Biology (AREA)
Abstract
The invention discloses an electroencephalogram signal classification method based on a visual transducer, which comprises the steps of firstly, preprocessing data, obtaining processed EEG data with labels, and then constructing an electroencephalogram signal classification model based on Vision Transformer; finally training an electroencephalogram signal classification model through the preprocessed EEG data: according to the method, the EEG sample is subjected to feature embedding through a proper EEG feature embedding method, then the long-time dependency relationship between the local features of the EEG sample and the continuous EEG signal is learned, and good performance is achieved in an EEG signal classification task.
Description
Technical Field
The invention relates to the field of electroencephalogram signal identification in the field of biological feature identification, in particular to an electroencephalogram signal classification method based on vision transformers (Vision Transformer, viT).
Background
Brain waves (EEG) are a method of recording brain activity using electrophysiological indicators, and are formed by summing up postsynaptic potentials that occur synchronously with a large number of neurons when the brain is active. It records the changes in electrical waves during brain activity, which is an overall reflection of the electrophysiological activity of brain nerve cells on the surface of the cerebral cortex or scalp. The brain electrical signal contains abundant, various and objective physiological information, and is often used in brain-computer interfaces (Brain Computer Interface, BCI) for research and analysis of the brain electrical signal so as to realize direct connection between the brain of a human or animal and external equipment and realize information exchange between the brain and the equipment. Because of its objectivity and convenience compared with other physiological signals such as nuclear magnetic resonance, the electroencephalogram signal is also used for judging physiological states, such as analyzing the emotion of people through the electroencephalogram signal, judging whether a driver is in a fatigue state or not by using the electroencephalogram signal in fatigue driving application, and the like.
The traditional electroencephalogram signal analysis method has a traditional signal analysis method based on a time domain or a frequency domain. Common time domain analysis methods are: waveform characterization, autoregressive AR models, and the like. Common frequency domain analysis methods are: fourier transform, power spectral density, non-parametric spectrum estimation, power spectrum estimation based on AR model, etc. The traditional signal analysis method needs to manually extract the characteristics of the brain electrical signals, and is time-consuming and labor-consuming. With the development of artificial intelligence technology, more and more artificial intelligence methods are used for analyzing brain electrical signals, and classical machine learning methods such as KNN, SVM, LDA and the like, and deep learning methods such as CNN, RNN, LSTM and the like are used for analyzing brain electrical signals. The method based on machine learning and deep learning can automatically extract and learn the characteristics in the electroencephalogram signals, and greatly promote the development of electroencephalogram signal analysis. Although these machine learning-based, deep learning methods have met with some success, they have not fully utilized the unique features of the electroencephalogram signals.
The electroencephalogram signal is a time series signal having a long dependency. The transducer model proposed in paper Attention is All You Need has been extremely successful in the NLP field, and the model can learn long dependency relations in sentences, but cannot achieve ideal effects when being directly applied to classification of electroencephalogram signals. The paper AN IMAGE IS WORTH 16X16 WORDS:TRANSFORMERS FOR IMAGE RECOGNITION AT SCALE proposes a ViT model suitable for the image field, which has been extremely successful in the image field, and the electroencephalogram signal is also a special image in a certain sense. However, the use of ViT for classification of the electroencephalogram signal directly cannot achieve an ideal effect because there is no module for learning local features, which are important for the electroencephalogram signal.
Disclosure of Invention
In order to solve the problems that the performance of applying the transducer to EEG signal classification is poor and local characteristics of the characteristics cannot be learned, and the like, the invention provides an EEG signal classification method EEG Vision Transformer (EEGViT) based on visual transducer, which learns the local characteristics through a EEG Transformer Encoder module, and Sequence In Time Transformer Encoder learns time sequence dependency relations among continuous samples to improve classification performance.
The invention adopts the technical proposal for overcoming the defects of the prior method that:
an electroencephalogram signal classification method based on visual transducer comprises the following steps:
Step 1: data preprocessing, namely acquiring labeled processed EEG data:
Step 2: establishing an electroencephalogram signal classification model based on Vision Transformer;
step 3: training an electroencephalogram classification model by the preprocessed EEG data:
further, the specific method in the step 1 is as follows:
The method comprises the steps of adopting a public emotion data Set (SEED) data set, segmenting preprocessed EEG data provided in the SEED data set according to 1 second time, extracting differential entropy characteristics of each electroencephalogram channel according to segmented data, obtaining differential entropy characteristic data of 62 channels multiplied by 5 frequency bands, and finally flattening the data into a one-dimensional data sample with the length of 310, wherein the label of the sample is the label of the provided corresponding stimulus emotion. Successive num samples are a set of data as input to the model.
Further, the specific method in the step 2 is as follows:
The Vision Transformer-based electroencephalogram signal classification model (EEG Vision Transformer) is improved based on Vision Transformer, a CNN module is added after multi-head attention in Transformer Encoder of Vision Transformer to form a EEG Transformer Encoder module, a MBConv module is added after multi-head attention in Transformer Encoder to form a Sequence In Time Transformer Encoder module, the electroencephalogram signal classification model comprises num EEG Transformer Encoder modules and is used for learning local features of continuous num samples, the learned local features of the num samples are input into a Sequence In Time Transformer Encoder module to learn time sequence dependence among the samples, and num tokens with time sequence dependence are obtained. And finally, inputting the num token into num classifiers for classification, wherein the classifiers are MLP modules.
Further, the specific method of the step 3 is as follows:
input: the EEG data samples after the processing with the labels are a group of input models, total_group, maximum iteration number N, EEG sample size EEG_size, patch block size patch_size and embedded dimension size emped_dim, wherein the continuous num samples are a group of input models.
Step 3.1: initializing:
Num EEG Transformer Encoder modules and num classifiers are initialized, one Sequence In Time Transformer Encoder module, and the initial iteration number t=1.
Step 3.2: the local features of the individual samples are learned by EEG Transformer Encoder module.
Step 3.3: the time sequence dependence among samples is learned by Sequence In Time Transformer Encoder modules.
Step 3.4: the learned features are classified by a classifier.
Step 3.5: and calculating cross entropy according to the classification result and the real label, and updating the model super-parameters by minimizing cross entropy loss.
Step 3.6: steps 3.2 to 3.5 are performed once for each set of data entered;
Step 3.7: after all input data are processed, the iteration times t=t+1 return to the step 3.2 for iteration until the set iteration times are reached;
and (3) outputting: probability of predicted electroencephalogram signal class.
Compared with the prior art, the invention has the beneficial effects that:
(1) The invention provides a feature embedding method suitable for EEG samples.
(2) The invention can learn the local characteristics of the input data through the EEG Transformer Encoder module.
(3) According to the invention, the time dependency relationship between the continuous samples is learned through the Sequence In Time Transformer Encoder module, so that better classification performance is obtained.
In a word, EEG Vision Transformer (EEGViT) provided by the invention performs characteristic embedding on an EEG sample through a proper EEG characteristic embedding method, then learns the long-time dependency relationship between the local characteristics of the EEG sample and continuous EEG signals, and obtains better performance in an EEG signal classification task.
Drawings
FIG. 1 is a flow chart of an embodiment of the present invention;
FIG. 2 is a block diagram of EEG Vision Transformer modules;
Fig. 3 is a block diagram of Sequence In Time Transformer Encoder modules.
Detailed Description
The invention is further described below with reference to the drawings and examples.
The flow of the present invention is shown in figure 1.
Step 1: data preprocessing, namely acquiring labeled processed EEG data:
With the public emotion dataset SEED dataset contained in the Preprocessed _eeg folder is EEG data down sampled to 200Hz, pre-processed using a 0-75Hz band pass filter. As shown in the data processing part of fig. 1, the data processing process is that preprocessing EEG data provided in the SEED data set is segmented according to 1 second time, and each electroencephalogram channel of the segmented data is subjected to differential entropy feature extraction, wherein the differential entropy feature is defined as follows:
Where X corresponds to a Gaussian distribution N (μ, σ 2), μ is the mean of the distribution, σ is the standard deviation of the distribution, X is the variable, pi and e are constants, exp is an exponential operation, and log is a logarithmic operation.
And obtaining differential entropy characteristic data of 62 channels multiplied by 5 frequency bands after the differential entropy characteristic extraction, and finally flattening the differential entropy characteristic data into a one-dimensional data sample with the length of 310, wherein the label of the sample is the label of the provided corresponding stimulus emotion. Successive num samples are a set of data as input to the model.
Step 2: establishing an electroencephalogram signal classification model based on Vision Transformer;
The Vision Transformer-based electroencephalogram classification model EEG Vision Transformer is composed of num EEG Transformer Encoder modules, one Sequence In Time Transformer Encoder module and num classifiers. EEG Transformer Encoder is formed by adding a CNN module after multi-head attention in Transformer Encoder of Vision Transformer, the structure is shown in fig. 2, embedded Patches is the input of the module, then the module is normalized, the multi-head self-attention multi-head attention module has a shortcut branch after Embedded Patches, the Embedded Patches will be added with the output of the multi-head attention module, then the next module is sent after the Embedded Patches will be added with the output of the multi-head attention module, the next module is the one-dimensional data, the CNN module, the norm module and the MLP module, because the input sample is one-dimensional data, the CNN module uses a one-dimensional convolution module, the data after Embedded Patches will be added with the multi-head attention module also has a shortcut branch with the final output, and Embedded Patches will be added with the data after the multi-head attention module and the final output as the final output. Sequence In Time Transformer Encoder is formed by adding a MBConv module after multi-head attention in Transformer Encoder of Vision Transformer, the structure of the MBConv module is introduced to MBConv module in the paper of EFFICIENTNETV2: smaller Models AND FASTER TRAINING, sequence In Time Transformer Encoder structure is shown in figure 3, the structure is similar to EEG Transformer Encoder module, and the CNN module is replaced by MBConv module. The classifier is an MLP module.
Step 3: training an electroencephalogram classification model by the preprocessed EEG data:
Input: the labeled processed EEG data samples, the consecutive num samples are 1 set of input models, total_group, maximum iteration number N, EEG sample size EEG_size, patch block size (patch_size needs to be divisible by EEG_size), and embedded dimension size emped_dim.
Step 3.1: initializing: randomly initializing num EEG Transformer Encoder modules, sequence In Time Transformer Encoder modules, hyper-parameters of num classifiers and initial iteration times t=1.
Step 3.2: learning local features of a single sample;
The input num samples learn the local features of the individual samples through num EEG Transformer Encoder, the specific steps of each EEG Transformer Encoder are as follows:
a) Feature segmentation: unlike the two-dimensional picture input of ViT, each sample of EEGViT is 5 frequency bands multiplied by 62 channels of one-dimensional feature data of length 310, and the input one-dimensional feature is segmented into EEG_size/patch_size one-dimensional data segments according to the size of patch_size.
B) Feature embedding: flattening the segmented data segment according to the embedded dimension size emmbed_dim, and adding a token for local feature learning to obtain Embedded Patches. The processed Embedded Patches is input EEG Transformer Encoder to learn the local features.
C) Local feature learning: the local features are learned by EEG Transformer Encoder encoder. Unlike Transformer Encoder of ViT, EEG Transformer Encoder adds a CNN module for learning and extracting local characteristics of input data between the Multi-Head Attention module and the MLP module in order to learn the local characteristic information of input brain data. Embedded Patches, firstly normalizing the data, and then inputting the normalized data into a Multi-Head attribute module, wherein the formula of the Multi-Head attribute is as follows:
MultiHead(Q,K,V)=Concat(head1,…,headh)WO
Where Q is the query matrix, K is the key matrix, V is the value matrix, Is the parameter matrix of the ith head of the query matrix,Is the key matrix of the ith head,/>Is the value matrix of the i-th header, W O is the parameter matrix, concat is the join operation.
Wherein the equation of the Attention is:
Where Q is the query matrix, K is the keyword matrix, V is the value matrix, d k is the dimensions of the query matrix and the keyword matrix, and softmax is the softmax operation.
Through the Multi-Head Attention module, the interdependence relationship between the segmentation data can be learned.
The data output by the Multi-Head Attention module is added with Embedded Patches data, normalized and then sent to the CNN module to learn the local characteristics of the input data. The convolution kernel parameter sharing and interlayer connection sparsity of the CNN module can extract local features of input data with small calculation amount, so that the local features of the input data are learned and extracted. The convolution kernel of the CNN module is 3 steps and 1.
The data output by the CNN module is input into the MLP module after normalization to further learn hidden features in the data, the size of the input data is reduced by 2 after convolution operation with the size of a convolution kernel being 3 and the step length being 1, so that the size of an input layer of the MLP is embed_dim-2, the hidden features in node learning data of the hidden layer are increased, the size of the hidden layer is set to be embed_dim 4, and finally the size of an output layer is embed_dim. The output of the MLP module is added to the data of the Multi-Head Attention module added to Embedded Patches data as the output of EEG Transformer Encoder. The output at this time is EEG_size/patch_size+1 token, numbered 0 through EEG_size/patch_size, and the local feature token numbered 0 is returned for feature learning of the subsequent module.
Since the CNN module and the MLP module are well known to those skilled in the art, the structure thereof is not explained in detail.
Step 3.3: the num local features learned by num EEG Transformer Encoder constitute new Embedded Patches and are sent to the Sequence In Time Transformer Encoder module to learn the dependency relationship of the num continuous samples. Sequence In Time Transformer Encoder the learning process is similar to EEG Transformer Encoder. num local features learn the interdependencies between themselves through the Multi-Head Attention module, followed by MBConv modules to learn further features between samples. Finally, the hidden features among the samples are learned through the MLP module. The input layer size of the MLP module is num, the hidden layer size is set to num×4, and finally the output layer size is num. And finally outputting num feature token with mutual time sequence dependency.
Step 3.4: and (3) inputting the returned num token into num classifiers for classification, wherein the classifier is an MLP module, the input layer size is an embed_dim, the output layer size is the category number, the SEED data set is 3, the output layer is 3, no hidden layer exists, and finally the num classifiers obtain the predicted probability that the num samples belong to each category.
Step 3.5: the cross entropy is calculated according to the classification result and the real label,
Where M is the number of classes, y ic is the sign function, 1 is taken when the class is equal to the real label, otherwise 0 is taken, p ic is the predicted probability that the sample belongs to class c, and num is the number of samples.
The model super parameters are updated by minimizing cross entropy loss, and an adopted optimization method is a random gradient descent method (SGD):
θ=θ-∈g# (5)
m is the number of m samples taken from the input samples for updating the parameters, g is the gradient, and θ is the hyper-parameter in the model.
Step 3.6: steps 3.2 to 3.5 are performed once for each set of data;
Step 3.7: after all input data are processed, the iteration times t=t+1 return to the step 3.2 for iteration until the set iteration times are reached;
And (3) outputting: predicting the probability that the electroencephalogram signal belongs to each category, and taking the category with the highest probability as the prediction category.
Claims (7)
1. An electroencephalogram signal classification method based on visual transducer is characterized by comprising the following steps:
Step 1: data preprocessing, namely acquiring labeled processed EEG data:
Step 2: establishing an electroencephalogram signal classification model based on Vision Transformer;
The Vision Transformer-based electroencephalogram signal classification model EEG Vision Transformer is composed of num EEG Transformer Encoder modules, one Sequence In Time Transformer Encoder module and num classifiers;
EEG Transformer Encoder is formed by adding a CNN module after multi-head attention in Transformer Encoder of Vision Transformer; sequence In Time Transformer Encoder is constituted by adding a MBConv module to the multi-head attention in Transformer Encoder of Vision Transformer;
the num EEG Transformer Encoder modules are used for learning local features of continuous num samples, and the learned local features of the num samples are input into the Sequence In Time Transformer Encoder module to learn time sequence dependence among the samples to obtain num token with time sequence dependence; finally, inputting num token into num classifiers for classification, wherein the classifiers are MLP modules;
step 3: and training an electroencephalogram signal classification model through the preprocessed EEG data.
2. The brain electrical signal classification method based on visual transducer according to claim 1, wherein the specific method in step 1 is as follows:
Adopting a public emotion data Set (SEED) data set, segmenting the preprocessed EEG data provided in the SEED data set according to 1 second time, extracting differential entropy characteristics of each electroencephalogram channel by the segmented data to obtain differential entropy characteristic data of 62 channels multiplied by 5 frequency bands, and finally flattening the data into a one-dimensional data sample with the length of 310, wherein the label of the sample is the label of the provided corresponding stimulus emotion; successive num samples are a set of data as input to the model.
3. The method for classifying electroencephalogram signals based on visual transducer according to claim 2, wherein the specific method in step 3 is as follows:
Input: the processed EEG data samples with labels are a group of input models, total_group groups are used as continuous num samples, the maximum iteration number N is the size EEG_size of the EEG samples, the patch block size patch_size is embedded into the dimension size emped_dim;
step 3.1: initializing:
initializing num EEG Transformer Encoder modules and num classifiers, one Sequence In Time Transformer Encoder module, and performing initial iteration times t=1;
step 3.2: learning local features of the individual samples by EEG Transformer Encoder module;
Step 3.3: learning timing dependencies among samples by Sequence In Time Transformer Encoder modules;
step 3.4: classifying the learned features by a classifier;
step 3.5: calculating cross entropy according to the classification result and the real label, and updating the model super-parameters by minimizing cross entropy loss;
Step 3.6: steps 3.2 to 3.5 are performed once for each set of data entered;
Step 3.7: after all input data are processed, the iteration times t=t+1 return to the step 3.2 for iteration until the set iteration times are reached;
step 3: and training an electroencephalogram signal classification model through the preprocessed EEG data.
4. The brain electrical signal classification method based on visual transducer according to claim 3, wherein the specific method of step 3.2 is as follows:
The input num samples learn the local features of the individual samples through num EEG Transformer Encoder, the specific steps of each EEG Transformer Encoder are as follows:
a) Feature segmentation: unlike the two-dimensional picture input of ViT, each sample of EEGViT is one-dimensional feature data of length 310 of 5 frequency bands by 62 channels, and the input one-dimensional feature is segmented into EEG_size/patch_size one-dimensional data segments according to the size of patch_size;
b) Feature embedding: flattening the segmented data segment according to the embedded dimension size emmbed_dim, and adding a token for local feature learning to obtain Embedded Patches; inputting the processed Embedded Patches into EEG Transformer Encoder to learn local features;
c) Local feature learning: learning the local features by EEG Transformer Encoder encoders; unlike Transformer Encoder of ViT, EEG Transformer Encoder adds a CNN module for learning and extracting local characteristics of input data between the Multi-Head Attention module and the MLP module in order to learn the local characteristic information of input brain data; embedded Patches, firstly normalizing the data, and then inputting the normalized data into a Multi-Head attribute module, wherein the formula of the Multi-Head attribute is as follows:
MultiHead(Q,K,V)=Concat(head1,…,headh)WO
Where Q is the query matrix, K is the key matrix, V is the value matrix, Is the parameter matrix of the ith head of the query matrix,/>Is the key matrix of the ith head,/>Is the value matrix of the ith header, W O is the parameter matrix, concat is the join operation;
Wherein the equation of the Attention is:
Where Q is the query matrix, K is the keyword matrix, V is the value matrix, d k is the dimensions of the query matrix and the keyword matrix, and softmax is the softmax operation;
Through the Multi-Head Attention module, the interdependence relationship between the segmentation data can be learned;
Adding data output by the Multi-Head Attention module and Embedded Patches data, normalizing and then sending the normalized data into the CNN module to learn local characteristics of input data; the convolution kernel parameter sharing and interlayer connection sparsity of the CNN module can extract local features of input data with small calculation amount, so that the local features of the input data are learned and extracted; the convolution kernel of the CNN module is 3 steps and 1;
The data output by the CNN module is input into the MLP module after normalization to further learn hidden features in the data, the size of the input data is reduced by 2 after convolution operation with the size of a convolution kernel being 3 and the step length being 1, so that the size of an input layer of the MLP is embed_dim-2, the hidden features in node learning data of a hidden layer are increased, the size of the hidden layer is set to be embed_dim 4, and finally the size of an output layer is embed_dim; the output of the MLP module is added with the data obtained by adding the data output by the Multi-Head Attention module and Embedded Patches data to be used as the output of EEG Transformer Encoder; the output at this time is EEG_size/patch_size+1 token, numbered 0 through EEG_size/patch_size, and the local feature token numbered 0 is returned for feature learning of the subsequent module.
5. The method for classifying electroencephalogram signals based on visual transducer according to claim 4, wherein the specific method in step 3.3 is as follows:
The num local features learned by the num EEG Transformer Encoder constitute new Embedded Patches and are sent to a Sequence In Time Transformer Encoder module to learn the dependency relationship of the num continuous samples; sequence In Time Transformer Encoder the learning process is similar to EEG Transformer Encoder; num local features learn the mutual dependency relationship through a Multi-Head Attention module, and then learn the further features among samples through a MBConv module; finally, learning hidden features among samples through an MLP module; the size of an input layer of the MLP module is num, the size of a hidden layer is set to be num 4, and finally the size of an output layer is num; and finally outputting num feature token with mutual time sequence dependency.
6. The method for classifying electroencephalogram signals based on visual transducer according to claim 5, wherein the specific method in step 3.4 is as follows:
And (3) inputting the returned num token into num classifiers for classification, wherein the classifier is an MLP module, the input layer size is an embed_dim, the output layer size is the category number, the SEED data set is 3, the output layer is 3, no hidden layer exists, and finally the num classifiers obtain the predicted probability that the num samples belong to each category.
7. The method for classifying electroencephalogram signals based on visual transducer according to claim 6, wherein the specific method in step 3.5 is as follows:
the cross entropy is calculated according to the classification result and the real label,
Wherein M is the number of categories, y ic is the sign function, 1 is taken when the category is equal to the real label, otherwise 0 is taken, p ic is the prediction probability that the sample belongs to the c category, and num is the number of samples;
the model super parameters are updated by minimizing cross entropy loss, and an adopted optimization method is a random gradient descent method (SGD):
θ=θ-∈g#(5)
m is the number of m samples taken from the input samples for updating the parameters, g is the gradient, and θ is the hyper-parameter in the model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111616915.1A CN114176607B (en) | 2021-12-27 | 2021-12-27 | Electroencephalogram signal classification method based on vision transducer |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202111616915.1A CN114176607B (en) | 2021-12-27 | 2021-12-27 | Electroencephalogram signal classification method based on vision transducer |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114176607A CN114176607A (en) | 2022-03-15 |
CN114176607B true CN114176607B (en) | 2024-04-19 |
Family
ID=80606165
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202111616915.1A Active CN114176607B (en) | 2021-12-27 | 2021-12-27 | Electroencephalogram signal classification method based on vision transducer |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114176607B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN114782933A (en) * | 2022-05-09 | 2022-07-22 | 合肥综合性国家科学中心人工智能研究院(安徽省人工智能实验室) | Driver fatigue detection system based on multi-mode Transformer network |
CN115422983A (en) * | 2022-11-04 | 2022-12-02 | 智慧眼科技股份有限公司 | Emotion classification method and device based on brain wave signals |
CN115969381B (en) * | 2022-11-16 | 2024-04-30 | 西北工业大学 | Electroencephalogram signal analysis method based on multi-band fusion and space-time transducer |
CN115813409B (en) * | 2022-12-02 | 2024-08-23 | 复旦大学 | Motion image electroencephalogram decoding method with ultralow delay |
CN115844425B (en) * | 2022-12-12 | 2024-05-17 | 天津大学 | DRDS brain electrical signal identification method based on transducer brain region time sequence analysis |
CN117281534B (en) * | 2023-11-22 | 2024-03-22 | 广东省人民医院 | Multi-index anesthesia state monitoring method and system |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111616721A (en) * | 2020-05-31 | 2020-09-04 | 天津大学 | Emotion recognition system based on deep learning and brain-computer interface and application |
CN112101152A (en) * | 2020-09-01 | 2020-12-18 | 西安电子科技大学 | Electroencephalogram emotion recognition method and system, computer equipment and wearable equipment |
KR20210051419A (en) * | 2019-10-30 | 2021-05-10 | 한밭대학교 산학협력단 | System for classificating mental workload using eeg and method thereof |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111460892A (en) * | 2020-03-02 | 2020-07-28 | 五邑大学 | Electroencephalogram mode classification model training method, classification method and system |
-
2021
- 2021-12-27 CN CN202111616915.1A patent/CN114176607B/en active Active
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR20210051419A (en) * | 2019-10-30 | 2021-05-10 | 한밭대학교 산학협력단 | System for classificating mental workload using eeg and method thereof |
CN111616721A (en) * | 2020-05-31 | 2020-09-04 | 天津大学 | Emotion recognition system based on deep learning and brain-computer interface and application |
CN112101152A (en) * | 2020-09-01 | 2020-12-18 | 西安电子科技大学 | Electroencephalogram emotion recognition method and system, computer equipment and wearable equipment |
Non-Patent Citations (1)
Title |
---|
Introducing Attention Mechanism for EEG Signals: Emotion Recognition with Vision Transformers;Arjun 等;2021 43rd Annual International Conference of the IEEE Engineering in Medicine & Biology Society (EMBC);20211104;5723-5726 * |
Also Published As
Publication number | Publication date |
---|---|
CN114176607A (en) | 2022-03-15 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN114176607B (en) | Electroencephalogram signal classification method based on vision transducer | |
CN110532900B (en) | Facial expression recognition method based on U-Net and LS-CNN | |
Yuan et al. | Patients’ EEG data analysis via spectrogram image with a convolution neural network | |
CN114052735B (en) | Deep field self-adaption-based electroencephalogram emotion recognition method and system | |
CN111444960A (en) | Skin disease image classification system based on multi-mode data input | |
Mensch et al. | Learning neural representations of human cognition across many fMRI studies | |
CN112766355B (en) | Electroencephalogram signal emotion recognition method under label noise | |
Kaziha et al. | A convolutional neural network for seizure detection | |
CN113392733B (en) | Multi-source domain self-adaptive cross-tested EEG cognitive state evaluation method based on label alignment | |
CN108256629A (en) | The unsupervised feature learning method of EEG signal based on convolutional network and own coding | |
CN113011330B (en) | Electroencephalogram signal classification method based on multi-scale neural network and cavity convolution | |
CN114564990B (en) | Electroencephalogram signal classification method based on multichannel feedback capsule network | |
CN113554110B (en) | Brain electricity emotion recognition method based on binary capsule network | |
CN112465069A (en) | Electroencephalogram emotion classification method based on multi-scale convolution kernel CNN | |
CN113243924A (en) | Identity recognition method based on electroencephalogram signal channel attention convolution neural network | |
CN113069117A (en) | Electroencephalogram emotion recognition method and system based on time convolution neural network | |
CN114841216B (en) | Electroencephalogram signal classification method based on model uncertainty learning | |
He et al. | What catches the eye? Visualizing and understanding deep saliency models | |
CN107045624B (en) | Electroencephalogram signal preprocessing and classifying method based on maximum weighted cluster | |
CN117332300A (en) | Motor imagery electroencephalogram classification method based on self-attention improved domain adaptation network | |
CN110991554A (en) | Improved PCA (principal component analysis) -based deep network image classification method | |
CN111914922A (en) | Hyperspectral image classification method based on local convolution and cavity convolution | |
CN116746947A (en) | Cross-subject electroencephalogram signal classification method based on online test time domain adaptation | |
CN116269442A (en) | Multi-head attention-based multidimensional motor imagery electroencephalogram signal classification method | |
CN115310491A (en) | Class-imbalance magnetic resonance whole brain data classification method based on deep learning |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |