CN111310783B - Speech state detection method based on electroencephalogram micro-state features and neural network model - Google Patents
Speech state detection method based on electroencephalogram micro-state features and neural network model Download PDFInfo
- Publication number
- CN111310783B CN111310783B CN202010007821.3A CN202010007821A CN111310783B CN 111310783 B CN111310783 B CN 111310783B CN 202010007821 A CN202010007821 A CN 202010007821A CN 111310783 B CN111310783 B CN 111310783B
- Authority
- CN
- China
- Prior art keywords
- layer
- micro
- state
- input
- output
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000003062 neural network model Methods 0.000 title claims abstract description 60
- 238000001514 detection method Methods 0.000 title claims abstract description 23
- 238000012549 training Methods 0.000 claims abstract description 18
- 238000011176 pooling Methods 0.000 claims description 39
- 238000012360 testing method Methods 0.000 claims description 12
- 238000000034 method Methods 0.000 claims description 10
- 238000013528 artificial neural network Methods 0.000 claims description 7
- 208000037170 Delayed Emergence from Anesthesia Diseases 0.000 claims description 6
- 238000010276 construction Methods 0.000 claims description 6
- 238000012795 verification Methods 0.000 claims description 6
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 4
- 238000003064 k means clustering Methods 0.000 claims description 3
- 230000001131 transforming effect Effects 0.000 claims description 3
- 210000004556 brain Anatomy 0.000 description 9
- 238000013135 deep learning Methods 0.000 description 5
- 230000006870 function Effects 0.000 description 5
- 230000003925 brain function Effects 0.000 description 3
- 238000011161 development Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000000605 extraction Methods 0.000 description 2
- 239000011159 matrix material Substances 0.000 description 2
- 230000001537 neural effect Effects 0.000 description 2
- 230000002123 temporal effect Effects 0.000 description 2
- 241000282412 Homo Species 0.000 description 1
- 230000003542 behavioural effect Effects 0.000 description 1
- 230000007177 brain activity Effects 0.000 description 1
- 230000001149 cognitive effect Effects 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 230000000694 effects Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 230000010365 information processing Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 238000010801 machine learning Methods 0.000 description 1
- 230000006996 mental state Effects 0.000 description 1
- 210000002569 neuron Anatomy 0.000 description 1
- 230000001766 physiological effect Effects 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 208000027765 speech disease Diseases 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/24—Classification techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/23—Clustering techniques
- G06F18/232—Non-hierarchical techniques
- G06F18/2321—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions
- G06F18/23213—Non-hierarchical techniques using statistics or function optimisation, e.g. modelling of probability density functions with fixed number of clusters, e.g. K-means clustering
Landscapes
- Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Artificial Intelligence (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Probability & Statistics with Applications (AREA)
- Image Analysis (AREA)
- Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
Abstract
A speech state detection method based on electroencephalogram micro-state features and a neural network model comprises the following steps: respectively constructing an improved GoogleLeNet neural network model, acquiring multi-channel electroencephalogram signals of a normal subject in the states of listening, speaking and imagining speaking, respectively extracting micro-state time sequence features in a set time window, and adding corresponding labels to the micro-state time sequence features; training an improved GoogLeNet neural network model by using the micro-state time sequence features with labels; and then acquiring multi-channel electroencephalogram signals of a normal subject in a speech state in real time, extracting the micro-state time sequence characteristics in a set time window, and sending the micro-state time sequence characteristics into a trained improved GoogLeNet neural network model, thereby realizing speech state detection. The invention innovatively applies the electroencephalogram micro-state time sequence characteristics to the neural network model to achieve the aim of speech detection, and can effectively improve the accuracy of speech state classification.
Description
Technical Field
The invention relates to a speech state detection method. In particular to a speech state detection method based on electroencephalogram micro-state characteristics and a neural network model.
Background
The human brain is the most complex system in the human body, various neural networks of the brain regulate various physiological activities of the human body, and the realization of one task generally requires the mutual correlation and coordination among various brain areas. The human brain is only explored in one corner of iceberg, and the brain function far beyond the cognitive range of people is not known yet. Therefore, with the development of scientific technology, it is undoubtedly a very meaningful matter to explore brain functions.
Many studies have shown that the features of the electroencephalographic micro-state time-series are different in different behavioral states. Microstations, also known as "atoms of thought", are mentioned in the document EEG microstations a tool for interpreting the temporal dynamics of the temporal neural networks a review, that the networks activated in a particular microstate represent different states of consciousness, and that each microstate is associated with a different class of mental states constituting states of consciousness. The time series characteristics of the micro-states include the frequency of occurrence, duration and switching pattern of the micro-states. In task-oriented brain activities, there is a link between the occurrence of micro-states and specific functions of information processing. Therefore, the brain electrical microstate can be used as an objective physiological index, and a novel method is provided for detecting the brain function state.
Speech is the fundamental way that humans communicate. Speech states are divided into listening, speaking and imagination. At present, a large number of language-handicapped patients still exist in the world, and the language-handicapped patients hardly can normally communicate with the outside world, which brings great difficulty to the lives of the language-handicapped patients. With the rapid development of brain-computer interfaces in recent years, people begin to find a method which can decode the speech state of a person with speech disorder by means of brain science and neural engineering, so that the speech state of the patient can be recognized, and the method is further applied to brain-computer interaction. In the document Microstates in Language-Related Brain positional Maps Show non-Verb Differences, where linguistic processing involves multiple fundamental regions of the Brain and widely distributed neurons, in a simple word reading paradigm where the Brain processes nouns and verbs through different neural groups, the subject's average position of the microstate topographic map center when seeing the nouns and verbs varies.
With the rise of deep learning, medical workers also begin to explore an efficient human body physiological index detection method. And compared with the traditional machine learning, the deep learning saves more time, and because of the weight sharing, the deep learning is more advantageous in the aspects of the accuracy and the efficiency of the model. Deep learning has also been a good breakthrough in medical image classification tasks in recent years, and more people are beginning to apply deep learning to medical detection. The google lenet neural network invented by google corporation has been receiving wide attention from researchers, and the residual block network has been considered as an effective method for preventing the disappearance of the gradient and the explosion of the gradient.
Disclosure of Invention
The invention aims to solve the technical problem of providing a speech state detection method based on electroencephalogram micro-state features and a neural network model, which deepens the width and depth of the model, avoids the over-fitting phenomenon and effectively improves the classification detection accuracy.
The technical scheme adopted by the invention is as follows: a speech state detection method based on electroencephalogram micro-state features and a neural network model comprises the following steps: respectively constructing an improved GoogleLeNet neural network model, acquiring multi-channel electroencephalogram signals of a normal subject in the states of listening, speaking and imagining speaking, respectively extracting micro-state time sequence features in a set time window, and adding corresponding labels to the micro-state time sequence features; training an improved GoogLeNet neural network model by using the micro-state time sequence features with labels; and then acquiring multi-channel electroencephalogram signals of a normal subject in a speech state in real time, extracting the micro-state time sequence characteristics in a set time window, and sending the micro-state time sequence characteristics into a trained improved GoogLeNet neural network model, thereby realizing speech state detection.
The speech state is one of an listening state, a speaking state and an imagination and speaking state.
The construction of the improved GoogLeNet neural network model is characterized in that a residual block network is added on the basis of the GoogLeNet neural network model, and the construction method specifically comprises the following steps:
from the input layer to the first three groups of parallel structures, and then from the three groups of parallel structures to the first channel merging layer, the first three groups of parallel structures are respectively: a 1 × 1 convolutional layer and a pooling layer connected; a 3 × 3 convolutional layer and a pooling layer connected; connecting the 5 × 5 convolutional layer and the pooling layer; the number of convolution kernels of the convolution layers is 32;
and uniformly outputting the data from the first channel merging layer to the second three groups of parallel structures and then from the second three groups of parallel structures to the second channel merging layer, wherein the second three groups of parallel structures are respectively as follows: a 1 × 1 convolutional layer and a pooling layer connected; five 3 x 3 convolutional layers and one pooling layer which are connected in sequence; five 5 x 5 convolutional layers and one pooling layer which are connected in sequence; the number of convolution kernels of the convolution layer is also 32; the connection of the residual error network is respectively as follows: the input of the first 3 x 3 convolutional layer is used as the input of the first residual error network, and the output of the second 3 x 3 convolutional layer is used as the output of the first residual error network; the input of the fourth 3 × 3 convolutional layer is used as the input of the second residual error network, and the output of the fifth 3 × 3 convolutional layer is used as the output of the second residual error network; the input of the first 5 x 5 convolutional layer is used as the input of the third residual error network, and the output of the second 5 x 5 convolutional layer is used as the output of the third residual error network; the input of the fourth 5 × 5 convolutional layer is taken as the input of the fourth residual network, and the output of the fifth 5 × 5 convolutional layer is taken as the output of the fourth residual network;
and the third three groups of parallel structures are uniformly output to the third channel merging layer from the second channel merging layer to the third three groups of parallel structures, wherein the third three groups of parallel structures are respectively as follows: a 1 × 1 convolutional layer and a pooling layer connected; five 3 x 3 convolutional layers and one pooling layer which are connected in sequence; five 5 x 5 convolutional layers and one pooling layer which are connected in sequence; the number of convolution kernels of the convolution layers is also 64; the connection of the residual error network is respectively as follows: the input of the first 3 × 3 convolutional layer is used as the input of the fifth residual error network, and the output of the second 3 × 3 convolutional layer is used as the output of the fifth residual error network; the input of the fourth 3 × 3 convolutional layer is used as the input of the sixth residual error network, and the output of the fifth 3 × 3 convolutional layer is used as the output of the sixth residual error network; the input of the first 5 × 5 convolutional layer is used as the input of the seventh residual error network, and the output of the second 5 × 5 convolutional layer is used as the output of the seventh residual error network; the input of the fourth 5 × 5 convolutional layer is used as the input of the eighth residual network, and the output of the fifth 5 × 5 convolutional layer is used as the output of the eighth residual network;
and uniformly outputting the three groups of parallel structures from the third channel merging layer to the fourth three groups of parallel structures and then outputting the four groups of parallel structures to the fourth channel merging layer, wherein the fourth three groups of parallel structures are respectively as follows: a 1 × 1 convolutional layer and a pooling layer connected; five 3 x 3 convolutional layers and one pooling layer which are connected in sequence; five 5 x 5 convolutional layers and one pooling layer which are connected in sequence; the number of convolution kernels of the convolution layer is also 128;
the connection of the residual error network is respectively as follows: the input of the first 3 × 3 convolutional layer is used as the input of the ninth residual error network, and the output of the second 3 × 3 convolutional layer is used as the output of the ninth residual error network; the input of the fourth 3 × 3 convolutional layer is taken as the input of the tenth residual network, and the output of the fifth 3 × 3 convolutional layer is taken as the output of the tenth residual network; the input of the first 5 × 5 convolutional layer is taken as the input of the eleventh residual network, and the output of the second 5 × 5 convolutional layer is taken as the output of the eleventh residual network; the input of the fourth 5 × 5 convolutional layer is taken as the input of the twelfth residual network, and the output of the fifth 5 × 5 convolutional layer is taken as the output of the twelfth residual network;
and finally, outputting the data to an average pooling layer and an output layer in sequence.
The set time window is 2s, namely, the time sequence of each 2s micro-state is taken as an input characteristic, and the time lasts for 80-120ms before one micro-state is converted into another micro-state.
The extraction of the micro-state time series characteristics in the set time window comprises the following steps:
(1) calculating the multichannel electroencephalogram signals through the following formula to obtain a global field power curve:
wherein, V i (t) denotes the electrode voltage vector at time t, V mean (t) represents the average value of the instantaneous potential between the electrodes, K represents the number of electrodes, and GFP represents the totalLocal field power curve;
then drawing the potential of the global field power curve at the moment of the local maximum value to generate a topographic map of the electrode array;
(2) submitting a topographic map corresponding to the local maximum time of the global field power curve to a K-means clustering algorithm, and dividing the topographic map into four types of micro-state maps through the algorithm;
(3) and (4) carrying out time sequence arrangement on the four types of micro states according to the sequence of the peak values of the global field power curve to obtain input characteristics.
The micro-state time sequence features are added with corresponding labels, and are divided into three types of labels of listening, speaking and imagination according to different speech states.
The method for training the improved GoogLeNet neural network model by using the micro-state time sequence features with the labels comprises the following steps:
(1) dividing the micro-state time sequence features with the labels into a training set, a verification set and a test set according to the ratio of 8:1:1, wherein the labels in the test set are removed;
(2) inputting the training set into an input layer of an improved GoogLeNet neural network model for training, and carrying out forward propagation; transforming layer by layer and transmitting to an output layer of a residual error neural network;
(3) and (3) detecting the effectiveness of the improved GoogLeNet neural network model by using a cross entropy loss function, adjusting parameters of the improved GoogLeNet neural network model obtained in the step (2) by using an Adam optimizer and a back propagation mode to update parameters and weights of each layer, verifying the adjusted improved GoogLeNet neural network model by using a verification set, tuning until the accuracy of the improved GoogLeNet neural network model is unchanged, stopping training, and testing the tuned improved GoogLeNet neural network model by using a test set, thereby obtaining the trained improved GoogLeNet neural network model.
According to the speech state detection method based on the electroencephalogram micro-state characteristics and the neural network model, the improved GoogleLeNet neural network is added with the residual block on the basis of GoogleLeNet, internal resources of a computer are fully utilized by the network, the width of the network is increased, and the overfitting resistance effect is achieved. The invention innovatively applies the electroencephalogram micro-state time sequence characteristics to the neural network model to achieve the aim of speech detection, and can effectively improve the accuracy of speech state classification.
Drawings
FIG. 1 is a block diagram of a speech state detection method based on electroencephalogram micro-state features and a neural network model according to the present invention;
FIG. 2 is a flow chart of a speech state detection method based on electroencephalogram micro-state features and a neural network model according to the present invention;
FIG. 3 is a flow chart of the micro-state time series feature acquisition of the present invention;
fig. 4 is a schematic diagram of a residual block according to an embodiment of the present invention.
Detailed Description
The speech state detection method based on the electroencephalogram micro-state features and the neural network model of the invention is explained in detail below with reference to the embodiments and the accompanying drawings.
The invention discloses a speech state detection method based on electroencephalogram micro-state characteristics and a neural network model, which comprises the following steps of:
1) respectively constructing an improved GoogLeNet neural network model, acquiring multi-channel electroencephalogram signals of a normal testee in the states of listening, speaking and imagining speaking, respectively extracting micro-state time sequence characteristics in a set time window, and adding corresponding labels to the micro-state time sequence characteristics, specifically dividing the micro-state time sequence characteristics into three types of labels of listening, speaking and imagining speaking according to different speech states; wherein,
the construction of the improved GoogLeNet neural network model is characterized in that a residual block network is added on the basis of the GoogLeNet neural network model, and the construction method specifically comprises the following steps:
from the input layer to the first three groups of parallel structures, and then from the three groups of parallel structures to the first channel merging layer, the first three groups of parallel structures are respectively: a 1 × 1 convolutional layer and a pooling layer connected; a 3 × 3 convolutional layer and a pooling layer connected; connecting the 5 × 5 convolutional layer and the pooling layer; the number of convolution kernels of the convolution layers is 32;
and uniformly outputting the data from the first channel merging layer to the second three groups of parallel structures and then from the second three groups of parallel structures to the second channel merging layer, wherein the second three groups of parallel structures are respectively as follows: a 1 × 1 convolutional layer and a pooling layer connected; five 3 x 3 convolutional layers and one pooling layer which are connected in sequence; five 5 x 5 convolutional layers and one pooling layer which are connected in sequence; the number of convolution kernels of the convolution layer is also 32; the connection of the residual error network is respectively as follows: the input of the first 3 x 3 convolutional layer is used as the input of the first residual error network, and the output of the second 3 x 3 convolutional layer is used as the output of the first residual error network; the input of the fourth 3 × 3 convolutional layer is used as the input of the second residual error network, and the output of the fifth 3 × 3 convolutional layer is used as the output of the second residual error network; the input of the first 5 x 5 convolutional layer is used as the input of the third residual error network, and the output of the second 5 x 5 convolutional layer is used as the output of the third residual error network; the input of the fourth 5 × 5 convolutional layer is taken as the input of the fourth residual network, and the output of the fifth 5 × 5 convolutional layer is taken as the output of the fourth residual network;
and the third three groups of parallel structures are uniformly output to the third channel merging layer from the second channel merging layer to the third three groups of parallel structures, wherein the third three groups of parallel structures are respectively as follows: a 1 × 1 convolutional layer and a pooling layer connected; five 3 x 3 convolutional layers and one pooling layer which are connected in sequence; five 5 x 5 convolutional layers and one pooling layer which are connected in sequence; the number of convolution kernels of the convolution layer is also 64; the connection of the residual error network is respectively as follows: the input of the first 3 × 3 convolutional layer is used as the input of the fifth residual error network, and the output of the second 3 × 3 convolutional layer is used as the output of the fifth residual error network; the input of the fourth 3 × 3 convolutional layer is used as the input of the sixth residual network, and the output of the fifth 3 × 3 convolutional layer is used as the output of the sixth residual network; the input of the first 5 × 5 convolutional layer is used as the input of the seventh residual error network, and the output of the second 5 × 5 convolutional layer is used as the output of the seventh residual error network; the input of the fourth 5 × 5 convolutional layer is used as the input of the eighth residual network, and the output of the fifth 5 × 5 convolutional layer is used as the output of the eighth residual network;
and uniformly outputting the three groups of parallel structures from the third channel merging layer to the fourth three groups of parallel structures and then outputting the four groups of parallel structures to the fourth channel merging layer, wherein the fourth three groups of parallel structures are respectively as follows: a 1 × 1 convolutional layer and a pooling layer connected; five 3 x 3 convolutional layers and one pooling layer which are connected in sequence; five 5 x 5 convolutional layers and one pooling layer which are connected in sequence; the number of convolution kernels of the convolution layer is also 128;
the connection of the residual error network is respectively as follows: the input of the first 3 × 3 convolutional layer is used as the input of the ninth residual error network, and the output of the second 3 × 3 convolutional layer is used as the output of the ninth residual error network; the input of the fourth 3 × 3 convolutional layer is taken as the input of the tenth residual network, and the output of the fifth 3 × 3 convolutional layer is taken as the output of the tenth residual network; the input of the first 5 × 5 convolutional layer is taken as the input of the eleventh residual network, and the output of the second 5 × 5 convolutional layer is taken as the output of the eleventh residual network; the input of the fourth 5 × 5 convolutional layer is taken as the input of the twelfth residual network, and the output of the fifth 5 × 5 convolutional layer is taken as the output of the twelfth residual network;
and finally, outputting the data to the average pooling layer and the output layer in sequence.
2) Training an improved GoogLeNet neural network model with labeled micro-state time series features, comprising:
(1) dividing the micro-state time sequence features with the labels into a training set, a verification set and a test set according to the ratio of 8:1:1, wherein the labels in the test set are removed;
(2) inputting the training set into an input layer of an improved GoogLeNet neural network model for training, and carrying out forward propagation; transforming layer by layer and transmitting to an output layer of a residual error neural network;
(3) and (3) detecting the effectiveness of the improved GoogLeNet neural network model by using a cross entropy loss function, adjusting parameters of the improved GoogLeNet neural network model obtained in the step (2) by using an Adam optimizer and a back propagation mode to update parameters and weights of each layer, verifying the adjusted improved GoogLeNet neural network model by using a verification set, tuning until the accuracy of the improved GoogLeNet neural network model is unchanged, stopping training, and testing the tuned improved GoogLeNet neural network model by using a test set, thereby obtaining the trained improved GoogLeNet neural network model.
(1) In the present example, the cross entropy loss function used is:
where m denotes the number of classifications, y denotes the correct label value, y l Representing the real output.
(2) The Adam algorithm mode in the embodiment of the invention is as follows:
s←ρ 1 s+(1-ρ 1 )g
γ←ρ 2 γ+(1-ρ 2 )g⊙g
where ρ is 1 、ρ 2 Are all constants (default value is ρ) 1 =0.9、ρ 2 0.999), g represents the first order gradient of the loss function, s represents the biased first order moment estimate, and γ represents the biased second order moment estimate;
wherein ρ 1 、ρ 2 Are all constants (default value is ρ) 1 =0.9、ρ 2 0.999), s stands for biased first order moment estimate, γ stands for biased second order moment estimate, s l Representing deviations of the corrected first-order matrix, gamma l Representing the deviation of the modified second moment.
Wherein, both epsilon and delta are constants (default values are epsilon 0.001 and delta 10) -8 ) Gamma stands for the biased second moment estimate, s l RepresentAnd correcting the deviation of the first-order matrix, wherein delta theta represents the change of the parameter theta.
3) And acquiring a multi-channel electroencephalogram signal of a normal testee in a speech state in real time, wherein the speech state is one of an auditory state, a speaking state and a fictitious speaking state, extracting a micro-state time sequence characteristic in a set time window, and sending the micro-state time sequence characteristic into a trained improved GoogLeNet neural network model, so that speech state detection is realized.
The time window set in step 1) and step 3) of the present invention is 2s, that is, every 2s of the micro-state time sequence is used as an input feature, and the time duration lasts 80-120ms before one micro-state is switched to another.
The extraction of the micro-state time series characteristics in the set time window in the steps 1) and 3) of the invention comprises the following steps:
(1) calculating the multichannel electroencephalogram signals by the following formula to obtain a Global Field Power (GFP) curve:
wherein, V i (t) denotes the electrode voltage vector at time t, V mean (t) represents the average value of the instantaneous potential between the electrodes, K represents the number of electrodes, and GFP represents the global field power curve;
then drawing the potential of the global field power curve at the moment of the local maximum value to generate a topographic map of the electrode array;
(2) submitting a topographic map corresponding to the local maximum time of the global field power curve to a K-means clustering algorithm, and dividing the topographic map into four types of micro-state maps through the algorithm;
(3) and (4) carrying out time sequence arrangement on the four types of micro-states according to the sequence of the peak value of the global field power curve to obtain input characteristics.
Claims (6)
1. A speech state detection method based on electroencephalogram micro-state features and a neural network model is characterized by comprising the following steps: respectively constructing an improved GoogleLeNet neural network model, acquiring multi-channel electroencephalogram signals of a normal subject in the states of listening, speaking and imagining speaking, respectively extracting micro-state time sequence features in a set time window, and adding corresponding labels to the micro-state time sequence features; training an improved GoogLeNet neural network model by using the micro-state time sequence features with labels; acquiring multi-channel electroencephalogram signals of a normal testee in a speech state in real time, extracting micro-state time sequence characteristics in a set time window, and sending the micro-state time sequence characteristics into a trained improved GoogLeNet neural network model, so that speech state detection is realized;
the construction of the improved GoogLeNet neural network model is characterized in that a residual block network is added on the basis of the GoogLeNet neural network model, and the construction method specifically comprises the following steps:
from the input layer to the first three groups of parallel structures, and then from the three groups of parallel structures to the first channel merging layer, the first three groups of parallel structures are respectively: a 1 × 1 convolutional layer and a pooling layer connected; a 3 × 3 convolutional layer and a pooling layer connected; connecting the 5 × 5 convolutional layer and the pooling layer; the number of convolution kernels of the convolution layers is 32;
and uniformly outputting the data from the first channel merging layer to the second three groups of parallel structures and from the second three groups of parallel structures to the second channel merging layer, wherein the second three groups of parallel structures are respectively as follows: a 1 × 1 convolutional layer and a pooling layer connected; five 3 x 3 convolutional layers and one pooling layer which are connected in sequence; five 5 x 5 convolutional layers and one pooling layer which are connected in sequence; the number of convolution kernels of each convolution layer is also 32; the connection of the residual error network is respectively as follows: the input of the first 3 x 3 convolutional layer is used as the input of the first residual error network, and the output of the second 3 x 3 convolutional layer is used as the output of the first residual error network; the input of the fourth 3 × 3 convolutional layer is used as the input of the second residual error network, and the output of the fifth 3 × 3 convolutional layer is used as the output of the second residual error network; the input of the first 5 x 5 convolutional layer is used as the input of the third residual error network, and the output of the second 5 x 5 convolutional layer is used as the output of the third residual error network; the input of the fourth 5 × 5 convolutional layer is taken as the input of the fourth residual network, and the output of the fifth 5 × 5 convolutional layer is taken as the output of the fourth residual network;
and the third three groups of parallel structures are uniformly output to the third channel merging layer from the second channel merging layer to the third three groups of parallel structures, wherein the third three groups of parallel structures are respectively as follows: a 1 × 1 convolutional layer and a pooling layer connected; five 3 x 3 convolutional layers and one pooling layer which are connected in sequence; five 5 x 5 convolutional layers and one pooling layer which are connected in sequence; the number of convolution kernels of the convolution layer is also 64; the connection of the residual error network is respectively as follows: the input of the first 3 × 3 convolutional layer is used as the input of the fifth residual error network, and the output of the second 3 × 3 convolutional layer is used as the output of the fifth residual error network; the input of the fourth 3 × 3 convolutional layer is used as the input of the sixth residual error network, and the output of the fifth 3 × 3 convolutional layer is used as the output of the sixth residual error network; the input of the first 5 × 5 convolutional layer is used as the input of the seventh residual error network, and the output of the second 5 × 5 convolutional layer is used as the output of the seventh residual error network; the input of the fourth 5 × 5 convolutional layer is used as the input of the eighth residual network, and the output of the fifth 5 × 5 convolutional layer is used as the output of the eighth residual network;
and uniformly outputting the three groups of parallel structures from the third channel merging layer to the fourth three groups of parallel structures and then outputting the four groups of parallel structures to the fourth channel merging layer, wherein the fourth three groups of parallel structures are respectively as follows: a 1 × 1 convolutional layer and a pooling layer connected; five 3 x 3 convolutional layers and one pooling layer which are connected in sequence; five 5 x 5 convolutional layers and one pooling layer which are connected in sequence; the number of convolution kernels of the convolution layer is also 128;
the connection of the residual error network is respectively as follows: the input of the first 3 × 3 convolutional layer is used as the input of the ninth residual error network, and the output of the second 3 × 3 convolutional layer is used as the output of the ninth residual error network; the input of the fourth 3 × 3 convolutional layer is used as the input of the tenth residual network, and the output of the fifth 3 × 3 convolutional layer is used as the output of the tenth residual network; the input of the first 5 × 5 convolutional layer is taken as the input of the eleventh residual error network, and the output of the second 5 × 5 convolutional layer is taken as the output of the eleventh residual error network; the input of the fourth 5 × 5 convolutional layer is taken as the input of the twelfth residual network, and the output of the fifth 5 × 5 convolutional layer is taken as the output of the twelfth residual network;
and finally, outputting the data to the average pooling layer and the output layer in sequence.
2. The speech state detection method based on the electroencephalogram micro-state features and the neural network model as claimed in claim 1, wherein the speech state is one of an auditory state, a speaking state and an imaginary speaking state.
3. The method for detecting the speech state based on the EEG micro-state characteristics and the neural network model according to claim 1, wherein the set time window is 2s, that is, the micro-state time sequence of every 2s is used as an input characteristic, and the time period lasts 80-120ms before one micro-state is converted into another micro-state.
4. The method for detecting the speech state based on the electroencephalogram micro-state features and the neural network model according to claim 1, wherein the step of extracting the micro-state time series features in the set time window comprises the following steps:
(1) calculating the multichannel electroencephalogram signals to obtain a global field power curve through the following formula:
wherein, V i (t) denotes the electrode voltage vector at time t, V mean (t) represents the average value of the instantaneous potential between the electrodes, K represents the number of electrodes, and GFP represents the global field power curve;
then drawing the potential of the global field power curve at the moment of the local maximum value to generate a topographic map of the electrode array;
(2) submitting a topographic map corresponding to the local maximum time of the global field power curve to a K-means clustering algorithm, and dividing the topographic map into four types of micro-state maps through the algorithm;
(3) and (4) carrying out time sequence arrangement on the four types of micro-states according to the sequence of the peak value of the global field power curve to obtain input characteristics.
5. The method for detecting the speech state based on the electroencephalogram micro-state features and the neural network model according to claim 1, wherein the micro-state time series features are added with corresponding labels, and the micro-state time series features are classified into three types of labels of listening, speaking and imagination according to different speech states.
6. The method for detecting the speech state based on the electroencephalogram micro-state features and the neural network model according to claim 1, wherein the training of the improved GoogleLeNet neural network model by using the micro-state time series features with the labels comprises the following steps:
(1) dividing the micro-state time sequence features with the labels into a training set, a verification set and a test set according to the ratio of 8:1:1, wherein the labels in the test set are removed;
(2) inputting the training set into an input layer of an improved GoogLeNet neural network model for training, and carrying out forward propagation; transforming layer by layer and transmitting to an output layer of a residual error neural network;
(3) and (3) detecting the effectiveness of the improved GoogLeNet neural network model by using a cross entropy loss function, adjusting parameters of the improved GoogLeNet neural network model obtained in the step (2) by using an Adam optimizer and a back propagation mode to update parameters and weights of each layer, verifying the adjusted improved GoogLeNet neural network model by using a verification set, tuning until the accuracy of the improved GoogLeNet neural network model is unchanged, stopping training, and testing the tuned improved GoogLeNet neural network model by using a test set, thereby obtaining the trained improved GoogLeNet neural network model.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010007821.3A CN111310783B (en) | 2020-01-05 | 2020-01-05 | Speech state detection method based on electroencephalogram micro-state features and neural network model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010007821.3A CN111310783B (en) | 2020-01-05 | 2020-01-05 | Speech state detection method based on electroencephalogram micro-state features and neural network model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN111310783A CN111310783A (en) | 2020-06-19 |
CN111310783B true CN111310783B (en) | 2022-08-30 |
Family
ID=71146806
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010007821.3A Active CN111310783B (en) | 2020-01-05 | 2020-01-05 | Speech state detection method based on electroencephalogram micro-state features and neural network model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111310783B (en) |
Families Citing this family (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113367705A (en) * | 2021-04-07 | 2021-09-10 | 西北工业大学 | Motor imagery electroencephalogram signal classification method based on improved micro-state analysis |
CN113558637B (en) * | 2021-07-05 | 2024-01-05 | 杭州电子科技大学 | Music perception brain network construction method based on phase transfer entropy |
CN117130490B (en) * | 2023-10-26 | 2024-01-26 | 天津大学 | Brain-computer interface control system, control method and implementation method thereof |
Family Cites Families (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN107479702A (en) * | 2017-08-04 | 2017-12-15 | 西南大学 | A kind of human emotion's dominance classifying identification method using EEG signals |
WO2019068200A1 (en) * | 2017-10-06 | 2019-04-11 | Holland Bloorview Kids Rehabilitation Hospital | Brain-computer interface platform and process for classification of covert speech |
CN108022647B (en) * | 2017-11-30 | 2022-01-25 | 东北大学 | Lung nodule benign and malignant prediction method based on ResNet-inclusion model |
CN108577835B (en) * | 2018-05-17 | 2019-07-19 | 太原理工大学 | A kind of brain function network establishing method based on micro- state |
CN108764471B (en) * | 2018-05-17 | 2020-04-14 | 西安电子科技大学 | Neural network cross-layer pruning method based on feature redundancy analysis |
CN109784023B (en) * | 2018-11-28 | 2022-02-25 | 西安电子科技大学 | Steady-state vision-evoked electroencephalogram identity recognition method and system based on deep learning |
CN109846477B (en) * | 2019-01-29 | 2021-08-06 | 北京工业大学 | Electroencephalogram classification method based on frequency band attention residual error network |
CN109620185B (en) * | 2019-01-31 | 2020-07-21 | 山东大学 | Autism auxiliary diagnosis system, device and medium based on multi-modal information |
CN110236533A (en) * | 2019-05-10 | 2019-09-17 | 杭州电子科技大学 | Epileptic seizure prediction method based on the study of more deep neural network migration features |
CN110163180A (en) * | 2019-05-29 | 2019-08-23 | 长春思帕德科技有限公司 | Mental imagery eeg data classification method and system |
-
2020
- 2020-01-05 CN CN202010007821.3A patent/CN111310783B/en active Active
Also Published As
Publication number | Publication date |
---|---|
CN111310783A (en) | 2020-06-19 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Wang et al. | Linking attention-based multiscale CNN with dynamical GCN for driving fatigue detection | |
CN111310783B (en) | Speech state detection method based on electroencephalogram micro-state features and neural network model | |
Chinara | Automatic classification methods for detecting drowsiness using wavelet packet transform extracted time-domain features from single-channel EEG signal | |
Chao et al. | Recognition of Emotions Using Multichannel EEG Data and DBN‐GC‐Based Ensemble Deep Learning Framework | |
Li et al. | Densely feature fusion based on convolutional neural networks for motor imagery EEG classification | |
CN111553295A (en) | Multi-mode emotion recognition method based on self-attention mechanism | |
Li et al. | Combined long short-term memory based network employing wavelet coefficients for MI-EEG recognition | |
CN113180692A (en) | Electroencephalogram signal classification and identification method based on feature fusion and attention mechanism | |
CN108280414A (en) | A kind of recognition methods of the Mental imagery EEG signals based on energy feature | |
Satapathy et al. | ADASYN and ABC-optimized RBF convergence network for classification of electroencephalograph signal | |
US20230101539A1 (en) | Physiological electric signal classification processing method and apparatus, computer device and storage medium | |
Balam et al. | Statistical channel selection method for detecting drowsiness through single-channel EEG-based BCI system | |
Mo et al. | Motor imagery electroencephalograph classification based on optimized support vector machine by magnetic bacteria optimization algorithm | |
CN114676720B (en) | Mental state identification method and system based on graph neural network | |
Dehzangi et al. | EEG based driver inattention identification via feature profiling and dimensionality reduction | |
Chanu et al. | An automated epileptic seizure detection using optimized neural network from EEG signals | |
Yang et al. | An aggressive driving state recognition model using EEG based on stacking ensemble learning | |
Saranya et al. | An efficient AP-ANN-based multimethod fusion model to detect stress through EEG signal analysis | |
Abdulghani et al. | EEG Classifier Using Wavelet Scattering Transform-Based Features and Deep Learning for Wheelchair Steering | |
Palaniappan et al. | Using genetic algorithm to identify the discriminatory subset of multi-channel spectral bands for visual response | |
Huynh et al. | An investigation of ensemble methods to classify electroencephalogram signaling modes | |
Xu et al. | Emotion Recognition from Multi-channel EEG via an Attention-Based CNN Model | |
Sun et al. | MEEG-Transformer: transformer Network based on Multi-domain EEG for emotion recognition | |
CN114052734A (en) | Electroencephalogram emotion recognition method based on progressive graph convolution neural network | |
Du et al. | Improving motor imagery EEG classification by CNN with data augmentation |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |