CN117338313B - Multi-dimensional characteristic electroencephalogram signal identification method based on stacking integration technology - Google Patents
Multi-dimensional characteristic electroencephalogram signal identification method based on stacking integration technology Download PDFInfo
- Publication number
- CN117338313B CN117338313B CN202311196950.1A CN202311196950A CN117338313B CN 117338313 B CN117338313 B CN 117338313B CN 202311196950 A CN202311196950 A CN 202311196950A CN 117338313 B CN117338313 B CN 117338313B
- Authority
- CN
- China
- Prior art keywords
- model
- electroencephalogram
- matrix
- data
- layer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 238000000034 method Methods 0.000 title claims abstract description 47
- 238000005516 engineering process Methods 0.000 title claims abstract description 43
- 230000010354 integration Effects 0.000 title claims abstract description 43
- 239000011159 matrix material Substances 0.000 claims abstract description 102
- 238000000605 extraction Methods 0.000 claims abstract description 34
- 238000012545 processing Methods 0.000 claims abstract description 27
- 238000007781 pre-processing Methods 0.000 claims abstract description 24
- 230000009467 reduction Effects 0.000 claims abstract description 23
- 238000000513 principal component analysis Methods 0.000 claims abstract description 17
- 238000012549 training Methods 0.000 claims description 62
- 238000012360 testing method Methods 0.000 claims description 25
- 238000012795 verification Methods 0.000 claims description 22
- 230000006870 function Effects 0.000 claims description 17
- 239000013598 vector Substances 0.000 claims description 16
- 210000004556 brain Anatomy 0.000 claims description 10
- 238000013527 convolutional neural network Methods 0.000 claims description 9
- 238000011176 pooling Methods 0.000 claims description 9
- 238000010606 normalization Methods 0.000 claims description 7
- 230000004913 activation Effects 0.000 claims description 6
- 238000013528 artificial neural network Methods 0.000 claims description 6
- 230000015654 memory Effects 0.000 claims description 6
- 230000009466 transformation Effects 0.000 claims description 6
- 238000001914 filtration Methods 0.000 claims description 5
- 238000000354 decomposition reaction Methods 0.000 claims description 4
- 230000007787 long-term memory Effects 0.000 claims description 4
- 210000005252 bulbus oculi Anatomy 0.000 claims description 3
- 238000012880 independent component analysis Methods 0.000 claims description 3
- 238000007477 logistic regression Methods 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims 1
- 239000000284 extract Substances 0.000 abstract description 2
- 201000010099 disease Diseases 0.000 description 10
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 10
- 230000036541 health Effects 0.000 description 7
- 238000001514 detection method Methods 0.000 description 3
- 238000013135 deep learning Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 230000006403 short-term memory Effects 0.000 description 2
- 206010003805 Autism Diseases 0.000 description 1
- 208000020706 Autistic disease Diseases 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007812 deficiency Effects 0.000 description 1
- 238000003745 diagnosis Methods 0.000 description 1
- 230000003631 expected effect Effects 0.000 description 1
- 230000003340 mental effect Effects 0.000 description 1
- 230000004630 mental health Effects 0.000 description 1
- 230000008520 organization Effects 0.000 description 1
- 230000033764 rhythmic process Effects 0.000 description 1
- 230000000630 rising effect Effects 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
Classifications
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/24—Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
- A61B5/316—Modalities, i.e. specific diagnostic methods
- A61B5/369—Electroencephalography [EEG]
- A61B5/372—Analysis of electroencephalograms
-
- A—HUMAN NECESSITIES
- A61—MEDICAL OR VETERINARY SCIENCE; HYGIENE
- A61B—DIAGNOSIS; SURGERY; IDENTIFICATION
- A61B5/00—Measuring for diagnostic purposes; Identification of persons
- A61B5/72—Signal processing specially adapted for physiological signals or for diagnostic purposes
- A61B5/7235—Details of waveform analysis
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02A—TECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
- Y02A90/00—Technologies having an indirect contribution to adaptation to climate change
- Y02A90/10—Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Engineering & Computer Science (AREA)
- Medical Informatics (AREA)
- Surgery (AREA)
- Biophysics (AREA)
- Pathology (AREA)
- Veterinary Medicine (AREA)
- Biomedical Technology (AREA)
- Heart & Thoracic Surgery (AREA)
- Psychiatry (AREA)
- Molecular Biology (AREA)
- Physics & Mathematics (AREA)
- Animal Behavior & Ethology (AREA)
- General Health & Medical Sciences (AREA)
- Public Health (AREA)
- Psychology (AREA)
- Artificial Intelligence (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Physiology (AREA)
- Signal Processing (AREA)
- Measurement And Recording Of Electrical Phenomena And Electrical Characteristics Of The Living Body (AREA)
Abstract
The invention discloses a multi-dimensional characteristic electroencephalogram signal identification method based on a stacking integration technology, which comprises the following steps of: 1) Acquiring two different electroencephalogram signals and preprocessing the data of the electroencephalogram signals; 2) Performing multidimensional feature extraction on the preprocessed electroencephalogram data, and constructing a feature matrix to obtain an original feature matrix; 3) Performing dimension reduction processing on the original feature matrix by using a principal component analysis algorithm to obtain a final feature matrix; 4) Constructing a multi-dimensional characteristic electroencephalogram signal identification model based on a stacking integration technology by taking the final characteristic matrix in the step 3) as input based on a stacking integration learning algorithm on the preprocessed electroencephalogram data in the step 1); 5) And preprocessing the electroencephalogram signals to be identified and extracting features, and inputting the preprocessed electroencephalogram signals into a trained model to obtain an identification result. The invention extracts multidimensional characteristics of the electroencephalogram signals, and improves the identification degree of the extracted electroencephalogram signals by using a stacking integration technology.
Description
Technical Field
The invention relates to the technical field of electroencephalogram signal processing, in particular to a multi-dimensional characteristic electroencephalogram signal identification method based on a stacking integration technology.
Background
With the acceleration of the social rhythm, the mental stress of people is gradually increased, and the mental health problem has become one of the important problems facing the contemporary society. According to the published data of the world health organization, the number of patients with global depression reaches 3.22 hundred million in 2022, and the prevalence rate of neuropsychiatric diseases such as autism is also continuously rising, which brings serious influence to society.
Most psychological diseases are complicated in etiology, high in treatment difficulty and insufficient in pertinence, so that the detection of the psychological diseases is particularly important. The existing psychological disease detection method mainly depends on subjective judgment of doctors and self-feedback of patients, and has the problems of low diagnosis accuracy, long time consumption, high cost and the like.
With the progress of science and technology and the appearance of deep learning algorithms, a non-invasive electroencephalogram signal recognition technology plays an important role in the detection and treatment of psychological diseases such as depression. However, the existing electroencephalogram signal identification technology has a plurality of problems, such as low identification accuracy, insufficient extraction of characteristics of the electroencephalogram signal, and incapability of effectively processing nonlinearity, high-dimensional characteristics and the like of the electroencephalogram signal, so that identification of psychological diseases such as depression and the like is affected.
Disclosure of Invention
The invention mainly aims to provide a multi-dimensional characteristic electroencephalogram signal identification method based on a stacking integration technology, which is used for extracting multi-dimensional characteristics of electroencephalogram signals and improving the identification degree of the extracted electroencephalogram signals.
The technical scheme adopted by the invention is as follows:
A multi-dimensional characteristic electroencephalogram signal identification method based on a stacking integration technology comprises the following steps:
1) Acquiring two different electroencephalogram signals of a healthy subject and a non-healthy subject so as to facilitate the identification of subsequent electroencephalogram signals, and carrying out data preprocessing on the electroencephalogram signals;
2) Performing multidimensional feature extraction on the preprocessed electroencephalogram data, and constructing a feature matrix to obtain an original feature matrix;
3) Performing dimension reduction processing on the original feature matrix by using a principal component analysis algorithm to obtain a final feature matrix;
4) Constructing a multi-dimensional characteristic electroencephalogram signal identification model based on a stacking integration technology by taking the final characteristic matrix in the step 3) as input based on a stacking integration learning algorithm on the preprocessed electroencephalogram data in the step 1); training the model to obtain a trained recognition model;
5) And (3) carrying out data preprocessing and multi-dimensional feature extraction on the electroencephalogram signals to be identified, and inputting the electroencephalogram signals to be identified into the trained multi-dimensional feature electroencephalogram signal identification model in the step (4) to obtain identification results.
In a further scheme, the step of preprocessing the data of the electroencephalogram signal in the step 1) mainly comprises denoising and normalization so as to eliminate noise components in the signal, improve the signal quality and obtain the subsequently available electroencephalogram data.
In a further scheme, the step 1) of preprocessing the data of the electroencephalogram signal comprises the following steps:
11 Denoising the original electroencephalogram signal to eliminate noise components in the signal, specifically: firstly, carrying out filtering treatment by using band-pass filtering of 0.5-40 Hz; secondly, removing artifacts such as blinks, eyeball movements and the like in the data by using an independent component analysis method;
12 Normalized processing is carried out on the denoised signal, and the amplitude range of the signal is scaled to be between 0 and 1, specifically: the signals were pre-processed using Min-Max normalization as follows:
Wherein z j is the j-th element in sample z, z max is the maximum value in the sample data, and z min is the minimum value in the sample data;
In step 2), feature extraction is performed on the preprocessed electroencephalogram data in a time-frequency domain and a space domain to obtain multidimensional features, so that an original feature matrix X { X 1,x2,…,xm } is constructed.
In a further scheme, in the step 2), feature extraction is performed on the preprocessed electroencephalogram data in a time-frequency domain and a space domain to obtain multidimensional features, so that the step of constructing an original feature matrix is as follows:
21 Performing feature extraction on the preprocessed electroencephalogram signal on a time-frequency domain based on discrete wavelet transformation to obtain time-frequency domain features;
22 Performing feature extraction on the preprocessed electroencephalogram signals on a airspace based on a co-space mode (CSP) method to obtain airspace features; CSP is a feature extraction algorithm under two classification tasks, and minimizes one class of variance while maximizing the other class of variance, so as to obtain a feature vector with the greatest degree of distinction;
23 Combining the extracted time-frequency domain features and the spatial domain features to construct a combined feature matrix, thereby obtaining an original feature matrix.
In a further scheme, in step 21), an electroencephalogram signal is extracted based on discrete wavelet transformation, an approximation component and a detail component are obtained, and energy information of the detail component is used as time-frequency domain feature data.
In a further scheme, in the step 3), the method for performing dimension reduction processing on the original feature matrix by using a Principal Component Analysis (PCA) algorithm to obtain a final feature matrix comprises the following steps:
The method comprises the steps of performing dimension reduction processing on an original feature matrix by using a principal component analysis algorithm, reducing high-dimensional data into a low-dimensional space, reducing redundant information of the data, improving the processing efficiency of the data and the precision of a model, and obtaining a final feature matrix, wherein the method comprises the following specific steps of:
31 Performing decentering treatment on the original feature matrix X { X 1,x2,…,xm }, wherein m represents the number of feature vectors, and X m represents the mth feature vector to obtain a decentered matrix Y;
32 Calculating a covariance matrix D of the matrix Y after the decentralization;
wherein m is the number of feature vectors, Y is the matrix after decentration, and Y T is the transposed matrix of the matrix Y;
33 Calculating eigenvalues and eigenvectors of the covariance matrix D through singular value decomposition;
34 The characteristic values obtained in the step 33) are ranked from large to small, and k maximum characteristic values are selected; then respectively taking k corresponding eigenvectors as column vectors to form an eigenvector matrix H; wherein k is calculated according to the accumulated contribution rate;
35 According to the matrix H, the final feature matrix F=H X after PCA dimension reduction is obtained, wherein X is the original feature matrix. The final feature matrix after PCA dimension reduction represents useful information to the greatest extent, reduces redundant information in data, improves the processing efficiency of the data, and is better as input of a classifier in the next step.
In step 4), an electroencephalogram signal recognition model is built by using a stacking integration algorithm, two algorithms, namely a convolutional neural network and a long-short-term memory network, are selected as a base model of a first layer, and a logistic regression classifier is selected as a meta model of a second layer; the recognition model is divided into two layers, wherein the first layer is two base models, each base model is trained by utilizing a training set, and then the trained base model is used for classifying and predicting data and outputting a prediction label; the second layer is a meta-model, the meta-model predicts the output result of the first layer as the input of the first layer, and finally obtains the multi-dimensional characteristic electroencephalogram signal identification model based on the stacking integration technology by combining a 5-fold cross verification method. Training the multi-dimensional characteristic electroencephalogram signal recognition model to obtain a trained recognition model; the model can effectively improve the recognition degree of the brain electrical signals.
Further, the specific steps of training the multi-dimensional characteristic electroencephalogram signal recognition model to obtain a trained recognition model are as follows:
41 Dividing the brain electrical data preprocessed in the step 1) into a training set D train and a test set D test, equally dividing the training set D train into five subsets, selecting one of the subsets as a verification set D m (m=1, 2,3,4, 5) during each training, and forming a new training set by the other four subsets;
42 Training the base model of the first layer by using different new training sets, wherein for the same base model, five models with different parameters can be trained by five different training sets;
43 Predicting the corresponding verification set D m by using the trained base model to obtain a prediction result M i (i=1, 2,3,4, 5); then predicting the test set D test by using the trained base model to obtain a predicted result N i (i=1, 2,3,4, 5);
44 Training all base models, repeating the steps 42) and 43), respectively carrying out model training and prediction by using the new training set and the verification set, finally obtaining a set M n of prediction results of the verification set by each base model, and marking the prediction result set of the verification set of all the base models as M; predicting all base models by using the test sets, finally taking weighted average of the test set prediction results N i of each base model, marking the weighted average as N n, and marking the test set prediction result set of all the base models as N;
45 Training the verification set prediction result set M of all the base models obtained in the step 44) as a training set of the second-layer base model to obtain a trained meta model. And training the model by taking the test set prediction result set N of all the base models as a test set of the second layer element model, thereby obtaining a final multi-dimensional characteristic electroencephalogram signal identification model based on the stacking integration technology.
In a further scheme, a first layer base model of the multi-dimensional characteristic electroencephalogram signal identification model based on a stacking integration technology and parameters thereof are selected as follows:
a. Convolutional Neural Network (CNN): the model uses 2 convolution layers, wherein the first convolution layer is 1 Dropout layer, and the second convolution layer is 1 maximum pooling layer and 1 full connection layer; wherein the core size of the first layer of convolution layer is 64×5, and the core size of the second layer of convolution layer is 128×3; then use ReLU as an activation function after each convolutional layer; the maximum pooling layer uses maximum pooling to reduce the input size, the memory usage amount and the parameter number, thereby reducing the operation amount; the Dropout layer is used for preventing the neural network from being over fitted, and finally, a Softmax function is used as the class prediction output of the classification problem;
b. Long-term memory network: the size of a hidden layer in the LSTM unit is set to be 64, and the LSTM structure consists of a storage unit for storing information and three gates, namely an input gate, an output gate and a forget gate; the three gates control the input and output of data; there are four different functions in LSTM, namely sigmoid, tanh, multiplication and addition, for more easily updating weights during model training; finally, in the fully connected layer, the use of Softmax activation functions enables the neural network to implement a dichotomous function.
The invention also provides a multi-dimensional characteristic electroencephalogram signal identification system based on the stacking integration technology, which adopts the multi-dimensional characteristic electroencephalogram signal identification method based on the stacking integration technology, and comprises the following steps:
The electroencephalogram signal acquisition module is used for acquiring electroencephalogram signals;
the preprocessing module is used for preprocessing the acquired electroencephalogram signals;
the multidimensional feature extraction module is used for carrying out multidimensional feature extraction on the preprocessed electroencephalogram data, constructing a feature matrix and obtaining an original feature matrix;
The dimension reduction processing module is used for carrying out dimension reduction processing on the original feature matrix by utilizing a principal component analysis algorithm to obtain a final feature matrix;
The stacking integrated learning module is used for constructing a multidimensional characteristic electroencephalogram signal recognition model based on a stacking integrated technology by taking a final characteristic matrix as input on the basis of a stacking integrated learning algorithm on the basis of the preprocessed electroencephalogram data; training the model to obtain a trained recognition model; and carrying out data preprocessing and multi-dimensional feature extraction on the electroencephalogram signals to be identified, and inputting the electroencephalogram signals to be identified into a trained multi-dimensional feature electroencephalogram signal identification model to obtain identification results.
The invention has the beneficial effects that:
According to the invention, the time-frequency domain and the spatial domain characteristics in the electroencephalogram signals are extracted, compared with the extraction of single electroencephalogram characteristics, the multidimensional characteristics keep the information contained in the electroencephalogram signals as completely as possible, the recognition precision and the classification performance of a model can be effectively improved, and meanwhile, the feature matrix is subjected to dimension reduction processing, so that the processing efficiency of data is improved;
The identification model in the invention can combine the advantages of multiple basic learners, effectively improve the generalization capability of the model, make up the deficiency of a single model and improve the classification effect of the model; the model uses cross verification, so that overfitting can be effectively prevented, and generalization capability and accuracy of the model are further improved;
the invention applies deep learning to the brain electrical signals, and identifies and classifies the brain electrical signals through an artificial intelligent algorithm, thereby further improving the accuracy of identification and achieving the expected effect of assisting in diagnosing psychological diseases such as depression and the like;
The invention extracts the multidimensional characteristics of the brain electrical signals and keeps the information in the brain electrical signals as completely as possible;
Establishing a dimension characteristic electroencephalogram signal identification model through electroencephalogram signals of a healthy subject and a non-healthy subject, and training the dimension characteristic electroencephalogram signal identification model to improve judgment accuracy;
the invention adopts the stacking integration technology to combine the advantages of a plurality of models and improve the generalization capability and the accuracy of the models, thereby achieving the effect of assisting in diagnosing psychological diseases.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the present invention, and other drawings can be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a flow chart of a multi-dimensional characteristic EEG signal recognition method based on a stacked integration technology;
FIG. 2 is a diagram showing specific positions of electrodes during electroencephalogram signal acquisition according to an embodiment of the present invention;
FIG. 3 is a schematic diagram of training and predicting a first tier model of a stacked integration strategy.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
According to the invention, the multi-dimensional characteristic electroencephalogram recognition model based on the stacking integration technology is trained by collecting the electroencephalogram signals of a plurality of subjects (the electroencephalogram signals of healthy subjects and unhealthy subjects), so that a trained recognition model is obtained. Preprocessing the electroencephalogram signals to be identified through denoising, normalization and the like, extracting multidimensional features to obtain an original feature matrix, and performing dimension reduction processing on the matrix by using a principal component analysis algorithm to obtain a final feature matrix. The final feature matrix is input into a trained recognition model, a prediction result 0 or 1 (wherein 0 represents health and 1 represents non-health) is output, whether the subject has psychological diseases or not is judged in an auxiliary mode according to the prediction result given by the model, and the recognition degree of the electroencephalogram signals can be effectively improved.
The invention uses a stacked integrated learning algorithm which improves the accuracy of prediction by integrating the prediction results of a plurality of base learners. In the invention, multidimensional feature extraction is used, a principal component analysis algorithm is utilized to carry out dimension reduction processing on the feature matrix, and a convolutional neural network and a long-term and short-term memory network are used for model training so as to realize accurate identification of the electroencephalogram signals.
Example 1
Referring to fig. 1, a flow chart of a multi-dimensional characteristic electroencephalogram signal identification method based on a stacked integration technology comprises the following steps: acquiring two different electroencephalogram signals of a healthy subject and a non-healthy subject and preprocessing data of the electroencephalogram signals; then carrying out multidimensional feature extraction on the preprocessed electroencephalogram data to obtain an original feature matrix; performing dimension reduction treatment on the original feature matrix to obtain a final feature matrix; constructing a multi-dimensional characteristic electroencephalogram signal identification model based on a stacking integration technology by taking the final characteristic matrix as input; training the model to obtain a trained recognition model; the electroencephalogram signal to be identified is subjected to data preprocessing and multidimensional feature extraction and then is input into a trained model, a prediction result 0 or 1 (wherein 0 represents health and 1 represents non-health) is output, whether the subject has psychological diseases or not is judged in an auxiliary mode according to the prediction result given by the model, and the identification degree of the electroencephalogram signal can be effectively improved. The method comprises the following specific steps:
s1: and acquiring two different brain electrical signals of the healthy subject and the unhealthy subject, and preprocessing the data of the signals, wherein the preprocessing mainly comprises denoising and normalizing to eliminate noise components in the signals, improve the signal quality and obtain the subsequently available brain electrical data.
S2: and carrying out multidimensional feature extraction on the preprocessed data, and constructing a feature matrix to obtain an original feature matrix. And carrying out feature extraction on the preprocessed electroencephalogram data in a time-frequency domain and a space domain to obtain multidimensional features, thereby constructing a feature matrix.
S3: and performing dimension reduction treatment on the original feature matrix by using a principal component analysis algorithm to obtain a final feature matrix.
S4: based on the obtained electroencephalogram data and a stacking integrated learning algorithm, constructing a multidimensional characteristic electroencephalogram recognition model based on a stacking integrated technology by using the final characteristic matrix in the step 3 as input; training the model to obtain a trained recognition model;
s5, carrying out data preprocessing and multi-dimensional feature extraction on the electroencephalogram signals to be identified, inputting the electroencephalogram signals to be identified into a trained multi-dimensional feature electroencephalogram signal identification model in the step 4), and outputting a prediction result 0 or 1 (wherein 0 represents health and 1 represents non-health).
The specific method of the step 1 is as follows:
S11, acquiring two types of EEG signal data of a healthy subject and a non-healthy subject. First, an electroencephalogram signal is acquired from a subject using an electroencephalogram acquisition apparatus. In order to obtain the best signal quality, the following parameters are set: data acquisition was performed using 64 channels, one of which was set as the reference electrode; the sampling rate was set at 500Hz. The specific location of the electrodes may be referred to in fig. 2.
S12, denoising the original electroencephalogram signal to eliminate noise components in the signal. Firstly, carrying out filtering treatment by using band-pass filtering of 0.5-40 Hz; and secondly, removing the artifacts such as blinks, eyeball movements and the like in the data by using an independent component analysis method.
S13, carrying out normalization processing on the denoised signal, and scaling the amplitude range of the signal to be between 0 and 1. Data were pre-processed using Min-Max normalization as follows:
Where z j is the j-th element in sample z, z max is the maximum value in the sample data, and z min is the minimum value in the sample data.
The specific method of the step 2 is as follows:
S21, performing feature extraction on the preprocessed electroencephalogram signal on a time-frequency domain based on discrete wavelet transformation to obtain time-frequency domain features; and extracting an electroencephalogram signal based on discrete wavelet transformation, obtaining an approximation component and a detail component, and using energy information of the detail component as time-frequency domain feature data.
Specifically, the discrete wavelet is defined as follows:
In the course of this formula (ii) the formula, For wavelet basis functions, a and n represent frequency resolution and time shift amount respectively, f (t) represents preprocessed electroencephalogram signals, t represents time index, and the wavelet function selected by the invention is db4. The signals were decomposed using the Mallat algorithm:
In this formula, x [ e ] is a discrete output signal, e represents a time index, L is the number of decomposition layers, A L is a low-pass approximation component, and D i is a detail component corresponding to each layer.
S22, carrying out feature extraction on the preprocessed electroencephalogram signals on the airspace based on a co-space mode (CSP) method to obtain airspace features; CSP is a feature extraction algorithm under two classification tasks, and the feature vector with the greatest degree of distinction is obtained by maximizing one class of variance and minimizing the other class of variance.
S23, combining the extracted time-frequency domain features and the spatial domain features to construct a combined feature matrix, and obtaining an original feature matrix.
The specific method of the step 3 is as follows:
S31, performing decentering treatment on an original feature matrix X { X 1,x2,…,xm }, wherein m represents the number of feature vectors, and X m represents the mth feature vector to obtain a decentered matrix Y;
s32, calculating a covariance matrix D of the matrix Y after the decentralization;
wherein m is the number of feature vectors, Y is the matrix after decentration, and Y T is the transposed matrix of the matrix Y;
s33, calculating eigenvalues and eigenvectors of a covariance matrix D through singular value decomposition;
S34, sorting the obtained characteristic values from large to small, and selecting the largest k of the characteristic values. And then respectively forming a characteristic vector matrix H by using k corresponding characteristic vectors as column vectors. Wherein k is calculated according to the accumulated contribution rate;
And S35, obtaining a final feature matrix F=H X after dimension reduction according to the matrix H, wherein X is an original feature matrix. The final feature matrix after PCA dimension reduction represents useful information to the greatest extent, reduces redundant information in data, improves the processing efficiency of the data, and is better as input of a classifier in the next step.
The specific method of the step 4 is as follows:
Constructing an electroencephalogram signal identification model by using a stacked integration algorithm, selecting two algorithms, namely a convolutional neural network and a long-short-term memory network as a base model of a first layer, and selecting a logistic regression classifier as a meta model of a second layer; the recognition model is divided into two layers, wherein the first layer is two base models, each base model is trained by utilizing a training set, and then the trained base model is used for classifying and predicting data and outputting a prediction label; the second layer is a meta model, the meta model predicts the output result of the first layer as the input of the first layer, and finally obtains a multi-dimensional characteristic electroencephalogram signal identification model based on a stacking integration technology by combining a 5-fold cross verification method; and then training the multi-dimensional characteristic electroencephalogram signal recognition model to obtain a trained recognition model. The training and prediction steps of the first layer base model are specifically seen in fig. 3.
Training the multidimensional characteristic electroencephalogram signal recognition model to obtain a trained recognition model, wherein the specific steps are as follows:
S41, dividing electroencephalogram data into a training set D train and a test set D test, equally dividing the training set D train into five subsets, selecting one of the subsets as a verification set D m (m=1, 2,3,4, 5) during each training, and forming a new training set by the other four subsets;
s42, training the base model of the first layer by using different new training sets. For the same basic model, five different training sets can train out five models with different parameters;
S43, predicting a corresponding verification set D m by using the trained base model to obtain a prediction result M i (i=1, 2,3,4, 5); then predicting the test set D test by using the trained base model to obtain a predicted result N i (i=1, 2,3,4, 5);
S44, training all base models, repeating the steps S42 and S43, respectively carrying out model training and prediction by using a new training set and a verification set, finally obtaining a set M n of prediction results of the verification set by each base model, and marking the prediction result set of the verification set of all base models as M; predicting all base models by using the test sets, finally taking weighted average of the test set prediction results N i of each base model, marking the weighted average as N n, and marking the test set prediction result set of all the base models as N;
S45, training the verification set prediction result set M of all the base models obtained in the step S44 as a training set of the second-layer base model to obtain a trained meta model. And training the model by taking the test set prediction result set N of all the base models as a test set of the second layer element model, thereby obtaining a final multi-dimensional characteristic electroencephalogram signal identification model based on the stacking integration technology.
The first layer base model of the multidimensional characteristic EEG signal identification model based on the stacking integration technology and parameters thereof are selected as follows:
a. Convolutional Neural Network (CNN): the model uses 2 convolutional layers, 1 Dropout layer, 1 max pooling layer and 1 fully connected layer. Wherein the kernel size of the first layer is 64×5, and the kernel size of the second layer is 128×3. The ReLU is then used as an activation function after each convolutional layer. The maximum pooling layer uses maximum pooling to reduce the input size, the memory usage and the parameter number, thereby reducing the operation amount. The Dropout technique is used to prevent overfitting of the neural network, and finally uses the Softmax function as the class prediction output for the classification problem.
B. long and short term memory network (LSTM): the size of the hidden layer in the LSTM cell is set to 64, and the LSTM structure is composed of one storage unit for storing information and three gates (input gate, output gate, and forget gate). The three gates control the input and output of data. There are four different functions in LSTM, sigmoid, tanh, multiplication and addition, for more easily updating weights during model training. Finally, in the fully connected layer, the use of Softmax activation functions enables the neural network to implement a dichotomous function.
Example 2
A multi-dimensional characteristic EEG signal identification system based on a stacking integration technology adopts the multi-dimensional characteristic EEG signal identification method based on the stacking integration technology in the embodiment 1, which comprises the following steps:
The electroencephalogram signal acquisition module is used for acquiring electroencephalogram signals;
the preprocessing module is used for preprocessing the acquired electroencephalogram signals;
the multidimensional feature extraction module is used for carrying out multidimensional feature extraction on the preprocessed electroencephalogram data, constructing a feature matrix and obtaining an original feature matrix;
The dimension reduction processing module is used for carrying out dimension reduction processing on the original feature matrix by utilizing a principal component analysis algorithm to obtain a final feature matrix;
The stacking integrated learning module is used for constructing a multidimensional characteristic electroencephalogram signal recognition model based on a stacking integrated technology by taking a final characteristic matrix as input on the basis of a stacking integrated learning algorithm on the basis of the preprocessed electroencephalogram data; training the model to obtain a trained recognition model; and carrying out data preprocessing and multi-dimensional feature extraction on the electroencephalogram signals to be identified, and inputting the electroencephalogram signals to be identified into a trained multi-dimensional feature electroencephalogram signal identification model to obtain identification results.
It will be understood that modifications and variations will be apparent to those skilled in the art from the foregoing description, and it is intended that all such modifications and variations be included within the scope of the following claims.
Claims (8)
1. The multi-dimensional characteristic electroencephalogram signal identification method based on the stacking integration technology is characterized by comprising the following steps of:
1) Acquiring brain electrical signals of a healthy subject and a non-healthy subject, and preprocessing data of the two different brain electrical signals;
2) Performing multidimensional feature extraction on the preprocessed electroencephalogram data, and constructing a feature matrix to obtain an original feature matrix;
3) Performing dimension reduction processing on the original feature matrix by using a principal component analysis algorithm to obtain a final feature matrix;
4) Constructing a multi-dimensional characteristic electroencephalogram signal identification model based on a stacking integration technology by taking the final characteristic matrix in the step 3) as input based on a stacking integration learning algorithm on the preprocessed electroencephalogram data in the step 1); training the model to obtain a trained recognition model;
In the step 4), an electroencephalogram signal identification model is built by using a stacking integration algorithm, two algorithms, namely a convolutional neural network and a long-short-term memory network, are selected as a base model of a first layer, and a logistic regression classifier is selected as a meta model of a second layer; the recognition model is divided into two layers, wherein the first layer is two base models, each base model is trained by utilizing a training set, and then the trained base model is used for classifying and predicting data and outputting a prediction label; the second layer is a meta model, the meta model predicts the output result of the first layer as the input of the first layer, and finally obtains a multi-dimensional characteristic electroencephalogram signal identification model based on a stacking integration technology by combining a 5-fold cross verification method; training the multi-dimensional characteristic electroencephalogram signal recognition model to obtain a trained recognition model;
training the multidimensional characteristic electroencephalogram signal recognition model to obtain a trained recognition model, wherein the specific steps are as follows:
41 Dividing the electroencephalogram data preprocessed in the step 1) into training sets And test set/>Then training setEqually divided into five subsets, one of which is selected as verification set/>, each time training is performedThe other four components form a new training set;
42 Training the base model of the first layer by using different new training sets, wherein for the same base model, five models with different parameters can be trained by five different training sets;
43 Using the trained base model to corresponding verification set Predicting to obtain a prediction result; Then the trained base model is utilized to test the set/>Predicting to obtain a prediction result;
44 Then training all base models, repeating steps 42) and 43), model training and prediction using the new training set and verification set, respectively, and finally obtaining a set of verification set prediction results for each base modelMarking the set of prediction results of the verification set of all base models as M; predicting all base models by using the test set, and finally predicting the result/>, of the test set of each base modelTake a weighted average, noted/>Marking a test set prediction result set of all base models as N;
45 Training the verification set prediction result set M of all the base models obtained in the step 44) as a training set of the second-layer base model to obtain a trained meta model; training the model by taking a test set prediction result set N of all the base models as a test set of a second layer element model, thereby obtaining a final multi-dimensional characteristic electroencephalogram signal identification model based on a stacking integration technology;
5) And (3) carrying out data preprocessing and multi-dimensional feature extraction on the electroencephalogram signals to be identified, and inputting the electroencephalogram signals to be identified into the trained multi-dimensional feature electroencephalogram signal identification model in the step (4) to obtain identification results.
2. The multi-dimensional characteristic electroencephalogram signal identification method based on stacked integration technology according to claim 1, wherein the method comprises the following steps of: the step of preprocessing the data of the electroencephalogram signals in the step 1) comprises denoising and normalization; the method comprises the following steps:
11 Denoising the original electroencephalogram signal to eliminate noise components in the signal, specifically: firstly, filtering treatment is carried out; secondly, removing blinks and eyeball movement artifacts in the data by using an independent component analysis method;
12 Normalized processing is carried out on the denoised signal, and the amplitude range of the signal is scaled to be between 0 and 1, specifically: using The signal is preprocessed in a normalization manner as follows:
Wherein the method comprises the steps of For sample/>Middle/>Element,/>Is the maximum value in the sample data,/>Is the minimum value in the sample data.
3. The multi-dimensional characteristic electroencephalogram signal identification method based on stacked integration technology according to claim 1, wherein the method comprises the following steps of: in the step 2), the preprocessed electroencephalogram data is subjected to feature extraction on a time frequency domain and a space domain to obtain multidimensional features, so that an original feature matrix is constructed{/>}。
4. The multi-dimensional characteristic electroencephalogram identification method based on stacking integration technology according to claim 3, wherein the method comprises the following steps of: in the step 2), the preprocessed electroencephalogram data is subjected to feature extraction on a time frequency domain and a space domain to obtain multidimensional features, so that the step of constructing an original feature matrix is as follows:
21 Performing feature extraction on the preprocessed electroencephalogram signal on a time-frequency domain based on discrete wavelet transformation to obtain time-frequency domain features;
22 Performing feature extraction on the preprocessed electroencephalogram signals in a space domain based on a co-space mode method to obtain space domain features;
23 Combining the extracted time-frequency domain features and the spatial domain features to construct a combined feature matrix, thereby obtaining an original feature matrix.
5. The multi-dimensional characteristic electroencephalogram signal identification method based on stacking integration technology according to claim 4, wherein the method comprises the following steps of: in step 21), an electroencephalogram signal is extracted based on discrete wavelet transformation, an approximation component and a detail component are obtained, and energy information of the detail component is used as time-frequency domain feature data.
6. The multi-dimensional characteristic electroencephalogram signal identification method based on stacked integration technology according to claim 1, wherein the method comprises the following steps of: in the step 3), the primary feature matrix is subjected to dimension reduction processing by using a principal component analysis algorithm, and the method for obtaining the final feature matrix comprises the following steps:
The method comprises the steps of performing dimension reduction processing on an original feature matrix by using a principal component analysis algorithm, reducing high-dimensional data into a low-dimensional space, reducing redundant information of the data, improving the processing efficiency of the data and the precision of a model, and obtaining a final feature matrix, wherein the method comprises the following specific steps of:
31 For the original feature matrix {/>Perform decentralization processing,/>Representing the number of feature vectors,/>Represents the/>Obtaining the matrix/>, after the decentralization, of the feature vectors;
32 Calculating the decentered matrixCovariance matrix/>;
,
Wherein,Is the number of feature vectors,/>For the decentered matrix,/>For matrix/>Is a transposed matrix of (a);
33 Calculating eigenvalues and eigenvectors of the covariance matrix D through singular value decomposition;
34 The characteristic values obtained in the step 33) are ranked from large to small, and k maximum characteristic values are selected; then respectively taking k corresponding eigenvectors as column vectors to form an eigenvector matrix H; wherein k is obtained by calculation according to the accumulated contribution rate;
35 Obtaining final feature matrix after dimension reduction according to matrix H Wherein/>Is the original feature matrix.
7. The multi-dimensional characteristic electroencephalogram signal identification method based on stacked integration technology according to claim 1, wherein the method comprises the following steps of:
the base model of the first layer and its parameters are chosen as follows:
a. convolutional neural network: the model uses 2 convolution layers, the first layer of convolution layers being 1 The second convolution layer is 1 max pooling layer and 1 full connection layer; wherein the core size of the first layer of convolution layer is 64×5, and the core size of the second layer of convolution layer is 128×3; then use/>, after each convolution layerAs an activation function; the maximum pooling layer uses maximum pooling to reduce the input size, the memory usage amount and the parameter number, thereby reducing the operation amount; /(I)Layers are used to prevent overfitting of the neural network, last use/>The function is used as the class prediction output of the class-divided problem;
b. Long-term memory network: the size of a hidden layer in the LSTM unit is set to be 64, and the LSTM structure consists of a storage unit for storing information and three gates, namely an input gate, an output gate and a forget gate; the three gates control the input and output of data; there are four different functions in LSTM, namely Multiplication and addition for updating weights during model training; finally, in the fully connected layer, use/>The activation function enables the neural network to implement a classification function.
8. A multi-dimensional characteristic electroencephalogram signal identification system based on a stacking integration technology is characterized in that: the recognition system adopts the multi-dimensional characteristic electroencephalogram signal recognition method based on the stacking integration technology as claimed in claims 1-7, and comprises the following steps:
The electroencephalogram signal acquisition module is used for acquiring electroencephalogram signals;
the preprocessing module is used for preprocessing the acquired electroencephalogram signals;
the multidimensional feature extraction module is used for carrying out multidimensional feature extraction on the preprocessed electroencephalogram data, constructing a feature matrix and obtaining an original feature matrix;
The dimension reduction processing module is used for carrying out dimension reduction processing on the original feature matrix by utilizing a principal component analysis algorithm to obtain a final feature matrix;
The stacking integrated learning module is used for constructing a multidimensional characteristic electroencephalogram signal recognition model based on a stacking integrated technology by taking a final characteristic matrix as input on the basis of a stacking integrated learning algorithm on the basis of the preprocessed electroencephalogram data; training the model to obtain a trained recognition model; and carrying out data preprocessing and multi-dimensional feature extraction on the electroencephalogram signals to be identified, and inputting the electroencephalogram signals to be identified into a trained multi-dimensional feature electroencephalogram signal identification model to obtain identification results.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311196950.1A CN117338313B (en) | 2023-09-15 | 2023-09-15 | Multi-dimensional characteristic electroencephalogram signal identification method based on stacking integration technology |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202311196950.1A CN117338313B (en) | 2023-09-15 | 2023-09-15 | Multi-dimensional characteristic electroencephalogram signal identification method based on stacking integration technology |
Publications (2)
Publication Number | Publication Date |
---|---|
CN117338313A CN117338313A (en) | 2024-01-05 |
CN117338313B true CN117338313B (en) | 2024-05-07 |
Family
ID=89368159
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202311196950.1A Active CN117338313B (en) | 2023-09-15 | 2023-09-15 | Multi-dimensional characteristic electroencephalogram signal identification method based on stacking integration technology |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN117338313B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118141392B (en) * | 2024-05-09 | 2024-07-09 | 西南医科大学附属医院 | Electroencephalogram identification method |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110432899A (en) * | 2019-07-23 | 2019-11-12 | 南京工业大学 | Electroencephalogram signal identification method based on depth stacking support matrix machine |
WO2020248008A1 (en) * | 2019-06-14 | 2020-12-17 | The University Of Adelaide | A method and system for classifying sleep related brain activity |
CN112603334A (en) * | 2020-12-18 | 2021-04-06 | 杭州电子科技大学 | Spike detection method based on time sequence characteristics and stacked Bi-LSTM network |
CN112656427A (en) * | 2020-11-26 | 2021-04-16 | 山西大学 | Electroencephalogram emotion recognition method based on dimension model |
KR102334595B1 (en) * | 2020-12-21 | 2021-12-02 | 건국대학교 산학협력단 | Emotion recongnition method and device |
CN115374812A (en) * | 2022-08-02 | 2022-11-22 | 中山大学附属第三医院 | Signal feature extraction method for multi-feature fusion extraction of electroencephalogram signals |
CN115795346A (en) * | 2022-12-02 | 2023-03-14 | 江苏理工学院 | Classification and identification method of human electroencephalogram signals |
CN116211320A (en) * | 2023-03-16 | 2023-06-06 | 安徽工业大学 | Pattern recognition method of motor imagery brain-computer interface based on ensemble learning |
CN116484290A (en) * | 2023-04-18 | 2023-07-25 | 武汉纺织大学 | Depression recognition model construction method based on Stacking integration |
CN116628420A (en) * | 2023-05-08 | 2023-08-22 | 武汉纺织大学 | Brain wave signal processing method based on LSTM neural network element learning |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20220160286A1 (en) * | 2020-11-20 | 2022-05-26 | Anhui Huami Health Technology Co., Ltd. | Adaptive User Interaction Systems For Interfacing With Cognitive Processes |
-
2023
- 2023-09-15 CN CN202311196950.1A patent/CN117338313B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2020248008A1 (en) * | 2019-06-14 | 2020-12-17 | The University Of Adelaide | A method and system for classifying sleep related brain activity |
CN110432899A (en) * | 2019-07-23 | 2019-11-12 | 南京工业大学 | Electroencephalogram signal identification method based on depth stacking support matrix machine |
CN112656427A (en) * | 2020-11-26 | 2021-04-16 | 山西大学 | Electroencephalogram emotion recognition method based on dimension model |
CN112603334A (en) * | 2020-12-18 | 2021-04-06 | 杭州电子科技大学 | Spike detection method based on time sequence characteristics and stacked Bi-LSTM network |
KR102334595B1 (en) * | 2020-12-21 | 2021-12-02 | 건국대학교 산학협력단 | Emotion recongnition method and device |
CN115374812A (en) * | 2022-08-02 | 2022-11-22 | 中山大学附属第三医院 | Signal feature extraction method for multi-feature fusion extraction of electroencephalogram signals |
CN115795346A (en) * | 2022-12-02 | 2023-03-14 | 江苏理工学院 | Classification and identification method of human electroencephalogram signals |
CN116211320A (en) * | 2023-03-16 | 2023-06-06 | 安徽工业大学 | Pattern recognition method of motor imagery brain-computer interface based on ensemble learning |
CN116484290A (en) * | 2023-04-18 | 2023-07-25 | 武汉纺织大学 | Depression recognition model construction method based on Stacking integration |
CN116628420A (en) * | 2023-05-08 | 2023-08-22 | 武汉纺织大学 | Brain wave signal processing method based on LSTM neural network element learning |
Non-Patent Citations (2)
Title |
---|
基于SAE 和GNDO-SVM 的脑电信号情绪识别;陈晨等;《计算机系统应用》;20230823;第32卷(第10期);第284-292页 * |
堆叠稀疏降噪自编码的脑电信号识别;唐贤伦等;《电 子 科 技 大 学 学 报》;20190131;第48卷(第1期);第62-67页 * |
Also Published As
Publication number | Publication date |
---|---|
CN117338313A (en) | 2024-01-05 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN111461176B (en) | Multi-mode fusion method, device, medium and equipment based on normalized mutual information | |
CN108304917B (en) | P300 signal detection method based on LSTM network | |
CN110070105B (en) | Electroencephalogram emotion recognition method and system based on meta-learning example rapid screening | |
CN117338313B (en) | Multi-dimensional characteristic electroencephalogram signal identification method based on stacking integration technology | |
CN115804602A (en) | Electroencephalogram emotion signal detection method, equipment and medium based on attention mechanism and with multi-channel feature fusion | |
CN113128353B (en) | Emotion perception method and system oriented to natural man-machine interaction | |
CN113855038A (en) | Electrocardiosignal critical value prediction method and device based on multi-model integration | |
CN114564990A (en) | Electroencephalogram signal classification method based on multi-channel feedback capsule network | |
CN114595725B (en) | Electroencephalogram signal classification method based on addition network and supervised contrast learning | |
Deepthi et al. | An intelligent Alzheimer’s disease prediction using convolutional neural network (CNN) | |
CN115422973A (en) | Electroencephalogram emotion recognition method of space-time network based on attention | |
CN112932501A (en) | Method for automatically identifying insomnia based on one-dimensional convolutional neural network | |
Peng | Research on Emotion Recognition Based on Deep Learning for Mental Health | |
CN117609863A (en) | Long-time electroencephalogram emotion recognition method based on electroencephalogram micro state | |
CN116484290A (en) | Depression recognition model construction method based on Stacking integration | |
Kumar et al. | Early detection of parkinson’s disease using convolutional neural network | |
Panayotova et al. | One Approach to using R for Bayesian Analysis of Brain Signals | |
Sansiagi et al. | Post-Stroke identification of EEG signals using recurrent neural networks and long short-term memory | |
Mahapatra et al. | Decoding of Imagined Speech Neural EEG Signals Using Deep Reinforcement Learning Technique | |
CN116595455B (en) | Motor imagery electroencephalogram signal classification method and system based on space-time frequency feature extraction | |
CN118177806B (en) | Electrocardiosignal mental pressure detection method and device based on image coding | |
CN118395273B (en) | Electroencephalogram signal classification method based on impulse neural network and Stokes Will transformation | |
Kumar et al. | A Transfer Learning Approach with MobileNetV2 for Parkinson’s Disease Detection using Hand-Drawings | |
KR20130006030A (en) | Construction method of classification model for emotion recognition and apparatus thereof | |
Donato | CoTraM: Convolutional Transformer for Multichannel Time Series Classification |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |