CN112906539B

CN112906539B - Object identification method based on EEG data

Info

Publication number: CN112906539B
Application number: CN202110172061.6A
Authority: CN
Inventors: 魏展; 周文晖; 张桦; 黄鸿飞; 杨思学; 施江玮
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2021-02-08
Filing date: 2021-02-08
Publication date: 2024-04-05
Anticipated expiration: 2041-02-08
Also published as: CN112906539A

Abstract

The invention discloses an object identification method based on EEG data. The invention firstly uses data enhancement and extraction technology to expand the capacity of the data. Then randomly dividing the data into five equal parts, performing 5-fold training, then using ResBlock as a basic structure to construct a brand new two-dimensional convolutional neural network, removing the first three layers of the network by hole convolution to replace common convolution, using PReLU as an activation function of the network, and using Focalloss as a loss function of a model. Model training was performed using the EEG data object recognition dataset released by the active laboratory in 2017. The data enhancement of the invention can make full use of the data of the small data set. In addition, a deeper network model is adopted, and fewer network parameters are adopted, so that the characteristic which can be used for identification in the EEG data can be learned as much as possible in spite of the fact that the quantity of the EEG data in the data set is small, and the object identification task can be efficiently and accurately realized.

Description

Object identification method based on EEG data

Technical Field

The invention belongs to the technical field of EEG signal recognition, and particularly relates to a scene recognition system based on EEG signals.

Background

Currently, humans are much more capable of classifying scenes than computers. Although the deep learning algorithm in recent years can achieve a very good accuracy of scene recognition, the deep learning algorithm still cannot exceed the accuracy of human recognition. In comparison with the computer vision method, which directly extracts distinguishable features of an image for recognition, the human brain visual recognition process also comprises a perception process and a recognition theory, for example, the color, the shape and the like of an object are stimulated by the human brain, and the human brain cortex reacts to the stimulation. Some studies in the neuroscience field indicate that human brain activity has a specific pattern of brain activity for a specific class of objects. The recognition of the human brain for the same kind of objects is based on the same pattern and the recognition speed is very fast, with a time scale of around a few milliseconds. Therefore, a picture of the object can be presented to the subject and then the EEG data of the subject recorded, and features that can be used for identification can be extracted from the EEG data well with the very high resolution of the time dimension of the EEG data. If the human perception and recognition process of an object can be refined and applied in computer vision, the computer can automatically simulate the human object recognition process.

The decoding of vision-related human brain activities has great significance, not only can provide a brand-new angle for scientific researchers to study the vision-cognition process, but also can enhance the performance of a human-computer interaction system. The object recognition by using EEG data of the human brain can not only change the traditional object recognition method, but also provide a novel method for labeling unlabeled image data. The use of patterns in the human visual system to do computer vision tasks has been a difficult direction, however, the advent of deep learning has enabled researchers to study EEG data and neuroscience more deeply. To the best of our knowledge, there are some studies to directly use EEG signals for object recognition tasks. For example, kaboor a et al propose a method for object recognition using EEG data, which classifies images into three categories (animal, face, inanimate object). Rouby et al use EEG data to do object recognition tasks. They acquired a dataset containing 400 images and corresponding EEG data using a 256-lead EEG data acquisition system. Their average correct rate was 82.70%. Viral et al propose an image annotation system based on EEG data. They utilized 2500 images as the training set and the F1 score reached 0.88. These studies have shown that object recognition tasks using EEG data are feasible. In 2017, the active laboratory has released one of the largest current object recognition EEG datasets. And they constructed a series of EEG recognition models based on recurrent neural networks. Their best recognition accuracy was 82.9%. Although these methods have achieved very good results at present, there is still a great improvement in recognition accuracy, mainly because the network still needs to be improved, and secondly, the data of the image is less, which makes it difficult for the deep learning model to learn the distinguishable features.

Disclosure of Invention

The invention aims to provide an object identification method based on EEG data. Firstly, the data is expanded by utilizing the data enhancement and extraction technology. Then, a brand new two-dimensional convolution neural network is constructed by taking ResBlock as a basic structure, the first three layers of the network replace common convolution by hole convolution, PReLU is used as an activation function of the network, and Focalloss is used as a loss function of the model. Model training was performed using the EEG data object recognition dataset released by the active laboratory in 2017.

The invention provides an object identification method based on EEG. In the data preprocessing stage, the data enhancement technology is utilized to perform random 0 setting processing and random data overturning operation on the original EEG data, so that the purpose of expanding the original data set is achieved. The data was then randomly split into five equal parts and a 5-fold training was performed. Data enhancement may allow the data of a small data set to be fully utilized. In addition, a deeper network model is adopted, and fewer network parameters are adopted, so that the characteristic which can be used for identification in the EEG data can be learned as much as possible in spite of the fact that the quantity of the EEG data in the data set is small, and the object identification task can be efficiently and accurately realized.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

the Python language is preferred as the basis. On the data enhancement, the numpy library is used for carrying out random 0 setting operation and random overturning operation on the original image data, and then a data set after the data enhancement is output. And on this basis, the EEG data was randomly divided into 5 equal parts for 5-fold training. And then, using an open source deep learning framework PyTorch to realize a network model and complete the construction of the model. And 5-fold training is carried out on the data set, each fold is selected to obtain the optimal model on the verification set, and model fusion is carried out on the five selected models to obtain a final result.

An EEG-based object recognition method comprising the steps of:

and step 1, acquiring an EEG object identification data set, and cleaning the acquired original image data set.

The EEG object recognition data set includes EEG data collected from the subjects and a tag for each subject to view the object during the collection of data.

And 2, performing enhancement processing on the cleaned EEG object identification data set by using a data enhancement technology, and increasing the number of samples.

And 3, randomly dividing the EEG object identification data set processed in the step 2 into 5 data sets with the same quantity.

Step 4, constructing an identification network model, utilizing ResBlock as a basic module, adopting hole convolution to the ResBlock layer of the first three layers of the identification network model, adopting a PReLU activation function to the activation function in the network, adopting the following formula of the PReLU activation function,

x _i is the input of the activation function in the ith channel, the parameter a can be learned _i Is responsible for controlling the slope of the negative half-axis of the activation function.

The recognition network model adopts FocalLoss as a loss function, the FocalLoss loss function has the following formula,

Loss＝-α(1-y′) ^γ logy′

in the formula, alpha and gamma are two super parameters which are respectively used for solving the problem of data unbalance and improving the classification performance of the difficult-to-separate samples. y' is the output of the recognition network model.

And 5, performing five-fold training on the identification network model according to the 5 data sets obtained in the step 3, and fusing the optimal model obtained by each-fold training.

Further, the step 2 specifically includes the following steps:

step 2.1, randomly setting data to 0: for EEG data in the EEG object identification dataset, 128 channels of each EEG data are randomly taken from 1 to 10 channels, randomly taken from 5 to 15 data points in the time dimension, and these data points are all set to 0.

Step 2.2, random data overturn: and (3) performing a time dimension or channel dimension overturning operation on the EEG data.

The step 4 specifically comprises the following steps:

and 4.1, an EEG-based object recognition network model integrally adopts ResBlock as a basic structure, the ResBlock layers of the first three layers of the recognition network model adopt hole convolution, EEG data are precoded through the hole convolution, and the hole convolution hole ratios of the three ResBlock built-in holes are respectively set to be 1, 10 and 21.

The cavity convolution can enable the object recognition network model to obtain a larger receptive field in the initial stage, and is favorable for extracting information between EEG data cross channels and large time scales.

Step 4.2, a module obtained by superposing MaxPool (maximum pooling) and ResBlock is used as a basic module of the object recognition network model; the three layers adopt the ResBlock layer of the cavity convolution and then are connected with 5 basic modules with the same structure to form an object recognition network model;

the step size of the MaxPool layer in the first 2 basic modules is set to 2 in the EEG data time dimension and 1 in the EEG data channel dimension, so that the feature map is close to the dimension of the channel dimension in the time dimension; the MaxPool operation in the last 3 basic blocks is set to 2 in step size for both channels, i.e. the downsampling operation is performed jointly in both dimensions. All ResBlock in the object recognition network model, and the PReLU activation function is adopted as the activation function. Finally, focalLoss is used as a loss function.

The object recognition network model is firstly downsampled only in the time dimension because EEG data generally has higher resolution in the time dimension, and firstly downsampled in the time dimension enables the channel dimension and the time dimension to be close, and the feature map is close to a square, so that the following convolution can better extract the features. The PReLU activation function is employed so that each layer of the network can learn the best activation function for that layer using back propagation. The adoption of Focalloss enables learning of the network to bias towards learning of hard-to-separate samples.

Compared with the prior art, the invention has the following obvious advantages:

deep learning models typically require a large number of data sets to train to find the best model, whereas EEG data sets are smaller in size than natural images due to the difficulty and cost of acquisition. The model utilizes FocalLoss to strengthen the generalization capability of the model, so that the learning of the model can pay more attention to the learning of difficult-to-separate samples. Meanwhile, the front three layers of the network adopt ResBlock with different void ratios, so that the network can acquire a larger receptive field in the precoding process, and information on a larger time scale is fused. The object recognition network model adopts the steps of firstly downsampling the time scale, so that after the resolution of the feature map on the time scale and the channel scale are close, downsampling operation is carried out on the two scales at the same time.

Drawings

Figure 1 is a flow chart of a method according to the invention;

FIG. 2 is a general block diagram of an EEG object recognition network according to the present invention;

fig. 3 is a network configuration diagram of the present invention.

Fig. 4 is a ResBlock structure of the present invention.

Detailed Description

The present invention will be described in further detail below with reference to the accompanying drawings in conjunction with the detailed description.

The hardware equipment used in the invention is one PC machine and one 1080ti display card;

as shown in fig. 1, the invention provides an EEG object recognition method based on deep learning, which specifically comprises the following steps:

step 1, acquiring an EEG-based object identification data set in the related field, and cleaning (e.g. deleting dirty data) the acquired original image data set.

The object identification data set based on the EEG comprises EEG data and a kind label of the object being watched by the test when the EEG data is acquired;

and 2, enhancing the cleaned EEG data set by using an image enhancement technology, increasing the number of samples and enriching the data content.

And 3, randomly dividing the image data set processed in the step 2 into 5 data sets with the same quantity.

And 4, building an object recognition model, adding hole convolution in a network by using ResBlock as a basic structure, adopting a PReLU activation function as an activation function, and adopting FocalLoss as a loss function.

As shown in fig. 2, is an overall block diagram of the entire EEG-based object recognition model.

As shown in fig. 3, a detailed model structure diagram of the present invention is shown.

Step 4.1, constructing a model using ResBlock as a basic module, as shown in FIG. 4, the model is a 2DResBlock structure of the invention.

Step 4.2, in the first three ResBlock layers of the model, the conventional convolution is replaced by hole convolution.

Step 4.3, changing the step size of the first two largest pooling layers of the model to 2 on the time scale of the EEG data and 1 on the channel scale of the EEG data.

Step 4.4, adding PReLU activation function into the model, wherein the PReLU activation function has the following formula,

And 4.5, using FocalLoss as a loss function of the model. The formula for the FocalLoss function is as follows,

Loss＝-α(1-y′) ^γ logy′

in the formula, alpha and gamma are two super parameters which are respectively used for solving the problem of data unbalance and improving the classification performance of the difficult-to-separate samples.

And 5, performing five-fold training on the object recognition model according to the 5 data sets obtained in the step 3, and fusing the optimal model obtained by each-fold training. The specific model fusion method is to normalize the output of each model and then calculate the average value.

Through experimental tests, the maximum accuracy of the object identification model on a test set reaches 93.2%, the accuracy of the model is 82.9% raised by a personal laboratory in 2017, the accuracy of the model is improved by 10.3% compared with the model, the accuracy of the knowledge distillation model is 89.6% raised by praay Mukherjee et al in 2019, and the accuracy of the model is improved by 3.6% compared with the model.

Claims

1. An EEG data based object recognition method comprising the steps of:

step 1, acquiring an EEG object identification data set, and cleaning the acquired original image data set;

the EEG object identification data set comprises EEG data acquired from the tested subjects and labels of the objects watched by each tested subject in the process of acquiring the data;

step 2, enhancing the cleaned EEG object identification data set by utilizing a data enhancement technology, and increasing the number of samples;

step 3, randomly dividing the EEG object identification data set processed in the step 2 into 5 data sets with the same quantity;

x _i is the input of the activation function in the ith channel, the parameter a can be learned _i The slope of the negative half shaft of the activation function is controlled;

Loss＝-α(1-y′) ^γ log y′

in the formula, alpha and gamma are two super parameters which are respectively used for solving the problem of unbalance of data and improving the classification performance of a difficult-to-separate sample; y' is the output of the recognition network model;

step 4.1, an EEG-based object recognition network model integrally adopts ResBlock as a basic structure, a ResBlock layer of the first three layers of the recognition network model adopts hole convolution, EEG data is precoded through the hole convolution, and hole convolution hole ratios of three ResBlock built-in holes are respectively set to be 1, 10 and 21;

the cavity convolution can enable the object recognition network model to obtain a larger receptive field at the initial stage, and is favorable for extracting information between the cross channels and the large time scales of EEG data;

step 4.2, a module obtained by superposing MaxPool and ResBlock is used as a basic module of the object recognition network model; the three layers adopt the ResBlock layer of the cavity convolution and then are connected with 5 basic modules with the same structure to form an object recognition network model;

the step size of the MaxPool layer in the first 2 basic modules is set to 2 in the EEG data time dimension and 1 in the EEG data channel dimension, so that the feature map is close to the dimension of the channel dimension in the time dimension; the MaxPool operation in the last 3 basic modules is set to 2 in the step length of the two channels, namely downsampling operation is carried out in two dimensions together; all ResBlock in the object recognition network model, and the PReLU activation function is adopted as the activation function; finally, focalLoss is adopted as a loss function;

2. An EEG data based object recognition method according to claim 1, wherein step 2 comprises the steps of:

step 2.1, randomly setting data to 0: for EEG data in the EEG object recognition dataset, randomly taking 1 to 10 channels for 128 channels of each EEG data, randomly taking 5 to 15 data points long in the time dimension, and setting all the data points to 0;