CN115470810A

CN115470810A - Electroencephalogram semantic visualization method based on diffusion model framework

Info

Publication number: CN115470810A
Application number: CN202210788432.8A
Authority: CN
Inventors: 曾虹; 夏念章; 钱东官; 欧阳瑜; 贾哲
Original assignee: Hangzhou Dianzi University
Current assignee: Hangzhou Dianzi University
Priority date: 2022-07-06
Filing date: 2022-07-06
Publication date: 2022-12-13

Abstract

The invention provides an electroencephalogram semantic visualization method based on a diffusion model framework, which comprises the following steps: s1, acquiring original data, namely acquiring EEG data through EEG acquisition equipment; s2, processing data; s3, obtaining a semantic feature vector e through an EVR-Net model; s4, obtaining a visual image x through a DDPM model ₀ . The method has a good effect in the aspect of EEG semantic visualization, the brain control technology is obviously improved in practical application, a foundation is laid for practical brain-computer interaction application in the future, and the method has a wider application prospect.

Description

Electroencephalogram semantic visualization method based on diffusion model framework

Technical Field

The invention relates to the technical field of biological electroencephalogram signals, in particular to an electroencephalogram semantic visualization method based on a diffusion model framework.

Background

Electroencephalography (EEG) is a brain response signal generated by a human being upon external stimuli. The electroencephalogram contains a large amount of semantic information, and the information can be visualized into images with the same semantic meaning. Generally, electroencephalogram data are used for classifying states, emotions, image stimulation and the like of various models after semantic feature extraction. However, there is little work to visualize the extracted semantic feature information in EEG data. The EEG semantic visualization technology has important prospects, such as solving of communication obstacles, multimedia application and grade identification of Alzheimer's disease. However, variability among EEG individuals results in poor performance of current methods. Therefore, how to design an effective EEG semantic feature extraction method and an EEG semantic feature visualization method is very important.

Disclosure of Invention

According to the defects of the prior art, the invention provides an EEG semantic visualization method based on a diffusion model frame, EEG semantic features contained in EEG data are extracted through an EEG semantic feature extraction network model (EVR-Net) based on a deep learning method, the extracted semantic features are used as a guide, and a Denoising Diffusion Probability Model (DDPM) is used for generating a visualization image with the same semantic as that of EEG.

In order to solve the technical problems, the technical scheme of the invention is as follows:

an electroencephalogram semantic visualization method based on a diffusion model framework comprises the following steps:

s1, collecting original data, namely collecting EEG data through EEG collecting equipment

S2, data processing

S2-1, preprocessing data, importing collected EEG data into a python script, and eliminating noise and interference;

s2-2, time slice division, namely segmenting the preprocessed EEG signals into a plurality of sequences with equal time length, and storing the sequences in a file with a format of npy;

s3, obtaining a semantic feature vector e through an EVR-Net model

S3-1, constructing an initial EVR-Net model, wherein the initial EVR-Net model comprises eight layers of networks, namely five convolution layers, two residual blocks and an average pooling layer;

s3-2, training an initial EVR-Net model to obtain an optimal EEG semantic feature extractor EVR-Net parameter;

s3-3, importing the optimal EEG semantic feature extractor EVR-Net parameters into an initial EVR-Net model to obtain a final EVR-Net model;

s3-4, inputting the EEG signal after data processing into a final EVR-Net model after training to obtain a semantic feature vector e;

s4, obtaining a visual image x through a DDPM model ₀

S4-1, constructing an initial DDPM model, wherein the initial DDPM model comprises nine layers of networks, namely a full-connection layer, four convolution layers and four anti-convolution layers;

s4-2, training an initial DDPM model to obtain an optimal DDPM parameter of a semantic visualizer;

s4-3, importing the optimal DDPM parameters of the semantic visualizer into the initial DDPM model to obtain a final DDPM model;

s4-4, inputting the semantic feature vector e into the trained final DDPM model to obtain a visual image x ₀ 。

Preferably, in step S2-1, the method for eliminating noise and interference includes: the 1-50 Hz EEG signal was extracted from the bandpass filtering algorithm in the python library.

Preferably, in step S3-2, before the initial EVR-Net model is trained, the parameter μ of the initial EVR-Net model is required to be measured _EVR-Net And (3) initializing:

the activation function of the initial EVR-Net model after passing through the average pooling layer is a Sigmoid function, and the formula is as follows:

wherein, x in the Sigmoid function represents a sample, and the activation functions between other network layers or blocks are all ReLU functions, and the formula is as follows:

ReLU(x)＝max(0，x)

the cross entropy function of Cross EntropyLoss is used as the loss function Lc during initial EVR-Net model training, and the formula is as follows:

in the cross entropy function, i represents the classification condition under the current label state, K represents the total classification quantity, y represents the value of the label, and p represents the probability of semantic prediction under the current model parameter.

Preferably, in the step S3-2, mu in the initial EVR-Net model _EVR-Net After the initialization is completed, the following training is carried out:

data enters a convolution layer of a first layer, convolution processing is carried out on the data through one-dimensional convolution, convolution adopts a convolution kernel with the size of 3, the step length is set to be 2, the number of output channels is 32, and the data is activated through an activation function ReLU after the convolution is finished;

the data enters a convolution layer of a second layer, convolution processing is carried out on the data through one-dimensional convolution, convolution adopts a convolution kernel with the size of 3, the step length is set to be 2, and the number of output channels is 64; activating data by using an activation function ReLU after the convolution is finished;

and the data enters the residual block of the third layer, and the data is processed by the residual block with the convolution kernel size of 3. Activating data by using an activation function ReLU after the data is processed;

the data enters a convolution layer of a fourth layer, convolution processing is carried out on the data through one-dimensional convolution, convolution adopts a convolution kernel with the size of 3, the step length is set to be 2, the number of output channels is 128, and the data are activated through an activation function ReLU after the convolution is finished;

data enters a convolution layer of a fifth layer, convolution processing is carried out on the data through one-dimensional convolution, convolution adopts a convolution kernel with the size of 3, the step length is set to be 2, and the number of output channels is 256; activating data by using an activation function ReLU after the convolution is finished;

and the data enters a residual block of the sixth layer, and the data is processed by using the residual block with the convolution kernel size of 3. Activating data by using an activation function ReLU after the data is processed;

the data enters a convolution layer of a seventh layer, convolution processing is carried out on the data through two-dimensional convolution, convolution adopts a convolution kernel with the size of 3, the step length is set to be 2, the number of output channels is 512, and the data is activated through an activation function ReLU after the convolution is finished;

the data enters an average pooling layer of the eighth layer, is converted into a tensor matrix with the size of 512 after passing through the average pooling layer, and is activated by a Sigmoid function;

and a loss function Lc adopted in the training process is a crossentropy function of Cross EntropyLoss, and an optimal EVR-Net parameter is obtained through Lc cyclic training and is stored in a local file.

Preferably, in step S4-2, before the initial DDPM model is trained, the parameter μ of the initial DDPM model needs to be measured _DDPM And (3) initializing:

defining a standard deviation plan beta _t ＝1-α _t And is initialized according to a linear formula as follows:

wherein T is the total number of initialization steps, T is a certain number of steps, and T and T are integers, so as to facilitate the training of the initial DDPM model and define

Is alpha ₁ …α _t The formula is as follows:

definition of x _t The value of (a) is determined by

And the Gaussian noise epsilon is obtained, and the formula is as follows:

loss function L used in initial DDPM model training _DDPM The formula is as follows:

wherein e _θ Diffusion model function, x ₀ Is an image and e is an EEG semantic feature.

Preferably, in the step S4-2, the parameter μ of the initial DDPM model _DDPM After the initialization is completed, the following training is carried out:

the semantic feature vector e passes through a full connection layer and is subjected to vector splicing with the mapping feature obtained by the current t value passing through the full connection layer, so that the EEG semantic fusion feature of the t step is obtained;

xt is entered into a convolution layer of a first layer, data is subjected to convolution processing by two-dimensional convolution, the convolution adopts a convolution kernel with the size of 4, the step length is set to be 2, the number of output channels is 64, EEG semantic fusion feature positions are embedded into the data after the convolution is finished, and the data is activated by an activation function ReLU;

the data enters a convolution layer of a second layer, convolution processing is carried out on the data through two-dimensional convolution, the convolution adopts a convolution kernel with the size of 4, the step length is set to be 2, the number of output channels is 128, EEG semantic fusion feature positions are embedded into the data after the convolution is finished, and the data are activated through an activation function ReLU;

the data enters a convolution layer of a third layer, convolution processing is carried out on the data through two-dimensional convolution, the convolution adopts a convolution kernel with the size of 4, the step length is set to be 2, the number of output channels is 256, and the data is activated through an activation function ReLU after the convolution is finished;

the data enters a convolution layer of a fourth layer, convolution processing is carried out on the data through two-dimensional convolution, convolution adopts a convolution kernel with the size of 4, the step length is set to be 2, the number of output channels is 512, and the data are activated through an activation function ReLU after the convolution is finished;

the data enters a fifth-layer deconvolution layer, deconvolution processing is carried out on the data through two-dimensional deconvolution, a convolution kernel with the size of 4 is adopted in convolution, the step length is set to be 2, and the number of output channels is 256; activating data by using an activation function ReLU after the convolution is finished; splicing with data obtained by processing the convolutional layer of the third layer, and embedding an EEG semantic fusion characteristic position into the data;

the data enters a deconvolution layer of a sixth layer, deconvolution processing is carried out on the data by two-dimensional deconvolution, a convolution kernel with the size of 4 is adopted in convolution, the step length is set to be 2, and the number of output channels is 128; activating data by using an activation function ReLU after the convolution is finished; splicing with data obtained by processing the convolution layer of the second layer, and embedding an EEG semantic fusion characteristic position into the data;

the data enters a deconvolution layer of a seventh layer, deconvolution processing is carried out on the data through two-dimensional deconvolution, convolution adopts convolution kernels with the size of 4, the step length is set to be 2, and the number of output channels is 64; activating data by using an activation function ReLU after the convolution is finished; splicing with the data obtained by processing the convolution layer of the third layer;

the data enters a deconvolution layer of an eighth layer, deconvolution processing is carried out on the data by two-dimensional deconvolution, a convolution kernel with the size of 4 is adopted in convolution, the step length is set to be 2, and the number of output channels is 3; activating data by using an activation function ReLU after the convolution is finished;

loss function L used in the training process _DDPM Through L _DDPM And (5) performing cyclic training to obtain the optimal DDPM parameters of the semantic visualizer, and storing the optimal DDPM parameters into a local file.

Preferably, it further comprises

Step S5, testing and result evaluation,

the test method comprises inputting EEG signal and Gaussian noise x _T Obtaining a semantic feature vector e in the final EVR-Net model after training; inputting the semantic feature vector e into the final DDPM model, and enabling T to iterate from T to 0 according to the Markov chain to obtain a visual image x ₀ The Markov chain formula is as follows:

where z is Gaussian noise, δ _t Is x _T Standard deviation of (d).

Preferably, the evaluation method in step S5 is as follows:

visual image x generated for a testing phase ₀ Two indices were used for evaluation:

generated visualization image x ₀ Semantic accuracy, i.e. the visual image x generated ₀ Whether the semantics are the same as the EEG;

the evaluation is performed using an image quality index, inclusion Score, the higher the value of which represents the higher the generated image quality;

evaluation index implementation:

for semantic accuracy evaluation, statistical visualization x ₀ The corresponding number of the EEG semantics and the semantics of (1);

for image quality assessment, the visualization image x is used ₀ And inputting the value into a python script, and calling an included Score calculation algorithm in the library by the python script to obtain the value of the included Score.

The invention has the following characteristics and beneficial effects:

by adopting the technical scheme, firstly, the EEG semantic feature extractor EVR-Net is used for extracting the semantic features from the EEG data, and then the EEG is visualized into the image with the same semantic through DDPM by taking the EEG semantic features as a guide. The problem that only semantic recognition or unilateral image visualization is concerned in the conventional machine learning and deep learning method is solved, and EEG semantics are associated with visualized images. The invention can obtain stable and high-quality effect on different data sets and has good generalization performance.

In conclusion, the invention has better effect in the aspect of EEG semantic visualization, obviously improves the brain control technology in practical application, lays a foundation for practical brain-computer interaction application in the future, and has wider application prospect.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a flow chart of a method according to an embodiment of the present invention.

Fig. 2 is a network overview framework diagram of the training phase of an embodiment of the present invention.

FIG. 3 is a model diagram of the EEG semantic feature extractor network model training phase of the present invention.

FIG. 4 is a model diagram of the training phase of the EEG semantic visualizer DDPM of the present invention.

Fig. 5 is a network overview framework diagram of the testing phase of an embodiment of the present invention.

Detailed Description

It should be noted that the embodiments and features of the embodiments may be combined with each other without conflict.

In the description of the present invention, it is to be understood that the terms "central," "longitudinal," "lateral," "upper," "lower," "front," "rear," "left," "right," "vertical," "horizontal," "top," "bottom," "inner," "outer," and the like are used in the orientations and positional relationships indicated in the drawings, which are based on the orientations and positional relationships indicated in the drawings, and are used for convenience in describing the present invention and for simplicity in description, but do not indicate or imply that the device or element so referred to must have a particular orientation, be constructed in a particular orientation, and be operated, and thus should not be construed as limiting the present invention. Furthermore, the terms "first", "second", etc. are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first," "second," etc. may explicitly or implicitly include one or more of that feature. In the description of the present invention, "a plurality" means two or more unless otherwise specified.

In the description of the present invention, it should be noted that, unless otherwise explicitly specified or limited, the terms "mounted," "connected," and "connected" are to be construed broadly, e.g., as meaning either a fixed connection, a removable connection, or an integral connection; can be mechanically or electrically connected; they may be connected directly or indirectly through intervening media, or they may be interconnected between two elements. The specific meaning of the above terms in the present invention can be understood by those of ordinary skill in the art through specific situations.

The invention provides a diffusion model framework-based electroencephalogram semantic visualization method, which comprises the following steps of:

s1, original data acquisition, wherein EEG data are acquired through EEG acquisition equipment.

Specifically, an image stimulation system is built through an electroencephalogram paradigm design platform Openvibe and a C # programming language for acquiring original data, and an image stimulation paradigm is designed in the system. The data of the simulation system is recorded by data receiving software EASYCAP corresponding to acquisition equipment and forwarded to Openvibe, the Openvibe stores the data in a CSV format, the sampling frequency of EEG acquisition equipment is 128 Hz, 32 EEG electrode positions are provided in total, the EEG acquisition equipment is grounded to two earlobes of a FCz channel in reference, and the impedance of the EEG acquisition equipment is kept below 10 KQ.

Based on the system, a plurality of young people with healthy and normal vision or corrected to a normal level are selected as the testees to acquire EEG signals. The task is arranged in quiet and no external interference's place, and the quilt is tried to keep comfortable posture and is sat on the chair, keeps 60 centimetres distance with the computer display screen, then wears EEG data acquisition device, waits that the experiment begins the instruction and sends the back, watches the paradigm image that appears in the computer display screen. In the experimental process, the computer display screen switches the paradigm image every 0.5 second, thereby achieving the effect of giving the tested event-related potential stimulation. When the image interface disappears and the end prompt appears, the task ends. The experiment contained multiple experimental collections (26).

S2, data processing

S2-1, preprocessing data, importing collected EEG data into a python script, and eliminating noise and interference.

Specifically, the EEG signal at 1-50 Hz is extracted by calling the self-bandpass filtering algorithm in the python library, thereby eliminating noise and other interference.

And S2-2, time slice division, namely segmenting the preprocessed EEG signals into a plurality of sequences with equal time length, storing the sequences in a file with a format of npy, and using the sequences for subsequent model training.

S3, obtaining a semantic feature vector e through an EVR-Net model, wherein the EVR-Net model refers to an EEG semantic feature extraction network model and specifically comprises the following sub-steps:

s3-1, constructing an initial EVR-Net model, wherein the initial EVR-Net model comprises eight layers of networks, namely four convolution layers, three residual blocks and an average pooling layer.

S3-2, as shown in FIG. 3, training an initial EVR-Net model to obtain an optimal EEG semantic feature extractor EVR-Net parameter.

It can be understood that the parameter μ of the initial EVR-Net model needs to be trained before the initial EVR-Net model is trained _EVR-Net And (3) initializing:

ReLU(x)＝max(0，x)

Further, μ in the initial EVR-Net model _EVR-Net After the initialization is completed, the following training is carried out:

and (4) the data enters the residual block of the third layer, and the residual block with the convolution kernel size of 3 is used for processing the data. Activating data by using an activation function ReLU after the data is processed;

and (4) the data enters a residual block of the sixth layer, and the data is processed by using the residual block with the convolution kernel size of 3. Activating data by using an activation function ReLU after the data is processed;

the loss function Lc adopted in the training process is a crossentropy function of Cross EntropyLoss, and the optimal EVR-Net parameter, namely the optimal parameter mu, is obtained through Lc cycle training _EVR-Net And storing the file into a local file.

Note that the data entered into the EVR-Net model is EEG data in a Python script reading folder.

It will be appreciated that the data for the first layer network is thus the EEG data in the Python script read folder, the data for the second layer network is the data output by the first layer network, and so on.

And S3-3, importing the optimal EEG semantic feature extractor EVR-Net parameters into an initial EVR-Net model to obtain a final EVR-Net model.

s4, obtaining a visual image x through a DDPM model ₀

S4-1, constructing an initial DDPM model, wherein the initial DDPM model comprises nine layers of networks, namely a full-connection layer, four convolution layers and four anti-convolution layers.

S4-2, as shown in FIG. 4, training the initial DDPM model to obtain the optimal DDPM parameters of the semantic visualizer.

It will be appreciated that the initial DDPM model parameter μ needs to be trained before it can be trained _DDPM And (3) initializing:

wherein T is the total number of initialization steps, T is a certain number of steps, and T and T are integers, so as to facilitate DDPM model training and define

Is alpha ₁ …α _t The formula is as follows:

definition of x _t The value of (a) is determined by

And the Gaussian noise epsilon is obtained, and the formula is as follows:

loss function L used in DDPM model training _DDPM The formula is as follows:

wherein e _θ Diffusion model function, x ₀ Is the image and e is the EEG semantic feature.

Further, the parameter μ of the DDPM model _DDPM After the initialization is completed, the following training is carried out:

x _t entering a convolution layer of a first layer, performing convolution processing on data by using two-dimensional convolution, wherein the convolution adopts a convolution kernel with the size of 4, the step length is set to be 2, the number of output channels is 64, after the convolution is finished, embedding an EEG semantic fusion feature position into the data, and activating the data by using an activation function ReLU;

data enters a convolution layer of a third layer, convolution processing is carried out on the data through two-dimensional convolution, the convolution adopts a convolution kernel with the size of 4, the step length is set to be 2, the number of output channels is 256, and the data are activated through an activation function ReLU after the convolution is finished;

the data enters a deconvolution layer of a seventh layer, deconvolution processing is carried out on the data by two-dimensional deconvolution, a convolution kernel with the size of 4 is adopted in convolution, the step length is set to be 2, and the number of output channels is 64; activating data by using an activation function ReLU after the convolution is finished; splicing with the data obtained by processing the convolution layer of the third layer;

It can be understood that, in the above technical solution, an artificial intelligence method for finding the relevance between data and extracting semantic features by mapping the data to a high-dimensional space is provided. Compared with the manual method of finding the relevance among data and extracting semantic features, the deep learning method can quickly and accurately complete tasks.

The diffusion model is an image generation model based on likelihood functions. The model divides the training process from random images to noise into training steps of T steps. The conversion process of the learning image to the noise step by step in the backward step is realized through a Markov chain, so that the noise can be converted into a high-quality image in the forward step.

A further arrangement of the present invention, as shown in fig. 5, further includes step S5, test and result evaluation,

the test method comprises inputting EEG signal and Gaussian noise x _T Obtaining a semantic feature vector e in the trained EVR-Net model; inputting the semantic feature vector e into a DDPM model, and enabling T to iterate from T to 0 according to a Markov chain to obtain a visual image x ₀ The Markov chain formula is as follows:

where z is Gaussian noise, δ _t Is x _T Standard deviation of (d).

Preferably, the evaluation method in step S5 is as follows:

evaluation index implementation:

for image quality assessment, visualChange image x ₀ And inputting the value into a python script, and calling an included Score calculation algorithm in the library by the python script to obtain the value of the included Score.

It can be understood that step S5 can ensure that the method provided by the present invention can achieve stable and high quality effects on different data sets.

The embodiments of the present invention have been described in detail with reference to the accompanying drawings, but the present invention is not limited to the described embodiments. It will be apparent to those skilled in the art that various changes, modifications, substitutions and alterations can be made in these embodiments, including components thereof, without departing from the principles and spirit of the invention, and still fall within the scope of the invention.

Claims

1. A diffusion model framework-based electroencephalogram semantic visualization method is characterized by comprising the following steps:

s1, acquiring original data, namely acquiring EEG data through EEG acquisition equipment;

s2, data processing

s3, obtaining a semantic feature vector e through an EVR-Net model

s3-4, inputting the EEG signal after data processing into the final EVR-Net model after training to obtain a semantic feature vector e;

s4, passing through a DDPM dieObtaining a visual image x ₀

S4-1, constructing an initial DDPM model, wherein the initial DDPM model comprises nine layers of networks, namely a full-connection layer, four convolution layers and four deconvolution layers;

2. The diffusion model framework-based electroencephalogram semantic visualization method according to claim 1, wherein in the step S2-1, the noise and interference elimination method comprises the following steps: and calling a self-band-pass filtering algorithm in a python library to extract the EEG signal of 1-50 Hz.

3. The diffusion model framework-based electroencephalogram semantic visualization method according to claim 1, wherein in the step S3-2, before an initial EVR-Net model is trained, a parameter mu of the initial EVR-Net model needs to be set _EVR-Net And (3) initializing:

ReLU(x)＝max(0,x)

4. The diffusion model framework-based electroencephalogram semantic visualization method according to claim 3, wherein in the step S3-2, μ in an initial EVR-Net model _EVR-Net After the initialization is completed, the following training is carried out:

5. The diffusion model framework-based electroencephalogram semantic visualization method according to claim 1, wherein in the step S4-2, before the initial DDPM model is trained, the parameter μ of the initial DDPM model needs to be set _DDPM And (3) initializing:

Is alpha ₁ …α _t The formula is as follows:

definition of x _t The value of (a) is determined by

And the Gaussian noise epsilon is obtained, and the formula is as follows:

6. The diffusion model framework-based electroencephalogram semantic visualization method according to claim 5, wherein in the step S4-2, the parameter μ of the initial DDPM model _DDPM After the initialization is completed, the following training is carried out:

the data enters a deconvolution layer of an eighth layer, deconvolution processing is carried out on the data through two-dimensional deconvolution, convolution adopts convolution kernels with the size of 4, the step length is set to be 2, and the number of output channels is 3; activating data by using an activation function ReLU after the convolution is finished;

7. The diffusion model framework-based electroencephalogram semantic visualization method according to claim 6, further comprising

Step S5, testing and result evaluation,

the test method comprises inputting EEG signal and Gaussian noise x _T Obtaining a semantic feature vector e in the final EVR-Net model after training; inputting semantic feature vector e into final DDPM modelAccording to the Markov chain, enabling T to iterate from T to 0 to obtain a visual image x ₀ The Markov chain formula is as follows:

where z is Gaussian noise, δ _t Is x _T Standard deviation of (2).

8. The diffusion model framework-based electroencephalogram semantic visualization method according to claim 7, wherein the evaluation method in the step S5 is as follows:

evaluation index implementation:

for image quality assessment, the visualization image x is used ₀ And inputting the index Score into a python script, and calling an inclusion Score calculation algorithm in the library by the python script to obtain the value of the inclusion Score.