CN116186544A

CN116186544A - Single-channel electroencephalogram sleep stage-dividing method based on deep learning

Info

Publication number: CN116186544A
Application number: CN202310285377.5A
Authority: CN
Inventors: 彭虎; 陈瑾; 韩志会
Original assignee: Hefei University of Technology
Current assignee: Hefei University of Technology
Priority date: 2023-03-22
Filing date: 2023-03-22
Publication date: 2023-05-30

Abstract

The invention discloses a single-channel electroencephalogram sleep stage method based on deep learning, which comprises the following steps: 1. constructing a training set from the polysomnography and the sleep label; 2. building an electroencephalogram sleep stage network based on deep learning; 3. constructing an MFE loss function; 4. training a deep learning model; 5. and carrying out sleep staging on the original single-channel electroencephalogram signal by using the trained single-channel sleep staging network, so that the automatic sleep staging of the electroencephalogram signal can be realized.

Description

Single-channel electroencephalogram sleep stage-dividing method based on deep learning

Technical Field

The invention belongs to the field of electroencephalogram sleep stage, and particularly relates to a single-channel electroencephalogram sleep stage method based on deep learning.

Background

Sleep occupies one-third of the life time of a person, and during the sleep, electroencephalogram changes variously, and the changes are different along with the depth of sleep, and the sleep is divided into two states according to different characteristics of the electroencephalogram: non-eye fast movement sleep (also called normal phase sleep, slow wave sleep, synchronous sleep, quiet sleep, NREM sleep) and eye fast movement sleep, which are distinguished by whether there is eye paroxysmal fast movement and different brain wave characteristics, are necessary for evaluation measurement of sleep quality, some sleep-specific simulation methods perform electric signals recorded in a sleep-staged Polysomnogram (PSG) recorded by sensors attached to the brain epidermis, the PSG includes electroencephalogram (EEG), electroencephalogram (EOG), electromyogram (EMG) and Electrocardiograph (ECG), regarding comfort of a subject during sleep monitoring, single-channel electroencephalogram signals are increasingly used for sleep staging, the PSG is divided into periods of 30s, sleep specialists manually divide each period into different stages according to decision rules formulated by rechchanffen and Kales (R & K) and american sleep medical society (AASM), in clinical practice, artificial sleep staging is very energy and cost intensive and automatic research, and thus, various methods for classifying sleep are available for two kinds of sleep stage of critical study: traditional machine-learning-based and deep learning-based approaches typically involve manually extracting features and sleep stage classification. Electroencephalographic signals typically use artificial feature extraction to extract time-domain or frequency-domain features, and then conventional machine learning algorithms such as Support Vector Machines (SVMs), random Forests (RFs), decision trees, and Hidden Markov Models (HMMs) will be used to train sleep stage classification models based on hand-made features, which, while achieving acceptable performance, require prior knowledge of human engineering and expertise in the relevant arts, and require human feature extraction, which takes a long time. In addition, the deep learning method is also applied to single-channel electroencephalogram signal sleep stage. These deep learning-based methods can be divided into two categories depending on the composition of the network: convolutional Neural Networks (CNNs) and Recurrent Neural Networks (RNNs). CNN is most often used to extract waveform features related to sleep, some researchers have successfully applied CNN to sleep stage tasks, e.g., tsinalis et al use one-dimensional convolution to extract features and use the largest aggregate layer to remove redundant information, sors et al propose a 14-layer deep CNN for sleep stage, perslev et al propose a CNN of encoder-decoder structure for sleep stage, fiorillo et al propose a multi-scale CNN for extracting features of different scales for sleep stage, there is some time correlation between brain electrical signals, CNN cannot learn the time dependence of these signals, but RNN can learn, e.g., michaeli et al propose a cascaded RNN structure for sleep stage, some methods combine CNN and RNN, extract features with RNN to learn time dependence. For example, supra et al propose a CNN and RNN based model using CNN to obtain a 30 second duration signal characterization and long-short-time memory (LSTM) to learn transition rules between duration phases. In addition, there are methods that combine CNN and attention mechanisms, feature extraction with CNN, and learn time correlation between duration phases with attention mechanisms. Eldele et al combine CNN with channel attention and use an attention mechanism-based module to quantify inter-dependencies between features, qu et al study transition rules between durations by using CNN to extract features and attention mechanisms. Current methods use CNNs to extract features, RNNs or attention mechanisms to learn the time dependence of the duration signal. However, none of these methods can sufficiently extract the time dependence of the history signal, RNN is insufficient in learning ability for long time series time dependence, the attention mechanism needs to rely on a longer history sequence for event-dependent learning of the signal on sleep brain electricity, and learning ability for transition relations between durations is weaker than RNN.

Disclosure of Invention

The invention aims to solve the defects in the prior art, and provides a single-channel electroencephalogram sleep stage method based on deep learning, so that the time correlation of time duration signals can be more fully learned, the generalization capability of a single-channel sleep stage model can be improved, and the accuracy of single-channel sleep stage can be improved.

In order to achieve the aim of the invention, the invention adopts the following technical scheme:

the invention discloses a single-channel electroencephalogram sleep stage method based on deep learning, which is characterized by comprising the following steps of:

step 1, acquiring a polysomnography of a subject and a sleep label set corresponding to the polysomnography, selecting an electroencephalogram sleep signal of one channel in the polysomnography, and dividing the electroencephalogram sleep signal under one channel into N non-overlapping duration signal sequences X= { X ₁ ，x ₂ ，...，x _i ，...，x _N Building a tag sequence set y= { Y according to sleep tags corresponding to each duration ₁ ，y ₂ ，...，y _i ，...，y _N Simultaneously counting the number { N } of the tags of each sleep category in the sleep tags corresponding to all durations _f |f=1, 2, …, D }; wherein x is _i Represents the i-th duration signal sequence, and x _i ＝{e _i,1 ，e _i,2 ，...，e _i，j ，...，e _i，l }，y _i Representing the ith duration signal sequence x _i Corresponding sleep tag sequences, and y _i ＝{y _i，1 ，y _i,2 ，...，y _i,j ，...，y _i,l }，e _i，j ∈R ^1×m ，e _i，j Representing the jth duration signal, y, in the ith duration signal sequence _i,j Tag indicating the jth duration signal in the ith duration signal sequence, N _f Tag number representing the f-th sleep category, f e {1,2,..once., D }, y _i,j E {1,2,., D }, D is the total category of sleep sessions, j ε {1,2 … l }, l is x _i The number of duration signals present in a sample point is represented by mNumber, and m=θ×δ, θ represents the sampling rate, δ represents the sampling time;

step 2, setting up an electroencephalogram sleep stage network based on deep learning, comprising the following steps: the duration feature extraction module, the context encoder and the capsule network are formed;

step 2.1, the duration feature extraction module extracts the j-th duration signal e _i，j After processing, the j-th duration signal e is obtained _i，j Feature map a of (2) _i，j Thereby obtaining the ith duration signal sequence x _i Feature map sequence a of (2) _i ＝{a _i,1 ，a _i,2 ，...，a _i，j ，...，a _i，l }；

Step 2.2, the context encoder is used for the feature map sequence a _i Time correlation learning is carried out to obtain a characteristic diagram sequence H _i ＝{H _i，1 ，H _i，2 ，…，H _i，j ，...H _i，l }；

Step 2.3, constructing a capsule network, which consists of a primary capsule layer and a digital capsule layer;

step 2.3.1, the primary capsule layer vs. the feature map sequence H _i Jth sub-feature map H _i，j After treatment, a primary capsule is obtained

Thereby obtaining the sequence x with the ith duration _i Is the jth duration signal e _i，j Corresponding activated primary capsule sequence v _i，j ＝{v ¹ _i，j ，v ² _i，j ，...，v ^k _i，j， ...，v ⁿ _i，j }；

Step 2.3.2 Primary Capsule sequence v to be activated _i，j Inputting into the digital capsule layer for processing to obtain j-th duration signals e related to all D sleep categories _i，j Corresponding capsule sequences

Step 2.3.3, sequence of capsules U _i，j Medium f high-grade characteristic capsule

Taking l ₂ Norms, thereby obtaining the ith duration sequence x _i J-th duration signal e _i，j Probability of presence of the f-th sleep class +.>

And then obtain the ith duration sequence x _i J-th duration signal e _i，j Predicting probability vectors of the existence of all sleep categories output via the sleep staging network

Step 3, constructing a loss function l of the electroencephalogram sleep stage network by using the formula (7) _MFE ：

In the formula (7), the amino acid sequence of the compound,

is the j-th duration signal e _i，j Is the true label y of (2) _i，j The existence probability of the f-th class in the single-hot coding of (a);

and 4, training the sleep stage network by using a gradient descent algorithm, and minimizing the loss function to optimize network parameters until the loss function converges to obtain a sleep stage model for realizing sleep stage of the single-channel electroencephalogram signal.

The single-channel electroencephalogram sleep stage-dividing method based on deep learning is also characterized in that the duration feature extraction module in the step 2.1 sequentially comprises a one-dimensional step length of S ₁ The convolution kernel is K ₁ A first convolutional layer, a first BN normalization layer, a first Relu nonlinear active layer, a pooling size k _m1 Step length s _m1 Is the first maximum pooling of (2)A layer; step length S ₂ The convolution kernel is K ₂ A second BN normalization layer, a second Relu nonlinear activation layer; step length S ₂ The convolution kernel is K ₂ A third BN normalization layer, a third Relu nonlinear activation layer; step length S ₂ The convolution kernel is K ₂ A fourth convolution layer, a fourth BN normalization layer, a fourth Relu nonlinear activation layer, a pooling size of k _m2 Step length s _m2 Is stacked with the second largest pooling layer.

The context encoder in step 2.2 comprises: a forward LSTM cell and a backward LSTM cell;

the characteristic diagram sequence a _i Is the j-th feature map a of (2) _i，j Obtaining the forward hidden state at output j time after processing by the forward LSTM unit

Jth feature map a _i，j Obtaining the backward hidden state of the output j moment after the processing of the backward LSTM unit>

Will->

And->

After splicing, a characteristic diagram sequence a is obtained _i A j-th sub-feature map output by the context encoder, < >>

Thereby obtaining the characteristic diagram sequence a _i Feature map sequence H output through context encoder _i ＝{H _i，1 ，H _i，2 ，…，H _i，j ，…H _i，l }。

The primary capsule layer in the step 2.3.1 is formed by a one-dimensional convolution kernel with the size of K _p X 1, number of convolution kernels N _p Is a convolution layer of (2)A length of l _d Is composed of a Reshape operation layer;

the characteristic diagram sequence H _i Jth sub-feature map H _i，j The primary characteristic map P is obtained after the convolution layer treatment of the primary capsule layer _i，j After the Reshape operation layer is processed, outputting a j-th sub-feature diagram H _i，j Primary capsule sequence s of (2) _i，j And s _i，j ＝{s ¹ _i，j ，s ² _i，j ，...，s ^k _i，j ，...，s ⁿ _i，j }，

Representing the i-th duration sequence x _i J-th duration signal e _i，j Corresponding primary capsule sequence s _i，j N is the primary capsule sequence s _i，j The number of the capsules, and n=l _P /l _d ；l _d Represents the kth capsule->

Length of l _P Representing P _i，j K e {1,2,3 …, n };

using a squaring activation function on the primary capsule sequence s _i，j The kth capsule of (a)

After treatment, an activated primary capsule is obtained>

Thereby obtaining the sequence x with the ith duration _i Is the jth duration signal e _i，j Corresponding activated primary capsule sequence v _i，j ＝{v ¹ _i，j ，v ² _i，j ，...，v ^k _i，j ，...，v ⁿ _i，j }。/>

The digital capsule layer in the step 2.3.2 calculates a primary capsule sequence v by using the formula (1) _i，j Mapped f-th subspace prediction capsule orderColumn of

The kth predictive capsule of (a)>

Thereby calculating the primary capsule sequence v using formula (2) _i，j Mapped f-th subspace predictive capsule sequence +.>

In the formula (1) and the formula (2), W ^f Representing the f-th matrix of the learner,

representation matrix W ^f Is the kth sub-matrix of (c); f.epsilon. {1,2,3 …, D };

the digital capsule layer calculates the f-th subspace predictive capsule sequence by utilizing the method (3)

Similarity matrix between each prediction capsule of (a)>

In the formula (3), the amino acid sequence of the compound,

representing predicted capsule sequence->

Is a transpose of (2);

the digital capsule layer calculates the f-th subspace predictive capsule sequence by utilizing the method (4)

The kth sub-capsule in (a)

Weight occupied ∈>

In the formula (4), the amino acid sequence of the compound,

represents->

Is->

Similarity matrix representing dimensions n x n>

The p-th row and the o-th column of the elements;

the digital capsule layer calculates the f-th subspace predictive capsule sequence by utilizing the method (5)

The length of the output is l _d+1 Advanced feature capsule->

Thereby obtaining the jth associated with all D sleep classesDuration signal e _i,j Corresponding capsule sequence->

The electronic equipment comprises a memory and a processor, wherein the memory is used for storing a program for supporting the processor to execute any single-channel electroencephalogram sleep stage method, and the processor is configured for executing the program stored in the memory.

The invention relates to a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and is characterized in that the computer program is executed by a processor to execute any step of the single-channel electroencephalogram sleep stage method.

Compared with the prior art, the invention has the beneficial effects that:

1. according to the invention, the scalar network combined by the traditional LSTM and the full-connection layer is converted into the vector network combined by the LSTM and the capsule network by adopting the mode of combining the LSTM by the capsule network, so that the defect of insufficient generalization of the mode of combining the traditional LSTM and the full-connection layer is overcome, and the learning capacity of the LSTM on the internal time correlation of sleep signals can be fully improved.

2. The invention completes the route process from the primary route to the advanced route by adopting the route mechanism based on the self-attention mechanism, overcomes the defect of large training consumption caused by the fact that the traditional dynamic route mechanism cannot be trained in parallel, reduces the complexity of capsule network training, and ensures that the capsule network is successfully applied to the electroencephalogram sleep stage task.

Drawings

FIG. 1 is a block diagram of a single channel sleep stage neural network designed in the present invention;

FIG. 2 is a diagram of a CNN module for feature extraction according to the present invention;

FIG. 3 is a schematic diagram of a long and short term memory model unit designed for use in the present invention;

FIG. 4 is a block diagram of a capsule network according to the present invention;

FIG. 5 is a schematic diagram of the self-attention mechanism of a capsule network contemplated for use with the present invention;

FIG. 6 is a diagram of a reconstructed network for regularization of the design model of the present invention;

fig. 7 is a histogram of ablation experimental evaluation parameters of a single-channel staged sleep model designed in the present invention.

Detailed Description

In this embodiment, a single-channel electroencephalogram sleep stage method based on deep learning includes the following steps:

step 1, acquiring a polysomnography of a subject and a sleep tag set corresponding to the polysomnography, selecting an electroencephalogram sleep signal of one channel in the polysomnography, and dividing the electroencephalogram sleep signal under the one channel into N non-overlapping duration signal sequences X= { X ₁ ，x ₂ ，...，x _i ，...，x _N And constructing a tag sequence set Y= { Y according to the sleep tags corresponding to each duration ₁ ，y ₂ ，...，y _i ，...，y _N Simultaneously counting the number { N } of the tags of each sleep category in the sleep tags corresponding to all durations _f |f=1, 2, …, D }; wherein x is _i Represents the i-th duration signal sequence, and x _i ＝{e _i,1 ，e _i,2 ，...，e _i，j ，...，e _i，l }，y _i Representing the ith duration signal sequence x _i Corresponding sleep tag sequences, and y _i ＝{y _i，1 ，y _i,2 ，...，y _i,j ，...，y _i，l }，e _i，j ∈R ^1×m ，e _i，j Representing the jth duration signal, y, in the ith duration signal sequence _i,j Tag indicating the jth duration signal in the ith duration signal sequence, N _f Tag number representing the f-th sleep category, f e {1,2,..once., D }, y _i,j E {1,2,., D }, D is the total category of sleep sessions, j ε {1,2 … l }, l is x _i Duration signal present in (a)The number, m, represents the number of sampling points, and m=θ×δ, θ represents the sampling rate, δ represents the sampling time;

step 2, building an electroencephalogram sleep stage network based on deep learning, as shown in fig. 1, including: the duration feature extraction module, the context encoder and the capsule network are formed;

step 2.1, the different sleep phases show significant waveform differences, such as Low Amplitude Mixed Frequency (LAMF) waves occurring in the first phase (N1) and K-complexes occurring mainly in the second phase (N2) of NREM sleep, together with the main axis of sleep, are two significant features of this phase, while sawtooth waves are the only typical electroencephalographic feature of REM sleep. The convolutional neural network is widely applied to the aspect of feature extraction, so the convolutional neural network is designed to serve as a feature extraction module. As shown in fig. 2, the duration feature extraction module sequentially uses a one-dimensional step length of S ₁ The convolution kernel is K ₁ A first convolutional layer, a first BN normalization layer, a first Relu nonlinear active layer, a pooling size k _m1 Step length s _m1 Is included in the first maximum pooling layer; step length S ₂ The convolution kernel is K ₂ A second BN normalization layer, a second Relu nonlinear activation layer; step length S ₂ The convolution kernel is K ₂ A third BN normalization layer, a third Relu nonlinear activation layer; step length S ₂ The convolution kernel is K ₂ A fourth convolution layer, a fourth BN normalization layer, a fourth Relu nonlinear activation layer, a pooling size of k _m2 Step length s _m2 Is stacked, the stacked convolution layers can merge the fields of view and can also extract deeper features. And a droupout layer is arranged between the convolution layers, the discarding rate is set to 0.5 for reducing overfitting, and the designed optimal parameters of the feature extraction module formed by the convolution neural network are the results obtained by continuous adjustment in experiments.

Ith duration signal sequence x _i The j-th duration signal e in (a) _i，j After the processing of the duration feature extraction module, the j-th duration signal e is obtained _i，j Feature map a of (2) _i,j Thereby obtaining the ith duration signal sequence x _i Is a sequence of feature maps of (a)a _i ＝{a _i,1 ，a _i,2 ，...，a _i,j ，...，a _i，l }；

Step 2.2, constructing a context encoder, comprising: forward LSTM unit and backward LSTM unit for feature map sequence a _i And (3) performing time correlation learning:

according to fig. 3, the LSTM network is designed to learn the time dependence of time series, consists of gates with different functions, and different operations, the forget gate determines which information can be discarded or retained, expressed as follows:

f _j ＝σ(W _f ⊙[h _j-1 ，a _i，j ]+b _f ) (9)

the purpose of updating the input gate and the output gate is to determine whether to replace the memory cell with a candidate value and to generate an active part of the current time step, which can also be expressed as:

i _j ＝σ(W _i ⊙[h _j-1 ，a _i，j ]+b _i ) (10)

o _j ＝σ(W _o ⊙[h _j-1 ，a _i，j ]+b _o ) (11)

the following equation governs the behavior of the LSTM cell:

h _j ＝o _j *tanh(C _j ) (14)

wherein sigma is a sigmoid activation function, tanh is a tanh activation function, W represents a weight matrix, and a _i，j To be the eigenvector of the input sequence at time j, h _j The hidden state at the j-th moment is represented by b, and b is represented by bias;

sequence of feature maps a _i Is the j-th feature map a of (2) _i,j Via forward LSTM cellAfter processing to obtain the forward hidden state at output j

Jth feature map a _i,j Obtaining the backward hidden state of the output j moment after the processing of the backward LSTM unit>

Will->

And->

After splicing, a characteristic diagram sequence a is obtained _i The jth sub-feature map of the context encoder output,

a two-way long-short-period memory unit is selected, so that the time transition relation of the duration signal can be learned from two directions; thereby obtaining a characteristic diagram sequence a _i Feature map sequence H output through context encoder _i ＝{H _i,1 ,H _i,2 ,...,H _i,j ,...H _i,l }；

Step 2.3, constructing a capsule network, wherein the capsule network is composed of a primary capsule layer and a digital capsule layer as shown in fig. 4;

step 2.3.1, the primary capsule layer is formed by a one-dimensional convolution kernel with size K _p X 1, number of convolution kernels N _p Is a convolution layer of length l _d Is composed of a Reshape operation layer;

sequence of feature maps H _i Jth sub-feature map H _i,j The primary characteristic map P is obtained after the convolution layer treatment of the primary capsule layer _i,j This convolution layer is a one-dimensional convolution layer of 1*1, so the convolution kernel size is set to 1 because the 1*1 convolution layer can better fuse features between different channels, thereby converting to a primary capsule better, increasing nonlinearity, and keeping the number of convolution kernels consistent with the number of channels of the input features. Then pass throughAfter the Reshape operation layer is processed, outputting a j-th sub-feature diagram H _i,j Primary capsule sequence s of (2) _i,j And s _i,j ＝{s ¹ _i,j ,s ² _i,j ,...,s ^k _i,j ,...,s ⁿ _i,j }，

Representing the i-th duration sequence x _i J-th duration signal e _i,j Corresponding primary capsule sequence s _i,j N is the primary capsule sequence s _i,j The number of the capsules, and n=l _P /l _d ；l _d Represents the kth capsule->

Length of l _P Representing P _i,j K e {1,2,3., n };

step 2.3.2 Using the squaring activation function on the Primary Capsule sequence s _i,j The kth capsule of (a)

After treatment, an activated primary capsule is obtained>

The function of the square activation has two functions, namely, the function of normalizing the capsules and the function of improving the nonlinear learning ability of a capsule network, thereby obtaining the sequence x with the ith duration _i Is the jth duration signal e _i,j Corresponding activated primary capsule sequence v _i,j ＝{v ¹ _i,j ,v ² _i,j ,...,v ^k _i,j ,...,v ⁿ _i,j }；

Step 2.3.3, as shown in FIG. 5, the primary capsule sequence v to be activated _i,j Is input into a digital capsule layer, and a primary capsule sequence v is calculated by using the formula (1) _i,j Mapped f-th subspace predictive capsule sequence

The kth predictive capsule of (a)>

Thereby calculating the primary capsule sequence v using formula (2) _i,j Mapped f-th subspace predictive capsule sequence

representation matrix W ^f Is the kth sub-matrix of (c); f e {1,2,3., D };

step 2.3.4, calculating the f-th subspace predictive capsule sequence by the digital capsule layer through the method (2)

Similarity matrix between each prediction capsule of (a)>

In the formula (2), the amino acid sequence of the compound,

representing predicted capsule sequence->

Is a transpose of (2);

step 2.3.5, calculating the f-th subspace predictive capsule sequence by the digital capsule layer through the method (4)

The kth sub-capsule of->

Weight occupied ∈>

In the formula (4), the amino acid sequence of the compound,

represents->

Is->

Similarity matrix representing dimensions n x n>

The p-th row and the o-th column of the elements;

step 2.3.5, calculating the f-th subspace predictive capsule sequence by the digital capsule layer through the method (5)

The length of the output is l _d+1 Advanced feature capsule->

Thereby obtaining the j-th duration signal e related to all the D sleep categories _i,j Corresponding capsule sequence->

The attention-based routing mechanism comprises two parts, the first part being a matrix W ^f It is intended to generate predictive capsules. The second part is the process of assigning weights and generating new capsules from the attention mechanism. The two interact to route for self-attention mechanism. />

Step 2.3.6 for Capsule sequence U _i,j Medium f high-grade characteristic capsule

Taking l ₂ Norms, thereby obtaining the ith duration sequence x _i J-th duration signal e _i,j Probability of presence of the f-th sleep class +.>

And then obtain the ith duration sequence x _i J-th duration signal e _i,j Predicting probability vectors of the presence of all sleep categories output via a sleep staging network

Step 3, constructing a loss function l of the electroencephalogram sleep stage network by utilizing the step 7 _MFE ：

In the formula (7), the amino acid sequence of the compound,

is the j-th duration signal e _i,j Is the true label y of (2) _i,j Class f existence probability in the single thermal coding of (1), since sleep stage task is an unbalanced distribution classification task, using balanced distribution loss function will increase networkThe probability of the signal being identified as the maximum number of classes in the training set, which causes errors, can be well solved by using an MFE weighted loss function which forces the network to prioritize the classes with small sample size, thereby alleviating the problem of unbalanced distribution caused by the sample size, and in addition, the loss function has priori property due to the fact that the distribution of the verification set cannot be accurately determined, without adjusting the super-parameters

To improve network performance, reconstruction regularization is used, as shown in fig. 6, by placing a full connection layer at the end of the network to reconstruct the original input signal, reducing the network's overfitting, the euclidean distance between the reconstructed signal and the original signal is used to calculate the reconstruction loss, by multiplying the regularized loss by a coefficient γ and adding it to the loss function, the total loss being:

l _total ＝l _MFE +γ*l _Recon (15)

and 4, training the sleep stage network by using a gradient descent algorithm, and minimizing a loss function to optimize network parameters until the loss function converges to obtain a sleep stage model for realizing sleep stage of the single-channel electroencephalogram signal.

In this embodiment, an electronic device includes a memory for storing a program supporting the processor to execute the above method, and a processor configured to execute the program stored in the memory.

In this embodiment, a computer-readable storage medium stores a computer program that, when executed by a processor, performs the steps of the method described above.

Examples:

experimental data set: the model was evaluated by a Sleep-EDFx dataset in which 153 PSG recordings were used, from studies of age effects on healthy Subjects (SC), collected by 78 participants, for comparison with existing methods by dividing Sleep-EDFx into two datasets, sleep-EDF-39 and Sleep-EDF-153, sleep-EDF-39 included the first 20 subjects of SC, excluding 13 subjects with only one night of data, and other 19 persons with two night of data; there were 78 subjects in the Sleep-EDF-153. To demonstrate the broad applicability of the model to different channels, the method of the present invention was tested on the Fpz-Cz channel, the Pz-Oz channel and the Eog channel, respectively:

as shown in table 1, the model was evaluated using a k-fold cross-validation scheme, with k equal to 20 for Sleep-EDF-39, which means that 19 subjects 'data were used as training sets, 1 subject's data were used as test sets, and 10% of the remaining training set was used for validation, with mini-batch for training set 20.

Table 1: sample distribution and number of subjects for sleep stage class for each dataset

Dataset	N _s	W	N1	N2	N3	REM	Total
								Sleep-EDF-39	20	10197	2804	17799	5703	7717	44220
Sleep-EDF-153	78	69824	21522	69132	13039	25835	199352

Experimental parameters: the length l of each sequence is 15, and parameters lr, beta of Adam optimizer are used for updating the weight of the model ₁ And beta ₂ Set to 10-4, 0.90 and 0.999 respectively, for regularization, the L2 regularization term is added to the loss function with a weight of 10-3, the reconstruction loss in the capsule network is also applied to the loss function with a weight γ set to 10-4. The epochs of the maximum number of networks trained is set to 200 and if the loss of the validation set does not decrease within 20 iterations, the network will stop training and save the weight with the lowest validation loss as the optimization weight. Training was accelerated by using Tensorflow as the framework of the experiment and using a NVIDIA GeForce GTX 1070 video card with 8GB video memory.

Evaluation parameters: evaluating the performance of the model by using precision per class (PR), recall per class (RE), F1 score per class (F1), macroscopic average F1 score (MF 1) and overall precision (ACC), and Kappa coefficient (k) of Cohen, PR reflecting the proportion of true positive samples in positive cases determined by the classifier, RE reflecting the proportion of true positive cases in all samples determined to be positive by the classifier, F1 being a comprehensive evaluation based on PR and RE, F1 being a more efficient test method when F1 is high, MF1 being the average of all classes F1 reflecting the overall performance of the model, kappa coefficient (k) of Cohen's being used to measure the consistency of two different sleep stage results, corresponding to the model results and labels formulated by the sleep expert, the calculation formula being:

wherein TP represents true positive, TP _f Represents true positive of class f, TN represents true negative, TN _f Represents true negative of class f, FP refers to false positive, FP _f Representing a false positive of class f, FN is a false negative, FN _f Represents a false negative of class f, P _e Is the probability of hypothetical contingency, a hyper-parameter.

Experimental evaluation: classification performance references table 2, table 3 and table 4 list the confusion matrices applied to proposed models of Fpz-Cz, pz-Oz and Eog channels, respectively, the diagonal elements of the confusion matrices representing the consistency of the labels with expert classification, while the right side of the table represents the evaluation parameters for each category, the higher the evaluation parameters, the better the performance of Fpz-Cz channel is for the channel used, while the performance of Eog channels is the worst, the best performance being at W and N2 phases and the worst performance being at N1 phase, which is easily confused with other sleep phases, mainly due to the small amount of data at N1 phase, and the similarity of N1 phase with other sleep phases, which makes N1 phase more difficult to distinguish from other sleep phases.

Table 2: the output confusion matrix for the model of the Fpz-Cz channel signal in the Sleep-EDF-39 dataset is used.

Table 3: the output confusion matrix for the model of the Pz-Oz channel signal in the Sleep-EDF-39 dataset is used.

Table 4: the output confusion matrix for the model of the EOG channel signals in the Sleep-EDF-39 dataset is used.

By using ablation experiments to represent the function of each component of the model, by comparing the performance of the model with and without the part-capsule network, the model for comparison consists of the following four basic models:

(1) the method comprises the following steps Cnn+lstm, comprising one CNN and one unidirectional LSTM;

(2) the method comprises the following steps CNN+Bi-LSTM comprising a CNN and a Bi-directional LSTM;

(3) the method comprises the following steps CNN+Bi-LSTM+CapsNet, comprising a CNN and a Bi-directional LSTM and capsule network, without reconstruction loss;

(4) the method comprises the following steps The proposed model, sleepcapsule net, contains a CNN and a reconstruction penalty for a bi-directional LSTM and capsule network.

By evaluating the performance of different structures with macroscopic averaged F1 fraction (MF 1), overall Accuracy (ACC), and Kappa coefficients for Cohen's, FIG. 7 depicts a histogram of performance of various models under different evaluation parameters, bi-directional LSTM performs better than uni-directional LSTM models, mainly because uni-directional LSTM models can capture only one directional time dependence, while Bi-directional LSTM models can capture both directional time dependence, which is beneficial for learning transition rules in sleep stages, and CNN+Bi-LSTM+CapsNet models perform better than CNN+Bi-LSTM models, which is higher in all three evaluation indexes, indicating that adding a capsule network can improve the performance of a Bi-LSTM network, and by comparing the impact of reconstruction losses on network performance, reconstruction losses can improve the performance of the network, thereby reducing overfitting, and achieving optimal performance by minimizing reconstruction regularization losses.

By comparing the model with the most advanced methods, the present invention evaluates parameters such as macroscopic average F1 score (MF 1), overall Accuracy (ACC), and Cohen's Kappa coefficient.

Table 5 shows the results of the comparison with the current most advanced methods, the model comprising DeepSleepNet, sleepEEGNet, tinySleepNet, CCRRSleepNet, attnNet, EOGNET, the model being superior to the other models in terms of accuracy, up to 85.84% on the Sleep-EDF-39 and up to 83.4% on the Sleep-EDF-153, in particular the model being superior to the other models in terms of the following.

In terms of W-phase recognition accuracy and generalization performance over large datasets, using EOG channel signals for staging, the model achieves superior results compared to EOGNet, exceeding EOGNet in all respects, the model is a sequence-to-sequence model, using multiple consecutive durations as inputs, outputting sleep stages corresponding to each duration, while deepsenet employs a two-step training method, first learning a representation of each epoch, then training RNN to learn transition relationships between epochs, but the model does not require two-step learning, the first half of the model is comparable to the model of tinysenet, except if features extracted from CNN are flattened and fed LSTM, tinySleepNet require small volume data enhancement during training, attnNet is a network that uses multiple attentions to capture time dependence, unlike the network of the present invention, it can achieve optimal results by using three epochs as inputs, and outputting intermediate epoch tags. Furthermore, its loss function is a hyper-parametric loss function, which lacks reasonable interpretation, requiring hyper-parametric adjustments for different data sets, whereas the network of the present invention is not.

Table 5 evaluation parameter graphs comparing the methods herein with the most advanced methods

In terms of W-phase recognition accuracy and generalization performance over large datasets, using EOG channel signals for staging, the model achieves superior results compared to EOGNet, exceeding EOGNet in all respects, the model is a sequence-to-sequence model, using multiple consecutive durations as inputs, outputting sleep stages corresponding to each duration, while deepsenet employs a two-step training method, first learning a representation of each epoch, then training RNN to learn transition relationships between epochs, but the model does not require two-step learning, the first half of the model is comparable to the model of tinysenet, except if features extracted from CNN are flattened and fed LSTM, tinySleepNet require small volume data enhancement during training, attnNet is a network that uses multiple attentions to capture time dependence, unlike the network of the present invention, it can achieve optimal results by using three epochs as inputs, and outputting intermediate epoch tags. Furthermore, its loss function is a class-aware loss function, lacking reasonable interpretation, requiring hyper-parameter adjustment for different data sets, whereas the network of the present invention is not.

The model in the invention can be applied to different EEG channels (Fpz-Cz and Pz-Oz) and EOG channels without modifying the structure and training algorithm of the model, and from the experimental results carried out on each channel, the use of the Fpz-Cz channel for Sleep staging obtains better performance than other channels, which indicates that the Fpz-Cz channel has substantially more Sleep related information, the best performance is in the W stage and the worst performance is in the N1 stage, the identification of the N1 stage is always the difficulty of Sleep staging task, on one hand, the data of the N1 stage is relatively limited, on the other hand, the staging itself is difficult and is easy to confuse with other stages, the model proposed in the invention can be used for Sleep staging from untreated original single brain electrical signals, this is friendly for subjects and persons lacking expertise in the Sleep area, as it reduces the discomfort of the measurement of the subject by using single channel EEG signals, which however also increase the risk of Sleep stage inaccuracy, do not always appear to be Sleep related signatures, even containing different degrees and types of noise signals, e.g. in Sleep-EDF-39, the subject 11 verification accuracy on the Fpz-Cz channel is only 60%, which is the worst of 20-fold cross-verification results, mainly due to the presence of high intensity noise signals in the verification set, leading to other classes of duration being identified as W-or R-cycles, the model in the present invention captures this inherent spatial relationship by introducing a network of capsules to obtain a more accurate, compared with the LSTM, the Bi-LSTM can consider time reference information from two directions, so that Sleep stage is more accurate, the self-attention routing mechanism used in the invention can fully consider the inherent similarity among capsules, and parameters are fewer, a capsule network can be widely applied to Sleep stage tasks based on EEG signals through the high-efficiency self-attention mechanism, in clinical practice, sleep experts need to refer to single-channel EEG signals and other channels, the characteristics related to Sleep, such as Sleep spindle waves, are not simultaneously present in all EEG signals, eyeball motion and electromyographic signals are important to judge Sleep stage, in addition, the scoring styles of the experts are different, and different expert scoring styles are different, so that different scoring results are caused, and even so, the overall kappa value of a model in the invention on the Sleep-EDF-39 exceeds 0.8, therefore, the invention has the more practical value compared with the method of learning the Sleep stage data in the invention by the most advanced method, such as the advanced Sleep stage, and the method has the more time-consuming value. The model of the invention is used for sleep stage, the model can be trained on data accumulated in hospitals, the sleep stage can be completed in a short time without manually extracting characteristics, which is beneficial to a monitored person to quickly make psychological or physiological adjustment according to the result, in particular, the invention can reduce the uncomfortable feeling of the infant in infant sleep monitoring, which is beneficial to monitoring the development of the infant, and the invention can be more convenient for individual sleep monitoring by combining the model of the invention with a physical electroencephalogram signal acquisition tool.

Claims

1. A single-channel electroencephalogram sleep stage method based on deep learning is characterized by comprising the following steps:

step 1, acquiring a polysomnography of a subject and a sleep label set corresponding to the polysomnography, selecting an electroencephalogram sleep signal of one channel in the polysomnography, and dividing the electroencephalogram sleep signal under one channel into N non-overlapping duration signal sequences X= { X ₁ ,x ₂ ,…,x _i ,…,x _N Building a tag sequence set y= { Y according to sleep tags corresponding to each duration ₁ ,y ₂ ,…,y _i ,…,y _N Simultaneously counting the number { N } of the tags of each sleep category in the sleep tags corresponding to all durations _f |f=1, 2, …, D }; wherein x is _i Represents the i-th duration signal sequence, and x _i ＝{e _i,1 ,e _i,2 ,…,e _i,j ,…,e _i,l }，y _i Representing the ith duration signal sequence x _i Corresponding sleep tag sequences, and y _i ＝{y _i,1 ,y _i,2 ,…,y _i,j ,…,y _i,l }，e _i,j ∈R ^1×m ，e _i,j Represents the ithThe j-th duration signal in the sequence of duration signals, y _i,j Tag indicating the jth duration signal in the ith duration signal sequence, N _f Tag number representing the f-th sleep category, f e {1,2,..once., D }, y _i,j E {1,2, …, D }, D is the total class of sleep stages, j E {1,2 … l }, l is x _i The number of the duration signals existing in the network, m represents the number of sampling points, m=θ×δ, θ represents the sampling rate, and tiger represents the sampling time;

step 2.1, the duration feature extraction module extracts the j-th duration signal e _i,j After processing, the j-th duration signal e is obtained _i,j Feature map a of (2) _i,j Thereby obtaining the ith duration signal sequence x _i Feature map sequence a of (2) _i ＝{a _i,1 ,a _i,2 ,…,a _i,j ,…,a _i,l }；

Step 2.2, the context encoder is used for the feature map sequence a _i Time correlation learning is carried out to obtain a characteristic diagram sequence H _i ＝{H _i,1 ,H _i,2 ,...,H _i,j ,...H _i,l }；

step 2.3.1, the primary capsule layer vs. the feature map sequence H _i Jth sub-feature map H _i,j After treatment, a primary capsule is obtained

Thereby obtaining the sequence x with the ith duration _i Is the jth duration signal e _i,，j Corresponding activated primary capsule sequence v _i,j ＝{v ¹ _i,j ,v ² _i,j ,...,v ^k _i,j ,...,v ⁿ _i,j }；

Step 2.3.2 Primary Capsule sequence v to be activated _i,j Inputting into the digital capsule layer for processing to obtain and allJ-th duration signal e related to D sleep categories _i,j Corresponding capsule sequences

Step 2.3.3, sequence of capsules U _i,j Medium f high-grade characteristic capsule

Taking l ₂ Norms, thereby obtaining the ith duration sequence x _i J-th duration signal e _i,，j Probability of presence of the f-th sleep class +.>

And then obtain the ith duration sequence x _i J-th duration signal e _i,j Predicting probability vectors of the existence of all sleep categories output via the sleep staging network

In the formula (7), the amino acid sequence of the compound,

is the j-th duration signal e _i,j Is the true label y of (2) _i,j The existence probability of the f-th class in the single-hot coding of (a);

2. The method for single-channel electroencephalogram sleep stage separation based on deep learning according to claim 1, wherein the duration feature extraction module in step 2.1 sequentially comprises a one-dimensional step length of S ₁ The convolution kernel is K ₁ A first convolutional layer, a first BN normalization layer, a first Relu nonlinear active layer, a pooling size k _m1 Step length s _m1 Is included in the first maximum pooling layer; step length S ₂ The convolution kernel is K ₂ A second BN normalization layer, a second Relu nonlinear activation layer; step length S ₂ The convolution kernel is K ₂ A third BN normalization layer, a third Relu nonlinear activation layer; step length S ₂ The convolution kernel is K ₂ A fourth convolution layer, a fourth BN normalization layer, a fourth Relu nonlinear activation layer, a pooling size of k _m2 Step length s _m2 Is stacked with the second largest pooling layer.

3. The single channel electroencephalogram sleep staging method according to claim 2, characterized in that said context encoder in step 2.2 comprises: a forward LSTM cell and a backward LSTM cell;

the characteristic diagram sequence a _i Is the j-th feature map a of (2) _i,j Obtaining the forward hidden state at output j time after processing by the forward LSTM unit

Will->

And->

After splicing, a characteristic diagram sequence a is obtained _i At the jth sub-feature map of the context encoder output,

thereby obtaining the characteristic diagram sequence a _i Feature map sequence H output through context encoder _i ＝{H _i,1 ,H _i,2 ,...,H _i,j ,...H _i,l }。

4. The method of deep learning-based single-channel electroencephalogram sleep stage according to claim 3, wherein the primary capsule layer in step 2.3.1 is composed of a one-dimensional convolution kernel with size of K _p X 1, number of convolution kernels N _p Is a convolution layer of length l _d Is composed of a Reshape operation layer;

the characteristic diagram sequence H _i Jth sub-feature map H _i,j The primary characteristic map P is obtained after the convolution layer treatment of the primary capsule layer _i,j After the Reshape operation layer is processed, outputting a j-th sub-feature diagram H _i,j Primary capsule sequence s of (2) _i,j And (2) and

Length of l _P Representing P _i,j K e {1,2,3., n };

using a squaring activation function on the primary capsule sequence s _i,j The kth capsule of (a)

After treatment, an activated primary capsule is obtained>

Thereby obtaining the sequence x with the ith duration _i Is the jth duration signal e _i,j Corresponding activated primary capsule sequence v _i,j ＝{v ¹ _i,j ,v ² _i,j ,...,v ^k _i,j ,...,v ⁿ _i,j }。

5. The method of single channel electroencephalogram sleep stage based on deep learning according to claim 4, wherein the digital capsule layer in step 2.3.2 calculates the primary capsule sequence v by using formula (1) _i,j Mapped f-th subspace predictive capsule sequence

The kth predictive capsule of (a)>

Thereby calculating the primary capsule sequence v using formula (2) _i,j Mapped f-th subspace predictive capsule sequence +.>

representation matrix W ^f Is the kth sub-matrix of (c); f e {1,2,3., D }; />

Similarity matrix between each prediction capsule of (a)>

In the formula (3), the amino acid sequence of the compound,

representing predicted capsule sequence->

Is a transpose of (2);

The kth sub-capsule of->

Weight occupied ∈>

In the formula (4), the amino acid sequence of the compound,

represents->

Is->

Similarity matrix representing dimensions n x n>

The p-th row and the o-th column of the elements;

The length of the output is l _d+1 Advanced feature capsule->

Thereby obtaining the j-th duration signal e related to all the D sleep categories _i,j Corresponding capsule sequences

6. An electronic device comprising a memory and a processor, wherein the memory is configured to store a program that supports the processor to perform the single channel electroencephalogram sleep staging method of any one of claims 1-5, the processor being configured to execute the program stored in the memory.

7. A computer readable storage medium having stored thereon a computer program, characterized in that the computer program when executed by a processor performs the steps of the single channel electroencephalogram sleep staging method of any one of claims 1-5.