CN115089123A - OSA detection method based on attention and Transformer - Google Patents

OSA detection method based on attention and Transformer Download PDF

Info

Publication number
CN115089123A
CN115089123A CN202210788498.7A CN202210788498A CN115089123A CN 115089123 A CN115089123 A CN 115089123A CN 202210788498 A CN202210788498 A CN 202210788498A CN 115089123 A CN115089123 A CN 115089123A
Authority
CN
China
Prior art keywords
formula
attention
osa
matrix
osa detection
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Withdrawn
Application number
CN202210788498.7A
Other languages
Chinese (zh)
Inventor
石争浩
张治军
周亮
李成建
任晓勇
黑新宏
张一彤
刘海琴
罗靖
尤珍臻
陈敬国
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Xian University of Technology
Original Assignee
Xian University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Xian University of Technology filed Critical Xian University of Technology
Priority to CN202210788498.7A priority Critical patent/CN115089123A/en
Publication of CN115089123A publication Critical patent/CN115089123A/en
Withdrawn legal-status Critical Current

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/48Other medical applications
    • A61B5/4806Sleep evaluation
    • A61B5/4818Sleep apnoea
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/24Detecting, measuring or recording bioelectric or biomagnetic signals of the body or parts thereof
    • A61B5/316Modalities, i.e. specific diagnostic methods
    • A61B5/369Electroencephalography [EEG]
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Surgery (AREA)
  • Public Health (AREA)
  • Pathology (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Medical Informatics (AREA)
  • Molecular Biology (AREA)
  • Veterinary Medicine (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biophysics (AREA)
  • Psychiatry (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Physiology (AREA)
  • Signal Processing (AREA)
  • Psychology (AREA)
  • Evolutionary Computation (AREA)
  • Fuzzy Systems (AREA)
  • Mathematical Physics (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses an attention and Transformer-based OSA detection method, which comprises the steps of establishing a data set, constructing an OSA detection network, preprocessing the data set, inputting the preprocessed data set into the OSA detection network for training to obtain a trained OSA detection network, and inputting the preprocessed data set into the trained OSA detection network for classification to obtain a classification detection result. The detection method improves the class perception loss function to effectively solve the class imbalance problem without additional calculation, completes the detection of the OSA in the whole night recording of the patient through a classifier consisting of MLP and Softmax and improves the accuracy of the OSA detection.

Description

OSA detection method based on attention and Transformer
Technical Field
The invention belongs to the technical field of medical signal processing, and particularly relates to an OSA detection method based on attention and a Transformer.
Background
Sleep Apnea Syndrome (SAS) is a common Sleep disorder that results in Sleep fragmentation and affects human health and quality of life. There are three basic types of sleep apnea: obstructive (OSA), Central (CSA) and Mixed (Mixed Sleep Apnea, MSA), often accompanied by associated hypoventilation (hypoppeas, HYP). OSA is one of the most common, most frequent and most severe sleep disorders. It can cause complete obstruction of the upper airway and relaxation of the laryngeal muscles, thereby impeding respiratory flow during sleep. A recent literature-based study suggests that 9.36 million adults worldwide are estimated to have undiagnosed OSA. Such diseases may lead to serious adverse physiological conditions and increased risk of heart disease, stroke, neurodegenerative diseases such as alzheimer's disease and cancer.
According to AASM, Polysomnography (PSG) is considered the "gold standard" for OSA detection, which is based on a comprehensive assessment of sleep signals. The PSG records the patient overnight and measures signals, such as electroencephalograms (EEG), Electrocardiograms (ECG), etc., using sensors attached to the body, after which sleep state analysis is performed by a sleep medical professional and sleep apnea events are manually annotated. With the increasing number of OSA patients in recent years, the shortage of sleep centers and sleep medical professionals, as well as the difference in scores between medical professionals and some other human errors, is a major obstacle to the correct diagnosis and timely treatment of OSA. Therefore, there is an urgent need to automate the detection of OSA events and help sleep medical experts to achieve fast and accurate event annotation to provide a powerful technical support to circumvent these human errors and the lack of infrastructure.
To achieve this goal, various physiological signals such as oxygen saturation, changes in heart rate and respiration, and various physiological signals such as EEG and ECG, etc., have been used for a great deal of research. EEG-based analysis has received particular attention from researchers in recent years due to the acquisition of multiple physiological signals, the time and effort involved, the high cost, and the rapid development of wearable infinite electrical detection acquisition systems. Meanwhile, with the continuous development of deep learning, the method is applied to different fields, and the method shows the advantages over the traditional machine learning model under the condition of not needing field knowledge. Among them, convolutional neural network is a popular deep learning model, and because of its excellent feature extraction and classification capability in tasks such as visual image, speech recognition and text recognition, CNN is also applied to the problem of biological signal classification at present. Some proposals divide the EEG signal into individual subframes, then realize the feature extraction of each subframe through the designed FCNN, and finally realize classification by using the full connection layer. The CNN building model, however, cannot model the time dependence and adaptive feature optimization between EEG data. Currently, a Recurrent Neural Network (RNN) is often employed in order to capture time dependence in EEG data. It has been proposed to divide the EEG signal into sub-frames at a fixed length of 10s, followed by feature extraction by CNN, long short term memory network (LSTM) to learn the transition rules for sleep apnea classification. However, due to their cyclic nature, RNNs have limitations in that they typically have high model complexity, and thus they are difficult to train in parallel. In addition, the problem of class imbalance of the data is also one of the important problems affecting the detection accuracy of the OSA.
Disclosure of Invention
The invention aims to provide an OSA detection method based on attention and a Transformer, which is beneficial to improving the accuracy of the detection of obstructive sleep apnea syndrome.
The technical scheme adopted by the invention is that an OSA detection method based on attention and a Transformer is used for establishing a data set and constructing an OSA detection network, preprocessing the data set and inputting the preprocessed data set into the OSA detection network for training to obtain the trained OSA detection network, and then inputting the preprocessed data set into the trained OSA detection network for classification to obtain a classification detection result.
The present invention is also characterized in that,
the method specifically comprises the following steps:
step 1, establishing an EEG signal data set and constructing an OSA detection network comprising a feature extraction module and a classification module;
step 2, carrying out data preprocessing on the EEG signal data set in the step 1 to obtain an original EEG signal;
step 3, inputting the original EEG signal obtained in the step 2 into an OSA detection network for feature extraction and classification to obtain a classification result;
step 4, constraining the classification result obtained in the step 3 by using a loss function, and then performing iterative training to obtain a trained OSA detection network model;
and 5, putting the EEG signal to be processed into the OSA detection network model trained in the step 4, and finally outputting a classification detection result.
The feature extraction module in the step 1 consists of a two-way convolution neural network, a convolution attention module and a Transformer, the convolution attention module consists of a space attention module and a channel attention module, and the classification module in the step 1 consists of an MLP (maximum likelihood probability) and a Softmax.
Step 3, the characteristic extraction specifically comprises the following steps: and (3) transmitting the original signals obtained in the step (2) into a two-way convolutional neural network, splicing the features extracted from each branch, inputting the spliced features into a convolutional attention module to complete the adaptive feature optimization of the features, and modeling the dependency relationship among the optimized features to obtain shallow semantics.
The two-way convolutional neural network adopts convolution kernels with two different sizes to carry out primary extraction of features, then multilayer convolution and pooling are carried out to carry out convolution on two branches, wherein a Dropout layer is adopted to prevent model overfitting, the size of a large convolution kernel in the two-way convolutional neural network is set to be 400, and the size of a small convolution kernel is set to be 50.
And in the characteristic extraction process, a residual block is adopted to enrich characteristic details and enhance the characteristic extraction capability.
The formula of the residual block is:
x l+1 =x l +F(x l +W l ) (10)
in the formula (10), x l+1 Is the convolution result of the (l + 1) th convolutional layer, x l As a result of convolution of the first convolutional layer, W l Is the weight of the first convolutional layer, F (x) l +W l ) Is the residual part.
The step 2 preprocessing is specifically to decompose, denoise and reconstruct the EEG signal by the FastICA algorithm.
The decomposition, denoising and reconstruction of the EEG signal by the FastICA algorithm specifically comprises the following steps:
s1, centralization: calculating the mean value of the mixed signal X, and then subtracting the mean value from X, as shown in equation (1):
X=X-E(X) (1)
in the formula (1), X is a mixed signal, and E (X) is a signal mean value;
s2, whitening: the specific process is shown as formula (2):
E[XX T ]=C X =UΛU T (2)
in the formula (2), C X Is E (XX) T ) Given the covariance matrix of X, Λ ═ diag (λ) 1 ,λ 2 ,…λ n ) Is that the diagonal element is C X Diagonal matrix of eigenvalues, U ═ U 1 ,u 2 ,…,u n ]Is a characteristic value of C X Characteristic vector of (1), UΛ U T A characteristic decomposition part of the covariance matrix;
Z=K×X (3)
in the formula (3), Z is the updated whitening matrix, K is the whitening matrix, and X is the signal after mean value removal;
Figure BDA0003732647040000041
in the formula (4), K is a whitening matrix, and Λ ═ diag (λ) 1 ,λ 2 ,…λ n ) Is that the diagonal element is C X Diagonal matrix of eigenvalues, U ═ U 1 ,u 2 ,…,u n ]Is a characteristic value of C X The feature vector of (2);
the matrix Z after whitening is: as shown in the formula (5),
Figure BDA0003732647040000051
in formula (5), where I is an identity matrix, Z is an updated whitening matrix, K is a whitening matrix, and Λ ═ diag (λ) 1 ,λ 2 ,…λ n ) Is that the diagonal element is C X Diagonal matrix of eigenvalues, U ═ U 1 ,u 2 ,…,u n ]Is a characteristic value of C X Characteristic vector of (1), UΛ U T A characteristic decomposition part of the covariance matrix;
s3, initializing W i
S4, for W i Updating is carried out, as shown in the formula (6),
W i =E{Zg(W i T Z)}-E{g′(W i T Z)}W (6)
in equation (6), W is a matrix initialized with all 0s, W i I column of W;
s5, orthogonalizing W i As shown in the formula (7),
W i =W i -∑(W i T W j )W j (7)
in formula (7), W i Is the ith column of W, W j Is the jth column of W;
s6, normalizing W i As shown in the formula (8),
W i =W i /||W i || (8)
in formula (8), W i Is the ith column of W, | W i I is W i The mold of (4);
s7, checking whether the iteration converges, if not converging to S4, when both converge to S3, initializing the next W i (i++)
S8, the separated signal is reconstructed to obtain the source signal S, as shown in equation (9).
S=WKX (9)
In the formula (9), S is a reconstructed source signal, W is a matrix initialized by all 0S, K is a whitening matrix, and X is a signal after mean value removal;
the loss function in step 4 is shown in equations (11), (12) and (13):
Figure BDA0003732647040000061
ω k =μ k max(1,log(μ k M/M k )) (12)
Figure BDA0003732647040000062
in equations (11), (12) and (13), λ represents a penalty factor representing a degree of penalty, ω, for the entire network weight k Denotes the weight, μ, assigned to the k class k Is an adjustable parameter, M k Is the number of samples for the class k.
The invention has the beneficial effects that:
according to the invention, under the condition that an original non-interference signal is obtained by decomposing, denoising and reconstructing the signal through FastICA, the feature extraction is completed through the feature extraction module, the modeling of the dependence relationship between the features is optimized and captured, high-quality information is extracted, and meanwhile, a loss function is designed to solve the problem of category imbalance, and finally, the classification is completed through MLP, so that a better OSA detection effect and a higher evaluation index are obtained.
Drawings
FIG. 1 is a schematic flow diagram of an attention and transducer based OSA detection method of the present invention;
FIG. 2 is a schematic flow chart of the preprocessing in the attention and Transformer based OSA detection method of the present invention;
FIG. 3 is a schematic structural diagram of a two-way convolutional neural network in the OSA detection method based on attention and Transformer according to the present invention;
FIG. 4 is a schematic diagram of the structure of the convolution attention module in the attention and Transformer based OSA detection method of the present invention;
FIG. 5 is a schematic diagram of the structure of a Transformer in the attention and Transformer based OSA detection method of the present invention;
FIG. 6 is a diagram of a trained OSA detection network model in the attention and Transformer based OSA detection method of the present invention;
FIG. 7 is a graph showing the result of OSA detection in the attention and Transformer based OSA detection method of the present invention.
Detailed Description
The present invention will be described in detail below with reference to the accompanying drawings and detailed description.
As shown in fig. 1, in the OSA detection method based on attention and Transformer, a data set is established and an OSA detection network is constructed, the data set is preprocessed and then input to the OSA detection network for training to obtain a trained OSA detection network, and then the preprocessed data set is input to the trained OSA detection network for classification to obtain a result of classification detection.
The method specifically comprises the following steps:
step 1, establishing an EEG signal data set and constructing an OSA detection network comprising a feature extraction module and a classification module; the data set was the UCD data set provided in the university of saint wenmet college of medicine/university of dublin school of sleep apnea database, with a total of 25 subjects participating in EEG signal collection, each subject recording overnight PSG signals, with a sampling frequency of 128Hz and annotated by the expert for the subjects' overnight recordings. The feature extraction module consists of a two-way convolutional neural network, a convolutional attention module and a Transformer, and the classification module consists of an MLP (maximum likelihood probability) and a Softmax;
step 2, carrying out data preprocessing on the EEG signal data set in the step 1 to obtain an original EEG signal; dividing the EEG signal of a subject in the whole night for 30s with a fixed length, and decomposing, denoising and reconstructing the divided signal through FastICA to obtain original signal data without other lead interference in the signal acquisition process;
as shown in fig. 2, the preprocessing specifically includes decomposing, denoising, and reconstructing an original EEG signal by a FastICA algorithm, and finally obtaining a denoised signal S;
decomposing the EEG signal by the FastICA algorithm, and denoising and reconstructing the EEG signal specifically comprises the following steps:
s1, centralization: this is the first step of preprocessing, first calculating the mean value of the mixed signal X, and then subtracting the mean value from X as shown in equation (1):
X=X-E(X) (1)
in the formula (1), X is a mixed signal, and E (X) is a signal mean value;
s2, whitening: in this step, the matrix X is converted to Z, so they are uncorrelated and possess unit variance. The method relates to characteristic value decomposition, and the specific process is shown as formula (2):
E[XX T ]=C X =UΛU T (2)
in the formula (2), C X Is E (XX) T ) Given the covariance matrix of X, Λ ═ diag (λ) 1 ,λ 2 ,…λ n ) Is that the diagonal element is C X Diagonal matrix of eigenvalues, U ═ U 1 ,u 2 ,…,u n ]Is a characteristic value of C X Characteristic vector of (1), UΛ U T A characteristic decomposition part of the covariance matrix;
the whitening realization process is shown as the formula (3),
Z=K×X (3)
in the formula (3), Z is the updated whitening matrix, K is the whitening matrix, and X is the signal after mean value removal;
the matrix K is defined as:
Figure BDA0003732647040000081
in the formula (4), K is a whitening matrix, and Λ ═ diag (λ) 1 ,λ 2 ,…λ n ) Is that the diagonal element is C X Diagonal matrix of eigenvalues, U ═ U 1 ,u 2 ,…,u n ]Is a characteristic value of C X The feature vector of (2);
it can further be concluded that the Z elements of the matrix are uncorrelated after the whitening process.
The matrix Z after whitening is: as shown in the formula (5),
Figure BDA0003732647040000091
in formula (5), where I is an identity matrix, Z is an updated whitening matrix, K is a whitening matrix, and Λ ═ diag (λ) 1 ,λ 2 ,…λ n ) Is that the diagonal element is C X Diagonal matrix of eigenvalues, U ═ U 1 ,u 2 ,…,u n ]Is a characteristic value of C X Characteristic vector of (1), UΛ U T A characteristic decomposition part of the covariance matrix;
s3, initializing W i ,W i I column of W;
s4, for W i Updating is carried out, as shown in the formula (6),
W i =E{Zg(W i T Z)}-E{g′(W i T Z)}W (6)
in the formula (6), W is a matrix initialized by all 0s, W i I column of W;
in the experiment, g (·) ═ tanh (·), at the moment, FastICA iteration speed is high, and robustness is strong.
S5, orthogonalizing W i As shown in the formula (7),
W i =W i -∑(W i T W j )W j (7)
in formula (7), W i Is ith column of W, W j Is the jth column of W;
s6, normalizing W i As shown in the formula (8),
W i =W i /||W i || (8)
in formula (8), W i Is the ith column of W, | W i I is W i The mold of (4);
s7, checking whether the iteration converges, if not converging to S4, when both converge to S3, initializing the next W i (i++)
S8, reconstructing the separated signal to obtain a source signal S, as shown in formula (9);
S=WKX (9)
in equation (9), S is the reconstructed source signal, W is a matrix initialized with all 0S, K is the whitening matrix, and X is the de-averaged signal.
Step 3, inputting the EEG signal data set in the step 2 into an OSA detection network for feature extraction and classification, obtaining a classification result and obtaining a trained OSA detection network model; the obtained original signals are transmitted into a two-way convolution neural network, and finally the features extracted from each branch are spliced and then input into a convolution attention module to complete the self-adaptive feature optimization of the features so as to enhance the feature learning. And simultaneously, context coding is realized by using a Transformer for better capturing the context dependency relationship among the interior of the features.
In order to better extract features and enrich semantic information of the features, the invention adopts a double-path convolutional neural network as a primary feature extractor, wherein convolutional layer branches with two different kernel sizes are realized according to the relation of sampling rates in signals. To better capture features according to the range of the frequency bands, the large convolution kernel size was set to 400 and the small convolution kernel size was set to 50 in the experiment. As shown in FIG. 3, each branch consists of three convolutional layers and two max pooling layers, where each convolutional layer includes a Batch normalization layer and uses GELU as the activation function. To prevent overfitting, we also apply Dropout layer after the first maximum pooling in the two branches and after the connection of the two branches, inputting the reconstructed signal S into the two-way convolutional neural network as shown in equations (11) and (12):
Figure BDA0003732647040000101
Figure BDA0003732647040000102
F=Dropout(concat(B1,B2)) (13)
in formulae (11) and (12), f n×n Represents the convolution operation with convolution kernel size n × n, and Maxpool represents the maximum pooling.
In formula (13), concat represents the concatenation of two signatures.
The convolution attention module is composed of a spatial attention module and a channel attention module. For better feature extraction, feature extraction capability is enhanced, and meanwhile, a convolution attention mechanism is introduced to realize the self-adaptive optimization of features. The convolutional attention mechanism is a simple and effective attention module for feeding forward convolutional neural networks. Given an intermediate feature, the convolution attention mechanism module infers the attention feature in turn along two independent dimensions, such as channel and space, and then multiplies the attention feature with the input feature for adaptive feature optimization. As shown in fig. 4.
The channel attention mechanism is as follows: the feature mapping F firstly carries out global average pooling and global maximum pooling operations to aggregate spatial information of the feature mapping, and generates two different spatial context descriptors:
Figure BDA0003732647040000111
and
Figure BDA0003732647040000112
the global average pooling feature and the global maximum pooling feature are represented separately. These two features are then propagated forward into a shared network to generate our channel attention map
Figure BDA0003732647040000113
The channel attention was calculated as:
Figure BDA0003732647040000114
Figure BDA0003732647040000115
Figure BDA0003732647040000116
in equation (16), AdaptAvgPool represents adaptive average pooling, AdapteMaxpool represents adaptive maximum pooling, σ represents sigmoid function, f 1×1 A convolution operation with a convolution kernel size of 1 x 1 is represented.
The spatial attention mechanism is as follows: aggregating the channel information of the feature maps by two merging operations, generating two feature maps:
Figure BDA0003732647040000117
and
Figure BDA0003732647040000118
mean pooling characteristics and maximum pooling characteristics across channels are represented, respectively. These information are then concatenated through a standard convolutional layer to generate a spatial attention map
Figure BDA0003732647040000121
In short, the spatial attention calculation is as shown in equation (17):
Figure BDA0003732647040000122
in the formula (17), σ represents a sigmoid function, f n×n Represents the convolution operation with a convolution kernel size of n × n, AvgPool represents the average pooling operation, MaxPool represents the maximum pooling operation,
Figure BDA0003732647040000123
and
Figure BDA0003732647040000124
mean pooling characteristics and maximum pooling characteristics across channels are represented, respectively.
Given feature map F as input, CBAM derives a channel attention map in order
Figure BDA0003732647040000125
And spatial attention mapping
Figure BDA0003732647040000126
As shown in fig. 4. The overall process of attention is shown in equation (18):
Figure BDA0003732647040000127
in the formula (18), the reaction mixture,
Figure BDA0003732647040000128
representing a dot product, in which the attention value is broadcast: channel attention values are broadcast along the spatial dimension and vice versa. F is the input signature, Mc represents the input through the channel attention, Ms represents the input through the spatial attention, and F "is the final output.
For better extracting context dependence between features, a Transformer is used as context codingA Code (CE) module. The architecture of the CE module is shown in FIG. 5 as the Transformer Encoder. It mainly comprises a Norm layer, a continuous multi-head attention Module (MHA) and an MLP module. The MLP module of (1) consists of two fully connected layers with a non-linear relationship, using ReLU as the activation function in between and a Dropout layer to prevent data overfitting. The same L layers are then stacked to produce the final feature. Inspired by the BERT model, we add a token to the input
Figure BDA0003732647040000129
Its state is represented as a context vector in the output. The Transformer first inputs F' into a linear mapping layer omega Tran Mapping features to hidden dimensions, i.e.
Figure BDA0003732647040000131
The output of this linear mapping is then input to a Transformer, i.e.
Figure BDA0003732647040000132
Next, we append the context vector to the feature vector
Figure BDA0003732647040000133
Such input features become
Figure BDA0003732647040000134
Where subscript 0 represents the input to the first layer. Then we will let ψ 0 Inputting the transform layer for processing, as shown in formula (19) and formula (20):
Figure BDA0003732647040000135
Figure BDA0003732647040000136
in formula (19), MHA represents a multi-headed attention module, Norm represents a LayerNorm layer, ψ l-1 Represents the characteristics of the (l-1) th layer,
Figure BDA0003732647040000137
representing features after multi-head attention, + representing residual concatenation.
In the formula (20), MLP represents a pure full-link network, Norm represents a LayerNorm layer, ψ l Represents the l-th layer output characteristics and + represents the residual connection.
The characteristic extraction specifically comprises the following steps: transmitting the original signals obtained in the step (2) into a two-way convolutional neural network, splicing the features extracted from each branch, then inputting the spliced features into a convolutional attention module to complete the adaptive feature optimization of the features, and modeling the dependency relationship among the optimized features to obtain shallow semantics;
feature extraction in which residual blocks are employed to enrich feature details enhance feature extraction capabilities.
The formula of the residual block is:
x l+1 =x l +F(x l +W l ) (21)
in the formula (21), x l+1 Is the convolution result of the (l + 1) th convolutional layer, x l As a result of convolution of the first convolutional layer, W l Is the weight of the first convolutional layer, F (x) l +W l ) Is the residual part.
The two-way convolutional neural network uses convolution kernels of two different sizes for initial extraction of features, followed by multi-layer convolution and pooling to convolve the two-way branches, with a Dropout layer being used to prevent model overfitting. The large convolution kernel size in a two-way convolutional neural network is set to 400 and the small convolution kernel size is set to 50.
Re-attaching the context vector from the final output such that
Figure BDA0003732647040000141
Combining MLP and Softmax to be used as a classifier, inputting the context vector into the classifier in order to achieve a classification effect, and obtaining classification results Normal and OSA. As shown in fig. 6.
Figure BDA0003732647040000142
And 4, constraining the result obtained by training the network by using a loss function, then performing parameter updating through reverse retransmission, and performing 100 times of iterative training, wherein 1 time refers to training the preprocessed signal once, and finally obtaining a trained network model, as shown in fig. 6.
Standard multi-class cross entropy is used as a loss function for the model in conventional multi-classification tasks.
Figure BDA0003732647040000143
In the formula (22), the reaction mixture is,
Figure BDA0003732647040000144
is the true label for the ith sample,
Figure BDA0003732647040000145
the prediction probability of the ith sample as the class K is shown, M is the total amount of the samples, and K is the number of the classes. The overall data set category is unbalanced due to the large number of variations per category. The loss function in equation (22) is equivalent to penalizing the misclassifications of all classes, so the trained model may be biased toward classes with large sample size.
In order to solve the influence of the class imbalance problem on the classification result, the method reconstructs the whole loss function on the basis of the standard multi-class cross entropy and the former work to solve the class imbalance problem, as shown in the formulas (23), (24) and (25):
Figure BDA0003732647040000146
Figure BDA0003732647040000147
Figure BDA0003732647040000148
in equations (23), (24) and (25), λ represents a penalty factor representing a degree of penalty for the entire network weight, and ω represents a penalty degree k Denotes the weight, μ, assigned to the k class k Is an adjustable parameter, M k Is the number of samples for the class k. The choice of class weight depends on two factors, namely the number of samples of the class is M/M k The decision, and the class are differentiated by μ k And (6) determining. We distribute the number of samples in the dataset according to the Normal State and the OSA State we will have the highest μ k OSA was assigned, with Normal being the lowest.
And 5: putting the EEG signal to be processed into the trained model, and finally outputting the result of classification detection, as shown in FIG. 7, graph a, graph b and graph c are the EEG signal to be detected, graph d is the classification result graph of graph a, graph e is the classification result graph of graph b, and graph e is the classification result graph of graph c.

Claims (10)

1. The OSA detection method based on attention and a Transformer is characterized in that a data set is established, an OSA detection network is constructed, the data set is preprocessed and then input into the OSA detection network for training, the trained OSA detection network is obtained, then the preprocessed data set is input into the trained OSA detection network for classification, and the classified detection result is obtained.
2. The OSA detection method based on attention and transducer according to claim 1, characterized by comprising the following steps:
step 1, establishing an EEG signal data set and constructing an OSA detection network comprising a feature extraction module and a classification module;
step 2, carrying out data preprocessing on the EEG signal data set in the step 1 to obtain an original EEG signal;
step 3, inputting the original EEG signal obtained in the step 2 into an OSA detection network for feature extraction and classification to obtain a classification result;
step 4, constraining the classification result obtained in the step 3 by using a loss function, and then performing iterative training to obtain a trained OSA detection network model;
and 5, putting the EEG signal to be processed into the trained OSA detection network model in the step 4, and finally outputting a classification detection result.
3. The attention and fransformer-based OSA detection method as claimed in claim 2, wherein the feature extraction module of step 1 is composed of a two-way convolutional neural network, a convolutional attention module and a fransformer, the convolutional attention module is composed of a spatial attention module and a channel attention module, and the classification module of step 1 is composed of MLP and Softmax.
4. The attention and transducer based OSA detection method according to claim 2, wherein the feature extraction in step 3 is specifically: and (3) transmitting the original signals obtained in the step (2) into a two-way convolution neural network, splicing the features extracted from each branch, inputting the spliced features into a convolution attention module to complete the self-adaptive feature optimization of the features, and modeling the dependency relationship among the optimized features to obtain shallow semantics.
5. The attention and transducer based OSA detection method according to claim 3 or 4, characterized in that the two-way convolutional neural network uses convolution kernels of two different sizes for the preliminary extraction of features, then performs multi-layer convolution and pooling to convolve the two-way branches, wherein a Dropout layer is used to prevent model overfitting, and the two-way convolutional neural network has a large convolution kernel size set to 400 and a small convolution kernel size set to 50.
6. The attention and fransformer based OSA detection method of claim 2, wherein the feature extraction process employs a residual block to enrich feature details and enhance feature extraction capability.
7. The attention and fransformer based OSA detection method of claim 6, wherein the formula of the residual block is:
x l+1 =x l +F(x l +W l ) (1)
in the formula (1), x l+1 Is the convolution result of the l +1 th convolution layer, x l As a result of convolution of the first convolutional layer, W l Is the weight of the first convolutional layer, F (x) l +W l ) Is the residual part.
8. The attention and transducer based OSA detection method according to claim 2, wherein the preprocessing of step 2 is to decompose, denoise and reconstruct EEG signal by FastICA algorithm.
9. The attention and fransformer based OSA detection method of claim 8, wherein the FastICA algorithm decomposition, denoising and reconstruction of EEG signals specifically comprises the steps of:
s1, centralization: calculating the mean value of the mixed signal X, and then subtracting the mean value from X, as shown in equation (2):
X=X-E(X) (2)
in the formula (2), X is a mixed signal, and E (X) is a signal mean value;
s2, whitening: the specific process is shown as formula (3):
E[XX T ]=C X =UΛU T (3)
in formula (3), C X Is E (XX) T ) Given the covariance matrix of X, Λ ═ diag (λ) 1 ,λ 2 ,...λ n ) Is a diagonal element of C X Diagonal matrix of eigenvalues, U ═ U 1 ,u 2 ,...,u n ]Is a characteristic value of C X Characteristic vector of (1), UΛ U T A characteristic decomposition part of the covariance matrix;
Z=K×X (4)
in the formula (4), Z is the updated whitening matrix, K is the whitening matrix, and X is the signal after mean value removal;
Figure FDA0003732647030000031
in the formula (5), K is a whitening matrix, and Λ ═ diag (λ) 1 ,λ 2 ,...λ n ) Is a diagonal element of C X Diagonal matrix of eigenvalues, U ═ U 1 ,u 2 ,...,u n ]Is a characteristic value of C X The feature vector of (2);
the matrix Z after whitening is: as shown in the formula (6),
Figure FDA0003732647030000032
in the formula (6), I is an identity matrix, Z is an updated whitening matrix, K is a whitening matrix, and Λ ═ diag (λ) 1 ,λ 2 ,...λ n ) Is that the diagonal element is C X Diagonal matrix of eigenvalues, U ═ U 1 ,u 2 ,...,u n ]Is a characteristic value of C X Characteristic vector of (1), UΛ U T A characteristic decomposition part of the covariance matrix;
s3, initializing W i
S4, for W i Updating is carried out, as shown in a formula (7),
W i =E{Zg(W i T Z)}-E{g′(W i T Z)}W (7)
in the formula (7), W is a matrix initialized by all 0s, W i I column of W;
s5, orthogonalizing W i As shown in the formula (8),
W i =W i -∑(W i T W j )W j (8)
in the formula (8), W i Is ith column of W, W j Is the jth column of W;
S6,normalized W i As shown in the formula (9),
W i =W i /||W i || (9)
in the formula (9), W i Is the ith column of W, | W i I is W i The mold of (4);
s7, checking whether the iteration converges, if not converging to S4, when both converge to S3, initializing the next W i (i++)
And S8, reconstructing the separated signals to obtain a source signal S, as shown in the formula (10).
S=WKX (10)
In equation (10), S is the reconstructed source signal, W is a matrix initialized with all 0S, K is the whitening matrix, and X is the de-averaged signal.
10. The attention and transducer based OSA detection method according to claim 2, wherein the loss function in step 4 is shown in equations (11), (12) and (13):
Figure FDA0003732647030000041
ω k =μ k max(1,log(μ k M/M k )) (12)
Figure FDA0003732647030000042
in equations (11), (12) and (13), λ represents a penalty factor representing a degree of penalty, ω, for the entire network weight k Representing the weight assigned to the k class, μ k Is an adjustable parameter, M k Is the number of samples for the class k.
CN202210788498.7A 2022-07-06 2022-07-06 OSA detection method based on attention and Transformer Withdrawn CN115089123A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210788498.7A CN115089123A (en) 2022-07-06 2022-07-06 OSA detection method based on attention and Transformer

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210788498.7A CN115089123A (en) 2022-07-06 2022-07-06 OSA detection method based on attention and Transformer

Publications (1)

Publication Number Publication Date
CN115089123A true CN115089123A (en) 2022-09-23

Family

ID=83296302

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210788498.7A Withdrawn CN115089123A (en) 2022-07-06 2022-07-06 OSA detection method based on attention and Transformer

Country Status (1)

Country Link
CN (1) CN115089123A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116421152A (en) * 2023-06-13 2023-07-14 长春理工大学 Sleep stage result determining method, device, equipment and medium
CN117338253A (en) * 2023-12-05 2024-01-05 华南师范大学 Sleep apnea detection method and device based on physiological signals

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116421152A (en) * 2023-06-13 2023-07-14 长春理工大学 Sleep stage result determining method, device, equipment and medium
CN116421152B (en) * 2023-06-13 2023-08-22 长春理工大学 Sleep stage result determining method, device, equipment and medium
CN117338253A (en) * 2023-12-05 2024-01-05 华南师范大学 Sleep apnea detection method and device based on physiological signals
CN117338253B (en) * 2023-12-05 2024-03-26 华南师范大学 Sleep apnea detection method and device based on physiological signals

Similar Documents

Publication Publication Date Title
Jiang et al. A snapshot research and implementation of multimodal information fusion for data-driven emotion recognition
Qu et al. A residual based attention model for EEG based sleep staging
Rubin et al. Recognizing abnormal heart sounds using deep learning
CN110801221B (en) Sleep apnea fragment detection equipment based on unsupervised feature learning
Gupta et al. OSACN-Net: automated classification of sleep apnea using deep learning model and smoothed Gabor spectrograms of ECG signal
CN115089123A (en) OSA detection method based on attention and Transformer
Salhi et al. Voice disorders identification using multilayer neural network
CN111920420B (en) Patient behavior multi-modal analysis and prediction system based on statistical learning
Deperlioglu Heart sound classification with signal instant energy and stacked autoencoder network
CN116072265B (en) Sleep stage analysis system and method based on convolution of time self-attention and dynamic diagram
Kang et al. 1D convolutional autoencoder-based PPG and GSR signals for real-time emotion classification
Aggarwal et al. A structured learning approach with neural conditional random fields for sleep staging
Dar et al. Spectral features and optimal hierarchical attention networks for pulmonary abnormality detection from the respiratory sound signals
Asatani et al. Classification of respiratory sounds using improved convolutional recurrent neural network
CN113925459A (en) Sleep staging method based on electroencephalogram feature fusion
CN115804602A (en) Electroencephalogram emotion signal detection method, equipment and medium based on attention mechanism and with multi-channel feature fusion
Simply et al. Diagnosis of obstructive sleep apnea using speech signals from awake subjects
Liang et al. Obstructive sleep apnea detection using combination of CNN and LSTM techniques
Lu et al. Speech depression recognition based on attentional residual network
Wang et al. Deep learning for sleep stage classification
Pravin et al. Regularized deep LSTM autoencoder for phonological deviation assessment
Zhang et al. A noninvasive method to detect diabetes mellitus and lung cancer using the stacked sparse autoencoder
Rezaee et al. Can you understand why i am crying? a decision-making system for classifying infants’ cry languages based on deepsvm model
Srivastava et al. ApneaNet: A hybrid 1DCNN-LSTM architecture for detection of Obstructive Sleep Apnea using digitized ECG signals
Liu et al. Respiratory sounds feature learning with deep convolutional neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WW01 Invention patent application withdrawn after publication
WW01 Invention patent application withdrawn after publication

Application publication date: 20220923