CN115349821A - Sleep staging method and system based on multi-modal physiological signal fusion - Google Patents

Sleep staging method and system based on multi-modal physiological signal fusion Download PDF

Info

Publication number
CN115349821A
CN115349821A CN202210675112.1A CN202210675112A CN115349821A CN 115349821 A CN115349821 A CN 115349821A CN 202210675112 A CN202210675112 A CN 202210675112A CN 115349821 A CN115349821 A CN 115349821A
Authority
CN
China
Prior art keywords
time
domain
frequency
graph
fusion
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210675112.1A
Other languages
Chinese (zh)
Inventor
樊小毛
李宇杰
马文俊
赵淦森
陈莹
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China Normal University
Shenzhen Technology University
Original Assignee
South China Normal University
Shenzhen Technology University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China Normal University, Shenzhen Technology University filed Critical South China Normal University
Priority to CN202210675112.1A priority Critical patent/CN115349821A/en
Publication of CN115349821A publication Critical patent/CN115349821A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/48Other medical applications
    • A61B5/4806Sleep evaluation
    • A61B5/4812Detecting sleep stages or cycles
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7253Details of waveform analysis characterised by using transforms
    • A61B5/7257Details of waveform analysis characterised by using transforms using Fourier transforms
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning

Landscapes

  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Medical Informatics (AREA)
  • Artificial Intelligence (AREA)
  • Veterinary Medicine (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Surgery (AREA)
  • Animal Behavior & Ethology (AREA)
  • General Health & Medical Sciences (AREA)
  • Public Health (AREA)
  • Pathology (AREA)
  • Mathematical Physics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Evolutionary Computation (AREA)
  • Software Systems (AREA)
  • Signal Processing (AREA)
  • Psychiatry (AREA)
  • Physiology (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Fuzzy Systems (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a sleep staging method and a sleep staging system based on multi-modal physiological signal fusion, wherein the method comprises the following steps: performing down-sampling processing on the acquired multi-lead sleep monitoring signal to obtain a multi-lead signal; carrying out short-time Fourier transform processing on the multi-lead signals, converting to obtain a time-frequency graph, extracting lead characteristics in the multi-lead signals by using one-dimensional convolution, and generating a machine self-learning graph according to the lead characteristics; respectively combining the time-frequency graph and the machine self-learning graph with time-domain characteristics of adjacent sleep stages to obtain a frequency domain-time domain fusion characteristic and a space domain-time domain fusion characteristic; and performing multi-view feature fusion on the frequency domain-time domain fusion features and the space domain-time domain fusion features to obtain a sleep stage result. The invention can improve the accuracy of sleep staging and can be widely applied to the technical field of artificial intelligence.

Description

Sleep staging method and system based on multi-modal physiological signal fusion
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to a sleep staging method and a sleep staging system based on multi-modal physiological signal fusion.
Background
Sleep is a basic physiological function of human beings, and is characterized by a series of changes of brain, muscle, eyes, heart and respiratory activity, and plays an important role in repairing various functions of the human beings. By sleeping, the fatigue of a human body in one day is eliminated, the energy is recovered, and clear thinking and sensitive response of the brain in a waking state in the next day can be ensured. Sleep is closely related to human mental diseases, human listlessness is caused by sleep deficiency, and the depression patients are often accompanied with symptoms such as insomnia, abnormal sleep and the like. The sleep state monitoring is a research focus in the field of crossing physiological signals and artificial intelligence in recent years, has important significance on human health, and diseases related to sleep, such as insomnia, schizophrenia and autism, can be distinguished by analyzing the sleep quality. Sleep staging is an important method of identifying sleep states that helps to better locate the onset of different abnormalities.
Conventional sleep staging methods mostly rely on medical staff to visually observe Polysomnography (PSG), record the patient at night, and measure various physiological signals, such as electroencephalogram (EEG), electromyogram (EMG), electrocardiogram (ECG), and Electrooculogram (EOG), using sensors attached to the body for monitoring respiratory and other physiological changes of the breathing patient. However, this method is time-consuming, labor-consuming, and prone to errors due to human judgment subjectivity, and therefore an automated sleep staging model is necessary to solve this problem. Most of the automatic sleep staging models are based on single-mode physiological signals, and cannot capture the communication information among polysomnography signals, so that multi-angle physiological information is provided for a user. Most models based on multi-modal physiological signals extract single feature types, complementary information of multiple views cannot be fused, and the accuracy of sleep stages of the models and the complementary information are still in a space for improvement.
Disclosure of Invention
In view of this, embodiments of the present invention provide a sleep staging method and system based on multi-modal physiological signal fusion, which can improve the accuracy of sleep staging.
One aspect of the embodiments of the present invention provides a sleep staging method based on multi-modal physiological signal fusion, including:
performing down-sampling processing on the acquired multi-lead sleep monitoring signal to obtain a multi-lead signal;
carrying out short-time Fourier transform processing on the multi-lead signals, converting to obtain a time-frequency graph, extracting lead characteristics in the multi-lead signals by using one-dimensional convolution, and generating a machine self-learning graph according to the lead characteristics;
respectively combining the time-frequency graph and the machine self-learning graph with time-domain characteristics of adjacent sleep stages to obtain a frequency domain-time domain fusion characteristic and a space domain-time domain fusion characteristic;
and performing multi-view feature fusion on the frequency domain-time domain fusion features and the space domain-time domain fusion features to obtain a sleep stage result.
Optionally, the performing short-time fourier transform processing on the multi-lead signal to obtain a time-frequency diagram by conversion includes:
configuring a sampling frequency and an STFT window width;
according to the configured sampling frequency and the STFT window width, performing short-time Fourier transform on signals of one section every 30 seconds to obtain an STFT image corresponding to each lead signal;
stacking the STFT images of all the lead signals to obtain a time-frequency graph;
and when the window width intercepted in the short-time Fourier transform process is less than 100, filling 0 at two ends of the time-frequency diagram.
Optionally, the generating a machine self-learning graph from lead characteristics comprises:
viewing each lead as each node in the graph; the characteristics of the nodes are formed by characteristic matrixes extracted from original signals by two one-dimensional convolution kernels;
and generating a self-learning graph according to the extracted feature matrix, and further forming a physiological structure relationship graph among different signal leads.
Optionally, the step of combining the time-domain features of the adjacent sleep stages with the time-domain graph and the machine self-learning graph respectively to obtain a frequency-domain time-domain fusion feature and a spatial-domain time-domain fusion feature includes:
extracting the characteristics of the time-frequency graph by using frequency domain convolution, and extracting frequency domain characteristics;
extracting the features of the self-learning graph by using space domain convolution, and extracting the spatial features among human physiological structures;
respectively extracting time domain characteristics of the time-frequency graph and the self-learning graph;
and performing multi-view feature fusion according to the extracted frequency domain features, spatial features and time domain features to obtain frequency domain-time domain fusion features and space domain-time domain fusion features.
Optionally, the performing feature extraction on the time-frequency graph by using frequency-domain convolution and extracting frequency-domain features include:
extracting the characteristics of the time-frequency diagram through a VGG-16 network to obtain frequency domain characteristics;
wherein the VGG-16 network comprises 5 convolutional layers, 3 fully-connected layers and 1 SoftMax output layer; the maximum pooling is used between each layer for processing and the activation function Relu is used to activate all hidden layers and the resulting features are input into a 128-dimensional fully connected layer.
Optionally, the extracting the features of the self-learning graph by using a spatial convolution to extract spatial features between human physiological structures includes:
adding a space attention mechanism on the self-learning graph, capturing a topological structure in the self-learning graph by utilizing the convolution of the Chebyshev graph, and extracting space characteristics;
and the adjacent matrix and the space attention matrix learned in the Chebyshev graph convolution process are used for dynamically adjusting the update of the nodes.
Optionally, the separately extracting time domain features of the time-frequency graph and the self-learning graph includes:
combining the currently extracted frequency characteristic and the front and rear segments of the frequency characteristic into a GRU network;
inputting the output of each sequence into an attention network, learning the weight of each sequence, and fusing the characteristics of the five sequences into a 256-dimensional time characteristic;
combining the fused features with the features of the current sleep stage, and inputting the combined features into a 128-dimensional full connection layer;
adding a time attention mechanism into the structural diagram of the current record and the two sections of the previous and next records, and performing convolution by using time to obtain an attention matrix;
and carrying out normalization processing on the attention matrix by utilizing a Softmax operation to obtain time domain characteristics.
In another aspect, an embodiment of the present invention further provides a sleep staging system based on multi-modal physiological signal fusion, including:
the first module is used for performing down-sampling processing on the acquired multi-lead sleep monitoring signal to obtain a multi-lead signal;
the second module is used for carrying out short-time Fourier transform processing on the multi-lead signals, converting the multi-lead signals to obtain a time-frequency graph, extracting lead characteristics in the multi-lead signals by using one-dimensional convolution, and then generating a machine self-learning graph according to the lead characteristics;
the third module is used for respectively combining the time-frequency graph and the machine self-learning graph with the time-domain characteristics of adjacent sleep stages to obtain a frequency domain-time domain fusion characteristic and a space domain-time domain fusion characteristic;
and the fourth module is used for carrying out multi-view feature fusion on the frequency domain-time domain fusion features and the spatial domain-time domain fusion features to obtain a sleep staging result.
Another aspect of the embodiments of the present invention further provides an electronic device, which includes a processor and a memory;
the memory is used for storing programs;
the processor executes the program to implement the method as described above.
Yet another aspect of the embodiments of the present invention provides a computer-readable storage medium, which stores a program, which is executed by a processor to implement the method as described above.
The embodiment of the invention also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and the computer instructions executed by the processor cause the computer device to perform the foregoing method.
The embodiment of the invention carries out down-sampling processing on the collected multi-lead sleep monitoring signal to obtain the multi-lead signal; carrying out short-time Fourier transform processing on the multi-lead signals, converting to obtain a time-frequency graph, extracting lead characteristics in the multi-lead signals by using one-dimensional convolution, and generating a machine self-learning graph according to the lead characteristics; respectively combining the time-frequency graph and the machine self-learning graph with time-domain characteristics of adjacent sleep stages to obtain a frequency domain-time domain fusion characteristic and a space domain-time domain fusion characteristic; and performing multi-view feature fusion on the frequency domain-time domain fusion features and the space domain-time domain fusion features to obtain a sleep staging result. The invention can improve the accuracy of sleep staging.
Drawings
In order to more clearly illustrate the technical solutions in the embodiments of the present application, the drawings required to be used in the description of the embodiments are briefly introduced below, and it is obvious that the drawings in the description below are only some embodiments of the present application, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.
FIG. 1 is a flowchart illustrating the overall steps provided by an embodiment of the present invention;
FIG. 2 is a flowchart of an MVF-SleepNet according to an embodiment of the present invention;
fig. 3 is a schematic diagram of generating a self-learning graph according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
In view of the problems in the prior art, an aspect of the embodiments of the present invention provides a sleep staging method based on multi-modal physiological signal fusion, as shown in fig. 1, the method of the present invention includes the following steps:
performing down-sampling processing on the acquired multi-lead sleep monitoring signal to obtain a multi-lead signal;
carrying out short-time Fourier transform processing on the multi-lead signals, converting to obtain a time-frequency graph, extracting lead characteristics in the multi-lead signals by using one-dimensional convolution, and generating a machine self-learning graph according to the lead characteristics;
respectively combining the time-frequency graph and the machine self-learning graph with time-domain characteristics of adjacent sleep stages to obtain a frequency domain-time domain fusion characteristic and a space domain-time domain fusion characteristic;
and performing multi-view feature fusion on the frequency domain-time domain fusion features and the space domain-time domain fusion features to obtain a sleep staging result.
Optionally, the performing short-time fourier transform processing on the multi-lead signal to obtain a time-frequency diagram by conversion includes:
configuring a sampling frequency and an STFT window width;
according to the configured sampling frequency and the STFT window width, performing short-time Fourier transform on signals of one section every 30 seconds to obtain an STFT image corresponding to each lead signal;
stacking the STFT images of all the lead signals to obtain a time-frequency graph;
and when the window width intercepted in the short-time Fourier transform process is less than 100, filling 0 to the two ends of the time frequency diagram.
Optionally, the generating a machine self-learning graph from the lead characteristics comprises:
viewing each lead as each node in the graph; the node characteristics are formed by a characteristic matrix extracted from an original signal by two one-dimensional convolution kernels;
and generating a self-learning graph according to the extracted feature matrix, and further forming a physiological structure relationship graph among different signal leads.
Optionally, the step of combining the time-domain features of the adjacent sleep stages with the time-domain graph and the machine self-learning graph respectively to obtain a frequency-domain time-domain fusion feature and a spatial-domain time-domain fusion feature includes:
extracting the characteristics of the time-frequency graph by using frequency domain convolution, and extracting frequency domain characteristics;
extracting the features of the self-learning graph by using space domain convolution, and extracting the spatial features among human physiological structures;
respectively extracting time domain characteristics of the time frequency graph and the self-learning graph;
and performing multi-view feature fusion according to the extracted frequency domain features, spatial features and time domain features to obtain frequency domain-time domain fusion features and space domain-time domain fusion features.
Optionally, the extracting the features of the time-frequency graph by using frequency domain convolution and extracting the frequency domain features include:
extracting the characteristics of the time-frequency diagram through a VGG-16 network to obtain frequency domain characteristics;
wherein the VGG-16 network comprises 5 convolutional layers, 3 fully-connected layers and 1 SoftMax output layer; the maximum pooling between each layer is used for processing and the activation function Relu is used to activate all hidden layers and the resulting features are input to a 128-dimensional fully connected layer.
Optionally, the extracting the features of the self-learning graph by using a spatial convolution to extract spatial features between human physiological structures includes:
adding a space attention mechanism to the self-learning graph, capturing a topological structure in the self-learning graph by utilizing the convolution of a Chebyshev graph, and extracting space characteristics;
and the adjacent matrix and the space attention matrix learned in the Chebyshev graph convolution process are used for dynamically adjusting the update of the nodes.
Optionally, the separately extracting time-domain features of the time-frequency graph and the self-learning graph includes:
combining the currently extracted frequency feature and the front and rear segments of the frequency feature into a GRU network;
inputting the output of each sequence into an attention network, learning the weight of each sequence, and fusing the characteristics of the five sequences into a 256-dimensional time characteristic;
combining the fused features with the features of the current sleep stage, and inputting the combined features into a 128-dimensional full connection layer;
adding a time attention mechanism into the structural diagrams of the current record and the two previous and next records, and performing convolution by using time to obtain an attention matrix;
and carrying out normalization processing on the attention matrix by utilizing a Softmax operation to obtain time domain characteristics.
In another aspect, an embodiment of the present invention further provides a sleep staging system based on multi-modal physiological signal fusion, including:
the first module is used for performing down-sampling processing on the acquired multi-lead sleep monitoring signal to obtain a multi-lead signal;
the second module is used for carrying out short-time Fourier transform processing on the multi-lead signals, converting the multi-lead signals to obtain a time-frequency graph, extracting lead characteristics in the multi-lead signals by using one-dimensional convolution, and then generating a machine self-learning graph according to the lead characteristics;
the third module is used for respectively combining the time-frequency graph and the machine self-learning graph with the time-domain characteristics of adjacent sleep stages to obtain a frequency domain-time domain fusion characteristic and a space domain-time domain fusion characteristic;
and the fourth module is used for carrying out multi-view feature fusion on the frequency domain-time domain fusion features and the space domain-time domain fusion features to obtain a sleep stage result.
Another aspect of the embodiments of the present invention further provides an electronic device, which includes a processor and a memory;
the memory is used for storing programs;
the processor executes the program to implement the method as described above.
Yet another aspect of the embodiments of the present invention provides a computer-readable storage medium, which stores a program, which is executed by a processor to implement the method as described above.
The embodiment of the invention also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and the computer instructions executed by the processor cause the computer device to perform the foregoing method.
The following detailed description of the invention is made with reference to the accompanying drawings, in which:
to automate sleep staging using polysomnography, some researchers have proposed various sleep staging models using traditional machine learning methods. However, most of their models are based on a signal processing or data mining method, and the frequency domain or time domain features of physiological signals are extracted, and the models are constructed after feature selection. The feature extraction method of these models relies on the prior knowledge of human beings, the feature selection method relies on the experience of researchers, and the overall performance has been inferior to the Deep Learning (Deep Learning) model today when the amount of medical data is becoming huge. In recent years, researchers extract features from multi-derivative sleep signals based on a Convolutional Neural Network (CNN) and a Recurrent Neural Network (RNN) which are two classical methods in a deep learning method, and construct an automatic sleep staging model. However, their research has the problem that some models use only one or two leads in the physiological signal, rather than multiple leads, which ignores the connected information between different lead channels; some models only simply extract features from a time domain, a frequency domain or a space domain, but do not select organic fusion of the features of multiple domains; furthermore, most of the above studies only consider regular grid-like input data types such as time series data and images, but not non-euclidean input data types such as graph networks, which can well combine different physiological structures of the human body.
In recent years, complex networks and Graph Neural Networks (GNNs) have become a focus of machine learning research and have many applications in sleep staging methods based on physiological signals. Graph structures can be used to ideally represent human physiological structures, and the connectivity between different human physiological structures can be captured by graph data types.
Unlike prior art methods, the present invention proposes a model based on multimodal physiological signal fusion. (1) The inputs to the model include EEG, EOG, ECG and EMG signals in a polysomnogram. (2) The invention respectively expresses the relationship between regular lead signals and the relationship of irregular physiological structures by constructing a time frequency Graph (TF Image) and a self-learning Graph (GL Graph), extracts multi-view complementary features from the time domain, the frequency domain and the space domain of the physiological signals by using a deep learning model, and fuses the features to improve the accuracy of sleep staging. (3) Experiments are carried out on a Sleep signal public data set ISRUC-Sleep S3, and the superiority of the method is verified.
In the present invention, the data set ISRUC-Sleep S3 data set, which is publicly available at the university of British Boola, grape dental, is used by the present invention. The sampling frequency of the polysomnography signal was 200hz. The data set segment-labeled subjects overnight polysomnography recordings, each 30 seconds long, and labeled as awake, N1, N2, N3, or REM. The total number of polysomnography entries used for the experiment was 10 and the total number of segments was 8549.
The working flow of the multi-modal fusion model MVF-SleepNet provided by the invention is shown in figure 2. Firstly, signals are down-sampled to 100hz, short-time Fourier transform (STFT) is respectively carried out on multi-lead signals to convert the signals into a time-frequency graph, regularization information of the signals is fused, one-dimensional convolution (1-D CNN) is used for extracting lead characteristics, a machine self-learning graph is generated according to the lead characteristics, and physiological structure information of a human body is fused. For the time-frequency graph of the current marking segment, firstly, extracting frequency domain characteristics of the current marking segment by using a VGG-16 network, then inputting the extracted characteristics into a GRU network to extract time domain characteristics by combining the frequency domain characteristics of the left and right adjacent 2 segments, and then combining the frequency domain characteristics of the original marking segment to obtain frequency domain-time domain fusion characteristics; for the self-learning graph of the current mark segment, the invention firstly uses the Chebyshev graph convolution to extract the space domain characteristic, and then uses the time domain convolution to extract the time domain characteristic by combining the extracted characteristic with the frequency domain characteristics of the left and right adjacent segments to obtain the space domain-time domain fusion characteristic. And finally, fusing the extracted multi-view characteristics for sleep staging.
The technical scheme of the invention is explained in detail as follows:
(1) Multi-lead relationship representation:
in order to better represent the relationship among the multichannel data and extract rich information from Euclidean data and non-Euclidean data, the invention provides a time-frequency graph which is constructed and used for representing regular data in the multichannel physiological signal and a machine self-learning graph which is constructed and used for representing non-regular data in the multichannel physiological signal.
a. Time-frequency graph construction
In signal analysis, fourier transforms can be used to analyze components of a signal, and also to synthesize signals, which are mainly used to process stationary signals. Although the frequency components of the signal can typically be obtained by fourier transformation, the time of each component is unknown. For non-stationary signals, a short time fourier transform is needed to know when each frequency occurs. The essence of the short-time fourier transform is windowing, by decomposing the time-domain process into infinitesimal equal-length processes, each of which is approximately stationary, and then fourier transforming them. The STFT formula is defined as follows:
Figure BDA0003696166190000081
where w (t) is the window function and x (t) is the signal to be converted. In this embodiment, the present invention performs STFT conversion on a sleep signal for a period of 30 seconds, and the sampling frequency of the conversion is set to 1hz. The STFT window width is set to 100 and if some of the truncated window widths are less than 100, then the signal will be padded with 0 complements on both ends. The signal of each lead is converted to obtain the corresponding STFT image. Finally, the invention stacks all the images together to obtain a time-frequency image with the resolution of 100 multiplied by 10.
b. Self-learning graph construction
The generation mode of the self-learning graph is shown in FIG. 3: the graph structure can capture the relationship between different lead signals, however, most of the previous researches rely on pre-defining the graph structure manually, which depends on the prior knowledge of human, and no universal graph structure can be applied to different data sets at present, and different data often need to construct different graph structures, which is not beneficial to the generalization capability of the model. And the self-learning graph is used, namely, the graph automatically learns and generates a graph structure according to the characteristics of the different lead node characteristics and the characteristics of the data set, so that the generalization of the model is effectively improved. In the present invention, each lead is considered as each node in the graph, and the node features are formed by a feature matrix extracted from an original signal by two one-dimensional convolution kernels (the size is 32 and 64 respectively). And then generating a self-learning graph according to the characteristic matrix to form a physiological structure relationship graph among different signal leads.
(2) Multi-view feature extraction
After the time-frequency graph and the self-learning graph are constructed, different features need to be extracted from the time-frequency graph and the self-learning graph. Previous studies have shown that the multichannel signal has rich information in frequency, spatial domain and time domain, which can be used to identify different sleep states. In order to extract the multi-view characteristics simultaneously, the invention uses frequency domain convolution to extract frequency domain characteristics from the time-frequency diagram, and uses spatial domain convolution to extract spatial characteristics among human physiological structures from the self-learning diagram. In addition, in real life, sleep experts often help to identify the current sleep stage according to neighborhood information of adjacent sleep stages, and based on the inspiration, the invention respectively extracts time domain characteristics of the two graphs. The multi-view feature extraction consists of 4 modules, which are respectively: the device comprises a frequency domain characteristic extraction module, a spatial domain characteristic extraction module, a time domain characteristic extraction module and a multi-view characteristic fusion module.
And a frequency domain feature extraction module. After the time-frequency graph is constructed, the frequency domain characteristics are extracted by using the VGG-16 network widely applied to the field of computer vision. The VGG-16 consists of 5 convolutional layers, 3 fully-connected layers, and 1 SoftMax output layer. In addition, maxpoling is used between each layer, all hidden layers are activated using the activation function Relu, and the resulting features are input into a 128-dimensional fully-connected layer.
And a spatial domain feature extraction module. After the self-learning graph is constructed, a spatial attention mechanism is added to the graph, and the topological structure in the graph is captured and spatial features are extracted by utilizing Chebyshev graph convolution (Cheb GCN). Spatial attention is defined as follows:
Figure BDA0003696166190000091
in the formula
Figure BDA0003696166190000092
Is the input of the l-th layer. V p ,b p ,Z 1 ,Z 2 ,Z 3 The parameters can be learned, σ being the Sigmoid activation function. P denotes a spatial attention matrix, dynamically computed from the input of the current layer. The calculated attention matrix P is then normalized by the Softmax operation. In the model of the invention, the learned adjacency matrix and spatial attention matrix P can dynamically adjust the update of the nodes when graph convolution is performed. The chebyshev convolution formula is defined as follows:
Figure BDA0003696166190000093
L=D-A
Figure BDA0003696166190000094
wherein g represents a convolution kernel G To representIn the graph convolution operation, θ is a vector of the chebyshev coefficient, and x is input data. Lambda max Is the maximum eigenvalue of the Lass matrix, and I N Is an identity matrix. T is k Is a recursive chebyshev polynomial. After graph convolution, each channel lead integrates the characteristics of the other channel leads.
And a time domain feature extraction module. Inspired by sleep experts often judging the current sleep stage through the characteristics of adjacent sleep periods, the invention combines the currently extracted frequency characteristics and the front and rear segments thereof into a GRU network. The update expression of the GRU is defined as follows:
h t =(1-z)⊙h t-1 +z⊙h′
wherein z is a gate control signal, h t As information of the current signal, h t-1 Information sent for the upper unit. Then, the invention inputs the output of each sequence into an attention network, learns the weight of each sequence, and then fuses the characteristics of the five sequences into a 256-dimensional time characteristic. Finally, the present invention combines the fused dimensional features with the features of the current sleep stage itself and then inputs them into a 128-dimensional fully connected layer. Similarly, to understand the features of adjacent sleep phases, the present invention adds a time attention mechanism to the current recording and its two preceding and following anatomical maps, and utilizes time convolution. The definition of temporal attention is as follows:
Figure BDA0003696166190000095
wherein, V q ,b q ,M 1 ,M 2 ,M 3 Are learnable parameters. Q u,v Representing the sleeping brain network G u And G v The strength of the correlation therebetween. Finally, the attention matrix Q is normalized by utilizing Softmax operation, and the input of the space-time flow is adjusted by time attention, so that the space-time flow focuses more on time information with rich information. The definition of the time domain convolution is as follows:
Figure BDA0003696166190000096
where ReLU is the activation function, Φ is the parameter of the convolution kernel, and is the standard convolution operation. After time convolution, the present invention flattens the feature matrix and inputs it into a 128-dimensional fully connected layer.
And a multi-view feature fusion module. After the model acquires the frequency domain-time domain characteristics from the TF Image and acquires the space domain-time domain characteristics from the GL Graph, splicing operation is carried out on the characteristic matrixes, and multi-view characteristic fusion is carried out. The splicing operation is defined as follows:
Figure BDA0003696166190000101
in the formula, X FT ,X S T represents the features extracted from the frequency domain-time domain and the space domain-time domain, respectively, | | represents the splicing operation. Finally, the invention inputs the fused features into a 128-dimensional full-connection layer, and outputs 5 classification sleep staging results after being activated by a Softmax function.
(3) Experimental verification
In order to prove effectiveness, the algorithm provided by the invention is compared with the existing machine learning methods (SVM, RF and the like), the methods based on classical deep learning (MLP + LSTM, CNN and CNN + BilsTM and the like) and the methods based on a graph neural network (STGCN and MSTGCN and the like), the experimental result is an average value of 10-fold cross validation, wherein the table 1 is the comparison of the Sleep stage identification performance of each section on an ISRUC-Sleep S3 public data set.
In the invention, a novel multi-mode fusion algorithm MVF-SleepNet based on deep learning is developed for sleep staging. The MVF-SleepNet provided by the invention respectively represents the relation between multi-lead signals by using short-time Fourier transform and a self-learning diagram, and extracts frequency domain characteristics, chebGCN spatial domain characteristics and GRU and time domain convolution time domain characteristics through a VGG-16 network. On ISRUC-Sleep S3, the Sleep staging accuracy of the invention is 83.3%.
Table 1 below shows the results of comparing the sleep staging performance of the models.
TABLE 1
Figure BDA0003696166190000102
Figure BDA0003696166190000111
Among them, the first method in table 1 is the method adopted in the prior art document "e.alickovic and a.subasi", "envelope SVM method for automatic sheet stage classification", "IEEE institute, meas., vol.67, no.6, pp.1258-1265, jun.2018".
The second method is the method adopted in the prior art document "P.Memar and F.Faradji", "A novel multi-class EEG-based sleep stage classification system", "IEEE trans.neural Syst.Rehabil.Eng., vol.26, no.1, pp.84-95, jan.2018".
The third method is the method used in the prior art document "h.dong, a.supratak, w.pan, c.wu, p.m.mathews, and y.guo," Mixed neural network approach for temporal slide stage classification, "IEEE trans.neural system.rehabil.eng., vol.26, no.2, pp.324-333, feb.2018".
The fourth method is referred to as "A.Superak, H.Dong, C.Wu, and Y.Guo", "deep SleeeNet: A model for automatic sleep stage screening on raw single-channel EEG", "IEEE Trans.neural Syst.Rehabil.Eng., vol.25, no.11, pp.1998-2008, nov.2017", which is a method employed in the prior art document.
The fifth method is the method used in "s.chambon, m.n.galier, p.j.arnal, g.wainib, and a.gramfort," a deep learning architecture for temporal slide stage classification using multilateral and multimodal time series, "IEEE trans.neural system.rehabil.eng, vol.26, no.4, 758-769, apr.2018" prior art documents.
The sixth method is the method adopted in the prior art documents of "H.phan, F.Andreotti, N.Cooray, O.Y.Ch.n, and M.De Vos", "SeqSleepNet: end-to-End resonant temporal neural network for sequence-to-sequence automatic sleep stage", "IEEE trans.neural Syst.Rehaibi.Eng., vol.27, no.3, pp.400-410, mar.2019".
The seventh method is the method used in the prior art document "Z.Jia et al", "GraphSleepNet: adaptive spatial-temporal mapping relational network for sleep stage classification", "in Proc.29th int Conf.Artif.Intell. (IJCAI), jul.2020, pp.1324-1330".
The eighth method is a method employed in the prior art document "Z.jia, Y.Lin, J.Wang, X.Ning, Y.He, R.ZHou, Y.ZHou, and H.L.Li-wei", "Multi-view specific-temporal mapping with domain generation for slide stage classification", "IEEE Transactions on Neural Systems and regeneration Engineering, 29.1977-1986, 2021".
In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations or logic flows presented by the present invention. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in a separate physical device or software module. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional blocks in the apparatus disclosed in the present invention will be understood within the ordinary skill of an engineer in view of the attributes, functionality, and internal relationship of the blocks. Accordingly, those skilled in the art can, using ordinary skill, practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is to be determined from the appended claims along with their full scope of equivalents.
The functions may be stored in a computer-readable storage medium if they are implemented in the form of software functional units and sold or used as separate products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk, or an optical disk, and various media capable of storing program codes.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the description of the specification, reference to the description of "one embodiment," "some embodiments," "an example," "a specific example," or "some examples" or the like means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, the schematic representations of the terms used above do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined in the appended claims.

Claims (10)

1. A sleep staging method based on multi-modal physiological signal fusion is characterized by comprising the following steps:
performing down-sampling processing on the acquired multi-lead sleep monitoring signal to obtain a multi-lead signal;
performing short-time Fourier transform processing on the multi-lead signals, converting to obtain a time-frequency graph, extracting lead characteristics in the multi-lead signals by using one-dimensional convolution, and generating a machine self-learning graph according to the lead characteristics;
respectively combining the time-frequency graph and the machine self-learning graph with time-domain characteristics of adjacent sleep stages to obtain a frequency domain-time domain fusion characteristic and a space domain-time domain fusion characteristic;
and performing multi-view feature fusion on the frequency domain-time domain fusion features and the space domain-time domain fusion features to obtain a sleep stage result.
2. The sleep staging method based on multi-modal physiological signal fusion as claimed in claim 1, wherein the short-time fourier transform processing of the multi-lead signals to obtain a time-frequency diagram comprises:
configuring a sampling frequency and an STFT window width;
according to the configured sampling frequency and the STFT window width, performing short-time Fourier transform on signals of one section every 30 seconds to obtain an STFT image corresponding to each lead signal;
stacking the STFT images of all the lead signals to obtain a time-frequency diagram;
and when the window width intercepted in the short-time Fourier transform process is less than 100, filling 0 at two ends of the time-frequency diagram.
3. The sleep staging method based on multi-modal physiological signal fusion as claimed in claim 1, wherein the generating of the machine self-learning map from the lead characteristics comprises:
viewing each lead as each node in the graph; the node characteristics are formed by a characteristic matrix extracted from an original signal by two one-dimensional convolution kernels;
and generating a self-learning diagram according to the extracted characteristic matrix, and further forming a physiological structure relationship diagram among different signal leads.
4. The sleep staging method based on multi-modal physiological signal fusion as claimed in claim 1, wherein the time-frequency graph and the machine self-learning graph are respectively combined with the time-domain features of the adjacent sleep stages to obtain a frequency-domain-time-domain fusion feature and a spatial-time-domain fusion feature, and the method comprises:
extracting the characteristics of the time-frequency graph by using frequency domain convolution, and extracting frequency domain characteristics;
extracting the features of the self-learning graph by using space domain convolution, and extracting the spatial features among human physiological structures;
respectively extracting time domain characteristics of the time-frequency graph and the self-learning graph;
and performing multi-view feature fusion according to the extracted frequency domain features, spatial features and time domain features to obtain frequency domain-time domain fusion features and space domain-time domain fusion features.
5. The sleep staging method based on multi-modal physiological signal fusion as claimed in claim 4, wherein the extracting the features of the time-frequency diagram by using frequency domain convolution comprises:
extracting the characteristics of the time-frequency diagram through a VGG-16 network to obtain frequency domain characteristics;
wherein the VGG-16 network comprises 5 convolutional layers, 3 fully-connected layers and 1 SoftMax output layer; the maximum pooling is used between each layer for processing and the activation function Relu is used to activate all hidden layers and the resulting features are input into a 128-dimensional fully connected layer.
6. The sleep staging method based on multi-modal physiological signal fusion according to claim 4, wherein the extracting the spatial features between the human physiological structures by using spatial convolution to extract the features of the self-learning graph comprises:
adding a space attention mechanism on the self-learning graph, capturing a topological structure in the self-learning graph by utilizing the convolution of the Chebyshev graph, and extracting space characteristics;
and the adjacent matrix and the spatial attention matrix learned in the convolution process of the Chebyshev diagram are used for dynamically adjusting the update of the nodes.
7. The sleep staging method based on multi-modal physiological signal fusion as claimed in claim 4, wherein the extracting the time domain features of the time frequency graph and the self-learning graph respectively comprises:
combining the currently extracted frequency feature and the front and rear segments of the frequency feature into a GRU network;
inputting the output of each sequence into an attention network, learning the weight of each sequence, and fusing the characteristics of the five sequences into a 256-dimensional time characteristic;
combining the fused features with the features of the current sleep stage, and inputting the combined features into a 128-dimensional full connection layer;
adding a time attention mechanism into the structural diagram of the current record and the two sections of the previous and next records, and performing convolution by using time to obtain an attention matrix;
and carrying out normalization processing on the attention matrix by utilizing a Softmax operation to obtain time domain characteristics.
8. A sleep staging system based on multimodal physiological signal fusion, comprising:
the first module is used for performing down-sampling processing on the acquired multi-lead sleep monitoring signal to obtain a multi-lead signal;
the second module is used for carrying out short-time Fourier transform processing on the multi-lead signals, converting the multi-lead signals to obtain a time-frequency graph, extracting lead characteristics in the multi-lead signals by using one-dimensional convolution, and then generating a machine self-learning graph according to the lead characteristics;
the third module is used for respectively combining the time-frequency graph and the machine self-learning graph with the time-domain characteristics of adjacent sleep stages to obtain a frequency domain-time domain fusion characteristic and a space domain-time domain fusion characteristic;
and the fourth module is used for carrying out multi-view feature fusion on the frequency domain-time domain fusion features and the space domain-time domain fusion features to obtain a sleep stage result.
9. An electronic device comprising a processor and a memory;
the memory is used for storing programs;
the processor executing the program realizes the method of any one of claims 1 to 7.
10. A computer-readable storage medium, characterized in that the storage medium stores a program, which is executed by a processor to implement the method according to any one of claims 1 to 7.
CN202210675112.1A 2022-06-15 2022-06-15 Sleep staging method and system based on multi-modal physiological signal fusion Pending CN115349821A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210675112.1A CN115349821A (en) 2022-06-15 2022-06-15 Sleep staging method and system based on multi-modal physiological signal fusion

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210675112.1A CN115349821A (en) 2022-06-15 2022-06-15 Sleep staging method and system based on multi-modal physiological signal fusion

Publications (1)

Publication Number Publication Date
CN115349821A true CN115349821A (en) 2022-11-18

Family

ID=84030040

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210675112.1A Pending CN115349821A (en) 2022-06-15 2022-06-15 Sleep staging method and system based on multi-modal physiological signal fusion

Country Status (1)

Country Link
CN (1) CN115349821A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115969329A (en) * 2023-02-08 2023-04-18 长春理工大学 Sleep staging method, system, device and medium

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115969329A (en) * 2023-02-08 2023-04-18 长春理工大学 Sleep staging method, system, device and medium

Similar Documents

Publication Publication Date Title
Liu et al. Deep learning in ECG diagnosis: A review
Yang et al. A single-channel EEG based automatic sleep stage classification method leveraging deep one-dimensional convolutional neural network and hidden Markov model
Ganapathy et al. Deep learning on 1-D biosignals: a taxonomy-based survey
Yuan et al. Muvan: A multi-view attention network for multivariate temporal data
Xiao et al. Follow the sound of children’s heart: a deep-learning-based computer-aided pediatric CHDs diagnosis system
US9949714B2 (en) Method, electronic apparatus, and computer readable medium of constructing classifier for disease detection
Chen et al. Multi-information fusion neural networks for arrhythmia automatic detection
CN110897639A (en) Electroencephalogram sleep staging method based on deep convolutional neural network
CN110801221B (en) Sleep apnea fragment detection equipment based on unsupervised feature learning
Mostafa et al. Multi-objective hyperparameter optimization of convolutional neural network for obstructive sleep apnea detection
CN110619322A (en) Multi-lead electrocardio abnormal signal identification method and system based on multi-flow convolution cyclic neural network
CN113095302B (en) Depth model for arrhythmia classification, method and device using same
CN116072265B (en) Sleep stage analysis system and method based on convolution of time self-attention and dynamic diagram
Güler et al. Two-stage classification of respiratory sound patterns
Parsa et al. Staged inference using conditional deep learning for energy efficient real-time smart diagnosis
CN113925459A (en) Sleep staging method based on electroencephalogram feature fusion
Zhang et al. Competition convolutional neural network for sleep stage classification
Liang et al. Obstructive sleep apnea detection using combination of CNN and LSTM techniques
Prakash et al. A system for automatic cardiac arrhythmia recognition using electrocardiogram signal
Katsaouni et al. Energy efficient convolutional neural networks for arrhythmia detection
Chen et al. RAFNet: Restricted attention fusion network for sleep apnea detection
Liao et al. Recognizing diseases with multivariate physiological signals by a DeepCNN-LSTM network
CN115349821A (en) Sleep staging method and system based on multi-modal physiological signal fusion
Walther et al. A systematic comparison of deep learning methods for EEG time series analysis
Kutluana et al. Classification of cardiac disorders using weighted visibility graph features from ECG signals

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination