CN116584891A

CN116584891A - Sleep apnea syndrome detection method based on multi-level feature fusion

Info

Publication number: CN116584891A
Application number: CN202310353366.6A
Authority: CN
Inventors: 何克晶; 舒诗文; 卓伟伦
Original assignee: Guangzhou Odier Technology Co ltd; South China University of Technology SCUT
Current assignee: Guangzhou Odier Technology Co ltd; South China University of Technology SCUT
Priority date: 2023-04-04
Filing date: 2023-04-04
Publication date: 2023-08-15

Abstract

The invention discloses a sleep apnea syndrome detection method based on multi-level feature fusion. The method comprises the following steps: acquiring different breathing signal data of a human body to be detected and preprocessing the data; the preprocessed data are input into a TF-Res2Net module, spatial features of different scales of a time domain and a frequency domain of respiratory signals are acquired and input into a mixed attention module, the acquired correlation features in different respiratory signals and the acquired cross features among different respiratory signals are connected and input into a self-adaptive fusion module, and a detection result of sleep apnea syndrome is output through a full-connection layer with an activation function of Softmax. The invention can automatically integrate the time-frequency domain characteristics of different scales of different respiratory signals of a human body, the correlation characteristics inside a single respiratory signal and the cross characteristics among different respiratory signals, filter the redundant characteristics among different signals, strengthen the generalization capability of a model and improve the efficiency of sleep apnea syndrome detection.

Description

Sleep apnea syndrome detection method based on multi-level feature fusion

Technical Field

The invention relates to the field of sleep monitoring, in particular to a sleep apnea syndrome detection method based on multi-level feature fusion.

Background

Sleep apnea syndrome (Sleep Apnea Syndrome, SAS) is one of the most common sleep disordered breathing diseases, embodied in the periodic reduction (hypopnea) or cessation (apnea) of airflow during sleep. Currently, about 10% of middle-aged people worldwide are diagnosed with sleep apnea syndrome, and the incidence rate rises year by year. Polysomnography (PSG) is a gold standard for diagnosing SAS, however, due to its high price, long-term monitoring of patients, tedious data recording and difficult interpretation, a large number of potential patients do not seek professional treatment, and are diagnosed in time, resulting in potential threats of complications such as daytime sleepiness, cardiovascular diseases, cognitive dysfunction, etc. Therefore, the detection of the SAS has important significance for guaranteeing the health of a human body and preventing related complications.

To improve patient convenience and reduce costs, various physiological signals have been studied to diagnose SAS instead of PSG, wherein respiratory signals are the signals most directly related to respiratory dynamics. The existing SAS detection method based on the breathing signals is divided into a traditional machine learning analysis method and a deep neural network model analysis method. The traditional machine learning analysis method relies on a large number of manually set feature engineering, the deep neural network model can automatically extract the features of signals, but the existing deep neural network model is mainly based on a convolutional neural network and a cyclic neural network, the spatial features of a time domain are extracted through the convolutional neural network, the time features of the time domain are extracted through the cyclic neural network, the frequency domain features of respiratory signals cannot be effectively utilized, the frequency domain features of the respiratory signals and the dependency relationship among different respiratory signals are difficult to capture, and the detection effect of sleep apnea syndrome is limited to a certain extent, for example, the method comprises the following steps:

after simple concatenation of the different respiratory signals, a model containing six convolutions, three max pooling layers and one fully connected layer was input for sleep apnea syndrome detection (R.Haidar, S.McCloskey, I.Koprinska and b.jeffries, "Convolutional Neural Networks on Multiple Respiratory Channels to Detect Hypopnea and Obstructive Apnea Events,"2018International Joint Conference on Neural Networks (IJCNN), rio de Janeiro, brazil,2018, pp.1-7, doi: 10.1109/ijcnn.2018.8489248.). The disadvantage of this method is that: only the time domain features of the respiratory signals are extracted, and the relation between different respiratory signals is not considered, so that the method has great limitation.

Disclosure of Invention

The invention aims to provide a sleep apnea syndrome detection method based on multi-level feature fusion, which mainly solves the problems of insufficient feature extraction capability and limited detection effect of the existing method. The method can effectively extract the time-frequency domain characteristics of different respiratory signals and the potential dependency relationship among different respiratory signals, is helpful for detecting sleep apnea syndrome, and improves the detection effect of the sleep apnea syndrome.

The invention is realized at least by one of the following technical schemes.

A sleep apnea syndrome detection method based on multi-level feature fusion comprises the following steps:

s1, acquiring different respiratory signal data of a human body to be detected, and preprocessing to obtain preprocessed data;

s2, inputting the preprocessed data into a TF-Res2Net module to acquire spatial features of different scales of a time domain and a frequency domain of the respiratory signal;

s3, inputting the time-frequency domain characteristics of the respiratory signals into a Mixed attention module (Mixed-Attn-Block) so as to obtain the correlation characteristics inside different respiratory signals and the cross characteristics among different respiratory signals;

s4, connecting the correlation characteristics in different respiratory signals with the cross characteristics among different respiratory signals, inputting the connection characteristics into the self-adaptive fusion module, and outputting a detection result of sleep apnea syndrome through a full-connection layer with an activation function of Softmax.

Further, in step S1, the different respiratory signals of the human body include a thoracic activity signal, i.e. a first respiratory signal, an abdominal activity signal, i.e. a second respiratory signal, and an oronasal respiratory airflow, i.e. a third respiratory signal.

Further, in step S1, the data preprocessing step includes:

performing z-score normalization on different respiratory signals of a human body;

uniformly dividing different respiratory signals of the standardized human body into non-overlapping fragments with the length of more than or equal to 10 seconds;

the respiratory signal segment is marked with sleep apnea category, if the segment has hypopnea event, obstructive sleep apnea event or central sleep apnea event more than 10 seconds, the segment is marked as hypopnea, obstructive sleep apnea or central sleep apnea respectively; otherwise, the fragment is marked as normal.

Further, the TF-Res2Net module includes a time domain branch and a frequency domain branch, and the frequency domain branch is used as a residual error of the time domain branch, specifically as follows:

the first respiratory signal segment after pretreatment is S, then there are:

S＝(p ₁ ,p ₂ ,…,p _i ,…,p _w )

wherein ,p_i Is the value corresponding to the time point i in the respiratory signal segment, w is the number of time points in the respiratory signal segment;

in the time domain branch, the first respiratory signal segment S is input into a one-dimensional convolution layer and then divided into scale parts on average, wherein the specific value of scale can be set manually by comprehensively considering various factors such as model complexity, model performance and the like, and is recorded as S' = (S) ₁ ,s ₂ ,…,s _i ,…,s _scale ) S' represents the segmented first respiratory signal segment, each feature subset S when i+.1 _i After an additional convolution operation, the next feature subset s is entered as a residual _i+1 Is a convolution operation of (1):

wherein ,z_i For feature subset s _i The output after the additional convolution operation is Con (·) is a one-dimensional convolution operation; the outputs of all feature subsets are combined and input into a one-dimensional convolution layer as time domain branchesOutput Z _T ：

wherein ,for join (connection) operations;

in the frequency domain branch, the first respiratory signal segment S is subjected to discrete Fourier transform (Discrete Fourier Transform, DFT), the real part is reserved and marked as D, and the operations of segmentation, convolution, combination and convolution in the time domain branch are repeated to obtain the output Z of the frequency domain branch _F ：

D＝DFT(S).Real

D′＝(d ₁ ,d ₂ ,…,d _i ,…,d _scale )

Where D' represents a frequency domain segment of the segmented first respiratory signal, D _i Representing an ith feature subset, z 'in the frequency domain segment of the segmented first respiratory signal' _i Feature subset d, which is the frequency domain _i The output after the additional convolution operation;

output Z of time domain branching _T And output Z of frequency domain branches _F After addition, the maximum pooling layer (MaxPool) is input, and the output X of the TF-Res2Net module is obtained:

X＝MaxPool(Z _T +Z _F )。

further, in step S3, the Mixed-attention module Mixed-Attn-Block includes a single-signal self-attention module and a multi-head cross-attention module between signals, wherein the single-signal self-attention module is implemented by a graph attention network:

a1, assuming that the output of the first respiratory signal segment passing through the TF-Res2Net module is X _A ＝{a ₁ ,a ₂ ,…,a _i ,…,a _l As input to a single signal self-attention module, where l is X _A The number of time points a _i Is X _A A vector representation of each time point i;

a2, calculating X _A Attention weight w of time point j to time point i in (b) _ij ：

wherein w^T Is X _A The weight matrix shared internally, ψ (·) is a nonlinear activation function, and tie (linkage) operation;

a3 input a to the single signal self-attention module _i Weighting to obtain an enhancement matrix representation A' of the internal association characteristics of the fused single signals:

A′＝{a′ ₁ ,a′ ₂ ,…,a′ _i ,…,a′ _l }

wherein A' is a matrix representation of the first respiration signal, N _i For the neighbor node of time point i, σ is the activation function, a' _i Is a vector representation of a time point i after the associated features between the time point i and the neighboring nodes j are fused;

a4, assuming that the output of the second respiratory signal and the third respiratory signal through the TF-Res2Net module is X respectively _B ＝{b ₁ ,b ₂ ,…,b _i ,…,b _l} and X_C ＝{c ₁ ,c ₂ ,…,c _i ,…,c _l And (3) repeating the steps A1 to A3 to obtain a matrix representation B 'of the second respiration signal and a matrix representation C' of the third respiration signal.

Further, the multi-headed cross-attention module between signals includes the operations of:

b1, calculating an incidence matrix Z of a matrix representation A 'of the first respiration signal and a matrix representation B' of the second respiration signal according to a multi-head cross attention mechanism _ab ：

Q＝A′,K＝V＝B′

h _j ＝Attention(QW _j ^Q ,KW _j ^K ,VW _j ^V )

Wherein Q is the matrix representation of the query signal, K is the matrix representation of the relationship between the queried signal and the query signal, V is the matrix representation of the queried signal, d _k Is the dimension of the relationship matrix K; w (W) _j ^Q ,W _j ^K ,W _j ^V Respectively randomly initializing parameter matrixes h _j A j-th single-head cross attention representation; m is the number of heads in the multi-head cross-attention mechanism;

b2, matrix representation A' of the first respiration signal and the correlation matrix Z _ab Adding, performing layer normalization, inputting into a full-connection layer, and performing layer normalization to obtain Cross signal matrix representation Cross _ab ：

Cross _ab ＝LayerNorm(FC(LayerNorm(A′+Z _ab )))

B3, exchanging inquiry signals and inquired signals, and calculating an association matrix Z of matrix representation B 'of the second respiration signals and matrix representation A' of the first respiration signals according to a multi-head cross attention mechanism _ba ：

Q＝B′,K＝V＝A′

h _j ＝Attention(QW _j ^Q ,KW _j ^K ,VW _j ^V )

B4, matrix representation B' of the second respiration signal and the correlation matrix Z _ba Adding, performing layer normalization, inputting into a fully-connected layer, and performing layer normalization to obtain a Cross signal matrix representation Cross of the second respiratory signal and the first respiratory signal _ba ：

Cross _ba ＝LayerNorm(FC(LayerNorm(B ^′ +Z _ba )))

B5, repeating the steps B1 to B4 for a plurality of times to respectively acquire a Cross signal matrix representation Cross of the second respiration signal and the third respiration signal _bc The Cross signal matrix representation Cross of the third respiration signal and the second respiration signal _cb The Cross signal matrix representation Cross of the first respiration signal and the third respiration signal _ac The Cross signal matrix representation Cross of the third respiration signal and the first respiration signal _ca 。

Further, in step S4, the adaptive fusion module obtains the refined feature F for subsequent classification by balancing information complementation and information redundancy between features of different levels:

F ₁ ＝W ₁₁ ⊙A ^′ +W ₁₂ ⊙Cross _ab +W ₁₃ ⊙Cross _ba +W ₁₄ ⊙B ^′

F ₂ ＝W ₂₁ ⊙B ^′ +W ₂₂ ⊙Cross _bc +W ₂₃ ⊙Cross _cb +W ₂₄ ⊙C ^′

F ₃ ＝W ₃₁ ⊙A ^′ +W ₃₂ ⊙Cross _ac +W ₃₃ ⊙Cross _ca +W ₃₄ ⊙C ^′

wherein ,W_eg As a trainable parameter, e.e. {1,2,3}, g.e. {1,2,3,4}, as an element point product.

Further, in step S4, the sleep apnea syndrome detection result of the respiratory signal segment includes normal, hypopnea, obstructive sleep apnea, and central sleep apnea, and the loss function adopted during training is cross entropy.

Compared with the prior art, the invention has the beneficial effects that:

(1) The chest activity signal, the abdomen activity signal and the mouth-nose breathing air flow are used for replacing PSG to detect sleep apnea syndrome, and an expert does not need to manually select the characteristics containing the field expertise, so that the detection cost is reduced, and the convenience is improved;

(2) Extracting time domain space features of respiratory signals by using a convolutional neural network, expanding the signals to a frequency domain by using Fourier transform for analysis, fusing respiratory signal features of different angles, and increasing generalization capability of a model;

(3) The internal connection of the breathing signals is extracted by utilizing the self-attention module of the single breathing signals, potential dependency relations among different breathing signals are extracted by utilizing the cross-attention module among different breathing signals, the breathing signal characteristics of different layers are adaptively fused, and the complementation among the multi-layer characteristics is utilized, so that the detection efficiency of the sleep apnea syndrome is improved.

Drawings

FIG. 1 is a flow chart of a sleep apnea syndrome detection method based on multi-level feature fusion, according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a TF-Res2Net module shown in an embodiment of the present invention;

fig. 3 is a schematic diagram of a hybrid attention module shown in an embodiment of the invention.

Detailed Description

The present invention will be described in further detail with reference to examples and drawings, but embodiments of the present invention are not limited thereto.

Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. The implementations described in the following exemplary examples do not represent all implementations consistent with the invention. Rather, they are merely examples of apparatus and methods consistent with aspects of the invention as detailed in the accompanying claims.

The sleep apnea syndrome detection method based on multi-level feature fusion as shown in fig. 1 comprises the following steps:

the data used in this example was from 1000 polysomnography recordings longer than 6 hours in the Multi-Ethnic Study of Atherosclerosis (MESA) dataset, with a sampling frequency of the respiratory signals of 32Hz.

The different respiratory signals of the human body comprise a chest activity signal, namely a first respiratory signal, an abdomen activity signal, namely a second respiratory signal, and an oronasal respiratory airflow, namely a third respiratory signal;

in one embodiment, the data preprocessing step includes:

-z-score normalization of different respiratory signals of the human body;

dividing the standardized different respiratory signals of the human body into non-overlapping segments with the length of 30 seconds;

the sleep apnea type marking is carried out on the respiratory signal segment, and if the segment has a hypopnea event, an obstructive sleep apnea event or a central sleep apnea event which exceeds 10 seconds, the segment is marked as hypopnea, obstructive sleep apnea or central sleep apnea respectively; otherwise, the fragment is marked as normal.

S2, inputting the preprocessed data into a TF-Res2Net module to acquire spatial features of different scales of a time domain and a frequency domain of a respiratory signal;

the TF-Res2Net module, as shown in fig. 2, includes a time domain branch and a frequency domain branch, where the frequency domain branch is used as a residual of the time domain branch:

assuming that the preprocessed first respiratory signal fragment is S, there are:

S＝(p ₁ ,p ₂ ,…,p _i ,…,p ₃₀ )

wherein ,p_i Is the value corresponding to point in time i in the respiratory signal segment;

in one embodiment, after the respiratory signal segment S is input into the one-dimensional convolution layer in the time domain branch, the respiratory signal segment S is divided into 4 parts on average, denoted as S' = (S) ₁ ,s ₂ ,s ₃ ,s ₄ ) S' represents a segmented first respiratory segment, each feature subset S _i (i.noteq.1) after an additional convolution operation, enter the next feature subset s as a residual _i+1 Is a convolution operation of (1):

in one embodiment, wherein z _i For feature subset s _i The output after the additional convolution operation is Con (·) is a one-dimensional convolution operation with a convolution kernel of 1*3; the outputs of all feature subsets are combined and input into a one-dimensional convolution layer as the output Z of the time domain branch _T ：

wherein ,for join (connection) operations.

Performing discrete Fourier transform (Discrete Fourier Transform, DFT) on the respiratory signal segment S in the frequency domain branch, reserving a real part, denoted as D, and repeating the operations of segmentation, convolution, coupling and convolution of the step S2Obtaining output Z of frequency domain branches _F ：

D＝DFT(S).Real

D′＝(d ₁ ,d ₂ ,d ₃ ,d ₄ )

output Z of the time domain branches _T And output Z of frequency domain branches _F Adding, inputting a maximum pooling layer (MaxPool) with the pooling layer size of 3 and the step length of 3, and obtaining an output X of the TF-Res2Net module:

X＝MaxPool(Z _T +Z _F )

s3, inputting the time-frequency domain characteristics of the respiratory signals into a Mixed attention module (Mixed-Attn-Block) shown in the figure 3 so as to acquire the correlation characteristics inside different respiratory signals and the cross characteristics among different respiratory signals;

the Mixed attention module Mixed-Attn-Block comprises a multi-head self-attention module of single signals and a multi-head cross attention module among the signals, wherein the self-attention module of the single signals is realized by a graph attention network:

a1, assuming that the output of the first respiratory signal segment through the TF-Res2Net module is X _A ＝{a ₁ ,a ₂ ,…,a _i ,…,a _l As input to a single signal self-attention module, where l is X _A The number of time points a _i Is X _A A vector representation of each time point i;

a2, meterCalculate X _A Attention weight w of time point j to time point i in (b) _ij ：

wherein w^T Is X _A Internally shared weight matrix, leakyReLU is a nonlinear activation function, and tie (linkage) operation;

a3, inputting a to the single-signal self-attention module _i Weighting to obtain an enhancement matrix representation A' of the internal association characteristics of the fused single signals:

A′＝{a′ ₁ ,a′ ₂ ,…,a′ _l }

wherein A' is a matrix representation of the first respiration signal, N _i For the neighbor node of time point i, σ is the activation function, a' _i Is a vector representation of a time point i after the associated features between the time point i and the neighboring nodes j are fused; in the present embodiment, since the criterion for sleep apnea syndrome is divided by 10 seconds, the neighbor node N at the time point i is set _i The number of (2) is 10:

a4, assuming that the output of the second respiratory signal and the third respiratory signal through the TF-Res2Net module is X respectively _B ＝{b ₁ ,b ₂ ,…,b _i ,…,b _l} and X_C ＝{c ₁ ,c ₂ ,…,c _i ,…,c _l Repeating steps A1-A3 to obtain a matrix representation B 'of the second respiration signal and a matrix representation C' of the third respiration signal;

the multi-head cross attention module between signals comprises the following operations:

b1, calculating the correlation matrix Z of the matrix representation A 'of the first respiration signal and the matrix representation B' of the second respiration signal according to the multi-head cross attention mechanism _ab :

Q＝A′,K＝V＝B′

h _j ＝Attention(QW _j ^Q ,KW _j ^K ,VW _j ^V )

Wherein Q is the matrix representation of the query signal, K is the matrix representation of the relationship between the queried signal and the query signal, V is the matrix representation of the queried signal, d _k Is the dimension of the relationship matrix K; w (W) _j ^Q ,W _j ^K ,W _j ^V Respectively randomly initializing parameter matrixes h _j A j-th single-head cross attention representation; m is the number of heads in the multi-head cross-attention mechanism; in one embodiment, m=8;

b2, representing the matrix of the first respiratory signal A' and the correlation matrix Z _ab Adding, performing layer normalization, inputting into a full-connection layer, and performing layer normalization to obtain Cross signal matrix representation Cross _ab ：

Cross _ab ＝LayerNorm(FC(LayerNorm(A′+Z _ab )))

B3 exchanging the query signal and the queried signal, calculating a matrix representation B of the second respiratory signal according to a multi-headed cross-attention mechanism ^′ Matrix representation A with first respiration signal ^′ Is of the correlation matrix Z of (1) _ba :

Q＝B ^′ ,K＝V＝A ^′

h _j ＝Attention(QW _j ^Q ,KW _j ^K ,VW _j ^V )

B4, representing the matrix of the second respiratory signal with B ^′ Incidence matrix Z _ba Adding, performing layer normalization, inputting into a fully-connected layer, and performing layer normalization to obtain Cross signal matrix representation Cross of the second respiratory signal and the first respiratory signal _ba ：

Cross _ba ＝LayerNorm(FC(LayerNorm(B ^′ +Z _ba )))

B5, repeating the steps B1 to B4 for a plurality of times to respectively obtain a Cross signal matrix representation Cross of the second respiration signal and the third respiration signal _bc Cross signal matrix representation of the third respiration signal and the second respiration signal _cb Cross signal matrix representation of the first respiration signal and the third respiration signal _ac Cross signal matrix representation of the third respiration signal and the first respiration signal _ca ；

The self-adaptive fusion module acquires refined features F for subsequent classification by weighing information complementation and information redundancy among features of different layers:

wherein W_eg As a trainable parameter, e.epsilon. {1,2,3}, g.epsilon. {1,2,3,4}, is the Element point product (Element-wise).

The sleep apnea syndrome detection result of the respiratory signal segment comprises normal, hypopnea, obstructive sleep apnea and central sleep apnea, and a loss function adopted during training is cross entropy.

In one embodiment, the model is evaluated using a five-fold cross-validation approach, with a training set to validation set ratio of 4:1, comprising 1085692 normal breathing segments, 122503 hypopnea segments, 31252 obstructive sleep apnea segments, 2876 central sleep apnea segments.

In order to verify the effectiveness of the invention, the sleep apnea syndrome detection method provided by the invention is compared with the following sleep apnea syndrome detection method:

(1) The CNN detection method proposed by Rim Haidar et al [ "Convolutional Neural Networks on Multiple Respiratory Channels to Detect Hypopnea and Obstructive Apnea events" ("International Joint Conference on Neural Networks,2018:1-7 detection method ]

(2) MLF2N detection method by Xingfeng Lv et al [ "A Multi-level Features Fusion Network for Detecting Obstructive Sleep Apnea Hypopnea Syndrome" ("ICA 3PP (3)", 2020:509-519 ]

Detection performance was evaluated with accuracy, recall, and F1 score. Wherein, F1 is a comprehensive evaluation index of model performance, and the closer the value is to 1, the better the performance of the sleep apnea syndrome detection method is. The mean and standard deviation of the model test results after five-fold cross validation are shown in tables 1,2,3 and 4.

Table 1 Performance (%)

Model	Accuracy rate of	Recall rate of recall	F1 fraction
				CNN	93.4±0.2	97.0±0.3	95.2±0.1
MLF2N	94.4±0.0	96.7±0.1	95.5±0.1
				The method of the invention	94.6±0.2	96.6±0.1	95.7±0.0

Table 2 test results performance of hypopnea events (%)

Table 3 test results performance of obstructive sleep apnea event on MESA dataset (%)

Model	Accuracy rate of	Recall rate of recall	F1 fraction
				CNN	73.7±1.7	62.6±2.4	67.6±0.9
MLF2N	74.5±0.7	68.7±1.3	71.5±0.5
				The method of the invention	73.7±0.7	70.6±1.5	72.2±0.7

Table 4 Performance of the results of the detection of Central sleep apnea events (%)

Model	Accuracy rate of	Recall rate of recall	F1 fraction
				CNN	62.4±4.5	18.0±5.1	27.3±5.7
MLF2N	63.1±3.0	30.6±3.3	41.1±3.1
				The method of the invention	52.5±9.2	41.5±13.3	43.6±5.8

As shown in tables 1-4, the F1 score of the method of the invention is higher than that of the prior method when different types of sleep apnea events are detected, which shows that the comprehensive performance of the method of the invention is optimal.

While embodiments of the present invention have been illustrated and described above, it will be appreciated that the above embodiments are illustrative and not to be construed as limiting the invention. Various equivalent modifications and substitutions will occur to those skilled in the art, and these are intended to be included within the scope of the present invention.

Claims

1. A sleep apnea syndrome detection method based on multi-level feature fusion is characterized by comprising the following steps:

2. The method for detecting sleep apnea syndrome based on multi-level feature fusion according to claim 1, wherein in step S1, the different respiratory signals of the human body include a thoracic activity signal, i.e. a first respiratory signal, an abdominal activity signal, i.e. a second respiratory signal, and an oronasal respiratory airflow, i.e. a third respiratory signal.

3. The sleep apnea syndrome detection method based on multi-level feature fusion of claim 1, wherein in step S1, the data preprocessing step includes:

4. The sleep apnea syndrome detection method based on multi-level feature fusion of claim 1, wherein the TF-Res2Net module includes a time domain branch and a frequency domain branch, the frequency domain branch being a residual of the time domain branch.

5. The method for detecting sleep apnea syndrome based on multi-level feature fusion according to claim 4, wherein the first respiratory signal segment after preprocessing is S, and the method comprises:

S＝(p ₁ ,p ₂ ,…,p _i ,…,p _w )

wherein ,p_i Is the value corresponding to the time point i in the respiratory signal segment, w is the number of time points in the respiratory signal segment.

After the first respiratory signal segment S is input into a one-dimensional convolution layer in a time domain branch, the first respiratory signal segment S is divided into scale parts on average and marked as S ^′ ＝(s ₁ ,s ₂ ,…,s _i ,…,s _scale )，S ^′ Representing the segmented first respiratory signal segment, each feature subset s when i+.1 _i After an additional convolution operation, the next feature subset s is entered as a residual _i+1 Is a convolution operation of (1):

wherein ,z_i For feature subset s _i The output after the additional convolution operation is Con (·) is a one-dimensional convolution operation; the outputs of all feature subsets are combined and input into a one-dimensional convolution layer as the output Z of the time domain branch _T ：

wherein ,for join (connection) operations;

D＝DFT(S).Real

D ^′ ＝(d ₁ ,d ₂ ,…,d _i ,…,d _scale )

wherein ,D^′ Representing a frequency domain segment of the segmented first respiratory signal, d _i Representing an ith feature subset, z, in the frequency domain segment of the segmented first respiratory signal _i ^′ Feature subset d, which is the frequency domain _i And outputting after the additional convolution operation.

6. The sleep apnea syndrome detection method based on multi-level feature fusion according to claim 5, wherein the output Z of the time domain branch is _T And output Z of frequency domain branches _F After addition, the maximum pooling layer (MaxPool) is input, and the output X of the TF-Res2Net module is obtained:

X＝MaxPool(Z _T +Z _F )。

7. the sleep apnea syndrome detection method based on multi-level feature fusion of claim 1, wherein in step S3, the Mixed attention module Mixed-Attn-Block includes a single-signal self-attention module, and a multi-head cross-attention module between signals, wherein the single-signal self-attention module is implemented by a graph attention network:

wherein w^T Is X _A The internally shared weight matrix, ψ (·) is the nonlinear activation function,for join (connection) operations;

A′＝{a′ ₁ ,a′ ₂ ,…,a′ _i ,…,a′ _l }

8. The method for detecting sleep apnea syndrome based on multi-level feature fusion according to claim 7, wherein the multi-head cross attention module between signals comprises the following operations:

b1, calculating a matrix representation A of the first respiratory signal according to a multi-headed cross-attention mechanism ^′ Matrix representation B with a second respiration signal ^′ Is of the correlation matrix Z of (1) _ab ：

Q＝A ^′ ,K＝V＝B ^′

Wherein Q is the matrix representation of the query signal, K is the matrix representation of the relationship between the queried signal and the query signal, V is the matrix representation of the queried signal, d _k Is the dimension of the relationship matrix K;respectively randomly initializing parameter matrixes h _j For the j-th single-head cross-attention representationThe method comprises the steps of carrying out a first treatment on the surface of the m is the number of heads in the multi-head cross-attention mechanism;

b2 representing the matrix of the first respiration signals A ^′ Incidence matrix Z _ab Adding, performing layer normalization, inputting into a full-connection layer, and performing layer normalization to obtain Cross signal matrix representation Cross _ab ：

Cross _ab ＝LayerNorm(FC(LayerNorm(A ^′ +Z _ab )))

B3 exchanging the query signal and the queried signal, calculating a matrix representation B of the second respiration signal according to a multi-headed cross-attention mechanism ^′ Matrix representation A with first respiration signal ^′ Is of the correlation matrix Z of (1) _ba ：

Q＝B ^′ ,K＝V＝A ^′

B4 representing the matrix of the second respiratory signal by ^′ Incidence matrix Z _ba Adding, performing layer normalization, inputting into a fully-connected layer, and performing layer normalization to obtain a Cross signal matrix representation Cross of the second respiratory signal and the first respiratory signal _ba ：

Cross _ba ＝LayerNorm(FC(LayerNorm(B ^′ +Z _ba )))

B5, repeating the steps B1 to B4 for a plurality of times to respectively obtain a Cross signal matrix representation Cross of the second respiration signal and the third respiration signal _bc Cross signal matrix representation of the third respiration signal and the second respiration signal _cb First respiratory signal andcross signal matrix representation of the third respiratory signal _ac Cross signal matrix representation of the third respiration signal and the first respiration signal _ca 。

9. The sleep apnea syndrome detection method based on multi-level feature fusion according to claim 1, wherein in step S4, the adaptive fusion module obtains refined features F for subsequent classification by balancing information complementation and information redundancy between features of different levels:

10. The method according to claim 8, wherein in step S4, the sleep apnea syndrome detection result of the respiratory signal segment includes normal, hypopnea, obstructive sleep apnea, central sleep apnea, and the loss function used in training is cross entropy.