CN113995379A

CN113995379A - Sleep apnea and hypopnea syndrome evaluation method and device based on target detection framework

Info

Publication number: CN113995379A
Application number: CN202111227631.3A
Authority: CN
Inventors: 陈丹; 张垒; 明哲锴; 熊明福
Original assignee: Jiangxi Brain Modulation Technology Development Co ltd
Current assignee: Jiangxi Brain Modulation Technology Development Co ltd
Priority date: 2021-10-21
Filing date: 2021-10-21
Publication date: 2022-02-01
Anticipated expiration: 2041-10-21
Also published as: CN113995379B

Abstract

The invention provides a sleep apnea hypopnea syndrome evaluation method and device based on a target detection framework. The model contains three main parts: firstly, respectively processing oral-nasal airflow and chest pressure data in a time sequence form, and fusing extracted characteristics; generating a candidate frame for SAHS fragment detection based on Region Probable Network (RPN) Network adaptive regression; and generating and classifying the scale-invariant features based on the candidate segments. The model training adopts an alternate training mode to respectively train the RPN network and the classification network; and (3) alleviating the deviation problem possibly caused by model training under the unbalanced samples by using the focal loss function.

Description

Sleep apnea and hypopnea syndrome evaluation method and device based on target detection framework

Technical Field

The invention relates to the technical field of computers, in particular to a sleep apnea and hypopnea syndrome evaluation method and device based on a target detection framework.

Background

Sleep Apnea Hypoventilation Syndrome (SAHS) refers to a clinical syndrome in which a series of pathophysiological changes occur in the body due to repeated apnea and/or hypoventilation, hypercapnia and sleep interruption in a sleep state caused by various reasons, and is a typical symptom of sleep disorder. The method for evaluating the sleep apnea hypopnea syndrome is to evaluate the sleep apnea hypopnea syndrome through physiological signals (such as polysomnography) recorded in the sleeping process of a person, and the result is used as an important reference basis for diagnosing sleep disorder diseases.

The traditional SAHS evaluation method usually adopts a template matching mode to search the significant features of SAHS fragments and calibrate the fragments meeting the standard; however, such evaluation algorithms depend on specific tasks, are highly customized, and have a very high difficulty in migration among different signal segment identification tasks. The classification algorithm based on feature engineering or feature learning can be suitable for different segment recognition tasks, the algorithm obtains a feature set through a manual feature extraction method or a feature learning method, and then judges whether an input segment belongs to a specific signal segment or not through classification; however, a key problem of the classification-based algorithm is that only the type of an input fragment can be judged, and the problem of positioning a specific fragment cannot be solved; moreover, since the length of the SAHS segment is uncertain, the sliding window algorithm based on a single fixed-length time window will also destroy the integrity of the real segment, and the requirement for accurate positioning of the segment cannot be met.

Disclosure of Invention

In order to solve the technical problem, the first aspect of the present invention discloses a method for evaluating sleep apnea hypopnea syndrome based on a target detection framework, comprising:

s1: collecting original sleep physiological index data;

s2: preprocessing the collected original sleep physiological index data, and labeling SAHS segments;

s3: constructing an SAHS target detection framework, wherein the SAHS target detection framework comprises a backbone network module for fusing features, a region candidate module for generating a detection candidate frame and a sequence modeling module for classifying candidate sequences, and the starting point and the ending point of each candidate SAHS fragment can be obtained based on the detection candidate frame;

s4: acquiring training data from the preprocessed and labeled data, and training the SAHS target detection framework by using the training data;

s5: and detecting the data to be recognized by using the trained SAHS target detection framework.

In one embodiment, step S1 includes:

monitoring various physiological indexes of a person in the sleeping process by using a standard polysomnogram, and extracting the air flow of the mouth and the nose and the chest pressure data as original sleeping physiological index data.

In one embodiment, S2 includes:

s2.1: performing down-sampling processing on the chest pressure data based on a multiphase filtering algorithm, and taking the chest pressure data and the oronasal airflow data as two channels input into an SAHS target detection framework;

s2.2: and (4) labeling the data obtained in the step (S2.1), wherein the labeling specifically comprises the labeling of three SAHS segments of an occlusion type, a central type and a low ventilation type.

In one embodiment, the backbone network module is composed of three bottleneck layer structures, each bottleneck layer structure comprises two sub-blocks, namely an identity mapping module and a convolution module, and each sub-block is a series of one-dimensional convolution layers.

In one embodiment, the region candidate module employs a region candidate network RPN, which includes two branches, one branch for distinguishing whether a signal in a candidate frame is a feature signal, and the other branch for performing regression on the boundary of the candidate frame.

In one embodiment, the output of the local candidate network is a matrix of n × 2, where n represents the number of candidate frames contained in the sample, and 2 represents two elements of the start and end positions of the candidate frames.

In one embodiment, the sequence modeling module includes a feature processing layer, a timing signal embedding layer, and a fully-connected layer,

the characteristic processing layer is used for intercepting a characteristic sequence of the candidate region from the characteristic diagram to obtain a candidate sequence;

the time sequence signal embedding layer is used for extracting time sequence characteristics in the candidate sequence by adopting an LSTM network to obtain a time sequence signal embedding result;

and the full connection layer is used for obtaining a classification result according to the time sequence signal embedding result.

In one embodiment, S4 includes:

s4.1: the backbone network module and the area candidate module are trained together, and the candidate frame category loss and the candidate frame boundary loss of the area candidate module are summed and propagated reversely;

s4.2: the backbone network module and the sequence modeling module are trained together, and the classification loss and the boundary loss of the sequence modeling module are summed and reversely propagated;

s4.3: keeping parameters of the backbone network module fixed, continuing to train the area candidate module, summing the candidate frame category loss and the candidate frame boundary loss of the area candidate module and reversely transmitting;

s4.4: keeping parameters of the backbone network module fixed, training the sequence modeling module, summing the classification loss and the boundary loss of the sequence modeling module and reversely transmitting.

In one embodiment, the candidate box class penalty in S4.1 and S4.3 is calculated using the focal loss function.

Based on the same inventive concept, the second aspect of the present invention discloses a sleep apnea hypopnea syndrome evaluation apparatus based on a target detection framework, comprising:

the data acquisition module is used for acquiring original sleep physiological index data;

the preprocessing module is used for preprocessing the acquired original sleep physiological index data and marking SAHS segments;

the system comprises a target detection framework construction module, a sequence modeling module and a characteristic extraction module, wherein the target detection framework construction module is used for constructing an SAHS target detection framework, and the SAHS target detection framework comprises a backbone network module for fusing characteristics, a region candidate module for generating a detection candidate frame and the sequence modeling module for classifying candidate sequences;

the training module is used for acquiring training data from the preprocessed and labeled data and training the SAHS target detection framework by utilizing the training data;

and the detection module is used for detecting the data to be identified by using the trained SAHS target detection framework.

One or more technical solutions in the embodiments of the present application have at least one or more of the following technical effects:

the invention provides a sleep apnea hypopnea syndrome evaluation method based on a target detection framework.A SAHS target detection framework is constructed, features in an input signal can be extracted and fused through a backbone network module, a detection candidate frame can be generated through a region candidate module, and the start and stop points of each candidate SAHS segment are obtained; the candidate sequences can be classified through a sequence modeling module, and the SAHS target detection framework is trained by using training data; the detection of the data to be identified can be realized by using the trained SAHS target detection framework, so that an evaluation result is obtained. The segments can be accurately positioned and identified through an SAHS target detection framework.

Drawings

In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the description of the embodiments or the prior art will be briefly introduced below, and it is obvious that the drawings in the following description are some embodiments of the present invention, and for those skilled in the art, other drawings can be obtained according to these drawings without creative efforts.

FIG. 1 is a schematic structural diagram of an SAHS target detection framework constructed in the practice of the present invention;

fig. 2 is a graph of the actual apnea/hypopnea segment duration profiles in the practice of the present invention.

Detailed Description

Aiming at the problem that the type of an input fragment can only be judged and the positioning problem of a specific fragment cannot be solved in the classification-based algorithm in the prior art; and because the SAHS segment length is uncertain, the sliding window algorithm based on a single fixed-length time window can also destroy the integrity of the real segment, and the requirement of accurate positioning of the segment cannot be met.

The invention introduces the target detection framework into the analysis of data such as sleep signals, and the target detection framework is widely applied in the field of computer vision in recent years, and a great number of excellent models are proposed. Through a great deal of research and practice, the inventor of the application finds that physiological signals are essentially different from images in data formation and semantics. First, the physiological signals are time series data with natural timing information as they evolve in the time dimension. The existing target detection network basically takes a convolutional layer as a basic structure, and although the internal relative position information of a signal can be reserved, the physiological signal time sequence information cannot be integrated. Therefore, in order to improve the detection performance of the model on the specific physiological signal segment, the invention integrates the time sequence information into the target detection network, provides a lightweight specific physiological segment detection algorithm, and is applied to the automatic evaluation of the SAHS so as to simultaneously achieve the accurate identification and the accurate positioning of the apnea/hypopnea segment.

The main inventive concept of the present invention is as follows:

the data of the oral-nasal airflow and the chest pressure in the sleep monitoring data are selected as the data basis of the SAHS, and a target detection model (framework) fusing the oral-nasal airflow and the chest pressure data is designed. The model comprises three main parts, wherein the three main parts are used for respectively processing oral-nasal airflow and chest pressure data in a time sequence form and fusing extracted characteristics; generating a candidate frame for SAHS fragment detection based on Region Probable Network (RPN) Network adaptive regression; and generating and classifying the scale-invariant features based on the candidate segments. The model training adopts an alternate training mode to respectively train the RPN network and the classification network; and (3) alleviating the deviation problem possibly caused by model training under the unbalanced samples by using the focal loss function.

In order to make the objects, technical solutions and advantages of the embodiments of the present invention clearer, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are some, but not all, embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Example one

The embodiment of the invention provides a sleep apnea hypopnea syndrome evaluation method based on a target detection framework, which comprises the following steps:

s1: collecting original sleep physiological index data;

The invention provides a Sleep Apnea Hypopnea Syndrome (SAHS) detection method based on a target detection framework. Sleep Apnea-Hypopnea Syndrome (SAHS) refers to a clinical Syndrome in which a series of pathophysiological changes occur in an organism due to repeated Apnea and/or Hypopnea, hypercapnia and Sleep interruption in a Sleep state caused by various reasons. The main goal of the SAHS assessment task is to detect segments of apnea/hypopnea that occur during sleep in sleep disordered patients based on respiratory indices. Based on the detection frame, the invention designs a detection frame based on a target detection network, realizes the direct detection of the original apnea/hypopnea segment, and simultaneously introduces a sequence model to generate a fixed-length characteristic and fuses the time sequence information of physiological signals to realize the subsequent classification of three SAHS segments of an obstructive type, a central type and a low-ventilation type.

Fig. 1 is a schematic structural diagram of an SAHS target detection framework constructed in the present invention.

In one embodiment, step S1 includes:

Specifically, each physiological index comprises electroencephalogram, electrooculogram, oral-nasal airflow, chest and abdominal pressure and the like, and the oral-nasal airflow and chest pressure data are used as data sources for SAHS fragment detection.

In one embodiment, S2 includes:

s2.2: the data obtained in step S2.1 are labeled with three types of SAHS segments, namely, blocking type, central type and low-ventilation type (note that the SAHS segment is labeled as one of the three types, and the segments other than the SAHS segment are regarded as non-SHAS segments).

In the specific implementation process, as the effective sleeping time of a person exceeds 8 hours all night, in order to reduce the performance pressure during model (SAHS target detection framework) analysis, original data needs to be segmented into segments with the time length of 1 hour as model input; meanwhile, because the sampling rate of the chest pressure data is higher than that of the oral-nasal airflow data, in order to keep the input lengths of the channels consistent, the chest pressure data is subjected to down-sampling processing based on a multiphase filter algorithm, and the chest pressure data and the oral-nasal airflow data are used as two channels input into an SAHS target detection framework.

The dimension of the resulting single sample is [ n _ channel, seg _ length ] sampling _ rate ]. n _ channel represents the number of channels of electrophysiological data, here 2; seg _ length represents the segment time length, here by default 1h, sampling _ rate represents the sampling rate, here 50 Hz. The finally obtained sample data can be expressed as n _ samples, n _ channel, seg _ length _ sampling _ rate, where n _ samples represents the number of samples.

The structure of the identity mapping module and the structure of the convolution module are similar, and the identity mapping module and the convolution module are both stacked convolution modules and a bypass for avoiding the degradation problem of a deep network. But the identity mapping module does not change the dimension of the input and the output, and the bypass does not make any change on the input, so that the network can be continuously deepened in series. The convolution module can change the dimension of input and output, and a single-layer convolution capable of changing the dimension is arranged on a bypass of the convolution module, so that the output dimension matching with the stacked convolution layer is realized.

In the specific implementation process, for the preprocessed data, each time segment comprises an oral-nasal airflow segment and a chest pressure segment in a time sequence form, and feature extraction and fusion can be realized through a backbone network module formed by a one-dimensional convolution calculation unit. After dimension reduction, nonlinear transformation and dimension increasing processing, heterogeneity among different data is eliminated, and signals are converted into high-level and more abstract feature matrixes.

Specifically, the invention migrates the conventional two-dimensional area candidate network to the time series data, once for each area candidate networkInputting a feature tensor X of a segment of a signal^N×SWhere N is the feature dimension, S is the feature length, and the training output of the regional candidate network is

Wherein the content of the first and second substances,

indicating the start and end positions of the nth candidate frame of the segment of input signal, the "position" specifically refers to the bit elements from the nth position of the original input signal matrix (i.e. the output result of the backbone network module), for example, the 2 nd to 7 th position elements. And obtaining the starting point and the ending point of the SAHS segment according to the regional candidate network, and then segmenting the fusion signal output by the backbone network module. After the starting point and the end point (starting and stopping points) of the SAHS segment are obtained through the region candidate module, the duration of the SAHS segment can be obtained through reverse estimation according to the backbone network module.

The reverse process is as follows: assuming that the original segment length is L and the sampling rate is f (the signal is actually a two-dimensional matrix formed by two channels, but only the time dimension needs to be considered here), the segment length after feature fusion of the backbone network module is L, and the start and end points of the candidate frames extracted by the region candidate module are p respectively₁,p₂. Then corresponds to the start and end point of the original signal (i.e. the start and end point of the SAHS segment) P₁And P₂：

Then backward-deducing to obtain the duration

In the specific implementation process, in order to keep the input and output sizes of the same batch of samples in the model network layer consistent under the input mode of batch samples, when the feature sequence of the candidate region is intercepted from the feature map, the consistency of the feature lengths of the samples in the batch is met by adopting a completion mode. Assuming that the pooling step is fixed to s, for a batch containing N samples, the length of the feature of the ith sample on the feature map is L_iThus the number of features produced after pooling is

In order to ensure the consistency of the features, the number of features of the longest candidate region in the batch is taken as a standard and recorded as

When the feature is cut out, other candidate regions are partially cut out from the feature map and are added to the maximum length. And finally obtaining the fragments to be classified.

Then, the segment to be classified obtained by the feature processing module is input into a Long short-term memory (LSTM) network, and the time step length of the LSTM is set as the time step length

Outputting corresponding characteristics at each time step of the LSTM, and setting a length of the LSTM to be a length of the LSTM in the model concrete implementation process in order to ensure that the final output result of the LSTM layer is output according with the real characteristic sequence of the sample

The mask matrix A, A is initialized as followsThe formula is shown as follows:

i denotes the ith sample in the batch, j denotes the matrix coordinate in the time-series dimension

k represents N_outputOne reference of this dimension (embedded representation of the timing signal).

Assume that the output result of LSTM at each time step is N_outputFor input, then

The tensor output after passing through the LSTM is

Ont-hot coding is carried out on the mask matrix to obtain a mask tensor A ' with the same size as the LSTM output result, and then the final result O ' of the time sequence signal embedding is obtained in an element multiplication mode, wherein the size of the final result O ' is NxN_outputThe calculation formula is as follows:

and finally, inputting a final result O' of the time sequence signal embedding obtained by the time sequence signal embedding layer into the full-connection layer to obtain a classification result of the final sequence modeling module, wherein the size of the classification result is Nx 4, specifically, the probabilities of 4 types of subordination (blocking type, central type, low ventilation and non-SAHS) are calculated for each segment, the sum of the probabilities of the four types is 1, and the type corresponding to the maximum probability is taken as the classification result.

In one embodiment, S4 includes:

In the specific implementation process, an alternate training method is adopted: firstly, training a backbone network module and a regional candidate module together; secondly, training the backbone network module and the sequence modeling module together; then keeping parameters of the backbone network module fixed, and continuing to train the area candidate module; and finally, fixing parameters of the backbone network module and training the sequence modeling module. Wherein the candidate box category loses: whether a candidate frame given by the regional candidate network is an SAHS segment or not, wherein the boundary loss of the candidate frame is an error between the boundary of the candidate frame and a labeled boundary, and the classification loss is as follows: specifically, the classification loss of three SAHS fragments including blocking type, central type and low ventilation type and the classification loss of non-SAHS fragments, and the boundary loss of a sequence modeling module: the calculation method is the same as the loss of the boundary of the candidate frame.

The model training loss function of the regional candidate network adopts focal loss, which is defined as FL (p)_t)＝-α_t(1-p_t)^γlog(p_t)，p_tProbability of frame being determined as a positive sample, p_tIs the probability that the in-frame signal candidate is determined as the SAHS feature signal (positive sample), α_tAnd γ is a hyper-parameter of focal loss, also called balance coefficient and focus coefficient, in this embodiment α_t0.75, γ 2. By passingThe back propagation algorithm transmits the error to the region candidate module and the backbone network module, and the iteration is continuously carried out, so that the model parameters are optimized.

The process according to the invention is illustrated by the following specific examples.

Step 1: and (3) data acquisition, namely acquiring the physiological indexes of sleep by using a standard polysomnogram.

Indexes of oronasal temperature airflow, nasal pressure, thoracoabdominal pressure and the like for apnea/hypopnea detection are extracted from a published research data set (CHAT) for children suffering from sleep apnea syndrome, and the two indexes of oronasal airflow and thoracoabdominal pressure are selected as judgment bases by referring to diagnosis standards of the apnea syndrome. Because the data acquisition equipment has difference in different places, the sampling rate of the same monitoring index also has difference. Therefore, 100 child overnight monitoring signals with the same index sampling rate were selected as data sources from the experiment, in this case, the sampling rate of oronasal airflow was 50Hz, and the sampling rate of thoracoabdominal pressure was 200 Hz.

Step 2: and data preprocessing, namely preprocessing the acquired data set and labeling SAHS segments.

Because the effective sleeping time of the person is longer than 8 hours all night, the single-machine operation environment of the embodiment cannot bear the memory pressure caused by one-time input. Therefore, the whole signal is divided into segments with the time length of 1 hour as the model input. In order to keep the input length of the channel consistent, the present embodiment performs downsampling processing based on a polyphase filter algorithm on the data of the thoracoabdominal pressure, and then the sampling rates of the oronasal airflow and the data of the thoracoabdominal pressure are both 50 Hz. And marking the SAHS segment according to expert marking information provided by the original data set, and recording the starting time of the SAHS segment.

And step 3: constructing an SAHS fragment target detection framework;

(3.1) in order to perform feature fusion on the preprocessed data of the oral-nasal airflow segment and the chest pressure segment which comprise a time sequence form in each time segment, the embodiment designs a backbone network module which is composed of a one-dimensional convolution computing unit. The backbone network module is composed of three bottleneck layer structures, and each bottleneck layer structure comprises two sub-blocks (an identity mapping module and a convolution module).

(3.2) in order to generate the SAHS fragment detection candidate frame, the present embodiment designs an area candidate module consisting of an area candidate network. The size of a preset frame needs to be preset in the regional candidate network, the distribution of the duration of the real apnea/hypopnea segments is referred to, as shown in fig. 2, the horizontal axis represents the duration, the vertical axis represents the segment number of the corresponding duration, three length preset frames of 8s, 13s and 18s are selected, and the preset frame with the overlap degree of the real segment exceeding 0.6 is marked as a positive sample. The region candidate module generates a series of two-dimensional vectors for each input sample, each two-dimensional vector referring to the beginning and end positions of a candidate box.

(3.3) in order to detect the candidate segments, a sequence modeling module based on the LSTM and the full-connection layer network is constructed in the embodiment, the two-dimensional vectors obtained in the region candidate module are mapped to the feature matrix generated by the backbone network module, and the feature matrix is segmented. And (3) fusing time sequence characteristics by using an LSTM network, ensuring that candidate frames with different sizes can obtain characteristics with the same length, and inputting the output of the LSTM into a full-connection layer for final classification.

And 4, step 4: training the target detection framework using the training data;

in order to implement parameter sharing and co-training among modules, the embodiment trains the network in an alternating training mode. Specifically, firstly, a backbone network module and a region candidate module are trained together; secondly, training the backbone network module and the sequence modeling module together; then keeping parameters of the backbone network module fixed, and continuing to train the area candidate module; and finally, fixing parameters of the backbone network module and training the sequence modeling module.

And 5: and detecting the data to be identified by using the trained SAHS target detection framework.

In order to confirm the effect of the embodiment, the embodiment extracts data in the CHAT data set, part of which is not used for training, and performs the same data cropping and downsampling operations, resulting in a series of data to be identified. And inputting the data to be identified into a detection frame to obtain a final identification result.

Compared with the prior art, the invention has the beneficial effects that:

1. the invention provides a general solution for the defects of the target detection network in detecting the characteristic waveform with any size in the biological signal. A pooling strategy is proposed in combination with sequence modeling to solve the feature distortion problem while enhancing temporal information.

2. The invention provides a lightweight deep learning framework which is used for distinguishing the category of SAHS fragments and positioning the start point, the end point and the duration of each fragment, thereby providing feasibility for detecting other characteristic waveforms from biological signals.

Example two

Based on the same inventive concept, the present embodiment provides a sleep apnea hypopnea syndrome evaluating apparatus based on a target detection framework, comprising:

Since the device described in the second embodiment of the present invention is a device used for implementing the method for assessing sleep apnea and hypopnea syndrome based on the target detection framework in the first embodiment of the present invention, a person skilled in the art can understand the specific structure and modification of the device based on the method described in the first embodiment of the present invention, and thus the detailed description thereof is omitted. All the devices adopted in the method of the first embodiment of the present invention belong to the protection scope of the present invention.

The above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A method for assessing sleep apnea hypopnea syndrome based on a target detection framework, comprising:

s1: collecting original sleep physiological index data;

2. The method for sleep apnea hypopnea syndrome assessment based on a target detection framework as claimed in claim 1, wherein step S1 includes:

3. The target detection framework-based sleep apnea hypopnea syndrome assessment method of claim 2, wherein S2 comprises:

4. The method of claim 2, wherein the backbone network module comprises three bottleneck layer structures, each bottleneck layer structure comprises two sub-blocks, namely an identity mapping module and a convolution module, and each sub-block comprises a series of one-dimensional convolution layers.

5. The method of claim 1, wherein the regional candidate module employs a regional candidate network (RPN) comprising two branches, one branch for distinguishing whether the signal in the candidate frame is a feature signal, and the other branch for performing regression on the boundary of the candidate frame.

6. The method of claim 5, wherein the output of the local candidate network is a matrix of n x 2, n represents the number of candidate boxes contained in the sample, and 2 represents two elements of the starting and ending positions of the candidate boxes.

7. The object detection framework-based sleep apnea hypopnea syndrome evaluation method of claim 1, wherein the sequence modeling module comprises a feature processing layer, a time sequence signal embedding layer and a full connectivity layer,

8. The target detection framework-based sleep apnea hypopnea syndrome assessment method of claim 1, wherein S4 comprises:

9. The method for objective detection framework-based assessment of sleep apnea hypopnea syndrome according to claim 8, wherein the candidate box class loss in S4.1 and S4.3 is calculated using a focal loss function.

10. An apparatus for assessing sleep apnea hypopnea syndrome based on a target detection framework, comprising: