CN113995379B

CN113995379B - Sleep apnea-hypopnea syndrome evaluation method and device based on target detection framework

Info

Publication number: CN113995379B
Application number: CN202111227631.3A
Authority: CN
Inventors: 陈丹; 张垒; 明哲锴; 熊明福
Original assignee: Jiangxi Brain Modulation Technology Development Co ltd
Current assignee: Jiangxi Brain Modulation Technology Development Co ltd
Priority date: 2021-10-21
Filing date: 2021-10-21
Publication date: 2023-08-15
Anticipated expiration: 2041-10-21
Also published as: CN113995379A

Abstract

The application provides a sleep apnea-hypopnea syndrome evaluation method and a sleep apnea-hypopnea syndrome evaluation device based on a target detection framework, which are used for selecting oral-nasal airflow and chest pressure data in sleep monitoring data as a data basis of SAHS evaluation and designing a target detection model integrating the oral-nasal airflow and the chest pressure data. The model contains three main parts: firstly, respectively processing the oral-nasal airflow and chest pressure data in a time sequence form, and fusing the extracted characteristics; generating candidate boxes for SAHS fragment detection based on regional candidate network (Region Proposal Network, RPN) network adaptive regression; and generating scale-invariant features based on the candidate segments and classifying. The model training adopts an alternate training mode to train the RPN network and the classification network respectively; the use of a focal loss function alleviates bias problems that may be caused by model training under unbalanced samples.

Description

Sleep apnea-hypopnea syndrome evaluation method and device based on target detection framework

Technical Field

The application relates to the technical field of computers, in particular to a sleep apnea-hypopnea syndrome evaluation method and device based on a target detection framework.

Background

Sleep apnea-hypopnea syndrome (SAHS) refers to a clinical syndrome in which apnea and/or hypopnea, hypercapnia and sleep interruption occur repeatedly in a sleep state due to various reasons, so that a series of pathophysiological changes occur in an organism, and is a typical sleep disorder symptom. The sleep apnea-hypopnea syndrome evaluation method is to evaluate the sleep apnea-hypopnea syndrome through physiological signals (such as polysomnography) recorded by the sleep process of a person, and the result is used as an important reference basis for diagnosing sleep disorder diseases.

The traditional SAHS evaluation method usually adopts a template matching mode to search the obvious characteristics of the SAHS fragments and calibrate the fragments conforming to the standard; however, the evaluation algorithm depends on specific tasks, has strong customization, and has great migration difficulty among different signal segment recognition tasks. The classification algorithm based on the feature engineering or the feature learning can be suitable for different segment recognition tasks, the algorithm obtains a feature set through a manual feature extraction or feature learning method, and then whether an input segment belongs to a specific signal segment is judged through classification; however, a key problem of the classification-based algorithm is that only the type of the input fragment can be judged, and the problem of positioning the specific fragment cannot be solved; moreover, since the SAHS fragment length is uncertain, a sliding window algorithm based on a single fixed length time window also breaks the integrity of the real fragment, and cannot meet the requirement of accurate fragment positioning.

Disclosure of Invention

To solve the above technical problem, a first aspect of the present application discloses a sleep apnea-hypopnea syndrome evaluation method based on a target detection framework, including:

s1: collecting original sleep physiological index data;

s2: preprocessing the acquired original sleep physiological index data, and labeling SAHS fragments;

s3: constructing an SAHS target detection framework, wherein the SAHS target detection framework comprises a backbone network module for fusing features, a region candidate module for generating detection candidate frames and a sequence modeling module for classifying candidate sequences, and the starting point of each candidate SAHS fragment can be obtained based on the detection candidate frames;

s4: acquiring training data from the preprocessed and marked data, and training an SAHS target detection frame by using the training data;

s5: and detecting the data to be identified by using the trained SAHS target detection framework.

In one embodiment, step S1 includes:

and monitoring various physiological indexes in the sleeping process of the person by using a standard polysomnography, and extracting the data of the mouth-nose airflow and the chest pressure as the original sleeping physiological index data.

In one embodiment, S2 comprises:

s2.1: performing downsampling processing on the chest pressure data based on a polyphase filtering algorithm, and taking the chest pressure data and the oral-nasal airflow data as two channels of an input SAHS target detection frame;

s2.2: labeling the data obtained in the step S2.1, specifically including labeling three SAHS fragments of occlusion type, central type and low ventilation type.

In one embodiment, the backbone network module is formed of three bottleneck layer structures, each comprising two sub-blocks, an identity mapping module and a convolution module, respectively, each sub-block having a series of one-dimensional convolution layers therein.

In one embodiment, the region candidate module employs a region candidate network RPN, which includes two branches, one for distinguishing whether a signal in a candidate box is a feature signal and the other for regressing the boundary of the candidate box.

In one embodiment, the output of the area candidate network is a matrix of n×2, where n represents the number of candidate boxes contained in the sample, and 2 represents two elements of the starting and ending positions of the candidate boxes.

In one embodiment, the sequence modeling module includes a feature processing layer, a timing signal embedding layer, and a full connection layer,

the feature processing layer is used for intercepting the feature sequence of the candidate region from the feature map to obtain a candidate sequence;

the time sequence signal embedding layer is used for extracting time sequence characteristics in the candidate sequence by adopting an LSTM network to obtain a time sequence signal embedding result;

the full connection layer is used for obtaining a classification result according to the time sequence signal embedding result.

In one embodiment, S4 comprises:

s4.1: training the backbone network module and the region candidate module together, summing the candidate frame class loss and the candidate frame boundary loss of the region candidate module, and back-propagating;

s4.2: training the backbone network module and the sequence modeling module together, summing the classification loss and the boundary loss of the sequence modeling module and back-propagating;

s4.3: keeping the parameters of the backbone network module fixed, continuing to train the region candidate module, summing the candidate frame class loss and the candidate frame boundary loss of the region candidate module, and back-propagating;

s4.4: and keeping the parameters of the backbone network module fixed, training the sequence modeling module, summing the classification loss and the boundary loss of the sequence modeling module, and back-propagating.

In one embodiment, the candidate box class loss in S4.1 and S4.3 is calculated using a focal loss function.

Based on the same inventive concept, a second aspect of the present application discloses a sleep apnea-hypopnea syndrome evaluation device based on a target detection framework, comprising:

the data acquisition module is used for acquiring original sleep physiological index data;

the preprocessing module is used for preprocessing the acquired original sleep physiological index data and labeling SAHS fragments;

the target detection framework construction module is used for constructing an SAHS target detection framework, and the SAHS target detection framework comprises a backbone network module for fusing features, a region candidate module for generating detection candidate frames and a sequence modeling module for classifying candidate sequences;

the training module is used for acquiring training data from the preprocessed and marked data and training the SAHS target detection frame by utilizing the training data;

and the detection module is used for detecting the data to be identified by using the trained SAHS target detection frame.

The above technical solutions in the embodiments of the present application at least have one or more of the following technical effects:

according to the sleep apnea-hypopnea syndrome evaluation method based on the target detection framework, the SAHS target detection framework is constructed, features in input signals can be extracted and fused through a backbone network module, detection candidate frames can be generated through a region candidate module, and a starting point of each candidate SAHS fragment is obtained; the candidate sequences can be classified through the sequence modeling module, and the SAHS target detection framework is trained by using training data; the trained SAHS target detection framework can be used for detecting the data to be identified, so that an evaluation result is obtained. The fragments can be accurately positioned and accurately identified through the SAHS target detection framework.

Drawings

In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required in the embodiments or the description of the prior art will be briefly described, and it is obvious that the drawings in the following description are some embodiments of the present application, and other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.

FIG. 1 is a schematic diagram of an SAHS target detection framework constructed in the practice of the present application;

figure 2 is a graph of the actual apnea/hypopnea segment duration profile in the practice of the present application.

Detailed Description

Aiming at the fact that the type of the input fragment can only be judged in the classification-based algorithm in the prior art, the problem of positioning the specific fragment cannot be solved; and because the SAHS segment length is uncertain, the sliding window algorithm based on a single fixed length time window also breaks the integrity of the real segment, and cannot meet the requirement of accurate segment positioning.

The application introduces the target detection framework into analysis of data such as sleep signals, and the target detection framework is widely applied in the field of computer vision in recent years, and a large number of excellent models are proposed. The inventor of the present application has found through extensive research and practice that physiological signals are essentially different from images in terms of data formation and semantics. Firstly, physiological signals are used as time series data, and natural time series information is carried when the time dimension evolves. The existing target detection network basically uses a convolution layer as a basic structure, and can keep relative position information inside signals, but cannot integrate physiological signal time sequence information. Therefore, in order to improve the detection performance of the model on the specific physiological signal segment, the application integrates time sequence information into a target detection network, and provides a lightweight specific physiological segment detection algorithm which is applied to automatic evaluation of SAHS so as to achieve accurate identification and accurate positioning of the apnea/hypopnea segment.

The main inventive concept of the present application is as follows:

the data of the oral-nasal airflow and the chest pressure in the sleep monitoring data are selected as the data base of SAHS evaluation, and a target detection model (framework) which fuses the data of the oral-nasal airflow and the chest pressure is designed. The model comprises three main parts, which respectively process the oral-nasal airflow and chest pressure data in a time sequence form and fuse the extracted characteristics; generating candidate boxes for SAHS fragment detection based on regional candidate network (Region Proposal Network, RPN) network adaptive regression; and generating scale-invariant features based on the candidate segments and classifying. The model training adopts an alternate training mode to train the RPN network and the classification network respectively; the use of a focal loss function alleviates bias problems that may be caused by model training under unbalanced samples.

For the purpose of making the objects, technical solutions and advantages of the embodiments of the present application more apparent, the technical solutions of the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application, and it is apparent that the described embodiments are some embodiments of the present application, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.

Example 1

The embodiment of the application provides a sleep apnea-hypopnea syndrome evaluation method based on a target detection framework, which comprises the following steps of:

s1: collecting original sleep physiological index data;

The application provides a sleep apnea-hypopnea syndrome (SAHS) detection method based on a target detection framework. Sleep Apnea-hypopnea syndrome (SAHS) refers to a clinical syndrome in which a series of pathophysiological changes occur in the body due to repeated occurrence of Apnea and/or hypopnea, hypercapnia, and Sleep interruption in a Sleep state caused by various reasons. The main goal of the SAHS assessment task is to detect fragments of apnea/hypopnea that occur during sleep in a sleep disordered patient based on a respiratory index. Based on the method, a detection framework based on a target detection network is designed, the direct detection of the original apnea/hypopnea fragment is realized, and meanwhile, a sequence model is introduced to generate fixed-length characteristics and the time sequence information of physiological signals is fused to realize the subsequent classification of three SAHS fragments of occlusion type, central type and hypopnea.

Please refer to fig. 1, which is a schematic diagram illustrating a structure of an SAHS target detection framework constructed in the present application.

In one embodiment, step S1 includes:

Specifically, each physiological index includes brain electricity, eye electricity, mouth-nose airflow, chest-abdomen pressure, and the like, and the mouth-nose airflow and chest pressure data are used as data sources for SAHS segment detection in the embodiment.

In one embodiment, S2 comprises:

s2.2: labeling of SAHS fragments of three types of occlusion type, central type and hypopnea in the data obtained in step S2.1 (it should be noted that the SAHS fragment is labeled as one of the three types, and the fragments other than the SAHS fragment are regarded as non-SHAS fragments).

In the implementation process, as the effective sleeping time of a person exceeds 8 hours throughout the night, in order to reduce the performance pressure during analysis of a model (SAHS target detection frame), the original data is required to be divided into segments with the time length of 1 hour and used as model input; meanwhile, as the sampling rate of the chest pressure data is higher than that of the oral-nasal airflow data, in order to keep the consistency of the input lengths of the channels, the chest pressure data is subjected to downsampling processing based on a polyphase filtering algorithm, and the chest pressure data and the oral-nasal airflow data are used as two channels for inputting an SAHS target detection frame.

The dimension of the resulting single sample is [ n_channel, seg_length_sampling_rate ]. n_channel represents the number of channels of electrophysiological data, here 2; seg_length represents the segment time length, here by default 1h, and sampling_rate represents the sampling rate, here 50Hz. The final obtained sample data may be expressed as [ n_samples, n_channel, seg_length ] sampling_rate ], where n_samples represents the number of samples.

The structure of the identity mapping module and the structure of the convolution module are similar, and the identity mapping module and the convolution module are stacked convolution modules and a bypass for avoiding degradation problems of a deep network. The identity mapping module does not change the dimensions of the input and output, its bypass does not make any changes to the input, and thus the network can be deepened continuously in series. The convolution module can change the dimension of input and output, and a single-layer convolution which can change the dimension is arranged on the bypass of the convolution module, so that the output dimension matching of the stacked convolution layers is realized.

In the specific implementation process, each time slice contains an oral-nasal airflow slice and a chest pressure slice in a time sequence form for the preprocessed data, and the extraction and fusion of the features can be realized through a backbone network module formed by a one-dimensional convolution computing unit. After the dimensionality reduction, nonlinear transformation and dimension increasing processing, heterogeneity among different data is eliminated, and the signals are converted into high-level and more abstract feature matrixes.

Specifically, the application migrates the traditional two-dimensional area candidate network to time sequence data, and the area candidate network inputs the characteristic tensor X of one section of signal at a time ^N×S Wherein N is the feature dimension, S is the feature length, and the training output of the area candidate network isWherein (1)>The n candidate box start and end positions of the segment of the input signal are represented, and the "position" specifically refers to the element of the original input signal matrix (i.e. the output result of the backbone network module) to the element of the original input signal matrix, for example, the 2 nd to 7 th bit elements. And obtaining starting and ending points of the SAHS segments according to the area candidate network, and then segmenting the fusion signal output by the backbone network module. By passing throughAfter the region candidate module obtains the start point and the end point (start point and end point) of the SAHS fragment, the duration of the SAHS fragment can be obtained by back-pushing according to the backbone network module.

The reverse thrust process is as follows: assuming that the original segment length is L, the sampling rate is f (the signal is actually a two-dimensional matrix formed by two channels, but only the time dimension needs to be considered here), the segment length after the feature fusion of the backbone network module is L, and the candidate frame starting and ending points extracted by the region candidate module are p respectively ₁ ,p ₂ . Then the start-stop point (i.e. the start-stop point of the SAHS fragment) P corresponds to the start-stop point of the original signal ₁ And P ₂ ：

And back-pushing to obtain duration

In the implementation process, in order to keep the input and output sizes of the same batch of samples consistent in a model network layer in an input mode of the batch of samples, when a characteristic sequence of a candidate region is intercepted from a characteristic diagram, consistency of characteristic lengths of samples in the batch is met in a complement mode. Assuming the pooling step size is fixed to s for a sample containing N samplesFor a batch, the length of the feature of the ith sample on the feature map is L _i Therefore, the number of the characteristics generated after pooling isTo ensure feature uniformity, the number of features of the longest candidate region in the lot is taken as a criterion and is denoted +.>Other candidate regions may intercept a feature more than a portion of the complement to the maximum length on the feature map when the feature is intercepted. The finally obtained fragment is the fragment to be classified.

Then inputting the fragments to be classified obtained by the feature processing module into a Long short-term memory (LSTM) network, and setting the time step length of the LSTM asCorresponding characteristics are output at each time step of the LSTM, and in order to ensure that the final output result of the LSTM layer is output in accordance with the real characteristic sequence of the sample, a length of +_ is set when the model is specifically implemented>The initialized form of the mask matrix a, a is shown as the following formula:

i represents the ith sample in the batch and j represents the matrix coordinates in the time dimensionk represents N _output One reference to this dimension (the embedded representation of the timing signal).

Assume that the output result of LSTM is N in size every time step _output Then for input Tensor output after LSTM is +.>Ont-hot encoding is carried out on the mask matrix to obtain a mask tensor A ' with the same size as that of the LSTM output result, and then a final result O ' of time sequence signal embedding is obtained in an element multiplication mode, wherein the size of the final result O ' is N multiplied by N _output The calculation formula is as follows:

and finally, inputting a final result O' of time sequence signal embedding obtained by the time sequence signal embedding layer into a full-connection layer to obtain a classification result of a final sequence modeling module, wherein the size of the classification result is N multiplied by 4, specifically, the probability of being subordinate to 4 types (blocking type, central type, low ventilation type and non SAHS type) is calculated for each segment, the sum of the probabilities of the four types is 1, and the type corresponding to the maximum probability is taken as the classification result.

In one embodiment, S4 comprises:

In the specific implementation process, the method of alternate training is adopted: firstly, training a backbone network module and a region candidate module together; training the backbone network module and the sequence modeling module together; then keeping the parameters of the backbone network module fixed, and continuing training the region candidate module; finally, parameters of the backbone network module are fixed, and the sequence modeling module is trained. Wherein, candidate frame category loss: whether the candidate frames given by the regional candidate network are SAHS fragments or not, the boundary loss of the candidate frames is the error between the boundary of the candidate frames and the marked boundary, and the classification loss is as follows: specifically, the classification loss of three SAHS fragments and non-SAHS fragments of occlusion type, central type and hypopnea, the boundary loss of the sequence modeling module: the calculation mode is the same as the candidate frame boundary loss.

The model training loss function of the area candidate network uses a focal loss, defined as FL (p _t )＝-α _t (1-p _t ) ^γ log(p _t )，p _t Probability of candidate box being determined as positive sample, p _t Alpha, the probability that the candidate in-frame signal is determined to be an SAHS characteristic signal (positive sample) _t And γ is a super-parameter of focal loss, also known as balance coefficient and focusing coefficient, α in this embodiment _t =0.75, γ=2. The error is conducted to the region candidate module and the backbone network module through a back propagation algorithm, and iteration is performed continuously so as to optimize the model parameters.

The method provided by the application is illustrated by the following specific examples.

Step 1: data acquisition refers to acquisition of sleep physiological indexes by using a standard polysomnography.

Indexes such as mouth and nose temperature air flow, nose pressure and chest and abdomen pressure for detecting the apnea/hypopnea are extracted from a disclosed study data set (Childhood Adenotonsillectomy Trial, CHAT) aiming at children suffering from sleep apnea syndrome, and the two indexes of mouth and nose air flow and chest and abdomen pressure are selected as judgment bases by referring to the diagnosis standard of the apnea symptoms. Because the data acquisition equipment has the difference in different places, the sampling rate of the same monitoring index also has the difference. Therefore, 100 children with the same index sampling rate are selected as data sources from the experiment, and the sampling rate of the oral-nasal airflow is 50Hz and the sampling rate of the chest-abdomen pressure is 200Hz.

Step 2: the data preprocessing refers to preprocessing the acquired data set and labeling the SAHS fragments.

Because the effective sleeping time of the person exceeds 8 hours throughout the night, the single machine operating environment of the embodiment cannot bear the memory pressure caused by one-time input. The whole signal is split into segments of time length 1 hour as model inputs. In order to keep the input length of the channel consistent, the sampling rate of the oral-nasal airflow and the chest-abdomen pressure data is 50Hz when the chest-abdomen pressure data is subjected to downsampling processing based on a multiphase filtering algorithm. The marking of the SAHS segments records the starting time of the SAHS segments according to expert marking information provided by the original data set.

Step 3: constructing an SAHS fragment target detection framework;

(3.1) in order to perform feature fusion on the preprocessed data of each time slice including the time-series form of the oral-nasal airflow slice and the chest pressure slice, the present embodiment designs a backbone network module composed of one-dimensional convolution computing units. The backbone network module is formed by three bottleneck layer structures, each comprising two sub-blocks (identity mapping module and convolution module).

(3.2) to generate an SAHS fragment detection candidate block, the present embodiment designs a region candidate module composed of a region candidate network. The area candidate network needs to preset the size of a preset frame, refer to the real apnea/hypopnea segment duration distribution, as shown in fig. 2, the horizontal axis represents "duration", the vertical axis represents "segment number" of the corresponding duration, three preset frames of 8s, 13s and 18s are selected, and the preset frame with overlapping degree with the real segment exceeding 0.6 is marked as a positive sample. The region candidate module generates a series of two-dimensional vectors for each input sample, each two-dimensional vector referring to the beginning and ending positions of one candidate box.

(3.3) in order to detect candidate segments, this embodiment constructs a sequence modeling module based on LSTM and fully-connected layer network, maps the two-dimensional vector obtained in the region candidate module onto the feature matrix generated by the backbone network module, and segments the feature matrix. And (3) fusing time sequence characteristics by using an LSTM network, simultaneously ensuring that candidate frames with different sizes can obtain characteristics with the same length, and inputting the output of the LSTM into a full-connection layer for final classification.

Step 4: training the target detection framework by using training data;

in order to realize parameter sharing and co-training among the modules, the embodiment uses an alternate training mode to train the network. Specifically, firstly, training a backbone network module and a region candidate module together; training the backbone network module and the sequence modeling module together; then keeping the parameters of the backbone network module fixed, and continuing training the region candidate module; finally, parameters of the backbone network module are fixed, and the sequence modeling module is trained.

Step 5: and detecting the data to be identified by using the trained SAHS target detection framework.

In order to confirm the effect of the present embodiment, the present embodiment extracts data in the CHAT dataset that is not used for training, and performs the same data slicing and downsampling operations, resulting in a series of data to be identified. And inputting the data to be identified into a detection framework to obtain a final identification result.

Compared with the prior art, the application has the beneficial effects that:

1. the application provides a general solution for the defect of the target detection network in detecting the characteristic waveform with any size in the biological signal. A pooling strategy combined with sequence modeling is presented to solve the problem of feature distortion while enhancing the time information.

2. The application provides a lightweight deep learning framework for judging the type of SAHS fragments and locating the starting point and the duration of each fragment, thereby providing feasibility for detecting other characteristic waveforms from biological signals.

Example two

Based on the same inventive concept, the present embodiment provides a sleep apnea-hypopnea syndrome evaluation device based on a target detection frame, including:

Since the device described in the second embodiment of the present application is a device for implementing the sleep apnea-hypopnea syndrome evaluation method based on the target detection frame in the first embodiment of the present application, a person skilled in the art can understand the specific structure and deformation of the device based on the method described in the first embodiment of the present application, and therefore, the detailed description thereof is omitted herein. All devices used in the method according to the first embodiment of the present application are within the scope of the present application.

The above embodiments are only for illustrating the technical solution of the present application, and are not limiting; although the application has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present application.

Claims

1. A method for assessing sleep apnea-hypopnea syndrome based on a target detection framework, comprising:

s1: collecting original sleep physiological index data;

s3: the method comprises the steps of constructing an SAHS target detection framework, wherein the SAHS target detection framework comprises a backbone network module for fusing features, a region candidate module for generating detection candidate frames and a sequence modeling module for classifying candidate sequences, the region candidate module adopts a region candidate network RPN, the region candidate network comprises two branches, one branch is used for distinguishing whether signals in the candidate frames are feature signals or not, the other branch is used for regressing the boundaries of the candidate frames, the output of the region candidate network is a matrix with n being 2, n represents the number of the candidate frames contained in a sample, 2 represents two elements of the starting and ending positions of the candidate frames, and starting and ending points of SAHS fragments are obtained according to the region candidate network;

2. The sleep apnea hypopnea syndrome assessment method based on a target detection framework of claim 1, wherein step S1 comprises:

3. The sleep apnea hypopnea syndrome assessment method based on a target detection framework of claim 2, wherein S2 comprises:

4. The sleep apnea hypopnea syndrome assessment method based on a target detection framework according to claim 2, wherein the backbone network module is comprised of three bottleneck layer structures, each bottleneck layer structure comprising two sub-blocks, an identity mapping module and a convolution module, respectively, and each sub-block being a series of one-dimensional convolution layers.

5. The sleep apnea hypopnea syndrome assessment method based on a target detection framework according to claim 1, wherein the sequence modeling module comprises a feature processing layer, a timing signal embedding layer, and a fully connected layer,

6. The sleep apnea hypopnea syndrome assessment method based on a target detection framework of claim 1, wherein S4 comprises:

7. The sleep apnea hypopnea syndrome assessment method based on a target detection framework of claim 6, wherein the candidate box class loss in S4.1 and S4.3 is calculated using a focal loss function.

8. A sleep apnea-hypopnea syndrome assessment device based on a target detection framework, comprising:

the target detection frame construction module is used for constructing an SAHS target detection frame, the SAHS target detection frame comprises a backbone network module used for fusing characteristics, a region candidate module used for generating detection candidate frames and a sequence modeling module used for classifying candidate sequences, wherein the region candidate module adopts a region candidate network RPN, the region candidate network comprises two branches, one branch is used for distinguishing whether signals in the candidate frames are characteristic signals or not, the other branch is used for carrying out regression on the boundaries of the candidate frames, the output of the region candidate network is a matrix of n x 2, n represents the number of the candidate frames contained in a sample, 2 represents two elements of the starting and ending positions of the candidate frames, and the starting and ending points of SAHS fragments are obtained according to the region candidate network;