CN113599052A

CN113599052A - Snore monitoring method and system based on deep learning algorithm and corresponding electric bed control method and system

Info

Publication number: CN113599052A
Application number: CN202110803746.6A
Authority: CN
Inventors: 单华锋; 丁少康; 张建炜; 郑剑
Original assignee: Keeson Technology Corp Ltd
Current assignee: Keeson Technology Corp Ltd
Priority date: 2021-07-15
Filing date: 2021-07-15
Publication date: 2021-11-05
Also published as: WO2023284813A1

Abstract

The invention discloses a snore monitoring method based on a deep learning algorithm, which takes a deep learning and voice recognition technology as a basis and comprises the steps of collecting audio signals and slicing according to preset sample duration; judging whether the slice contains sound by using a silence detection algorithm; extracting acoustic spectral features for an audio slice containing sound; inputting the generated frequency spectrum characteristics into a deep neural network to extract deep learning characteristics; classifying the extracted deep learning features by using a full connection layer; and judging the snore event according to the preset time length and intervening. The invention also discloses a related system. Compared with the traditional method, the snore identification method and the snore identification system based on the deep learning algorithm have the advantages that the accuracy of judging the snore and the non-snore is greatly improved, and better user experience is brought (figure 1).

Description

Snore monitoring method and system based on deep learning algorithm and corresponding electric bed control method and system

Technical Field

The invention relates to a voice correlation technology and a deep learning technology, in particular to a snore monitoring and electric bed control method and device based on a deep learning algorithm.

Background

According to incomplete statistics, the snore crowd in China accounts for nearly two, and even a serious snore person causes obstructive apnea, so that the physical health is seriously influenced.

At present, the following methods are mostly adopted in the related snore detection technology: the method has the advantages that the contact-type wearable equipment is used for detection, the highest detection rate can be achieved, false triggering is not easy to occur, and sleeping is affected; a method for detecting respiratory tract vibration by adopting a sound sensor or a vibration sensor is adopted, but the method has low accuracy and is easy to generate false detection. Therefore, there is a need for a snore detecting and intervening system that can accurately detect snore without affecting sleep quality and avoid false detection to the maximum extent.

Disclosure of Invention

The invention provides a snore detecting method based on deep learning and voice recognition technology, which is realized in an embedded terminal and can solve the problems of serious false recognition and missing recognition in the prior art.

At least to this end, some embodiments of the present application provide a deep learning algorithm based snore monitoring method, comprising: collecting sound signals and slicing to obtain audio slices; judging whether the audio slice contains sound by using a silence detection algorithm; and performing Mel Frequency Cepstrum Coefficient (MFCC) feature extraction on the audio slices judged to contain the sound, and then performing depth feature extraction and classification by using a pre-trained deep learning model to classify each audio slice into an audio slice containing the snore and an audio slice not containing the snore.

In some embodiments, the silence detection algorithm comprises calculating at least one of parameters comprising an energy peak, an energy mean, and an energy standard deviation in an audio slice; and comparing the calculated parameters with a preset mute threshold, and identifying the audio slice as mute if all the parameters are lower than the threshold.

In some embodiments, the snore classification algorithm comprises: performing framing operation on the audio slice; extracting MFCC features from the framed data, forming a spectrogram of the audio slice by the MFCC features, and inputting the spectrogram into a pre-trained convolutional neural network to extract deep learning features; and then inputting the deep learning features into a full connection layer for classification so as to classify each audio slice into an audio slice containing snore and an audio slice not containing snore. Wherein the deep learning feature is a feature vector of a high dimension.

In some embodiments, the framing is performed by sliding window frame shifting.

In some embodiments, 64-dimensional spectral features are extracted each frame.

In some embodiments, the method further includes determining a ratio of the audio slices classified as containing snore in a certain time period to the total number of audio slices in the certain time period, and if the ratio exceeds a preset threshold, determining that a snore event exists in the certain time period.

In some embodiments, the snore event determining method comprises: storing the identification result of the audio slice into a fixed-length queue, wherein the identification result comprises the audio slice without sound; audio slices containing snoring; and audio slices that do not contain snoring; and if the proportion of the number of the audio slices containing the snore to the total number of the audio slices in the fixed length queue exceeds a preset threshold value, judging that the snore event exists in a time period corresponding to the fixed length queue.

In some embodiments, if the ratio of the number of audio slices containing snore to the total number of audio slices in the fixed-length queue exceeds a preset ratio value, for example 75%, it is determined that there is a snore event in the time period corresponding to the fixed-length queue.

The application also provides a deep learning algorithm-based snore monitoring system, which comprises an embedded development board, wherein the development board is used as a computer program carrier, and the computer program is executed to realize the method of any one of the claims 1 to 8.

In some embodiments, the embedded development board includes a recording function, and the system is configured to turn on the recording function of the development board in real time for detection of a snoring event.

In some embodiments, an arm architecture imx6 model 896 6ul development board is employed to run a trained deep learning model.

Other embodiments of the present application provide a method for controlling an electric bed based on a deep learning algorithm, including: collecting sound signals and slicing to obtain audio slices; judging whether the audio slice contains sound by using a silence detection algorithm; performing Mel Frequency Cepstrum Coefficient (MFCC) feature extraction on the audio slices judged to contain the sound, and then performing depth feature extraction and classification by using a pre-trained deep learning model to classify each audio slice into an audio slice containing snore and an audio slice not containing snore; judging the proportion of the audio slices classified as containing snore in a certain time period to the total number of the audio slices in the time period, and if the proportion exceeds a preset threshold value, judging that a snore event exists in the time period; and after a snore event is detected, driving at least one component of the electric bed to act so as to interfere the snore generation.

In some embodiments, the framing is performed by sliding window frame shifting.

In some embodiments, 64-dimensional spectral features are extracted each frame.

In some embodiments, when the snore event is detected, the electric bed is controlled to lift the bed head by a preset angle, the detection algorithm of the snore event is continuously operated, and if the snore event is not detected any more within a preset time, the bed head is controlled to be flat.

Still further embodiments of the present application provide an electric bed control system based on a deep learning algorithm, comprising an embedded development board as a computer program carrier, executing the computer program to implement the method of any of the above claims 12 to 19.

In some embodiments, the system includes a control terminal storing the computer program such that the computer program is run on the control terminal to control the electric bed.

Compared with the traditional method, the snore identifying equipment based on the deep learning algorithm has the advantages that the accuracy of judging snores and non-snores is greatly improved, and better user experience is brought.

In addition, the structure provided by the application can greatly reduce the network parameter quantity and the calculated quantity, and can be widely applied to mobile terminals such as mobile phones and embedded terminals.

Drawings

Fig. 1 is a flow diagram of a snore monitoring method and intervention method based on a deep learning algorithm according to an embodiment of the present application;

fig. 2 is a schematic flow chart of a snore monitoring method and an intervention method based on a deep learning algorithm according to an embodiment of the present application;

FIG. 3 is a flow diagram of a muting algorithm detection according to an embodiment of the application;

FIG. 4 is a schematic flow diagram of a feature acquisition method according to an embodiment of the present application;

FIG. 5 is a functional schematic diagram showing the layer-by-layer convolution, pixel-by-pixel convolution and conventional convolution alignment;

FIG. 6 is a schematic flow chart illustrating pre-training of a deep neural network according to an embodiment of the present application;

fig. 7 is a schematic illustration of the sampling principle in some embodiments of the present application.

Detailed Description

The following detailed description of embodiments of the present application refers to the accompanying drawings.

Fig. 1 shows a flow chart of an embodiment of a snore detecting method based on a deep learning algorithm according to the invention, and as shown in the figure, the method comprises the following steps.

Step S101, collecting sound signals, wherein the sampling frequency can be set to be 16KHz, the sampling process is continuously operated from the system starting time, and the collection is not stopped until the active closing time.

Step S102, slicing the acquired audio signal to a specified length, wherein the length of the audio signal is 6 seconds, and the length of the audio signal is 3 seconds, that is, each time new 6 seconds of data are generated in the continuously acquired signal, an audio slice is extracted from the continuously acquired signal.

Step S103, carrying out mute algorithm analysis on the obtained fixed-length audio slice, and judging whether the audio segment contains sound;

step S104, if the audio slice contains sound, extracting acoustic features;

step S105, inputting the extracted acoustic features into a pre-trained deep neural network for deep feature extraction, and finally encoding the audio segment into fixed-dimension features, in this embodiment, 128-dimension features are adopted;

and S106, inputting the obtained fixed dimension characteristics into a full-connection layer network for final two-classification to obtain whether the current audio slice is snore or not.

And step S107, storing the classified results into a queue with a fixed length, and then performing integrated analysis on the total audio slice results in the queue to judge whether the snore event exists in the fixed time period.

Only the results of the classification may be included in the fixed length queue.

The result of the classification and the audio slice determined not to contain sound in step S104 may be included in the fixed-length queue.

Audio slices are also referred to as audio data slices in this application.

The judgment result of whether snore exists can be used as a control basis for the subsequent action of the electric bed. For example, a snore event may be established based on the proportion of audio slices in the fixed length queue that contain snore, and the action or sequence of actions of the electric bed may be controlled based on the snore event.

During the action of the electric bed, the audio can be continuously collected and monitored according to the monitoring method, and if no audio slice containing snore exists in a preset time period, the action of the electric bed is stopped.

The steps of the above method are explained in detail below.

Specifically, the step S102 of slicing the acquired audio signal may adopt a slice length of 6 seconds, i.e., an audio slice is extracted from the continuously acquired signal each time new 6-second data generation is satisfied. The time length of the audio slice is preferably set to ensure that at least one complete snore exists in one audio slice to the greatest extent so as to improve the detection success rate of the algorithm. Taking the human breathing rate of 10-20 breaths per minute for example, 6 seconds is sufficient to encompass a complete breath.

It is also possible to arrange for the 3 second length to be in the form of a sliding window, since slicing may cut a complete snore in two different 6 seconds, and thus may have an effect on the snore determination of both 6 second slices. The sliding window approach may avoid this problem to the greatest extent possible. A schematic diagram of a sliding window is shown in fig. 7, where each bin represents a data length of one second. By adopting the sliding window mode, the omission of snore can be avoided.

Specifically, the step of performing a mute algorithm detection on the intercepted audio slice to determine whether the audio slice includes sound, i.e. step S103, as shown in fig. 3, further includes:

calculating a maximum energy value, a minimum energy value and an energy value standard deviation of the sampling values in the audio slice, and step S301;

comparing the three values with a preset mute threshold value, step S302, if the mute condition is satisfied, determining that the current audio slice does not contain sound, abandoning the slice and not performing subsequent snore monitoring algorithm, step S303, thereby greatly reducing the calculation amount.

In the embodiment of the invention, a plurality of silent signals are collected by using a related development board in a sleep environment, the silent signals are sliced into 6-second audio slice data, then the maximum value, the minimum value and the standard deviation of energy in each 6-second audio slice are calculated, and after arrangement, the value of the ninth decimal place is selected as a threshold (for example, 1000 values are arranged from small to large, and the 900 th value is selected as the threshold), wherein the sleep environment refers to an environment which can not hear obvious sound by human ears and is close to the sleep environment. For example, 100 audio frequencies of 6 seconds are collected in the environment, the maximum energy value, the minimum energy value and the standard deviation are respectively extracted, the 90 th values are respectively selected as threshold values after being respectively sorted from small to large, namely, the nine-quantile value is used as the threshold value, for example, the maximum energy threshold value is set to 2000, in the actual use process of the algorithm, if the maximum energy value in the audio slice of 6 seconds is less than 2000, the slice of 6 seconds is considered to be silent, and of course, the embodiment of the invention not only adopts the maximum energy value, but also adopts other threshold values such as the standard deviation and the like as the judgment standard of silence.

And a step 104 of acquiring acoustic features of the audio slice judged to have sound. As shown in fig. 4, the steps specifically include:

in the embodiment of the present invention, applicants find that, in the step 401 of normalizing the sampled signal, the accuracy rate of model identification can be greatly improved by normalizing the audio slice data of 6 seconds in the training process of identifying the model. If normalization is not carried out, the sampling value of the audio is calculated according to the sampling byte type int16, the range of-32768-32767 of the sampling value of the audio is too large, fitting of a neural network algorithm is not facilitated, the value range of energy after normalization is limited to be minus one to one, and fitting of the algorithm is facilitated. Therefore, in the embodiment of the present invention, for each 6-second audio slice, the maximum value of energy in the 6-second audio slice is first calculated, and then each sample point is divided by the maximum value for normalization. For example, at a sampling frequency of 16KHz, then a total of 9 sixteen thousand sample points for a 6 second audio slice, then dividing these ninety-six thousand points by the maximum value yields the 6 second audio slice data after normalization.

The normalized audio slice data is framed, in this embodiment, a length of 25ms is adopted for each frame, and a sliding window length of 10ms is adopted between frames, that is, step 402.

Then, for each frame of audio, a high-dimensional feature, for example, a 64-dimensional MFCC (Mel frequency cepstrum coefficient) feature is extracted, step S403, where MFCC is one of the most commonly used acoustic features in the speech recognition field and the voiceprint recognition field, and includes a series of steps of windowing, fast fourier transform, Mel (Mel) filtering, and discrete cosine transform, respectively.

Specifically, in the embodiment of the present invention, the calculation process of the feature extraction part is as follows: the length of one sample is 6 seconds, the sample is firstly divided into frames, the frame length is 25ms, the frame shift is 10ms, the frame is divided into 598 frames, each frame is 25ms, and each frame has 400 sampling points at the sampling rate of 16 KHz. In step S4031, each frame is windowed, where the windowing is used to eliminate discontinuity at two ends of the frame in the sliding window process, and the hamming window is adopted in the embodiment of the present invention, and the hamming window schematic diagram is shown in fig. 7, so that the weight of the middle non-overlapping portion of each frame can be enhanced, and the weight at the connection position between the frame and the frame can be weakened. In step S4032, the windowed data is subjected to fast fourier transform, where each frame of data points includes 400 data points, and there are 512 points after 0 is supplemented, and 512 points are more suitable for FFT. In step S4033, mel-filtering is performed on the frequency spectrum obtained in the previous step, the filtering frequency range is 125Hz to 7500Hz, the frequency is converted into mel value as shown in formula 1, 64 v-mel frequency is extracted every frame, wherein f_hzIs the frequency, f_melIs the corresponding mel frequency. In step S4034, discrete cosine transform is performed after the obtained mel frequency is logarithmized, so that the final MFCC feature is obtained.

After obtaining the MFCC analysis results, the numerical results of the MFCC analysis can be directly transformed into image dimensions, i.e., length by width channels, that can be processed by the convolutional neural network. In the embodiment of the present invention, the input to each network comes from the processing of 6 second audio slices, and 64-dimensional MFCC features are extracted per frame in 25ms per frame and 10ms per sliding window, so that 6 second audio will be finally processed into 598 × 64 dimensional array, i.e. frame × feature dimension. It is then deformed into the dimensions 96 x 64 x6 of the image, i.e. a spectrogram input neural network is formed. The last few frames can then be discarded by deforming the image dimensions to fit the designed convolutional neural network. It should be understood that other styles of convolutional neural networks can be designed and the MFCC features deformed accordingly to generate the corresponding image dimensions.

Step 105 is to perform deep learning feature extraction and classification on the MFCC features.

The embodiment of the invention adopts a deep convolutional neural network as a deep learning feature extractor, takes the MFCC frequency spectrum features obtained above as the input of the convolutional neural network, and outputs 128-dimensional features.

The convolutional neural network is a neural network structure which is dominant in the field of computer vision image processing, has the characteristics of translation invariance and local parameter sharing, and is very suitable for extracting some abstract depth features in class image data. The MFCC signature is theoretically a time-varying distribution of the energy of the audio signal over different frequencies, which is also suitable for processing using a convolutional neural network after processing into a data structure like a graph matrix in step 105.

After the 128-dimensional features are obtained through the convolutional neural network, the 128-dimensional features are sent to the full link layer for final classification, i.e., step 106. Here, the convolutional neural network is used for extracting more deep abstract features from the MFCC features, because only the MFCC features cannot be directly identified by the full connection layer, the full connection layer is used for self-learning and classifying the extracted abstract features, in this example, the full connection layer is used for two classifications (1 and 0), that is, the output of the full connection layer includes two nodes S (snore) and N (no snore), when the value of the output node S is greater than the value of the node N, that is, the snore is likely to be greater than or not likely to be snore, so that it is determined that the audio slice belongs to snore, otherwise, it is determined that the audio slice is not snore, and in this embodiment, the structure of the convolutional neural network is shown in table 1.

TABLE 1

FIG. 6 shows a flow diagram of a method of training a convolutional neural network model for feature extraction and fully-connected layers for final classification:

first, snoring sound data of a large number of different people is collected through various channels for about 10 hours in total, step S601, and a large number of various sounds that may be generated at night, i.e., sounds that are most likely to cause misjudgment, such as wind, rain, etc., for about 10 hours in total, step S602.

After the data are completely collected and processed, the training of the model is started, the training mode can adopt a standard gradient descent method for training, and the loss function uses cross entropy loss. In the end-to-end self-training process, the convolutional neural network can learn how to automatically extract the most important features in the MFCC for judging snore, and the fully-connected connection layer can also learn how to automatically classify the features extracted by the convolutional neural network. And step S603 of classification, and the finally obtained detection model is used for detecting snore.

Whether a snore event exists can be judged according to the detection result of the full connection layer on the snore and the sound existence result, and step S107.

A snore event may be the determination of an audio slice as a snore event each time it is determined to be a snore and optionally an intervention. However, if the intervention is performed every time the audio slice is determined to be snore, the intervention process may be too frequent, and an alternative method is to define a time period, and when the number of the detected audio slices belonging to the snore in the time period exceeds a certain number, it is determined that the sleeper in the time period has the snoring behavior. As shown in fig. 1, in the embodiment of the present invention, a queue structure with a fixed length may be maintained, for example, the queue length 39 is adopted, that is, 39 analysis results are totally included in the queue, each analysis result represents that a 6 second sample is judged to be snore through an algorithm, 3 seconds of overlap exists between each sample, and the total time required for filling a queue is 3+39 × 3 — 120 seconds, that is, all analysis results within two minutes are stored in one queue, step S701.

If it is determined that the number of audio slices of the snore exceeds a certain percentage, for example, one of 70%, 75%, and 85%, it is determined that a snore event occurs during the period of time, and step S702 may optionally initiate subsequent snore intervention actions. In the present example, one third of the cohort, 39 in length, more than 26 slices were judged to be snoring, and used as a threshold above which snoring events were considered to be present in the two minutes. As shown in fig. 1 and 2, when the maintained queue is determined not to reach the intervention threshold, dequeuing the data at the head of the queue, supplementing new data into the queue, and continuing to determine whether the intervention threshold is met.

The intervention may be, for example, activating the head motor of the electric bed as shown in fig. 2 to lift the head at an angle, for example, 15 degrees, and after the intervention step is performed, the queue may be emptied. The intervention state may thereafter be maintained for a period of time, and then a detection period during the period of intervention is checked for the presence of an audio slice of snoring, and if so, the intervention is continued until there is no more audio slice.

Furthermore, the related algorithms of the invention can be transplanted into embedded equipment to operate, so as to obtain the snore detecting equipment, and the snore detecting equipment can be used for intervening in snore by combining products such as electric beds, pillows and the like.

The embodiment of the invention adopts an ARM architecture embedded development board as a carrier of the algorithm and the method, a microphone and a mic can be carried on the board, a convolution neural network is designed by adopting a separated convolution structure, the difference and the connection between the separated convolution and the original convolution are shown in figure 5, as can be seen from the figure, the separated convolution divides the traditional convolution (a) into two parts of layer-by-layer convolution (b) and pixel-by-pixel convolution (c), and the definition of related parameters is shown in figure 5:

the original convolution structure is calculated by equations 2 to 4:

C_ori＝K*K*W_Out*H_out*C_in*C_out(formula 2)

The separate convolution structure is calculated as

C_new＝W_Out*H_out*C_in*C_out+K*K*C_in*W_Out*H_out(formula 3)

The amount of computation reduced compared to the original convolution is

That is, if the original convolution kernel size is 3 x 3, the computational effort of the convolution operation can be reduced to around 1/9. The structure can greatly reduce the network parameter quantity and the calculated quantity, and can be widely applied to mobile terminals such as mobile phones, embedded type terminals and the like.

The whole logic is realized by C + + language, after cross compiling, the cross compiling is transplanted to an embedded development board to run, and after testing, the snore detecting algorithm in the invention is superior to the traditional snore detecting technology in both snore false detection and snore missing detection.

In some example embodiments, the functions of any of the methods, processes, signaling diagrams, algorithms, or flow diagrams described herein may be implemented by software and/or computer program code or portions of code stored in memory or other computer-readable or tangible media, and executed by a processor.

In some example embodiments, the apparatus in the present application may be included or associated with at least one software application, module, unit or entity configured as an arithmetic operation, or as a program or portion thereof (including added or updated software routines), executed by at least one operating processor. Programs, also referred to as program products or computer programs, including software routines, applets and macros, may be stored in any device-readable data storage medium and may include program instructions for performing particular tasks.

A computer program product may comprise one or more computer-executable components configured to perform some example embodiments when the program is run. The one or more computer-executable components may be at least one software code or code portion. Changes and configurations to implement the functions of the example embodiments may be performed as routines, which may be implemented as added or updated software routines. In an example, a software routine may be downloaded into the device.

By way of example, the software or computer program code or portions of code may be in source code form, object code form, or in some intermediate form, and may be stored on some type of carrier, distribution medium, or computer-readable medium, which may be any entity or device capable of carrying the program. Such a carrier may comprise, for example, a record medium, computer memory, read-only memory, an optical and/or electrical carrier signal, a telecommunication signal and/or a software distribution package. Depending on the required processing power, the computer program may be executed in a single electronic digital computer or may be distributed over a plurality of computers. The computer-readable medium or computer-readable storage medium may be a non-transitory medium.

In other example embodiments, this functionality may be performed by hardware or circuitry included in the airbag pillow, for example, by using an Application Specific Integrated Circuit (ASIC), a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or any other hardware and software combination. In yet another example embodiment, the functionality may be implemented as a signal, such as a non-tangible means that may be carried by electromagnetic signals downloaded from the Internet or other networks.

According to example embodiments, an apparatus such as a node, device or response means may be configured as a circuit, a computer or a microprocessor (such as a single chip computer element) or a chipset, which may comprise at least a memory for providing storage capacity for arithmetic operations and/or an operation processor for performing arithmetic operations.

Finally, it should be noted that: the above embodiments are merely illustrative of the present invention, and any variations and modifications which do not require inventive efforts by those skilled in the art are within the scope of the present invention without departing from the core of the present invention.

Claims

1. A snore monitoring method based on a deep learning algorithm is characterized by comprising the following steps:

collecting sound signals and slicing to obtain audio slices;

judging whether the audio slice contains sound by using a silence detection algorithm;

and performing Mel Frequency Cepstrum Coefficient (MFCC) feature extraction on the audio slices judged to contain the sound, and then performing depth feature extraction and classification by using a pre-trained deep learning model to classify each audio slice into an audio slice containing the snore and an audio slice not containing the snore.

2. The snore monitoring method based on the deep learning algorithm according to claim 1, characterized in that: the silence detection algorithm includes calculating at least one of parameters including an energy peak value, an energy mean value, and an energy standard deviation in an audio slice; and comparing the calculated parameters with a preset mute threshold, and if all the parameters are lower than the mute threshold, identifying the audio slice as a mute audio slice.

3. The snore monitoring method based on the deep learning algorithm according to claim 1, characterized in that: the snore classification algorithm comprises the following steps:

performing framing operation on the audio slice;

extracting MFCC features from the framed data, and forming a spectrogram of an audio slice based on the MFCC features;

inputting the spectrogram into a pre-trained convolutional neural network to extract deep learning features; and

and inputting the deep learning characteristics into a full connection layer for classification so as to classify each audio slice into an audio slice containing snore and an audio slice not containing snore.

4. The snore monitoring method based on the deep learning algorithm according to claim 3, wherein: and performing framing by adopting a sliding window frame shifting mode.

5. The snore monitoring and intervention method based on the deep learning algorithm as claimed in claim 3, wherein: where 64-dimensional spectral features are extracted per frame.

6. The snore monitoring method based on the deep learning algorithm according to claim 1, characterized in that: the method further comprises the steps of judging the proportion of the audio slices classified as containing the snore in a certain time period to the total number of the audio slices in the time period, and if the proportion exceeds a preset threshold value, judging that the snore event exists in the time period.

7. The snore monitoring method based on the deep learning algorithm according to claim 6, wherein: the snore event judging method comprises the following steps: storing the identification result of the audio slice into a fixed-length queue, wherein the identification result comprises the audio slice without sound; audio slices containing snoring; and audio slices that do not contain snoring; and if the proportion of the number of the audio slices containing the snore to the total number of the audio slices in the fixed length queue exceeds a preset threshold value, judging that the snore event exists in a time period corresponding to the fixed length queue.

8. The snore monitoring method based on the deep learning algorithm according to claim 7, wherein: and if the proportion of the number of the audio slices containing the snore to the total number of the audio slices in the fixed length queue exceeds a preset proportion value, judging that the snore event exists in a time period corresponding to the fixed length queue.

9. A snore monitoring system based on a deep learning algorithm is characterized in that: comprising an embedded development board as a computer program carrier, which computer program is executed to implement the method of any of the preceding claims 1 to 8.

10. The deep learning algorithm-based snore monitoring system of claim 9, wherein: the embedded development board comprises a recording function, and the system is configured to start the recording function of the development board in real time to detect the snore event.

11. The deep learning algorithm-based snore monitoring system of claim 9, wherein: and (3) running the well-trained deep learning model by adopting an arm framework imx6 model No. 6ul development board.

12. A method for controlling an electric bed based on a deep learning algorithm is characterized by comprising the following steps:

collecting sound signals and slicing to obtain audio slices;

performing Mel Frequency Cepstrum Coefficient (MFCC) feature extraction on the audio slices judged to contain the sound, and then performing depth feature extraction and classification by using a pre-trained deep learning model to classify each audio slice into an audio slice containing snore and an audio slice not containing snore;

judging the proportion of the audio slices classified as containing snore in a certain time period to the total number of the audio slices in the time period, and if the proportion exceeds a preset threshold value, judging that a snore event exists in the time period; and

and after a snore event is detected, at least one part of the electric bed is driven to act so as to interfere the generation of the snore.

13. The electric bed control method based on the deep learning algorithm according to claim 12, characterized in that: the silence detection algorithm includes calculating at least one of parameters including an energy peak value, an energy mean value, and an energy standard deviation in an audio slice; and comparing the calculated parameters with a preset mute threshold, and if all the parameters are lower than the mute threshold, identifying the audio slice as a mute audio slice.

14. The electric bed control method based on the deep learning algorithm according to claim 12, characterized in that: the snore classification algorithm comprises the following steps:

performing framing operation on the audio slice;

inputting the spectrogram into a pre-trained convolutional neural network to extract deep learning features;

the deep learning features are then input into the full-connected layer for classification to classify each audio slice as an audio slice containing snoring and an audio slice not containing snoring.

15. The electric bed control method based on the deep learning algorithm according to claim 14, characterized in that: and performing framing by adopting a sliding window frame shifting mode.

16. The electric bed control method based on the deep learning algorithm according to claim 14, characterized in that: where 64-dimensional spectral features are extracted per frame.

17. The electric bed control method based on the deep learning algorithm according to claim 12, characterized in that: the snore event judging method comprises the following steps: storing the identification result of the audio slice into a fixed-length queue, wherein the identification result comprises the audio slice without sound; audio slices containing snoring; and audio slices that do not contain snoring; and if the proportion of the number of the audio slices containing the snore to the total number of the audio slices in the fixed length queue exceeds a preset threshold value, judging that the snore event exists in a time period corresponding to the fixed length queue.

18. The electric bed control method based on the deep learning algorithm according to claim 17, characterized in that: and if the proportion of the number of the audio slices containing the snore to the total number of the audio slices in the fixed length queue exceeds a preset proportion value, judging that the snore event exists in a time period corresponding to the fixed length queue.

19. The electric bed control method based on the deep learning algorithm according to claim 12, characterized in that: when the snore event is detected, the electric bed is controlled to lift the bed head by a preset angle, the detection algorithm of the snore event is continuously operated, and if the snore event is not detected within preset time, the bed head is controlled to be flat.

20. The utility model provides an beddo control system based on deep learning algorithm which characterized in that: comprising an embedded development board as a computer program carrier, which computer program is executed to implement the method of any of the preceding claims 12 to 19.

21. The deep learning algorithm-based electric bed control system according to claim 20, wherein: the embedded development board comprises a recording function, and the system is configured to start the recording function of the development board in real time to detect the snore event.

22. The deep learning algorithm-based electric bed control system according to claim 20, wherein: and (3) running the well-trained deep learning model by adopting an arm framework imx6 model No. 6ul development board.

23. The deep learning algorithm-based electric bed control system according to claim 20, wherein: the electric bed control system comprises a control terminal, wherein the control terminal stores the computer program so that the computer program runs on the control terminal to control the electric bed.