CN116386872B - Training method, method and device for identifying sleep quality, medium and electronic equipment - Google Patents

Training method, method and device for identifying sleep quality, medium and electronic equipment Download PDF

Info

Publication number
CN116386872B
CN116386872B CN202310391245.0A CN202310391245A CN116386872B CN 116386872 B CN116386872 B CN 116386872B CN 202310391245 A CN202310391245 A CN 202310391245A CN 116386872 B CN116386872 B CN 116386872B
Authority
CN
China
Prior art keywords
decision tree
audio data
training
tree model
snore audio
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202310391245.0A
Other languages
Chinese (zh)
Other versions
CN116386872A (en
Inventor
黄晶晶
李华伟
田君琦
邱禧荷
方志军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Hainan Tianmeng Digital Medical Technology Co ltd
Eye and ENT Hospital of Fudan University
Original Assignee
Hainan Tianmeng Digital Medical Technology Co ltd
Eye and ENT Hospital of Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Hainan Tianmeng Digital Medical Technology Co ltd, Eye and ENT Hospital of Fudan University filed Critical Hainan Tianmeng Digital Medical Technology Co ltd
Priority to CN202310391245.0A priority Critical patent/CN116386872B/en
Publication of CN116386872A publication Critical patent/CN116386872A/en
Application granted granted Critical
Publication of CN116386872B publication Critical patent/CN116386872B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G16INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
    • G16HHEALTHCARE INFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR THE HANDLING OR PROCESSING OF MEDICAL OR HEALTHCARE DATA
    • G16H50/00ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics
    • G16H50/30ICT specially adapted for medical diagnosis, medical simulation or medical data mining; ICT specially adapted for detecting, monitoring or modelling epidemics or pandemics for calculating health indices; for individual health risk assessment
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • G10L25/66Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination for extracting parameters related to health condition

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Medical Informatics (AREA)
  • Multimedia (AREA)
  • Public Health (AREA)
  • Acoustics & Sound (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Signal Processing (AREA)
  • Theoretical Computer Science (AREA)
  • Epidemiology (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Primary Health Care (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Pathology (AREA)
  • Auxiliary Devices For Music (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The application provides a training method, a method and device for identifying sleep quality, a medium and electronic equipment. The training method comprises the following steps: obtaining snore audio data; processing the snore audio data to obtain a mel-frequency cepstrum coefficient of the snore audio data and a label corresponding to the snore audio data; training a sleep quality assessment model to be trained based on a training data set to obtain a trained sleep quality assessment model, wherein the training data set is a mel-frequency spectrum coefficient of the snore audio data and a label corresponding to the snore audio data, the sleep quality assessment model to be trained is a decision tree model, and an objective function of the decision tree model in a training iteration process is determined by a first regularization item parameter, a second regularization item parameter and a decision tree corresponding to iteration times. The trained sleep quality assessment model trained according to the training method has higher precision and generalization capability.

Description

Training method, method and device for identifying sleep quality, medium and electronic equipment
Technical Field
The application belongs to the field of artificial intelligence models, relates to a training method, and in particular relates to a training method, a sleep quality identification device, a sleep quality identification medium and electronic equipment.
Background
One way of checking the sleep quality of a patient in the past is through laboratory polysomnography, which requires that the patient be in the laboratory overnight and that a large number of lead wires be connected, often resulting in the first night effect of the patient in unfamiliar environments, and thus poor sleep monitoring results, and the current way of checking has been gradually replaced by the more popular way of checking the sleep quality through artificial intelligence models.
The machine learning model is used for checking the sleep quality of the patient, the corresponding sleep quality assessment model is generally required to be trained firstly, but the current training method has the problem that the trained sleep quality assessment model is low in accuracy because the quality of the training data set is low and the accuracy of the non-optimized objective function is not high.
Disclosure of Invention
The invention aims to provide a training method, a device, a medium and electronic equipment for identifying sleep quality, which are used for solving the problem that the accuracy of a trained sleep quality assessment model is not high in the existing training method.
In a first aspect, the present application provides a training method, the training method comprising: obtaining snore audio data; processing the snore audio data to obtain a mel-frequency cepstrum coefficient of the snore audio data and a label corresponding to the snore audio data; training a sleep quality assessment model to be trained based on a training data set to obtain a trained sleep quality assessment model, wherein the training data set is a mel-frequency spectrum coefficient of the snore audio data and a label corresponding to the snore audio data, the sleep quality assessment model to be trained is a decision tree model, and an objective function of the decision tree model in a training iteration process is determined by a first regularization item parameter, a second regularization item parameter and a decision tree corresponding to iteration times.
In the training method, because the training data set of the sleep quality assessment model to be trained is the mel-frequency cepstrum coefficient of the snore audio data, compared with the traditional training data set, the mel-frequency cepstrum coefficient of the snore audio data has better training effect when being used as the training data set, and the trained sleep quality assessment model has higher precision. In addition, the first regular term parameter and the second regular term parameter of the objective function can improve the generalization capability and the precision of the trained sleep quality assessment model.
In an embodiment of the present application, the implementation method for processing the snore audio data to obtain a mel-frequency cepstrum coefficient of the snore audio data and a label corresponding to the snore audio data includes: splitting the snore audio data to obtain segment snore audio data in a plurality of fixed time intervals, wherein the label corresponding to the snore audio data is the label corresponding to the segment snore audio data; feature extraction processing is carried out on the segment snore audio data so as to obtain 13-dimensional Meier cepstrum static coefficients of a plurality of segment snore audio data; and carrying out average processing on the 13-dimensional Meyer cepstrum static coefficient of the segment snore audio data to obtain the average 13-dimensional Meyer cepstrum static coefficient of the segment snore audio data, wherein the Meyer cepstrum coefficient of the snore audio data is the average 13-dimensional Meyer cepstrum static coefficient of the segment snore audio data.
In an embodiment of the present application, the predicted value in the iteration process of the decision tree model is determined by a function space formed by the number of decision trees corresponding to the iteration times and the decision tree corresponding to the iteration times, and the training data set is an average 13-dimensional mel cepstrum static coefficient of the segment snore audio data and a label corresponding to the segment snore audio data.
In one embodiment of the present application, the predicted value for the ith training sample is expressed as:
wherein K is the number of decision trees in the iterative process of the decision tree model, G is the function space formed by the decision trees, and x i G for the ith training sample k (x i ) A function of the kth decision tree with respect to the ith training sample.
In an embodiment of the present application, the method for generating an objective function of the decision tree model in the training iteration process includes: generating a new decision tree of the decision tree model in each round of iterative process based on the training data set; acquiring an objective function of the decision tree model in a training iteration process based on the newly-generated decision tree; and processing the objective function of the decision tree model in the training iteration process to obtain the improved objective function of the decision tree model.
In an embodiment of the present application, the objective function under the t-th iteration process of the decision tree model is:
wherein L is t Expressed as an objective function, y, of the decision tree model in the t-th round of iterative process i The true value is represented by a value that is true,representing a predicted value of the decision tree model in the t-1 th round of iteration process, wherein alpha represents the first regularization item parameter, beta represents the second regularization item parameter, H represents the number of leaf nodes of the decision tree in the t-1 th round of iteration process of the decision tree model, w is a weight matrix of the leaf nodes, and g t (x i ) Representing a function of a new generation decision tree about the ith training sample in a t-th round of iteration process of the decision tree model, n representing the number of training samples, and the objective function of the improvement of the decision tree model being represented as:
wherein n is i Represented as first order partial derivatives of objective function of ith training sample in the t-th iteration process of the decision tree model, m i Represented as the second order of the objective function for the ith training sample in the t-th iteration of the decision tree modelPartial derivatives, T represents the number of iterations undergone by the decision tree model, and I represents a sample set.
In a second aspect, the present application provides a method of identifying sleep quality, the method of identifying sleep quality comprising: obtaining snore audio data to be identified; the trained sleep quality assessment model according to any one of the first aspect processes the snore audio data to be identified to obtain a sleep quality identification result.
In a third aspect, the present application provides a training device comprising: the snore audio data acquisition module is used for acquiring snore audio data; the snore audio data processing module is used for processing the snore audio data to acquire a mel cepstrum coefficient of the snore audio data and a label corresponding to the snore audio data; the sleep quality assessment model training module is used for training a sleep quality assessment model to be trained based on a training data set to obtain a trained sleep quality assessment model, wherein the training data set is a Mel cepstrum coefficient of the audio data and a label corresponding to the snore audio data, the sleep quality assessment model to be trained is a decision tree model, and an objective function of the decision tree model in a training iteration process is determined by a first regularization item parameter, a second regularization item parameter and a decision tree corresponding to the iteration times.
In a fourth aspect, the present application provides a computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements the training method according to any one of the first aspects of the present application and/or the method for identifying sleep quality according to any one of the second aspects.
In a fifth aspect, the present application provides an electronic device, including: a memory storing a computer program; and the processor is in communication connection with the memory and executes the training method according to any one of the first aspect and/or the sleep quality identification method according to any one of the second aspect of the application when the computer program is called.
As described above, the training method, the device, the medium and the electronic equipment for identifying sleep quality have the following beneficial effects:
in the training method, because the training data set of the sleep quality assessment model to be trained is the mel-frequency cepstrum coefficient of the snore audio data, compared with the traditional training data set, the mel-frequency cepstrum coefficient of the snore audio data has better training effect when being used as the training data set, and the trained sleep quality assessment model has higher precision. In addition, the first regular term parameter and the second regular term parameter of the objective function can improve the generalization capability and the precision of the trained sleep quality assessment model.
Drawings
Fig. 1 is a schematic diagram of a hardware structure of a mobile terminal according to an embodiment of the present application.
Fig. 2 shows a flowchart of a training method according to an embodiment of the present application.
Fig. 3 is a flowchart of processing the snore audio data according to an embodiment of the present application.
Fig. 4 is a flowchart of a method for generating an objective function according to an embodiment of the present application.
Fig. 5 is a flowchart of a method for identifying sleep quality according to an embodiment of the present application.
Fig. 6 is a schematic structural diagram of a training device according to an embodiment of the present application.
Description of element reference numerals
10. Mobile terminal
100. Processor and method for controlling the same
110. Memory device
120. Communication transmission device
130. Input/output device
600. Training device
610. Snore audio data acquisition module
620. Snore audio data processing module
630. Sleep quality assessment model training module
S11-S13 step
S21-S23 step
S31-S33 step
S41-S42 step
Detailed Description
Other advantages and effects of the present application will become apparent to those skilled in the art from the present disclosure, when the following description of the embodiments is taken in conjunction with the accompanying drawings. The present application may be embodied or carried out in other specific embodiments, and the details of the present application may be modified or changed from various points of view and applications without departing from the spirit of the present application. It should be noted that the following embodiments and features in the embodiments may be combined with each other without conflict.
It should be noted that, the illustrations provided in the following embodiments merely illustrate the basic concepts of the application by way of illustration, and only the components related to the application are shown in the drawings and are not drawn according to the number, shape and size of the components in actual implementation, and the form, number and proportion of the components in actual implementation may be arbitrarily changed, and the layout of the components may be more complex.
The following describes the technical solutions in the embodiments of the present application in detail with reference to the drawings in the embodiments of the present application.
The training method and/or the sleep quality recognition method provided by the embodiment of the application can be operated in a mobile terminal, a computer terminal and the like. Taking the mobile terminal as an example, fig. 1 is a block diagram of a hardware structure of the mobile terminal of the training method and/or the sleep quality recognition method, and as one example of the mobile terminal of fig. 1, the mobile terminal 10 may include: processor 100 and memory 110, the processor 100 may be a central processing unit, and the memory 110 is configured to store data. The mobile terminal 10 of fig. 1 is intended to be exemplary only and is not intended to limit the specific structure of the mobile terminal 10.
Optionally, the mobile terminal 10 may further include: a communication transmission device 120 and an input output device 130.
Alternatively, the memory 110 may be used to store computer programs, such as software programs and modules of application software, and the memory 110 may include high-speed random access memory, and may also include non-volatile memory, such as one or more magnetic storage devices, flash memory, or other non-volatile solid state memory. In some examples, the memory 110 may further include memory 110 located remotely from the processor 100, which may be connected to the mobile terminal 10 via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
Alternatively, the communication transmission device 120 may be used to receive or transmit data via a network, which may include a wireless network provided by a communication provider of the mobile terminal 10, and the communication transmission device 120 may include a NIC (Network Interface Controller, network adapter) that may be connected to other network devices through a base station so as to communicate with the internet.
As shown in fig. 2, the present embodiment provides a training method, which may be implemented by a processor of a computer device or a processor of a mobile terminal, and the training method includes:
S11, obtaining snore audio data.
Optionally, the snore audio data refers to a digitized audio data obtained by recording or collecting a snore signal. Snoring is noise generated due to obstruction or narrowing of the airway during sleep, and is generally considered to be a manifestation of sleep disordered breathing. The snore audio data can be collected by a microphone or the like and then analyzed and processed using digital signal processing techniques.
S12, processing the snore audio data to obtain the mel-frequency cepstrum coefficient of the snore audio data and the label corresponding to the snore audio data.
Optionally, the mel-frequency cepstral coefficient of the snore audio data may include: the static coefficient of the 13-dimensional mel-frequency spectrum of the snore audio data, the first-order differential coefficient of the 13-dimensional mel-frequency spectrum of the snore audio data and the second-order differential coefficient of the 13-dimensional mel-frequency spectrum of the snore audio data. Mel-frequency cepstral coefficients are coefficients used to characterize audio signals, and are commonly used in the fields of speech recognition, music information retrieval, and the like.
Optionally, the label corresponding to the snore audio data may be labels corresponding to the segment snore audio data within a plurality of fixed time intervals, where the label may be indicated by 1 or 0, 1 indicates that there is a respiratory event within the fixed time interval, and 0 indicates that there is no respiratory event within the fixed time interval. The respiratory event may be a normal breathing inter-respiratory apnea and hypopnea event. For example, the snore audio data may have a total of 3s, the fixed time interval may be 1s, the snore audio data may be divided into 3 segments of snore audio data within 1s time intervals, and each segment of snore audio data within 1s time interval has a corresponding label for indicating whether there is a respiratory event in that time interval.
S13, training a sleep quality assessment model to be trained based on a training data set to obtain a trained sleep quality assessment model, wherein the training data set is a Mel cepstrum coefficient of the snore audio data and a label corresponding to the snore audio data, the sleep quality assessment model to be trained is a decision tree model, and an objective function of the decision tree model in a training iteration process is determined by a first regularization item parameter, a second regularization item parameter and a decision tree corresponding to iteration times.
Optionally, the sleep quality assessment model to be trained may be a sleep quality assessment model in a training process, where the sleep quality assessment model in the training process includes an initialized sleep quality assessment model and a sleep quality assessment model with updated parameters in the training process. The decision tree model is a classification and regression model based on a tree structure and is used for solving the classification and regression problems in supervised learning. The decision tree model builds a tree structure by recursively splitting the data, where each node represents a feature and each leaf node represents a class or a value.
Optionally, the decision trees in each iteration process of the decision tree model are different, for example, the number of decision trees in the third iteration process of the decision tree model may be 5, the number of decision trees in the fourth iteration process of the decision tree model may be 6, the decision tree corresponding to the iteration number is the decision tree of the decision tree model under the iteration number, the decision tree under the first iteration is the decision tree in the first iteration process of the decision tree model, and the decision tree under the second iteration is the decision tree in the second iteration process of the decision tree model.
Optionally, the objective function includes a regularization term, where the regularization term includes a first regularization term parameter and a second regularization term parameter, where the regularization term may be a penalty term in the objective function, and is used to constrain parameters or weights of the sleep quality assessment model to be trained, and the first regularization term parameter and the second regularization term parameter may prevent the sleep quality assessment model to be trained from excessively fitting the training data set, thereby improving a generalization capability of the sleep quality assessment model to be trained. Specific values of the first regularization term parameter and the second regularization term parameter are not specifically limited in this embodiment, and may be set according to an actual training process.
Optionally, the predicted value in the iterative process of the decision tree model is determined by a function space formed by the number of decision trees corresponding to the iterative times and the decision trees corresponding to the iterative times, and the training data set is an average 13-dimensional mel cepstrum static coefficient of the segment snore audio data and a label corresponding to the segment snore audio data. The predicted value may refer to a result of the decision tree model processing the input data.
Optionally, the predicted value for the ith training sample is expressed as:
wherein K is the number of decision trees in the iterative process of the decision tree model, G is the function space formed by the decision trees, and x i Is the ithTraining samples g k (x i ) As a function of the kth decision tree with respect to the ith training sample,may be expressed as said predicted value for the i-th training sample. The function of the decision tree and the function space formed by the decision tree are not described in detail in this embodiment.
The training method comprises the following steps: obtaining snore audio data; processing the snore audio data to obtain a mel-frequency cepstrum coefficient of the snore audio data and a label corresponding to the snore audio data; training a sleep quality assessment model to be trained based on a training data set to obtain a trained sleep quality assessment model, wherein the training data set is a mel-frequency spectrum coefficient of the snore audio data and a label corresponding to the snore audio data, the sleep quality assessment model to be trained is a decision tree model, and an objective function of the decision tree model in a training iteration process is determined by a first regularization item parameter, a second regularization item parameter and a decision tree corresponding to iteration times.
In the training method, because the training data set of the sleep quality assessment model to be trained is the mel-frequency cepstrum coefficient of the snore audio data, compared with the traditional training data set, the mel-frequency cepstrum coefficient of the snore audio data has better training effect when being used as the training data set, and the trained sleep quality assessment model has higher precision. In addition, the first regular term parameter and the second regular term parameter of the objective function can improve the generalization capability and the precision of the trained sleep quality assessment model.
As shown in fig. 3, the embodiment provides a method for processing the snore audio data to obtain a mel-frequency cepstrum coefficient of the snore audio data and a label corresponding to the snore audio data, which includes:
s21, segmentation processing is carried out on the snore audio data to obtain segment snore audio data in a plurality of fixed time intervals, and the label corresponding to the snore audio data is the label corresponding to the segment snore audio data.
Optionally, the implementation method for slicing the snore audio data includes: carrying out frame-dividing and windowing treatment on the snore audio data to obtain a plurality of frames with equal length of the snore audio data; and cutting the frames with equal length according to the fixed time interval to obtain the segment snore audio data. The frame dividing and windowing process includes frame dividing process for dividing the snore audio data into several equal-length frames, typically 20-40 ms in length, and windowing process for performing hanning window windowing process on the snore audio data of each frame to reduce spectrum leakage and truncation of time domain waveforms, and the segmentation process is for performing equal-length frame segmentation, where the fixed time interval may be 1 minute.
S22, carrying out feature extraction processing on the segment snore audio data to obtain 13-dimensional Meier cepstrum static coefficients of a plurality of segment snore audio data.
Optionally, the implementation method for performing feature extraction processing on the segment snore audio data comprises the following steps: feature extraction processing is carried out on the segment snore audio data so as to obtain a plurality of mel cepstrum coefficients of the segment snore audio data; and screening the mel-frequency cepstrum coefficient to obtain the 13-dimensional mel-frequency cepstrum static coefficient of the segment snore audio data. The mel-frequency cepstrum coefficient includes: 13-dimensional mel-frequency spectrum static coefficient, 13-dimensional mel-frequency spectrum first-order differential coefficient and 13-dimensional mel-frequency spectrum second-order differential coefficient. The feature extraction processing is performed on the segment snore audio data to obtain a plurality of 13-dimensional mel cepstrum static coefficients of the segment snore audio data, for example, feature extraction processing may be performed on 1-minute snore audio data to obtain 1200 13-dimensional mel cepstrum static coefficients.
S23, carrying out average processing on the 13-dimensional Meyer cepstrum static coefficient of the segment snore audio data to obtain an average 13-dimensional Meyer cepstrum static coefficient of the segment snore audio data, wherein the Meyer cepstrum coefficient of the snore audio data is the average 13-dimensional Meyer cepstrum static coefficient of the segment snore audio data.
As can be seen from the above description, the implementation method for processing the snore audio data to obtain the mel cepstrum coefficient of the snore audio data and the label corresponding to the snore audio data includes: splitting the snore audio data to obtain segment snore audio data in a plurality of fixed time intervals, wherein the label corresponding to the snore audio data is the label corresponding to the segment snore audio data; feature extraction processing is carried out on the segment snore audio data so as to obtain 13-dimensional Meier cepstrum static coefficients of a plurality of segment snore audio data; and carrying out average processing on the 13-dimensional Meyer cepstrum static coefficient of the segment snore audio data to obtain the average 13-dimensional Meyer cepstrum static coefficient of the segment snore audio data, wherein the Meyer cepstrum coefficient of the snore audio data is the average 13-dimensional Meyer cepstrum static coefficient of the segment snore audio data.
According to the above description, by obtaining the average 13-dimensional mel cepstrum static coefficient of the segment snore audio data, training of the sleep quality assessment model to be trained can be facilitated, and accuracy of the sleep quality assessment model to be trained can be improved.
As shown in fig. 4, the present embodiment provides a method for generating an objective function of the decision tree model in the training iteration process, including:
s31, based on the training data set, generating a new generation decision tree of the decision tree model in each iteration process.
Optionally, the decision tree of the decision tree model in each round of iterative process comprises a native decision tree and a new generation decision tree, wherein the native decision tree is a decision tree which is generated by the decision tree model before the round of iterative process, and the new generation decision tree is a decision tree which is generated by the decision tree model in the round of iterative process. The training process of the decision tree model can be realized through an XGBoost algorithm.
Optionally, based on the training data set, the implementation method for generating the new decision tree of the decision tree model in each round of iterative process comprises the following steps: acquiring importance of each column of coefficients in the average 13-dimensional Meyer cepstrum static coefficient of the segment snore audio data; based on the importance of each column of coefficients, acquiring the probability that each column of coefficients is extracted; and generating the new decision tree based on preset parameters and the probability of each column of coefficients being extracted.
Alternatively, the importance may refer to the contribution degree of the coefficient to the prediction result of the decision tree model, and the importance may be expressed as:
wherein m represents the number of categories, n i Represent the training sample number of class i, S i Representing training samples belonging to the class i,represents the average value of the attributes k belonging to the i category, n represents the number of category samples, μ k Representing the average value, x, of the property k in the training sample k The value of attribute k representing the training sample. The number of categories may be 2, one category may be respiratory events and another category may be non-respiratory events. The number of category samples may be the number of training samples of all categories, the number of category samples may be the training data set, and the attribute may be a coefficient in an average 13-dimensional mel cepstrum static coefficient of the segment snore audio data, where one attribute corresponds to one coefficient.
Alternatively, the probability that the column coefficients are extracted may be a ratio of the importance of the column coefficients to the sum of the importance of the column coefficients.
Alternatively, the preset parameter may be a sampling proportion of a sample set, and the sample set may be the training data set or a subset of the training data set, where the sampling proportion of the sample set may be flexibly set, and the embodiment is not limited explicitly. The realization method for generating the new decision tree comprises the following steps: based on the preset parameters and the probabilities of the coefficients of each column being extracted, obtaining a sample for generating the new decision tree; generating the new decision tree based on the sample of the new decision tree.
Optionally, based on the preset parameters and the probabilities of the coefficients of each column being extracted, the implementation method for obtaining the samples for generating the new decision tree includes: based on the preset parameters, obtaining extraction samples of the new-born decision tree; based on the extracted samples of the new-generation decision tree and the probability that each column of coefficients are extracted, obtaining attribute samples of the new-generation decision tree, wherein the attribute samples comprise attribute data in the extracted samples, and the attribute samples are samples of the new-generation decision tree. The training effect of the decision tree model can be better because the sample construction of the newly-generated decision tree depends on the importance of the attribute.
The implementation method for obtaining the sample for generating the new decision tree is not described herein, and can be flexibly implemented by those skilled in the art.
S32, based on the newly-generated decision tree, acquiring an objective function of the decision tree model in the training iteration process.
Alternatively, the objective function of the decision tree model during the first round of iteration may be expressed as:
wherein,representing predicted values +.>The predicted value represented may be a predicted value for the ith training sample. .
Alternatively, the objective function of the decision tree model during the t-th round of iteration may be expressed as:
wherein L is t Expressed as an objective function, y, of the decision tree model in the t-th round of iterative process i Representing the true value, y i The true value of the representation may be the true value for the ith training sample,representing the predicted value of the decision tree model under the t-1 th round of iterative process, ++>The predicted value represented may be a predicted value of an ith training sample in a t-1 th round of the decision tree model, α represents the first regularization term parameter, β represents the second regularization term parameter, H represents a number of leaf nodes of the decision tree in the t-1 st round of the decision tree model, w is a weight matrix of the leaf nodes, where g t (x i ) And representing a function of the new generation decision tree about the ith training sample in the t-th iteration process, wherein n represents the number of training samples, and the number of training samples can be the number of training data in the training data set. The first regularization term parameter and the second regularization term parameter may be flexibly set in an actual debugging process, which is not explicitly limited in this embodiment.
Optionally, the true value may be a real label corresponding to a training sample, the predicted value may be a predicted label corresponding to the training sample, the real label corresponding to the training sample may be a label corresponding to the segment snore audio data, and the predicted label is a label obtained by processing the training sample by the decision tree model.
Alternatively, the second-order taylor expansion of the objective function under the t-th round of iteration of the decision tree model may be expressed as:
wherein n is i Represented as first order partial derivatives of objective function of ith training sample in the t-th iteration process of the decision tree model, m i Expressed as the second order partial derivative of the objective function of the ith training sample in the T-th iteration process of the decision tree model, T represents the number of iterations, w, experienced by the decision tree model j And representing a weight matrix of the leaf nodes in the j-th round training process of the decision tree model.
S33, processing the objective function of the decision tree model in the training iteration process to obtain the improved objective function of the decision tree model.
Alternatively, the objective function of the decision tree model improvement may be expressed as:
alternatively, ojb (t) may represent the modified objective function under the t-th round of iteration of the decision tree model, and I represents a sample set, which may be the training data set.
The protection scope of the training method in the embodiment of the present application is not limited to the execution sequence of the steps listed in the embodiment, and all the schemes implemented by adding or removing steps and replacing steps according to the principles of the present application in the prior art are included in the protection scope of the present application.
As shown in fig. 5, the present embodiment provides a method for identifying sleep quality, where the method for identifying sleep quality includes:
s41, obtaining snore audio data to be identified.
S42, processing the snore audio data to be identified according to the trained sleep quality assessment model shown in fig. 2 to obtain a sleep quality identification result.
Optionally, the method for implementing the sleep quality recognition result by processing the snore audio data to be recognized according to the trained sleep quality evaluation model includes: processing the snore audio data to be identified according to the trained sleep quality assessment model to obtain the predicted respiratory event number of the snore audio data to be identified per minute; and acquiring the sleep quality identification result according to the predicted respiratory event number of the snore audio data to be identified per minute. The sleep quality recognition result includes: mild, moderate and severe apnoea syndromes, the mild being 5-15 respiratory events per hour, the moderate being 15-30 respiratory events per hour, the moderate being greater than 30 respiratory events per hour.
As can be seen from the above description, the method for identifying sleep quality according to the present embodiment includes: obtaining snore audio data to be identified; and processing the snore audio data to be identified according to the sleep quality assessment model after training to obtain a sleep quality identification result. The method for identifying the sleep quality is not limited by sites, any terminal with a recording function and a trained sleep quality evaluation model can realize the method for identifying the sleep quality, the method for identifying the sleep quality is high in efficiency, can save a large amount of labor cost, can cover most patient groups, is not limited by the physical condition of the patient, and has a wide application range.
As shown in fig. 6, the present embodiment provides a training apparatus 600, which includes:
the snore audio data obtaining module 610 is configured to obtain snore audio data.
The snore audio data processing module 620 is configured to process the snore audio data to obtain a mel-frequency cepstrum coefficient of the snore audio data and a label corresponding to the snore audio data.
The sleep quality assessment model training module 630 is configured to train a sleep quality assessment model to be trained based on a training data set, so as to obtain a trained sleep quality assessment model, where the training data set is a mel cepstrum coefficient of the audio data and a label corresponding to the snore audio data, the sleep quality assessment model to be trained is a decision tree model, and an objective function of the decision tree model in an iterative process of training is determined by a first regularization term parameter, a second regularization term parameter and a decision tree corresponding to the iteration times.
In the training device 600 provided in this embodiment, the snore audio data obtaining module 610, the snore audio data processing module 620, and the sleep quality evaluation model training module 630 are in one-to-one correspondence with steps S11-S13 of the training method shown in fig. 2, and are not described herein.
In the several embodiments provided in this application, it should be understood that the disclosed apparatus or method may be implemented in other manners. For example, the apparatus embodiments described above are merely illustrative, e.g., the division of modules/units is merely a logical function division, and there may be additional divisions when actually implemented, e.g., multiple modules or units may be combined or integrated into another system, or some features may be omitted or not performed. Alternatively, the coupling or direct coupling or communication connection shown or discussed with each other may be an indirect coupling or communication connection via some interfaces, devices or modules or units, which may be in electrical, mechanical or other forms.
The modules/units illustrated as separate components may or may not be physically separate, and components shown as modules/units may or may not be physical modules, i.e., may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules/units may be selected according to actual needs to achieve the purposes of the embodiments of the present application. For example, functional modules/units in various embodiments of the present application may be integrated into one processing module, or each module/unit may exist alone physically, or two or more modules/units may be integrated into one module/unit.
Those of ordinary skill would further appreciate that the elements and algorithm steps of the examples described in connection with the embodiments disclosed herein may be embodied in electronic hardware, in computer software, or in a combination of the two, and that the elements and steps of the examples have been generally described in terms of function in the foregoing description to clearly illustrate the interchangeability of hardware and software. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application.
The embodiment provides an electronic device, which comprises a memory, wherein a computer program is stored in the memory; and the processor is in communication connection with the memory and executes the training method shown in fig. 2 and/or the sleep quality recognition method shown in fig. 5 when the computer program is called.
Embodiments of the present application also provide a computer-readable storage medium. Those of ordinary skill in the art will appreciate that all or part of the steps in the method implementing the above embodiments may be implemented by a program to instruct a processor, where the program may be stored in a computer readable storage medium, where the storage medium is a non-transitory (non-transitory) medium, such as a random access memory, a read only memory, a flash memory, a hard disk, a solid state disk, a magnetic tape (magnetic tape), a floppy disk (floppy disk), an optical disk (optical disk), and any combination thereof. The storage media may be any available media that can be accessed by a computer or a data storage device such as a server, data center, or the like that contains an integration of one or more available media. The usable medium may be a magnetic medium (e.g., a floppy disk, a hard disk, a magnetic tape), an optical medium (e.g., a digital video disc (digital video disc, DVD)), or a semiconductor medium (e.g., a Solid State Disk (SSD)), or the like.
Embodiments of the present application may also provide a computer program product comprising one or more computer instructions. When the computer instructions are loaded and executed on a computing device, the processes or functions described in accordance with the embodiments of the present application are produced in whole or in part. The computer instructions may be stored in a computer-readable storage medium or transmitted from one computer-readable storage medium to another computer-readable storage medium, for example, the computer instructions may be transmitted from one website, computer, or data center to another website, computer, or data center by a wired (e.g., coaxial cable, fiber optic, digital Subscriber Line (DSL)) or wireless (e.g., infrared, wireless, microwave, etc.).
The computer program product is executed by a computer, which performs the method according to the preceding method embodiment. The computer program product may be a software installation package, which may be downloaded and executed on a computer in case the aforementioned method is required.
The descriptions of the processes or structures corresponding to the drawings have emphasis, and the descriptions of other processes or structures may be referred to for the parts of a certain process or structure that are not described in detail.
The foregoing embodiments are merely illustrative of the principles of the present application and their effectiveness, and are not intended to limit the application. Modifications and variations may be made to the above-described embodiments by those of ordinary skill in the art without departing from the spirit and scope of the present application. Accordingly, it is intended that all equivalent modifications and variations which may be accomplished by persons skilled in the art without departing from the spirit and technical spirit of the disclosure be covered by the claims of this application.

Claims (9)

1. A training method, the training method comprising:
obtaining snore audio data;
processing the snore audio data to obtain a mel-frequency cepstrum coefficient of the snore audio data and a label corresponding to the snore audio data;
training a sleep quality assessment model to be trained based on a training data set to obtain a trained sleep quality assessment model, wherein the training data set is a mel-frequency spectrum coefficient of snore audio data and a label corresponding to the snore audio data, the sleep quality assessment model to be trained is a decision tree model, an objective function of the decision tree model in a training iteration process is determined by a first regularization item parameter, a second regularization item parameter and a decision tree corresponding to iteration times, and the objective function of the decision tree model in a t-th iteration process is as follows:
Wherein L is t Expressed as an objective function, y, of the decision tree model in the t-th round of iterative process i The true value is represented by a value that is true,representing a predicted value of the decision tree model in the t-1 th round of iteration process, wherein alpha represents the first regularization item parameter, beta represents the second regularization item parameter, H represents the number of leaf nodes of the decision tree in the t-1 th round of iteration process of the decision tree model, w is a weight matrix of the leaf nodes, and g t (x i ) Representing a function of a new generation decision tree with respect to the ith training sample, x, during a t-th iteration of the decision tree model i For the ith training sample, n represents the number of training samples, and the objective function of the decision tree model improvement is expressed as:
wherein Ojb (t) represents the improved objective function, n, in the t-th iteration of the decision tree model i Represented as first order partial derivatives of objective function of ith training sample in the t-th iteration process of the decision tree model, m i And representing the second order bias of the objective function of the ith training sample in the T-th iteration process of the decision tree model, wherein T represents the number of iterations undergone by the decision tree model, and I represents a sample set.
2. The training method according to claim 1, wherein the implementation method for processing the snore audio data to obtain mel-frequency cepstral coefficients of the snore audio data and the labels corresponding to the snore audio data includes:
Splitting the snore audio data to obtain segment snore audio data in a plurality of fixed time intervals, wherein the label corresponding to the snore audio data is the label corresponding to the segment snore audio data;
feature extraction processing is carried out on the segment snore audio data so as to obtain 13-dimensional Meier cepstrum static coefficients of a plurality of segment snore audio data;
and carrying out average processing on the 13-dimensional Meyer cepstrum static coefficient of the segment snore audio data to obtain the average 13-dimensional Meyer cepstrum static coefficient of the segment snore audio data, wherein the Meyer cepstrum coefficient of the snore audio data is the average 13-dimensional Meyer cepstrum static coefficient of the segment snore audio data.
3. The training method according to claim 2, wherein the predicted value in the iterative process of the decision tree model is determined by a function space composed of the number of decision trees corresponding to the number of iterations and the decision tree corresponding to the number of iterations, and the training data set is an average 13-dimensional mel cepstrum static coefficient of the segment snore audio data and a label corresponding to the segment snore audio data.
4. A training method as claimed in claim 3, characterized in that the predicted value for the ith training sample is expressed as:
Wherein K is the number of decision trees in the iterative process of the decision tree model, G is the function space formed by the decision trees, and x i To be the instituteThe ith training sample, g k (x i ) A function of the kth decision tree with respect to the ith training sample.
5. The training method of claim 4, wherein the method for generating the objective function of the decision tree model in the iterative process of training comprises:
generating a new decision tree of the decision tree model in each round of iterative process based on the training data set;
acquiring an objective function of the decision tree model in a training iteration process based on the newly-generated decision tree;
and processing the objective function of the decision tree model in the training iteration process to obtain the improved objective function of the decision tree model.
6. A method of identifying sleep quality, the method comprising:
obtaining snore audio data to be identified;
the trained sleep quality assessment model according to any one of claims 1-5 processes the snore audio data to be identified to obtain a sleep quality identification result.
7. A training device, the training device comprising:
The snore audio data acquisition module is used for acquiring snore audio data;
the snore audio data processing module is used for processing the snore audio data to acquire a mel cepstrum coefficient of the snore audio data and a label corresponding to the snore audio data;
the sleep quality assessment model training module is used for training a sleep quality assessment model to be trained based on a training data set to obtain a trained sleep quality assessment model, wherein the training data set is a Mel cepstrum coefficient of the audio data and a label corresponding to the snore audio data, the sleep quality assessment model to be trained is a decision tree model, an objective function of the decision tree model in a training iteration process is determined by a first regularization item parameter, a second regularization item parameter and a decision tree corresponding to iteration times, and the objective function of the decision tree model in a t-th iteration process is as follows:
wherein L is t Expressed as an objective function, y, of the decision tree model in the t-th round of iterative process i The true value is represented by a value that is true,representing a predicted value of the decision tree model in the t-1 th round of iteration process, wherein alpha represents the first regularization item parameter, beta represents the second regularization item parameter, H represents the number of leaf nodes of the decision tree in the t-1 th round of iteration process of the decision tree model, w is a weight matrix of the leaf nodes, and g t (x i ) Representing a function of a new generation decision tree with respect to the ith training sample, x, during a t-th iteration of the decision tree model i For the ith training sample, n represents the number of training samples, and the objective function of the decision tree model improvement is expressed as:
wherein Ojb (t) represents the improved objective function, n, in the t-th iteration of the decision tree model i Represented as first order partial derivatives of objective function of ith training sample in the t-th iteration process of the decision tree model, m i And representing the second order bias of the objective function of the ith training sample in the T-th iteration process of the decision tree model, wherein T represents the number of iterations undergone by the decision tree model, and I represents a sample set.
8. A computer readable storage medium, on which a computer program is stored, characterized in that the computer program, when being executed by a processor, implements the training method of any one of claims 1-5 and/or the method of identifying sleep quality of claim 6.
9. An electronic device, the electronic device comprising:
a memory storing a computer program;
a processor in communication with the memory, which when invoked executes the training method of any one of claims 1-5 and/or the method of identifying sleep quality of claim 6.
CN202310391245.0A 2023-04-11 2023-04-11 Training method, method and device for identifying sleep quality, medium and electronic equipment Active CN116386872B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310391245.0A CN116386872B (en) 2023-04-11 2023-04-11 Training method, method and device for identifying sleep quality, medium and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310391245.0A CN116386872B (en) 2023-04-11 2023-04-11 Training method, method and device for identifying sleep quality, medium and electronic equipment

Publications (2)

Publication Number Publication Date
CN116386872A CN116386872A (en) 2023-07-04
CN116386872B true CN116386872B (en) 2024-01-26

Family

ID=86974814

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310391245.0A Active CN116386872B (en) 2023-04-11 2023-04-11 Training method, method and device for identifying sleep quality, medium and electronic equipment

Country Status (1)

Country Link
CN (1) CN116386872B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110570880A (en) * 2019-09-04 2019-12-13 杭州深蓝睡眠科技有限公司 Snore signal identification method
CN112232892A (en) * 2020-12-14 2021-01-15 南京华苏科技有限公司 Method for mining accessible users based on satisfaction of mobile operators

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110570880A (en) * 2019-09-04 2019-12-13 杭州深蓝睡眠科技有限公司 Snore signal identification method
CN112232892A (en) * 2020-12-14 2021-01-15 南京华苏科技有限公司 Method for mining accessible users based on satisfaction of mobile operators

Also Published As

Publication number Publication date
CN116386872A (en) 2023-07-04

Similar Documents

Publication Publication Date Title
WO2021174757A1 (en) Method and apparatus for recognizing emotion in voice, electronic device and computer-readable storage medium
DE60124842T2 (en) Noise-robbed pattern recognition
CN109493881B (en) Method and device for labeling audio and computing equipment
US20210065735A1 (en) Sequence models for audio scene recognition
US20200082213A1 (en) Sample processing method and device
Stoeger et al. Age-group estimation in free-ranging African elephants based on acoustic cues of low-frequency rumbles
Xu et al. Cross-language transfer learning for deep neural network based speech enhancement
Kholghi et al. Active learning for classifying long‐duration audio recordings of the environment
JP5060006B2 (en) Automatic relearning of speech recognition systems
CN117115581A (en) Intelligent misoperation early warning method and system based on multi-mode deep learning
CN111402922B (en) Audio signal classification method, device, equipment and storage medium based on small samples
CN114819289A (en) Prediction method, training method, device, electronic device and storage medium
CN115273904A (en) Angry emotion recognition method and device based on multi-feature fusion
CN114078472A (en) Training method and device for keyword calculation model with low false awakening rate
CN116386872B (en) Training method, method and device for identifying sleep quality, medium and electronic equipment
Rao et al. Exploring the impact of optimal clusters on cluster purity
CN114266271A (en) Distributed optical fiber vibration signal mode classification method and system based on neural network
WO2021174883A1 (en) Voiceprint identity-verification model training method, apparatus, medium, and electronic device
Bayat et al. Identification of Aras Birds with convolutional neural networks
Poschadel et al. CNN-based multi-class multi-label classification of sound scenes in the context of wind turbine sound emission measurements
CN114566184A (en) Audio recognition method and related device
Zhipeng et al. Voiceprint recognition based on BP Neural Network and CNN
CN113724720A (en) Non-human voice filtering method in noisy environment based on neural network and MFCC
Zhou et al. Environmental sound classification of western black-crowned gibbon habitat based on spectral subtraction and VGG16
Chiu et al. A micro-control device of soundscape collection for mixed frog call recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant