CN116110429A - Construction method of recognition model based on daytime voice OSA severity degree discrimination - Google Patents

Construction method of recognition model based on daytime voice OSA severity degree discrimination Download PDF

Info

Publication number
CN116110429A
CN116110429A CN202310031801.3A CN202310031801A CN116110429A CN 116110429 A CN116110429 A CN 116110429A CN 202310031801 A CN202310031801 A CN 202310031801A CN 116110429 A CN116110429 A CN 116110429A
Authority
CN
China
Prior art keywords
osa
model
voice
hour
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202310031801.3A
Other languages
Chinese (zh)
Inventor
胡霞
陈炜
陈晨
周涛
张佳辰
罗竞春
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Fudan University
Original Assignee
Fudan University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Fudan University filed Critical Fudan University
Priority to CN202310031801.3A priority Critical patent/CN116110429A/en
Publication of CN116110429A publication Critical patent/CN116110429A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/48Other medical applications
    • A61B5/4806Sleep evaluation
    • A61B5/4818Sleep apnoea
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • AHUMAN NECESSITIES
    • A61MEDICAL OR VETERINARY SCIENCE; HYGIENE
    • A61BDIAGNOSIS; SURGERY; IDENTIFICATION
    • A61B5/00Measuring for diagnostic purposes; Identification of persons
    • A61B5/72Signal processing specially adapted for physiological signals or for diagnostic purposes
    • A61B5/7235Details of waveform analysis
    • A61B5/7264Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems
    • A61B5/7267Classification of physiological signals or data, e.g. using neural networks, statistical classifiers, expert systems or fuzzy systems involving training the classification device
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N20/00Machine learning
    • G06N20/20Ensemble learning
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/21Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being power information
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02ATECHNOLOGIES FOR ADAPTATION TO CLIMATE CHANGE
    • Y02A90/00Technologies having an indirect contribution to adaptation to climate change
    • Y02A90/10Information and communication technologies [ICT] supporting adaptation to climate change, e.g. for weather forecasting or climate simulation

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Signal Processing (AREA)
  • Medical Informatics (AREA)
  • Heart & Thoracic Surgery (AREA)
  • Surgery (AREA)
  • Acoustics & Sound (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Veterinary Medicine (AREA)
  • Evolutionary Computation (AREA)
  • Human Computer Interaction (AREA)
  • Public Health (AREA)
  • General Health & Medical Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Multimedia (AREA)
  • Mathematical Physics (AREA)
  • Molecular Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biophysics (AREA)
  • Pathology (AREA)
  • Biomedical Technology (AREA)
  • Computational Linguistics (AREA)
  • Theoretical Computer Science (AREA)
  • Software Systems (AREA)
  • Fuzzy Systems (AREA)
  • Physiology (AREA)
  • Psychiatry (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Measurement Of The Respiration, Hearing Ability, Form, And Blood Characteristics Of Living Organisms (AREA)

Abstract

The invention discloses a construction method of a recognition model based on the severity degree discrimination of daytime voice OSA. The method comprises the following steps: collecting a voice signal of a subject; the voice signal is subjected to pretreatment, feature extraction, feature selection and feature splicing in sequence; constructing an equilibrium data set; constructing a machine learning model as a base classifier, evaluating the models, and selecting a plurality of models with higher accuracy from the models; and integrating the model by adopting a Voing fusion algorithm, taking the average value of probabilities of a certain class as a prediction sample of a plurality of base classifiers, and taking the class corresponding to the highest probability as a prediction result, so as to strengthen the classification capability and generalization capability of the model. The model is used for identifying and distinguishing the obstructive sleep apnea syndrome, and has the advantages of no wound, rapider, lower cost and high accuracy.

Description

Construction method of recognition model based on daytime voice OSA severity degree discrimination
Technical Field
The invention relates to the technical field of diagnosis and screening of obstructive sleep apnea syndrome, in particular to a construction method based on a daytime voice OSA model.
Background
Obstructive sleep apnea syndrome (Obstructive Sleep Apnea Syndrome, OSAS) is one of the most common sleep disorders characterized by repeated collapse of the upper airway during sleep, resulting in sleep apnea and hypopnea, with snoring, repeated arousal, intermittent hypoxia, sleep structural disturbances, resulting in poor sleep quality. Studies have shown that 9.36 billions of adults aged 30-69 worldwide have mild to severe OSA and 4.25 billions have moderate to severe OSA, with the affected population being highest in china, however most affected individuals have not been diagnosed and treated in a timely manner. More and more studies have shown that the risk factors for chronic and acute diseases of OSA are higher than those of normal individuals, may cause cardiovascular disease, excessive daytime sleepiness, mental abnormalities, major traffic accidents, increased mortality, etc., and have been shown to increase the risk of sudden nocturnal cardiac death. Currently, the gold standard for OSA diagnosis is to diagnose OSA by calculating the apneic-hypnea Index (AHI) using Polysomnography (PSG) to observe the average number of apneic and hypopneas events per hour during sleep.
However, PSG requires a long appointment and examination time, the result of PSG needs to be analyzed by a professional witness staff, cumbersome sensors need to be connected throughout the patient during PSG, which may affect the patient's sleep structure, and poor contact of the electrodes may invalidate the whole night PSG record. For these reasons, the diagnosis of OSA in the general population is relatively low. Currently, the primary screening assessment of OSA is to predict the risk level of OSA by patient history, physical examination, asking for snoring while sleeping, by ESS questionnaire, berlin questionnaire, STOP-band questionnaire, etc., although questionnaire is a practical, low cost and efficient method, it is affected by individuals in terms of accurate identification, i.e. the patient himself may have a bias of self-selection, and in addition many studies indicate that questionnaires have defects of low specificity, and that questionnaires prove unable to detect sleep breathing disorder in patients with cardiovascular disease.
The voice signal contains a large amount of human body characteristic information, including emotion, voiceprint, rhythm and the like, and has the characteristics of non-invasiveness, objectivity and convenience, so that the voice signal has higher application value. The vocal tract is used as a part of the upper airway, the main structure comprises vocal cords, pharyngeal cavities, laryngeal cavities, oral cavities, nasal cavities, soft palates, hard palates, tongue teeth and lips, OSA patients are characterized by soft palate relaxation, pharyngeal cavity stenosis, pharyngeal tumors, enlarged tongue roots and enlarged soft tissue structures due to long-term upper airway anatomical structure collapse, and research shows that the mandibular, maxillary, skull base, hyoid bones and head position characteristics of OSA patients and the sizes of the upper airway and surrounding soft tissues are abnormal compared with normal people, and the voice signals are sent out by the cooperative action of all organs in the vocal tract, and any anatomical and functional changes of the vocal tract can influence the acoustic characteristics of voice and possibly directly influence the voice speed, tone, loudness, tone quality and the like of individual voice.
Because of the high incidence rate, low diagnosis rate and high disease risk factor of OSA, and the high monetary and time costs of the gold standard PSG for diagnosing OSA are too high, the clinical questionnaire primary screening commonly used also has the non-negligible defects of self-selection bias, low specificity and the like, how to rapidly and reliably screen OSA high-risk patients is one of research hotspots of sleep medicine, and a novel OSA diagnosis screening tool is urgently needed clinically, and we expect that the novel OSA diagnosis screening tool is economical, efficient, reliable and contactless. A sleeping doctor can quickly and reliably identify patients at risk for severe OSA by such tools and preferentially perform PSG sleep tests on the patients, early identification and treatment of OSA to prevent many adverse health problems.
Disclosure of Invention
Aiming at the defects of high incidence rate, low diagnosis rate and high disease risk factor of OSA, too high money and time cost of gold standard PSG for diagnosing OSA, and non-negligible defects of self-selection bias, low specificity and the like in the clinical conventional questionnaire primary screening, the invention aims to provide a construction method of a recognition model based on the severity degree discrimination of the daytime voice OSA; the invention can provide a noninvasive, faster and lower-cost diagnostic tool for the obstructive sleep apnea syndrome by analyzing the characteristics of the voice signals and combining a machine learning method to identify the obstructive sleep apnea syndrome and judge and grade the severity; the invention can solve the problems of high incidence and low diagnosis rate of clinical OSA, overhigh PSG cost, questionnaire screening and the like.
The technical scheme of the invention is specifically introduced as follows.
A construction method of a full-automatic obstructive sleep apnea syndrome recognition and severity discrimination model based on daytime voice comprises the following steps:
(1) Collecting voice signals of a subject, wherein the voice signals comprise vowels and designed words and sentences;
(2) Acquiring an AHI index obtained after PSG of a subject is performed overnight, and grading the severity of the subject;
(3) Preprocessing, feature extraction and feature selection are carried out on the voice signals, rich multidimensional voice features are extracted, a feature selection method is adopted on the voice signals, the first 20 features with the highest contribution value are reserved and spliced into one-dimensional vectors to serve as voice features, and a feature set with the highest detection efficiency is provided for subsequent training;
(4) Adopting a data equalization algorithm to the sample, constructing an equalization data set, and improving the problem of unbalance among data classes;
(5) Constructing a machine learning model as a base classifier, and evaluating the model;
(6) Model integration
And integrating the model by adopting a Voing fusion algorithm, taking the average value of probabilities of a certain class as a prediction sample of a plurality of base classifiers, and taking the class corresponding to the highest probability as a prediction result, so as to strengthen the classification capability and generalization capability of the model.
In the present invention, in step (2), the subjects were graded for severity according to the following criteria:
task one: two categories, no OSA and OSA;
OSA-free, AHI <5 times/hour; OSA, AHI is more than or equal to 5 times/hour;
task two: four categories, no OSA, mild OSA, moderate OSA and severe OSA;
OSA-free, AHI <5 times/hour; mild OSA,5 times/hour +.ahi <15 times/hour; moderate OSA,15 times/hour less than or equal to AHI <30 times/hour; severe OSA, AHI > 30 times/hour.
In the invention, in the step (3), the extracted features comprise energy features, time domain features, frequency domain features and music theory features; the feature selection method adopts a univariate selection method.
In the present invention, in step (4), the data equalization algorithm includes SMOTE algorithm.
In the invention, in the step (5), the machine learning model comprises a K nearest neighbor model, a support vector machine model and an AdaBoost model.
In the invention, in the step (4) and the step (5), an Easy Ensemble model is adopted to construct an equilibrium data set and a base classifier.
In the invention, in the step (5), the accuracy, the precision, the recall and the F1 fraction are used as model evaluation indexes.
In the present invention, in the step (6), the voing fusion algorithm is a softvoing fusion algorithm.
In summary, compared with the existing OSA diagnosis technology, the invention has significant substantial advantages:
1. aiming at the possible characteristics of pathological voice modes (especially the characteristics of OSA patients), a voice paradigm corresponding to the pathological voice modes is designed, the collected voice signals are more abundant in types and comprise vowels, words and sentences;
2. according to the invention, the voice signal is used for machine learning, a full-automatic daily voice OSA severity degree-based discrimination model is developed, a doctor can rapidly and reliably identify a patient with serious OSA risk through the discrimination model constructed by the invention, and PSG sleep tests are preferentially carried out on the serious patient, so that OSA is identified and treated early to prevent a plurality of bad health problems, and the problems of high incidence and low diagnosis rate of OSA, overhigh PSG cost, questionnaire screening and the like in clinic can be solved;
3. the method extracts rich multidimensional voice features (the features comprise energy features, time domain features, frequency domain features and music theory features), and adopts a feature selection method to screen out effective and reliable features, so as to provide a feature set with the most detection efficiency for subsequent training;
4. aiming at the problem of poor classification performance of the model caused by unbalanced data types, a data equalization SMOTE algorithm is provided, and the classification effect is improved to a certain extent;
5. the invention provides an integrated model based on a Voing fusion algorithm on the existing machine learning method, fuses the discrimination results of a plurality of base classifiers, realizes a high-precision recognition and severity discrimination method based on daytime voice OSA, and comprises the following steps: two classifications (recognition of whether OSA is present or not) and four classifications (OSA severity: discrimination of none, light, medium, heavy).
Drawings
FIG. 1 is a flow chart of a fully automatic OSA severity classification method based on speech features.
Fig. 2 is a schematic diagram of a voice-based fully automatic OSA recognition and severity determination system.
Fig. 3 is a model training flowchart of a full-automatic OSA recognition and severity determination method based on voice according to the present embodiment.
Detailed Description
The technical scheme of the invention is described in detail below with reference to the accompanying drawings and examples.
The invention provides a method for constructing a full-automatic obstructive sleep apnea syndrome identification and severity discrimination model based on daytime voice. The method of the invention comprises the following steps:
(1) Collecting voice signals of a subject, wherein the voice signals comprise vowels, designed words and sentences;
(2) The severity of subjects was graded by the following criteria, with an AHI index derived from the subject's overnight PSG acquisition:
task 1: two classifications, no OSA (AHI <5 times/hour) or OSA (AHI. Gtoreq.5 times/hour);
task 2: four classifications, i.e., no OSA (AHI <5 times/hour) or mild OSA (5 times/hour. Ltoreq. AHI <15 times/hour) or moderate OSA (15 times/hour. Ltoreq. AHI <30 times/hour) or severe OSA (AHI. Ltoreq.30 times/hour);
(3) Preprocessing the voice signal, extracting features and selecting features, extracting rich multidimensional voice features (the features cover energy features, time domain features, frequency domain features and music theory features), adopting a feature selection method for the voice signal, reserving the first 20 features with the highest contribution value, splicing the first 20 features into one-dimensional vectors as voice features, and providing a feature set with the highest detection efficiency for subsequent training;
(4) A data equalization algorithm is applied to the samples, comprising: the SMOTE algorithm and Easy Ensemble model are used for constructing an equilibrium data set, so that the problem of unbalance among data classes is solved;
(5) The constructed machine learning model comprises: k neighbor model, support vector machine model, adaBoost model, easyEnsemble model, etc.;
(6) Model integration: based on the existing machine learning method, an integrated model of a Voing fusion algorithm is adopted, and the integration with higher accuracy in a plurality of base classifiers is combined to form a final model, so that the classification capacity and generalization capacity of the model are enhanced.
(7) Randomly dividing the data set into 5 parts by using a 5-fold cross validation method, wherein 4 parts are used as training sets each time, the remaining 1 part is used as a test set, and the 4 training sets are sequentially input into a machine learning model and a voing integrated model for carrying out
Training to obtain an optimal solution of the model, and finally taking the average value of the accuracy, the precision, the recall and the F1 fraction of the 5 times of test sets as a model evaluation index to verify the performance of the model.
In the present invention, the above-constructed full-automatic OSA recognition and severity determination model based on speech (fig. 2) includes: the device comprises a data acquisition module, a preprocessing, feature extraction and feature selection module, a classification module and a four-classification module; wherein:
the data acquisition module is used for acquiring voice signals;
the preprocessing, feature extraction and feature selection module is used for preprocessing the voice signal, extracting features, reserving the first 20 features with the highest contribution value, and splicing the features into one-dimensional vectors serving as voice features of a subsequent input model;
the classification module is used for performing two classifications on the voice characteristics, namely evaluating whether OSA exists or not;
and the four-classification module is used for four-classifying the voice characteristics, namely evaluating the non-OSA or the mild OSA or the moderate OSA or the severe OSA.
Further, in an embodiment, a method for constructing a full-automatic OSA recognition and severity determination model based on daytime voice, as shown in fig. 3, specifically includes the steps of:
1. collecting voice signals of a subject, wherein the voice signals comprise vowels, designed words and sentences;
a microphone was fixed at a position about 20 cm from the oral cavity of the subject, digital audio signals were recorded at a sampling rate of 44.1kHz, the microphone was at an angle of 15 ° to the horizontal, and prescribed speech signals including vowels, words and sentences were collected.
2. The severity of subjects was graded by the following criteria, with an AHI index derived from the subject's overnight PSG acquisition:
task 1: two classifications, no OSA (AHI <5 times/hour) or OSA (AHI. Gtoreq.5 times/hour);
task 2: four classifications, i.e., no OSA (AHI <5 times/hour) or mild OSA (5 times/hour. Ltoreq. AHI <15 times/hour) or moderate OSA (15 times/hour. Ltoreq. AHI <30 times/hour) or severe OSA (AHI. Ltoreq.30 times/hour);
3. preprocessing, feature extraction and feature selection are carried out on the voice signals, and the first 20 features with the highest contribution value are reserved and spliced into one-dimensional vectors to serve as voice features.
The preprocessing comprises the steps of downsampling the voice signal and extracting voice fragments by utilizing endpoint detection;
the feature extraction method utilizes python software to extract voice features in batches and can be divided into the following parts according to different feature types:
energy characteristics: root mean square energy;
time domain features: attack time, zero crossing rate, autocorrelation;
frequency domain characteristics: the method comprises the steps of a spectrogram, a spectrum centroid, a spectrum flux, a mean value of each stage of mel cepstrum coefficient, a standard deviation of each stage of mel cepstrum coefficient, a mean value of each stage of difference of each stage of mel cepstrum coefficient and the like;
music theory feature: fundamental frequency, mean value and median of fundamental frequency, the most value of fundamental frequency, the mean value of fundamental frequency disturbance, the minimum value and standard deviation of first formants, second formants and third formants, etc.;
the characteristics are selected, the characteristics which have higher contribution degree and can represent most information are extracted by using a univariate selection method, and the first 20 characteristics with the highest contribution value are reserved, so that the data redundancy is reduced, and the training time is shortened.
4. The embodiment adopts a data equalization algorithm, which comprises the following steps: the SMOTE algorithm and Easy Ensemble model are used for constructing an equilibrium data set, so that the problem of unbalance among data classes is solved;
SMOTE algorithm
The main principle is to add virtual samples of classes to interpolation operations between minority classes. The method comprises the following steps: each sample x in the minority class is searched for k nearest neighbor samples, and N samples are randomly selected among them and denoted as y 1 ,y 2 ,...,y n In a few classes of samples x and y 1 ,y 2 ,...,y n Random linear interpolation is carried out between the two, and a new minority class sample p is constructed j The formula is as follows:
p j =x+rand(0,1)×(y j -x),j=1,2,...,N
wherein: p is p j For the newly interpolated samples; x is the selected raw sample data; rand (0, 1) represents a certain random number between 0 and 1.
Easy Ensemble model
The main principle is that a plurality of classifiers are trained to perform integrated learning by repeatedly combining a few class samples with a plurality of class samples of the same number extracted randomly, and the method comprises the following steps of: let the number of samples of the minority class be P and the samples of the majority class be N. The method comprises the steps of randomly sampling P samples from a plurality of types of samples, combining the samples with a few types of samples, inputting the samples into a base classifier for training, repeatedly sampling and training T base classifiers, adding the prediction probabilities of the T base classifiers, and determining classification by a sign function, wherein the formula is as follows:
Figure BDA0004047424590000061
Figure BDA0004047424590000062
wherein: h is a i,j (x) Is a base classifier; alpha i,j Weights for the corresponding base classifier; si is the number of iterations for each base model; h i Integration for each base model; θ i Is the threshold of the integrated model.
5. The method comprises the steps that a balance data set obtained after an SMOTE algorithm is input into a machine learning model for training, wherein the model comprises a K nearest neighbor model, a support vector machine model and an AdaBoost model;
k nearest neighbor model
The main principle is that the sample to be classified and K nearest neighbors of the sample to be classified are subjected to the most occurrence of categories, and the method comprises the following steps: and calculating the distance between the sample to be tested and each training sample, sorting the samples to be tested corresponding to the K least distances according to the sequence from small to large, and taking the category with the highest occurrence frequency in the K nearest neighbor samples as the prediction classification of the sample to be tested.
Support vector machine model
The main principle is that a data set is mapped to a high-dimensional space, and an optimal classification hyperplane is searched in the space, wherein the hyperplane is required to be capable of correctly dividing a training set and has the largest interval, and the solution of the hyperplane can be expressed as a convex quadratic optimization problem, and the formula is as follows:
Figure BDA0004047424590000071
s.t.y i (w T x i +b)≥1,i=1,2,...,m.
wherein: w is a weight vector; b is displacement; (x) y ,y i ) Is a training sample.
AdaBoost model
The method is mainly an integrated learning algorithm, and a plurality of weak classifiers are combined into one strong classifier through iteration so as to improve the performance of a prediction model. The method comprises the following steps:
firstly, initializing weight distribution of training data, and giving the same weight to each training sample at the beginning
Figure BDA0004047424590000072
The initial weight distribution of the training sample set at this time +.>
Figure BDA0004047424590000073
/>
Selecting a weak classifier H with the lowest current error rate as a t-th base classifier H, and calculating the error of the weak classifier on distribution as follows:
Figure BDA0004047424590000074
calculating the weight of the weak classifier in the final classifier:
Figure BDA0004047424590000075
updating weight distribution of training samples:
Figure BDA0004047424590000076
wherein: z is Z t For the normalization constant(s),
Figure BDA0004047424590000077
finally, weighting alpha according to weak classifier t Combining the individual weak classifiers, namely:
Figure BDA0004047424590000078
obtaining a strong classifier:
Figure BDA0004047424590000079
6. the method is based on the existing machine learning method, an integrated model of a Voing fusion algorithm is adopted, a plurality of base classifier prediction samples are taken as the average value of probabilities of a certain class, the class corresponding to the highest probability is the prediction result, the model prediction capability is effectively improved, and the generalization capability of the model is enhanced.
Voting fusion algorithm
The method adopts a softVoting fusion algorithm, and the main principle is that the average value of sample classification probabilities is calculated by taking all base classifiers, and the class with the highest probability is the final class. The method comprises the following steps: assuming that X is an input sample, C is a label corresponding to the input sample, selecting T base classifiers to predict the sample, and integrating the prediction results of the T base classifiers by using the following formula:
Figure BDA0004047424590000081
wherein w is i Each base model weight is represented, i=1, 2 i (x) Representing the predicted output result of each classifier.
According to the embodiment, a 5-fold cross validation method is adopted to randomly divide a data set into 5 parts, wherein 4 parts are used as training sets each time, the rest 1 part is used as a test set, 4 training sets are sequentially input into a machine learning model for training, and a model optimal solution is obtained, wherein the machine learning model comprises a K nearest neighbor model, a support vector machine model, a AdaBoost, easyEnsemble model and a Votingintegration model; and finally taking the average value of the accuracy, the precision, the recall and the F1 fraction of the 5 times of test set as a model evaluation index, and verifying the performance of the model. The results of the two-classification verification are shown in table 1. The four-class verification results are shown in table 2.
TABLE 1 two classification verification results
K nearest neighbor Support vector machine AdaBoost EasyEnsemble Voting integration
Accuracy rate of 0.91 0.90 0.87 0.62 0.95
Accuracy rate of 0.92 0.91 0.88 0.52 0.96
Recall rate of recall 0.91 0.90 0.87 0.61 0.95
F1 fraction 0.91 0.90 0.87 0.44 0.95
TABLE 2 four-class verification results
K nearest neighbor Support vector machine AdaBoost EasyEnsemble Voting integration
Accuracy rate of 0.84 0.89 0.54 0.42 0.93
Accuracy rate of 0.87 0.89 0.54 0.28 0.94
Recall rate of recall 0.84 0.89 0.54 0.37 0.93
F1 fraction 0.82 0.88 0.53 0.22 0.93

Claims (8)

1. A construction method of a full-automatic obstructive sleep apnea syndrome recognition and severity degree discrimination model based on daytime voice is characterized in that the model is used for full-automatic obstructive sleep apnea syndrome recognition and severity degree discrimination; the method comprises the following steps:
(1) Collecting voice signals of a subject, wherein the voice signals comprise vowels and designed words and sentences;
(2) The severity of the subjects was graded by obtaining the AHI index from the subject after overnight PSG:
(3) Preprocessing, feature extraction, feature selection and feature splicing are carried out on the voice signals, rich multidimensional voice features are extracted, a feature selection method is adopted on the voice signals, the first 20 features with the highest contribution value are reserved and spliced into one-dimensional vectors to serve as voice features, and a feature set with the highest detection efficiency is provided for subsequent training;
(4) Adopting a data equalization algorithm to the sample, constructing an equalization data set, and improving the problem of unbalance among data classes;
(5) Taking a plurality of built machine learning models as a base classifier, and evaluating the models;
(6) Model integration
And integrating the model by adopting a Voing fusion algorithm, taking the average value of probabilities of a certain class as a prediction sample of a plurality of base classifiers, and taking the class corresponding to the highest probability as a prediction result, so as to strengthen the classification capability and generalization capability of the model.
2. The method of claim 1, wherein in step (2), the subject is graded for severity according to two criteria:
task one
Two classifications: OSA-free and OSA-free;
OSA-free, AHI <5 times/hour; OSA, AHI is more than or equal to 5 times/hour;
task two
Four classifications: OSA, mild OSA, moderate OSA, and severe OSA;
OSA-free, AHI <5 times/hour; mild OSA,5 times/hour +.ahi <15 times/hour; moderate OSA,15 times/hour less than or equal to AHI <30 times/hour; severe OSA, AHI > 30 times/hour.
3. The method of claim 1, wherein in step (3), the extracted features include energy features, time domain features, frequency domain features, and music theory features; the feature selection method adopts a univariate selection method.
4. The method of claim 1, wherein in step (4), the data equalization algorithm is SMOTE algorithm.
5. The method of claim 1, wherein in step (5), the machine learning model includes a K-nearest neighbor model, a support vector machine model, and an AdaBoost model.
6. The construction method according to claim 1, wherein in step (4) and step (5), an Easy model is used to construct the equalization dataset and the base classifier.
7. The method according to claim 1, wherein in step (5), the accuracy, the precision, the recall and the F1 score are used as model evaluation indexes.
8. The method of claim 1, wherein in step (6), the voing fusion algorithm is a softvoing fusion algorithm.
CN202310031801.3A 2023-01-10 2023-01-10 Construction method of recognition model based on daytime voice OSA severity degree discrimination Pending CN116110429A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202310031801.3A CN116110429A (en) 2023-01-10 2023-01-10 Construction method of recognition model based on daytime voice OSA severity degree discrimination

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202310031801.3A CN116110429A (en) 2023-01-10 2023-01-10 Construction method of recognition model based on daytime voice OSA severity degree discrimination

Publications (1)

Publication Number Publication Date
CN116110429A true CN116110429A (en) 2023-05-12

Family

ID=86266916

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202310031801.3A Pending CN116110429A (en) 2023-01-10 2023-01-10 Construction method of recognition model based on daytime voice OSA severity degree discrimination

Country Status (1)

Country Link
CN (1) CN116110429A (en)

Similar Documents

Publication Publication Date Title
Ozdas et al. Investigation of vocal jitter and glottal flow spectrum as possible cues for depression and near-term suicidal risk
Amrulloh et al. Automatic cough segmentation from non-contact sound recordings in pediatric wards
Swarnkar et al. Automatic identification of wet and dry cough in pediatric patients with respiratory diseases
Goldshtein et al. Automatic detection of obstructive sleep apnea using speech signals
CN103687540B (en) Use respiratory murmur amplitude spectrogram and the pitch contour diagnosis OSA/CSA of record
US11344225B2 (en) Determining apnea-hypopnia index AHI from speech
WO2018011801A1 (en) Estimation of sleep quality parameters from whole night audio analysis
Cheng et al. Automated sleep apnea detection in snoring signal using long short-term memory neural networks
CN112190253A (en) Classification method for severity of obstructive sleep apnea
CN113288065A (en) Real-time apnea and hypopnea prediction method based on snore
Simply et al. Diagnosis of obstructive sleep apnea using speech signals from awake subjects
Zhang et al. Diagnosing Parkinson's disease with speech signal based on convolutional neural network
BT et al. Asthmatic versus healthy child classification based on cough and vocalised/ɑ:/sounds
CN111312293A (en) Method and system for identifying apnea patient based on deep learning
US20210401364A1 (en) System and Methods for Screening Obstructive Sleep Apnea During Wakefulness Using Anthropometric Information and Tracheal Breathing Sounds
Shi et al. Obstructive sleep apnea detection using difference in feature and modified minimum distance classifier
Kurt et al. Musical feature based classification of Parkinson's disease using dysphonic speech
Milani et al. A real-time application to detect human voice disorders
CN116110429A (en) Construction method of recognition model based on daytime voice OSA severity degree discrimination
Keskinpala et al. Screening for high risk suicidal states using mel-cepstral coefficients and energy in frequency bands
Herath et al. An investigation of critical frequency sub-bands of snoring sounds for OSA diagnosis
Shabber et al. A review and classification of amyotrophic lateral sclerosis with speech as a biomarker
Elisha et al. Automatic detection of obstructive sleep apnea using speech signal analysis
Elisha et al. Detection of obstructive sleep apnea using speech signal analysis
Carullo et al. Rehabilitation monitoring of post-laryngectomy patients through the extraction of vocal parameters

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination