CN109584861A - The screening method of Alzheimer's disease voice signal based on deep learning - Google Patents

The screening method of Alzheimer's disease voice signal based on deep learning Download PDF

Info

Publication number
CN109584861A
CN109584861A CN201811464595.0A CN201811464595A CN109584861A CN 109584861 A CN109584861 A CN 109584861A CN 201811464595 A CN201811464595 A CN 201811464595A CN 109584861 A CN109584861 A CN 109584861A
Authority
CN
China
Prior art keywords
voice
alzheimer
feature
disease
deep learning
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811464595.0A
Other languages
Chinese (zh)
Inventor
周青
顾明亮
马勇
朱祖德
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Jiangsu Normal University
Original Assignee
Jiangsu Normal University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Jiangsu Normal University filed Critical Jiangsu Normal University
Priority to CN201811464595.0A priority Critical patent/CN109584861A/en
Publication of CN109584861A publication Critical patent/CN109584861A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/16Speech classification or search using artificial neural networks
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • G10L15/063Training
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Processing of the speech or voice signal to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/02Speech enhancement, e.g. noise reduction or echo cancellation
    • G10L21/0208Noise filtering
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/03Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters
    • G10L25/24Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 characterised by the type of extracted parameters the extracted parameters being the cepstrum

Abstract

The screening method of Alzheimer's disease voice signal based on deep learning, is related to voice processing technology, comprising steps of training depth confidence network model is spare;So that detected person is carried out different spoken output tasks and acquires the voice of detected person;Acquired voice is pre-processed;It extracts in pretreated voice pathological characters relevant with Alzheimer's disease and is inputted trained depth confidence network model and be trained to obtain optimization feature;Optimization feature is inputted trained SVM classifier to classify, classification results are screening results.The screening method of Alzheimer's disease voice signal based on deep learning of the invention, realizes AD rapid screening using deep learning, only can make preliminary judgement by subject's voice, and method is simple, and intelligence degree is high.

Description

The screening method of Alzheimer's disease voice signal based on deep learning
Technical field
The present invention relates to place's voice process technology, and in particular to the Alzheimer's disease voice signal based on deep learning Screening method.
Background technique
Alzheimer's disease (Alzheimer ' s disease, AD), becomes one of aging society focus of attention.Whole nation stream Row disease, which is learned, investigates the Alzheimer's disease illness rate for then showing China's over-65s population up to 4.8%;Current clinic AD diagnosis needs 2-3 hours standard neuropsychologicals of experience assess the nerve that the base PET or Traumatic spinal cord low and expensive with availability are punctured Marker inspection, going screening using this conventional route, nearly ten million potential dementia patients are very difficult easily.
Early there is research to observe the obstacle of AD patient's spoken language output and finds that the exception of linguistic function can be used as AD assessment The training of deep neural network algorithm is utilized therefore by the analysis to measured's phonic signal character with the important evidence of diagnosis Pathological characters model finds effective pathological characters of AD patient, is realized to AD patient by SVM classifier with the side of non-intrusion type Formula carries out rapid screening, provides a kind of low cost for the clinical diagnosis of AD, feasibility is high, and structure is simple, intelligentized objective survey Amount method.
Summary of the invention
The object of the present invention is to provide a kind of quick sieves of Alzheimer's disease based on the optimization of depth confidence network characterization Technology is looked into, is analyzed by the processing to subject's voice signal, correlated pathologies, including fundamental frequency, jitter are extracted (jitter), Shimmer (shimmer), humorous make an uproar than (HNR), signal-to-noise ratio (SNR), short-time zero-crossing rate, short-time energy, resonance Peak, MFCC, LPC, speech pause, word speed.The pathological characters of extraction are analyzed, establish and train the depth for characteristic optimization Confidence network model and the svm classifier model for classification are spent, to realize the rapid screening to Alzheimer Disease patient.
To realize the above goal of the invention, technical scheme is as follows:
The screening method of Alzheimer's disease voice signal based on deep learning, comprising steps of
S1: training depth confidence network model is spare;
S2: so that detected person is carried out different spoken output tasks and acquire the voice of detected person;
S3: acquired voice is pre-processed;
S4: pathological characters relevant with Alzheimer's disease in pretreated voice are extracted and are inputted trained Depth confidence network model is trained to obtain optimization feature;
S5: optimization feature is inputted into trained SVM classifier and is classified, classification results are screening results.
Technical solution as a further improvement of that present invention, the step S2 are specifically included: measure field noise excludes to make an uproar Sound source carries out voice collecting after noise meets the requirements;During voice collecting, different spoken outputs are carried out to measured and are appointed Business, is marked arrangement to voice.
Technical solution as a further improvement of that present invention, the step S2 are specifically included: measure field noise excludes to make an uproar Sound source carries out voice collecting after noise meets the requirements;During voice collecting, different spoken outputs are carried out to measured and are appointed Business, spoken output task include self-introduction, Verbal fluency test, picture description, continuously send out vowel, voice is marked It arranges.
Technical solution as a further improvement of that present invention, the step S3 are specifically included: to collected voice data It is denoised, parameter is regular, preemphasis, adding window and sub-frame processing, obtains voice frame sequence.
Technical solution as a further improvement of that present invention, the step S3 are specifically included: to collected voice data It is denoised, parameter is regular, preemphasis, adding window and sub-frame processing, obtains voice frame sequence, wherein preemphasis, and adding window, framing is led to OpenSMILE is crossed to be pre-processed.
Technical solution as a further improvement of that present invention, the step S4 are specifically included: being extracted each in voice frame sequence The pathological characters of speech frame simultaneously extract first-order difference and second differnce to pathological characters, form new multidimensional pathological characters, will be more It ties up pathological characters and inputs trained depth confidence network model, output optimization feature.
Technical solution as a further improvement of that present invention, the step S4 are specifically included: being extracted each in voice frame sequence The pathological characters of speech frame simultaneously extract first-order difference and second differnce to pathological characters, form new multidimensional pathological characters, wherein Pathological characters include: fundamental frequency, jitter, Shimmer, humorous ratio of making an uproar, signal-to-noise ratio, short-time zero-crossing rate, short-time energy, formant, Multidimensional pathological characters are inputted trained depth confidence network model by MFCC, LPC, speech pause and word speed, and output optimization is special Sign.
Technical solution as a further improvement of that present invention, the step S5 are specifically included: using optimization feature as input It is put into trained SVM classifier and classifies, classification results are testing result, wherein the training of SVM classifier model Process are as follows: by the data in training set by pretreatment, pathological characters are extracted, and are put into the optimization that depth confidence network model obtains Feature input SVM classifier is trained to obtain trained SVM classifier model.
Compared with prior art, beneficial effects of the present invention: the Alzheimer's disease language of the invention based on deep learning The screening method of sound signal realizes AD rapid screening using deep learning, can only be made by subject's voice and tentatively be sentenced Disconnected, method is simple, and intelligence degree is high.
Detailed description of the invention
Fig. 1 is method flow schematic diagram of the invention;
Fig. 2 is voice collecting flow diagram;
Fig. 3 is voice pretreatment process schematic diagram;
Fig. 4 is feature extraction flow diagram;
Fig. 5 is depth network frame training optimization feature schematic diagram;
Fig. 6 is the flow chart of RBM parameter training;
Fig. 7 is SVM classifier training classification process figure.
Specific embodiment:
The present invention is described further with reference to the accompanying drawings.
Embodiment
Fig. 1 is the process signal of the screening method of the Alzheimer's disease voice signal of the invention based on deep learning Figure, comprising steps of
1) voice when carrying out different spoken output tasks to subject is acquired and arranges;
2) above-mentioned voice is pre-processed;
3) it extracts the acoustic feature of above-mentioned voice and is inputted depth confidence neural network and be trained to obtain optimization spy Sign;
4) optimization feature is inputted trained SVM classifier to classify, realizes the automatic recognition of speech by input Alzheimer Disease patient.
Fig. 2 is voice collecting flow diagram.The effect of the part is: acquiring primary data for experiment, collection is used for The training voice document that subsequent algorithm needs.Personnel's measure field noise of test is presided over first, if on-site noise is higher than 55dB, Noise source is then excluded, when noise is down to 55dB or less, then carries out voice collecting.
During voice collecting, different spoken output tasks, including " self-introduction " are carried out to measured, " speech is smooth Property test ", " picture description ", " continuously sending out vowel " four different spoken output tasks save voice.
Wherein, the voice of training set saves, label, and finishing part is to save all recording files of each subject In the case where numbering identical file with subject, process is saved without personal information, only retains the number to distinguish and examines Disconnected result (young people, normal old man, AD patient or not after diagnosing).
Fig. 3 is voice pretreatment process schematic diagram.Training data and test data are denoised respectively, parameter is regular, Preemphasis, adding window and sub-frame processing are successively carried out simultaneously, obtain voice frame sequence.Denoising.Using automatic segmentation program to voice It carries out smart detection and removes the noise jammings such as cough, manually proofreaded again for training data, it is bright to what is occurred in voice segments Aobvious noise and mute section of length are labeled and cut.Parameter is regular, and due to recording environment, equipment is different, in data summarization Afterwards, according to parameters such as requirement of experiment uniform sampling rate, bit rates, amplitude normalization processing is carried out using Audition software, is disappeared Except interference.Cutting, in order to examine influence of the different duration voice segments to effect is distinguished, by design automatic segmentation program to training number It, can manual setting cutting duration according to integration cutting is carried out.
After handling the voice signal of acquisition, pathological characters extraction is carried out.Fig. 4 is that pathological characters extract flow chart, The feature of extraction includes but is not limited to: fundamental frequency, jitter (jitter), Shimmer (shimmer), it is humorous make an uproar than (HNR), letter It makes an uproar than (SNR), short-time zero-crossing rate, short-time energy, formant, MFCC, LPC, speech pause, word speed.Training data also carries out spy Levy extraction process.
It is illustrated by taking MFCC feature as an example below.
When extracting the MFCC feature of each speech frame, frequency-region signal is obtained by Fourier transformation and modulus first, and pass through It crosses triangle filter function and obtains the output in Meier domain, take logarithm to carry out decorrelative transformation by long-lost cosine code, obtain 13 ranks MFCC parameter, then first-order difference and second differnce are extracted to it, 39 dimension MFCC feature of composition.
The method for wherein extracting feature includes being calculated using openSMILE including fundamental frequency, jitter (jitter), vibration The features such as width perturbation (shimmer), MFCC, LPC;Being extracted using the kit voice box of MATLAB includes humorous ratio of making an uproar (HNR), the features such as signal-to-noise ratio (SNR), short-time zero-crossing rate, short-time energy;Wherein word speed, speech pause and formant feature use Praat script is realized;
Particularly, for the extraction of the speech pause feature of one of AD patient's pathological characters, when including total to voice segments Length, generation total duration, pause total duration, pause number, five features such as sounding/pause ratio are stopped as speech pause assessment voice The global feature to pause.
For the effective information of preferably keeping characteristics, optimization characterization step is obtained using depth confidence network model here It include: that the pathological characters are inputted into trained depth confidence neural network (DBN) model in advance, output optimization feature.
Wherein, typical depth confidence network is limitation Boltzmann machine (the Restricted Boltzmann by multilayer Machine, RBM) and one layer BP neural network composition.Entire training process can be summarized as unsupervised learning from the bottom up and Two step of supervised learning from top to bottom:
It is DBN network 1. the first step is the RBM network parameter for successively training each layer from the bottom up using no label data Pre-training process.
2. network is finally exported the difference that obtains compared with having label data from upper by BP neural network by second step It is returned down, is the fine tuning of network with regulating networks parameter to optimal.
Fig. 5 is depth confidence neural metwork training flow chart, and the pathological characters of extraction are inputted, from the bottom up successively training Each layer of RBM network parameter, the output of pre-training are the input of SVM classifier.
Wherein RBM is the pith of DBN network, is a kind of undirected generative probabilistic model, from two layers neuron ( Layer v and hidden layer h) is constituted.The value of the visible layer unit of RBM be [0,1], implicit layer unit can only value be 0 or 1.The network The connection performance of neuron is statistical iteration between resulting in each neuron of same layer, and RBM is about v, the energy function of h are as follows:
In formula, I, J are respectively visible layer neuron number and hidden layer neuron number, and v, h are respectively visible layer unit With implicit layer unit, θ={ a, b, w } is the parameter of RBM model.
In vi=1 or hjWhen=1, conditional probability is
Wherein, activation primitive
Fig. 6 is the flow chart of RBM parameter training, and RBM the destination of study is to obtain the parameter of its network model, and pass through ladder Degree descent algorithm seeks the least energy in network structure.
It can serve as the input of SVM classifier by the optimization feature that depth confidence network model exports.Fig. 7 is SVM Classifier training classification process figure.The optimization feature that test obtains through the above steps is put into trained SVM classifier Classify, classification results are testing result, wherein training process are as follows: it is first that training data is passed through into pretreatment, feature is extracted, The optimization feature exported by depth confidence network model inputs SVM classifier and is trained, and is tested using 5 folding cross-validation methods Demonstrate,prove classifying quality;Wherein SVM is realized using LIBSVM, and the kernel function of selection is RBF (Radial Basis Function).
Although having been presented for some embodiments of the present invention herein, it will be appreciated by those of skill in the art that Without departing from the spirit of the invention, the embodiments herein can be changed.Examples detailed above is only exemplary, and is not answered Using the embodiments herein as the restriction of interest field of the present invention.

Claims (8)

1. the screening method of the Alzheimer's disease voice signal based on deep learning, which is characterized in that comprising steps of
S1: training depth confidence network model is spare;
S2: so that detected person is carried out different spoken output tasks and acquire the voice of detected person;
S3: acquired voice is pre-processed;
S4: pathological characters relevant with Alzheimer's disease in pretreated voice are extracted and are inputted trained depth Confidence network model is trained to obtain optimization feature;
S5: optimization feature is inputted into trained SVM classifier and is classified, classification results are screening results.
2. the screening method of the Alzheimer's disease voice signal according to claim 1 based on deep learning, feature Be, the step S2 is specifically included: measure field noise excludes noise source, and voice is carried out after noise meets the requirements and is adopted Collection;During voice collecting, different spoken output tasks are carried out to measured, arrangement is marked to voice.
3. the screening method of the Alzheimer's disease voice signal according to claim 2 based on deep learning, feature Be, the step S2 is specifically included: measure field noise excludes noise source, and voice is carried out after noise meets the requirements and is adopted Collection;During voice collecting, different spoken output tasks are carried out to measured, spoken output task includes self-introduction, speech Fluency test, continuously sends out vowel at picture description, and arrangement is marked to voice.
4. the screening method of the Alzheimer's disease voice signal according to claim 1 based on deep learning, feature It is, the step S3 is specifically included: collected voice data is denoised, parameter is regular, preemphasis, adding window and framing Processing obtains voice frame sequence.
5. the screening method of the Alzheimer's disease voice signal according to claim 4 based on deep learning, feature It is, the step S3 is specifically included: collected voice data is denoised, parameter is regular, preemphasis, adding window and framing Processing obtains voice frame sequence, and wherein preemphasis, adding window, framing are pre-processed by openSMILE.
6. the screening method of the Alzheimer's disease voice signal according to claim 1 based on deep learning, feature It is, the step S4 is specifically included: extracts the pathological characters of each speech frame in voice frame sequence and one is extracted to pathological characters Order difference and second differnce form new multidimensional pathological characters, and multidimensional pathological characters are inputted trained depth confidence network Model, output optimization feature.
7. the screening method of the Alzheimer's disease voice signal according to claim 6 based on deep learning, feature It is, the step S4 is specifically included: extracts the pathological characters of each speech frame in voice frame sequence and one is extracted to pathological characters Order difference and second differnce form new multidimensional pathological characters, wherein pathological characters include: that fundamental frequency, jitter, amplitude are micro- It disturbs, humorous make an uproar ratio, signal-to-noise ratio, short-time zero-crossing rate, short-time energy, formant, MFCC, LPC, speech pause and word speed, by multidimensional disease It manages feature and inputs trained depth confidence network model, output optimization feature.
8. the screening method of the Alzheimer's disease voice signal according to claim 1 based on deep learning, feature It is, the step S5 is specifically included: optimization feature is put into trained SVM classifier as input and is classified, point Class result is testing result, wherein the training process of SVM classifier model are as follows: by the data in training set by pre-processing, Pathological characters extract, and are put into the optimization feature input SVM classifier that depth confidence network model obtains and are trained and are trained Good SVM classifier model.
CN201811464595.0A 2018-12-03 2018-12-03 The screening method of Alzheimer's disease voice signal based on deep learning Pending CN109584861A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811464595.0A CN109584861A (en) 2018-12-03 2018-12-03 The screening method of Alzheimer's disease voice signal based on deep learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811464595.0A CN109584861A (en) 2018-12-03 2018-12-03 The screening method of Alzheimer's disease voice signal based on deep learning

Publications (1)

Publication Number Publication Date
CN109584861A true CN109584861A (en) 2019-04-05

Family

ID=65926673

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811464595.0A Pending CN109584861A (en) 2018-12-03 2018-12-03 The screening method of Alzheimer's disease voice signal based on deep learning

Country Status (1)

Country Link
CN (1) CN109584861A (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111081229A (en) * 2019-12-23 2020-04-28 科大讯飞股份有限公司 Scoring method based on voice and related device
CN113440107A (en) * 2021-07-06 2021-09-28 浙江大学 Alzheimer's symptom diagnosis device based on voice signal analysis

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106725532A (en) * 2016-12-13 2017-05-31 兰州大学 Depression automatic evaluation system and method based on phonetic feature and machine learning
US9763617B2 (en) * 2011-08-02 2017-09-19 Massachusetts Institute Of Technology Phonologically-based biomarkers for major depressive disorder
CN107944360A (en) * 2017-11-13 2018-04-20 中国科学院深圳先进技术研究院 A kind of induced multi-potent stem cell recognition methods, system and electronic equipment
CN108198576A (en) * 2018-02-11 2018-06-22 华南理工大学 A kind of Alzheimer's disease prescreening method based on phonetic feature Non-negative Matrix Factorization
CN108597542A (en) * 2018-03-19 2018-09-28 华南理工大学 A kind of dysarthrosis severity method of estimation based on depth audio frequency characteristics
CN108877917A (en) * 2018-06-14 2018-11-23 杭州电子科技大学 The system and method for network remote monitoring Parkinson's disease severity

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US9763617B2 (en) * 2011-08-02 2017-09-19 Massachusetts Institute Of Technology Phonologically-based biomarkers for major depressive disorder
CN106725532A (en) * 2016-12-13 2017-05-31 兰州大学 Depression automatic evaluation system and method based on phonetic feature and machine learning
CN107944360A (en) * 2017-11-13 2018-04-20 中国科学院深圳先进技术研究院 A kind of induced multi-potent stem cell recognition methods, system and electronic equipment
CN108198576A (en) * 2018-02-11 2018-06-22 华南理工大学 A kind of Alzheimer's disease prescreening method based on phonetic feature Non-negative Matrix Factorization
CN108597542A (en) * 2018-03-19 2018-09-28 华南理工大学 A kind of dysarthrosis severity method of estimation based on depth audio frequency characteristics
CN108877917A (en) * 2018-06-14 2018-11-23 杭州电子科技大学 The system and method for network remote monitoring Parkinson's disease severity

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111081229A (en) * 2019-12-23 2020-04-28 科大讯飞股份有限公司 Scoring method based on voice and related device
CN111081229B (en) * 2019-12-23 2022-06-07 科大讯飞股份有限公司 Scoring method based on voice and related device
CN113440107A (en) * 2021-07-06 2021-09-28 浙江大学 Alzheimer's symptom diagnosis device based on voice signal analysis

Similar Documents

Publication Publication Date Title
CN106725532B (en) Depression automatic evaluation system and method based on phonetic feature and machine learning
Godino-Llorente et al. Automatic detection of voice impairments by means of short-term cepstral parameters and neural network based detectors
Orozco et al. Detecting pathologies from infant cry applying scaled conjugate gradient neural networks
Deb et al. Multiscale amplitude feature and significance of enhanced vocal tract information for emotion classification
Wallen et al. A screening test for speech pathology assessment using objective quality measures
CN111951824A (en) Detection method for distinguishing depression based on sound
Chaki Pattern analysis based acoustic signal processing: a survey of the state-of-art
Hasan et al. Emotion recognition from bengali speech using rnn modulation-based categorization
Turan et al. Monitoring Infant's Emotional Cry in Domestic Environments Using the Capsule Network Architecture.
CN113257406A (en) Disaster rescue triage and auxiliary diagnosis method based on intelligent glasses
Warule et al. Significance of voiced and unvoiced speech segments for the detection of common cold
CN109584861A (en) The screening method of Alzheimer's disease voice signal based on deep learning
Sharma et al. Audio texture and age-wise analysis of disordered speech in children having specific language impairment
da Silva et al. Evaluation of a sliding window mechanism as DataAugmentation over emotion detection on speech
Nouhaila et al. An intelligent approach based on the combination of the discrete wavelet transform, delta delta MFCC for Parkinson's disease diagnosis
Wang et al. Continuous speech for improved learning pathological voice disorders
CN112466284B (en) Mask voice identification method
Wang et al. Unsupervised instance discriminative learning for depression detection from speech signals
Esmaili et al. An automatic prolongation detection approach in continuous speech with robustness against speaking rate variations
Wijesinghe et al. Machine learning based automated speech dialog analysis of autistic children
Deepa et al. Speech technology in healthcare
Marck et al. Identification, analysis and characterization of base units of bird vocal communication: The white spectacled bulbul (Pycnonotus xanthopygos) as a case study
Khanum et al. Speech based gender identification using feed forward neural networks
Sheikh et al. Advancing stuttering detection via data augmentation, class-balanced loss and multi-contextual deep learning
Yagnavajjula et al. Detection of neurogenic voice disorders using the fisher vector representation of cepstral features

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20190405