CN103258545A - Pathological voice subdivision method - Google Patents

Pathological voice subdivision method Download PDF

Info

Publication number
CN103258545A
CN103258545A CN2012105555873A CN201210555587A CN103258545A CN 103258545 A CN103258545 A CN 103258545A CN 2012105555873 A CN2012105555873 A CN 2012105555873A CN 201210555587 A CN201210555587 A CN 201210555587A CN 103258545 A CN103258545 A CN 103258545A
Authority
CN
China
Prior art keywords
voice
feature
pathology
threshold value
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN2012105555873A
Other languages
Chinese (zh)
Inventor
陶智
周强
张晓俊
吴迪
肖仲喆
季晶晶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Suzhou University
Original Assignee
Suzhou University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Suzhou University filed Critical Suzhou University
Priority to CN2012105555873A priority Critical patent/CN103258545A/en
Publication of CN103258545A publication Critical patent/CN103258545A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention discloses a pathological voice subdivision method. A model training module and an identification module are included. The model training module is used for modeling of input voice signals, getting corresponding likelihood, calculating and comparing matching probabilities, and finding out voice signals meeting the conditions. The identification module is used for matching the voice signals which meet the conditions. The lengths of the input voice signals are not required, the voice signals can have characteristic parameters of any types, and different weights are distributed to different characteristics. Therefore, advantages of all parameters can be utilized, and the number of dimensions of the characteristic parameters is not limited. Multiple times of trainings can be carried out, retraining is carried out on voice signals which are difficult to identify, and threshold values, finishing conditions and identification conditions can be flexibly set in the training. By means of the pathological voice subdivision method, types of pathological voices can be automatically set and subdivided precisely, pre-diagnosis of voice diseases and timely-tracking of recovery condition of a patient are achieved, and meanwhile the pathological voice subdivision method is suitable for being used for health self-inspection of teachers and singers and the like.

Description

A kind of pathology voice divided method
Technical field
The invention belongs to the noise field, be specifically related to a kind of pathology voice divided method.
Background technology
Investigation to the voice situation shows that China at least 1 hundred million people suffers from various voice diseases, relates to many reasons such as physiology and working environment.Mainly be because of the functional of sounding organs,speech or sounding laloplegia that organic lesion causes.Detection for the pathology voice mainly is that the medical expert carries out subjective differentiation in early days, and its False Rate is bigger.The shortcoming of electronic device indagation method is that naked eyes are difficult to catch moment, and can make troubles to patient, cause inaccurate diagnostic result.Along with the proposition of pattern-recognition, automatic detection method convenient and that do not have an infringement becomes research emphasis.
Along with the proposition of identification automatically of pathology voice, a lot of characteristic parameters occur in succession.Roughly can be divided into four big classes: step) time domain and statistics parameter thereof, as fundamental frequency, frequency perturbation, amplitude disturbance etc.; Step) transform domain class parameter is as LPCC, MFCC etc.; (3) noise parameter is as the humorous ratio of making an uproar, glottis noise energy etc.; (4) nonlinear characteristic parameter is as largest Lyapunov exponent, correlation dimension etc.Because the kind of pathology voice is more, each parameter has all produced influence in various degree for the segmentation of pathology voice.
General recognition methods is divided into training part and identification division.Modeling at first will be extracted the feature of voice sound signal, and training obtains model then.Identification division at first carries out feature extraction, mates with the model that trains then to obtain matching score, compares with preset threshold again, obtains differentiating the result at last.All there are two problems in method at present, and the one, do not carry out the precisely subdivided of pathology voice, just carried out the differentiation of right and wrong; The 2nd, do not take full advantage of the advantage of each feature, particularly do not consider the complementarity between the feature.
  
Summary of the invention
For addressing the above problem, the purpose of this invention is to provide a kind of pathology voice divided method, take full advantage of the advantage of all kinds of parameters, and realized the segmentation of pathology voice.
The apprizing system of sample treatment plant reaches above-mentioned technique effect for realizing above-mentioned technical purpose, and the present invention is achieved through the following technical solutions:
A kind of pathology voice divided method comprises model training module and identification module, and described model training module may further comprise the steps:
Step 1) is extracted various characteristic parameters to all kinds of pathology voice kinds;
Step 2) every kind of characteristic parameter of every class voice carries out GMM training, the GMM matrix that obtains training;
The GMM that the characteristic parameter input that step 3) is extracted step 1) trains asks for corresponding likelihood score;
The likelihood score that step 4) is calculated according to step 3) calculates the matching probability of all kinds of pathology voices;
Each characteristic parameter matching probability weighted sum that step 5) is obtained step 4) gets total matching degree Match;
Match and threshold value that step 6) is tried to achieve step 5) compare, and when greater than threshold value, then carry out the contribution rate of each feature and calculate; When less than threshold value, then count and send into and finish to judge;
Step 7) is carried out weight allocation according to the contribution rate of each feature that step 6) is calculated, and wherein each feature weight sum is 1;
Step 8) to the Match in the step 5) less than threshold value and the voice sound signal that does not satisfy termination condition return step 1); When voice sound signal satisfied termination condition, then training finished;
Described identification module may further comprise the steps:
Step 9) is carried out feature extraction with the voice sound signal that satisfies termination condition in the step 8);
Step 10) is written into the model of cognition that trains;
Step 11) is mated feature and the corresponding stress model that extracts in the step 9);
If the step 12) matching result satisfies the condition of setting, then end of identification; If do not satisfy then be written into the model that the next one trains, and change step 11) over to.
The invention has the beneficial effects as follows:
1, the present invention does not have any requirement to the length of voice sound signal of input, can be the characteristic parameter of any type, and different characteristic allocation takes full advantage of the advantage of each parameter like this with different weights, and to the dimension of characteristic parameter without limits;
2, the present invention can repeatedly train, and at those voice sound signal that is difficult for being identified, carries out retraining, and the condition in the condition of threshold value, end in the training and the identification can be set flexibly;
3, the present invention can set up the kind of pathology voice on their own, and carries out precisely subdividedly, realizes the pre-diagnosis of voice disease and the timely tracking of patient's recovery, and simultaneous adaptation is carried out health in teacher, singer etc. and checked oneself.
Above-mentioned explanation only is the general introduction of technical solution of the present invention, for can clearer understanding technological means of the present invention, and can be implemented according to the content of instructions, below with preferred embodiment of the present invention and conjunction with figs. describe in detail as after.The specific embodiment of the present invention is provided in detail by following examples and accompanying drawing thereof.
Description of drawings
Fig. 1 is modeling process flow diagram of the present invention;
Fig. 2 is coupling of the present invention and screening operation process flow diagram;
Fig. 3 is the workflow diagram of identification module of the present invention;
Fig. 4 is the another kind of workflow diagram of identification module of the present invention.
Embodiment
Below with reference to the accompanying drawings and in conjunction with the embodiments, describe the present invention in detail.
Referring to Fig. 1, shown in Figure 2, a kind of pathology voice divided method comprises model training module and identification module, and described model training module may further comprise the steps:
Step 1) is extracted M kind characteristic parameter to each voice of N class pathology voice;
Step 2) carry out the GMM training according to the characteristic parameter in the step 1) then, every kind of feature of every class voice obtains a model that trains, and institute thinks N*M GMM;
Step 3) is then imported the GMM of character pair and is asked for likelihood score the feature of voice sound signal;
The likelihood score that step 4) is asked for according to step 3) is asked for the corresponding probability with every class voice of voice;
The probability weight summation of every kind of feature correspondence of step 5) is in order to characterize the matching degree Match corresponding to the voice sound signal kind;
Match and threshold value that step 6) is tried to achieve step 5) compare, if greater than threshold value, will carry out the contribution rate of each feature and calculate, and are used for regulating the weighting coefficient of probability; If less than threshold value, will count and send into and finish to judge;
If step 7) does not satisfy termination condition to sending into the voice that finishes to judge in the step 6), then send step 2 back to) carry out the GMM training again, otherwise training finishes.So just obtained the voice sound signal that trains.
Identification module embodiment one:
Referring to shown in Figure 3, identification module may further comprise the steps:
Step 8) is extracted characteristic parameter among the M that imports voice sound signal;
Step 9) loads the GMM matrix that trains;
Step 10) is imported the GMM matrix that trains with the characteristic parameter that extracts in the step 8), obtains the likelihood score of the corresponding every class voice of every kind of feature;
The likelihood score that step 11) is obtained according to step 10) is asked for the probability into every class pathology voice;
Step 12) is to the maximum corresponding pathology voice kind ballot of probability, and every kind of feature all can have a ticket;
All characteristic synthetics of step 13) get up, and carry out total count of votes; If last model, then voice to be identified is the maximum corresponding pathology voice kinds of aggregate votes, finishes; If not last model, aggregate votes are greater than preset threshold, and then voice to be identified is the maximum corresponding pathology voice kinds of aggregate votes, finish, less than the next GMM matrix of then being written into of threshold value and change step 10) over to.
Identification module embodiment two:
Referring to shown in Figure 4, identification module may further comprise the steps:
Step 8) is extracted characteristic parameter among the M that imports voice sound signal;
Step 9) loads the GMM matrix that trains;
Step 10) is imported the GMM matrix that trains with the characteristic parameter that extracts in the step 8), obtains the likelihood score of the corresponding every class voice of every kind of feature;
The likelihood score that step 11) is obtained according to step 10), asking for voice is the probability of every class pathology voice;
The probability weight summation of all feature correspondences of step 12) is as total coupling Match;
Step 13) is chosen maximum Match in all pathology voice kind coupling Match; If last model, then voice to be identified is the maximum corresponding pathology voice of Match kind, finishes; Greater than preset threshold, then voice to be identified is the maximum corresponding pathology voice of Match kind if not, Match, finishes, less than the next GMM matrix of then being written into of threshold value and change step 10) over to.
The above only for the preferred embodiment of invention, is not limited to the present invention, and for a person skilled in the art, the present invention can have various changes and variation.Within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims (1)

1. a pathology voice divided method comprises model training module and identification module, it is characterized in that, described model training module may further comprise the steps:
Step 1) is extracted various characteristic parameters to all kinds of pathology voice kinds;
Step 2) every kind of characteristic parameter of every class voice carries out GMM training, the GMM matrix that obtains training;
The GMM that the characteristic parameter input that step 3) is extracted step 1) trains asks for corresponding likelihood score;
The likelihood score that step 4) is calculated according to step 3) calculates the matching probability of all kinds of pathology voices;
Each characteristic parameter matching probability weighted sum that step 5) is obtained step 4) gets total matching degree Match;
Match and threshold value that step 6) is tried to achieve step 5) compare, and when greater than threshold value, then carry out the contribution rate of each feature and calculate; When less than threshold value, then count and send into and finish to judge;
Step 7) is carried out weight allocation according to the contribution rate of each feature that step 6) is calculated, and wherein each feature weight sum is 1;
Step 8) to the Match in the step 5) less than threshold value and the voice sound signal that does not satisfy termination condition return step 1); When voice sound signal satisfied termination condition, then training finished;
Described identification module may further comprise the steps:
Step 9) is carried out feature extraction with the voice sound signal that satisfies termination condition in the step 8);
Step 10) is written into the model of cognition that trains;
Step 11) is mated feature and the corresponding stress model that extracts in the step 9);
If the step 12) matching result satisfies the condition of setting, then end of identification; If do not satisfy then be written into the model that the next one trains, and change step 11) over to.
CN2012105555873A 2012-12-20 2012-12-20 Pathological voice subdivision method Pending CN103258545A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2012105555873A CN103258545A (en) 2012-12-20 2012-12-20 Pathological voice subdivision method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2012105555873A CN103258545A (en) 2012-12-20 2012-12-20 Pathological voice subdivision method

Publications (1)

Publication Number Publication Date
CN103258545A true CN103258545A (en) 2013-08-21

Family

ID=48962415

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2012105555873A Pending CN103258545A (en) 2012-12-20 2012-12-20 Pathological voice subdivision method

Country Status (1)

Country Link
CN (1) CN103258545A (en)

Cited By (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103730130A (en) * 2013-12-20 2014-04-16 中国科学院深圳先进技术研究院 Detection method and system for pathological voice
CN103778913A (en) * 2014-01-22 2014-05-07 苏州大学 Pathologic voice recognizing method
CN106297768A (en) * 2015-05-11 2017-01-04 苏州大学 A kind of audio recognition method
CN108269590A (en) * 2018-01-17 2018-07-10 广州势必可赢网络科技有限公司 A kind of vocal cords restore methods of marking and device
CN108601567A (en) * 2016-02-09 2018-09-28 Pst株式会社 Estimation method, estimating program, estimating unit and hypothetical system
CN109036469A (en) * 2018-07-17 2018-12-18 西安交通大学 A kind of autonomic nervous function parameter acquiring method based on sound characteristic
CN109192226A (en) * 2018-06-26 2019-01-11 深圳大学 A kind of signal processing method and device
CN111554325A (en) * 2020-05-09 2020-08-18 陕西师范大学 Voice recognition method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2045307U (en) * 1988-10-05 1989-10-04 马洪恩 Electronic laryngopathy treatment apparatus
CN1875877A (en) * 2006-05-15 2006-12-13 西安交通大学 A method for obtaining subglottic pressure value and calculating phonation efficiency
CN101452698A (en) * 2007-11-29 2009-06-10 中国科学院声学研究所 Voice HNR automatic analytical method

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN2045307U (en) * 1988-10-05 1989-10-04 马洪恩 Electronic laryngopathy treatment apparatus
CN1875877A (en) * 2006-05-15 2006-12-13 西安交通大学 A method for obtaining subglottic pressure value and calculating phonation efficiency
CN101452698A (en) * 2007-11-29 2009-06-10 中国科学院声学研究所 Voice HNR automatic analytical method

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
于燕平: "基于小波变换和GMM的病态嗓音特征提取及识别研究", 《广西师范大学硕士学位论文》, 31 December 2008 (2008-12-31) *
高俊芬 等: "基于非线性动力学和GMM的病态嗓音识别与研究", 《广西师范大学学报(自然科学版)》, 31 December 2011 (2011-12-31) *

Cited By (12)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103730130A (en) * 2013-12-20 2014-04-16 中国科学院深圳先进技术研究院 Detection method and system for pathological voice
CN103730130B (en) * 2013-12-20 2019-03-01 中国科学院深圳先进技术研究院 A kind of detection system of pathological voice
CN103778913A (en) * 2014-01-22 2014-05-07 苏州大学 Pathologic voice recognizing method
CN106297768A (en) * 2015-05-11 2017-01-04 苏州大学 A kind of audio recognition method
CN106297768B (en) * 2015-05-11 2020-01-17 苏州大学 Speech recognition method
CN108601567A (en) * 2016-02-09 2018-09-28 Pst株式会社 Estimation method, estimating program, estimating unit and hypothetical system
CN108601567B (en) * 2016-02-09 2021-06-11 Pst株式会社 Estimation method, estimation program, estimation device, and estimation system
CN108269590A (en) * 2018-01-17 2018-07-10 广州势必可赢网络科技有限公司 A kind of vocal cords restore methods of marking and device
CN109192226A (en) * 2018-06-26 2019-01-11 深圳大学 A kind of signal processing method and device
CN109036469A (en) * 2018-07-17 2018-12-18 西安交通大学 A kind of autonomic nervous function parameter acquiring method based on sound characteristic
CN111554325A (en) * 2020-05-09 2020-08-18 陕西师范大学 Voice recognition method and system
CN111554325B (en) * 2020-05-09 2023-03-24 陕西师范大学 Voice recognition method and system

Similar Documents

Publication Publication Date Title
CN103258545A (en) Pathological voice subdivision method
CN106878677A (en) Student classroom Grasping level assessment system and method based on multisensor
CN109620152B (en) MutifacolLoss-densenert-based electrocardiosignal classification method
CN108198620A (en) A kind of skin disease intelligent auxiliary diagnosis system based on deep learning
CN109044396B (en) Intelligent heart sound identification method based on bidirectional long-time and short-time memory neural network
CN108899049A (en) A kind of speech-emotion recognition method and system based on convolutional neural networks
Muhammad et al. Convergence of artificial intelligence and internet of things in smart healthcare: a case study of voice pathology detection
CN112006697B (en) Voice signal-based gradient lifting decision tree depression degree recognition system
CN103996155A (en) Intelligent interaction and psychological comfort robot service system
CN107066514A (en) The Emotion identification method and system of the elderly
CN107609736A (en) A kind of teaching diagnostic analysis system and method for integrated application artificial intelligence technology
CN106941005A (en) A kind of vocal cords method for detecting abnormality based on speech acoustics feature
WO2019242155A1 (en) Voice recognition-based health management method and apparatus, and computer device
CN105448291A (en) Parkinsonism detection method and detection system based on voice
CN111920420B (en) Patient behavior multi-modal analysis and prediction system based on statistical learning
CN102623009A (en) Abnormal emotion automatic detection and extraction method and system on basis of short-time analysis
CN107274888A (en) A kind of Emotional speech recognition method based on octave signal intensity and differentiation character subset
CN109727608A (en) A kind of ill voice appraisal procedure based on Chinese speech
CN109272986A (en) A kind of dog sound sensibility classification method based on artificial neural network
CN103578480B (en) The speech-emotion recognition method based on context correction during negative emotions detects
CN110705523B (en) Entrepreneur performance evaluation method and system based on neural network
CN111489736A (en) Automatic seat speech technology scoring device and method
CN116844080A (en) Fatigue degree multi-mode fusion detection method, electronic equipment and storage medium
CN109272262A (en) A kind of analysis method of natural language feature
Radha et al. Automated detection and severity assessment of dysarthria using raw speech

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20130821