CN103258545A

CN103258545A - Pathological voice subdivision method

Info

Publication number: CN103258545A
Application number: CN2012105555873A
Authority: CN
Inventors: 陶智; 周强; 张晓俊; 吴迪; 肖仲喆; 季晶晶
Original assignee: Suzhou University
Current assignee: Suzhou University
Priority date: 2012-12-20
Filing date: 2012-12-20
Publication date: 2013-08-21

Abstract

The invention discloses a pathological voice subdivision method. A model training module and an identification module are included. The model training module is used for modeling of input voice signals, getting corresponding likelihood, calculating and comparing matching probabilities, and finding out voice signals meeting the conditions. The identification module is used for matching the voice signals which meet the conditions. The lengths of the input voice signals are not required, the voice signals can have characteristic parameters of any types, and different weights are distributed to different characteristics. Therefore, advantages of all parameters can be utilized, and the number of dimensions of the characteristic parameters is not limited. Multiple times of trainings can be carried out, retraining is carried out on voice signals which are difficult to identify, and threshold values, finishing conditions and identification conditions can be flexibly set in the training. By means of the pathological voice subdivision method, types of pathological voices can be automatically set and subdivided precisely, pre-diagnosis of voice diseases and timely-tracking of recovery condition of a patient are achieved, and meanwhile the pathological voice subdivision method is suitable for being used for health self-inspection of teachers and singers and the like.

Description

A kind of pathology voice divided method

Technical field

The invention belongs to the noise field, be specifically related to a kind of pathology voice divided method.

Background technology

Investigation to the voice situation shows that China at least 1 hundred million people suffers from various voice diseases, relates to many reasons such as physiology and working environment.Mainly be because of the functional of sounding organs,speech or sounding laloplegia that organic lesion causes.Detection for the pathology voice mainly is that the medical expert carries out subjective differentiation in early days, and its False Rate is bigger.The shortcoming of electronic device indagation method is that naked eyes are difficult to catch moment, and can make troubles to patient, cause inaccurate diagnostic result.Along with the proposition of pattern-recognition, automatic detection method convenient and that do not have an infringement becomes research emphasis.

Along with the proposition of identification automatically of pathology voice, a lot of characteristic parameters occur in succession.Roughly can be divided into four big classes: step) time domain and statistics parameter thereof, as fundamental frequency, frequency perturbation, amplitude disturbance etc.; Step) transform domain class parameter is as LPCC, MFCC etc.; (3) noise parameter is as the humorous ratio of making an uproar, glottis noise energy etc.; (4) nonlinear characteristic parameter is as largest Lyapunov exponent, correlation dimension etc.Because the kind of pathology voice is more, each parameter has all produced influence in various degree for the segmentation of pathology voice.

General recognition methods is divided into training part and identification division.Modeling at first will be extracted the feature of voice sound signal, and training obtains model then.Identification division at first carries out feature extraction, mates with the model that trains then to obtain matching score, compares with preset threshold again, obtains differentiating the result at last.All there are two problems in method at present, and the one, do not carry out the precisely subdivided of pathology voice, just carried out the differentiation of right and wrong; The 2nd, do not take full advantage of the advantage of each feature, particularly do not consider the complementarity between the feature.

Summary of the invention

For addressing the above problem, the purpose of this invention is to provide a kind of pathology voice divided method, take full advantage of the advantage of all kinds of parameters, and realized the segmentation of pathology voice.

The apprizing system of sample treatment plant reaches above-mentioned technique effect for realizing above-mentioned technical purpose, and the present invention is achieved through the following technical solutions:

A kind of pathology voice divided method comprises model training module and identification module, and described model training module may further comprise the steps:

Step 1) is extracted various characteristic parameters to all kinds of pathology voice kinds;

Step 2) every kind of characteristic parameter of every class voice carries out GMM training, the GMM matrix that obtains training;

The GMM that the characteristic parameter input that step 3) is extracted step 1) trains asks for corresponding likelihood score;

The likelihood score that step 4) is calculated according to step 3) calculates the matching probability of all kinds of pathology voices;

Each characteristic parameter matching probability weighted sum that step 5) is obtained step 4) gets total matching degree Match;

Match and threshold value that step 6) is tried to achieve step 5) compare, and when greater than threshold value, then carry out the contribution rate of each feature and calculate; When less than threshold value, then count and send into and finish to judge;

Step 7) is carried out weight allocation according to the contribution rate of each feature that step 6) is calculated, and wherein each feature weight sum is 1;

Step 8) to the Match in the step 5) less than threshold value and the voice sound signal that does not satisfy termination condition return step 1); When voice sound signal satisfied termination condition, then training finished;

Described identification module may further comprise the steps:

Step 9) is carried out feature extraction with the voice sound signal that satisfies termination condition in the step 8);

Step 10) is written into the model of cognition that trains;

Step 11) is mated feature and the corresponding stress model that extracts in the step 9);

If the step 12) matching result satisfies the condition of setting, then end of identification; If do not satisfy then be written into the model that the next one trains, and change step 11) over to.

The invention has the beneficial effects as follows:

1, the present invention does not have any requirement to the length of voice sound signal of input, can be the characteristic parameter of any type, and different characteristic allocation takes full advantage of the advantage of each parameter like this with different weights, and to the dimension of characteristic parameter without limits;

2, the present invention can repeatedly train, and at those voice sound signal that is difficult for being identified, carries out retraining, and the condition in the condition of threshold value, end in the training and the identification can be set flexibly;

3, the present invention can set up the kind of pathology voice on their own, and carries out precisely subdividedly, realizes the pre-diagnosis of voice disease and the timely tracking of patient's recovery, and simultaneous adaptation is carried out health in teacher, singer etc. and checked oneself.

Above-mentioned explanation only is the general introduction of technical solution of the present invention, for can clearer understanding technological means of the present invention, and can be implemented according to the content of instructions, below with preferred embodiment of the present invention and conjunction with figs. describe in detail as after.The specific embodiment of the present invention is provided in detail by following examples and accompanying drawing thereof.

Description of drawings

Fig. 1 is modeling process flow diagram of the present invention;

Fig. 2 is coupling of the present invention and screening operation process flow diagram;

Fig. 3 is the workflow diagram of identification module of the present invention;

Fig. 4 is the another kind of workflow diagram of identification module of the present invention.

Embodiment

Below with reference to the accompanying drawings and in conjunction with the embodiments, describe the present invention in detail.

Referring to Fig. 1, shown in Figure 2, a kind of pathology voice divided method comprises model training module and identification module, and described model training module may further comprise the steps:

Step 1) is extracted M kind characteristic parameter to each voice of N class pathology voice;

Step 2) carry out the GMM training according to the characteristic parameter in the step 1) then, every kind of feature of every class voice obtains a model that trains, and institute thinks N*M GMM;

Step 3) is then imported the GMM of character pair and is asked for likelihood score the feature of voice sound signal;

The likelihood score that step 4) is asked for according to step 3) is asked for the corresponding probability with every class voice of voice;

The probability weight summation of every kind of feature correspondence of step 5) is in order to characterize the matching degree Match corresponding to the voice sound signal kind;

Match and threshold value that step 6) is tried to achieve step 5) compare, if greater than threshold value, will carry out the contribution rate of each feature and calculate, and are used for regulating the weighting coefficient of probability; If less than threshold value, will count and send into and finish to judge;

If step 7) does not satisfy termination condition to sending into the voice that finishes to judge in the step 6), then send step 2 back to) carry out the GMM training again, otherwise training finishes.So just obtained the voice sound signal that trains.

Identification module embodiment one:

Referring to shown in Figure 3, identification module may further comprise the steps:

Step 8) is extracted characteristic parameter among the M that imports voice sound signal;

Step 9) loads the GMM matrix that trains;

Step 10) is imported the GMM matrix that trains with the characteristic parameter that extracts in the step 8), obtains the likelihood score of the corresponding every class voice of every kind of feature;

The likelihood score that step 11) is obtained according to step 10) is asked for the probability into every class pathology voice;

Step 12) is to the maximum corresponding pathology voice kind ballot of probability, and every kind of feature all can have a ticket;

All characteristic synthetics of step 13) get up, and carry out total count of votes; If last model, then voice to be identified is the maximum corresponding pathology voice kinds of aggregate votes, finishes; If not last model, aggregate votes are greater than preset threshold, and then voice to be identified is the maximum corresponding pathology voice kinds of aggregate votes, finish, less than the next GMM matrix of then being written into of threshold value and change step 10) over to.

Identification module embodiment two:

Referring to shown in Figure 4, identification module may further comprise the steps:

Step 9) loads the GMM matrix that trains;

The likelihood score that step 11) is obtained according to step 10), asking for voice is the probability of every class pathology voice;

The probability weight summation of all feature correspondences of step 12) is as total coupling Match;

Step 13) is chosen maximum Match in all pathology voice kind coupling Match; If last model, then voice to be identified is the maximum corresponding pathology voice of Match kind, finishes; Greater than preset threshold, then voice to be identified is the maximum corresponding pathology voice of Match kind if not, Match, finishes, less than the next GMM matrix of then being written into of threshold value and change step 10) over to.

The above only for the preferred embodiment of invention, is not limited to the present invention, and for a person skilled in the art, the present invention can have various changes and variation.Within the spirit and principles in the present invention all, any modification of doing, be equal to replacement, improvement etc., all should be included within protection scope of the present invention.

Claims

1. a pathology voice divided method comprises model training module and identification module, it is characterized in that, described model training module may further comprise the steps:

Described identification module may further comprise the steps:

Step 10) is written into the model of cognition that trains;