CN101650944A - Method for distinguishing speakers based on protective kernel Fisher distinguishing method - Google Patents
Method for distinguishing speakers based on protective kernel Fisher distinguishing method Download PDFInfo
- Publication number
- CN101650944A CN101650944A CN200910152590A CN200910152590A CN101650944A CN 101650944 A CN101650944 A CN 101650944A CN 200910152590 A CN200910152590 A CN 200910152590A CN 200910152590 A CN200910152590 A CN 200910152590A CN 101650944 A CN101650944 A CN 101650944A
- Authority
- CN
- China
- Prior art keywords
- phi
- overbar
- prime
- sigma
- class
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/213—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
- G06F18/2132—Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis
Landscapes
- Engineering & Computer Science (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Data Mining & Analysis (AREA)
- Theoretical Computer Science (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Artificial Intelligence (AREA)
- Evolutionary Biology (AREA)
- Evolutionary Computation (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Management, Administration, Business Operations System, And Electronic Commerce (AREA)
Abstract
The invention relates to a method for distinguishing speakers based on a protective kernel Fisher distinguishing method. The method comprises steps as follows: (1) pretreating voice signals; (2) extracting characteristic parameters: after framing and end point detection of voice signals, extracting Mel frequency cepstrum coefficients as characteristic vectors of speakers; (3) creating a speaker distinguishing model; (4) calculating model optimal projection vector: by using optimal solution of LWFD method, calculating to obtain an optimal projection vector group; (5) distinguishing speakers: projecting original data xi to yi belonging to R<r>( r is more than or equal to 1 and less than or equal to d) according to optimal projection classification vector phi, wherein r is cut dimensionality;the optimal projection classification dimensionality of original c type data space is c-1, then solving a central value of data of each type after injection and normalizing; after projecting data tobe classified to a sub space and normalizing, calculating Euclidean distance from the normalized protecting data to the central point of each type of data in the sub space, and judging the nearest tobe a distinguishing result. The invention has high distinguishing rate, simple model construction and favorable rapidity.
Description
Technical field
The present invention relates to signal Processing, machine learning and area of pattern recognition, especially a kind of speaker identification's implementation method.
Background technology
(Speaker Recognition SR) claims words person identification again to Speaker Identification, is meant by the analyzing and processing of speaker's voice signal also being confirmed automatically speaker's technology.The speaker identification who the present invention relates to is an important branch of Speaker Identification.It is from that the speaker identification system must recognize voice to be identified for which of individual philtrum to be investigated, also will make the differentiation of refusal to the voice beyond this people sometimes.It is not the process of a pattern match that the speaker debates, and in this process, computing machine at first will be set up speech model according to speaker's characteristic voice; Promptly the voice signal of input is analyzed, and extracted speaker's personal characteristics, set up the required model of speaker identification on this basis.Speaker debates other system can be divided into several sections such as the training of the selection of pre-service, characteristic parameter of voice and extraction, model of cognition and coupling.
The algorithm of comparative maturity mainly contains vector quantization (Vector Quantization at present, VQ), support vector machine (Support Vector Machine, SVM), Hidden Markov Model (HMM) (Hidden Markov Model, HMM), mixed Gauss model (Gaussian Mixture Model, GMM) etc.Wherein the VQ method is only at the relevant speaker identification's occasion of text.The prerequisite that GMM and HMM method are used is to need a large amount of training utterance data to carry out the optimization of model parameter.Though SVM can obtain recognition efficiency preferably, its range of application that has been the weak intrinsic limitation of output of non-probability and multiclass expansion.
Through new patent searching statistics, the patent of existing many Speaker Identification aspect both at home and abroad; For example, based on the method for distinguishing speek person (200510061953.X) of the supporting vector machine model of embedded GMM nuclear, utilize the method for distinguishing speek person (200710157134.4) of base frequency envelope to eliminate emotion voice, based on the method for distinguishing speek person (200710157133.X) of neutrality and affection sound-groove model conversion, based on the method for distinguishing speek person (200510061954.4) of hybrid supporting vector machine, based on the emotional speaker recognition method (200810162450.5) of frequency spectrum translation, based on the method for distinguishing speek person (200810162449.2) of mixed t model; Based on method for distinguishing speek person (200510061360.3) of MFCC linear emotion compensation etc.
Summary of the invention
Lower for the discrimination that overcomes existing speaker identification's implementation method, model construction is complicated, slow-footed deficiency, the invention provides a kind of discrimination height, model construction simple, have good rapidity based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method.
The technical solution adopted for the present invention to solve the technical problems is:
A kind of based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method, may further comprise the steps:
1., the pre-service of voice signal: voice signal is carried out pre-service;
2., characteristic parameter extraction: after voice signal is finished processing of branch frame and end-point detection, extract the Mel cepstrum parameter as the speaker characteristic vector, described Mel cepstrum parameter is 13 rank cepstrum parameters, remove and wherein speaker characteristic is described the 0th less rank parameter, every frame voice signal is converted to 12 Jan Vermeer cepstrum feature vectors;
3., speaker identification's model construction:
Set x
i∈ R
d(i=1,2 ..., N) be d dimension sample data, y
i∈ 1,2 ..., and c} is corresponding class label, and wherein N is a total sample number, and c is the classification sum, c
lBe the sample number of l class, then:
X is a sample matrix, that is:
X≡(x
1|x
2|…|x
n|)
Based on above-mentioned pacing items, set up speaker identification's model and be:
Wherein:
Be divergence matrix in the class,
Be divergence matrix in the class, affine matrix
Wherein σ is the adjustable integer constant factor,
Be best projection class vector to be asked;
4., model best projection vector calculation
Adopt the optimum solution of LWFD method, promptly according to formula:
Calculate the best projection Vector Groups, suppose nullB with
Represent S respectively
bWith
Kernel, then the best of following formula differentiates that the subspace takes from
NullB wherein
⊥Be the orthocomplement of nullB, at first with S
bProject to nullB
⊥, obtain nullB
⊥Behind the space, again divergence matrix projection between class scatter matrix and the class is advanced
Subspace, the vector in the subspace of gained are the optimum proper vector of differentiating;
5., speaker identification:
According to optimum projection class vector
With former data x
iBe projected as y
i∈ R
r(1≤r≤d), wherein r is the dimension after cutting down, the projection formula that adopts transformation matrix T:
The optimal classification projected dimensions of former c class data space is c-1, ask for the central value and the standardization of data after each class projection afterwards, after will treating that grouped data projects to subspace and standardization, calculate its with the subspace in the Euclidean distance of each class data center's point, nearest person is judged to recognition result.
Further, described step 4. in, the described optimum process of asking for of differentiating proper vector is:
At first with S
bProject to nullB
⊥, rewrite S
bExpression formula is:
Wherein
Matrix S
bOrder be c-1, Φ
bΦ
b TWith Φ
b TΦ
bIdentical nonzero eigenvalue is arranged, and the pairing proper subspace of filtering zero eigenvalue is S
bKernel; Use Φ
b TΦ
bSubstitute Φ
bΦ
b TAnd take to examine skill and derive;
Wherein:
With each utilizes kernel function to be converted to matrix in the following formula:
Wherein
1
LCBe that an all elements is L * C matrix of 1,
Be a L * C block diagonal matrix, piece
Be that an all elements is
C
i* 1 column vector;
If λ
iWith e
i(i=1 ... c) be Φ
b TΦ
bI eigen vector, and with the eigenwert descending sort; V then
i=Φ
be
iIt is former between class scatter matrix S
bProper vector; Remove S
bKernel, promptly abandon eigenwert and be zero individual features vector, keep v
iIn before c-1 proper vector: V=[v
1V
C-1]=Φ
bE
m=Φ
b[e
1E
C-1], V then
TS
bV=Λ
b, Λ
b=diag[λ
1λ
C-1] be (c-1) * (c-1) diagonal matrix;
Obtain nullB
⊥Behind the space, divergence matrix between class scatter matrix and the class is complied with
The subspace is entered in projection, wherein U
TS
bU=I,
Utilize nuclear matrix K, will
Carry out the coring conversion:
Wherein:
First:
Second:
W=diag[w in the following formula
1W
c] be a N * N partitioned matrix, w
iBe that an element is
C
i* c
iMatrix, therefore
It also is a c * c matrix.Then
Be the simple matrix of a dimension, calculate its proper vector p for (c-1) * (c-1)
iWith eigenvalue '
iAnd with the ascending order arrangement, m vectorial eigentransformation matrix Q=UP=U[p of getting before extracting
1P
M-1], wherein 1≤m≤c-1 can get
Λ
w=diag[λ '
1λ '
m] be a m * m diagonal matrix;
The optimum that keeps Fisher to differentiate in the class is differentiated proper vector:
Feature constitutes a low n-dimensional subspace n in the H space after the conversion.
Further, described step 5. in, the speaker's phonetic entry pattern z to be classified arbitrarily projects to proper subspace according to Γ, is calculated as follows:
Wherein:
Because
Can get:
Wherein
Be a N * 1 nuclear vector, the proper vector value is:
Calculate the Euclidean distance of each the class data center's point in y and the subspace, the person is judged to recognition result recently.
Further again, described step 1. in, described pre-service comprises sampling, removes noise, end-point detection, pre-emphasis, branch frame and windowing.
Technical conceive of the present invention is: (Fisher Discriminant Analysis is that the sample data that d ties up the input space is projected on the straight line FDA), makes on this straight line projection in zone calibration the best of sample in the Fisher discriminatory analysis.Speaker's pitch, tone color, volume presents the colourful form of expression at different times, and speech characteristic parameter often has non-linear, polymorphism, directly uses the Fisher discriminant analysis method and can't obtain desirable recognition result.
Nuclear Fisher diagnostic method (Kernel Fisher Discriminant Analysis, KFDA) be will the nuclear learning method the product that combines with Fisher judgement method of thought.The thinking of KFDA algorithm is: at first by a Nonlinear Mapping, will import data and hint obliquely in a higher-dimension nuclear space; Then, in this higher-dimension nuclear space, carry out linear Fisher judgment analysis again, thereby realize with respect to former space being non-linear judgment analysis.Though nuclear Fisher diagnostic method meets the speaker and debates other nonlinear feature, but nuclear Fisher diagnostic method is only considered the global area calibration maximal projection of grouped data, do not consider the interior polymorphic distribution characteristics of class of same speaker's speech vector, but also need an accelerated model training algorithm to support speaker identification's big data quantity situation.
The effect that the present invention is useful is: 1, affinity between the sample is incorporated divergence matrix in the class with the weights form, propose to keep the Fisher method of discrimination in the class, be applied to the speaker identification, discrimination is higher than tradition generation property model (as gauss hybrid models); To compare discrimination similar with other property distinguished models (as support vector machine), yet support vector machine is the binary classification device, can only make up a plurality of models by " one-to-many " or " one to one " mode carries out the ballot formula and classifies more, and the inventive method can directly be carried out many classification, and model construction is directly perceived more quick; 2, the optimum projection class vector of the coring Fisher discrimination model of class internal characteristic maintenance is searched in the subspace of non-kernel between class by kernel in class, makes the optimal vector computing velocity faster, meets the big training sample situation of this class of speaker identification.
Embodiment
Below the present invention is further described.
A kind of based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method, may further comprise the steps:
1., the pre-service of voice signal: voice signal is carried out pre-service;
2., characteristic parameter extraction: after voice signal is finished processing of branch frame and end-point detection, extract the Mel cepstrum parameter as the speaker characteristic vector, described Mel cepstrum parameter is 13 rank cepstrum parameters, remove and wherein speaker characteristic is described the 0th less rank parameter, every frame voice signal is converted to 12 Jan Vermeer cepstrum feature vectors;
3., speaker identification's model construction:
Set x
i∈ R
d(i=1,2 ..., N) be d dimension sample data, y
i∈ 1,2 ..., and c} is corresponding class label, and wherein N is a total sample number, and c is the classification sum, c
lBe the sample number of l class, then:
X is a sample matrix, that is:
X≡(x
1|x
2|…|x
n|)
Based on above-mentioned pacing items, set up speaker identification's model and be:
Wherein:
Be divergence matrix in the class,
Be divergence matrix in the class, affine matrix
Wherein σ is the adjustable integer constant factor, x
iBe the mean value of i class sample, x represents the mean value of all samples,
Be best projection class vector to be asked.
4., model best projection vector calculation
Adopt the optimum solution of LWFD method, promptly according to formula:
Calculate the best projection Vector Groups, suppose nullB with
Represent S respectively
bWith
Kernel, then the best of following formula differentiates that the subspace takes from
NullB wherein
⊥Be the orthocomplement of nullB, at first with S
bProject to nullB
⊥, obtain nullB
⊥Behind the space, again divergence matrix projection between class scatter matrix and the class is advanced
Subspace, the vector in the subspace of gained are the optimum proper vector of differentiating;
5., speaker identification:
According to optimum projection class vector
With former data x
iBe projected as y
i∈ R
r(1≤r≤d), wherein r is the dimension after cutting down, the projection formula that adopts transformation matrix T:
The optimal classification projected dimensions of former c class data space is c-1, ask for the central value and the standardization of data after each class projection afterwards, after will treating that grouped data projects to subspace and standardization, calculate its with the subspace in the Euclidean distance of each class data center's point, nearest person is judged to recognition result.
The framework of present embodiment is as follows:
First's feature extraction
Prior art is adopted in feature extraction substantially, and the voice signal of at first gathering each speaker's different times is some, carries out pretreatment operation, comprises that sample quantization, center clipping, pre-emphasis, unvoiced segments are removed, windowing divides frame.Pretreated voice signal is carried out feature extraction, the present invention adopts Mel frequency cepstrum parameter (MFCC), extract 13 rank Mel cepstrum parameters of every frame voice signal, remove and wherein speaker characteristic is described the 0th less rank parameter, last every frame voice signal is converted to 12 Jan Vermeer cepstrum feature vectors.
Keep the Fisher discrimination model in the second portion class
Traditional core Fisher criterion is as follows:
Wherein
Between class scatter matrix and the interior divergence matrix of class among the higher dimensional space H have been represented respectively; φ (x) is that input vector x is at higher dimensional space H respective projection, φ
iBe the mean value of i class sample at higher dimensional space, φ represents the mean value of all samples of higher dimensional space.According to reproducing kernel theory,
Can be expressed as following form:
Then former differentiation criterion is equivalent to expression formula:
Divergence matrix K in the nuclear class
wWith nuclear between class scatter matrix K
bBe defined as follows:
Wherein:
η
x=(k(x
1,x),…,k(x
N,x))
T
Try to achieve best nuclear discriminant vector α in the following formula according to Generalized Rayleigh Quotient
1, α
2... α
N, the projection matrix that best discriminant vector collection constitutes among the feature space H then:
Have the situation of sub-clustering and overlapping sample in the class according to speaker's speech characteristic parameter, directly use nuclear Fisher method of discrimination and can't obtain desirable recognition result.The present invention proposes to protect the Fisher method of discrimination of class internal characteristic with the divergence matrix in class of local data's Feature Fusion in the class:
The between class scatter matrix S that keeps former Fisher techniques of discriminant analysis
bConstant, the divergence matrix is adjusted as follows in the class:
Based on above-mentioned adjustment, data local feature in the class incorporated make it to become:
Wherein
σ is adjustable integral factor.That is to say, the distance factor of homogeneous data is incorporated divergence matrix in the class with the form of weighting, the sample in similar to being weighted, is reduced similar sample medium and long distance data to the effect to divergence matrix in the class, the influence power of outstanding neighbour's data promptly keeps the class internal characteristic.Will
Be applied to the Fisher discriminatory analysis, then the Fisher criterion formulas becomes:
The best Fisher projection vector of third part obtains
The directly optimum projection that keeps Fisher to differentiate in the compute classes, similar Fisher diagnostic method need be asked K
w -1K
bThe corresponding proper vector of matrix eigenvalue of maximum, when being applied to the speaker identification, the sample characteristics vector is thousands of easily, and calculated amount is quite surprising, can't use in real time, must improve training algorithm.
The optimum solution that keeps the Fisher method of discrimination in the class promptly calculate according to the Fisher criterion formulas and the best projection Vector Groups, suppose nullB and
Represent S respectively
bWith
Kernel, take from the then best subspace of differentiating
NullB wherein
⊥Be the orthocomplement of nullB, at first with S
bProject to nullB
⊥, rewrite S
bExpression formula is:
Wherein
Matrix S
bOrder be c-1, Φ
bΦ
b TWith Φ
b TΦ
bIdentical nonzero eigenvalue is arranged, and the pairing proper subspace of zero eigenvalue is S
bKernel, need filtering.Therefore use Φ
b TΦ
bSubstitute Φ
bΦ
b TAnd take to examine skill and derive, can simplify operand.
Wherein:
With each utilizes kernel function to be converted to matrix in the following formula:
Wherein
1
LCBe that an all elements is L * C matrix of 1,
Be a L * C block diagonal matrix, piece
Be that an all elements is
C
i* 1 column vector, K is the nuclear matrix of input feature value.。
If λ
iWith e
i(i=1 ... c) be Φ
b TΦ
bI eigen vector, and with the eigenwert descending sort.V then
i=Φ
be
iIt is former between class scatter matrix S
bProper vector.For obtaining optimum projection, must remove S
bKernel, promptly abandon eigenwert and be zero individual features vector, keep v
iIn before c-1 proper vector: V=[v
1V
C-1]=Φ
bE
m=Φ
b[e
1E
C-1], V then
TS
bV=Λ
b, Λ
b=diag[λ
1λ
C-1] be (c-1) * (c-1) diagonal matrix.
Obtain nullB
⊥Behind the space, divergence matrix between class scatter matrix and the class is complied with
The subspace is entered in projection, wherein U
TS
bU=I,
Utilize nuclear matrix K, will
Carry out the coring conversion:
Wherein
Respectively with behind the matrix computations formal representation two be:
W=diag[w in the following formula
1W
c] be a N * N partitioned matrix, w
iBe that an element is
C
i* c
iMatrix, therefore
It also is a c * c matrix.Then
Be the simple matrix of a dimension, calculate its proper vector p for (c-1) * (c-1)
iWith eigenvalue '
iAnd with the ascending order arrangement, m vectorial feature extraction transformation matrix Q=UP=U[p that gets before getting
1P
M-1], wherein 1≤m≤c-1 can get
Λ
w=diag[λ '
1λ '
m] be a m * m diagonal matrix.
In sum, the optimum differentiation proper vector that keeps Fisher to differentiate in the class is:
Feature constitutes a low n-dimensional subspace n in the H space after the conversion, and has maximum separable degree.Ask for the central value of data and standardization after each class projection in the low n-dimensional subspace n, for next step speaker identification gets ready.
The 4th part speaker identification
Speaker's phonetic entry pattern z to be classified arbitrarily projects to proper subspace according to Γ, is calculated as follows:
Wherein:
Because
Can get:
Wherein
It is a N * 1 nuclear vector.Final proper vector value is:
Calculate the Euclidean distance of each the class data center's point in y and the subspace, the person is judged to recognition result recently.
Test experiments: the corpus of oneself recording is adopted in experiment, 20 of recording total numbers of persons, and wherein the man is 12,8 of woman.Data transform acquisition by sample frequency 8000Hz, quantization digit 16bit, monophony A/D.Everyone voice signal is recorded synthetic by different times.Everyone mix extract different times sound bite total length 15s as training signal, 20 length 1.5s sound bites of different times are as test signal, i.e. 20 training utterances, 400 tested speech.Voice signal is earlier through high boost, pre-service such as center reduction detect by VAD (Voice Activity Detection) sound is active again, extract wherein effectively voice segments, removing redundant unvoiced segments, is that length divides frame to extract the 12 MFCC characteristic parameters of tieing up as sorting parameter with 30ms.
GMM, SVM and the inventive method are carried out speaker identification's contrast test.The inventive method adopts identical radially basic kernel function with SVM.The erroneous results rate is respectively: gauss hybrid models: 3.5%; Support vector machine: 2.75%; The class internal characteristic keeps nuclear Fisher diagnostic method: 2.5%.As seen, the inventive method has the better recognition rate than classic method.
Claims (4)
1, a kind of based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method, it is characterized in that: described speaker identification's implementation method may further comprise the steps:
1., the pre-service of voice signal: voice signal is carried out pre-service;
2., characteristic parameter extraction: after voice signal is finished processing of branch frame and end-point detection, extract the Mel cepstrum parameter as the speaker characteristic vector, described Mel cepstrum parameter is 13 rank cepstrum parameters, remove and wherein speaker characteristic is described the 0th less rank parameter, every frame voice signal is converted to 12 Jan Vermeer cepstrum feature vectors;
3., speaker identification's model construction:
Set x
i∈ R
d(i=1,2 ..., N) be d dimension sample data, y
i∈ 1,2 ..., and c} is corresponding class label, and wherein N is a total sample number, and c is the classification sum, c
lBe the sample number of l class, then:
X is a sample matrix, that is:
X≡(x
1|x
2|…|x
n|)
Based on above-mentioned pacing items, set up speaker identification's model and be:
Wherein:
Be the between class scatter matrix,
Be divergence matrix in the class, affine matrix
Wherein σ is the adjustable integer constant factor, x
iBe the mean value of i class sample, x represents the mean value of all samples,
Be best projection class vector to be asked;
4., model best projection vector calculation
Adopt the optimum solution of LWFD method, promptly according to formula:
Calculate the best projection Vector Groups, suppose nullB with
Represent S respectively
bWith
Kernel, then the best of following formula differentiates that the subspace takes from
NullB wherein
⊥Be the orthocomplement of nullB, at first with S
bProject to nullB
⊥, obtain nullB
⊥Behind the space, again divergence matrix projection between class scatter matrix and the class is advanced
Subspace, the vector in the subspace of gained are the optimum proper vector of differentiating;
5., speaker identification:
According to optimum projection class vector
With former data x
iBe projected as y
i∈ R
r(1≤r≤d), wherein r is the dimension after cutting down, the projection formula that adopts transformation matrix T:
The optimal classification projected dimensions of former c class data space is c-1, ask for the central value and the standardization of data after each class projection afterwards, after will treating that grouped data projects to subspace and standardization, calculate its with the subspace in the Euclidean distance of each class data center's point, nearest person is judged to recognition result.
2, as claimed in claim 1 based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method, it is characterized in that: described step 4. in, the described optimum process of asking for of differentiating proper vector is:
At first with S
bProject to nullB
⊥, rewrite S
bExpression formula is:
φ wherein
iBe the mean value of i class sample at higher dimensional space, φ represents the mean value of all samples of higher dimensional space,
Φ
b=[φ′
1…φ′
c]。Matrix S
bOrder be c-1, Φ
bΦ
b TWith Φ
b TΦ
bIdentical nonzero eigenvalue is arranged, and the pairing proper subspace of zero eigenvalue is S
bKernel; Use Φ
b TΦ
bSubstitute Φ
bΦ
b TAnd take to examine skill and derive;
Wherein:
With each utilizes kernel function to be converted to matrix in the following formula:
Wherein
1
LCBe that an all elements is L * C matrix of 1,
Be a L * C block diagonal matrix, piece
Be that an all elements is
C
i* 1 column vector, K is the nuclear matrix of input feature value.
If λ
iWith e
i(i=1 ... c) be Φ
b TΦ
bI eigen vector, and with the eigenwert descending sort; V then
i=Φ
be
iIt is former between class scatter matrix S
bProper vector; Remove S
bKernel, promptly abandon eigenwert and be zero individual features vector, keep v
iIn before c-1 proper vector: V=[v
1V
C-1]=Φ
bE
m=Φ
b[e
1E
C-1], V then
TS
bV=Λ
b, Λ
b=diang[λ
1λ
C-1] be (c-1) * (c-1) diagonal matrix.
Obtain nullB
⊥Behind the space, divergence matrix between class scatter matrix and the class is complied with
The subspace is entered in projection, wherein U
TS
bU=I,
Utilize nuclear matrix K, will
Carry out the coring conversion:
Wherein:
First:
Second:
W=diag[w in the following formula
1W
c] be a N * N partitioned matrix, w
iBe that an element is
C
i* c
iMatrix, therefore
It also is an outer c matrix.Then
Be the simple matrix of a dimension, calculate its proper vector p for (c-1) * (c-1)
iWith eigenvalue '
iAnd with the ascending order arrangement, m vectorial eigentransformation matrix Q=UP=U[p of getting before extracting
1P
M-1], wherein 1≤m≤c-1 can get
Λ
w=diag[λ '
1λ '
m] be a m * m diagonal matrix;
The optimum that keeps Fisher to differentiate in the class is differentiated proper vector:
Feature constitutes a low n-dimensional subspace n in the H space after the conversion.
3, as claimed in claim 1 or 2 based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method, it is characterized in that: described step 5. in, speaker's phonetic entry pattern z to be classified arbitrarily projects to proper subspace according to Γ, is calculated as follows:
Wherein:
Wherein
Be a N * 1 nuclear vector, the proper vector value is:
Calculate the Euclidean distance of each the class data center's point in y and the subspace, the person is judged to recognition result recently.
4, as claimed in claim 3 based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method, it is characterized in that: described step 1. in, described pre-service comprises sampling, removal noise, end-point detection, pre-emphasis, branch frame and windowing.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200910152590A CN101650944A (en) | 2009-09-17 | 2009-09-17 | Method for distinguishing speakers based on protective kernel Fisher distinguishing method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN200910152590A CN101650944A (en) | 2009-09-17 | 2009-09-17 | Method for distinguishing speakers based on protective kernel Fisher distinguishing method |
Publications (1)
Publication Number | Publication Date |
---|---|
CN101650944A true CN101650944A (en) | 2010-02-17 |
Family
ID=41673165
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN200910152590A Pending CN101650944A (en) | 2009-09-17 | 2009-09-17 | Method for distinguishing speakers based on protective kernel Fisher distinguishing method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101650944A (en) |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103077405A (en) * | 2013-01-18 | 2013-05-01 | 浪潮电子信息产业股份有限公司 | Bayes classification method based on Fisher discriminant analysis |
CN106128466A (en) * | 2016-07-15 | 2016-11-16 | 腾讯科技(深圳)有限公司 | Identity vector processing method and device |
CN106297825A (en) * | 2016-07-25 | 2017-01-04 | 华南理工大学 | A kind of speech-emotion recognition method based on integrated degree of depth belief network |
CN106326927A (en) * | 2016-08-24 | 2017-01-11 | 大连海事大学 | Shoeprint new class detection method |
CN106683661A (en) * | 2015-11-05 | 2017-05-17 | 阿里巴巴集团控股有限公司 | Role separation method and device based on voice |
CN107274888A (en) * | 2017-06-14 | 2017-10-20 | 大连海事大学 | A kind of Emotional speech recognition method based on octave signal intensity and differentiation character subset |
CN107633845A (en) * | 2017-09-11 | 2018-01-26 | 清华大学 | A kind of duscriminant local message distance keeps the method for identifying speaker of mapping |
CN109389017A (en) * | 2017-08-11 | 2019-02-26 | 苏州经贸职业技术学院 | Pedestrian's recognition methods again |
CN110163034A (en) * | 2018-02-27 | 2019-08-23 | 冷霜 | A kind of listed method of aircraft surface positioning extracted based on optimal characteristics |
CN112949671A (en) * | 2019-12-11 | 2021-06-11 | 中国科学院声学研究所 | Signal classification method and system based on unsupervised feature optimization |
CN115268417A (en) * | 2022-09-29 | 2022-11-01 | 南通艾美瑞智能制造有限公司 | Self-adaptive ECU fault diagnosis control method |
-
2009
- 2009-09-17 CN CN200910152590A patent/CN101650944A/en active Pending
Cited By (19)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103077405A (en) * | 2013-01-18 | 2013-05-01 | 浪潮电子信息产业股份有限公司 | Bayes classification method based on Fisher discriminant analysis |
CN106683661A (en) * | 2015-11-05 | 2017-05-17 | 阿里巴巴集团控股有限公司 | Role separation method and device based on voice |
CN106128466A (en) * | 2016-07-15 | 2016-11-16 | 腾讯科技(深圳)有限公司 | Identity vector processing method and device |
US10650830B2 (en) | 2016-07-15 | 2020-05-12 | Tencent Technology (Shenzhen) Company Limited | Identity vector processing method and computer device |
CN106128466B (en) * | 2016-07-15 | 2019-07-05 | 腾讯科技(深圳)有限公司 | Identity vector processing method and device |
CN106297825A (en) * | 2016-07-25 | 2017-01-04 | 华南理工大学 | A kind of speech-emotion recognition method based on integrated degree of depth belief network |
CN106326927B (en) * | 2016-08-24 | 2019-06-04 | 大连海事大学 | A kind of shoes print new category detection method |
CN106326927A (en) * | 2016-08-24 | 2017-01-11 | 大连海事大学 | Shoeprint new class detection method |
CN107274888A (en) * | 2017-06-14 | 2017-10-20 | 大连海事大学 | A kind of Emotional speech recognition method based on octave signal intensity and differentiation character subset |
CN107274888B (en) * | 2017-06-14 | 2020-09-15 | 大连海事大学 | Emotional voice recognition method based on octave signal strength and differentiated feature subset |
CN109389017A (en) * | 2017-08-11 | 2019-02-26 | 苏州经贸职业技术学院 | Pedestrian's recognition methods again |
CN109389017B (en) * | 2017-08-11 | 2021-11-16 | 苏州经贸职业技术学院 | Pedestrian re-identification method |
CN107633845A (en) * | 2017-09-11 | 2018-01-26 | 清华大学 | A kind of duscriminant local message distance keeps the method for identifying speaker of mapping |
CN110163034A (en) * | 2018-02-27 | 2019-08-23 | 冷霜 | A kind of listed method of aircraft surface positioning extracted based on optimal characteristics |
CN110163034B (en) * | 2018-02-27 | 2021-07-23 | 山东炎黄工业设计有限公司 | Aircraft ground positioning and listing method based on optimal feature extraction |
CN112949671A (en) * | 2019-12-11 | 2021-06-11 | 中国科学院声学研究所 | Signal classification method and system based on unsupervised feature optimization |
CN112949671B (en) * | 2019-12-11 | 2023-06-30 | 中国科学院声学研究所 | Signal classification method and system based on unsupervised feature optimization |
CN115268417A (en) * | 2022-09-29 | 2022-11-01 | 南通艾美瑞智能制造有限公司 | Self-adaptive ECU fault diagnosis control method |
CN115268417B (en) * | 2022-09-29 | 2022-12-16 | 南通艾美瑞智能制造有限公司 | Self-adaptive ECU fault diagnosis control method |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN101650944A (en) | Method for distinguishing speakers based on protective kernel Fisher distinguishing method | |
Yu et al. | Spoofing detection in automatic speaker verification systems using DNN classifiers and dynamic acoustic features | |
CN1975856B (en) | Speech emotion identifying method based on supporting vector machine | |
CN111243602B (en) | Voiceprint recognition method based on gender, nationality and emotion information | |
CN106503805A (en) | A kind of bimodal based on machine learning everybody talk with sentiment analysis system and method | |
CN101923855A (en) | Test-irrelevant voice print identifying system | |
CN102968990B (en) | Speaker identifying method and system | |
CN103544963A (en) | Voice emotion recognition method based on core semi-supervised discrimination and analysis | |
CN105261367B (en) | A kind of method for distinguishing speek person | |
CN103456302B (en) | A kind of emotional speaker recognition method based on the synthesis of emotion GMM Model Weight | |
CN101226743A (en) | Method for recognizing speaker based on conversion of neutral and affection sound-groove model | |
CN111724770B (en) | Audio keyword identification method for generating confrontation network based on deep convolution | |
CN102982803A (en) | Isolated word speech recognition method based on HRSF and improved DTW algorithm | |
CN105825852A (en) | Oral English reading test scoring method | |
CN102789779A (en) | Speech recognition system and recognition method thereof | |
CN106531174A (en) | Animal sound recognition method based on wavelet packet decomposition and spectrogram features | |
CN106601230A (en) | Logistics sorting place name speech recognition method, system and logistics sorting system based on continuous Gaussian mixture HMM | |
CN110070895A (en) | A kind of mixed sound event detecting method based on supervision variation encoder Factor Decomposition | |
CN102592593B (en) | Emotional-characteristic extraction method implemented through considering sparsity of multilinear group in speech | |
CN105609117A (en) | Device and method for identifying voice emotion | |
CN104464738B (en) | A kind of method for recognizing sound-groove towards Intelligent mobile equipment | |
Iqbal et al. | Mfcc and machine learning based speech emotion recognition over tess and iemocap datasets | |
CN101650945B (en) | Method for recognizing speaker based on multivariate core logistic regression model | |
Ye et al. | Phoneme classification using naive bayes classifier in reconstructed phase space | |
Chen et al. | Automatic recognition of bird songs using time-frequency texture |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C12 | Rejection of a patent application after its publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20100217 |