CN101650944A - Method for distinguishing speakers based on protective kernel Fisher distinguishing method - Google Patents

Method for distinguishing speakers based on protective kernel Fisher distinguishing method Download PDF

Info

Publication number
CN101650944A
CN101650944A CN200910152590A CN200910152590A CN101650944A CN 101650944 A CN101650944 A CN 101650944A CN 200910152590 A CN200910152590 A CN 200910152590A CN 200910152590 A CN200910152590 A CN 200910152590A CN 101650944 A CN101650944 A CN 101650944A
Authority
CN
China
Prior art keywords
phi
overbar
prime
sigma
class
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN200910152590A
Other languages
Chinese (zh)
Inventor
王万良
郑建炜
王震宇
韩姗姗
蒋一波
郑泽萍
王磊
陈胜勇
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University of Technology ZJUT
Original Assignee
Zhejiang University of Technology ZJUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University of Technology ZJUT filed Critical Zhejiang University of Technology ZJUT
Priority to CN200910152590A priority Critical patent/CN101650944A/en
Publication of CN101650944A publication Critical patent/CN101650944A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/213Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods
    • G06F18/2132Feature extraction, e.g. by transforming the feature space; Summarisation; Mappings, e.g. subspace methods based on discrimination criteria, e.g. discriminant analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Data Mining & Analysis (AREA)
  • Theoretical Computer Science (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Management, Administration, Business Operations System, And Electronic Commerce (AREA)

Abstract

The invention relates to a method for distinguishing speakers based on a protective kernel Fisher distinguishing method. The method comprises steps as follows: (1) pretreating voice signals; (2) extracting characteristic parameters: after framing and end point detection of voice signals, extracting Mel frequency cepstrum coefficients as characteristic vectors of speakers; (3) creating a speaker distinguishing model; (4) calculating model optimal projection vector: by using optimal solution of LWFD method, calculating to obtain an optimal projection vector group; (5) distinguishing speakers: projecting original data xi to yi belonging to R<r>( r is more than or equal to 1 and less than or equal to d) according to optimal projection classification vector phi, wherein r is cut dimensionality;the optimal projection classification dimensionality of original c type data space is c-1, then solving a central value of data of each type after injection and normalizing; after projecting data tobe classified to a sub space and normalizing, calculating Euclidean distance from the normalized protecting data to the central point of each type of data in the sub space, and judging the nearest tobe a distinguishing result. The invention has high distinguishing rate, simple model construction and favorable rapidity.

Description

Based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method
Technical field
The present invention relates to signal Processing, machine learning and area of pattern recognition, especially a kind of speaker identification's implementation method.
Background technology
(Speaker Recognition SR) claims words person identification again to Speaker Identification, is meant by the analyzing and processing of speaker's voice signal also being confirmed automatically speaker's technology.The speaker identification who the present invention relates to is an important branch of Speaker Identification.It is from that the speaker identification system must recognize voice to be identified for which of individual philtrum to be investigated, also will make the differentiation of refusal to the voice beyond this people sometimes.It is not the process of a pattern match that the speaker debates, and in this process, computing machine at first will be set up speech model according to speaker's characteristic voice; Promptly the voice signal of input is analyzed, and extracted speaker's personal characteristics, set up the required model of speaker identification on this basis.Speaker debates other system can be divided into several sections such as the training of the selection of pre-service, characteristic parameter of voice and extraction, model of cognition and coupling.
The algorithm of comparative maturity mainly contains vector quantization (Vector Quantization at present, VQ), support vector machine (Support Vector Machine, SVM), Hidden Markov Model (HMM) (Hidden Markov Model, HMM), mixed Gauss model (Gaussian Mixture Model, GMM) etc.Wherein the VQ method is only at the relevant speaker identification's occasion of text.The prerequisite that GMM and HMM method are used is to need a large amount of training utterance data to carry out the optimization of model parameter.Though SVM can obtain recognition efficiency preferably, its range of application that has been the weak intrinsic limitation of output of non-probability and multiclass expansion.
Through new patent searching statistics, the patent of existing many Speaker Identification aspect both at home and abroad; For example, based on the method for distinguishing speek person (200510061953.X) of the supporting vector machine model of embedded GMM nuclear, utilize the method for distinguishing speek person (200710157134.4) of base frequency envelope to eliminate emotion voice, based on the method for distinguishing speek person (200710157133.X) of neutrality and affection sound-groove model conversion, based on the method for distinguishing speek person (200510061954.4) of hybrid supporting vector machine, based on the emotional speaker recognition method (200810162450.5) of frequency spectrum translation, based on the method for distinguishing speek person (200810162449.2) of mixed t model; Based on method for distinguishing speek person (200510061360.3) of MFCC linear emotion compensation etc.
Summary of the invention
Lower for the discrimination that overcomes existing speaker identification's implementation method, model construction is complicated, slow-footed deficiency, the invention provides a kind of discrimination height, model construction simple, have good rapidity based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method.
The technical solution adopted for the present invention to solve the technical problems is:
A kind of based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method, may further comprise the steps:
1., the pre-service of voice signal: voice signal is carried out pre-service;
2., characteristic parameter extraction: after voice signal is finished processing of branch frame and end-point detection, extract the Mel cepstrum parameter as the speaker characteristic vector, described Mel cepstrum parameter is 13 rank cepstrum parameters, remove and wherein speaker characteristic is described the 0th less rank parameter, every frame voice signal is converted to 12 Jan Vermeer cepstrum feature vectors;
3., speaker identification's model construction:
Set x i∈ R d(i=1,2 ..., N) be d dimension sample data, y i∈ 1,2 ..., and c} is corresponding class label, and wherein N is a total sample number, and c is the classification sum, c lBe the sample number of l class, then:
&Sigma; l = 1 c c l = N
X is a sample matrix, that is:
X≡(x 1|x 2|…|x n|)
Based on above-mentioned pacing items, set up speaker identification's model and be:
Figure G2009101525909D00022
Wherein: S b &Phi; = 1 N &Sigma; i = 1 c c l ( x &OverBar; i - x &OverBar; ) ( x &OverBar; i - x &OverBar; ) T Be divergence matrix in the class, S ~ w &equiv; &Sigma; i = 1 n x i x i T - &Sigma; l = 1 c 1 n l &Sigma; k = 1 n l &Sigma; m = 1 n l A k , m x k x m T Be divergence matrix in the class, affine matrix A i , j = exp ( - | | x i - x j | | 2 &sigma; 2 ) , Wherein σ is the adjustable integer constant factor,
Figure G2009101525909D00032
Be best projection class vector to be asked;
4., model best projection vector calculation
Adopt the optimum solution of LWFD method, promptly according to formula:
Figure G2009101525909D00033
Calculate the best projection Vector Groups, suppose nullB with
Figure G2009101525909D00034
Represent S respectively bWith
Figure G2009101525909D00035
Kernel, then the best of following formula differentiates that the subspace takes from nullB &perp; &cap; null W ~ , NullB wherein Be the orthocomplement of nullB, at first with S bProject to nullB , obtain nullB Behind the space, again divergence matrix projection between class scatter matrix and the class is advanced
Figure G2009101525909D00037
Subspace, the vector in the subspace of gained are the optimum proper vector of differentiating;
5., speaker identification:
According to optimum projection class vector
Figure G2009101525909D00038
With former data x iBe projected as y i∈ R r(1≤r≤d), wherein r is the dimension after cutting down, the projection formula that adopts transformation matrix T:
Figure G2009101525909D00039
The optimal classification projected dimensions of former c class data space is c-1, ask for the central value and the standardization of data after each class projection afterwards, after will treating that grouped data projects to subspace and standardization, calculate its with the subspace in the Euclidean distance of each class data center's point, nearest person is judged to recognition result.
Further, described step 4. in, the described optimum process of asking for of differentiating proper vector is:
At first with S bProject to nullB , rewrite S bExpression formula is:
S b = &Sigma; i = 1 c ( c i N ( &phi; &OverBar; i - &phi; &OverBar; ) ) ( c i N ( &phi; &OverBar; i - &phi; &OverBar; ) ) T = &Sigma; i c &phi; &OverBar; &prime; i &phi; &prime; &OverBar; i T &Phi; b &Phi; b T
Wherein &phi; &OverBar; &prime; i = c i N ( &phi; &OverBar; i - &phi; &OverBar; ) , &Phi; b = [ &phi; &OverBar; &prime; 1 &CenterDot; &CenterDot; &CenterDot; &phi; &OverBar; &prime; c ] . Matrix S bOrder be c-1, Φ bΦ b TWith Φ b TΦ bIdentical nonzero eigenvalue is arranged, and the pairing proper subspace of filtering zero eigenvalue is S bKernel; Use Φ b TΦ bSubstitute Φ bΦ b TAnd take to examine skill and derive;
&Phi; b T &Phi; b = [ &phi; &OverBar; &prime; 1 &CenterDot; &CenterDot; &CenterDot; &phi; &OverBar; &prime; c ] T [ &phi; &OverBar; &prime; 1 &CenterDot; &CenterDot; &CenterDot; &phi; &OverBar; &prime; c ] = ( &phi; &OverBar; &prime; i T &phi; &OverBar; &prime; j ) i = 1 , &CenterDot; &CenterDot; &CenterDot; , c j = 1 , &CenterDot; &CenterDot; &CenterDot; , c
Wherein:
&phi; &OverBar; &prime; i T &phi; &OverBar; &prime; j = c i c j N ( &phi; i T &phi; j - &phi; i T &phi; - &phi; T &phi; j + &phi; T &phi; )
With each utilizes kernel function to be converted to matrix in the following formula:
&Phi; b T &Phi; b = 1 N B &times; ( A LC T &times; K &times; A LC - 1 N ( A LC T &times; K &times; 1 LC ) -
1 N ( 1 LC T &times; K &times; A LC ) + 1 N 2 ( 1 LC T &times; K &times; 1 LC ) ) &times; B
Wherein B = diag [ c 1 &CenterDot; &CenterDot; &CenterDot; c c ] , 1 LCBe that an all elements is L * C matrix of 1, A LC = diag [ a c 1 &CenterDot; &CenterDot; &CenterDot; a c c ] Be a L * C block diagonal matrix, piece
Figure G2009101525909D00046
Be that an all elements is
Figure G2009101525909D00047
C i* 1 column vector;
If λ iWith e i(i=1 ... c) be Φ b TΦ bI eigen vector, and with the eigenwert descending sort; V then ibe iIt is former between class scatter matrix S bProper vector; Remove S bKernel, promptly abandon eigenwert and be zero individual features vector, keep v iIn before c-1 proper vector: V=[v 1V C-1]=Φ bE mb[e 1E C-1], V then TS bV=Λ b, Λ b=diag[λ 1λ C-1] be (c-1) * (c-1) diagonal matrix;
Obtain nullB Behind the space, divergence matrix between class scatter matrix and the class is complied with U = V &Lambda; b - 1 / 2 The subspace is entered in projection, wherein U TS bU=I, U T S ~ w U = ( E m &Lambda; b - 1 / 2 ) T ( &Phi; b T S ~ w &Phi; b ) ( E m &Lambda; b - 1 / 2 ) , Utilize nuclear matrix K, will
Figure G2009101525909D000410
Carry out the coring conversion:
&Phi; b T S ~ w &Phi; b = [ &phi; &OverBar; &prime; 1 &CenterDot; &CenterDot; &CenterDot; &phi; &OverBar; &prime; c ] T S ~ w [ &phi; &OverBar; &prime; 1 &CenterDot; &CenterDot; &CenterDot; &phi; &OverBar; &prime; c ] = ( &phi; &OverBar; &prime; i T S ~ w &phi; &OverBar; &prime; j ) i = 1 , &CenterDot; &CenterDot; &CenterDot; , c j = 1 , &CenterDot; &CenterDot; &CenterDot; , c
Wherein:
&phi; &OverBar; &prime; i T S ~ w &phi; &OverBar; &prime; j = &Sigma; i = 1 n &phi; &OverBar; &prime; i T &phi; ( x i ) &phi; ( x i ) T &phi; &OverBar; &prime; j - &Sigma; l = 1 c 1 n l &Sigma; k = 1 n l &Sigma; m = 1 n l A k , m &phi; &OverBar; &prime; i T x k x m T &phi; &OverBar; &prime; j
First:
&Sigma; i = 1 n &phi; &OverBar; &prime; i T &phi; ( x i ) &phi; ( x i ) T &phi; &OverBar; &prime; j = J 1 = 1 L B ( A LC T KK A LC - 1 L ( A LC T KK 1 LC ) -
1 L ( 1 LC T KK A LC ) + 1 L 2 ( 1 LC T KK 1 LC ) ) B
Second:
&Sigma; l = 1 c 1 c l &Sigma; k = 1 n l &Sigma; m = 1 n l A k , m &phi; &OverBar; &prime; i T x k x m T &phi; &OverBar; &prime; j = J 2 = 1 L B ( A LC T KWK A LC - 1 L ( A LC T KWK 1 LC )
- 1 L ( 1 LC T KWK A LC ) + 1 L 2 ( 1 LC T KWK 1 LC ) ) B
W=diag[w in the following formula 1W c] be a N * N partitioned matrix, w iBe that an element is
Figure G2009101525909D00053
C i* c iMatrix, therefore &Phi; b T S ~ w &Phi; b = J 1 - J 2 , It also is a c * c matrix.Then Be the simple matrix of a dimension, calculate its proper vector p for (c-1) * (c-1) iWith eigenvalue ' iAnd with the ascending order arrangement, m vectorial eigentransformation matrix Q=UP=U[p of getting before extracting 1P M-1], wherein 1≤m≤c-1 can get Q T S ~ w Q = &Lambda; w , Λ w=diag[λ ' 1λ ' m] be a m * m diagonal matrix;
The optimum that keeps Fisher to differentiate in the class is differentiated proper vector: &Gamma; = Q &Lambda; w - 1 / 2 , Feature constitutes a low n-dimensional subspace n in the H space after the conversion.
Further, described step 5. in, the speaker's phonetic entry pattern z to be classified arbitrarily projects to proper subspace according to Γ, is calculated as follows:
y = &Gamma; T &phi; ( z ) = ( E m &Lambda; b - 1 / 2 P &Lambda; w - 1 / 2 ) T ( &Phi; b T &phi; ( z ) )
Wherein:
&Phi; b T &phi; ( z ) = [ &phi; &OverBar; &prime; 1 &CenterDot; &CenterDot; &CenterDot; &phi; &OverBar; &prime; c ] T &phi; ( z )
Because &phi; &OverBar; &prime; i &phi; ( z ) = ( c i N ( &phi; &OverBar; i - &phi; ) ) T &phi; ( z ) = c i N ( 1 c i &Sigma; m = 1 c i &phi; im T &phi; ( z ) - 1 N &Sigma; p = 1 c &Sigma; q = 1 c q &phi; pq T &phi; ( z ) ) , Can get:
&Phi; b T &phi; ( z ) = 1 N B ( A LC T &gamma; ( &phi; ( z ) ) - 1 N 1 LC T &gamma; ( &phi; ( z ) ) )
Wherein &gamma; ( &phi; ( z ) ) = [ &phi; 11 T &phi; ( z ) | &phi; 12 T &phi; ( z ) | &CenterDot; &CenterDot; &CenterDot; | &phi; cc c T &phi; ( z ) ] T Be a N * 1 nuclear vector, the proper vector value is:
y = 1 N ( E m &Lambda; b - 1 / 2 P &Lambda; w - 1 / 2 ) T ( B ( A LC T - 1 N 1 LC T ) ) &gamma; ( &phi; ( z ) )
Calculate the Euclidean distance of each the class data center's point in y and the subspace, the person is judged to recognition result recently.
Further again, described step 1. in, described pre-service comprises sampling, removes noise, end-point detection, pre-emphasis, branch frame and windowing.
Technical conceive of the present invention is: (Fisher Discriminant Analysis is that the sample data that d ties up the input space is projected on the straight line FDA), makes on this straight line projection in zone calibration the best of sample in the Fisher discriminatory analysis.Speaker's pitch, tone color, volume presents the colourful form of expression at different times, and speech characteristic parameter often has non-linear, polymorphism, directly uses the Fisher discriminant analysis method and can't obtain desirable recognition result.
Nuclear Fisher diagnostic method (Kernel Fisher Discriminant Analysis, KFDA) be will the nuclear learning method the product that combines with Fisher judgement method of thought.The thinking of KFDA algorithm is: at first by a Nonlinear Mapping, will import data and hint obliquely in a higher-dimension nuclear space; Then, in this higher-dimension nuclear space, carry out linear Fisher judgment analysis again, thereby realize with respect to former space being non-linear judgment analysis.Though nuclear Fisher diagnostic method meets the speaker and debates other nonlinear feature, but nuclear Fisher diagnostic method is only considered the global area calibration maximal projection of grouped data, do not consider the interior polymorphic distribution characteristics of class of same speaker's speech vector, but also need an accelerated model training algorithm to support speaker identification's big data quantity situation.
The effect that the present invention is useful is: 1, affinity between the sample is incorporated divergence matrix in the class with the weights form, propose to keep the Fisher method of discrimination in the class, be applied to the speaker identification, discrimination is higher than tradition generation property model (as gauss hybrid models); To compare discrimination similar with other property distinguished models (as support vector machine), yet support vector machine is the binary classification device, can only make up a plurality of models by " one-to-many " or " one to one " mode carries out the ballot formula and classifies more, and the inventive method can directly be carried out many classification, and model construction is directly perceived more quick; 2, the optimum projection class vector of the coring Fisher discrimination model of class internal characteristic maintenance is searched in the subspace of non-kernel between class by kernel in class, makes the optimal vector computing velocity faster, meets the big training sample situation of this class of speaker identification.
Embodiment
Below the present invention is further described.
A kind of based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method, may further comprise the steps:
1., the pre-service of voice signal: voice signal is carried out pre-service;
2., characteristic parameter extraction: after voice signal is finished processing of branch frame and end-point detection, extract the Mel cepstrum parameter as the speaker characteristic vector, described Mel cepstrum parameter is 13 rank cepstrum parameters, remove and wherein speaker characteristic is described the 0th less rank parameter, every frame voice signal is converted to 12 Jan Vermeer cepstrum feature vectors;
3., speaker identification's model construction:
Set x i∈ R d(i=1,2 ..., N) be d dimension sample data, y i∈ 1,2 ..., and c} is corresponding class label, and wherein N is a total sample number, and c is the classification sum, c lBe the sample number of l class, then:
&Sigma; l = 1 c c l = N
X is a sample matrix, that is:
X≡(x 1|x 2|…|x n|)
Based on above-mentioned pacing items, set up speaker identification's model and be:
Figure G2009101525909D00072
Wherein: S b &Phi; = 1 N &Sigma; i = 1 c c l ( x &OverBar; i - x &OverBar; ) ( x &OverBar; i - x &OverBar; ) T Be divergence matrix in the class, S ~ w &equiv; &Sigma; i = 1 n x i x i T - &Sigma; l = 1 c 1 n l &Sigma; k = 1 n l &Sigma; m = 1 n l A k , m x k x m T Be divergence matrix in the class, affine matrix A i , j = exp ( - | | x i - x j | | 2 &sigma; 2 ) , Wherein σ is the adjustable integer constant factor, x iBe the mean value of i class sample, x represents the mean value of all samples,
Figure G2009101525909D00076
Be best projection class vector to be asked.
4., model best projection vector calculation
Adopt the optimum solution of LWFD method, promptly according to formula:
Figure G2009101525909D00077
Calculate the best projection Vector Groups, suppose nullB with
Figure G2009101525909D00078
Represent S respectively bWith
Figure G2009101525909D00079
Kernel, then the best of following formula differentiates that the subspace takes from nullB &perp; &cap; null W ~ , NullB wherein Be the orthocomplement of nullB, at first with S bProject to nullB , obtain nullB Behind the space, again divergence matrix projection between class scatter matrix and the class is advanced
Figure G2009101525909D000711
Subspace, the vector in the subspace of gained are the optimum proper vector of differentiating;
5., speaker identification:
According to optimum projection class vector
Figure G2009101525909D00081
With former data x iBe projected as y i∈ R r(1≤r≤d), wherein r is the dimension after cutting down, the projection formula that adopts transformation matrix T:
Figure G2009101525909D00082
The optimal classification projected dimensions of former c class data space is c-1, ask for the central value and the standardization of data after each class projection afterwards, after will treating that grouped data projects to subspace and standardization, calculate its with the subspace in the Euclidean distance of each class data center's point, nearest person is judged to recognition result.
The framework of present embodiment is as follows:
First's feature extraction
Prior art is adopted in feature extraction substantially, and the voice signal of at first gathering each speaker's different times is some, carries out pretreatment operation, comprises that sample quantization, center clipping, pre-emphasis, unvoiced segments are removed, windowing divides frame.Pretreated voice signal is carried out feature extraction, the present invention adopts Mel frequency cepstrum parameter (MFCC), extract 13 rank Mel cepstrum parameters of every frame voice signal, remove and wherein speaker characteristic is described the 0th less rank parameter, last every frame voice signal is converted to 12 Jan Vermeer cepstrum feature vectors.
Keep the Fisher discrimination model in the second portion class
Traditional core Fisher criterion is as follows:
Figure G2009101525909D00083
Wherein
S b &Phi; = 1 N &Sigma; i = 1 c c l ( &phi; &OverBar; i - &phi; &OverBar; ) ( &phi; &OverBar; i - &phi; &OverBar; ) T
S w &Phi; = 1 N &Sigma; i = 1 c &Sigma; j = 1 c l ( &phi; ( x j i ) - &phi; &OverBar; i ) ( &phi; ( x j i ) - &phi; &OverBar; i ) T
Between class scatter matrix and the interior divergence matrix of class among the higher dimensional space H have been represented respectively; φ (x) is that input vector x is at higher dimensional space H respective projection, φ iBe the mean value of i class sample at higher dimensional space, φ represents the mean value of all samples of higher dimensional space.According to reproducing kernel theory,
Figure G2009101525909D00086
Can be expressed as following form:
Then former differentiation criterion is equivalent to expression formula:
J ( &alpha; ) = &alpha; T K b &alpha; &alpha; T K w &alpha;
Divergence matrix K in the nuclear class wWith nuclear between class scatter matrix K bBe defined as follows:
K b = 1 N &Sigma; l = 1 c c l ( &mu; l - &mu; 0 ) ( &mu; l - &mu; 0 ) T
K w = 1 N &Sigma; l = 1 c &Sigma; j = 1 c l ( &eta; x j l - &mu; l ) ( &eta; x j l - &mu; l ) T
Wherein:
η x=(k(x 1,x),…,k(x N,x)) T
&mu; l = ( 1 c l &Sigma; m = 1 c l k ( x 1 , x m ) , &CenterDot; &CenterDot; &CenterDot; , 1 c l &Sigma; m = 1 c l k ( x 1 , x m ) ) T
&mu; 0 = ( 1 N &Sigma; m = 1 N k ( x 1 , x m ) , &CenterDot; &CenterDot; &CenterDot; , 1 N &Sigma; m = 1 N k ( x 1 , x m ) ) T
Try to achieve best nuclear discriminant vector α in the following formula according to Generalized Rayleigh Quotient 1, α 2... α N, the projection matrix that best discriminant vector collection constitutes among the feature space H then:
Figure G2009101525909D00096
The nuclear Fisher differentiation of vector x to be identified is characterized as
Figure G2009101525909D00097
Have the situation of sub-clustering and overlapping sample in the class according to speaker's speech characteristic parameter, directly use nuclear Fisher method of discrimination and can't obtain desirable recognition result.The present invention proposes to protect the Fisher method of discrimination of class internal characteristic with the divergence matrix in class of local data's Feature Fusion in the class:
The between class scatter matrix S that keeps former Fisher techniques of discriminant analysis bConstant, the divergence matrix is adjusted as follows in the class:
S w = &Sigma; l = 1 c &Sigma; i = 1 n l ( x i - 1 n l &Sigma; j = 1 n l x j ) ( x i - 1 n l &Sigma; j = 1 n l x j ) T
= &Sigma; i = 1 n x i x i T - &Sigma; l = 1 c 1 n l &Sigma; k = 1 n l &Sigma; m = 1 n l x k x m T
Based on above-mentioned adjustment, data local feature in the class incorporated make it to become:
S ~ w &equiv; &Sigma; i = 1 n x i x i T - &Sigma; l = 1 c 1 n l &Sigma; k = 1 n l &Sigma; m = 1 n l A k , m x k x m T
Wherein A i , j = exp ( - | | x i - x j | | 2 &sigma; 2 ) , σ is adjustable integral factor.That is to say, the distance factor of homogeneous data is incorporated divergence matrix in the class with the form of weighting, the sample in similar to being weighted, is reduced similar sample medium and long distance data to the effect to divergence matrix in the class, the influence power of outstanding neighbour's data promptly keeps the class internal characteristic.Will
Figure G2009101525909D00101
Be applied to the Fisher discriminatory analysis, then the Fisher criterion formulas becomes:
Figure G2009101525909D00102
The best Fisher projection vector of third part obtains
The directly optimum projection that keeps Fisher to differentiate in the compute classes, similar Fisher diagnostic method need be asked K w -1K bThe corresponding proper vector of matrix eigenvalue of maximum, when being applied to the speaker identification, the sample characteristics vector is thousands of easily, and calculated amount is quite surprising, can't use in real time, must improve training algorithm.
The optimum solution that keeps the Fisher method of discrimination in the class promptly calculate according to the Fisher criterion formulas and the best projection Vector Groups, suppose nullB and
Figure G2009101525909D00103
Represent S respectively bWith
Figure G2009101525909D00104
Kernel, take from the then best subspace of differentiating nullB &perp; &cap; null W ~ , NullB wherein Be the orthocomplement of nullB, at first with S bProject to nullB , rewrite S bExpression formula is:
S b = &Sigma; i = 1 c ( c i N ( &phi; &OverBar; i - &phi; &OverBar; ) ) ( c i N ( &phi; &OverBar; i - &phi; &OverBar; ) ) T = &Sigma; i c &phi; &OverBar; &prime; i &phi; &prime; &OverBar; i T = &Phi; b &Phi; b T
Wherein &phi; &OverBar; &prime; i = c i N ( &phi; &OverBar; i - &phi; &OverBar; ) , &Phi; b = [ &phi; &OverBar; &prime; 1 &CenterDot; &CenterDot; &CenterDot; &phi; &OverBar; &prime; c ] . Matrix S bOrder be c-1, Φ bΦ b TWith Φ b TΦ bIdentical nonzero eigenvalue is arranged, and the pairing proper subspace of zero eigenvalue is S bKernel, need filtering.Therefore use Φ b TΦ bSubstitute Φ bΦ b TAnd take to examine skill and derive, can simplify operand.
&Phi; b T &Phi; b = [ &phi; &OverBar; &prime; 1 &CenterDot; &CenterDot; &CenterDot; &phi; &OverBar; &prime; c ] T [ &phi; &OverBar; &prime; 1 &CenterDot; &CenterDot; &CenterDot; &phi; &OverBar; &prime; c ] = ( &phi; &OverBar; &prime; i T &phi; &OverBar; &prime; j ) i = 1 , &CenterDot; &CenterDot; &CenterDot; , c j = 1 , &CenterDot; &CenterDot; &CenterDot; , c
Wherein:
&phi; &OverBar; &prime; i T &phi; &OverBar; &prime; j = c i c j N ( &phi; i T &phi; j - &phi; i T &phi; - &phi; T &phi; j + &phi; T &phi; )
With each utilizes kernel function to be converted to matrix in the following formula:
&Phi; b T &Phi; b = 1 N B &times; ( A LC T &times; K &times; A LC - 1 N ( A LC T &times; K &times; 1 LC ) -
1 N ( 1 LC T &times; K &times; A LC ) + 1 N 2 ( 1 LC T &times; K &times; 1 LC ) ) &times; B
Wherein B = diag [ c 1 &CenterDot; &CenterDot; &CenterDot; c c ] , 1 LCBe that an all elements is L * C matrix of 1, A LC = diag [ a c 1 &CenterDot; &CenterDot; &CenterDot; a c c ] Be a L * C block diagonal matrix, piece Be that an all elements is
Figure G2009101525909D00113
C i* 1 column vector, K is the nuclear matrix of input feature value.。
If λ iWith e i(i=1 ... c) be Φ b TΦ bI eigen vector, and with the eigenwert descending sort.V then ibe iIt is former between class scatter matrix S bProper vector.For obtaining optimum projection, must remove S bKernel, promptly abandon eigenwert and be zero individual features vector, keep v iIn before c-1 proper vector: V=[v 1V C-1]=Φ bE mb[e 1E C-1], V then TS bV=Λ b, Λ b=diag[λ 1λ C-1] be (c-1) * (c-1) diagonal matrix.
Obtain nullB Behind the space, divergence matrix between class scatter matrix and the class is complied with U = V &Lambda; b - 1 / 2 The subspace is entered in projection, wherein U TS bU=I, U T S ~ w U = ( E m &Lambda; b - 1 / 2 ) T ( &Phi; b T S ~ w &Phi; b ) ( E m &Lambda; b - 1 / 2 ) , Utilize nuclear matrix K, will
Figure G2009101525909D00116
Carry out the coring conversion:
&Phi; b T S ~ w &Phi; b = [ &phi; &OverBar; &prime; 1 &CenterDot; &CenterDot; &CenterDot; &phi; &OverBar; &prime; c ] T S ~ w [ &phi; &OverBar; &prime; 1 &CenterDot; &CenterDot; &CenterDot; &phi; &OverBar; &prime; c ] = ( &phi; &OverBar; &prime; i T S ~ w &phi; &OverBar; &prime; j ) i = 1 , &CenterDot; &CenterDot; &CenterDot; , c j = 1 , &CenterDot; &CenterDot; &CenterDot; , c
Wherein &phi; &OverBar; &prime; i T S ~ w &phi; &OverBar; &prime; j = &Sigma; i = 1 n &phi; &OverBar; &prime; i T &phi; ( x i ) &phi; ( x i ) T &phi; &OverBar; &prime; j - &Sigma; l = 1 c 1 n l &Sigma; k = 1 n l &Sigma; m = 1 n l A k , m &phi; &OverBar; &prime; i T x k x m T &phi; &OverBar; &prime; j .
Respectively with behind the matrix computations formal representation two be:
&Sigma; i = 1 n &phi; &OverBar; &prime; i T &phi; ( x i ) &phi; ( x i ) T &phi; &OverBar; &prime; j = J 1 = 1 L B ( A LC T KK A LC - 1 L ( A LC T KK 1 LC ) -
1 L ( 1 LC T KK A LC ) + 1 L 2 ( 1 LC T KK 1 LC ) ) B
&Sigma; l = 1 c 1 c l &Sigma; k = 1 n l &Sigma; m = 1 n l A k , m &phi; &OverBar; &prime; i T x k x m T &phi; &OverBar; &prime; j = J 2 = 1 L B ( A LC T KWK A LC - 1 L ( A LC T KWK 1 LC )
- 1 L ( 1 LC T KWK A LC ) + 1 L 2 ( 1 LC T KWK 1 LC ) ) B
W=diag[w in the following formula 1W c] be a N * N partitioned matrix, w iBe that an element is
Figure G2009101525909D001113
C i* c iMatrix, therefore &Phi; b T S ~ w &Phi; b = J 1 - J 2 , It also is a c * c matrix.Then
Figure G2009101525909D001115
Be the simple matrix of a dimension, calculate its proper vector p for (c-1) * (c-1) iWith eigenvalue ' iAnd with the ascending order arrangement, m vectorial feature extraction transformation matrix Q=UP=U[p that gets before getting 1P M-1], wherein 1≤m≤c-1 can get Q T S ~ w Q = &Lambda; w , Λ w=diag[λ ' 1λ ' m] be a m * m diagonal matrix.
In sum, the optimum differentiation proper vector that keeps Fisher to differentiate in the class is: &Gamma; = Q &Lambda; w - 1 / 2 , Feature constitutes a low n-dimensional subspace n in the H space after the conversion, and has maximum separable degree.Ask for the central value of data and standardization after each class projection in the low n-dimensional subspace n, for next step speaker identification gets ready.
The 4th part speaker identification
Speaker's phonetic entry pattern z to be classified arbitrarily projects to proper subspace according to Γ, is calculated as follows:
y = &Gamma; T &phi; ( z ) = ( E m &Lambda; b - 1 / 2 P &Lambda; w - 1 / 2 ) T ( &Phi; b T &phi; ( z ) )
Wherein:
&Phi; b T &phi; ( z ) = [ &phi; &OverBar; &prime; 1 &CenterDot; &CenterDot; &CenterDot; &phi; &OverBar; &prime; c ] T &phi; ( z )
Because &phi; &OverBar; &prime; i &phi; ( z ) = ( c i N ( &phi; &OverBar; i - &phi; ) ) T &phi; ( z ) = c i N ( 1 c i &Sigma; m = 1 c i &phi; im T &phi; ( z ) - 1 N &Sigma; p = 1 c &Sigma; q = 1 c q &phi; pq T &phi; ( z ) ) , Can get:
&Phi; b T &phi; ( z ) = 1 N B ( A LC T &gamma; ( &phi; ( z ) ) - 1 N 1 LC T &gamma; ( &phi; ( z ) ) )
Wherein &gamma; ( &phi; ( z ) ) = [ &phi; 11 T &phi; ( z ) | &phi; 12 T &phi; ( z ) | &CenterDot; &CenterDot; &CenterDot; | &phi; cc c T &phi; ( z ) ] T It is a N * 1 nuclear vector.Final proper vector value is:
y = 1 N ( E m &Lambda; b - 1 / 2 P &Lambda; w - 1 / 2 ) T ( B ( A LC T - 1 N 1 LC T ) ) &gamma; ( &phi; ( z ) )
Calculate the Euclidean distance of each the class data center's point in y and the subspace, the person is judged to recognition result recently.
Test experiments: the corpus of oneself recording is adopted in experiment, 20 of recording total numbers of persons, and wherein the man is 12,8 of woman.Data transform acquisition by sample frequency 8000Hz, quantization digit 16bit, monophony A/D.Everyone voice signal is recorded synthetic by different times.Everyone mix extract different times sound bite total length 15s as training signal, 20 length 1.5s sound bites of different times are as test signal, i.e. 20 training utterances, 400 tested speech.Voice signal is earlier through high boost, pre-service such as center reduction detect by VAD (Voice Activity Detection) sound is active again, extract wherein effectively voice segments, removing redundant unvoiced segments, is that length divides frame to extract the 12 MFCC characteristic parameters of tieing up as sorting parameter with 30ms.
GMM, SVM and the inventive method are carried out speaker identification's contrast test.The inventive method adopts identical radially basic kernel function with SVM.The erroneous results rate is respectively: gauss hybrid models: 3.5%; Support vector machine: 2.75%; The class internal characteristic keeps nuclear Fisher diagnostic method: 2.5%.As seen, the inventive method has the better recognition rate than classic method.

Claims (4)

1, a kind of based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method, it is characterized in that: described speaker identification's implementation method may further comprise the steps:
1., the pre-service of voice signal: voice signal is carried out pre-service;
2., characteristic parameter extraction: after voice signal is finished processing of branch frame and end-point detection, extract the Mel cepstrum parameter as the speaker characteristic vector, described Mel cepstrum parameter is 13 rank cepstrum parameters, remove and wherein speaker characteristic is described the 0th less rank parameter, every frame voice signal is converted to 12 Jan Vermeer cepstrum feature vectors;
3., speaker identification's model construction:
Set x i∈ R d(i=1,2 ..., N) be d dimension sample data, y i∈ 1,2 ..., and c} is corresponding class label, and wherein N is a total sample number, and c is the classification sum, c lBe the sample number of l class, then:
&Sigma; l = 1 c c l = N
X is a sample matrix, that is:
X≡(x 1|x 2|…|x n|)
Based on above-mentioned pacing items, set up speaker identification's model and be:
Wherein: S b &Phi; = 1 N &Sigma; i = 1 c c l ( x &OverBar; i - x &OverBar; ) ( x &OverBar; i - x &OverBar; ) T Be the between class scatter matrix, S ~ w &equiv; &Sigma; i = 1 n x i x i T - &Sigma; l = 1 c 1 n l &Sigma; k = 1 n l &Sigma; m = 1 n l A k , m x k x m T Be divergence matrix in the class, affine matrix A i , j = exp ( - | | x i - x j | | 2 &sigma; 2 ) , Wherein σ is the adjustable integer constant factor, x iBe the mean value of i class sample, x represents the mean value of all samples,
Figure A2009101525900002C6
Be best projection class vector to be asked;
4., model best projection vector calculation
Adopt the optimum solution of LWFD method, promptly according to formula:
Figure A2009101525900002C7
Calculate the best projection Vector Groups, suppose nullB with
Figure A2009101525900002C8
Represent S respectively bWith
Figure A2009101525900002C9
Kernel, then the best of following formula differentiates that the subspace takes from
Figure A2009101525900002C10
NullB wherein Be the orthocomplement of nullB, at first with S bProject to nullB , obtain nullB Behind the space, again divergence matrix projection between class scatter matrix and the class is advanced
Figure A2009101525900003C1
Subspace, the vector in the subspace of gained are the optimum proper vector of differentiating;
5., speaker identification:
According to optimum projection class vector
Figure A2009101525900003C2
With former data x iBe projected as y i∈ R r(1≤r≤d), wherein r is the dimension after cutting down, the projection formula that adopts transformation matrix T:
Figure A2009101525900003C3
The optimal classification projected dimensions of former c class data space is c-1, ask for the central value and the standardization of data after each class projection afterwards, after will treating that grouped data projects to subspace and standardization, calculate its with the subspace in the Euclidean distance of each class data center's point, nearest person is judged to recognition result.
2, as claimed in claim 1 based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method, it is characterized in that: described step 4. in, the described optimum process of asking for of differentiating proper vector is:
At first with S bProject to nullB , rewrite S bExpression formula is:
S b = &Sigma; i = 1 c ( c i N ( &phi; &OverBar; i - &phi; &OverBar; ) ) ( c i N ( &phi; &OverBar; i - &phi; &OverBar; ) ) T = &Sigma; i c &phi; &OverBar; &prime; i &phi; &prime; &OverBar; i T = &Phi; b &Phi; b T
φ wherein iBe the mean value of i class sample at higher dimensional space, φ represents the mean value of all samples of higher dimensional space, &phi; &OverBar; &prime; i = c i N ( &phi; &OverBar; i - &phi; &OverBar; ) , Φ b=[φ′ 1…φ′ c]。Matrix S bOrder be c-1, Φ bΦ b TWith Φ b TΦ bIdentical nonzero eigenvalue is arranged, and the pairing proper subspace of zero eigenvalue is S bKernel; Use Φ b TΦ bSubstitute Φ bΦ b TAnd take to examine skill and derive;
&Phi; b T &Phi; b = [ &phi; &OverBar; &prime; 1 . . . &phi; &OverBar; &prime; c ] T [ &phi; &OverBar; &prime; 1 . . . &phi; &OverBar; &prime; c ] = ( &phi; &OverBar; &prime; i T &phi; &OverBar; &prime; j ) j = 1 , . . . , c i = 1 , . . . , c
Wherein:
&phi; &OverBar; &prime; i T &phi; &OverBar; &prime; j = c i c j N ( &phi; i T &phi; j - &phi; i T &phi; - &phi; T &phi; j + &phi; T &phi; )
With each utilizes kernel function to be converted to matrix in the following formula:
&Phi; b T &Phi; b = 1 N B &times; ( A LC T &times; K &times; A LC - 1 N ( A LC T &times; K &times; 1 LC ) -
1 N ( 1 LC T &times; K &times; A LC ) + 1 N 2 ( 1 LC T &times; K &times; 1 LC ) ) &times; B
Wherein B = diag [ c 1 . . . c c ] , 1 LCBe that an all elements is L * C matrix of 1, A LC = diag [ a c 1 . . . a c c ] Be a L * C block diagonal matrix, piece
Figure A2009101525900004C2
Be that an all elements is
Figure A2009101525900004C3
C i* 1 column vector, K is the nuclear matrix of input feature value.
If λ iWith e i(i=1 ... c) be Φ b TΦ bI eigen vector, and with the eigenwert descending sort; V then ibe iIt is former between class scatter matrix S bProper vector; Remove S bKernel, promptly abandon eigenwert and be zero individual features vector, keep v iIn before c-1 proper vector: V=[v 1V C-1]=Φ bE mb[e 1E C-1], V then TS bV=Λ b, Λ b=diang[λ 1λ C-1] be (c-1) * (c-1) diagonal matrix.
Obtain nullB Behind the space, divergence matrix between class scatter matrix and the class is complied with U = V &Lambda; b - 1 / 2 The subspace is entered in projection, wherein U TS bU=I, U T S ~ w U = ( E m &Lambda; b - 1 / 2 ) T ( &Phi; b T S ~ w &Phi; b ) ( E m &Lambda; b - 1 / 2 ) , Utilize nuclear matrix K, will
Figure A2009101525900004C6
Carry out the coring conversion:
&Phi; b T S ~ w &Phi; b = [ &phi; &OverBar; &prime; 1 . . . &phi; &OverBar; &prime; c ] T S ~ w [ &phi; &OverBar; &prime; 1 . . . &phi; &OverBar; &prime; c ] = ( &phi; &OverBar; &prime; i T S ~ w &phi; &OverBar; &prime; j ) j = 1 , . . . , c i = 1 , . . . , c
Wherein:
&phi; &OverBar; &prime; i T S ~ w &phi; &OverBar; &prime; j = &Sigma; i = 1 n &phi; &OverBar; &prime; i T &phi; ( x i ) &phi; ( x i ) T &phi; &OverBar; &prime; j - &Sigma; l = 1 c 1 n l &Sigma; k = 1 n l &Sigma; m = 1 n l A k , m &phi; &OverBar; &prime; i T x k x m T &phi; &OverBar; &prime; j
First:
&Sigma; i = 1 n &phi; &OverBar; &prime; i T &phi; ( x i ) &phi; ( x i ) T &phi; &OverBar; &prime; j = J 1 = 1 L B ( A LC T KKA LC - 1 L ( A LC T KK 1 LC ) -
1 L ( 1 LC T KKA LC ) + 1 L 2 ( 1 LC T KK 1 LC ) ) B
Second:
&Sigma; l = 1 c 1 c l &Sigma; k = 1 n l &Sigma; m = 1 n l A k , m &phi; &OverBar; &prime; i T x k x m T &phi; &OverBar; &prime; j = J 2 = 1 L B ( A LC T KWKA LC - 1 L ( A LC T KWK 1 LC )
- 1 L ( 1 LC T KWKA LC ) + 1 L 2 ( 1 LC T KWK 1 LC ) ) B
W=diag[w in the following formula 1W c] be a N * N partitioned matrix, w iBe that an element is C i* c iMatrix, therefore &Phi; b T S ~ w &Phi; b = J 1 - J 2 , It also is an outer c matrix.Then
Figure A2009101525900004C15
Be the simple matrix of a dimension, calculate its proper vector p for (c-1) * (c-1) iWith eigenvalue ' iAnd with the ascending order arrangement, m vectorial eigentransformation matrix Q=UP=U[p of getting before extracting 1P M-1], wherein 1≤m≤c-1 can get Q T S ~ w Q = &Lambda; w , Λ w=diag[λ ' 1λ ' m] be a m * m diagonal matrix;
The optimum that keeps Fisher to differentiate in the class is differentiated proper vector: &Gamma; = Q &Lambda; w - 1 / 2 , Feature constitutes a low n-dimensional subspace n in the H space after the conversion.
3, as claimed in claim 1 or 2 based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method, it is characterized in that: described step 5. in, speaker's phonetic entry pattern z to be classified arbitrarily projects to proper subspace according to Γ, is calculated as follows:
y = &Gamma; T &phi; ( z ) = ( E m &Lambda; b - 1 / 2 P &Lambda; w - 1 / 2 ) T ( &Phi; b T &phi; ( z ) )
Wherein:
&Phi; b T &phi; ( z ) = [ &phi; &OverBar; &prime; 1 . . . &phi; &OverBar; &prime; c ] T &phi; ( z ) , φ (z) is input vector z corresponding proper vector in higher dimensional space H, because &phi; &OverBar; &prime; i &phi; ( z ) = ( c i N ( &phi; &OverBar; i - &phi; ) ) T &phi; ( z ) = c i N ( 1 c i &Sigma; m = 1 c i &phi; im T &phi; ( z ) - 1 N &Sigma; p = 1 c &Sigma; q = 1 c q &phi; pq T &phi; ( z ) ) , Can get:
&Phi; b T &phi; ( z ) = 1 N B ( A LC T &gamma; ( &phi; ( z ) ) - 1 N 1 LC T &gamma; ( &phi; ( z ) ) )
Wherein &gamma; ( &phi; ( z ) ) = [ &phi; 11 T &phi; ( z ) | &phi; 12 T &phi; ( z ) | . . . | &phi; cc c T &phi; ( z ) ] T Be a N * 1 nuclear vector, the proper vector value is:
y = 1 N ( E m &Lambda; b - 1 / 2 P &Lambda; w - 1 / 2 ) T ( B ( A LC T - 1 N 1 LC T ) ) &gamma; ( &phi; ( z ) )
Calculate the Euclidean distance of each the class data center's point in y and the subspace, the person is judged to recognition result recently.
4, as claimed in claim 3 based on speaker identification's implementation method of protecting class kernel Fisher diagnostic method, it is characterized in that: described step 1. in, described pre-service comprises sampling, removal noise, end-point detection, pre-emphasis, branch frame and windowing.
CN200910152590A 2009-09-17 2009-09-17 Method for distinguishing speakers based on protective kernel Fisher distinguishing method Pending CN101650944A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN200910152590A CN101650944A (en) 2009-09-17 2009-09-17 Method for distinguishing speakers based on protective kernel Fisher distinguishing method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN200910152590A CN101650944A (en) 2009-09-17 2009-09-17 Method for distinguishing speakers based on protective kernel Fisher distinguishing method

Publications (1)

Publication Number Publication Date
CN101650944A true CN101650944A (en) 2010-02-17

Family

ID=41673165

Family Applications (1)

Application Number Title Priority Date Filing Date
CN200910152590A Pending CN101650944A (en) 2009-09-17 2009-09-17 Method for distinguishing speakers based on protective kernel Fisher distinguishing method

Country Status (1)

Country Link
CN (1) CN101650944A (en)

Cited By (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103077405A (en) * 2013-01-18 2013-05-01 浪潮电子信息产业股份有限公司 Bayes classification method based on Fisher discriminant analysis
CN106128466A (en) * 2016-07-15 2016-11-16 腾讯科技(深圳)有限公司 Identity vector processing method and device
CN106297825A (en) * 2016-07-25 2017-01-04 华南理工大学 A kind of speech-emotion recognition method based on integrated degree of depth belief network
CN106326927A (en) * 2016-08-24 2017-01-11 大连海事大学 Shoeprint new class detection method
CN106683661A (en) * 2015-11-05 2017-05-17 阿里巴巴集团控股有限公司 Role separation method and device based on voice
CN107274888A (en) * 2017-06-14 2017-10-20 大连海事大学 A kind of Emotional speech recognition method based on octave signal intensity and differentiation character subset
CN107633845A (en) * 2017-09-11 2018-01-26 清华大学 A kind of duscriminant local message distance keeps the method for identifying speaker of mapping
CN109389017A (en) * 2017-08-11 2019-02-26 苏州经贸职业技术学院 Pedestrian's recognition methods again
CN110163034A (en) * 2018-02-27 2019-08-23 冷霜 A kind of listed method of aircraft surface positioning extracted based on optimal characteristics
CN112949671A (en) * 2019-12-11 2021-06-11 中国科学院声学研究所 Signal classification method and system based on unsupervised feature optimization
CN115268417A (en) * 2022-09-29 2022-11-01 南通艾美瑞智能制造有限公司 Self-adaptive ECU fault diagnosis control method

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103077405A (en) * 2013-01-18 2013-05-01 浪潮电子信息产业股份有限公司 Bayes classification method based on Fisher discriminant analysis
CN106683661A (en) * 2015-11-05 2017-05-17 阿里巴巴集团控股有限公司 Role separation method and device based on voice
CN106128466A (en) * 2016-07-15 2016-11-16 腾讯科技(深圳)有限公司 Identity vector processing method and device
US10650830B2 (en) 2016-07-15 2020-05-12 Tencent Technology (Shenzhen) Company Limited Identity vector processing method and computer device
CN106128466B (en) * 2016-07-15 2019-07-05 腾讯科技(深圳)有限公司 Identity vector processing method and device
CN106297825A (en) * 2016-07-25 2017-01-04 华南理工大学 A kind of speech-emotion recognition method based on integrated degree of depth belief network
CN106326927B (en) * 2016-08-24 2019-06-04 大连海事大学 A kind of shoes print new category detection method
CN106326927A (en) * 2016-08-24 2017-01-11 大连海事大学 Shoeprint new class detection method
CN107274888A (en) * 2017-06-14 2017-10-20 大连海事大学 A kind of Emotional speech recognition method based on octave signal intensity and differentiation character subset
CN107274888B (en) * 2017-06-14 2020-09-15 大连海事大学 Emotional voice recognition method based on octave signal strength and differentiated feature subset
CN109389017A (en) * 2017-08-11 2019-02-26 苏州经贸职业技术学院 Pedestrian's recognition methods again
CN109389017B (en) * 2017-08-11 2021-11-16 苏州经贸职业技术学院 Pedestrian re-identification method
CN107633845A (en) * 2017-09-11 2018-01-26 清华大学 A kind of duscriminant local message distance keeps the method for identifying speaker of mapping
CN110163034A (en) * 2018-02-27 2019-08-23 冷霜 A kind of listed method of aircraft surface positioning extracted based on optimal characteristics
CN110163034B (en) * 2018-02-27 2021-07-23 山东炎黄工业设计有限公司 Aircraft ground positioning and listing method based on optimal feature extraction
CN112949671A (en) * 2019-12-11 2021-06-11 中国科学院声学研究所 Signal classification method and system based on unsupervised feature optimization
CN112949671B (en) * 2019-12-11 2023-06-30 中国科学院声学研究所 Signal classification method and system based on unsupervised feature optimization
CN115268417A (en) * 2022-09-29 2022-11-01 南通艾美瑞智能制造有限公司 Self-adaptive ECU fault diagnosis control method
CN115268417B (en) * 2022-09-29 2022-12-16 南通艾美瑞智能制造有限公司 Self-adaptive ECU fault diagnosis control method

Similar Documents

Publication Publication Date Title
CN101650944A (en) Method for distinguishing speakers based on protective kernel Fisher distinguishing method
Yu et al. Spoofing detection in automatic speaker verification systems using DNN classifiers and dynamic acoustic features
CN1975856B (en) Speech emotion identifying method based on supporting vector machine
CN111243602B (en) Voiceprint recognition method based on gender, nationality and emotion information
CN106503805A (en) A kind of bimodal based on machine learning everybody talk with sentiment analysis system and method
CN101923855A (en) Test-irrelevant voice print identifying system
CN102968990B (en) Speaker identifying method and system
CN103544963A (en) Voice emotion recognition method based on core semi-supervised discrimination and analysis
CN105261367B (en) A kind of method for distinguishing speek person
CN103456302B (en) A kind of emotional speaker recognition method based on the synthesis of emotion GMM Model Weight
CN101226743A (en) Method for recognizing speaker based on conversion of neutral and affection sound-groove model
CN111724770B (en) Audio keyword identification method for generating confrontation network based on deep convolution
CN102982803A (en) Isolated word speech recognition method based on HRSF and improved DTW algorithm
CN105825852A (en) Oral English reading test scoring method
CN102789779A (en) Speech recognition system and recognition method thereof
CN106531174A (en) Animal sound recognition method based on wavelet packet decomposition and spectrogram features
CN106601230A (en) Logistics sorting place name speech recognition method, system and logistics sorting system based on continuous Gaussian mixture HMM
CN110070895A (en) A kind of mixed sound event detecting method based on supervision variation encoder Factor Decomposition
CN102592593B (en) Emotional-characteristic extraction method implemented through considering sparsity of multilinear group in speech
CN105609117A (en) Device and method for identifying voice emotion
CN104464738B (en) A kind of method for recognizing sound-groove towards Intelligent mobile equipment
Iqbal et al. Mfcc and machine learning based speech emotion recognition over tess and iemocap datasets
CN101650945B (en) Method for recognizing speaker based on multivariate core logistic regression model
Ye et al. Phoneme classification using naive bayes classifier in reconstructed phase space
Chen et al. Automatic recognition of bird songs using time-frequency texture

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C12 Rejection of a patent application after its publication
RJ01 Rejection of invention patent application after publication

Application publication date: 20100217