CN1758263A - Multi-model ID recognition method based on scoring difference weight compromised - Google Patents

Multi-model ID recognition method based on scoring difference weight compromised Download PDF

Info

Publication number
CN1758263A
CN1758263A CN200510061359.0A CN200510061359A CN1758263A CN 1758263 A CN1758263 A CN 1758263A CN 200510061359 A CN200510061359 A CN 200510061359A CN 1758263 A CN1758263 A CN 1758263A
Authority
CN
China
Prior art keywords
centerdot
sorter
score
classification
sample
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN200510061359.0A
Other languages
Chinese (zh)
Other versions
CN100363938C (en
Inventor
吴朝晖
杨莹春
李东东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang University ZJU
Original Assignee
Zhejiang University ZJU
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang University ZJU filed Critical Zhejiang University ZJU
Priority to CNB2005100613590A priority Critical patent/CN100363938C/en
Publication of CN1758263A publication Critical patent/CN1758263A/en
Application granted granted Critical
Publication of CN100363938C publication Critical patent/CN100363938C/en
Expired - Fee Related legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Image Analysis (AREA)

Abstract

This invention relates to a weight combined multi-mode identity identification method based on score difference, which first of all utilizes a group of sample data of speakers and the score of each speaker model in each sample templet of a traditional mono-mode sorter to record the score difference of two speakers if the model and the sample got the maximum score belongs to different speakers then accumulates all these differences in the single sorters to decide the weight of each mode by the score differences of the sorters. The effectiveness is: carrying out cross identity identification by multiple biology properties and applying an amendatory weight algorithm SDWS based on the score difference to combine two biological identification modes and integrate the results of the identification.

Description

Multi-modal personal identification method based on score difference weighting fusion
Technical field
The present invention relates to the Multiple Classifier Fusion technology, mainly is a kind of multi-modal personal identification method based on score difference weighting fusion.
Background technology
In the application of actual life, the discriminating of identity is a very complicated job, has very strong robustness because it need reach very high performance and requirement.The biological identification technology with people's self physical features as the authentication foundation, fundamentally be different from traditional authentication techniques based on " thing that you had " or " thing known to you ", real with the foundation of people self as authentication, own authentic representative oneself.
In numerous biological identification technology, differentiate it is current two kinds of popular methods based on the identity of sound and image.Application on Voiceprint Recognition, i.e. Speaker Identification does not have and can lose, need not memory and advantage such as easy to use, economic, accurate; Recognition of face then has initiative, non-infringement and many advantages such as user friendly.When this several method uses separately, its separately performance always can be subjected to the constraint of certain extreme value or show instability.So, adopt information fusion to come the advantage of comprehensive each subpattern, be that the reliability of raising identification is a valid approach.
Nearly all multi-modal recognition methods at present all is to carry out on the fusion rank of decision level.According to fusion rule, decision-making level merges generally two kinds of strategies.A kind of is the fixing fusion method of parameter, as the method for average, and ballot method, addition or the like; Another kind is the method that needs parameter training, as Dempster-Shafer, and knowledge and behavior space and naive Bayesian method or the like.
The fusion method of preset parameter can influence performance because of the pairing effect of sorter to a great extent.And the quality of training set and size make the decision level fusion method of parameter training often can not reach theoretic syncretizing effect.
Summary of the invention
The present invention will solve the existing defective of above-mentioned technology, and a kind of multi-modal personal identification method based on score difference weighting fusion is provided.By research to the identification score of single sorter, recognition category and affiliated class score difference as the weights foundation, obtain a kind of new weighting parameters training method " based on the weighting of score difference " SDWS (Scores Difference-BasedWeightedSum Rule) and merged vocal print sorter and people's face sorter, thus the performance of raising Speaker Identification.
The technical solution adopted for the present invention to solve the technical problems: this multi-modal personal identification method based on score difference weighting fusion, at first utilize one group of speaker's sample data, by the score of each speaker model in the relative template of original traditional each sample of single mode sorter; If that model that score is the highest belongs to different speakers with sample, the score of then noting both is poor; Then all these differences in the single sorter are all added up; Utilize the score difference of each sorter to determine the weight of each mode at last.
The technical solution adopted for the present invention to solve the technical problems can also be further perfect.Described traditional single mode sorter is Application on Voiceprint Recognition sorter and recognition of face sorter.Described be divided into sorter belongs to this guess of certain classification to the data of input support.Described score difference is under separation vessel is differentiated error situation, import this moment data former under the classification of the input data supposed of classification and sorter when inconsistent, sorter is to the difference of the support of above-mentioned two classifications.The score difference of described sorter be all speakers differentiate the score of the speaker model that the sample under the error situation belongs to and top score in the single sorter difference and.Described sorter based on the weight of score difference be single separation vessel score difference inverse to the inverse of all separation vessel score differences and ratio.
The effect that the present invention is useful is: utilize multi-biological characteristic (vocal print, people's face) intersects authentication, and adopt a kind of weighting algorithm SDWS of correction that two biological identification mode are merged based on the score difference, comprehensive in addition the result of two kinds of authentications.Utilize the advantage of two kinds of biological information identifications and the field that is suitable for, improve fault-tolerance, reduction is uncertain, overcomes the imperfection of single biological information, strengthens recognition decision result's reliability, makes it have more extensive security and adaptability.
Description of drawings
Fig. 1 is the multi-modal identification system frame diagram based on score difference weighting fusion SDWS of the present invention;
Fig. 2 is the topological structure synoptic diagram of dynamic Bayesian model of the present invention.
Embodiment
The invention will be described further below in conjunction with drawings and Examples: method of the present invention was divided into for three steps.
The first step, Application on Voiceprint Recognition
Speaker Identification is divided into the voice pre-service, feature extraction, and model training is discerned four parts.
1. voice pre-service
The voice pre-service is divided into sample quantization, zero-suppresses and floats, three parts of pre-emphasis and windowing.
A), sample quantization
I. with sharp filter sound signal is carried out filtering, make its nyquist frequency F NBe 4KHZ;
II., audio sample rate F=2F is set N
III. to sound signal S a(t) sample by the cycle, obtain the amplitude sequence of digital audio and video signals s ( n ) = s a ( n F ) ;
IV. with pulse code modulation (pcm) s (n) is carried out quantization encoding, the quantization means s ' that obtains amplitude sequence (n).
B), zero-suppress and float
I. calculate the mean value s of the amplitude sequence that quantizes;
II. each amplitude is deducted mean value, obtaining zero-suppressing, to float back mean value be 0 amplitude sequence s " (n).
C), pre-emphasis
I., Z transfer function H (the z)=1-α z of digital filter is set -1In pre emphasis factor α, the value that the desirable ratio of α 1 is slightly little;
II.s " (n) by digital filter, obtain the suitable amplitude sequence s (n) of high, medium and low frequency amplitude of sound signal.
D), windowing
I. frame length N of computing voice frame (32 milliseconds) and the frame amount of moving T (10 milliseconds), satisfy respectively:
N F = 0.032
T F = 0.010
Here F is the speech sample rate, and unit is Hz;
II. be that N, the frame amount of moving are T with the frame length, s (n) is divided into a series of speech frame F m, each audio frame comprises N voice signal sample;
III. calculate the hamming code window function:
Figure A20051006135900074
IV. to each speech frame F mAdd hamming code window:
2.MFCC extraction:
A), the exponent number p of Mel cepstrum coefficient is set;
B), be fast fourier transform FFT, time-domain signal s (n) is become frequency domain signal X (k).
C), calculate Mel territory scale:
M i = i p × 2595 log ( 1 + 8000 / 2.0 700.0 ) , ( i = 0,1,2 , . . . , p )
D), calculate corresponding frequency domain scale:
f i = 700 × e M i 2595 ln 10 - 1 , ( i = 0,1,2 , . . . , p )
E), calculate each Mel territory passage φ jOn the logarithm energy spectrum:
E j = Σ k = 0 K 2 - 1 φ j ( k ) | X ( k ) | 2
Wherein Σ k = 0 K 2 - 1 φ j ( k ) = 1 .
F), be discrete cosine transform DCT
3.DBN model training
Dynamic bayesian network model (DBN) is similar to HMM, is a generation model, and it only needs a people's speech data just can carry out modeling to it, finishes identifying.
The purpose of training is in order to make under given speech data, and the parameter of model can better be described the distribution situation of voice in feature space.Here DBN training mainly lays particular emphasis on the training to model parameter, does not learn at network topology.
A) if likelihood score does not have convergence, and iterations changes B less than preset times) step; Otherwise, change E).
Here convergent definition is:
Converged = TRUE , if | PreLogLik - CurLogLik | < &theta; FALSE , otherwize
The PreLogLik here is meant the likelihood score of back iteration, and CurLogLik is meant the likelihood score of current iteration, and they all are by step C) in forward-backward algorithm traversal obtain.θ is the threshold values of presetting.Default maximum iteration time MAXITER can set arbitrarily.The judgement in this step is to make iteration be unlikely to unconfined to carry out.
B), the ASSOCIATE STATISTICS value of each node empties.
Will empty statistical value before forward-backward algorithm traversal, said here statistical value is meant CPD (conditional probability distribution) to node needed data when learning.
C), pooled observations, carry out forward-backward algorithm traversal, the output likelihood score.
Network is carried out the forward-backward algorithm traversal, make observed reading can make other nodes in the network also can obtain upgrading to the renewal of some node, satisfy locally coherence and global coherency condition, this step has realized in abutting connection with algorithm, and the frame inner structure has been carried out the probability diffusion with COLLECT-EVIDENCE (collecting evidence) and DISTRIBUTE-EVIDENCE (issue evidence).This traversal will be exported the Log likelihood score, at A in step) in will be used to.Used probability output also obtains by this traversal in the identification.
D), according to observed reading, calculate the ASSOCIATE STATISTICS value, upgrade the probability distribution of interdependent node, change A).
According to observed reading, calculate the ASSOCIATE STATISTICS value, the probability distribution of new node more, this is determined by the EM learning algorithm.
E), preserve model.
4. identification
After the user speech input,, obtain a characteristic vector sequence C through feature extraction.Press Bayes rule,, meet model M giving under the given data C iLikelihood score be,
P ( M i | C ) = P ( C | M i ) * P ( M i ) P ( C )
Because without any the knowledge of priori, so we think to all models P (M i) be identical, i.e. P (M i)=1/N, i=1,2 ..., N, and concerning all speakers, P (C) is a unconditional probability, also is identical, that is:
P(M i|C)∝P(C|M i)
We are converted into the posterior probability of asking model and ask the prior probability of model to data.So, speaker's identification test is exactly to calculate following formula,
i * = arg max i P ( C | M i )
Second step: recognition of face
2 dimension face identification systems mainly comprise image pre-service, feature extraction and three parts of sorter classification.
1. image pre-service
The pretreated general objects of image is to adjust the difference of original image on illumination and geometry, obtains normalized new images.Pre-service comprises the alignment and the convergent-divergent of image.
2.PCA feature extraction
By the pivot conversion, with a low n-dimensional subspace n (pivot subspace) facial image is described, try hard to when rejecting the classification interference components, remain with the discriminant information that is beneficial to classification.
With the standard picture after pretreated as training sample set, and with the covariance matrix of this sample set generation matrix as the pivot conversion:
&Sigma; = 1 M &Sigma; i = 0 M - 1 ( x i - &mu; ) ( x i - &mu; ) T
X wherein iBe the image vector of i training sample, μ is the average image vector of training sample set, and M is the sum of training sample.If the image size is K * L, then the matrix ∑ has KL * KL dimension.When image is very big, directly calculating the eigenwert and the proper vector that produce matrix will have certain difficulty.As sample number M during less than KL * KL, available svd theorem (SVD) is converted to the calculating of M dimension matrix.
With the eigenwert λ that sorts from big to small 0〉=λ 1〉=... λ R-1, and establish the vectorial u of being of their characteristics of correspondence iLike this, each width of cloth facial image can project to by u 0, u 1..., u M-1In the subspace of opening.Obtained M proper vector altogether, chosen preceding k maximum proper vector, made:
&Sigma; i = 0 k &lambda; i &Sigma; i = 0 M - 1 &lambda; i = &alpha;
Wherein α is called the energy ratio, accounts for the ratio of whole energy for the energy of sample set on preceding k axle.
3. sorter classification
With the arest neighbors sorting technique as component classifier.What distance metric used is the Euclidean distance formula.
The 3rd step: based on the Multiple Classifier Fusion of score difference weighting
Multiple Classifier Fusion algorithm based on the weighting of score difference is divided into the sorter formalized description, trains and discern three parts.
1. sorter formalized description
A), sorter is described: establish D={D 1, D 2..., D LRepresent a group component sorter;
B), classification is described: establish Ω={ ω 1..., ω c) represent a category not identify, promptly all possible classification results
C), input: proper vector
Figure A20051006135900103
D), output: length is the vectorial D of c i(x)=[d I, 1(x), d I, 2(x) ..., d I, c(x)] T, d wherein I, j(x) represent D iBelong to for x
Figure A20051006135900104
The support .d of this guess I, j(x) normalized to [0,1] interval interior component classifier output, and
&Sigma; j = 1 c d i , j ( x ) = 1
E), the output of all sorters can be synthesized a DP (Decision Profile) matrix:
DP ( x ) = d 1,1 ( x ) , d 1,2 ( x ) , . . . , d 1 , c ( x ) . . . d i , 1 ( x ) , d i , 2 ( x ) , . . . , d i , c ( x ) . . . d L , 1 ( x ) , d I , 2 ( x ) , . . . , d I , c ( x )
In this matrix, the i row element is represented component classifier D iOutput D i(x); The j column element represents each component classifier right
Figure A20051006135900112
Support.
2. training
A), training sample: the training set X={x that N element arranged 1, x 2..., x N}
B), sorter is to the recognition result of sample:
S ( X ) = s 1,1 ( X ) , . . . , S 1 , L ( X ) . . . s j , 1 ( X ) , . . . , s j , L ( X ) . . . s N , 1 ( X ) , . . . , s N , L ( X )
S wherein J, iBe sorter D iTo sample elements x jThe class that is identified, and if only if
s j,i=D i(x j)
= s &DoubleLeftRightArrow; d i , s ( x j ) = max o = 1,2 , . . , c { d i , o ( x j ) }
Here j=1 ..., N is the number of element in the training set; I=1 ... L is that the number .C of sorter is the number of classification, is number to be identified herein.
C), original affiliated classification: the L (X) of sample=[k 1 ..., k N] T,
Figure A20051006135900115
D), the score difference SD of i sorter i(X) be:
SD i ( X ) = &Sigma; j = 1 N SD i j ( x j )
= &Sigma; j = 1 N &Sigma; s j , i &NotEqual; k j | d i , k j ( x j ) - d i , s j , i ( x j ) |
SD i(X) be under separation vessel is differentiated error situation, import this moment data former under the classification of the input data supposed of classification and sorter s when inconsistent J, i≠ k j, sorter is to the difference of the support of above-mentioned two classifications.D wherein I, j(x) be element in DP (x) matrix.
E), sorter is based on the weights of score difference:
W i = SD i ( X ) - 1 &Sigma; i = 1 L SD i ( X ) - 1
3. judgement
According to weights, recomputate under the multi-modal state support of each classification:
D(x)=[d 1(x),d 2(x),...,d c(x)] T
= [ &Sigma; i = 1 L W i * d i , 1 ( x ) , &Sigma; i = 1 L W i * d i , 2 ( x ) , . . . &Sigma; i = 1 L W i * d i , c ( x ) , ] T
A plurality of sorters are ω to the classification results of test vector x sAnd if only if s = max i = 1 , . . . c d i ( x ) .
Experimental result
Native system is tested on a multi-modal speech database that comprises 54 user's vocal prints and voice messaging.This database has been gathered the people's face and the voiceprint of 54 students of Zhejiang University (37 schoolboys, 17 schoolgirls).The collecting work of entire database carries out in the environment of low noise bright and clear.In the phonological component, everyone is required to say personal information 3 times; The mandarin numeric string, dialect numeric string, english digit string, mandarin word string, each 10 of picture talks, one section of short essay.The voice document form is the wav/nist form, and all standard becomes the 8000Hz sampling rate, the 16bit data.Experiment adopts short essay and personal information as training, and all the other 50 voice are as test.In the facial image part, everyone respectively produces the front and people from side face shines totally 4, and wherein positive according to two, the side is according to two.Experiment employing wherein positive a photograph is trained, and another is tested.
We use the single mode Application on Voiceprint Recognition simultaneously on this storehouse, single mode recognition of face and addition, weighting, ballot method and carried out same experiment based on this several frequently seen decision-making level's blending algorithm of method of behavior knowledge space, be used for and native system (SDWS is based on the blending algorithm of score difference weighting) compares.Wherein Application on Voiceprint Recognition is based on people's phonetic feature, and recognition of face is based on people's face feature.Blending algorithm combines these two kinds of features, and addition and ballot are owned by France in the fixing fusion method of parameter; Weighted sum belongs to the blending algorithm that needs parameter training based on the method for behavior knowledge space.
Single mode vocal print method for distinguishing speek person is based on the first step of this explanation, voice are carried out pre-service after, it is extracted the Mel cepstrum feature, utilize dynamic Bayesian model to speaker's modeling.Dynamically the topology of Bayesian model adopts structure as shown in Figure 2, wherein q i j, i=1,2,3, j=1,2 ... T represents latent node variable, and each node hypothesis has two discrete values, o i j, i=1,2,3, j=1,2 ... T is an observer nodes, corresponding to observation vector, has the father node q of Discrete Distribution i j, satisfy Gaussian distribution.Same, tested speech is carried out rightly with the speaker model of building up after the process of extracting through pre-service and Mel cepstrum feature, obtains the pairing artificial identification person that speaks of the highest model of branch.
The single mode recognition of face goes on foot based on second of this explanation, after facial image is manuallyd locate according to eyes, it is extracted the PCA feature, by comparing the Euclidean distance between the PCA feature, gets the pairing artificial identification person that speaks of nearest feature.
For addition, its thought can be by following formulate:
μ i(x)=F(d 1,i(x),...,d L,i(x)),i=1,...,c
Wherein F has represented add operation (Sum), and final classification results is to make μ iThe ω of maximum i correspondence i
Weighting algorithm is to grow up on the basis of addition, embodies difference good and bad between each sorter by weight.Here adopt each sorter etc. error rate as its weight.
The basic thought of ballot method is " the minority is subordinate to the majority ".Wherein, the voter is all component classifiers, and the candidate is all possible classification results.Give its candidate's ballot of supporting by the voter, the candidate that poll is maximum wins.
Method based on the behavior knowledge space is to estimate posterior probability under the situation of knowing the component classifier classification results.It need add up the number that each class sample drops on each unit of behavior knowledge space.When using this method, the sample in the training set is divided into different unit, and these unit are that the various combination by all component classifier classification results defines.When a unknown sample need be carried out the branch time-like, all component classifiers all can be known the combination of classification results, can find corresponding unit thus.Then, according to the sample concrete class in this unit, unknown sample is included into the maximum classification of occurrence number.
We are being different under the voice collection of voice content and languages, and single mode identification and above several blending algorithm are assessed.
Assess for performance, select for use discrimination (IR, Identification Rate) to be used as the evaluation criteria of experimental result Speaker Recognition System.
The computing formula of discrimination IR is:
Figure A20051006135900131
Experimental result is as follows:
Fusion method Discrimination (%)
Mandarin Dialect English Vocabulary Picture talk
Application on Voiceprint Recognition 84.63 85.55 91.11 87.78 87.78
Recognition of face 85.18
Addition 85.37 85.18 86.11 85.18 85
Weighting 85.37 85.18 86.67 85.18 85
SDWS 97.96 97.98 98.89 99.26 98.33
The ballot method 85.18 85.18 85.18 85.18 85.18
Method based on the behavior knowledge space 89.15 89.68 92.33 90.21 88.10
Experimental result shows that the biological authentication method of single mode can't reach discrimination preferably, can not satisfy the requirement of security and robustness.
Under the situation of two Multiple Classifier Fusion, the method for addition and weighting tends to make the advantage of two sorters disappear mutually on the contrary because do not consider the score distribution situation of sorter.
The ballot method has only been considered the category label of each sorter output, and does not consider their error rate, and this has wasted the information of training sample to a certain extent.
Though behavior knowledge space method is to tie up the direct statistics that distributes more than a plurality of sorter results of decision, the decision-making that can make up component classifier is to obtain best result.Yet, because the relative training sample quantity of behavior knowledge space is too huge, being easy to occur undisciplined situation, this is because training set can't be huge to each unit is filled into enough density.
This recognizer can be by the analysis to the sorter score, according under the situation of sorter identification error, difference between the score of the model under the score of the model that the sorter of collecting is judged and the sample, with this weight as sorter, by simple and effective method of weighting sorter is merged in decision-making level, make two kinds of sorters have complementary advantages, to improving a lot on the system performance, head and shoulders above other fusion method, improved about 7.8-13.3% than the method for single mode.Thereby improved the recognition performance of Speaker Identification.

Claims (7)

1, a kind of based on the warm multi-modal personal identification method of score difference weighting, it is characterized in that: at first utilize one group of speaker's sample data, by the score of each speaker model in the relative masterplate of original traditional each sample of single mode sorter; If that model that score is the highest belongs to different speakers with sample, the score of then noting both is poor; Then all these differences in the single sorter are all added up; Utilize the score difference of each sorter to determine the weight of each mode at last.
2, according to claim 1 based on the warm multi-modal personal identification method of score difference weighting, it is characterized in that: described traditional single mode sorter is Application on Voiceprint Recognition sorter and recognition of face sorter.
3, according to claim 1 based on the warm multi-modal personal identification method of score difference weighting, it is characterized in that: described be divided into sorter belongs to this guess of certain classification to the data of input support.
4, according to claim 1 based on the warm multi-modal personal identification method of score difference weighting, it is characterized in that: described score difference is for differentiating under the error situation at separation vessel, import this moment data former under the classification of the input data supposed of classification and sorter when inconsistent, sorter is to the difference of the support of above-mentioned two classifications.
5, according to claim 1 based on the warm multi-modal personal identification method of score difference weighting, it is characterized in that: the score difference of described sorter be all speakers differentiate the score of the speaker model that the sample under the error situation belongs to and top score in the single sorter difference and.
6, the multi-modal personal identification method based on the weighting of score difference according to claim 1 is characterized in that: described sorter based on the weight of score difference be single separation vessel score difference inverse to the inverse of all separation vessel score differences and ratio.
7, according to claim 1 or 2 or 3 or 4 or 5 or 6 described multi-modal personal identification methods based on the weighting of score difference, it is characterized in that: the Multiple Classifier Fusion algorithm based on the weighting of score difference is divided into the sorter formalized description, trains and discern three parts;
1), sorter formalized description
A), sorter is described: establish D={D 1, D 2..., D LRepresent a group component sorter;
B), classification is described: establish Ω={ ω 1..., ω cRepresent a category not identify, promptly all possible classification results;
C), input: proper vector
Figure A2005100613590002C1
D), output: length is the vectorial D of c i(x)=[d I, 1(x), d I, 2(x) ..., d I, c(x)] T, d wherein I, j(x) represent D iBelong to for x
Figure A2005100613590003C1
The support of this guess, d I, j(x) normalized to [0,1] interval interior component classifier output, and
&Sigma; j = 1 c d i , j ( x ) = 1 ;
E), synthetic DP matrix of the output of all sorters:
DP ( x ) = d 1,1 ( x ) , d 1,2 ( x ) , &CenterDot; &CenterDot; &CenterDot; , d 1 , c ( x ) &CenterDot; &CenterDot; &CenterDot; d i , 1 ( x ) , d i , 2 ( x ) , &CenterDot; &CenterDot; &CenterDot; , d i , c ( x ) &CenterDot; &CenterDot; &CenterDot; d L , 1 ( x ) , d l , 2 ( x ) &CenterDot; &CenterDot; &CenterDot; , d l , c ( x )
In this matrix, the i row element is represented component classifier D iOutput D i(x); The j column element represents each component classifier right
Figure A2005100613590003C4
Support;
2), training
A), training sample: the training set X={x that N element arranged 1, x 2..., x N;
B), sorter is to the recognition result of sample:
S ( X ) = s 1 , 1 ( X ) , &CenterDot; &CenterDot; &CenterDot; , s 1 , L ( X ) &CenterDot; &CenterDot; &CenterDot; s j , i ( X ) , &CenterDot; &CenterDot; &CenterDot; , s j , L ( X ) &CenterDot; &CenterDot; &CenterDot; s N , 1 ( X ) , &CenterDot; &CenterDot; &CenterDot; , s N , L ( X )
S wherein J, iBe sorter D iTo sample elements x jThe class that is identified, and if only if
s j , i = D i ( x j ) = s &DoubleLeftRightArrow; d i , s ( x j ) = max o = 1,2 , &CenterDot; &CenterDot; &CenterDot; , c { d i , o ( x j ) }
Here j=1 ..., N is the number of element in the training set: i=1 ... L is the number of sorter, and C is the number of classification, is number to be identified herein;
C), original affiliated classification: the L (X) of sample=[k 1..., k N] T,
Figure A2005100613590003C7
D), the score difference SD of i sorter i(X) be:
SD i ( X ) = &Sigma; J = 1 N SD i j ( x j ) = &Sigma; j = 1 N &Sigma; s j , i &NotEqual; k j | d i , k j ( x j ) - d i , s j , i ( x j ) |
SD i(X) be under separation vessel is differentiated error situation, import this moment data former under the classification of the input data supposed of classification and sorter s when inconsistent j, i ≠ k j, sorter is to the difference of the support of above-mentioned two classifications.D wherein I, j(x) be element in DP (x) matrix;
E), sorter is based on the weights of score difference:
W i = SD i ( X ) - 1 &Sigma; i = 1 L SD i ( X ) - 1
3), judgement
According to weights, recomputate under the multi-modal state support of each classification:
D ( x ) = [ d 1 ( x ) , d 2 ( x ) , &CenterDot; &CenterDot; &CenterDot; , d c ( x ) ] T = [ &Sigma; i = 1 L W i * d i , 1 ( x ) , &Sigma; i = 1 L W i * d i , 2 ( x ) , &CenterDot; &CenterDot; &CenterDot; &Sigma; i = 1 L W i * d i , c ( x ) , ] T
A plurality of sorters are ω to the classification results of test vector x sAnd if only if s = max i = 1 , &CenterDot; &CenterDot; &CenterDot; c d i ( x ) .
CNB2005100613590A 2005-10-31 2005-10-31 Multi-model ID recognition method based on scoring difference weight compromised Expired - Fee Related CN100363938C (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CNB2005100613590A CN100363938C (en) 2005-10-31 2005-10-31 Multi-model ID recognition method based on scoring difference weight compromised

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CNB2005100613590A CN100363938C (en) 2005-10-31 2005-10-31 Multi-model ID recognition method based on scoring difference weight compromised

Publications (2)

Publication Number Publication Date
CN1758263A true CN1758263A (en) 2006-04-12
CN100363938C CN100363938C (en) 2008-01-23

Family

ID=36703632

Family Applications (1)

Application Number Title Priority Date Filing Date
CNB2005100613590A Expired - Fee Related CN100363938C (en) 2005-10-31 2005-10-31 Multi-model ID recognition method based on scoring difference weight compromised

Country Status (1)

Country Link
CN (1) CN100363938C (en)

Cited By (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102810154A (en) * 2011-06-02 2012-12-05 国民技术股份有限公司 Method and system for biological characteristic acquisition and fusion based on trusted module
CN104183240A (en) * 2014-08-19 2014-12-03 中国联合网络通信集团有限公司 Vocal print feature fusion method and device
CN104598796A (en) * 2015-01-30 2015-05-06 科大讯飞股份有限公司 Method and system for identifying identity
CN104598795A (en) * 2015-01-30 2015-05-06 科大讯飞股份有限公司 Authentication method and system
CN105810199A (en) * 2014-12-30 2016-07-27 中国科学院深圳先进技术研究院 Identity verification method and device for speakers
CN106127156A (en) * 2016-06-27 2016-11-16 上海元趣信息技术有限公司 Robot interactive method based on vocal print and recognition of face
CN106303797A (en) * 2016-07-30 2017-01-04 杨超坤 A kind of automobile audio with control system
WO2017067136A1 (en) * 2015-10-20 2017-04-27 广州广电运通金融电子股份有限公司 Method and device for authenticating identify by means of fusion of multiple biological characteristics
CN107249434A (en) * 2015-02-12 2017-10-13 皇家飞利浦有限公司 Robust classification device
CN110008676A (en) * 2019-04-02 2019-07-12 合肥智查数据科技有限公司 A kind of personnel's multidimensional challenge and true identity discrimination system and method
CN110378414A (en) * 2019-07-19 2019-10-25 中国计量大学 The personal identification method of multi-modal biological characteristic fusion based on evolution strategy
CN111144167A (en) * 2018-11-02 2020-05-12 银河水滴科技(北京)有限公司 Gait information identification optimization method, system and storage medium
CN112990252A (en) * 2019-12-18 2021-06-18 株式会社东芝 Information processing apparatus, information processing method, and program
CN114841293A (en) * 2022-07-04 2022-08-02 国网信息通信产业集团有限公司 Multimode data fusion analysis method and system for power Internet of things

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1304114A (en) * 1999-12-13 2001-07-18 中国科学院自动化研究所 Identity identification method based on multiple biological characteristics
CN1172260C (en) * 2001-12-29 2004-10-20 浙江大学 Fingerprint and soundprint based cross-certification system
US7065465B2 (en) * 2002-03-26 2006-06-20 Lockheed Martin Corporation Method and system for multi-sensor data fusion

Cited By (19)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102810154A (en) * 2011-06-02 2012-12-05 国民技术股份有限公司 Method and system for biological characteristic acquisition and fusion based on trusted module
CN102810154B (en) * 2011-06-02 2016-05-11 国民技术股份有限公司 A kind of physical characteristics collecting fusion method and system based on trusted module
CN104183240A (en) * 2014-08-19 2014-12-03 中国联合网络通信集团有限公司 Vocal print feature fusion method and device
CN105810199A (en) * 2014-12-30 2016-07-27 中国科学院深圳先进技术研究院 Identity verification method and device for speakers
CN104598796A (en) * 2015-01-30 2015-05-06 科大讯飞股份有限公司 Method and system for identifying identity
CN104598795A (en) * 2015-01-30 2015-05-06 科大讯飞股份有限公司 Authentication method and system
CN107249434B (en) * 2015-02-12 2020-12-18 皇家飞利浦有限公司 Robust classifier
CN107249434A (en) * 2015-02-12 2017-10-13 皇家飞利浦有限公司 Robust classification device
WO2017067136A1 (en) * 2015-10-20 2017-04-27 广州广电运通金融电子股份有限公司 Method and device for authenticating identify by means of fusion of multiple biological characteristics
US10346602B2 (en) 2015-10-20 2019-07-09 Grg Banking Equipment Co., Ltd. Method and device for authenticating identify by means of fusion of multiple biological characteristics
CN106127156A (en) * 2016-06-27 2016-11-16 上海元趣信息技术有限公司 Robot interactive method based on vocal print and recognition of face
CN106303797A (en) * 2016-07-30 2017-01-04 杨超坤 A kind of automobile audio with control system
CN111144167A (en) * 2018-11-02 2020-05-12 银河水滴科技(北京)有限公司 Gait information identification optimization method, system and storage medium
CN110008676A (en) * 2019-04-02 2019-07-12 合肥智查数据科技有限公司 A kind of personnel's multidimensional challenge and true identity discrimination system and method
CN110008676B (en) * 2019-04-02 2022-09-16 合肥智查数据科技有限公司 System and method for multi-dimensional identity checking and real identity discrimination of personnel
CN110378414A (en) * 2019-07-19 2019-10-25 中国计量大学 The personal identification method of multi-modal biological characteristic fusion based on evolution strategy
CN110378414B (en) * 2019-07-19 2021-11-09 中国计量大学 Multi-mode biological characteristic fusion identity recognition method based on evolution strategy
CN112990252A (en) * 2019-12-18 2021-06-18 株式会社东芝 Information processing apparatus, information processing method, and program
CN114841293A (en) * 2022-07-04 2022-08-02 国网信息通信产业集团有限公司 Multimode data fusion analysis method and system for power Internet of things

Also Published As

Publication number Publication date
CN100363938C (en) 2008-01-23

Similar Documents

Publication Publication Date Title
CN1758263A (en) Multi-model ID recognition method based on scoring difference weight compromised
CN1236423C (en) Background learning of speaker voices
CN1162839C (en) Method and device for producing acoustics model
CN1296886C (en) Speech recognition system and method
Shriberg et al. Modeling prosodic feature sequences for speaker recognition
CN101136199B (en) Voice data processing method and equipment
CN105469784B (en) A kind of speaker clustering method and system based on probability linear discriminant analysis model
CN108281137A (en) A kind of universal phonetic under whole tone element frame wakes up recognition methods and system
Zhang et al. Automatic mispronunciation detection for Mandarin
CN103531198B (en) A kind of speech emotion feature normalization method based on pseudo-speaker clustering
CN109545189A (en) A kind of spoken language pronunciation error detection and correcting system based on machine learning
CN1188804C (en) Method for recognizing voice print
CN1787076A (en) Method for distinguishing speek person based on hybrid supporting vector machine
CN110111797A (en) Method for distinguishing speek person based on Gauss super vector and deep neural network
CN1763843A (en) Pronunciation quality evaluating method for language learning machine
JPWO2010047019A1 (en) Statistical model learning apparatus, statistical model learning method, and program
CN1293428A (en) Information check method based on speed recognition
CN1787075A (en) Method for distinguishing speek speek person by supporting vector machine model basedon inserted GMM core
CN110110790B (en) Speaker confirmation method adopting unsupervised clustering score normalization
CN103985381A (en) Voice frequency indexing method based on parameter fusion optimized decision
CN112802494B (en) Voice evaluation method, device, computer equipment and medium
CN104347071B (en) Method and system for generating reference answers of spoken language test
CN108109612A (en) Voice recognition classification method based on self-adaptive dimension reduction
CN1787074A (en) Method for distinguishing speak person based on feeling shifting rule and voice correction
JP6996627B2 (en) Information processing equipment, control methods, and programs

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
CF01 Termination of patent right due to non-payment of annual fee
CF01 Termination of patent right due to non-payment of annual fee

Granted publication date: 20080123

Termination date: 20211031