CN102238190A - Identity authentication method and system - Google Patents

Identity authentication method and system Download PDF

Info

Publication number
CN102238190A
CN102238190A CN2011102180452A CN201110218045A CN102238190A CN 102238190 A CN102238190 A CN 102238190A CN 2011102180452 A CN2011102180452 A CN 2011102180452A CN 201110218045 A CN201110218045 A CN 201110218045A CN 102238190 A CN102238190 A CN 102238190A
Authority
CN
China
Prior art keywords
vocal print
model
characteristic sequence
likelihood score
print characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN2011102180452A
Other languages
Chinese (zh)
Other versions
CN102238190B (en
Inventor
潘逸倩
胡国平
何婷婷
魏思
胡郁
王智国
刘庆峰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
iFlytek Co Ltd
Original Assignee
iFlytek Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by iFlytek Co Ltd filed Critical iFlytek Co Ltd
Priority to CN2011102180452A priority Critical patent/CN102238190B/en
Publication of CN102238190A publication Critical patent/CN102238190A/en
Application granted granted Critical
Publication of CN102238190B publication Critical patent/CN102238190B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Collating Specific Patterns (AREA)

Abstract

The invention discloses an identity authentication method and an identity authentication system. The method comprises the following steps of: in the login of a user, receiving a continuous voice signal recorded by the current login user; extracting a voiceprint characteristic sequence from the continuous voice signal; computing likelihood between the voiceprint characteristic sequence and a background model; computing the likelihood between the voiceprint characteristic sequence and a speaker model of the current login user, wherein the speaker model is a polyhybrid Gaussian model constructed according to the repetition times and frame number of the registration voice signals recorded in the login of the current login user; computing a likelihood ratio according to the likelihood between the voiceprint characteristic sequence and the speaker model and the likelihood between the voiceprint characteristic sequence and the background model; and if the likelihood ratio is greater than a preset threshold value, determining the current login user is an effectively authenticated user, otherwise determining the current login user is an unauthenticated user. By the method and the system, the voiceprint-password-based identity authentication accuracy can be improved.

Description

Identity identifying method and system
Technical field
The present invention relates to identity identification technical field, particularly a kind of identity identifying method and system.
Background technology
(Voiceprint Recognition VPR) is also referred to as Speaker Identification to Application on Voiceprint Recognition, and two classes are arranged, i.e. speaker's identification and speaker verification.The former in order to judge that certain section voice are some philtrums which is said, be " multiselect one " problem; And the latter is " differentiating one to one " problem in order to confirm that whether certain section voice are that the someone of appointment is said.Different tasks can be used different Application on Voiceprint Recognition technology with using.
Voiceprint is meant according to the voice signal that collects confirms speaker ' s identity, belongs to the differentiation problem of " one to one ".The voiceprint system of main flow has adopted the framework based on hypothesis testing now, by calculate respectively the vocal print signal with respect to the likelihood score of speaker model and background model and relatively they likelihood ratio and in advance rule of thumb the threshold size of setting confirm.Obviously the accuracy of background model and speaker model will directly have influence on the voiceprint effect, and the big more then modelling effect of amount of training data is good more under setting based on the statistical model of data-driven.
The vocal print cipher authentication is the relevant speaker ' s identity authentication method of a kind of text.This method requires the user speech input to determine cryptogram, and confirms speaker ' s identity in view of the above.The phonetic entry of determining cryptogram is all adopted in user's registration and authentication in this application, thereby its vocal print is often more consistent, can obtain accordingly than the speaker verification of text-independent and better authenticate effect.
Now the vocal print cipher authentication system the most the technology path of main flow be the GMM-UBM algorithm, promptly adopted mixed Gaussian (Gaussian Mixture Model respectively, GMM) the modeling background model (Universal Background Model, UBM) and speaker model.The UBM model is used to describe the general character of speaker's vocal print.Because each speaker's vocal print always has specificity separately, the UBM model based on many speakers training data needs the complicated model structure to satisfy the match requirement of distribution separate data accordingly.The GMM model of common selection 1024 of UBM model at present even bigger Gaussage.
Online training obtains speaker model according to the registration voice when the user registers by system.Because registration is often limited with speech samples, directly trains complex model in view of the above because data are sparse and easily cause problems such as model is accurate inadequately.For this reason, in the prior art, normally be that initial model is by a small amount of speaker's data adjustment model partial parameters of various adaptive approachs bases with the background model, as at present commonly used based on maximum a posteriori probability (Maximum A Posterior, MAP) adaptive algorithms etc. are current speaker's individual character with user's vocal print general character self adaptation.
Between each Gauss of the mixed Gauss model of speaker under the adaptive updates algorithm and common background Gauss model, form man-to-man corresponding relation, therefore, make that the speaker model parameter is too much, in the less vocal print cipher authentication system of log-on data amount, cause following problem easily:
1. model redundancy: speaker model is to be obtained by several sample datas training that repeat all over the registration speech cipher in the vocal print cipher authentication system.Very few sample data causes adaptive algorithm can only upgrade part Gauss in the initial back-ground model, and has much all kept and the similar Gaussian component of background model.The existence of redundancy model parameter causes storing and the increase of computing pressure easily, and then the efficient of influence decoding.
2. the model training amount is bigger: in adaptive algorithm, and each Gauss's of 1024 even bigger Gaussage of needs calculating initial back-ground model sample statistic, and to its parameter update.
3. in adaptive algorithm,, thereby often directly adopt the variance of background model because the variance revaluation of speaker model is comparatively difficult.Because background model is based on the model of the simulation vocal print general character that many speakers training data obtains, its model probability distribution variance is often bigger.And the characteristics of the specific vocal print of speaker of the variance of speaker model simulation have specificity.Directly can not embody the speaker model characteristics well, reduce the differentiation between the different speaker models, thereby influenced recognition accuracy with the background model variance.
Summary of the invention
The embodiment of the invention provides a kind of identity identifying method and system, to improve the accuracy rate of carrying out authentication based on the vocal print password.
The embodiment of the invention provides a kind of identity identifying method on the one hand, comprising:
When the user logins, receive the continuous speech signal of current login user typing;
Extract the vocal print characteristic sequence in the described continuous speech signal, described vocal print characteristic sequence comprises one group of vocal print feature;
Calculate the likelihood score of described vocal print characteristic sequence and background model;
Calculate the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user, described speaker model is the number of repetition of the registration voice signal of typing during according to described current login user registration and the polyhybird Gauss model that frame number makes up;
According to the likelihood score of described vocal print characteristic sequence and speaker model and the likelihood score of described vocal print characteristic sequence and background model, calculate likelihood ratio;
If described likelihood ratio, determines then that described current login user is effective authenticated user greater than preset threshold, otherwise determines that described current login user is non-authenticated user.
The embodiment of the invention provides a kind of identity authorization system on the other hand, comprising:
The voice signal receiving element is used for when the user logins, and receives the continuous speech signal of current login user typing;
Extraction unit is used for extracting the vocal print characteristic sequence of described continuous speech signal, and described vocal print characteristic sequence comprises one group of vocal print feature;
First computing unit is used to calculate the likelihood score of described vocal print characteristic sequence and background model;
Second computing unit, be used to calculate the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user, described speaker model is the number of repetition of the registration voice signal of typing during according to described current login user registration and the polyhybird Gauss model that frame number makes up;
The 3rd computing unit is used for according to the likelihood score of described vocal print characteristic sequence and speaker model and the likelihood score of described vocal print characteristic sequence and background model, calculates likelihood ratio;
Judging unit when being used for the likelihood ratio that calculates at described the 3rd computing unit greater than preset threshold, determines that described current login user is effective authenticated user, otherwise determines that described current login user is non-authenticated user.
Identity identifying method that the embodiment of the invention provides and system, according to the vocal print characteristic sequence in the continuous speech signal of current login user typing, calculate the speaker model of vocal print characteristic sequence and current login user and the likelihood score of background model respectively, calculate likelihood ratio then, determine according to the likelihood ratio that obtains whether current login user is effective authenticated user.Because in this scheme, employed speaker model is the polyhybird Gauss model that the voice signal of typing makes up during according to the registration of current login user, thereby can simulate the characteristics that described user says the difference pronunciation variation of same voice signal (being password) existence, improve the accuracy rate of carrying out authentication based on the vocal print password.
Description of drawings
In order to be illustrated more clearly in technical scheme of the invention process, to do to introduce simply to the accompanying drawing of required use among the embodiment below, apparently, accompanying drawing in describing below only is some embodiments of the present invention, for those of ordinary skills, under the prerequisite of not paying creative work, can also obtain other accompanying drawing according to these accompanying drawings.
Fig. 1 is the flow chart of embodiment of the invention identity identifying method;
Fig. 2 is a kind of flow chart of background model parameters training process in the embodiment of the invention;
Fig. 3 is that traditional adaptive algorithm of utilizing makes up the flow chart of speaker model;
Fig. 4 is the flow chart that makes up speaker model in the embodiment of the invention;
Fig. 5 is a kind of structural representation of embodiment of the invention identity authorization system;
Fig. 6 is the another kind of structural representation of embodiment of the invention identity authorization system.
Embodiment
Below in conjunction with the accompanying drawing in the embodiment of the invention, the technical scheme in the embodiment of the invention is clearly and completely described, obviously, described embodiment only is the present invention's part embodiment, rather than whole embodiment.Based on the embodiment among the present invention, those of ordinary skills belong to the scope of protection of the invention not making the every other embodiment that is obtained under the creative work prerequisite.
As shown in Figure 1, be the flow chart of embodiment of the invention identity identifying method, may further comprise the steps:
Step 101 when the user logins, receives the continuous speech signal of current login user typing.
Step 102 is extracted the vocal print characteristic sequence in the described continuous speech signal.
This vocal print characteristic sequence comprises one group of vocal print feature, can distinguish different speakers effectively, and same speaker's variation is kept relative stability.
Described vocal print feature mainly contains: spectrum envelope parameter phonetic feature, fundamental tone profile, formant frequency bandwidth feature, linear predictor coefficient, cepstrum coefficient etc.Consider the quantification property of above-mentioned vocal print feature, the quantity of training sample and the problems such as evaluation of systematic function, can select MFCC (Mel Frequency Cepstrum Coefficient for use, the Mel frequency cepstral coefficient) feature, every frame speech data that the long 25ms frame of window is moved 10ms is done short-time analysis and is obtained MFCC parameter and single order second differnce thereof, amounts to 39 dimensions.Like this, every voice signal can be quantified as one 39 dimension vocal print feature vector sequence X.
Step 103 is calculated the likelihood score of described vocal print characteristic sequence and background model.
Frame number is that the vocal print feature vector sequence X of T is corresponding to the likelihood score of background model (UBM):
p ( X | UBM ) = 1 T Σ t = 1 T Σ m = 1 M c m N ( X t ; μ m , Σ m ) - - - ( 1 )
Wherein, c mBe m Gauss's weight coefficient, satisfy
Figure BDA0000080250750000052
μ mAnd ∑ mBe respectively m Gauss's average and variance.Wherein N (.) satisfies normal distribution, is used to calculate t vocal print characteristic vector X constantly tLikelihood score on single Gaussian component:
N ( X t ; μ m , Σ m ) = 1 ( 2 π ) n | Σ m | e - 1 2 ( X t - μ m ) ′ Σ m - 1 ( X t - μ m ) - - - ( 2 )
Step 104, calculate the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user, described speaker model is the number of repetition of the registration voice signal of typing during according to described current login user registration and the polyhybird Gauss model that frame number makes up.
Because speaker model is the polyhybird Gauss model that the voice signal of typing makes up during according to described current login user registration, therefore, in this step, when calculating the likelihood score of speaker model of described vocal print characteristic sequence and described current login user, need calculate the likelihood score of each vocal print feature and each mixed Gauss model in the described vocal print characteristic sequence respectively; Determine the likelihood score of the speaker model of described vocal print feature and described current login user then according to all likelihood scores that calculate.Specifically multiple implementation can be arranged, such as:
1. calculate earlier the likelihood score of described vocal print characteristic sequence and each mixed Gauss model respectively, and then determine the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user according to result of calculation.
In this mode, can calculate in the described vocal print characteristic sequence likelihood score of each mixed Gauss model in each vocal print feature and described polyhybird Gauss model respectively; Select the likelihood score of the time average of the likelihood score summation that mixed Gauss model of one group of vocal print feature correspondence calculates in the described vocal print characteristic sequence as described vocal print characteristic sequence and this mixed Gauss model.
And behind the likelihood score that obtains described vocal print characteristic sequence and each mixed Gauss model, can select one of them maximum or average likelihood score as the speaker model of described vocal print characteristic sequence and described current login user.
2. calculate earlier in the described vocal print characteristic sequence each vocal print feature respectively with respect to the likelihood score of described polyhybird Gauss model, and then determine the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user according to result of calculation.
In this mode, can calculate in the described vocal print characteristic sequence likelihood score of each mixed Gauss model in each vocal print feature and described polyhybird Gauss model respectively; Select in the described vocal print characteristic sequence maximum in each mixed Gauss model calculates in the corresponding described polyhybird Gauss model of a vocal print feature the likelihood score as the likelihood score of this vocal print feature and described polyhybird Gauss model; Perhaps, select in the described vocal print characteristic sequence likelihood score of the mean value of all likelihood scores that each mixed Gauss model calculates in the corresponding described polyhybird Gauss model of a vocal print feature as this vocal print feature and described polyhybird Gauss model.
And in obtaining described vocal print feature behind the likelihood score of each vocal print feature and polyhybird Gauss model, the summation time average of all vocal print feature likelihood scores of selecting the vocal print characteristic sequence is as the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user.
Certainly, other selection modes can also be arranged, average etc. such as all likelihood scores that calculate are weighted, this embodiment of the invention is not done qualification.
Step 105 according to the likelihood score of described vocal print characteristic sequence and speaker model and the likelihood score of described vocal print characteristic sequence and background model, is calculated likelihood ratio.
Likelihood ratio is: p = p ( X | U ) p ( X | UBM ) - - - ( 3 )
Wherein, p (X|U) is the likelihood score of described vocal print feature and speaker model, and p (X|UBM) is the likelihood score of described vocal print feature and background model.
Whether step 106 judges described likelihood ratio greater than preset threshold, if then execution in step 107; Otherwise, execution in step 108.
Above-mentioned threshold value can be preestablished by system, in general, this threshold value is big more, then the sensitivity of system is high more, require user's pronunciation of the voice signal (password) of typing during as far as possible according to registration when login, otherwise then the sensitivity of system is lower, and there is certain variation in the pronunciation the when pronunciation of the voice signal of typing is with registration when allowing the user to login.
Step 107 determines that described current login user is effective authenticated user.
Step 108 determines that described current login user is non-authenticated user.
Need to prove, in order to improve the robustness of system, before above-mentioned steps 101 and step 102, can also carry out noise reduction process to described continuous speech signal, such as, at first, continuous voice signal is divided into independently voice snippet and non-voice segment by short-time energy and short-time zero-crossing rate analysis to voice signal.Reduce the interference of channel noise and background noise then by the front end noise reduction process, improve the voice signal to noise ratio, handling for follow-up system provides clean signal.
The existing relative stability of user's vocal print feature has variability again.Be subjected to the influence of health, age, mood etc. on the one hand easily, be subjected to the interference of external environment noise and voice collecting channel on the other hand easily, so the different vocal prints that speaker model needs to distinguish same speaker preferably change.In embodiments of the present invention, speaker model is the polyhybird Gauss model that the voice signal of typing makes up during according to described current login user registration, the frame number of the number of repetition of the voice signal of typing and this voice signal was relevant when the Gaussage of mixed Gauss model number and each mixed Gauss model was registered with the user, thereby can utilize a plurality of mixed Gauss model analog subscribers to the characteristics that the difference pronunciation of saying same password (being above-mentioned voice signal) existence changes, improve the accuracy rate of carrying out authentication based on the vocal print password.
In embodiments of the present invention, background model is used to describe the general character of speaker's vocal print, this background model needs to make up in advance, specifically can adopt modes more of the prior art, such as, the mixed Gauss model simulation background model of employing 1024 or bigger Gaussage, its model parameter training process as shown in Figure 2.
Step 201 is extracted the vocal print feature respectively from many speakers training utterance signal, each vocal print feature is as a characteristic vector.
Step 202 utilizes clustering algorithm that above-mentioned characteristic vector is carried out cluster, obtains K Gauss's initialization average, and K is the mixed Gauss model number that sets in advance.
Such as, (Gray) clustering algorithm approaches optimum regeneration code book by trained vector collection and certain iterative algorithm for Linde, Buzo can to adopt traditional LBG.
Step 203 utilizes EM (Expectation Maximization) algorithm iteration to upgrade the weight coefficient of above-mentioned average, variance and each Gauss's correspondence, obtains background model.
Concrete iteration renewal process is same as the prior art, is not described in detail at this.
Certainly, can also adopt other modes to make up background model, this embodiment of the invention is not done qualification.
In embodiments of the present invention, need to distinguish the user and be in login mode or registration mode, if login mode, then need carry out authentication to this user according to flow process shown in Figure 1 based on the vocal print password, if registration mode, then need to receive the registration voice signal of described user's typing, and make up described user's speaker model according to described registration voice signal.
The building process of the building process of speaker model and traditional speaker model is diverse in the embodiment of the invention, for this point is described better, at first the building process of traditional speaker model is done simple declaration below.
The building process of traditional speaker model is to be initial model with the background model, by adaptive approach adjustment model partial parameters, as the present adaptive algorithm based on maximum a posteriori probability the most commonly used etc.Adaptive algorithm is current speaker's individual character according to a small amount of speaker's data with user's vocal print general character self adaptation, and it specifically trains flow process as shown in Figure 3, may further comprise the steps:
Step 301 is extracted the vocal print feature from the registration voice signal of user's typing.
Step 302 is utilized the average μ of described vocal print feature adaptive updates background model mixed Gaussian m
Particularly, new Gaussian mean μ mBe calculated as the weighted average of sample statistic and original Gaussian mean, that is:
μ m ^ = Σ t = 1 T γ m ( x t ) x t + τ μ m Σ t = 1 T γ m ( x t ) + τ - - - ( 4 )
Wherein, x tRepresent t frame vocal print feature, γ m(x t) representing that t frame vocal print feature falls within m Gauss's probability, τ is a forgetting factor, is used for historical average of balance and the sample update intensity to new average.In general, the τ value is big more, and then new average is restricted by original average mainly.And if the τ value is less, then new average has more embodied the characteristics that new samples distributes mainly by the sample statistic decision.
Step 303 is duplicated the speaker model variance of background model variance as described user.
Step 304 generates described user's speaker model.
In embodiments of the present invention, need when the user registers, receive the registration voice signal of described user's typing, and make up described user's speaker model according to described registration voice signal.This speaker model is made of a plurality of mixed Gauss models, the characteristics that the difference pronunciation of saying same password and existing changed with the simulation speaker, and, illustrate that each mixed Gauss model is trained variance separately in the human model, cause variance excessive directly to duplicate the background model variance in the solution conventional method, do not meet the problem of practical application.
As shown in Figure 4, be the flow chart that makes up speaker model in the embodiment of the invention, may further comprise the steps:
Step 401, the registration voice signal that user recording is gone into saves as a conventional sequence.
It is inferior to suppose that the user registers the same password content N of input (such as N=2,3 etc.), then obtains N independently conventional sequence.
Step 402 is extracted the vocal print feature from the conventional sequence that obtains.
The step 102 of detailed process and front is similar, is not described in detail at this.
Step 403, all mixed Gauss models of the speaker model of determining described user according to the number of repetition and the frame number of described registration voice signal.
In the vocal print password was used, the user inputed unified content of text and uses as password.Such as, the mixed Gauss model number that can set described user's speaker model equals the number of repetition of described registration voice signal, and set the frame number that equals the registration voice signal of described mixed Gauss model correspondence corresponding to the Gaussage of each mixed Gauss model, specifically can be expressed as:
p ( O | M k ) = Σ m = 1 T ( k ) c m k N ( O ; μ m k , Σ m k ) - - - ( 5 )
Wherein, T (k) is mixed Gauss model M kGaussage, be equal to the frame number of k speech samples of model correspondence.And
Figure BDA0000080250750000092
Be respectively mixed Gauss model M kWeight coefficient, average and the variance of m Gaussian component.
Certainly, the embodiment of the invention does not limit the topological mode of above-mentioned speaker model, the Gaussage of its mixed Gauss model number and each mixed Gauss model can be not and the number of repetition and the complete correspondent equal of frame number of described voice signal yet, also can choose the number of repetition of mixed Gauss model number by adopting clustering algorithm less than described registration voice signal, equally, the Gaussage of each mixed Gauss model also can be less than the frame number of described registration voice signal.
Step 404 is estimated the Gaussian mean parameter of all mixed Gauss models according to the vocal print feature of extracting.
In embodiments of the present invention, determine the Gaussian mean parameter of its corresponding mixed Gauss model according to single training sample.Particularly, each Gaussian mean vector that can mixed Gauss model is set to the characteristic vector value of sample, promptly
Figure BDA0000080250750000101
Wherein
Figure BDA0000080250750000102
The average of representing m Gauss of k mixed model, and The vocal print characteristic vector of representing the m frame voice of k voice signal.
Step 405 is estimated Gauss's variance parameter of all mixed Gauss models according to the vocal print feature of extracting.
Can suppose that a plurality of Gausses of each mixed Gaussian have in the speaker model is the unified matrix of the overall situation, to realize the variance revaluation problem on the less data.Under this hypothesis,
Figure BDA0000080250750000104
(promptly the covariance matrix of all Gaussian component of k mixed Gauss model has identical matrix numerical value).Particularly, to given sample vocal print characteristic sequence O k, according to
Figure BDA0000080250750000105
Promptly all remain the statistical information of sample vocal print characteristic sequences, revaluation mixed Gauss model M kVariance, be calculated as follows:
Σ k = Σ n ≠ k Σ i = 1 T ( n ) Σ m = 1 T ( k ) ( γ m k ( O i n ) ( O i n - μ m k ) ( O i n - μ m k ) T ) Σ n ≠ k Σ i = 1 T ( n ) Σ m = 1 T ( k ) γ m k ( O i n ) - - - ( 6 )
Wherein,
Figure BDA0000080250750000107
I the speech frame (being sample) of representing n sentence log-in password (promptly registering voice signal),
Figure BDA0000080250750000108
M Gaussian mean representing k mixed Gauss model,
Figure BDA0000080250750000109
The expression sample
Figure BDA00000802507500001010
Dropping on average is
Figure BDA00000802507500001011
Gauss on probability.
Like this, to each independent mixed Gaussian M of speaker model k, can utilize non-O kSample data obtain the corresponding variance parameter.If the registration voice signal is the N sentence, then obtain N different variance matrix.
Especially, can suppose that this variance matrix is that diagonal matrix is with the sparse problem of further minimizing data, promptly
Figure BDA00000802507500001012
A plurality of Gausses' the variance that can also further consider a plurality of mixed Gauss models of speaker model in addition has the unified diagonal matrix of the overall situation, to solve the model variance revaluation problem under the sparse situation of data better.Under this hypothesis,
Step 406 is estimated Gauss's weight coefficient parameter of all mixed Gauss models.
The Gaussian mean of considering mixed Gauss model in the present embodiment directly determined by the sample vector, thereby each Gauss exists with 1 probability on sample, and promptly probability of occurrence is identical.The weight coefficient equalization of each Gauss in the mixed model can be set for this reason, that is: in the present embodiment
c m k = c k = 1 T ( k ) - - - ( 7 )
Utilize above-mentioned flow process shown in Figure 4, can be according to the sentence number and the long topological structure that the number of the mixed Gauss model in the speaker model is set and determines model of sentence of registration voice, by reasonable setting to Gaussian mean, variance and the weight coefficient of all mixed Gauss models, solved the sparse training problem of data that exists in the system of tradition based on the vocal print cipher authentication effectively, improve the differentiation between the mixed Gauss model, and then can improve the accuracy rate of authentication.And the mixed Gauss model of use is littler more effective, with respect to prior art, has improved the required memory pressure of arithmetic speed and storage data greatly.
Correspondingly, the embodiment of the invention also provides a kind of identity authorization system, as shown in Figure 5, is a kind of structural representation of embodiment of the invention identity authorization system.
In this embodiment, described system comprises:
Voice signal receiving element 501 is used for when the user logins, and receives the continuous speech signal of current login user typing;
Extraction unit 502 is used for extracting the vocal print characteristic sequence of described continuous speech signal;
First computing unit 503 is used to calculate the likelihood score of described vocal print characteristic sequence and background model;
Second computing unit 504, be used to calculate the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user, described speaker model is the number of repetition of the registration voice signal of typing during according to described current login user registration and the polyhybird Gauss model that frame number makes up;
The 3rd computing unit 505 is used for according to the likelihood score of described vocal print characteristic sequence and speaker model and the likelihood score of described vocal print characteristic sequence and background model, calculates likelihood ratio;
Judging unit 506 when being used for the likelihood ratio that calculates at described the 3rd computing unit 505 greater than preset threshold, determines that described current login user is effective authenticated user, otherwise determines that described current login user is non-authenticated user.
Above-mentioned this vocal print characteristic sequence comprises one group of vocal print feature, can distinguish different speakers effectively, and same speaker's variation is kept relative stability.
Such as, the vocal print feature that extraction unit 502 can extract mainly contains: spectrum envelope parameter phonetic feature, fundamental tone profile, formant frequency bandwidth feature, linear predictor coefficient, cepstrum coefficient etc.Consider the quantification property of above-mentioned vocal print feature, the quantity of training sample and the problems such as evaluation of systematic function, can select MFCC (Mel Frequency Cepstrum Coefficient for use, the Mel frequency cepstral coefficient) feature, every frame speech data that the long 25ms frame of window is moved 10ms is done short-time analysis and is obtained MFCC parameter and single order second differnce thereof, amounts to 39 dimensions.Like this, every voice signal can be quantified as one 39 dimension vocal print characteristic sequence X.
The above-mentioned background model can be that system makes up in advance and is written into when initialization, and the concrete building process embodiment of the invention of background model is not done qualification.
Above-mentioned speaker model is the polyhybird Gauss model that the voice signal of typing makes up during according to described current login user registration, and correspondingly, in embodiments of the present invention, above-mentioned second computing unit 504 can have multiple implementation, such as:
In one implementation, described second computing unit 504 comprises: first computation subunit and first is determined subelement.Wherein:
Described first computation subunit is used for calculating respectively the likelihood score of described vocal print characteristic sequence and each mixed Gauss model;
Described first determines subelement, is used for determining according to the result of calculation of described first computation subunit likelihood score of the speaker model of described vocal print characteristic sequence and described current login user.
Above-mentioned first computation subunit can comprise: first computing module and first is selected module, wherein:
Described first computing module is used for calculating respectively the likelihood score of each mixed Gauss model in each vocal print feature of described vocal print characteristic sequence and the described polyhybird Gauss model;
Described first selects module, is used for selecting the likelihood score of the time average of the likelihood score summation that the corresponding mixed Gauss model of one group of vocal print feature of described vocal print characteristic sequence calculates as described vocal print characteristic sequence and this mixed Gauss model.
Correspondingly, above-mentioned first determines that subelement also can have multiple implementation, such as, obtain the likelihood score of described vocal print characteristic sequence and each mixed Gauss model in first computation subunit after, first determines that subelement can select one of them maximum or the average likelihood score as the speaker model of described vocal print characteristic sequence and described current login user.
In another kind of implementation, described second computing unit 504 comprises: second computation subunit and second is determined subelement.Wherein:
Described second computation subunit is used for calculating respectively the likelihood score of each vocal print feature of described vocal print characteristic sequence with respect to described polyhybird Gauss model;
The described second chooser unit is used for determining according to the result of calculation of described second computation subunit likelihood score of the speaker model of described vocal print characteristic sequence and described current login user.
Above-mentioned second computation subunit can comprise: second computing module and second is selected module, wherein:
Described second computing module is used for calculating respectively the likelihood score of each mixed Gauss model in each vocal print feature of described vocal print characteristic sequence and the described polyhybird Gauss model;
Described second selects module, is used for selecting maximum in each mixed Gauss model calculates in the corresponding described polyhybird Gauss model of vocal print feature of described vocal print characteristic sequence the likelihood score as the likelihood score of this vocal print feature and described polyhybird Gauss model; Perhaps select in the described vocal print characteristic sequence likelihood score of the mean value of all likelihood scores that each mixed Gauss model calculates in the corresponding described polyhybird Gauss model of a vocal print feature as this vocal print feature and described polyhybird Gauss model.
Correspondingly, above-mentioned second determines that subelement also can have multiple implementation, such as, obtain in the described vocal print characteristic sequence behind the likelihood score of each vocal print feature with respect to described polyhybird Gauss model in second computation subunit, second determines that it is the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user with respect to the time average of the likelihood score of described polyhybird Gauss model that subelement can be selected each vocal print feature in the described vocal print characteristic sequence.
Certainly, second computing unit 504 can also adopt other modes to realize, this embodiment of the invention is not done qualification.
The concrete computational process of above-mentioned first computing unit 503, second computing unit 504 and the 3rd computing unit 505 can not repeat them here with reference to the description in the embodiment of the invention identity identifying method of front.
In embodiments of the present invention, speaker model is the polyhybird Gauss model that the voice signal of typing makes up during according to described current login user registration, the frame number of the number of repetition of the voice signal of typing and this voice signal was relevant when the Gaussage of mixed Gauss model number and each mixed Gauss model was registered with the user, thereby can utilize a plurality of mixed Gauss model analog subscribers to the characteristics that the difference pronunciation of saying same password (being above-mentioned voice signal) existence changes, improve the accuracy rate of carrying out authentication based on the vocal print password.
As shown in Figure 6, be the another kind of structural representation of embodiment of the invention identity authorization system.
With embodiment illustrated in fig. 5 different be that in this embodiment, described voice signal receiving element 501 also is used for when the user registers, and receives the registration voice signal of described user's typing.
In addition, in this system, also further comprise: model construction unit 601, be used for making up described user's speaker model according to described registration voice signal, this model construction unit 601 comprises:
Feature extraction subelement 611 is used for extracting the vocal print feature from described registration voice signal;
Topological structure is determined subelement 612, all mixed Gauss models of the speaker model that is used for determining described user according to the number of repetition and the frame number of described registration voice signal;
Such as, the mixed Gauss model number that can set described user's speaker model is less than or equal to the number of repetition of described registration voice signal; Setting is less than or equal to the frame number of described registration voice signal corresponding to the Gaussage of each mixed Gauss model;
The first estimator unit 613, the vocal print feature that is used to utilize feature extraction subelement 611 to extract estimate that described topological structure determines the Gaussian mean parameter of all mixed Gauss models that subelement 612 is determined;
The second estimator unit 614, the vocal print feature that is used to utilize feature extraction subelement 611 to extract estimate that described topological structure determines Gauss's variance parameter of all mixed Gauss models that subelement 612 is determined.
Above-mentioned each estimator unit can not repeat them here with reference to the description of front the method for estimation of the relevant parameter in the mixed Gauss model.
The identity authorization system of the embodiment of the invention, can be according to the sentence number and the long topological structure that the number of the mixed Gauss model in the speaker model is set and determines model of sentence of registration voice, by reasonable setting to Gaussian mean, variance and the weight coefficient of all mixed Gauss models, solved the sparse training problem of data that exists in the system of tradition based on the vocal print cipher authentication effectively, improve the differentiation between the mixed Gauss model, and then can improve the accuracy rate of authentication.And the mixed Gauss model of use is littler more effective, with respect to prior art, has improved the required memory pressure of arithmetic speed and storage data greatly.
Each embodiment in this specification all adopts the mode of going forward one by one to describe, and identical similar part is mutually referring to getting final product between each embodiment, and each embodiment stresses all is difference with other embodiment.Especially, for system embodiment, because it is substantially similar in appearance to method embodiment, so describe fairly simplely, relevant part gets final product referring to the part explanation of method embodiment.System embodiment described above only is schematically, and wherein said unit and module as the separating component explanation can or can not be physically to separate also.In addition, can also select wherein some or all of unit and the module purpose that realizes the present embodiment scheme according to the actual needs.Those of ordinary skills promptly can understand and implement under the situation of not paying creative work.
More than disclosed only be preferred implementation of the present invention; but the present invention is not limited thereto; any those skilled in the art can think do not have a creationary variation, and, all should drop in protection scope of the present invention not breaking away from some improvements and modifications of being done under the principle of the invention prerequisite.

Claims (18)

1. an identity identifying method is characterized in that, comprising:
When the user logins, receive the continuous speech signal of current login user typing;
Extract the vocal print characteristic sequence in the described continuous speech signal, described vocal print characteristic sequence comprises one group of vocal print feature;
Calculate the likelihood score of described vocal print characteristic sequence and background model;
Calculate the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user, described speaker model is the number of repetition of the registration voice signal of typing during according to described current login user registration and the polyhybird Gauss model that frame number makes up;
According to the likelihood score of described vocal print characteristic sequence and speaker model and the likelihood score of described vocal print characteristic sequence and background model, calculate likelihood ratio;
If described likelihood ratio, determines then that described current login user is effective authenticated user greater than preset threshold, otherwise determines that described current login user is non-authenticated user.
2. the method for claim 1 is characterized in that, the likelihood score of the speaker model of described vocal print characteristic sequence of described calculating and described current login user comprises:
Calculate the likelihood score of described vocal print characteristic sequence and each mixed Gauss model respectively;
Determine the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user according to result of calculation.
3. method as claimed in claim 2 is characterized in that, the described likelihood score that calculates described vocal print characteristic sequence and each mixed Gauss model respectively comprises:
Calculate in the described vocal print characteristic sequence likelihood score of each mixed Gauss model in each vocal print feature and described polyhybird Gauss model respectively;
Select the likelihood score of the time average of the likelihood score summation that mixed Gauss model of one group of vocal print feature correspondence calculates in the described vocal print characteristic sequence as described vocal print characteristic sequence and this mixed Gauss model.
4. method as claimed in claim 2 is characterized in that, describedly determines that according to result of calculation the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user comprises:
Select the likelihood score of the mean value of the likelihood score that corresponding all mixed Gauss models of described vocal print characteristic sequence calculate as the speaker model of described vocal print characteristic sequence and described current login user; Perhaps
Select the likelihood score of the maximum of the likelihood score that corresponding all mixed Gauss models of described vocal print characteristic sequence calculate as the speaker model of described vocal print characteristic sequence and described current login user.
5. the method for claim 1 is characterized in that, the likelihood score of the speaker model of described vocal print characteristic sequence of described calculating and described current login user comprises:
Calculate in the described vocal print characteristic sequence each vocal print feature respectively with respect to the likelihood score of described polyhybird Gauss model;
Determine the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user according to result of calculation.
6. method as claimed in claim 5 is characterized in that, describedly calculates each vocal print feature in the described vocal print characteristic sequence respectively and comprises with respect to the likelihood score of described polyhybird Gauss model:
Calculate in the described vocal print characteristic sequence likelihood score of each mixed Gauss model in each vocal print feature and described polyhybird Gauss model respectively;
Select in the described vocal print characteristic sequence maximum in each mixed Gauss model calculates in the corresponding described polyhybird Gauss model of a vocal print feature the likelihood score as the likelihood score of this vocal print feature and described polyhybird Gauss model; Perhaps, select in the described vocal print characteristic sequence likelihood score of the mean value of all likelihood scores that each mixed Gauss model calculates in the corresponding described polyhybird Gauss model of a vocal print feature as this vocal print feature and described polyhybird Gauss model.
7. method as claimed in claim 5 is characterized in that, describedly determines that according to result of calculation the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user comprises:
Select the likelihood score of the time average of the likelihood score that the corresponding polyhybird Gauss model of all vocal print features calculates in the described vocal print characteristic sequence as the speaker model of described vocal print characteristic sequence and described current login user.
8. as each described method of claim 1 to 7, it is characterized in that described method also comprises:
When the user registers, receive the registration voice signal of described user's typing;
Make up described user's speaker model according to described registration voice signal, this process comprises:
From described registration voice signal, extract the vocal print feature;
All mixed Gauss models of the speaker model of determining described user according to the number of repetition and the frame number of described registration voice signal;
Estimate the Gaussian mean parameter of all mixed Gauss models of described user's speaker model according to the vocal print feature of from described registration voice signal, extracting;
Estimate Gauss's variance parameter of all mixed Gauss models of described user's speaker model according to the vocal print feature of from described registration voice signal, extracting.
9. method as claimed in claim 8 is characterized in that, all mixed Gauss models of the described speaker model of determining described user according to the number of repetition and the frame number of described registration voice signal comprise:
The mixed Gauss model number of setting described user's speaker model is less than or equal to the number of repetition of described registration voice signal;
Setting is less than or equal to the frame number of the registration voice signal of described mixed Gauss model correspondence corresponding to the Gaussage of each mixed Gauss model.
10. an identity authorization system is characterized in that, comprising:
The voice signal receiving element is used for when the user logins, and receives the continuous speech signal of current login user typing;
Extraction unit is used for extracting the vocal print characteristic sequence of described continuous speech signal, and described vocal print characteristic sequence comprises one group of vocal print feature;
First computing unit is used to calculate the likelihood score of described vocal print characteristic sequence and background model;
Second computing unit, be used to calculate the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user, described speaker model is the number of repetition of the registration voice signal of typing during according to described current login user registration and the polyhybird Gauss model that frame number makes up;
The 3rd computing unit is used for according to the likelihood score of described vocal print characteristic sequence and speaker model and the likelihood score of described vocal print characteristic sequence and background model, calculates likelihood ratio;
Judging unit when being used for the likelihood ratio that calculates at described the 3rd computing unit greater than preset threshold, determines that described current login user is effective authenticated user, otherwise determines that described current login user is non-authenticated user.
11. system as claimed in claim 10 is characterized in that, described second computing unit comprises:
First computation subunit is used for calculating respectively the likelihood score of described vocal print characteristic sequence and each mixed Gauss model;
First determines subelement, is used for determining according to the result of calculation of described first computation subunit likelihood score of the speaker model of described vocal print characteristic sequence and described current login user.
12. system as claimed in claim 11 is characterized in that, described first computation subunit comprises:
First computing module is used for calculating respectively the likelihood score of each mixed Gauss model in each vocal print feature of described vocal print characteristic sequence and the described polyhybird Gauss model;
First selects module, is used for selecting the likelihood score of the time average of the likelihood score summation that the corresponding mixed Gauss model of one group of vocal print feature of described vocal print characteristic sequence calculates as described vocal print characteristic sequence and this mixed Gauss model.
13. system as claimed in claim 11 is characterized in that,
Described first determines subelement, specifically is used to select the likelihood score of the mean value of the likelihood score that corresponding all mixed Gauss models of described vocal print characteristic sequence calculate as the speaker model of described vocal print characteristic sequence and described current login user; Perhaps, select the likelihood score of the maximum of the likelihood score that corresponding all mixed Gauss models of described vocal print characteristic sequence calculate as the speaker model of described vocal print characteristic sequence and described current login user.
14. system as claimed in claim 10 is characterized in that, described second computing unit comprises:
Second computation subunit is used for calculating respectively the likelihood score of each vocal print feature of described vocal print characteristic sequence with respect to described polyhybird Gauss model;
Second determines subelement, is used for determining according to the result of calculation of described second computation subunit likelihood score of the speaker model of described vocal print characteristic sequence and described current login user.
15. system as claimed in claim 14 is characterized in that, described second computation subunit comprises:
Second computing module is used for calculating respectively the likelihood score of each mixed Gauss model in each vocal print feature of described vocal print characteristic sequence and the described polyhybird Gauss model;
Second selects module, is used for selecting maximum in each mixed Gauss model calculates in the corresponding described polyhybird Gauss model of vocal print feature of described vocal print characteristic sequence the likelihood score as the likelihood score of this vocal print feature and described polyhybird Gauss model; Perhaps select in the described vocal print characteristic sequence likelihood score of the mean value of all likelihood scores that each mixed Gauss model calculates in the corresponding described polyhybird Gauss model of a vocal print feature as this vocal print feature and described polyhybird Gauss model.
16. system as claimed in claim 14 is characterized in that,
Described second determines subelement, and specifically being used for selecting each vocal print feature of described vocal print characteristic sequence is the likelihood score of the speaker model of described vocal print characteristic sequence and described current login user with respect to the time average of the likelihood score of described polyhybird Gauss model.
17. as each described system of claim 10 to 16, it is characterized in that,
Described voice signal receiving element also is used for when the user registers, and receives the registration voice signal of described user's typing;
Described system also comprises: the model construction unit, be used for making up described user's speaker model according to described registration voice signal, and described model construction unit comprises:
The feature extraction subelement is used for extracting the vocal print feature from described registration voice signal;
Topological structure is determined subelement, all mixed Gauss models of the speaker model that is used for determining described user according to the number of repetition and the frame number of described registration voice signal;
The first estimator unit, the vocal print feature that is used to utilize described feature extraction subelement to extract estimate that described topological structure determines the Gaussian mean parameter of all mixed Gauss models that subelement is determined;
The second estimator unit, the vocal print feature that is used to utilize described feature extraction subelement to extract estimate that described topological structure determines Gauss's variance parameter of all mixed Gauss models that subelement is determined.
18. system as claimed in claim 17 is characterized in that,
Described topological structure is determined subelement, and the mixed Gauss model number that specifically is used to set described user's speaker model is less than or equal to the number of repetition of described registration voice signal; Setting is less than or equal to the frame number of the registration voice signal of described mixed Gauss model correspondence corresponding to the Gaussage of each mixed Gauss model.
CN2011102180452A 2011-08-01 2011-08-01 Identity authentication method and system Active CN102238190B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2011102180452A CN102238190B (en) 2011-08-01 2011-08-01 Identity authentication method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2011102180452A CN102238190B (en) 2011-08-01 2011-08-01 Identity authentication method and system

Publications (2)

Publication Number Publication Date
CN102238190A true CN102238190A (en) 2011-11-09
CN102238190B CN102238190B (en) 2013-12-11

Family

ID=44888395

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2011102180452A Active CN102238190B (en) 2011-08-01 2011-08-01 Identity authentication method and system

Country Status (1)

Country Link
CN (1) CN102238190B (en)

Cited By (20)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102496365A (en) * 2011-11-30 2012-06-13 上海博泰悦臻电子设备制造有限公司 User verification method and device
CN102510426A (en) * 2011-11-29 2012-06-20 安徽科大讯飞信息科技股份有限公司 Personal assistant application access method and system
CN102710602A (en) * 2012-04-28 2012-10-03 深圳创维-Rgb电子有限公司 Voice login method and system for electronic equipment, and television
CN102968990A (en) * 2012-11-15 2013-03-13 江苏嘉利德电子科技有限公司 Speaker identifying method and system
CN103226951A (en) * 2013-04-19 2013-07-31 清华大学 Speaker verification system creation method based on model sequence adaptive technique
CN104160441A (en) * 2011-12-29 2014-11-19 罗伯特·博世有限公司 Speaker verification in a health monitoring system
CN104239471A (en) * 2014-09-03 2014-12-24 陈飞 Data query/ exchange device in behavior simulation mode and method thereof
CN104361891A (en) * 2014-11-17 2015-02-18 科大讯飞股份有限公司 Method and system for automatically checking customized polyphonic ringtones of specific population
CN104766607A (en) * 2015-03-05 2015-07-08 广州视源电子科技股份有限公司 Television program recommendation method and system
CN105096954A (en) * 2014-05-06 2015-11-25 中兴通讯股份有限公司 Identity identifying method and device
CN106057206A (en) * 2016-06-01 2016-10-26 腾讯科技(深圳)有限公司 Voiceprint model training method, voiceprint recognition method and device
CN106157135A (en) * 2016-07-14 2016-11-23 微额速达(上海)金融信息服务有限公司 Antifraud system and method based on Application on Voiceprint Recognition Sex, Age
CN106228990A (en) * 2016-07-15 2016-12-14 北京光年无限科技有限公司 Login method and operating system towards intelligent robot
CN107705791A (en) * 2016-08-08 2018-02-16 中国电信股份有限公司 Caller identity confirmation method, device and Voiceprint Recognition System based on Application on Voiceprint Recognition
CN107767863A (en) * 2016-08-22 2018-03-06 科大讯飞股份有限公司 voice awakening method, system and intelligent terminal
WO2018166187A1 (en) * 2017-03-13 2018-09-20 平安科技(深圳)有限公司 Server, identity verification method and system, and a computer-readable storage medium
CN109102810A (en) * 2017-06-21 2018-12-28 北京搜狗科技发展有限公司 Method for recognizing sound-groove and device
CN110223078A (en) * 2019-06-17 2019-09-10 国网电子商务有限公司 Identity authentication method, device, electronic equipment and storage medium
CN111023470A (en) * 2019-12-06 2020-04-17 厦门快商通科技股份有限公司 Air conditioner temperature adjusting method, medium, equipment and device
CN115171727A (en) * 2022-09-08 2022-10-11 北京亮亮视野科技有限公司 Method and device for quantifying communication efficiency

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060111905A1 (en) * 2004-11-22 2006-05-25 Jiri Navratil Method and apparatus for training a text independent speaker recognition system using speech data with text labels
US20080059156A1 (en) * 2006-08-30 2008-03-06 International Business Machines Corporation Method and apparatus for processing speech data
CN101833951A (en) * 2010-03-04 2010-09-15 清华大学 Multi-background modeling method for speaker recognition
CN102024455A (en) * 2009-09-10 2011-04-20 索尼株式会社 Speaker recognition system and method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20060111905A1 (en) * 2004-11-22 2006-05-25 Jiri Navratil Method and apparatus for training a text independent speaker recognition system using speech data with text labels
US20080059156A1 (en) * 2006-08-30 2008-03-06 International Business Machines Corporation Method and apparatus for processing speech data
CN102024455A (en) * 2009-09-10 2011-04-20 索尼株式会社 Speaker recognition system and method
CN101833951A (en) * 2010-03-04 2010-09-15 清华大学 Multi-background modeling method for speaker recognition

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102510426A (en) * 2011-11-29 2012-06-20 安徽科大讯飞信息科技股份有限公司 Personal assistant application access method and system
CN102496365A (en) * 2011-11-30 2012-06-13 上海博泰悦臻电子设备制造有限公司 User verification method and device
CN104160441A (en) * 2011-12-29 2014-11-19 罗伯特·博世有限公司 Speaker verification in a health monitoring system
CN104160441B (en) * 2011-12-29 2017-12-15 罗伯特·博世有限公司 Talker in health monitoring system examines
CN102710602A (en) * 2012-04-28 2012-10-03 深圳创维-Rgb电子有限公司 Voice login method and system for electronic equipment, and television
CN102968990B (en) * 2012-11-15 2015-04-15 朱东来 Speaker identifying method and system
CN102968990A (en) * 2012-11-15 2013-03-13 江苏嘉利德电子科技有限公司 Speaker identifying method and system
CN103226951B (en) * 2013-04-19 2015-05-06 清华大学 Speaker verification system creation method based on model sequence adaptive technique
CN103226951A (en) * 2013-04-19 2013-07-31 清华大学 Speaker verification system creation method based on model sequence adaptive technique
CN105096954A (en) * 2014-05-06 2015-11-25 中兴通讯股份有限公司 Identity identifying method and device
CN104239471A (en) * 2014-09-03 2014-12-24 陈飞 Data query/ exchange device in behavior simulation mode and method thereof
CN104239471B (en) * 2014-09-03 2017-12-19 陈飞 The devices and methods therefor of data query/exchange is carried out in a manner of Behavior modeling
CN104361891A (en) * 2014-11-17 2015-02-18 科大讯飞股份有限公司 Method and system for automatically checking customized polyphonic ringtones of specific population
CN104766607A (en) * 2015-03-05 2015-07-08 广州视源电子科技股份有限公司 Television program recommendation method and system
CN106057206A (en) * 2016-06-01 2016-10-26 腾讯科技(深圳)有限公司 Voiceprint model training method, voiceprint recognition method and device
CN106057206B (en) * 2016-06-01 2019-05-03 腾讯科技(深圳)有限公司 Sound-groove model training method, method for recognizing sound-groove and device
CN106157135A (en) * 2016-07-14 2016-11-23 微额速达(上海)金融信息服务有限公司 Antifraud system and method based on Application on Voiceprint Recognition Sex, Age
CN106228990A (en) * 2016-07-15 2016-12-14 北京光年无限科技有限公司 Login method and operating system towards intelligent robot
CN107705791A (en) * 2016-08-08 2018-02-16 中国电信股份有限公司 Caller identity confirmation method, device and Voiceprint Recognition System based on Application on Voiceprint Recognition
CN107767863A (en) * 2016-08-22 2018-03-06 科大讯飞股份有限公司 voice awakening method, system and intelligent terminal
WO2018166187A1 (en) * 2017-03-13 2018-09-20 平安科技(深圳)有限公司 Server, identity verification method and system, and a computer-readable storage medium
CN109102810A (en) * 2017-06-21 2018-12-28 北京搜狗科技发展有限公司 Method for recognizing sound-groove and device
CN109102810B (en) * 2017-06-21 2021-10-15 北京搜狗科技发展有限公司 Voiceprint recognition method and device
CN110223078A (en) * 2019-06-17 2019-09-10 国网电子商务有限公司 Identity authentication method, device, electronic equipment and storage medium
CN111023470A (en) * 2019-12-06 2020-04-17 厦门快商通科技股份有限公司 Air conditioner temperature adjusting method, medium, equipment and device
CN115171727A (en) * 2022-09-08 2022-10-11 北京亮亮视野科技有限公司 Method and device for quantifying communication efficiency

Also Published As

Publication number Publication date
CN102238190B (en) 2013-12-11

Similar Documents

Publication Publication Date Title
CN102238190B (en) Identity authentication method and system
CN102238189B (en) Voiceprint password authentication method and system
EP1989701B1 (en) Speaker authentication
CN108417201B (en) Single-channel multi-speaker identity recognition method and system
CN102509547B (en) Method and system for voiceprint recognition based on vector quantization based
JP6303971B2 (en) Speaker change detection device, speaker change detection method, and computer program for speaker change detection
Chavan et al. An overview of speech recognition using HMM
CN104900235B (en) Method for recognizing sound-groove based on pitch period composite character parameter
CN102142253B (en) Voice emotion identification equipment and method
CN108281137A (en) A kind of universal phonetic under whole tone element frame wakes up recognition methods and system
US20080065380A1 (en) On-line speaker recognition method and apparatus thereof
CN104575490A (en) Spoken language pronunciation detecting and evaluating method based on deep neural network posterior probability algorithm
CN104835498A (en) Voiceprint identification method based on multi-type combination characteristic parameters
CN102270451A (en) Method and system for identifying speaker
WO2014114048A1 (en) Voice recognition method and apparatus
CN104765996B (en) Voiceprint password authentication method and system
CN102324232A (en) Method for recognizing sound-groove and system based on gauss hybrid models
CN102024455A (en) Speaker recognition system and method
CN101923855A (en) Test-irrelevant voice print identifying system
CN102184654A (en) Reading supervision method and device
Shivakumar et al. Simplified and supervised i-vector modeling for speaker age regression
CN104901807A (en) Vocal print password method available for low-end chip
Beritelli et al. The role of voice activity detection in forensic speaker verification
Trabelsi et al. A multi level data fusion approach for speaker identification on telephone speech
Kockmann et al. Recent progress in prosodic speaker verification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C56 Change in the name or address of the patentee

Owner name: IFLYTEK CO., LTD.

Free format text: FORMER NAME: ANHUI USTC IFLYTEK CO., LTD.

CP03 Change of name, title or address

Address after: 666, No. 230088 Wangjiang West Road, Hefei hi tech Development Zone, Anhui, China (230088)

Patentee after: Iflytek Co., Ltd.

Address before: 230088 No. 616, Mount Huangshan Road, hi tech Development Zone, Anhui, Hefei

Patentee before: Anhui USTC iFLYTEK Co., Ltd.