CN104765996B

CN104765996B - Voiceprint password authentication method and system

Info

Publication number: CN104765996B
Application number: CN201410005651.XA
Authority: CN
Inventors: 殷兵; 魏思; 柳林; 刘俊华; 张程风; 王建社; 刘海波; 胡国平; 胡郁; 金鑫
Original assignee: Xun Feizhi Metamessage Science And Technology Ltd
Current assignee: Xun Feizhi Metamessage Science And Technology Ltd
Priority date: 2014-01-06
Filing date: 2014-01-06
Publication date: 2018-04-27
Anticipated expiration: 2034-01-06
Also published as: CN104765996A

Abstract

The invention discloses a kind of voiceprint password authentication method and system, belongs to cipher authentication technique field.This method includes：Receive voice signal input by user；Speech recognition is carried out to the voice signal, obtains cryptogram；Determine whether there is the corresponding background model of the cryptogram；If it is, obtain the background model；If it is not, then being extended according to the cryptogram to the pronunciation unit model that training obtains in advance, the corresponding background model of the cryptogram is obtained；The user is authenticated using the vocal print feature sequence in the voice signal, the background model and the vocal print cryptogram-modle of the user.Using this method and system, can meet the needs of User Defined password and frequently change password.

Description

Voiceprint password authentication method and system

Technical field

The present invention relates to cipher authentication technique field, more particularly to a kind of voiceprint password authentication method and system.

Background technology

Voiceprint, i.e., according to speak human physiological characteristics and the vocal print of behavioural characteristic reflected in user's input speech signal Information, automatically confirms that the technology of speaker's identity.Compared to other biological authentication techniques, voiceprint have it is easier, accurate, Numerous advantages such as economic and good autgmentability, can be widely applied to each side such as safety verification, control.

Vocal print cipher authentication, is the voiceprint authentication technology of text related (Text-Dependent) a kind of, the technical requirements User speech input determines cryptogram, and confirms speaker's identity according to vocal print feature.In vocal print cipher authentication, user's note It is required to input the relevant voice signal of definite cryptogram when volume and authentication, thus voiceprint is more consistent, accordingly Obtain and compare the voiceprint more preferable certification effect unrelated with text, it is close in access control system, startup password, bank paying Positive effect is played under the application environments such as code.

In the prior art, vocal print cipher authentication need offline collection in advance largely with the relevant voice number of cryptogram Background model is obtained according to training.In practice, often require that user unifies text password, and cannot arbitrarily change password, ability Meet the requirements.

Since the mode of user's one password of unified setting is easy to cause correct option or the cryptogram leakage of setting, into And can cause largely to emit the person's of recognizing voice can be misidentified as target speaker, so that effective safety guarantee can not be provided.Therefore, In order to improve security and meet that user individual sets demand, it is allowed to which different user is replaced using self-defined password or often Password is just highly desirable.And training of the prior art to background model often requires that collection is largely relevant with cryptogram Voice data, it is clear that user-defined vocal print cipher authentication system can not be suitable for, and be unfavorable for vocal print cipher authentication system Password update.

The content of the invention

An embodiment of the present invention provides a kind of voiceprint password authentication method and system, to meet User Defined password and frequency The application demand of numerous change password.

Technical solution provided in an embodiment of the present invention is as follows：

On the one hand, there is provided a kind of voiceprint password authentication method, including：

Receive voice signal input by user；

Speech recognition is carried out to the voice signal, obtains cryptogram；

The cryptogram trained beforehand through the password voice data gathered offline is determined whether there is to correspond to Background model；

If it is, obtain the background model；If it is not, then according to the cryptogram to training obtained hair in advance Sound model of element is spliced, and obtains the corresponding background model of the cryptogram；

Utilize the vocal print feature sequence in the voice signal, the background model and the vocal print cryptogram-modle of the user The user is authenticated.

Preferably, training obtains pronunciation unit model in advance in the following way：

Obtain training voice data；

Pronunciation unit is determined according to the trained voice data；

Determine the topological structure of the acoustic model of the pronunciation unit；

Parameter training is carried out to the acoustic model, obtains pronunciation unit model.

Preferably, the acoustic model is GMM model；

It is described that the pronunciation unit model that training obtains in advance is spliced according to the cryptogram, obtain the password The corresponding background model of text includes：

The corresponding GMM model of each pronunciation unit in the cryptogram is obtained, obtains GMM model set；

To the model unit in the GMM model set, spliced using equal weight, obtain new combination GMM model；

The Gauss weight of the combination GMM model is updated so that the sum of Gauss weight of the combination GMM model For 1, the corresponding background model of the cryptogram is obtained.

Preferably, the acoustic model is GMM model；

The corresponding GMM model of each pronunciation unit in the cryptogram is obtained, obtains GMM model sequence；

After splicing successively at least two model units in the GMM model sequence, probability is redirected certainly using default Redirect probability with outer and carry out redirecting transfer, obtain the corresponding background model of the cryptogram, wherein, it is described from redirect probability with Outer the sum of the probability that redirects is 1.

Preferably, the acoustic model is HMM model；

Obtain the corresponding HMM model sequence of each pronunciation unit in the cryptogram；

After splicing successively to the model unit in the HMM model sequence, the corresponding background mould of the cryptogram is obtained Type.

Preferably, the vocal print feature sequence using in the voice signal, the background model and the user Vocal print cryptogram-modle is authenticated including to the user：

First likelihood score of the vocal print feature sequence relative to the vocal print cryptogram-modle, and the sound are calculated respectively Line characteristic sequence relative to the background model the second likelihood score；

According to first likelihood score and the ratio and predetermined threshold value of second likelihood score, determine whether user is conjunction Method user.

Preferably, the method further includes：Before speech recognition is carried out to the voice signal or described in acquisition After the corresponding background model of cryptogram, the vocal print feature sequence in the voice signal is extracted.

Preferably, the method further includes：Before speech recognition is carried out to the voice signal or described in acquisition After the corresponding background model of cryptogram, the vocal print cryptogram-modle of the user is obtained.

Preferably, the method further includes：Before the vocal print cryptogram-modle of the user is obtained, whether judgement currently deposits In the vocal print cryptogram-modle of user；

If it does not exist, then according to the registration voice signal of user and the vocal print password mould of background model structure user Type.

On the other hand, there is provided a kind of vocal print cipher authentication system, including：

Receiving module, for receiving voice signal input by user；

Identification module, for carrying out speech recognition to the voice signal, obtains cryptogram；

Determining module, be used to determine whether to exist and train beforehand through the password voice data gathered offline described in The corresponding background model of cryptogram；

Background model acquisition module, for determining that there are the corresponding background model of the cryptogram in the determining module Afterwards, the background model is obtained, after the determining module determines that the corresponding background model of the cryptogram is not present, according to The cryptogram splices the pronunciation unit model that training obtains in advance, obtains the corresponding background mould of the cryptogram Type；

Authentication module, for utilizing vocal print feature sequence, the background model and the user in the voice signal Vocal print cryptogram-modle the user is authenticated.

Preferably, the system also includes：Training module, pronunciation unit model is obtained for training in advance；The training Module includes：

Voice data acquiring unit, for obtaining trained voice data；

First determination unit, for determining pronunciation unit according to the trained voice data；

Second determination unit, the topological structure of the acoustic model for determining the pronunciation unit；

Parameter training unit, for carrying out parameter training to the acoustic model, obtains pronunciation unit model.

Preferably, the acoustic model is GMM model, and the background model acquisition module includes：

GMM model acquiring unit, for obtaining the corresponding GMM model of each pronunciation unit in the cryptogram, obtains GMM Model set；

First concatenation unit, for the model unit in the GMM model set, being spliced using equal weight, being obtained new Combination GMM model；

Weight updating block, for being updated to the Gauss weight of the combination GMM model so that the combination GMM The sum of Gauss weight of model is 1, obtains the corresponding background model of the cryptogram.

GMM model acquiring unit, for obtaining the corresponding GMM model of each pronunciation unit in the cryptogram, obtains GMM Model sequence；

Second concatenation unit, after splicing successively at least two model units in the GMM model sequence, uses It is default from redirect probability and it is outer redirect probability and carry out redirecting transfer, obtain the corresponding background model of the cryptogram, wherein, Described probability and outer the sum of probability that redirects of redirecting certainly is 1.

Preferably, the acoustic model is HMM model, and the background model acquisition module includes：

HMM model retrieval unit, for obtaining the corresponding HMM model sequence of each pronunciation unit in the cryptogram Row；

3rd concatenation unit, after splicing successively to the model unit in the HMM model sequence, obtains the password The corresponding background model of text.

Preferably, the authentication module includes：

Computing unit, for calculating first likelihood of the vocal print feature sequence relative to the vocal print cryptogram-modle respectively Degree, and the vocal print feature sequence is relative to the second likelihood score of the background model；

Determination unit, for the ratio and predetermined threshold value according to first likelihood score and second likelihood score, really Determine whether user is validated user.

Preferably, the system also includes：Vocal print feature sequential extraction procedures module, for carrying out language to the voice signal Before sound identification or after the corresponding background model of the cryptogram is obtained, the vocal print extracted in the voice signal is special Levy sequence.

Preferably, the system also includes：Vocal print cryptogram-modle acquisition module, for carrying out language to the voice signal Before sound identification or after the corresponding background model of the cryptogram is obtained, the vocal print password mould of the user is obtained Type.

Preferably, the system also includes：Judgment module, for before the vocal print cryptogram-modle of the user is obtained, Judge currently with the presence or absence of the vocal print cryptogram-modle of user；

Vocal print cryptogram-modle build module, for the judgment module judge there is currently no vocal print cryptogram-modle after, According to the registration voice signal of user and the vocal print cryptogram-modle of background model structure user.

Voiceprint password authentication method provided in an embodiment of the present invention and system, utilize the vocal print feature sequence in voice signal Row, background model and the vocal print cryptogram-modle of user are authenticated user.If there is no the corresponding background mould of cryptogram Type, then splice the pronunciation unit model that training obtains in advance according to cryptogram, obtain the corresponding background of cryptogram Model., can be easily since background model is spliced by the pronunciation unit model for obtaining advance training Real-time online generates, and disclosure satisfy that the demand that User Defined password and password are frequently changed.

Brief description of the drawings

In order to illustrate more clearly about the embodiment of the present invention or technical scheme of the prior art, below will be to institute in embodiment Attached drawing to be used is needed to be briefly described, it should be apparent that, drawings in the following description are only one described in the present invention A little embodiments, for those of ordinary skill in the art, can also obtain other attached drawings according to these attached drawings.

Fig. 1 is vocal print cipher authentication process schematic of the prior art；

Fig. 2 is the flow chart of voiceprint password authentication method provided in an embodiment of the present invention；

Fig. 3 is the structure diagram of vocal print cipher authentication system provided in an embodiment of the present invention.

Embodiment

In order to make those skilled in the art more fully understand the scheme of the embodiment of the present invention, below in conjunction with the accompanying drawings and embodiment party Formula is described in further detail the embodiment of the present invention.

Vocal print cipher authentication process of the prior art is briefly described first below.

As shown in Figure 1, being vocal print cipher authentication process schematic in the prior art, comprise the following steps：

Step 101：Receive voice signal input by user；

Step 102：Extract the vocal print feature sequence in voice signal；

Step 103：Speech recognition is carried out to voice signal, obtains cryptogram；

Step 104：Obtain the corresponding background model of cryptogram that training obtains in advance；

Step 105：According to the vocal print cryptogram-modle of registration voice signal and background model the structure user of user；

Step 106：User is authenticated using vocal print feature sequence, background model and vocal print cryptogram-modle.

Vocal print cipher authentication of the prior art, it is main to use based on the assumption that the frame examined, i.e., calculate user and survey respectively Voice is tried relative to background model and the likelihood score of vocal print cryptogram-modle, to carry out user authentication.

Due to the accuracy of the corresponding background model of cryptogram and vocal print cryptogram-modle, directly influence vocal print password and recognize Demonstrate,prove effect.Under the statistical model setting of data-driven, the more big then modelling effect of amount of training data is better.In the prior art, carry on the back Scape model can just be obtained often through substantial amounts of password voice data is gathered offline beforehand through data training.Due to adopting offline Collect a large amount of password voice data and carry out data training, more troublesome, operability is poor.In practice, often through using single One universal code, and the method that cannot arbitrarily change password is achieved.

Under mainstream GMM-UBM algorithm frames, be respectively adopted mixed Gauss model (Gaussian Mixture Model, Abbreviation GMM) simulate universal code background model (Universal Background Model, abbreviation UBM) and vocal print password mould Type.

Since vocal print password voice is usually shorter, for describing the back of the body of the user to the vocal print general character of universal code voicing text Scape model generally selects Gaussage as 256 or the GMM model of bigger Gaussage.

Training obtains universal code background model by the following method：

(1) vocal print feature sequence is extracted from a large amount of password training voices prestored, obtains vocal print feature vector storehouse；

(2) using for example traditional LBG clustering algorithms of clustering algorithm, the vocal print feature vector of extraction is clustered, is obtained To the initialization average of K Gauss, wherein K is the pre-set scale of model parameter of system.

(3) by EM algorithm (Expectation-Maximization algorithm, EM algorithm), iteration is more Average, variance and the corresponding weighting coefficient of each Gauss of new GMM model, obtain new UBM model：

Wherein, T represents the totalframes of word, o_tRepresent that t frames observe data, c_mIt is the weight of m-th of Gauss, meetsμ_m,Σ_mIt is the average and variance of m-th of Gauss respectively, wherein, N () meets normal distribution：

Vocal print cryptogram-modle is obtained according to the registration voice on-line training of user.Since registration speech samples quantity has The problems such as limit, directly trained complex model is easy to cause Sparse accordingly.Vocal print cryptogram-modle of the prior art is to carry on the back Scape model is initial model, adjusts what model part parameter obtained by adaptive algorithm.Adaptive algorithm can be according to a small amount of Personal data of speaking should be current speaker's individual character by user's vocal print general character is adaptive.Currently used adaptive algorithm has based on maximum The adaptive algorithm of posterior probability (Maximum a Posteriori, abbreviation MAP).

Specific training process using adaptive algorithm training vocal print cryptogram-modle is as follows：

(1) vocal print feature sequence is extracted from user's registration voice, forms training characteristics vector storehouse；

(2) training characteristics vector storehouse is utilized, using the average of adaptive algorithm renewal universal background model mixed Gaussian, is obtained To the Gaussian mean μ of vocal print cryptogram-modle_m；Wherein, vocal print cryptogram-modle Gaussian mean μ_mFor sample statistic and common background mould The weighted average of type Gaussian mean：

Wherein, x_tRepresent observation vector, i.e. vocal print feature sequence；μ_mRepresent the average of m-th of Gauss；γ_m(x_t) represent the T frame vocal print features fall within the probability of m-th of Gauss, and τ is forgetting factor, for balancing history average and sample statistic to sound The update intensity of line cryptogram-modle Gaussian mean.In general, τ values are larger, then vocal print cryptogram-modle Gaussian mean is mainly by original Beginning average restricts；τ values are smaller, then vocal print cryptogram-modle Gaussian mean is mainly determined by sample statistic, more embody new sample The characteristics of this distribution.

(3) background model variance is replicated, as vocal print cryptogram-modle variance.

After obtaining universal code background model and vocal print cryptogram-modle, it can be carried out according to voice signal input by user User authentication.Specifically, according to the vocal print feature sequence of voice signal input by user, it is opposite that vocal print feature sequence is calculated respectively In the first likelihood score of vocal print cryptogram-modle, vocal print feature sequence relative to universal code background model the second likelihood score, so The ratio of the first likelihood score and the second likelihood score is calculated afterwards, judges whether user is legal use by the ratio and predetermined threshold value Family.If the ratio of the first likelihood score and the second likelihood score is more than predetermined threshold value, judge user for validated user；Otherwise, sentence Disconnected user is disabled user.

Vocal print cipher authentication of the prior art, in the case of single password, since sufficiently training number can be obtained According to the background model and vocal print cryptogram-modle that training obtains are preferable, disclosure satisfy that certification demand.However, in User Defined sound Under the application environment of line password, due to the use of the reason such as number is limited, it is difficult to obtain substantial amounts of training data, lead to not in advance Training obtains corresponding background model so that traditional vocal print cipher authentication is needing User Defined password or frequently changing close In the case of code, it can not be applied.

For this reason, the embodiment of the present invention proposes a kind of voiceprint password authentication method and system, it would be preferable to support User Defined Password and the application demand for frequently replacing password.

As shown in Fig. 2, being the flow chart of voiceprint password authentication method provided in an embodiment of the present invention, following step can be included Suddenly：

Step 201：Receive voice signal input by user；

Step 202：Speech recognition is carried out to voice signal, obtains cryptogram；

Step 203：Determine whether there is the corresponding background model of cryptogram；

Step 204：If there is the corresponding background model of cryptogram, then background model is obtained；

Step 205：If there is no the corresponding background model of cryptogram, then training in advance is obtained according to cryptogram Pronunciation unit model spliced, obtain the corresponding background model of cryptogram；

Step 206：Using the vocal print feature sequence in voice signal, background model and the vocal print of user cryptogram-modle to Family is authenticated.

In embodiments of the present invention, required vocal print feature sequence, background model and user are authenticated to user Vocal print cryptogram-modle, can obtain in the following manner：Vocal print feature sequence can be extracted from voice signal input by user, Specifically, can be before speech recognition be carried out to voice signal, naturally it is also possible to obtaining the corresponding background mould of cryptogram After type, the extraction of the vocal print feature sequence of voice signal is carried out.On the acquisition of background model, can first judge whether The corresponding background model of cryptogram, if existed, can directly acquire；If it does not, can be according to password text This splices the pronunciation unit model that training obtains in advance, obtains the corresponding background model of cryptogram；It is close on vocal print Code model, can first judge whether the vocal print cryptogram-modle of user, if it does, can directly acquire；If it does not, Can be according to the vocal print password for the background model structure user that the registration voice signal and step 204 or step 205 of user obtains Model.

Vocal print feature sequence, background model and the vocal print of user in the voice signal that can be obtained in the above manner is close Code model is authenticated user, specifically, can calculate vocal print feature sequence respectively relative to the first of vocal print cryptogram-modle Likelihood score, and vocal print feature sequence is relative to the second likelihood score of background model；According to the first likelihood score and the second likelihood score Ratio and predetermined threshold value, determine whether user is validated user.If the ratio of the first likelihood score and the second likelihood score is big In predetermined threshold value, it is determined that user is validated user；Otherwise, it determines user is disabled user.

In embodiments of the present invention, the corresponding background model of cryptogram can be by training obtained pronunciation in advance What model of element was spliced.The training process of pronunciation unit model can specifically include：Training voice data is first obtained, Then pronunciation unit is determined according to training voice data, then determines the topological structure of the acoustic model of pronunciation unit, then to acoustics Model carries out parameter training, obtains pronunciation unit model.Wherein, pronunciation unit for example can be phoneme unit or syllable unit. For example, " my birthday " corresponding phoneme unit for " w, o, d, e, sh, eng, r, i ", after being determined to phoneme unit, Acoustic model can be established respectively to each phoneme unit therein, so as to obtain the topology knot of the acoustic model of phoneme unit Structure, then parameter training is carried out to acoustic model, so as to obtain pronunciation unit model.Wherein, can to the parameter training of acoustic model To use some training methods of the prior art, this embodiment of the present invention is not limited.

It should be noted that in embodiments of the present invention, above-mentioned acoustic model can use GMM model, can also use HMM model (Hidden Markov Model, hidden Markov model).

When above-mentioned acoustic model uses GMM model, according to cryptogram to training obtained pronunciation unit model in advance Spliced, obtaining the corresponding background model of cryptogram can include：Obtain the corresponding GMM of each pronunciation unit in cryptogram Model, obtains GMM model set；To the model unit in GMM model set, spliced using equal weight, obtain new combination GMM Model；The Gauss weight for combining GMM model is updated so that it is 1 to combine the sum of Gauss weight of GMM model, is obtained close The corresponding background model of code text.

The Gauss weight renewal process of above-mentioned each model unit can be as follows：Equal weight splicing is being carried out to each model unit During, Gauss weight is arranged to w_n, n=1:N, N are model unit number；The Gauss weight of each model unit is updated to w′_ni=w_nw_ni, w_niFor the weight of each Gauss in master mould unit, and meet w_ni=1；As can be seen that splicing from above formula GMM model afterwards remains able to meet after the Gauss weight of each model unit is updated

When above-mentioned acoustic model uses GMM model, according to cryptogram to training obtained pronunciation unit model in advance Spliced, obtaining the corresponding background model of cryptogram can include：Obtain the corresponding GMM of each pronunciation unit in cryptogram Model, obtains GMM model sequence；After splicing successively at least two model units in GMM model sequence, using it is default from Redirect probability and the outer probability that redirects carry out redirecting transfer, obtain the corresponding background model of cryptogram, wherein, redirect certainly probability with Outer the sum of the probability that redirects is 1.

When above-mentioned acoustic model uses HMM model, according to cryptogram to training obtained pronunciation unit model in advance Spliced, obtaining the corresponding background model of cryptogram includes：Obtain the corresponding HMM model sequence of cryptogram；To HMM moulds After at least two model units splicing in type sequence, using it is default from redirect probability and it is outer redirect probability and redirect turn Move, obtain the corresponding background model of cryptogram, wherein, probability and outer the sum of the probability that redirects are redirected certainly as 1.

In embodiments of the present invention, redirect certainly and redirect can be carried out in units of word outside, therefore, it is necessary to two Or two or more model unit is spliced, to obtain the GMM model of some word or HMM model, afterwards again in units of word Set it and redirect transition probability.Wherein, redirecting probability certainly and redirecting probability outside rule of thumb to be set in advance, for example, It will can certainly redirect probability and be arranged to 0.9, can be by redirecting probability is arranged to 0.1 outside.

Since pronunciation has certain when ductility, vocal print multidate information and password can preferably be simulated using HMM model The timing information of text, effectively to solve out of order text, wrongly written character, multiword and few word phenomenon.

Voiceprint password authentication method provided in an embodiment of the present invention, utilizes vocal print feature sequence, the background in voice signal Model and the vocal print cryptogram-modle of user are authenticated user.If there is no the corresponding background model of cryptogram, then root The pronunciation unit model that training obtains in advance is spliced according to cryptogram, obtains the corresponding background model of cryptogram.By In background model spliced by the pronunciation unit model for obtaining advance training, can easily real-time online Generation, disclosure satisfy that the demand that User Defined password and password are frequently changed.

Correspondingly, the embodiment of the present invention also provides a kind of vocal print cipher authentication system, as shown in figure 3, implementing for the present invention The structure diagram for the vocal print cipher authentication system that example provides.

In this embodiment, the vocal print cipher authentication system can include：

Receiving module 301, for receiving voice signal input by user；

Identification module 302, for carrying out speech recognition to voice signal, obtains cryptogram；

Determining module 303, is used to determine whether that there are the corresponding background model of cryptogram；

Background model acquisition module 304, for determining that there are the corresponding background model of cryptogram in determining module 303 Afterwards, background model is obtained, after determining module 303 determines that the corresponding background model of cryptogram is not present, according to cryptogram The pronunciation unit model that training obtains in advance is spliced, obtains the corresponding background model of cryptogram；

Authentication module 305, for utilizing vocal print feature sequence, background model and the vocal print of the user password in voice signal Model is authenticated user.

In embodiments of the present invention, the corresponding background model of cryptogram can be to trained in advance according to cryptogram To pronunciation unit model spliced, and pronunciation unit model can in advance be trained and obtained.It is for this reason, of the invention The system of embodiment can also further comprise：Training module (not shown), pronunciation unit model is obtained for training in advance.The instruction Practicing module can include：

Voice data acquiring unit, for obtaining trained voice data；

First determination unit, for determining pronunciation unit according to training voice data；

Second determination unit, the topological structure of the acoustic model for determining pronunciation unit；

Parameter training unit, for carrying out parameter training to acoustic model, obtains pronunciation unit model.

It should be noted that in embodiments of the present invention, above-mentioned acoustic model can use GMM model, can also use HMM model.

Above-mentioned acoustic model uses GMM model, and a kind of specific implementation structure of above-mentioned background model acquisition module 304 can be with Including：

GMM model acquiring unit, for obtaining the corresponding GMM model of each pronunciation unit in cryptogram, obtains GMM model Set；

First concatenation unit, for the model unit in GMM model set, being spliced using equal weight, obtaining new group Close GMM model；

Weight updating block, for being updated to the Gauss weight for combining GMM model so that combine the height of GMM model The sum of this weight is 1, obtains the corresponding background model of cryptogram.

Above-mentioned acoustic model uses GMM model, another specific implementation structure of above-mentioned background model acquisition module 304 It can include：

Above-mentioned acoustic model uses HMM model, and a kind of specific implementation structure of above-mentioned background model acquisition module 304 can be with Including：

3rd concatenation unit, after splicing successively to the model unit in the HMM model sequence, obtains cryptogram Corresponding background model.

In embodiments of the present invention, a kind of concrete structure of authentication module 305 can include：

Computing unit, for calculating first likelihood score of the vocal print feature sequence relative to vocal print cryptogram-modle respectively, and Vocal print feature sequence relative to background model the second likelihood score；

Determination unit, for the ratio and predetermined threshold value according to the first likelihood score and the second likelihood score, determines that user is No is validated user.

Vocal print cipher authentication system provided in an embodiment of the present invention, can further include：

Vocal print feature sequential extraction procedures module, for before speech recognition is carried out to voice signal or literary obtaining password After this corresponding background model, the vocal print feature sequence in voice signal is extracted.

Further, said system can also include：

Vocal print cryptogram-modle acquisition module, for before speech recognition is carried out to voice signal or literary obtaining password After this corresponding background model, the vocal print cryptogram-modle of user is obtained.

Further, said system can also include：

Judgment module, for before the vocal print cryptogram-modle of user is obtained, judging currently with the presence or absence of the vocal print of user Cryptogram-modle；

Vocal print cryptogram-modle build module, for judgment module judge there is currently no vocal print cryptogram-modle after, according to The vocal print cryptogram-modle of registration voice signal and background model the structure user of user.

Vocal print cipher authentication system provided in an embodiment of the present invention, utilizes vocal print feature sequence, the background in voice signal Model and the vocal print cryptogram-modle of user are authenticated user.If there is no the corresponding background model of cryptogram, then root The pronunciation unit model that training obtains in advance is spliced according to cryptogram, obtains the corresponding background model of cryptogram.By In background model spliced by the pronunciation unit model for obtaining advance training, can easily real-time online Generation, disclosure satisfy that the demand that User Defined password and password are frequently changed.

The voiceprint password authentication method and system that above-described embodiment provides belong to same inventive concept, each module in system, The function of unit and realize that process can refer to the description in embodiment of the method, which is not described herein again.

Each embodiment in this specification is described by the way of progressive, identical similar portion between each embodiment Divide mutually referring to what each embodiment stressed is the difference with other embodiment.It is real especially for system For applying example, since it is substantially similar to embodiment of the method, so describing fairly simple, related part is referring to embodiment of the method Part explanation.System embodiment described above is only schematical, wherein described be used as separating component explanation Unit may or may not be physically separate, may or may not be as the component that unit is shown Physical location, you can with positioned at a place, or can also be distributed in multiple network unit.Can be according to the actual needs Some or all of module therein is selected to realize the purpose of this embodiment scheme.Those of ordinary skill in the art are not paying In the case of creative work, you can to understand and implement.

The foregoing is merely presently preferred embodiments of the present invention, is not intended to limit the invention, it is all the present invention spirit and Within principle, any modification, equivalent replacement, improvement and so on, should all be included in the protection scope of the present invention.

Claims

A kind of 1. voiceprint password authentication method, it is characterised in that including：

Receive voice signal input by user；

Speech recognition is carried out to the voice signal, obtains cryptogram；

Determine whether there is the corresponding back of the body of the cryptogram trained beforehand through the password voice data gathered offline Scape model；

If it is, obtain the background model；If it is not, then according to the cryptogram to training obtained pronunciation list in advance Meta-model is spliced, and obtains the corresponding background model of the cryptogram；

Using the vocal print feature sequence in the voice signal, the background model and the vocal print cryptogram-modle of the user to institute User is stated to be authenticated.
2. according to the method described in claim 1, it is characterized in that, training obtains pronunciation unit mould in advance in the following way Type：

Obtain training voice data；

Pronunciation unit is determined according to the trained voice data；

Determine the topological structure of the acoustic model of the pronunciation unit；

Parameter training is carried out to the acoustic model, obtains pronunciation unit model.
3. according to the method described in claim 2, it is characterized in that, the acoustic model is GMM model；

It is described that the pronunciation unit model that training obtains in advance is spliced according to the cryptogram, obtain the cryptogram Corresponding background model includes：

The corresponding GMM model of each pronunciation unit in the cryptogram is obtained, obtains GMM model set；

To the model unit in the GMM model set, spliced using equal weight, obtain new combination GMM model；

The Gauss weight of the combination GMM model is updated so that the sum of Gauss weight of the combination GMM model is 1, Obtain the corresponding background model of the cryptogram.
4. according to the method described in claim 2, it is characterized in that, the acoustic model is GMM model；

It is described that the pronunciation unit model that training obtains in advance is spliced according to the cryptogram, obtain the cryptogram Corresponding background model includes：

The corresponding GMM model of each pronunciation unit in the cryptogram is obtained, obtains GMM model sequence；

After splicing successively at least two model units in the GMM model sequence, using default from redirecting probability and outer Redirect probability to carry out redirecting transfer, obtain the corresponding background model of the cryptogram, wherein, it is described to redirect probability and outer jump certainly Turn the sum of probability as 1.
5. according to the method described in claim 2, it is characterized in that, the acoustic model is HMM model；

It is described that the pronunciation unit model that training obtains in advance is spliced according to the cryptogram, obtain the cryptogram Corresponding background model includes：

Obtain the corresponding HMM model sequence of each pronunciation unit in the cryptogram；

After splicing successively to the model unit in the HMM model sequence, the corresponding background model of the cryptogram is obtained.
6. the according to the method described in claim 1, it is characterized in that, vocal print feature sequence using in the voice signal Row, the background model and the vocal print cryptogram-modle of the user are authenticated including to the user：

First likelihood score of the vocal print feature sequence relative to the vocal print cryptogram-modle, and vocal print spy are calculated respectively Levy second likelihood score of the sequence relative to the background model；

According to first likelihood score and the ratio and predetermined threshold value of second likelihood score, determine whether user is legal use Family.
7. method according to any one of claims 1 to 6, it is characterised in that the method further includes：

Before speech recognition is carried out to the voice signal or after the corresponding background model of the cryptogram is obtained, Extract the vocal print feature sequence in the voice signal.
8. method according to any one of claims 1 to 6, it is characterised in that the method further includes：

Before speech recognition is carried out to the voice signal or after the corresponding background model of the cryptogram is obtained, Obtain the vocal print cryptogram-modle of the user.
9. according to the method described in claim 8, it is characterized in that, the method further includes：

Before the vocal print cryptogram-modle of the user is obtained, judge currently to whether there is the vocal print cryptogram-modle of user；

If it does not exist, then according to the registration voice signal of user and the vocal print cryptogram-modle of background model structure user.
A kind of 10. vocal print cipher authentication system, it is characterised in that including：

Receiving module, for receiving voice signal input by user；

Identification module, for carrying out speech recognition to the voice signal, obtains cryptogram；

Determining module, is used to determine whether there is the password trained beforehand through the password voice data gathered offline The corresponding background model of text；

Background model acquisition module, for after the determining module determines there are the corresponding background model of the cryptogram, The background model is obtained, after the determining module determines that the corresponding background model of the cryptogram is not present, according to institute State cryptogram to splice the pronunciation unit model that training obtains in advance, obtain the corresponding background mould of the cryptogram Type, pronunciation unit model to training voice data by being trained to obtain；

Authentication module, for utilizing vocal print feature sequence, the background model and the sound of the user in the voice signal Line cryptogram-modle is authenticated the user.
11. system according to claim 10, it is characterised in that the system also includes：Training module, for instructing in advance Get pronunciation unit model；The training module includes：

Voice data acquiring unit, for obtaining trained voice data；

First determination unit, for determining pronunciation unit according to the trained voice data；

Second determination unit, the topological structure of the acoustic model for determining the pronunciation unit；

Parameter training unit, for carrying out parameter training to the acoustic model, obtains pronunciation unit model.
12. system according to claim 11, it is characterised in that the acoustic model is GMM model, the background model Acquisition module includes：

GMM model acquiring unit, for obtaining the corresponding GMM model of each pronunciation unit in the cryptogram, obtains GMM model Set；

First concatenation unit, for the model unit in the GMM model set, being spliced using equal weight, obtaining new group Close GMM model；

Weight updating block, for being updated to the Gauss weight of the combination GMM model so that the combination GMM model The sum of Gauss weight be 1, obtain the corresponding background model of the cryptogram.
13. system according to claim 11, it is characterised in that the acoustic model is GMM model, the background model Acquisition module includes：

GMM model acquiring unit, for obtaining the corresponding GMM model of each pronunciation unit in the cryptogram, obtains GMM model Sequence；

Second concatenation unit, after splicing successively at least two model units in the GMM model sequence, using default Redirect certainly probability and it is outer redirect probability and carry out redirecting transfer, obtain the corresponding background model of the cryptogram, wherein, it is described From probability and outer the sum of the probability that redirects is redirected for 1.
14. system according to claim 11, it is characterised in that the acoustic model is HMM model, the background model Acquisition module includes：

HMM model retrieval unit, for obtaining the corresponding HMM model sequence of each pronunciation unit in the cryptogram；

3rd concatenation unit, after splicing successively to the model unit in the HMM model sequence, obtains the cryptogram Corresponding background model.
15. system according to claim 10, it is characterised in that the authentication module includes：

Computing unit, for calculating first likelihood score of the vocal print feature sequence relative to the vocal print cryptogram-modle respectively, And the vocal print feature sequence is relative to the second likelihood score of the background model；

Determination unit, for the ratio and predetermined threshold value according to first likelihood score and second likelihood score, determines to use Whether family is validated user.
16. according to claim 10 to 15 any one of them system, it is characterised in that the system also includes：

Vocal print feature sequential extraction procedures module, for before speech recognition is carried out to the voice signal or described close obtaining After the corresponding background model of code text, the vocal print feature sequence in the voice signal is extracted.
17. according to claim 10 to 15 any one of them system, it is characterised in that the system also includes：

Vocal print cryptogram-modle acquisition module, for before speech recognition is carried out to the voice signal or described close obtaining After the corresponding background model of code text, the vocal print cryptogram-modle of the user is obtained.
18. system according to claim 17, it is characterised in that the system also includes：

Judgment module, for before the vocal print cryptogram-modle of the user is obtained, judging currently with the presence or absence of the vocal print of user Cryptogram-modle；

Vocal print cryptogram-modle build module, for the judgment module judge there is currently no vocal print cryptogram-modle after, according to The registration voice signal of user and the vocal print cryptogram-modle of background model structure user.