CN108694952A

CN108694952A - Electronic device, the method for authentication and storage medium

Info

Publication number: CN108694952A
Application number: CN201810311721.2A
Authority: CN
Inventors: 王健宗; 于夕畔; 李瑾瑾; 肖京
Original assignee: Ping An Technology Shenzhen Co Ltd
Current assignee: Ping An Technology Shenzhen Co Ltd
Priority date: 2018-04-09
Filing date: 2018-04-09
Publication date: 2018-10-23
Anticipated expiration: 2038-04-09
Also published as: WO2019196305A1; CN108694952B

Abstract

The present invention relates to a kind of electronic device, the method for authentication and storage medium, this method to include:Under IVR scenes when user's transacting business, report the random code of the first presetting digit capacity for the user with reading, and after with reading be respectively this random code for reporting and the user this acoustic model of preset kind is established with the voice of reading;By the acoustic model of the random code of this report and the user, this with the acoustic model of the voice of reading carries out forcing whole alignment operation, and the identical probability of two acoustic models after the alignment is calculated using pre-defined algorithm;If the probability is more than preset first threshold value, then extracting the user, this is vectorial with the vocal print feature of the voice of reading, obtain the standard vocal print feature vector that the user prestores after succeeding in registration, and calculate the user this with the voice of reading vocal print feature vector and the standard vocal print feature vector distance, with to the user progress authentication.The present invention carries out double verification to user identity, can accurately confirm user identity.

Description

Electronic device, the method for authentication and storage medium

Technical field

The present invention relates to a kind of field of communication technology more particularly to electronic device, the method for authentication and storage mediums.

Background technology

Currently, in interactive voice answering IVR (Interactive Voice Response) scene, providing will be mutual Dynamic formula voice answer-back IVR is combined with Application on Voiceprint Recognition, to carry out the scheme of authentication to client, for example, client receives credit card When using phone to carry out credit card activation or Modify password afterwards, the scene for verifying client identity is needed.The prior art is interactive In voice answer-back IVR (Interactive Voice Response) scene, in view of long-range voice print verification both sides be not face to face into Row verification cannot accurately confirm client's body accordingly, it is possible to can have the fraud that client utilizes pre-prepd synthesized voice Part, the safety of authentication is low.

Invention content

The purpose of the present invention is to provide a kind of electronic device, the method for authentication and storage mediums, it is intended to user Identity carries out double verification, can accurately confirm user identity.

To achieve the above object, the present invention provides a kind of electronic device, the electronic device include memory and with it is described The processor of memory connection, is stored with the processing system that can be run on the processor, the processing in the memory System realizes following steps when being executed by the processor:

Under interactive voice answering IVR scenes when user's transacting business, it is default to report first for acoustic model establishment step The random code of digit for the user with read, and after with reading be respectively this report random code and the user this with reading language Sound establishes the acoustic model of preset kind;

Force whole alignment step, by the acoustic model of the random code of this report and the user this with reading voice Acoustic model carries out forcing whole alignment operation, and the identical probability of two acoustic models after the alignment is calculated using pre-defined algorithm;

Authentication step is extracted if the identical probability of two acoustic models after the alignment is more than preset first threshold value The user this with the voice of reading vocal print feature vector, obtain standard vocal print feature that the user prestores after succeeding in registration to Amount, and calculate the user this with the voice of reading vocal print feature vector and the standard vocal print feature vector distance, with to this User carries out authentication.

Preferably, when the processing system is executed by the processor, following steps are also realized:

When user carries out voiceprint registration under interactive voice answering IVR scenes, the random code of the second presetting digit capacity is reported It is respectively that the random code reported and user establish the default class with the voice of reading after every time with reading for user with reading default time The acoustic model of type;

The acoustic model for the random code reported every time and corresponding user are carried out with the acoustic model of the voice of reading respectively Whole alignment operation is forced, the identical probability of two acoustic models after alignment is calculated using pre-defined algorithm;

If the identical probability of two acoustic models after alignment is all higher than default second threshold, each user is extracted with reading Whether the vocal print feature vector of voice, calculate the distance of vocal print feature vector two-by-two, be every time same with the user of reading with analysis User;

If so, the standard vocal print feature vector using the vocal print feature vector as the user stores.

Preferably, the acoustic model of the preset kind is deep neural network-hidden Markov model.

Preferably, described extraction user this with the step of the vocal print feature vector of the voice of reading include:

To the user, this with the voice of reading carries out preemphasis and windowing process, and Fourier transform is carried out to each adding window Corresponding frequency spectrum is obtained, the frequency spectrum is inputted into Meier filter to export to obtain Meier frequency spectrum;

Cepstral analysis is carried out on Meier frequency spectrum to obtain mel-frequency cepstrum coefficient MFCC, is fallen based on the mel-frequency Spectral coefficient MFCC forms the user, and this is vectorial with the vocal print feature of the voice of reading.

To achieve the above object, the present invention also provides a kind of method of authentication, the method for the authentication includes:

S1, under interactive voice answering IVR scenes when user's transacting business, the random code for reporting the first presetting digit capacity supplies The user with reading, and after with reading be respectively this random code for reporting and the user this with the voice of reading establish preset kind Acoustic model;

S2, by the acoustic model of the random code of this report and the user this with the voice of reading acoustic model carry out it is strong The whole alignment operation of system, the identical probability of two acoustic models after the alignment is calculated using pre-defined algorithm;

S3, if the identical probability of two acoustic models after the alignment is more than preset first threshold value, extract the user this With the vocal print feature vector of the voice of reading, the standard vocal print feature vector that the user prestores after succeeding in registration is obtained, and calculate The user this with the voice of reading vocal print feature vector and the standard vocal print feature vector distance, with to the user progress body Part verification.

Preferably, before the step S1, further include:

S01, under interactive voice answering IVR scenes user carry out voiceprint registration when, report the second presetting digit capacity with Machine code for user with reading default time, with the voice of reading establish described pre- by the random code and user respectively reported after every time with reading If the acoustic model of type;

S02, respectively by the acoustic model for the random code reported every time and corresponding user with the acoustic model of the voice of reading It carries out forcing whole alignment operation, the identical probability of two acoustic models after alignment is calculated using pre-defined algorithm;

S03, if alignment after the identical probability of two acoustic models be all higher than default second threshold, extract each user with The vocal print feature vector of the voice of reading, calculates the distance of vocal print feature vector two-by-two, with analysis every time with the user of reading whether be Same user;

S04, if so, the standard vocal print feature vector using the vocal print feature vector as the user stores.

Preferably, it is described calculate the user this with the voice of reading vocal print feature vector and standard vocal print feature vector Apart from the step of include:

Wherein, describedIt is described for standard vocal print feature vectorFor the user this with reading The vocal print feature vector of voice.

The present invention also provides a kind of computer readable storage medium, processing is stored on the computer readable storage medium The step of system, the processing system realizes the method for above-mentioned authentication when being executed by processor.

The beneficial effects of the invention are as follows:When the present invention carries out identification under interactive voice answering IVR scenes, utilize Random code can effectively prevent pre-prepd synthesized voice with reading for user and cheat, by random code and Application on Voiceprint Recognition knot It closes, realizes the double verification to user identity, can accurately confirm user identity, improve interactive voice answering IVR scenes The safety of lower authentication, in addition, the acoustic model of acoustic model and the user to the random code of report with the voice of reading It carries out forcing whole alignment operation, calculation amount can be reduced, improve identification efficiency.

Description of the drawings

Fig. 1 is each one optional application environment schematic diagram of embodiment of the present invention;

Fig. 2 is the flow diagram of one embodiment of method of authentication of the present invention.

Specific implementation mode

In order to make the purpose , technical scheme and advantage of the present invention be clearer, with reference to the accompanying drawings and embodiments, right The present invention is further elaborated.It should be appreciated that described herein, specific examples are only used to explain the present invention, not For limiting the present invention.Based on the embodiments of the present invention, those of ordinary skill in the art are not before making creative work The every other embodiment obtained is put, shall fall within the protection scope of the present invention.

It should be noted that the description for being related to " first ", " second " etc. in the present invention is used for description purposes only, and cannot It is interpreted as indicating or implying its relative importance or implicitly indicates the quantity of indicated technical characteristic.Define as a result, " the One ", the feature of " second " can explicitly or implicitly include at least one of the features.In addition, the skill between each embodiment Art scheme can be combined with each other, but must can be implemented as basis with those of ordinary skill in the art, when technical solution Will be understood that the combination of this technical solution is not present in conjunction with there is conflicting or cannot achieve when, also not the present invention claims Protection domain within.

As shown in fig.1, being the application environment schematic diagram of the preferred embodiment of the method for authentication of the present invention.The application Environment schematic includes electronic device 1 and terminal device.What electronic device 1 can be suitble to by network, near-field communication technology etc. Technology carries out data interaction with terminal device.In the present embodiment, interactive language that user passes through terminal device logs electronic device 1 Sound response IVR system, to execute the operation of voiceprint registration and Application on Voiceprint Recognition.

The terminal device, which includes, but are not limited to any type, to pass through keyboard, mouse, remote controler, touch tablet with user Or the modes such as voice-operated device carry out the electronic product of human-computer interaction, for example, personal computer, tablet computer, smart mobile phone, a Personal digital assistant (Personal Digital Assistant, PDA), game machine, Interactive Internet TV (Internet Protocol Television, IPTV), the movable equipment of intellectual Wearable, navigation device etc., or such as The fixed terminal of digital TV, desktop computer, notebook, server etc..

The electronic device 1 be it is a kind of can according to the instruction for being previously set or storing, it is automatic carry out numerical computations and/ Or the equipment of information processing.The electronic device 1 can be computer, can also be single network server, multiple networks clothes It is engaged in the server group either cloud being made of a large amount of hosts or network server based on cloud computing of device composition, wherein cloud computing It is one kind of Distributed Calculation, a super virtual computer being made of the computer collection of a group loose couplings.

In the present embodiment, electronic device 1 may include, but be not limited only to, and can be in communication with each other connection by system bus Memory 11, processor 12, network interface 13, memory 11 are stored with the processing system that can be run on the processor 12.It needs , it is noted that Fig. 1 illustrates only the electronic device 1 with component 11-13, it should be understood that being not required for implementing all The component shown, the implementation that can be substituted is more or less component.

Wherein, memory 11 includes memory and the readable storage medium storing program for executing of at least one type.Inside save as the fortune of electronic device 1 Row provides caching;Readable storage medium storing program for executing can be if flash memory, hard disk, multimedia card, card-type memory are (for example, SD or DX memories Deng), random access storage device (RAM), static random-access memory (SRAM), read-only memory (ROM), electric erasable can compile Journey read-only memory (EEPROM), programmable read only memory (PROM), magnetic storage, disk, CD etc. it is non-volatile Storage medium.In some embodiments, readable storage medium storing program for executing can be the internal storage unit of electronic device 1, such as the electronics The hard disk of device 1;In further embodiments, which can also be that the external storage of electronic device 1 is set Plug-in type hard disk that is standby, such as being equipped on electronic device 1, intelligent memory card (Smart Media Card, SMC), secure digital (Secure Digital, SD) blocks, flash card (Flash Card) etc..In the present embodiment, the readable storage medium storing program for executing of memory 11 It is installed on commonly used in storage in the operating system and types of applications software of electronic device 1, such as storage one embodiment of the invention Processing system program code etc..It has exported or will export in addition, memory 11 can be also used for temporarily storing Various types of data.

The processor 12 can be in some embodiments central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor or other data processing chips.The processor 12 is commonly used in the control electricity The overall operation of sub-device 1, such as execute and carry out data interaction with the terminal device or communicate relevant control and processing Deng.In the present embodiment, the processor 12 is for running the program code stored in the memory 11 or processing data, example Such as run processing system.

The network interface 13 may include radio network interface or wired network interface, which is commonly used in Communication connection is established between the electronic device 1 and other electronic equipments.In the present embodiment, network interface 13 is mainly used for will be electric Sub-device 1 is connected with one or more terminal devices, and data are established between electronic device 1 and one or more terminal devices and are passed Defeated channel and communication connection.

The processing system is stored in memory 11, including it is at least one be stored in it is computer-readable in memory 11 Instruction, at least one computer-readable instruction can be executed by processor device 12, the method to realize each embodiment of the application;With And the function that at least one computer-readable instruction is realized according to its each section is different, can be divided into different logic moulds Block.

In one embodiment, following steps are realized when above-mentioned processing system is executed by the processor 12:

Under interactive voice answering IVR scenes, user asks to send identity code, such as identity card when transacting business Number, after receiving the request of user, whether the business handled of analysis user needs further authentication, and according to The identity code at family analyzes whether the user is registered vocal print, if desired the further authentication and user is registered There is vocal print, then generate the random code of the first presetting digit capacity and speech synthesis technique is used to report the random code with speech form, draws It leads user to carry out with reading, which is, for example, 8.

It is that the voice of this random code reported establishes the acoustic model of preset kind, is the user after user is with reading This establishes the acoustic model of preset kind with the voice of reading.In a preferred embodiment, the acoustic model of the preset kind is Deep neural network-hidden Markov acoustic model, i.e. DNN-HMM acoustic models.In other embodiments, the preset kind Acoustic model may be other acoustic models, for example, hidden Markov acoustic model etc..

In a specific example, by taking DNN-HMM acoustic models as an example, wherein HMM is used for describing the dynamic of voice signal Variation, the posterior probability of some state of continuous density HMM is estimated using each output node of DNN, you can obtain DNN- HMM model.This with the voice of reading is all a series of syllable by the voice of this random code reported and the user, to identification At word, then be a series of character.The present embodiment is based on scheduled character voice when establishing DNN-HMM acoustic models Library obtains DNN-HMM acoustic models, the use of the voice of the random code of this report by global character acoustics adaptive training Family this with the voice of reading DNN-HMM acoustic models.

Wherein, by this report random code acoustic model and the user this with the voice of reading acoustic model carry out Whole alignment (Force Alignment) operation is forced, takes the method word for word compared, the present embodiment can compared to traditional Calculation amount is substantially reduced, the advantageous efficiency for improving identification.

Wherein, posterior probability algorithm is tested before pre-defined algorithm is in one embodiment, in other embodiments, can also be phase Like degree algorithm, such as editing distance that the similarity algorithm be character in two acoustic models calculated after being aligned, editing distance gets over The identical probability of two acoustic models after small then alignment is bigger;The similarity algorithm can also be longest common subsequence algorithm, If obtained longest common subsequence be aligned after two acoustic models in the length of character differ smaller, two after alignment The identical probability of acoustic model is bigger.

In the present embodiment, if the identical probability of two acoustic models after the alignment is more than preset first threshold value, for example, it is default First threshold is 0.985, then it is assumed that this is consistent with the random code that this is reported with the character of reading by user.Due to report be with Machine code, therefore the pre-prepd synthesized voice of user can be effectively prevented and cheated, promote the safety of identification.

In one embodiment, extracting the user, this with the step of the vocal print feature vector of the voice of reading includes:To the use This with the voice of reading carries out preemphasis and windowing process at family, and carrying out Fourier transform to each adding window obtains corresponding frequency The frequency spectrum is inputted Meier filter to export to obtain Meier frequency spectrum by spectrum;Cepstral analysis is carried out on Meier frequency spectrum to obtain Mel-frequency cepstrum coefficient MFCC, based on the mel-frequency cepstrum coefficient MFCC form the user this with the voice of reading sound Line feature vector.

Wherein, to the user this with reading voice carry out framing, then to after framing voice data carry out preemphasis Processing, preemphasis processing are really high-pass filtering processing, filter out low-frequency data so that the high frequency characteristics in the voice data is more It highlights, specifically, the transmission function of high-pass filtering is:H (Z)=1- α Z^-1, wherein Z is voice data, and α is constant factor, excellent The value of selection of land, α is 0.97;Since voice deviates from raw tone to a certain extent after framing, therefore, it is necessary to the language Sound data carry out windowing process.

In the present embodiment, it is, for example, to take logarithm, do inverse transformation that cepstral analysis is carried out on Meier frequency spectrum, and inverse transformation is usually It is realized by DCT discrete cosine transforms, takes the 2nd after DCT to the 13rd coefficient as mel-frequency cepstrum coefficient MFCC. Mel-frequency cepstrum coefficient MFCC is the vocal print feature of this frame voice data, by the mel-frequency cepstrum coefficient MFCC groups of every frame At characteristic matrix, this feature data matrix be the user this with the voice of reading vocal print feature vector.

The present embodiment takes the mel-frequency cepstrum coefficient MFCC of voice data to form corresponding vocal print feature vector, due to it Than the frequency band for the linear interval in normal cepstrum more can subhuman auditory system, therefore body can be improved The accuracy of part verification.

In one embodiment, calculate the user this with the voice of reading vocal print feature vector and the standard vocal print feature to The distance of amount is the COS distance of both calculating, including:

If COS distance is less than or equal to preset distance threshold, authentication passes through;If COS distance is more than pre- If distance threshold, then authentication do not pass through.

In one embodiment, the step of standard vocal print feature vector to prestore after user registration success, the registration vocal print Including:

If so, the standard vocal print feature vector using the vocal print feature vector as the user stores;

If it is not, the step of then user being prompted to re-type, carrying out registration vocal print again.

Wherein, under interactive voice answering IVR scenes, user asks to send identity code, such as identity when registration Card number generates the random code of the second presetting digit capacity and uses speech synthesis technique with voice shape after receiving the request of user Formula reports the random code, and guiding user carries out with reading default time (such as 3 times), which is, for example, 8.

After user is with reading, the voice of the random code to report every time establishes the acoustic model of preset kind, is the user The acoustic model of preset kind is established with the voice of reading every time.In a preferred embodiment, the acoustic model of the preset kind is Deep neural network-hidden Markov acoustic model, i.e. DNN-HMM acoustic models.In other embodiments, the preset kind Acoustic model may be other acoustic models, for example, hidden Markov acoustic model etc..Specific example can refer to The above embodiments, details are not described herein again.

In a specific example, by taking DNN-HMM acoustic models as an example, wherein HMM is used for describing the dynamic of voice signal Variation, the posterior probability of some state of continuous density HMM is estimated using each output node of DNN, you can obtain DNN- HMM model.The voice for the random code reported every time and the user are a series of syllables with the voice of reading, to identification Word is then a series of character.The present embodiment is based on scheduled character sound bank when establishing DNN-HMM acoustic models, leads to The DNN-HMM acoustic models of the voice for the random code that global character acoustics adaptive training is reported, the user are crossed with reading The DNN-HMM acoustic models of voice.

Wherein, the acoustic model for the random code reported every time and the user are forced with the acoustic model of the voice of reading Whole alignment (Force Alignment) operation, takes the method word for word compared, the present embodiment can be significantly compared to traditional Reduce calculation amount, the advantageous efficiency for improving identification.

Wherein, posterior probability algorithm is tested before pre-defined algorithm is in one embodiment, in other embodiments, can also be phase Like degree algorithm, specific example can refer to the above embodiments, and details are not described herein again.

In the present embodiment, if the identical probability of two acoustic models after alignment is all higher than default second threshold, for example, it is default Second threshold is 0.985, then it is assumed that user is consistent with the random code reported with the character of reading every time.What it is due to report is random Code, therefore the pre-prepd synthesized voice of user can be effectively prevented and cheated, promote the safety of identification.

In one embodiment, step and above-described embodiment of each user with the vocal print feature vector of the voice of reading are extracted The method for extracting the vocal print feature vector of voice is essentially identical, and details are not described herein again.

In one embodiment, calculate two-by-two vocal print feature vector apart from the step of, the step with above-mentioned calculating COS distance Rapid essentially identical, details are not described herein again.

It is every time same user with the user of reading, at this time if COS distance is less than or equal to preset distance threshold Standard vocal print feature vector using the vocal print feature vector as the user stores;If COS distance is more than preset distance Threshold value, then be not same user with the user of reading every time, and prompt user re-registers.

Compared with prior art, when the present invention carries out identification under interactive voice answering IVR scenes, using random Code can effectively prevent pre-prepd synthesized voice with reading for user and cheat, and random code is combined with Application on Voiceprint Recognition, real Show the double verification to user identity, can accurately confirm user identity, improves identity under interactive voice answering IVR scenes The safety of verification, in addition, the acoustic model and the user to the random code of report carry out by force with the acoustic model of the voice of reading The whole alignment operation of system, can reduce calculation amount, improve identification efficiency.

As shown in Fig. 2, Fig. 2 is the flow diagram of one embodiment of method of authentication of the present invention, the authentication Method includes the following steps:

Step S1 under interactive voice answering IVR scenes when user's transacting business, reports the random of the first presetting digit capacity Code for the user with reading, and after with reading be respectively this random code for reporting and the user this with reading voice establish it is default The acoustic model of type;

Step S2, by this report random code acoustic model and the user this with the voice of reading acoustic model into Row forces whole alignment operation, and the identical probability of two acoustic models after the alignment is calculated using pre-defined algorithm;

Step S3 extracts the user if the identical probability of two acoustic models after the alignment is more than preset first threshold value This obtains the standard vocal print feature vector that the user prestores after succeeding in registration with the vocal print feature vector of the voice of reading, and Calculate the user this with the voice of reading vocal print feature vector and the standard vocal print feature vector distance, with to the user into Row authentication.

The embodiments of the present invention are for illustration only, can not represent the quality of embodiment.

Through the above description of the embodiments, those skilled in the art can be understood that above-described embodiment side Method can add the mode of required general hardware platform to realize by software, naturally it is also possible to by hardware, but in many cases The former is more preferably embodiment.Based on this understanding, technical scheme of the present invention substantially in other words does the prior art Going out the part of contribution can be expressed in the form of software products, which is stored in a storage medium In (such as ROM/RAM, magnetic disc, CD), including some instructions are used so that a station terminal equipment (can be mobile phone, computer, clothes Be engaged in device, air conditioner or the network equipment etc.) execute method described in each embodiment of the present invention.

It these are only the preferred embodiment of the present invention, be not intended to limit the scope of the invention, it is every to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims

1. a kind of electronic device, which is characterized in that the electronic device includes memory and the processing that is connect with the memory Device is stored with the processing system that can be run on the processor in the memory, and the processing system is by the processor Following steps are realized when execution:

Acoustic model establishment step under interactive voice answering IVR scenes when user's transacting business, reports the first presetting digit capacity Random code for the user with read, and after with reading be respectively this report random code and the user this built with the voice of reading The acoustic model of vertical preset kind;

Force whole alignment step, by the acoustic model of the random code of this report and the user this with the voice of reading acoustics Model carries out forcing whole alignment operation, and the identical probability of two acoustic models after the alignment is calculated using pre-defined algorithm;

Authentication step extracts the use if the identical probability of two acoustic models after the alignment is more than preset first threshold value This is vectorial with the vocal print feature of the voice of reading at family, obtains the standard vocal print feature vector that the user prestores after succeeding in registration, And calculate the user this with the voice of reading vocal print feature vector and the standard vocal print feature vector distance, with to the user Carry out authentication.

2. electronic device according to claim 1, which is characterized in that when the processing system is executed by the processor, Also realize following steps:

Under interactive voice answering IVR scenes user carry out voiceprint registration when, report the second presetting digit capacity random code for Family is respectively that the random code reported and user establish the preset kind with the voice of reading after every time with reading with reading default time Acoustic model;

The acoustic model for the random code reported every time and corresponding user are forced with the acoustic model of the voice of reading respectively Whole alignment operation calculates the identical probability of two acoustic models after alignment using pre-defined algorithm;

If the identical probability of two acoustic models after alignment is all higher than default second threshold, voice of each user with reading is extracted Vocal print feature vector, calculate the distance of vocal print feature vector two-by-two, whether be every time same user with the user of reading to analyze;

3. electronic device according to claim 1 or 2, which is characterized in that the acoustic model of the preset kind is depth Neural network-hidden Markov model.

4. electronic device according to claim 1 or 2, which is characterized in that it is described extract the user this with reading voice The step of vocal print feature vector include:

To the user, this with the voice of reading carries out preemphasis and windowing process, and carrying out Fourier transform to each adding window obtains The frequency spectrum is inputted Meier filter to export to obtain Meier frequency spectrum by corresponding frequency spectrum;

Cepstral analysis is carried out on Meier frequency spectrum to obtain mel-frequency cepstrum coefficient MFCC, is based on mel-frequency cepstrum system Number MFCC forms the user, and this is vectorial with the vocal print feature of the voice of reading.

5. a kind of method of authentication, which is characterized in that the method for the authentication includes:

S1 under interactive voice answering IVR scenes when user's transacting business, reports the random code of the first presetting digit capacity for this Family with reading, and after with reading be respectively this random code for reporting and the user this sound of preset kind is established with the voice of reading Learn model;

S2, by this report random code acoustic model and the user this with the voice of reading acoustic model force it is whole Body alignment operation calculates the identical probability of two acoustic models after the alignment using pre-defined algorithm;

S3, if the identical probability of two acoustic models after the alignment is more than preset first threshold value, extract the user this with reading Voice vocal print feature vector, obtain the standard vocal print feature vector that the user prestores after succeeding in registration, and calculate the use Family this with the voice of reading vocal print feature vector and the standard vocal print feature vector distance, with to the user progress identity test Card.

6. the method for authentication according to claim 5, which is characterized in that before the step S1, further include:

S01 when user carries out voiceprint registration under interactive voice answering IVR scenes, reports the random code of the second presetting digit capacity It is respectively that the random code reported and user establish the default class with the voice of reading after every time with reading for user with reading default time The acoustic model of type;

S02 respectively carries out the acoustic model for the random code reported every time and corresponding user with the acoustic model of the voice of reading Whole alignment operation is forced, the identical probability of two acoustic models after alignment is calculated using pre-defined algorithm;

S03 extracts each user with reading if the identical probability of two acoustic models after alignment is all higher than default second threshold Whether the vocal print feature vector of voice, calculate the distance of vocal print feature vector two-by-two, be every time same with the user of reading with analysis User;

7. the method for authentication according to claim 5 or 6, which is characterized in that the acoustic model of the preset kind For deep neural network-hidden Markov model.

8. the method for authentication according to claim 5 or 6, which is characterized in that described extraction user this with read The step of vocal print feature vector of voice include:

9. the method for authentication according to claim 5 or 6, which is characterized in that described calculating user this with read Voice vocal print feature vector and standard vocal print feature vector apart from the step of include:

Wherein, describedIt is described for standard vocal print feature vectorFor the user this with reading voice Vocal print feature vector.

10. a kind of computer readable storage medium, which is characterized in that be stored with processing system on the computer readable storage medium System, when the processing system is executed by processor the method for authentication of the realization as described in any one of claim 5 to 9 Step.