WO2020199473A1

WO2020199473A1 - Voice password verification method and apparatus, storage medium, and computer device

Info

Publication number: WO2020199473A1
Application number: PCT/CN2019/103048
Authority: WO
Inventors: 张丝潆; 王健宗
Original assignee: 平安科技（深圳）有限公司
Priority date: 2019-04-04
Filing date: 2019-08-28
Publication date: 2020-10-08
Also published as: CN109994118A; CN109994118B

Abstract

The present invention relates to the technical field of security verification, and particularly relates to a voice password verification method and apparatus, a storage medium, and a computer device. The voice password verification method comprises: receiving voice information inputted by a user, and parsing the voice information to acquire voice print information of the user (S210); inputting the voice print information into a pre-trained recognition model and acquiring the degree of similarity between the voice print information and preset identity information (S220); the recognition model is based on correlation information of the user identity and voiceprint information formed by training samples containing an interference factor; on the basis of the degree of similarity, scoring the voice print information to acquire a score of the voice print information and, if the score exceeds a preset threshold, then verification passes (S230). The provided solution can increase the anti-interference of voice password verification and increase the accuracy of voice password verification.

Description

Voice password verification method, device, storage medium and computer equipment

This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on April 4, 2019, the application number is 201910270003.X, and the invention title is "Voice Password Verification Method, Device, Storage Medium and Computer Equipment", and its entire contents Incorporated in this application by reference.

Technical field

This application relates to the technical field of security verification. Specifically, this application relates to a voice password verification method, device, storage medium, and computer equipment.

Background technique

With the advancement of technology and the rise of the concept of smart homes, more and more smart products have appeared on the market, such as sweeping robots, smart locks, smart water heaters, etc. Due to the uniqueness of the biological characteristics of voiceprints, some products have appeared on the market. A technology for password verification based on voiceprint information.

Voice password is a technology that double-encrypts user information using text information and speaker information in the voice segment. It has good security and convenience, and has good applications in the fields of finance, insurance, public security, and smart devices. Scenes. The inventor realizes that in the current technical research, the acoustic features used in traditional voiceprint password recognition mainly include text information and channel information. The speaker information belongs to the weak information. This leads to the password recognition process, which still faces resistance. Insufficiency such as poor interference.

Summary of the invention

This application provides a voice password verification method, device, computer readable storage medium, and computer equipment to improve the anti-interference of voice password recognition.

The embodiment of the application first provides a voice password verification method, including:

Receiving voice information input by a user, and parsing the voice information to obtain voiceprint information of the user;

Input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein the recognition model is based on a voiceprint formed by training samples containing interference factors The relationship between information and user identity;

The voiceprint information is scored according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.

To solve the above technical problems, an embodiment of the present application also provides a voice password verification device, including:

The parsing module is used to receive the voice information input by the user, and parse the voice information to obtain the user's voiceprint information;

The similarity obtaining module is used to input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein, the recognition model is based on the inclusion of interference factors The relationship between the voiceprint information formed by the training samples and the user identity;

The verification module is configured to score the voiceprint information according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.

In order to solve the above-mentioned problems, embodiments of the present application also provide a non-volatile computer-readable storage medium. The computer-readable storage medium is used to store computer instructions. When the computer instructions run on the computer, the computer The steps of the voice password verification method described in any of the above technical solutions can be executed, wherein the steps of the voice password verification method include:

Furthermore, an embodiment of the present application also provides a computer device, and the computer device includes:

One or more processors;

Storage device for storing one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors implement the steps of the voice password verification method described above, wherein the steps of the voice password verification method include:

The voice password verification method provided in the embodiment of the application obtains the user's voiceprint information by parsing voice information, inputs the voiceprint information into a pre-trained recognition model, and performs identity based on the similarity between the voiceprint information and preset identity information Verification, if the similarity between the currently received voiceprint information and the preset identity information exceeds a preset threshold, the verification is passed. Since the recognition model used to obtain the similarity between voiceprint information and preset identity information is obtained from training samples containing interference factors, the recognition model has a certain degree of anti-interference when processing voiceprint information, which improves the voice Recognition accuracy of pattern information. Further, the present application uses a joint probability recognition model based on recognition features to recognize voiceprint information. This algorithm strengthens the speaker information in the speech information, further improves the anti-interference of the recognition model, and also improves the recognition of the recognition model. performance.

The additional aspects and advantages of this application will be partly given in the following description, which will become obvious from the following description, or be understood through the practice of this application.

Description of the drawings

Figure 1 is a diagram of the implementation environment of a voice password verification method provided by an embodiment of the application;

2 is a schematic flowchart of a voice password verification method provided by an embodiment of this application;

FIG. 3 is a schematic diagram of the process of password verification in the voice password verification method provided by an embodiment of the application;

4 is a schematic flowchart of establishing a joint probability model for adding interference factors according to an embodiment of the application;

FIG. 5 is a schematic diagram of the process of scoring voiceprint information according to the feature likelihood provided by an embodiment of the application;

6 is a schematic structural diagram of a voice password verification device provided by an embodiment of this application;

FIG. 7 is a schematic structural diagram of a computer device provided by an embodiment of this application.

detailed description

The embodiments of the present application are described in detail below. Examples of the embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals indicate the same or similar elements or elements with the same or similar functions. The embodiments described below with reference to the drawings are exemplary, and are only used to explain the present application, and cannot be construed as a limitation to the present application.

Those skilled in the art can understand that, unless specifically stated, the singular forms "a", "an", "said" and "the" used herein may also include plural forms. It should be further understood that the term "comprising" used in the specification of this application refers to the presence of the described features, integers, steps, operations, elements, and/or components, but does not exclude the presence or addition of one or more other features, Integers, steps, operations, elements, components, and/or groups thereof.

Fig. 1 is an implementation environment diagram of a voice password verification method provided in an embodiment. The implementation environment includes a user terminal and a server side.

The voice password verification method provided in this embodiment can be applied to the server. The server receives the voice information input by the user, parses the voice information to obtain the user's voiceprint information; inputs the voiceprint information into a pre-trained recognition model In, the similarity between the voiceprint information and the preset identity information is obtained based on the recognition model; wherein, the recognition model is based on the association between the voiceprint information and the user identity formed by training samples containing interference factors Relationship; score the voice information according to the similarity to obtain a score value of the voice information; if the score value exceeds a preset threshold, the verification is passed.

It should be noted that the user terminal can be a smart phone, a tablet computer, a notebook computer, a desktop computer, etc., and the server side can be implemented by a computer device with processing functions, but is not limited to this. The server and the user terminal can be connected to the network through Bluetooth, USB (Universal Serial Bus) or other communication connection methods, and this application is not limited here.

In one embodiment, FIG. 2 is a schematic flowchart of a voice password verification method provided in an embodiment of this application. The voice password verification method can be applied to the server side described above and includes the following steps:

Step S210: Receive voice information input by a user, and parse the voice information to obtain voiceprint information of the user;

Step S220: Input the voiceprint information into a pre-trained recognition model, and obtain the similarity between the voiceprint information and preset identity information based on the recognition model; wherein, the recognition model is based on the interference The correlation between the voiceprint information formed by the training samples of the factor and the user identity;

Step S230: Score the voice information according to the similarity to obtain a score value of the voice information. If the score value exceeds a preset threshold, the verification is passed.

The voice password verification method provided by the embodiment of the application performs identity verification based on voiceprint information, and is an identification model used to identify a user's identity. The training sample used in the training process contains interference factors. Therefore, the identification model formed based on the training sample It has a certain anti-interference performance. When the recognition model is used to recognize voiceprint information, the voiceprint information can be accurately recognized, which improves the accuracy and efficiency of voiceprint recognition.

In order to be more clear about the voice password verification solution provided by this application and its technical effects, the specific solution will be described in detail in several embodiments below.

Before the step of parsing the voice information to obtain the voiceprint information of the user in step S210, a password verification may be performed first to improve the security of the verification scheme. The flow diagram is shown in Fig. 3 and includes the following sub-steps:

S310: Receive identity verification request information, retrieve preset questions in the database in response to the request information, and send to the user;

S320: Receive voice information for the preset question sent by the user, and parse the voice information to obtain semantic information therein;

S330: Determine whether the semantic information is consistent with a preset answer;

S331: If they are consistent, perform parsing of the voice information in step S210 to obtain voiceprint information of the user.

Among them, the preset question can be a system setting, combined with the preset answer provided by the user, and establish an association relationship between the preset question and the preset answer, or the preset question and the corresponding preset answer are both customized by the user Establish the relationship between the two and store the preset question, the preset answer and the relationship between the two in the database.

One or more preset questions can be set. If there are multiple preset questions in the database, the preset questions are randomly selected, and the extracted preset questions are sent to the user. Compared with a single question or a scheme that uses static features for identification, this scheme of randomly selecting preset questions is beneficial to improve the security of password verification.

The solution provided by the embodiment of the present application further includes: if the semantic information is inconsistent with the preset answer, the verification is terminated.

The solution provided by the embodiment of this application is as follows: receiving the identity verification request information sent by the user, in response to the request information, first enters the password verification phase, retrieves the preset question in the database, and returns the retrieved preset question to The user, the user receives a preset question, inputs the response information of the preset question through a voice input module, the voice input module may be a microphone; receives voice information containing response information sent by the user, and parses the voice information to obtain The semantic information is used to verify whether the voice information entered by the user is correct according to the semantic information. If the semantic information entered by the user is inconsistent with the preset answer, it indicates that the user is not a stored standard user and the verification fails. The preset answers are consistent, indicating that the current password verification is passed, and further, the second verification can be performed in conjunction with the voiceprint verification process described above to enhance the security and accuracy of the voice password verification process.

Further, if the voice information consistent with the preset answer is not received within the preset time, the following operations can also be performed to increase the verification pass rate:

S340: Retrieve prompt information associated with the preset question, and send the prompt information to the user.

In this embodiment, the prompt information is associated with the preset question, and the reference answer of the preset question is the preset answer. The preset question is also associated with the preset answer in advance, so that the prompt information or the preset answer can be provided according to the preset question. Call.

Wherein, the preset answer is a standard answer associated with the preset question, and the voice information consistent with the preset answer is not received within the preset time, including the following situations: First, the user's transmission is not received within the preset time Second, the voice message that is consistent with the preset answer sent by the user is received within the preset time. In these two cases, it may be that the user did not remember the preset answer and did not input the voice information, or the user The input voice information does not match the preset answer, and the voice information consistent with the preset answer is not received. In this case, the embodiment of this application will send prompt information related to the preset question to the user to improve the verification. Success rate and verification efficiency.

Among them, the prompt information can be set to one or multiple, all of which are associated with the preset question in advance, and it is detected that the voice information consistent with the preset answer is not received within the preset time, and the related information of the preset question is retrieved. Prompt information, the prompt information is sent to the user. If the prompt information includes multiple prompts, the prompt information with the highest priority is sent to the user according to the priority of the prompt information. If the prompt information is still within the preset time after the prompt information is sent If the voice information consistent with the preset answer is not received, the prompt information with the second highest priority is sent to the user, and the prompt information is sent to the user in turn in this manner. Of course, if there is no priority between the prompt information, the prompt information can also be randomly sent to the user, which helps reduce the complexity of the prompt process.

Further, if the number of times of receiving wrong answers exceeds a preset threshold, the verification process is terminated. Among them, the verification process can be terminated by locking the verification interface to prevent password guessing through constant trial and error, and to avoid power loss caused by such trial and error behavior.

It is judged whether the semantic information is consistent with the preset answer, if it is consistent, the voice information is parsed to obtain the voiceprint information of the user, and if it is inconsistent, the verification fails, indicating that the user requesting identity verification is an illegal user.

In the solution provided by the embodiment of the present application, after receiving the identity verification request information sent by the user, the verification is first performed by comparing the voice information in the voice message with the preset answer, that is, the text message is used for password verification. If the password verification is passed, then User identity verification through voiceprint information, combined with the above description of the voiceprint verification scheme, this scheme combines password verification and identity verification, which helps to improve the anti-interference and security of voice password verification.

In an embodiment, before the step of inputting the voiceprint information into the pre-trained recognition model in step S220, the method further includes the following step: establishing a joint probability model with interference factors added, and the flow chart is shown in Fig. 4 , The establishment process is as follows:

S410: Retrieve voice samples stored in the database, add an interference factor to each voice sample, and generate training samples, where the interference factors correspond to multiple different interference types;

S420: Extract feature information of the voiceprint information in the training sample, and establish an association relationship between the voiceprint information and the user identity according to the feature information.

Among them, the interference factors in the voiceprint recognition process are collected in advance, such as noise, multiple people talking, etc. Therefore, the interference types in the embodiment of the present application include multiple languages, microphone types, noise, etc. Obtain the user's voice samples, add the above-mentioned interference types to the corresponding voice samples of each user, and add an interference type to the voice samples to form a training sample. The number of training samples corresponding to each user is not less than the type of interference type , Give an example to illustrate the idea of this solution: there are N voice samples corresponding to the user, and M interference types, then the training samples corresponding to the user are not less than M.

The above-obtained training samples are used to establish an association relationship, the feature parameters in the training samples are extracted, and the weight coefficients of the feature parameters are continuously determined according to the training samples. After the weight coefficients of each feature parameter are determined, the recognition model is obtained.

Since interference factors are added to the training data for establishing the recognition model, the recognition model formed by the training data added with the interference factors has a certain degree of anti-interference, improves the anti-interference performance of the recognition model, and then improves the recognition accuracy of the recognition model.

The voice password verification method provided in the foregoing embodiment can improve the anti-interference performance of the voice password verification process, but in order to further improve the security of the voice password verification method, this application provides the following solutions:

In an embodiment, the user inputs voice information through a microphone, analyzes the voice information to obtain the user's voiceprint information, and uses a joint probability recognition model based on recognition features (I-Vector features), which is based on probabilistic linear prediction The differential analysis algorithm PLDA (probabilistic linear discriminant analysis, PLDA) is obtained. This algorithm has good channel compensation performance and can strengthen the speaker information. The significance of the channel compensation algorithm is to reduce the influence of the channel information on the speaker information in the I-Vector feature. Interference, to further improve the anti-interference of the recognition model. From the perspective of pattern recognition, this algorithm increases the dispersion between classes and reduces the dispersion within the classes, so as to obtain higher discrimination and improve the recognition model Recognition performance.

In the solution provided by this application, in order to improve the anti-interference in the verification process, in addition to adding interference factors in the process of establishing the identification model to make the identification model have a certain anti-interference, on the other hand, this application uses the PLDA algorithm to perform voiceprint information Score, the algorithm can perform channel compensation. The acoustic features used in voiceprint recognition mainly include text information and channel information. The speaker information belongs to the weak information. The PLDA algorithm used in this application strengthens the speaker information, so it can be further Improve the anti-interference of the voice password verification scheme.

The formula of the joint probability recognition model provided by this application is as follows:

Among them, _mi represents the voice sample vector of the speaker s _i , i represents the number of speakers, μ is the global average of the training data, y _si is the feature representation of _mi in the speaker space, and V represents the feature vector of the inter-class space. x _i is the interference variable with size Rx, ε _i is the noise variable, j=1, 2...N is a positive integer, the variability of the speaker is decomposed into N different interference types, U represents the intra-class space The feature vector, W _j represents the feature parameter.

Using the training samples in the training sample set and the EM algorithm, the weights and PLDA model parameters corresponding to the interference factors in the recognition model are obtained. The EM algorithm is essentially the use of maximum likelihood estimation to solve the probability model parameters containing hidden variables. In each iteration, the expectation of the hidden variables under the given training data is first obtained in E-step, and then this expectation is maximized in M-step, and iteratively converges to reach the local optimal value.

After obtaining the model parameters of the PLDA model according to the above scheme, the feature likelihood between the currently obtained voiceprint information and the preset identity information is calculated, and the currently obtained voiceprint information is also the voiceprint information to be verified, according to the feature likelihood The process of scoring voiceprint information is shown in Figure 5. The specific process is as follows:

S510: Retrieve preset identity information, and compare the feature likelihood between the voiceprint information and the preset identity information;

S520: Score the voiceprint information according to the obtained feature likelihood, and obtain a score value of the voiceprint information.

Among them, the preset identity information is the user's pre-stored identity information, and there is at least one pre-stored user's identity information. If this application is applied to a door lock, the preset identity information is the pre-stored user who can open the door lock Identity information. The preset identity information stored in the database is retrieved, the feature likelihood between the currently obtained voiceprint information and the preset identity information is obtained, and the expectation maximization algorithm is used to iteratively solve the problem, and the log likelihood ratio is used to calculate the score value.

Preferably, the following formula is used to calculate the score value of the voiceprint information:

In the above formula, η ₁ and η ₂ are the recognition feature vectors of the speech at both ends respectively. The probability that the two speeches come from the same speaker is assumed to be H _s , and the probability of coming from different speakers is H _d , p(η ₁ ,η ₂ |H _s ) is the likelihood function of two voices from the same speaker; p(η ₁ |H _d ), p(η ₂ |H _d) are the likelihood functions of η ₁ and η ₂ from different speakers, respectively . By calculating the log-likelihood ratio, the similarity of two voices can be measured. The degree of similarity between the voiceprint information to be verified and the preset identity information is proportional to the score: the higher the ratio, the higher the score, and the greater the probability that the two voices belong to the same speaker; the lower the ratio, the lower the score, then The two voices are less likely to belong to the same speaker.

Each training sample contains a type of interference, calculates the inter-class distance of different training samples, and scores based on the distance between the sample to be tested and the stored standard sample. If the voice characteristics of the two samples are the same The greater the degree of likelihood, the more likely the two samples belong to the same speaker.

The above is an embodiment of the voice password verification method provided by this application. For this method, the following describes the embodiment of the corresponding voice password verification device.

The embodiment of the present application also provides a voice password verification device. The structure diagram is shown in FIG. 6, and includes: an analysis module 610, a similarity obtaining module 620, and a verification module 630, as follows:

The parsing module 610 is configured to receive voice information input by a user, and parse the voice information to obtain voiceprint information of the user;

The similarity obtaining module 620 is configured to input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein, the recognition model is based on the interference The correlation between the voiceprint information formed by the training samples of the factor and the user identity;

The verification module 630 is configured to score the voiceprint information according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.

Regarding the voice password verification device in the above-mentioned embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment of the method, and will not be elaborated here.

Further, an embodiment of the present application also provides a non-volatile computer-readable storage medium having computer instructions stored thereon, and when the computer instructions are executed by a processor, the steps of any one of the above-mentioned voice password verification methods . Wherein, the storage medium includes, but is not limited to, any type of disk (including floppy disk, hard disk, optical disk, CD-ROM, and magneto-optical disk), ROM (Read-Only Memory), RAM (Random AccesSS Memory), and then Memory), EPROM (EraSable Programmable Read-Only Memory), EEPROM (Electrically EraSable Programmable Read-Only Memory), flash memory, magnetic card or optical card. That is, the storage medium includes any medium that stores or transmits information in a readable form by a device (for example, a computer). It can be a read-only memory, magnetic disk or optical disk, etc.

One or more processors;

Storage device for storing one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors implement the steps of the voice password verification method described in any one of the foregoing.

Fig. 7 is a block diagram showing a computer device 700 according to an exemplary embodiment. For example, the computer device 700 may be provided as a server. Referring to FIG. 7, the computer device 700 includes a processing component 722, which further includes one or more processors, and a memory resource represented by a memory 732, for storing instructions executable by the processing component 722, such as an application program. The application program stored in the memory 732 may include one or more modules each corresponding to a set of instructions. In addition, the processing component 722 is configured to execute instructions to execute the steps of the voice password verification method described above.

The computer device 700 may also include a power supply component 726 configured to perform power management of the computer device 700, a wired or wireless network interface 750 configured to connect the computer device 700 to a network, and an input output (I/O) interface 758 . The computer device 700 can operate based on an operating system stored in the memory 732, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or the like. It should be understood that, although the various steps in the flowchart of the drawings are shown in sequence as indicated by the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless explicitly stated in this article, the execution of these steps is not strictly limited in order, and they can be executed in other orders. Moreover, at least part of the steps in the flowchart of the drawings may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times, and the order of execution is also It is not necessarily performed sequentially, but may be performed alternately or alternately with other steps or at least a part of sub-steps or stages of other steps.

It should be understood that the functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules.

The above are only part of the implementation of this application. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of this application, several improvements and modifications can be made, and these improvements and modifications are also Should be regarded as the scope of protection of this application.

Claims

A voice password verification method, including:

Receiving voice information input by a user, and parsing the voice information to obtain voiceprint information of the user;

Input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein the recognition model is based on a voiceprint formed by training samples containing interference factors The relationship between information and user identity;

The voiceprint information is scored according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
The voice password verification method according to claim 1, before the step of inputting the voiceprint information into a pre-trained recognition model, the method further comprises:

Retrieve the voice samples stored in the database, add interference factors to each of the voice samples, and generate training samples; wherein the interference factors correspond to multiple different types of interference;

The feature information of the voiceprint information in the training sample is extracted, and the association relationship between the voiceprint information and the user identity is established according to the feature information.
The voice password verification method according to claim 1, wherein the recognition model is a joint probability recognition model based on recognition characteristics, and the formula of the joint probability recognition model is expressed as follows:

Among them, mi represents the voice sample vector of the speaker s i , i represents the number of speakers, μ is the global average of the training data, y si is the feature representation of mi in the speaker space, and V represents the feature vector of the inter-class space. x i is the interference variable with size Rx, ε i is the noise variable, j=1, 2...N is a positive integer, the variability of the speaker is decomposed into N different interference types, U represents the intra-class space The feature vector, W j represents the feature parameter.
The voice password verification method according to claim 1, before the step of parsing the voice information to obtain the user's voiceprint information, the method further comprises:

Receive identity verification request information, retrieve preset questions in the database in response to the request information, and send it to the user;

Receive the voice information for the preset question sent by the user, and parse the voice information to obtain semantic information therein.
According to the voice password verification method of claim 4, if a plurality of preset questions are set in the database, the step of invoking the preset questions in the database and sending them to the user includes:

Randomly extract preset questions and send the extracted preset questions to the user.
The voice password verification method according to claim 4, before the step of receiving the response information sent by the user to the preset question, the method further comprises:

If the voice information consistent with the preset answer is not received within the preset time, the prompt information associated with the preset question is retrieved, and the prompt information is sent to the user; wherein, the preset answer is a preset question The associated standard answer.
The voice password verification method according to claim 1, wherein the step of obtaining the score value of the voiceprint information comprises:

Retrieve preset identity information, and compare the feature likelihood between the voiceprint information and the preset identity information;

The voiceprint information is scored according to the obtained feature likelihood, and the score value of the voiceprint information is obtained.
A voice password verification device includes:

The parsing module is used to receive the voice information input by the user, and parse the voice information to obtain the user's voiceprint information;

The similarity obtaining module is used to input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein, the recognition model is based on the inclusion of interference factors The relationship between the voiceprint information formed by the training samples and the user identity;

The verification module is configured to score the voiceprint information according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
A non-volatile computer-readable storage medium used to store computer instructions, which when running on a computer, enable the computer to execute the steps of the voice password verification method, wherein the The steps of the voice password verification method include:

Receiving voice information input by a user, and parsing the voice information to obtain voiceprint information of the user;

Input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein the recognition model is based on a voiceprint formed by training samples containing interference factors The relationship between information and user identity;

The voiceprint information is scored according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
The non-volatile computer-readable storage medium according to claim 9, before the step of inputting the voiceprint information into a pre-trained recognition model, further comprising:

Retrieve the voice samples stored in the database, add interference factors to each of the voice samples, and generate training samples; wherein the interference factors correspond to multiple different types of interference;

The feature information of the voiceprint information in the training sample is extracted, and the association relationship between the voiceprint information and the user identity is established according to the feature information.
The non-volatile computer-readable storage medium according to claim 9, wherein the recognition model is a joint probability recognition model based on recognition characteristics, and the formula of the joint probability recognition model is expressed as follows:

Among them, mi represents the voice sample vector of the speaker s i , i represents the number of speakers, μ is the global average of the training data, y si is the feature representation of mi in the speaker space, and V represents the feature vector of the inter-class space. x i is the interference variable with size Rx, ε i is the noise variable, j=1, 2...N is a positive integer, the variability of the speaker is decomposed into N different interference types, U represents the intra-class space The feature vector, W j represents the feature parameter.
The non-volatile computer-readable storage medium according to claim 9, before the step of parsing the voice information to obtain the user's voiceprint information, further comprising:

Receive identity verification request information, retrieve preset questions in the database in response to the request information, and send it to the user;

Receive the voice information for the preset question sent by the user, and parse the voice information to obtain semantic information therein.
According to the non-volatile computer-readable storage medium of claim 12, if a plurality of preset questions are set in the database, the step of invoking the preset questions in the database and sending to the user includes:

Randomly extract preset questions and send the extracted preset questions to the user.
The non-volatile computer-readable storage medium according to claim 12, before the step of receiving the response information sent by the user to the preset question, further comprising:

If the voice information consistent with the preset answer is not received within the preset time, the prompt information associated with the preset question is retrieved, and the prompt information is sent to the user; wherein, the preset answer is a preset question The associated standard answer.
A computer device, the computer device includes:

One or more processors;

Storage device for storing one or more programs,

When the one or more programs are executed by the one or more processors, the one or more processors implement the steps of the voice password verification method described above, wherein the steps of the voice password verification method include:

Receiving voice information input by a user, and parsing the voice information to obtain voiceprint information of the user;

Input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein the recognition model is based on a voiceprint formed by training samples containing interference factors The relationship between information and user identity;

The voiceprint information is scored according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
The computer device according to claim 15, before the step of inputting the voiceprint information into a pre-trained recognition model, further comprising:

Retrieve the voice samples stored in the database, add interference factors to each of the voice samples, and generate training samples; wherein the interference factors correspond to multiple different types of interference;

The feature information of the voiceprint information in the training sample is extracted, and the association relationship between the voiceprint information and the user identity is established according to the feature information.
The computer device according to claim 15, wherein the recognition model is a joint probability recognition model based on recognition characteristics, and the formula of the joint probability recognition model is expressed as follows:

Among them, mi represents the voice sample vector of the speaker s i , i represents the number of speakers, μ is the global average of the training data, y si is the feature representation of mi in the speaker space, and V represents the feature vector of the inter-class space. x i is the interference variable with size Rx, ε i is the noise variable, j=1, 2...N is a positive integer, the variability of the speaker is decomposed into N different interference types, U represents the intra-class space The feature vector, W j represents the feature parameter.
The computer device according to claim 15, before the step of parsing the voice information to obtain the user's voiceprint information, it further comprises:

Receive identity verification request information, retrieve preset questions in the database in response to the request information, and send it to the user;

Receive the voice information for the preset question sent by the user, and parse the voice information to obtain semantic information therein.
The computer device according to claim 18, if a plurality of preset questions are set in the database, the step of retrieving the preset questions in the database and sending them to the user includes:

Randomly extract preset questions and send the extracted preset questions to the user.
The computer device according to claim 18, before the step of receiving the response information sent by the user for the preset question, further comprising:

If the voice information consistent with the preset answer is not received within the preset time, the prompt information associated with the preset question is retrieved, and the prompt information is sent to the user; wherein, the preset answer is a preset question The associated standard answer.