WO2016054991A1

WO2016054991A1 - Voiceprint information management method and device as well as identity authentication method and system

Info

Publication number: WO2016054991A1
Application number: PCT/CN2015/091260
Authority: WO
Inventors: 熊剑
Original assignee: 阿里巴巴集团控股有限公司; 熊剑
Priority date: 2014-10-10
Filing date: 2015-09-30
Publication date: 2016-04-14
Also published as: US20170221488A1; JP2017534905A; EP3206205A1; HK1224074A1; KR20170069258A; SG10201903085YA; JP6671356B2; SG11201702919UA; EP3206205A4; US10593334B2; CN105575391A; CN105575391B; EP3206205B1

Abstract

A voiceprint information management method and device as well as an identity authentication method and system. The voiceprint information management method comprises: filtering a historical voice file stored in a related system to obtain voice information of a first user (S12), performing text identification processing to obtain text information corresponding to the voice information (S13), and editing the voice information and the corresponding text information into reference voiceprint information of the first user. As the text information and the voice information in the reference voiceprint information are both obtained based on the historical voice file and are not preset by a related system, that is, the text information and the voice information are not public and no user can predict the specific content of the text information that needs to be reread when identity authentication is executed, so that a corresponding audio file cannot be recorded in advance and successful authentication cannot be achieved by playing the audio file recorded in advance. The identity authentication is performed based on voiceprint information management, the authentication result is more accurate, no potential safety hazard exists and the account security is higher.

Description

Voiceprint information management method and device, and identity authentication method and system

Technical field

The present application relates to the field of voiceprint recognition technology, and in particular, to a voiceprint information management method and apparatus, and an identity authentication method and system.

Background technique

Voiceprint refers to the spectrum of sound waves carrying speech information displayed by electroacoustic instruments. Different people say the same, the sound waves produced by them are different, and the corresponding sound wave spectrum, that is, the voiceprint information is also different. Therefore, by comparing the voiceprint information, it can be determined whether the corresponding speakers are the same, that is, the voiceprint recognition based identity authentication is implemented; the voiceprint recognition based identity authentication method can be widely applied to various account management systems for securing accounts. Security.

In the related art, before using the voiceprint recognition technology to implement identity authentication, the user first needs to read the preset text information, collect the voice signal of the user at this time, and analyze and obtain the corresponding voiceprint information as the reference voiceprint information of the user. Deposited into the voiceprint library; when implementing identity authentication, the authenticated person is also required to read the preset text information, collect the voice signal of the authenticated person, analyze and obtain the corresponding voiceprint information, and compare the voiceprint information with the sound The reference voiceprint information in the texture library can determine whether the authenticated person is the user himself or herself.

In the above technology, the text information used for identity authentication has been disclosed when the voiceprint library is established. Correspondingly, the text information required to be read by the authenticator when performing identity authentication is also known. When the text message is a sound file, anyone can play the pre-recorded sound file to make the authentication successful. It can be seen that the existing identity authentication method based on voiceprint recognition has serious security risks.

Summary of the invention

To overcome the problems in the related art, the present application provides a voiceprint information management method and apparatus, and an identity authentication method and system.

A first aspect of the present application provides a voiceprint information management method, the method comprising the following steps:

Obtaining a historical voice file generated by the first user and the second user;

Performing a filtering process on the historical voice file to obtain voice information of the first user;

Performing text recognition processing on the voice information to obtain text information corresponding to the voice information;

Editing the voice information and corresponding text information into reference voiceprint information of the first user, and storing the reference voiceprint information and an identity identifier of the first user.

With reference to the first aspect, in the first feasible implementation manner of the first aspect, the voiceprint information management method further includes include:

Dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information;

The sub-voice information corresponding to each sub-text information is respectively intercepted from the voice information according to the start and end time of the sub-text information.

With reference to the first possible implementation manner of the first aspect, in the second possible implementation manner of the first aspect, the editing the voice information and the corresponding text information into the reference voiceprint information of the first user includes:

Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.

With reference to the first aspect, in a third possible implementation manner of the first aspect, the storing the reference voiceprint information and the identity identifier of the first user includes:

Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;

If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;

If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;

And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.

A second aspect of the present application provides a voiceprint information management apparatus, the apparatus comprising:

a voice filter, configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user;

a text identifier, configured to perform text recognition processing on the voice information, to obtain text information corresponding to the voice information;

a voiceprint generator for editing the voice information and corresponding text information into reference voiceprint information of the first user, and storing the reference voiceprint information and an identity identifier of the first user.

In conjunction with the second aspect, in the first possible implementation manner of the second aspect, the voiceprint information management apparatus further includes:

a text cutter for dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information;

The voiceprint cutter is configured to separately intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.

With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the voiceprint generator edits the voice information and corresponding text information into a reference of the first user Voiceprint information, including:

With reference to the second aspect, in a third possible implementation manner of the second aspect, the voiceprint generator stores the reference voiceprint information and the identity identifier of the first user, including:

The third aspect of the present application provides an identity authentication method, where the method includes the following steps:

Editing the voice information and corresponding text information into reference voiceprint information of the first user, and storing the reference voiceprint information and an identity identifier of the first user;

Obtaining reference voiceprint information corresponding to the identity identifier of the user to be authenticated;

Outputting text information in the obtained reference voiceprint information, and receiving corresponding voice information to be authenticated;

The voice information in the obtained reference voiceprint information is matched with the voice information to be authenticated. If the matching succeeds, it is determined that the authentication of the user to be authenticated is successful. If the matching fails, it is determined that the authentication of the user to be authenticated fails.

In conjunction with the third aspect, in the first possible implementation manner of the third aspect, the identity authentication method further includes:

With reference to the first possible implementation manner of the third aspect, in the second possible implementation manner of the third aspect, the editing the voice information and the corresponding text information into the reference voiceprint information of the first user includes:

With reference to the third aspect, in a third possible implementation manner of the third aspect, the storing the reference voiceprint information and the identity identifier of the first user includes:

A fourth aspect of the present application provides an identity authentication system; the system includes:

a voiceprint generator, configured to edit the voice information and corresponding text information into reference voiceprint information of the first user, and store the reference voiceprint information and an identity identifier of the first user;

a voiceprint extractor, configured to acquire reference voiceprint information corresponding to an identity identifier of a user to be authenticated;

Identifying a front end device, configured to output text information in the obtained reference voiceprint information, and receive corresponding voice information to be authenticated;

a voiceprint matching device, configured to match the voice information in the obtained reference voiceprint information with the voice information to be authenticated, and if the matching is successful, determine that the authentication of the user to be authenticated is successful, and if the matching fails, determine that the authentication is to be authenticated. User authentication failed.

With reference to the fourth aspect, in a first possible implementation manner of the fourth aspect, the identity authentication system further includes:

With reference to the first possible implementation manner of the fourth aspect, in a second possible implementation manner of the fourth aspect, the voiceprint generator edits the voice information and corresponding text information as a reference of the first user Voiceprint information, including:

With reference to the fourth aspect, in a third possible implementation manner of the fourth aspect, the voiceprint generator stores the reference voiceprint information and the identity identifier of the first user, including:

According to the above technical solution, the present application obtains the voice information of the first user by filtering the historical voice file stored in the related system, and obtains the text information corresponding to the voice information through text recognition processing, and the voice information and the corresponding voice information. The text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the related system, that is, it is non-public, Whether the first user, the second user, or any other user cannot predict the specific content of the text information that needs to be read when performing identity authentication, so that the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be authenticated. The purpose of success. Therefore, compared with the existing voiceprint recognition-based identity authentication method, the voiceprint information management method provided by the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.

The above general description and the following detailed description are intended to be illustrative and not restrictive.

DRAWINGS

The accompanying drawings, which are incorporated in the specification of FIG

FIG. 1 is a flowchart of a method for managing voiceprint information provided by an embodiment of the present application.

FIG. 2 is a flowchart of another method for managing voiceprint information provided by an embodiment of the present application.

FIG. 3 is a flowchart of a method for storing reference voiceprint information provided by an embodiment of the present application.

4 is a structural block diagram of a voiceprint information management system provided by an embodiment of the present application.

FIG. 5 is a structural block diagram of another voiceprint information management system provided by an embodiment of the present application.

FIG. 6 is a flowchart of an identity authentication method according to an embodiment of the present application.

FIG. 7 is a flowchart of another identity authentication method provided by an embodiment of the present application.

FIG. 8 is a structural block diagram of an identity authentication system according to an embodiment of the present application.

FIG. 9 is a structural block diagram of another identity authentication system according to an embodiment of the present application.

detailed description

Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. The following description refers to the same or similar elements in the different figures unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Instead, they are merely examples of devices and methods consistent with aspects of the invention as detailed in the appended claims.

1 is a flowchart of a voiceprint information management method provided by an embodiment of the present application, and the voiceprint information management method is applied to an account management system. As shown in FIG. 1, the voiceprint information management method includes the following steps.

S11. Obtain a historical voice file generated by the first user and the second user.

The first user may be a registered user who has a corresponding private account in the account management system, and correspondingly, the second user may be a service personnel of the account management system.

S12. Perform filtering processing on the historical voice file to obtain voice information of the first user.

S13. Perform text recognition processing on the voice information to obtain text information corresponding to the voice information.

S14. Edit the voice information and the corresponding text information into the reference voiceprint information of the first user, and store the reference voiceprint information and the identity identifier of the first user.

Generally, in order to facilitate performance statistics, service quality assessment, dispute resolution, etc., the account management system records the voice call process between the registered user and the service personnel and stores the corresponding voice file. In view of this, the embodiment of the present application filters out the machine prompt sound in the historical voice file stored by the account management system, the voice information of the service personnel, and the like, and obtains the voice information of the registered user, and performs text recognition processing on the voice information to obtain The text information corresponding to the voice information, the voice information and the corresponding text information can be used as a set of reference voiceprint information of the registered user. Performing the above steps for each registered user separately, the reference voiceprint information corresponding to each registered user can be obtained, and the voiceprint library is created.

According to the above method, the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the related system, and obtains the text information corresponding to the voice information through the text recognition process, and the voice information and the corresponding The text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the relevant system, that is, it is non-public, No matter whether the first user, the second user, or any other user can predict the specific content of the text information that needs to be read when performing identity authentication, the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be played. The purpose of successful certification. Therefore, compared with the existing voiceprint recognition-based identity authentication method, the voiceprint information management method provided by the embodiment of the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.

In a feasible embodiment of the present application, a historical voice file corresponding to an arbitrary call process of the first user and the second user may be randomly obtained, so that the identity identifiers in the voiceprint library are in one-to-one correspondence with the reference voiceprint information. The specific content of the text information in the obtained reference voiceprint information cannot be predicted because the actual voice file obtained by the historical voice file cannot be predicted. Therefore, performing the identity authentication based on the embodiment can ensure the accuracy of the authentication result. Sex, improve the security of the account.

In another possible embodiment of the present application, all historical voice files corresponding to the first user may also be acquired, and each historical voice file may correspond to at least one set of reference voiceprint information, so that one identity identifier in the voiceprint library may be Corresponding to multiple sets of reference voiceprint information (ie, the first user has multiple sets of reference voiceprint information); correspondingly, any set of reference voiceprint information can be randomly obtained to perform identity authentication. Since the text information in each set of reference voiceprint information is non-public, the reference voiceprint information obtained during identity authentication cannot be predicted, so the specific content of the text information used for performing identity authentication cannot be predicted, and thus cannot be advanced. Recording the corresponding sound file can not achieve the purpose of successful authentication by playing the sound file recorded in advance; therefore, performing identity authentication based on the embodiment can ensure the accuracy of the authentication result and improve the security of the account.

2 is a flowchart of a voiceprint information management method according to another embodiment of the present application, and the voiceprint information management method is applied to an account management system. As shown in FIG. 2, the voiceprint information management method includes the following steps.

S21. Obtain a historical voice file generated by the first user and the second user.

S22. Perform filtering processing on the historical voice file to obtain voice information of the first user.

S23. Perform text recognition processing on the voice information to obtain text information corresponding to the voice information.

S24. The text information is divided into a plurality of sub-text information, and the start and end time of each sub-text information is marked.

S25. The sub-voice information corresponding to each sub-text information is separately intercepted from the voice information according to the start and end time of the sub-text information.

S26. Edit each pair of sub-voice information and sub-text information as a piece of reference voiceprint information of the first user, and store each piece of reference voiceprint information and an identity identifier of the first user.

Since the historical voice file is a call recording file between the first user and the second user in a period of time, the filtered voice information includes the plurality of pieces of voice information of the first user, and the corresponding text information obtained by the text recognition includes multiple A sentence or phrase. The embodiment of the present application divides the text information into a plurality of sub-text information (each sub-text information may be a sentence, a phrase or a word); at the same time, each sub-text information obtained by the segmentation is marked with a start and end time, according to the start and end time. The sub-speech information corresponding to the sub-text information is intercepted in the voice information (that is, the voice information is segmented according to the sub-text information). For example, the sentence "My account is locked" in the text message is recognized by the 00:03 to 00:05 period of the voice message, and the "My account is locked" is divided into a sub-text information, The start and end time is 00:03 to 00:05. Correspondingly, the voice information in the 00:03 to 00:05 period of the voice information is intercepted, that is, the subtext information corresponding to "My account is locked" is obtained. Sub-voice information. By segmenting the text information and the voice information, a plurality of pairs of sub-text information and sub-speech information can be obtained, and are respectively edited into reference voiceprint information according to a predetermined format, thereby obtaining a plurality of reference voiceprint information corresponding to the same user.

In the embodiment of the present application, the sub-voice information and the corresponding sub-text information are edited into the reference voiceprint information, and the method includes: processing the sub-voice information into corresponding sub-voice texture information, and setting a file name for the sub-voice texture information, The format of the file name may be “soundprint number. file format suffix”, such as 0989X.WAV; storing the sub-voice texture information, and the identity identifier, sub-text information and the like of the first user corresponding to the sub-voice texture information; The storage structure of the voiceprint library obtained based on the above voiceprint information management method is as shown in Table 1.

Table 1 Example of voiceprint storage structure

用户IDUser ID	用户声纹编号User voiceprint number	子文本信息Subtext information	子声纹信息Sonogram information
139XXXXXXXX139XXXXXXXX	11	非常满意Very satisfied	0989X.WAV0989X.WAV
139XXXXXXXX139XXXXXXXX	22	为什么还没有退款Why are there no refunds?	0389X.WAV0389X.WAV
189XXXXXXXX189XXXXXXXX	11	我很生气I am very angry	0687X.WAV0687X.WAV
189XXXXXXXX189XXXXXXXX	22	账号被锁定Account is locked	0361X.WAV0361X.WAV

In Table 1, each row corresponds to a reference voiceprint information in the voiceprint library; the identity identifier (ie, user ID) is used as the primary key for querying and invoking voiceprint information; the user voiceprint number is used to mark the same user ID. The number of corresponding reference voiceprint information. Taking the user ID "139XXXXXXXX" as an example, when receiving an identity authentication request for the user ID, Querying the reference voiceprint information corresponding to "139XXXXXXXX" from the voiceprint library, a plurality of query results can be obtained, and a reference voiceprint information as the current authentication is randomly extracted from the voiceprint database, for example, the reference voiceprint No. 2 corresponding to the user ID is extracted. As the reference voiceprint information of this authentication, the sub-text information “Why is there no refund” is outputted; the voice information to be authenticated obtained by the user to be authenticated is re-read the information of the sub-file, and is processed as the voiceprint information to be authenticated. Comparing the voiceprint information to be authenticated and the child voiceprint information “0389X.WAV” extracted in the voiceprint library, if the two match, determining that the identity authentication is successful, that is, the user to be authenticated is the first corresponding to “139XXXXXXXX” User; otherwise, if the two do not match, it is determined that the identity authentication failed.

According to the above technical solution, the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the system; and performing text recognition processing on the voice information to obtain corresponding text information; The text information is divided into a plurality of sub-text information, and the corresponding sub-speech information is intercepted from the above-mentioned voice information according to the start and end time of each sub-text information, and each pair of sub-text information and sub-voice information is respectively edited into a reference voiceprint information, and the information is saved. The voiceprint library is provided so that each of the first users has a plurality of reference voiceprint information; when the identity authentication needs to be performed, one of the plurality of reference voiceprint information corresponding to the identity identifier to be authenticated is randomly selected. Since the reference voiceprint information obtained during the identity authentication is random, the specific content of the text information that needs to be read by the user to be authenticated cannot be predicted. Therefore, the voiceprint library obtained according to the embodiment performs identity authentication, and the authentication result can be guaranteed. Accuracy and improve account security. In addition, in this embodiment, the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.

The voiceprint information management method provided by the embodiment of the present application can not only create a new voiceprint library, but also update the created voiceprint library, for example, adding a reference voiceprint information corresponding to a new user, and adding a new voice to the old user. Reference voiceprint information. For the new user, it is only necessary to obtain the historical voice file corresponding to the new user, and perform the above steps S12 to S4, or steps S22 to S26, to obtain the reference voiceprint information corresponding to the new user. As the historical voice file corresponding to the same user increases with time, the old user can obtain the corresponding new historical voice file and perform the above steps to add a new reference to the old user. Voiceprint information.

Based on the voiceprint information management method provided by the embodiment of the present application, one or more pieces of reference voiceprint information may be set for the first user. When a plurality of pieces of reference voiceprint information are set for the same first user, it is necessary to ensure that the text information in any two reference voiceprint information corresponding to the first user is different. However, in practical applications, it is inevitable that the following situation may occur: different historical voice files identify the same text information, or the same text information cuts out multiple sub-text information with the same content, so that the same sub-text information corresponds to multiple sub-texts. Voice information; at this time, the embodiment of the present application uses the method shown in FIG. 3 to complete the storage of the reference voiceprint information. For convenience of description, it is assumed that the reference voiceprint information to be stored is the first reference voiceprint information composed of the first text information and the first voice information. As shown in FIG. 3, the first reference voiceprint information is stored in the embodiment of the present application. The process includes the following steps:

S31. Determine whether there is second reference voiceprint information that satisfies the comparison condition. If yes, execute step S32. Otherwise, step S34 is performed.

The comparison condition includes: the second text information corresponding to the second reference voiceprint information is the same as the first text information in the first reference voiceprint information, and the second identity identifier corresponding to the second reference voiceprint information is The first identity identifier corresponding to a reference voiceprint information is also the same.

S32. Determine whether the quality of the first voice information in the first reference voiceprint information is higher than the quality of the second voice information in the second reference voiceprint information, and if yes, perform step S33, otherwise perform steps S35.

S33. Delete the second reference voiceprint information, and perform step S34.

S34. Store the first reference voiceprint information and the corresponding first identity identifier.

S35. Delete the first reference voiceprint information.

In the above step S31, it is determined whether the second reference voiceprint information is present, and the search range includes at least the reference voiceprint information that has been stored in the voiceprint library, and may also include the data generated synchronously with the first reference voiceprint information and not yet stored. Reference voiceprint information. If the second reference voiceprint information does not exist, the first reference voiceprint information is directly stored. If the second reference voiceprint information is found, the same first user and the same text information have at least two different voice information. At this time, the quality of the first voice information in the first reference voiceprint information, and the Comparing the quality of the second voice information in the second reference voiceprint information, if the quality of the first voice information is higher than the second voice information, storing the first reference voiceprint information, and deleting the second reference voiceprint information, if If the quality of the voice information is lower than the second voice information, the first reference voiceprint information is directly deleted, that is, only the voice information with the highest quality is retained for the same text information, so as to improve the accuracy of the voice information comparison result in the identity authentication process, Reduce the difficulty of comparison.

Based on the above stored procedure, the following three voiceprint library update modes can be implemented: 1) adding reference voiceprint information of the new user; 2) increasing the reference voiceprint information of the text information corresponding to the old user; 3) The reference voiceprint information with lower voice information quality is replaced with the reference voiceprint information with higher voice information quality.

It can be seen from the above technical solutions that the new reference voiceprint information obtained by the embodiment of the present application is not directly stored in the voiceprint library, but is first determined whether the text information in the reference voiceprint information and the corresponding text are stored. Another reference voiceprint information having the same identity identifier, if present, compares the quality of the voice information in the two reference voiceprint information, retains the reference voiceprint information with higher voice information quality, and deletes the voice information with lower quality Reference voiceprint information. Therefore, the embodiment of the present application can ensure that the text information in any two reference voiceprint information corresponding to the same identity identifier (that is, the same first user) is different in the stored reference voiceprint information, and each text can be guaranteed. The quality of the voice information corresponding to the text information is the highest; when performing the identity authentication based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the accuracy of the authentication and improve the authentication efficiency.

FIG. 4 is a structural block diagram of a voiceprint information management system according to an embodiment of the present application; the voiceprint information management system can be applied to an account management system. As shown in FIG. 4, the voiceprint information management system 100 includes a voice filter 110, a text recognizer 120, and a voiceprint generator 130.

The voice filter 110 is configured to acquire a historical voice file generated by a call between the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.

The text recognizer 120 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.

The voiceprint generator 130 is configured to edit the voice information and corresponding text information into reference voiceprint information of the first user, and store the reference voiceprint information and the identity of the first user symbol.

It can be seen from the above structure that the historical voice file stored in the related system is filtered to obtain the voice information of the first user, and the text information corresponding to the voice information is obtained through the text recognition process, and the voice information and the corresponding information are obtained. The text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the relevant system, that is, it is non-public, No matter whether the first user, the second user, or any other user can predict the specific content of the text information that needs to be read when performing identity authentication, the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be played. The purpose of successful certification. Therefore, compared with the existing voiceprint recognition-based identity authentication method, the voiceprint information management method provided by the embodiment of the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.

FIG. 5 is a structural block diagram of another voiceprint information management system according to an embodiment of the present disclosure; the voiceprint information management system can be applied to an account management system. As shown in FIG. 5, the voiceprint information management system 200 includes a voice filter 210, a text recognizer 220, a text cutter 240, a voiceprint cutter 250, and a voiceprint generator 230.

The voice filter 210 is configured to acquire a historical voice file generated by a call between the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.

The text recognizer 220 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.

The text cutter 240 is configured to slice the text information into a plurality of sub-text information and mark the start and end time of each sub-text information.

The voiceprint cutter 250 is configured to respectively intercept sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.

The voiceprint generator 230 is configured to edit each pair of sub-voice information and sub-text information as a piece of reference voiceprint information of the first user, and store each piece of reference voiceprint information and the identity of the first user. Identifier.

According to the above structure, the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the system; and performing text recognition processing on the voice information to obtain corresponding text information; and the recognized text The information is divided into a plurality of sub-text information, and according to the start and end time of each sub-text information from the above The corresponding sub-speech information is intercepted in the voice information, and each pair of sub-text information and sub-speech information is respectively edited into a reference voiceprint information, and stored in the voiceprint library, so that each first user has multiple reference voiceprint information; When identity authentication needs to be performed, one of the plurality of reference voiceprint information corresponding to the identity identifier to be authenticated may be randomly selected. Since the reference voiceprint information obtained during the identity authentication is random, it is impossible to predict the specific content of the text information that needs to be read by the user to be authenticated, so that the corresponding sound file cannot be recorded in advance, that is, the sound file that cannot be recorded in advance cannot be played. The purpose of the authentication success is achieved. Therefore, performing the identity authentication based on the voiceprint library obtained in this embodiment can ensure the accuracy of the authentication result and improve the security of the account. In addition, in this embodiment, the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.

In the voiceprint information management system provided by the embodiment of the present application, in order to implement the function of storing the reference voiceprint information and the identity identifier of the first user, the voiceprint generator 130 and the voiceprint generator 230 may be configured. for:

If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first If the quality of the voice information is lower than the second voice information, the first reference voiceprint information is directly deleted;

Based on the voiceprint generator configured above, the embodiment of the present application can ensure that the text information in any two reference voiceprint information corresponding to the same user is different in the stored reference voiceprint information, and each text can be guaranteed. The quality of the voice information corresponding to the information is the highest; thus, when performing the identity authentication based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the accuracy of the authentication and improve the authentication efficiency.

FIG. 6 is a flowchart of an identity authentication method according to an embodiment of the present application; the identity authentication method may be applied to an account management system. Referring to FIG. 6, the identity authentication method includes the following steps.

S41. Obtain a historical voice file generated by the first user and the second user.

The first user may be a registered user who has a corresponding private account in the account management system. Correspondingly, the second user may be a service personnel of the account management system.

S42. Perform filtering processing on the historical voice file to obtain voice information of the first user.

S43. Perform text recognition processing on the voice information to obtain text information corresponding to the voice information.

S44. Edit the text information and the corresponding voice information into the reference voiceprint information of the first user, and store the reference voiceprint information and the identity identifier of the first user.

S45. Obtain reference voiceprint information corresponding to an identity identifier of the user to be authenticated.

S46. Output text information in the obtained reference voiceprint information, and receive corresponding voice information to be authenticated.

S47. Match the voice information in the obtained reference voiceprint information with the to-be-authenticated voice information. If the matching is successful, determine that the authentication of the user to be authenticated is successful. If the matching fails, determine that the authentication of the user to be authenticated fails.

FIG. 7 is a flowchart of another identity authentication method according to an embodiment of the present application; the identity authentication method may be applied to an account management system. Referring to FIG. 7, the identity authentication method includes the following steps.

S51. Obtain a historical voice file generated by the first user and the second user.

S52. Perform filtering processing on the historical voice file to obtain voice information of the first user.

S53. Perform text recognition processing on the voice information to obtain text information corresponding to the voice information.

S54. The text information is divided into a plurality of sub-text information, and the start and end time of each sub-text information is marked.

S55. The sub-voice information corresponding to each sub-text information is separately extracted from the voice information according to the start and end time of the sub-text information.

S56. Edit each pair of sub-voice information and sub-text information as a piece of reference voiceprint information of the first user, and store each piece of reference voiceprint information and an identifier of the first user.

S57. Obtain reference voiceprint information corresponding to an identity identifier of the user to be authenticated.

S58. Output sub-text information in the obtained reference voiceprint information, and receive corresponding voice information to be authenticated.

S59. Match the sub-voice information in the obtained reference voiceprint information with the to-be-authenticated voice information. If the matching succeeds, it is determined that the authentication of the user to be authenticated is successful, and if the matching fails, it is determined that the authentication of the user to be authenticated fails.

According to the above method, the text information in the present application is divided into a plurality of sub-text information, and the corresponding sub-speech information is intercepted according to the start and end time, and each sub-text information and the corresponding sub-voice information are edited into one reference. The voiceprint information is such that the first user has a plurality of reference voiceprint information; when the identity authentication needs to be performed, one of the plurality of reference voiceprint information corresponding to the identity identifier to be authenticated is randomly selected. Since the reference voiceprint information obtained during the identity authentication is random, it is impossible to predict the specific content of the text information that needs to be read by the user to be authenticated, so that the corresponding sound file cannot be recorded in advance, that is, the sound file that cannot be recorded in advance cannot be played. The purpose of the authentication is successful. Therefore, the identity authentication method provided in this embodiment can ensure the accuracy of the authentication result and improve the security of the account. In addition, in this embodiment, the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.

The identity authentication method provided by the embodiment of the present application can also complete the storage of the reference voiceprint information by using the method shown in FIG. 3, and can ensure not only any two reference voiceprint information corresponding to the same user in the stored reference voiceprint information. The text information in the text is different, and the quality of the voice information corresponding to each type of text information is also guaranteed to be the highest; when the identity authentication is performed based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the authentication. Accuracy and improve certification efficiency.

FIG. 8 is a structural block diagram of an identity authentication system according to an embodiment of the present application, where the identity authentication system can be applied to an account management system. Referring to FIG. 8, the identity authentication system 300 includes a voice filter 310, a text recognizer 320, a voiceprint generator 330, a voiceprint extractor 360, a recognition front end 370, and a voiceprint matcher 380.

The voice filter 310 is configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.

The text recognizer 320 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.

The voiceprint generator 330 is configured to edit the voice information and corresponding text information into reference voiceprint information of the first user, and store the reference voiceprint information and the identity of the first user symbol.

The voiceprint extractor 360 is configured to acquire reference voiceprint information corresponding to an identity identifier of a user to be authenticated.

The recognition front end 370 is configured to output the text information in the acquired reference voiceprint information and receive the corresponding voice information to be authenticated.

The voiceprint matcher 380 is configured to match the voice information in the obtained reference voiceprint information with the voice information to be authenticated, and if the matching is successful, determine that the authentication of the user to be authenticated is successful, and if the matching fails, determine The authentication of the user to be authenticated failed.

In the above structure, the identification pre-amplifier 370 is configured to implement interaction between the identity authentication system and the user to be authenticated; in addition to the text information in the reference voiceprint information acquired by the voiceprint extractor 360, the user input to be authenticated is received. In addition to authenticating the voice information, the user may also receive an identity authentication request of the user to be authenticated, and after receiving the identity authentication request, trigger the voiceprint extractor 360, and output the authentication result obtained by the voiceprint matcher 380 to the user to be authenticated.

FIG. 9 is a structural block diagram of an identity authentication system according to an embodiment of the present application. The identity authentication system can be applied to an account management system. Referring to FIG. 9, the identity authentication system 400 includes a voice filter 410, a text recognizer 420, a text cutter 440, a voiceprint cutter 450, a voiceprint generator 430, a voiceprint extractor 460, a recognition front end 470, and Voiceprint matcher 480.

The voice filter 410 is configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.

The text recognizer 420 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.

The text cutter 440 is configured to slice the text information into a plurality of sub-text information and mark the start and end time of each sub-text information.

The voiceprint cutter 450 is configured to respectively intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.

The voiceprint generator 430 is configured to edit each pair of sub-voice information and sub-text information as a piece of reference voiceprint information of the first user, and store each piece of reference voiceprint information and the identity of the first user. Identifier.

The voiceprint extractor 460 is configured to acquire reference voiceprint information corresponding to an identity identifier of a user to be authenticated.

The identification preamplifier 470 is configured to output sub-text information in the acquired reference voiceprint information and receive corresponding to-be-authenticated speech information.

The voiceprint matcher 480 is configured to: the sub-voice information in the acquired reference voiceprint information and the to-be-recognized If the matching is successful, the authentication of the user to be authenticated is successful. If the matching fails, the authentication of the user to be authenticated is determined to be unsuccessful.

It can be seen from the above structure that the identified text information is divided into a plurality of sub-text information, and the corresponding sub-speech information is intercepted according to the start and end time, and each sub-text information and the corresponding sub-speech information are edited into one reference. The voiceprint information is such that the first user has a plurality of pieces of reference voiceprint information; when the identity authentication needs to be performed, the corresponding plurality of reference voiceprint information are determined from the identity identifier corresponding to the user to be authenticated, and one of the randomly selected ones is used for This identity certification. Since the reference voiceprint information obtained during the identity authentication is random, it is impossible to predict the specific content of the text information that needs to be read by the user to be authenticated, so that the corresponding sound file cannot be recorded in advance, that is, the sound file that cannot be recorded in advance cannot be played. The purpose of the authentication is successful. Therefore, the identity authentication system provided in this embodiment can ensure the accuracy of the authentication result and improve the security of the account. In addition, in this embodiment, the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.

In the identity authentication system provided by the embodiment of the present application, in order to implement the function of storing the reference voiceprint information and the corresponding user identity identifier, the voiceprint generator 330 and the voiceprint generator 430 may be configured to:

If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the identity identifier of the first user;

And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and a corresponding user identity identifier.

Based on the voiceprint generator configured above, the embodiment of the present application can ensure that the text information in any two reference voiceprint information corresponding to the same identity identifier is different in the stored reference voiceprint information, and each text can be guaranteed. The quality of the voice information corresponding to the text information is the highest; when the identity authentication is performed based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the accuracy of the authentication and improve the authentication efficiency.

Other embodiments of the invention will be apparent to those skilled in the <RTIgt; The present application is intended to cover any variations, uses, or adaptations of the present invention, which are in accordance with the general principles of the present invention and include common general knowledge or conventional technical means in the art not disclosed herein. . The description and examples are to be regarded as illustrative only, the true scope and spirit of the invention The claims are stated.

It is to be understood that the invention is not limited to the details of the details of The scope of the invention is limited only by the appended claims.

Claims

A method for managing voiceprint information, comprising:

Obtaining a historical voice file generated by the first user and the second user;

Performing a filtering process on the historical voice file to obtain voice information of the first user;

Performing text recognition processing on the voice information to obtain text information corresponding to the voice information;

Editing the voice information and corresponding text information into reference voiceprint information of the first user, and storing the reference voiceprint information and an identity identifier of the first user.
The voiceprint information management method according to claim 1, further comprising:

Dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information;

The sub-voice information corresponding to each sub-text information is respectively intercepted from the voice information according to the start and end time of the sub-text information.
The voiceprint information management method according to claim 2, wherein the editing the voice information and the corresponding text information into the reference voiceprint information of the first user comprises:

Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
The voiceprint information management method according to claim 1, wherein storing the reference voiceprint information and the identity identifier of the first user comprises:

Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;

If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;

If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;

And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
A voiceprint information management system, comprising:

a voice filter, configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user;

a text identifier, configured to perform text recognition processing on the voice information, to obtain text information corresponding to the voice information;

a voiceprint generator for editing the voice information and corresponding text information into reference voiceprint information of the first user, and storing the reference voiceprint information and an identity identifier of the first user.
The voiceprint information management system according to claim 5, further comprising:

a text cutter for dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information;

The voiceprint cutter is configured to separately intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
The voiceprint information management system according to claim 6, wherein the voiceprint generator edits the voice information and the corresponding text information into the reference voiceprint information of the first user, including:

Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
The voiceprint information management system according to claim 5, wherein the voiceprint generator stores the reference voiceprint information and the identity identifier of the first user, including:

Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;

If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;

If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;

And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
An identity authentication method, comprising:

Obtaining a historical voice file generated by the first user and the second user;

Performing a filtering process on the historical voice file to obtain voice information of the first user;

Performing text recognition processing on the voice information of the user voice information, and obtaining text information corresponding to the voice information of the user voice information;

Editing the voice information and corresponding text information into reference voiceprint information of the first user, and storing reference voiceprint information and an identity identifier of the first user;

Obtaining reference voiceprint information corresponding to the identity identifier of the user to be authenticated;

Outputting text information in the obtained reference voiceprint information, and receiving corresponding voice information to be authenticated;

The voice information in the obtained reference voiceprint information is matched with the voice information to be authenticated. If the matching succeeds, it is determined that the authentication of the user to be authenticated is successful. If the matching fails, it is determined that the authentication of the user to be authenticated fails.
The identity authentication system according to claim 9, further comprising:

Dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information;

The sub-voice information corresponding to each sub-text information is respectively intercepted from the voice information according to the start and end time of the sub-text information.
The identity authentication system according to claim 10, wherein the editing the voice information and the corresponding text information into the reference voiceprint information of the first user comprises:

Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
The identity authentication system according to claim 9, wherein storing the reference voiceprint information and the identity identifier of the first user comprises:

Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;

If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;

If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;

And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
An identity authentication system, comprising:

a voice filter, configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user;

a text identifier, configured to perform text recognition processing on the voice information, to obtain text information corresponding to the voice information;

a voiceprint generator, configured to edit the voice information and corresponding text information into reference voiceprint information of the first user, and store reference voiceprint information and an identity identifier of the first user;

a voiceprint extractor, configured to acquire reference voiceprint information corresponding to an identity identifier of a user to be authenticated;

Identifying a front end device, configured to output text information in the obtained reference voiceprint information, and receive corresponding voice information to be authenticated;

a voiceprint matching device, configured to match the voice information in the obtained reference voiceprint information with the voice information to be authenticated, and if the matching is successful, determine that the authentication of the user to be authenticated is successful, and if the matching fails, determine the user to be authenticated. Authentication failed.
The identity authentication system according to claim 13, further comprising:

a text cutter for dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information;

The voiceprint cutter is configured to separately intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
The identity authentication system according to claim 14, wherein the voiceprint generator edits the voice information and the corresponding text information into the reference voiceprint information of the first user, including:

Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
The identity authentication system according to claim 13, wherein the voiceprint generator stores reference voiceprint information and an identity identifier of the first user, including:

Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;

If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;

If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;

And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.