WO2016054991A1 - Voiceprint information management method and device as well as identity authentication method and system - Google Patents

Voiceprint information management method and device as well as identity authentication method and system Download PDF

Info

Publication number
WO2016054991A1
WO2016054991A1 PCT/CN2015/091260 CN2015091260W WO2016054991A1 WO 2016054991 A1 WO2016054991 A1 WO 2016054991A1 CN 2015091260 W CN2015091260 W CN 2015091260W WO 2016054991 A1 WO2016054991 A1 WO 2016054991A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
voice
user
text
reference voiceprint
Prior art date
Application number
PCT/CN2015/091260
Other languages
French (fr)
Chinese (zh)
Inventor
熊剑
Original Assignee
阿里巴巴集团控股有限公司
熊剑
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司, 熊剑 filed Critical 阿里巴巴集团控股有限公司
Priority to JP2017518071A priority Critical patent/JP6671356B2/en
Priority to EP15848463.4A priority patent/EP3206205B1/en
Priority to KR1020177012683A priority patent/KR20170069258A/en
Priority to SG11201702919UA priority patent/SG11201702919UA/en
Publication of WO2016054991A1 publication Critical patent/WO2016054991A1/en
Priority to US15/484,082 priority patent/US10593334B2/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • G10L17/24Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase

Definitions

  • the present application relates to the field of voiceprint recognition technology, and in particular, to a voiceprint information management method and apparatus, and an identity authentication method and system.
  • Voiceprint refers to the spectrum of sound waves carrying speech information displayed by electroacoustic instruments. Different people say the same, the sound waves produced by them are different, and the corresponding sound wave spectrum, that is, the voiceprint information is also different. Therefore, by comparing the voiceprint information, it can be determined whether the corresponding speakers are the same, that is, the voiceprint recognition based identity authentication is implemented; the voiceprint recognition based identity authentication method can be widely applied to various account management systems for securing accounts. Security.
  • the user before using the voiceprint recognition technology to implement identity authentication, the user first needs to read the preset text information, collect the voice signal of the user at this time, and analyze and obtain the corresponding voiceprint information as the reference voiceprint information of the user. Deposited into the voiceprint library; when implementing identity authentication, the authenticated person is also required to read the preset text information, collect the voice signal of the authenticated person, analyze and obtain the corresponding voiceprint information, and compare the voiceprint information with the sound
  • the reference voiceprint information in the texture library can determine whether the authenticated person is the user himself or herself.
  • the text information used for identity authentication has been disclosed when the voiceprint library is established.
  • the text information required to be read by the authenticator when performing identity authentication is also known.
  • the text message is a sound file, anyone can play the pre-recorded sound file to make the authentication successful. It can be seen that the existing identity authentication method based on voiceprint recognition has serious security risks.
  • the present application provides a voiceprint information management method and apparatus, and an identity authentication method and system.
  • a first aspect of the present application provides a voiceprint information management method, the method comprising the following steps:
  • the voiceprint information management method further includes include:
  • the sub-voice information corresponding to each sub-text information is respectively intercepted from the voice information according to the start and end time of the sub-text information.
  • the editing the voice information and the corresponding text information into the reference voiceprint information of the first user includes:
  • Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
  • the storing the reference voiceprint information and the identity identifier of the first user includes:
  • the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information.
  • the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
  • the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
  • the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
  • a second aspect of the present application provides a voiceprint information management apparatus, the apparatus comprising:
  • a voice filter configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user
  • a text identifier configured to perform text recognition processing on the voice information, to obtain text information corresponding to the voice information
  • a voiceprint generator for editing the voice information and corresponding text information into reference voiceprint information of the first user, and storing the reference voiceprint information and an identity identifier of the first user.
  • the voiceprint information management apparatus further includes:
  • a text cutter for dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information
  • the voiceprint cutter is configured to separately intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
  • the voiceprint generator edits the voice information and corresponding text information into a reference of the first user Voiceprint information, including:
  • Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
  • the voiceprint generator stores the reference voiceprint information and the identity identifier of the first user, including:
  • the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information.
  • the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
  • the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
  • the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
  • the third aspect of the present application provides an identity authentication method, where the method includes the following steps:
  • the voice information in the obtained reference voiceprint information is matched with the voice information to be authenticated. If the matching succeeds, it is determined that the authentication of the user to be authenticated is successful. If the matching fails, it is determined that the authentication of the user to be authenticated fails.
  • the identity authentication method further includes:
  • the sub-voice information corresponding to each sub-text information is respectively intercepted from the voice information according to the start and end time of the sub-text information.
  • the editing the voice information and the corresponding text information into the reference voiceprint information of the first user includes:
  • Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
  • the storing the reference voiceprint information and the identity identifier of the first user includes:
  • the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information.
  • the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
  • the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
  • the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
  • a fourth aspect of the present application provides an identity authentication system; the system includes:
  • a voice filter configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user
  • a text identifier configured to perform text recognition processing on the voice information, to obtain text information corresponding to the voice information
  • a voiceprint generator configured to edit the voice information and corresponding text information into reference voiceprint information of the first user, and store the reference voiceprint information and an identity identifier of the first user;
  • a voiceprint extractor configured to acquire reference voiceprint information corresponding to an identity identifier of a user to be authenticated
  • Identifying a front end device configured to output text information in the obtained reference voiceprint information, and receive corresponding voice information to be authenticated
  • a voiceprint matching device configured to match the voice information in the obtained reference voiceprint information with the voice information to be authenticated, and if the matching is successful, determine that the authentication of the user to be authenticated is successful, and if the matching fails, determine that the authentication is to be authenticated. User authentication failed.
  • the identity authentication system further includes:
  • a text cutter for dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information
  • the voiceprint cutter is configured to separately intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
  • the voiceprint generator edits the voice information and corresponding text information as a reference of the first user Voiceprint information, including:
  • Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
  • the voiceprint generator stores the reference voiceprint information and the identity identifier of the first user, including:
  • the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information.
  • the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
  • the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
  • the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
  • the present application obtains the voice information of the first user by filtering the historical voice file stored in the related system, and obtains the text information corresponding to the voice information through text recognition processing, and the voice information and the corresponding voice information.
  • the text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the related system, that is, it is non-public, Whether the first user, the second user, or any other user cannot predict the specific content of the text information that needs to be read when performing identity authentication, so that the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be authenticated.
  • the voiceprint information management method provided by the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
  • FIG. 1 is a flowchart of a method for managing voiceprint information provided by an embodiment of the present application.
  • FIG. 2 is a flowchart of another method for managing voiceprint information provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of a method for storing reference voiceprint information provided by an embodiment of the present application.
  • FIG. 4 is a structural block diagram of a voiceprint information management system provided by an embodiment of the present application.
  • FIG. 5 is a structural block diagram of another voiceprint information management system provided by an embodiment of the present application.
  • FIG. 6 is a flowchart of an identity authentication method according to an embodiment of the present application.
  • FIG. 7 is a flowchart of another identity authentication method provided by an embodiment of the present application.
  • FIG. 8 is a structural block diagram of an identity authentication system according to an embodiment of the present application.
  • FIG. 9 is a structural block diagram of another identity authentication system according to an embodiment of the present application.
  • the voiceprint information management method includes the following steps.
  • the first user may be a registered user who has a corresponding private account in the account management system, and correspondingly, the second user may be a service personnel of the account management system.
  • the account management system records the voice call process between the registered user and the service personnel and stores the corresponding voice file.
  • the embodiment of the present application filters out the machine prompt sound in the historical voice file stored by the account management system, the voice information of the service personnel, and the like, and obtains the voice information of the registered user, and performs text recognition processing on the voice information to obtain
  • the text information corresponding to the voice information, the voice information and the corresponding text information can be used as a set of reference voiceprint information of the registered user. Performing the above steps for each registered user separately, the reference voiceprint information corresponding to each registered user can be obtained, and the voiceprint library is created.
  • the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the related system, and obtains the text information corresponding to the voice information through the text recognition process, and the voice information and the corresponding
  • the text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the relevant system, that is, it is non-public, No matter whether the first user, the second user, or any other user can predict the specific content of the text information that needs to be read when performing identity authentication, the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be played.
  • the voiceprint information management method provided by the embodiment of the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
  • a historical voice file corresponding to an arbitrary call process of the first user and the second user may be randomly obtained, so that the identity identifiers in the voiceprint library are in one-to-one correspondence with the reference voiceprint information.
  • the specific content of the text information in the obtained reference voiceprint information cannot be predicted because the actual voice file obtained by the historical voice file cannot be predicted. Therefore, performing the identity authentication based on the embodiment can ensure the accuracy of the authentication result. Sex, improve the security of the account.
  • all historical voice files corresponding to the first user may also be acquired, and each historical voice file may correspond to at least one set of reference voiceprint information, so that one identity identifier in the voiceprint library may be Corresponding to multiple sets of reference voiceprint information (ie, the first user has multiple sets of reference voiceprint information); correspondingly, any set of reference voiceprint information can be randomly obtained to perform identity authentication. Since the text information in each set of reference voiceprint information is non-public, the reference voiceprint information obtained during identity authentication cannot be predicted, so the specific content of the text information used for performing identity authentication cannot be predicted, and thus cannot be advanced. Recording the corresponding sound file can not achieve the purpose of successful authentication by playing the sound file recorded in advance; therefore, performing identity authentication based on the embodiment can ensure the accuracy of the authentication result and improve the security of the account.
  • the voiceprint information management method includes the following steps.
  • the text information is divided into a plurality of sub-text information, and the start and end time of each sub-text information is marked.
  • the sub-voice information corresponding to each sub-text information is separately intercepted from the voice information according to the start and end time of the sub-text information.
  • the filtered voice information includes the plurality of pieces of voice information of the first user, and the corresponding text information obtained by the text recognition includes multiple A sentence or phrase.
  • the embodiment of the present application divides the text information into a plurality of sub-text information (each sub-text information may be a sentence, a phrase or a word); at the same time, each sub-text information obtained by the segmentation is marked with a start and end time, according to the start and end time.
  • the sub-speech information corresponding to the sub-text information is intercepted in the voice information (that is, the voice information is segmented according to the sub-text information).
  • the sentence "My account is locked” in the text message is recognized by the 00:03 to 00:05 period of the voice message, and the "My account is locked” is divided into a sub-text information,
  • the start and end time is 00:03 to 00:05.
  • the voice information in the 00:03 to 00:05 period of the voice information is intercepted, that is, the subtext information corresponding to "My account is locked” is obtained.
  • Sub-voice information By segmenting the text information and the voice information, a plurality of pairs of sub-text information and sub-speech information can be obtained, and are respectively edited into reference voiceprint information according to a predetermined format, thereby obtaining a plurality of reference voiceprint information corresponding to the same user.
  • the sub-voice information and the corresponding sub-text information are edited into the reference voiceprint information
  • the method includes: processing the sub-voice information into corresponding sub-voice texture information, and setting a file name for the sub-voice texture information,
  • the format of the file name may be “soundprint number. file format suffix”, such as 0989X.WAV; storing the sub-voice texture information, and the identity identifier, sub-text information and the like of the first user corresponding to the sub-voice texture information;
  • the storage structure of the voiceprint library obtained based on the above voiceprint information management method is as shown in Table 1.
  • each row corresponds to a reference voiceprint information in the voiceprint library; the identity identifier (ie, user ID) is used as the primary key for querying and invoking voiceprint information; the user voiceprint number is used to mark the same user ID. The number of corresponding reference voiceprint information.
  • the identity identifier ie, user ID
  • the user voiceprint number is used to mark the same user ID. The number of corresponding reference voiceprint information.
  • the sub-text information “Why is there no refund” is outputted; the voice information to be authenticated obtained by the user to be authenticated is re-read the information of the sub-file, and is processed as the voiceprint information to be authenticated. Comparing the voiceprint information to be authenticated and the child voiceprint information “0389X.WAV” extracted in the voiceprint library, if the two match, determining that the identity authentication is successful, that is, the user to be authenticated is the first corresponding to “139XXXXXXX” User; otherwise, if the two do not match, it is determined that the identity authentication failed.
  • the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the system; and performing text recognition processing on the voice information to obtain corresponding text information;
  • the text information is divided into a plurality of sub-text information, and the corresponding sub-speech information is intercepted from the above-mentioned voice information according to the start and end time of each sub-text information, and each pair of sub-text information and sub-voice information is respectively edited into a reference voiceprint information, and the information is saved.
  • the voiceprint library is provided so that each of the first users has a plurality of reference voiceprint information; when the identity authentication needs to be performed, one of the plurality of reference voiceprint information corresponding to the identity identifier to be authenticated is randomly selected. Since the reference voiceprint information obtained during the identity authentication is random, the specific content of the text information that needs to be read by the user to be authenticated cannot be predicted. Therefore, the voiceprint library obtained according to the embodiment performs identity authentication, and the authentication result can be guaranteed. Accuracy and improve account security.
  • the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.
  • the voiceprint information management method provided by the embodiment of the present application can not only create a new voiceprint library, but also update the created voiceprint library, for example, adding a reference voiceprint information corresponding to a new user, and adding a new voice to the old user.
  • Reference voiceprint information For the new user, it is only necessary to obtain the historical voice file corresponding to the new user, and perform the above steps S12 to S4, or steps S22 to S26, to obtain the reference voiceprint information corresponding to the new user. As the historical voice file corresponding to the same user increases with time, the old user can obtain the corresponding new historical voice file and perform the above steps to add a new reference to the old user. Voiceprint information.
  • one or more pieces of reference voiceprint information may be set for the first user.
  • a plurality of pieces of reference voiceprint information are set for the same first user, it is necessary to ensure that the text information in any two reference voiceprint information corresponding to the first user is different.
  • different historical voice files identify the same text information, or the same text information cuts out multiple sub-text information with the same content, so that the same sub-text information corresponds to multiple sub-texts.
  • Voice information at this time, the embodiment of the present application uses the method shown in FIG. 3 to complete the storage of the reference voiceprint information.
  • the reference voiceprint information to be stored is the first reference voiceprint information composed of the first text information and the first voice information.
  • the first reference voiceprint information is stored in the embodiment of the present application. The process includes the following steps:
  • step S31 Determine whether there is second reference voiceprint information that satisfies the comparison condition. If yes, execute step S32. Otherwise, step S34 is performed.
  • the comparison condition includes: the second text information corresponding to the second reference voiceprint information is the same as the first text information in the first reference voiceprint information, and the second identity identifier corresponding to the second reference voiceprint information is The first identity identifier corresponding to a reference voiceprint information is also the same.
  • step S32 Determine whether the quality of the first voice information in the first reference voiceprint information is higher than the quality of the second voice information in the second reference voiceprint information, and if yes, perform step S33, otherwise perform steps S35.
  • step S31 it is determined whether the second reference voiceprint information is present, and the search range includes at least the reference voiceprint information that has been stored in the voiceprint library, and may also include the data generated synchronously with the first reference voiceprint information and not yet stored. Reference voiceprint information. If the second reference voiceprint information does not exist, the first reference voiceprint information is directly stored. If the second reference voiceprint information is found, the same first user and the same text information have at least two different voice information.
  • the quality of the first voice information in the first reference voiceprint information, and the Comparing the quality of the second voice information in the second reference voiceprint information if the quality of the first voice information is higher than the second voice information, storing the first reference voiceprint information, and deleting the second reference voiceprint information, if If the quality of the voice information is lower than the second voice information, the first reference voiceprint information is directly deleted, that is, only the voice information with the highest quality is retained for the same text information, so as to improve the accuracy of the voice information comparison result in the identity authentication process, Reduce the difficulty of comparison.
  • the following three voiceprint library update modes can be implemented: 1) adding reference voiceprint information of the new user; 2) increasing the reference voiceprint information of the text information corresponding to the old user; 3) The reference voiceprint information with lower voice information quality is replaced with the reference voiceprint information with higher voice information quality.
  • the new reference voiceprint information obtained by the embodiment of the present application is not directly stored in the voiceprint library, but is first determined whether the text information in the reference voiceprint information and the corresponding text are stored.
  • Another reference voiceprint information having the same identity identifier if present, compares the quality of the voice information in the two reference voiceprint information, retains the reference voiceprint information with higher voice information quality, and deletes the voice information with lower quality Reference voiceprint information. Therefore, the embodiment of the present application can ensure that the text information in any two reference voiceprint information corresponding to the same identity identifier (that is, the same first user) is different in the stored reference voiceprint information, and each text can be guaranteed.
  • the quality of the voice information corresponding to the text information is the highest; when performing the identity authentication based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the accuracy of the authentication and improve the authentication efficiency.
  • FIG. 4 is a structural block diagram of a voiceprint information management system according to an embodiment of the present application; the voiceprint information management system can be applied to an account management system.
  • the voiceprint information management system 100 includes a voice filter 110, a text recognizer 120, and a voiceprint generator 130.
  • the voice filter 110 is configured to acquire a historical voice file generated by a call between the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.
  • the text recognizer 120 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
  • the voiceprint generator 130 is configured to edit the voice information and corresponding text information into reference voiceprint information of the first user, and store the reference voiceprint information and the identity of the first user symbol.
  • the historical voice file stored in the related system is filtered to obtain the voice information of the first user, and the text information corresponding to the voice information is obtained through the text recognition process, and the voice information and the corresponding information are obtained.
  • the text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the relevant system, that is, it is non-public, No matter whether the first user, the second user, or any other user can predict the specific content of the text information that needs to be read when performing identity authentication, the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be played.
  • the voiceprint information management method provided by the embodiment of the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
  • FIG. 5 is a structural block diagram of another voiceprint information management system according to an embodiment of the present disclosure; the voiceprint information management system can be applied to an account management system.
  • the voiceprint information management system 200 includes a voice filter 210, a text recognizer 220, a text cutter 240, a voiceprint cutter 250, and a voiceprint generator 230.
  • the voice filter 210 is configured to acquire a historical voice file generated by a call between the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.
  • the text recognizer 220 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
  • the text cutter 240 is configured to slice the text information into a plurality of sub-text information and mark the start and end time of each sub-text information.
  • the voiceprint cutter 250 is configured to respectively intercept sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
  • the voiceprint generator 230 is configured to edit each pair of sub-voice information and sub-text information as a piece of reference voiceprint information of the first user, and store each piece of reference voiceprint information and the identity of the first user. Identifier.
  • the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the system; and performing text recognition processing on the voice information to obtain corresponding text information; and the recognized text
  • the information is divided into a plurality of sub-text information, and according to the start and end time of each sub-text information from the above
  • the corresponding sub-speech information is intercepted in the voice information, and each pair of sub-text information and sub-speech information is respectively edited into a reference voiceprint information, and stored in the voiceprint library, so that each first user has multiple reference voiceprint information;
  • identity authentication needs to be performed, one of the plurality of reference voiceprint information corresponding to the identity identifier to be authenticated may be randomly selected.
  • the reference voiceprint information obtained during the identity authentication is random, it is impossible to predict the specific content of the text information that needs to be read by the user to be authenticated, so that the corresponding sound file cannot be recorded in advance, that is, the sound file that cannot be recorded in advance cannot be played.
  • the purpose of the authentication success is achieved. Therefore, performing the identity authentication based on the voiceprint library obtained in this embodiment can ensure the accuracy of the authentication result and improve the security of the account.
  • the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.
  • the voiceprint generator 130 and the voiceprint generator 230 may be configured. for:
  • the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information.
  • the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
  • the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first If the quality of the voice information is lower than the second voice information, the first reference voiceprint information is directly deleted;
  • the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
  • the embodiment of the present application can ensure that the text information in any two reference voiceprint information corresponding to the same user is different in the stored reference voiceprint information, and each text can be guaranteed.
  • the quality of the voice information corresponding to the information is the highest; thus, when performing the identity authentication based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the accuracy of the authentication and improve the authentication efficiency.
  • FIG. 6 is a flowchart of an identity authentication method according to an embodiment of the present application; the identity authentication method may be applied to an account management system. Referring to FIG. 6, the identity authentication method includes the following steps.
  • the first user may be a registered user who has a corresponding private account in the account management system.
  • the second user may be a service personnel of the account management system.
  • the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the related system, and obtains the text information corresponding to the voice information through the text recognition process, and the voice information and the corresponding
  • the text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the relevant system, that is, it is non-public, No matter whether the first user, the second user, or any other user can predict the specific content of the text information that needs to be read when performing identity authentication, the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be played.
  • the voiceprint information management method provided by the embodiment of the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
  • FIG. 7 is a flowchart of another identity authentication method according to an embodiment of the present application; the identity authentication method may be applied to an account management system.
  • the identity authentication method includes the following steps.
  • the text information is divided into a plurality of sub-text information, and the start and end time of each sub-text information is marked.
  • the sub-voice information corresponding to each sub-text information is separately extracted from the voice information according to the start and end time of the sub-text information.
  • the text information in the present application is divided into a plurality of sub-text information, and the corresponding sub-speech information is intercepted according to the start and end time, and each sub-text information and the corresponding sub-voice information are edited into one reference.
  • the voiceprint information is such that the first user has a plurality of reference voiceprint information; when the identity authentication needs to be performed, one of the plurality of reference voiceprint information corresponding to the identity identifier to be authenticated is randomly selected. Since the reference voiceprint information obtained during the identity authentication is random, it is impossible to predict the specific content of the text information that needs to be read by the user to be authenticated, so that the corresponding sound file cannot be recorded in advance, that is, the sound file that cannot be recorded in advance cannot be played.
  • the identity authentication method provided in this embodiment can ensure the accuracy of the authentication result and improve the security of the account.
  • the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.
  • the identity authentication method provided by the embodiment of the present application can also complete the storage of the reference voiceprint information by using the method shown in FIG. 3, and can ensure not only any two reference voiceprint information corresponding to the same user in the stored reference voiceprint information.
  • the text information in the text is different, and the quality of the voice information corresponding to each type of text information is also guaranteed to be the highest; when the identity authentication is performed based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the authentication. Accuracy and improve certification efficiency.
  • FIG. 8 is a structural block diagram of an identity authentication system according to an embodiment of the present application, where the identity authentication system can be applied to an account management system.
  • the identity authentication system 300 includes a voice filter 310, a text recognizer 320, a voiceprint generator 330, a voiceprint extractor 360, a recognition front end 370, and a voiceprint matcher 380.
  • the voice filter 310 is configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.
  • the text recognizer 320 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
  • the voiceprint generator 330 is configured to edit the voice information and corresponding text information into reference voiceprint information of the first user, and store the reference voiceprint information and the identity of the first user symbol.
  • the voiceprint extractor 360 is configured to acquire reference voiceprint information corresponding to an identity identifier of a user to be authenticated.
  • the recognition front end 370 is configured to output the text information in the acquired reference voiceprint information and receive the corresponding voice information to be authenticated.
  • the voiceprint matcher 380 is configured to match the voice information in the obtained reference voiceprint information with the voice information to be authenticated, and if the matching is successful, determine that the authentication of the user to be authenticated is successful, and if the matching fails, determine The authentication of the user to be authenticated failed.
  • the identification pre-amplifier 370 is configured to implement interaction between the identity authentication system and the user to be authenticated; in addition to the text information in the reference voiceprint information acquired by the voiceprint extractor 360, the user input to be authenticated is received.
  • the user may also receive an identity authentication request of the user to be authenticated, and after receiving the identity authentication request, trigger the voiceprint extractor 360, and output the authentication result obtained by the voiceprint matcher 380 to the user to be authenticated.
  • the historical voice file stored in the related system is filtered to obtain the voice information of the first user, and the text information corresponding to the voice information is obtained through the text recognition process, and the voice information and the corresponding information are obtained.
  • the text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the relevant system, that is, it is non-public, No matter whether the first user, the second user, or any other user can predict the specific content of the text information that needs to be read when performing identity authentication, the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be played.
  • the voiceprint information management method provided by the embodiment of the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
  • FIG. 9 is a structural block diagram of an identity authentication system according to an embodiment of the present application.
  • the identity authentication system can be applied to an account management system.
  • the identity authentication system 400 includes a voice filter 410, a text recognizer 420, a text cutter 440, a voiceprint cutter 450, a voiceprint generator 430, a voiceprint extractor 460, a recognition front end 470, and Voiceprint matcher 480.
  • the voice filter 410 is configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.
  • the text recognizer 420 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
  • the text cutter 440 is configured to slice the text information into a plurality of sub-text information and mark the start and end time of each sub-text information.
  • the voiceprint cutter 450 is configured to respectively intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
  • the voiceprint generator 430 is configured to edit each pair of sub-voice information and sub-text information as a piece of reference voiceprint information of the first user, and store each piece of reference voiceprint information and the identity of the first user. Identifier.
  • the voiceprint extractor 460 is configured to acquire reference voiceprint information corresponding to an identity identifier of a user to be authenticated.
  • the identification preamplifier 470 is configured to output sub-text information in the acquired reference voiceprint information and receive corresponding to-be-authenticated speech information.
  • the voiceprint matcher 480 is configured to: the sub-voice information in the acquired reference voiceprint information and the to-be-recognized If the matching is successful, the authentication of the user to be authenticated is successful. If the matching fails, the authentication of the user to be authenticated is determined to be unsuccessful.
  • the identified text information is divided into a plurality of sub-text information, and the corresponding sub-speech information is intercepted according to the start and end time, and each sub-text information and the corresponding sub-speech information are edited into one reference.
  • the voiceprint information is such that the first user has a plurality of pieces of reference voiceprint information; when the identity authentication needs to be performed, the corresponding plurality of reference voiceprint information are determined from the identity identifier corresponding to the user to be authenticated, and one of the randomly selected ones is used for This identity certification.
  • the identity authentication system provided in this embodiment can ensure the accuracy of the authentication result and improve the security of the account.
  • the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.
  • the voiceprint generator 330 and the voiceprint generator 430 may be configured to:
  • the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information.
  • the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the identity identifier of the first user;
  • the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
  • the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and a corresponding user identity identifier.
  • the embodiment of the present application can ensure that the text information in any two reference voiceprint information corresponding to the same identity identifier is different in the stored reference voiceprint information, and each text can be guaranteed.
  • the quality of the voice information corresponding to the text information is the highest; when the identity authentication is performed based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the accuracy of the authentication and improve the authentication efficiency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Computer Security & Cryptography (AREA)
  • Theoretical Computer Science (AREA)
  • Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)
  • Storage Device Security (AREA)

Abstract

A voiceprint information management method and device as well as an identity authentication method and system. The voiceprint information management method comprises: filtering a historical voice file stored in a related system to obtain voice information of a first user (S12), performing text identification processing to obtain text information corresponding to the voice information (S13), and editing the voice information and the corresponding text information into reference voiceprint information of the first user. As the text information and the voice information in the reference voiceprint information are both obtained based on the historical voice file and are not preset by a related system, that is, the text information and the voice information are not public and no user can predict the specific content of the text information that needs to be reread when identity authentication is executed, so that a corresponding audio file cannot be recorded in advance and successful authentication cannot be achieved by playing the audio file recorded in advance. The identity authentication is performed based on voiceprint information management, the authentication result is more accurate, no potential safety hazard exists and the account security is higher.

Description

声纹信息管理方法、装置以及身份认证方法、系统Voiceprint information management method and device, and identity authentication method and system 技术领域Technical field
本申请涉及声纹识别技术领域,尤其涉及一种声纹信息管理方法、装置以及身份认证方法、系统。The present application relates to the field of voiceprint recognition technology, and in particular, to a voiceprint information management method and apparatus, and an identity authentication method and system.
背景技术Background technique
声纹是指用电声学仪器显示的携带言语信息的声波频谱。不同人说相同的话,其产生的声波不同,相应的声波频谱,即声纹信息也不同。因此,通过比对声纹信息可以判断对应的说话人是否相同,即实现基于声纹识别的身份认证;该基于声纹识别的身份认证方式可以广泛应用于各种账户管理系统,用于保证账户的安全性。Voiceprint refers to the spectrum of sound waves carrying speech information displayed by electroacoustic instruments. Different people say the same, the sound waves produced by them are different, and the corresponding sound wave spectrum, that is, the voiceprint information is also different. Therefore, by comparing the voiceprint information, it can be determined whether the corresponding speakers are the same, that is, the voiceprint recognition based identity authentication is implemented; the voiceprint recognition based identity authentication method can be widely applied to various account management systems for securing accounts. Security.
相关技术中,在利用声纹识别技术实现身份认证前,首先需要用户读出预设文本信息,采集此时用户的声音信号,分析得到对应的声纹信息,作为该用户的基准声纹信息,存入声纹库;在实现身份认证时,同样要求被认证人读出上述预设文本信息,采集被认证人的声音信号,分析得到对应的声纹信息,通过比对该声纹信息与声纹库中的基准声纹信息,就可以判断出被认证人是否为用户本人。In the related art, before using the voiceprint recognition technology to implement identity authentication, the user first needs to read the preset text information, collect the voice signal of the user at this time, and analyze and obtain the corresponding voiceprint information as the reference voiceprint information of the user. Deposited into the voiceprint library; when implementing identity authentication, the authenticated person is also required to read the preset text information, collect the voice signal of the authenticated person, analyze and obtain the corresponding voiceprint information, and compare the voiceprint information with the sound The reference voiceprint information in the texture library can determine whether the authenticated person is the user himself or herself.
以上技术中,用于身份认证的文本信息已在声纹库建立时被公开,相应的,进行身份认证时要求被认证人读出的文本信息也是已知的,如果提前录制用户本人读出该文本信息时的声音文件,则任何人都可以通过播放该提前录制的声音文件使得认证成功。可见,现有基于声纹识别的身份认证方式存在严重的安全隐患。In the above technology, the text information used for identity authentication has been disclosed when the voiceprint library is established. Correspondingly, the text information required to be read by the authenticator when performing identity authentication is also known. When the text message is a sound file, anyone can play the pre-recorded sound file to make the authentication successful. It can be seen that the existing identity authentication method based on voiceprint recognition has serious security risks.
发明内容Summary of the invention
为克服相关技术中存在的问题,本申请提供一种声纹信息管理方法、装置以及身份认证方法、系统。To overcome the problems in the related art, the present application provides a voiceprint information management method and apparatus, and an identity authentication method and system.
本申请第一方面提供一种声纹信息管理方法,该方法包括如下步骤:A first aspect of the present application provides a voiceprint information management method, the method comprising the following steps:
获取第一用户与第二用户通话产生的历史语音文件;Obtaining a historical voice file generated by the first user and the second user;
对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;Performing a filtering process on the historical voice file to obtain voice information of the first user;
对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息;Performing text recognition processing on the voice information to obtain text information corresponding to the voice information;
将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符。Editing the voice information and corresponding text information into reference voiceprint information of the first user, and storing the reference voiceprint information and an identity identifier of the first user.
结合第一方面,在第一方面第一种可行的实施方式中,所述声纹信息管理方法还包 括:With reference to the first aspect, in the first feasible implementation manner of the first aspect, the voiceprint information management method further includes include:
将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;Dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information;
根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。The sub-voice information corresponding to each sub-text information is respectively intercepted from the voice information according to the start and end time of the sub-text information.
结合第一方面第一种可行的实施方式,在第一方面第二种可行的实施方式中,将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:With reference to the first possible implementation manner of the first aspect, in the second possible implementation manner of the first aspect, the editing the voice information and the corresponding text information into the reference voiceprint information of the first user includes:
将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
结合第一方面,在第一方面第三种可行的实施方式中,存储所述基准声纹信息和所述第一用户的身份标识符,包括:With reference to the first aspect, in a third possible implementation manner of the first aspect, the storing the reference voiceprint information and the identity identifier of the first user includes:
判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;
如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
本申请第二方面提供一种声纹信息管理装置,该装置包括:A second aspect of the present application provides a voiceprint information management apparatus, the apparatus comprising:
语音过滤器,用于获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;a voice filter, configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user;
文本识别器,用于对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息;a text identifier, configured to perform text recognition processing on the voice information, to obtain text information corresponding to the voice information;
声纹生成器,用于将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符。a voiceprint generator for editing the voice information and corresponding text information into reference voiceprint information of the first user, and storing the reference voiceprint information and an identity identifier of the first user.
结合第二方面,在第二方面第一种可行的实施方式中,所述声纹信息管理装置还包括:In conjunction with the second aspect, in the first possible implementation manner of the second aspect, the voiceprint information management apparatus further includes:
文本切割器,用于将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间; a text cutter for dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information;
声纹切割器,用于根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。The voiceprint cutter is configured to separately intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
结合第二方面第一种可行的实施方式,在第二方面第二种可行的实施方式中,所述声纹生成器将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:With reference to the first possible implementation manner of the second aspect, in a second possible implementation manner of the second aspect, the voiceprint generator edits the voice information and corresponding text information into a reference of the first user Voiceprint information, including:
将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
结合第二方面,在第二方面第三种可行的实施方式中,所述声纹生成器存储所述基准声纹信息和所述第一用户的身份标识符,包括:With reference to the second aspect, in a third possible implementation manner of the second aspect, the voiceprint generator stores the reference voiceprint information and the identity identifier of the first user, including:
判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;
如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
本申请第三方面提供一种身份认证方法,该方法包括如下步骤:The third aspect of the present application provides an identity authentication method, where the method includes the following steps:
获取第一用户与第二用户通话产生的历史语音文件;Obtaining a historical voice file generated by the first user and the second user;
对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;Performing a filtering process on the historical voice file to obtain voice information of the first user;
对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息;Performing text recognition processing on the voice information to obtain text information corresponding to the voice information;
将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符;Editing the voice information and corresponding text information into reference voiceprint information of the first user, and storing the reference voiceprint information and an identity identifier of the first user;
获取待认证用户的身份标识符对应的基准声纹信息;Obtaining reference voiceprint information corresponding to the identity identifier of the user to be authenticated;
输出获取到的基准声纹信息中的文本信息,并接收对应的待认证语音信息;Outputting text information in the obtained reference voiceprint information, and receiving corresponding voice information to be authenticated;
将获取到的基准声纹信息中的语音信息与所述待认证语音信息进行匹配,如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用户认证失败。The voice information in the obtained reference voiceprint information is matched with the voice information to be authenticated. If the matching succeeds, it is determined that the authentication of the user to be authenticated is successful. If the matching fails, it is determined that the authentication of the user to be authenticated fails.
结合第三方面,在第三方面第一种可行的实施方式中,所述身份认证方法还包括: In conjunction with the third aspect, in the first possible implementation manner of the third aspect, the identity authentication method further includes:
将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;Dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information;
根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。The sub-voice information corresponding to each sub-text information is respectively intercepted from the voice information according to the start and end time of the sub-text information.
结合第三方面第一种可行的实施方式,在第三方面第二种可行的实施方式中,将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:With reference to the first possible implementation manner of the third aspect, in the second possible implementation manner of the third aspect, the editing the voice information and the corresponding text information into the reference voiceprint information of the first user includes:
将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
结合第三方面,在第三方面第三种可行的实施方式中,存储所述基准声纹信息和所述第一用户的身份标识符,包括:With reference to the third aspect, in a third possible implementation manner of the third aspect, the storing the reference voiceprint information and the identity identifier of the first user includes:
判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;
如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
本申请第四方面提供一种身份认证系统;该系统包括:A fourth aspect of the present application provides an identity authentication system; the system includes:
语音过滤器,用于获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;a voice filter, configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user;
文本识别器,用于对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息;a text identifier, configured to perform text recognition processing on the voice information, to obtain text information corresponding to the voice information;
声纹生成器,用于将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符;a voiceprint generator, configured to edit the voice information and corresponding text information into reference voiceprint information of the first user, and store the reference voiceprint information and an identity identifier of the first user;
声纹提取器,用于获取待认证用户的身份标识符对应的基准声纹信息;a voiceprint extractor, configured to acquire reference voiceprint information corresponding to an identity identifier of a user to be authenticated;
识别前置器,用于输出获取到的基准声纹信息中的文本信息,并接收对应的待认证语音信息;Identifying a front end device, configured to output text information in the obtained reference voiceprint information, and receive corresponding voice information to be authenticated;
声纹匹配器,用于将获取到的基准声纹信息中的语音信息与所述待认证语音信息进行匹配,如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用 户认证失败。a voiceprint matching device, configured to match the voice information in the obtained reference voiceprint information with the voice information to be authenticated, and if the matching is successful, determine that the authentication of the user to be authenticated is successful, and if the matching fails, determine that the authentication is to be authenticated. User authentication failed.
结合第四方面,在第四方面第一种可行的实施方式中,所述身份认证系统还包括:With reference to the fourth aspect, in a first possible implementation manner of the fourth aspect, the identity authentication system further includes:
文本切割器,用于将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;a text cutter for dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information;
声纹切割器,用于根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。The voiceprint cutter is configured to separately intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
结合第四方面第一种可行的实施方式,在第四方面第二种可行的实施方式中,所述声纹生成器将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:With reference to the first possible implementation manner of the fourth aspect, in a second possible implementation manner of the fourth aspect, the voiceprint generator edits the voice information and corresponding text information as a reference of the first user Voiceprint information, including:
将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
结合第四方面,在第四方面第三种可行的实施方式中,所述声纹生成器存储所述基准声纹信息和所述第一用户的身份标识符,包括:With reference to the fourth aspect, in a third possible implementation manner of the fourth aspect, the voiceprint generator stores the reference voiceprint information and the identity identifier of the first user, including:
判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;
如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
由以上技术方案可知,本申请通过对相关系统存储的历史语音文件进行过滤,得到第一用户的语音信息,并通过文本识别处理得到该语音信息对应的文本信息,并将该语音信息和对应的文本信息编辑为第一用户的基准声纹信息;由于该基准声纹信息中的文本信息和语音信息都是基于上述历史语音文件得到的,不是相关系统预设的,即是非公开的,故无论第一用户,还是第二用户,还是其他任何用户都无法预知执行身份认证时需要复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的。因此,相对于现有基于声纹识别的身份认证方式,基于本申请提供的声纹信息管理方法进行身份认证,认证结果更准确,不存在安全隐患,账户的安全性更高。 According to the above technical solution, the present application obtains the voice information of the first user by filtering the historical voice file stored in the related system, and obtains the text information corresponding to the voice information through text recognition processing, and the voice information and the corresponding voice information. The text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the related system, that is, it is non-public, Whether the first user, the second user, or any other user cannot predict the specific content of the text information that needs to be read when performing identity authentication, so that the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be authenticated. The purpose of success. Therefore, compared with the existing voiceprint recognition-based identity authentication method, the voiceprint information management method provided by the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。The above general description and the following detailed description are intended to be illustrative and not restrictive.
附图说明DRAWINGS
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本发明的实施例,并与说明书一起用于解释本发明的原理。The accompanying drawings, which are incorporated in the specification of FIG
图1是本申请实施例提供的一种声纹信息管理方法的流程图。FIG. 1 is a flowchart of a method for managing voiceprint information provided by an embodiment of the present application.
图2是本申请实施例提供的另一种声纹信息管理方法的流程图。FIG. 2 is a flowchart of another method for managing voiceprint information provided by an embodiment of the present application.
图3是本申请实施例提供的存储基准声纹信息的方法流程图。FIG. 3 is a flowchart of a method for storing reference voiceprint information provided by an embodiment of the present application.
图4是本申请实施例提供的一种声纹信息管理系统的结构框图。4 is a structural block diagram of a voiceprint information management system provided by an embodiment of the present application.
图5是本申请实施例提供的另一种声纹信息管理系统的结构框图。FIG. 5 is a structural block diagram of another voiceprint information management system provided by an embodiment of the present application.
图6是本申请实施例提供的一种身份认证方法的流程图。FIG. 6 is a flowchart of an identity authentication method according to an embodiment of the present application.
图7是本申请实施例提供的另一种身份认证方法的流程图。FIG. 7 is a flowchart of another identity authentication method provided by an embodiment of the present application.
图8是本申请实施例提供的一种身份认证系统的结构框图。FIG. 8 is a structural block diagram of an identity authentication system according to an embodiment of the present application.
图9是本申请实施例提供的另一种身份认证系统的结构框图。FIG. 9 is a structural block diagram of another identity authentication system according to an embodiment of the present application.
具体实施方式detailed description
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本发明相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本发明的一些方面相一致的装置和方法的例子。Exemplary embodiments will be described in detail herein, examples of which are illustrated in the accompanying drawings. The following description refers to the same or similar elements in the different figures unless otherwise indicated. The embodiments described in the following exemplary embodiments do not represent all embodiments consistent with the present invention. Instead, they are merely examples of devices and methods consistent with aspects of the invention as detailed in the appended claims.
图1是本申请实施例提供的一种声纹信息管理方法的流程图,该声纹信息管理方法应用于一种账户管理系统。如图1所示,该声纹信息管理方法,包括以下步骤。1 is a flowchart of a voiceprint information management method provided by an embodiment of the present application, and the voiceprint information management method is applied to an account management system. As shown in FIG. 1, the voiceprint information management method includes the following steps.
S11、获取第一用户与第二用户通话产生的历史语音文件。S11. Obtain a historical voice file generated by the first user and the second user.
上述第一用户可以为在账户管理系统中存在对应的私有账户的注册用户,相应的,第二用户可以为账户管理系统的服务人员。The first user may be a registered user who has a corresponding private account in the account management system, and correspondingly, the second user may be a service personnel of the account management system.
S12、对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。S12. Perform filtering processing on the historical voice file to obtain voice information of the first user.
S13、对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。S13. Perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
S14、将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储基准声纹信息和所述第一用户的身份标识符。 S14. Edit the voice information and the corresponding text information into the reference voiceprint information of the first user, and store the reference voiceprint information and the identity identifier of the first user.
一般的,为便于业绩统计、服务质量评估、纠纷处理等,账户管理系统会对注册用户与服务人员之间的语音通话过程进行录音并存储对应的语音文件。有鉴于此,本申请实施例将账户管理系统存储的历史语音文件中的机器提示音、服务人员的声音信息等滤除,得到注册用户的语音信息,通过对该语音信息进行文本识别处理,得到该语音信息对应的文本信息,该语音信息和对应的文本信息就可以作为该注册用户的一组基准声纹信息。分别针对每个注册用户执行上述步骤,就可以得到每个注册用户对应的基准声纹信息,完成声纹库的创建。Generally, in order to facilitate performance statistics, service quality assessment, dispute resolution, etc., the account management system records the voice call process between the registered user and the service personnel and stores the corresponding voice file. In view of this, the embodiment of the present application filters out the machine prompt sound in the historical voice file stored by the account management system, the voice information of the service personnel, and the like, and obtains the voice information of the registered user, and performs text recognition processing on the voice information to obtain The text information corresponding to the voice information, the voice information and the corresponding text information can be used as a set of reference voiceprint information of the registered user. Performing the above steps for each registered user separately, the reference voiceprint information corresponding to each registered user can be obtained, and the voiceprint library is created.
由以上方法可知,本申请实施例通过对相关系统存储的历史语音文件进行过滤,得到第一用户的语音信息,并通过文本识别处理得到该语音信息对应的文本信息,并将该语音信息和对应的文本信息编辑为第一用户的基准声纹信息;由于该基准声纹信息中的文本信息和语音信息都是基于上述历史语音文件得到的,不是相关系统预设的,即是非公开的,故无论第一用户,还是第二用户,还是其他任何用户都无法预知执行身份认证时需要复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的。因此,相对于现有基于声纹识别的身份认证方式,基于本申请实施例提供的声纹信息管理方法进行身份认证,认证结果更准确,不存在安全隐患,账户的安全性更高。According to the above method, the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the related system, and obtains the text information corresponding to the voice information through the text recognition process, and the voice information and the corresponding The text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the relevant system, that is, it is non-public, No matter whether the first user, the second user, or any other user can predict the specific content of the text information that needs to be read when performing identity authentication, the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be played. The purpose of successful certification. Therefore, compared with the existing voiceprint recognition-based identity authentication method, the voiceprint information management method provided by the embodiment of the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
在本申请一个可行的实施例中,可以随机的获取第一用户与第二用户任意一次通话过程对应的一个历史语音文件,使得声纹库中身份标识符与基准声纹信息一一对应。由于无法预知实际获取到的历史语音文件对应哪一次通话过程,也就无法预知得到的基准声纹信息中的文本信息的具体内容;因此,基于本实施例执行身份认证,可以保证认证结果的准确性,提高账户的安全性。In a feasible embodiment of the present application, a historical voice file corresponding to an arbitrary call process of the first user and the second user may be randomly obtained, so that the identity identifiers in the voiceprint library are in one-to-one correspondence with the reference voiceprint information. The specific content of the text information in the obtained reference voiceprint information cannot be predicted because the actual voice file obtained by the historical voice file cannot be predicted. Therefore, performing the identity authentication based on the embodiment can ensure the accuracy of the authentication result. Sex, improve the security of the account.
在本申请另一个可行的实施例中,也可以获取第一用户对应的所有历史语音文件,每个历史语音文件都可以对应至少一组基准声纹信息,使得声纹库中一个身份标识符可以对应多组基准声纹信息(即第一用户存在多组基准声纹信息);相应的,可以随机的获取任意一组基准声纹信息,来执行身份认证。由于每组基准声纹信息中的文本信息都是非公开的,执行身份认证时获取到的基准声纹信息也无法预知,故用于执行身份认证的文本信息的具体内容也无法预知,从而无法提前录制对应的声音文件,也就无法通过播放提前录制的声音文件达到认证成功的目的;因此,基于本实施例执行身份认证,可以保证认证结果的准确性,提高账户的安全性。In another possible embodiment of the present application, all historical voice files corresponding to the first user may also be acquired, and each historical voice file may correspond to at least one set of reference voiceprint information, so that one identity identifier in the voiceprint library may be Corresponding to multiple sets of reference voiceprint information (ie, the first user has multiple sets of reference voiceprint information); correspondingly, any set of reference voiceprint information can be randomly obtained to perform identity authentication. Since the text information in each set of reference voiceprint information is non-public, the reference voiceprint information obtained during identity authentication cannot be predicted, so the specific content of the text information used for performing identity authentication cannot be predicted, and thus cannot be advanced. Recording the corresponding sound file can not achieve the purpose of successful authentication by playing the sound file recorded in advance; therefore, performing identity authentication based on the embodiment can ensure the accuracy of the authentication result and improve the security of the account.
图2是本申请另一实施例提供的声纹信息管理方法的流程图,该声纹信息管理方法应用于一种账户管理系统。如图2所示,该声纹信息管理方法,包括以下步骤。2 is a flowchart of a voiceprint information management method according to another embodiment of the present application, and the voiceprint information management method is applied to an account management system. As shown in FIG. 2, the voiceprint information management method includes the following steps.
S21、获取第一用户与第二用户通话产生的历史语音文件。 S21. Obtain a historical voice file generated by the first user and the second user.
S22、对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。S22. Perform filtering processing on the historical voice file to obtain voice information of the first user.
S23、对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。S23. Perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
S24、将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间。S24. The text information is divided into a plurality of sub-text information, and the start and end time of each sub-text information is marked.
S25、根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。S25. The sub-voice information corresponding to each sub-text information is separately intercepted from the voice information according to the start and end time of the sub-text information.
S26、将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息,并存储各条基准声纹信息和第一用户的身份标识符。S26. Edit each pair of sub-voice information and sub-text information as a piece of reference voiceprint information of the first user, and store each piece of reference voiceprint information and an identity identifier of the first user.
由于历史语音文件为一段时间内第一用户与第二用户之间的通话录音文件,故过滤得到的语音信息中包含第一用户的多段声音信息,相应的通过文本识别得到的文本信息包含多个句子或短语。本申请实施例将文本信息切分为多个子文本信息(每个子文本信息可以为一个句子、一个短语或一个词语);同时,为切分得到的每个子文本信息标记起止时间,根据该起止时间在语音信息中截取该子文本信息对应的子语音信息(也即根据子文本信息切分语音信息)。例如,文本信息中“我的账号被锁定了”这一句子由语音信息的00:03至00:05时段识别得到,则将“我的账号被锁定了”切分为一个子文本信息,其起止时间即为00:03至00:05,相应的,将语音信息中00:03至00:05时段的语音信息截取出来,即得到“我的账号被锁定了”这一子文本信息对应的子语音信息。通过对文本信息和语音信息的切分,可以得到多对子文本信息和子语音信息,按照预定格式分别将其编辑为基准声纹信息,则得到同一用户对应的多条基准声纹信息。Since the historical voice file is a call recording file between the first user and the second user in a period of time, the filtered voice information includes the plurality of pieces of voice information of the first user, and the corresponding text information obtained by the text recognition includes multiple A sentence or phrase. The embodiment of the present application divides the text information into a plurality of sub-text information (each sub-text information may be a sentence, a phrase or a word); at the same time, each sub-text information obtained by the segmentation is marked with a start and end time, according to the start and end time. The sub-speech information corresponding to the sub-text information is intercepted in the voice information (that is, the voice information is segmented according to the sub-text information). For example, the sentence "My account is locked" in the text message is recognized by the 00:03 to 00:05 period of the voice message, and the "My account is locked" is divided into a sub-text information, The start and end time is 00:03 to 00:05. Correspondingly, the voice information in the 00:03 to 00:05 period of the voice information is intercepted, that is, the subtext information corresponding to "My account is locked" is obtained. Sub-voice information. By segmenting the text information and the voice information, a plurality of pairs of sub-text information and sub-speech information can be obtained, and are respectively edited into reference voiceprint information according to a predetermined format, thereby obtaining a plurality of reference voiceprint information corresponding to the same user.
本申请实施例中,将子语音信息和对应的子文本信息编辑为基准声纹信息,可以包括:将子语音信息处理为对应的子声纹信息,并为该子声纹信息设置文件名,文件名的格式可以为“声纹编号.文件格式后缀”,如0989X.WAV;存储该子声纹信息,以及该子声纹信息对应的第一用户的身份标识符、子文本信息等信息;基于以上声纹信息管理方法得到的声纹库的存储结构如表1所示。In the embodiment of the present application, the sub-voice information and the corresponding sub-text information are edited into the reference voiceprint information, and the method includes: processing the sub-voice information into corresponding sub-voice texture information, and setting a file name for the sub-voice texture information, The format of the file name may be “soundprint number. file format suffix”, such as 0989X.WAV; storing the sub-voice texture information, and the identity identifier, sub-text information and the like of the first user corresponding to the sub-voice texture information; The storage structure of the voiceprint library obtained based on the above voiceprint information management method is as shown in Table 1.
表1声纹库存储结构示例Table 1 Example of voiceprint storage structure
用户IDUser ID 用户声纹编号User voiceprint number 子文本信息Subtext information 子声纹信息Sonogram information
139XXXXXXXX139XXXXXXXX 11 非常满意Very satisfied 0989X.WAV0989X.WAV
139XXXXXXXX139XXXXXXXX 22 为什么还没有退款Why are there no refunds? 0389X.WAV0389X.WAV
189XXXXXXXX189XXXXXXXX 11 我很生气I am very angry 0687X.WAV0687X.WAV
189XXXXXXXX189XXXXXXXX 22 账号被锁定Account is locked 0361X.WAV0361X.WAV
表1中,每一行对应声纹库中的一条基准声纹信息;以身份标识符(即用户ID)为主键,用于声纹信息的查询和调用;用户声纹编号用于标记同一用户ID对应的基准声纹信息的个数。以用户ID“139XXXXXXXX”为例,当接收到对该用户ID的身份认证请求时, 从上述声纹库中查询“139XXXXXXXX”对应的基准声纹信息,可以得到多条查询结果,从中随机提取一条作为本次认证的基准声纹信息,例如提取该用户ID对应的2号基准声纹信息作为本次认证的基准声纹信息,输出其中的子文本信息“为什么还没有退款”;接收待认证用户复读该子文件信息得到的待认证语音信息,将其处理为待认证声纹信息,比较该待认证声纹信息和声纹库中提取的子声纹信息“0389X.WAV”,如果二者匹配,则判定身份认证成功,即认为待认证用户即为“139XXXXXXXX”对应的第一用户;反之,如果二者不匹配,则判定身份认证失败。In Table 1, each row corresponds to a reference voiceprint information in the voiceprint library; the identity identifier (ie, user ID) is used as the primary key for querying and invoking voiceprint information; the user voiceprint number is used to mark the same user ID. The number of corresponding reference voiceprint information. Taking the user ID "139XXXXXXXX" as an example, when receiving an identity authentication request for the user ID, Querying the reference voiceprint information corresponding to "139XXXXXXXX" from the voiceprint library, a plurality of query results can be obtained, and a reference voiceprint information as the current authentication is randomly extracted from the voiceprint database, for example, the reference voiceprint No. 2 corresponding to the user ID is extracted. As the reference voiceprint information of this authentication, the sub-text information “Why is there no refund” is outputted; the voice information to be authenticated obtained by the user to be authenticated is re-read the information of the sub-file, and is processed as the voiceprint information to be authenticated. Comparing the voiceprint information to be authenticated and the child voiceprint information “0389X.WAV” extracted in the voiceprint library, if the two match, determining that the identity authentication is successful, that is, the user to be authenticated is the first corresponding to “139XXXXXXXX” User; otherwise, if the two do not match, it is determined that the identity authentication failed.
由以上技术方案可知,本申请实施例通过对系统存储的历史语音文件进行过滤处理,得到第一用户的语音信息;通过对该语音信息进行文本识别处理,得到对应的文本信息;将识别出的文本信息切分为多个子文本信息,并根据每个子文本信息的起止时间从上述语音信息中截取对应的子语音信息,分别将每对子文本信息和子语音信息编辑为一条基准声纹信息,存入声纹库,使得每个第一用户都具备多条基准声纹信息;当需要执行身份认证时,从待认证的身份标识符对应的多条基准声纹信息中随机选取一条即可。由于执行身份认证时获取到的基准声纹信息是随机的,故无法预知需要待认证用户复读的文本信息的具体内容,因此,基于本实施例得到的声纹库执行身份认证,可以保证认证结果的准确性,提高账户的安全性。另外,本实施例中,每条基准声纹信息中的子文本信息都很简短,可以减少复读文本信息所需的时间,减少声纹比较所消耗的时间,提高认证效率。According to the above technical solution, the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the system; and performing text recognition processing on the voice information to obtain corresponding text information; The text information is divided into a plurality of sub-text information, and the corresponding sub-speech information is intercepted from the above-mentioned voice information according to the start and end time of each sub-text information, and each pair of sub-text information and sub-voice information is respectively edited into a reference voiceprint information, and the information is saved. The voiceprint library is provided so that each of the first users has a plurality of reference voiceprint information; when the identity authentication needs to be performed, one of the plurality of reference voiceprint information corresponding to the identity identifier to be authenticated is randomly selected. Since the reference voiceprint information obtained during the identity authentication is random, the specific content of the text information that needs to be read by the user to be authenticated cannot be predicted. Therefore, the voiceprint library obtained according to the embodiment performs identity authentication, and the authentication result can be guaranteed. Accuracy and improve account security. In addition, in this embodiment, the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.
本申请实施例提供的声纹信息管理方法,不仅可以创建新的声纹库,还可以对所创建的声纹库进行更新,例如添加新用户对应的基准声纹信息,为老用户添加新的基准声纹信息。对于新用户,只需获取该新用户对应的历史语音文件,并执行上述步骤S12至S4,或步骤S22至S26,就可以得到该新用户对应的基准声纹信息。由于随着时间的推移,同一用户对应的历史语音文件也不断增加,因此,对于老用户,可以获取对应的新增历史语音文件,并执行上述步骤,就可以实现为该老用户添加新的基准声纹信息。The voiceprint information management method provided by the embodiment of the present application can not only create a new voiceprint library, but also update the created voiceprint library, for example, adding a reference voiceprint information corresponding to a new user, and adding a new voice to the old user. Reference voiceprint information. For the new user, it is only necessary to obtain the historical voice file corresponding to the new user, and perform the above steps S12 to S4, or steps S22 to S26, to obtain the reference voiceprint information corresponding to the new user. As the historical voice file corresponding to the same user increases with time, the old user can obtain the corresponding new historical voice file and perform the above steps to add a new reference to the old user. Voiceprint information.
基于本申请实施例提供的声纹信息管理方法,可以为第一用户设置一条或多条基准声纹信息。当为同一第一用户设置多条基准声纹信息时,需要保证该第一用户对应的任意两条基准声纹信息中的文本信息不同。然而,实际应用中,不可避免的会遇到以下情况:不同历史语音文件识别出内容相同的文本信息,或者同一文本信息切分出内容相同的多个子文本信息,使得同一子文本信息对应多个子语音信息;此时,本申请实施例采用图3所示的方法完成基准声纹信息的存储。为便于描述,假设待存储的基准声纹信息为由第一文本信息和第一语音信息构成的第一基准声纹信息,如图3所示,本申请实施例中存储第一基准声纹信息的过程包括以下步骤:Based on the voiceprint information management method provided by the embodiment of the present application, one or more pieces of reference voiceprint information may be set for the first user. When a plurality of pieces of reference voiceprint information are set for the same first user, it is necessary to ensure that the text information in any two reference voiceprint information corresponding to the first user is different. However, in practical applications, it is inevitable that the following situation may occur: different historical voice files identify the same text information, or the same text information cuts out multiple sub-text information with the same content, so that the same sub-text information corresponds to multiple sub-texts. Voice information; at this time, the embodiment of the present application uses the method shown in FIG. 3 to complete the storage of the reference voiceprint information. For convenience of description, it is assumed that the reference voiceprint information to be stored is the first reference voiceprint information composed of the first text information and the first voice information. As shown in FIG. 3, the first reference voiceprint information is stored in the embodiment of the present application. The process includes the following steps:
S31、判断是否存在满足对比条件的第二基准声纹信息,如果存在,则执行步骤S32, 否则执行步骤S34。S31. Determine whether there is second reference voiceprint information that satisfies the comparison condition. If yes, execute step S32. Otherwise, step S34 is performed.
其中,上述对比条件包括:第二基准声纹信息对应的第二文本信息与第一基准声纹信息中的第一文本信息相同,且第二基准声纹信息对应的第二身份标识符与第一基准声纹信息对应的第一身份标识符也相同。The comparison condition includes: the second text information corresponding to the second reference voiceprint information is the same as the first text information in the first reference voiceprint information, and the second identity identifier corresponding to the second reference voiceprint information is The first identity identifier corresponding to a reference voiceprint information is also the same.
S32、判断所述第一基准声纹信息中的第一语音信息的质量是否高于所述第二基准声纹信息中的第二语音信息的质量,如果是,则执行步骤S33,否则执行步骤S35。S32. Determine whether the quality of the first voice information in the first reference voiceprint information is higher than the quality of the second voice information in the second reference voiceprint information, and if yes, perform step S33, otherwise perform steps S35.
S33、删除所述第二基准声纹信息,并执行步骤S34。S33. Delete the second reference voiceprint information, and perform step S34.
S34、存储所述第一基准声纹信息和对应的第一身份标识符。S34. Store the first reference voiceprint information and the corresponding first identity identifier.
S35、删除所述第一基准声纹信息。S35. Delete the first reference voiceprint information.
上述步骤S31中,判断是否存在上述第二基准声纹信息,其查找范围至少包括已存储在声纹库中的基准声纹信息,还可以包括与第一基准声纹信息同步生成、尚未存储的基准声纹信息。如果不存在上述第二基准声纹信息,则直接存储第一基准声纹信息。如果查找到上述第二基准声纹信息,说明同一第一用户、同一文本信息存在至少两个不同的语音信息,此时,对第一基准声纹信息中的第一语音信息的质量,和第二基准声纹信息中的第二语音信息的质量进行比较,如果第一语音信息的质量高于第二语音信息,则存储第一基准声纹信息,同时删除第二基准声纹信息,如果第一语音信息的质量低于第二语音信息,则直接删除第一基准声纹信息,即对于同一文本信息,只保留质量最高的语音信息,以提高身份认证过程中语音信息对比结果的准确率、降低对比难度。In the above step S31, it is determined whether the second reference voiceprint information is present, and the search range includes at least the reference voiceprint information that has been stored in the voiceprint library, and may also include the data generated synchronously with the first reference voiceprint information and not yet stored. Reference voiceprint information. If the second reference voiceprint information does not exist, the first reference voiceprint information is directly stored. If the second reference voiceprint information is found, the same first user and the same text information have at least two different voice information. At this time, the quality of the first voice information in the first reference voiceprint information, and the Comparing the quality of the second voice information in the second reference voiceprint information, if the quality of the first voice information is higher than the second voice information, storing the first reference voiceprint information, and deleting the second reference voiceprint information, if If the quality of the voice information is lower than the second voice information, the first reference voiceprint information is directly deleted, that is, only the voice information with the highest quality is retained for the same text information, so as to improve the accuracy of the voice information comparison result in the identity authentication process, Reduce the difficulty of comparison.
基于上述存储过程,可以实现以下三种声纹库更新方式:1)增加新用户的基准声纹信息;2)增加老用户对应的文本信息不同的基准声纹信息;3)将声纹库中语音信息质量较低的基准声纹信息替换为语音信息质量更高的基准声纹信息。Based on the above stored procedure, the following three voiceprint library update modes can be implemented: 1) adding reference voiceprint information of the new user; 2) increasing the reference voiceprint information of the text information corresponding to the old user; 3) The reference voiceprint information with lower voice information quality is replaced with the reference voiceprint information with higher voice information quality.
由以上技术方案可知,本申请实施例对于得到的新的基准声纹信息,不是直接将其存入声纹库,而是先判断是否存储有与该基准声纹信息中的文本信息以及对应的身份标识符分别相同的另一基准声纹信息,如果存在,则比较两个基准声纹信息中的语音信息的质量,保留语音信息质量较高的基准声纹信息,删除语音信息质量较低的基准声纹信息。因此,本申请实施例不仅可以保证所存储的基准声纹信息中,同一身份标识符(也即同一第一用户)对应的任意两条基准声纹信息中的文本信息都不同,还可以保证每一种文本信息对应的语音信息的质量最高;在基于本申请实施例执行身份认证时,基于质量更高的语音信息进行声纹对比,可以保证认证的准确性,提高认证效率。It can be seen from the above technical solutions that the new reference voiceprint information obtained by the embodiment of the present application is not directly stored in the voiceprint library, but is first determined whether the text information in the reference voiceprint information and the corresponding text are stored. Another reference voiceprint information having the same identity identifier, if present, compares the quality of the voice information in the two reference voiceprint information, retains the reference voiceprint information with higher voice information quality, and deletes the voice information with lower quality Reference voiceprint information. Therefore, the embodiment of the present application can ensure that the text information in any two reference voiceprint information corresponding to the same identity identifier (that is, the same first user) is different in the stored reference voiceprint information, and each text can be guaranteed. The quality of the voice information corresponding to the text information is the highest; when performing the identity authentication based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the accuracy of the authentication and improve the authentication efficiency.
图4为本申请实施例提供的一种声纹信息管理系统的结构框图;该声纹信息管理系统可以应用于一种账户管理系统。如图4所示,该声纹信息管理系统100包括:语音过滤器110、文本识别器120和声纹生成器130。 FIG. 4 is a structural block diagram of a voiceprint information management system according to an embodiment of the present application; the voiceprint information management system can be applied to an account management system. As shown in FIG. 4, the voiceprint information management system 100 includes a voice filter 110, a text recognizer 120, and a voiceprint generator 130.
该语音过滤器110被配置为,获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。The voice filter 110 is configured to acquire a historical voice file generated by a call between the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.
该文本识别器120被配置为,对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。The text recognizer 120 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
该声纹生成器130被配置为,将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符。The voiceprint generator 130 is configured to edit the voice information and corresponding text information into reference voiceprint information of the first user, and store the reference voiceprint information and the identity of the first user symbol.
由以上结构可知,本申请实施例通过对相关系统存储的历史语音文件进行过滤,得到第一用户的语音信息,并通过文本识别处理得到该语音信息对应的文本信息,并将该语音信息和对应的文本信息编辑为第一用户的基准声纹信息;由于该基准声纹信息中的文本信息和语音信息都是基于上述历史语音文件得到的,不是相关系统预设的,即是非公开的,故无论第一用户,还是第二用户,还是其他任何用户都无法预知执行身份认证时需要复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的。因此,相对于现有基于声纹识别的身份认证方式,基于本申请实施例提供的声纹信息管理方法进行身份认证,认证结果更准确,不存在安全隐患,账户的安全性更高。It can be seen from the above structure that the historical voice file stored in the related system is filtered to obtain the voice information of the first user, and the text information corresponding to the voice information is obtained through the text recognition process, and the voice information and the corresponding information are obtained. The text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the relevant system, that is, it is non-public, No matter whether the first user, the second user, or any other user can predict the specific content of the text information that needs to be read when performing identity authentication, the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be played. The purpose of successful certification. Therefore, compared with the existing voiceprint recognition-based identity authentication method, the voiceprint information management method provided by the embodiment of the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
图5为本申请实施例提供的另一种声纹信息管理系统的结构框图;该声纹信息管理系统可以应用于一种账户管理系统。如图5所示,该声纹信息管理系统200包括:语音过滤器210、文本识别器220、文本切割器240、声纹切割器250和声纹生成器230。FIG. 5 is a structural block diagram of another voiceprint information management system according to an embodiment of the present disclosure; the voiceprint information management system can be applied to an account management system. As shown in FIG. 5, the voiceprint information management system 200 includes a voice filter 210, a text recognizer 220, a text cutter 240, a voiceprint cutter 250, and a voiceprint generator 230.
该语音过滤器210被配置为,获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。The voice filter 210 is configured to acquire a historical voice file generated by a call between the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.
该文本识别器220被配置为,对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。The text recognizer 220 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
该文本切割器240被配置为,将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间。The text cutter 240 is configured to slice the text information into a plurality of sub-text information and mark the start and end time of each sub-text information.
该声纹切割器250被配置为,根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。The voiceprint cutter 250 is configured to respectively intercept sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
该声纹生成器230被配置为,将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息,并存储各条基准声纹信息和所述第一用户的身份标识符。The voiceprint generator 230 is configured to edit each pair of sub-voice information and sub-text information as a piece of reference voiceprint information of the first user, and store each piece of reference voiceprint information and the identity of the first user. Identifier.
由以上结构可知,本申请实施例通过对系统存储的历史语音文件进行过滤处理,得到第一用户的语音信息;通过对该语音信息进行文本识别处理,得到对应的文本信息;将识别出的文本信息切分为多个子文本信息,并根据每个子文本信息的起止时间从上述 语音信息中截取对应的子语音信息,分别将每对子文本信息和子语音信息编辑为一条基准声纹信息,存入声纹库,使得每个第一用户都具备多条基准声纹信息;当需要执行身份认证时,从待认证的身份标识符对应的多条基准声纹信息中随机选取一条即可。由于执行身份认证时获取到的基准声纹信息是随机的,故无法预知需要待认证用户复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的,因此,基于本实施例得到的声纹库执行身份认证,可以保证认证结果的准确性,提高账户的安全性。另外,本实施例中,每条基准声纹信息中的子文本信息都很简短,可以减少复读文本信息所需的时间,减少声纹比较所消耗的时间,提高认证效率。According to the above structure, the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the system; and performing text recognition processing on the voice information to obtain corresponding text information; and the recognized text The information is divided into a plurality of sub-text information, and according to the start and end time of each sub-text information from the above The corresponding sub-speech information is intercepted in the voice information, and each pair of sub-text information and sub-speech information is respectively edited into a reference voiceprint information, and stored in the voiceprint library, so that each first user has multiple reference voiceprint information; When identity authentication needs to be performed, one of the plurality of reference voiceprint information corresponding to the identity identifier to be authenticated may be randomly selected. Since the reference voiceprint information obtained during the identity authentication is random, it is impossible to predict the specific content of the text information that needs to be read by the user to be authenticated, so that the corresponding sound file cannot be recorded in advance, that is, the sound file that cannot be recorded in advance cannot be played. The purpose of the authentication success is achieved. Therefore, performing the identity authentication based on the voiceprint library obtained in this embodiment can ensure the accuracy of the authentication result and improve the security of the account. In addition, in this embodiment, the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.
本申请实施例提供的声纹信息管理系统中,为实现存储所述基准声纹信息和所述第一用户的身份标识符的功能,上述声纹生成器130及声纹生成器230可以被配置为:In the voiceprint information management system provided by the embodiment of the present application, in order to implement the function of storing the reference voiceprint information and the identity identifier of the first user, the voiceprint generator 130 and the voiceprint generator 230 may be configured. for:
判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;
如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则直接删除所述第一基准声纹信息;If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first If the quality of the voice information is lower than the second voice information, the first reference voiceprint information is directly deleted;
如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
基于以上配置的声纹生成器,本申请实施例不仅可以保证所存储的基准声纹信息中,同一用户对应的任意两条基准声纹信息中的文本信息都不同,还可以保证每一种文本信息对应的语音信息的质量最高;从而在基于本申请实施例执行身份认证时,基于质量更高的语音信息进行声纹对比,可以保证认证的准确性,提高认证效率。Based on the voiceprint generator configured above, the embodiment of the present application can ensure that the text information in any two reference voiceprint information corresponding to the same user is different in the stored reference voiceprint information, and each text can be guaranteed. The quality of the voice information corresponding to the information is the highest; thus, when performing the identity authentication based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the accuracy of the authentication and improve the authentication efficiency.
图6为本申请实施例提供的一种身份认证方法的流程图;该身份认证方法可以应用于一种账户管理系统。参照图6,该身份认证方法包括如下步骤。FIG. 6 is a flowchart of an identity authentication method according to an embodiment of the present application; the identity authentication method may be applied to an account management system. Referring to FIG. 6, the identity authentication method includes the following steps.
S41、获取第一用户与第二用户通话产生的历史语音文件。S41. Obtain a historical voice file generated by the first user and the second user.
其中,上述第一用户可以为在账户管理系统中存在对应的私有账户的注册用户,相应的,第二用户可以为账户管理系统的服务人员。The first user may be a registered user who has a corresponding private account in the account management system. Correspondingly, the second user may be a service personnel of the account management system.
S42、对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。S42. Perform filtering processing on the historical voice file to obtain voice information of the first user.
S43、对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。S43. Perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
S44、将所述文本信息和对应的语音信息编辑为所述第一用户的基准声纹信息,并存储基准声纹信息和所述第一用户的身份标识符。S44. Edit the text information and the corresponding voice information into the reference voiceprint information of the first user, and store the reference voiceprint information and the identity identifier of the first user.
S45、获取待认证用户的身份标识符对应的基准声纹信息。S45. Obtain reference voiceprint information corresponding to an identity identifier of the user to be authenticated.
S46、输出获取到的基准声纹信息中的文本信息,并接收对应的待认证语音信息。S46. Output text information in the obtained reference voiceprint information, and receive corresponding voice information to be authenticated.
S47、将获取到的基准声纹信息中的语音信息与所述待认证语音信息进行匹配,如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用户认证失败。S47. Match the voice information in the obtained reference voiceprint information with the to-be-authenticated voice information. If the matching is successful, determine that the authentication of the user to be authenticated is successful. If the matching fails, determine that the authentication of the user to be authenticated fails.
由以上方法可知,本申请实施例通过对相关系统存储的历史语音文件进行过滤,得到第一用户的语音信息,并通过文本识别处理得到该语音信息对应的文本信息,并将该语音信息和对应的文本信息编辑为第一用户的基准声纹信息;由于该基准声纹信息中的文本信息和语音信息都是基于上述历史语音文件得到的,不是相关系统预设的,即是非公开的,故无论第一用户,还是第二用户,还是其他任何用户都无法预知执行身份认证时需要复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的。因此,相对于现有基于声纹识别的身份认证方式,基于本申请实施例提供的声纹信息管理方法进行身份认证,认证结果更准确,不存在安全隐患,账户的安全性更高。According to the above method, the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the related system, and obtains the text information corresponding to the voice information through the text recognition process, and the voice information and the corresponding The text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the relevant system, that is, it is non-public, No matter whether the first user, the second user, or any other user can predict the specific content of the text information that needs to be read when performing identity authentication, the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be played. The purpose of successful certification. Therefore, compared with the existing voiceprint recognition-based identity authentication method, the voiceprint information management method provided by the embodiment of the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
图7为本申请实施例提供的另一种身份认证方法的流程图;该身份认证方法可以应用于一种账户管理系统。参照图7,该身份认证方法包括如下步骤。FIG. 7 is a flowchart of another identity authentication method according to an embodiment of the present application; the identity authentication method may be applied to an account management system. Referring to FIG. 7, the identity authentication method includes the following steps.
S51、获取第一用户与第二用户通话产生的历史语音文件。S51. Obtain a historical voice file generated by the first user and the second user.
S52、对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。S52. Perform filtering processing on the historical voice file to obtain voice information of the first user.
S53、对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。S53. Perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
S54、将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间。S54. The text information is divided into a plurality of sub-text information, and the start and end time of each sub-text information is marked.
S55、根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。S55. The sub-voice information corresponding to each sub-text information is separately extracted from the voice information according to the start and end time of the sub-text information.
S56、将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息,并存储各条基准声纹信息和所述第一用户的身份标识符。S56. Edit each pair of sub-voice information and sub-text information as a piece of reference voiceprint information of the first user, and store each piece of reference voiceprint information and an identifier of the first user.
S57、获取待认证用户的身份标识符对应的基准声纹信息。S57. Obtain reference voiceprint information corresponding to an identity identifier of the user to be authenticated.
S58、输出获取到的基准声纹信息中的子文本信息,并接收对应的待认证语音信息。S58. Output sub-text information in the obtained reference voiceprint information, and receive corresponding voice information to be authenticated.
S59、将获取到的基准声纹信息中的子语音信息与所述待认证语音信息进行匹配, 如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用户认证失败。S59. Match the sub-voice information in the obtained reference voiceprint information with the to-be-authenticated voice information. If the matching succeeds, it is determined that the authentication of the user to be authenticated is successful, and if the matching fails, it is determined that the authentication of the user to be authenticated fails.
由以上方法可知,本申请实施例将识别得到的文本信息切分为多个子文本信息,并根据其起止时间截取对应的子语音信息,将每个子文本信息和对应的子语音信息编辑为一条基准声纹信息,使得第一用户具备多条基准声纹信息;当需要执行身份认证时,从待认证的身份标识符对应的多条基准声纹信息中随机选取一条即可。由于执行身份认证时获取到的基准声纹信息是随机的,故无法预知需要待认证用户复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的,因此,本实施例提供的身份认证方法,可以保证认证结果的准确性,提高账户的安全性。另外,本实施例中,每条基准声纹信息中的子文本信息都很简短,可以减少复读文本信息所需的时间,减少声纹比较所消耗的时间,提高认证效率。According to the above method, the text information in the present application is divided into a plurality of sub-text information, and the corresponding sub-speech information is intercepted according to the start and end time, and each sub-text information and the corresponding sub-voice information are edited into one reference. The voiceprint information is such that the first user has a plurality of reference voiceprint information; when the identity authentication needs to be performed, one of the plurality of reference voiceprint information corresponding to the identity identifier to be authenticated is randomly selected. Since the reference voiceprint information obtained during the identity authentication is random, it is impossible to predict the specific content of the text information that needs to be read by the user to be authenticated, so that the corresponding sound file cannot be recorded in advance, that is, the sound file that cannot be recorded in advance cannot be played. The purpose of the authentication is successful. Therefore, the identity authentication method provided in this embodiment can ensure the accuracy of the authentication result and improve the security of the account. In addition, in this embodiment, the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.
本申请实施例提供的身份认证方法,也可以采用图3所示的方法完成基准声纹信息的存储,不仅可以保证所存储的基准声纹信息中,同一用户对应的任意两条基准声纹信息中的文本信息都不同,还可以保证每一种文本信息对应的语音信息的质量最高;在基于本申请实施例执行身份认证时,基于质量更高的语音信息进行声纹对比,可以保证认证的准确性,提高认证效率。The identity authentication method provided by the embodiment of the present application can also complete the storage of the reference voiceprint information by using the method shown in FIG. 3, and can ensure not only any two reference voiceprint information corresponding to the same user in the stored reference voiceprint information. The text information in the text is different, and the quality of the voice information corresponding to each type of text information is also guaranteed to be the highest; when the identity authentication is performed based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the authentication. Accuracy and improve certification efficiency.
图8为本申请实施例提供的一种身份认证系统的结构框图,该身份认证系统可以应用于一种账户管理系统。参照图8,该身份认证系统300包括:语音过滤器310、文本识别器320、声纹生成器330、声纹提取器360、识别前置器370和声纹匹配器380。FIG. 8 is a structural block diagram of an identity authentication system according to an embodiment of the present application, where the identity authentication system can be applied to an account management system. Referring to FIG. 8, the identity authentication system 300 includes a voice filter 310, a text recognizer 320, a voiceprint generator 330, a voiceprint extractor 360, a recognition front end 370, and a voiceprint matcher 380.
该语音过滤器310被配置为,获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。The voice filter 310 is configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.
该文本识别器320被配置为,对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。The text recognizer 320 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
该声纹生成器330被配置为,将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符。The voiceprint generator 330 is configured to edit the voice information and corresponding text information into reference voiceprint information of the first user, and store the reference voiceprint information and the identity of the first user symbol.
该声纹提取器360被配置为,获取待认证用户的身份标识符对应的基准声纹信息。The voiceprint extractor 360 is configured to acquire reference voiceprint information corresponding to an identity identifier of a user to be authenticated.
该识别前置器370被配置为,输出获取到的基准声纹信息中的文本信息,并接收对应的待认证语音信息。The recognition front end 370 is configured to output the text information in the acquired reference voiceprint information and receive the corresponding voice information to be authenticated.
该声纹匹配器380被配置为,将获取到的基准声纹信息中的语音信息与所述待认证语音信息进行匹配,如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用户认证失败。 The voiceprint matcher 380 is configured to match the voice information in the obtained reference voiceprint information with the voice information to be authenticated, and if the matching is successful, determine that the authentication of the user to be authenticated is successful, and if the matching fails, determine The authentication of the user to be authenticated failed.
上述结构中,识别前置器370用于实现身份认证系统与待认证用户的交互;除了用于输出声纹提取器360获取到的基准声纹信息中的文本信息,接收待认证用户输入的待认证语音信息外,还可以接收待认证用户的身份认证请求,并在接收到身份认证请求后触发声纹提取器360,以及向待认证用户输出声纹匹配器380得到的认证结果。In the above structure, the identification pre-amplifier 370 is configured to implement interaction between the identity authentication system and the user to be authenticated; in addition to the text information in the reference voiceprint information acquired by the voiceprint extractor 360, the user input to be authenticated is received. In addition to authenticating the voice information, the user may also receive an identity authentication request of the user to be authenticated, and after receiving the identity authentication request, trigger the voiceprint extractor 360, and output the authentication result obtained by the voiceprint matcher 380 to the user to be authenticated.
由以上结构可知,本申请实施例通过对相关系统存储的历史语音文件进行过滤,得到第一用户的语音信息,并通过文本识别处理得到该语音信息对应的文本信息,并将该语音信息和对应的文本信息编辑为第一用户的基准声纹信息;由于该基准声纹信息中的文本信息和语音信息都是基于上述历史语音文件得到的,不是相关系统预设的,即是非公开的,故无论第一用户,还是第二用户,还是其他任何用户都无法预知执行身份认证时需要复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的。因此,相对于现有基于声纹识别的身份认证方式,基于本申请实施例提供的声纹信息管理方法进行身份认证,认证结果更准确,不存在安全隐患,账户的安全性更高。It can be seen from the above structure that the historical voice file stored in the related system is filtered to obtain the voice information of the first user, and the text information corresponding to the voice information is obtained through the text recognition process, and the voice information and the corresponding information are obtained. The text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the relevant system, that is, it is non-public, No matter whether the first user, the second user, or any other user can predict the specific content of the text information that needs to be read when performing identity authentication, the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be played. The purpose of successful certification. Therefore, compared with the existing voiceprint recognition-based identity authentication method, the voiceprint information management method provided by the embodiment of the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
图9为本申请实施例提供的一种身份认证系统的结构框图,该身份认证系统可以应用于一种账户管理系统。参照图9,该身份认证系统400包括:语音过滤器410、文本识别器420、文本切割器440、声纹切割器450、声纹生成器430、声纹提取器460、识别前置器470和声纹匹配器480。FIG. 9 is a structural block diagram of an identity authentication system according to an embodiment of the present application. The identity authentication system can be applied to an account management system. Referring to FIG. 9, the identity authentication system 400 includes a voice filter 410, a text recognizer 420, a text cutter 440, a voiceprint cutter 450, a voiceprint generator 430, a voiceprint extractor 460, a recognition front end 470, and Voiceprint matcher 480.
该语音过滤器410被配置为,获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。The voice filter 410 is configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.
该文本识别器420被配置为,对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。The text recognizer 420 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
该文本切割器440被配置为,将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间。The text cutter 440 is configured to slice the text information into a plurality of sub-text information and mark the start and end time of each sub-text information.
该声纹切割器450被配置为,根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。The voiceprint cutter 450 is configured to respectively intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
该声纹生成器430被配置为,将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息,并存储各条基准声纹信息和所述第一用户的身份标识符。The voiceprint generator 430 is configured to edit each pair of sub-voice information and sub-text information as a piece of reference voiceprint information of the first user, and store each piece of reference voiceprint information and the identity of the first user. Identifier.
该声纹提取器460被配置为,获取待认证用户的身份标识符对应的基准声纹信息。The voiceprint extractor 460 is configured to acquire reference voiceprint information corresponding to an identity identifier of a user to be authenticated.
该识别前置器470被配置为,输出获取到的基准声纹信息中的子文本信息,并接收对应的待认证语音信息。The identification preamplifier 470 is configured to output sub-text information in the acquired reference voiceprint information and receive corresponding to-be-authenticated speech information.
该声纹匹配器480被配置为,将获取到的基准声纹信息中的子语音信息与所述待认 证语音信息进行匹配,如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用户认证失败。The voiceprint matcher 480 is configured to: the sub-voice information in the acquired reference voiceprint information and the to-be-recognized If the matching is successful, the authentication of the user to be authenticated is successful. If the matching fails, the authentication of the user to be authenticated is determined to be unsuccessful.
由以上结构可知,本申请实施例将识别得到的文本信息切分为多个子文本信息,并根据其起止时间截取对应的子语音信息,将每个子文本信息和对应的子语音信息编辑为一条基准声纹信息,使得第一用户具备多条基准声纹信息;当需要执行身份认证时,从根据待认证用户对应的身份标识符确定对应的多条基准声纹信息,并从中随机选取一条用于本次身份认证。由于执行身份认证时获取到的基准声纹信息是随机的,故无法预知需要待认证用户复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的,因此,本实施例提供的身份认证系统,可以保证认证结果的准确性,提高账户的安全性。另外,本实施例中,每条基准声纹信息中的子文本信息都很简短,可以减少复读文本信息所需的时间,减少声纹比较所消耗的时间,提高认证效率。It can be seen from the above structure that the identified text information is divided into a plurality of sub-text information, and the corresponding sub-speech information is intercepted according to the start and end time, and each sub-text information and the corresponding sub-speech information are edited into one reference. The voiceprint information is such that the first user has a plurality of pieces of reference voiceprint information; when the identity authentication needs to be performed, the corresponding plurality of reference voiceprint information are determined from the identity identifier corresponding to the user to be authenticated, and one of the randomly selected ones is used for This identity certification. Since the reference voiceprint information obtained during the identity authentication is random, it is impossible to predict the specific content of the text information that needs to be read by the user to be authenticated, so that the corresponding sound file cannot be recorded in advance, that is, the sound file that cannot be recorded in advance cannot be played. The purpose of the authentication is successful. Therefore, the identity authentication system provided in this embodiment can ensure the accuracy of the authentication result and improve the security of the account. In addition, in this embodiment, the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.
本申请实施例提供的身份认证系统中,为实现存储所述基准声纹信息和对应的用户身份标识符的功能,上述声纹生成器330及声纹生成器430可以被配置为:In the identity authentication system provided by the embodiment of the present application, in order to implement the function of storing the reference voiceprint information and the corresponding user identity identifier, the voiceprint generator 330 and the voiceprint generator 430 may be configured to:
判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;
如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一用户的身份标识符;If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the identity identifier of the first user;
如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和对应的用户身份标识符。And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and a corresponding user identity identifier.
基于以上配置的声纹生成器,本申请实施例不仅可以保证所存储的基准声纹信息中,同一身份标识符对应的任意两条基准声纹信息中的文本信息都不同,还可以保证每一种文本信息对应的语音信息的质量最高;在基于本申请实施例执行身份认证时,基于质量更高的语音信息进行声纹对比,可以保证认证的准确性,提高认证效率。Based on the voiceprint generator configured above, the embodiment of the present application can ensure that the text information in any two reference voiceprint information corresponding to the same identity identifier is different in the stored reference voiceprint information, and each text can be guaranteed. The quality of the voice information corresponding to the text information is the highest; when the identity authentication is performed based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the accuracy of the authentication and improve the authentication efficiency.
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本发明的其它实施方案。本申请旨在涵盖本发明的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本发明的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本发明的真正范围和精神由下 面的权利要求指出。Other embodiments of the invention will be apparent to those skilled in the <RTIgt; The present application is intended to cover any variations, uses, or adaptations of the present invention, which are in accordance with the general principles of the present invention and include common general knowledge or conventional technical means in the art not disclosed herein. . The description and examples are to be regarded as illustrative only, the true scope and spirit of the invention The claims are stated.
应当理解的是,本发明并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本发明的范围仅由所附的权利要求来限制。 It is to be understood that the invention is not limited to the details of the details of The scope of the invention is limited only by the appended claims.

Claims (16)

  1. 一种声纹信息管理方法,其特征在于,包括:A method for managing voiceprint information, comprising:
    获取第一用户与第二用户通话产生的历史语音文件;Obtaining a historical voice file generated by the first user and the second user;
    对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;Performing a filtering process on the historical voice file to obtain voice information of the first user;
    对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息;Performing text recognition processing on the voice information to obtain text information corresponding to the voice information;
    将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符。Editing the voice information and corresponding text information into reference voiceprint information of the first user, and storing the reference voiceprint information and an identity identifier of the first user.
  2. 根据权利要求1所述的声纹信息管理方法,其特征在于,还包括:The voiceprint information management method according to claim 1, further comprising:
    将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;Dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information;
    根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。The sub-voice information corresponding to each sub-text information is respectively intercepted from the voice information according to the start and end time of the sub-text information.
  3. 根据权利要求2所述的声纹信息管理方法,其特征在于,将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:The voiceprint information management method according to claim 2, wherein the editing the voice information and the corresponding text information into the reference voiceprint information of the first user comprises:
    将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
  4. 根据权利要求1所述的声纹信息管理方法,其特征在于,存储所述基准声纹信息和所述第一用户的身份标识符,包括:The voiceprint information management method according to claim 1, wherein storing the reference voiceprint information and the identity identifier of the first user comprises:
    判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;
    如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
    如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
    如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。 And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
  5. 一种声纹信息管理系统,其特征在于,包括:A voiceprint information management system, comprising:
    语音过滤器,用于获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;a voice filter, configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user;
    文本识别器,用于对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息;a text identifier, configured to perform text recognition processing on the voice information, to obtain text information corresponding to the voice information;
    声纹生成器,用于将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符。a voiceprint generator for editing the voice information and corresponding text information into reference voiceprint information of the first user, and storing the reference voiceprint information and an identity identifier of the first user.
  6. 根据权利要求5所述的声纹信息管理系统,其特征在于,还包括:The voiceprint information management system according to claim 5, further comprising:
    文本切割器,用于将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;a text cutter for dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information;
    声纹切割器,用于根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。The voiceprint cutter is configured to separately intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
  7. 根据权利要求6所述的声纹信息管理系统,其特征在于,所述声纹生成器将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:The voiceprint information management system according to claim 6, wherein the voiceprint generator edits the voice information and the corresponding text information into the reference voiceprint information of the first user, including:
    将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
  8. 根据权利要求5所述的声纹信息管理系统,其特征在于,所述声纹生成器存储所述基准声纹信息和所述第一用户的身份标识符,包括:The voiceprint information management system according to claim 5, wherein the voiceprint generator stores the reference voiceprint information and the identity identifier of the first user, including:
    判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;
    如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
    如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息; If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
    如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
  9. 一种身份认证方法,其特征在于,包括:An identity authentication method, comprising:
    获取第一用户与第二用户通话产生的历史语音文件;Obtaining a historical voice file generated by the first user and the second user;
    对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;Performing a filtering process on the historical voice file to obtain voice information of the first user;
    对所述用户语音信息语音信息执行文本识别处理,得到所述用户语音信息语音信息对应的文本信息;Performing text recognition processing on the voice information of the user voice information, and obtaining text information corresponding to the voice information of the user voice information;
    将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储基准声纹信息和所述第一用户的身份标识符;Editing the voice information and corresponding text information into reference voiceprint information of the first user, and storing reference voiceprint information and an identity identifier of the first user;
    获取待认证用户的身份标识符对应的基准声纹信息;Obtaining reference voiceprint information corresponding to the identity identifier of the user to be authenticated;
    输出获取到的基准声纹信息中的文本信息,并接收对应的待认证语音信息;Outputting text information in the obtained reference voiceprint information, and receiving corresponding voice information to be authenticated;
    将获取到的基准声纹信息中的语音信息与所述待认证语音信息进行匹配,如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用户认证失败。The voice information in the obtained reference voiceprint information is matched with the voice information to be authenticated. If the matching succeeds, it is determined that the authentication of the user to be authenticated is successful. If the matching fails, it is determined that the authentication of the user to be authenticated fails.
  10. 根据权利要求9所述的身份认证系统,其特征在于,还包括:The identity authentication system according to claim 9, further comprising:
    将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;Dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information;
    根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。The sub-voice information corresponding to each sub-text information is respectively intercepted from the voice information according to the start and end time of the sub-text information.
  11. 根据权利要求10所述的身份认证系统,其特征在于,将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:The identity authentication system according to claim 10, wherein the editing the voice information and the corresponding text information into the reference voiceprint information of the first user comprises:
    将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
  12. 根据权利要求9所述的身份认证系统,其特征在于,存储基准声纹信息和所述第一用户的身份标识符,包括:The identity authentication system according to claim 9, wherein storing the reference voiceprint information and the identity identifier of the first user comprises:
    判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息; Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;
    如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
    如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
    如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
  13. 一种身份认证系统,其特征在于,包括:An identity authentication system, comprising:
    语音过滤器,用于获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;a voice filter, configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user;
    文本识别器,用于对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息;a text identifier, configured to perform text recognition processing on the voice information, to obtain text information corresponding to the voice information;
    声纹生成器,用于将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储基准声纹信息和所述第一用户的身份标识符;a voiceprint generator, configured to edit the voice information and corresponding text information into reference voiceprint information of the first user, and store reference voiceprint information and an identity identifier of the first user;
    声纹提取器,用于获取待认证用户的身份标识符对应的基准声纹信息;a voiceprint extractor, configured to acquire reference voiceprint information corresponding to an identity identifier of a user to be authenticated;
    识别前置器,用于输出获取到的基准声纹信息中的文本信息,并接收对应的待认证语音信息;Identifying a front end device, configured to output text information in the obtained reference voiceprint information, and receive corresponding voice information to be authenticated;
    声纹匹配器,用于将获取到的基准声纹信息中的语音信息与所述待认证语音信息进行匹配,如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用户认证失败。a voiceprint matching device, configured to match the voice information in the obtained reference voiceprint information with the voice information to be authenticated, and if the matching is successful, determine that the authentication of the user to be authenticated is successful, and if the matching fails, determine the user to be authenticated. Authentication failed.
  14. 根据权利要求13所述的身份认证系统,其特征在于,还包括:The identity authentication system according to claim 13, further comprising:
    文本切割器,用于将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;a text cutter for dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information;
    声纹切割器,用于根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。The voiceprint cutter is configured to separately intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
  15. 根据权利要求14所述的身份认证系统,其特征在于,所述声纹生成器将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括: The identity authentication system according to claim 14, wherein the voiceprint generator edits the voice information and the corresponding text information into the reference voiceprint information of the first user, including:
    将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
  16. 根据权利要求13所述的身份认证系统,其特征在于,所述声纹生成器存储基准声纹信息和所述第一用户的身份标识符,包括:The identity authentication system according to claim 13, wherein the voiceprint generator stores reference voiceprint information and an identity identifier of the first user, including:
    判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;Determining whether the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information. The same second reference voiceprint information;
    如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;If the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
    如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;If the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
    如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。 And if the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
PCT/CN2015/091260 2014-10-10 2015-09-30 Voiceprint information management method and device as well as identity authentication method and system WO2016054991A1 (en)

Priority Applications (5)

Application Number Priority Date Filing Date Title
JP2017518071A JP6671356B2 (en) 2014-10-10 2015-09-30 Voiceprint information management method and voiceprint information management apparatus, and personal authentication method and personal authentication system
EP15848463.4A EP3206205B1 (en) 2014-10-10 2015-09-30 Voiceprint information management method and device as well as identity authentication method and system
KR1020177012683A KR20170069258A (en) 2014-10-10 2015-09-30 Voiceprint information management method and device as well as identity authentication method and system
SG11201702919UA SG11201702919UA (en) 2014-10-10 2015-09-30 Voiceprint information management method and apparatus, and identity authentication method and system
US15/484,082 US10593334B2 (en) 2014-10-10 2017-04-10 Method and apparatus for generating voiceprint information comprised of reference pieces each used for authentication

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410532530.0A CN105575391B (en) 2014-10-10 2014-10-10 Voiceprint information management method and device and identity authentication method and system
CN201410532530.0 2014-10-10

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/484,082 Continuation US10593334B2 (en) 2014-10-10 2017-04-10 Method and apparatus for generating voiceprint information comprised of reference pieces each used for authentication

Publications (1)

Publication Number Publication Date
WO2016054991A1 true WO2016054991A1 (en) 2016-04-14

Family

ID=55652587

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/091260 WO2016054991A1 (en) 2014-10-10 2015-09-30 Voiceprint information management method and device as well as identity authentication method and system

Country Status (8)

Country Link
US (1) US10593334B2 (en)
EP (1) EP3206205B1 (en)
JP (1) JP6671356B2 (en)
KR (1) KR20170069258A (en)
CN (1) CN105575391B (en)
HK (1) HK1224074A1 (en)
SG (2) SG11201702919UA (en)
WO (1) WO2016054991A1 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10593334B2 (en) 2014-10-10 2020-03-17 Alibaba Group Holding Limited Method and apparatus for generating voiceprint information comprised of reference pieces each used for authentication
EP3611895A4 (en) * 2017-04-10 2020-04-08 Beijing Orion Star Technology Co., Ltd. Method and device for user registration, and electronic device
CN111862933A (en) * 2020-07-20 2020-10-30 北京字节跳动网络技术有限公司 Method, apparatus, device and medium for generating synthesized speech

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156583A (en) * 2016-06-03 2016-11-23 深圳市金立通信设备有限公司 A kind of method of speech unlocking and terminal
CN106549947A (en) * 2016-10-19 2017-03-29 陆腾蛟 A kind of voiceprint authentication method and system of immediate updating
CN106782564B (en) * 2016-11-18 2018-09-11 百度在线网络技术(北京)有限公司 Method and apparatus for handling voice data
US10592649B2 (en) 2017-08-09 2020-03-17 Nice Ltd. Authentication via a dynamic passphrase
CN107564531A (en) * 2017-08-25 2018-01-09 百度在线网络技术(北京)有限公司 Minutes method, apparatus and computer equipment based on vocal print feature
US10490195B1 (en) * 2017-09-26 2019-11-26 Amazon Technologies, Inc. Using system command utterances to generate a speaker profile
CN107863108B (en) * 2017-11-16 2021-03-23 百度在线网络技术(北京)有限公司 Information output method and device
CN108121210A (en) * 2017-11-20 2018-06-05 珠海格力电器股份有限公司 Permission distribution method and device for household appliance, storage medium and processor
CN108257604B (en) * 2017-12-08 2021-01-08 平安普惠企业管理有限公司 Speech recognition method, terminal device and computer-readable storage medium
CN107871236B (en) * 2017-12-26 2021-05-07 广州势必可赢网络科技有限公司 Electronic equipment voiceprint payment method and device
KR102483834B1 (en) * 2018-01-17 2023-01-03 삼성전자주식회사 Method for authenticating user based on voice command and electronic dvice thereof
CN111177329A (en) * 2018-11-13 2020-05-19 奇酷互联网络科技(深圳)有限公司 User interaction method of intelligent terminal, intelligent terminal and storage medium
CN111292733A (en) * 2018-12-06 2020-06-16 阿里巴巴集团控股有限公司 Voice interaction method and device
CN110660398B (en) * 2019-09-19 2020-11-20 北京三快在线科技有限公司 Voiceprint feature updating method and device, computer equipment and storage medium
CN112580390B (en) * 2019-09-27 2023-10-17 百度在线网络技术(北京)有限公司 Security monitoring method and device based on intelligent sound box, sound box and medium
CN110970036B (en) * 2019-12-24 2022-07-12 网易(杭州)网络有限公司 Voiceprint recognition method and device, computer storage medium and electronic equipment
US11516197B2 (en) * 2020-04-30 2022-11-29 Capital One Services, Llc Techniques to provide sensitive information over a voice connection
CN111785280B (en) * 2020-06-10 2024-09-10 北京三快在线科技有限公司 Identity authentication method and device, storage medium and electronic equipment
US11817113B2 (en) 2020-09-09 2023-11-14 Rovi Guides, Inc. Systems and methods for filtering unwanted sounds from a conference call
US11450334B2 (en) * 2020-09-09 2022-09-20 Rovi Guides, Inc. Systems and methods for filtering unwanted sounds from a conference call using voice synthesis
US12008091B2 (en) * 2020-09-11 2024-06-11 Cisco Technology, Inc. Single input voice authentication
US11522994B2 (en) 2020-11-23 2022-12-06 Bank Of America Corporation Voice analysis platform for voiceprint tracking and anomaly detection
CN112565242B (en) * 2020-12-02 2023-04-07 携程计算机技术(上海)有限公司 Remote authorization method, system, equipment and storage medium based on voiceprint recognition
US12020711B2 (en) * 2021-02-03 2024-06-25 Nice Ltd. System and method for detecting fraudsters
US20240054235A1 (en) * 2022-08-15 2024-02-15 Bank Of America Corporation Systems and methods for encrypting dialogue based data in a data storage system
CN115426632A (en) * 2022-08-30 2022-12-02 上汽通用五菱汽车股份有限公司 Voice transmission method, device, vehicle-mounted host and storage medium
CN115565539B (en) * 2022-11-21 2023-02-07 中网道科技集团股份有限公司 Data processing method for realizing self-help correction terminal anti-counterfeiting identity verification
CN117059092B (en) * 2023-10-11 2024-06-04 深圳普一同创科技有限公司 Intelligent medical interactive intelligent diagnosis method and system based on blockchain

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1547191A (en) * 2003-12-12 2004-11-17 北京大学 Semantic and sound groove information combined speaking person identity system
CN1852354A (en) * 2005-10-17 2006-10-25 华为技术有限公司 Method and device for collecting user behavior characteristics
US7158776B1 (en) * 2001-09-18 2007-01-02 Cisco Technology, Inc. Techniques for voice-based user authentication for mobile access to network services
CN102708867A (en) * 2012-05-30 2012-10-03 北京正鹰科技有限责任公司 Method and system for identifying faked identity by preventing faked recordings based on voiceprint and voice
CN102760434A (en) * 2012-07-09 2012-10-31 华为终端有限公司 Method for updating voiceprint feature model and terminal
CN103258535A (en) * 2013-05-30 2013-08-21 中国人民财产保险股份有限公司 Identity recognition method and system based on voiceprint recognition

Family Cites Families (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11344992A (en) * 1998-06-01 1999-12-14 Ntt Data Corp Voice dictionary creating method, personal authentication device and record medium
US20040236699A1 (en) 2001-07-10 2004-11-25 American Express Travel Related Services Company, Inc. Method and system for hand geometry recognition biometrics on a fob
IL154733A0 (en) 2003-03-04 2003-10-31 Financial transaction authorization apparatus and method
WO2005013263A1 (en) 2003-07-31 2005-02-10 Fujitsu Limited Voice authentication system
US7386448B1 (en) 2004-06-24 2008-06-10 T-Netix, Inc. Biometric voice authentication
US8014496B2 (en) 2004-07-28 2011-09-06 Verizon Business Global Llc Systems and methods for providing network-based voice authentication
US7536304B2 (en) 2005-05-27 2009-05-19 Porticus, Inc. Method and system for bio-metric voice print authentication
US20060277043A1 (en) 2005-06-06 2006-12-07 Edward Tomes Voice authentication system and methods therefor
CN101228770B (en) * 2005-07-27 2011-12-14 国际商业机器公司 Systems and method for secure delivery of files to authorized recipients
JP4466572B2 (en) * 2006-01-16 2010-05-26 コニカミノルタビジネステクノロジーズ株式会社 Image forming apparatus, voice command execution program, and voice command execution method
CN1808567A (en) 2006-01-26 2006-07-26 覃文华 Voice-print authentication device and method of authenticating people presence
US8396711B2 (en) * 2006-05-01 2013-03-12 Microsoft Corporation Voice authentication system and method
US20080256613A1 (en) 2007-03-13 2008-10-16 Grover Noel J Voice print identification portal
WO2010025523A1 (en) 2008-09-05 2010-03-11 Auraya Pty Ltd Voice authentication system and methods
US8537978B2 (en) * 2008-10-06 2013-09-17 International Business Machines Corporation Method and system for using conversational biometrics and speaker identification/verification to filter voice streams
US8655660B2 (en) * 2008-12-11 2014-02-18 International Business Machines Corporation Method for dynamic learning of individual voice patterns
CN102404287A (en) 2010-09-14 2012-04-04 盛乐信息技术(上海)有限公司 Voiceprint identification system and method for determining voiceprint authentication threshold value through data multiplexing method
US9318114B2 (en) * 2010-11-24 2016-04-19 At&T Intellectual Property I, L.P. System and method for generating challenge utterances for speaker verification
CN102222502A (en) * 2011-05-16 2011-10-19 上海先先信息科技有限公司 Effective way for voice verification by Chinese text-prompted mode
KR101304112B1 (en) * 2011-12-27 2013-09-05 현대캐피탈 주식회사 Real time speaker recognition system and method using voice separation
US10134401B2 (en) * 2012-11-21 2018-11-20 Verint Systems Ltd. Diarization using linguistic labeling
JP5646675B2 (en) * 2013-03-19 2014-12-24 ヤフー株式会社 Information processing apparatus and method
US20140359736A1 (en) 2013-05-31 2014-12-04 Deviceauthority, Inc. Dynamic voiceprint authentication
CN103679452A (en) * 2013-06-20 2014-03-26 腾讯科技(深圳)有限公司 Payment authentication method, device thereof and system thereof
GB2517952B (en) * 2013-09-05 2017-05-31 Barclays Bank Plc Biometric verification using predicted signatures
US8812320B1 (en) * 2014-04-01 2014-08-19 Google Inc. Segment-based speaker verification using dynamically generated phrases
CN105575391B (en) 2014-10-10 2020-04-03 阿里巴巴集团控股有限公司 Voiceprint information management method and device and identity authentication method and system

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7158776B1 (en) * 2001-09-18 2007-01-02 Cisco Technology, Inc. Techniques for voice-based user authentication for mobile access to network services
CN1547191A (en) * 2003-12-12 2004-11-17 北京大学 Semantic and sound groove information combined speaking person identity system
CN1852354A (en) * 2005-10-17 2006-10-25 华为技术有限公司 Method and device for collecting user behavior characteristics
CN102708867A (en) * 2012-05-30 2012-10-03 北京正鹰科技有限责任公司 Method and system for identifying faked identity by preventing faked recordings based on voiceprint and voice
CN102760434A (en) * 2012-07-09 2012-10-31 华为终端有限公司 Method for updating voiceprint feature model and terminal
CN103258535A (en) * 2013-05-30 2013-08-21 中国人民财产保险股份有限公司 Identity recognition method and system based on voiceprint recognition

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10593334B2 (en) 2014-10-10 2020-03-17 Alibaba Group Holding Limited Method and apparatus for generating voiceprint information comprised of reference pieces each used for authentication
EP3611895A4 (en) * 2017-04-10 2020-04-08 Beijing Orion Star Technology Co., Ltd. Method and device for user registration, and electronic device
US11568876B2 (en) 2017-04-10 2023-01-31 Beijing Orion Star Technology Co., Ltd. Method and device for user registration, and electronic device
CN111862933A (en) * 2020-07-20 2020-10-30 北京字节跳动网络技术有限公司 Method, apparatus, device and medium for generating synthesized speech

Also Published As

Publication number Publication date
US20170221488A1 (en) 2017-08-03
JP2017534905A (en) 2017-11-24
EP3206205A1 (en) 2017-08-16
HK1224074A1 (en) 2017-08-11
KR20170069258A (en) 2017-06-20
SG10201903085YA (en) 2019-05-30
JP6671356B2 (en) 2020-03-25
SG11201702919UA (en) 2017-05-30
EP3206205A4 (en) 2017-11-01
US10593334B2 (en) 2020-03-17
CN105575391A (en) 2016-05-11
CN105575391B (en) 2020-04-03
EP3206205B1 (en) 2020-01-15

Similar Documents

Publication Publication Date Title
WO2016054991A1 (en) Voiceprint information management method and device as well as identity authentication method and system
US10685657B2 (en) Biometrics platform
US10135818B2 (en) User biological feature authentication method and system
US10276168B2 (en) Voiceprint verification method and device
US8812319B2 (en) Dynamic pass phrase security system (DPSS)
CN102985965B (en) Voice print identification
CN105069874B (en) A kind of mobile Internet sound-groove gate inhibition system and its implementation
US20160014120A1 (en) Method, server, client and system for verifying verification codes
WO2019127897A1 (en) Updating method and device for self-learning voiceprint recognition
CN110533288A (en) Business handling process detection method, device, computer equipment and storage medium
US20120284026A1 (en) Speaker verification system
CN109036436A (en) A kind of voice print database method for building up, method for recognizing sound-groove, apparatus and system
US11076043B2 (en) Systems and methods of voiceprint generation and use in enforcing compliance policies
CN106982344A (en) video information processing method and device
WO2016107415A1 (en) Auxiliary identity authentication method based on user network behavior feature
KR101181060B1 (en) Voice recognition system and method for speaker recognition using thereof
US20120330663A1 (en) Identity authentication system and method
US20140163986A1 (en) Voice-based captcha method and apparatus
US11705134B2 (en) Graph-based approach for voice authentication
KR102291113B1 (en) Apparatus and method for producing conference record
KR20220166465A (en) Meeting minutes creating system and method using multi-channel receiver
Yakovlev et al. LRPD: Large Replay Parallel Dataset
Portêlo et al. Privacy-preserving query-by-example speech search
CN114125368B (en) Conference audio participant association method and device and electronic equipment
RU2628118C2 (en) Method for forming and usage of the inverted index of audio recording and machinescent of information storage device

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15848463

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2017518071

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 11201702919U

Country of ref document: SG

ENP Entry into the national phase

Ref document number: 20177012683

Country of ref document: KR

Kind code of ref document: A

REEP Request for entry into the european phase

Ref document number: 2015848463

Country of ref document: EP