WO2016054991A1 - 声纹信息管理方法、装置以及身份认证方法、系统 - Google Patents

声纹信息管理方法、装置以及身份认证方法、系统 Download PDF

Info

Publication number
WO2016054991A1
WO2016054991A1 PCT/CN2015/091260 CN2015091260W WO2016054991A1 WO 2016054991 A1 WO2016054991 A1 WO 2016054991A1 CN 2015091260 W CN2015091260 W CN 2015091260W WO 2016054991 A1 WO2016054991 A1 WO 2016054991A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
voice
user
text
reference voiceprint
Prior art date
Application number
PCT/CN2015/091260
Other languages
English (en)
French (fr)
Inventor
熊剑
Original Assignee
阿里巴巴集团控股有限公司
熊剑
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 阿里巴巴集团控股有限公司, 熊剑 filed Critical 阿里巴巴集团控股有限公司
Priority to JP2017518071A priority Critical patent/JP6671356B2/ja
Priority to KR1020177012683A priority patent/KR20170069258A/ko
Priority to SG11201702919UA priority patent/SG11201702919UA/en
Priority to EP15848463.4A priority patent/EP3206205B1/en
Publication of WO2016054991A1 publication Critical patent/WO2016054991A1/zh
Priority to US15/484,082 priority patent/US10593334B2/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/26Speech to text systems
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L21/00Speech or voice signal processing techniques to produce another audible or non-audible signal, e.g. visual or tactile, in order to modify its quality or its intelligibility
    • G10L21/003Changing voice quality, e.g. pitch or formants
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • G10L17/24Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase

Definitions

  • the present application relates to the field of voiceprint recognition technology, and in particular, to a voiceprint information management method and apparatus, and an identity authentication method and system.
  • Voiceprint refers to the spectrum of sound waves carrying speech information displayed by electroacoustic instruments. Different people say the same, the sound waves produced by them are different, and the corresponding sound wave spectrum, that is, the voiceprint information is also different. Therefore, by comparing the voiceprint information, it can be determined whether the corresponding speakers are the same, that is, the voiceprint recognition based identity authentication is implemented; the voiceprint recognition based identity authentication method can be widely applied to various account management systems for securing accounts. Security.
  • the user before using the voiceprint recognition technology to implement identity authentication, the user first needs to read the preset text information, collect the voice signal of the user at this time, and analyze and obtain the corresponding voiceprint information as the reference voiceprint information of the user. Deposited into the voiceprint library; when implementing identity authentication, the authenticated person is also required to read the preset text information, collect the voice signal of the authenticated person, analyze and obtain the corresponding voiceprint information, and compare the voiceprint information with the sound
  • the reference voiceprint information in the texture library can determine whether the authenticated person is the user himself or herself.
  • the text information used for identity authentication has been disclosed when the voiceprint library is established.
  • the text information required to be read by the authenticator when performing identity authentication is also known.
  • the text message is a sound file, anyone can play the pre-recorded sound file to make the authentication successful. It can be seen that the existing identity authentication method based on voiceprint recognition has serious security risks.
  • the present application provides a voiceprint information management method and apparatus, and an identity authentication method and system.
  • a first aspect of the present application provides a voiceprint information management method, the method comprising the following steps:
  • the voiceprint information management method further includes include:
  • the sub-voice information corresponding to each sub-text information is respectively intercepted from the voice information according to the start and end time of the sub-text information.
  • the editing the voice information and the corresponding text information into the reference voiceprint information of the first user includes:
  • Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
  • the storing the reference voiceprint information and the identity identifier of the first user includes:
  • the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information.
  • the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
  • the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
  • the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
  • a second aspect of the present application provides a voiceprint information management apparatus, the apparatus comprising:
  • a voice filter configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user
  • a text identifier configured to perform text recognition processing on the voice information, to obtain text information corresponding to the voice information
  • a voiceprint generator for editing the voice information and corresponding text information into reference voiceprint information of the first user, and storing the reference voiceprint information and an identity identifier of the first user.
  • the voiceprint information management apparatus further includes:
  • a text cutter for dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information
  • the voiceprint cutter is configured to separately intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
  • the voiceprint generator edits the voice information and corresponding text information into a reference of the first user Voiceprint information, including:
  • Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
  • the voiceprint generator stores the reference voiceprint information and the identity identifier of the first user, including:
  • the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information.
  • the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
  • the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
  • the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
  • the third aspect of the present application provides an identity authentication method, where the method includes the following steps:
  • the voice information in the obtained reference voiceprint information is matched with the voice information to be authenticated. If the matching succeeds, it is determined that the authentication of the user to be authenticated is successful. If the matching fails, it is determined that the authentication of the user to be authenticated fails.
  • the identity authentication method further includes:
  • the sub-voice information corresponding to each sub-text information is respectively intercepted from the voice information according to the start and end time of the sub-text information.
  • the editing the voice information and the corresponding text information into the reference voiceprint information of the first user includes:
  • Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
  • the storing the reference voiceprint information and the identity identifier of the first user includes:
  • the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information.
  • the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
  • the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
  • the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
  • a fourth aspect of the present application provides an identity authentication system; the system includes:
  • a voice filter configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user
  • a text identifier configured to perform text recognition processing on the voice information, to obtain text information corresponding to the voice information
  • a voiceprint generator configured to edit the voice information and corresponding text information into reference voiceprint information of the first user, and store the reference voiceprint information and an identity identifier of the first user;
  • a voiceprint extractor configured to acquire reference voiceprint information corresponding to an identity identifier of a user to be authenticated
  • Identifying a front end device configured to output text information in the obtained reference voiceprint information, and receive corresponding voice information to be authenticated
  • a voiceprint matching device configured to match the voice information in the obtained reference voiceprint information with the voice information to be authenticated, and if the matching is successful, determine that the authentication of the user to be authenticated is successful, and if the matching fails, determine that the authentication is to be authenticated. User authentication failed.
  • the identity authentication system further includes:
  • a text cutter for dividing the text information into a plurality of sub-text information, and marking a start and end time of each sub-text information
  • the voiceprint cutter is configured to separately intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
  • the voiceprint generator edits the voice information and corresponding text information as a reference of the first user Voiceprint information, including:
  • Each pair of sub-speech information and sub-text information is separately edited as a reference voiceprint information of the first user.
  • the voiceprint generator stores the reference voiceprint information and the identity identifier of the first user, including:
  • the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information.
  • the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
  • the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
  • the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
  • the present application obtains the voice information of the first user by filtering the historical voice file stored in the related system, and obtains the text information corresponding to the voice information through text recognition processing, and the voice information and the corresponding voice information.
  • the text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the related system, that is, it is non-public, Whether the first user, the second user, or any other user cannot predict the specific content of the text information that needs to be read when performing identity authentication, so that the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be authenticated.
  • the voiceprint information management method provided by the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
  • FIG. 1 is a flowchart of a method for managing voiceprint information provided by an embodiment of the present application.
  • FIG. 2 is a flowchart of another method for managing voiceprint information provided by an embodiment of the present application.
  • FIG. 3 is a flowchart of a method for storing reference voiceprint information provided by an embodiment of the present application.
  • FIG. 4 is a structural block diagram of a voiceprint information management system provided by an embodiment of the present application.
  • FIG. 5 is a structural block diagram of another voiceprint information management system provided by an embodiment of the present application.
  • FIG. 6 is a flowchart of an identity authentication method according to an embodiment of the present application.
  • FIG. 7 is a flowchart of another identity authentication method provided by an embodiment of the present application.
  • FIG. 8 is a structural block diagram of an identity authentication system according to an embodiment of the present application.
  • FIG. 9 is a structural block diagram of another identity authentication system according to an embodiment of the present application.
  • the voiceprint information management method includes the following steps.
  • the first user may be a registered user who has a corresponding private account in the account management system, and correspondingly, the second user may be a service personnel of the account management system.
  • the account management system records the voice call process between the registered user and the service personnel and stores the corresponding voice file.
  • the embodiment of the present application filters out the machine prompt sound in the historical voice file stored by the account management system, the voice information of the service personnel, and the like, and obtains the voice information of the registered user, and performs text recognition processing on the voice information to obtain
  • the text information corresponding to the voice information, the voice information and the corresponding text information can be used as a set of reference voiceprint information of the registered user. Performing the above steps for each registered user separately, the reference voiceprint information corresponding to each registered user can be obtained, and the voiceprint library is created.
  • the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the related system, and obtains the text information corresponding to the voice information through the text recognition process, and the voice information and the corresponding
  • the text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the relevant system, that is, it is non-public, No matter whether the first user, the second user, or any other user can predict the specific content of the text information that needs to be read when performing identity authentication, the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be played.
  • the voiceprint information management method provided by the embodiment of the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
  • a historical voice file corresponding to an arbitrary call process of the first user and the second user may be randomly obtained, so that the identity identifiers in the voiceprint library are in one-to-one correspondence with the reference voiceprint information.
  • the specific content of the text information in the obtained reference voiceprint information cannot be predicted because the actual voice file obtained by the historical voice file cannot be predicted. Therefore, performing the identity authentication based on the embodiment can ensure the accuracy of the authentication result. Sex, improve the security of the account.
  • all historical voice files corresponding to the first user may also be acquired, and each historical voice file may correspond to at least one set of reference voiceprint information, so that one identity identifier in the voiceprint library may be Corresponding to multiple sets of reference voiceprint information (ie, the first user has multiple sets of reference voiceprint information); correspondingly, any set of reference voiceprint information can be randomly obtained to perform identity authentication. Since the text information in each set of reference voiceprint information is non-public, the reference voiceprint information obtained during identity authentication cannot be predicted, so the specific content of the text information used for performing identity authentication cannot be predicted, and thus cannot be advanced. Recording the corresponding sound file can not achieve the purpose of successful authentication by playing the sound file recorded in advance; therefore, performing identity authentication based on the embodiment can ensure the accuracy of the authentication result and improve the security of the account.
  • the voiceprint information management method includes the following steps.
  • the text information is divided into a plurality of sub-text information, and the start and end time of each sub-text information is marked.
  • the sub-voice information corresponding to each sub-text information is separately intercepted from the voice information according to the start and end time of the sub-text information.
  • the filtered voice information includes the plurality of pieces of voice information of the first user, and the corresponding text information obtained by the text recognition includes multiple A sentence or phrase.
  • the embodiment of the present application divides the text information into a plurality of sub-text information (each sub-text information may be a sentence, a phrase or a word); at the same time, each sub-text information obtained by the segmentation is marked with a start and end time, according to the start and end time.
  • the sub-speech information corresponding to the sub-text information is intercepted in the voice information (that is, the voice information is segmented according to the sub-text information).
  • the sentence "My account is locked” in the text message is recognized by the 00:03 to 00:05 period of the voice message, and the "My account is locked” is divided into a sub-text information,
  • the start and end time is 00:03 to 00:05.
  • the voice information in the 00:03 to 00:05 period of the voice information is intercepted, that is, the subtext information corresponding to "My account is locked” is obtained.
  • Sub-voice information By segmenting the text information and the voice information, a plurality of pairs of sub-text information and sub-speech information can be obtained, and are respectively edited into reference voiceprint information according to a predetermined format, thereby obtaining a plurality of reference voiceprint information corresponding to the same user.
  • the sub-voice information and the corresponding sub-text information are edited into the reference voiceprint information
  • the method includes: processing the sub-voice information into corresponding sub-voice texture information, and setting a file name for the sub-voice texture information,
  • the format of the file name may be “soundprint number. file format suffix”, such as 0989X.WAV; storing the sub-voice texture information, and the identity identifier, sub-text information and the like of the first user corresponding to the sub-voice texture information;
  • the storage structure of the voiceprint library obtained based on the above voiceprint information management method is as shown in Table 1.
  • each row corresponds to a reference voiceprint information in the voiceprint library; the identity identifier (ie, user ID) is used as the primary key for querying and invoking voiceprint information; the user voiceprint number is used to mark the same user ID. The number of corresponding reference voiceprint information.
  • the identity identifier ie, user ID
  • the user voiceprint number is used to mark the same user ID. The number of corresponding reference voiceprint information.
  • the sub-text information “Why is there no refund” is outputted; the voice information to be authenticated obtained by the user to be authenticated is re-read the information of the sub-file, and is processed as the voiceprint information to be authenticated. Comparing the voiceprint information to be authenticated and the child voiceprint information “0389X.WAV” extracted in the voiceprint library, if the two match, determining that the identity authentication is successful, that is, the user to be authenticated is the first corresponding to “139XXXXXXX” User; otherwise, if the two do not match, it is determined that the identity authentication failed.
  • the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the system; and performing text recognition processing on the voice information to obtain corresponding text information;
  • the text information is divided into a plurality of sub-text information, and the corresponding sub-speech information is intercepted from the above-mentioned voice information according to the start and end time of each sub-text information, and each pair of sub-text information and sub-voice information is respectively edited into a reference voiceprint information, and the information is saved.
  • the voiceprint library is provided so that each of the first users has a plurality of reference voiceprint information; when the identity authentication needs to be performed, one of the plurality of reference voiceprint information corresponding to the identity identifier to be authenticated is randomly selected. Since the reference voiceprint information obtained during the identity authentication is random, the specific content of the text information that needs to be read by the user to be authenticated cannot be predicted. Therefore, the voiceprint library obtained according to the embodiment performs identity authentication, and the authentication result can be guaranteed. Accuracy and improve account security.
  • the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.
  • the voiceprint information management method provided by the embodiment of the present application can not only create a new voiceprint library, but also update the created voiceprint library, for example, adding a reference voiceprint information corresponding to a new user, and adding a new voice to the old user.
  • Reference voiceprint information For the new user, it is only necessary to obtain the historical voice file corresponding to the new user, and perform the above steps S12 to S4, or steps S22 to S26, to obtain the reference voiceprint information corresponding to the new user. As the historical voice file corresponding to the same user increases with time, the old user can obtain the corresponding new historical voice file and perform the above steps to add a new reference to the old user. Voiceprint information.
  • one or more pieces of reference voiceprint information may be set for the first user.
  • a plurality of pieces of reference voiceprint information are set for the same first user, it is necessary to ensure that the text information in any two reference voiceprint information corresponding to the first user is different.
  • different historical voice files identify the same text information, or the same text information cuts out multiple sub-text information with the same content, so that the same sub-text information corresponds to multiple sub-texts.
  • Voice information at this time, the embodiment of the present application uses the method shown in FIG. 3 to complete the storage of the reference voiceprint information.
  • the reference voiceprint information to be stored is the first reference voiceprint information composed of the first text information and the first voice information.
  • the first reference voiceprint information is stored in the embodiment of the present application. The process includes the following steps:
  • step S31 Determine whether there is second reference voiceprint information that satisfies the comparison condition. If yes, execute step S32. Otherwise, step S34 is performed.
  • the comparison condition includes: the second text information corresponding to the second reference voiceprint information is the same as the first text information in the first reference voiceprint information, and the second identity identifier corresponding to the second reference voiceprint information is The first identity identifier corresponding to a reference voiceprint information is also the same.
  • step S32 Determine whether the quality of the first voice information in the first reference voiceprint information is higher than the quality of the second voice information in the second reference voiceprint information, and if yes, perform step S33, otherwise perform steps S35.
  • step S31 it is determined whether the second reference voiceprint information is present, and the search range includes at least the reference voiceprint information that has been stored in the voiceprint library, and may also include the data generated synchronously with the first reference voiceprint information and not yet stored. Reference voiceprint information. If the second reference voiceprint information does not exist, the first reference voiceprint information is directly stored. If the second reference voiceprint information is found, the same first user and the same text information have at least two different voice information.
  • the quality of the first voice information in the first reference voiceprint information, and the Comparing the quality of the second voice information in the second reference voiceprint information if the quality of the first voice information is higher than the second voice information, storing the first reference voiceprint information, and deleting the second reference voiceprint information, if If the quality of the voice information is lower than the second voice information, the first reference voiceprint information is directly deleted, that is, only the voice information with the highest quality is retained for the same text information, so as to improve the accuracy of the voice information comparison result in the identity authentication process, Reduce the difficulty of comparison.
  • the following three voiceprint library update modes can be implemented: 1) adding reference voiceprint information of the new user; 2) increasing the reference voiceprint information of the text information corresponding to the old user; 3) The reference voiceprint information with lower voice information quality is replaced with the reference voiceprint information with higher voice information quality.
  • the new reference voiceprint information obtained by the embodiment of the present application is not directly stored in the voiceprint library, but is first determined whether the text information in the reference voiceprint information and the corresponding text are stored.
  • Another reference voiceprint information having the same identity identifier if present, compares the quality of the voice information in the two reference voiceprint information, retains the reference voiceprint information with higher voice information quality, and deletes the voice information with lower quality Reference voiceprint information. Therefore, the embodiment of the present application can ensure that the text information in any two reference voiceprint information corresponding to the same identity identifier (that is, the same first user) is different in the stored reference voiceprint information, and each text can be guaranteed.
  • the quality of the voice information corresponding to the text information is the highest; when performing the identity authentication based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the accuracy of the authentication and improve the authentication efficiency.
  • FIG. 4 is a structural block diagram of a voiceprint information management system according to an embodiment of the present application; the voiceprint information management system can be applied to an account management system.
  • the voiceprint information management system 100 includes a voice filter 110, a text recognizer 120, and a voiceprint generator 130.
  • the voice filter 110 is configured to acquire a historical voice file generated by a call between the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.
  • the text recognizer 120 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
  • the voiceprint generator 130 is configured to edit the voice information and corresponding text information into reference voiceprint information of the first user, and store the reference voiceprint information and the identity of the first user symbol.
  • the historical voice file stored in the related system is filtered to obtain the voice information of the first user, and the text information corresponding to the voice information is obtained through the text recognition process, and the voice information and the corresponding information are obtained.
  • the text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the relevant system, that is, it is non-public, No matter whether the first user, the second user, or any other user can predict the specific content of the text information that needs to be read when performing identity authentication, the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be played.
  • the voiceprint information management method provided by the embodiment of the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
  • FIG. 5 is a structural block diagram of another voiceprint information management system according to an embodiment of the present disclosure; the voiceprint information management system can be applied to an account management system.
  • the voiceprint information management system 200 includes a voice filter 210, a text recognizer 220, a text cutter 240, a voiceprint cutter 250, and a voiceprint generator 230.
  • the voice filter 210 is configured to acquire a historical voice file generated by a call between the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.
  • the text recognizer 220 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
  • the text cutter 240 is configured to slice the text information into a plurality of sub-text information and mark the start and end time of each sub-text information.
  • the voiceprint cutter 250 is configured to respectively intercept sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
  • the voiceprint generator 230 is configured to edit each pair of sub-voice information and sub-text information as a piece of reference voiceprint information of the first user, and store each piece of reference voiceprint information and the identity of the first user. Identifier.
  • the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the system; and performing text recognition processing on the voice information to obtain corresponding text information; and the recognized text
  • the information is divided into a plurality of sub-text information, and according to the start and end time of each sub-text information from the above
  • the corresponding sub-speech information is intercepted in the voice information, and each pair of sub-text information and sub-speech information is respectively edited into a reference voiceprint information, and stored in the voiceprint library, so that each first user has multiple reference voiceprint information;
  • identity authentication needs to be performed, one of the plurality of reference voiceprint information corresponding to the identity identifier to be authenticated may be randomly selected.
  • the reference voiceprint information obtained during the identity authentication is random, it is impossible to predict the specific content of the text information that needs to be read by the user to be authenticated, so that the corresponding sound file cannot be recorded in advance, that is, the sound file that cannot be recorded in advance cannot be played.
  • the purpose of the authentication success is achieved. Therefore, performing the identity authentication based on the voiceprint library obtained in this embodiment can ensure the accuracy of the authentication result and improve the security of the account.
  • the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.
  • the voiceprint generator 130 and the voiceprint generator 230 may be configured. for:
  • the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information.
  • the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the first identity identifier;
  • the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first If the quality of the voice information is lower than the second voice information, the first reference voiceprint information is directly deleted;
  • the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and the first identity identifier.
  • the embodiment of the present application can ensure that the text information in any two reference voiceprint information corresponding to the same user is different in the stored reference voiceprint information, and each text can be guaranteed.
  • the quality of the voice information corresponding to the information is the highest; thus, when performing the identity authentication based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the accuracy of the authentication and improve the authentication efficiency.
  • FIG. 6 is a flowchart of an identity authentication method according to an embodiment of the present application; the identity authentication method may be applied to an account management system. Referring to FIG. 6, the identity authentication method includes the following steps.
  • the first user may be a registered user who has a corresponding private account in the account management system.
  • the second user may be a service personnel of the account management system.
  • the embodiment of the present application obtains the voice information of the first user by filtering the historical voice file stored in the related system, and obtains the text information corresponding to the voice information through the text recognition process, and the voice information and the corresponding
  • the text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the relevant system, that is, it is non-public, No matter whether the first user, the second user, or any other user can predict the specific content of the text information that needs to be read when performing identity authentication, the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be played.
  • the voiceprint information management method provided by the embodiment of the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
  • FIG. 7 is a flowchart of another identity authentication method according to an embodiment of the present application; the identity authentication method may be applied to an account management system.
  • the identity authentication method includes the following steps.
  • the text information is divided into a plurality of sub-text information, and the start and end time of each sub-text information is marked.
  • the sub-voice information corresponding to each sub-text information is separately extracted from the voice information according to the start and end time of the sub-text information.
  • the text information in the present application is divided into a plurality of sub-text information, and the corresponding sub-speech information is intercepted according to the start and end time, and each sub-text information and the corresponding sub-voice information are edited into one reference.
  • the voiceprint information is such that the first user has a plurality of reference voiceprint information; when the identity authentication needs to be performed, one of the plurality of reference voiceprint information corresponding to the identity identifier to be authenticated is randomly selected. Since the reference voiceprint information obtained during the identity authentication is random, it is impossible to predict the specific content of the text information that needs to be read by the user to be authenticated, so that the corresponding sound file cannot be recorded in advance, that is, the sound file that cannot be recorded in advance cannot be played.
  • the identity authentication method provided in this embodiment can ensure the accuracy of the authentication result and improve the security of the account.
  • the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.
  • the identity authentication method provided by the embodiment of the present application can also complete the storage of the reference voiceprint information by using the method shown in FIG. 3, and can ensure not only any two reference voiceprint information corresponding to the same user in the stored reference voiceprint information.
  • the text information in the text is different, and the quality of the voice information corresponding to each type of text information is also guaranteed to be the highest; when the identity authentication is performed based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the authentication. Accuracy and improve certification efficiency.
  • FIG. 8 is a structural block diagram of an identity authentication system according to an embodiment of the present application, where the identity authentication system can be applied to an account management system.
  • the identity authentication system 300 includes a voice filter 310, a text recognizer 320, a voiceprint generator 330, a voiceprint extractor 360, a recognition front end 370, and a voiceprint matcher 380.
  • the voice filter 310 is configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.
  • the text recognizer 320 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
  • the voiceprint generator 330 is configured to edit the voice information and corresponding text information into reference voiceprint information of the first user, and store the reference voiceprint information and the identity of the first user symbol.
  • the voiceprint extractor 360 is configured to acquire reference voiceprint information corresponding to an identity identifier of a user to be authenticated.
  • the recognition front end 370 is configured to output the text information in the acquired reference voiceprint information and receive the corresponding voice information to be authenticated.
  • the voiceprint matcher 380 is configured to match the voice information in the obtained reference voiceprint information with the voice information to be authenticated, and if the matching is successful, determine that the authentication of the user to be authenticated is successful, and if the matching fails, determine The authentication of the user to be authenticated failed.
  • the identification pre-amplifier 370 is configured to implement interaction between the identity authentication system and the user to be authenticated; in addition to the text information in the reference voiceprint information acquired by the voiceprint extractor 360, the user input to be authenticated is received.
  • the user may also receive an identity authentication request of the user to be authenticated, and after receiving the identity authentication request, trigger the voiceprint extractor 360, and output the authentication result obtained by the voiceprint matcher 380 to the user to be authenticated.
  • the historical voice file stored in the related system is filtered to obtain the voice information of the first user, and the text information corresponding to the voice information is obtained through the text recognition process, and the voice information and the corresponding information are obtained.
  • the text information is edited as the reference voiceprint information of the first user; since the text information and the voice information in the reference voiceprint information are obtained based on the historical voice file, which is not preset by the relevant system, that is, it is non-public, No matter whether the first user, the second user, or any other user can predict the specific content of the text information that needs to be read when performing identity authentication, the corresponding sound file cannot be recorded in advance, that is, the sound file that is recorded in advance cannot be played.
  • the voiceprint information management method provided by the embodiment of the present application performs identity authentication, the authentication result is more accurate, there is no security risk, and the account security is higher.
  • FIG. 9 is a structural block diagram of an identity authentication system according to an embodiment of the present application.
  • the identity authentication system can be applied to an account management system.
  • the identity authentication system 400 includes a voice filter 410, a text recognizer 420, a text cutter 440, a voiceprint cutter 450, a voiceprint generator 430, a voiceprint extractor 460, a recognition front end 470, and Voiceprint matcher 480.
  • the voice filter 410 is configured to acquire a historical voice file generated by the first user and the second user, and perform filtering processing on the historical voice file to obtain voice information of the first user.
  • the text recognizer 420 is configured to perform text recognition processing on the voice information to obtain text information corresponding to the voice information.
  • the text cutter 440 is configured to slice the text information into a plurality of sub-text information and mark the start and end time of each sub-text information.
  • the voiceprint cutter 450 is configured to respectively intercept the sub-voice information corresponding to each sub-text information from the voice information according to the start and end time of the sub-text information.
  • the voiceprint generator 430 is configured to edit each pair of sub-voice information and sub-text information as a piece of reference voiceprint information of the first user, and store each piece of reference voiceprint information and the identity of the first user. Identifier.
  • the voiceprint extractor 460 is configured to acquire reference voiceprint information corresponding to an identity identifier of a user to be authenticated.
  • the identification preamplifier 470 is configured to output sub-text information in the acquired reference voiceprint information and receive corresponding to-be-authenticated speech information.
  • the voiceprint matcher 480 is configured to: the sub-voice information in the acquired reference voiceprint information and the to-be-recognized If the matching is successful, the authentication of the user to be authenticated is successful. If the matching fails, the authentication of the user to be authenticated is determined to be unsuccessful.
  • the identified text information is divided into a plurality of sub-text information, and the corresponding sub-speech information is intercepted according to the start and end time, and each sub-text information and the corresponding sub-speech information are edited into one reference.
  • the voiceprint information is such that the first user has a plurality of pieces of reference voiceprint information; when the identity authentication needs to be performed, the corresponding plurality of reference voiceprint information are determined from the identity identifier corresponding to the user to be authenticated, and one of the randomly selected ones is used for This identity certification.
  • the identity authentication system provided in this embodiment can ensure the accuracy of the authentication result and improve the security of the account.
  • the sub-text information in each reference voiceprint information is short, which can reduce the time required for re-reading the text information, reduce the time consumed by the voiceprint comparison, and improve the authentication efficiency.
  • the voiceprint generator 330 and the voiceprint generator 430 may be configured to:
  • the corresponding second text information is the same as the first text information in the first reference voiceprint information to be stored, and the corresponding second identity identifier is the first identity identifier corresponding to the first reference voiceprint information.
  • the second reference voiceprint information does not exist, directly storing the first reference voiceprint information and the identity identifier of the first user;
  • the second reference voiceprint information is present, comparing the quality of the first voice information in the first reference voiceprint information and the second voice information in the second reference voiceprint information, if the first Deleting the first reference voiceprint information when the quality of the voice information is lower than the second voice information;
  • the quality of the first voice information is higher than the second voice information, deleting the second reference voiceprint information, and storing the first reference voiceprint information and a corresponding user identity identifier.
  • the embodiment of the present application can ensure that the text information in any two reference voiceprint information corresponding to the same identity identifier is different in the stored reference voiceprint information, and each text can be guaranteed.
  • the quality of the voice information corresponding to the text information is the highest; when the identity authentication is performed based on the embodiment of the present application, the voiceprint comparison based on the higher quality voice information can ensure the accuracy of the authentication and improve the authentication efficiency.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Quality & Reliability (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Game Theory and Decision Science (AREA)
  • Business, Economics & Management (AREA)
  • Computer Hardware Design (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Signal Processing (AREA)
  • Telephonic Communication Services (AREA)
  • Storage Device Security (AREA)

Abstract

一种声纹信息管理方法、装置以及身份认证方法、系统,其通过对相关系统存储的历史语音文件进行过滤,得到第一用户的语音信息(S12),并通过文本识别处理得到该语音信息对应的文本信息(S13),并将该语音信息和对应的文本信息编辑为第一用户的基准声纹信息;由于该基准声纹信息中的文本信息和语音信息都是基于上述历史语音文件得到的,不是相关系统预设的,即是非公开的,任何用户都无法预知执行身份认证时需要复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的。基于该声纹信息管理进行身份认证,认证结果更准确,不存在安全隐患,账户的安全性更高。

Description

声纹信息管理方法、装置以及身份认证方法、系统 技术领域
本申请涉及声纹识别技术领域,尤其涉及一种声纹信息管理方法、装置以及身份认证方法、系统。
背景技术
声纹是指用电声学仪器显示的携带言语信息的声波频谱。不同人说相同的话,其产生的声波不同,相应的声波频谱,即声纹信息也不同。因此,通过比对声纹信息可以判断对应的说话人是否相同,即实现基于声纹识别的身份认证;该基于声纹识别的身份认证方式可以广泛应用于各种账户管理系统,用于保证账户的安全性。
相关技术中,在利用声纹识别技术实现身份认证前,首先需要用户读出预设文本信息,采集此时用户的声音信号,分析得到对应的声纹信息,作为该用户的基准声纹信息,存入声纹库;在实现身份认证时,同样要求被认证人读出上述预设文本信息,采集被认证人的声音信号,分析得到对应的声纹信息,通过比对该声纹信息与声纹库中的基准声纹信息,就可以判断出被认证人是否为用户本人。
以上技术中,用于身份认证的文本信息已在声纹库建立时被公开,相应的,进行身份认证时要求被认证人读出的文本信息也是已知的,如果提前录制用户本人读出该文本信息时的声音文件,则任何人都可以通过播放该提前录制的声音文件使得认证成功。可见,现有基于声纹识别的身份认证方式存在严重的安全隐患。
发明内容
为克服相关技术中存在的问题,本申请提供一种声纹信息管理方法、装置以及身份认证方法、系统。
本申请第一方面提供一种声纹信息管理方法,该方法包括如下步骤:
获取第一用户与第二用户通话产生的历史语音文件;
对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;
对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息;
将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符。
结合第一方面,在第一方面第一种可行的实施方式中,所述声纹信息管理方法还包 括:
将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;
根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。
结合第一方面第一种可行的实施方式,在第一方面第二种可行的实施方式中,将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:
将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。
结合第一方面,在第一方面第三种可行的实施方式中,存储所述基准声纹信息和所述第一用户的身份标识符,包括:
判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;
如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;
如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;
如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。
本申请第二方面提供一种声纹信息管理装置,该装置包括:
语音过滤器,用于获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;
文本识别器,用于对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息;
声纹生成器,用于将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符。
结合第二方面,在第二方面第一种可行的实施方式中,所述声纹信息管理装置还包括:
文本切割器,用于将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;
声纹切割器,用于根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。
结合第二方面第一种可行的实施方式,在第二方面第二种可行的实施方式中,所述声纹生成器将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:
将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。
结合第二方面,在第二方面第三种可行的实施方式中,所述声纹生成器存储所述基准声纹信息和所述第一用户的身份标识符,包括:
判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;
如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;
如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;
如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。
本申请第三方面提供一种身份认证方法,该方法包括如下步骤:
获取第一用户与第二用户通话产生的历史语音文件;
对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;
对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息;
将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符;
获取待认证用户的身份标识符对应的基准声纹信息;
输出获取到的基准声纹信息中的文本信息,并接收对应的待认证语音信息;
将获取到的基准声纹信息中的语音信息与所述待认证语音信息进行匹配,如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用户认证失败。
结合第三方面,在第三方面第一种可行的实施方式中,所述身份认证方法还包括:
将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;
根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。
结合第三方面第一种可行的实施方式,在第三方面第二种可行的实施方式中,将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:
将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。
结合第三方面,在第三方面第三种可行的实施方式中,存储所述基准声纹信息和所述第一用户的身份标识符,包括:
判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;
如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;
如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;
如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。
本申请第四方面提供一种身份认证系统;该系统包括:
语音过滤器,用于获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;
文本识别器,用于对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息;
声纹生成器,用于将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符;
声纹提取器,用于获取待认证用户的身份标识符对应的基准声纹信息;
识别前置器,用于输出获取到的基准声纹信息中的文本信息,并接收对应的待认证语音信息;
声纹匹配器,用于将获取到的基准声纹信息中的语音信息与所述待认证语音信息进行匹配,如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用 户认证失败。
结合第四方面,在第四方面第一种可行的实施方式中,所述身份认证系统还包括:
文本切割器,用于将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;
声纹切割器,用于根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。
结合第四方面第一种可行的实施方式,在第四方面第二种可行的实施方式中,所述声纹生成器将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:
将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。
结合第四方面,在第四方面第三种可行的实施方式中,所述声纹生成器存储所述基准声纹信息和所述第一用户的身份标识符,包括:
判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;
如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;
如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;
如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。
由以上技术方案可知,本申请通过对相关系统存储的历史语音文件进行过滤,得到第一用户的语音信息,并通过文本识别处理得到该语音信息对应的文本信息,并将该语音信息和对应的文本信息编辑为第一用户的基准声纹信息;由于该基准声纹信息中的文本信息和语音信息都是基于上述历史语音文件得到的,不是相关系统预设的,即是非公开的,故无论第一用户,还是第二用户,还是其他任何用户都无法预知执行身份认证时需要复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的。因此,相对于现有基于声纹识别的身份认证方式,基于本申请提供的声纹信息管理方法进行身份认证,认证结果更准确,不存在安全隐患,账户的安全性更高。
应当理解的是,以上的一般描述和后文的细节描述仅是示例性和解释性的,并不能限制本申请。
附图说明
此处的附图被并入说明书中并构成本说明书的一部分,示出了符合本发明的实施例,并与说明书一起用于解释本发明的原理。
图1是本申请实施例提供的一种声纹信息管理方法的流程图。
图2是本申请实施例提供的另一种声纹信息管理方法的流程图。
图3是本申请实施例提供的存储基准声纹信息的方法流程图。
图4是本申请实施例提供的一种声纹信息管理系统的结构框图。
图5是本申请实施例提供的另一种声纹信息管理系统的结构框图。
图6是本申请实施例提供的一种身份认证方法的流程图。
图7是本申请实施例提供的另一种身份认证方法的流程图。
图8是本申请实施例提供的一种身份认证系统的结构框图。
图9是本申请实施例提供的另一种身份认证系统的结构框图。
具体实施方式
这里将详细地对示例性实施例进行说明,其示例表示在附图中。下面的描述涉及附图时,除非另有表示,不同附图中的相同数字表示相同或相似的要素。以下示例性实施例中所描述的实施方式并不代表与本发明相一致的所有实施方式。相反,它们仅是与如所附权利要求书中所详述的、本发明的一些方面相一致的装置和方法的例子。
图1是本申请实施例提供的一种声纹信息管理方法的流程图,该声纹信息管理方法应用于一种账户管理系统。如图1所示,该声纹信息管理方法,包括以下步骤。
S11、获取第一用户与第二用户通话产生的历史语音文件。
上述第一用户可以为在账户管理系统中存在对应的私有账户的注册用户,相应的,第二用户可以为账户管理系统的服务人员。
S12、对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。
S13、对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。
S14、将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储基准声纹信息和所述第一用户的身份标识符。
一般的,为便于业绩统计、服务质量评估、纠纷处理等,账户管理系统会对注册用户与服务人员之间的语音通话过程进行录音并存储对应的语音文件。有鉴于此,本申请实施例将账户管理系统存储的历史语音文件中的机器提示音、服务人员的声音信息等滤除,得到注册用户的语音信息,通过对该语音信息进行文本识别处理,得到该语音信息对应的文本信息,该语音信息和对应的文本信息就可以作为该注册用户的一组基准声纹信息。分别针对每个注册用户执行上述步骤,就可以得到每个注册用户对应的基准声纹信息,完成声纹库的创建。
由以上方法可知,本申请实施例通过对相关系统存储的历史语音文件进行过滤,得到第一用户的语音信息,并通过文本识别处理得到该语音信息对应的文本信息,并将该语音信息和对应的文本信息编辑为第一用户的基准声纹信息;由于该基准声纹信息中的文本信息和语音信息都是基于上述历史语音文件得到的,不是相关系统预设的,即是非公开的,故无论第一用户,还是第二用户,还是其他任何用户都无法预知执行身份认证时需要复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的。因此,相对于现有基于声纹识别的身份认证方式,基于本申请实施例提供的声纹信息管理方法进行身份认证,认证结果更准确,不存在安全隐患,账户的安全性更高。
在本申请一个可行的实施例中,可以随机的获取第一用户与第二用户任意一次通话过程对应的一个历史语音文件,使得声纹库中身份标识符与基准声纹信息一一对应。由于无法预知实际获取到的历史语音文件对应哪一次通话过程,也就无法预知得到的基准声纹信息中的文本信息的具体内容;因此,基于本实施例执行身份认证,可以保证认证结果的准确性,提高账户的安全性。
在本申请另一个可行的实施例中,也可以获取第一用户对应的所有历史语音文件,每个历史语音文件都可以对应至少一组基准声纹信息,使得声纹库中一个身份标识符可以对应多组基准声纹信息(即第一用户存在多组基准声纹信息);相应的,可以随机的获取任意一组基准声纹信息,来执行身份认证。由于每组基准声纹信息中的文本信息都是非公开的,执行身份认证时获取到的基准声纹信息也无法预知,故用于执行身份认证的文本信息的具体内容也无法预知,从而无法提前录制对应的声音文件,也就无法通过播放提前录制的声音文件达到认证成功的目的;因此,基于本实施例执行身份认证,可以保证认证结果的准确性,提高账户的安全性。
图2是本申请另一实施例提供的声纹信息管理方法的流程图,该声纹信息管理方法应用于一种账户管理系统。如图2所示,该声纹信息管理方法,包括以下步骤。
S21、获取第一用户与第二用户通话产生的历史语音文件。
S22、对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。
S23、对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。
S24、将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间。
S25、根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。
S26、将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息,并存储各条基准声纹信息和第一用户的身份标识符。
由于历史语音文件为一段时间内第一用户与第二用户之间的通话录音文件,故过滤得到的语音信息中包含第一用户的多段声音信息,相应的通过文本识别得到的文本信息包含多个句子或短语。本申请实施例将文本信息切分为多个子文本信息(每个子文本信息可以为一个句子、一个短语或一个词语);同时,为切分得到的每个子文本信息标记起止时间,根据该起止时间在语音信息中截取该子文本信息对应的子语音信息(也即根据子文本信息切分语音信息)。例如,文本信息中“我的账号被锁定了”这一句子由语音信息的00:03至00:05时段识别得到,则将“我的账号被锁定了”切分为一个子文本信息,其起止时间即为00:03至00:05,相应的,将语音信息中00:03至00:05时段的语音信息截取出来,即得到“我的账号被锁定了”这一子文本信息对应的子语音信息。通过对文本信息和语音信息的切分,可以得到多对子文本信息和子语音信息,按照预定格式分别将其编辑为基准声纹信息,则得到同一用户对应的多条基准声纹信息。
本申请实施例中,将子语音信息和对应的子文本信息编辑为基准声纹信息,可以包括:将子语音信息处理为对应的子声纹信息,并为该子声纹信息设置文件名,文件名的格式可以为“声纹编号.文件格式后缀”,如0989X.WAV;存储该子声纹信息,以及该子声纹信息对应的第一用户的身份标识符、子文本信息等信息;基于以上声纹信息管理方法得到的声纹库的存储结构如表1所示。
表1声纹库存储结构示例
用户ID 用户声纹编号 子文本信息 子声纹信息
139XXXXXXXX 1 非常满意 0989X.WAV
139XXXXXXXX 2 为什么还没有退款 0389X.WAV
189XXXXXXXX 1 我很生气 0687X.WAV
189XXXXXXXX 2 账号被锁定 0361X.WAV
表1中,每一行对应声纹库中的一条基准声纹信息;以身份标识符(即用户ID)为主键,用于声纹信息的查询和调用;用户声纹编号用于标记同一用户ID对应的基准声纹信息的个数。以用户ID“139XXXXXXXX”为例,当接收到对该用户ID的身份认证请求时, 从上述声纹库中查询“139XXXXXXXX”对应的基准声纹信息,可以得到多条查询结果,从中随机提取一条作为本次认证的基准声纹信息,例如提取该用户ID对应的2号基准声纹信息作为本次认证的基准声纹信息,输出其中的子文本信息“为什么还没有退款”;接收待认证用户复读该子文件信息得到的待认证语音信息,将其处理为待认证声纹信息,比较该待认证声纹信息和声纹库中提取的子声纹信息“0389X.WAV”,如果二者匹配,则判定身份认证成功,即认为待认证用户即为“139XXXXXXXX”对应的第一用户;反之,如果二者不匹配,则判定身份认证失败。
由以上技术方案可知,本申请实施例通过对系统存储的历史语音文件进行过滤处理,得到第一用户的语音信息;通过对该语音信息进行文本识别处理,得到对应的文本信息;将识别出的文本信息切分为多个子文本信息,并根据每个子文本信息的起止时间从上述语音信息中截取对应的子语音信息,分别将每对子文本信息和子语音信息编辑为一条基准声纹信息,存入声纹库,使得每个第一用户都具备多条基准声纹信息;当需要执行身份认证时,从待认证的身份标识符对应的多条基准声纹信息中随机选取一条即可。由于执行身份认证时获取到的基准声纹信息是随机的,故无法预知需要待认证用户复读的文本信息的具体内容,因此,基于本实施例得到的声纹库执行身份认证,可以保证认证结果的准确性,提高账户的安全性。另外,本实施例中,每条基准声纹信息中的子文本信息都很简短,可以减少复读文本信息所需的时间,减少声纹比较所消耗的时间,提高认证效率。
本申请实施例提供的声纹信息管理方法,不仅可以创建新的声纹库,还可以对所创建的声纹库进行更新,例如添加新用户对应的基准声纹信息,为老用户添加新的基准声纹信息。对于新用户,只需获取该新用户对应的历史语音文件,并执行上述步骤S12至S4,或步骤S22至S26,就可以得到该新用户对应的基准声纹信息。由于随着时间的推移,同一用户对应的历史语音文件也不断增加,因此,对于老用户,可以获取对应的新增历史语音文件,并执行上述步骤,就可以实现为该老用户添加新的基准声纹信息。
基于本申请实施例提供的声纹信息管理方法,可以为第一用户设置一条或多条基准声纹信息。当为同一第一用户设置多条基准声纹信息时,需要保证该第一用户对应的任意两条基准声纹信息中的文本信息不同。然而,实际应用中,不可避免的会遇到以下情况:不同历史语音文件识别出内容相同的文本信息,或者同一文本信息切分出内容相同的多个子文本信息,使得同一子文本信息对应多个子语音信息;此时,本申请实施例采用图3所示的方法完成基准声纹信息的存储。为便于描述,假设待存储的基准声纹信息为由第一文本信息和第一语音信息构成的第一基准声纹信息,如图3所示,本申请实施例中存储第一基准声纹信息的过程包括以下步骤:
S31、判断是否存在满足对比条件的第二基准声纹信息,如果存在,则执行步骤S32, 否则执行步骤S34。
其中,上述对比条件包括:第二基准声纹信息对应的第二文本信息与第一基准声纹信息中的第一文本信息相同,且第二基准声纹信息对应的第二身份标识符与第一基准声纹信息对应的第一身份标识符也相同。
S32、判断所述第一基准声纹信息中的第一语音信息的质量是否高于所述第二基准声纹信息中的第二语音信息的质量,如果是,则执行步骤S33,否则执行步骤S35。
S33、删除所述第二基准声纹信息,并执行步骤S34。
S34、存储所述第一基准声纹信息和对应的第一身份标识符。
S35、删除所述第一基准声纹信息。
上述步骤S31中,判断是否存在上述第二基准声纹信息,其查找范围至少包括已存储在声纹库中的基准声纹信息,还可以包括与第一基准声纹信息同步生成、尚未存储的基准声纹信息。如果不存在上述第二基准声纹信息,则直接存储第一基准声纹信息。如果查找到上述第二基准声纹信息,说明同一第一用户、同一文本信息存在至少两个不同的语音信息,此时,对第一基准声纹信息中的第一语音信息的质量,和第二基准声纹信息中的第二语音信息的质量进行比较,如果第一语音信息的质量高于第二语音信息,则存储第一基准声纹信息,同时删除第二基准声纹信息,如果第一语音信息的质量低于第二语音信息,则直接删除第一基准声纹信息,即对于同一文本信息,只保留质量最高的语音信息,以提高身份认证过程中语音信息对比结果的准确率、降低对比难度。
基于上述存储过程,可以实现以下三种声纹库更新方式:1)增加新用户的基准声纹信息;2)增加老用户对应的文本信息不同的基准声纹信息;3)将声纹库中语音信息质量较低的基准声纹信息替换为语音信息质量更高的基准声纹信息。
由以上技术方案可知,本申请实施例对于得到的新的基准声纹信息,不是直接将其存入声纹库,而是先判断是否存储有与该基准声纹信息中的文本信息以及对应的身份标识符分别相同的另一基准声纹信息,如果存在,则比较两个基准声纹信息中的语音信息的质量,保留语音信息质量较高的基准声纹信息,删除语音信息质量较低的基准声纹信息。因此,本申请实施例不仅可以保证所存储的基准声纹信息中,同一身份标识符(也即同一第一用户)对应的任意两条基准声纹信息中的文本信息都不同,还可以保证每一种文本信息对应的语音信息的质量最高;在基于本申请实施例执行身份认证时,基于质量更高的语音信息进行声纹对比,可以保证认证的准确性,提高认证效率。
图4为本申请实施例提供的一种声纹信息管理系统的结构框图;该声纹信息管理系统可以应用于一种账户管理系统。如图4所示,该声纹信息管理系统100包括:语音过滤器110、文本识别器120和声纹生成器130。
该语音过滤器110被配置为,获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。
该文本识别器120被配置为,对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。
该声纹生成器130被配置为,将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符。
由以上结构可知,本申请实施例通过对相关系统存储的历史语音文件进行过滤,得到第一用户的语音信息,并通过文本识别处理得到该语音信息对应的文本信息,并将该语音信息和对应的文本信息编辑为第一用户的基准声纹信息;由于该基准声纹信息中的文本信息和语音信息都是基于上述历史语音文件得到的,不是相关系统预设的,即是非公开的,故无论第一用户,还是第二用户,还是其他任何用户都无法预知执行身份认证时需要复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的。因此,相对于现有基于声纹识别的身份认证方式,基于本申请实施例提供的声纹信息管理方法进行身份认证,认证结果更准确,不存在安全隐患,账户的安全性更高。
图5为本申请实施例提供的另一种声纹信息管理系统的结构框图;该声纹信息管理系统可以应用于一种账户管理系统。如图5所示,该声纹信息管理系统200包括:语音过滤器210、文本识别器220、文本切割器240、声纹切割器250和声纹生成器230。
该语音过滤器210被配置为,获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。
该文本识别器220被配置为,对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。
该文本切割器240被配置为,将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间。
该声纹切割器250被配置为,根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。
该声纹生成器230被配置为,将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息,并存储各条基准声纹信息和所述第一用户的身份标识符。
由以上结构可知,本申请实施例通过对系统存储的历史语音文件进行过滤处理,得到第一用户的语音信息;通过对该语音信息进行文本识别处理,得到对应的文本信息;将识别出的文本信息切分为多个子文本信息,并根据每个子文本信息的起止时间从上述 语音信息中截取对应的子语音信息,分别将每对子文本信息和子语音信息编辑为一条基准声纹信息,存入声纹库,使得每个第一用户都具备多条基准声纹信息;当需要执行身份认证时,从待认证的身份标识符对应的多条基准声纹信息中随机选取一条即可。由于执行身份认证时获取到的基准声纹信息是随机的,故无法预知需要待认证用户复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的,因此,基于本实施例得到的声纹库执行身份认证,可以保证认证结果的准确性,提高账户的安全性。另外,本实施例中,每条基准声纹信息中的子文本信息都很简短,可以减少复读文本信息所需的时间,减少声纹比较所消耗的时间,提高认证效率。
本申请实施例提供的声纹信息管理系统中,为实现存储所述基准声纹信息和所述第一用户的身份标识符的功能,上述声纹生成器130及声纹生成器230可以被配置为:
判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;
如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;
如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则直接删除所述第一基准声纹信息;
如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。
基于以上配置的声纹生成器,本申请实施例不仅可以保证所存储的基准声纹信息中,同一用户对应的任意两条基准声纹信息中的文本信息都不同,还可以保证每一种文本信息对应的语音信息的质量最高;从而在基于本申请实施例执行身份认证时,基于质量更高的语音信息进行声纹对比,可以保证认证的准确性,提高认证效率。
图6为本申请实施例提供的一种身份认证方法的流程图;该身份认证方法可以应用于一种账户管理系统。参照图6,该身份认证方法包括如下步骤。
S41、获取第一用户与第二用户通话产生的历史语音文件。
其中,上述第一用户可以为在账户管理系统中存在对应的私有账户的注册用户,相应的,第二用户可以为账户管理系统的服务人员。
S42、对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。
S43、对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。
S44、将所述文本信息和对应的语音信息编辑为所述第一用户的基准声纹信息,并存储基准声纹信息和所述第一用户的身份标识符。
S45、获取待认证用户的身份标识符对应的基准声纹信息。
S46、输出获取到的基准声纹信息中的文本信息,并接收对应的待认证语音信息。
S47、将获取到的基准声纹信息中的语音信息与所述待认证语音信息进行匹配,如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用户认证失败。
由以上方法可知,本申请实施例通过对相关系统存储的历史语音文件进行过滤,得到第一用户的语音信息,并通过文本识别处理得到该语音信息对应的文本信息,并将该语音信息和对应的文本信息编辑为第一用户的基准声纹信息;由于该基准声纹信息中的文本信息和语音信息都是基于上述历史语音文件得到的,不是相关系统预设的,即是非公开的,故无论第一用户,还是第二用户,还是其他任何用户都无法预知执行身份认证时需要复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的。因此,相对于现有基于声纹识别的身份认证方式,基于本申请实施例提供的声纹信息管理方法进行身份认证,认证结果更准确,不存在安全隐患,账户的安全性更高。
图7为本申请实施例提供的另一种身份认证方法的流程图;该身份认证方法可以应用于一种账户管理系统。参照图7,该身份认证方法包括如下步骤。
S51、获取第一用户与第二用户通话产生的历史语音文件。
S52、对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。
S53、对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。
S54、将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间。
S55、根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。
S56、将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息,并存储各条基准声纹信息和所述第一用户的身份标识符。
S57、获取待认证用户的身份标识符对应的基准声纹信息。
S58、输出获取到的基准声纹信息中的子文本信息,并接收对应的待认证语音信息。
S59、将获取到的基准声纹信息中的子语音信息与所述待认证语音信息进行匹配, 如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用户认证失败。
由以上方法可知,本申请实施例将识别得到的文本信息切分为多个子文本信息,并根据其起止时间截取对应的子语音信息,将每个子文本信息和对应的子语音信息编辑为一条基准声纹信息,使得第一用户具备多条基准声纹信息;当需要执行身份认证时,从待认证的身份标识符对应的多条基准声纹信息中随机选取一条即可。由于执行身份认证时获取到的基准声纹信息是随机的,故无法预知需要待认证用户复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的,因此,本实施例提供的身份认证方法,可以保证认证结果的准确性,提高账户的安全性。另外,本实施例中,每条基准声纹信息中的子文本信息都很简短,可以减少复读文本信息所需的时间,减少声纹比较所消耗的时间,提高认证效率。
本申请实施例提供的身份认证方法,也可以采用图3所示的方法完成基准声纹信息的存储,不仅可以保证所存储的基准声纹信息中,同一用户对应的任意两条基准声纹信息中的文本信息都不同,还可以保证每一种文本信息对应的语音信息的质量最高;在基于本申请实施例执行身份认证时,基于质量更高的语音信息进行声纹对比,可以保证认证的准确性,提高认证效率。
图8为本申请实施例提供的一种身份认证系统的结构框图,该身份认证系统可以应用于一种账户管理系统。参照图8,该身份认证系统300包括:语音过滤器310、文本识别器320、声纹生成器330、声纹提取器360、识别前置器370和声纹匹配器380。
该语音过滤器310被配置为,获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。
该文本识别器320被配置为,对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。
该声纹生成器330被配置为,将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符。
该声纹提取器360被配置为,获取待认证用户的身份标识符对应的基准声纹信息。
该识别前置器370被配置为,输出获取到的基准声纹信息中的文本信息,并接收对应的待认证语音信息。
该声纹匹配器380被配置为,将获取到的基准声纹信息中的语音信息与所述待认证语音信息进行匹配,如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用户认证失败。
上述结构中,识别前置器370用于实现身份认证系统与待认证用户的交互;除了用于输出声纹提取器360获取到的基准声纹信息中的文本信息,接收待认证用户输入的待认证语音信息外,还可以接收待认证用户的身份认证请求,并在接收到身份认证请求后触发声纹提取器360,以及向待认证用户输出声纹匹配器380得到的认证结果。
由以上结构可知,本申请实施例通过对相关系统存储的历史语音文件进行过滤,得到第一用户的语音信息,并通过文本识别处理得到该语音信息对应的文本信息,并将该语音信息和对应的文本信息编辑为第一用户的基准声纹信息;由于该基准声纹信息中的文本信息和语音信息都是基于上述历史语音文件得到的,不是相关系统预设的,即是非公开的,故无论第一用户,还是第二用户,还是其他任何用户都无法预知执行身份认证时需要复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的。因此,相对于现有基于声纹识别的身份认证方式,基于本申请实施例提供的声纹信息管理方法进行身份认证,认证结果更准确,不存在安全隐患,账户的安全性更高。
图9为本申请实施例提供的一种身份认证系统的结构框图,该身份认证系统可以应用于一种账户管理系统。参照图9,该身份认证系统400包括:语音过滤器410、文本识别器420、文本切割器440、声纹切割器450、声纹生成器430、声纹提取器460、识别前置器470和声纹匹配器480。
该语音过滤器410被配置为,获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息。
该文本识别器420被配置为,对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息。
该文本切割器440被配置为,将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间。
该声纹切割器450被配置为,根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。
该声纹生成器430被配置为,将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息,并存储各条基准声纹信息和所述第一用户的身份标识符。
该声纹提取器460被配置为,获取待认证用户的身份标识符对应的基准声纹信息。
该识别前置器470被配置为,输出获取到的基准声纹信息中的子文本信息,并接收对应的待认证语音信息。
该声纹匹配器480被配置为,将获取到的基准声纹信息中的子语音信息与所述待认 证语音信息进行匹配,如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用户认证失败。
由以上结构可知,本申请实施例将识别得到的文本信息切分为多个子文本信息,并根据其起止时间截取对应的子语音信息,将每个子文本信息和对应的子语音信息编辑为一条基准声纹信息,使得第一用户具备多条基准声纹信息;当需要执行身份认证时,从根据待认证用户对应的身份标识符确定对应的多条基准声纹信息,并从中随机选取一条用于本次身份认证。由于执行身份认证时获取到的基准声纹信息是随机的,故无法预知需要待认证用户复读的文本信息的具体内容,从而无法提前录制对应的声音文件,也即无法通过播放提前录制的声音文件达到认证成功的目的,因此,本实施例提供的身份认证系统,可以保证认证结果的准确性,提高账户的安全性。另外,本实施例中,每条基准声纹信息中的子文本信息都很简短,可以减少复读文本信息所需的时间,减少声纹比较所消耗的时间,提高认证效率。
本申请实施例提供的身份认证系统中,为实现存储所述基准声纹信息和对应的用户身份标识符的功能,上述声纹生成器330及声纹生成器430可以被配置为:
判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;
如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一用户的身份标识符;
如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;
如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和对应的用户身份标识符。
基于以上配置的声纹生成器,本申请实施例不仅可以保证所存储的基准声纹信息中,同一身份标识符对应的任意两条基准声纹信息中的文本信息都不同,还可以保证每一种文本信息对应的语音信息的质量最高;在基于本申请实施例执行身份认证时,基于质量更高的语音信息进行声纹对比,可以保证认证的准确性,提高认证效率。
本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本发明的其它实施方案。本申请旨在涵盖本发明的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本发明的一般性原理并包括本申请未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本发明的真正范围和精神由下 面的权利要求指出。
应当理解的是,本发明并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本发明的范围仅由所附的权利要求来限制。

Claims (16)

  1. 一种声纹信息管理方法,其特征在于,包括:
    获取第一用户与第二用户通话产生的历史语音文件;
    对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;
    对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息;
    将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符。
  2. 根据权利要求1所述的声纹信息管理方法,其特征在于,还包括:
    将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;
    根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。
  3. 根据权利要求2所述的声纹信息管理方法,其特征在于,将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:
    将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。
  4. 根据权利要求1所述的声纹信息管理方法,其特征在于,存储所述基准声纹信息和所述第一用户的身份标识符,包括:
    判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;
    如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;
    如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;
    如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。
  5. 一种声纹信息管理系统,其特征在于,包括:
    语音过滤器,用于获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;
    文本识别器,用于对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息;
    声纹生成器,用于将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储所述基准声纹信息和所述第一用户的身份标识符。
  6. 根据权利要求5所述的声纹信息管理系统,其特征在于,还包括:
    文本切割器,用于将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;
    声纹切割器,用于根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。
  7. 根据权利要求6所述的声纹信息管理系统,其特征在于,所述声纹生成器将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:
    将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。
  8. 根据权利要求5所述的声纹信息管理系统,其特征在于,所述声纹生成器存储所述基准声纹信息和所述第一用户的身份标识符,包括:
    判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;
    如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;
    如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;
    如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。
  9. 一种身份认证方法,其特征在于,包括:
    获取第一用户与第二用户通话产生的历史语音文件;
    对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;
    对所述用户语音信息语音信息执行文本识别处理,得到所述用户语音信息语音信息对应的文本信息;
    将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储基准声纹信息和所述第一用户的身份标识符;
    获取待认证用户的身份标识符对应的基准声纹信息;
    输出获取到的基准声纹信息中的文本信息,并接收对应的待认证语音信息;
    将获取到的基准声纹信息中的语音信息与所述待认证语音信息进行匹配,如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用户认证失败。
  10. 根据权利要求9所述的身份认证系统,其特征在于,还包括:
    将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;
    根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。
  11. 根据权利要求10所述的身份认证系统,其特征在于,将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:
    将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。
  12. 根据权利要求9所述的身份认证系统,其特征在于,存储基准声纹信息和所述第一用户的身份标识符,包括:
    判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;
    如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;
    如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;
    如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。
  13. 一种身份认证系统,其特征在于,包括:
    语音过滤器,用于获取第一用户与第二用户通话产生的历史语音文件,并对所述历史语音文件执行过滤处理,得到所述第一用户的语音信息;
    文本识别器,用于对所述语音信息执行文本识别处理,得到所述语音信息对应的文本信息;
    声纹生成器,用于将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,并存储基准声纹信息和所述第一用户的身份标识符;
    声纹提取器,用于获取待认证用户的身份标识符对应的基准声纹信息;
    识别前置器,用于输出获取到的基准声纹信息中的文本信息,并接收对应的待认证语音信息;
    声纹匹配器,用于将获取到的基准声纹信息中的语音信息与所述待认证语音信息进行匹配,如果匹配成功,则判定待认证用户认证成功,如果匹配失败,则判定待认证用户认证失败。
  14. 根据权利要求13所述的身份认证系统,其特征在于,还包括:
    文本切割器,用于将所述文本信息切分为多个子文本信息,并标记每个子文本信息的起止时间;
    声纹切割器,用于根据子文本信息的起止时间从所述语音信息中分别截取每个子文本信息对应的子语音信息。
  15. 根据权利要求14所述的身份认证系统,其特征在于,所述声纹生成器将所述语音信息和对应的文本信息编辑为所述第一用户的基准声纹信息,包括:
    将每对子语音信息和子文本信息分别编辑为所述第一用户的一条基准声纹信息。
  16. 根据权利要求13所述的身份认证系统,其特征在于,所述声纹生成器存储基准声纹信息和所述第一用户的身份标识符,包括:
    判断是否存在对应的第二文本信息与待存储的第一基准声纹信息中的第一文本信息相同,且对应的第二身份标识符与所述第一基准声纹信息对应的第一身份标识符也相同的第二基准声纹信息;
    如果不存在所述第二基准声纹信息,则直接存储所述第一基准声纹信息和所述第一身份标识符;
    如果存在所述第二基准声纹信息,则比较所述第一基准声纹信息中的第一语音信息和所述第二基准声纹信息中的第二语音信息的质量,如果所述第一语音信息的质量低于所述第二语音信息,则删除所述第一基准声纹信息;
    如果所述第一语音信息的质量高于所述第二语音信息,则删除所述第二基准声纹信息,并存储所述第一基准声纹信息和所述第一身份标识符。
PCT/CN2015/091260 2014-10-10 2015-09-30 声纹信息管理方法、装置以及身份认证方法、系统 WO2016054991A1 (zh)

Priority Applications (5)

Application Number Priority Date Filing Date Title
JP2017518071A JP6671356B2 (ja) 2014-10-10 2015-09-30 声紋情報管理方法および声紋情報管理装置、ならびに本人認証方法および本人認証システム
KR1020177012683A KR20170069258A (ko) 2014-10-10 2015-09-30 성문 정보 관리 방법 및 장치, 및 신원 인증 방법 및 시스템
SG11201702919UA SG11201702919UA (en) 2014-10-10 2015-09-30 Voiceprint information management method and apparatus, and identity authentication method and system
EP15848463.4A EP3206205B1 (en) 2014-10-10 2015-09-30 Voiceprint information management method and device as well as identity authentication method and system
US15/484,082 US10593334B2 (en) 2014-10-10 2017-04-10 Method and apparatus for generating voiceprint information comprised of reference pieces each used for authentication

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201410532530.0A CN105575391B (zh) 2014-10-10 2014-10-10 声纹信息管理方法、装置以及身份认证方法、系统
CN201410532530.0 2014-10-10

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US15/484,082 Continuation US10593334B2 (en) 2014-10-10 2017-04-10 Method and apparatus for generating voiceprint information comprised of reference pieces each used for authentication

Publications (1)

Publication Number Publication Date
WO2016054991A1 true WO2016054991A1 (zh) 2016-04-14

Family

ID=55652587

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2015/091260 WO2016054991A1 (zh) 2014-10-10 2015-09-30 声纹信息管理方法、装置以及身份认证方法、系统

Country Status (8)

Country Link
US (1) US10593334B2 (zh)
EP (1) EP3206205B1 (zh)
JP (1) JP6671356B2 (zh)
KR (1) KR20170069258A (zh)
CN (1) CN105575391B (zh)
HK (1) HK1224074A1 (zh)
SG (2) SG11201702919UA (zh)
WO (1) WO2016054991A1 (zh)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10593334B2 (en) 2014-10-10 2020-03-17 Alibaba Group Holding Limited Method and apparatus for generating voiceprint information comprised of reference pieces each used for authentication
EP3611895A4 (en) * 2017-04-10 2020-04-08 Beijing Orion Star Technology Co., Ltd. METHOD AND DEVICE FOR USER REGISTRATION AND ELECTRONIC DEVICE
CN111862933A (zh) * 2020-07-20 2020-10-30 北京字节跳动网络技术有限公司 用于生成合成语音的方法、装置、设备和介质

Families Citing this family (28)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106156583A (zh) * 2016-06-03 2016-11-23 深圳市金立通信设备有限公司 一种语音解锁的方法及终端
CN106549947A (zh) * 2016-10-19 2017-03-29 陆腾蛟 一种即时更新的声纹认证方法及系统
CN106782564B (zh) * 2016-11-18 2018-09-11 百度在线网络技术(北京)有限公司 用于处理语音数据的方法和装置
US10592649B2 (en) 2017-08-09 2020-03-17 Nice Ltd. Authentication via a dynamic passphrase
CN107564531A (zh) * 2017-08-25 2018-01-09 百度在线网络技术(北京)有限公司 基于声纹特征的会议记录方法、装置及计算机设备
US10490195B1 (en) * 2017-09-26 2019-11-26 Amazon Technologies, Inc. Using system command utterances to generate a speaker profile
CN107863108B (zh) * 2017-11-16 2021-03-23 百度在线网络技术(北京)有限公司 信息输出方法和装置
CN108121210A (zh) * 2017-11-20 2018-06-05 珠海格力电器股份有限公司 家电设备的权限分配方法和装置、存储介质、处理器
CN108257604B (zh) * 2017-12-08 2021-01-08 平安普惠企业管理有限公司 语音识别方法、终端设备及计算机可读存储介质
CN107871236B (zh) * 2017-12-26 2021-05-07 广州势必可赢网络科技有限公司 一种电子设备声纹支付方法及装置
KR102483834B1 (ko) 2018-01-17 2023-01-03 삼성전자주식회사 음성 명령을 이용한 사용자 인증 방법 및 전자 장치
CN111177329A (zh) * 2018-11-13 2020-05-19 奇酷互联网络科技(深圳)有限公司 一种智能终端的用户交互方法、智能终端及存储介质
CN111292733A (zh) * 2018-12-06 2020-06-16 阿里巴巴集团控股有限公司 一种语音交互方法和装置
CN110660398B (zh) * 2019-09-19 2020-11-20 北京三快在线科技有限公司 声纹特征更新方法、装置、计算机设备及存储介质
CN112580390B (zh) * 2019-09-27 2023-10-17 百度在线网络技术(北京)有限公司 基于智能音箱的安防监控方法、装置、音箱和介质
CN110970036B (zh) * 2019-12-24 2022-07-12 网易(杭州)网络有限公司 声纹识别方法及装置、计算机存储介质、电子设备
US11516197B2 (en) * 2020-04-30 2022-11-29 Capital One Services, Llc Techniques to provide sensitive information over a voice connection
CN111785280B (zh) * 2020-06-10 2024-09-10 北京三快在线科技有限公司 身份认证方法和装置、存储介质和电子设备
US11817113B2 (en) 2020-09-09 2023-11-14 Rovi Guides, Inc. Systems and methods for filtering unwanted sounds from a conference call
US11450334B2 (en) * 2020-09-09 2022-09-20 Rovi Guides, Inc. Systems and methods for filtering unwanted sounds from a conference call using voice synthesis
US12008091B2 (en) * 2020-09-11 2024-06-11 Cisco Technology, Inc. Single input voice authentication
US11522994B2 (en) 2020-11-23 2022-12-06 Bank Of America Corporation Voice analysis platform for voiceprint tracking and anomaly detection
CN112565242B (zh) * 2020-12-02 2023-04-07 携程计算机技术(上海)有限公司 基于声纹识别的远程授权方法、系统、设备及存储介质
US12020711B2 (en) * 2021-02-03 2024-06-25 Nice Ltd. System and method for detecting fraudsters
US20240054235A1 (en) * 2022-08-15 2024-02-15 Bank Of America Corporation Systems and methods for encrypting dialogue based data in a data storage system
CN115426632A (zh) * 2022-08-30 2022-12-02 上汽通用五菱汽车股份有限公司 语音传输方法、装置、车载主机以及存储介质
CN115565539B (zh) * 2022-11-21 2023-02-07 中网道科技集团股份有限公司 一种实现自助矫正终端防伪身份验证的数据处理方法
CN117059092B (zh) * 2023-10-11 2024-06-04 深圳普一同创科技有限公司 基于区块链的智慧医疗交互式智能分诊方法及系统

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1547191A (zh) * 2003-12-12 2004-11-17 北京大学 结合语义和声纹信息的说话人身份确认系统
CN1852354A (zh) * 2005-10-17 2006-10-25 华为技术有限公司 收集用户行为特征的方法和装置
US7158776B1 (en) * 2001-09-18 2007-01-02 Cisco Technology, Inc. Techniques for voice-based user authentication for mobile access to network services
CN102708867A (zh) * 2012-05-30 2012-10-03 北京正鹰科技有限责任公司 一种基于声纹和语音的防录音假冒身份识别方法及系统
CN102760434A (zh) * 2012-07-09 2012-10-31 华为终端有限公司 一种声纹特征模型更新方法及终端
CN103258535A (zh) * 2013-05-30 2013-08-21 中国人民财产保险股份有限公司 基于声纹识别的身份识别方法及系统

Family Cites Families (27)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JPH11344992A (ja) * 1998-06-01 1999-12-14 Ntt Data Corp 音声辞書作成方法、個人認証装置および記録媒体
US20040236699A1 (en) 2001-07-10 2004-11-25 American Express Travel Related Services Company, Inc. Method and system for hand geometry recognition biometrics on a fob
IL154733A0 (en) 2003-03-04 2003-10-31 Financial transaction authorization apparatus and method
JP4213716B2 (ja) 2003-07-31 2009-01-21 富士通株式会社 音声認証システム
US7386448B1 (en) 2004-06-24 2008-06-10 T-Netix, Inc. Biometric voice authentication
US8014496B2 (en) 2004-07-28 2011-09-06 Verizon Business Global Llc Systems and methods for providing network-based voice authentication
US7536304B2 (en) 2005-05-27 2009-05-19 Porticus, Inc. Method and system for bio-metric voice print authentication
US20060277043A1 (en) 2005-06-06 2006-12-07 Edward Tomes Voice authentication system and methods therefor
JP4755689B2 (ja) * 2005-07-27 2011-08-24 インターナショナル・ビジネス・マシーンズ・コーポレーション 正規受信者への安全なファイル配信のためのシステムおよび方法
JP4466572B2 (ja) * 2006-01-16 2010-05-26 コニカミノルタビジネステクノロジーズ株式会社 画像形成装置、音声コマンド実行プログラムおよび音声コマンド実行方法
CN1808567A (zh) 2006-01-26 2006-07-26 覃文华 验证真人在场状态的声纹认证设备和其认证方法
US8396711B2 (en) * 2006-05-01 2013-03-12 Microsoft Corporation Voice authentication system and method
US20080256613A1 (en) 2007-03-13 2008-10-16 Grover Noel J Voice print identification portal
US8775187B2 (en) 2008-09-05 2014-07-08 Auraya Pty Ltd Voice authentication system and methods
US8537978B2 (en) * 2008-10-06 2013-09-17 International Business Machines Corporation Method and system for using conversational biometrics and speaker identification/verification to filter voice streams
US8655660B2 (en) * 2008-12-11 2014-02-18 International Business Machines Corporation Method for dynamic learning of individual voice patterns
CN102404287A (zh) 2010-09-14 2012-04-04 盛乐信息技术(上海)有限公司 用数据复用法确定声纹认证阈值的声纹认证系统及方法
US9318114B2 (en) * 2010-11-24 2016-04-19 At&T Intellectual Property I, L.P. System and method for generating challenge utterances for speaker verification
CN102222502A (zh) * 2011-05-16 2011-10-19 上海先先信息科技有限公司 一种汉语随机提示声纹验证的有效方式
KR101304112B1 (ko) * 2011-12-27 2013-09-05 현대캐피탈 주식회사 음성 분리를 이용한 실시간 화자인식 시스템 및 방법
US10134401B2 (en) * 2012-11-21 2018-11-20 Verint Systems Ltd. Diarization using linguistic labeling
JP5646675B2 (ja) * 2013-03-19 2014-12-24 ヤフー株式会社 情報処理装置及び方法
US20140359736A1 (en) 2013-05-31 2014-12-04 Deviceauthority, Inc. Dynamic voiceprint authentication
CN103679452A (zh) * 2013-06-20 2014-03-26 腾讯科技(深圳)有限公司 支付验证方法、装置及系统
GB2517952B (en) * 2013-09-05 2017-05-31 Barclays Bank Plc Biometric verification using predicted signatures
US8812320B1 (en) * 2014-04-01 2014-08-19 Google Inc. Segment-based speaker verification using dynamically generated phrases
CN105575391B (zh) 2014-10-10 2020-04-03 阿里巴巴集团控股有限公司 声纹信息管理方法、装置以及身份认证方法、系统

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7158776B1 (en) * 2001-09-18 2007-01-02 Cisco Technology, Inc. Techniques for voice-based user authentication for mobile access to network services
CN1547191A (zh) * 2003-12-12 2004-11-17 北京大学 结合语义和声纹信息的说话人身份确认系统
CN1852354A (zh) * 2005-10-17 2006-10-25 华为技术有限公司 收集用户行为特征的方法和装置
CN102708867A (zh) * 2012-05-30 2012-10-03 北京正鹰科技有限责任公司 一种基于声纹和语音的防录音假冒身份识别方法及系统
CN102760434A (zh) * 2012-07-09 2012-10-31 华为终端有限公司 一种声纹特征模型更新方法及终端
CN103258535A (zh) * 2013-05-30 2013-08-21 中国人民财产保险股份有限公司 基于声纹识别的身份识别方法及系统

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10593334B2 (en) 2014-10-10 2020-03-17 Alibaba Group Holding Limited Method and apparatus for generating voiceprint information comprised of reference pieces each used for authentication
EP3611895A4 (en) * 2017-04-10 2020-04-08 Beijing Orion Star Technology Co., Ltd. METHOD AND DEVICE FOR USER REGISTRATION AND ELECTRONIC DEVICE
US11568876B2 (en) 2017-04-10 2023-01-31 Beijing Orion Star Technology Co., Ltd. Method and device for user registration, and electronic device
CN111862933A (zh) * 2020-07-20 2020-10-30 北京字节跳动网络技术有限公司 用于生成合成语音的方法、装置、设备和介质

Also Published As

Publication number Publication date
CN105575391A (zh) 2016-05-11
EP3206205B1 (en) 2020-01-15
JP6671356B2 (ja) 2020-03-25
EP3206205A1 (en) 2017-08-16
US10593334B2 (en) 2020-03-17
SG10201903085YA (en) 2019-05-30
US20170221488A1 (en) 2017-08-03
KR20170069258A (ko) 2017-06-20
SG11201702919UA (en) 2017-05-30
HK1224074A1 (zh) 2017-08-11
CN105575391B (zh) 2020-04-03
JP2017534905A (ja) 2017-11-24
EP3206205A4 (en) 2017-11-01

Similar Documents

Publication Publication Date Title
WO2016054991A1 (zh) 声纹信息管理方法、装置以及身份认证方法、系统
US10685657B2 (en) Biometrics platform
US10135818B2 (en) User biological feature authentication method and system
US10276168B2 (en) Voiceprint verification method and device
CN102985965B (zh) 声纹标识
CN105069874B (zh) 一种移动互联网声纹门禁系统及其实现方法
US20160014120A1 (en) Method, server, client and system for verifying verification codes
WO2019127897A1 (zh) 一种自学习声纹识别的更新方法和装置
CN110533288A (zh) 业务办理流程检测方法、装置、计算机设备和存储介质
US20130132091A1 (en) Dynamic Pass Phrase Security System (DPSS)
CN109036436A (zh) 一种声纹数据库建立方法、声纹识别方法、装置及系统
US11076043B2 (en) Systems and methods of voiceprint generation and use in enforcing compliance policies
CN106982344A (zh) 视频信息处理方法及装置
WO2016107415A1 (zh) 基于用户网络行为特征的辅助身份验证方法
KR101181060B1 (ko) 음성 인식 시스템 및 이를 이용한 화자 인증 방법
US20120330663A1 (en) Identity authentication system and method
US20140163986A1 (en) Voice-based captcha method and apparatus
US11705134B2 (en) Graph-based approach for voice authentication
KR102291113B1 (ko) 회의록 작성 장치 및 방법
KR20220166465A (ko) 다채널 수신기를 이용한 회의록 생성 시스템 및 방법
US10572636B2 (en) Authentication by familiar media fragments
Yakovlev et al. LRPD: Large Replay Parallel Dataset
Portêlo et al. Privacy-preserving query-by-example speech search
CN114125368B (zh) 会议音频的参会人关联方法、装置及电子设备
RU2628118C2 (ru) Способ формирования и использования инвертированного индекса аудиозаписи и машиночитаемый носитель информации

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 15848463

Country of ref document: EP

Kind code of ref document: A1

ENP Entry into the national phase

Ref document number: 2017518071

Country of ref document: JP

Kind code of ref document: A

NENP Non-entry into the national phase

Ref country code: DE

WWE Wipo information: entry into national phase

Ref document number: 11201702919U

Country of ref document: SG

ENP Entry into the national phase

Ref document number: 20177012683

Country of ref document: KR

Kind code of ref document: A

REEP Request for entry into the european phase

Ref document number: 2015848463

Country of ref document: EP