WO2019127897A1 - 一种自学习声纹识别的更新方法和装置 - Google Patents

一种自学习声纹识别的更新方法和装置 Download PDF

Info

Publication number
WO2019127897A1
WO2019127897A1 PCT/CN2018/077535 CN2018077535W WO2019127897A1 WO 2019127897 A1 WO2019127897 A1 WO 2019127897A1 CN 2018077535 W CN2018077535 W CN 2018077535W WO 2019127897 A1 WO2019127897 A1 WO 2019127897A1
Authority
WO
WIPO (PCT)
Prior art keywords
voiceprint
verified
preset
voiceprint feature
voice
Prior art date
Application number
PCT/CN2018/077535
Other languages
English (en)
French (fr)
Inventor
陈书东
Original Assignee
广州势必可赢网络科技有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 广州势必可赢网络科技有限公司 filed Critical 广州势必可赢网络科技有限公司
Publication of WO2019127897A1 publication Critical patent/WO2019127897A1/zh

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques

Definitions

  • the present invention relates to the field of voiceprint recognition technology, and in particular, to a method and apparatus for updating self-learning voiceprint recognition.
  • voiceprint recognition With the development of biometrics, voiceprint recognition has great application prospects in the fields of banking, smart home and mobile payment due to its better convenience, stability and higher security.
  • Voiceprint recognition the user registers a certain amount of voice according to the rules preset by the system, and re-records the verification voice to the system for identification when the user needs to verify.
  • the existing voiceprint recognition technology is a one-time registration of voiceprint for long-term use. However, as the time changes, the human voice will also have a certain change. This phenomenon is called voiceprint drift, and the voiceprint drift will affect the correct rate of identity recognition.
  • the current voiceprint recognition technology has a technical problem of reducing the correct rate due to voiceprint drift.
  • Embodiments of the present invention provide a method and apparatus for updating self-learning voiceprint recognition, which solves the technical problem that the current voiceprint recognition technology can reduce the correct rate due to voiceprint drift.
  • the invention provides an updating method for self-learning voiceprint recognition, comprising:
  • S1 receiving a verification instruction and a voice to be verified, performing voiceprint feature extraction on the voice to be verified according to the verification instruction, obtaining a voiceprint feature to be verified, and updating the voiceprint feature to be verified and the time in the user voiceprint database
  • the first preset contrast quantity of the merged voiceprint feature performs similarity comparison to obtain a matching value of the first preset comparison quantity
  • step S2 specifically includes:
  • the voiceprint verification passes and adds the to-be-verified voiceprint feature to the self-learning observation voiceprint feature library, and if not, the voiceprint verification fails .
  • step S3 specifically includes:
  • step S1 specifically includes:
  • step S12 Detect whether the voice to be verified meets the requirements of the preset voice quality standard, and if yes, perform step S13; if not, the voiceprint verification fails;
  • the method further comprises:
  • step S03 detecting whether the registered voice meets the preset voice quality standard, if yes, proceeding to step S04, if not, prompting the user to continue to input the registration voice and returning to step S02;
  • step S05 determining whether the number of registered voiceprint features in the user voiceprint library is equal to the second preset fusion number, if yes, executing step S06, if not, prompting the user to continue to input the registration voice and returning to step S02;
  • the invention provides an updating device for self-learning voiceprint recognition, comprising:
  • a first comparison unit configured to receive a verification instruction and a voice to be verified, perform voiceprint feature extraction on the voice to be verified according to the verification instruction, obtain a voiceprint feature to be verified, and perform the voiceprint feature to be verified and the user voice
  • the first preset contrast quantity of the merged voiceprint feature of the texture library performs similarity comparison to obtain a matching value of the first preset comparison quantity
  • a first determining unit configured to determine whether the matching value meets a requirement of a preset voiceprint evaluation standard, and if yes, the voiceprint verification passes and adds the to-be-verified voiceprint feature to the self-learning observation voiceprint feature library;
  • a second determining unit configured to determine whether a feedback instruction for canceling or reporting the verification instruction is received within a preset time, and if not, adding the voiceprint feature to be verified as a new material voiceprint feature to the user voiceprint Library
  • the first merging unit is configured to select a second preset number of material voiceprint features in the user voiceprint library to be fused to obtain a new fused voiceprint feature.
  • the first determining unit specifically includes:
  • the second determining unit specifically includes:
  • the first comparison unit specifically includes:
  • a first receiving subunit configured to receive a verification instruction and a voice to be verified
  • the first detecting subunit is configured to detect whether the voice to be verified meets the requirement of the preset voice quality standard, and if yes, trigger the first comparing subunit, and if not, the voiceprint verification fails;
  • a first comparison subunit configured to perform voiceprint feature extraction on the to-be-verified speech to obtain a voiceprint feature to be verified, and compare the to-be-verified voiceprint feature with a time-first first preset in the user voiceprint library
  • the merged voiceprint feature performs a similarity comparison to obtain a matching value of the first preset contrast quantity.
  • a voice registration unit is further included;
  • a second receiving subunit configured to receive a registration instruction
  • a third receiving subunit configured to receive a registration voice
  • a second detecting subunit configured to detect whether the registered voice meets a preset voice quality standard, and if yes, trigger a first extracting subunit, and if not, prompt the user to re-enter the registered voice and trigger the third receiving subunit;
  • a first extracting subunit configured to extract a registered voiceprint feature of the registered voice, and add the registered voiceprint feature to a user voiceprint library
  • a third determining subunit configured to determine whether the number of registered voiceprint features in the user voiceprint library is equal to the second preset fusion number, and if yes, triggering the second fusion subunit, if not, prompting the user to continue to input the registration Voice and trigger the third receiving subunit;
  • the second fusion subunit is configured to select the second preset fusion number of the registered voiceprint features to obtain a preset fusion voiceprint feature.
  • the present invention provides an update method for self-learning voiceprint recognition, comprising: S1: receiving a verification instruction and a voice to be verified, and performing voiceprint feature extraction on the voice to be verified according to the verification instruction to obtain a voiceprint feature to be verified, And comparing the similarity of the voiceprint feature to be verified with the latest first preset contrast quantity in the user voiceprint library to obtain a matching value of the first preset comparison quantity; S2: determining the said Whether the matching value meets the requirements of the preset voiceprint evaluation standard, and if so, the voiceprint verification passes and adds the to-be-verified voiceprint feature to the self-learning observation voiceprint feature library; S3: determines whether the preset time is received The feedback instruction for canceling or reporting the verification instruction, if not, adding the voiceprint feature to be verified as a new material voiceprint feature to the user voiceprint library; S4: selecting the latest time preset in the user voiceprint library The fused number of voiceprint features are fused to obtain a new fused voiceprint feature.
  • the voice to be verified is received and the voiceprint feature to be verified is extracted, and the matching value of the voiceprint feature to be verified and the latest fusion voiceprint feature is determined according to the requirements of the preset voiceprint evaluation standard.
  • the voiceprint feature to be verified is added to the self-learning observation voiceprint feature library, and after determining that the feedback instruction for canceling or reporting the verification instruction is not received within the preset time, the voiceprint feature to be verified is added to the user voiceprint library.
  • the material's voiceprint feature that meets the fusion condition is the verification operation initiated by the user
  • the pattern features a new fusion voiceprint feature, and the latest fusion voiceprint feature is used to verify the voiceprint feature.
  • the fusion voiceprint feature is updated so that the fusion voiceprint feature always matches the user's voice, and the current sound is solved. Pattern recognition technology reduces the technical problem of correct rate due to voiceprint drift.
  • FIG. 1 is a schematic flowchart of an embodiment of a method for updating a self-learning voiceprint recognition according to an embodiment of the present invention
  • FIG. 2 is a schematic flowchart diagram of another embodiment of a method for updating self-learning voiceprint recognition according to an embodiment of the present disclosure
  • FIG. 3 is a schematic structural diagram of an embodiment of an apparatus for updating a self-learning voiceprint recognition according to an embodiment of the present invention.
  • the embodiment of the invention provides a updating method and device for self-learning voiceprint recognition, which solves the technical problem that the voiceprint drift affects the correct rate of identity recognition.
  • an embodiment of an update method for self-learning voiceprint recognition includes:
  • Step 101 Receive a verification instruction and a voice to be verified, perform voiceprint feature extraction on the voice to be verified according to the verification instruction, obtain a voiceprint feature to be verified, and compare the voiceprint feature to be verified with the latest first preset in the user voice database. The number of the merged voiceprint features are compared by the similarity to obtain a matching value of the first preset comparison quantity;
  • the voiceprint feature is extracted according to the verification instruction to obtain the voiceprint feature to be verified, and the similarity algorithm is used to verify the voiceprint to be verified.
  • the feature is compared with the merged voiceprint feature of the latest first preset contrast quantity in the user voiceprint library to obtain a matching value of the first preset comparison quantity;
  • Feature extraction algorithms include: MFCC algorithm, FBank algorithm and D-vector algorithm;
  • the similarity algorithm includes: SVM algorithm, Cosine Distance (CDS) algorithm, LDA algorithm and PLDA algorithm;
  • the appropriate voiceprint feature and similarity algorithm are selected according to the needs for fusion and calculation.
  • Step 102 Determine whether the matching value of the first preset comparison quantity meets the requirement of the preset voiceprint evaluation standard, and if so, the voiceprint verification passes and adds the voiceprint feature to be verified to the self-learning observation voiceprint feature library;
  • the voiceprint verification passes and will be Verify that the voiceprint feature is added to the self-learning observation voiceprint feature library;
  • the preset voiceprint evaluation criterion includes: an average value of the matching values of the first preset comparison quantity is greater than a preset threshold, a matching value of the first preset number of 90% is greater than a preset threshold, or a maximum matching value is greater than a preset threshold;
  • Step 103 Determine whether a feedback instruction for canceling or reporting the verification instruction is received within the preset time, and if not, adding the voiceprint feature to be verified as a new material voiceprint feature to the user voiceprint library;
  • the voiceprint feature to be verified is added to the self-learning voiceprint feature library, the voiceprint feature to be verified is observed, and whether the feedback instruction for canceling or reporting the verification command is received within the preset time, if not,
  • the verification operation corresponding to the voiceprint feature satisfying the learning condition is a verification operation initiated by the user, and the voiceprint feature to be verified is added to the user voiceprint library as a new material voiceprint feature;
  • Observing the characteristics of the voiceprint to be verified that meet the learning conditions ensures that the verification operation initiated by the user himself ensures the security and reliability of the voiceprint features of the material, such as other people obtaining the voiceprint features of the user through some means, and then successfully passing The verification of the user's mobile phone, the user has reported the verification in time after the discovery, to ensure the security and reliability of the user voice database.
  • Step 104 Select a material of the first preset fusion quantity in the user's voiceprint library to fuse the texture features to obtain a new fusion voiceprint feature.
  • the voiceprint feature to be verified is added to the user voiceprint library as a new material voiceprint feature, the latest first preset fusion number of voiceprint features are selected for fusion to obtain a new fusion voiceprint feature.
  • the voiceprint feature fusion algorithm includes: gmm-ubm algorithm, DNN i-vector algorithm and JFA algorithm;
  • the voice to be verified is received and the voiceprint feature to be verified is extracted, and the matching value of the voiceprint feature to be verified and the latest fusion voiceprint feature is determined according to the requirements of the preset voiceprint evaluation standard.
  • the voiceprint feature to be verified is added to the self-learning observation voiceprint feature library, and after determining that the feedback instruction for canceling or reporting the verification instruction is not received within the preset time, the voiceprint feature to be verified is added to the user voice.
  • the texture library ensures that the material's voiceprint feature that satisfies the fusion condition is the verification operation initiated by the user himself, and the newest voiceprint feature in the user's voiceprint library is selected for fusion to obtain a new fusion voiceprint feature, and the latest time is selected in the whole process.
  • the material's voiceprint features a new fusion voiceprint feature, and the latest fusion voiceprint feature is used to verify the voiceprint feature, and the fusion voiceprint feature is updated so that the fusion voiceprint feature always matches the user's voice, thus solving the current
  • the voiceprint recognition technology reduces the technical problem of correctness due to the drift of the voiceprint.
  • the above is an embodiment of the method for updating the self-learning voiceprint recognition provided by the embodiment of the present invention.
  • the following is another embodiment of the method for updating the self-learning voiceprint recognition provided by the embodiment of the present invention.
  • another embodiment of a method for updating self-learning voiceprint recognition includes:
  • Step 201 Receive a registration instruction.
  • Step 202 Receive a registration voice.
  • the registered voice may be recorded live through a recording device such as a microphone, or may be a recorded voice audio.
  • Step 203 detecting whether the registered voice meets the preset voice quality standard, and if so, executing step 204, if not, executing step 205;
  • Preset voice quality standards include: preset signal-to-noise ratio standard, preset volume standard, and preset effective duration standard;
  • Step 204 Extract the registered voiceprint feature of the registered voice, and add the registered voiceprint feature to the user voiceprint library;
  • the registered voiceprint feature of the registered voice is extracted, and the registered voiceprint feature is added to the user voiceprint library;
  • Feature extraction algorithms include: MFCC algorithm, FBank algorithm and D-vector algorithm.
  • Step 205 prompt the user to continue to input the registration voice and return to step 202;
  • the user when detecting that the registered voice does not meet the preset voice quality standard, the user is prompted to continue to input the registration voice and returns to step 202 to continue the registration operation.
  • Step 206 Determine whether the number of registered voiceprint features in the user voice database is equal to the second preset fusion number, and if so, step 207 is performed, and if not, step 205 is performed;
  • step 207 it is determined whether the number of registered voiceprint features in the voiceprint database of the user meets the second preset fusion number, and if yes, the number of registered voiceprint features has satisfied the second preset fusion number, and step 207 is performed. Otherwise, step 205 is performed.
  • Step 207 Select a second preset fusion number of registered voiceprint features to perform fusion to obtain a preset fusion voiceprint feature.
  • the second preset fusion number of registered voiceprint features are selected to obtain a preset fusion voiceprint feature
  • the voiceprint feature fusion algorithm includes: gmm-ubm algorithm, DNN i-vector algorithm and JFA algorithm;
  • Step 208 Receive a verification instruction and a voice to be verified
  • the user when the user needs to click the verification option when verifying, the user receives the verification instruction and the voice to be verified;
  • the verification voice can be recorded live through a recording device such as a microphone, or it can be a recorded voice audio.
  • Step 209 According to the verification command, it is detected whether the voice to be verified meets the requirements of the preset voice quality standard, and if yes, step 210 is performed, and if not, step 211 is performed;
  • step 210 is performed, and if not, step 211 is performed;
  • Preset voice quality standards include: preset signal to noise ratio standard, preset volume standard, and preset effective duration standard;
  • Step 210 Perform voiceprint feature extraction on the verified voice to obtain a voiceprint feature to be verified, and compare the similarity between the voiceprint feature to be verified and the first preset preset number of the user's voiceprint library. Obtaining a matching value of the first preset comparison quantity;
  • the voice to be verified is in compliance with the requirements of the preset voice quality standard, and after the voiceprint feature is extracted to obtain the voiceprint feature to be verified, the voiceprint feature to be verified and the user voice are determined by using the similarity algorithm. Correlating the similarity of the first preset contrast quantity in the texture library to perform a similarity comparison to obtain a matching value of the first preset comparison quantity;
  • Feature extraction algorithms include: MFCC algorithm, FBank algorithm and D-vector algorithm;
  • the similarity algorithm includes: SVM algorithm, Cosine Distance (CDS) algorithm, LDA algorithm and PLDA algorithm;
  • the appropriate voiceprint feature and similarity algorithm are selected according to the needs for fusion and calculation.
  • Step 211 prompting the user that the voiceprint verification fails
  • the user when detecting that the voice to be verified does not meet the requirements of the preset voice quality standard according to the verification instruction, the user is prompted to verify that the voiceprint is not passed.
  • Step 212 Determine whether the matching value of the first preset comparison quantity meets the requirements of the preset voiceprint evaluation standard, and if yes, execute step 213, and if not, execute step 211;
  • the verification voice corresponding to the voiceprint to be verified is the voice of the user, and the steps are performed. 213. If not, it indicates that the voice to be verified may not be the voice of the user, and the voiceprint verification fails;
  • the preset voiceprint evaluation criterion includes: an average value of the matching values of the first preset comparison quantity is greater than a preset threshold, a matching value of the first preset number of 90% is greater than a preset threshold, or a maximum matching value is greater than a preset threshold;
  • Step 213 Voiceprint verification passes and adds the voiceprint feature to be verified to the self-learning observation voiceprint feature library
  • voiceprint verification is passed, and the voiceprint feature to be verified is added to the self-learning observation voiceprint feature library.
  • Step 214 Determine whether the feedback instruction for canceling or reporting the verification instruction is received within the preset time, if not, proceed to step 215, and if yes, proceed to step 216;
  • step 215 it is determined whether a feedback instruction for canceling or reporting the verification instruction is received within the preset time. If not, step 215 is performed, and if yes, step 216 is performed.
  • Step 215 Add the voiceprint feature to be verified as a new material voiceprint feature to the user voiceprint library
  • the verification operation corresponding to the voiceprint feature satisfying the learning condition is a verification operation initiated by the user, and the voiceprint to be verified is to be verified.
  • the feature is added to the user's voiceprint library as a new material voiceprint feature.
  • Step 216 Removing the to-be-verified voiceprint feature in the self-learning observation voiceprint feature library
  • the verification operation corresponding to the voiceprint feature satisfying the learning condition is not the verification operation initiated by the user, and the self-learning observation voiceprint is performed.
  • the voiceprint feature to be verified in the feature library is removed;
  • Step 217 Select a material of the first preset fusion quantity in the user voice library to merge the material voiceprint features to obtain a new fusion voiceprint feature.
  • the voiceprint feature to be verified is added to the user voiceprint library as a new material voiceprint feature, the latest first preset fusion number of voiceprint features are selected for fusion to obtain a new fusion voiceprint feature.
  • the voiceprint feature fusion algorithm includes: gmm-ubm algorithm, DNN i-vector algorithm and JFA algorithm;
  • the voice to be verified is received and the voiceprint feature to be verified is extracted, and the matching value of the voiceprint feature to be verified and the latest fusion voiceprint feature is determined according to the requirements of the preset voiceprint evaluation standard.
  • the voiceprint feature to be verified is added to the self-learning observation voiceprint feature library, and after determining that the feedback instruction for canceling or reporting the verification instruction is not received within the preset time, the voiceprint feature to be verified is added to the user voice.
  • the texture library ensures that the material's voiceprint feature that satisfies the fusion condition is the verification operation initiated by the user himself, and the newest voiceprint feature in the user's voiceprint library is selected for fusion to obtain a new fusion voiceprint feature, and the latest time is selected in the whole process.
  • the texture of the material has a new fusion voiceprint feature, and the latest fusion voiceprint feature is used to verify the voiceprint feature, which solves the technical problem that the current voiceprint recognition technology can reduce the correct rate due to the voiceprint drift.
  • the above is another embodiment of the method for updating the self-learning voiceprint recognition provided by the embodiment of the present invention.
  • the following is an embodiment of the apparatus for updating the self-learning voiceprint recognition provided by the embodiment of the present invention.
  • the first comparison unit 301 is configured to receive the verification instruction and the voice to be verified, perform voiceprint feature extraction on the voice to be verified according to the verification instruction, obtain a voiceprint feature to be verified, and perform a first prediction of the voiceprint feature to be verified and the user voiceprint library. Comparing the number of merged voiceprint features for similarity comparison to obtain a matching value of the first preset comparison quantity;
  • the first determining unit 302 is configured to determine whether the matching value of the first preset comparison quantity meets the requirement of the preset voiceprint evaluation standard, and if yes, the voiceprint verification passes and adds the voiceprint feature to be verified to the self-learning observation voiceprint feature.
  • the second determining unit 303 is configured to determine whether a feedback instruction for canceling or reporting the verification instruction is received within the preset time, and if not, adding the voiceprint feature to be verified as a new material voiceprint feature to the user voiceprint library;
  • the first merging unit 304 is configured to select a second preset number of material voiceprint features in the user's voiceprint library to be fused to obtain a new fused voiceprint feature.
  • the first determining unit 302 specifically includes:
  • the second determining unit 303 specifically includes:
  • the voiceprint feature to be verified is added to the user voiceprint library as a new material voiceprint feature, and if so, the self-learning sound is read. The voiceprint feature to be verified in the pattern library is removed.
  • the first comparison 301 unit specifically includes:
  • a first receiving subunit 3011 configured to receive a verification instruction and a voice to be verified
  • the first detecting subunit 3012 is configured to detect whether the voice to be verified meets the requirement of the preset voice quality standard, and if yes, trigger the first comparing subunit 3013, and if not, the voiceprint verification fails;
  • the first comparison subunit 3013 is configured to perform voiceprint feature extraction on the voice to be verified to obtain a voiceprint feature to be verified, and combine the voiceprint feature to be verified with the latest preset number of time in the user voiceprint library.
  • the feature performs similarity comparison to obtain a matching value of the first preset comparison quantity.
  • a voice registration unit 300 is further included;
  • a second receiving subunit 3001 configured to receive a registration instruction
  • a third receiving subunit 3002 configured to receive a registration voice
  • the second detecting subunit 3003 is configured to detect whether the registered voice meets the preset voice quality standard, and if yes, trigger the first extracting subunit, and if not, prompt the user to re-enter the registered voice and trigger the third receiving subunit 3002;
  • a first extracting subunit 3004 configured to extract a registered voiceprint feature of the registered voice, and add the registered voiceprint feature to the user voiceprint library;
  • the third determining subunit 3005 is configured to determine whether the number of registered voiceprint features in the user voiceprint library is equal to the second preset fusion number, and if yes, triggering the second fusion subunit, and if not, prompting the user to re-enter the registered voice And triggering the third receiving subunit 3002;
  • the second fusion subunit 3006 is configured to select a second preset fusion number of registered voiceprint features to obtain a preset fusion voiceprint feature.
  • the disclosed system, apparatus, and method may be implemented in other manners.
  • the device embodiments described above are merely illustrative.
  • the division of modules is only a logical function division.
  • multiple modules or components may be combined or integrated. Go to another system, or some features can be ignored or not executed.
  • the mutual coupling or direct coupling or communication connection shown or discussed may be an indirect coupling or communication connection through some interface, device or module, and may be electrical, mechanical or otherwise.
  • the modules described as separate components may or may not be physically separate.
  • the components displayed as modules may or may not be physical modules, that is, may be located in one place, or may be distributed to multiple network modules. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the embodiment.
  • each functional module in each embodiment of the present invention may be integrated into one processing module, or each module may exist physically separately, or two or more modules may be integrated into one module.
  • the above integrated modules can be implemented in the form of hardware or in the form of software functional modules.
  • An integrated module if implemented as a software functional module and sold or used as a standalone product, can be stored in a computer readable storage medium.
  • the technical solution of the present invention which is essential or contributes to the prior art, or all or part of the technical solution, may be embodied in the form of a software product stored in a storage medium.
  • a number of instructions are included to cause a computer device (which may be a personal computer, server, or network device, etc.) to perform all or part of the steps of the various embodiments of the present invention.
  • the foregoing storage medium includes: a U disk, a mobile hard disk, a read-only memory (ROM), a random access memory (RAM), a magnetic disk, or an optical disk, and the like. .

Landscapes

  • Engineering & Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Studio Devices (AREA)
  • Telephonic Communication Services (AREA)

Abstract

本发明公开了一种自学习声纹识别的更新方法和装置。本发明中当用户选择验证时,接收待验证语音并提取待验证声纹特征,判断待验证声纹特征与时间最新的融合声纹特征的匹配值符合预置声纹评估标准的要求,声纹验证通过,判断待验证声纹特征符合融合条件后,将待验证声纹特征作为素材声纹特征,确保满足融合条件的素材声纹特征是用户本人发起的验证操作,选取时间最新的素材声纹特征进行融合得到新的融合声纹特征,整个过程中选取时间最新的素材声纹特征得到新的融合声纹特征,同时选取时间最新的融合声纹特征对待验证声纹特征进行验证,解决了当前的声纹识别技术会因为声纹漂移而降低正确率的技术问题。

Description

一种自学习声纹识别的更新方法和装置
本申请要求于2017年12月29号提交中国专利局、申请号为201711477151.6、发明名称为“一种自学习声纹识别的更新方法和装置”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本发明涉及声纹识别技术领域,尤其涉及一种自学习声纹识别的更新方法和装置。
背景技术
随着生物特征识别的发展,声纹识别因其较好的便捷性、稳定性和较高的安全性,在银行、智能家居、移动支付等领域有较大的应用前景。声纹识别,用户按照系统预先设定的规则,注册一定量的语音,当用户需要验证时重新录制验证语音提交给系统进行身份识别。
现有的声纹识别技术都是对声纹进行一次性注册长期使用。但是,随着时间的变化人的声音也会有一定的变化,这种现象被称为声纹漂移,声纹漂移会影响身份识别的正确率。
因此,导致了当前的声纹识别技术会因为声纹漂移而降低正确率的技术问题。
发明内容
本发明实施例提供了一种自学习声纹识别的更新方法和装置,解决了当前的声纹识别技术会因为声纹漂移而降低正确率的技术问题。
本发明提供了一种自学习声纹识别的更新方法,包括:
S1、接收验证指令和待验证语音,根据所述验证指令对所述待验证语音进行声纹特征提取得到待验证声纹特征,并将所述待验证声纹特征与用户声纹库中时间最新的第一预置对比数量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值;
S2、判断所述匹配值是否符合预置声纹评估标准的要求,若是,则声纹验 证通过并将所述待验证声纹特征加入自学习观察声纹特征库;
S3、判断预置时间内是否接收到对所述验证指令进行撤销或举报的反馈指令,若否,将所述待验证声纹特征作为新的素材声纹特征加入用户声纹库;
S4、选取用户声纹库中时间最新的第一预置融合数量的素材声纹特征进行融合得到新的融合声纹特征。
作为优选,步骤S2具体包括:
判断所述匹配值是否符合预置声纹评估标准的要求,若是,则声纹验证通过并将所述待验证声纹特征加入自学习观察声纹特征库,若否,则声纹验证不通过。
作为优选,步骤S3具体包括:
判断预置时间内是否接收到对所述验证指令进行撤销或举报的反馈指令,若否,将所述待验证声纹特征作为新的素材声纹特征加入用户声纹库,若是,将所述自学习观察声纹特征库中的所述待验证声纹特征移除。
作为优选,步骤S1具体包括:
S11、接收验证指令和待验证语音;
S12、检测所述待验证语音是否符合预置语音质量标准的要求,若是,则执行步骤S13,若否,则声纹验证不通过;
S13、对所述待验证语音进行声纹特征提取得到待验证声纹特征,并将所述待验证声纹特征与用户声纹库中时间最新的第一预置对比数量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值。
作为优选,步骤S11之前还包括:
S01、接收注册指令;
S02、接收注册语音;
S03、检测所述注册语音是否满足预置语音质量标准,若是,则执行步骤S04,若否,则提示用户继续输入注册语音并返回步骤S02;
S04、提取所述注册语音的注册声纹特征,并将所述注册声纹特征加入用户声纹库;
S05、判断所述用户声纹库中注册声纹特征的数量是否等于第二预置融合数量,若是,则执行步骤S06,若否,则提示用户继续输入注册语音并返回步 骤S02;
S06、选取第二预置融合数量的所述注册声纹特征进行融合得到预置融合声纹特征。
本发明提供了一种自学习声纹识别的更新装置,包括:
第一对比单元,用于接收验证指令和待验证语音,根据所述验证指令对所述待验证语音进行声纹特征提取得到待验证声纹特征,并将所述待验证声纹特征与用户声纹库的第一预置对比数量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值;
第一判断单元,用于判断所述匹配值是否符合预置声纹评估标准的要求,若是,则声纹验证通过并将所述待验证声纹特征加入自学习观察声纹特征库;
第二判断单元,用于判断预置时间内是否接收到对所述验证指令进行撤销或举报的反馈指令,若否,将所述待验证声纹特征作为新的素材声纹特征加入用户声纹库;
第一融合单元,用于选取用户声纹库中时间最新的第二预置数量的素材声纹特征进行融合得到新的融合声纹特征。
作为优选,第一判断单元具体包括:
用于判断第一预置对比数量的匹配值是否符合预置声纹评估标准的要求,若是,则声纹验证通过并将所述待验证声纹特征加入自学习观察声纹特征库,若否,则声纹验证不通过。
作为优选,第二判断单元具体包括:
用于判断预置时间内是否接收到对所述验证指令进行撤销或举报的反馈指令,若否,将所述待验证声纹特征作为新的素材声纹特征加入用户声纹库,若是,将所述自学习观察声纹特征库中的所述待验证声纹特征移除。
作为优选,第一对比单元具体包括:
第一接收子单元,用于接收验证指令和待验证语音;
第一检测子单元,用于检测所述待验证语音是否符合预置语音质量标准的要求,若是,则触发第一对比子单元,若否,则声纹验证不通过;
第一对比子单元,用于对所述待验证语音进行声纹特征提取得到待验证声纹特征,并将所述待验证声纹特征与用户声纹库中时间最新的第一预置对比数 量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值。
作为优选,还包括语音注册单元;
第二接收子单元,用于接收注册指令;
第三接收子单元,用于接收注册语音;
第二检测子单元,用于检测所述注册语音是否满足预置语音质量标准,若是,则触发第一提取子单元,若否,则提示用户重新输入注册语音并触发第三接收子单元;
第一提取子单元,用于提取所述注册语音的注册声纹特征,并将所述注册声纹特征加入用户声纹库;
第三判断子单元,用于判断所述用户声纹库中注册声纹特征的数量是否等于第二预置融合数量,若是,则触发第二融合子单元,若否,则提示用户继续输入注册语音并触发第三接收子单元;
第二融合子单元,用于选取第二预置融合数量的所述注册声纹特征进行融合得到预置融合声纹特征。
从以上技术方案可以看出,本发明实施例具有以下优点:
本发明提供了一种自学习声纹识别的更新方法,包括:S1:接收验证指令和待验证语音,根据所述验证指令对所述待验证语音进行声纹特征提取得到待验证声纹特征,并将所述待验证声纹特征与用户声纹库中时间最新的第一预置对比数量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值;S2:判断所述匹配值是否符合预置声纹评估标准的要求,若是,则声纹验证通过并将所述待验证声纹特征加入自学习观察声纹特征库;S3:判断预置时间内是否接收到对所述验证指令进行撤销或举报的反馈指令,若否,将所述待验证声纹特征作为新的素材声纹特征加入用户声纹库;S4:选取用户声纹库中时间最新的第一预置融合数量的声纹特征进行融合得到新的融合声纹特征。
本发明中当用户选择验证时,接收待验证语音并提取待验证声纹特征,判断待验证声纹特征与时间最新的融合声纹特征的匹配值符合预置声纹评估标准的要求时,声纹验证通过,将待验证声纹特征加入自学习观察声纹特征库,判断预置时间内未接收到对验证指令进行撤销或举报的反馈指令后,将待验证声纹特征加入用户声纹库,确保满足融合条件的素材声纹特征是用户本人发起 的验证操作,选取用户声纹库中时间最新的素材声纹特征进行融合得到新的融合声纹特征,整个过程中选取时间最新的素材声纹特征得到新的融合声纹特征,同时选取时间最新的融合声纹特征对待验证声纹特征进行验证,实现融合声纹特征的更新使得融合声纹特征始终与用户声音匹配,解决了当前的声纹识别技术会因为声纹漂移而降低正确率的技术问题。
附图说明
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动性的前提下,还可以根据这些附图获得其它的附图。
图1为本发明实施例提供的一种自学习声纹识别的更新方法的一个实施例的流程示意图;
图2为本发明实施例提供的一种自学习声纹识别的更新方法的另一个实施例的流程示意图;
图3为本发明实施例提供的一种自学习声纹识别的更新装置的一个实施例的结构示意图。
具体实施方式
本发明实施例提供了一种自学习声纹识别的更新方法和装置,解决了声纹漂移影响身份识别正确率的技术问题。
为使得本发明的发明目的、特征、优点能够更加的明显和易懂,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,下面所描述的实施例仅仅是本发明一部分实施例,而非全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其它实施例,都属于本发明保护的范围。
请参阅图1,本发明实施例提供的一种自学习声纹识别的更新方法的一个实施例,包括:
步骤101:接收验证指令和待验证语音,根据验证指令对待验证语音进行 声纹特征提取得到待验证声纹特征,并将待验证声纹特征与用户声纹库中时间最新的第一预置对比数量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值;
需要说明的是,用户需要验证时点击验证选项,接收到验证指令和待验证语音后,根据验证指令对待验证语音进行声纹特征提取得到待验证声纹特征,利用相似度算法将待验证声纹特征与用户声纹库中时间最新的第一预置对比数量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值;
特征提取算法包括:MFCC算法、FBank算法和D-vector算法等;
相似度算法包括:SVM算法、Cosine Distance(CDS)算法、LDA算法和PLDA算法等;
实际应用过程中根据需要选择合适的声纹特征和相似度算法进行融合和计算。
步骤102:判断第一预置对比数量的匹配值是否符合预置声纹评估标准的要求,若是,则声纹验证通过并将待验证声纹特征加入自学习观察声纹特征库;
需要说明的是,判断第一预置对比数量的匹配值是否符合预置声纹评估标准的要求,若是,则说明待验证声纹对应的验证语音是用户的声音,声纹验证通过并将待验证声纹特征加入自学习观察声纹特征库;
预置声纹评估标准包括:第一预置对比数量的匹配值的平均值大于预置阈值、90%的第一预置数量的匹配值大于预置阈值或最大的匹配值大于预置阈值;
实际应用过程中可以根据需要选择不同的预置声纹评估标准进行评估验证。
步骤103:判断预置时间内是否接收到对验证指令进行撤销或举报的反馈指令,若否,将待验证声纹特征作为新的素材声纹特征加入用户声纹库;
需要说明的是,待验证声纹特征加入自学习观察声纹特征库之后,对待验证声纹特征进行观察,判断预置时间内是否接收到对验证指令进行撤销或举报的反馈指令,若否,则说明满足学习条件的声纹特征对应的验证操作是用户本人发起的验证操作,将待验证声纹特征作为新的素材声纹特征加入用户声纹库;
对满足学习条件的待验证声纹特征观察确保是用户本人发起的验证操作,保证了素材声纹特征的安全性与可靠性,如其他人通过某种手段获取用户的声纹特征,然后成功通过了用户手机的验证,用户发现后及时对此次验证进行了举报的操作,确保了用户声纹库的安全性与可靠性。
步骤104:选取用户声纹库中时间最新的第一预置融合数量的素材声纹特征进行融合得到新的融合声纹特征。
需要说明的是,待验证声纹特征作为新的素材声纹特征加入用户声纹库之后,选取时间上最新的第一预置融合数量的声纹特征进行融合得到新的融合声纹特征,实现融合声纹特征与用户声音的高匹配性;
声纹特征融合算法包括:gmm-ubm算法、DNN i-vector算法和JFA算法等;
实际应用过程中根据不同的需要和声纹特征可以选择不同的声纹特征融合算法。
本实施例中,当用户选择验证时,接收待验证语音并提取待验证声纹特征,判断待验证声纹特征与时间最新的融合声纹特征的匹配值符合预置声纹评估标准的要求时,声纹验证通过,将待验证声纹特征加入自学习观察声纹特征库,判断预置时间内未接收到对验证指令进行撤销或举报的反馈指令后,将待验证声纹特征加入用户声纹库,确保满足融合条件的素材声纹特征是用户本人发起的验证操作,选取用户声纹库中时间最新的素材声纹特征进行融合得到新的融合声纹特征,整个过程中选取时间最新的素材声纹特征得到新的融合声纹特征,同时选取时间最新的融合声纹特征对待验证声纹特征进行验证,实现融合声纹特征的更新使得融合声纹特征始终与用户声音匹配,解决了当前的声纹识别技术会因为声纹漂移而降低正确率的技术问题。
以上为本发明实施例提供的一种自学习声纹识别的更新方法的一个实施例,以下为本发明实施例提供的一种自学习声纹识别的更新方法的另一个实施例。
请参阅图2,本发明实施例提供的一种自学习声纹识别的更新方法的另一个实施例,包括:
步骤201:接收注册指令;
需要说明的是,用户需要注册时,点击注册选项,接收到注册指令接收到注册指令进入注册流程。
步骤202:接收注册语音;
需要说明的是,注册语音可以是通过麦克风等录音设备现场录制,也可以是一段已经录制好的语音音频。
步骤203:检测注册语音是否满足预置语音质量标准,若是,则执行步骤204,若否,则执行步骤205;
需要说明的是,检测注册语音是否满足预置语音质量标准,若是,则执行步骤204,若否,则执行步骤205;
预置语音质量标准包括:预置信噪比标准、预置音量标准和预置有效时长标准等;
实际应用过程中可以根据需要选择不同的预置声纹评估标准进行评估验证。
步骤204:提取注册语音的注册声纹特征,并将注册声纹特征加入用户声纹库;
需要说明的是,检测注册语音满足预置语音质量标准后,提取注册语音的注册声纹特征,将注册声纹特征加入用户声纹库;
特征提取算法包括:MFCC算法、FBank算法和D-vector算法等。
步骤205:提示用户继续输入注册语音并返回步骤202;
需要说明的是,检测注册语音不满足预置语音质量标准时,提示用户继续输入注册语音并返回步骤202继续进行注册操作。
步骤206:判断用户声纹库中注册声纹特征的数量是否等于第二预置融合数量,若是,则执行步骤207,若否,则执行步骤205;
需要说明的是,判断用户声纹库中注册声纹特征的数量是否满足第二预置融合数量,若是,则说明注册声纹特征的数量已满足第二预置融合数量,执行步骤207,若否,则执行步骤205。
步骤207:选取第二预置融合数量的注册声纹特征进行融合得到预置融合声纹特征;
需要说明的是,选取第二预置融合数量的注册声纹特征融合得到预置融合 声纹特征;
声纹特征融合算法包括:gmm-ubm算法、DNN i-vector算法和JFA算法等;
实际应用过程中根据不同的需要和声纹特征可以选择不同的声纹特征融合算法。
步骤208:接收验证指令和待验证语音;
需要说明的是,用户需要验证时点击验证选项后,接收到验证指令和待验证语音;
验证语音可以是通过麦克风等录音设备现场录制,也可以是一段已经录制好的语音音频。
步骤209:根据验证指令检测待验证语音是否符合预置语音质量标准的要求,若是,则执行步骤210,若否,则执行步骤211;
需要说明的是,根据验证指令检测待验证语音是否符合预置语音质量标准的要求,若是,则执行步骤210,若否,则执行步骤211;
预置语音质量标准包括:预置信噪比标准、预置音量标准和预置有效时长标准;
实际应用过程中可以根据需要选择不同的预置声纹评估标准进行评估验证。
步骤210:对待验证语音进行声纹特征提取得到待验证声纹特征,并将待验证声纹特征与用户声纹库中时间最新的第一预置对比数量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值;
需要说明的是,根据验证指令检测待验证语音符合预置语音质量标准的要求,对待验证语音进行声纹特征提取得到待验证声纹特征后,利用相似度算法将待验证声纹特征与用户声纹库中时间最新的第一预置对比数量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值;
特征提取算法包括:MFCC算法、FBank算法和D-vector算法等;
相似度算法包括:SVM算法、Cosine Distance(CDS)算法、LDA算法和PLDA算法等;
实际应用过程中根据需要选择合适的声纹特征和相似度算法进行融合和 计算。
步骤211:提示用户声纹验证不通过;
需要说明的是,根据验证指令检测待验证语音不符合预置语音质量标准的要求时,提示用户声纹验证不通过。
步骤212:判断第一预置对比数量的匹配值是否符合预置声纹评估标准的要求,若是,则执行步骤213,若否,则执行步骤211;
需要说明的是,需要说明的是,判断第一预置对比数量的匹配值是否符合预置声纹评估标准的要求,若是,则说明待验证声纹对应的验证语音是用户的声音,执行步骤213,若否,则说明待验证语音有可能不是用户的声音,声纹验证不通过;
预置声纹评估标准包括:第一预置对比数量的匹配值的平均值大于预置阈值、90%的第一预置数量的匹配值大于预置阈值或最大的匹配值大于预置阈值;
实际应用过程中可以根据需要选择不同的预置声纹评估标准进行评估验证。
步骤213:声纹验证通过并将待验证声纹特征加入自学习观察声纹特征库;
需要说明的是,声纹验证通过的同时会将待验证声纹特征加入自学习观察声纹特征库。
步骤214:判断预置时间内是否接收到对验证指令进行撤销或举报的反馈指令,若否,则执行步骤215,若是,则执行步骤216;
需要说明的是,判断预置时间内是否接收到对验证指令进行撤销或举报的反馈指令,若否,则执行步骤215,若是,则执行步骤216。
步骤215:将待验证声纹特征作为新的素材声纹特征加入用户声纹库;
需要说明的是,判断预置时间内未接收到对验证指令进行撤销或举报的反馈指令后,说明满足学习条件的声纹特征对应的验证操作是用户本人发起的验证操作,将待验证声纹特征作为新的素材声纹特征加入用户声纹库。
步骤216:将自学习观察声纹特征库中的待验证声纹特征移除;
需要说明的是,判断预置时间内接收到了对验证指令进行撤销或举报的反 馈指令后,说明满足学习条件的声纹特征对应的验证操作不是用户本人发起的验证操作,将自学习观察声纹特征库中的待验证声纹特征移除;
对满足学习条件的待验证声纹特征观察确保是用户本人发起的验证操作,保证了素材声纹特征的安全性与可靠性。
步骤217:选取用户声纹库中时间最新的第一预置融合数量的素材声纹特征进行融合得到新的融合声纹特征。
需要说明的是,待验证声纹特征作为新的素材声纹特征加入用户声纹库之后,选取时间上最新的第一预置融合数量的声纹特征进行融合得到新的融合声纹特征,实现融合声纹特征与用户声音的高匹配性;
声纹特征融合算法包括:gmm-ubm算法、DNN i-vector算法和JFA算法等;
实际应用过程中根据不同的需要和声纹特征可以选择不同的声纹特征融合算法。
本实施例中,当用户选择验证时,接收待验证语音并提取待验证声纹特征,判断待验证声纹特征与时间最新的融合声纹特征的匹配值符合预置声纹评估标准的要求时,声纹验证通过,将待验证声纹特征加入自学习观察声纹特征库,判断预置时间内未接收到对验证指令进行撤销或举报的反馈指令后,将待验证声纹特征加入用户声纹库,确保满足融合条件的素材声纹特征是用户本人发起的验证操作,选取用户声纹库中时间最新的素材声纹特征进行融合得到新的融合声纹特征,整个过程中选取时间最新的素材声纹特征得到新的融合声纹特征,同时选取时间最新的融合声纹特征对待验证声纹特征进行验证,解决了当前的声纹识别技术会因为声纹漂移而降低正确率的技术问题。
以上为本发明实施例提供的一种自学习声纹识别的更新方法的另一个实施例,以下为本发明实施例提供的一种自学习声纹识别的更新装置的一个实施例。
第一对比单元301,用于接收验证指令和待验证语音,根据验证指令对待验证语音进行声纹特征提取得到待验证声纹特征,并将待验证声纹特征与用户声纹库的第一预置对比数量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值;
第一判断单元302,用于判断第一预置对比数量的匹配值是否符合预置声纹评估标准的要求,若是,则声纹验证通过并将待验证声纹特征加入自学习观察声纹特征库;
第二判断单元303,用于判断预置时间内是否接收到对验证指令进行撤销或举报的反馈指令,若否,将待验证声纹特征作为新的素材声纹特征加入用户声纹库;
第一融合单元304,用于选取用户声纹库中时间最新的第二预置数量的素材声纹特征进行融合得到新的融合声纹特征。
进一步地,第一判断单元302具体包括:
用于判断第一预置对比数量的匹配值是否符合预置声纹评估标准的要求,若是,则声纹验证通过并将待验证声纹特征加入自学习观察声纹特征库,若否,则声纹验证不通过。
进一步地,第二判断单元303具体包括:
用于判断预置时间内是否接收到对验证指令进行撤销或举报的反馈指令,若否,将待验证声纹特征作为新的素材声纹特征加入用户声纹库,若是,将自学习观察声纹特征库中的待验证声纹特征移除。
进一步地,第一对比301单元具体包括:
第一接收子单元3011,用于接收验证指令和待验证语音;
第一检测子单元3012,用于检测待验证语音是否符合预置语音质量标准的要求,若是,则触发第一对比子单元3013,若否,则声纹验证不通过;
第一对比子单元3013,用于对待验证语音进行声纹特征提取得到待验证声纹特征,并将待验证声纹特征与用户声纹库中时间最新的第一预置对比数量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值。
进一步地,还包括语音注册单元300;
第二接收子单元3001,用于接收注册指令;
第三接收子单元3002,用于接收注册语音;
第二检测子单元3003,用于检测注册语音是否满足预置语音质量标准,若是,则触发第一提取子单元,若否,则提示用户重新输入注册语音并触发第三接收子单元3002;
第一提取子单元3004,用于提取注册语音的注册声纹特征,并将注册声纹特征加入用户声纹库;
第三判断子单元3005,用于判断用户声纹库中注册声纹特征的数量是否等于第二预置融合数量,若是,则触发第二融合子单元,若否,则提示用户重新输入注册语音并触发第三接收子单元3002;
第二融合子单元3006,用于选取第二预置融合数量的注册声纹特征融合得到预置融合声纹特征。
以上所述,以上实施例仅用以说明本发明的技术方案,而非对其限制;尽管参照前述实施例对本发明进行了详细的说明,本领域的普通技术人员应当理解:其依然可以对前述各实施例所记载的技术方案进行修改,或者对其中部分技术特征进行等同替换;而这些修改或者替换,并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。
所属领域的技术人员可以清楚地了解到,为描述的方便和简洁,上述描述的系统,装置和模块的具体工作过程,可以参考前述方法实施例中的对应过程,在此不再赘述。
在本申请所提供的几个实施例中,应该理解到,所揭露的系统,装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,模块的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个模块或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或模块的间接耦合或通信连接,可以是电性,机械或其它的形式。
作为分离部件说明的模块可以是或者也可以不是物理上分开的,作为模块显示的部件可以是或者也可以不是物理模块,即可以位于一个地方,或者也可以分布到多个网络模块上。可以根据实际的需要选择其中的部分或者全部模块来实现本实施例方案的目的。
另外,在本发明各个实施例中的各功能模块可以集成在一个处理模块中,也可以是各个模块单独物理存在,也可以两个或两个以上模块集成在一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的 形式实现。
集成的模块如果以软件功能模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储介质中。基于这样的理解,本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储介质中,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。

Claims (10)

  1. 一种自学习声纹识别的更新方法,其特征在于,包括:
    S1、接收验证指令和待验证语音,根据所述验证指令对所述待验证语音进行声纹特征提取得到待验证声纹特征,并将所述待验证声纹特征与用户声纹库中时间最新的第一预置对比数量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值;
    S2、判断所述匹配值是否符合预置声纹评估标准的要求,若是,则声纹验证通过并将所述待验证声纹特征加入自学习观察声纹特征库;
    S3、判断预置时间内是否接收到对所述验证指令进行撤销或举报的反馈指令,若否,将所述待验证声纹特征作为新的素材声纹特征加入用户声纹库;
    S4、选取用户声纹库中时间最新的第一预置融合数量的素材声纹特征进行融合得到新的融合声纹特征。
  2. 根据权利要求1所述的一种自学习声纹识别的更新方法,其特征在于,步骤S2具体包括:
    判断所述匹配值是否符合预置声纹评估标准的要求,若是,则声纹验证通过并将所述待验证声纹特征加入自学习观察声纹特征库,若否,则声纹验证不通过。
  3. 根据权利要求1所述的一种自学习声纹识别的更新方法,其特征在于,步骤S3具体包括:
    判断预置时间内是否接收到对所述验证指令进行撤销或举报的反馈指令,若否,将所述待验证声纹特征作为新的素材声纹特征加入用户声纹库,若是,将所述自学习观察声纹特征库中的所述待验证声纹特征移除。
  4. 根据权利要求1所述的一种自学习声纹识别的更新方法,其特征在 于,步骤S1具体包括:
    S11、接收验证指令和待验证语音;
    S12、检测所述待验证语音是否符合预置语音质量标准的要求,若是,则执行步骤S13,若否,则声纹验证不通过;
    S13、对所述待验证语音进行声纹特征提取得到待验证声纹特征,并将所述待验证声纹特征与用户声纹库中时间最新的第一预置对比数量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值。
  5. 根据权利要求4所述的一种自学习声纹识别的更新方法,其特征在于,步骤S11之前还包括:
    S01、接收注册指令;
    S02、接收注册语音;
    S03、检测所述注册语音是否满足预置语音质量标准,若是,则执行步骤S04,若否,则提示用户继续输入注册语音并返回步骤S02;
    S04、提取所述注册语音的注册声纹特征,并将所述注册声纹特征加入用户声纹库;
    S05、判断所述用户声纹库中注册声纹特征的数量是否等于第二预置融合数量,若是,则执行步骤S06,若否,则提示用户继续输入注册语音并返回步骤S02;
    S06、选取第二预置融合数量的所述注册声纹特征进行融合得到预置融合声纹特征。
  6. 一种自学习声纹识别的更新装置,其特征在于,包括:
    第一对比单元,用于接收验证指令和待验证语音,根据所述验证指令对所述待验证语音进行声纹特征提取得到待验证声纹特征,并将所述待验证声纹特征与用户声纹库的第一预置对比数量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值;
    第一判断单元,用于判断第所述匹配值是否符合预置声纹评估标准的要求,若是,则声纹验证通过并将所述待验证声纹特征加入自学习观察声纹特征库;
    第二判断单元,用于判断预置时间内是否接收到对所述验证指令进行撤销或举报的反馈指令,若否,将所述待验证声纹特征作为新的素材声纹特征加入用户声纹库;
    第一融合单元,用于选取用户声纹库中时间最新的第一预置融合数量的素材声纹特征进行融合得到新的融合声纹特征。
  7. 根据权利要求6所述的一种自学习声纹识别的更新装置,其特征在于,第一判断单元具体包括:
    用于判断所述匹配值是否符合预置声纹评估标准的要求,若是,则声纹验证通过并将所述待验证声纹特征加入自学习观察声纹特征库,若否,则声纹验证不通过。
  8. 根据权利要求6所述的一种自学习声纹识别的更新装置,其特征在于,第二判断单元具体包括:
    用于判断预置时间内是否接收到对所述验证指令进行撤销或举报的反馈指令,若否,将所述待验证声纹特征作为新的素材声纹特征加入用户声纹库,若是,将所述自学习观察声纹特征库中的所述待验证声纹特征移除。
  9. 根据权利要求6所述的一种自学习声纹识别的更新装置,其特征在于,第一对比单元具体包括:
    第一接收子单元,用于接收验证指令和待验证语音;
    第一检测子单元,用于检测所述待验证语音是否符合预置语音质量标准的要求,若是,则触发第一对比子单元,若否,则声纹验证不通过;
    第一对比子单元,用于对所述待验证语音进行声纹特征提取得到待验证声纹特征,并将所述待验证声纹特征与用户声纹库中时间最新的第一预 置对比数量的融合声纹特征进行相似度比对得到第一预置对比数量的匹配值。
  10. 根据权利要求6所述的一种自学习声纹识别的更新装置,其特征在于,还包括语音注册单元:
    第二接收子单元,用于接收注册指令;
    第三接收子单元,用于接收注册语音;
    第二检测子单元,用于检测所述注册语音是否满足预置语音质量标准,若是,则触发第一提取子单元,若否,则提示用户重新输入注册语音并触发第三接收子单元;
    第一提取子单元,用于提取所述注册语音的注册声纹特征,并将所述注册声纹特征加入用户声纹库;
    第三判断子单元,用于判断所述用户声纹库中注册声纹特征的数量是否等于第二预置融合数量,若是,则触发第二融合子单元,若否,则提示用户继续输入注册语音并触发第三接收子单元;
    第二融合子单元,用于选取第二预置融合数量的所述注册声纹特征进行融合得到预置融合声纹特征。
PCT/CN2018/077535 2017-12-29 2018-02-28 一种自学习声纹识别的更新方法和装置 WO2019127897A1 (zh)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201711477151.6A CN108231082A (zh) 2017-12-29 2017-12-29 一种自学习声纹识别的更新方法和装置
CN201711477151.6 2017-12-29

Publications (1)

Publication Number Publication Date
WO2019127897A1 true WO2019127897A1 (zh) 2019-07-04

Family

ID=62647095

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/077535 WO2019127897A1 (zh) 2017-12-29 2018-02-28 一种自学习声纹识别的更新方法和装置

Country Status (2)

Country Link
CN (1) CN108231082A (zh)
WO (1) WO2019127897A1 (zh)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113257266A (zh) * 2021-05-21 2021-08-13 特斯联科技集团有限公司 基于声纹多特征融合的复杂环境门禁方法及装置

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020019176A1 (zh) * 2018-07-24 2020-01-30 华为技术有限公司 一种终端更新语音助手的唤醒语音的方法及终端
CN109448731A (zh) * 2018-11-20 2019-03-08 北京网众共创科技有限公司 声纹信息的比对方法及装置、储存介质、电子装置
CN110489659A (zh) * 2019-07-18 2019-11-22 平安科技(深圳)有限公司 数据匹配方法和装置
CN110491373A (zh) * 2019-08-19 2019-11-22 Oppo广东移动通信有限公司 模型训练方法、装置、存储介质及电子设备
CN110600029A (zh) * 2019-09-17 2019-12-20 苏州思必驰信息科技有限公司 用于智能语音设备的自定义唤醒方法和装置
CN110660398B (zh) * 2019-09-19 2020-11-20 北京三快在线科技有限公司 声纹特征更新方法、装置、计算机设备及存储介质
CN111091837A (zh) * 2019-12-27 2020-05-01 中国人民解放军陆军工程大学 一种基于在线学习的时变声纹认证方法及系统
CN111063360B (zh) * 2020-01-21 2022-08-19 北京爱数智慧科技有限公司 一种声纹库的生成方法和装置
CN111428576B (zh) * 2020-03-02 2024-04-26 广州微盾科技股份有限公司 特征信息学习方法、电子设备及存储介质
CN112331210B (zh) * 2021-01-05 2021-05-18 太极计算机股份有限公司 一种语音识别装置
WO2022236827A1 (zh) * 2021-05-14 2022-11-17 华为技术有限公司 一种声纹管理方法及装置
CN113327618B (zh) * 2021-05-17 2024-04-19 西安讯飞超脑信息科技有限公司 声纹判别方法、装置、计算机设备和存储介质
CN113327617B (zh) * 2021-05-17 2024-04-19 西安讯飞超脑信息科技有限公司 声纹判别方法、装置、计算机设备和存储介质

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106506524A (zh) * 2016-11-30 2017-03-15 百度在线网络技术(北京)有限公司 用于验证用户的方法和装置
US20170140760A1 (en) * 2015-11-18 2017-05-18 Uniphore Software Systems Adaptive voice authentication system and method
CN107331400A (zh) * 2017-08-25 2017-11-07 百度在线网络技术(北京)有限公司 一种声纹识别性能提升方法、装置、终端及存储介质
CN107424614A (zh) * 2017-07-17 2017-12-01 广东讯飞启明科技发展有限公司 一种声纹模型更新方法

Family Cites Families (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN1905445B (zh) * 2005-07-27 2012-02-15 国际商业机器公司 使用可移动的语音标识卡的语音认证系统及语音认证方法
US8694315B1 (en) * 2013-02-05 2014-04-08 Visa International Service Association System and method for authentication using speaker verification techniques and fraud model
CN104424277B (zh) * 2013-08-29 2020-10-16 深圳市腾讯计算机系统有限公司 举报信息的处理方法及装置
GB2517952B (en) * 2013-09-05 2017-05-31 Barclays Bank Plc Biometric verification using predicted signatures
CN104183240A (zh) * 2014-08-19 2014-12-03 中国联合网络通信集团有限公司 一种声纹特征融合方法及装置
CN104616110A (zh) * 2015-02-10 2015-05-13 五八同城信息技术有限公司 一种信息的处理方法及装置
CN105096121B (zh) * 2015-06-25 2017-07-25 百度在线网络技术(北京)有限公司 声纹认证方法和装置
CN106469553A (zh) * 2015-08-13 2017-03-01 中兴通讯股份有限公司 语音识别方法及装置
CN105654343A (zh) * 2015-09-16 2016-06-08 颜陈煜 实现基于客户网络行为定点发送广告及联系客户的系统和方法
CN106782564B (zh) * 2016-11-18 2018-09-11 百度在线网络技术(北京)有限公司 用于处理语音数据的方法和装置
CN106782565A (zh) * 2016-11-29 2017-05-31 重庆重智机器人研究院有限公司 一种声纹特征识别方法及系统
CN106506563B (zh) * 2016-12-30 2019-11-19 中国建设银行股份有限公司 账户设置方法、装置和银行服务系统
CN107180632A (zh) * 2017-06-19 2017-09-19 微鲸科技有限公司 语音控制方法、装置及可读存储介质

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20170140760A1 (en) * 2015-11-18 2017-05-18 Uniphore Software Systems Adaptive voice authentication system and method
CN106506524A (zh) * 2016-11-30 2017-03-15 百度在线网络技术(北京)有限公司 用于验证用户的方法和装置
CN107424614A (zh) * 2017-07-17 2017-12-01 广东讯飞启明科技发展有限公司 一种声纹模型更新方法
CN107331400A (zh) * 2017-08-25 2017-11-07 百度在线网络技术(北京)有限公司 一种声纹识别性能提升方法、装置、终端及存储介质

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113257266A (zh) * 2021-05-21 2021-08-13 特斯联科技集团有限公司 基于声纹多特征融合的复杂环境门禁方法及装置
CN113257266B (zh) * 2021-05-21 2021-12-24 特斯联科技集团有限公司 基于声纹多特征融合的复杂环境门禁方法及装置

Also Published As

Publication number Publication date
CN108231082A (zh) 2018-06-29

Similar Documents

Publication Publication Date Title
WO2019127897A1 (zh) 一种自学习声纹识别的更新方法和装置
US10593334B2 (en) Method and apparatus for generating voiceprint information comprised of reference pieces each used for authentication
JP7109634B2 (ja) アイデンティティ認証方法及び装置
KR102339594B1 (ko) 객체 인식 방법, 컴퓨터 디바이스 및 컴퓨터 판독 가능 저장 매체
US10601821B2 (en) Identity authentication method and apparatus, terminal and server
US20170110125A1 (en) Method and apparatus for initiating an operation using voice data
US9373330B2 (en) Fast speaker recognition scoring using I-vector posteriors and probabilistic linear discriminant analysis
WO2017113658A1 (zh) 基于人工智能的声纹认证方法以及装置
DK2713367T3 (en) Speech Recognition
CN103475490B (zh) 一种身份验证方法及装置
WO2017197953A1 (zh) 基于声纹的身份识别方法及装置
US11869513B2 (en) Authenticating a user
KR20160147280A (ko) 인공 지능을 기반으로 하는 성문 로그인 방법 및 장치
US11430449B2 (en) Voice-controlled management of user profiles
WO2017162053A1 (zh) 一种身份认证的方法和装置
US9646613B2 (en) Methods and systems for splitting a digital signal
WO2019127929A1 (zh) 一种电子设备声纹支付方法及装置
KR101805437B1 (ko) 배경 화자 데이터를 이용한 화자 인증 방법 및 화자 인증 시스템
WO2018227584A1 (zh) 指纹识别的方法、装置和设备
US20220375476A1 (en) Speaker authentication system, method, and program
US9837080B2 (en) Detection of target and non-target users using multi-session information
KR101925252B1 (ko) 음성 특징벡터 및 파라미터를 활용한 화자확인 이중화 방법 및 장치
CN112530441A (zh) 合法用户的验证方法、装置、计算机设备和存储介质
Liu et al. Feature selection for fusion of speaker verification via maximum kullback-leibler distance
US20230153815A1 (en) Methods and systems for training a machine learning model and authenticating a user with the model

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18893501

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

32PN Ep: public notification in the ep bulletin as address of the adressee cannot be established

Free format text: NOTING OF LOSS OF RIGHTS PURSUANT TO RULE 112(1) EPC (EPO FORM 1205A DATED 04.11.2020)

122 Ep: pct application non-entry in european phase

Ref document number: 18893501

Country of ref document: EP

Kind code of ref document: A1