WO2020199473A1 - Voice password verification method and apparatus, storage medium, and computer device - Google Patents

Voice password verification method and apparatus, storage medium, and computer device Download PDF

Info

Publication number
WO2020199473A1
WO2020199473A1 PCT/CN2019/103048 CN2019103048W WO2020199473A1 WO 2020199473 A1 WO2020199473 A1 WO 2020199473A1 CN 2019103048 W CN2019103048 W CN 2019103048W WO 2020199473 A1 WO2020199473 A1 WO 2020199473A1
Authority
WO
WIPO (PCT)
Prior art keywords
information
voice
preset
user
voiceprint
Prior art date
Application number
PCT/CN2019/103048
Other languages
French (fr)
Chinese (zh)
Inventor
张丝潆
王健宗
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2020199473A1 publication Critical patent/WO2020199473A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/22Interactive procedures; Man-machine interfaces
    • G10L17/24Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0861Network architectures or network communication protocols for network security for authentication of entities using biometrical features, e.g. fingerprint, retina-scan

Definitions

  • This application relates to the technical field of security verification. Specifically, this application relates to a voice password verification method, device, storage medium, and computer equipment.
  • Voice password is a technology that double-encrypts user information using text information and speaker information in the voice segment. It has good security and convenience, and has good applications in the fields of finance, insurance, public security, and smart devices. Scenes. The inventor realizes that in the current technical research, the acoustic features used in traditional voiceprint password recognition mainly include text information and channel information. The speaker information belongs to the weak information. This leads to the password recognition process, which still faces resistance. Insufficiency such as poor interference.
  • This application provides a voice password verification method, device, computer readable storage medium, and computer equipment to improve the anti-interference of voice password recognition.
  • the embodiment of the application first provides a voice password verification method, including:
  • the voiceprint information is scored according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
  • an embodiment of the present application also provides a voice password verification device, including:
  • the parsing module is used to receive the voice information input by the user, and parse the voice information to obtain the user's voiceprint information;
  • the similarity obtaining module is used to input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein, the recognition model is based on the inclusion of interference factors The relationship between the voiceprint information formed by the training samples and the user identity;
  • the verification module is configured to score the voiceprint information according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
  • embodiments of the present application also provide a non-volatile computer-readable storage medium.
  • the computer-readable storage medium is used to store computer instructions.
  • the computer instructions run on the computer, the computer
  • the steps of the voice password verification method described in any of the above technical solutions can be executed, wherein the steps of the voice password verification method include:
  • the voiceprint information is scored according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
  • an embodiment of the present application also provides a computer device, and the computer device includes:
  • One or more processors are One or more processors;
  • Storage device for storing one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the steps of the voice password verification method described above, wherein the steps of the voice password verification method include:
  • the voiceprint information is scored according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
  • the voice password verification method obtaineds the user's voiceprint information by parsing voice information, inputs the voiceprint information into a pre-trained recognition model, and performs identity based on the similarity between the voiceprint information and preset identity information Verification, if the similarity between the currently received voiceprint information and the preset identity information exceeds a preset threshold, the verification is passed.
  • the recognition model used to obtain the similarity between voiceprint information and preset identity information is obtained from training samples containing interference factors, the recognition model has a certain degree of anti-interference when processing voiceprint information, which improves the voice Recognition accuracy of pattern information.
  • the present application uses a joint probability recognition model based on recognition features to recognize voiceprint information. This algorithm strengthens the speaker information in the speech information, further improves the anti-interference of the recognition model, and also improves the recognition of the recognition model. performance.
  • Figure 1 is a diagram of the implementation environment of a voice password verification method provided by an embodiment of the application
  • FIG. 2 is a schematic flowchart of a voice password verification method provided by an embodiment of this application.
  • FIG. 3 is a schematic diagram of the process of password verification in the voice password verification method provided by an embodiment of the application
  • FIG. 4 is a schematic flowchart of establishing a joint probability model for adding interference factors according to an embodiment of the application
  • FIG. 5 is a schematic diagram of the process of scoring voiceprint information according to the feature likelihood provided by an embodiment of the application
  • FIG. 6 is a schematic structural diagram of a voice password verification device provided by an embodiment of this application.
  • FIG. 7 is a schematic structural diagram of a computer device provided by an embodiment of this application.
  • Fig. 1 is an implementation environment diagram of a voice password verification method provided in an embodiment.
  • the implementation environment includes a user terminal and a server side.
  • the voice password verification method provided in this embodiment can be applied to the server.
  • the server receives the voice information input by the user, parses the voice information to obtain the user's voiceprint information; inputs the voiceprint information into a pre-trained recognition model In, the similarity between the voiceprint information and the preset identity information is obtained based on the recognition model; wherein, the recognition model is based on the association between the voiceprint information and the user identity formed by training samples containing interference factors Relationship; score the voice information according to the similarity to obtain a score value of the voice information; if the score value exceeds a preset threshold, the verification is passed.
  • the user terminal can be a smart phone, a tablet computer, a notebook computer, a desktop computer, etc.
  • the server side can be implemented by a computer device with processing functions, but is not limited to this.
  • the server and the user terminal can be connected to the network through Bluetooth, USB (Universal Serial Bus) or other communication connection methods, and this application is not limited here.
  • FIG. 2 is a schematic flowchart of a voice password verification method provided in an embodiment of this application.
  • the voice password verification method can be applied to the server side described above and includes the following steps:
  • Step S210 Receive voice information input by a user, and parse the voice information to obtain voiceprint information of the user;
  • Step S220 Input the voiceprint information into a pre-trained recognition model, and obtain the similarity between the voiceprint information and preset identity information based on the recognition model; wherein, the recognition model is based on the interference The correlation between the voiceprint information formed by the training samples of the factor and the user identity;
  • Step S230 Score the voice information according to the similarity to obtain a score value of the voice information. If the score value exceeds a preset threshold, the verification is passed.
  • the voice password verification method provided by the embodiment of the application performs identity verification based on voiceprint information, and is an identification model used to identify a user's identity.
  • the training sample used in the training process contains interference factors. Therefore, the identification model formed based on the training sample It has a certain anti-interference performance.
  • the recognition model is used to recognize voiceprint information, the voiceprint information can be accurately recognized, which improves the accuracy and efficiency of voiceprint recognition.
  • a password verification may be performed first to improve the security of the verification scheme.
  • the flow diagram is shown in Fig. 3 and includes the following sub-steps:
  • S310 Receive identity verification request information, retrieve preset questions in the database in response to the request information, and send to the user;
  • S320 Receive voice information for the preset question sent by the user, and parse the voice information to obtain semantic information therein;
  • step S331 If they are consistent, perform parsing of the voice information in step S210 to obtain voiceprint information of the user.
  • the preset question can be a system setting, combined with the preset answer provided by the user, and establish an association relationship between the preset question and the preset answer, or the preset question and the corresponding preset answer are both customized by the user Establish the relationship between the two and store the preset question, the preset answer and the relationship between the two in the database.
  • One or more preset questions can be set. If there are multiple preset questions in the database, the preset questions are randomly selected, and the extracted preset questions are sent to the user. Compared with a single question or a scheme that uses static features for identification, this scheme of randomly selecting preset questions is beneficial to improve the security of password verification.
  • the solution provided by the embodiment of the present application further includes: if the semantic information is inconsistent with the preset answer, the verification is terminated.
  • the solution provided by the embodiment of this application is as follows: receiving the identity verification request information sent by the user, in response to the request information, first enters the password verification phase, retrieves the preset question in the database, and returns the retrieved preset question to The user, the user receives a preset question, inputs the response information of the preset question through a voice input module, the voice input module may be a microphone; receives voice information containing response information sent by the user, and parses the voice information to obtain The semantic information is used to verify whether the voice information entered by the user is correct according to the semantic information. If the semantic information entered by the user is inconsistent with the preset answer, it indicates that the user is not a stored standard user and the verification fails. The preset answers are consistent, indicating that the current password verification is passed, and further, the second verification can be performed in conjunction with the voiceprint verification process described above to enhance the security and accuracy of the voice password verification process.
  • the following operations can also be performed to increase the verification pass rate:
  • S340 retrieve prompt information associated with the preset question, and send the prompt information to the user.
  • the prompt information is associated with the preset question
  • the reference answer of the preset question is the preset answer.
  • the preset question is also associated with the preset answer in advance, so that the prompt information or the preset answer can be provided according to the preset question. Call.
  • the preset answer is a standard answer associated with the preset question, and the voice information consistent with the preset answer is not received within the preset time, including the following situations: First, the user's transmission is not received within the preset time Second, the voice message that is consistent with the preset answer sent by the user is received within the preset time. In these two cases, it may be that the user did not remember the preset answer and did not input the voice information, or the user The input voice information does not match the preset answer, and the voice information consistent with the preset answer is not received. In this case, the embodiment of this application will send prompt information related to the preset question to the user to improve the verification. Success rate and verification efficiency.
  • the prompt information can be set to one or multiple, all of which are associated with the preset question in advance, and it is detected that the voice information consistent with the preset answer is not received within the preset time, and the related information of the preset question is retrieved. Prompt information, the prompt information is sent to the user. If the prompt information includes multiple prompts, the prompt information with the highest priority is sent to the user according to the priority of the prompt information. If the prompt information is still within the preset time after the prompt information is sent If the voice information consistent with the preset answer is not received, the prompt information with the second highest priority is sent to the user, and the prompt information is sent to the user in turn in this manner. Of course, if there is no priority between the prompt information, the prompt information can also be randomly sent to the user, which helps reduce the complexity of the prompt process.
  • the verification process is terminated.
  • the verification process can be terminated by locking the verification interface to prevent password guessing through constant trial and error, and to avoid power loss caused by such trial and error behavior.
  • the verification is first performed by comparing the voice information in the voice message with the preset answer, that is, the text message is used for password verification. If the password verification is passed, then User identity verification through voiceprint information, combined with the above description of the voiceprint verification scheme, this scheme combines password verification and identity verification, which helps to improve the anti-interference and security of voice password verification.
  • the method before the step of inputting the voiceprint information into the pre-trained recognition model in step S220, the method further includes the following step: establishing a joint probability model with interference factors added, and the flow chart is shown in Fig. 4 , The establishment process is as follows:
  • S410 Retrieve voice samples stored in the database, add an interference factor to each voice sample, and generate training samples, where the interference factors correspond to multiple different interference types;
  • S420 Extract feature information of the voiceprint information in the training sample, and establish an association relationship between the voiceprint information and the user identity according to the feature information.
  • the interference factors in the voiceprint recognition process are collected in advance, such as noise, multiple people talking, etc. Therefore, the interference types in the embodiment of the present application include multiple languages, microphone types, noise, etc. Obtain the user's voice samples, add the above-mentioned interference types to the corresponding voice samples of each user, and add an interference type to the voice samples to form a training sample. The number of training samples corresponding to each user is not less than the type of interference type , Give an example to illustrate the idea of this solution: there are N voice samples corresponding to the user, and M interference types, then the training samples corresponding to the user are not less than M.
  • the above-obtained training samples are used to establish an association relationship, the feature parameters in the training samples are extracted, and the weight coefficients of the feature parameters are continuously determined according to the training samples. After the weight coefficients of each feature parameter are determined, the recognition model is obtained.
  • the recognition model formed by the training data added with the interference factors has a certain degree of anti-interference, improves the anti-interference performance of the recognition model, and then improves the recognition accuracy of the recognition model.
  • the voice password verification method provided in the foregoing embodiment can improve the anti-interference performance of the voice password verification process, but in order to further improve the security of the voice password verification method, this application provides the following solutions:
  • the user inputs voice information through a microphone, analyzes the voice information to obtain the user's voiceprint information, and uses a joint probability recognition model based on recognition features (I-Vector features), which is based on probabilistic linear prediction
  • the differential analysis algorithm PLDA probabilistic linear discriminant analysis, PLDA
  • This algorithm has good channel compensation performance and can strengthen the speaker information.
  • the significance of the channel compensation algorithm is to reduce the influence of the channel information on the speaker information in the I-Vector feature.
  • Interference to further improve the anti-interference of the recognition model. From the perspective of pattern recognition, this algorithm increases the dispersion between classes and reduces the dispersion within the classes, so as to obtain higher discrimination and improve the recognition model Recognition performance.
  • this application uses the PLDA algorithm to perform voiceprint information Score, the algorithm can perform channel compensation.
  • the acoustic features used in voiceprint recognition mainly include text information and channel information.
  • the speaker information belongs to the weak information.
  • the PLDA algorithm used in this application strengthens the speaker information, so it can be further Improve the anti-interference of the voice password verification scheme.
  • mi represents the voice sample vector of the speaker s i
  • i represents the number of speakers
  • is the global average of the training data
  • y si is the feature representation of mi in the speaker space
  • V represents the feature vector of the inter-class space.
  • x i is the interference variable with size Rx
  • ⁇ i is the noise variable
  • U represents the intra-class space
  • W j represents the feature parameter.
  • the weights and PLDA model parameters corresponding to the interference factors in the recognition model are obtained.
  • the EM algorithm is essentially the use of maximum likelihood estimation to solve the probability model parameters containing hidden variables. In each iteration, the expectation of the hidden variables under the given training data is first obtained in E-step, and then this expectation is maximized in M-step, and iteratively converges to reach the local optimal value.
  • the feature likelihood between the currently obtained voiceprint information and the preset identity information is calculated, and the currently obtained voiceprint information is also the voiceprint information to be verified, according to the feature likelihood
  • the process of scoring voiceprint information is shown in Figure 5. The specific process is as follows:
  • S510 retrieve preset identity information, and compare the feature likelihood between the voiceprint information and the preset identity information
  • S520 Score the voiceprint information according to the obtained feature likelihood, and obtain a score value of the voiceprint information.
  • the preset identity information is the user's pre-stored identity information, and there is at least one pre-stored user's identity information. If this application is applied to a door lock, the preset identity information is the pre-stored user who can open the door lock Identity information.
  • the preset identity information stored in the database is retrieved, the feature likelihood between the currently obtained voiceprint information and the preset identity information is obtained, and the expectation maximization algorithm is used to iteratively solve the problem, and the log likelihood ratio is used to calculate the score value.
  • the following formula is used to calculate the score value of the voiceprint information:
  • ⁇ 1 and ⁇ 2 are the recognition feature vectors of the speech at both ends respectively.
  • the probability that the two speeches come from the same speaker is assumed to be H s , and the probability of coming from different speakers is H d
  • H s ) is the likelihood function of two voices from the same speaker
  • H d ) are the likelihood functions of ⁇ 1 and ⁇ 2 from different speakers, respectively .
  • the degree of similarity between the voiceprint information to be verified and the preset identity information is proportional to the score: the higher the ratio, the higher the score, and the greater the probability that the two voices belong to the same speaker; the lower the ratio, the lower the score, then The two voices are less likely to belong to the same speaker.
  • Each training sample contains a type of interference, calculates the inter-class distance of different training samples, and scores based on the distance between the sample to be tested and the stored standard sample. If the voice characteristics of the two samples are the same The greater the degree of likelihood, the more likely the two samples belong to the same speaker.
  • the embodiment of the present application also provides a voice password verification device.
  • the structure diagram is shown in FIG. 6, and includes: an analysis module 610, a similarity obtaining module 620, and a verification module 630, as follows:
  • the parsing module 610 is configured to receive voice information input by a user, and parse the voice information to obtain voiceprint information of the user;
  • the similarity obtaining module 620 is configured to input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein, the recognition model is based on the interference The correlation between the voiceprint information formed by the training samples of the factor and the user identity;
  • the verification module 630 is configured to score the voiceprint information according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
  • an embodiment of the present application also provides a non-volatile computer-readable storage medium having computer instructions stored thereon, and when the computer instructions are executed by a processor, the steps of any one of the above-mentioned voice password verification methods .
  • the storage medium includes, but is not limited to, any type of disk (including floppy disk, hard disk, optical disk, CD-ROM, and magneto-optical disk), ROM (Read-Only Memory), RAM (Random AccesSS Memory), and then Memory), EPROM (EraSable Programmable Read-Only Memory), EEPROM (Electrically EraSable Programmable Read-Only Memory), flash memory, magnetic card or optical card. That is, the storage medium includes any medium that stores or transmits information in a readable form by a device (for example, a computer). It can be a read-only memory, magnetic disk or optical disk, etc.
  • an embodiment of the present application also provides a computer device, and the computer device includes:
  • One or more processors are One or more processors;
  • Storage device for storing one or more programs
  • the one or more processors When the one or more programs are executed by the one or more processors, the one or more processors implement the steps of the voice password verification method described in any one of the foregoing.
  • Fig. 7 is a block diagram showing a computer device 700 according to an exemplary embodiment.
  • the computer device 700 may be provided as a server.
  • the computer device 700 includes a processing component 722, which further includes one or more processors, and a memory resource represented by a memory 732, for storing instructions executable by the processing component 722, such as an application program.
  • the application program stored in the memory 732 may include one or more modules each corresponding to a set of instructions.
  • the processing component 722 is configured to execute instructions to execute the steps of the voice password verification method described above.
  • the computer device 700 may also include a power supply component 726 configured to perform power management of the computer device 700, a wired or wireless network interface 750 configured to connect the computer device 700 to a network, and an input output (I/O) interface 758 .
  • the computer device 700 can operate based on an operating system stored in the memory 732, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or the like. It should be understood that, although the various steps in the flowchart of the drawings are shown in sequence as indicated by the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless explicitly stated in this article, the execution of these steps is not strictly limited in order, and they can be executed in other orders.
  • steps in the flowchart of the drawings may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times, and the order of execution is also It is not necessarily performed sequentially, but may be performed alternately or alternately with other steps or at least a part of sub-steps or stages of other steps.

Abstract

The present invention relates to the technical field of security verification, and particularly relates to a voice password verification method and apparatus, a storage medium, and a computer device. The voice password verification method comprises: receiving voice information inputted by a user, and parsing the voice information to acquire voice print information of the user (S210); inputting the voice print information into a pre-trained recognition model and acquiring the degree of similarity between the voice print information and preset identity information (S220); the recognition model is based on correlation information of the user identity and voiceprint information formed by training samples containing an interference factor; on the basis of the degree of similarity, scoring the voice print information to acquire a score of the voice print information and, if the score exceeds a preset threshold, then verification passes (S230). The provided solution can increase the anti-interference of voice password verification and increase the accuracy of voice password verification.

Description

语音密码验证方法、装置、存储介质及计算机设备Voice password verification method, device, storage medium and computer equipment
本申请要求于2019年4月4日提交中国专利局、申请号为201910270003.X,发明名称为“语音密码验证方法、装置、存储介质及计算机设备”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。This application claims the priority of a Chinese patent application filed with the Chinese Patent Office on April 4, 2019, the application number is 201910270003.X, and the invention title is "Voice Password Verification Method, Device, Storage Medium and Computer Equipment", and its entire contents Incorporated in this application by reference.
技术领域Technical field
本申请涉及安全验证技术领域,具体而言,本申请涉及一种语音密码验证方法、装置、存储介质及计算机设备。This application relates to the technical field of security verification. Specifically, this application relates to a voice password verification method, device, storage medium, and computer equipment.
背景技术Background technique
随着科技的进步及智能家居概念的兴起,市面上出现了越来越多的智能产品,如扫地机器人、智能锁、智能热水器等,由于声纹具有生物特征唯一性,因此市面上出现了一些根据声纹信息进行密码验证的技术。With the advancement of technology and the rise of the concept of smart homes, more and more smart products have appeared on the market, such as sweeping robots, smart locks, smart water heaters, etc. Due to the uniqueness of the biological characteristics of voiceprints, some products have appeared on the market. A technology for password verification based on voiceprint information.
语音密码是采用语音段中的文本信息和说话人信息对用户信息进行双重加密的技术,拥有较好的安全性和便捷性,在金融、保险、公安、智能设备等领域都有很好的应用场景。发明人意识到,在目前的技术研究中,传统的语音声纹密码识别采用的声学特征主要包含文本信息和信道信息,说话人信息属于其中的弱信息,这导致密码识别过程,仍然面临着抗干扰性差等不足。Voice password is a technology that double-encrypts user information using text information and speaker information in the voice segment. It has good security and convenience, and has good applications in the fields of finance, insurance, public security, and smart devices. Scenes. The inventor realizes that in the current technical research, the acoustic features used in traditional voiceprint password recognition mainly include text information and channel information. The speaker information belongs to the weak information. This leads to the password recognition process, which still faces resistance. Insufficiency such as poor interference.
发明内容Summary of the invention
本申请提供了一种语音密码验证方法、装置、计算机可读存储介质及计算机设备,以提高语音密码识别的抗干扰性。This application provides a voice password verification method, device, computer readable storage medium, and computer equipment to improve the anti-interference of voice password recognition.
本申请实施例首先提供了一种语音密码验证方法,包括:The embodiment of the application first provides a voice password verification method, including:
接收用户输入的语音信息,解析所述语音信息获得该用户的声纹信息;Receiving voice information input by a user, and parsing the voice information to obtain voiceprint information of the user;
将所述声纹信息输入经过预先训练的识别模型中,获得所述声纹信息与预设身份信息之间的相似度;其中,所述识别模型是基于包含干扰因子的训练样本形成的声纹信息与用户身份之间的关联关系;Input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein the recognition model is based on a voiceprint formed by training samples containing interference factors The relationship between information and user identity;
根据所述相似度对所述声纹信息进行评分,获得所述声纹信息的评分值,若所述评分值超过预设阈值,则验证通过。The voiceprint information is scored according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
为解决上述技术问题,本申请实施例还提供了一种语音密码验证装置,包括:To solve the above technical problems, an embodiment of the present application also provides a voice password verification device, including:
解析模块,用于接收用户输入的语音信息,解析所述语音信息获得该用户的声纹信息;The parsing module is used to receive the voice information input by the user, and parse the voice information to obtain the user's voiceprint information;
获得相似度模块,用于将所述声纹信息输入经过预先训练的识别模型中,获得所述声纹信息与预设身份信息之间的相似度;其中,所述识别模型是基于包含干扰因子的训练样本形成的声纹信息与用户身份之间的关联关系;The similarity obtaining module is used to input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein, the recognition model is based on the inclusion of interference factors The relationship between the voiceprint information formed by the training samples and the user identity;
验证模块,用于根据所述相似度对所述声纹信息进行评分,获得所述声纹信息的评分值,若所述评分值超过预设阈值,则验证通过。The verification module is configured to score the voiceprint information according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
为解决上述问题,本申请实施例还提供了一种非易失性计算机可读存储介质,所述计算机可读存储介质用于存储计算机指令,当所述计算机指令在计算机上运行时,使得计算机可以执行上述任一技术方案所述的语音密码验证方法的步骤,其中,所述语音密码验证方法的步骤包括:In order to solve the above-mentioned problems, embodiments of the present application also provide a non-volatile computer-readable storage medium. The computer-readable storage medium is used to store computer instructions. When the computer instructions run on the computer, the computer The steps of the voice password verification method described in any of the above technical solutions can be executed, wherein the steps of the voice password verification method include:
接收用户输入的语音信息,解析所述语音信息获得该用户的声纹信息;Receiving voice information input by a user, and parsing the voice information to obtain voiceprint information of the user;
将所述声纹信息输入经过预先训练的识别模型中,获得所述声纹信息与预设身份信息之间的相似度;其中,所述识别模型是基于包含干扰因子的训练样本形成的声纹信息与用户身份之间的关联关系;Input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein the recognition model is based on a voiceprint formed by training samples containing interference factors The relationship between information and user identity;
根据所述相似度对所述声纹信息进行评分,获得所述声纹信息的评分值,若所述评分值超过预设阈值,则验证通过。The voiceprint information is scored according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
更进一步地,本申请实施例还提供了一种计算机设备,所述计算机设备包括:Furthermore, an embodiment of the present application also provides a computer device, and the computer device includes:
一个或多个处理器;One or more processors;
存储装置,用于存储一个或多个程序,Storage device for storing one or more programs,
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现上述所述的语音密码验证方法的步骤,其中,所述语音密码验证方法的步骤包括:When the one or more programs are executed by the one or more processors, the one or more processors implement the steps of the voice password verification method described above, wherein the steps of the voice password verification method include:
接收用户输入的语音信息,解析所述语音信息获得该用户的声纹信息;Receiving voice information input by a user, and parsing the voice information to obtain voiceprint information of the user;
将所述声纹信息输入经过预先训练的识别模型中,获得所述声纹信息与预设身份信息之间的相似度;其中,所述识别模型是基于包含干扰因子的训练样本形成的声纹信息与用户身份之间的关联关系;Input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein the recognition model is based on a voiceprint formed by training samples containing interference factors The relationship between information and user identity;
根据所述相似度对所述声纹信息进行评分,获得所述声纹信息的评分值,若所述评分值超过预设阈值,则验证通过。The voiceprint information is scored according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
本申请实施例提供的语音密码验证方法,通过解析语音信息获得用户的声纹信息,将声纹信息输入预先训练的识别模型中,根据声纹信息与预设身份信息之间的相似度进行身份验证,若当前接收到的声纹信息与预设身份信息之间的相似度超过预设阈值,则验证通过。由于用于获得声纹信息与预设身份信息之间相似度的识别模型是由包含干扰因子的训练样本得到的,因此,识别模型在处理声纹信息时具备一定的抗干扰性,提高了声纹信息的识别准确性。进一步,本申请通过基于识别特征的联合概率识别模型进行声纹信息的识别,该算法补强了语音信息中的说话人信息,进一步提高识别模型的抗干扰性,同时也提高了识别模型的识别性能。The voice password verification method provided in the embodiment of the application obtains the user's voiceprint information by parsing voice information, inputs the voiceprint information into a pre-trained recognition model, and performs identity based on the similarity between the voiceprint information and preset identity information Verification, if the similarity between the currently received voiceprint information and the preset identity information exceeds a preset threshold, the verification is passed. Since the recognition model used to obtain the similarity between voiceprint information and preset identity information is obtained from training samples containing interference factors, the recognition model has a certain degree of anti-interference when processing voiceprint information, which improves the voice Recognition accuracy of pattern information. Further, the present application uses a joint probability recognition model based on recognition features to recognize voiceprint information. This algorithm strengthens the speaker information in the speech information, further improves the anti-interference of the recognition model, and also improves the recognition of the recognition model. performance.
本申请附加的方面和优点将在下面的描述中部分给出,这些将从下面的描述中变得明显,或通过本申请的实践了解到。The additional aspects and advantages of this application will be partly given in the following description, which will become obvious from the following description, or be understood through the practice of this application.
附图说明Description of the drawings
图1为本申请一个实施例提供的语音密码验证方法的实施环境图;Figure 1 is a diagram of the implementation environment of a voice password verification method provided by an embodiment of the application;
图2为本申请一个实施例提供的语音密码验证方法的流程示意图;2 is a schematic flowchart of a voice password verification method provided by an embodiment of this application;
图3为本申请一个实施例提供的语音密码验证方法中进行密码验证的流程示意图;FIG. 3 is a schematic diagram of the process of password verification in the voice password verification method provided by an embodiment of the application;
图4为本申请一个实施例提供的建立加入干扰因子的联合概率模型的流程示意图;4 is a schematic flowchart of establishing a joint probability model for adding interference factors according to an embodiment of the application;
图5为本申请一个实施例提供的根据所述特征似然度进行声纹信息的评分的流程示意图;FIG. 5 is a schematic diagram of the process of scoring voiceprint information according to the feature likelihood provided by an embodiment of the application;
图6为本申请一种实施例提供的语音密码验证装置的结构示意图;6 is a schematic structural diagram of a voice password verification device provided by an embodiment of this application;
图7为本申请一种实施例提供的计算机设备的结构示意图。FIG. 7 is a schematic structural diagram of a computer device provided by an embodiment of this application.
具体实施方式detailed description
下面详细描述本申请的实施例,所述实施例的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施例是示例性的,仅用于解释本申请,而不能解释为对本申请的限制。The embodiments of the present application are described in detail below. Examples of the embodiments are shown in the accompanying drawings, wherein the same or similar reference numerals indicate the same or similar elements or elements with the same or similar functions. The embodiments described below with reference to the drawings are exemplary, and are only used to explain the present application, and cannot be construed as a limitation to the present application.
本领域技术人员可以理解,除非特意声明,这里使用的单数形式“一”、“一个”、“所述”和“该”也可包括复数形式。应该进一步理解的是,本申请的说明书中使用的措辞“包括”是指存在所述特征、整数、步骤、操作、元件和/或组件,但是并不排除存在或添加一个或多个其他特征、整数、步骤、操作、元件、组件和/或它们的组。Those skilled in the art can understand that, unless specifically stated, the singular forms "a", "an", "said" and "the" used herein may also include plural forms. It should be further understood that the term "comprising" used in the specification of this application refers to the presence of the described features, integers, steps, operations, elements, and/or components, but does not exclude the presence or addition of one or more other features, Integers, steps, operations, elements, components, and/or groups thereof.
图1为一个实施例中提供的语音密码验证方法的实施环境图,在该实施环境中,包括用户终端、服务器端。Fig. 1 is an implementation environment diagram of a voice password verification method provided in an embodiment. The implementation environment includes a user terminal and a server side.
本实施例提供的语音密码验证方法可适用于服务器端,服务器端接收用户输入的语音信息,解析所述语音信息获得该用户的声纹信息;将所述声纹信息输入经过预先训练的识别模型中,基于所述识别模型获得所述声纹信息与预设身份信息之间的相似度;其中,所述识别模型是基于包含干扰因子的训练样本形成的声纹信息与用户身份之间的关联关系;根据所述相似度对所述语音信息评分,获得所述语音信息的评分值;若所述评分值超过预设阈值,则验证通过。The voice password verification method provided in this embodiment can be applied to the server. The server receives the voice information input by the user, parses the voice information to obtain the user's voiceprint information; inputs the voiceprint information into a pre-trained recognition model In, the similarity between the voiceprint information and the preset identity information is obtained based on the recognition model; wherein, the recognition model is based on the association between the voiceprint information and the user identity formed by training samples containing interference factors Relationship; score the voice information according to the similarity to obtain a score value of the voice information; if the score value exceeds a preset threshold, the verification is passed.
需要说明的是,用户终端可为智能手机、平板电脑、笔记本电脑、台式计算机等,服务器端可以由具有处理功能的计算机设备来实现,但并不局限于此。服务器端与用户终端可以通过蓝牙、USB(Universal Serial Bus,通用串行总线)或者其他通讯连接方式进行网络连接,本申请在此不做限制。It should be noted that the user terminal can be a smart phone, a tablet computer, a notebook computer, a desktop computer, etc., and the server side can be implemented by a computer device with processing functions, but is not limited to this. The server and the user terminal can be connected to the network through Bluetooth, USB (Universal Serial Bus) or other communication connection methods, and this application is not limited here.
在一个实施例中,图2为本申请实施例提供的语音密码验证方法的流程示意图,该语音密码验证方法可以应用于上述的服务器端,包括如下步骤:In one embodiment, FIG. 2 is a schematic flowchart of a voice password verification method provided in an embodiment of this application. The voice password verification method can be applied to the server side described above and includes the following steps:
步骤S210,接收用户输入的语音信息,解析所述语音信息获得该用户的声纹信息;Step S210: Receive voice information input by a user, and parse the voice information to obtain voiceprint information of the user;
步骤S220,将所述声纹信息输入经过预先训练的识别模型中,基于所述识别模型获得所述声纹信息与预设身份信息之间的相似度;其中,所述识别模型是基于包含干扰因子的训练样本形成的声纹信息与用户身份之间的关联关系;Step S220: Input the voiceprint information into a pre-trained recognition model, and obtain the similarity between the voiceprint information and preset identity information based on the recognition model; wherein, the recognition model is based on the interference The correlation between the voiceprint information formed by the training samples of the factor and the user identity;
步骤S230,根据所述相似度对所述语音信息评分,获得所述语音信息的评分值,若所述评分值超过预设阈值,则验证通过。Step S230: Score the voice information according to the similarity to obtain a score value of the voice information. If the score value exceeds a preset threshold, the verification is passed.
本申请实施例提供的语音密码验证方法,基于声纹信息进行身份验证,用于识别用户身份的识别模型,在训练过程中使用的训练样本包含干扰因子,因此,基于该训练样本形成的识别模型具备一定的抗干扰性能,使用该识别模型进行声纹信息的识别时,准确识别该声纹信息,提高了声纹识别的准确性及效率。The voice password verification method provided by the embodiment of the application performs identity verification based on voiceprint information, and is an identification model used to identify a user's identity. The training sample used in the training process contains interference factors. Therefore, the identification model formed based on the training sample It has a certain anti-interference performance. When the recognition model is used to recognize voiceprint information, the voiceprint information can be accurately recognized, which improves the accuracy and efficiency of voiceprint recognition.
为了更清楚本申请提供的语音密码验证方案及其技术效果,接下来以多个实施例对其具体方案进行详细阐述。In order to be more clear about the voice password verification solution provided by this application and its technical effects, the specific solution will be described in detail in several embodiments below.
在步骤S210的解析语音信息获得该用户的声纹信息的步骤之前,还可以首先进行密码验证,以提升验证方案的安全性,其流程示意图如图3所示,包括如下子步骤:Before the step of parsing the voice information to obtain the voiceprint information of the user in step S210, a password verification may be performed first to improve the security of the verification scheme. The flow diagram is shown in Fig. 3 and includes the following sub-steps:
S310,接收身份验证请求信息,响应于该请求信息调取数据库中的预设问题发送至用户;S310: Receive identity verification request information, retrieve preset questions in the database in response to the request information, and send to the user;
S320,接收用户发送的针对所述预设问题的语音信息,解析所述语音信息获得其中的语义信息;S320: Receive voice information for the preset question sent by the user, and parse the voice information to obtain semantic information therein;
S330,判断所述语义信息是否与预设答案一致;S330: Determine whether the semantic information is consistent with a preset answer;
S331,若一致,则进行步骤S210中的解析所述语音信息获得该用户的声纹信息。S331: If they are consistent, perform parsing of the voice information in step S210 to obtain voiceprint information of the user.
其中,预设问题可以是系统设置,结合用户提供的预设答案,并建立 所述预设问题与预设答案之间的关联关系,或者预设问题及对应的预设答案均由用户自定义,建立两者之间的关联关系并将预设问题、预设答案及两者之间的关联关系存储于数据库中。Among them, the preset question can be a system setting, combined with the preset answer provided by the user, and establish an association relationship between the preset question and the preset answer, or the preset question and the corresponding preset answer are both customized by the user Establish the relationship between the two and store the preset question, the preset answer and the relationship between the two in the database.
预设问题可设置一个或多个,若数据库中有多个预设问题,随机抽取预设问题,并将抽取出的预设问题发送至用户。相较于单个问题或采用静态特征进行识别的方案相比,该种随机选取预设问题的方案,有利于提升密码验证的安全性。One or more preset questions can be set. If there are multiple preset questions in the database, the preset questions are randomly selected, and the extracted preset questions are sent to the user. Compared with a single question or a scheme that uses static features for identification, this scheme of randomly selecting preset questions is beneficial to improve the security of password verification.
本申请实施例提供的方案还包括:若语义信息与预设答案不一致,则验证终止。The solution provided by the embodiment of the present application further includes: if the semantic information is inconsistent with the preset answer, the verification is terminated.
本申请实施例提供的方案如下:接收到用户发送的身份验证请求信息,响应于该请求信息首先进入密码验证阶段,调取数据库中的预设问题,并将调取出的预设问题返回给用户,用户接收到预设问题,将该预设问题的响应信息通过语音输入模块输入,所述语音输入模块可以是麦克风;接收到用户发送的包含响应信息的语音信息,解析所述语音信息获得其中的语义信息,根据语义信息验证用户输入的语音信息是否正确,若用户输入的语义信息与预设答案不一致,则表明用户并非已存储的标准用户,验证不通过,若用户输入的语义信息与预设答案一致,表明通过当次密码验证,进一步地,可结合上述声纹验证过程进行第二道验证,以增强语音密码验证过程的安全性及准确性。The solution provided by the embodiment of this application is as follows: receiving the identity verification request information sent by the user, in response to the request information, first enters the password verification phase, retrieves the preset question in the database, and returns the retrieved preset question to The user, the user receives a preset question, inputs the response information of the preset question through a voice input module, the voice input module may be a microphone; receives voice information containing response information sent by the user, and parses the voice information to obtain The semantic information is used to verify whether the voice information entered by the user is correct according to the semantic information. If the semantic information entered by the user is inconsistent with the preset answer, it indicates that the user is not a stored standard user and the verification fails. The preset answers are consistent, indicating that the current password verification is passed, and further, the second verification can be performed in conjunction with the voiceprint verification process described above to enhance the security and accuracy of the voice password verification process.
进一步地,若预设时间内未接收到与所述预设答案一致的语音信息,还可以进行如下操作以提高验证通过率:Further, if the voice information consistent with the preset answer is not received within the preset time, the following operations can also be performed to increase the verification pass rate:
S340,调取与所述预设问题关联的提示信息,并将所述提示信息发送给用户。S340: Retrieve prompt information associated with the preset question, and send the prompt information to the user.
本实施例中,提示信息与预设问题相关联,预设问题的参考答案为预设答案,同样预先将预设问题与预设答案相关联,以便根据预设问题进行提示信息或预设答案的调用。In this embodiment, the prompt information is associated with the preset question, and the reference answer of the preset question is the preset answer. The preset question is also associated with the preset answer in advance, so that the prompt information or the preset answer can be provided according to the preset question. Call.
其中,所述预设答案为预设问题关联的标准答案,预设时间内未接收到与所述预设答案一致的语音信息,包括如下情形:其一,预设时间内未接收到用户发送的语音信息,其二,预设时间内并接收到用户发送的与预 设答案一致的语音信息,这两种情形下,可能是用户未想起预设答案导致未曾输入语音信息,也可能是用户输入的语音信息与预设答案不匹配导致未接收到与预设答案一致的语音信息,该种情况下,本申请实施例将发送与该预设问题相关的提示信息至用户,以提高验证的成功率以及验证效率。Wherein, the preset answer is a standard answer associated with the preset question, and the voice information consistent with the preset answer is not received within the preset time, including the following situations: First, the user's transmission is not received within the preset time Second, the voice message that is consistent with the preset answer sent by the user is received within the preset time. In these two cases, it may be that the user did not remember the preset answer and did not input the voice information, or the user The input voice information does not match the preset answer, and the voice information consistent with the preset answer is not received. In this case, the embodiment of this application will send prompt information related to the preset question to the user to improve the verification. Success rate and verification efficiency.
其中,提示信息可以设置一个,也可以设置多个,均预先与预设问题建立关联,检测到预设时间内未接收到与预设答案一致的语音信息,调取该预设问题相关联的提示信息,将所述提示信息发送至用户,若提示信息包括多个,按照提示信息的优先级排序结果,将优先级最高的提示信息发送给用户,若提示信息发送后的预设时间内仍未接收到与预设答案一致的语音信息,将优先级次高的提示信息发送给用户,按照该种方式依次将提示信息发送给用户。当然,若提示信息之间没有优先级之分,也可以随机发送提示信息给用户,有利于降低提示过程的复杂性。Among them, the prompt information can be set to one or multiple, all of which are associated with the preset question in advance, and it is detected that the voice information consistent with the preset answer is not received within the preset time, and the related information of the preset question is retrieved. Prompt information, the prompt information is sent to the user. If the prompt information includes multiple prompts, the prompt information with the highest priority is sent to the user according to the priority of the prompt information. If the prompt information is still within the preset time after the prompt information is sent If the voice information consistent with the preset answer is not received, the prompt information with the second highest priority is sent to the user, and the prompt information is sent to the user in turn in this manner. Of course, if there is no priority between the prompt information, the prompt information can also be randomly sent to the user, which helps reduce the complexity of the prompt process.
进一步地,若接收到错误答案的次数超过预设阈值,则终止验证过程。其中,终止验证过程可以通过锁定验证界面,以防通过不断试错进行密码猜测的行为,以及避免这种试错行为导致的电量损耗。Further, if the number of times of receiving wrong answers exceeds a preset threshold, the verification process is terminated. Among them, the verification process can be terminated by locking the verification interface to prevent password guessing through constant trial and error, and to avoid power loss caused by such trial and error behavior.
判断所述语义信息是否与预设答案一致,若一致,则解析所述语音信息获得该用户的声纹信息,若不一致,则验证不通过,表明请求身份验证的用户为非法用户。It is judged whether the semantic information is consistent with the preset answer, if it is consistent, the voice information is parsed to obtain the voiceprint information of the user, and if it is inconsistent, the verification fails, indicating that the user requesting identity verification is an illegal user.
本申请实施例提供的方案,在接收到用户发送的身份验证请求信息之后,首先通过对比语音信息中的语音信息与预设答案进行验证,即通过文本信息进行密码验证,若密码验证通过,再通过声纹信息进行用户身份验证,结合上述对声纹验证方案的描述,本方案结合密码验证与身份验证,有利于提高语音密码验证的抗干扰性及安全性。In the solution provided by the embodiment of the present application, after receiving the identity verification request information sent by the user, the verification is first performed by comparing the voice information in the voice message with the preset answer, that is, the text message is used for password verification. If the password verification is passed, then User identity verification through voiceprint information, combined with the above description of the voiceprint verification scheme, this scheme combines password verification and identity verification, which helps to improve the anti-interference and security of voice password verification.
一种实施例中,在步骤S220的将所述声纹信息输入经过预先训练的识别模型中的步骤之前,还包括如下步骤:建立加入干扰因子的联合概率模型,其流程示意图如图4所示,建立过程如下:In an embodiment, before the step of inputting the voiceprint information into the pre-trained recognition model in step S220, the method further includes the following step: establishing a joint probability model with interference factors added, and the flow chart is shown in Fig. 4 , The establishment process is as follows:
S410,调取数据库中存储的语音样本,向每个语音样本中添加干扰因子,生成训练样本;其中,所述干扰因子对应多种不同的干扰类型;S410: Retrieve voice samples stored in the database, add an interference factor to each voice sample, and generate training samples, where the interference factors correspond to multiple different interference types;
S420,提取训练样本中声纹信息的特征信息,根据所述特征信息建立 声纹信息与用户身份之间的关联关系。S420: Extract feature information of the voiceprint information in the training sample, and establish an association relationship between the voiceprint information and the user identity according to the feature information.
其中,预先收集声纹识别过程中的干扰因子,如:噪音、多人说话等,因此,本申请实施例中的干扰类型包括多语言,麦克风类型,噪声等类型。获得用户的语音样本,分别向每个用户对应的语音样本中添加上述干扰类型,向语音样本中添加一种干扰类型,形成一个训练样本,各用户对应的训练样本数量不低于干扰类型的种类,举例阐述本方案的思路:用户对应的语音样本有N个,干扰类型有M个,则该用户对应的训练样本不低于M个。Among them, the interference factors in the voiceprint recognition process are collected in advance, such as noise, multiple people talking, etc. Therefore, the interference types in the embodiment of the present application include multiple languages, microphone types, noise, etc. Obtain the user's voice samples, add the above-mentioned interference types to the corresponding voice samples of each user, and add an interference type to the voice samples to form a training sample. The number of training samples corresponding to each user is not less than the type of interference type , Give an example to illustrate the idea of this solution: there are N voice samples corresponding to the user, and M interference types, then the training samples corresponding to the user are not less than M.
利用上述获得的训练样本建立关联关系,提取训练样本中的特征参数,根据训练样本不断确定特征参数的权重系数,各特征参数的权重系数确定后,获得识别模型。The above-obtained training samples are used to establish an association relationship, the feature parameters in the training samples are extracted, and the weight coefficients of the feature parameters are continuously determined according to the training samples. After the weight coefficients of each feature parameter are determined, the recognition model is obtained.
由于建立识别模型的训练数据中添加了干扰因子,用这些添加有干扰因子的训练数据形成的识别模型具备一定的抗干扰性,提高识别模型的抗干扰性能,进而提高识别模型的识别准确性。Since interference factors are added to the training data for establishing the recognition model, the recognition model formed by the training data added with the interference factors has a certain degree of anti-interference, improves the anti-interference performance of the recognition model, and then improves the recognition accuracy of the recognition model.
上述实施例提供的语音密码验证方法能够提高语音密码验证过程的抗干扰性,但为了进一步提高语音密码验证方法的安全性,本申请提供如下方案:The voice password verification method provided in the foregoing embodiment can improve the anti-interference performance of the voice password verification process, but in order to further improve the security of the voice password verification method, this application provides the following solutions:
在一种实施例中,用户通过麦克风输入语音信息,解析所述语音信息获得该用户的声纹信息,使用基于识别特征(I-Vector特征)的联合概率识别模型,该模型是基于概率线性预测区分分析算法PLDA(probabilistic linear discriminant analysis,PLDA)获得的,该算法具有良好的信道补偿性能,能够补强说话人信息,信道补偿算法的意义在于减少I-Vector特征中信道信息对说话人信息的干扰,进一步提高识别模型的抗干扰性,从模式识别的角度而言,本算法增大了类间的离散度并且降低类内的离散度,以此获得更高的区分性,提高了识别模型的识别性能。In an embodiment, the user inputs voice information through a microphone, analyzes the voice information to obtain the user's voiceprint information, and uses a joint probability recognition model based on recognition features (I-Vector features), which is based on probabilistic linear prediction The differential analysis algorithm PLDA (probabilistic linear discriminant analysis, PLDA) is obtained. This algorithm has good channel compensation performance and can strengthen the speaker information. The significance of the channel compensation algorithm is to reduce the influence of the channel information on the speaker information in the I-Vector feature. Interference, to further improve the anti-interference of the recognition model. From the perspective of pattern recognition, this algorithm increases the dispersion between classes and reduces the dispersion within the classes, so as to obtain higher discrimination and improve the recognition model Recognition performance.
本申请提供的方案,为了提高验证过程中的抗干扰性,除了在建立识别模型过程中添加干扰因子使得识别模型具备一定的抗干扰性,另一方面,本申请利用PLDA算法进行声纹信息的评分,该算法能够进行信道补偿,声纹识别采用的声学特征主要包含文本信息和信道信息,说话人信息属于 其中的弱信息,本申请采用的PLDA算法补强了说话人信息,因此,能够进一步提高语音密码验证方案的抗干扰性。In the solution provided by this application, in order to improve the anti-interference in the verification process, in addition to adding interference factors in the process of establishing the identification model to make the identification model have a certain anti-interference, on the other hand, this application uses the PLDA algorithm to perform voiceprint information Score, the algorithm can perform channel compensation. The acoustic features used in voiceprint recognition mainly include text information and channel information. The speaker information belongs to the weak information. The PLDA algorithm used in this application strengthens the speaker information, so it can be further Improve the anti-interference of the voice password verification scheme.
本申请提供的联合概率识别模型的公式如下:The formula of the joint probability recognition model provided by this application is as follows:
Figure PCTCN2019103048-appb-000001
Figure PCTCN2019103048-appb-000001
其中,m i表示扬声器s i的语音样本向量,i表示扬声器数,μ是训练数据的全局平均值,y si为m i在说话人空间中的特征表示,V表示类间空间的特征向量,x i是大小为Rx的干扰变量,ε i为噪声变量,j=1,2…N为正整数,说话者的可变性被分解成对应于N个不同的干扰类型,U表示类内空间的特征向量,W j表示特征参数。 Among them, mi represents the voice sample vector of the speaker s i , i represents the number of speakers, μ is the global average of the training data, y si is the feature representation of mi in the speaker space, and V represents the feature vector of the inter-class space. x i is the interference variable with size Rx, ε i is the noise variable, j=1, 2...N is a positive integer, the variability of the speaker is decomposed into N different interference types, U represents the intra-class space The feature vector, W j represents the feature parameter.
利用训练样本集中的训练样本及EM算法,获得识别模型中各干扰因子对应的权重及PLDA模型参数。采用EM算法本质上是利用极大似然估计求解含有隐变量的概率模型参数。在每一次迭代中,在E-step先求出给定训练数据下隐变量的期望,然后在M-step将这个期望最大化,通过迭代逐渐收敛,达到局部最优值。Using the training samples in the training sample set and the EM algorithm, the weights and PLDA model parameters corresponding to the interference factors in the recognition model are obtained. The EM algorithm is essentially the use of maximum likelihood estimation to solve the probability model parameters containing hidden variables. In each iteration, the expectation of the hidden variables under the given training data is first obtained in E-step, and then this expectation is maximized in M-step, and iteratively converges to reach the local optimal value.
按照上述方案获得PLDA模型的模型参数后,计算当前获取的声纹信息与预设身份信息之间的特征似然度,当前获取的声纹信息也是待验证声纹信息,根据所述特征似然度进行声纹信息的评分,其流程示意图如图5所示,具体过程如下:After obtaining the model parameters of the PLDA model according to the above scheme, the feature likelihood between the currently obtained voiceprint information and the preset identity information is calculated, and the currently obtained voiceprint information is also the voiceprint information to be verified, according to the feature likelihood The process of scoring voiceprint information is shown in Figure 5. The specific process is as follows:
S510,调取预设身份信息,对比所述声纹信息与所述预设身份信息之间的特征似然度;S510: Retrieve preset identity information, and compare the feature likelihood between the voiceprint information and the preset identity information;
S520,根据获得的特征似然度对所述声纹信息进行评分,获得所述声纹信息的评分值。S520: Score the voiceprint information according to the obtained feature likelihood, and obtain a score value of the voiceprint information.
其中,预设身份信息为预先存储的用户的身份信息,预先存储的用户的身份信息至少为一个,若本申请应用于门锁上,预设身份信息为预先存储的能够打开该门锁的用户的身份信息。调取数据库中存储的预设身份信 息,分别获得当前获得的声纹信息与预设身份信息之间的特征似然度,采用期望最大化算法迭代求解,使用对数似然比计算评分值。Among them, the preset identity information is the user's pre-stored identity information, and there is at least one pre-stored user's identity information. If this application is applied to a door lock, the preset identity information is the pre-stored user who can open the door lock Identity information. The preset identity information stored in the database is retrieved, the feature likelihood between the currently obtained voiceprint information and the preset identity information is obtained, and the expectation maximization algorithm is used to iteratively solve the problem, and the log likelihood ratio is used to calculate the score value.
优选地,利用如下公式计算声纹信息的评分值:Preferably, the following formula is used to calculate the score value of the voiceprint information:
Figure PCTCN2019103048-appb-000002
Figure PCTCN2019103048-appb-000002
上述公式中,η 1和η 2分别是两端语音的识别特征矢量,两条语音来自同一说话人的概率假设为H s,来自不同说话人的概率为H d,p(η 12|H s)为两条语音来自同一说话人的似然函数;p(η 1|H d),p(η 2|H d)分别为η 1和η 2来子不同说话人的似然函数。通过计算对数似然比,就能衡量两条语音的相似程度。待验证的声纹信息与预设身份信息的相似程度与评分高低成正比:比值越高,得分越高,两条语音属于同一说话人的可能性越大;比值越低,得分越低,则两条语音属于同一说话人的可能性越小。 In the above formula, η 1 and η 2 are the recognition feature vectors of the speech at both ends respectively. The probability that the two speeches come from the same speaker is assumed to be H s , and the probability of coming from different speakers is H d , p(η 12 |H s ) is the likelihood function of two voices from the same speaker; p(η 1 |H d ), p(η 2 |H d) are the likelihood functions of η 1 and η 2 from different speakers, respectively . By calculating the log-likelihood ratio, the similarity of two voices can be measured. The degree of similarity between the voiceprint information to be verified and the preset identity information is proportional to the score: the higher the ratio, the higher the score, and the greater the probability that the two voices belong to the same speaker; the lower the ratio, the lower the score, then The two voices are less likely to belong to the same speaker.
每个训练样本中均包含一种干扰类型,计算不同训练样本的类间距离,根据待检测样本与已存储的标准样本之间的距离进行评分,如果两个样本表示的语音的特征相同的似然度越大,则这两个样本越可能属于同一个说话人。Each training sample contains a type of interference, calculates the inter-class distance of different training samples, and scores based on the distance between the sample to be tested and the stored standard sample. If the voice characteristics of the two samples are the same The greater the degree of likelihood, the more likely the two samples belong to the same speaker.
以上为本申请提供的语音密码验证方法的实施例,针对于该方法,下面阐述与其对应的语音密码验证装置的实施例。The above is an embodiment of the voice password verification method provided by this application. For this method, the following describes the embodiment of the corresponding voice password verification device.
本申请实施例还提供了一种语音密码验证装置,其结构示意图如图6所示,包括:解析模块610、获得相似度模块620、验证模块630,具体如下:The embodiment of the present application also provides a voice password verification device. The structure diagram is shown in FIG. 6, and includes: an analysis module 610, a similarity obtaining module 620, and a verification module 630, as follows:
解析模块610,用于接收用户输入的语音信息,解析所述语音信息获得该用户的声纹信息;The parsing module 610 is configured to receive voice information input by a user, and parse the voice information to obtain voiceprint information of the user;
获得相似度模块620,用于将所述声纹信息输入经过预先训练的识别模型中,获得所述声纹信息与预设身份信息之间的相似度;其中,所述识别模型是基于包含干扰因子的训练样本形成的声纹信息与用户身份之间的关联关系;The similarity obtaining module 620 is configured to input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein, the recognition model is based on the interference The correlation between the voiceprint information formed by the training samples of the factor and the user identity;
验证模块630,用于根据所述相似度对所述声纹信息进行评分,获得所述声纹信息的评分值,若所述评分值超过预设阈值,则验证通过。The verification module 630 is configured to score the voiceprint information according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
关于上述实施例中的语音密码验证装置,其中各个模块执行操作的具体方式已经在有关该方法的实施例中进行了详细描述,此处将不做详细阐述说明。Regarding the voice password verification device in the above-mentioned embodiment, the specific manner in which each module performs the operation has been described in detail in the embodiment of the method, and will not be elaborated here.
进一步地,本申请实施例还提供一种非易失性计算机可读存储介质,其上存储有计算机指令,该计算机指令被处理器执行时实现上述任意一项所述的语音密码验证方法的步骤。其中,所述存储介质包括但不限于任何类型的盘(包括软盘、硬盘、光盘、CD-ROM、和磁光盘)、ROM(Read-Only Memory,只读存储器)、RAM(Random AcceSS Memory,随即存储器)、EPROM(EraSable Programmable Read-Only Memory,可擦写可编程只读存储器)、EEPROM(Electrically EraSable Programmable Read-Only Memory,电可擦可编程只读存储器)、闪存、磁性卡片或光线卡片。也就是,存储介质包括由设备(例如,计算机)以能够读的形式存储或传输信息的任何介质。可以是只读存储器,磁盘或光盘等。Further, an embodiment of the present application also provides a non-volatile computer-readable storage medium having computer instructions stored thereon, and when the computer instructions are executed by a processor, the steps of any one of the above-mentioned voice password verification methods . Wherein, the storage medium includes, but is not limited to, any type of disk (including floppy disk, hard disk, optical disk, CD-ROM, and magneto-optical disk), ROM (Read-Only Memory), RAM (Random AccesSS Memory), and then Memory), EPROM (EraSable Programmable Read-Only Memory), EEPROM (Electrically EraSable Programmable Read-Only Memory), flash memory, magnetic card or optical card. That is, the storage medium includes any medium that stores or transmits information in a readable form by a device (for example, a computer). It can be a read-only memory, magnetic disk or optical disk, etc.
更进一步地,本申请实施例还提供一种计算机设备,所述计算机设备包括:Furthermore, an embodiment of the present application also provides a computer device, and the computer device includes:
一个或多个处理器;One or more processors;
存储装置,用于存储一个或多个程序,Storage device for storing one or more programs,
当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现上述任意一项所述的语音密码验证方法的步骤。When the one or more programs are executed by the one or more processors, the one or more processors implement the steps of the voice password verification method described in any one of the foregoing.
图7是根据一示例性实施例示出的一种用于计算机设备700的框图。例如,计算机设备700可以被提供为一服务器。参照图7,计算机设备700包括处理组件722,其进一步包括一个或多个处理器,以及由存储器732所代表的存储器资源,用于存储可由处理组件722的执行的指令,例如应用程序。存储器732中存储的应用程序可以包括一个或一个以上的每一个对应于一组指令的模块。此外,处理组件722被配置为执行指令,以执行上述语音密码验证方法的步骤。Fig. 7 is a block diagram showing a computer device 700 according to an exemplary embodiment. For example, the computer device 700 may be provided as a server. Referring to FIG. 7, the computer device 700 includes a processing component 722, which further includes one or more processors, and a memory resource represented by a memory 732, for storing instructions executable by the processing component 722, such as an application program. The application program stored in the memory 732 may include one or more modules each corresponding to a set of instructions. In addition, the processing component 722 is configured to execute instructions to execute the steps of the voice password verification method described above.
计算机设备700还可以包括一个电源组件726被配置为执行计算机设 备700的电源管理,一个有线或无线网络接口750被配置为将计算机设备700连接到网络,和一个输入输出(I/O)接口758。计算机设备700可以操作基于存储在存储器732的操作系统,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM或类似。应该理解的是,虽然附图的流程图中的各个步骤按照箭头的指示依次显示,但是这些步骤并不是必然按照箭头指示的顺序依次执行。除非本文中有明确的说明,这些步骤的执行并没有严格的顺序限制,其可以以其他的顺序执行。而且,附图的流程图中的至少一部分步骤可以包括多个子步骤或者多个阶段,这些子步骤或者阶段并不必然是在同一时刻执行完成,而是可以在不同的时刻执行,其执行顺序也不必然是依次进行,而是可以与其他步骤或者其他步骤的子步骤或者阶段的至少一部分轮流或者交替地执行。The computer device 700 may also include a power supply component 726 configured to perform power management of the computer device 700, a wired or wireless network interface 750 configured to connect the computer device 700 to a network, and an input output (I/O) interface 758 . The computer device 700 can operate based on an operating system stored in the memory 732, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM or the like. It should be understood that, although the various steps in the flowchart of the drawings are shown in sequence as indicated by the arrows, these steps are not necessarily executed in sequence in the order indicated by the arrows. Unless explicitly stated in this article, the execution of these steps is not strictly limited in order, and they can be executed in other orders. Moreover, at least part of the steps in the flowchart of the drawings may include multiple sub-steps or multiple stages. These sub-steps or stages are not necessarily executed at the same time, but can be executed at different times, and the order of execution is also It is not necessarily performed sequentially, but may be performed alternately or alternately with other steps or at least a part of sub-steps or stages of other steps.
应该理解的是,在本申请各实施例中的各功能单元可集成在一个处理模块中,也可以各个单元单独物理存在,也可以两个或两个以上单元集成于一个模块中。上述集成的模块既可以采用硬件的形式实现,也可以采用软件功能模块的形式实现。It should be understood that the functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units may be integrated into one module. The above-mentioned integrated modules can be implemented in the form of hardware or software functional modules.
以上所述仅是本申请的部分实施方式,应当指出,对于本技术领域的普通技术人员来说,在不脱离本申请原理的前提下,还可以做出若干改进和润饰,这些改进和润饰也应视为本申请的保护范围。The above are only part of the implementation of this application. It should be pointed out that for those of ordinary skill in the art, without departing from the principle of this application, several improvements and modifications can be made, and these improvements and modifications are also Should be regarded as the scope of protection of this application.

Claims (20)

  1. 一种语音密码验证方法,包括:A voice password verification method, including:
    接收用户输入的语音信息,解析所述语音信息获得该用户的声纹信息;Receiving voice information input by a user, and parsing the voice information to obtain voiceprint information of the user;
    将所述声纹信息输入经过预先训练的识别模型中,获得所述声纹信息与预设身份信息之间的相似度;其中,所述识别模型是基于包含干扰因子的训练样本形成的声纹信息与用户身份之间的关联关系;Input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein the recognition model is based on a voiceprint formed by training samples containing interference factors The relationship between information and user identity;
    根据所述相似度对所述声纹信息进行评分,获得所述声纹信息的评分值,若所述评分值超过预设阈值,则验证通过。The voiceprint information is scored according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
  2. 根据权利要求1所述的语音密码验证方法,所述将所述声纹信息输入经过预先训练的识别模型中的步骤之前,还包括:The voice password verification method according to claim 1, before the step of inputting the voiceprint information into a pre-trained recognition model, the method further comprises:
    调取数据库中存储的语音样本,向每个所述语音样本中添加干扰因子,生成训练样本;其中,所述干扰因子对应多种不同的干扰类型;Retrieve the voice samples stored in the database, add interference factors to each of the voice samples, and generate training samples; wherein the interference factors correspond to multiple different types of interference;
    提取训练样本中声纹信息的特征信息,根据所述特征信息建立声纹信息与用户身份之间的关联关系。The feature information of the voiceprint information in the training sample is extracted, and the association relationship between the voiceprint information and the user identity is established according to the feature information.
  3. 根据权利要求1所述的语音密码验证方法,所述识别模型为基于识别特征的联合概率识别模型,所述联合概率识别模型的公式表示如下:The voice password verification method according to claim 1, wherein the recognition model is a joint probability recognition model based on recognition characteristics, and the formula of the joint probability recognition model is expressed as follows:
    Figure PCTCN2019103048-appb-100001
    Figure PCTCN2019103048-appb-100001
    其中,m i表示扬声器s i的语音样本向量,i表示扬声器数,μ是训练数据的全局平均值,y si为m i在说话人空间中的特征表示,V表示类间空间的特征向量,x i是大小为Rx的干扰变量,ε i为噪声变量,j=1,2…N为正整数,说话者的可变性被分解成对应于N个不同的干扰类型,U表示类内空间的特征向量,W j表示特征参数。 Among them, mi represents the voice sample vector of the speaker s i , i represents the number of speakers, μ is the global average of the training data, y si is the feature representation of mi in the speaker space, and V represents the feature vector of the inter-class space. x i is the interference variable with size Rx, ε i is the noise variable, j=1, 2...N is a positive integer, the variability of the speaker is decomposed into N different interference types, U represents the intra-class space The feature vector, W j represents the feature parameter.
  4. 根据权利要求1所述的语音密码验证方法,所述解析所述语音信息获得该用户的声纹信息的步骤之前,还包括:The voice password verification method according to claim 1, before the step of parsing the voice information to obtain the user's voiceprint information, the method further comprises:
    接收身份验证请求信息,响应于该请求信息调取数据库中的预设问题发送至用户;Receive identity verification request information, retrieve preset questions in the database in response to the request information, and send it to the user;
    接收用户发送的针对所述预设问题的语音信息,解析所述语音信息获得其中的语义信息。Receive the voice information for the preset question sent by the user, and parse the voice information to obtain semantic information therein.
  5. 根据权利要求4所述的语音密码验证方法,若所述数据库中设置有多个预设问题,所述调取数据库中的预设问题发送至用户的步骤,包括:According to the voice password verification method of claim 4, if a plurality of preset questions are set in the database, the step of invoking the preset questions in the database and sending them to the user includes:
    随机抽取预设问题并将抽取出的预设问题发送至用户。Randomly extract preset questions and send the extracted preset questions to the user.
  6. 根据权利要求4所述的语音密码验证方法,所述接收用户发送的针对所述预设问题的响应信息的步骤之前,还包括:The voice password verification method according to claim 4, before the step of receiving the response information sent by the user to the preset question, the method further comprises:
    若预设时间内未接收与预设答案一致的语音信息,调取与所述预设问题关联的提示信息,并将所述提示信息发送给用户;其中,所述预设答案为预设问题关联的标准答案。If the voice information consistent with the preset answer is not received within the preset time, the prompt information associated with the preset question is retrieved, and the prompt information is sent to the user; wherein, the preset answer is a preset question The associated standard answer.
  7. 根据权利要求1所述的语音密码验证方法,所述获得所述声纹信息的评分值的步骤,包括:The voice password verification method according to claim 1, wherein the step of obtaining the score value of the voiceprint information comprises:
    调取预设身份信息,对比所述声纹信息与所述预设身份信息之间的特征似然度;Retrieve preset identity information, and compare the feature likelihood between the voiceprint information and the preset identity information;
    根据获得的特征似然度对所述声纹信息进行评分,获得所述声纹信息的评分值。The voiceprint information is scored according to the obtained feature likelihood, and the score value of the voiceprint information is obtained.
  8. 一种语音密码验证装置,包括:A voice password verification device includes:
    解析模块,用于接收用户输入的语音信息,解析所述语音信息获得该用户的声纹信息;The parsing module is used to receive the voice information input by the user, and parse the voice information to obtain the user's voiceprint information;
    获得相似度模块,用于将所述声纹信息输入经过预先训练的识别模型中,获得所述声纹信息与预设身份信息之间的相似度;其中,所述识别模型是基于包含干扰因子的训练样本形成的声纹信息与用户身份之间的关联关系;The similarity obtaining module is used to input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein, the recognition model is based on the inclusion of interference factors The relationship between the voiceprint information formed by the training samples and the user identity;
    验证模块,用于根据所述相似度对所述声纹信息进行评分,获得所述声纹信息的评分值,若所述评分值超过预设阈值,则验证通过。The verification module is configured to score the voiceprint information according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
  9. 一种非易失性计算机可读存储介质,所述计算机可读存储介质用于存储计算机指令,当其在计算机上运行时,使得计算机可以执行所述语音密码验证方法的步骤,其中,所述语音密码验证方法的步骤包括:A non-volatile computer-readable storage medium used to store computer instructions, which when running on a computer, enable the computer to execute the steps of the voice password verification method, wherein the The steps of the voice password verification method include:
    接收用户输入的语音信息,解析所述语音信息获得该用户的声纹信息;Receiving voice information input by a user, and parsing the voice information to obtain voiceprint information of the user;
    将所述声纹信息输入经过预先训练的识别模型中,获得所述声纹信息与预设身份信息之间的相似度;其中,所述识别模型是基于包含干扰因子的训练样本形成的声纹信息与用户身份之间的关联关系;Input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein the recognition model is based on a voiceprint formed by training samples containing interference factors The relationship between information and user identity;
    根据所述相似度对所述声纹信息进行评分,获得所述声纹信息的评分值,若所述评分值超过预设阈值,则验证通过。The voiceprint information is scored according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
  10. 根据权利要求9所述的非易失性计算机可读存储介质,所述将所述声纹信息输入经过预先训练的识别模型中的步骤之前,还包括:The non-volatile computer-readable storage medium according to claim 9, before the step of inputting the voiceprint information into a pre-trained recognition model, further comprising:
    调取数据库中存储的语音样本,向每个所述语音样本中添加干扰因子,生成训练样本;其中,所述干扰因子对应多种不同的干扰类型;Retrieve the voice samples stored in the database, add interference factors to each of the voice samples, and generate training samples; wherein the interference factors correspond to multiple different types of interference;
    提取训练样本中声纹信息的特征信息,根据所述特征信息建立声纹信息与用户身份之间的关联关系。The feature information of the voiceprint information in the training sample is extracted, and the association relationship between the voiceprint information and the user identity is established according to the feature information.
  11. 根据权利要求9所述的非易失性计算机可读存储介质,所述识别模型为基于识别特征的联合概率识别模型,所述联合概率识别模型的公式表示如下:The non-volatile computer-readable storage medium according to claim 9, wherein the recognition model is a joint probability recognition model based on recognition characteristics, and the formula of the joint probability recognition model is expressed as follows:
    Figure PCTCN2019103048-appb-100002
    Figure PCTCN2019103048-appb-100002
    其中,m i表示扬声器s i的语音样本向量,i表示扬声器数,μ是训练数据的全局平均值,y si为m i在说话人空间中的特征表示,V表示类间空间的特征向量,x i是大小为Rx的干扰变量,ε i为噪声变量,j=1,2…N为正整数,说话者的可变性被分解成对应于N个不同的干扰类型,U表示类内空间的特征向量,W j表示特征参数。 Among them, mi represents the voice sample vector of the speaker s i , i represents the number of speakers, μ is the global average of the training data, y si is the feature representation of mi in the speaker space, and V represents the feature vector of the inter-class space. x i is the interference variable with size Rx, ε i is the noise variable, j=1, 2...N is a positive integer, the variability of the speaker is decomposed into N different interference types, U represents the intra-class space The feature vector, W j represents the feature parameter.
  12. 根据权利要求9所述的非易失性计算机可读存储介质,所述解析所述语音信息获得该用户的声纹信息的步骤之前,还包括:The non-volatile computer-readable storage medium according to claim 9, before the step of parsing the voice information to obtain the user's voiceprint information, further comprising:
    接收身份验证请求信息,响应于该请求信息调取数据库中的预设问题发送至用户;Receive identity verification request information, retrieve preset questions in the database in response to the request information, and send it to the user;
    接收用户发送的针对所述预设问题的语音信息,解析所述语音信息获得其中的语义信息。Receive the voice information for the preset question sent by the user, and parse the voice information to obtain semantic information therein.
  13. 根据权利要求12所述的非易失性计算机可读存储介质,若所述数据库中设置有多个预设问题,所述调取数据库中的预设问题发送至用户 的步骤,包括:According to the non-volatile computer-readable storage medium of claim 12, if a plurality of preset questions are set in the database, the step of invoking the preset questions in the database and sending to the user includes:
    随机抽取预设问题并将抽取出的预设问题发送至用户。Randomly extract preset questions and send the extracted preset questions to the user.
  14. 根据权利要求12所述的非易失性计算机可读存储介质,所述接收用户发送的针对所述预设问题的响应信息的步骤之前,还包括:The non-volatile computer-readable storage medium according to claim 12, before the step of receiving the response information sent by the user to the preset question, further comprising:
    若预设时间内未接收与预设答案一致的语音信息,调取与所述预设问题关联的提示信息,并将所述提示信息发送给用户;其中,所述预设答案为预设问题关联的标准答案。If the voice information consistent with the preset answer is not received within the preset time, the prompt information associated with the preset question is retrieved, and the prompt information is sent to the user; wherein, the preset answer is a preset question The associated standard answer.
  15. 一种计算机设备,所述计算机设备包括:A computer device, the computer device includes:
    一个或多个处理器;One or more processors;
    存储装置,用于存储一个或多个程序,Storage device for storing one or more programs,
    当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现上述所述的语音密码验证方法的步骤,其中,所述语音密码验证方法的步骤包括:When the one or more programs are executed by the one or more processors, the one or more processors implement the steps of the voice password verification method described above, wherein the steps of the voice password verification method include:
    接收用户输入的语音信息,解析所述语音信息获得该用户的声纹信息;Receiving voice information input by a user, and parsing the voice information to obtain voiceprint information of the user;
    将所述声纹信息输入经过预先训练的识别模型中,获得所述声纹信息与预设身份信息之间的相似度;其中,所述识别模型是基于包含干扰因子的训练样本形成的声纹信息与用户身份之间的关联关系;Input the voiceprint information into a pre-trained recognition model to obtain the similarity between the voiceprint information and preset identity information; wherein the recognition model is based on a voiceprint formed by training samples containing interference factors The relationship between information and user identity;
    根据所述相似度对所述声纹信息进行评分,获得所述声纹信息的评分值,若所述评分值超过预设阈值,则验证通过。The voiceprint information is scored according to the similarity to obtain a score value of the voiceprint information, and if the score value exceeds a preset threshold, the verification is passed.
  16. 根据权利要求15所述的计算机设备,所述将所述声纹信息输入经过预先训练的识别模型中的步骤之前,还包括:The computer device according to claim 15, before the step of inputting the voiceprint information into a pre-trained recognition model, further comprising:
    调取数据库中存储的语音样本,向每个所述语音样本中添加干扰因子,生成训练样本;其中,所述干扰因子对应多种不同的干扰类型;Retrieve the voice samples stored in the database, add interference factors to each of the voice samples, and generate training samples; wherein the interference factors correspond to multiple different types of interference;
    提取训练样本中声纹信息的特征信息,根据所述特征信息建立声纹信息与用户身份之间的关联关系。The feature information of the voiceprint information in the training sample is extracted, and the association relationship between the voiceprint information and the user identity is established according to the feature information.
  17. 根据权利要求15所述的计算机设备,所述识别模型为基于识别特征的联合概率识别模型,所述联合概率识别模型的公式表示如下:The computer device according to claim 15, wherein the recognition model is a joint probability recognition model based on recognition characteristics, and the formula of the joint probability recognition model is expressed as follows:
    Figure PCTCN2019103048-appb-100003
    Figure PCTCN2019103048-appb-100003
    其中,m i表示扬声器s i的语音样本向量,i表示扬声器数,μ是训练数据的全局平均值,y si为m i在说话人空间中的特征表示,V表示类间空间的特征向量,x i是大小为Rx的干扰变量,ε i为噪声变量,j=1,2…N为正整数,说话者的可变性被分解成对应于N个不同的干扰类型,U表示类内空间的特征向量,W j表示特征参数。 Among them, mi represents the voice sample vector of the speaker s i , i represents the number of speakers, μ is the global average of the training data, y si is the feature representation of mi in the speaker space, and V represents the feature vector of the inter-class space. x i is the interference variable with size Rx, ε i is the noise variable, j=1, 2...N is a positive integer, the variability of the speaker is decomposed into N different interference types, U represents the intra-class space The feature vector, W j represents the feature parameter.
  18. 根据权利要求15所述的计算机设备,所述解析所述语音信息获得该用户的声纹信息的步骤之前,还包括:The computer device according to claim 15, before the step of parsing the voice information to obtain the user's voiceprint information, it further comprises:
    接收身份验证请求信息,响应于该请求信息调取数据库中的预设问题发送至用户;Receive identity verification request information, retrieve preset questions in the database in response to the request information, and send it to the user;
    接收用户发送的针对所述预设问题的语音信息,解析所述语音信息获得其中的语义信息。Receive the voice information for the preset question sent by the user, and parse the voice information to obtain semantic information therein.
  19. 根据权利要求18所述的计算机设备,若所述数据库中设置有多个预设问题,所述调取数据库中的预设问题发送至用户的步骤,包括:The computer device according to claim 18, if a plurality of preset questions are set in the database, the step of retrieving the preset questions in the database and sending them to the user includes:
    随机抽取预设问题并将抽取出的预设问题发送至用户。Randomly extract preset questions and send the extracted preset questions to the user.
  20. 根据权利要求18所述的计算机设备,所述接收用户发送的针对所述预设问题的响应信息的步骤之前,还包括:The computer device according to claim 18, before the step of receiving the response information sent by the user for the preset question, further comprising:
    若预设时间内未接收与预设答案一致的语音信息,调取与所述预设问题关联的提示信息,并将所述提示信息发送给用户;其中,所述预设答案为预设问题关联的标准答案。If the voice information consistent with the preset answer is not received within the preset time, the prompt information associated with the preset question is retrieved, and the prompt information is sent to the user; wherein, the preset answer is a preset question The associated standard answer.
PCT/CN2019/103048 2019-04-04 2019-08-28 Voice password verification method and apparatus, storage medium, and computer device WO2020199473A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201910270003.XA CN109994118B (en) 2019-04-04 2019-04-04 Voice password verification method and device, storage medium and computer equipment
CN201910270003.X 2019-04-04

Publications (1)

Publication Number Publication Date
WO2020199473A1 true WO2020199473A1 (en) 2020-10-08

Family

ID=67132399

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/103048 WO2020199473A1 (en) 2019-04-04 2019-08-28 Voice password verification method and apparatus, storage medium, and computer device

Country Status (2)

Country Link
CN (1) CN109994118B (en)
WO (1) WO2020199473A1 (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109994118B (en) * 2019-04-04 2022-10-11 平安科技(深圳)有限公司 Voice password verification method and device, storage medium and computer equipment
CN111784899A (en) * 2020-06-17 2020-10-16 深圳南亿科技股份有限公司 Building intercom system and access control method thereof
CN111816191A (en) * 2020-07-08 2020-10-23 珠海格力电器股份有限公司 Voice processing method, device, system and storage medium
CN112565242B (en) * 2020-12-02 2023-04-07 携程计算机技术(上海)有限公司 Remote authorization method, system, equipment and storage medium based on voiceprint recognition
CN113593581B (en) * 2021-07-12 2024-04-19 西安讯飞超脑信息科技有限公司 Voiceprint discrimination method, voiceprint discrimination device, computer device and storage medium

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120330663A1 (en) * 2011-06-27 2012-12-27 Hon Hai Precision Industry Co., Ltd. Identity authentication system and method
CN108766444A (en) * 2018-04-09 2018-11-06 平安科技(深圳)有限公司 User ID authentication method, server and storage medium
CN108989349A (en) * 2018-08-31 2018-12-11 平安科技(深圳)有限公司 User account number unlocking method, device, computer equipment and storage medium
CN109243465A (en) * 2018-12-06 2019-01-18 平安科技(深圳)有限公司 Voiceprint authentication method, device, computer equipment and storage medium
CN109994118A (en) * 2019-04-04 2019-07-09 平安科技(深圳)有限公司 Speech cipher verification method, device, storage medium and computer equipment

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CA3036561C (en) * 2016-09-19 2021-06-29 Pindrop Security, Inc. Channel-compensated low-level features for speaker recognition
CN107274906A (en) * 2017-06-28 2017-10-20 百度在线网络技术(北京)有限公司 Voice information processing method, device, terminal and storage medium
CN108417216B (en) * 2018-03-15 2021-01-08 深圳市声扬科技有限公司 Voice verification method and device, computer equipment and storage medium
CN108768654B (en) * 2018-04-09 2020-04-21 平安科技(深圳)有限公司 Identity verification method based on voiceprint recognition, server and storage medium
CN108806695A (en) * 2018-04-17 2018-11-13 平安科技(深圳)有限公司 Anti- fraud method, apparatus, computer equipment and the storage medium of self refresh

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20120330663A1 (en) * 2011-06-27 2012-12-27 Hon Hai Precision Industry Co., Ltd. Identity authentication system and method
CN108766444A (en) * 2018-04-09 2018-11-06 平安科技(深圳)有限公司 User ID authentication method, server and storage medium
CN108989349A (en) * 2018-08-31 2018-12-11 平安科技(深圳)有限公司 User account number unlocking method, device, computer equipment and storage medium
CN109243465A (en) * 2018-12-06 2019-01-18 平安科技(深圳)有限公司 Voiceprint authentication method, device, computer equipment and storage medium
CN109994118A (en) * 2019-04-04 2019-07-09 平安科技(深圳)有限公司 Speech cipher verification method, device, storage medium and computer equipment

Also Published As

Publication number Publication date
CN109994118A (en) 2019-07-09
CN109994118B (en) 2022-10-11

Similar Documents

Publication Publication Date Title
WO2020199473A1 (en) Voice password verification method and apparatus, storage medium, and computer device
WO2018166187A1 (en) Server, identity verification method and system, and a computer-readable storage medium
JP6429945B2 (en) Method and apparatus for processing audio data
EP3866163A1 (en) Voiceprint identification method, model training method and server
US10255922B1 (en) Speaker identification using a text-independent model and a text-dependent model
US10056084B2 (en) Tamper-resistant element for use in speaker recognition
CA2549092C (en) System and method for providing improved claimant authentication
CN101467204B (en) Method and system for bio-metric voice print authentication
US20160014120A1 (en) Method, server, client and system for verifying verification codes
WO2019179029A1 (en) Electronic device, identity verification method and computer-readable storage medium
JP4939121B2 (en) Methods, systems, and programs for sequential authentication using one or more error rates that characterize each security challenge
US20150199960A1 (en) I-Vector Based Clustering Training Data in Speech Recognition
CN108989349B (en) User account unlocking method and device, computer equipment and storage medium
US10909989B2 (en) Identity vector generation method, computer device, and computer-readable storage medium
CN107395352A (en) Personal identification method and device based on vocal print
CN111883140A (en) Authentication method, device, equipment and medium based on knowledge graph and voiceprint recognition
Bengio et al. Learning the decision function for speaker verification
US6341264B1 (en) Adaptation system and method for E-commerce and V-commerce applications
WO2018040942A1 (en) Verification method and device
KR20230116886A (en) Self-supervised speech representation for fake audio detection
WO2019196305A1 (en) Electronic device, identity verification method, and storage medium
Sholokhov et al. Voice biometrics security: Extrapolating false alarm rate via hierarchical Bayesian modeling of speaker verification scores
EP3373177B1 (en) Methods and systems for determining user liveness
WO2020024415A1 (en) Voiceprint recognition processing method and apparatus, electronic device and storage medium
US11899765B2 (en) Dual-factor identification system and method with adaptive enrollment

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19922462

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 19922462

Country of ref document: EP

Kind code of ref document: A1