WO2017197953A1 - Voiceprint-based identity recognition method and device - Google Patents

Voiceprint-based identity recognition method and device Download PDF

Info

Publication number
WO2017197953A1
WO2017197953A1 PCT/CN2017/075346 CN2017075346W WO2017197953A1 WO 2017197953 A1 WO2017197953 A1 WO 2017197953A1 CN 2017075346 W CN2017075346 W CN 2017075346W WO 2017197953 A1 WO2017197953 A1 WO 2017197953A1
Authority
WO
WIPO (PCT)
Prior art keywords
user account
target
voice data
voiceprint
voiceprint feature
Prior art date
Application number
PCT/CN2017/075346
Other languages
French (fr)
Chinese (zh)
Inventor
彭丹丹
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Publication of WO2017197953A1 publication Critical patent/WO2017197953A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/02Feature extraction for speech recognition; Selection of recognition unit
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3226Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN
    • H04L9/3231Biological data, e.g. fingerprint, voice or retina
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/06Creation of reference templates; Training of speech recognition systems, e.g. adaptation to the characteristics of the speaker's voice
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials

Definitions

  • the present invention relates to the field of computer technologies, and in particular, to a voiceprint based identification method and apparatus.
  • Voiceprint recognition technology that is, the recognition technology of speaker speech
  • voice signal such as the fundamental frequency characteristic reflecting the opening and closing frequency of the glottis, and the spectral features reflecting the size and shape of the mouth and the length of the channel.
  • Etc. in order to identify the technology of the speaker identity and so on. It can be widely used in information security, telephone banking, smart access control, and entertainment value-added.
  • the security provided by voiceprint recognition is comparable to other biometric technologies (fingerprint, palm shape, and iris), and only requires a telephone or a microphone. No special equipment is required, data collection is extremely convenient, and the cost is low. An economical, reliable, easy and secure way to identify. At any time, simply enter the speaker's voice and rely on a unique voiceprint to safely identify the speaker.
  • Voiceprint recognition technology is more prominent in the telephone channel and is the only non-contact biometric technology that can be used for remote control.
  • the embodiment of the present invention proposes a voiceprint based identification method.
  • a method based on voiceprint identification comprising:
  • the embodiment of the present invention also proposes A voiceprint based identification device.
  • a voiceprint based identification device comprising:
  • a voice data collecting module configured to collect voice data transmitted by a user account as a sender in an instant messaging application
  • a voiceprint feature library creating module configured to perform a voiceprint recognition model training according to the collected voice data, and create a voiceprint feature library corresponding to the user account;
  • a target information obtaining module configured to receive the initiated identity verification request, and obtain the input target user account and the target voice data
  • a voiceprint matching module configured to search a voiceprint feature library that matches the target user account, and determine an identity school of the target user account when the target voice data matches the found voiceprint feature database Passed the test.
  • the user After adopting the voiceprint-based identification method and device, the user does not need to read a large amount of training text to record the voiceprint feature to establish a voiceprint feature library, and the terminal or server can collect the instant communication message sent by the user.
  • the speech data in the speech is used as a training sample of the voiceprint feature of the user, thereby saving the time for the user to input the voiceprint feature and improving the convenience of operation.
  • FIG. 1 is a schematic flow chart of a voiceprint based identification method in an embodiment
  • FIG. 2 is a schematic diagram of an instant messaging application interface for transmitting a voice segment in an embodiment
  • FIG. 3 is a schematic diagram of an interface for providing random code reading verification in one embodiment
  • FIG. 4 is a schematic structural diagram of a voiceprint based identification device in an embodiment
  • FIG. 5 is a schematic structural diagram of a computer device that runs the aforementioned voiceprint-based identification method in one embodiment.
  • the embodiment of the present invention provides a basis for Voiceprint identification method.
  • the implementation of the method may depend on a computer program running on a von Neumann system-based computer system, which may be an instant messaging application or a client program of a social networking application with instant messaging functionality or
  • the server program the computer system executing the server program may be a terminal device running a client program of an instant messaging application or a social networking application having an instant communication function, or may be a social networking application running an instant messaging application or having an instant communication function.
  • Server device for the server program.
  • the voiceprint-based identification method includes:
  • Step S102 Collect voice data transmitted by the user account as the sender in the instant messaging application.
  • instant messaging applications such as WeChat and QQ
  • these instant messaging applications provide the function of voice clip calls.
  • the user can input a segment of the voice data through the microphone on the mobile phone by long pressing the virtual button, and after releasing the virtual button, the voice data can be sent to the user of the receiver.
  • the terminal When users use the instant messaging application, they need to log in to the user account first.
  • the terminal only collects voice data sent by the logged-in user account, and does not collect voice data received by the logged-in user account.
  • the instant messaging application collects voice data input by the user through the mobile phone microphone, it is usually cached in a preset storage address, and when a complete voice data input is obtained after the acquisition is completed (that is, when the user releases the virtual button, the voice data is collected once). , generate the corresponding voice data file), and then send it to the server or other terminal.
  • the terminal performs the voiceprint-based identification method, the voice data can be obtained in the cached storage address.
  • Step S104 Perform voiceprint recognition model training according to the collected voice data, and create a voiceprint feature library corresponding to the user account (ie, a database including one or more voiceprint features).
  • voiceprint recognition modeling There are many algorithms that can be used for voiceprint recognition modeling, such as Dynamic Time Warping (DTW), Artificial Neural Network (ANN), and Hidden Markov Model (HMM). , Gaussian Mixture Model (GMM) and so on. Because GMM has a good fitting property to the distribution of speech acoustic features, GMM method has become the mainstream method of speech recognition system. In order to improve the recognition accuracy and recognition efficiency, in this paper, GMM is used as a modeling method as an example.
  • DTW Dynamic Time Warping
  • ANN Artificial Neural Network
  • HMM Hidden Markov Model
  • GMM Gaussian Mixture Model
  • the input voice data sequence (PCM (Pulse Code Modulation) code stream) may be preprocessed to remove the non-speech signal and the silence voice signal, and the voice signal is Framing, for subsequent processing; extracting the Mel-Frequency Cepstral Coefficients (MFCC) parameters of each frame of the speech signal and saving; using the extracted MFCC parameters to train the user (ie, the speaker) GMM, The GMM voiceprint model that is specific to this user.
  • PCM Pulse Code Modulation
  • MFCC Mel-Frequency Cepstral Coefficients
  • the step S102 Since the user frequently uses the instant messaging application such as WeChat and QQ, the number of times the voice segment is sent is also high. Therefore, when the step S102 is performed multiple times, multiple voice segments corresponding to the same registered user account are collected. data).
  • the collected plurality of voice data can be used as samples and input into the voiceprint recognition model for machine learning.
  • the feature values of each acquired speech data in the spectrum, cepstrum, formant, pitch, reflection coefficient, rhythm, rhythm, speed, intonation, volume, etc. can be extracted and then identified by the existing voiceprint.
  • the model is trained to obtain a voiceprint feature library corresponding to the logged-in user account.
  • Step S106 Receive an initiated identity verification request, and obtain the input target user account and target voice data.
  • Step S108 Find a voiceprint feature database that matches the target user account, and determine that the identity verification of the target user account passes when the target voice data matches the found voiceprint feature database.
  • voiceprint feature library After the voiceprint feature library is created, user authentication can be performed through the voiceprint feature library (when the collected voice feature data is less, or the voiceprint feature library is not created, the user can be prompted to change other authentication methods).
  • the user logs in on the terminal, he can select the voiceprint verification method, input the corresponding target user account, and input a voice (target voice data) through the microphone.
  • the terminal may first search for the voiceprint feature library corresponding to the input target user account, and then match the target voice data with the voiceprint feature database. If the matching is successful, the identity verification of the target user account may be determined to pass.
  • a matching operation function of the input voice and the GMM voiceprint model (set as needed) can be provided to determine whether the input target voice data matches the voiceprint (ie, the model), specifically
  • the matching process can be implemented using the Maximum a Posterior probability (MAP) criterion.
  • MAP Maximum a Posterior probability
  • the voice segments transmitted between the terminals need to be forwarded through the server, and the audio data cannot be directly transmitted between the terminals.
  • the server may collect the voice data sent by the terminal that is registered by the user account of the sender when forwarding the voice data transmitted between the terminals, and establish a mapping relationship between the collected voice data and the user account of the sender.
  • the server can collect the voice data sent by the user account A, and generate a voiceprint feature library corresponding to the user account A. Users can log in to other terminals using user account A
  • the server inputs the target voice data through the terminal and uploads it to the server.
  • the server searches for the voiceprint feature database corresponding to the user account A, and then determines whether the uploaded target voice data matches the found voiceprint feature database, and if so, the user account A completes the login on the server.
  • the voiceprint-based identification method is not limited to a scenario in which a user account is logged in, and may also be used in a password recovery/appeal of a user account.
  • the user accounts of the instant messaging application QQ and WeChat are interrelated user accounts.
  • the user can select the account authentication method to select the WeChat account.
  • the server can search for the WeChat account corresponding to the QQ number that needs to be retrieved by the password, and then search for the voiceprint feature database corresponding to the WeChat account, and receive the target voice data for the identity verification input by the user through the microphone, and match If it succeeds, it determines that the authentication is passed, prompting the user to reset the QQ password or send the password through the pre-bound mailbox.
  • the server may also generate target text content and display it to the user on the terminal. Prompt the user to read the above target text content. Then, the target voice data input corresponding to the target text content of the presentation is received, that is, the target voice data input when the user reads the target text content displayed on the terminal.
  • the target voice data when determining whether the identity verification of the target user account passes, may also be converted into text data by voice recognition; when the text data matches the target text content. And determining that the identity verification of the target user account is passed.
  • the terminal when the user performs identity verification, the terminal also displays a series of text content "85274196" generated by the terminal or the server, and prompts the user to read the numbers.
  • the target voice data generated by the user reading these numbers is uploaded to the server.
  • the server not only extracts the feature vector of the target speech data spectrum, cepstrum, formant, pitch, reflection coefficient, rhythm, rhythm, speed, tone or volume, but also performs speech recognition on the speech data to identify the semantic content of the target speech data. .
  • the user identity verification is passed.
  • voiceprint verification and semantic verification can prevent criminals from using other users' recordings for authentication. For example, if only voiceprint is used for identity verification, when user B holds the recording of user A, it can log in using the account of user A, and input the target voice data by playing the recording, so that the user can successfully pass the authentication to the user. A's body Log in to the system to steal user privacy.
  • voiceprint verification and semantic verification is used for identity verification. Even if user B holds the recording of user A, since the text content displayed to user B prompting the user to read can be randomly generated, user B only The voice recording can be verified by playing the recording, but the semantic verification cannot be successfully performed, thus improving the security of the authentication.
  • the voiceprint feature database corresponding to the user account it may also be determined whether the confidence level of the created voiceprint feature library corresponding to the user account is greater than or Equal to the threshold, and if so, stop collecting voice data transmitted by the user account as the sender in the instant messaging application.
  • the server has collected 100 samples of voice data, and generated a voiceprint feature library.
  • the voice data of 101 samples can be matched with the created voiceprint feature library, and the probability of successful matching is the confidence of the voiceprint feature library. If the confidence of the voiceprint feature library is high, it means that the voiceprint feature library has been able to identify the voiceprint more accurately, and therefore, the voice data of the sample can be stopped, thereby saving computer resources.
  • acquiring the input target user account and the target voice data includes: receiving the input target voice data at least once. Before determining the identity verification of the target user account, the method further includes: determining a matching frequency/proportion of the at least one received target voice data and the found voiceprint feature database, where the matching times/proportions are greater than Or equal to the threshold, determining that the target voice data matches the found voiceprint feature library.
  • authentication can be performed by multiple matches.
  • the target voice data input multiple times is verified by most or a large proportion, the identity verification is determined, thereby improving the accuracy of the identity verification.
  • the number of times the target voice data and the found voiceprint feature library are continuously mismatched may be greater than or equal to a threshold.
  • the target user account may be greater than or equal to a threshold.
  • the account that the user logs in can be locked, and the user is not allowed to log in again, and the user needs to be unlocked by other authentication methods.
  • the target user account can be locked for a certain period of time. When the lock time arrives, the target user account is unlocked and allowed to log in to the system, thereby preventing the criminals from trying to simulate the sound multiple times. Verification improves security.
  • a voiceprint-based identification device as shown in FIG. 4, the voiceprint-based identity recognition device includes a voice data collection module 102, a voiceprint feature library creation module 104, a target information acquisition module 106, and a voiceprint comparison Module 108, wherein:
  • the voice data collecting module 102 is configured to collect voice data transmitted by the user account as the sender in the instant messaging application;
  • the voiceprint feature library creating module 104 is configured to perform voiceprint recognition model training according to the collected voice data, and create a voiceprint feature library corresponding to the user account;
  • the target information obtaining module 106 is configured to receive the initiated identity verification request, and obtain the input target user account and the target voice data.
  • the voiceprint matching module 108 is configured to search a voiceprint feature library that matches the target user account, and determine the identity of the target user account when the target voice data matches the found voiceprint feature database. The verification passed.
  • the target information acquiring module 106 is further configured to generate target text content and display; obtain an input target user account, and receive the target text content of the display. Corresponding target voice data input.
  • the voiceprint comparison module 108 is further configured to convert the target voice data into text data by voice recognition; when the text data matches the target text content, Determining that the identity verification of the target user account is passed.
  • the foregoing apparatus further includes a voice data collection stop module 110, configured to determine whether the confidence level of the created voiceprint feature library corresponding to the user account is It is greater than or equal to the threshold, and if so, stops collecting voice data transmitted by the user account as the sender in the instant messaging application.
  • a voice data collection stop module 110 configured to determine whether the confidence level of the created voiceprint feature library corresponding to the user account is It is greater than or equal to the threshold, and if so, stops collecting voice data transmitted by the user account as the sender in the instant messaging application.
  • the target information acquiring module 106 is further configured to receive the input target voice data at least once; the voiceprint comparison module 108 is further configured to determine the at least one received target. The number of times/proportion of the matching of the voice data with the found voiceprint feature library, When the number of matches/proportion is greater than or equal to the threshold, it is determined that the target voice data matches the found voiceprint feature database.
  • the foregoing apparatus further includes a target user account locking module 112, configured to: when the target voice data does not match the found voiceprint feature library, Lock the target user account.
  • a target user account locking module 112 configured to: when the target voice data does not match the found voiceprint feature library, Lock the target user account.
  • the user After adopting the voiceprint-based identification method and device, the user does not need to read a large amount of training text to record the voiceprint feature to establish a voiceprint feature library, and the terminal or server can collect the instant communication message sent by the user.
  • the speech data in the speech is used as a training sample of the voiceprint feature of the user, thereby saving the time for the user to input the voiceprint feature and improving the convenience of operation.
  • FIG. 5 illustrates a terminal 10 of a von Neumann system-based computer system that operates the voiceprint-based identification method described above.
  • the computer system can be a terminal device such as a smart phone, a tablet computer, a palmtop computer, a notebook computer or a personal computer.
  • the terminal 10 may include an external input interface 1001, a processor 1002, a memory 1003, and an output interface 1004 connected through a system bus.
  • the external input interface 1001 can optionally include at least a network interface 10012.
  • the memory 1003 may include an external memory 10032 (eg, a hard disk, an optical disk, or a floppy disk, etc.) and an internal memory 10034.
  • the output interface 1004 can include at least a device such as a display 10042.
  • the processor 1002 (or CPU (Central Processing Unit)) is a computing core and a control core of the terminal 10, and can parse various types of commands in the terminal 10 and process various types of data of the smart device.
  • Memory 1003 (Memory) is a memory device in terminal 10 for storing programs and data, which may include, but is not limited to, ROM, RAM, CD-ROM, and other removable memories and the like.
  • the memory 1003 provides a storage space, which can be used to store the operating system of the terminal 10, and can also store program code, function modules, and the like.
  • the operating system can include, but is not limited to, a windows system, an Android system, and the like.
  • the operation of the method according to an embodiment of the present invention may be based on a computer program whose program files are stored in the external memory 10032 of the aforementioned von Neumann system-based computer system 10, It is loaded into the internal memory 10034 at runtime, and then compiled into the machine code and then passed to the processor 1002 for execution, thereby forming a logical voice data acquisition module 102, sound in the von Neumann system-based computer system 10.
  • the input parameters are all received through the external input interface 1001, and transferred to the buffer in the memory 1003, and then input to the processor 1002 for processing, and the processed result data is cached in the memory. In 1003, for subsequent processing, or passed to the output interface 1004 for output.
  • the storage medium mentioned in the text may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
  • the above computer readable storage medium may also be various types of recording media that the computer device can access through a network or a communication link, for example, a recording medium that can extract data therein through a router, the Internet, a local area network, or the like.
  • the computer readable storage medium described above may also be a plurality of computer readable storage media located in the same computer system, or a computer readable storage medium distributed across a plurality of computer systems or computing devices.

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Multimedia (AREA)
  • Computational Linguistics (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Signal Processing (AREA)
  • Artificial Intelligence (AREA)
  • Telephonic Communication Services (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Biodiversity & Conservation Biology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Telephone Function (AREA)

Abstract

An embodiment of the present invention discloses a voiceprint-based identity recognition method. The method comprises: acquiring voice data transmitted by a user account as a sender in an instant messaging application; training a voiceprint recognition model according to the acquired voice data, and creating a voiceprint feature library corresponding to the user account; receiving an initiated identity verification request, and acquiring an input target user account and target voice data; finding a voiceprint feature library matching the target user account, and, if the target voice data matches the found voiceprint feature library, confirming verification of the identity of the target user account. In addition, an embodiment of the present invention also correspondingly discloses a voiceprint-based identity recognition device. The present invention can improve operational convenience in the recording of sample voiceprints of users.

Description

基于声纹的身份识别方法及装置Voiceprint based identification method and device
相关申请的交叉参考Cross-reference to related applications
本申请要求于2016年5月16日在中国专利局提交的申请号为201610321746.1、标题为“基于声纹的身份识别方法及装置”的发明专利申请的优先权,其全部内容结合于此作为参考。The present application claims priority to the patent application filed on Jan. 16, 2016, at the Chinese Patent Office, the priority of which is hereby incorporated by reference. .
技术领域Technical field
本发明涉及计算机技术领域,尤其涉及一种基于声纹的身份识别方法及装置。The present invention relates to the field of computer technologies, and in particular, to a voiceprint based identification method and apparatus.
背景技术Background technique
声纹识别技术即说话人语音的识别技术,是一种通过语音信号提取代表说话人身份的相关特征,例如反映声门开合频率的基频特征、反映口腔大小形状及声道长度的频谱特征等,进而识别出说话人身份等方面的技术。它可以广泛应用于信息安全、电话银行、智能门禁、以及娱乐增值等领域。声纹识别所提供的安全性可与其他生物识别技术(指纹、掌形、和虹膜)相媲美,且只需电话或麦克风即可,无需特殊的设备,数据采集极为方便,造价低廉,是最为经济、可靠、简便和安全的身份识别方式。在任何时候,只需输入说话者的语音,依靠独特的声纹便可安全地识别说话者。声纹识别技术在电话信道中的表现更突出,是唯一可用于远程控制的非接触式生物识别技术。Voiceprint recognition technology, that is, the recognition technology of speaker speech, is a kind of correlation feature that extracts the identity of the speaker by voice signal, such as the fundamental frequency characteristic reflecting the opening and closing frequency of the glottis, and the spectral features reflecting the size and shape of the mouth and the length of the channel. Etc., in order to identify the technology of the speaker identity and so on. It can be widely used in information security, telephone banking, smart access control, and entertainment value-added. The security provided by voiceprint recognition is comparable to other biometric technologies (fingerprint, palm shape, and iris), and only requires a telephone or a microphone. No special equipment is required, data collection is extremely convenient, and the cost is low. An economical, reliable, easy and secure way to identify. At any time, simply enter the speaker's voice and rely on a unique voiceprint to safely identify the speaker. Voiceprint recognition technology is more prominent in the telephone channel and is the only non-contact biometric technology that can be used for remote control.
然而,为了提高作为样本的声纹特征的置信度,即,使声纹识别的准确度得到提高,通常需要用户在录入样本声纹时,阅读大量的文字,从而提取较完整的声纹特征。这就使得用户录入样本声纹的过程耗时较长,从而导致操作的便利性不足。However, in order to improve the confidence of the voiceprint feature as a sample, that is, to improve the accuracy of voiceprint recognition, it is generally required that the user read a large amount of text when recording the sample voiceprint, thereby extracting a relatively complete voiceprint feature. This makes the process of recording the sample voiceprint by the user takes a long time, resulting in insufficient convenience of operation.
发明内容Summary of the invention
基于此,为了解决传统技术中存在的为了提取较完整的声纹特征,需要用户在录入样本声纹时阅读大量的文字,从而导致的操作便利性较差的技术问 题,本发明实施例提出了一种基于声纹的身份识别方法。Based on this, in order to solve the problem of the convenience of the operation in order to extract a relatively complete voiceprint feature in the conventional technology, the user needs to read a large amount of characters when recording the sample voiceprint. The embodiment of the present invention proposes a voiceprint based identification method.
一种基于声纹的身份识别方法,包括:A method based on voiceprint identification, comprising:
采集即时通信应用中作为发送方的用户账号传输的语音数据;Collecting voice data transmitted by the user account of the sender in the instant messaging application;
根据采集的语音数据进行声纹识别模型训练,创建与所述用户账号对应的声纹特征库;Performing a voiceprint recognition model training according to the collected voice data, and creating a voiceprint feature library corresponding to the user account;
接收发起的身份验证请求,获取输入的目标用户账号和目标语音数据;Receiving the initiated authentication request, obtaining the input target user account and the target voice data;
查找与所述目标用户账号匹配的声纹特征库,在所述目标语音数据与所述查找到的声纹特征库匹配时,判定所述目标用户账号的身份校验通过。And searching for a voiceprint feature database that matches the target user account, and determining that the identity verification of the target user account passes when the target voice data matches the found voiceprint feature database.
此外,为了解决传统技术中存在的为了提取较完整的声纹特征,需要用户在录入样本声纹时阅读大量的文字,从而导致的操作便利性较差的技术问题,本发明实施例还提出了一种基于声纹的身份识别装置。In addition, in order to solve the technical problem that the conventional techniques exist in order to extract a relatively complete voiceprint feature, the user needs to read a large amount of characters when recording the sample voiceprint, resulting in poor operation convenience, the embodiment of the present invention also proposes A voiceprint based identification device.
一种基于声纹的身份识别装置,包括:A voiceprint based identification device comprising:
语音数据采集模块,用于采集即时通信应用中作为发送方的用户账号传输的语音数据;a voice data collecting module, configured to collect voice data transmitted by a user account as a sender in an instant messaging application;
声纹特征库创建模块,用于根据采集的语音数据进行声纹识别模型训练,创建与所述用户账号对应的声纹特征库;a voiceprint feature library creating module, configured to perform a voiceprint recognition model training according to the collected voice data, and create a voiceprint feature library corresponding to the user account;
目标信息获取模块,用于接收发起的身份验证请求,获取输入的目标用户账号和目标语音数据;a target information obtaining module, configured to receive the initiated identity verification request, and obtain the input target user account and the target voice data;
声纹比对模块,用于查找与所述目标用户账号匹配的声纹特征库,在所述目标语音数据与所述查找到的声纹特征库匹配时,判定所述目标用户账号的身份校验通过。a voiceprint matching module, configured to search a voiceprint feature library that matches the target user account, and determine an identity school of the target user account when the target voice data matches the found voiceprint feature database Passed the test.
采用了上述基于声纹的身份识别方法和装置之后,用户前期不需要专门地阅读大量的训练文本来录入声纹特征从而建立声纹特征库,而可由终端或服务器采集用户日常发送的即时通信消息中的语音数据作为用户的声纹特征的训练样本,从而节省了用户录入声纹特征的时间,提高了操作的便利性。 After adopting the voiceprint-based identification method and device, the user does not need to read a large amount of training text to record the voiceprint feature to establish a voiceprint feature library, and the terminal or server can collect the instant communication message sent by the user. The speech data in the speech is used as a training sample of the voiceprint feature of the user, thereby saving the time for the user to input the voiceprint feature and improving the convenience of operation.
附图说明DRAWINGS
为了更清楚地说明本发明实施例或现有技术中的技术方案,下面将对实施例或现有技术描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the drawings used in the embodiments or the description of the prior art will be briefly described below. Obviously, the drawings in the following description are only It is a certain embodiment of the present invention, and other drawings can be obtained from those skilled in the art without any creative work.
图1为一个实施例中一种基于声纹的身份识别方法的流程示意图;1 is a schematic flow chart of a voiceprint based identification method in an embodiment;
图2为一个实施例中发送语音片段的即时通信应用界面示意图;2 is a schematic diagram of an instant messaging application interface for transmitting a voice segment in an embodiment;
图3为一个实施例中提供随机码阅读校验的界面示意图;3 is a schematic diagram of an interface for providing random code reading verification in one embodiment;
图4为一个实施例中一种基于声纹的身份识别装置的结构示意图;4 is a schematic structural diagram of a voiceprint based identification device in an embodiment;
图5为一个实施例中运行前述基于声纹的身份识别方法的计算机设备的结构示意图。FIG. 5 is a schematic structural diagram of a computer device that runs the aforementioned voiceprint-based identification method in one embodiment.
具体实施方式detailed description
下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例,都属于本发明保护的范围。The technical solutions in the embodiments of the present invention are clearly and completely described in the following with reference to the accompanying drawings in the embodiments of the present invention. It is obvious that the described embodiments are only a part of the embodiments of the present invention, but not all embodiments. All other embodiments obtained by those skilled in the art based on the embodiments of the present invention without creative efforts are within the scope of the present invention.
为解决传统技术中存在的为了提取较完整的声纹特征,需要用户在录入样本声纹时阅读大量的文字,从而导致的操作便利性较差的技术问题,本发明实施例提出了一种基于声纹的身份识别方法。该方法的实现可依赖于计算机程序,该计算机程序可运行于基于冯诺依曼体系的计算机系统之上,该计算机程序可以是即时通信应用或者具有即时通信功能的社交网络应用的客户端程序或者服务器程序,执行上述服务器程序的计算机系统可以是运行即时通信应用或具有即时通信功能的社交网络应用的客户端程序的终端设备,也可以是运行即时通信应用或具有即时通信功能的社交网络应用的服务器程序的服务器设备。In order to solve the technical problem that the conventional techniques exist in order to extract a relatively complete voiceprint feature, the user needs to read a large amount of characters when recording the sample voiceprint, resulting in poor operation convenience, the embodiment of the present invention provides a basis for Voiceprint identification method. The implementation of the method may depend on a computer program running on a von Neumann system-based computer system, which may be an instant messaging application or a client program of a social networking application with instant messaging functionality or The server program, the computer system executing the server program may be a terminal device running a client program of an instant messaging application or a social networking application having an instant communication function, or may be a social networking application running an instant messaging application or having an instant communication function. Server device for the server program.
具体的,如图1所示,该基于声纹的身份识别方法包括:Specifically, as shown in FIG. 1, the voiceprint-based identification method includes:
步骤S102:采集即时通信应用中作为发送方的用户账号传输的语音数据。 Step S102: Collect voice data transmitted by the user account as the sender in the instant messaging application.
用户在使用即时通信应用,例如微信、QQ时,这些即时通信应用均提供了语音片段通话的功能。如图2所示,用户通过长按虚拟按键,即可通过手机上的麦克风输入语音数据的片段,释放虚拟按键之后,该语音数据即可被发送至接收方的用户。When users use instant messaging applications, such as WeChat and QQ, these instant messaging applications provide the function of voice clip calls. As shown in FIG. 2, the user can input a segment of the voice data through the microphone on the mobile phone by long pressing the virtual button, and after releasing the virtual button, the voice data can be sent to the user of the receiver.
用户在使用即时通信应用时,需要先登录用户账号。在本实施例中,终端只采集该登录的用户账号发送的语音数据,而不采集该登录的用户账号接收的语音数据。即时通信应用采集用户通过手机麦克风输入的语音数据时,通常将其缓存在预设的存储地址中,待采集完毕得到一次完整的语音数据输入时(即用户释放虚拟按键时,一次采集语音数据完成,生成相应的语音数据文件),才将其发送给服务器或其他终端。终端在执行该基于声纹的身份识别方法时,即可在此缓存的存储地址中得到该语音数据。When users use the instant messaging application, they need to log in to the user account first. In this embodiment, the terminal only collects voice data sent by the logged-in user account, and does not collect voice data received by the logged-in user account. When the instant messaging application collects voice data input by the user through the mobile phone microphone, it is usually cached in a preset storage address, and when a complete voice data input is obtained after the acquisition is completed (that is, when the user releases the virtual button, the voice data is collected once). , generate the corresponding voice data file), and then send it to the server or other terminal. When the terminal performs the voiceprint-based identification method, the voice data can be obtained in the cached storage address.
步骤S104:根据采集的语音数据进行声纹识别模型训练,创建与所述用户账号对应的声纹特征库(即,包括一个或多个声纹特征的数据库)。可以用于声纹识别建模的算法有很多,比如,动态时间归整(Dynamic Time Warping,DTW)、人工神经网络(Artificial Neural Network,ANN)、隐马尔可夫模型(Hidden Markov Model,HMM)、高斯混合模型(Gaussian Mixture Model,GMM)等。由于GMM对语音声学特征分布有较好的拟合特性,GMM方法已经成为声音识别系统的主流方法。为了提高识别正确率和识别效率,在本文中,以GMM作为建模方法为例进行说明。Step S104: Perform voiceprint recognition model training according to the collected voice data, and create a voiceprint feature library corresponding to the user account (ie, a database including one or more voiceprint features). There are many algorithms that can be used for voiceprint recognition modeling, such as Dynamic Time Warping (DTW), Artificial Neural Network (ANN), and Hidden Markov Model (HMM). , Gaussian Mixture Model (GMM) and so on. Because GMM has a good fitting property to the distribution of speech acoustic features, GMM method has become the mainstream method of speech recognition system. In order to improve the recognition accuracy and recognition efficiency, in this paper, GMM is used as a modeling method as an example.
例如,作为声纹识别模型训练的一种具体实现方式,可以对输入的语音数据序列(PCM(Pulse Code Modulation)码流)进行预处理,以去除非语音信号和静默语音信号,并对语音信号分帧,以供后续处理;提取每一帧语音信号的Mel频率倒谱参数(Mel-Frequency Cepstral Coefficients,MFCC)参数并保存;用提取的MFCC参数训练用户(即,说话者)的GMM,得到专属于该用户的GMM声纹模型。For example, as a specific implementation of the voiceprint recognition model training, the input voice data sequence (PCM (Pulse Code Modulation) code stream) may be preprocessed to remove the non-speech signal and the silence voice signal, and the voice signal is Framing, for subsequent processing; extracting the Mel-Frequency Cepstral Coefficients (MFCC) parameters of each frame of the speech signal and saving; using the extracted MFCC parameters to train the user (ie, the speaker) GMM, The GMM voiceprint model that is specific to this user.
由于用户日常使用微信、QQ等即时通信应用的次数较多,发送语音片段的次数也较多,因此可在多次执行步骤S102时,采集到多条与同一登录的用户账号对应的语音片段(数据)。该采集到的多条语音数据可作为样本,输入到声纹识别模型中进行机器学习。 Since the user frequently uses the instant messaging application such as WeChat and QQ, the number of times the voice segment is sent is also high. Therefore, when the step S102 is performed multiple times, multiple voice segments corresponding to the same registered user account are collected. data). The collected plurality of voice data can be used as samples and input into the voiceprint recognition model for machine learning.
例如,可提取采集到的每条语音数据在频谱、倒频谱、共振峰、基音、反射系数、韵律、节奏、速度、语调、音量等特征向量上的特征值,然后通过现有的声纹识别模型进行训练,从而得到与该登录的用户账号对应的声纹特征库。For example, the feature values of each acquired speech data in the spectrum, cepstrum, formant, pitch, reflection coefficient, rhythm, rhythm, speed, intonation, volume, etc. can be extracted and then identified by the existing voiceprint. The model is trained to obtain a voiceprint feature library corresponding to the logged-in user account.
步骤S106:接收发起的身份验证请求,获取输入的目标用户账号和目标语音数据。Step S106: Receive an initiated identity verification request, and obtain the input target user account and target voice data.
步骤S108:查找与所述目标用户账号匹配的声纹特征库,在所述目标语音数据与所述查找到的声纹特征库匹配时,判定所述目标用户账号的身份校验通过。Step S108: Find a voiceprint feature database that matches the target user account, and determine that the identity verification of the target user account passes when the target voice data matches the found voiceprint feature database.
在声纹特征库创建之后,则可通过声纹特征库进行用户身份验证(在采集的语音特征数据较少,或声纹特征库未创建时,则可提示用户更换其他的身份验证方式)。用户在终端上登录时,即可选择声纹验证的方式,输入相应的目标用户账号,并通过麦克风输入一段语音(目标语音数据)。终端可先查找与输入的目标用户账号对应的声纹特征库,然后将目标语音数据与声纹特征库进行匹配,若匹配成功,则可判定目标用户账号的身份校验通过。仍以GMM为例,在该步骤中,可以提供输入话音与GMM声纹模型的匹配运算函数(根据需要设定),以判断输入的目标语音数据是否与声纹(即,模型)匹配,具体实现中,匹配过程可以利用最大后验概率(Maximum a Posterior probability,MAP)准则加以实现。After the voiceprint feature library is created, user authentication can be performed through the voiceprint feature library (when the collected voice feature data is less, or the voiceprint feature library is not created, the user can be prompted to change other authentication methods). When the user logs in on the terminal, he can select the voiceprint verification method, input the corresponding target user account, and input a voice (target voice data) through the microphone. The terminal may first search for the voiceprint feature library corresponding to the input target user account, and then match the target voice data with the voiceprint feature database. If the matching is successful, the identity verification of the target user account may be determined to pass. Still taking the GMM as an example, in this step, a matching operation function of the input voice and the GMM voiceprint model (set as needed) can be provided to determine whether the input target voice data matches the voiceprint (ie, the model), specifically In implementation, the matching process can be implemented using the Maximum a Posterior probability (MAP) criterion.
需要说明的是,上述方法也可由即时通信应用或具有即时通信应用功能的社交应用执行:It should be noted that the above method may also be performed by an instant messaging application or a social application having an instant messaging application function:
在一个由服务器执行上述方法的应用场景中,终端之间发送语音片段需要通过服务器进行转发,终端之间无法直接传输音频数据。服务器可在转发终端之间传输的语音数据时,采集作为发送方的用户账号所登录的终端发送的语音数据,并建立采集到的语音数据与发送方的用户账号的映射关系。In an application scenario in which the above method is performed by the server, the voice segments transmitted between the terminals need to be forwarded through the server, and the audio data cannot be directly transmitted between the terminals. The server may collect the voice data sent by the terminal that is registered by the user account of the sender when forwarding the voice data transmitted between the terminals, and establish a mapping relationship between the collected voice data and the user account of the sender.
例如,用户账号A在终端上登录后向其他好友发送的语音数据均需要通过服务器进行转发,因此服务器可采集用户账号A发送的语音数据,并生成与用户账号A对应的声纹特征库。用户可使用用户账号A在其他终端上登录 服务器,并通过终端输入目标语音数据之后上传至服务器,服务器查找用户账号A对应的声纹特征库,然后判断该上传的目标语音数据是否与查找到的声纹特征库匹配,若是,则用户账号A完成在服务器上的登录。For example, the voice data sent by the user account A to other friends after being logged in to the terminal needs to be forwarded through the server. Therefore, the server can collect the voice data sent by the user account A, and generate a voiceprint feature library corresponding to the user account A. Users can log in to other terminals using user account A The server inputs the target voice data through the terminal and uploads it to the server. The server searches for the voiceprint feature database corresponding to the user account A, and then determines whether the uploaded target voice data matches the found voiceprint feature database, and if so, the user account A completes the login on the server.
另外,上述基于声纹的身份识别方法不限于用户账号登录的场景,也可用于用户账号的密码找回/申诉等场景。例如,在一个应用场景中,即时通信应用QQ和微信的用户账号为相互关联的用户账号。用户使用QQ的密码找回功能时,可选择关联账号验证的方式选择微信账号。此时,服务器可查找与需要密码找回的QQ号对应的微信账号,然后查找与该微信账号对应的声纹特征库,并接收用户通过麦克风输入的用于身份验证的目标语音数据,通过匹配,在成功时,则判定身份验证通过,提示用户重置QQ密码或通过预绑定的邮箱发送密码。In addition, the voiceprint-based identification method is not limited to a scenario in which a user account is logged in, and may also be used in a password recovery/appeal of a user account. For example, in an application scenario, the user accounts of the instant messaging application QQ and WeChat are interrelated user accounts. When the user uses the QQ password recovery function, the user can select the account authentication method to select the WeChat account. At this time, the server can search for the WeChat account corresponding to the QQ number that needs to be retrieved by the password, and then search for the voiceprint feature database corresponding to the WeChat account, and receive the target voice data for the identity verification input by the user through the microphone, and match If it succeeds, it determines that the authentication is passed, prompting the user to reset the QQ password or send the password through the pre-bound mailbox.
进一步的,在一个应用场景中,服务器接收发起的身份验证请求之后,还可生成目标文本内容并在终端上展示给用户。提示用户阅读上述目标文本内容。然后接收与所述展示的目标文本内容对应的目标语音数据输入,即用户阅读终端上展示的目标文本内容时输入的目标语音数据。Further, in an application scenario, after receiving the initiated identity verification request, the server may also generate target text content and display it to the user on the terminal. Prompt the user to read the above target text content. Then, the target voice data input corresponding to the target text content of the presentation is received, that is, the target voice data input when the user reads the target text content displayed on the terminal.
在本实施例中,在判定所述目标用户账号的身份校验是否通过时,还可通过语音识别将所述目标语音数据转换成文本数据;在所述文本数据与所述目标文本内容匹配时,判定所述目标用户账号的身份校验通过。In this embodiment, when determining whether the identity verification of the target user account passes, the target voice data may also be converted into text data by voice recognition; when the text data matches the target text content. And determining that the identity verification of the target user account is passed.
如图3所示,用户在进行身份验证时,终端还会展示一串终端或服务器生成的文本内容“85274196”,并提示用户阅读这些数字。用户阅读这些数字所生成的目标语音数据则被上传至服务器。服务器不仅提取目标语音数据的频谱、倒频谱、共振峰、基音、反射系数、韵律、节奏、速度、语调或音量等特征向量,还会对语音数据进行语音识别,识别此目标语音数据的语义内容。在其声纹匹配的前提下,且其语义也为“85274196”,或者识别出的拼音为“85274196”的拼音字符串,才判定该用户身份验证通过。As shown in FIG. 3, when the user performs identity verification, the terminal also displays a series of text content "85274196" generated by the terminal or the server, and prompts the user to read the numbers. The target voice data generated by the user reading these numbers is uploaded to the server. The server not only extracts the feature vector of the target speech data spectrum, cepstrum, formant, pitch, reflection coefficient, rhythm, rhythm, speed, tone or volume, but also performs speech recognition on the speech data to identify the semantic content of the target speech data. . On the premise that its voiceprint matches, and its semantics is also "85274196", or the pinyin string identified as "85274196" is recognized, the user identity verification is passed.
采用此种声纹验证和语义验证结合的方式对用户进行身份验证,可防止不法分子通过使用其他用户的录音来进行身份验证。例如,若仅使用声纹进行身份验证,当用户B持有用户A的录音时,其可使用用户A的账号登录,并通过播放录音来输入目标语音数据,从而可顺利身份验证通过,以用户A的身 份登录系统,窃取用户隐私。而采用了上述结合声纹验证和语义验证的方式来进行身份验证,即使用户B持有用户A的录音,但由于展示给用户B提示用户阅读的文本内容可以是随机生成的,因此用户B只能通过播放录音通过声纹验证,但无法顺利通过语义验证,因此,提高了身份验证的安全性。Using this combination of voiceprint verification and semantic verification to authenticate users can prevent criminals from using other users' recordings for authentication. For example, if only voiceprint is used for identity verification, when user B holds the recording of user A, it can log in using the account of user A, and input the target voice data by playing the recording, so that the user can successfully pass the authentication to the user. A's body Log in to the system to steal user privacy. The above-mentioned combination of voiceprint verification and semantic verification is used for identity verification. Even if user B holds the recording of user A, since the text content displayed to user B prompting the user to read can be randomly generated, user B only The voice recording can be verified by playing the recording, but the semantic verification cannot be successfully performed, thus improving the security of the authentication.
在本实施例中,为了节约计算性能,在创建与所述用户账号对应的声纹特征库之后,还可判断所述创建的与所述用户账号对应的声纹特征库的置信度是否大于或等于阈值,若是,则停止采集即时通信应用中作为发送方的用户账号传输的语音数据。In this embodiment, in order to save the computing performance, after creating the voiceprint feature database corresponding to the user account, it may also be determined whether the confidence level of the created voiceprint feature library corresponding to the user account is greater than or Equal to the threshold, and if so, stop collecting voice data transmitted by the user account as the sender in the instant messaging application.
例如,若服务器已经采集到了100条样本的语音数据,并生成了声纹特征库。当采集到101条样本的语音数据时,可将其与已创建的声纹特征库进行匹配,匹配成功的概率即为声纹特征库的置信度。若声纹特征库的置信度较高,则意味着声纹特征库已能够较准确地识别声纹,因此,可停止采集样本的语音数据,从而节约计算机资源。For example, if the server has collected 100 samples of voice data, and generated a voiceprint feature library. When the voice data of 101 samples is collected, it can be matched with the created voiceprint feature library, and the probability of successful matching is the confidence of the voiceprint feature library. If the confidence of the voiceprint feature library is high, it means that the voiceprint feature library has been able to identify the voiceprint more accurately, and therefore, the voice data of the sample can be stopped, thereby saving computer resources.
在本实施例中,获取输入的目标用户账号和目标语音数据包括:至少接收一次输入的目标语音数据。判定所述目标用户账号的身份校验通过之前还包括:判断所述至少一次接收到的目标语音数据与所述查找到的声纹特征库的匹配次数/比例,在所述匹配次数/比例大于或等于阈值时,判定所述目标语音数据与所述查找到的声纹特征库匹配。In this embodiment, acquiring the input target user account and the target voice data includes: receiving the input target voice data at least once. Before determining the identity verification of the target user account, the method further includes: determining a matching frequency/proportion of the at least one received target voice data and the found voiceprint feature database, where the matching times/proportions are greater than Or equal to the threshold, determining that the target voice data matches the found voiceprint feature library.
由于在样本较少时,声纹特征匹配可能存在不准确的情况,因此,可通过多次匹配来进行身份验证。当用户在身份验证的过程中,多次输入的目标语音数据大部分或较大比例地验证通过时,才判定身份验证通过,从而可提高身份验证的准确性。Since there may be inaccuracies in voiceprint feature matching when there are few samples, authentication can be performed by multiple matches. When the user in the process of identity verification, the target voice data input multiple times is verified by most or a large proportion, the identity verification is determined, thereby improving the accuracy of the identity verification.
在一个实施例中,查找与所述目标用户账号匹配的声纹特征库之后还可在所述目标语音数据与所述查找到的声纹特征库连续不匹配的次数大于或等于阈值时,锁定所述目标用户账号。In an embodiment, after searching the voiceprint feature library matching the target user account, the number of times the target voice data and the found voiceprint feature library are continuously mismatched may be greater than or equal to a threshold. The target user account.
也就是说,如果用户语音进行连续多次身份验证不通过,则可将该用户登录的账号锁定,不允许其再次登录,需要用户通过其他身份验证方式解锁。或者可将该目标用户账号锁定一定的时长,待锁定时间抵达时,再对目标用户账号解锁,允许其登录系统,从而防止不法分子通过模仿声音多次尝试来进行身 份验证,提高了安全性。That is to say, if the user voice fails to pass multiple consecutive authentications, the account that the user logs in can be locked, and the user is not allowed to log in again, and the user needs to be unlocked by other authentication methods. Or the target user account can be locked for a certain period of time. When the lock time arrives, the target user account is unlocked and allowed to log in to the system, thereby preventing the criminals from trying to simulate the sound multiple times. Verification improves security.
此外,为解决传统技术中为了提取较完整的声纹特征,需要用户在录入样本声纹时阅读大量的文字,从而导致的操作便利性较差的技术问题,在一个实施例中,还提出了一种基于声纹的身份识别装置,如图4所示,上述基于声纹的身份识别装置包括语音数据采集模块102、声纹特征库创建模块104、目标信息获取模块106、以及声纹比对模块108,其中:In addition, in order to solve the technical problem that in the conventional technology, in order to extract a relatively complete voiceprint feature, a user needs to read a large amount of characters when recording a sample voiceprint, resulting in poor operation convenience, in one embodiment, a method is also proposed. A voiceprint-based identification device, as shown in FIG. 4, the voiceprint-based identity recognition device includes a voice data collection module 102, a voiceprint feature library creation module 104, a target information acquisition module 106, and a voiceprint comparison Module 108, wherein:
语音数据采集模块102,用于采集即时通信应用中作为发送方的用户账号传输的语音数据;The voice data collecting module 102 is configured to collect voice data transmitted by the user account as the sender in the instant messaging application;
声纹特征库创建模块104,用于根据采集的语音数据进行声纹识别模型训练,创建与所述用户账号对应的声纹特征库;The voiceprint feature library creating module 104 is configured to perform voiceprint recognition model training according to the collected voice data, and create a voiceprint feature library corresponding to the user account;
目标信息获取模块106,用于接收发起的身份验证请求,获取输入的目标用户账号和目标语音数据;The target information obtaining module 106 is configured to receive the initiated identity verification request, and obtain the input target user account and the target voice data.
声纹比对模块108,用于查找与所述目标用户账号匹配的声纹特征库,在所述目标语音数据与所述查找到的声纹特征库匹配时,判定所述目标用户账号的身份校验通过。The voiceprint matching module 108 is configured to search a voiceprint feature library that matches the target user account, and determine the identity of the target user account when the target voice data matches the found voiceprint feature database. The verification passed.
可选的,在其中一个实施例中,如图4所示,所述目标信息获取模块106还用于生成目标文本内容并展示;获取输入的目标用户账号,接收与所述展示的目标文本内容对应的目标语音数据输入。Optionally, in one embodiment, as shown in FIG. 4, the target information acquiring module 106 is further configured to generate target text content and display; obtain an input target user account, and receive the target text content of the display. Corresponding target voice data input.
可选的,在其中一个实施例中,所述声纹比对模块108还用于通过语音识别将所述目标语音数据转换成文本数据;在所述文本数据与所述目标文本内容匹配时,判定所述目标用户账号的身份校验通过。Optionally, in one embodiment, the voiceprint comparison module 108 is further configured to convert the target voice data into text data by voice recognition; when the text data matches the target text content, Determining that the identity verification of the target user account is passed.
可选的,在其中一个实施例中,如图4所示,上述装置还包括语音数据采集停止模块110,用于判断所述创建的与所述用户账号对应的声纹特征库的置信度是否大于或等于阈值,若是,则停止采集即时通信应用中作为发送方的用户账号传输的语音数据。Optionally, in one embodiment, as shown in FIG. 4, the foregoing apparatus further includes a voice data collection stop module 110, configured to determine whether the confidence level of the created voiceprint feature library corresponding to the user account is It is greater than or equal to the threshold, and if so, stops collecting voice data transmitted by the user account as the sender in the instant messaging application.
可选的,在其中一个实施例中,所述目标信息获取模块106还用于至少接收一次输入的目标语音数据;所述声纹比对模块108还用于判断所述至少一次接收到的目标语音数据与所述查找到的声纹特征库的匹配次数/比例,在所述 匹配次数/比例大于或等于阈值时,判定所述目标语音数据与所述查找到的声纹特征库匹配。Optionally, in one embodiment, the target information acquiring module 106 is further configured to receive the input target voice data at least once; the voiceprint comparison module 108 is further configured to determine the at least one received target. The number of times/proportion of the matching of the voice data with the found voiceprint feature library, When the number of matches/proportion is greater than or equal to the threshold, it is determined that the target voice data matches the found voiceprint feature database.
可选的,在其中一个实施例中,如图4所示,上述装置还包括目标用户账号锁定模块112,用于在所述目标语音数据与所述查找到的声纹特征库不匹配时,锁定所述目标用户账号。Optionally, in one embodiment, as shown in FIG. 4, the foregoing apparatus further includes a target user account locking module 112, configured to: when the target voice data does not match the found voiceprint feature library, Lock the target user account.
采用了上述基于声纹的身份识别方法和装置之后,用户前期不需要专门地阅读大量的训练文本来录入声纹特征从而建立声纹特征库,而可由终端或服务器采集用户日常发送的即时通信消息中的语音数据作为用户的声纹特征的训练样本,从而节省了用户录入声纹特征的时间,提高了操作的便利性。After adopting the voiceprint-based identification method and device, the user does not need to read a large amount of training text to record the voiceprint feature to establish a voiceprint feature library, and the terminal or server can collect the instant communication message sent by the user. The speech data in the speech is used as a training sample of the voiceprint feature of the user, thereby saving the time for the user to input the voiceprint feature and improving the convenience of operation.
在一个实施例中,如图5所示,图5展示了一种运行上述基于声纹的身份识别方法的基于冯诺依曼体系的计算机系统的终端10。该计算机系统可以是智能手机、平板电脑、掌上电脑,笔记本电脑或个人电脑等终端设备。具体的,终端10可包括通过系统总线连接的外部输入接口1001、处理器1002、存储器1003、和输出接口1004。其中,外部输入接口1001可选的可至少包括网络接口10012。存储器1003可包括外存储器10032(例如硬盘、光盘或软盘等)和内存储器10034。输出接口1004可至少包括显示屏10042等设备。In one embodiment, as shown in FIG. 5, FIG. 5 illustrates a terminal 10 of a von Neumann system-based computer system that operates the voiceprint-based identification method described above. The computer system can be a terminal device such as a smart phone, a tablet computer, a palmtop computer, a notebook computer or a personal computer. Specifically, the terminal 10 may include an external input interface 1001, a processor 1002, a memory 1003, and an output interface 1004 connected through a system bus. The external input interface 1001 can optionally include at least a network interface 10012. The memory 1003 may include an external memory 10032 (eg, a hard disk, an optical disk, or a floppy disk, etc.) and an internal memory 10034. The output interface 1004 can include at least a device such as a display 10042.
其中,处理器1002(或称CPU(Central Processing Unit,中央处理器))是终端10的计算核心以及控制核心,其可以解析终端10内的各类指令以及处理智能设备的各类数据。存储器1003(Memory)是终端10中的记忆设备,用于存放程序和数据,其可以包括但不限于ROM、RAM、CD-ROM、以及其他可移除存储器等等。存储器1003提供了存储空间,该存储空间可以用于存储终端10的操作系统,还可以存储程序代码、功能模块等等,该操作系统可以包括但不限于:windows系统、Android系统等等。The processor 1002 (or CPU (Central Processing Unit)) is a computing core and a control core of the terminal 10, and can parse various types of commands in the terminal 10 and process various types of data of the smart device. Memory 1003 (Memory) is a memory device in terminal 10 for storing programs and data, which may include, but is not limited to, ROM, RAM, CD-ROM, and other removable memories and the like. The memory 1003 provides a storage space, which can be used to store the operating system of the terminal 10, and can also store program code, function modules, and the like. The operating system can include, but is not limited to, a windows system, an Android system, and the like.
根据本发明实施例的方法的运行可以基于计算机程序,该计算机程序的程序文件存储于前述基于冯诺依曼体系的计算机系统10的外存储器10032中, 在运行时被加载到内存储器10034中,然后被编译为机器码之后传递至处理器1002中执行,从而使得基于冯诺依曼体系的计算机系统10中形成逻辑上的语音数据采集模块102、声纹特征库创建模块104、目标信息获取模块106、声纹比对模块108、语音数据采集停止模块110、以及目标用户账号锁定模块112。且在上述基于声纹的身份识别执行过程中,输入的参数均通过外部输入接口1001接收,并传递至存储器1003中缓存,然后输入到处理器1002中进行处理,处理的结果数据或缓存于存储器1003中以便进行后续处理,或被传递至输出接口1004进行输出。The operation of the method according to an embodiment of the present invention may be based on a computer program whose program files are stored in the external memory 10032 of the aforementioned von Neumann system-based computer system 10, It is loaded into the internal memory 10034 at runtime, and then compiled into the machine code and then passed to the processor 1002 for execution, thereby forming a logical voice data acquisition module 102, sound in the von Neumann system-based computer system 10. The pattern feature library creation module 104, the target information acquisition module 106, the voiceprint comparison module 108, the voice data collection stop module 110, and the target user account lockout module 112. And in the above-described voiceprint-based identity recognition execution process, the input parameters are all received through the external input interface 1001, and transferred to the buffer in the memory 1003, and then input to the processor 1002 for processing, and the processed result data is cached in the memory. In 1003, for subsequent processing, or passed to the output interface 1004 for output.
本领域普通技术人员可以理解实现上述实施例方法中的全部或部分流程,可以通过计算机程序来指令相关的硬件来完成,该程序可存储于一计算机可读取存储介质中,在被数据处理设备运行时,该程序可使得数据处理设备执行包括如上述各方法的实施例的流程或步骤,具体请参照上文结合附图对实施例的描述,此处不再赘述。A person skilled in the art can understand that all or part of the process of implementing the above embodiment method can be completed by a computer program to instruct related hardware, and the program can be stored in a computer readable storage medium, in the data processing device. In the operation, the program may cause the data processing device to perform the process or the steps including the embodiments of the foregoing methods. For details, refer to the description of the embodiments in conjunction with the accompanying drawings, and no further details are described herein.
其中,本文中提到的存储介质可为磁碟、光盘、只读存储记忆体(Read-Only Memory,ROM)或随机存储记忆体(Random Access Memory,RAM)等。The storage medium mentioned in the text may be a magnetic disk, an optical disk, a read-only memory (ROM), or a random access memory (RAM).
另外,上述计算机可读存储介质还可以是计算机装置可以通过网络或通信链接访问的各种类型的记录媒体,例如,可以通过路由器、互联网、局域网等提取其中的数据的记录媒体。此外,上述计算机可读存储介质还可以是位于同一计算机系统中的多个计算机可读存储介质,也可以指分布于多个计算机系统或计算装置的计算机可读存储介质。In addition, the above computer readable storage medium may also be various types of recording media that the computer device can access through a network or a communication link, for example, a recording medium that can extract data therein through a router, the Internet, a local area network, or the like. Furthermore, the computer readable storage medium described above may also be a plurality of computer readable storage media located in the same computer system, or a computer readable storage medium distributed across a plurality of computer systems or computing devices.
以上所揭露的仅为本发明较佳实施例而已,当然不能以此来限定本发明之权利范围,因此依本发明权利要求所作的等同变化,仍属本发明所涵盖的范围。 The above is only the preferred embodiment of the present invention, and the scope of the present invention is not limited thereto, and thus equivalent changes made in the claims of the present invention are still within the scope of the present invention.

Claims (20)

  1. 一种基于声纹的身份识别方法,其特征在于,包括:A voiceprint-based identification method, comprising:
    采集即时通信应用中作为发送方的用户账号传输的语音数据;Collecting voice data transmitted by the user account of the sender in the instant messaging application;
    根据采集的语音数据进行声纹识别模型训练,创建与所述用户账号对应的声纹特征库;Performing a voiceprint recognition model training according to the collected voice data, and creating a voiceprint feature library corresponding to the user account;
    接收发起的身份验证请求,获取输入的目标用户账号和目标语音数据;Receiving the initiated authentication request, obtaining the input target user account and the target voice data;
    查找与所述目标用户账号匹配的声纹特征库,在所述目标语音数据与所述查找到的声纹特征库匹配时,判定所述目标用户账号的身份校验通过。And searching for a voiceprint feature database that matches the target user account, and determining that the identity verification of the target user account passes when the target voice data matches the found voiceprint feature database.
  2. 根据权利要求1所述的基于声纹的身份识别方法,其特征在于,A voiceprint based identification method according to claim 1, wherein
    所述接收发起的身份验证请求之后还包括:After receiving the initiated authentication request, the method further includes:
    生成目标文本内容并展示;Generate target text content and display it;
    所述获取输入的目标用户账号和目标语音数据包括:The obtaining the input target user account and the target voice data includes:
    获取输入的目标用户账号,接收与所述展示的目标文本内容对应的目标语音数据输入。Obtaining the input target user account, and receiving target voice data input corresponding to the target text content of the presentation.
  3. 根据权利要求2所述的基于声纹的身份识别方法,其特征在于,所述判定所述目标用户账号的身份校验通过还包括:The voiceprint-based identification method according to claim 2, wherein the determining the identity verification of the target user account further comprises:
    通过语音识别将所述目标语音数据转换成文本数据;Converting the target voice data into text data by voice recognition;
    在所述文本数据与所述目标文本内容匹配时,判定所述目标用户账号的身份校验通过。When the text data matches the target text content, it is determined that the identity verification of the target user account passes.
  4. 根据权利要求1所述的基于声纹的身份识别方法,其特征在于,所述创建与所述用户账号对应的声纹特征库之后还包括:The voiceprint-based identification method according to claim 1, wherein the creating a voiceprint feature library corresponding to the user account further comprises:
    判断所述创建的与所述用户账号对应的声纹特征库的置信度是否大于或等于阈值,若是,则停止采集即时通信应用中作为发送方的用户账号传输的语音数据。 Determining whether the confidence level of the created voiceprint feature database corresponding to the user account is greater than or equal to a threshold, and if so, stopping collecting voice data transmitted by the user account as the sender in the instant messaging application.
  5. 根据权利要求1所述的基于声纹的身份识别方法,其特征在于,A voiceprint based identification method according to claim 1, wherein
    所述获取输入的目标用户账号和目标语音数据包括:The obtaining the input target user account and the target voice data includes:
    至少接收一次输入的目标语音数据;Receiving at least one input target voice data;
    所述判定所述目标用户账号的身份校验通过之前还包括:Before determining the identity verification of the target user account, the method further includes:
    判断所述至少一次接收到的目标语音数据与所述查找到的声纹特征库的匹配次数/比例,在所述匹配次数/比例大于或等于阈值时,判定所述目标语音数据与所述查找到的声纹特征库匹配。Determining a number of matches/proportion of the at least one received target voice data and the found voiceprint feature library, and determining the target voice data and the finding when the number of matches/proportion is greater than or equal to a threshold The soundprint feature library is matched.
  6. 根据权利要求1所述的基于声纹的身份识别方法,其特征在于,所述查找与所述目标用户账号匹配的声纹特征库之后还包括:The voiceprint-based identification method according to claim 1, wherein the searching for the voiceprint feature library matching the target user account further comprises:
    在所述目标语音数据与所述查找到的声纹特征库不匹配时,锁定所述目标用户账号。The target user account is locked when the target voice data does not match the found voiceprint feature library.
  7. 根据权利要求1所述的基于声纹的身份识别方法,其特征在于,还包括:The voiceprint-based identification method according to claim 1, further comprising:
    当后续对所述目标用户账号进行账号验证时,将验证方式设置为通过第一用户账号进行验证,其中,所述目标用户账号和所述第一用户账号为同一用户的关联账号;When the account authentication is performed on the target user account, the authentication mode is set to be verified by using the first user account, where the target user account and the first user account are associated accounts of the same user;
    查找所述第一用户账号,并查找与所述第一用户账号对应的第一声纹特征库;Finding the first user account, and searching for a first voiceprint feature library corresponding to the first user account;
    接收所述用户输入的用于身份验证的第一语音数据;以及Receiving, by the user, first voice data for authentication; and
    将所述第一语音数据与所述第一声纹特征库进行匹配,并在匹配成功时,通过所述目标用户账号的账号验证。Matching the first voice data with the first voiceprint feature library, and verifying by using the account of the target user account when the matching is successful.
  8. 一种基于声纹的身份识别装置,其特征在于,包括:A voiceprint-based identification device, comprising:
    语音数据采集模块,用于采集即时通信应用中作为发送方的用户账号传输的语音数据;a voice data collecting module, configured to collect voice data transmitted by a user account as a sender in an instant messaging application;
    声纹特征库创建模块,用于根据采集的语音数据进行声纹识别模型训练,创建与所述用户账号对应的声纹特征库; a voiceprint feature library creating module, configured to perform a voiceprint recognition model training according to the collected voice data, and create a voiceprint feature library corresponding to the user account;
    目标信息获取模块,用于接收发起的身份验证请求,获取输入的目标用户账号和目标语音数据;a target information obtaining module, configured to receive the initiated identity verification request, and obtain the input target user account and the target voice data;
    声纹比对模块,用于查找与所述目标用户账号匹配的声纹特征库,在所述目标语音数据与所述查找到的声纹特征库匹配时,判定所述目标用户账号的身份校验通过。a voiceprint matching module, configured to search a voiceprint feature library that matches the target user account, and determine an identity school of the target user account when the target voice data matches the found voiceprint feature database Passed the test.
  9. 根据权利要求8所述的基于声纹的身份识别装置,其特征在于,所述目标信息获取模块还用于生成目标文本内容并展示;获取输入的目标用户账号,接收与所述展示的目标文本内容对应的目标语音数据输入。The voiceprint-based identification device according to claim 8, wherein the target information acquisition module is further configured to generate target text content and display; obtain an input target user account, and receive target text with the display The target voice data corresponding to the content is input.
  10. 根据权利要求9所述的基于声纹的身份识别装置,其特征在于,所述声纹比对模块还用于通过语音识别将所述目标语音数据转换成文本数据;在所述文本数据与所述目标文本内容匹配时,判定所述目标用户账号的身份校验通过。The voiceprint-based identification device according to claim 9, wherein the voiceprint comparison module is further configured to convert the target voice data into text data by voice recognition; When the content of the target text matches, it is determined that the identity verification of the target user account is passed.
  11. 根据权利要求8所述的基于声纹的身份识别装置,其特征在于,所述装置还包括语音数据采集停止模块,用于判断所述创建的与所述用户账号对应的声纹特征库的置信度是否大于或等于阈值,若是,则停止采集即时通信应用中作为发送方的用户账号传输的语音数据。The voiceprint-based identification device according to claim 8, wherein the device further comprises a voice data collection stop module, configured to determine the confidence of the created voiceprint feature library corresponding to the user account Whether the degree is greater than or equal to the threshold, and if so, stops collecting voice data transmitted by the user account as the sender in the instant messaging application.
  12. 根据权利要求8所述的基于声纹的身份识别装置,其特征在于,所述目标信息获取模块还用于至少接收一次输入的目标语音数据;The voiceprint-based identification device according to claim 8, wherein the target information acquisition module is further configured to receive at least one input target voice data;
    所述声纹比对模块还用于判断所述至少一次接收到的目标语音数据与所述查找到的声纹特征库的匹配次数/比例,在所述匹配次数/比例大于或等于阈值时,判定所述目标语音数据与所述查找到的声纹特征库匹配。The voiceprint matching module is further configured to determine a matching frequency/proportion of the at least one received target voice data and the found voiceprint feature database, when the number of matches/proportion is greater than or equal to a threshold, Determining that the target voice data matches the found voiceprint feature library.
  13. 根据权利要求8所述的基于声纹的身份识别装置,其特征在于,所述装置还包括目标用户账号锁定模块,用于在所述目标语音数据与所述查找到的声纹特征库不匹配时,锁定所述目标用户账号。 The voiceprint-based identification device according to claim 8, wherein the device further comprises a target user account locking module, configured to: the target voice data does not match the found voiceprint feature library The target user account is locked.
  14. 根据权利要求8所述的基于声纹的身份识别装置,其特征在于,A voiceprint-based identification device according to claim 8, wherein
    所述目标信息获取模块还用于,当后续对所述目标用户账号进行账号验证时,将验证方式设置为通过第一用户账号进行验证,其中,所述目标用户账号和所述第一用户账号为同一用户的关联账号;The target information obtaining module is further configured to: when the account verification is performed on the target user account, the verification mode is set to be verified by using the first user account, wherein the target user account and the first user account Is the associated account of the same user;
    所述声纹比对模块还用于查找所述第一用户账号,并查找与所述第一用户账号对应的第一声纹特征库;The voiceprint comparison module is further configured to search the first user account, and search for a first voiceprint feature library corresponding to the first user account;
    所述目标信息获取模块还用于接收所述用户输入的用于身份验证的第一语音数据;以及The target information acquiring module is further configured to receive first voice data input by the user for identity verification;
    所述声纹比对模块还用于将所述第一语音数据与所述第一声纹特征库进行匹配,并在匹配成功时,通过所述目标用户账号的账号验证。The voiceprint matching module is further configured to match the first voice data with the first voiceprint feature library, and when the matching is successful, verify the account by using the target user account.
  15. 一种计算机可读存储介质,配置为存储计算机可读指令,当在数据处理设备上执行所述计算机可读指令时,使得所述数据处理设备执行预定操作,所述预定操作包括:A computer readable storage medium, configured to store computer readable instructions that, when executed on a data processing device, cause the data processing device to perform a predetermined operation, the predetermined operation comprising:
    采集即时通信应用中作为发送方的用户账号传输的语音数据;Collecting voice data transmitted by the user account of the sender in the instant messaging application;
    根据采集的语音数据进行声纹识别模型训练,创建与所述用户账号对应的声纹特征库;Performing a voiceprint recognition model training according to the collected voice data, and creating a voiceprint feature library corresponding to the user account;
    接收发起的身份验证请求,获取输入的目标用户账号和目标语音数据;Receiving the initiated authentication request, obtaining the input target user account and the target voice data;
    查找与所述目标用户账号匹配的声纹特征库,在所述目标语音数据与所述查找到的声纹特征库匹配时,判定所述目标用户账号的身份校验通过。And searching for a voiceprint feature database that matches the target user account, and determining that the identity verification of the target user account passes when the target voice data matches the found voiceprint feature database.
  16. 根据权利要求15所述的计算机可读存储介质,其特征在于,A computer readable storage medium according to claim 15 wherein:
    在接收发起的身份验证请求之后,所述预定操作还包括:After receiving the initiated authentication request, the predetermined operation further includes:
    生成目标文本内容并展示;Generate target text content and display it;
    所述获取输入的目标用户账号和目标语音数据包括:The obtaining the input target user account and the target voice data includes:
    获取输入的目标用户账号,接收与所述展示的目标文本内容对应的目标语音数据输入。 Obtaining the input target user account, and receiving target voice data input corresponding to the target text content of the presentation.
  17. 根据权利要求16所述的计算机可读存储介质,其特征在于,所述判定所述目标用户账号的身份校验通过还包括:The computer readable storage medium according to claim 16, wherein the determining the identity verification of the target user account further comprises:
    通过语音识别将所述目标语音数据转换成文本数据;Converting the target voice data into text data by voice recognition;
    在所述文本数据与所述目标文本内容匹配时,判定所述目标用户账号的身份校验通过。When the text data matches the target text content, it is determined that the identity verification of the target user account passes.
  18. 根据权利要求15所述的计算机可读存储介质,其特征在于,在创建与所述用户账号对应的声纹特征库之后,所述预定操作还包括:The computer readable storage medium according to claim 15, wherein after the creation of the voiceprint feature library corresponding to the user account, the predetermined operation further comprises:
    判断所述创建的与所述用户账号对应的声纹特征库的置信度是否大于或等于阈值,若是,则停止采集即时通信应用中作为发送方的用户账号传输的语音数据。Determining whether the confidence level of the created voiceprint feature database corresponding to the user account is greater than or equal to a threshold, and if so, stopping collecting voice data transmitted by the user account as the sender in the instant messaging application.
  19. 根据权利要求15所述的计算机可读存储介质,其特征在于,A computer readable storage medium according to claim 15 wherein:
    所述获取输入的目标用户账号和目标语音数据包括:The obtaining the input target user account and the target voice data includes:
    至少接收一次输入的目标语音数据;Receiving at least one input target voice data;
    在判定所述目标用户账号的身份校验通过之前,所述预定操作还包括:Before determining that the identity verification of the target user account is passed, the predetermined operation further includes:
    判断所述至少一次接收到的目标语音数据与所述查找到的声纹特征库的匹配次数/比例,在所述匹配次数/比例大于或等于阈值时,判定所述目标语音数据与所述查找到的声纹特征库匹配。Determining a number of matches/proportion of the at least one received target voice data and the found voiceprint feature library, and determining the target voice data and the finding when the number of matches/proportion is greater than or equal to a threshold The soundprint feature library is matched.
  20. 根据权利要求15所述的计算机可读存储介质,其特征在于,在查找与所述目标用户账号匹配的声纹特征库之后,所述预定操作还包括:The computer readable storage medium according to claim 15, wherein after the lookup of the voiceprint feature library matching the target user account, the predetermined operation further comprises:
    在所述目标语音数据与所述查找到的声纹特征库不匹配时,锁定所述目标用户账号。 The target user account is locked when the target voice data does not match the found voiceprint feature library.
PCT/CN2017/075346 2016-05-16 2017-03-01 Voiceprint-based identity recognition method and device WO2017197953A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201610321746.1 2016-05-16
CN201610321746.1A CN107395352B (en) 2016-05-16 2016-05-16 Personal identification method and device based on vocal print

Publications (1)

Publication Number Publication Date
WO2017197953A1 true WO2017197953A1 (en) 2017-11-23

Family

ID=60324810

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2017/075346 WO2017197953A1 (en) 2016-05-16 2017-03-01 Voiceprint-based identity recognition method and device

Country Status (2)

Country Link
CN (1) CN107395352B (en)
WO (1) WO2017197953A1 (en)

Cited By (26)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107918771A (en) * 2017-12-07 2018-04-17 河北工业大学 Character recognition method and Worn type person recognition system
CN108417216A (en) * 2018-03-15 2018-08-17 深圳市声扬科技有限公司 Speech verification method, apparatus, computer equipment and storage medium
CN109192213A (en) * 2018-08-21 2019-01-11 平安科技(深圳)有限公司 The real-time transfer method of court's trial voice, device, computer equipment and storage medium
CN109346083A (en) * 2018-11-28 2019-02-15 北京猎户星空科技有限公司 A kind of intelligent sound exchange method and device, relevant device and storage medium
CN109344587A (en) * 2018-08-09 2019-02-15 平安科技(深圳)有限公司 Case handling suggestion input method, device, computer equipment and storage medium
CN109462482A (en) * 2018-11-09 2019-03-12 深圳壹账通智能科技有限公司 Method for recognizing sound-groove, device, electronic equipment and computer readable storage medium
CN109473108A (en) * 2018-12-15 2019-03-15 深圳壹账通智能科技有限公司 Auth method, device, equipment and storage medium based on Application on Voiceprint Recognition
CN110134830A (en) * 2019-04-15 2019-08-16 深圳壹账通智能科技有限公司 Video information data processing method, device, computer equipment and storage medium
CN110246503A (en) * 2019-05-20 2019-09-17 平安科技(深圳)有限公司 Blacklist vocal print base construction method, device, computer equipment and storage medium
CN110634492A (en) * 2019-06-13 2019-12-31 中信银行股份有限公司 Login verification method and device, electronic equipment and computer readable storage medium
CN111210828A (en) * 2019-12-23 2020-05-29 秒针信息技术有限公司 Equipment binding method, device and system and storage medium
CN111224972A (en) * 2019-12-31 2020-06-02 航天信息股份有限公司 Method and system for retrieving account/password based on facial features
CN111464644A (en) * 2020-04-01 2020-07-28 北京声智科技有限公司 Data transmission method and electronic equipment
CN111862933A (en) * 2020-07-20 2020-10-30 北京字节跳动网络技术有限公司 Method, apparatus, device and medium for generating synthesized speech
CN112312084A (en) * 2020-10-16 2021-02-02 李小丽 Intelligent image monitoring system
CN113038420A (en) * 2021-03-03 2021-06-25 恒大新能源汽车投资控股集团有限公司 Service method and device based on Internet of vehicles
CN113056784A (en) * 2019-01-29 2021-06-29 深圳市欢太科技有限公司 Voice information processing method and device, storage medium and electronic equipment
CN113160961A (en) * 2020-01-20 2021-07-23 深圳市理邦精密仪器股份有限公司 Authority management method of mobile electrocardiogram equipment, mobile electrocardiogram equipment and storage medium
CN113409795A (en) * 2021-08-19 2021-09-17 北京世纪好未来教育科技有限公司 Training method, voiceprint recognition method and device and electronic equipment
CN113421573A (en) * 2021-06-18 2021-09-21 马上消费金融股份有限公司 Identity recognition model training method, identity recognition method and device
CN113489668A (en) * 2020-07-14 2021-10-08 青岛海信电子产业控股股份有限公司 Data processing method and intelligent equipment
CN113835345A (en) * 2020-06-24 2021-12-24 青岛海尔多媒体有限公司 Method, device, equipment and system for voice home control
CN114038087A (en) * 2020-07-20 2022-02-11 阜阳万瑞斯电子锁业有限公司 Unlocking system and method for voice recognition of electronic lock
CN114842146A (en) * 2022-05-10 2022-08-02 中国民用航空飞行学院 Civil aviation engine maintenance manual and work card modeling method and storable medium
CN115860749A (en) * 2023-02-09 2023-03-28 支付宝(杭州)信息技术有限公司 Data processing method, device and equipment
CN115952482A (en) * 2023-03-13 2023-04-11 山东博奥克生物科技有限公司 Medical equipment data management system and method

Families Citing this family (29)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108039177A (en) * 2017-12-20 2018-05-15 广州势必可赢网络科技有限公司 User identity verification method and device for network real-name ticket purchasing
CN110010135B (en) * 2018-01-05 2024-05-07 北京搜狗科技发展有限公司 Speech-based identity recognition method and device and electronic equipment
CN108648758B (en) * 2018-03-12 2020-09-01 北京云知声信息技术有限公司 Method and system for separating invalid voice in medical scene
CN108766444B (en) * 2018-04-09 2020-11-03 平安科技(深圳)有限公司 User identity authentication method, server and storage medium
CN108831484A (en) * 2018-05-29 2018-11-16 广东声将军科技有限公司 A kind of offline and unrelated with category of language method for recognizing sound-groove and device
CN108831476A (en) * 2018-05-31 2018-11-16 平安科技(深圳)有限公司 Voice acquisition method, device, computer equipment and storage medium
CN108959163B (en) * 2018-06-28 2020-01-21 掌阅科技股份有限公司 Subtitle display method for audio electronic book, electronic device and computer storage medium
CN110880325B (en) * 2018-09-05 2022-06-28 华为技术有限公司 Identity recognition method and equipment
CN109256147B (en) * 2018-10-30 2022-06-10 腾讯音乐娱乐科技(深圳)有限公司 Audio beat detection method, device and storage medium
CN109801637A (en) * 2018-12-03 2019-05-24 厦门快商通信息技术有限公司 Model Fusion method and system based on hiding factor
CN114938360B (en) * 2019-04-12 2023-04-18 腾讯科技(深圳)有限公司 Data processing method and device based on instant messaging application
CN111046220B (en) * 2019-04-29 2024-06-21 广东小天才科技有限公司 Playback method of newspaper-reading voice in dictation process and electronic equipment
CN110164450B (en) * 2019-05-09 2023-11-28 腾讯科技(深圳)有限公司 Login method, login device, playing equipment and storage medium
CN110232917A (en) * 2019-05-21 2019-09-13 平安科技(深圳)有限公司 Voice login method, device, equipment and storage medium based on artificial intelligence
CN110351329A (en) * 2019-05-27 2019-10-18 深圳壹账通智能科技有限公司 A kind of outer forwarding method of file and file outgoing system
CN110443018A (en) * 2019-07-30 2019-11-12 深圳力维智联技术有限公司 Method, terminal, system and readable storage medium storing program for executing based on Application on Voiceprint Recognition identity
CN110704822A (en) * 2019-08-30 2020-01-17 深圳市声扬科技有限公司 Method, device, server and system for improving user identity authentication security
CN110767237A (en) * 2019-10-25 2020-02-07 深圳市声扬科技有限公司 Voice transmission method and device, first interphone and system
CN111028845A (en) * 2019-12-06 2020-04-17 广州国音智能科技有限公司 Multi-audio recognition method, device, equipment and readable storage medium
CN111339517B (en) * 2020-05-15 2021-08-13 支付宝(杭州)信息技术有限公司 Voiceprint feature sampling method, user identification method, device and electronic equipment
CN111899744A (en) * 2020-07-16 2020-11-06 中国联合网络通信集团有限公司 Voice information processing method, device, server and storage medium
WO2022028207A1 (en) * 2020-08-03 2022-02-10 华为技术有限公司 Speech recognition method, apparatus, device and system, and computer readable storage medium
CN112116911B (en) * 2020-09-22 2023-12-19 深圳易美诺科技有限公司 Sound control method and device and computer readable storage medium
CN114638559A (en) * 2020-12-15 2022-06-17 深圳顺丰快运科技有限公司 Logistics piece signing method, device, equipment and computer readable storage medium
CN112687295A (en) * 2020-12-22 2021-04-20 联想(北京)有限公司 Input control method and electronic equipment
CN113037918B (en) * 2021-03-02 2022-06-17 四川速宝网络科技有限公司 Non-invasive sound changing method for Android client
CN112802482B (en) * 2021-04-15 2021-07-23 北京远鉴信息技术有限公司 Voiceprint serial-parallel identification method, individual soldier system and storage medium
CN113259373B (en) * 2021-06-08 2021-10-01 支付宝(杭州)信息技术有限公司 Resource transfer method, device and system and Internet of things equipment
CN118212927A (en) * 2024-03-07 2024-06-18 中科世通亨奇(北京)科技有限公司 Identity recognition method and system based on sound characteristics, storage medium and electronic equipment

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102737634A (en) * 2012-05-29 2012-10-17 百度在线网络技术(北京)有限公司 Authentication method and device based on voice
US8374865B1 (en) * 2012-04-26 2013-02-12 Google Inc. Sampling training data for an automatic speech recognition system based on a benchmark classification distribution
CN103679452A (en) * 2013-06-20 2014-03-26 腾讯科技(深圳)有限公司 Payment authentication method, device thereof and system thereof
CN104331650A (en) * 2013-07-22 2015-02-04 联想(北京)有限公司 Information processing method and electronic equipment
CN105357006A (en) * 2014-08-20 2016-02-24 中兴通讯股份有限公司 Method and equipment for performing security authentication based on voiceprint feature

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8374865B1 (en) * 2012-04-26 2013-02-12 Google Inc. Sampling training data for an automatic speech recognition system based on a benchmark classification distribution
CN102737634A (en) * 2012-05-29 2012-10-17 百度在线网络技术(北京)有限公司 Authentication method and device based on voice
CN103679452A (en) * 2013-06-20 2014-03-26 腾讯科技(深圳)有限公司 Payment authentication method, device thereof and system thereof
CN104331650A (en) * 2013-07-22 2015-02-04 联想(北京)有限公司 Information processing method and electronic equipment
CN105357006A (en) * 2014-08-20 2016-02-24 中兴通讯股份有限公司 Method and equipment for performing security authentication based on voiceprint feature

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107918771A (en) * 2017-12-07 2018-04-17 河北工业大学 Character recognition method and Worn type person recognition system
CN107918771B (en) * 2017-12-07 2023-11-24 河北工业大学 Person identification method and wearable person identification system
CN108417216A (en) * 2018-03-15 2018-08-17 深圳市声扬科技有限公司 Speech verification method, apparatus, computer equipment and storage medium
CN108417216B (en) * 2018-03-15 2021-01-08 深圳市声扬科技有限公司 Voice verification method and device, computer equipment and storage medium
CN109344587A (en) * 2018-08-09 2019-02-15 平安科技(深圳)有限公司 Case handling suggestion input method, device, computer equipment and storage medium
CN109192213A (en) * 2018-08-21 2019-01-11 平安科技(深圳)有限公司 The real-time transfer method of court's trial voice, device, computer equipment and storage medium
CN109192213B (en) * 2018-08-21 2023-10-20 平安科技(深圳)有限公司 Method and device for real-time transcription of court trial voice, computer equipment and storage medium
CN109462482A (en) * 2018-11-09 2019-03-12 深圳壹账通智能科技有限公司 Method for recognizing sound-groove, device, electronic equipment and computer readable storage medium
CN109462482B (en) * 2018-11-09 2023-08-08 深圳壹账通智能科技有限公司 Voiceprint recognition method, voiceprint recognition device, electronic equipment and computer readable storage medium
CN109346083A (en) * 2018-11-28 2019-02-15 北京猎户星空科技有限公司 A kind of intelligent sound exchange method and device, relevant device and storage medium
CN109473108A (en) * 2018-12-15 2019-03-15 深圳壹账通智能科技有限公司 Auth method, device, equipment and storage medium based on Application on Voiceprint Recognition
CN113056784A (en) * 2019-01-29 2021-06-29 深圳市欢太科技有限公司 Voice information processing method and device, storage medium and electronic equipment
CN110134830A (en) * 2019-04-15 2019-08-16 深圳壹账通智能科技有限公司 Video information data processing method, device, computer equipment and storage medium
CN110246503A (en) * 2019-05-20 2019-09-17 平安科技(深圳)有限公司 Blacklist vocal print base construction method, device, computer equipment and storage medium
CN110634492B (en) * 2019-06-13 2023-08-25 中信银行股份有限公司 Login verification method, login verification device, electronic equipment and computer readable storage medium
CN110634492A (en) * 2019-06-13 2019-12-31 中信银行股份有限公司 Login verification method and device, electronic equipment and computer readable storage medium
CN111210828A (en) * 2019-12-23 2020-05-29 秒针信息技术有限公司 Equipment binding method, device and system and storage medium
CN111224972A (en) * 2019-12-31 2020-06-02 航天信息股份有限公司 Method and system for retrieving account/password based on facial features
CN113160961A (en) * 2020-01-20 2021-07-23 深圳市理邦精密仪器股份有限公司 Authority management method of mobile electrocardiogram equipment, mobile electrocardiogram equipment and storage medium
CN111464644A (en) * 2020-04-01 2020-07-28 北京声智科技有限公司 Data transmission method and electronic equipment
CN113835345A (en) * 2020-06-24 2021-12-24 青岛海尔多媒体有限公司 Method, device, equipment and system for voice home control
CN113489668A (en) * 2020-07-14 2021-10-08 青岛海信电子产业控股股份有限公司 Data processing method and intelligent equipment
CN114038087B (en) * 2020-07-20 2024-03-15 阜阳万瑞斯电子锁业有限公司 Unlocking system and method for electronic lock voice recognition
CN114038087A (en) * 2020-07-20 2022-02-11 阜阳万瑞斯电子锁业有限公司 Unlocking system and method for voice recognition of electronic lock
CN111862933A (en) * 2020-07-20 2020-10-30 北京字节跳动网络技术有限公司 Method, apparatus, device and medium for generating synthesized speech
CN112312084A (en) * 2020-10-16 2021-02-02 李小丽 Intelligent image monitoring system
CN113038420A (en) * 2021-03-03 2021-06-25 恒大新能源汽车投资控股集团有限公司 Service method and device based on Internet of vehicles
CN113421573A (en) * 2021-06-18 2021-09-21 马上消费金融股份有限公司 Identity recognition model training method, identity recognition method and device
CN113421573B (en) * 2021-06-18 2024-03-19 马上消费金融股份有限公司 Identity recognition model training method, identity recognition method and device
CN113409795A (en) * 2021-08-19 2021-09-17 北京世纪好未来教育科技有限公司 Training method, voiceprint recognition method and device and electronic equipment
CN114842146A (en) * 2022-05-10 2022-08-02 中国民用航空飞行学院 Civil aviation engine maintenance manual and work card modeling method and storable medium
CN115860749A (en) * 2023-02-09 2023-03-28 支付宝(杭州)信息技术有限公司 Data processing method, device and equipment
CN115952482A (en) * 2023-03-13 2023-04-11 山东博奥克生物科技有限公司 Medical equipment data management system and method

Also Published As

Publication number Publication date
CN107395352A (en) 2017-11-24
CN107395352B (en) 2019-05-07

Similar Documents

Publication Publication Date Title
WO2017197953A1 (en) Voiceprint-based identity recognition method and device
KR102250460B1 (en) Methods, devices and systems for building user glottal models
Liu et al. An MFCC‐based text‐independent speaker identification system for access control
WO2020211354A1 (en) Speaker identity recognition method and device based on speech content, and storage medium
CN108074310B (en) Voice interaction method based on voice recognition module and intelligent lock management system
WO2017113658A1 (en) Artificial intelligence-based method and device for voiceprint authentication
WO2019153404A1 (en) Smart classroom voice control system
US20110320202A1 (en) Location verification system using sound templates
CN104143326A (en) Voice command recognition method and device
JP2007133414A (en) Method and apparatus for estimating discrimination capability of voice and method and apparatus for registration and evaluation of speaker authentication
CN103678977A (en) Method and electronic device for protecting information security
US20230401338A1 (en) Method for detecting an audio adversarial attack with respect to a voice input processed by an automatic speech recognition system, corresponding device, computer program product and computer-readable carrier medium
JP2004101901A (en) Speech interaction system and speech interaction program
TW202018696A (en) Voice recognition method and device and computing device
CN109920435A (en) A kind of method for recognizing sound-groove and voice print identification device
CN109727342A (en) Recognition methods, device, access control system and the storage medium of access control system
KR20230116886A (en) Self-supervised speech representation for fake audio detection
CN112417412A (en) Bank account balance inquiry method, device and system
Kuznetsov et al. Methods of countering speech synthesis attacks on voice biometric systems in banking
KR101181060B1 (en) Voice recognition system and method for speaker recognition using thereof
US20240126851A1 (en) Authentication system and method
KR20040068548A (en) Method and system for non-intrusive speaker verification using behavior models
WO2018137426A1 (en) Method and apparatus for recognizing voice information of user
Hari et al. Comprehensive Research on Speaker Recognition and its Challenges
CN112530441A (en) Method and device for authenticating legal user, computer equipment and storage medium

Legal Events

Date Code Title Description
NENP Non-entry into the national phase

Ref country code: DE

121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 17798520

Country of ref document: EP

Kind code of ref document: A1

122 Ep: pct application non-entry in european phase

Ref document number: 17798520

Country of ref document: EP

Kind code of ref document: A1