WO2019196303A1 - User identity authentication method, server and storage medium - Google Patents

User identity authentication method, server and storage medium Download PDF

Info

Publication number
WO2019196303A1
WO2019196303A1 PCT/CN2018/102123 CN2018102123W WO2019196303A1 WO 2019196303 A1 WO2019196303 A1 WO 2019196303A1 CN 2018102123 W CN2018102123 W CN 2018102123W WO 2019196303 A1 WO2019196303 A1 WO 2019196303A1
Authority
WO
WIPO (PCT)
Prior art keywords
user identity
feature vector
target user
voiceprint
distance
Prior art date
Application number
PCT/CN2018/102123
Other languages
French (fr)
Chinese (zh)
Inventor
王健宗
胡秋涵
李梦迪
郑斯奇
肖京
Original Assignee
平安科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 平安科技(深圳)有限公司 filed Critical 平安科技(深圳)有限公司
Publication of WO2019196303A1 publication Critical patent/WO2019196303A1/en

Links

Images

Classifications

    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/06Decision making techniques; Pattern matching strategies
    • G10L17/08Use of distortion metrics or a particular distance between probe pattern and reference templates
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/04Training, enrolment or model building
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/20Pattern transformations or operations aimed at increasing system robustness, e.g. against channel noise or different working conditions
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification techniques
    • G10L17/22Interactive procedures; Man-machine interfaces
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
    • G10L25/00Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00
    • G10L25/48Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use
    • G10L25/51Speech or voice analysis techniques not restricted to a single one of groups G10L15/00 - G10L21/00 specially adapted for particular use for comparison or discrimination

Definitions

  • the present application relates to the field of computer technologies, and in particular, to a user identity verification method, a server, and a computer readable storage medium.
  • voiceprint verification technology to verify user identity has become an important means of authentication for major customer service companies (eg, banks, insurance companies, game companies, etc.).
  • the traditional business solution for realizing user authentication using voiceprint verification technology is: the existing voiceprint recognition technology, which usually uses a voiceprint verification model from the data collected by a single channel, and then uses the trained voice.
  • the pattern verification model performs voiceprint verification on voiceprint data from different channels.
  • the present application provides a user identity verification method, a server, and a computer readable storage medium, the main purpose of which is to avoid the problem that the difference between the voiceprint discrimination vector and the actual voiceprint discrimination vector caused by different voice data collection channels is different, and the identity verification is improved.
  • the accuracy is to avoid the problem that the difference between the voiceprint discrimination vector and the actual voiceprint discrimination vector caused by different voice data collection channels is different, and the identity verification is improved. The accuracy.
  • the present application provides a user identity verification method, including:
  • the current voice data is input into the trained voiceprint recognition model, and the current voiceprint feature vector of the target user is determined, and the target user identity is determined according to a predetermined mapping relationship between the user identity and the standard voiceprint feature vector.
  • the present application further provides an identity verification server, where the server includes a memory and a processor, and the memory stores a user identity verification program executable on the processor, the program is processed by the processor.
  • the following steps are implemented during execution:
  • the current voice data is input into the trained voiceprint recognition model, and the current voiceprint feature vector of the target user is determined, and the target user identity is determined according to a predetermined mapping relationship between the user identity and the standard voiceprint feature vector.
  • the present application further provides a computer readable storage medium having a user identity verification program stored thereon, and the program is implemented by a processor to implement the user identity verification method as described above. Any step.
  • the voiceprint recognition model is obtained from the current
  • the current voiceprint discrimination vector of the target user is extracted from the voice data, which avoids the problem that the voiceprint discrimination vector and the actual voiceprint discrimination vector are different due to different voice data collection channels, and improves the extracted voiceprint discrimination vector.
  • FIG. 1 is a schematic diagram of a preferred embodiment of a user identity verification server of the present application
  • FIG. 2 is a schematic diagram of a program module of the user identity verification program in FIG. 1;
  • FIG. 3 is a flow chart of a preferred embodiment of a user identity verification method of the present application.
  • the application provides a user identity verification server 1.
  • a schematic diagram of a preferred embodiment of the identity verification server 1 of the present application is shown.
  • the authentication server 1 may be a rack server, a blade server, a tower server, or a rack server.
  • the authentication server 1 includes a memory 11, a processor 12, a communication bus 13, and a network interface 14.
  • the memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (for example, an SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like.
  • the memory 11 may in some embodiments be an internal storage unit of the authentication server 1, such as the hard disk of the authentication server 1.
  • the memory 11 may also be an external storage device of the authentication server 1 in other embodiments, such as a plug-in hard disk equipped with the smart card (SMC), a secure digital card (SMC). Secure Digital, SD) cards, flash cards, etc. Further, the memory 11 may also include both an internal storage unit of the authentication server 1 and an external storage device.
  • the memory 11 can be used not only for storing the application software installed in the identity verification server 1 and various types of data, such as the user identity verification program 10, the mapping relationship between the predetermined user identity and the standard voiceprint authentication vector, and the like.
  • the data that has been output or will be output is temporarily stored.
  • the processor 12 can be a central processing unit (CPU), controller, microcontroller, microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as user identity. Verification procedure 10, etc.
  • CPU central processing unit
  • controller microcontroller
  • microprocessor microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as user identity. Verification procedure 10, etc.
  • Communication bus 13 is used to implement connection communication between these components.
  • the network interface 14 may optionally include a standard wired interface, a wireless interface (such as a WI-FI interface), and is generally used to establish a communication connection between the authentication server 1 and other electronic devices, for example, the authentication server 1 passes through the network.
  • the interface 14 receives the identity verification request that carries the target identity sent by the user through the client (not identified in the figure), and feeds the authentication result to the client.
  • FIG. 1 shows only the authentication server 1 with components 11-14, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
  • the authentication server 1 may further include a user interface, and the user interface may include a display, an input unit such as a keyboard, and the optional user interface may further include a standard wired interface and a wireless interface.
  • the display may be an LED display, a liquid crystal display, a touch liquid crystal display, and an Organic Light-Emitting Diode (OLED) touch device.
  • the display may also be referred to as a display screen or display unit for displaying information processed in the authentication server 1 and a user interface for displaying visualizations.
  • a user identity verification program 10 is stored in the memory 11.
  • the processor 12 executes the user identity verification program 10 stored in the memory 11, the following steps are implemented:
  • the current voice data is input into the trained voiceprint recognition model, and the current voiceprint feature vector of the target user is determined, and the target user identity is determined according to a predetermined mapping relationship between the user identity and the standard voiceprint feature vector.
  • the client is a client computer or a mobile terminal with voice collection function used by the target user, and the target user sends an identity verification request through the client.
  • the target user identity for example, the ID number
  • the real-time voice data of the user who currently sends the authentication request is collected. That is, the client collects the current voice data of the target user, and constructs a corresponding current voiceprint discrimination vector for the collected current voice data.
  • a corresponding standard voiceprint authentication vector is set in advance for a predetermined user identity, and a mapping relationship between the predetermined user identity and the standard voiceprint authentication vector is obtained, and the mapping relationship is saved to a database (not identified in the figure).
  • the predetermined user identity identifies the target user identity.
  • the user identity M 1 corresponds to a standard voiceprint discrimination vector.
  • User identity M 2 corresponds to standard voiceprint identification vector.
  • the current voice data of the target user is collected, the current voice data is input into the pre-trained voiceprint recognition model to determine a current voiceprint discrimination vector corresponding to the current voice data.
  • the voiceprint recognition model is obtained by acquiring a first preset number (for example, 5000) of voice samples of a user, and each user's voice sample includes a second preset number (for example, 10 copies).
  • a first preset number for example, 5000
  • different voice segment samples wherein different voice segment samples are respectively acquired through different channels (for example, different terminals)
  • the preset voiceprint recognition model is trained by using the acquired voice samples of each user.
  • the voiceprint recognition model can be used to obtain the voiceprint discrimination vector of the voice data from different channels, which can avoid the sound caused by different voice data collection channels to some extent.
  • the difference between the discriminant vector and the actual voiceprint discriminant vector is large, and the accuracy of recognizing the voiceprint discriminant vector is improved.
  • the voiceprint recognition model needs to be defined before training the voiceprint recognition model.
  • the voiceprint recognition model includes a speaker space feature item representing the eigenvoice space matrix and a channel space feature item representing the eigenchannel space matrix. It should be noted that the speaker spatial feature item is only related to the speaker and has nothing to do with the specific content of the speaker. The speaker's inter-class difference is expressed.
  • the feature item is summarized and summarized into a matrix form, which is expressed as The eigenvoice space matrix, which contains the content defined as the speaker feature item, contains the information unique to the corresponding speaker, and the feature items are different for each person; the channel space feature item represents the same speaker Different differences, that is, noise differences caused by different channels, in order to facilitate the calculation of the algorithm, the feature items are summarized into a matrix form, representing the spatial matrix of the intrinsic channel, and the content contained therein is defined as a channel space feature item, which includes The voiceprint difference information brought by the same speaker through different channels, that is, the same voice of the same person after the same voice passes through different channels, the feature item is different.
  • the speaker spatial feature item includes a speaker voiceprint feature vector
  • the channel spatial feature item includes a channel factor feature vector.
  • the model formula of the voiceprint recognition model is:
  • X ij represents the jth speech of the i-th speaker
  • represents the mean of all speech sample data
  • F represents the identity space and contains the base used to represent various identities
  • each column of F is equivalent to the inter-class
  • the feature vector of space h i represents the voiceprint feature vector of the i-th speaker
  • G represents the error space and contains the base used to represent the different changes of the same identity
  • each column of G is equivalent to the feature vector of the intra-class space
  • w ij denotes a channel vector of the j-th element wherein the i-th voice speaker
  • ⁇ ij represents the residual noise term to denote not yet explain the factors, that may have a Gaussian distribution of zero
  • " ⁇ + Fh i" represents the speaker
  • the human space feature term, "Gw ij + ⁇ ij ", represents a channel space feature item.
  • the predetermined distance calculation formula may be:
  • D represents the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector corresponding to the target user identity.
  • Representing the standard voiceprint identification vector corresponding to the target user identity carried in the authentication request Represents the current voiceprint discrimination vector extracted from the current voice data.
  • a distance threshold is preset.
  • the voiceprint verification result is determined to be voiceprint verification, that is, the target user identity verification is determined; otherwise, the voiceprint verification result is voiceprint.
  • the verification fails, that is, the target user authentication fails, and the authentication result is fed back to the client.
  • the current voiceprint feature vector is calculated by using a predetermined distance calculation formula. While calculating the distance between the standard voiceprint feature vectors corresponding to the target user identity, the current voiceprint feature vector is also calculated corresponding to each of the predetermined (eg, n, n is an integer, and n>0) other users.
  • Pre-storing a plurality of distances between standard voiceprint feature vectors that is, respectively calculating a distance D i between the current voiceprint discrimination vector and a standard voiceprint discrimination vector corresponding to all of the predetermined user identifiers, wherein i The value is an integer, and 0 ⁇ i ⁇ n.
  • the specific calculation manner is the same as the above embodiment, and is not described here.
  • the distance between the current voiceprint feature vector and the standard voiceprint feature vector corresponding to each predetermined user identity is sorted, and each predetermined user identity is determined.
  • the target user identity is included; the third preset number (for example, 5) of the user identity corresponding to the previous distance is filtered out from the n distances, and the third preset number (for example, 5) is determined.
  • the target user identity is included in the user identifier; when the third preset number (for example, 5) of the user identity includes the target user identity, the voiceprint verification result is determined to be voiceprint verification, that is, The target user authentication is passed.
  • the voiceprint verification result is that the voiceprint verification fails, that is, the target user identity verification fails, and the authentication result is fed back to the client.
  • the sorted order is adjusted in the previous third preset number (for example, the third preset number is adjusted to 2).
  • the server 1 proposed by the above embodiment extracts the current voiceprint discrimination vector of the target user from the current voice data by redefining the voiceprint recognition model and using the voiceprint data trained by the voiceprint data collected by different channels. To the extent that the difference between the voiceprint discrimination vector and the actual voiceprint discrimination vector caused by different voice data collection channels is avoided, and the accuracy of extracting the voiceprint discrimination vector is improved; by calculating the current voiceprint discrimination vector and the predetermined user The distance between the standard voiceprint authentication vectors corresponding to the identity identifier, and whether the target user identity identifier is included in the user identity corresponding to the preset minimum number of distances, and whether the target user identity verification is passed, thereby improving the user to a certain extent The success rate of authentication.
  • the user identity verification program 10 may also be divided into one or more modules, one or more modules are stored in the memory 11 and executed by one or more processors (this implementation)
  • the processor 12 is executed to complete the application
  • a module referred to herein refers to a series of computer program instructions that are capable of performing a particular function.
  • FIG. 2 it is a schematic diagram of a program module of the user identity verification program 10 in FIG. 1.
  • the user identity verification program 10 can be divided into an acquisition module 110, a vector extraction module 120, a calculation module 130, and an analysis.
  • the functions or operational steps implemented by the modules 140-140 are similar to the above, and are not described in detail herein, by way of example, for example:
  • the obtaining module 110 is configured to receive an identity verification request with a target user identity, and obtain current voice data of the target user from the client.
  • the vector extraction module 120 is configured to input the current voice data into the trained voiceprint recognition model, determine a current voiceprint feature vector of the target user, and according to a mapping relationship between the predetermined user identity and the standard voiceprint feature vector, Determining a standard voiceprint feature vector corresponding to the target user identity;
  • the calculating module 130 is configured to calculate a distance between the current voiceprint feature vector and the standard voiceprint feature vector by using a predetermined distance calculation formula
  • the analyzing module 140 is configured to analyze, according to the distance, whether the target user passes the identity verification, and send the identity verification result to the client.
  • the present application also provides a user identity verification method.
  • FIG. 3 it is a flowchart of a preferred embodiment of the user identity verification method of the present application.
  • the method can be performed by a device that can be implemented by software and/or hardware.
  • the user identity verification method includes steps S1-S4:
  • Step S1 Receive an identity verification request with a target user identity, and obtain current voice data of the target user from the client.
  • Step S2 input the current voice data into the trained voiceprint recognition model, determine a current voiceprint feature vector of the target user, and determine the target according to a predetermined mapping relationship between the user identity identifier and the standard voiceprint feature vector. a standard voiceprint feature vector corresponding to the user identity;
  • Step S3 calculating a distance between the current voiceprint feature vector and the standard voiceprint feature vector by using a predetermined distance calculation formula
  • Step S4 Analyze, according to the distance, whether the target user passes the identity verification, and sends the identity verification result to the client.
  • the client is a client computer or a mobile terminal with voice collection function used by the target user, and the target user sends an identity verification request through the client.
  • the target user identity for example, the ID number
  • the real-time voice data of the user who currently sends the authentication request is collected. That is, the client collects the current voice data of the target user, and constructs a corresponding current voiceprint discrimination vector for the collected current voice data.
  • a corresponding standard voiceprint authentication vector is set in advance for a predetermined user identity, and a mapping relationship between the predetermined user identity and the standard voiceprint authentication vector is obtained, and the mapping relationship is saved to a database (not identified in the figure).
  • the predetermined user identity identifies the target user identity.
  • the user identity M 1 corresponds to a standard voiceprint discrimination vector.
  • User identity M 2 corresponds to standard voiceprint identification vector.
  • the current voice data of the target user is collected, the current voice data is input into the pre-trained voiceprint recognition model to determine a current voiceprint discrimination vector corresponding to the current voice data.
  • the voiceprint recognition model is obtained by acquiring a first preset number (for example, 5000) of voice samples of a user, and each user's voice sample includes a second preset number (for example, 10 copies).
  • a first preset number for example, 5000
  • different voice segment samples wherein different voice segment samples are respectively acquired through different channels (for example, different terminals)
  • the preset voiceprint recognition model is trained by using the acquired voice samples of each user.
  • the voiceprint recognition model can be used to obtain the voiceprint discrimination vector of the voice data from different channels, which can avoid the sound caused by different voice data collection channels to some extent.
  • the difference between the discriminant vector and the actual voiceprint discriminant vector is large, and the accuracy of recognizing the voiceprint discriminant vector is improved.
  • the voiceprint recognition model needs to be defined before training the voiceprint recognition model.
  • the voiceprint recognition model includes a speaker space feature item representing the eigenvoice space matrix and a channel space feature item representing the eigenchannel space matrix. It should be noted that the speaker spatial feature item is only related to the speaker and has nothing to do with the specific content of the speaker. The speaker's inter-class difference is expressed.
  • the feature item is summarized and summarized into a matrix form, which is expressed as The eigenvoice space matrix, which contains the content defined as the speaker feature item, contains the information unique to the corresponding speaker, and the feature items are different for each person; the channel space feature item represents the same speaker Different differences, that is, noise differences caused by different channels, in order to facilitate the calculation of the algorithm, the feature items are summarized into a matrix form, representing the spatial matrix of the intrinsic channel, and the content contained therein is defined as a channel space feature item, which includes The voiceprint difference information brought by the same speaker through different channels, that is, the same voice of the same person after the same voice passes through different channels, the feature item is different.
  • the speaker spatial feature item includes a speaker voiceprint feature vector
  • the channel spatial feature item includes a channel factor feature vector.
  • the model formula of the voiceprint recognition model is:
  • X ij represents the jth speech of the i-th speaker
  • represents the mean of all speech sample data
  • F represents the identity space and contains the base used to represent various identities
  • each column of F is equivalent to the inter-class
  • the feature vector of space h i represents the voiceprint feature vector of the i-th speaker
  • G represents the error space and contains the base used to represent the different changes of the same identity
  • each column of G is equivalent to the feature vector of the intra-class space
  • w ij represents the channel eigenvectors j-th element of the i-th voice speaker
  • ⁇ ij represents the residual noise term to denote not yet explain the factors, that may have a Gaussian distribution of zero
  • " ⁇ + Fh i" represents the speaker
  • the human space feature term, "Gw ij + ⁇ ij ", represents a channel space feature item. It should be noted that the voiceprint feature vectors h i corresponding to different speech segments of the same speaker are the same
  • the predetermined distance calculation formula may be:
  • D represents the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector corresponding to the target user identity.
  • Representing the standard voiceprint identification vector corresponding to the target user identity carried in the authentication request Represents the current voiceprint discrimination vector extracted from the current voice data.
  • a distance threshold is preset.
  • the voiceprint verification result is determined to be voiceprint verification, that is, the target user identity verification is determined; otherwise, the voiceprint verification result is voiceprint.
  • the verification fails, that is, the target user authentication fails, and the authentication result is fed back to the client.
  • the step S3 may be replaced by calculating a distance between the current voiceprint feature vector and the standard voiceprint feature vector corresponding to each of the predetermined user identity identifiers by using a predetermined distance calculation formula.
  • the current voiceprint feature vector is also calculated with each predetermined (eg, n, n is an integer, and n>0) pre-stored standard voiceprint feature vectors corresponding to other users.
  • the step S4 may be replaced by: sorting the distance between the current voiceprint feature vector and the standard voiceprint feature vector corresponding to each predetermined user identity according to the order from large to small, Determining the user identity of the user in the predetermined user identity; selecting a third preset number (for example, 5) of the user identity corresponding to the previous distance from the n distances, and determining the third pre-determination Whether the number of user IDs (for example, 5) includes the target user identity; when the third preset number (for example, 5) of the user identity includes the target user identity, the voiceprint verification is determined. The result is that the voiceprint verification is passed, that is, the target user identity is passed.
  • the voiceprint verification result is determined to be that the voiceprint verification fails, that is, the target user identity verification fails, and the identity verification result is fed back to the client.
  • the sorted order is adjusted in the previous third preset number (for example, the third preset number is adjusted to 2).
  • the user identity verification method proposed by the above embodiment extracts the current voiceprint discrimination vector of the target user from the current voice data by redefining the voiceprint recognition model and using the voiceprint recognition model trained by the voiceprint data collected by different channels. To some extent, the problem that the voiceprint discrimination vector and the actual voiceprint discrimination vector are different due to different voice data collection channels is avoided, and the accuracy of extracting the voiceprint discrimination vector is improved; by calculating the current voiceprint discrimination vector and predetermining The distance between the standard voiceprint authentication vectors corresponding to the user identity, and whether the target user identity is included in the user identity corresponding to the preset minimum number of distances, and whether the target user identity verification is passed, to some extent The success rate of user authentication.
  • the embodiment of the present application further provides a computer readable storage medium, where the user identifiable program 10 is stored, and when the program is executed by the processor, the following operations are implemented:
  • the current voice data is input into the trained voiceprint recognition model, and the current voiceprint feature vector of the target user is determined, and the target user identity is determined according to a predetermined mapping relationship between the user identity and the standard voiceprint feature vector.
  • the technical solution of the present application which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM as described above). , a disk, an optical disk, including a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the various embodiments of the present application.
  • a terminal device which may be a mobile phone, a computer, a server, or a network device, etc.

Landscapes

  • Engineering & Computer Science (AREA)
  • Multimedia (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Physics & Mathematics (AREA)
  • Acoustics & Sound (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Business, Economics & Management (AREA)
  • Game Theory and Decision Science (AREA)
  • Computational Linguistics (AREA)
  • Signal Processing (AREA)
  • Collating Specific Patterns (AREA)

Abstract

Provided is a user identity authentication method, comprising: receiving an identity authentication request carrying an identity identifier of a target user, and acquiring current voice data of the target user from a client; inputting the current voice data into a trained voiceprint recognition model, determining a current voiceprint feature vector of the target user, and determining a standard voiceprint feature vector corresponding to the identity identifier of the target user; calculating the distance between the current voiceprint feature vector and the standard voiceprint feature vector; and analyzing, according to the distance, whether the target user passes the identity authentication, and sending an identity authentication result to the client. Further provided are an identity authentication server and a computer-readable storage medium. By using the present application, the problem of there being a greater difference between a voiceprint identification vector and an actual voiceprint identification vector caused by the difference between voice data collection channels can be avoided, thereby improving the accuracy of identity authentication.

Description

用户身份验证方法、服务器及存储介质User authentication method, server and storage medium
本申请基于巴黎公约申明享有2018年4月9日递交的申请号为CN 2018103110980、名称为“用户身份验证方法、服务器及存储介质”的中国专利申请的优先权,该中国专利申请的整体内容以参考的方式结合在本申请中。The present application is based on the priority of the Chinese Patent Application entitled "User Authentication Method, Server and Storage Medium", filed on April 9, 2018, with the application number of CN 2018103110980, the entire contents of which are The manner of reference is incorporated in the present application.
技术领域Technical field
本申请涉及计算机技术领域,尤其涉及一种用户身份验证方法、服务器及计算机可读存储介质。The present application relates to the field of computer technologies, and in particular, to a user identity verification method, a server, and a computer readable storage medium.
背景技术Background technique
目前,随着声纹识别技术的不断发展,利用声纹验证技术实现用户身份的验证,已经成为各大客户服务公司(例如,银行、保险公司、游戏公司等)的重要鉴权手段。At present, with the continuous development of voiceprint recognition technology, the use of voiceprint verification technology to verify user identity has become an important means of authentication for major customer service companies (eg, banks, insurance companies, game companies, etc.).
传统的利用声纹验证技术实现用户身份验证的业务方案是:现有的声纹识别技术,通常使用的是来自于单一渠道收集的声纹的数据训练的声纹验证模型,然后使用训练的声纹验证模型对来自不同渠道的声纹数据进行声纹验证。The traditional business solution for realizing user authentication using voiceprint verification technology is: the existing voiceprint recognition technology, which usually uses a voiceprint verification model from the data collected by a single channel, and then uses the trained voice. The pattern verification model performs voiceprint verification on voiceprint data from different channels.
然而,这种传统的声纹验证方案的缺陷在于:在跨设备使用时,因不同类型设备之间的差异容易导致采集的声纹数据的差异较大,识别的准确性无法满足要求。However, the drawback of this conventional voiceprint verification scheme is that when used across devices, the difference in voiceprint data collected is likely to be large due to the difference between different types of devices, and the accuracy of recognition cannot meet the requirements.
发明内容Summary of the invention
本申请提供一种用户身份验证方法、服务器及计算机可读存储介质,其主要目的在于避免因语音数据采集渠道不同造成的声纹鉴别向量与实际声纹鉴别向量差异较大的问题,提高身份验证的准确性。The present application provides a user identity verification method, a server, and a computer readable storage medium, the main purpose of which is to avoid the problem that the difference between the voiceprint discrimination vector and the actual voiceprint discrimination vector caused by different voice data collection channels is different, and the identity verification is improved. The accuracy.
为实现上述目的,本申请提供一种用户身份验证方法,该方法包括:To achieve the above objective, the present application provides a user identity verification method, including:
接收带有目标用户身份标识的身份验证请求,从客户端获取该目标用户的当前语音数据;Receiving an authentication request with a target user identity, and obtaining current voice data of the target user from the client;
将该当前语音数据输入训练好的声纹识别模型中,确定该目标用户的当 前声纹特征向量,根据预先确定的用户身份标识与标准声纹特征向量的映射关系,确定所述目标用户身份标识对应的标准声纹特征向量;The current voice data is input into the trained voiceprint recognition model, and the current voiceprint feature vector of the target user is determined, and the target user identity is determined according to a predetermined mapping relationship between the user identity and the standard voiceprint feature vector. Corresponding standard voiceprint feature vector;
利用预先确定的距离计算公式计算所述当前声纹特征向量与所述标准声纹特征向量之间的距离;及Calculating a distance between the current voiceprint feature vector and the standard voiceprint feature vector using a predetermined distance calculation formula; and
根据所述距离分析目标用户是否通过身份验证,将所述身份验证结果发送给该客户端。And analyzing, according to the distance, whether the target user is authenticated, and sending the identity verification result to the client.
此外,为实现上述目的,本申请还提供一种身份验证服务器,该服务器包括存储器、处理器,所述存储器上存储有可在所述处理器上运行的用户身份验证程序,该程序被处理器执行时实现如下步骤:In addition, in order to achieve the above object, the present application further provides an identity verification server, where the server includes a memory and a processor, and the memory stores a user identity verification program executable on the processor, the program is processed by the processor. The following steps are implemented during execution:
接收带有目标用户身份标识的身份验证请求,从客户端获取该目标用户的当前语音数据;Receiving an authentication request with a target user identity, and obtaining current voice data of the target user from the client;
将该当前语音数据输入训练好的声纹识别模型中,确定该目标用户的当前声纹特征向量,根据预先确定的用户身份标识与标准声纹特征向量的映射关系,确定所述目标用户身份标识对应的标准声纹特征向量;The current voice data is input into the trained voiceprint recognition model, and the current voiceprint feature vector of the target user is determined, and the target user identity is determined according to a predetermined mapping relationship between the user identity and the standard voiceprint feature vector. Corresponding standard voiceprint feature vector;
利用预先确定的距离计算公式计算所述当前声纹特征向量与所述标准声纹特征向量之间的距离;及Calculating a distance between the current voiceprint feature vector and the standard voiceprint feature vector using a predetermined distance calculation formula; and
根据所述距离分析目标用户是否通过身份验证,将所述身份验证结果发送给该客户端。And analyzing, according to the distance, whether the target user is authenticated, and sending the identity verification result to the client.
此外,为实现上述目的,本申请还提供一种计算机可读存储介质,所述计算机可读存储介质上存储有用户身份验证程序,该程序被处理器执行时实现如上所述的用户身份验证方法的任意步骤。In addition, in order to achieve the above object, the present application further provides a computer readable storage medium having a user identity verification program stored thereon, and the program is implemented by a processor to implement the user identity verification method as described above. Any step.
相较于现有技术,本申请提出的用户身份验证方法、服务器及计算机可读存储介质,通过重新定义声纹识别模型,并采用不同渠道收集的声纹数据训练得到的声纹识别模型从当前语音数据中提取出目标用户的当前声纹鉴别向量,在一定程度上避免了因语音数据采集渠道不同造成的声纹鉴别向量与实际声纹鉴别向量差异较大的问题,提高提取声纹鉴别向量的准确性;通过计算当前声纹鉴别向量与预先确定的用户身份标识对应的标准声纹鉴别向量 之间的距离,并根据预设数量的最小距离对应的用户身份标识中是否包含目标用户身份标识,分析目标用户身份验证是否通过,在一定程度上提高了用户身份验证的成功率。Compared with the prior art, the user identity verification method, the server and the computer readable storage medium proposed by the present application, by redefining the voiceprint recognition model, and using the voiceprint data collected by different channels, the voiceprint recognition model is obtained from the current The current voiceprint discrimination vector of the target user is extracted from the voice data, which avoids the problem that the voiceprint discrimination vector and the actual voiceprint discrimination vector are different due to different voice data collection channels, and improves the extracted voiceprint discrimination vector. Accuracy; calculating the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector corresponding to the predetermined user identity, and determining whether the target user identity is included in the user identity corresponding to the preset minimum distance Analyze whether the target user authentication is passed, which improves the success rate of user authentication to a certain extent.
附图说明DRAWINGS
图1为本申请用户身份验证服务器较佳实施例的示意图;1 is a schematic diagram of a preferred embodiment of a user identity verification server of the present application;
图2为图1中用户身份验证程序的程序模块示意图;2 is a schematic diagram of a program module of the user identity verification program in FIG. 1;
图3为本申请用户身份验证方法较佳实施例的流程图。3 is a flow chart of a preferred embodiment of a user identity verification method of the present application.
本申请目的的实现、功能特点及优点将结合实施例,参照附图做进一步说明。The implementation, functional features and advantages of the present application will be further described with reference to the accompanying drawings.
具体实施方式detailed description
应当理解,此处所描述的具体实施例仅仅用以解释本申请,并不用于限定本申请。It is understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting.
本申请提供一种用户身份验证服务器1。参照图1所示,为本申请身份验证服务器1较佳实施例的示意图。The application provides a user identity verification server 1. Referring to FIG. 1, a schematic diagram of a preferred embodiment of the identity verification server 1 of the present application is shown.
在本实施例中,身份验证服务器1可以是机架式服务器、刀片式服务器、塔式服务器或机柜式服务器。In this embodiment, the authentication server 1 may be a rack server, a blade server, a tower server, or a rack server.
该身份验证服务器1包括存储器11、处理器12,通信总线13,以及网络接口14。The authentication server 1 includes a memory 11, a processor 12, a communication bus 13, and a network interface 14.
其中,存储器11至少包括一种类型的可读存储介质,所述可读存储介质包括闪存、硬盘、多媒体卡、卡型存储器(例如,SD或DX存储器等)、磁性存储器、磁盘、光盘等。存储器11在一些实施例中可以是所述身份验证服务器1的内部存储单元,例如该身份验证服务器1的硬盘。存储器11在另一些实施例中也可以是所述身份验证服务器1的外部存储设备,例如该身份验证服务器1上配备的插接式硬盘,智能存储卡(Smart Media Card,SMC),安全数字(Secure Digital,SD)卡,闪存卡(Flash Card)等。进一步地,存储器11还可以既包括该身份验证服务器1的内部存储单元也包括外部存储设备。存储器11不仅可以用于存储安装于该身份验证服务器1的应用软件及各类数据,例如用户身份验证程序10、预先确定的用户身份标识与标准声纹鉴别向 量的映射关系等,还可以用于暂时地存储已经输出或者将要输出的数据。The memory 11 includes at least one type of readable storage medium including a flash memory, a hard disk, a multimedia card, a card type memory (for example, an SD or DX memory, etc.), a magnetic memory, a magnetic disk, an optical disk, and the like. The memory 11 may in some embodiments be an internal storage unit of the authentication server 1, such as the hard disk of the authentication server 1. The memory 11 may also be an external storage device of the authentication server 1 in other embodiments, such as a plug-in hard disk equipped with the smart card (SMC), a secure digital card (SMC). Secure Digital, SD) cards, flash cards, etc. Further, the memory 11 may also include both an internal storage unit of the authentication server 1 and an external storage device. The memory 11 can be used not only for storing the application software installed in the identity verification server 1 and various types of data, such as the user identity verification program 10, the mapping relationship between the predetermined user identity and the standard voiceprint authentication vector, and the like. The data that has been output or will be output is temporarily stored.
处理器12可以是一中央处理器(Central Processing Unit,CPU)、控制器、微控制器、微处理器或其他数据处理芯片,用于运行存储器11中存储的程序代码或处理数据,例如用户身份验证程序10等。The processor 12 can be a central processing unit (CPU), controller, microcontroller, microprocessor or other data processing chip for running program code or processing data stored in the memory 11, such as user identity. Verification procedure 10, etc.
通信总线13用于实现这些组件之间的连接通信。 Communication bus 13 is used to implement connection communication between these components.
网络接口14可选的可以包括标准的有线接口、无线接口(如WI-FI接口),通常用于在该身份验证服务器1与其他电子设备之间建立通信连接,例如,身份验证服务器1通过网络接口14接收用户通过客户端(图中未标识)发送的携带目标身份标识的身份验证请求,并将身份验证结果反馈至客户端。The network interface 14 may optionally include a standard wired interface, a wireless interface (such as a WI-FI interface), and is generally used to establish a communication connection between the authentication server 1 and other electronic devices, for example, the authentication server 1 passes through the network. The interface 14 receives the identity verification request that carries the target identity sent by the user through the client (not identified in the figure), and feeds the authentication result to the client.
图1仅示出了具有组件11-14的身份验证服务器1,但是应理解的是,并不要求实施所有示出的组件,可以替代的实施更多或者更少的组件。Figure 1 shows only the authentication server 1 with components 11-14, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.
可选地,该身份验证服务器1还可以包括用户接口,用户接口可以包括显示器(Display)、输入单元比如键盘(Keyboard),可选的用户接口还可以包括标准的有线接口、无线接口。Optionally, the authentication server 1 may further include a user interface, and the user interface may include a display, an input unit such as a keyboard, and the optional user interface may further include a standard wired interface and a wireless interface.
可选地,在一些实施例中,显示器可以是LED显示器、液晶显示器、触控式液晶显示器以及有机发光二极管(Organic Light-Emitting Diode,OLED)触摸器等。其中,显示器也可以称为显示屏或显示单元,用于显示在身份验证服务器1中处理的信息以及用于显示可视化的用户界面。Optionally, in some embodiments, the display may be an LED display, a liquid crystal display, a touch liquid crystal display, and an Organic Light-Emitting Diode (OLED) touch device. The display may also be referred to as a display screen or display unit for displaying information processed in the authentication server 1 and a user interface for displaying visualizations.
在图1所示的实施例中,存储器11中存储有用户身份验证程序10。处理器12执行存储器11中存储的用户身份验证程序10时实现如下步骤:In the embodiment shown in FIG. 1, a user identity verification program 10 is stored in the memory 11. When the processor 12 executes the user identity verification program 10 stored in the memory 11, the following steps are implemented:
接收带有目标用户身份标识的身份验证请求,从客户端获取该目标用户的当前语音数据;Receiving an authentication request with a target user identity, and obtaining current voice data of the target user from the client;
将该当前语音数据输入训练好的声纹识别模型中,确定该目标用户的当前声纹特征向量,根据预先确定的用户身份标识与标准声纹特征向量的映射关系,确定所述目标用户身份标识对应的标准声纹特征向量;The current voice data is input into the trained voiceprint recognition model, and the current voiceprint feature vector of the target user is determined, and the target user identity is determined according to a predetermined mapping relationship between the user identity and the standard voiceprint feature vector. Corresponding standard voiceprint feature vector;
利用预先确定的距离计算公式计算所述当前声纹特征向量与所述标准声纹特征向量之间的距离;及Calculating a distance between the current voiceprint feature vector and the standard voiceprint feature vector using a predetermined distance calculation formula; and
根据所述距离分析目标用户是否通过身份验证,将所述身份验证结果发送给该客户端。And analyzing, according to the distance, whether the target user is authenticated, and sending the identity verification result to the client.
在本实施例中,客户端为目标用户使用的具备语音采集功能的客户端计算机或者移动终端,目标用户通过客户端发送身份验证请求。当接收到客户端发送来的带有目标用户身份标识(例如,身份证号)的身份验证请求后,为了防止目标用户进行虚假操作,需采集当前发出该身份验证请求的用户的实时语音数据,即利用客户端采集目标用户的当前语音数据,针对采集的当前语音数据,构建出相应的当前声纹鉴别向量。另外,事先为预先确定的用户身份标识设置对应的标准声纹鉴别向量,得到预先确定的用户身份标识与标准声纹鉴别向量的映射关系,并将该映射关系保存至数据库(图中未标识)中,所述预先确定的用户身份标识包含所述目标用户身份标识。例如,用户身份标识M 1对应标准声纹鉴别向量
Figure PCTCN2018102123-appb-000001
用户身份标识M 2对应标准声纹鉴别向量
Figure PCTCN2018102123-appb-000002
然后,根据身份验证请求中携带的目标用户身份标识,从数据库中调取用户身份标识与标准声纹鉴别向量的映射关系,并确定与目标用户身份标识对应的标准声纹鉴别向量。
In this embodiment, the client is a client computer or a mobile terminal with voice collection function used by the target user, and the target user sends an identity verification request through the client. After receiving the authentication request with the target user identity (for example, the ID number) sent by the client, in order to prevent the target user from performing a fake operation, the real-time voice data of the user who currently sends the authentication request is collected. That is, the client collects the current voice data of the target user, and constructs a corresponding current voiceprint discrimination vector for the collected current voice data. In addition, a corresponding standard voiceprint authentication vector is set in advance for a predetermined user identity, and a mapping relationship between the predetermined user identity and the standard voiceprint authentication vector is obtained, and the mapping relationship is saved to a database (not identified in the figure). The predetermined user identity identifies the target user identity. For example, the user identity M 1 corresponds to a standard voiceprint discrimination vector.
Figure PCTCN2018102123-appb-000001
User identity M 2 corresponds to standard voiceprint identification vector
Figure PCTCN2018102123-appb-000002
Then, according to the target user identity carried in the identity verification request, the mapping relationship between the user identity and the standard voiceprint authentication vector is retrieved from the database, and the standard voiceprint authentication vector corresponding to the target user identity is determined.
作为一种实施方式,在采集到目标用户的当前语音数据后,将当前语音数据输入预先训练好的声纹识别模型中,以确定当前语音数据对应的当前声纹鉴别向量。As an implementation manner, after the current voice data of the target user is collected, the current voice data is input into the pre-trained voiceprint recognition model to determine a current voiceprint discrimination vector corresponding to the current voice data.
具体地,所述声纹识别模型通过如下步骤获取:预先获取第一预设数量(例如,5000个)的用户的语音样本,每一个用户的语音样本包括第二预设数量(例如,10份)的不同语音段样本,其中,不同的语音段样本分别通过不同的渠道(例如,不同的终端)获取,并利用获取的每一个用户的语音样本训练所述预设类型的声纹识别模型,生成训练好的声纹识别模型。通过利用不同渠道收集的声纹数据训练声纹识别模型,后续利用该声纹识别模型获取来自不同渠道的语音数据的声纹鉴别向量,可在一定程度上避免因语音数据采集渠道不同造成的声纹鉴别向量与实际声纹鉴别向量差异较大的问题,提高识别声纹鉴别向量的准确性。Specifically, the voiceprint recognition model is obtained by acquiring a first preset number (for example, 5000) of voice samples of a user, and each user's voice sample includes a second preset number (for example, 10 copies). Different voice segment samples, wherein different voice segment samples are respectively acquired through different channels (for example, different terminals), and the preset voiceprint recognition model is trained by using the acquired voice samples of each user. Generate a trained voiceprint recognition model. By using the voiceprint data collected by different channels to train the voiceprint recognition model, the voiceprint recognition model can be used to obtain the voiceprint discrimination vector of the voice data from different channels, which can avoid the sound caused by different voice data collection channels to some extent. The difference between the discriminant vector and the actual voiceprint discriminant vector is large, and the accuracy of recognizing the voiceprint discriminant vector is improved.
进一步地,在训练声纹识别模型之前,需对声纹识别模型进行定义。在本实施例中,该声纹识别模型包括代表本征音空间矩阵的说话人空间特征项和代表本征信道空间矩阵的信道空间特征项。需要说明的是,说话人空间特征项只与说话人有关而与说话人具体内容无关,表述了说话人的类间差异,为方便算法计算,将该特征项汇集总结为矩阵的形式,表示为本征音空间矩 阵,其中包含的内容定义为说话人特征项,包含了相应说话人独有的信息,每个人之间的该特征项都是不同的;信道空间特征项表示了同一说话人的不同差异,即因信道不同造成的噪音差异,为方便算法计算,将该特征项汇集总结为矩阵的形式,表示为本征信道空间矩阵,其中包含的内容定义为信道空间特征项,包含了同一说话人通过不同的信道而带来的声纹差异信息,也就是说,同一个人的同一段语音经过不同声道后,该特征项是不同的。其中,所述说话人空间特征项包括说话人声纹特征向量,所述信道空间特征项包括信道因素特征向量。Further, the voiceprint recognition model needs to be defined before training the voiceprint recognition model. In this embodiment, the voiceprint recognition model includes a speaker space feature item representing the eigenvoice space matrix and a channel space feature item representing the eigenchannel space matrix. It should be noted that the speaker spatial feature item is only related to the speaker and has nothing to do with the specific content of the speaker. The speaker's inter-class difference is expressed. To facilitate the algorithm calculation, the feature item is summarized and summarized into a matrix form, which is expressed as The eigenvoice space matrix, which contains the content defined as the speaker feature item, contains the information unique to the corresponding speaker, and the feature items are different for each person; the channel space feature item represents the same speaker Different differences, that is, noise differences caused by different channels, in order to facilitate the calculation of the algorithm, the feature items are summarized into a matrix form, representing the spatial matrix of the intrinsic channel, and the content contained therein is defined as a channel space feature item, which includes The voiceprint difference information brought by the same speaker through different channels, that is, the same voice of the same person after the same voice passes through different channels, the feature item is different. The speaker spatial feature item includes a speaker voiceprint feature vector, and the channel spatial feature item includes a channel factor feature vector.
优选地,所述声纹识别模型的模型公式为:Preferably, the model formula of the voiceprint recognition model is:
X ij=μ+Fh i+Gw ij+∈ ij X ij =μ+Fh i +Gw ij +∈ ij
其中,X ij表示第i个说话人的第j条语音,μ表示所有语音样本数据的均值,F表示身份空间且包含了用来表示各种身份的基底,F的每一列就相当于类间空间的特征向量,h i表示第i个说话人的声纹特征向量,G表示误差空间且包含了用来表示同一身份不同变化的基底,G的每一列相当于类内空间的特征向量,w ij表示第i个说话人的第j条语音的信道因素特征向量,∈ ij表示残留噪声项,用来表示尚未解释的因素,该项可以为零均高斯分布,“μ+Fh i”表示说话人空间特征项,“Gw ij+∈ ij”表示信道空间特征项。需要说明的是,同一个说话人的不同语音段对应的声纹特征向量h i是相同的,通过模型训练,可以训练出Gw ij+∈ ij因素关系。 Where X ij represents the jth speech of the i-th speaker, μ represents the mean of all speech sample data, F represents the identity space and contains the base used to represent various identities, and each column of F is equivalent to the inter-class The feature vector of space, h i represents the voiceprint feature vector of the i-th speaker, G represents the error space and contains the base used to represent the different changes of the same identity, and each column of G is equivalent to the feature vector of the intra-class space, w ij denotes a channel vector of the j-th element wherein the i-th voice speaker, ∈ ij represents the residual noise term to denote not yet explain the factors, that may have a Gaussian distribution of zero, "μ + Fh i" represents the speaker The human space feature term, "Gw ij + ∈ ij ", represents a channel space feature item. It should be noted that the voiceprint feature vectors h i corresponding to different speech segments of the same speaker are the same, and the Gw ij +∈ ij factor relationship can be trained through model training.
利用上述声纹识别模型提取出目标用户的当前语音数据对应的当前声纹鉴别向量后,根据预先确定的距离计算公式计算当前声纹鉴别向量与目标用户身份标识对应的标准声纹鉴别向量之间的距离。作为一种实施方式,预先确定的距离计算公式可以为:After extracting the current voiceprint discrimination vector corresponding to the current voice data of the target user by using the voiceprint recognition model, calculating a current voiceprint discrimination vector corresponding to the target voiceprint discrimination vector corresponding to the target user identity identifier according to the predetermined distance calculation formula. the distance. As an implementation manner, the predetermined distance calculation formula may be:
Figure PCTCN2018102123-appb-000003
Figure PCTCN2018102123-appb-000003
其中,D表示当前声纹鉴别向量与目标用户身份标识对应的标准声纹鉴别向量之间的距离,
Figure PCTCN2018102123-appb-000004
表示身份验证请求中携带的目标用户身份标识对应的标准声纹鉴别向量,
Figure PCTCN2018102123-appb-000005
表示从当前语音数据中提取出的当前声纹鉴别向量。
Where D represents the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector corresponding to the target user identity.
Figure PCTCN2018102123-appb-000004
Representing the standard voiceprint identification vector corresponding to the target user identity carried in the authentication request,
Figure PCTCN2018102123-appb-000005
Represents the current voiceprint discrimination vector extracted from the current voice data.
可以理解的是,当前声纹鉴别向量与标准声纹鉴别向量之间的距离越大,说明两个向量对应的说话人越不可能是同一个人。因此,预设一个距离阈值,当计算的距离小于或者等于预设的距离阈值时,确定声纹验证结果为声纹验 证通过,即确定目标用户身份验证通过;否则,声纹验证结果为声纹验证失败,即确定目标用户身份验证失败,并将身份验证结果反馈至客户端。It can be understood that the greater the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector, the less likely the speaker corresponding to the two vectors is the same person. Therefore, a distance threshold is preset. When the calculated distance is less than or equal to the preset distance threshold, the voiceprint verification result is determined to be voiceprint verification, that is, the target user identity verification is determined; otherwise, the voiceprint verification result is voiceprint. The verification fails, that is, the target user authentication fails, and the authentication result is fed back to the client.
在其他实施例中,在分别确定目标用户的当前语音数据对应的当前声纹鉴别向量、与目标用户身份标识对应的标准声纹鉴别向量后,利用预先确定的距离计算公式计算当前声纹特征向量与目标用户身份标识对应的标准声纹特征向量之间的距离的同时,还计算当前声纹特征向量与各个预先确定的(例如,n个,n为整数,且n>0)其他用户对应的预存标准声纹特征向量之间的多个距离,也就是说,分别计算当前声纹鉴别向量与上述所有预先确定的用户身份标识对应的标准声纹鉴别向量之间的距离D i,其中,i为整数,且0<i≤n,具体计算方式与上述实施例一致,这里不作赘述。 In other embodiments, after determining the current voiceprint discrimination vector corresponding to the current voice data of the target user and the standard voiceprint discrimination vector corresponding to the target user identity, the current voiceprint feature vector is calculated by using a predetermined distance calculation formula. While calculating the distance between the standard voiceprint feature vectors corresponding to the target user identity, the current voiceprint feature vector is also calculated corresponding to each of the predetermined (eg, n, n is an integer, and n>0) other users. Pre-storing a plurality of distances between standard voiceprint feature vectors, that is, respectively calculating a distance D i between the current voiceprint discrimination vector and a standard voiceprint discrimination vector corresponding to all of the predetermined user identifiers, wherein i The value is an integer, and 0<i ≤ n. The specific calculation manner is the same as the above embodiment, and is not described here.
进一步地,按照从大到小的顺序,对所述当前声纹特征向量与各个预先确定的用户身份标识对应的标准声纹特征向量之间的距离进行排序,所述各个预先确定的用户身份标识中包括目标用户身份标识;从n个距离中筛选出排序在前的距离对应的第三预设数量(例如,5个)的用户身份标识,判断该第三预设数量(例如,5个)的用户身份标识中是否包含目标用户身份标识;当所述第三预设数量(例如,5个)的用户身份标识中包含目标用户身份标识时,判断声纹验证结果为声纹验证通过,即目标用户身份验证通过,否则,判断声纹验证结果为声纹验证失败,即目标用户身份验证失败,并将身份验证结果反馈至客户端。需要说明的是,第三预设数量数值越大,声纹识别通过的可能性越大,然而,识别的准确性无法得到保证,因此,为了提高声纹验证的准确性,可以根据实际需求对筛选的排序在前的第三预设数量进行调整(例如,将第三预设数量调整至2个)。Further, in order from large to small, the distance between the current voiceprint feature vector and the standard voiceprint feature vector corresponding to each predetermined user identity is sorted, and each predetermined user identity is determined. The target user identity is included; the third preset number (for example, 5) of the user identity corresponding to the previous distance is filtered out from the n distances, and the third preset number (for example, 5) is determined. Whether the target user identity is included in the user identifier; when the third preset number (for example, 5) of the user identity includes the target user identity, the voiceprint verification result is determined to be voiceprint verification, that is, The target user authentication is passed. Otherwise, it is determined that the voiceprint verification result is that the voiceprint verification fails, that is, the target user identity verification fails, and the authentication result is fed back to the client. It should be noted that the larger the third preset number is, the more likely the voiceprint recognition is to pass. However, the accuracy of the recognition cannot be guaranteed. Therefore, in order to improve the accuracy of voiceprint verification, it may be based on actual needs. The sorted order is adjusted in the previous third preset number (for example, the third preset number is adjusted to 2).
上述实施例提出的服务器1,通过重新定义声纹识别模型,并采用不同渠道收集的声纹数据训练得到的声纹识别模型从当前语音数据中提取出目标用户的当前声纹鉴别向量,在一定程度上避免了因语音数据采集渠道不同造成的声纹鉴别向量与实际声纹鉴别向量差异较大的问题,提高提取声纹鉴别向量的准确性;通过计算当前声纹鉴别向量与预先确定的用户身份标识对应的标准声纹鉴别向量之间的距离,并根据预设数量的最小距离对应的用户身份标识中是否包含目标用户身份标识,分析目标用户身份验证是否通过,在一 定程度上提高了用户身份验证的成功率。The server 1 proposed by the above embodiment extracts the current voiceprint discrimination vector of the target user from the current voice data by redefining the voiceprint recognition model and using the voiceprint data trained by the voiceprint data collected by different channels. To the extent that the difference between the voiceprint discrimination vector and the actual voiceprint discrimination vector caused by different voice data collection channels is avoided, and the accuracy of extracting the voiceprint discrimination vector is improved; by calculating the current voiceprint discrimination vector and the predetermined user The distance between the standard voiceprint authentication vectors corresponding to the identity identifier, and whether the target user identity identifier is included in the user identity corresponding to the preset minimum number of distances, and whether the target user identity verification is passed, thereby improving the user to a certain extent The success rate of authentication.
可选地,在其他的实施例中,用户身份验证程序10还可以被分割为一个或者多个模块,一个或者多个模块被存储于存储器11中,并由一个或多个处理器(本实施例为处理器12)所执行,以完成本申请,本申请所称的模块是指能够完成特定功能的一系列计算机程序指令段。例如,参照图2所示,为图1中用户身份验证程序10的程序模块示意图,该实施例中,用户身份验证程序10可以被分割为获取模块110、向量提取模块120、计算模块130及分析模块140,所述模块110-140所实现的功能或操作步骤均与上文类似,此处不再详述,示例性地,例如其中:Optionally, in other embodiments, the user identity verification program 10 may also be divided into one or more modules, one or more modules are stored in the memory 11 and executed by one or more processors (this implementation) For example, the processor 12) is executed to complete the application, and a module referred to herein refers to a series of computer program instructions that are capable of performing a particular function. For example, as shown in FIG. 2, it is a schematic diagram of a program module of the user identity verification program 10 in FIG. 1. In this embodiment, the user identity verification program 10 can be divided into an acquisition module 110, a vector extraction module 120, a calculation module 130, and an analysis. The functions or operational steps implemented by the modules 140-140 are similar to the above, and are not described in detail herein, by way of example, for example:
获取模块110,用于接收带有目标用户身份标识的身份验证请求,从客户端获取该目标用户的当前语音数据;The obtaining module 110 is configured to receive an identity verification request with a target user identity, and obtain current voice data of the target user from the client.
向量提取模块120,用于将该当前语音数据输入训练好的声纹识别模型中,确定该目标用户的当前声纹特征向量,根据预先确定的用户身份标识与标准声纹特征向量的映射关系,确定所述目标用户身份标识对应的标准声纹特征向量;The vector extraction module 120 is configured to input the current voice data into the trained voiceprint recognition model, determine a current voiceprint feature vector of the target user, and according to a mapping relationship between the predetermined user identity and the standard voiceprint feature vector, Determining a standard voiceprint feature vector corresponding to the target user identity;
计算模块130,用于利用预先确定的距离计算公式计算所述当前声纹特征向量与所述标准声纹特征向量之间的距离;及The calculating module 130 is configured to calculate a distance between the current voiceprint feature vector and the standard voiceprint feature vector by using a predetermined distance calculation formula; and
分析模块140,用于根据所述距离分析目标用户是否通过身份验证,将所述身份验证结果发送给该客户端。The analyzing module 140 is configured to analyze, according to the distance, whether the target user passes the identity verification, and send the identity verification result to the client.
此外,本申请还提供一种用户身份验证方法。参照图3所示,为本申请用户身份验证方法较佳实施例的流程图。该方法可以由一个装置执行,该装置可以由软件和/或硬件实现。In addition, the present application also provides a user identity verification method. Referring to FIG. 3, it is a flowchart of a preferred embodiment of the user identity verification method of the present application. The method can be performed by a device that can be implemented by software and/or hardware.
在本实施例中,用户身份验证方法包括步骤S1-S4:In this embodiment, the user identity verification method includes steps S1-S4:
步骤S1,接收带有目标用户身份标识的身份验证请求,从客户端获取该目标用户的当前语音数据;Step S1: Receive an identity verification request with a target user identity, and obtain current voice data of the target user from the client.
步骤S2,将该当前语音数据输入训练好的声纹识别模型中,确定该目标用户的当前声纹特征向量,根据预先确定的用户身份标识与标准声纹特征向量的映射关系,确定所述目标用户身份标识对应的标准声纹特征向量;Step S2: input the current voice data into the trained voiceprint recognition model, determine a current voiceprint feature vector of the target user, and determine the target according to a predetermined mapping relationship between the user identity identifier and the standard voiceprint feature vector. a standard voiceprint feature vector corresponding to the user identity;
步骤S3,利用预先确定的距离计算公式计算所述当前声纹特征向量与所述标准声纹特征向量之间的距离;及Step S3, calculating a distance between the current voiceprint feature vector and the standard voiceprint feature vector by using a predetermined distance calculation formula; and
步骤S4,根据所述距离分析目标用户是否通过身份验证,将所述身份验证结果发送给该客户端。Step S4: Analyze, according to the distance, whether the target user passes the identity verification, and sends the identity verification result to the client.
在本实施例中,客户端为目标用户使用的具备语音采集功能的客户端计算机或者移动终端,目标用户通过客户端发送身份验证请求。当接收到客户端发送来的带有目标用户身份标识(例如,身份证号)的身份验证请求后,为了防止目标用户进行虚假操作,需采集当前发出该身份验证请求的用户的实时语音数据,即利用客户端采集目标用户的当前语音数据,针对采集的当前语音数据,构建出相应的当前声纹鉴别向量。另外,事先为预先确定的用户身份标识设置对应的标准声纹鉴别向量,得到预先确定的用户身份标识与标准声纹鉴别向量的映射关系,并将该映射关系保存至数据库(图中未标识)中,所述预先确定的用户身份标识包含所述目标用户身份标识。例如,用户身份标识M 1对应标准声纹鉴别向量
Figure PCTCN2018102123-appb-000006
用户身份标识M 2对应标准声纹鉴别向量
Figure PCTCN2018102123-appb-000007
然后,根据身份验证请求中携带的目标用户身份标识,从数据库中调取用户身份标识与标准声纹鉴别向量的映射关系,并确定与目标用户身份标识对应的标准声纹鉴别向量。
In this embodiment, the client is a client computer or a mobile terminal with voice collection function used by the target user, and the target user sends an identity verification request through the client. After receiving the authentication request with the target user identity (for example, the ID number) sent by the client, in order to prevent the target user from performing a fake operation, the real-time voice data of the user who currently sends the authentication request is collected. That is, the client collects the current voice data of the target user, and constructs a corresponding current voiceprint discrimination vector for the collected current voice data. In addition, a corresponding standard voiceprint authentication vector is set in advance for a predetermined user identity, and a mapping relationship between the predetermined user identity and the standard voiceprint authentication vector is obtained, and the mapping relationship is saved to a database (not identified in the figure). The predetermined user identity identifies the target user identity. For example, the user identity M 1 corresponds to a standard voiceprint discrimination vector.
Figure PCTCN2018102123-appb-000006
User identity M 2 corresponds to standard voiceprint identification vector
Figure PCTCN2018102123-appb-000007
Then, according to the target user identity carried in the identity verification request, the mapping relationship between the user identity and the standard voiceprint authentication vector is retrieved from the database, and the standard voiceprint authentication vector corresponding to the target user identity is determined.
作为一种实施方式,在采集到目标用户的当前语音数据后,将当前语音数据输入预先训练好的声纹识别模型中,以确定当前语音数据对应的当前声纹鉴别向量。As an implementation manner, after the current voice data of the target user is collected, the current voice data is input into the pre-trained voiceprint recognition model to determine a current voiceprint discrimination vector corresponding to the current voice data.
具体地,所述声纹识别模型通过如下步骤获取:预先获取第一预设数量(例如,5000个)的用户的语音样本,每一个用户的语音样本包括第二预设数量(例如,10份)的不同语音段样本,其中,不同的语音段样本分别通过不同的渠道(例如,不同的终端)获取,并利用获取的每一个用户的语音样本训练所述预设类型的声纹识别模型,生成训练好的声纹识别模型。通过利用不同渠道收集的声纹数据训练声纹识别模型,后续利用该声纹识别模型获取来自不同渠道的语音数据的声纹鉴别向量,可在一定程度上避免因语音数据采集渠道不同造成的声纹鉴别向量与实际声纹鉴别向量差异较大的问题,提高识别声纹鉴别向量的准确性。Specifically, the voiceprint recognition model is obtained by acquiring a first preset number (for example, 5000) of voice samples of a user, and each user's voice sample includes a second preset number (for example, 10 copies). Different voice segment samples, wherein different voice segment samples are respectively acquired through different channels (for example, different terminals), and the preset voiceprint recognition model is trained by using the acquired voice samples of each user. Generate a trained voiceprint recognition model. By using the voiceprint data collected by different channels to train the voiceprint recognition model, the voiceprint recognition model can be used to obtain the voiceprint discrimination vector of the voice data from different channels, which can avoid the sound caused by different voice data collection channels to some extent. The difference between the discriminant vector and the actual voiceprint discriminant vector is large, and the accuracy of recognizing the voiceprint discriminant vector is improved.
进一步地,在训练声纹识别模型之前,需对声纹识别模型进行定义。在 本实施例中,该声纹识别模型包括代表本征音空间矩阵的说话人空间特征项和代表本征信道空间矩阵的信道空间特征项。需要说明的是,说话人空间特征项只与说话人有关而与说话人具体内容无关,表述了说话人的类间差异,为方便算法计算,将该特征项汇集总结为矩阵的形式,表示为本征音空间矩阵,其中包含的内容定义为说话人特征项,包含了相应说话人独有的信息,每个人之间的该特征项都是不同的;信道空间特征项表示了同一说话人的不同差异,即因信道不同造成的噪音差异,为方便算法计算,将该特征项汇集总结为矩阵的形式,表示为本征信道空间矩阵,其中包含的内容定义为信道空间特征项,包含了同一说话人通过不同的信道而带来的声纹差异信息,也就是说,同一个人的同一段语音经过不同声道后,该特征项是不同的。其中,所述说话人空间特征项包括说话人声纹特征向量,所述信道空间特征项包括信道因素特征向量。Further, the voiceprint recognition model needs to be defined before training the voiceprint recognition model. In this embodiment, the voiceprint recognition model includes a speaker space feature item representing the eigenvoice space matrix and a channel space feature item representing the eigenchannel space matrix. It should be noted that the speaker spatial feature item is only related to the speaker and has nothing to do with the specific content of the speaker. The speaker's inter-class difference is expressed. To facilitate the algorithm calculation, the feature item is summarized and summarized into a matrix form, which is expressed as The eigenvoice space matrix, which contains the content defined as the speaker feature item, contains the information unique to the corresponding speaker, and the feature items are different for each person; the channel space feature item represents the same speaker Different differences, that is, noise differences caused by different channels, in order to facilitate the calculation of the algorithm, the feature items are summarized into a matrix form, representing the spatial matrix of the intrinsic channel, and the content contained therein is defined as a channel space feature item, which includes The voiceprint difference information brought by the same speaker through different channels, that is, the same voice of the same person after the same voice passes through different channels, the feature item is different. The speaker spatial feature item includes a speaker voiceprint feature vector, and the channel spatial feature item includes a channel factor feature vector.
优选地,所述声纹识别模型的模型公式为:Preferably, the model formula of the voiceprint recognition model is:
X ij=μ+Fh i+Gw ij+∈ ij X ij =μ+Fh i +Gw ij +∈ ij
其中,X ij表示第i个说话人的第j条语音,μ表示所有语音样本数据的均值,F表示身份空间且包含了用来表示各种身份的基底,F的每一列就相当于类间空间的特征向量,h i表示第i个说话人的声纹特征向量,G表示误差空间且包含了用来表示同一身份不同变化的基底,G的每一列相当于类内空间的特征向量,w ij表示第i个说话人的第j条语音的信道因素特征向量,∈ ij表示残留噪声项,用来表示尚未解释的因素,该项可以为零均高斯分布,“μ+Fh i”表示说话人空间特征项,“Gw ij+∈ ij”表示信道空间特征项。需要说明的是,同一个说话人的不同语音段对应的声纹特征向量h i是相同的,通过模型训练,可以训练出Gw ij+∈ ij因素关系。 Where X ij represents the jth speech of the i-th speaker, μ represents the mean of all speech sample data, F represents the identity space and contains the base used to represent various identities, and each column of F is equivalent to the inter-class The feature vector of space, h i represents the voiceprint feature vector of the i-th speaker, G represents the error space and contains the base used to represent the different changes of the same identity, and each column of G is equivalent to the feature vector of the intra-class space, w ij represents the channel eigenvectors j-th element of the i-th voice speaker, ∈ ij represents the residual noise term to denote not yet explain the factors, that may have a Gaussian distribution of zero, "μ + Fh i" represents the speaker The human space feature term, "Gw ij + ∈ ij ", represents a channel space feature item. It should be noted that the voiceprint feature vectors h i corresponding to different speech segments of the same speaker are the same, and the Gw ij +∈ ij factor relationship can be trained through model training.
利用上述声纹识别模型提取出目标用户的当前语音数据对应的当前声纹鉴别向量后,根据预先确定的距离计算公式计算当前声纹鉴别向量与目标用户身份标识对应的标准声纹鉴别向量之间的距离。作为一种实施方式,预先确定的距离计算公式可以为:After extracting the current voiceprint discrimination vector corresponding to the current voice data of the target user by using the voiceprint recognition model, calculating a current voiceprint discrimination vector corresponding to the target voiceprint discrimination vector corresponding to the target user identity identifier according to the predetermined distance calculation formula. the distance. As an implementation manner, the predetermined distance calculation formula may be:
Figure PCTCN2018102123-appb-000008
Figure PCTCN2018102123-appb-000008
其中,D表示当前声纹鉴别向量与目标用户身份标识对应的标准声纹鉴别向量之间的距离,
Figure PCTCN2018102123-appb-000009
表示身份验证请求中携带的目标用户身份标识对应的标准声纹鉴别向量,
Figure PCTCN2018102123-appb-000010
表示从当前语音数据中提取出的当前声纹鉴别向量。
Where D represents the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector corresponding to the target user identity.
Figure PCTCN2018102123-appb-000009
Representing the standard voiceprint identification vector corresponding to the target user identity carried in the authentication request,
Figure PCTCN2018102123-appb-000010
Represents the current voiceprint discrimination vector extracted from the current voice data.
可以理解的是,当前声纹鉴别向量与标准声纹鉴别向量之间的距离越大,说明两个向量对应的说话人越不可能是同一个人。因此,预设一个距离阈值,当计算的距离小于或者等于预设的距离阈值时,确定声纹验证结果为声纹验证通过,即确定目标用户身份验证通过;否则,声纹验证结果为声纹验证失败,即确定目标用户身份验证失败,并将身份验证结果反馈至客户端。It can be understood that the greater the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector, the less likely the speaker corresponding to the two vectors is the same person. Therefore, a distance threshold is preset. When the calculated distance is less than or equal to the preset distance threshold, the voiceprint verification result is determined to be voiceprint verification, that is, the target user identity verification is determined; otherwise, the voiceprint verification result is voiceprint. The verification fails, that is, the target user authentication fails, and the authentication result is fed back to the client.
在其他实施例中,所述步骤S3可以替换为:利用预先确定的距离计算公式计算当前声纹特征向量与所述各个预先确定的用户身份标识对应的标准声纹特征向量之间的距离。In other embodiments, the step S3 may be replaced by calculating a distance between the current voiceprint feature vector and the standard voiceprint feature vector corresponding to each of the predetermined user identity identifiers by using a predetermined distance calculation formula.
在分别确定目标用户的当前语音数据对应的当前声纹鉴别向量、与目标用户身份标识对应的标准声纹鉴别向量后,利用预先确定的距离计算公式计算当前声纹特征向量与目标用户身份标识对应的标准声纹特征向量之间的距离的同时,还计算当前声纹特征向量与各个预先确定的(例如,n个,n为整数,且n>0)其他用户对应的预存标准声纹特征向量之间的多个距离,也就是说,分别计算当前声纹鉴别向量与上述所有预先确定的用户身份标识对应的标准声纹鉴别向量之间的距离D i,其中,i为整数,且0<i≤n,具体计算方式与上述实施例一致,这里不作赘述。 After determining the current voiceprint discrimination vector corresponding to the current voice data of the target user and the standard voiceprint discrimination vector corresponding to the target user identity, respectively, calculating the current voiceprint feature vector and the target user identity identifier by using a predetermined distance calculation formula While calculating the distance between the standard voiceprint feature vectors, the current voiceprint feature vector is also calculated with each predetermined (eg, n, n is an integer, and n>0) pre-stored standard voiceprint feature vectors corresponding to other users. a plurality of distances between each other, that is, respectively calculating a distance D i between the current voiceprint discrimination vector and a standard voiceprint discrimination vector corresponding to all of the predetermined user identifiers described above, where i is an integer and 0< i ≤ n, the specific calculation method is consistent with the above embodiment, and will not be described herein.
进一步地,所述步骤S4可以替换为:按照从大到小的顺序,对所述当前声纹特征向量与各个预先确定的用户身份标识对应的标准声纹特征向量之间的距离进行排序,所述各个预先确定的用户身份标识中包括目标用户身份标识;从n个距离中筛选出排序在前的距离对应的第三预设数量(例如,5个)的用户身份标识,判断该第三预设数量(例如,5个)的用户身份标识中是否包含目标用户身份标识;当所述第三预设数量(例如,5个)的用户身份标识中包含目标用户身份标识时,判断声纹验证结果为声纹验证通过,即目标用户身份验证通过,否则,判断声纹验证结果为声纹验证失败,即目标用户身份验证失败,并将身份验证结果反馈至客户端。需要说明的是,第三预设数量数值越大,声纹识别通过的可能性越大,然而,识别的准确性无法得到保 证,因此,为了提高声纹验证的准确性,可以根据实际需求对筛选的排序在前的第三预设数量进行调整(例如,将第三预设数量调整至2个)。Further, the step S4 may be replaced by: sorting the distance between the current voiceprint feature vector and the standard voiceprint feature vector corresponding to each predetermined user identity according to the order from large to small, Determining the user identity of the user in the predetermined user identity; selecting a third preset number (for example, 5) of the user identity corresponding to the previous distance from the n distances, and determining the third pre-determination Whether the number of user IDs (for example, 5) includes the target user identity; when the third preset number (for example, 5) of the user identity includes the target user identity, the voiceprint verification is determined. The result is that the voiceprint verification is passed, that is, the target user identity is passed. Otherwise, the voiceprint verification result is determined to be that the voiceprint verification fails, that is, the target user identity verification fails, and the identity verification result is fed back to the client. It should be noted that the larger the third preset number is, the more likely the voiceprint recognition is to pass. However, the accuracy of the recognition cannot be guaranteed. Therefore, in order to improve the accuracy of voiceprint verification, it may be based on actual needs. The sorted order is adjusted in the previous third preset number (for example, the third preset number is adjusted to 2).
上述实施例提出的用户身份验证方法,通过重新定义声纹识别模型,并采用不同渠道收集的声纹数据训练得到的声纹识别模型从当前语音数据中提取出目标用户的当前声纹鉴别向量,在一定程度上避免了因语音数据采集渠道不同造成的声纹鉴别向量与实际声纹鉴别向量差异较大的问题,提高提取声纹鉴别向量的准确性;通过计算当前声纹鉴别向量与预先确定的用户身份标识对应的标准声纹鉴别向量之间的距离,并根据预设数量的最小距离对应的用户身份标识中是否包含目标用户身份标识,分析目标用户身份验证是否通过,在一定程度上提高了用户身份验证的成功率。The user identity verification method proposed by the above embodiment extracts the current voiceprint discrimination vector of the target user from the current voice data by redefining the voiceprint recognition model and using the voiceprint recognition model trained by the voiceprint data collected by different channels. To some extent, the problem that the voiceprint discrimination vector and the actual voiceprint discrimination vector are different due to different voice data collection channels is avoided, and the accuracy of extracting the voiceprint discrimination vector is improved; by calculating the current voiceprint discrimination vector and predetermining The distance between the standard voiceprint authentication vectors corresponding to the user identity, and whether the target user identity is included in the user identity corresponding to the preset minimum number of distances, and whether the target user identity verification is passed, to some extent The success rate of user authentication.
此外,本申请实施例还提出一种计算机可读存储介质,所述计算机可读存储介质上存储有用户身份验证程序10,该程序被处理器执行时实现如下操作:In addition, the embodiment of the present application further provides a computer readable storage medium, where the user identifiable program 10 is stored, and when the program is executed by the processor, the following operations are implemented:
接收带有目标用户身份标识的身份验证请求,从客户端获取该目标用户的当前语音数据;Receiving an authentication request with a target user identity, and obtaining current voice data of the target user from the client;
将该当前语音数据输入训练好的声纹识别模型中,确定该目标用户的当前声纹特征向量,根据预先确定的用户身份标识与标准声纹特征向量的映射关系,确定所述目标用户身份标识对应的标准声纹特征向量;The current voice data is input into the trained voiceprint recognition model, and the current voiceprint feature vector of the target user is determined, and the target user identity is determined according to a predetermined mapping relationship between the user identity and the standard voiceprint feature vector. Corresponding standard voiceprint feature vector;
利用预先确定的距离计算公式计算所述当前声纹特征向量与所述标准声纹特征向量之间的距离;及Calculating a distance between the current voiceprint feature vector and the standard voiceprint feature vector using a predetermined distance calculation formula; and
根据所述距离分析目标用户是否通过身份验证,将所述身份验证结果发送给该客户端。And analyzing, according to the distance, whether the target user is authenticated, and sending the identity verification result to the client.
本申请计算机可读存储介质具体实施方式与上述用户身份验证方法的各实施例基本相同,在此不作累述。The specific embodiment of the computer readable storage medium of the present application is substantially the same as the embodiments of the user identity verification method described above, and is not described herein.
需要说明的是,上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。并且本文中的术语“包括”、“包含”或者其任何其他变体意在涵盖非排他性的包含,从而使得包括一系列要素的过程、装置、物品或者方法不仅包括那些要素,而且还包括没有明确列出的其他要素,或者是还包括为这种过程、 装置、物品或者方法所固有的要素。在没有更多限制的情况下,由语句“包括一个……”限定的要素,并不排除在包括该要素的过程、装置、物品或者方法中还存在另外的相同要素。It should be noted that the foregoing serial numbers of the embodiments of the present application are merely for the description, and do not represent the advantages and disadvantages of the embodiments. And the terms "including", "comprising", or any other variations thereof are intended to encompass a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a plurality of elements includes not only those elements but also Other elements listed, or elements that are inherent to such a process, device, item, or method. An element that is defined by the phrase "comprising a ..." does not exclude the presence of additional equivalent elements in the process, the device, the item, or the method that comprises the element.
通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到上述实施例方法可借助软件加必需的通用硬件平台的方式来实现,当然也可以通过硬件,但很多情况下前者是更佳的实施方式。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来,该计算机软件产品存储在如上所述的一个存储介质(如ROM/RAM、磁碟、光盘)中,包括若干指令用以使得一台终端设备(可以是手机,计算机,服务器,或者网络设备等)执行本申请各个实施例所述的方法。Through the description of the above embodiments, those skilled in the art can clearly understand that the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is better. Implementation. Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM as described above). , a disk, an optical disk, including a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, or a network device, etc.) to perform the methods described in the various embodiments of the present application.
以上仅为本申请的优选实施例,并非因此限制本申请的专利范围,凡是利用本申请说明书及附图内容所作的等效结构或等效流程变换,或直接或间接运用在其他相关的技术领域,均同理包括在本申请的专利保护范围内。The above is only a preferred embodiment of the present application, and is not intended to limit the scope of the patent application, and the equivalent structure or equivalent process transformations made by the specification and the drawings of the present application, or directly or indirectly applied to other related technical fields. The same is included in the scope of patent protection of this application.

Claims (20)

  1. 一种用户身份验证方法,其特征在于,该方法包括:A user identity verification method, the method comprising:
    接收带有目标用户身份标识的身份验证请求,从客户端获取该目标用户的当前语音数据;Receiving an authentication request with a target user identity, and obtaining current voice data of the target user from the client;
    将该当前语音数据输入训练好的声纹识别模型中,确定该目标用户的当前声纹特征向量,根据预先确定的用户身份标识与标准声纹特征向量的映射关系,确定所述目标用户身份标识对应的标准声纹特征向量;The current voice data is input into the trained voiceprint recognition model, and the current voiceprint feature vector of the target user is determined, and the target user identity is determined according to a predetermined mapping relationship between the user identity and the standard voiceprint feature vector. Corresponding standard voiceprint feature vector;
    利用预先确定的距离计算公式计算所述当前声纹特征向量与所述标准声纹特征向量之间的距离;及Calculating a distance between the current voiceprint feature vector and the standard voiceprint feature vector using a predetermined distance calculation formula; and
    根据所述距离分析目标用户是否通过身份验证,将所述身份验证结果发送给该客户端。And analyzing, according to the distance, whether the target user is authenticated, and sending the identity verification result to the client.
  2. 如权利要求1所述的用户身份验证方法,其特征在于,所述“根据所述距离分析目标用户是否通过身份验证”的步骤包括:The user identity verification method according to claim 1, wherein the step of "analysing whether the target user is authenticated according to the distance" comprises:
    当计算的距离小于或者等于预设阈值时,判断目标用户身份验证通过;或When the calculated distance is less than or equal to the preset threshold, it is determined that the target user identity verification is passed; or
    当计算的距离大于预设阈值时,判断目标用户身份验证失败。When the calculated distance is greater than the preset threshold, it is determined that the target user identity verification fails.
  3. 如权利要求1或2所述的用户身份验证方法,其特征在于,所述声纹识别模型的训练过程包括:The user identity verification method according to claim 1 or 2, wherein the training process of the voiceprint recognition model comprises:
    预先获取第一预设数量的用户的语音样本,每一个所述用户的语音样本包括第二预设数量的不同语音段样本,并利用获取的各个所述用户的语音样本训练所述预设类型的声纹识别模型,生成训练好的声纹识别模型。Pre-acquiring a first preset number of voice samples of the user, each voice sample of the user includes a second preset number of different voice segment samples, and training the preset type by using the acquired voice samples of each of the users The voiceprint recognition model generates a trained voiceprint recognition model.
  4. 如权利要求3所述的用户身份验证方法,其特征在于,所述声纹识别模型包括:代表本征音空间矩阵的用户空间特征项和代表本征信道空间矩阵的信道空间特征项,所述用户空间特征项包括用户声纹特征向量,所述信道空间特征项包括信道因素特征向量。The user identity verification method according to claim 3, wherein the voiceprint recognition model comprises: a user space feature item representing an eigenvoice space matrix and a channel space feature item representing an eigenchannel space matrix, The user space feature item includes a user voiceprint feature vector, and the channel space feature item includes a channel factor feature vector.
  5. 如权利要求4所述的用户身份验证方法,其特征在于,所述声纹识别模型的公式为:The user identity verification method according to claim 4, wherein the formula of the voiceprint recognition model is:
    X ij=μ+Fh i+Gw ij+∈ ij X ij =μ+Fh i +Gw ij +∈ ij
    其中,X ij表示第i个说话人的第j条语音,μ表示所有语音样本数据的均值,F表示身份空间且包含了用来表示各种身份的基底,F的每一列就相当于 类间空间的特征向量,h i表示第i个说话人的声纹特征向量,G表示误差空间且包含了用来表示同一身份不同变化的基底,G的每一列相当于类内空间的特征向量,w ij表示第i个说话人的第j条语音的信道因素特征向量,∈ ij表示残留噪声项,“μ+Fh i”表示说话人空间特征项,“Gw ij+∈ ij”表示信道空间特征项。 Where X ij represents the jth speech of the i-th speaker, μ represents the mean of all speech sample data, F represents the identity space and contains the base used to represent various identities, and each column of F is equivalent to the inter-class The feature vector of space, h i represents the voiceprint feature vector of the i-th speaker, G represents the error space and contains the base used to represent the different changes of the same identity, and each column of G is equivalent to the feature vector of the intra-class space, w ij denotes a channel factor eigenvectors j-th voice of the i-th speaker, ∈ ij represents a residual noise term, "μ + Fh i" denotes the speaker space wherein items, "Gw ij + ∈ ij" represents a channel space wherein items .
  6. 如权利要求1所述的用户身份验证方法,其特征在于,所述“利用预先确定的距离计算公式计算所述当前声纹特征向量与所述标准声纹特征向量之间的距离”的步骤可以替换为:The user identity verification method according to claim 1, wherein the step of "calculating the distance between the current voiceprint feature vector and the standard voiceprint feature vector by using a predetermined distance calculation formula" may Replace with:
    利用预先确定的距离计算公式计算当前声纹特征向量与各个预先确定的用户身份标识对应的标准声纹特征向量之间的距离。The distance between the current voiceprint feature vector and the standard voiceprint feature vector corresponding to each predetermined user identity is calculated using a predetermined distance calculation formula.
  7. 如权利要求6所述的用户身份验证方法,其特征在于,所述“根据所述距离分析目标用户是否通过身份验证”的步骤包括:The user identity verification method according to claim 6, wherein the step of "analysing whether the target user is authenticated according to the distance" comprises:
    按照从大到小的顺序,对所述当前声纹特征向量与各个预先确定的用户身份标识对应的标准声纹特征向量之间的距离进行排序,所述各个预先确定的用户身份标识中包括目标用户身份标识;Sorting the distance between the current voiceprint feature vector and the standard voiceprint feature vector corresponding to each predetermined user identity in an order from large to small, wherein each predetermined user identity includes a target User identity
    筛选出排序在前的距离对应的第三预设数量的用户身份标识,判断该第三预设数量的用户身份标识中是否包含目标用户身份标识;And filtering a third preset number of user identifiers corresponding to the previous distance, and determining whether the third preset number of user identifiers include the target user identifier;
    当所述第三预设数量的用户身份标识中包含目标用户身份标识时,判断目标用户身份验证通过;或When the third preset number of user identifiers include the target user identity, determining that the target user identity is verified; or
    当所述第三预设数量的用户身份标识中不包含目标用户身份标识时,判断目标用户身份验证失败。When the target user identity is not included in the third preset number of user identifiers, it is determined that the target user identity verification fails.
  8. 一种用户身份验证服务器,其特征在于,该服务器包括:存储器、处理器,所述存储器上存储有可在所述处理器上运行的用户身份验证程序,该程序被所述处理器执行时实现如下步骤:A user authentication server, characterized in that the server comprises: a memory, a processor, and a user identity verification program runable on the processor, the program being implemented by the processor The following steps:
    接收带有目标用户身份标识的身份验证请求,从客户端获取该目标用户的当前语音数据;Receiving an authentication request with a target user identity, and obtaining current voice data of the target user from the client;
    将该当前语音数据输入训练好的声纹识别模型中,确定该目标用户的当前声纹特征向量,根据预先确定的用户身份标识与标准声纹特征向量的映射关系,确定所述目标用户身份标识对应的标准声纹特征向量;The current voice data is input into the trained voiceprint recognition model, and the current voiceprint feature vector of the target user is determined, and the target user identity is determined according to a predetermined mapping relationship between the user identity and the standard voiceprint feature vector. Corresponding standard voiceprint feature vector;
    利用预先确定的距离计算公式计算所述当前声纹特征向量与所述标准声纹特征向量之间的距离;及Calculating a distance between the current voiceprint feature vector and the standard voiceprint feature vector using a predetermined distance calculation formula; and
    根据所述距离分析目标用户是否通过身份验证,将所述身份验证结果发送给该客户端。And analyzing, according to the distance, whether the target user is authenticated, and sending the identity verification result to the client.
  9. 如权利要求8所述的用户身份验证服务器,其特征在于,所述“根据所述距离分析目标用户是否通过身份验证”的步骤包括:The user identity verification server according to claim 8, wherein the step of "analysing whether the target user is authenticated according to the distance" comprises:
    当计算的距离小于或者等于预设阈值时,判断目标用户身份验证通过;或When the calculated distance is less than or equal to the preset threshold, it is determined that the target user identity verification is passed; or
    当计算的距离大于预设阈值时,判断目标用户身份验证失败。When the calculated distance is greater than the preset threshold, it is determined that the target user identity verification fails.
  10. 如权利要求8或9所述的用户身份验证服务器,其特征在于,所述声纹识别模型的训练过程包括:The user identity verification server according to claim 8 or 9, wherein the training process of the voiceprint recognition model comprises:
    预先获取第一预设数量的用户的语音样本,每一个所述用户的语音样本包括第二预设数量的不同语音段样本,并利用获取的各个所述用户的语音样本训练所述预设类型的声纹识别模型,生成训练好的声纹识别模型。Pre-acquiring a first preset number of voice samples of the user, each voice sample of the user includes a second preset number of different voice segment samples, and training the preset type by using the acquired voice samples of each of the users The voiceprint recognition model generates a trained voiceprint recognition model.
  11. 如权利要求10所述的用户身份验证服务器,其特征在于,所述声纹识别模型包括:代表本征音空间矩阵的用户空间特征项和代表本征信道空间矩阵的信道空间特征项,所述用户空间特征项包括用户声纹特征向量,所述信道空间特征项包括信道因素特征向量。The user identity verification server according to claim 10, wherein said voiceprint recognition model comprises: a user space feature item representing an eigenvoice space matrix and a channel space feature item representing an eigenchannel space matrix, The user space feature item includes a user voiceprint feature vector, and the channel space feature item includes a channel factor feature vector.
  12. 如权利要求11所述的用户身份验证服务器,其特征在于,所述声纹识别模型的公式为:The user identity verification server according to claim 11, wherein the formula of the voiceprint recognition model is:
    X ij=μ+Fh i+Gw ij+∈ ij X ij =μ+Fh i +Gw ij +∈ ij
    其中,X ij表示第i个说话人的第j条语音,μ表示所有语音样本数据的均值,F表示身份空间且包含了用来表示各种身份的基底,F的每一列就相当于类间空间的特征向量,h i表示第i个说话人的声纹特征向量,G表示误差空间且包含了用来表示同一身份不同变化的基底,G的每一列相当于类内空间的特征向量,w ij表示第i个说话人的第j条语音的信道因素特征向量,∈ ij表示残留噪声项,“μ+Fh i”表示说话人空间特征项,“Gw ij+∈ ij”表示信道空间特征项。 Where X ij represents the jth speech of the i-th speaker, μ represents the mean of all speech sample data, F represents the identity space and contains the base used to represent various identities, and each column of F is equivalent to the inter-class The feature vector of space, h i represents the voiceprint feature vector of the i-th speaker, G represents the error space and contains the base used to represent the different changes of the same identity, and each column of G is equivalent to the feature vector of the intra-class space, w ij denotes a channel factor eigenvectors j-th voice of the i-th speaker, ∈ ij represents a residual noise term, "μ + Fh i" denotes the speaker space wherein items, "Gw ij + ∈ ij" represents a channel space wherein items .
  13. 如权利要求8所述的用户身份验证服务器,其特征在于,所述“利用预先确定的距离计算公式计算所述当前声纹特征向量与所述标准声纹特征向量之间的距离”的步骤可以替换为:The user identity verification server according to claim 8, wherein said step of "calculating a distance between said current voiceprint feature vector and said standard voiceprint feature vector using a predetermined distance calculation formula" may Replace with:
    利用预先确定的距离计算公式计算当前声纹特征向量与各个预先确定的用户身份标识对应的标准声纹特征向量之间的距离。The distance between the current voiceprint feature vector and the standard voiceprint feature vector corresponding to each predetermined user identity is calculated using a predetermined distance calculation formula.
  14. 如权利要求13所述的用户身份验证服务器,其特征在于,所述“根据所述距离分析目标用户是否通过身份验证”的步骤包括:The user identity verification server according to claim 13, wherein the step of "analysing whether the target user is authenticated according to the distance" comprises:
    按照从大到小的顺序,对所述当前声纹特征向量与各个预先确定的用户身份标识对应的标准声纹特征向量之间的距离进行排序,所述各个预先确定的用户身份标识中包括目标用户身份标识;Sorting the distance between the current voiceprint feature vector and the standard voiceprint feature vector corresponding to each predetermined user identity in an order from large to small, wherein each predetermined user identity includes a target User identity
    筛选出排序在前的距离对应的第三预设数量的用户身份标识,判断该第三预设数量的用户身份标识中是否包含目标用户身份标识;And filtering a third preset number of user identifiers corresponding to the previous distance, and determining whether the third preset number of user identifiers include the target user identifier;
    当所述第三预设数量的用户身份标识中包含目标用户身份标识时,判断目标用户身份验证通过;或When the third preset number of user identifiers include the target user identity, determining that the target user identity is verified; or
    当所述第三预设数量的用户身份标识中不包含目标用户身份标识时,判断目标用户身份验证失败。When the target user identity is not included in the third preset number of user identifiers, it is determined that the target user identity verification fails.
  15. 一种计算机可读存储介质,其特征在于,所述计算机可读存储介质上存储有用户身份验证程序,该程序被处理器执行时实现如下步骤:A computer readable storage medium, wherein the computer readable storage medium stores a user identity verification program, and when the program is executed by the processor, the following steps are implemented:
    接收带有目标用户身份标识的身份验证请求,从客户端获取该目标用户的当前语音数据;Receiving an authentication request with a target user identity, and obtaining current voice data of the target user from the client;
    将该当前语音数据输入训练好的声纹识别模型中,确定该目标用户的当前声纹特征向量,根据预先确定的用户身份标识与标准声纹特征向量的映射关系,确定所述目标用户身份标识对应的标准声纹特征向量;The current voice data is input into the trained voiceprint recognition model, and the current voiceprint feature vector of the target user is determined, and the target user identity is determined according to a predetermined mapping relationship between the user identity and the standard voiceprint feature vector. Corresponding standard voiceprint feature vector;
    利用预先确定的距离计算公式计算所述当前声纹特征向量与所述标准声纹特征向量之间的距离;及Calculating a distance between the current voiceprint feature vector and the standard voiceprint feature vector using a predetermined distance calculation formula; and
    根据所述距离分析目标用户是否通过身份验证,将所述身份验证结果发送给该客户端。And analyzing, according to the distance, whether the target user is authenticated, and sending the identity verification result to the client.
  16. 如权利要求15所述的计算机可读存储介质,其特征在于,所述“根据所述距离分析目标用户是否通过身份验证”的步骤包括:The computer readable storage medium of claim 15, wherein the step of "analysing whether the target user is authenticated according to the distance" comprises:
    当计算的距离小于或者等于预设阈值时,判断目标用户身份验证通过;或When the calculated distance is less than or equal to the preset threshold, it is determined that the target user identity verification is passed; or
    当计算的距离大于预设阈值时,判断目标用户身份验证失败。When the calculated distance is greater than the preset threshold, it is determined that the target user identity verification fails.
  17. 如权利要求15或16所述的计算机可读存储介质,其特征在于,所述声纹识别模型的训练过程包括:The computer readable storage medium according to claim 15 or 16, wherein the training process of the voiceprint recognition model comprises:
    预先获取第一预设数量的用户的语音样本,每一个所述用户的语音样本 包括第二预设数量的不同语音段样本,并利用获取的各个所述用户的语音样本训练所述预设类型的声纹识别模型,生成训练好的声纹识别模型。Pre-acquiring a first preset number of voice samples of the user, each voice sample of the user includes a second preset number of different voice segment samples, and training the preset type by using the acquired voice samples of each of the users The voiceprint recognition model generates a trained voiceprint recognition model.
  18. 如权利要求17所述的计算机可读存储介质,其特征在于,所述声纹识别模型包括:代表本征音空间矩阵的用户空间特征项和代表本征信道空间矩阵的信道空间特征项,所述用户空间特征项包括用户声纹特征向量,所述信道空间特征项包括信道因素特征向量。The computer readable storage medium of claim 17, wherein the voiceprint recognition model comprises: a user space feature item representing an eigenvoice space matrix and a channel space feature item representing an eigenchannel space matrix, The user space feature item includes a user voiceprint feature vector, and the channel space feature item includes a channel factor feature vector.
  19. 如权利要求15所述的计算机可读存储介质,其特征在于,所述“利用预先确定的距离计算公式计算所述当前声纹特征向量与所述标准声纹特征向量之间的距离”的步骤可以替换为:The computer readable storage medium according to claim 15, wherein said step of "calculating a distance between said current voiceprint feature vector and said standard voiceprint feature vector using a predetermined distance calculation formula" Can be replaced by:
    利用预先确定的距离计算公式计算当前声纹特征向量与各个预先确定的用户身份标识对应的标准声纹特征向量之间的距离。The distance between the current voiceprint feature vector and the standard voiceprint feature vector corresponding to each predetermined user identity is calculated using a predetermined distance calculation formula.
  20. 如权利要求19所述的计算机可读存储介质,其特征在于,所述“根据所述距离分析目标用户是否通过身份验证”的步骤包括:The computer readable storage medium of claim 19, wherein the step of "analysing whether the target user is authenticated according to the distance" comprises:
    按照从大到小的顺序,对所述当前声纹特征向量与各个预先确定的用户身份标识对应的标准声纹特征向量之间的距离进行排序,所述各个预先确定的用户身份标识中包括目标用户身份标识;Sorting the distance between the current voiceprint feature vector and the standard voiceprint feature vector corresponding to each predetermined user identity in an order from large to small, wherein each predetermined user identity includes a target User identity
    筛选出排序在前的距离对应的第三预设数量的用户身份标识,判断该第三预设数量的用户身份标识中是否包含目标用户身份标识;And filtering a third preset number of user identifiers corresponding to the previous distance, and determining whether the third preset number of user identifiers include the target user identifier;
    当所述第三预设数量的用户身份标识中包含目标用户身份标识时,判断目标用户身份验证通过;或When the third preset number of user identifiers include the target user identity, determining that the target user identity is verified; or
    当所述第三预设数量的用户身份标识中不包含目标用户身份标识时,判断目标用户身份验证失败。When the target user identity is not included in the third preset number of user identifiers, it is determined that the target user identity verification fails.
PCT/CN2018/102123 2018-04-09 2018-08-24 User identity authentication method, server and storage medium WO2019196303A1 (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810311098.0 2018-04-09
CN201810311098.0A CN108766444B (en) 2018-04-09 2018-04-09 User identity authentication method, server and storage medium

Publications (1)

Publication Number Publication Date
WO2019196303A1 true WO2019196303A1 (en) 2019-10-17

Family

ID=63981534

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2018/102123 WO2019196303A1 (en) 2018-04-09 2018-08-24 User identity authentication method, server and storage medium

Country Status (2)

Country Link
CN (1) CN108766444B (en)
WO (1) WO2019196303A1 (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12019718B2 (en) 2019-04-24 2024-06-25 Tencent Technology (Shenzhen) Company Limited Identity verification method and apparatus, computer device and storage medium

Families Citing this family (11)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111199742A (en) * 2018-11-20 2020-05-26 阿里巴巴集团控股有限公司 Identity verification method and device and computing equipment
CN109462603A (en) * 2018-12-14 2019-03-12 平安城市建设科技(深圳)有限公司 Voiceprint authentication method, equipment, storage medium and device based on blind Detecting
CN109753778A (en) * 2018-12-30 2019-05-14 北京城市网邻信息技术有限公司 Checking method, device, equipment and the storage medium of user
CN109994118B (en) * 2019-04-04 2022-10-11 平安科技(深圳)有限公司 Voice password verification method and device, storage medium and computer equipment
CN110059465B (en) * 2019-04-24 2023-07-25 腾讯科技(深圳)有限公司 Identity verification method, device and equipment
CN110400567B (en) * 2019-07-30 2021-10-19 深圳秋田微电子股份有限公司 Dynamic update method for registered voiceprint and computer storage medium
CN111402899B (en) * 2020-03-25 2023-10-13 中国工商银行股份有限公司 Cross-channel voiceprint recognition method and device
CN111833068A (en) * 2020-07-31 2020-10-27 重庆富民银行股份有限公司 Identity verification system and method based on voiceprint recognition
CN112509586A (en) * 2020-12-17 2021-03-16 中国工商银行股份有限公司 Method and device for recognizing voice print of telephone channel
CN112652314A (en) * 2020-12-30 2021-04-13 太平金融科技服务(上海)有限公司 Method, device, equipment and medium for verifying disabled object based on voiceprint shading
CN113282072B (en) * 2021-07-19 2021-11-02 江铃汽车股份有限公司 Vehicle remote diagnosis method, device, storage medium and system

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013154805A1 (en) * 2012-04-09 2013-10-17 Sony Computer Entertainment Inc. Text dependent speaker recognition with long-term feature
US20160027444A1 (en) * 2014-07-22 2016-01-28 Nuance Communications, Inc. Method and apparatus for detecting splicing attacks on a speaker verification system
CN106453205A (en) * 2015-08-07 2017-02-22 阿里巴巴集团控股有限公司 Identity verification method and identity verification device
CN106506524A (en) * 2016-11-30 2017-03-15 百度在线网络技术(北京)有限公司 Method and apparatus for verifying user
CN107395352A (en) * 2016-05-16 2017-11-24 腾讯科技(深圳)有限公司 Personal identification method and device based on vocal print

Family Cites Families (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2000058947A1 (en) * 1999-03-31 2000-10-05 Veritel Corporation User authentication for consumer electronics
CN103730114A (en) * 2013-12-31 2014-04-16 上海交通大学无锡研究院 Mobile equipment voiceprint recognition method based on joint factor analysis model
CN107068154A (en) * 2017-03-13 2017-08-18 平安科技(深圳)有限公司 The method and system of authentication based on Application on Voiceprint Recognition
CN107690036A (en) * 2017-06-24 2018-02-13 平安科技(深圳)有限公司 Electronic installation, inlet wire personal identification method and computer-readable recording medium
CN107610709B (en) * 2017-08-01 2021-03-19 百度在线网络技术(北京)有限公司 Method and system for training voiceprint recognition model

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2013154805A1 (en) * 2012-04-09 2013-10-17 Sony Computer Entertainment Inc. Text dependent speaker recognition with long-term feature
US20160027444A1 (en) * 2014-07-22 2016-01-28 Nuance Communications, Inc. Method and apparatus for detecting splicing attacks on a speaker verification system
CN106453205A (en) * 2015-08-07 2017-02-22 阿里巴巴集团控股有限公司 Identity verification method and identity verification device
CN107395352A (en) * 2016-05-16 2017-11-24 腾讯科技(深圳)有限公司 Personal identification method and device based on vocal print
CN106506524A (en) * 2016-11-30 2017-03-15 百度在线网络技术(北京)有限公司 Method and apparatus for verifying user

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US12019718B2 (en) 2019-04-24 2024-06-25 Tencent Technology (Shenzhen) Company Limited Identity verification method and apparatus, computer device and storage medium

Also Published As

Publication number Publication date
CN108766444A (en) 2018-11-06
CN108766444B (en) 2020-11-03

Similar Documents

Publication Publication Date Title
WO2019196303A1 (en) User identity authentication method, server and storage medium
JP6429945B2 (en) Method and apparatus for processing audio data
WO2019109526A1 (en) Method and device for age recognition of face image, storage medium
CN108768654B (en) Identity verification method based on voiceprint recognition, server and storage medium
WO2019218514A1 (en) Method for extracting webpage target information, device, and storage medium
WO2019205369A1 (en) Electronic device, identity recognition method based on human face image and voiceprint information, and storage medium
US9979721B2 (en) Method, server, client and system for verifying verification codes
WO2018090641A1 (en) Method, apparatus and device for identifying insurance policy number, and computer-readable storage medium
CN107680294B (en) House property information inquiry method, system, terminal equipment and storage medium
WO2019033525A1 (en) Au feature recognition method, device and storage medium
US20220012512A1 (en) Intelligent gallery management for biometrics
CN110348362B (en) Label generation method, video processing method, device, electronic equipment and storage medium
WO2014082496A1 (en) Method and apparatus for identifying client characteristic and storage medium
CN110795714A (en) Identity authentication method and device, computer equipment and storage medium
US10489637B2 (en) Method and device for obtaining similar face images and face image information
CN109636582B (en) Credit information management method, apparatus, device and storage medium
CN106056083B (en) A kind of information processing method and terminal
CN108491709A (en) The method and apparatus of permission for identification
CN116305076B (en) Signature-based identity information registration sample online updating method, system and storage medium
CN107766868A (en) A kind of classifier training method and device
CN112330331A (en) Identity verification method, device and equipment based on face recognition and storage medium
CN112418167A (en) Image clustering method, device, equipment and storage medium
JP5812505B2 (en) Demographic analysis method and system based on multimodal information
CN111553241A (en) Method, device and equipment for rejecting mismatching points of palm print and storage medium
CN111062301A (en) Identity authentication method and device, electronic equipment and computer readable storage medium

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 18914857

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

122 Ep: pct application non-entry in european phase

Ref document number: 18914857

Country of ref document: EP

Kind code of ref document: A1