CN108650266B - Server, voiceprint verification method and storage medium - Google Patents

Server, voiceprint verification method and storage medium Download PDF

Info

Publication number
CN108650266B
CN108650266B CN201810457267.1A CN201810457267A CN108650266B CN 108650266 B CN108650266 B CN 108650266B CN 201810457267 A CN201810457267 A CN 201810457267A CN 108650266 B CN108650266 B CN 108650266B
Authority
CN
China
Prior art keywords
voiceprint
graphic code
verification
current
voice data
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810457267.1A
Other languages
Chinese (zh)
Other versions
CN108650266A (en
Inventor
程序
彭俊清
王健宗
肖京
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN201810457267.1A priority Critical patent/CN108650266B/en
Priority to PCT/CN2018/102049 priority patent/WO2019218512A1/en
Publication of CN108650266A publication Critical patent/CN108650266A/en
Application granted granted Critical
Publication of CN108650266B publication Critical patent/CN108650266B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0861Network architectures or network communication protocols for network security for authentication of entities using biometrical features, e.g. fingerprint, retina-scan
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/22Interactive procedures; Man-machine interfaces
    • G10L17/24Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0807Network architectures or network communication protocols for network security for authentication of entities using tickets, e.g. Kerberos
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/083Network architectures or network communication protocols for network security for authentication of entities using passwords
    • H04L63/0838Network architectures or network communication protocols for network security for authentication of entities using passwords using one-time-passwords
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/08Network architectures or network communication protocols for network security for authentication of entities
    • H04L63/0876Network architectures or network communication protocols for network security for authentication of entities based on the identity of the terminal or configuration, e.g. MAC address, hardware or software configuration or device fingerprint

Abstract

The invention relates to a server, a voiceprint verification method and a storage medium, wherein the method comprises the following steps: after receiving the identity authentication request, generating a graphic code parameter of a graphic code corresponding to the user identity, and sending the graphic code parameter to the client computer; after the hand-held terminal analyzes the graphic code, receiving a voiceprint verification request which is sent by the hand-held terminal through a voiceprint data acquisition link address and carries a random key, and analyzing whether the two random keys are consistent; if so, establishing a voice data acquisition channel with the handheld terminal, and acquiring the current voiceprint verification voice data of the user acquired from the handheld terminal based on the channel; and constructing a corresponding current voiceprint authentication vector, determining a standard voiceprint authentication vector corresponding to the user identity, calculating the distance between the current voiceprint authentication vector and the standard voiceprint authentication vector, and generating an identity verification result based on the calculated distance. The method and the device can improve the flexibility of voiceprint verification and avoid sound hijack.

Description

Server, voiceprint verification method and storage medium
Technical Field
The present invention relates to the field of communications technologies, and in particular, to a server, a voiceprint verification method, and a storage medium.
Background
Currently, the verification of the user identity by using the voiceprint verification technology has become an important verification means for various large customer service companies (e.g., banks, insurance companies, game companies, etc.). The traditional service scheme for realizing user identity authentication by utilizing the voiceprint authentication technology is as follows: and the interface of the voiceprint verification server is utilized to independently develop a corresponding client program, the developed client program is used for collecting and pre-processing the voice of the user, and then the pre-processed voiceprint data is transmitted to the voiceprint verification server for the voiceprint verification server to carry out authentication verification and operation processing on the transmitted voiceprint data.
However, the conventional voiceprint authentication scheme has disadvantages in that: the user needs to acquire the voice of the user through the developed client program, in actual operation, the use flexibility is low, the user is easily interfered by human voice, the user is easily hijacked by the voice when adopting a client computer to acquire the voice, the authenticity of voiceprint verification cannot be accurately controlled, and the safety cannot be guaranteed.
Disclosure of Invention
The invention aims to provide a server, a voiceprint verification method and a storage medium, aiming at improving the flexibility of voiceprint verification and avoiding sound hijacking.
In order to achieve the above object, the present invention provides a server, including a memory and a processor connected to the memory, wherein the memory stores a processing program executable on the processor, and the processing program, when executed by the processor, implements the following steps:
generating, after receiving an identity authentication request carrying a user identity sent by a client computer, a graphic code parameter of a graphic code corresponding to the user identity, and sending the graphic code parameter to the client computer, so that the client computer generates and displays the graphic code corresponding to the graphic code parameter, wherein the graphic code parameter comprises a random key and a voiceprint data acquisition link address;
an analysis step, after the handheld terminal analyzes the graphic code to obtain a random key and a voiceprint data acquisition link address, receiving a voiceprint verification request which is sent by the handheld terminal through the voiceprint data acquisition link address and carries the random key, and analyzing whether the random key in the graphic code parameters sent to the client computer is consistent with the random key received from the handheld terminal;
if so, establishing a voice data acquisition channel with the handheld terminal, and acquiring current voiceprint verification voice data of the user acquired from the handheld terminal based on the voice data acquisition channel;
and a verification step, namely constructing a current voiceprint authentication vector corresponding to the current voiceprint authentication voice data, determining a standard voiceprint authentication vector corresponding to the user identity according to a mapping relation between a preset user identity and the standard voiceprint authentication vector, calculating the distance between the current voiceprint authentication vector and the standard voiceprint authentication vector, generating an authentication result based on the calculated distance, and sending the authentication result to the client computer.
Preferably, the analyzing step specifically includes:
the server receives a voiceprint verification request which is sent by the handheld terminal through the voiceprint data acquisition link address and carries a random key, and analyzes whether the number of receiving the random key is greater than a preset number of times;
and if the number of times is less than or equal to the preset number of times, analyzing whether the random secret key in the graphic code parameters sent to the client computer is consistent with the random secret key received from the handheld terminal.
Preferably, the graphic code parameters further include an effective time of the graphic code, and the analyzing step specifically includes:
the server receives a voiceprint verification request which is sent by the handheld terminal through the voiceprint data acquisition link address and carries a random key, and analyzes whether the time for receiving the random key is within the valid time range of the graphic code;
if the time is within the valid time range of the graphic code, analyzing whether the number of times of receiving the random secret key is greater than a preset number of times;
and if the number of times is less than or equal to the preset number of times, analyzing whether the random secret key in the graphic code parameters sent to the client computer is consistent with the random secret key received from the handheld terminal.
Preferably, the step of constructing the current voiceprint authentication vector corresponding to the current voiceprint authentication voice data specifically includes:
processing the current voiceprint verification voice data to extract preset type voiceprint features, and constructing corresponding voiceprint feature vectors based on the preset type voiceprint features;
inputting the voiceprint feature vector into a pre-trained background channel model to construct a current voiceprint identification vector corresponding to the current voiceprint verification voice data;
the step of calculating the distance between the current voiceprint authentication vector and the standard voiceprint authentication vector and generating an identity verification result based on the calculated distance comprises:
calculating the cosine distance between the current voiceprint identification vector and the standard voiceprint identification vector:
Figure GDA0002302945660000031
for the standard voiceprint authentication vector(s),
Figure GDA0002302945660000032
identifying a vector for the current voiceprint;
if the cosine distance is smaller than or equal to a preset distance threshold, generating information that the verification is passed;
and if the cosine distance is greater than a preset distance threshold, generating information that the verification fails.
In order to achieve the above object, the present invention further provides a method for voiceprint authentication, where the method for voiceprint authentication includes:
s1, after receiving an identity authentication request carrying a user identity sent by a client computer, a server generates a graphic code parameter of a graphic code corresponding to the user identity, and sends the graphic code parameter to the client computer for the client computer to generate and display the graphic code corresponding to the graphic code parameter, wherein the graphic code parameter comprises a random key and a voiceprint data acquisition link address;
s2, after the hand-held terminal analyzes the graphic code to obtain the random secret key and the voiceprint data acquisition link address, the server receives a voiceprint verification request which is sent by the hand-held terminal through the voiceprint data acquisition link address and carries the random secret key, and analyzes whether the random secret key in the graphic code parameters sent to the client computer is consistent with the random secret key received from the hand-held terminal;
s3, if yes, the server establishes a voice data acquisition channel with the handheld terminal, and acquires the current voiceprint verification voice data of the user acquired from the handheld terminal based on the voice data acquisition channel;
s4, constructing a current voiceprint authentication vector corresponding to the current voiceprint authentication voice data, determining a standard voiceprint authentication vector corresponding to the user identity according to the mapping relation between the preset user identity and the standard voiceprint authentication vector, calculating the distance between the current voiceprint authentication vector and the standard voiceprint authentication vector, generating an authentication result based on the calculated distance, and sending the authentication result to the client computer.
Preferably, the step S2 specifically includes:
the server receives a voiceprint verification request which is sent by the handheld terminal through the voiceprint data acquisition link address and carries a random key, and analyzes whether the number of receiving the random key is greater than a preset number of times;
and if the number of times is less than or equal to the preset number of times, analyzing whether the random secret key in the graphic code parameters sent to the client computer is consistent with the random secret key received from the handheld terminal.
Preferably, the graphic code parameter further includes an effective time of the graphic code, and the step S2 specifically includes:
the server receives a voiceprint verification request which is sent by the handheld terminal through the voiceprint data acquisition link address and carries a random key, and analyzes whether the time for receiving the random key is within the valid time range of the graphic code;
if the time is within the valid time range of the graphic code, analyzing whether the number of times of receiving the random secret key is greater than a preset number of times;
and if the number of times is less than or equal to the preset number of times, analyzing whether the random secret key in the graphic code parameters sent to the client computer is consistent with the random secret key received from the handheld terminal.
Preferably, the step of constructing the current voiceprint authentication vector corresponding to the current voiceprint authentication voice data specifically includes:
processing the current voiceprint verification voice data to extract preset type voiceprint features, and constructing corresponding voiceprint feature vectors based on the preset type voiceprint features;
inputting the voiceprint feature vector into a pre-trained background channel model to construct a current voiceprint identification vector corresponding to the current voiceprint verification voice data;
the step of calculating the distance between the current voiceprint authentication vector and the standard voiceprint authentication vector and generating an identity verification result based on the calculated distance comprises:
calculating the cosine distance between the current voiceprint identification vector and the standard voiceprint identification vector:
Figure GDA0002302945660000051
for the standard voiceprint authentication vector(s),
Figure GDA0002302945660000052
identifying a vector for the current voiceprint;
if the cosine distance is smaller than or equal to a preset distance threshold, generating information that the verification is passed;
and if the cosine distance is greater than a preset distance threshold, generating information that the verification fails.
Preferably, the step of processing the current voiceprint verification voice data to extract a preset type voiceprint feature and construct a corresponding voiceprint feature vector based on the preset type voiceprint feature specifically includes:
pre-emphasis, framing and windowing are carried out on the current voiceprint verification voice data, Fourier transform is carried out on each windowing to obtain a corresponding frequency spectrum, and the frequency spectrum is input into a Mel filter to be output to obtain a Mel frequency spectrum;
performing cepstral analysis on the mel-frequency spectrum to obtain mel-frequency cepstral coefficients MFCC, and composing corresponding voiceprint feature vectors based on the mel-frequency cepstral coefficients MFCC.
The present invention also provides a computer-readable storage medium having stored thereon a processing program which, when executed by a processor, implements the steps of the method of voiceprint authentication described above.
The invention has the beneficial effects that: the invention adopts a framework composed of a client computer, a server and a handheld terminal when voiceprint verification is carried out, the client computer carries a user identity to request the server, the server generates a graphic code parameter corresponding to the user identity and sends the graphic code parameter to the client computer for displaying the graphic code corresponding to the graphic code parameter, the user scans the graphic code by using the handheld terminal and sends a random code to the server for verification through a link address, a channel can be established with the server after verification is passed, and voice data of the user collected by the handheld terminal is obtained for voiceprint verification The server and the handheld terminal are bound, so that the condition of sound hijacking is avoided, and the authenticity and the safety of voiceprint verification are improved.
Drawings
FIG. 1 is a schematic diagram of an alternative application environment according to various embodiments of the present invention;
fig. 2 is a flowchart illustrating a method for voiceprint authentication according to an embodiment of the present invention.
Detailed Description
In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
It should be noted that the description relating to "first", "second", etc. in the present invention is for descriptive purposes only and is not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In addition, technical solutions between various embodiments may be combined with each other, but must be realized by a person skilled in the art, and when the technical solutions are contradictory or cannot be realized, such a combination should not be considered to exist, and is not within the protection scope of the present invention.
Fig. 1 is a schematic diagram of an application environment of the method for voiceprint authentication according to the preferred embodiment of the invention. The application environment schematic diagram comprises a server 1, a client computer 2 and a handheld terminal 3. The server 1 may perform data interaction with the client computer 2 and the handheld terminal 3 through a suitable technology such as a network and a near field communication technology.
The client computer 2 includes, but is not limited to, any electronic product capable of performing human-computer interaction with a user through a keyboard, a mouse, a remote controller, a touch panel, or a voice control device, for example, a mobile device such as a Personal computer, a tablet computer, a smart phone, a Personal Digital Assistant (PDA), a game machine, an interactive web Television (IPTV), an intelligent wearable device, a navigation device, or the like, or a fixed terminal such as a Digital TV, a desktop computer, a notebook, a server, or the like. The handheld terminal 3 may be a tablet computer, a smart phone, or the like.
The server 1 is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction. The server 1 may be a single network server, a server group composed of a plurality of network servers, or a cloud composed of a large number of hosts or network servers based on cloud computing, wherein the cloud computing is one of distributed computing and is a super virtual computer composed of a group of loosely coupled computers.
In the present embodiment, the server 1 may include, but is not limited to, a memory 11, a processor 12, and a network interface 13, which are communicatively connected to each other through a system bus, and the memory 11 stores a processing program that can be executed on the processor 12. It is noted that fig. 1 only shows the server 1 with components 11-13, but it is to be understood that not all of the shown components are required to be implemented, and that more or fewer components may be implemented instead.
The storage 11 includes a memory and at least one type of readable storage medium. The memory provides cache for the operation of the server 1; the readable storage medium may be a non-volatile storage medium such as flash memory, a hard disk, a multimedia card, a card type memory (e.g., SD or DX memory, etc.), a Random Access Memory (RAM), a Static Random Access Memory (SRAM), a Read Only Memory (ROM), an Electrically Erasable Programmable Read Only Memory (EEPROM), a Programmable Read Only Memory (PROM), a magnetic memory, a magnetic disk, an optical disk, etc. In some embodiments, the readable storage medium may be an internal storage unit of the on-server 1, such as a hard disk of the on-server 1; in other embodiments, the non-volatile storage medium may also be an external storage device on the server 1, such as a plug-in hard disk provided on the server 1, a Smart Media Card (SMC), a Secure Digital (SD) Card, a Flash memory Card (Flash Card), and the like. In this embodiment, the readable storage medium of the memory 11 is generally used for storing an operating system and various application software installed on the server 1, such as a program code for storing a processing program in an embodiment of the present invention. Further, the memory 11 may also be used to temporarily store various types of data that have been output or are to be output.
The processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data Processing chip in some embodiments. The processor 12 is generally used for controlling the overall operation of the server 1, such as performing control and processing related to data interaction or communication with the client computer 2 and the handheld terminal 3. In this embodiment, the processor 12 is configured to run the program codes or process data stored in the memory 11, for example, run a processing program.
The network interface 13 may comprise a wireless network interface or a wired network interface, and the network interface 13 is typically used for establishing a communication connection between the server 1 and other electronic devices. In this embodiment, the network interface 13 is mainly used to connect the server 1 with the client computer 2 and the handheld terminal 3, and establish a data transmission channel and a communication connection between the server 1 and the client computer 2 and the handheld terminal 3.
The processing program is stored in the memory 11 and includes at least one computer readable instruction stored in the memory 11, which is executable by the processor 12 to implement the method of the embodiments of the present application; and the at least one computer readable instruction may be divided into different logic blocks depending on the functions implemented by the respective portions.
In an embodiment, the above processing program when executed by the processor 12 implements the following steps:
a generating step, namely generating a graphic code parameter of a graphic code corresponding to a user identity after receiving an identity authentication request carrying the user identity sent by a client computer, and sending the graphic code parameter to the client computer for the client computer to generate and display the graphic code corresponding to the graphic code parameter;
the user identity is an identity for uniquely identifying the identity of the user, and preferably, the user identity is an identity card number. The graphic code is preferably a two-dimensional code, but is not limited thereto, and may be a barcode or the like, for example. The graphic code parameters are used for generating corresponding graphic codes, for example, the two-dimensional code parameters generate corresponding two-dimensional codes, and the bar code parameters generate corresponding bar codes. The parameter of the graphic code comprises a random secret key and a sound stripe data acquisition link address, and further comprises the valid time of the graphic code, the detailed information of the graphic code, the scene value ID of the graphic code and the like, wherein the random secret key can be a random number string or a random character string and the like.
The method comprises the steps that a client computer sends an identity authentication request carrying a user identity identifier to a server, the server generates a random secret key corresponding to the user identity identifier, a voiceprint data acquisition link address of the server, the effective time of a graphic code, detailed information of the graphic code, a scene value ID of the graphic code and other graphic code parameters after receiving the identity authentication request, the graphic code parameters are sent to the client computer, and after receiving the graphic code parameters, the client computer generates the corresponding graphic code according to the graphic code parameters and displays the graphic code for a handheld terminal to scan.
An analysis step, after the handheld terminal analyzes the graphic code to obtain a random key and a voiceprint data acquisition link address, receiving a voiceprint verification request which is sent by the handheld terminal through the voiceprint data acquisition link address and carries the random key, and analyzing whether the random key in the graphic code parameters sent to the client computer is consistent with the random key received from the handheld terminal;
after scanning the graphical code, the handheld terminal analyzes the graphical code by using a self functional module for analyzing the graphical code to obtain a corresponding random secret key, a voiceprint data acquisition link address of the server, the valid time of the graphical code, detailed information of the graphical code, a scene value ID of the graphical code and other graphical code parameters, and sends a voiceprint verification request carrying the random secret key to the server through the voiceprint data acquisition link address.
In order to prevent other handheld terminals from stealing the random key of the current time and then carrying out voiceprint verification with the server and improve the accuracy of the voiceprint verification, in one embodiment, the server receives the voiceprint verification request which is sent by the handheld terminal through a voiceprint data acquisition link address and carries the random key, and firstly analyzes whether the number of times of receiving the random key is greater than the preset number of times; and if the number of times of receiving the random key is greater than a preset number of times, for example, greater than 1 time, the server refuses to respond to the voiceprint authentication request, and can send the relevant information of the handheld terminal to the server for the server to subsequently serve as a reference basis for whether voiceprint authentication is fraudulent, and if the number of times of receiving the random key is less than or equal to the preset number of times, for example, 1 time, the server performs an operation of analyzing whether the random key in the graphic code parameter sent to the client computer is consistent with the random key received from the handheld terminal.
In order to prevent other handheld terminals from stealing the random key of this time and then performing voiceprint verification with the server, so as to further improve the accuracy of the voiceprint verification, in another embodiment, the server receives a voiceprint verification request carrying the random key and sent by the handheld terminal through a voiceprint data acquisition link address, and first analyzes whether the time of receiving the random key is within the valid time range of the graphic code, for example, the valid time of the graphic code is 2018.03.01-2018.03.10, and the time of receiving the random key of the handheld terminal by the server is 2018.03.08, and then is within the valid time range of the graphic code. If the time is within the valid time range of the graphic code, analyzing whether the number of times of receiving the random key is greater than a preset number of times, for example, analyzing whether the number of times of receiving the random key is greater than 1 time; and if the number of times of receiving the random key is greater than the preset number of times, the server refuses to respond to the voiceprint verification request, and can send the relevant information of the handheld terminal to the server for the server to serve as a reference basis for whether the voiceprint verification is fraudulent or not. And if the number of times is less than or equal to the preset number of times, finally, analyzing whether the random key in the graphic code parameters sent to the client computer is consistent with the random key received from the handheld terminal.
If so, establishing a voice data acquisition channel with the handheld terminal, and acquiring current voiceprint verification voice data of the user acquired from the handheld terminal based on the voice data acquisition channel;
and if the random key in the graphic code parameters sent to the client computer is consistent with the random key received from the handheld terminal, establishing a voice data acquisition channel with the handheld terminal. The handheld terminal acquires the current voiceprint verification voice data of the user in real time through voice acquisition equipment such as a microphone. When collecting the current voice print verification voice data, the interference of environmental noise and the handheld terminal should be prevented as much as possible. The handheld terminal keeps a proper distance from a user, the handheld terminal with large distortion is not used as much as possible, the power supply preferably uses commercial power, and the current is kept stable; sensors should be used when recording. Prior to framing and sampling, the current voiceprint validation speech data can be denoised to further reduce interference. In order to extract the voiceprint features of the current voiceprint verification voice data, the collected current voiceprint verification voice data are voice data with a preset data length or voice data with a length larger than the preset data length.
And a verification step, namely constructing a current voiceprint authentication vector corresponding to the current voiceprint authentication voice data, determining a standard voiceprint authentication vector corresponding to the user identity according to a mapping relation between a preset user identity and the standard voiceprint authentication vector, calculating the distance between the current voiceprint authentication vector and the standard voiceprint authentication vector, generating an authentication result based on the calculated distance, and sending the authentication result to the client computer.
In order to effectively reduce the amount of computation for voiceprint recognition and increase the speed of voiceprint recognition, in an embodiment, the step of constructing the current voiceprint identification vector corresponding to the current voiceprint verification voice data specifically includes: processing the current voiceprint verification voice data to extract preset type voiceprint features, and constructing corresponding voiceprint feature vectors based on the preset type voiceprint features; and inputting the voiceprint characteristic vector into a pre-trained background channel model to construct a current voiceprint identification vector corresponding to the current voiceprint verification voice data.
The voiceprint features include multiple types, such as a wideband voiceprint, a narrowband voiceprint, an amplitude voiceprint, and the like, the preset type voiceprint features in this embodiment are preferably Mel Frequency Cepstrum Coefficient (MFCC) of the current voiceprint verification voice data, and the preset filter is a Mel filter. And when constructing the corresponding voiceprint characteristic vector, forming the voiceprint characteristics of the current voiceprint verification voice data into a characteristic data matrix, wherein the characteristic data matrix is the corresponding voiceprint characteristic vector.
Specifically, pre-emphasis and windowing are performed on current voiceprint verification voice data, Fourier transform is performed on each windowing to obtain a corresponding frequency spectrum, and the frequency spectrum is input into a Mel filter to be output to obtain a Mel frequency spectrum; performing cepstral analysis on the mel-frequency spectrum to obtain mel-frequency cepstral coefficients MFCC, and composing corresponding voiceprint feature vectors based on the mel-frequency cepstral coefficients MFCC.
The pre-emphasis processing is actually high-pass filtering processing to filter low-frequency data so as to make high-frequency characteristics in the current voiceprint verification voice data more prominent, and specifically, the transfer function of the high-pass filtering is H (Z) -1- α Z-1The method comprises the steps of dividing a frame into frames, performing cepstrum analysis on a Mel frequency spectrum, wherein Z is voice data, α is a constant coefficient, preferably, α has a value of 0.97, performing windowing processing on the voice data after the voice data are divided into frames, performing inverse transform on the Mel frequency spectrum, for example, taking logarithm and performing inverse transform, the inverse transform is generally realized by DCT discrete cosine transform, taking coefficients from 2 nd to 13 th after DCT as Mel frequency cepstrum coefficients MFCC, and the Mel frequency cepstrum coefficients MFCC are vocal print features of the voice data of the frame, and forming the Mel frequency cepstrum coefficients MFCC of each frame into a feature data matrix, wherein the feature data matrix is a vocal print feature vector of voice sampling data.
In the embodiment, the Mel frequency cepstrum coefficients MFCC of the voice data are taken to form corresponding voiceprint feature vectors, and the voiceprint feature vectors are more similar to the human auditory system than the frequency bands used for linear intervals in the normal log cepstrum, so that the accuracy of identity verification can be improved.
Then, the voiceprint feature vector is input into a pre-trained background channel model to construct a current voiceprint identification vector corresponding to the current voiceprint verification voice data, for example, a feature matrix corresponding to the current voiceprint verification voice data is calculated by using the pre-trained background channel model to determine the current voiceprint identification vector corresponding to the current voiceprint verification voice data.
In order to efficiently and high-quality construct the current voiceprint identification vector corresponding to the current voiceprint verification voice data, in a preferred embodiment, the background channel model is a set of gaussian mixture models, and the training process of the background channel model includes the following steps: 1. acquiring a preset number of voice data samples, wherein each preset number of voice data samples corresponds to a standard voiceprint identification vector; 2. processing each voice data sample respectively to extract preset type voiceprint features corresponding to each voice data sample, and constructing a voiceprint feature vector corresponding to each voice data sample based on the preset type voiceprint features corresponding to each voice data sample; 3. dividing all the extracted preset type voiceprint feature vectors into a training set with a first percentage and a verification set with a second percentage, wherein the sum of the first percentage and the second percentage is less than or equal to 100%; 4. training the set of Gaussian mixture models by using preset type voiceprint feature vectors in the training set, and verifying the accuracy of the trained set of Gaussian mixture models by using the verification set after the training is finished; if the accuracy is greater than a preset threshold (for example, 98.5%), the training is finished, and the set of trained gaussian mixture models is used as a background channel model to be used, or if the accuracy is less than or equal to the preset threshold, the number of voice data samples is increased, and the training is repeated until the accuracy of the set of gaussian mixture models is greater than the preset threshold.
The pre-trained background channel model is obtained by mining and comparing training of a large amount of voice data, the model can accurately depict the background voiceprint characteristics of the user during speaking while keeping the voiceprint characteristics of the user to the maximum extent, and can remove the characteristics during recognition to extract the inherent characteristics of the voice of the user, so that the accuracy and efficiency of user identity verification can be greatly improved.
In an embodiment, the step of calculating a distance between the current voiceprint authentication vector and the standard voiceprint authentication vector and generating the authentication result based on the calculated distance includes:
calculating the cosine distance between the current voiceprint identification vector and the standard voiceprint identification vector:
Figure GDA0002302945660000131
for the standard voiceprint authentication vector(s),
Figure GDA0002302945660000132
identifying a vector for the current voiceprint; if the cosine distance is smaller than or equal to a preset distance threshold, generating information that the verification is passed; if the cosine distance is larger than the preset valueAnd generating information of failing verification by the distance threshold.
When the identity of the user is verified, the corresponding standard voiceprint authentication vector is obtained according to the identification information of the current voiceprint authentication vector in a matching mode, the cosine distance between the current voiceprint authentication vector and the matched standard voiceprint authentication vector is calculated, the identity of the target user is verified according to the cosine distance, and the accuracy of identity verification is improved.
Compared with the prior art, the invention adopts a framework composed of a client computer, a server and a handheld terminal when carrying out voiceprint verification, the client computer carries a user identity to request the server, the server generates a graphic code parameter corresponding to the user identity and sends the graphic code parameter to the client computer for displaying the graphic code corresponding to the graphic code parameter, the user scans the graphic code by using the handheld terminal carried by the user and sends a random code to the server for verification through a link address, a channel can be established with the server after the verification is passed, and voice data of the user collected by the handheld terminal is obtained for voiceprint verification, the invention does not need to collect the voice data of the user by using a developed client program, has high flexibility and is not easy to be interfered when carrying out voiceprint verification by using the handheld terminal, and binds the server and the client computer by using the user identity, and then binding the client computer, the server and the handheld terminal by using the random code, avoiding the condition of sound hijacking and improving the authenticity and the safety of voiceprint verification.
As shown in fig. 2, fig. 2 is a schematic flow chart of an embodiment of a method for voiceprint authentication according to the present invention, and the method for voiceprint authentication includes the following steps:
step S1, after receiving an identity authentication request carrying a user identity sent by a client computer, a server generates a graphic code parameter of a graphic code corresponding to the user identity, and sends the graphic code parameter to the client computer for the client computer to generate and display the graphic code corresponding to the graphic code parameter, wherein the graphic code parameter comprises a random key and a voiceprint data acquisition link address;
the user identity is an identity for uniquely identifying the identity of the user, and preferably, the user identity is an identity card number. The graphic code is preferably a two-dimensional code, but is not limited thereto, and may be a barcode or the like, for example. The graphic code parameters are used for generating corresponding graphic codes, for example, the two-dimensional code parameters generate corresponding two-dimensional codes, and the bar code parameters generate corresponding bar codes. The parameter of the graphic code comprises a random secret key and a sound stripe data acquisition link address, and further comprises the valid time of the graphic code, the detailed information of the graphic code, the scene value ID of the graphic code and the like, wherein the random secret key can be a random number string or a random character string and the like.
The method comprises the steps that a client computer sends an identity authentication request carrying a user identity identifier to a server, the server generates a random secret key corresponding to the user identity identifier, a voiceprint data acquisition link address of the server, the effective time of a graphic code, detailed information of the graphic code, a scene value ID of the graphic code and other graphic code parameters after receiving the identity authentication request, the graphic code parameters are sent to the client computer, and after receiving the graphic code parameters, the client computer generates the corresponding graphic code according to the graphic code parameters and displays the graphic code for a handheld terminal to scan.
Step S2, after the hand-held terminal analyzes the graphic code to obtain the random secret key and the voiceprint data acquisition link address, the server receives a voiceprint verification request which is sent by the hand-held terminal through the voiceprint data acquisition link address and carries the random secret key, and analyzes whether the random secret key in the graphic code parameters sent to the client computer is consistent with the random secret key received from the hand-held terminal;
after scanning the graphical code, the handheld terminal analyzes the graphical code by using a self functional module for analyzing the graphical code to obtain a corresponding random secret key, a voiceprint data acquisition link address of the server, the valid time of the graphical code, detailed information of the graphical code, a scene value ID of the graphical code and other graphical code parameters, and sends a voiceprint verification request carrying the random secret key to the server through the voiceprint data acquisition link address.
In order to prevent other handheld terminals from stealing the random key of the current time and then carrying out voiceprint verification with the server and improve the accuracy of the voiceprint verification, in one embodiment, the server receives the voiceprint verification request which is sent by the handheld terminal through a voiceprint data acquisition link address and carries the random key, and firstly analyzes whether the number of times of receiving the random key is greater than the preset number of times; and if the number of times of receiving the random key is greater than a preset number of times, for example, greater than 1 time, the server refuses to respond to the voiceprint authentication request, and can send the relevant information of the handheld terminal to the server for the server to subsequently serve as a reference basis for whether voiceprint authentication is fraudulent, and if the number of times of receiving the random key is less than or equal to the preset number of times, for example, 1 time, the server performs an operation of analyzing whether the random key in the graphic code parameter sent to the client computer is consistent with the random key received from the handheld terminal.
In order to prevent other handheld terminals from stealing the random key of this time and then performing voiceprint verification with the server, so as to further improve the accuracy of the voiceprint verification, in another embodiment, the server receives a voiceprint verification request carrying the random key and sent by the handheld terminal through a voiceprint data acquisition link address, and first analyzes whether the time of receiving the random key is within the valid time range of the graphic code, for example, the valid time of the graphic code is 2018.03.01-2018.03.10, and the time of receiving the random key of the handheld terminal by the server is 2018.03.08, and then is within the valid time range of the graphic code. If the time is within the valid time range of the graphic code, analyzing whether the number of times of receiving the random key is greater than a preset number of times, for example, analyzing whether the number of times of receiving the random key is greater than 1 time; and if the number of times of receiving the random key is greater than the preset number of times, the server refuses to respond to the voiceprint verification request, and can send the relevant information of the handheld terminal to the server for the server to serve as a reference basis for whether the voiceprint verification is fraudulent or not. And if the number of times is less than or equal to the preset number of times, finally, analyzing whether the random key in the graphic code parameters sent to the client computer is consistent with the random key received from the handheld terminal.
Step S3, if yes, the server establishes a voice data acquisition channel with the handheld terminal, and acquires the current voiceprint verification voice data of the user acquired from the handheld terminal based on the voice data acquisition channel;
and if the random key in the graphic code parameters sent to the client computer is consistent with the random key received from the handheld terminal, establishing a voice data acquisition channel with the handheld terminal. The handheld terminal acquires the current voiceprint verification voice data of the user in real time through voice acquisition equipment such as a microphone. When collecting the current voice print verification voice data, the interference of environmental noise and the handheld terminal should be prevented as much as possible. The handheld terminal keeps a proper distance from a user, the handheld terminal with large distortion is not used as much as possible, the power supply preferably uses commercial power, and the current is kept stable; sensors should be used when recording. Prior to framing and sampling, the current voiceprint validation speech data can be denoised to further reduce interference. In order to extract the voiceprint features of the current voiceprint verification voice data, the collected current voiceprint verification voice data are voice data with a preset data length or voice data with a length larger than the preset data length.
Step S4, constructing a current voiceprint authentication vector corresponding to the current voiceprint authentication voice data, determining a standard voiceprint authentication vector corresponding to the user identity according to a mapping relationship between a predetermined user identity and the standard voiceprint authentication vector, calculating a distance between the current voiceprint authentication vector and the standard voiceprint authentication vector, generating an authentication result based on the calculated distance, and sending the authentication result to the client computer.
In order to effectively reduce the amount of computation for voiceprint recognition and increase the speed of voiceprint recognition, in an embodiment, the step of constructing the current voiceprint identification vector corresponding to the current voiceprint verification voice data specifically includes: processing the current voiceprint verification voice data to extract preset type voiceprint features, and constructing corresponding voiceprint feature vectors based on the preset type voiceprint features; and inputting the voiceprint characteristic vector into a pre-trained background channel model to construct a current voiceprint identification vector corresponding to the current voiceprint verification voice data.
The voiceprint features include multiple types, such as a wideband voiceprint, a narrowband voiceprint, an amplitude voiceprint, and the like, the preset type voiceprint features in this embodiment are preferably Mel Frequency Cepstrum Coefficient (MFCC) of the current voiceprint verification voice data, and the preset filter is a Mel filter. And when constructing the corresponding voiceprint characteristic vector, forming the voiceprint characteristics of the current voiceprint verification voice data into a characteristic data matrix, wherein the characteristic data matrix is the corresponding voiceprint characteristic vector.
Specifically, pre-emphasis and windowing are performed on current voiceprint verification voice data, Fourier transform is performed on each windowing to obtain a corresponding frequency spectrum, and the frequency spectrum is input into a Mel filter to be output to obtain a Mel frequency spectrum; performing cepstral analysis on the mel-frequency spectrum to obtain mel-frequency cepstral coefficients MFCC, and composing corresponding voiceprint feature vectors based on the mel-frequency cepstral coefficients MFCC.
The pre-emphasis processing is actually high-pass filtering processing to filter low-frequency data so as to make high-frequency characteristics in the current voiceprint verification voice data more prominent, and specifically, the transfer function of the high-pass filtering is H (Z) -1- α Z-1The method comprises the steps of dividing a frame into frames, performing cepstrum analysis on a Mel frequency spectrum, wherein Z is voice data, α is a constant coefficient, preferably, α has a value of 0.97, performing windowing processing on the voice data after the voice data are divided into frames, performing inverse transform on the Mel frequency spectrum, for example, taking logarithm and performing inverse transform, the inverse transform is generally realized by DCT discrete cosine transform, taking coefficients from 2 nd to 13 th after DCT as Mel frequency cepstrum coefficients MFCC, and the Mel frequency cepstrum coefficients MFCC are vocal print features of the voice data of the frame, and forming the Mel frequency cepstrum coefficients MFCC of each frame into a feature data matrix, wherein the feature data matrix is a vocal print feature vector of voice sampling data.
In the embodiment, the Mel frequency cepstrum coefficients MFCC of the voice data are taken to form corresponding voiceprint feature vectors, and the voiceprint feature vectors are more similar to the human auditory system than the frequency bands used for linear intervals in the normal log cepstrum, so that the accuracy of identity verification can be improved.
Then, the voiceprint feature vector is input into a pre-trained background channel model to construct a current voiceprint identification vector corresponding to the current voiceprint verification voice data, for example, a feature matrix corresponding to the current voiceprint verification voice data is calculated by using the pre-trained background channel model to determine the current voiceprint identification vector corresponding to the current voiceprint verification voice data.
In order to efficiently and high-quality construct the current voiceprint identification vector corresponding to the current voiceprint verification voice data, in a preferred embodiment, the background channel model is a set of gaussian mixture models, and the training process of the background channel model includes the following steps: 1. acquiring a preset number of voice data samples, wherein each preset number of voice data samples corresponds to a standard voiceprint identification vector; 2. processing each voice data sample respectively to extract preset type voiceprint features corresponding to each voice data sample, and constructing a voiceprint feature vector corresponding to each voice data sample based on the preset type voiceprint features corresponding to each voice data sample; 3. dividing all the extracted preset type voiceprint feature vectors into a training set with a first percentage and a verification set with a second percentage, wherein the sum of the first percentage and the second percentage is less than or equal to 100%; 4. training the set of Gaussian mixture models by using preset type voiceprint feature vectors in the training set, and verifying the accuracy of the trained set of Gaussian mixture models by using the verification set after the training is finished; if the accuracy is greater than a preset threshold (for example, 98.5%), the training is finished, and the set of trained gaussian mixture models is used as a background channel model to be used, or if the accuracy is less than or equal to the preset threshold, the number of voice data samples is increased, and the training is repeated until the accuracy of the set of gaussian mixture models is greater than the preset threshold.
The pre-trained background channel model is obtained by mining and comparing training of a large amount of voice data, the model can accurately depict the background voiceprint characteristics of the user during speaking while keeping the voiceprint characteristics of the user to the maximum extent, and can remove the characteristics during recognition to extract the inherent characteristics of the voice of the user, so that the accuracy and efficiency of user identity verification can be greatly improved.
In an embodiment, the step of calculating a distance between the current voiceprint authentication vector and the standard voiceprint authentication vector and generating the authentication result based on the calculated distance includes:
calculating the cosine distance between the current voiceprint identification vector and the standard voiceprint identification vector:for the standard voiceprint authentication vector(s),identifying a vector for the current voiceprint; if the cosine distance is smaller than or equal to a preset distance threshold, generating information that the verification is passed; and if the cosine distance is greater than a preset distance threshold, generating information that the verification fails.
When the identity of the user is verified, the corresponding standard voiceprint authentication vector is obtained according to the identification information of the current voiceprint authentication vector in a matching mode, the cosine distance between the current voiceprint authentication vector and the matched standard voiceprint authentication vector is calculated, the identity of the target user is verified according to the cosine distance, and the accuracy of identity verification is improved.
Compared with the prior art, the invention adopts a framework composed of a client computer, a server and a handheld terminal when carrying out voiceprint verification, the client computer carries a user identity to request the server, the server generates a graphic code parameter corresponding to the user identity and sends the graphic code parameter to the client computer for displaying the graphic code corresponding to the graphic code parameter, the user scans the graphic code by using the handheld terminal carried by the user and sends a random code to the server for verification through a link address, a channel can be established with the server after the verification is passed, and voice data of the user collected by the handheld terminal is obtained for voiceprint verification, the invention does not need to collect the voice data of the user by using a developed client program, has high flexibility and is not easy to be interfered when carrying out voiceprint verification by using the handheld terminal, and binds the server and the client computer by using the user identity, and then binding the client computer, the server and the handheld terminal by using the random code, avoiding the condition of sound hijacking and improving the authenticity and the safety of voiceprint verification.
The present invention also provides a computer-readable storage medium having stored thereon a processing program which, when executed by a processor, implements the steps of the method of voiceprint authentication described above.
The above-mentioned serial numbers of the embodiments of the present invention are merely for description and do not represent the merits of the embodiments.
Through the above description of the embodiments, those skilled in the art will clearly understand that the method of the above embodiments can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware, but in many cases, the former is a better implementation manner. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which is stored in a storage medium (such as ROM/RAM, magnetic disk, optical disk) and includes instructions for enabling a terminal device (such as a mobile phone, a computer, a server, an air conditioner, or a network device) to execute the method according to the embodiments of the present invention.
The above description is only a preferred embodiment of the present invention, and not intended to limit the scope of the present invention, and all modifications of equivalent structures and equivalent processes, which are made by using the contents of the present specification and the accompanying drawings, or directly or indirectly applied to other related technical fields, are included in the scope of the present invention.

Claims (10)

1. A server, comprising a memory and a processor coupled to the memory, wherein the memory stores a processing program operable on the processor, and wherein the processing program when executed by the processor performs the steps of:
generating, after receiving an identity authentication request carrying a user identity sent by a client computer, a graphic code parameter of a graphic code corresponding to the user identity, and sending the graphic code parameter to the client computer, so that the client computer generates and displays the graphic code corresponding to the graphic code parameter, wherein the graphic code parameter comprises a random key and a voiceprint data acquisition link address;
an analysis step, after the handheld terminal analyzes the graphic code to obtain a random key and a voiceprint data acquisition link address, receiving a voiceprint verification request which is sent by the handheld terminal through the voiceprint data acquisition link address and carries the random key, and analyzing whether the random key in the graphic code parameters sent to the client computer is consistent with the random key received from the handheld terminal;
if so, establishing a voice data acquisition channel with the handheld terminal, and acquiring current voiceprint verification voice data of the user acquired from the handheld terminal based on the voice data acquisition channel;
and a verification step, namely constructing a current voiceprint authentication vector corresponding to the current voiceprint authentication voice data, determining a standard voiceprint authentication vector corresponding to the user identity according to a mapping relation between a preset user identity and the standard voiceprint authentication vector, calculating the distance between the current voiceprint authentication vector and the standard voiceprint authentication vector, generating an authentication result based on the calculated distance, and sending the authentication result to the client computer.
2. The server according to claim 1, wherein the analyzing step specifically comprises:
the server receives a voiceprint verification request which is sent by the handheld terminal through the voiceprint data acquisition link address and carries a random key, and analyzes whether the number of receiving the random key is greater than a preset number of times;
and if the number of times is less than or equal to the preset number of times, analyzing whether the random secret key in the graphic code parameters sent to the client computer is consistent with the random secret key received from the handheld terminal.
3. The server according to claim 1, wherein the graphic code parameters further include a valid time of the graphic code, and the analyzing step specifically includes:
the server receives a voiceprint verification request which is sent by the handheld terminal through the voiceprint data acquisition link address and carries a random key, and analyzes whether the time for receiving the random key is within the valid time range of the graphic code;
if the time is within the valid time range of the graphic code, analyzing whether the number of times of receiving the random secret key is greater than a preset number of times;
and if the number of times is less than or equal to the preset number of times, analyzing whether the random secret key in the graphic code parameters sent to the client computer is consistent with the random secret key received from the handheld terminal.
4. The server according to any one of claims 1 to 3, wherein the step of constructing the current voiceprint authentication vector corresponding to the current voiceprint verification speech data specifically comprises:
processing the current voiceprint verification voice data to extract preset type voiceprint features, and constructing corresponding voiceprint feature vectors based on the preset type voiceprint features;
inputting the voiceprint feature vector into a pre-trained background channel model to construct a current voiceprint identification vector corresponding to the current voiceprint verification voice data;
the step of calculating the distance between the current voiceprint authentication vector and the standard voiceprint authentication vector and generating an identity verification result based on the calculated distance comprises:
calculating the cosine distance between the current voiceprint identification vector and the standard voiceprint identification vector:
Figure FDA0002302945650000021
Figure FDA0002302945650000022
for the standard voiceprint authentication vector(s),
Figure FDA0002302945650000023
identifying a vector for the current voiceprint;
if the cosine distance is smaller than or equal to a preset distance threshold, generating information that the verification is passed;
and if the cosine distance is greater than a preset distance threshold, generating information that the verification fails.
5. A method of voiceprint authentication, the method comprising:
s1, after receiving an identity authentication request carrying a user identity sent by a client computer, a server generates a graphic code parameter of a graphic code corresponding to the user identity, and sends the graphic code parameter to the client computer for the client computer to generate and display the graphic code corresponding to the graphic code parameter, wherein the graphic code parameter comprises a random key and a voiceprint data acquisition link address;
s2, after the hand-held terminal analyzes the graphic code to obtain the random secret key and the voiceprint data acquisition link address, the server receives a voiceprint verification request which is sent by the hand-held terminal through the voiceprint data acquisition link address and carries the random secret key, and analyzes whether the random secret key in the graphic code parameters sent to the client computer is consistent with the random secret key received from the hand-held terminal;
s3, if yes, the server establishes a voice data acquisition channel with the handheld terminal, and acquires the current voiceprint verification voice data of the user acquired from the handheld terminal based on the voice data acquisition channel;
s4, constructing a current voiceprint authentication vector corresponding to the current voiceprint authentication voice data, determining a standard voiceprint authentication vector corresponding to the user identity according to the mapping relation between the preset user identity and the standard voiceprint authentication vector, calculating the distance between the current voiceprint authentication vector and the standard voiceprint authentication vector, generating an authentication result based on the calculated distance, and sending the authentication result to the client computer.
6. The method for voiceprint authentication according to claim 5, wherein the step S2 specifically includes:
the server receives a voiceprint verification request which is sent by the handheld terminal through the voiceprint data acquisition link address and carries a random key, and analyzes whether the number of receiving the random key is greater than a preset number of times;
and if the number of times is less than or equal to the preset number of times, analyzing whether the random secret key in the graphic code parameters sent to the client computer is consistent with the random secret key received from the handheld terminal.
7. The method for voiceprint authentication according to claim 5, wherein the graphic code parameter further includes a valid time of the graphic code, and the step S2 specifically includes:
the server receives a voiceprint verification request which is sent by the handheld terminal through the voiceprint data acquisition link address and carries a random key, and analyzes whether the time for receiving the random key is within the valid time range of the graphic code;
if the time is within the valid time range of the graphic code, analyzing whether the number of times of receiving the random secret key is greater than a preset number of times;
and if the number of times is less than or equal to the preset number of times, analyzing whether the random secret key in the graphic code parameters sent to the client computer is consistent with the random secret key received from the handheld terminal.
8. The method according to any one of claims 5 to 7, wherein the step of constructing the current voiceprint authentication vector corresponding to the current voiceprint authentication speech data specifically comprises:
processing the current voiceprint verification voice data to extract preset type voiceprint features, and constructing corresponding voiceprint feature vectors based on the preset type voiceprint features;
inputting the voiceprint feature vector into a pre-trained background channel model to construct a current voiceprint identification vector corresponding to the current voiceprint verification voice data;
the step of calculating the distance between the current voiceprint authentication vector and the standard voiceprint authentication vector and generating an identity verification result based on the calculated distance comprises:
calculating the cosine distance between the current voiceprint identification vector and the standard voiceprint identification vector:
Figure FDA0002302945650000041
Figure FDA0002302945650000042
for the standard voiceprint authentication vector(s),
Figure FDA0002302945650000043
identifying a vector for the current voiceprint;
if the cosine distance is smaller than or equal to a preset distance threshold, generating information that the verification is passed;
and if the cosine distance is greater than a preset distance threshold, generating information that the verification fails.
9. The method according to claim 8, wherein the step of processing the current voiceprint verification speech data to extract a preset type voiceprint feature and construct a corresponding voiceprint feature vector based on the preset type voiceprint feature specifically comprises:
pre-emphasis, framing and windowing are carried out on the current voiceprint verification voice data, Fourier transform is carried out on each windowing to obtain a corresponding frequency spectrum, and the frequency spectrum is input into a Mel filter to be output to obtain a Mel frequency spectrum;
performing cepstral analysis on the mel-frequency spectrum to obtain mel-frequency cepstral coefficients MFCC, and composing corresponding voiceprint feature vectors based on the mel-frequency cepstral coefficients MFCC.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium has stored thereon a processing program which, when being executed by a processor, carries out the steps of the method of voiceprint authentication according to any one of claims 5 to 9.
CN201810457267.1A 2018-05-14 2018-05-14 Server, voiceprint verification method and storage medium Active CN108650266B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN201810457267.1A CN108650266B (en) 2018-05-14 2018-05-14 Server, voiceprint verification method and storage medium
PCT/CN2018/102049 WO2019218512A1 (en) 2018-05-14 2018-08-24 Server, voiceprint verification method, and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810457267.1A CN108650266B (en) 2018-05-14 2018-05-14 Server, voiceprint verification method and storage medium

Publications (2)

Publication Number Publication Date
CN108650266A CN108650266A (en) 2018-10-12
CN108650266B true CN108650266B (en) 2020-02-18

Family

ID=63755329

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810457267.1A Active CN108650266B (en) 2018-05-14 2018-05-14 Server, voiceprint verification method and storage medium

Country Status (2)

Country Link
CN (1) CN108650266B (en)
WO (1) WO2019218512A1 (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109462482B (en) * 2018-11-09 2023-08-08 深圳壹账通智能科技有限公司 Voiceprint recognition method, voiceprint recognition device, electronic equipment and computer readable storage medium
CN113129903A (en) * 2019-12-31 2021-07-16 深圳市航盛电子股份有限公司 Automatic audio test method and device, computer equipment and storage medium
CN113973299B (en) * 2020-07-22 2023-09-29 中国石油化工股份有限公司 Wireless sensor with identity authentication function and identity authentication method
CN111931146B (en) * 2020-07-24 2024-01-19 捷德(中国)科技有限公司 Identity verification method, device, equipment and storage medium

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107610707A (en) * 2016-12-15 2018-01-19 平安科技(深圳)有限公司 A kind of method for recognizing sound-groove and device
CN107993071A (en) * 2017-11-21 2018-05-04 平安科技(深圳)有限公司 Electronic device, auth method and storage medium based on vocal print

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP4463526B2 (en) * 2003-10-24 2010-05-19 株式会社ユニバーサルエンターテインメント Voiceprint authentication system
GB2519571A (en) * 2013-10-25 2015-04-29 Aplcomp Oy Audiovisual associative authentication method and related system
CN105100123A (en) * 2015-09-11 2015-11-25 深圳市亚略特生物识别科技有限公司 Application registration method and system
CN107068154A (en) * 2017-03-13 2017-08-18 平安科技(深圳)有限公司 The method and system of authentication based on Application on Voiceprint Recognition

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107610707A (en) * 2016-12-15 2018-01-19 平安科技(深圳)有限公司 A kind of method for recognizing sound-groove and device
CN107993071A (en) * 2017-11-21 2018-05-04 平安科技(深圳)有限公司 Electronic device, auth method and storage medium based on vocal print

Also Published As

Publication number Publication date
WO2019218512A1 (en) 2019-11-21
CN108650266A (en) 2018-10-12

Similar Documents

Publication Publication Date Title
US11068571B2 (en) Electronic device, method and system of identity verification and computer readable storage medium
CN108650266B (en) Server, voiceprint verification method and storage medium
WO2018166187A1 (en) Server, identity verification method and system, and a computer-readable storage medium
WO2019100606A1 (en) Electronic device, voiceprint-based identity verification method and system, and storage medium
JP6429945B2 (en) Method and apparatus for processing audio data
WO2019205369A1 (en) Electronic device, identity recognition method based on human face image and voiceprint information, and storage medium
CN107977776B (en) Information processing method, device, server and computer readable storage medium
CN108694952B (en) Electronic device, identity authentication method and storage medium
CN109409349B (en) Credit certificate authentication method, credit certificate authentication device, credit certificate authentication terminal and computer readable storage medium
WO2019136912A1 (en) Electronic device, identity authentication method and system, and storage medium
CN109816521A (en) A kind of banking processing method, apparatus and system
CN110247898B (en) Identity verification method, identity verification device, identity verification medium and electronic equipment
CN110556126A (en) Voice recognition method and device and computer equipment
CN110033365A (en) A kind of loan on personal security approval system and method
CN110795714A (en) Identity authentication method and device, computer equipment and storage medium
CN108630208B (en) Server, voiceprint-based identity authentication method and storage medium
CN112073407A (en) System, method and storage medium for real-time judgment of abnormal equipment in high-concurrency service
US10083696B1 (en) Methods and systems for determining user liveness
CN111583935A (en) Loan intelligent delivery method, device and storage medium
CN115690920B (en) Credible living body detection method for medical identity authentication and related equipment
CN113421575B (en) Voiceprint recognition method, voiceprint recognition device, voiceprint recognition equipment and storage medium
CN114386002B (en) Biometric identification method, biometric identification device, biometric identification equipment and readable storage medium
CN115910071A (en) Identity authentication method and device, computer equipment and storage medium
CN114358790A (en) Method and device for identifying object, computer readable storage medium and electronic equipment
CN113436633A (en) Speaker recognition method, speaker recognition device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant