WO2019218512A1

WO2019218512A1 - Server, voiceprint verification method, and storage medium

Info

Publication number: WO2019218512A1
Application number: PCT/CN2018/102049
Authority: WO
Inventors: 程序; 彭俊清; 王健宗; 肖京
Original assignee: 平安科技（深圳）有限公司
Priority date: 2018-05-14
Filing date: 2018-08-24
Publication date: 2019-11-21
Also published as: CN108650266A; CN108650266B

Abstract

The present application relates to a server, a voiceprint verification method, and a storage medium. The method comprises: after receiving an identity verification request, generating a graphic code parameter of a graphic code corresponding to a user identity, and sending the graphic code parameter to a client computer; after a handheld terminal parses the graphic code, receiving a voiceprint verification request, sent by the handheld terminal by means of a voiceprint data acquisition link address, carrying random keys, and analyzing whether the two random keys are consistent; if yes, establishing a voice data acquisition channel with the handheld terminal, and obtaining, on the basis of the channel, user's current voiceprint verification voice data acquired from the handheld terminal; and constructing a corresponding current voiceprint discrimination vector, determining a standard voiceprint discrimination vector corresponding to the user identity, calculating the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector, and generating an identity verification result on the basis of the calculated distance. The present application can improve the flexibility of voiceprint verification, and avoid sound hijacking.

Description

Server, voiceprint verification method and storage medium

Priority claim

This application is based on the priority of the Chinese Patent Application entitled "Server, Voiceprint Verification Method and Storage Medium", which is filed on May 14, 2018, with the application number of CN2018104572671, the entire contents of which are incorporated by reference. The way is combined in this application.

Technical field

The present application relates to the field of communications technologies, and in particular, to a server, a method for voiceprint verification, and a storage medium.

Background technique

At present, the use of voiceprint verification technology to verify user identity has become an important means of verification for major customer service companies (eg, banks, insurance companies, game companies, etc.). The traditional business solution for realizing user authentication using voiceprint verification technology is to use the interface of the voiceprint verification server to separately develop the corresponding client program, and collect and pre-process the user's voice through the developed client program, and then The voiceprint data processed in the previous period is transmitted to the voiceprint verification server, and the voiceprint verification server performs authentication verification and operation processing on the transmitted voiceprint data.

However, the drawback of this traditional voiceprint verification scheme is that the user needs to collect the user's voice through the developed client program. In actual operation, the use flexibility is low, it is easy to be interfered by human voice, and the client computer collects. When the sound is sound, it is easily hijacked by the sound, and the authenticity of the voiceprint verification cannot be accurately controlled, and the security cannot be guaranteed.

Summary of the invention

The purpose of the present application is to provide a server, a voiceprint verification method and a storage medium, which aim to improve the flexibility of voiceprint verification and avoid sound hijacking.

To achieve the above object, the present application provides a server including a memory and a processor coupled to the memory, the memory storing a processing system operable on the processor, the processing system being The processor implements the following steps when executed:

a generating step, after receiving the identity verification request sent by the client computer and carrying the user identity, generating a graphic code parameter of the graphic code corresponding to the user identity, and sending the graphic code parameter to the client computer for The client computer generates and displays a graphic code corresponding to the graphic code parameter, where the graphic code parameter includes a random key and a voiceprint data collection link address;

The analyzing step, after the handheld terminal parses the graphic code to obtain the random key and the voiceprint data collection link address, receives the voiceprint verification request that is sent by the handheld terminal through the voiceprint data collection link address and carries the random key, and analyzes and sends the request Whether the random key in the graphic code parameter of the client computer is consistent with the random key received from the handheld terminal;

Obtaining, if yes, establishing a voice data collection channel with the handheld terminal, and acquiring current voiceprint verification voice data of the user collected from the handheld terminal based on the voice data collection channel;

a verification step of constructing a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data, and determining a standard voiceprint discrimination vector corresponding to the user identity identifier according to a mapping relationship between the predetermined user identity identifier and the standard voiceprint discrimination vector, and calculating a current The distance between the voiceprint discrimination vector and the standard voiceprint discrimination vector, the identity verification result is generated based on the calculated distance, and the identity verification result is sent to the client computer.

To achieve the above object, the present application further provides a method for voiceprint verification, and the method for voiceprint verification includes:

S1. After receiving the identity verification request that is sent by the client computer and carrying the user identity, the server generates a graphic code parameter of the graphic code corresponding to the user identity, and sends the graphic code parameter to the client computer for The client computer generates and displays a graphic code corresponding to the graphic code parameter, where the graphic code parameter includes a random key and a voiceprint data collection link address;

S2. After the handheld terminal parses the graphic code to obtain the random key and the voiceprint data collection link address, the server receives the voiceprint verification request that is sent by the handheld terminal through the voiceprint data collection link address and carries the random key, and analyzes and sends the data. Whether the random key in the graphic code parameter of the client computer is consistent with the random key received from the handheld terminal;

S3, if yes, the server establishes a voice data collection channel with the handheld terminal, and acquires current voiceprint verification voice data of the user collected from the handheld terminal based on the voice data collection channel;

S4. Construct a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data, determine a standard voiceprint discrimination vector corresponding to the user identity identifier according to a mapping relationship between the predetermined user identity identifier and the standard voiceprint discrimination vector, and calculate a current voice. The distance between the texture identification vector and the standard voiceprint discrimination vector, the identity verification result is generated based on the calculated distance, and the identity verification result is sent to the client computer.

The present application also provides a computer readable storage medium having stored thereon a processing system, the processing system being executed by a processor to implement the steps of the method of voiceprint verification described above.

The application has the beneficial effects that the application does not require the developed client program to collect the user's voice data, and the voiceprint verification using the handheld terminal is highly flexible and not easily interfered, and the server is bound to the client computer by using the user identity. Then, the random code is used to bind the client computer, the server and the handheld terminal to avoid the situation of sound hijacking, and improve the authenticity and security of the voiceprint verification.

DRAWINGS

1 is a schematic diagram of an optional application environment of each embodiment of the present application;

2 is a schematic flow chart of an embodiment of a method for voiceprint verification according to the present application.

Detailed ways

In order to make the objects, technical solutions, and advantages of the present application more comprehensible, the present application will be further described in detail below with reference to the accompanying drawings and embodiments. It is understood that the specific embodiments described herein are merely illustrative of the application and are not intended to be limiting. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments of the present application without departing from the inventive scope are the scope of the present application.

It should be noted that the descriptions of "first", "second" and the like in the present application are for the purpose of description only, and are not to be construed as indicating or implying their relative importance or implicitly indicating the number of technical features indicated. . Thus, features defining "first" or "second" may include at least one of the features, either explicitly or implicitly. In addition, the technical solutions between the various embodiments may be combined with each other, but must be based on the realization of those skilled in the art, and when the combination of the technical solutions is contradictory or impossible to implement, it should be considered that the combination of the technical solutions does not exist. Nor is it within the scope of protection required by this application.

Referring to FIG. 1, it is a schematic diagram of an application environment of a preferred embodiment of the method for voiceprint verification of the present application. The application environment diagram includes a server 1, a client computer 2, and a handheld terminal 3. The server 1 can perform data interaction with the client computer 2 and the handheld terminal 3 through a suitable technology such as a network or a near field communication technology.

The client computer 2 includes, but is not limited to, any electronic product that can interact with a user through a keyboard, a mouse, a remote controller, a touch panel, or a voice control device, for example, a personal computer, a tablet, or a smart device. Mobile devices such as mobile phones, personal digital assistants (PDAs), game consoles, Internet Protocol Television (IPTV), smart wearable devices, navigation devices, etc., or such as digital TVs, desktop computers Fixed terminals for notebooks, servers, etc. The handheld terminal 3 can be a tablet computer, a smart phone, or the like.

The server 1 is a device capable of automatically performing numerical calculation and/or information processing in accordance with an instruction set or stored in advance. The server 1 may be a single network server, a server group composed of multiple network servers, or a cloud-based cloud composed of a large number of hosts or network servers, wherein cloud computing is a kind of distributed computing, which is loosely coupled by a group. A super virtual computer consisting of a set of computers.

In the present embodiment, the server 1 may include, but is not limited to, a memory 11, a processor 12, and a network interface 13 communicably connected to each other through a system bus, and the memory 11 stores a processing system operable on the processor 12. It is pointed out that Figure 1 shows only the server 1 with components 11-13, but it should be understood that not all illustrated components may be implemented, and more or fewer components may be implemented instead.

The memory 11 includes a memory and at least one type of readable storage medium. The memory provides a cache for the operation of the server 1; the readable storage medium may be, for example, a flash memory, a hard disk, a multimedia card, a card type memory (for example, SD or DX memory, etc.), a random access memory (RAM), a static random access memory (SRAM). A non-volatile storage medium such as a read only memory (ROM), an electrically erasable programmable read only memory (EEPROM), a programmable read only memory (PROM), a magnetic memory, a magnetic disk, an optical disk, or the like. In some embodiments, the readable storage medium may be an internal storage unit on the server 1, such as a hard disk on the server 1; in other embodiments, the non-volatile storage medium may also be external to the server 1 Storage devices, such as plug-in hard drives on the server 1, smart memory cards (SMC), Secure Digital (SD) cards, flash cards, etc. In this embodiment, the readable storage medium of the memory 11 is generally used to store an operating system installed on the server 1 and various types of application software, such as program code for storing the processing system in an embodiment of the present application. Further, the memory 11 can also be used to temporarily store various types of data that have been output or are to be output.

The processor 12 may be a Central Processing Unit (CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 12 is typically used to control overall operations on the server 1, such as performing control and processing related to data interaction or communication with the client computer 2, the handheld terminal 3. In this embodiment, the processor 12 is configured to run program code or process data stored in the memory 11, such as running a processing system or the like.

The network interface 13 may comprise a wireless network interface or a wired network interface, which is typically used to establish a communication connection between the server 1 and other electronic devices. In this embodiment, the network interface 13 is mainly used to connect the server 1 to the client computer 2 and the handheld terminal 3, and establish a data transmission channel and a communication connection between the server 1 and the client computer 2 and the handheld terminal 3.

The processing system is stored in the memory 11 and includes at least one computer readable instruction stored in the memory 11, the at least one computer readable instruction being executable by the processor 12 to implement the methods of various embodiments of the present application; The at least one computer readable instruction can be classified into different logic modules depending on the functions implemented by its various parts.

In an embodiment, when the processing system is executed by the processor 12, the following steps are implemented:

a generating step, after receiving the identity verification request sent by the client computer and carrying the user identity, generating a graphic code parameter of the graphic code corresponding to the user identity, and sending the graphic code parameter to the client computer for The client computer generates and displays a graphic code corresponding to the graphic code parameter;

The user identity is an identifier for uniquely identifying the identity of the user. Preferably, the user identity is an identity card number. The graphic code is preferably a two-dimensional code, but is not limited thereto, and may be, for example, a barcode or the like. The graphic code parameter is used to generate a corresponding graphic code, for example, a two-dimensional code parameter generates a corresponding two-dimensional code, and the barcode parameter generates a corresponding barcode. The graphic code parameter includes a random key and a voiceprint data collection link address, and may further include a valid time of the graphic code, detailed information of the graphic code, a scene value ID of the graphic code, etc., and the random key may be a random number string or a random character. Strings and so on.

The client computer sends an authentication request carrying the user identity to the server, and after receiving the identity verification request, the server generates a random key corresponding to the user identity, a voiceprint data collection link address of the server, and a graphic code. The effective time, the detailed information of the graphic code, the scene value ID of the graphic code, and the like, the graphic code parameter is sent to the client computer, and after receiving the graphic code parameter, the client computer generates the corresponding graphic according to the graphic code parameter. The code is displayed and displayed for scanning by the handheld terminal.

After scanning the graphic code, the handheld terminal parses the graphic code by using its own function module for analyzing the graphic code, and obtains the corresponding random key, the voiceprint data collection link address of the server, and the effective time and graphic code of the graphic code. The detailed information, the scene value ID of the graphic code, and the like, the handheld terminal sends a voiceprint verification request carrying the random key to the server through the voiceprint data collection link address.

After receiving the voiceprint verification request, the server analyzes whether the random key in the graphic code parameter sent to the client computer is consistent with the random key received from the handheld terminal, in order to prevent other handheld terminals from stealing the current random key. After performing voiceprint verification with the server to improve the accuracy of the voiceprint verification, in an embodiment, the server receives the voiceprint verification request that is sent by the handheld terminal through the voiceprint data collection link address and carries the random key, and first analyzes the received Whether the number of times the random key is greater than a preset number; if the number of times the random key is received is greater than a preset number, for example, greater than one time, the server refuses to respond to the voiceprint verification request, and may information about the handheld terminal Sending to the server for the server to use as a reference for whether the voiceprint verification is fraudulent. If the preset number of times is less than or equal to the preset number of times, for example, the random key in the graphic code parameter sent to the client computer is analyzed. Whether the operation is consistent with the random key received from the handheld terminal.

In order to prevent other handheld terminals from stealing the current random key and performing voiceprint verification with the server to further improve the accuracy of the voiceprint verification, in another embodiment, the server receives the portable terminal to transmit by using the voiceprint data collection link address. The voiceprint verification request with the random key first analyzes whether the time when the random key is received is within the valid time range of the graphic code, for example, the effective time of the graphic code is 2018.03.01-2018.03.10, and the server receives the handheld The time of the terminal's random key is 2018.03.08, which is within the valid time range of the graphic code. If it is within the valid time range of the graphic code, it is analyzed whether the number of times the random key is received is greater than a preset number, for example, whether the number of times the random key is received is greater than one; if the random secret is received If the number of times of the key is greater than the preset number of times, the server refuses to respond to the voiceprint verification request, and may send the related information of the handheld terminal to the server, for the server to subsequently use as a reference for whether the voiceprint verification is fraudulent. If the preset number of times is less than or equal to, the operation of analyzing whether the random key in the graphic code parameter sent to the client computer and the random key received from the handheld terminal are consistent is performed.

If the random key in the graphic code parameter sent to the client computer coincides with the random key received from the handheld terminal, a voice data collection channel with the handheld terminal is established. The handheld terminal collects the current voiceprint verification voice data of the user through a voice collection device such as a microphone. When collecting current voiceprint verification voice data, it should try to prevent environmental noise and interference from the handheld terminal. The handheld terminal maintains an appropriate distance from the user and tries not to use a large hand-held terminal. The power supply is preferably powered by the mains and keeps the current stable; the sensor should be used when recording. The current voiceprint verification voice data can be denoised before framing and sampling to further reduce interference. In order to extract the voiceprint feature of the current voiceprint verification voice data, the collected voiceprint verification voice data is voice data of a preset data length, or voice data greater than a preset data length.

In order to effectively reduce the calculation amount of the voiceprint recognition and improve the speed of the voiceprint recognition, in an embodiment, the step of constructing the current voiceprint discrimination vector corresponding to the current voiceprint verification voice data includes: verifying the current voiceprint The voice data is processed to extract a preset type voiceprint feature, and a corresponding voiceprint feature vector is constructed based on the preset type voiceprint feature; the voiceprint feature vector is input into a pre-trained background channel model to construct the current The voiceprint verifies the current voiceprint discrimination vector corresponding to the voice data.

The voiceprint feature includes a plurality of types, such as a wide-band voiceprint, a narrow-band voiceprint, an amplitude voiceprint, and the like. In this embodiment, the preset type voiceprint feature is preferably a Mel frequency cepstrum coefficient of the current voiceprint verification voice data (Mel Frequency Cepstrum Coefficient (MFCC), the default filter is a Meyer filter. When constructing the corresponding voiceprint feature vector, the voiceprint feature of the current voiceprint verification voice data is composed into a feature data matrix, and the feature data matrix is the corresponding voiceprint feature vector.

Specifically, pre-emphasizing and windowing processing the current voiceprint verification voice data, performing Fourier transform on each window to obtain a corresponding spectrum, and inputting the spectrum into a Meyer filter to output a Mel spectrum; A cepstrum analysis is performed on the spectrum to obtain a Mel frequency cepstral coefficient MFCC, and a corresponding voiceprint feature vector is formed based on the Mel frequency cepstral coefficient MFCC.

The pre-emphasis processing is actually a high-pass filtering process, filtering out the low-frequency data, so that the high-frequency characteristics in the current voiceprint verification voice data are more prominent. Specifically, the transfer function of the high-pass filter is: H(Z)=1-αZ ^{- 1} , wherein Z is voice data, α is a constant coefficient, preferably, the value of α is 0.97; since the voice data deviates from the original voice to some extent after the frame division, the voice data needs to be windowed. The cepstrum analysis on the Mel spectrum is, for example, taking the logarithm and inverse transform. The inverse transform is generally realized by DCT discrete cosine transform. The second to thirteenth coefficients after DCT are taken as the Mel frequency cepstrum coefficients. MFCC. The Mel frequency cepstrum coefficient MFCC is the voiceprint feature of the speech data of this frame. The Mel frequency cepstral coefficient MFCC of each frame is composed into a feature data matrix, which is the voiceprint feature vector of the speech sample data.

In this embodiment, the voice frequency cepstral coefficient MFCC of the speech data is composed of a corresponding voiceprint feature vector, which can be improved because it is more similar to the human auditory system than the linearly spaced frequency band used in the normal cepstrum spectrum. The accuracy of the authentication.

Then, the voiceprint feature vector is input into the pre-trained background channel model to construct a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data, for example, using the pre-trained background channel model to calculate the current voiceprint verification voice data. Corresponding feature matrix to determine a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data.

In order to construct the current voiceprint discrimination vector corresponding to the current voiceprint verification voice data with high efficiency and high quality, in a preferred embodiment, the background channel model is a set of Gaussian mixture models, and the training process of the background channel model includes The following steps are as follows: 1. Obtain a preset number of voice data samples, and each preset number of voice data samples corresponds to a standard voiceprint discrimination vector; 2. respectively process each voice data sample to extract corresponding voice data samples. Presetting the type of voiceprint feature, and constructing the voiceprint feature vector corresponding to each voice data sample based on the preset type voiceprint feature corresponding to each voice data sample; 3. dividing all the extracted preset voiceprint feature vectors into the first a percentage of the training set and the second percentage of the verification set, the sum of the first percentage and the second percentage being less than or equal to 100%; 4. utilizing a preset type of voiceprint feature in the training set The Gaussian mixture model is trained by the vector, and the accuracy of the trained Gaussian mixture model is verified by the verification set after the training is completed; If the accuracy is greater than the preset threshold (for example, 98.5%), the training ends, and the trained Gaussian mixture model is used as the background channel model to be used, or if the accuracy is less than or equal to the preset threshold, the voice data is added. The number of samples and retraining until the accuracy of the Gaussian mixture model is greater than the preset threshold.

The background channel model pre-trained in this embodiment is obtained by mining and comparing a large amount of voice data. This model can accurately depict the background voiceprint characteristics of the user while maximally retaining the voiceprint features of the user. And this feature can be removed at the time of identification, and the inherent characteristics of the user's voice can be extracted, which can greatly improve the accuracy and efficiency of user identity verification.

In an embodiment, the calculating the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector, and generating the identity verification result based on the calculated distance comprises:

Calculating the cosine distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector:

Identifying the vector for the standard voiceprint,

And identifying the vector for the current voiceprint; if the cosine distance is less than or equal to the preset distance threshold, generating information for verifying the pass; if the cosine distance is greater than the preset distance threshold, generating information that the verification fails.

Wherein, when storing the user's standard voiceprint authentication vector, the user identity identifier may be carried. When the identity of the user is verified, the corresponding standard voiceprint discrimination vector is obtained according to the identification information of the current voiceprint authentication vector, and the current voiceprint discrimination is calculated. The cosine distance between the vector and the standard voiceprint discrimination vector obtained by matching, the cosine distance is used to verify the identity of the target user, and the accuracy of the authentication is improved.

Compared with the prior art, the present application adopts an architecture composed of a client computer, a server and a handheld terminal when performing voiceprint verification, the client computer carries a user identity to make a request to the server, and the server generates a graphic code corresponding to the user identity. The parameter is sent to the client computer for displaying the graphic code corresponding to the graphic code parameter, and the user scans the graphic code by using the carried handheld terminal, and then sends a random code to the server for verification through the link address, and the channel can be established with the server after the verification is passed. The voice data of the user collected by the handheld terminal is obtained, and voiceprint verification is performed. The application does not require the developed client program to collect the voice data of the user, and the voice recording verification using the handheld terminal is highly flexible and not easily interfered, and the user identity is utilized. The logo binds the server to the client computer, and then binds the client computer, the server, and the handheld terminal with a random code to avoid sound hijacking and improve the authenticity and security of the voiceprint verification.

As shown in FIG. 2, FIG. 2 is a schematic flowchart of a method for voiceprint verification according to an embodiment of the present invention. The voiceprint verification method includes the following steps:

Step S1: After receiving the identity verification request that is sent by the client computer and carrying the user identity, the server generates a graphic code parameter of the graphic code corresponding to the user identity, and sends the graphic code parameter to the client computer. And the client computer generates and displays a graphic code corresponding to the graphic code parameter, where the graphic code parameter comprises a random key and a voiceprint data collection link address;

Step S2, after the handheld terminal parses the graphic code to obtain the random key and the voiceprint data collection link address, the server receives the voiceprint verification request carried by the handheld terminal through the voiceprint data collection link address and carries the random key, and analyzes Whether the random key in the graphic code parameter sent to the client computer is consistent with the random key received from the handheld terminal;

Step S3, if yes, the server establishes a voice data collection channel with the handheld terminal, and acquires current voiceprint verification voice data of the user collected from the handheld terminal based on the voice data collection channel;

Step S4, constructing a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data, determining a standard voiceprint discrimination vector corresponding to the user identity identifier according to a mapping relationship between a predetermined user identity identifier and a standard voiceprint discrimination vector, and calculating a current The distance between the voiceprint discrimination vector and the standard voiceprint discrimination vector, the identity verification result is generated based on the calculated distance, and the identity verification result is sent to the client computer.

Identifying the vector for the standard voiceprint,

The application does not require the developed client program to collect the user's voice data, and the voice recording verification using the handheld terminal is highly flexible and difficult to be interfered with, and the user identity is used to bind the server to the client computer, and then the random code is used again. The client computer, the server and the handheld terminal are bound to avoid the sound hijacking, and improve the authenticity and security of the voiceprint verification.

The serial numbers of the embodiments of the present application are merely for the description, and do not represent the advantages and disadvantages of the embodiments.

Through the description of the above embodiments, those skilled in the art can clearly understand that the foregoing embodiment method can be implemented by means of software plus a necessary general hardware platform, and of course, can also be through hardware, but in many cases, the former is better. Implementation. Based on such understanding, the technical solution of the present application, which is essential or contributes to the prior art, may be embodied in the form of a software product stored in a storage medium (such as ROM/RAM, disk, The optical disc includes a number of instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the methods described in various embodiments of the present application.

The above is only a preferred embodiment of the present application, and is not intended to limit the scope of the patent application, and the equivalent structure or equivalent process transformations made by the specification and the drawings of the present application, or directly or indirectly applied to other related technical fields. The same is included in the scope of patent protection of this application.

Claims

A server, comprising: a memory and a processor coupled to the memory, the memory storing a processing system operable on the processor, the processing system being The following steps are implemented during execution:

a generating step, after receiving the identity verification request sent by the client computer and carrying the user identity, generating a graphic code parameter of the graphic code corresponding to the user identity, and sending the graphic code parameter to the client computer for The client computer generates and displays a graphic code corresponding to the graphic code parameter, where the graphic code parameter includes a random key and a voiceprint data collection link address;

The analyzing step, after the handheld terminal parses the graphic code to obtain the random key and the voiceprint data collection link address, receives the voiceprint verification request that is sent by the handheld terminal through the voiceprint data collection link address and carries the random key, and analyzes and sends the request Whether the random key in the graphic code parameter of the client computer is consistent with the random key received from the handheld terminal;

Obtaining, if yes, establishing a voice data collection channel with the handheld terminal, and acquiring current voiceprint verification voice data of the user collected from the handheld terminal based on the voice data collection channel;

a verification step of constructing a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data, and determining a standard voiceprint discrimination vector corresponding to the user identity identifier according to a mapping relationship between the predetermined user identity identifier and the standard voiceprint discrimination vector, and calculating a current The distance between the voiceprint discrimination vector and the standard voiceprint discrimination vector, the identity verification result is generated based on the calculated distance, and the identity verification result is sent to the client computer.
The server according to claim 1, wherein the analyzing step comprises:

Receiving, by the server, the voiceprint verification request that is sent by the handheld terminal by using the voiceprint data collection link address and carrying the random key, and analyzing whether the number of times the random key is received is greater than a preset number of times;

If it is less than or equal to the preset number of times, it is analyzed whether the random key in the graphic code parameter sent to the client computer is consistent with the random key received from the handheld terminal.
The server according to claim 1, wherein the graphic code parameter further comprises an effective time of the graphic code, and the analyzing step comprises:

Receiving, by the server, a voiceprint verification request that is sent by the handheld terminal by using a voiceprint data collection link address and carrying a random key, and analyzing whether the time of receiving the random key is within a valid time range of the graphic code;

If it is within the valid time range of the graphic code, analyze whether the number of times the random key is received is greater than a preset number of times;

If it is less than or equal to the preset number of times, it is analyzed whether the random key in the graphic code parameter sent to the client computer is consistent with the random key received from the handheld terminal.
The server according to claim 1, wherein the step of constructing the current voiceprint authentication vector corresponding to the current voiceprint verification voice data comprises:

Processing the current voiceprint verification voice data to extract a preset type voiceprint feature, and constructing a corresponding voiceprint feature vector based on the preset voiceprint feature;

Inputting the voiceprint feature vector into the pre-trained background channel model to construct a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data;

The calculating the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector, and generating the identity verification result based on the calculated distance includes:

Calculating the cosine distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector:
Identifying the vector for the standard voiceprint,
Identify the vector for the current voiceprint;

Generating a verification pass if the cosine distance is less than or equal to a preset distance threshold;

If the cosine distance is greater than a preset distance threshold, information that the verification fails is generated.
The server according to claim 2, wherein the step of constructing the current voiceprint authentication vector corresponding to the current voiceprint verification voice data comprises:

Processing the current voiceprint verification voice data to extract a preset type voiceprint feature, and constructing a corresponding voiceprint feature vector based on the preset voiceprint feature;

Inputting the voiceprint feature vector into the pre-trained background channel model to construct a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data;

The calculating the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector, and generating the identity verification result based on the calculated distance includes:

Calculating the cosine distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector:
Identifying the vector for the standard voiceprint,
Identify the vector for the current voiceprint;

Generating a verification pass if the cosine distance is less than or equal to a preset distance threshold;

If the cosine distance is greater than a preset distance threshold, information that the verification fails is generated.
The server according to claim 3, wherein the step of constructing the current voiceprint authentication vector corresponding to the current voiceprint verification voice data comprises:

Processing the current voiceprint verification voice data to extract a preset type voiceprint feature, and constructing a corresponding voiceprint feature vector based on the preset voiceprint feature;

Inputting the voiceprint feature vector into the pre-trained background channel model to construct a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data;

The calculating the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector, and generating the identity verification result based on the calculated distance includes:

Calculating the cosine distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector:
Identifying the vector for the standard voiceprint,
Identify the vector for the current voiceprint;

Generating a verification pass if the cosine distance is less than or equal to a preset distance threshold;

If the cosine distance is greater than a preset distance threshold, information that the verification fails is generated.
The server according to claim 4, 5 or 6, wherein the current voiceprint verification voice data is processed to extract a preset type voiceprint feature, and the corresponding voiceprint feature is constructed based on the preset type The steps of the voiceprint feature vector include:

Performing pre-emphasis, framing, and windowing on the current voiceprint verification voice data, performing Fourier transform on each window to obtain a corresponding spectrum, and inputting the spectrum into a Meyer filter to output a Mel spectrum;

A cepstrum analysis is performed on the Mel spectrum to obtain a Mel frequency cepstral coefficient MFCC, and a corresponding voiceprint feature vector is formed based on the Mel frequency cepstral coefficient MFCC.
A method for voiceprint verification, characterized in that the method for voiceprint verification comprises:

S1. After receiving the identity verification request that is sent by the client computer and carrying the user identity, the server generates a graphic code parameter of the graphic code corresponding to the user identity, and sends the graphic code parameter to the client computer for The client computer generates and displays a graphic code corresponding to the graphic code parameter, where the graphic code parameter includes a random key and a voiceprint data collection link address;

S2. After the handheld terminal parses the graphic code to obtain the random key and the voiceprint data collection link address, the server receives the voiceprint verification request that is sent by the handheld terminal through the voiceprint data collection link address and carries the random key, and analyzes and sends the data. Whether the random key in the graphic code parameter of the client computer is consistent with the random key received from the handheld terminal;

S3, if yes, the server establishes a voice data collection channel with the handheld terminal, and acquires current voiceprint verification voice data of the user collected from the handheld terminal based on the voice data collection channel;

S4. Construct a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data, determine a standard voiceprint discrimination vector corresponding to the user identity identifier according to a mapping relationship between the predetermined user identity identifier and the standard voiceprint discrimination vector, and calculate a current voice. The distance between the texture identification vector and the standard voiceprint discrimination vector, the identity verification result is generated based on the calculated distance, and the identity verification result is sent to the client computer.
The method of voiceprint verification according to claim 8, wherein the step S2 comprises:

Receiving, by the server, the voiceprint verification request that is sent by the handheld terminal by using the voiceprint data collection link address and carrying the random key, and analyzing whether the number of times the random key is received is greater than a preset number of times;

If it is less than or equal to the preset number of times, it is analyzed whether the random key in the graphic code parameter sent to the client computer is consistent with the random key received from the handheld terminal.
The method of claim 8, wherein the graphic code parameter further comprises an effective time of the graphic code, and the step S2 comprises:

Receiving, by the server, a voiceprint verification request that is sent by the handheld terminal by using a voiceprint data collection link address and carrying a random key, and analyzing whether the time of receiving the random key is within a valid time range of the graphic code;

If it is within the valid time range of the graphic code, analyze whether the number of times the random key is received is greater than a preset number of times;

If it is less than or equal to the preset number of times, it is analyzed whether the random key in the graphic code parameter sent to the client computer is consistent with the random key received from the handheld terminal.
The method of claim 8, wherein the step of constructing the current voiceprint authentication vector corresponding to the current voiceprint verification voice data comprises:

Processing the current voiceprint verification voice data to extract a preset type voiceprint feature, and constructing a corresponding voiceprint feature vector based on the preset voiceprint feature;

Inputting the voiceprint feature vector into the pre-trained background channel model to construct a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data;

The calculating the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector, and generating the identity verification result based on the calculated distance includes:

Calculating the cosine distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector:
Identifying the vector for the standard voiceprint,
Identify the vector for the current voiceprint;

Generating a verification pass if the cosine distance is less than or equal to a preset distance threshold;

If the cosine distance is greater than a preset distance threshold, information that the verification fails is generated.
The method according to claim 9, wherein the step of constructing the current voiceprint authentication vector corresponding to the current voiceprint verification voice data comprises:

Processing the current voiceprint verification voice data to extract a preset type voiceprint feature, and constructing a corresponding voiceprint feature vector based on the preset voiceprint feature;

Inputting the voiceprint feature vector into the pre-trained background channel model to construct a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data;

The calculating the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector, and generating the identity verification result based on the calculated distance includes:

Calculating the cosine distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector:
Identifying the vector for the standard voiceprint,
Identify the vector for the current voiceprint;

Generating a verification pass if the cosine distance is less than or equal to a preset distance threshold;

If the cosine distance is greater than a preset distance threshold, information that the verification fails is generated.
The method of claim 10, wherein the step of constructing the current voiceprint authentication vector corresponding to the current voiceprint verification voice data comprises:

Processing the current voiceprint verification voice data to extract a preset type voiceprint feature, and constructing a corresponding voiceprint feature vector based on the preset voiceprint feature;

Inputting the voiceprint feature vector into the pre-trained background channel model to construct a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data;

The calculating the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector, and generating the identity verification result based on the calculated distance includes:

Calculating the cosine distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector:
Identifying the vector for the standard voiceprint,
Identify the vector for the current voiceprint;

Generating a verification pass if the cosine distance is less than or equal to a preset distance threshold;

If the cosine distance is greater than a preset distance threshold, information that the verification fails is generated.
The method for verifying voiceprint according to claim 11, 12 or 13, wherein the current voiceprint verification voice data is processed to extract a preset type voiceprint feature, and based on the preset type voiceprint The step of constructing the corresponding voiceprint feature vector by the feature includes:

Performing pre-emphasis, framing, and windowing on the current voiceprint verification voice data, performing Fourier transform on each window to obtain a corresponding spectrum, and inputting the spectrum into a Meyer filter to output a Mel spectrum;

A cepstrum analysis is performed on the Mel spectrum to obtain a Mel frequency cepstral coefficient MFCC, and a corresponding voiceprint feature vector is formed based on the Mel frequency cepstral coefficient MFCC.
A computer readable storage medium, wherein the computer readable storage medium stores a processing system, and when the processing system is executed by the processor, the steps are:

a generating step, after receiving the identity verification request sent by the client computer and carrying the user identity, generating a graphic code parameter of the graphic code corresponding to the user identity, and sending the graphic code parameter to the client computer for The client computer generates and displays a graphic code corresponding to the graphic code parameter, where the graphic code parameter includes a random key and a voiceprint data collection link address;

The analyzing step, after the handheld terminal parses the graphic code to obtain the random key and the voiceprint data collection link address, receives the voiceprint verification request that is sent by the handheld terminal through the voiceprint data collection link address and carries the random key, and analyzes and sends the request Whether the random key in the graphic code parameter of the client computer is consistent with the random key received from the handheld terminal;

Obtaining, if yes, establishing a voice data collection channel with the handheld terminal, and acquiring current voiceprint verification voice data of the user collected from the handheld terminal based on the voice data collection channel;

a verification step of constructing a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data, and determining a standard voiceprint discrimination vector corresponding to the user identity identifier according to a mapping relationship between the predetermined user identity identifier and the standard voiceprint discrimination vector, and calculating a current The distance between the voiceprint discrimination vector and the standard voiceprint discrimination vector, the identity verification result is generated based on the calculated distance, and the identity verification result is sent to the client computer.
The computer readable storage medium according to claim 15, wherein the analyzing step comprises:

Receiving, by the server, the voiceprint verification request that is sent by the handheld terminal by using the voiceprint data collection link address and carrying the random key, and analyzing whether the number of times the random key is received is greater than a preset number of times;

If it is less than or equal to the preset number of times, it is analyzed whether the random key in the graphic code parameter sent to the client computer is consistent with the random key received from the handheld terminal.
The computer readable storage medium according to claim 15, wherein the graphic code parameter further comprises an effective time of the graphic code, and the analyzing step comprises:

Receiving, by the server, a voiceprint verification request that is sent by the handheld terminal by using a voiceprint data collection link address and carrying a random key, and analyzing whether the time of receiving the random key is within a valid time range of the graphic code;

If it is within the valid time range of the graphic code, analyze whether the number of times the random key is received is greater than a preset number of times;

If it is less than or equal to the preset number of times, it is analyzed whether the random key in the graphic code parameter sent to the client computer is consistent with the random key received from the handheld terminal.
The computer readable storage medium according to claim 15, wherein the step of constructing the current voiceprint discrimination vector corresponding to the current voiceprint verification voice data comprises:

Processing the current voiceprint verification voice data to extract a preset type voiceprint feature, and constructing a corresponding voiceprint feature vector based on the preset voiceprint feature;

Inputting the voiceprint feature vector into the pre-trained background channel model to construct a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data;

The calculating the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector, and generating the identity verification result based on the calculated distance includes:

Calculating the cosine distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector:
Identifying the vector for the standard voiceprint,
Identify the vector for the current voiceprint;

Generating a verification pass if the cosine distance is less than or equal to a preset distance threshold;

If the cosine distance is greater than a preset distance threshold, information that the verification fails is generated.
The computer readable storage medium according to claim 16, wherein the step of constructing the current voiceprint authentication vector corresponding to the current voiceprint verification voice data comprises:

Processing the current voiceprint verification voice data to extract a preset type voiceprint feature, and constructing a corresponding voiceprint feature vector based on the preset voiceprint feature;

Inputting the voiceprint feature vector into the pre-trained background channel model to construct a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data;

The calculating the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector, and generating the identity verification result based on the calculated distance includes:

Calculating the cosine distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector:
Identifying the vector for the standard voiceprint,
Identify the vector for the current voiceprint;

Generating a verification pass if the cosine distance is less than or equal to a preset distance threshold;

If the cosine distance is greater than a preset distance threshold, information that the verification fails is generated.
The computer readable storage medium according to claim 17, wherein the step of constructing the current voiceprint discrimination vector corresponding to the current voiceprint verification voice data comprises:

Processing the current voiceprint verification voice data to extract a preset type voiceprint feature, and constructing a corresponding voiceprint feature vector based on the preset voiceprint feature;

Inputting the voiceprint feature vector into the pre-trained background channel model to construct a current voiceprint discrimination vector corresponding to the current voiceprint verification voice data;

The calculating the distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector, and generating the identity verification result based on the calculated distance includes:

Calculating the cosine distance between the current voiceprint discrimination vector and the standard voiceprint discrimination vector:
Identifying the vector for the standard voiceprint,
Identify the vector for the current voiceprint;

Generating a verification pass if the cosine distance is less than or equal to a preset distance threshold;

If the cosine distance is greater than a preset distance threshold, information that the verification fails is generated.