CN111883140B - Authentication method, device, equipment and medium based on knowledge graph and voiceprint recognition - Google Patents

Authentication method, device, equipment and medium based on knowledge graph and voiceprint recognition Download PDF

Info

Publication number
CN111883140B
CN111883140B CN202010723015.6A CN202010723015A CN111883140B CN 111883140 B CN111883140 B CN 111883140B CN 202010723015 A CN202010723015 A CN 202010723015A CN 111883140 B CN111883140 B CN 111883140B
Authority
CN
China
Prior art keywords
reply
user
result
voiceprint
authentication
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010723015.6A
Other languages
Chinese (zh)
Other versions
CN111883140A (en
Inventor
邹芳
李俊蓉
李沛恒
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Life Insurance Company of China Ltd
Original Assignee
Ping An Life Insurance Company of China Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Life Insurance Company of China Ltd filed Critical Ping An Life Insurance Company of China Ltd
Priority to CN202010723015.6A priority Critical patent/CN111883140B/en
Publication of CN111883140A publication Critical patent/CN111883140A/en
Application granted granted Critical
Publication of CN111883140B publication Critical patent/CN111883140B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L15/00Speech recognition
    • G10L15/08Speech classification or search
    • G10L15/18Speech classification or search using natural language modelling
    • G10L15/1822Parsing for meaning understanding
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/02Preprocessing operations, e.g. segment selection; Pattern representation or modelling, e.g. based on linear discriminant analysis [LDA] or principal components; Feature selection or extraction
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/18Artificial neural networks; Connectionist approaches
    • GPHYSICS
    • G10MUSICAL INSTRUMENTS; ACOUSTICS
    • G10LSPEECH ANALYSIS OR SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING; SPEECH OR AUDIO CODING OR DECODING
    • G10L17/00Speaker identification or verification
    • G10L17/22Interactive procedures; Man-machine interfaces
    • G10L17/24Interactive procedures; Man-machine interfaces the user being prompted to utter a password or a predefined phrase
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L9/00Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols
    • H04L9/32Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials
    • H04L9/3226Cryptographic mechanisms or cryptographic arrangements for secret or secure communications; Network security protocols including means for verifying the identity or authority of a user of the system or for message authentication, e.g. authorization, entity authentication, data integrity or data verification, non-repudiation, key authentication or verification of credentials using a predetermined code, e.g. password, passphrase or PIN
    • H04L9/3231Biological data, e.g. fingerprint, voice or retina

Abstract

The invention relates to the field of artificial intelligence, and provides an authentication method, device, equipment and medium based on knowledge graph and voiceprint recognition, wherein the method comprises the following steps: acquiring user information in an identity verification instruction; acquiring user sample sound information and a user knowledge graph; acquiring a first authentication problem generation model to generate a problem to be confirmed; authentication problem voice obtained through a voice conversion model; receiving the reply voice information, and acquiring a voiceprint matching result output by the voiceprint recognition model according to the reply voice information and the user sample voice information; text recognition and intention recognition are carried out on the reply voice information through a reply recognition model, a reply result is obtained, and then the matching degree of the problem to be confirmed and the reply result is recognized, so that a reply comprehensive result is obtained; and determining an identity authentication result in the round of dialogue. The invention realizes double authentication, enhances the safety of user information, and also relates to a blockchain technology, wherein the user knowledge graph can be stored in the blockchain.

Description

Authentication method, device, equipment and medium based on knowledge graph and voiceprint recognition
Technical Field
The invention relates to the field of artificial intelligence voice processing, in particular to an authentication method, device, equipment and medium based on knowledge graph and voiceprint recognition.
Background
Voiceprint recognition technology is widely used in authentication scenes at present. However, because personal sound has variability, is easily influenced by factors such as physical condition, age, emotion and the like or external factors (such as microphones, channels, environmental noise, mixed speaking noise and the like), identity verification fails, and user experience satisfaction is greatly reduced; and lawless persons impersonate the sound through means such as acquaintance imitations attack, replay attack, sound synthesis attack, sound conversion attack and the like to confuse the lawless persons to perform illegal operation through identity verification, thereby threatening the information security of users.
Disclosure of Invention
The invention provides an authentication method, an authentication device, computer equipment and a storage medium based on knowledge graph and voiceprint recognition, which are used for generating authentication problem voice by acquiring user sample voice information and user knowledge graph of a call user, receiving reply voice information, obtaining a voiceprint matching result and a reply comprehensive result by text recognition and intention recognition, and jointly confirming an identity authentication result in a round of dialogue through the voiceprint matching result and the reply comprehensive result, thereby achieving a double authentication effect, authenticating the identity of the call user more accurately and enhancing the safety of user information.
An authentication method based on knowledge graph and voiceprint recognition comprises the following steps:
receiving an identity verification instruction of a call user, and acquiring user information in the identity verification instruction;
determining a user identification code of the call user according to the user information, and acquiring user sample tone information and a user knowledge graph associated with the user identification code; the user knowledge graph comprises graph nodes of a tree structure;
inputting the user knowledge graph into a first authentication problem generation model, and acquiring a to-be-confirmed problem which is generated by the first authentication problem generation model and contains a three-tree structure; the problem to be confirmed is generated by the first authentication problem generation model according to the map node and a first node problem template;
inputting the to-be-confirmed problem into a preset voice conversion model, acquiring authentication problem voice obtained by converting the voice conversion model through a voice synthesis technology, and broadcasting the authentication problem voice to the call user;
receiving reply voice information of the call user aiming at the authentication problem voice, inputting the reply voice information and the user sample voice information into a voiceprint recognition model, and inputting the reply voice information and the problem to be confirmed into a reply recognition model;
Extracting voiceprint characteristics in the reply voice information through the voiceprint recognition model, and obtaining a voiceprint matching result output by the voiceprint recognition model according to the voiceprint characteristics; the voiceprint matching result refers to a confidence value of the voiceprint feature matching the user sample sound information;
performing text recognition and intention recognition on the reply voice information through the reply recognition model to obtain a reply result, and recognizing the matching degree of the to-be-confirmed problem and the reply result through the reply recognition model to obtain a reply comprehensive result;
and determining an identity authentication result of the call user in the round of dialogue according to the voiceprint matching result and the answer comprehensive result.
An authentication device based on knowledge graph and voiceprint recognition, comprising:
the acquisition module is used for determining a user identification code of the call user according to the user information and acquiring user sample sound information and a user knowledge graph associated with the user identification code; the user knowledge graph comprises graph nodes of a tree structure;
the generation module is used for inputting the user knowledge graph into a first authentication problem generation model and acquiring a to-be-confirmed problem which is generated by the first authentication problem generation model and contains a trigeminal tree structure; the problem to be confirmed is generated by the first authentication problem generation model according to the map node and a first node problem template;
The conversion module is used for inputting the to-be-confirmed problem into a preset voice conversion model, acquiring authentication problem voice obtained by conversion of the voice conversion model through a voice synthesis technology, and broadcasting the authentication problem voice to the call user;
the input module is used for receiving the reply voice information of the call user aiming at the authentication problem voice, inputting the reply voice information and the user sample voice information into a voiceprint recognition model, and inputting the reply voice information and the problem to be confirmed into a reply recognition model;
the extraction module is used for extracting voiceprint characteristics in the reply voice information through the voiceprint recognition model and obtaining a voiceprint matching result output by the voiceprint recognition model according to the voiceprint characteristics; the voiceprint matching result refers to a confidence value of the voiceprint feature matching the user sample sound information;
the recognition module is used for carrying out text recognition and intention recognition on the reply voice information through the reply recognition model to obtain a reply result, and then recognizing the matching degree of the to-be-confirmed problem and the reply result through the reply recognition model to obtain a reply comprehensive result;
And the authentication module is used for determining an identity authentication result of the call user in the round of dialogue according to the voiceprint matching result and the answer comprehensive result.
A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, the processor implementing the steps of the above-described knowledge-graph-and voiceprint recognition based authentication method when the computer program is executed.
A computer readable storage medium storing a computer program which, when executed by a processor, implements the steps of the above-described authentication method based on knowledge-graph and voiceprint recognition.
According to the authentication method, the authentication device, the computer equipment and the storage medium based on the knowledge graph and the voiceprint recognition, which are provided by the invention, the user information in the authentication instruction is obtained by receiving the authentication instruction of the call user; determining a user identification code of the call user according to the user information, and acquiring user sample tone information and a user knowledge graph associated with the user identification code; the user knowledge graph comprises graph nodes of a tree structure; inputting the user knowledge graph into a first authentication problem generation model, and acquiring a to-be-confirmed problem which is generated by the first authentication problem generation model and contains a three-tree structure; the problem to be confirmed is generated by the first authentication problem generation model according to the map node and a first node problem template; inputting the to-be-confirmed problem into a preset voice conversion model, acquiring authentication problem voice obtained by converting the voice conversion model through a voice synthesis technology, and broadcasting the authentication problem voice to the call user; receiving reply voice information of the call user aiming at the authentication problem voice, inputting the reply voice information and the user sample voice information into a voiceprint recognition model, and inputting the reply voice information and the problem to be confirmed into a reply recognition model; extracting voiceprint characteristics in the reply voice information through the voiceprint recognition model, and obtaining a voiceprint matching result output by the voiceprint recognition model according to the voiceprint characteristics; the voiceprint matching result refers to a confidence value of the voiceprint feature matching the user sample sound information; performing text recognition and intention recognition on the reply voice information through the reply recognition model to obtain a reply result, and recognizing the matching degree of the to-be-confirmed problem and the reply result through the reply recognition model to obtain a reply comprehensive result; and determining an identity authentication result of the call user in the round of dialogue according to the voiceprint matching result and the answer comprehensive result.
The invention realizes the user information in the identity verification instruction by acquiring the user information; determining a user identification code of the call user according to the user information, and acquiring user sample tone information and a user knowledge graph associated with the user identification code; acquiring a to-be-confirmed problem which is generated by a first authentication problem generation model according to the user knowledge graph and contains a three-fork tree structure; through a voice synthesis technology, the authentication problem voice obtained by converting the voice is converted by a voice conversion model; receiving reply voice information of the call user aiming at the authentication problem voice, inputting the reply voice information and the user sample voice information into a voiceprint recognition model, and inputting the reply voice information and the problem to be confirmed into a reply recognition model; extracting voiceprint characteristics in the reply voice information through the voiceprint recognition model, and obtaining a voiceprint matching result; text recognition and intention recognition are carried out on the reply voice information through the reply recognition model, a reply result is obtained, and a reply comprehensive result is obtained through the reply recognition model; according to the voiceprint matching result and the reply comprehensive result, the identity authentication result of the call user in the round of dialogue is determined, so that the purposes of obtaining user sample sound information and user knowledge graph of the call user, generating a to-be-confirmed problem through a first authentication problem generation model, obtaining authentication problem voice through conversion of a voice synthesis technology, receiving reply voice information, carrying out voiceprint recognition through a voiceprint recognition model, carrying out text recognition and intention recognition through a reply recognition model, obtaining the voiceprint matching result and the reply comprehensive result, and jointly confirming the identity authentication result in the round of dialogue through the voiceprint matching result and the reply comprehensive result are achieved, so that double authentication effect is achieved, identity of the call user is authenticated more accurately, and safety of user information is enhanced.
Drawings
In order to more clearly illustrate the technical solutions of the embodiments of the present invention, the drawings that are needed in the description of the embodiments of the present invention will be briefly described below, it being obvious that the drawings in the following description are only some embodiments of the present invention, and that other drawings may be obtained according to these drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic view of an application environment of an authentication method based on knowledge-graph and voiceprint recognition in an embodiment of the present invention;
FIG. 2 is a flow chart of an authentication method based on knowledge-graph and voiceprint recognition in an embodiment of the invention;
FIG. 3 is a flowchart of step S80 of an authentication method based on knowledge-graph and voiceprint recognition in one embodiment of the present invention;
FIG. 4 is a flowchart of step S20 of an authentication method based on knowledge-graph and voiceprint recognition in one embodiment of the present invention;
FIG. 5 is a flowchart of step S30 of an authentication method based on knowledge-graph and voiceprint recognition in one embodiment of the present invention;
FIG. 6 is a flowchart of step S60 of an authentication method based on knowledge-graph and voiceprint recognition in one embodiment of the present invention;
FIG. 7 is a flowchart of step S70 of an authentication method based on knowledge-graph and voiceprint recognition in one embodiment of the present invention;
FIG. 8 is a flowchart of step S90 of an authentication method based on knowledge-graph and voiceprint recognition in one embodiment of the present invention;
FIG. 9 is a schematic block diagram of an authentication device based on knowledge-graph and voiceprint recognition in an embodiment of the present invention;
FIG. 10 is a schematic diagram of a computer device in accordance with an embodiment of the invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and fully with reference to the accompanying drawings, in which it is evident that the embodiments described are some, but not all embodiments of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The authentication method based on knowledge graph and voiceprint recognition provided by the invention can be applied to an application environment as shown in fig. 1, wherein a client (computer equipment) communicates with a server through a network. Among them, clients (computer devices) include, but are not limited to, personal computers, notebook computers, smartphones, tablet computers, cameras, and portable wearable devices. The server may be implemented as a stand-alone server or as a server cluster composed of a plurality of servers.
In an embodiment, as shown in fig. 2, an authentication method based on knowledge graph and voiceprint recognition is provided, and the technical scheme mainly includes the following steps S10-S80:
s10, receiving an authentication instruction of a call user, and acquiring user information in the authentication instruction.
Understandably, the call user is a user who needs to perform identity verification and is in a call, the identity verification instruction is an instruction triggered by the call user needing to perform identity verification, and the user information is information related to the call user, such as an identity card number, a mobile phone number, and the like of the call user.
S20, determining a user identification code of the call user according to the user information, and acquiring user sample tone information and a user knowledge graph associated with the user identification code; the user knowledge graph comprises graph nodes of a tree structure.
Understandably, the user identification code is a unique code for identifying the call user, the user identification code can be set according to requirements, the user sample tone information is voiceprint feature data obtained after the call user records according to sample tone content and has extracted voiceprint features, the user sample tone information is associated with the user identification code, the user knowledge graph is a knowledge graph of a tree structure associated with the user identification code, which is constructed by carrying out knowledge fusion and relation extraction on user data associated with the user identification code, and all the graph nodes according to a triplet mode.
In one embodiment, as shown in fig. 4, before the step S20, that is, before the step of obtaining the user voice-like information and the user knowledge graph associated with the user identification code, the method includes:
s201, user data associated with the user identification code is acquired.
It will be appreciated that the user data includes structured data and unstructured data associated with the user identification code, the structured data being information such as numbers, symbols that can be represented in terms of data or a unified structure, the structured data having a clear relationship that makes the data convenient to use, such as structured data comprising: credit card number, date, financial amount, telephone number, address, product name, etc., which is not in compliance with any predefined model, stored in a non-relational database, which may be text or non-text, or may be an artificial or machine-generated image or video, etc.
S202, converting the structured data in the user data to obtain first data, and simultaneously extracting text from unstructured data in the user data to obtain second data.
Understandably, the structured data is data logically expressed and realized through a two-dimensional table structure in a database in an acquisition server, and is mainly stored and managed through a relational database, and knowledge such as an entity, an event, related attributes and the like is acquired by converting the structured data according to a preset rule, so that the first data is obtained; the unstructured data is obtained by removing the structured data from the user data, the unstructured data is usually obtained by extracting text from content or comments of an access website associated with the user identification code, and the second data is obtained by extracting text from the unstructured data, and the text extraction refers to entity knowledge extraction, event extraction and attribute extraction from the unstructured data.
And S203, obtaining a map node by carrying out knowledge fusion and relation extraction on all the first data and all the second data, constructing a user knowledge map which is associated with the user identification code and contains the map node according to a triplet mode, and storing the user knowledge map in a blockchain.
It is understood that the knowledge fusion is to fuse together the same entities from different knowledge bases, i.e. to fuse together all the first data and all the second data, or to say to stack together, the relation extraction is to extract specific event or fact information through natural language text, to connect two entities according to the event or fact information, to establish a relation between two entities, and to triple the way RDF (Resource Description Framework ) in a knowledge graph, such as (Zhang san, height, 185), (Zhang san, profession, teacher), and to store the user knowledge graph in a blockchain.
It should be emphasized that, to further ensure the privacy and security of the user knowledge graph, the user knowledge graph may also be stored in the nodes of the blockchain.
The blockchain is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm and the like. The Blockchain (Blockchain), which is essentially a decentralised database, is a string of data blocks that are generated by cryptographic means in association, each data block containing a batch of information of network transactions for verifying the validity of the information (anti-counterfeiting) and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like. The decentralized completely distributed DNS service provided by the blockchain can realize the inquiry and analysis of domain names through the point-to-point data transmission service among all nodes in the network, can be used for ensuring that an operating system and firmware of a certain important infrastructure are not tampered, monitoring the state and the integrity of software, finding bad tampering, ensuring that transmitted data are not tampered, storing the user knowledge graph in the blockchain, and ensuring the privacy and the safety of the user knowledge graph.
According to the invention, the user knowledge graph of the call user is constructed so as to extract the relevant information which is important and frequent to the call user, so that the accuracy can be enhanced for subsequent identification.
S30, inputting the user knowledge graph into a first authentication problem generation model, and acquiring a to-be-confirmed problem containing a three-tree structure generated by the first authentication problem generation model; and the problem to be confirmed is generated by the first authentication problem generation model according to the map node and the first node problem template.
The first node problem template is a node problem template corresponding to the map node, the node problem template is a template which is generated according to the attribute of the map node and is formed in a questioning mode, the first authentication problem model is a template which is generated according to the attribute of the map node and is formed by extracting one map node from the user knowledge map, determining the first node problem template according to the map node, combining the node attribute in the map node with the first node problem template to generate the problem to be confirmed, the problem to be confirmed is formed into a three-fork tree structure by three child nodes, and the node attribute is the content described in a triple mode, such as (Zhang San, height, 185), (Zhang San, professional and teacher).
In an embodiment, as shown in fig. 5, in the step S30, that is, the obtaining the problem to be confirmed including the trigeminal tree structure generated by the first authentication problem generating model includes:
s301, randomly acquiring a first-layer tree node from the user knowledge graph through the first authentication problem model; the user knowledge graph comprises a plurality of first-layer tree nodes.
Understandably, the first layer tree node is a graph node in the tree structure in the knowledge graph, which is next to the user identification code, that is, a graph node in the first layer in the user knowledge graph in the tree structure, wherein the tree structure in the knowledge graph includes multiple layers of graph nodes, and is divided into a first layer tree node, a second layer tree node and the like, and the root in the user knowledge graph is the user identification code.
S302, acquiring a node problem template corresponding to the node attribute in the first-layer tree node, and determining the acquired node problem template as the first node problem template.
Understandably, the node problem template is a template which is set up by extracting only one or several node attributes in tree nodes and is asked in a problem mode, and the acquired node problem template corresponding to the node attributes in the first-layer tree nodes is determined as the first node problem template through the node attribute matching corresponding to the node problem template.
S303, combining the node attribute with the first node problem template through the first authentication problem model to generate the problem to be confirmed; the question to be confirmed comprises three sub-nodes, wherein the sub-nodes are associated with one tree node in the user knowledge graph.
It can be understood that, according to the requirements of the first node question template, the relevant node attributes in the first layer tree nodes are extracted to form the attribute questions of the questions to be confirmed, the remaining node attributes are used as attribute answers to the questions to be confirmed, the node attributes are associated with the second layer tree nodes corresponding to the first layer tree nodes in the user knowledge graph, the second layer tree nodes are branches of the first layer tree nodes, namely, master-slave relationships, the child nodes comprise correct nodes (or opposite nodes, the node attributes of which are correct), incorrect nodes (or error nodes, the node attributes of which are incorrect), unknown nodes (or uncertain nodes, the node attributes of which are unknown), the attribute answers are associated with the opposite nodes, the incorrect nodes are associated with contents except the attribute answers, and the unknown nodes are associated with unanswered or unknown contents.
The method for generating the problem of identity authentication based on the knowledge graph is capable of providing the problem of identity authentication in multiple aspects and guaranteeing the safety of user information.
S40, inputting the to-be-confirmed problem into a preset voice conversion model, acquiring authentication problem voice obtained by converting the voice conversion model through a voice synthesis technology, and broadcasting the authentication problem voice to the call user.
The voice conversion model is a deep convolutional neural network model which is trained, the processing procedure of the voice synthesis technology can be set according to requirements, and the voice synthesis technology can be used for carrying out text analysis on the input to-be-confirmed problem, taking semantic, syntax, part of speech and other information into consideration by using a deep bidirectional long and short time memory network, obtaining the authentication problem voice through a vocoder, wherein the deep bidirectional long and short time memory network is abbreviated as Bi-LSTM, and the vocoder can be used for encoding the input information into sound.
S50, receiving the reply voice information of the call user aiming at the authentication problem voice, inputting the reply voice information and the user sample voice information into a voiceprint recognition model, and inputting the reply voice information and the problem to be confirmed into a reply recognition model.
Understandably, after receiving the reply of the call user to the authentication problem voice, recording the reply voice information by recording, and inputting the reply voice information and the user sample voice information into the voiceprint recognition model, where the voiceprint recognition model is a trained deep convolutional neural network model, and the network structure of the voiceprint recognition model may be set according to requirements, for example, the network structure of the voiceprint recognition model may be a network structure of a GMM (gaussian) model, a network structure of a UBM (universal background) model, a network structure of a GMM-UBM (mixed gaussian-universal background) model, and so on.
In one embodiment, before the step S50, that is, before the step of inputting the reply voice information and the user-like voice information into the voiceprint recognition model, the method includes:
s501, acquiring a voice sample set; the set of speech samples comprises a plurality of speech samples, one of the speech samples being associated with one speech tag.
The voice sample set is understandably a set of voice samples, the voice samples are voice files of users collected in a history, one voice sample is associated with one voice tag, the voice tag is a voiceprint characteristic value corresponding to a user speaking the voice sample, and the voiceprint characteristic value can be information manually extracted from the voice sample.
S502, inputting the voice sample into a deep convolutional neural network model containing initial parameters.
Understandably, the deep convolutional neural network model contains the initial parameters.
S503, extracting the voiceprint features in the voice sample through the deep convolutional neural network model, and obtaining a sample identification result output by the deep convolutional neural network model.
Understandably, the voiceprint features are features related to the sonic spectrum of sound, the voiceprint features include tone quality, tone length, tone intensity, tone pitch, and the like, and the sample recognition result recognized by the deep convolutional neural network model according to the extracted voiceprint features is obtained.
S504, determining a loss value according to the sample recognition result and the voice tag.
It can be understood that the loss value obtained by calculating the sample recognition result and the voice tag by using the loss function of the deep convolutional neural network model is output, and the loss function can be set according to requirements, for example, the loss function can be a cross entropy loss function, etc.
And S505, recording the depth convolution neural network model after convergence as a voiceprint recognition model when the loss value reaches a preset convergence condition.
It is understood that the convergence condition may be a condition that the loss value is smaller than a set threshold, that is, when the loss value is smaller than the set threshold, the deep convolutional neural network model after convergence is recorded as a voiceprint recognition model.
In an embodiment, after the step S504, that is, after the determining the loss value according to the sample recognition result and the voice tag, the method further includes:
and S506, when the loss value does not reach a preset convergence condition, iteratively updating initial parameters of the deep convolutional neural network model until the loss value reaches the preset convergence condition, and recording the converged deep convolutional neural network model as a voiceprint recognition model.
It can be understood that the convergence condition may also be a condition that the value of the loss value is small and will not drop after 10000 times of calculation, that is, when the value of the loss value is small and will not drop again after 10000 times of calculation, training is stopped, and the deep convolutional neural network model after convergence is recorded as a voiceprint recognition model.
S60, extracting voiceprint features in the reply voice information through the voiceprint recognition model, and obtaining a voiceprint matching result output by the voiceprint recognition model according to the voiceprint features; the voiceprint matching result refers to a confidence value of the voiceprint feature matching the user-like sound information.
Understandably, the voiceprint recognition model outputs a recognition result according to the voiceprint feature in the extracted reply voice information, the voiceprint feature is a feature related to the acoustic spectrum of sound, the voiceprint feature comprises tone quality, tone length, tone intensity, tone pitch and the like, the voiceprint recognition model compares and verifies the recognition result with the user-like voice information to obtain the confidence value after comparison and verification, the confidence value indicates that the recognition result matches with the probability value of the user-like voice information, then the voiceprint matching result is determined according to the confidence value, and the voiceprint matching result characterizes the voiceprint matching degree between the reply voice information and the user-like voice information.
In an embodiment, as shown in fig. 6, in step S60, that is, the extracting, by the voiceprint recognition model, the voiceprint feature in the reply voice information, obtaining a voiceprint matching result output by the voiceprint recognition model according to the voiceprint feature includes:
s601, acquiring a recognition result output by the voiceprint recognition model according to the extracted voiceprint features.
Understandably, the extracting process of the voiceprint recognition model includes preprocessing the reply voice information, convolving the preprocessed reply voice information according to the network structure of the voiceprint recognition model, extracting voiceprint features in the preprocessed reply voice information through convolution, and outputting voiceprint feature vectors corresponding to the voiceprint features, which is the recognition result, where the preprocessing can be set according to requirements, for example, the preprocessing includes VAD, denoising, reverberation removal, speaker separation, and the like.
S602, comparing and verifying the identification result with the user sample sound information to obtain the confidence value after comparison and verification.
Understandably, the matching method for comparison verification may be set according to requirements, for example, the matching method for comparison verification may be a probability statistical matching method, a vector quantization matching method, a VQ cluster matching method, or the like, and preferably, the matching method for comparison verification is a probability statistical matching method, and the recognition result and the user sample sound information are compared and verified by the matching method for comparison verification, so as to obtain a probability value of the matching degree of the recognition result and the user sample sound information, that is, the confidence value after comparison verification.
S603, determining the voiceprint matching result according to the confidence value, wherein the voiceprint matching result characterizes the voiceprint matching degree between the reply voice information and the user sample voice information.
Understandably, if the confidence value is greater than or equal to a preset confidence threshold, determining that the voiceprint matching result is a matching pass, and if the confidence value is less than the preset confidence threshold, determining that the voiceprint matching result is a matching fail, where the voiceprint matching result includes the confidence value.
According to the invention, the voiceprint characteristics are extracted through the voiceprint recognition model, and are compared and verified with the user sample sound information to obtain the voiceprint matching result, so that the voiceprint recognition accuracy is improved.
S70, carrying out text recognition and intention recognition on the reply voice information through the reply recognition model to obtain a reply result, and recognizing the matching degree of the to-be-confirmed problem and the reply result through the reply recognition model to obtain a reply comprehensive result.
Understandably, the text recognition is a method of recognition by a speech recognition technology, the intention recognition is a method of recognition by a natural language understanding technology, the speech recognition technology is abbreviated as ASR, the speech recognition technology is a method of analyzing parameters of speech features of an input speech file, a result is recognized according to the parameters of the speech features, the natural language understanding technology is abbreviated as NLU, the natural language understanding technology is a method of extracting semantic features by performing lexical analysis, syntactic analysis and semantic analysis on the input speech file, and finally, a result is analyzed according to the extracted semantic features, and the reply recognition model comprises a reply text recognition model, a reply intention recognition model and a dialog management module.
In an embodiment, as shown in fig. 7, in step S70, that is, the text recognition and the intention recognition are performed on the reply voice information through the reply recognition model to obtain a reply result, and then the matching degree value of the to-be-confirmed problem and the reply result is recognized through the reply recognition model to obtain a reply comprehensive result, which includes:
s701, carrying out text recognition on the reply voice information through the reply text recognition model according to a voice recognition technology to obtain a reply text result; the reply recognition model comprises a reply text recognition model, a reply intention recognition model and a dialog management module.
Understandably, the text recognition is a method of performing recognition by using a voice recognition technology, where the voice recognition technology analyzes parameters of a voice feature of an input voice file, recognizes a result according to the parameters of the voice feature, and recognizes text content in the reply voice information through the reply text recognition model to obtain the reply text result, that is, the reply text result includes chinese content that the call user in the reply voice information answers according to the authentication question voice.
S702, according to natural language understanding technology, intention recognition is carried out on the reply voice information through the reply intention recognition model, and a reply state result is obtained.
Understandably, according to the natural language understanding technology, extracting semantic features in the reply voice information through the reply intention recognition model, where the semantic features are text meanings in the reply voice information, the semantic features include emotion, intonation and the like, and performing intention recognition on the extracted semantic features to obtain the reply state result, where the reply state result characterizes emotion features of the talking user in answering the authentication question voice, and indicates whether the reply voice information is affirmed kisses or contains impatience or boring emotion and the like.
S703, determining the reply text result and the reply state result as the reply result through the dialogue management module.
Understandably, the reply text result and the reply status result are combined into the reply result by the dialog management module, where the dialog management module is a module for managing the results of multiple rounds of dialog, and the dialog management module includes a confirmation of the reply result.
S704, matching the reply text with the child node in the to-be-confirmed problem through the dialogue management module to obtain a matching degree value of the reply text and the to-be-confirmed problem.
It may be understood that the dialog management module further includes obtaining the content in the child node, and matching the reply text with the content in the child node to obtain the matching degree value, where the matching degree value may be calculated by a text similarity matching algorithm, or may be obtained by a method of whether the matching is completely corresponding to the matching (i.e. if the reply text includes the content in the child node, the matching degree value is determined to be 100%, and if the reply text does not include the content in the child node, the matching degree value is determined to be 0%).
And S705, obtaining the answer comprehensive result by the dialogue management module according to the matching degree value and the answer state result.
Understandably, the dialog management module further includes calculating the authenticity of the user identity by replying to the status of the question to be confirmed (including emotion and intonation) and the result of voiceprint matching, where the reply comprehensive result includes the matching degree value, the reply status result, and the result of confirmation result, where the confirmation result includes correct, incorrect, unknown, and the confirmation result indicates whether the question to be confirmed is replied to.
According to the invention, the reply comprehensive finger is obtained through the reply recognition model according to the voice recognition technology, the natural language understanding technology and the dialogue management, and the recognition accuracy can be improved.
S80, according to the voiceprint matching result and the answer comprehensive result, determining an identity authentication result of the call user in the round of dialogue.
And understandably, inputting the voiceprint matching result and the answer comprehensive result into a preset weighted verification model, wherein the weighted verification model obtains a verification probability value by converting the voiceprint matching result into a measurable value and then carrying out weighted processing on the voiceprint matching result and the matching degree value in the answer comprehensive result, the verification probability value represents the probability of whether the call user passes verification, if the verification probability value is larger than a preset probability threshold, the identity authentication result of the call user in the session of the current round is confirmed to pass the authentication of the current round, and if the verification probability value is larger than the preset probability threshold, the identity authentication result of the call user in the session of the current round is confirmed to fail the authentication of the current round, the identity authentication result comprises a current round authentication identifier and a current round authentication probability value, and the current round authentication identifier comprises the current round authentication pass and the current round authentication fail.
The invention realizes the user information in the identity verification instruction by acquiring the user information; determining a user identification code of the call user according to the user information, and acquiring user sample tone information and a user knowledge graph associated with the user identification code; acquiring a to-be-confirmed problem which is generated by a first authentication problem generation model according to the user knowledge graph and contains a three-fork tree structure; through a voice synthesis technology, the authentication problem voice obtained by converting the voice is converted by a voice conversion model; receiving reply voice information of the call user aiming at the authentication problem voice, inputting the reply voice information and the user sample voice information into a voiceprint recognition model, and inputting the reply voice information and the problem to be confirmed into a reply recognition model; extracting voiceprint characteristics in the reply voice information through the voiceprint recognition model, and obtaining a voiceprint matching result; text recognition and intention recognition are carried out on the reply voice information through the reply recognition model, a reply result is obtained, and a reply comprehensive result is obtained through the reply recognition model; according to the voiceprint matching result and the reply comprehensive result, the identity authentication result of the call user in the round of dialogue is determined, so that the purposes of obtaining user sample sound information and user knowledge graph of the call user, generating a to-be-confirmed problem through a first authentication problem generation model, obtaining authentication problem voice through conversion of a voice synthesis technology, receiving reply voice information, carrying out voiceprint recognition through a voiceprint recognition model, carrying out text recognition and intention recognition through a reply recognition model, obtaining the voiceprint matching result and the reply comprehensive result, and jointly confirming the identity authentication result in the round of dialogue through the voiceprint matching result and the reply comprehensive result are achieved, so that double authentication effect is achieved, identity of the call user is authenticated more accurately, and safety of user information is enhanced.
In one embodiment, as shown in fig. 3, after the step S80, that is, after the step of determining the identity authentication result of the current session according to the confidence value and the answer comprehensive result, the method includes:
s90, inputting the to-be-confirmed questions, the user knowledge graph and the reply result into a second authentication question generation model, and acquiring the next round of to-be-confirmed questions generated by the second authentication question generation model through a knowledge decision method.
The method for determining knowledge decision includes that the reply text is respectively matched with each child node according to the reply text, tree nodes associated with the child nodes matched with the reply text are obtained, a next tree node identical to the tree node is obtained in the user knowledge graph, the tree nodes are graph nodes with association relations in the user knowledge graph, and the tree nodes comprise a first tree node and a next tree node.
The next round of questions to be confirmed are generated by the second authentication question generation model according to the next round of atlas nodes and a second node question template, and the next round of atlas nodes are the knowledge atlas nodes determined according to the user knowledge atlas, the questions to be confirmed and the reply result through the knowledge decision method.
In an embodiment, as shown in fig. 8, in step S90, that is, inputting the to-be-confirmed question, the user knowledge graph, and the reply result into a second authentication question generation model, obtaining a next round of to-be-confirmed question generated by the second authentication question generation model includes:
and S901, inquiring the child node matched with the reply comprehensive result, and acquiring a tree node associated with the child node.
It may be understood that, according to the confirmation result in the reply comprehensive result, the sub-node matched with the sub-node is queried, that is, the sub-node matched with the content of the sub-node in the reply comprehensive result is queried, and then the tree node associated with the sub-node is acquired, in an embodiment, if the confirmation result is "unknown", the matched sub-node is a sub-node of "unknown" content, the tree node acquired with the sub-node is any first layer tree node that does not include the first layer tree node corresponding to the problem to be confirmed, if the confirmation result is "incorrect", the matched sub-node is a sub-node of "incorrect" content, the tree node acquired with the sub-node may be any first layer tree node marked with a sensitive identifier, which indicates that a call user needs to be authenticated by a more sensitive authentication problem, and if the confirmation result is "correct", the matched sub-node is a sub-node of "correct" content, the tree node acquired with the sub-node may be any first layer tree node corresponding to the problem to be the first layer tree node, that is the first layer tree node corresponding to the problem to be confirmed, and the second layer tree node is the first layer tree node.
S902, searching a next-layer tree node which is the same as the tree node in the user knowledge graph, determining the searched next-layer tree node as the next-round graph node, acquiring node attributes in the next-round graph node, and determining the node attributes as the next-round node attributes.
Understandably, the same tree node as the tree node is found in the user knowledge graph, the tree node is determined to be the next tree node, the next graph node is used as the next-round graph node, the node attribute in the next-round graph node is obtained, and the node attribute is marked as the next-round node attribute.
S903, acquiring the node problem template corresponding to the next round of node attribute, and determining the acquired node problem template as a second node problem template.
Understandably, the node question template is a template which is set up by extracting only one or several node attributes in tree nodes and is asked in a question mode, and the node question template corresponding to the next round of node attributes is determined to be the second node question template through matching the next round of node attributes with the corresponding node question template.
S904, combining the next round of node attributes with the second node problem template through the second authentication problem model to generate the next round of problems to be confirmed.
Understandably, the contents of the next round of node attributes are filled into corresponding positions in the second node question template, and the next round of questions to be confirmed are generated in combination, for example: the problem to be confirmed is "your occupation is teacher", if the talking user replies "is teacher", the confirmation result in the reply comprehensive result is "correct", the next tree node obtained through the user knowledge graph is (Zhang San, occupation, junior middle school teacher), its corresponding second node problem template is "your occupation is XXXX is? "then the next round of questions to be confirmed" is your occupation a junior middle school teacher? ".
The invention realizes that the child node matched with the reply text is inquired out, and the tree node associated with the child node is acquired; searching a next-layer tree node which is the same as the tree node in the user knowledge graph, determining the searched next-layer tree node as the next-round graph node, and acquiring a next-round node attribute; acquiring a node problem template corresponding to the next round of node attribute, and determining the acquired node problem template as a second node problem template; and combining the next round of node attributes with the second node problem template through the second authentication problem model to generate the next round of problems to be confirmed, so that the association relation of tree nodes based on the knowledge graph is realized, the next round of problems to be confirmed are generated in a layer-by-layer progressive mode, the identity authentication of the call user is more accurately performed, and the identification accuracy and reliability are improved.
S100, inputting the reply state result in the reply comprehensive result and the next round of questions to be confirmed into a preset personalized voice conversion model, acquiring the next round of authentication questions converted by the personalized voice conversion model through a voice synthesis technology, and broadcasting the next round of authentication questions to the call user.
Understandably, the personalized voice conversion model is a neural network model after training, the personalized voice conversion model realizes that a speaking operation, a speaking speed and a style which are matched with the reply state are recognized according to the extracted reply state by extracting the reply state in the reply result, the speaking operation, the speaking speed and the style are fused into a next round of to-be-confirmed problem through a voice synthesis technology, the next round of authentication problem voice is converted and synthesized, the next round of authentication problem voice is voice data which is personally designed for the call user, and the next round of authentication problem voice is broadcast to the call user.
S110, receiving next round of return voice information of the call user for the next round of authentication problem voice response, inputting the next round of return voice information and the user sample voice information into the voiceprint recognition model, and simultaneously inputting the next round of return voice information and the next round of to-be-confirmed problem into the response recognition model.
Understandably, the next round of repeated voice information is recorded in a recording mode, the next round of repeated voice information is voice information for a call user to answer the next round of authentication questions, the next round of repeated voice information and the user-like voice information are input into the voiceprint recognition model together, and the next round of repeated voice information and the next round of questions to be confirmed are input into the reply recognition model.
S120, obtaining a next round of voiceprint matching result output by the voiceprint recognition model according to the extracted voiceprint features in the next round of repeated voice information; and the next round of voiceprint matching result refers to a confidence value of the voiceprint characteristics in the next round of repeated voice information and the user sample voice information.
Understandably, the voiceprint recognition model is used for extracting voiceprint features in the next round of the repeated voice information, and recognizing the voiceprint features in the next round of the repeated voice information according to the extracted voiceprint features to obtain a next round of voiceprint matching result, wherein the next round of voiceprint matching result refers to a confidence value of matching the voiceprint features in the next round of the repeated voice information with the user-like voice information.
S130, carrying out text recognition and intention recognition on the next round of reply voice information through the reply recognition model to obtain a next round of reply result, and recognizing the matching degree of the next round of to-be-confirmed problem and the next round of reply result through the reply recognition model to obtain a next round of reply comprehensive result.
Understandably, text recognition and intention recognition are performed on the next round of reply voice information through the reply recognition model, the next round of reply result is output, and the matching degree of the next round of to-be-confirmed problem and the next round of reply result is recognized through the reply recognition model, so that a next round of reply comprehensive result is obtained.
And S140, determining the identity authentication result of the next dialog according to the next voiceprint matching result and the next reply comprehensive result.
Understandably, by inputting the next round of voiceprint matching results and the next round of reply comprehensive results into the weighted verification model, the identity authentication result of the next round of dialogue is confirmed through the weighted verification model.
S150, when the completion of the conversations of the preset rounds is detected, the identity authentication result of each round of conversations is obtained, and the final identity authentication result of the conversation user is determined according to all the identity authentication results; the final identity authentication result refers to the identity authentication result of the talking user determined through multiple rounds of conversations.
It is to be understood that, when the dialog of the preset round is not detected, steps S90 to S140 are repeated until the dialog of the preset round is detected to be completed, where the preset round may be set according to requirements, for example, the preset round is set to be three rounds of dialogs, or the preset round is set to determine whether to add two rounds of dialogs according to the identity authentication result of the current round of dialogs and the identity authentication result of the next round of dialogs, and when the dialog of the preset round is detected to be completed, whether the final identity result of the flower for conversation passes is determined according to the identity authentication result obtained by each round of dialogs.
Inputting the to-be-confirmed questions, the user knowledge graph and the reply comprehensive results into a second authentication question generation model, and obtaining the next round of to-be-confirmed questions generated by the second authentication question generation model through a knowledge decision method; inputting a reply state result in the reply comprehensive result and the next round of questions to be confirmed into a preset personalized voice conversion model, acquiring the next round of authentication questions converted by the personalized voice conversion model through a voice synthesis technology, and broadcasting the next round of authentication questions to the call user; receiving next round of return voice information of the call user for the next round of authentication problem voice reply, inputting the next round of return voice information and the user sample voice information into the voiceprint recognition model, and simultaneously inputting the next round of return voice information and the next round of problem to be confirmed into the reply recognition model; acquiring a next round of voiceprint matching result output by the voiceprint recognition model according to the extracted voiceprint features in the next round of repeated voice information; text recognition and intention recognition are carried out on the next round of reply voice information through the reply recognition model, a next round of reply comprehensive results are obtained, and then the matching degree of the next round of questions to be confirmed and the next round of reply comprehensive results is recognized through the reply recognition model, so that the next round of reply comprehensive results are obtained; determining an identity authentication result of the next dialog according to the next voiceprint matching result and the next reply comprehensive result; when the completion of the conversations of the preset rounds is detected, the identity authentication result of each round of conversations is obtained, and the final identity authentication result of the conversation user is determined according to all the identity authentication results; the final identity authentication result refers to the identity authentication result of the talking user determined through multiple rounds of conversations.
According to the method, the next round of questions to be confirmed, which are generated by the second authentication question generation model, are obtained according to the questions to be confirmed, the user knowledge graph and the answer comprehensive result by a knowledge decision method; through a voice synthesis technology, the personalized voice conversion model converts the next round of questions to be confirmed into the next round of authentication questions; the next round of reply voice information is received, a next round of voiceprint matching result is identified through a voiceprint identification model, a next round of reply comprehensive result is identified through a reply identification model, finally, the identity authentication result of the next round of dialogue is determined according to the next round of to-be-confirmed problem and the next round of reply comprehensive result, after the preset round of dialogue is completed, whether the talking user passes the identity authentication is determined according to the identity authentication result of each round of dialogue, so, the next round of authentication problem and voice for personalized design of the talking user are generated through a knowledge decision method and a personalized voice conversion model, the identity identification can be accurately carried out on the talking user through voiceprint identification, text identification and intention identification of multiple rounds of dialogue, the accuracy is improved, and the user experience satisfaction can be improved through personalized voice.
In an embodiment, an authentication device based on knowledge-graph and voiceprint recognition is provided, where the authentication device based on knowledge-graph and voiceprint recognition corresponds to the authentication method based on knowledge-graph and voiceprint recognition in the above embodiment one by one. As shown in fig. 9, the authentication device based on knowledge graph and voiceprint recognition includes a receiving module 11, an acquiring module 12, a generating module 13, a converting module 14, an input module 15, an extracting module 16, a recognizing module 17, and an authenticating module 18. The functional modules are described in detail as follows:
the receiving module 11 is configured to receive an authentication instruction of a call user, and obtain user information in the authentication instruction;
an obtaining module 12, configured to determine a user identifier of the call user according to the user information, and obtain user sample tone information and a user knowledge graph associated with the user identifier; the user knowledge graph comprises graph nodes of a tree structure;
the generating module 13 is configured to input the user knowledge graph into a first authentication problem generating model, and obtain a problem to be confirmed, which is generated by the first authentication problem generating model and contains a trigeminal tree structure; the problem to be confirmed is generated by the first authentication problem generation model according to the map node and a first node problem template;
The conversion module 14 is configured to input the to-be-confirmed problem into a preset voice conversion model, obtain an authentication problem voice obtained by converting the voice conversion model through a voice synthesis technology, and broadcast the authentication problem voice to the call user;
the input module 15 is configured to receive reply voice information of the call user for the authentication problem voice, input the reply voice information and the user sample voice information into a voiceprint recognition model, and input the reply voice information and the problem to be confirmed into a reply recognition model;
the extracting module 16 is configured to extract voiceprint features in the reply voice information through the voiceprint recognition model, and obtain a voiceprint matching result output by the voiceprint recognition model according to the voiceprint features; the voiceprint matching result refers to a confidence value of the voiceprint feature matching the user sample sound information;
the recognition module 17 is configured to perform text recognition and intention recognition on the reply voice information through the reply recognition model to obtain a reply result, and then recognize, through the reply recognition model, the matching degree between the to-be-confirmed problem and the reply result to obtain a reply comprehensive result;
And the authentication module 18 is configured to determine an identity authentication result of the call user in the current round of dialogue according to the voiceprint matching result and the reply comprehensive result.
In one embodiment, the authentication module 18 includes:
the generating unit is used for inputting the to-be-confirmed problem, the user knowledge graph and the reply result into a second authentication problem generating model, and acquiring the next round of to-be-confirmed problem generated by the second authentication problem generating model through a knowledge decision method;
the personalized unit is used for inputting a reply state result in the reply result and the next round of questions to be confirmed into a preset personalized voice conversion model, acquiring the next round of authentication questions converted by the personalized voice conversion model through a voice synthesis technology, and broadcasting the next round of authentication questions to the call user;
the receiving unit is used for receiving the next round of return voice information of the voice response of the talking user aiming at the next round of authentication problem, inputting the next round of return voice information and the user-like voice information into the voiceprint recognition model, and inputting the next round of return voice information and the next round of problem to be confirmed into the response recognition model;
The first acquisition unit is used for acquiring a next round of voiceprint matching result output by the voiceprint recognition model according to the voiceprint characteristics in the extracted next round of the recurrent voice information; the next round of voiceprint matching result refers to a confidence value of the voiceprint characteristics in the next round of repeated voice information and the user sample voice information;
the recognition unit is used for carrying out text recognition and intention recognition on the next round of reply voice information through the reply recognition model to obtain a next round of reply result, and then recognizing the matching degree of the next round of to-be-confirmed problem and the next round of reply result through the reply recognition model to obtain a next round of reply comprehensive result;
the determining unit is used for determining the identity authentication result of the next round of dialogue according to the next round of voiceprint matching result and the next round of reply comprehensive result;
the final authentication unit is used for obtaining the identity authentication result of each round of dialogue when the completion of the dialogue of the preset round is detected, and determining the final identity authentication result of the talking user according to all the identity authentication results; the final identity authentication result refers to the identity authentication result of the talking user determined through multiple rounds of conversations.
In one embodiment, the acquisition module 12 includes:
a second acquisition unit configured to acquire user data associated with the user identification code;
the conversion unit is used for converting the structured data in the user data to obtain first data, and extracting text from unstructured data in the user data to obtain second data;
the construction unit is used for obtaining the map nodes through knowledge fusion and relation extraction of all the first data and all the second data, constructing the user knowledge maps which are associated with the user identification codes and contain the map nodes according to a triplet mode, and storing the user knowledge maps in a blockchain.
In one embodiment, the generating module 13 includes:
the random unit is used for randomly acquiring a first-layer tree node from the user knowledge graph through the first authentication problem model; the user knowledge graph comprises a plurality of first-layer tree nodes;
a third obtaining unit, configured to obtain a node problem template corresponding to a node attribute in the first-layer tree node, and determine the obtained node problem template as the first node problem template;
The combination unit is used for combining the node attribute with the first node problem template through the first authentication problem model to generate the problem to be confirmed; the question to be confirmed comprises three sub-nodes, wherein the sub-nodes are associated with one tree node in the user knowledge graph.
In one embodiment, the extraction module 16 includes:
a fourth obtaining unit, configured to obtain a recognition result output by the voiceprint recognition model according to the extracted voiceprint feature;
the comparison unit is used for comparing and verifying the identification result with the user sample sound information to obtain the confidence value after comparison and verification;
and the first output unit is used for determining the voiceprint matching result according to the confidence value, and the voiceprint matching result characterizes the voiceprint matching degree between the reply voice information and the user sample voice information.
In one embodiment, the identification module 17 includes:
the text recognition unit is used for carrying out text recognition on the reply voice information through the reply text recognition model according to a voice recognition technology to obtain a reply text result; the reply identification model comprises a reply text identification model, a reply intention identification model and a dialogue management module;
The intention recognition unit is used for carrying out intention recognition on the reply voice information through the reply intention recognition model according to a natural language understanding technology to obtain a reply state result;
the management unit is used for determining the reply text result and the reply state result as the reply result through the dialogue management module;
the matching unit is used for matching the reply text with the child node in the to-be-confirmed problem through the dialogue management module to obtain a matching degree value of the reply text and the to-be-confirmed problem;
and the second output unit is used for obtaining the reply comprehensive result according to the matching degree value and the reply state result through the dialogue management module.
In an embodiment, the generating unit includes:
a query subunit, configured to query the child node that matches the reply comprehensive result, and obtain a tree node associated with the child node;
the searching subunit is used for searching the next-layer tree node which is the same as the tree node in the user knowledge graph, determining the searched next-layer tree node as the next-round graph node, acquiring the node attribute in the next-round graph node, and determining the node attribute as the next-round node attribute;
The obtaining subunit is used for obtaining the node problem template corresponding to the next round of node attribute and determining the obtained node problem template as a second node problem template;
and the synthesis subunit is used for combining the next round of node attributes with the second node problem template through the second authentication problem model to generate the next round of problems to be confirmed.
For specific limitations on the authentication device based on the knowledge-graph and the voiceprint recognition, reference may be made to the above limitation on the authentication method based on the knowledge-graph and the voiceprint recognition, and the details are not repeated here. The above-mentioned respective modules in the authentication apparatus based on knowledge-graph and voiceprint recognition may be implemented in whole or in part by software, hardware, and combinations thereof. The above modules may be embedded in hardware or may be independent of a processor in the computer device, or may be stored in software in a memory in the computer device, so that the processor may call and execute operations corresponding to the above modules.
In one embodiment, a computer device is provided, which may be a server, and the internal structure of which may be as shown in fig. 10. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device includes a non-volatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, computer programs, and a database. The internal memory provides an environment for the operation of the operating system and computer programs in the non-volatile storage media. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program, when executed by a processor, implements an authentication method based on knowledge-graph and voiceprint recognition.
In one embodiment, a computer device is provided, including a memory, a processor, and a computer program stored on the memory and executable on the processor, where the processor implements the authentication method based on knowledge-graph and voiceprint recognition in the above embodiments when executing the computer program.
In one embodiment, a computer readable storage medium is provided, on which a computer program is stored, which when executed by a processor implements the authentication method based on knowledge-graph and voiceprint recognition in the above embodiments.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by way of a computer program stored on a non-transitory computer readable storage medium, which when executed, may comprise the steps of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
It will be apparent to those skilled in the art that, for convenience and brevity of description, only the above-described division of the functional units and modules is illustrated, and in practical application, the above-described functional distribution may be performed by different functional units and modules according to needs, i.e. the internal structure of the apparatus is divided into different functional units or modules to perform all or part of the above-described functions.
The above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims (10)

1. An authentication method based on knowledge graph and voiceprint recognition is characterized by comprising the following steps:
receiving an identity verification instruction of a call user, and acquiring user information in the identity verification instruction;
Determining a user identification code of the call user according to the user information, and acquiring user sample tone information and a user knowledge graph associated with the user identification code; the user knowledge graph comprises graph nodes of a tree structure;
inputting the user knowledge graph into a first authentication problem generation model, and acquiring a to-be-confirmed problem which is generated by the first authentication problem generation model and contains a three-tree structure; the problem to be confirmed is generated by the first authentication problem generation model according to the map node and a first node problem template;
inputting the to-be-confirmed problem into a preset voice conversion model, acquiring authentication problem voice obtained by converting the voice conversion model through a voice synthesis technology, and broadcasting the authentication problem voice to the call user;
receiving reply voice information of the call user aiming at the authentication problem voice, inputting the reply voice information and the user sample voice information into a voiceprint recognition model, and inputting the reply voice information and the problem to be confirmed into a reply recognition model;
extracting voiceprint characteristics in the reply voice information through the voiceprint recognition model, and obtaining a voiceprint matching result output by the voiceprint recognition model according to the voiceprint characteristics; the voiceprint matching result refers to a confidence value of the voiceprint feature matching the user sample sound information;
Performing text recognition and intention recognition on the reply voice information through the reply recognition model to obtain a reply result, and recognizing the matching degree of the to-be-confirmed problem and the reply result through the reply recognition model to obtain a reply comprehensive result;
and determining an identity authentication result of the call user in the round of dialogue according to the voiceprint matching result and the answer comprehensive result.
2. The authentication method based on knowledge-graph and voiceprint recognition according to claim 1, wherein before the step of obtaining the user-like sound information and the user knowledge-graph associated with the user identification code, the method comprises:
acquiring user data associated with the user identification code;
converting the structured data in the user data to obtain first data, and simultaneously extracting text from unstructured data in the user data to obtain second data;
and carrying out knowledge fusion and relation extraction on all the first data and all the second data to obtain a map node, constructing a user knowledge map which is associated with the user identification code and contains the map node according to a triplet mode, and storing the user knowledge map in a block chain.
3. The authentication method based on knowledge graph and voiceprint recognition according to claim 1, wherein the obtaining the to-be-confirmed question having the trigeminal tree structure generated by the first authentication question generation model includes:
randomly acquiring a first-layer tree node from the user knowledge graph through the first authentication problem model; the user knowledge graph comprises a plurality of first-layer tree nodes;
acquiring a node problem template corresponding to the node attribute in the first-layer tree node, and determining the acquired node problem template as the first node problem template;
combining the node attribute with the first node problem template through the first authentication problem model to generate the problem to be confirmed; the question to be confirmed comprises three sub-nodes, wherein the sub-nodes are associated with one tree node in the user knowledge graph.
4. The authentication method based on knowledge graph and voiceprint recognition according to claim 1, wherein the extracting voiceprint features in the reply voice information by the voiceprint recognition model, obtaining a voiceprint matching result output by the voiceprint recognition model according to the voiceprint features, comprises:
Acquiring an identification result output by the voiceprint identification model according to the extracted voiceprint characteristics;
comparing and verifying the identification result with the user sample sound information to obtain the confidence value after comparison and verification;
and determining the voiceprint matching result according to the confidence value, wherein the voiceprint matching result characterizes the voiceprint matching degree between the reply voice information and the user-like voice information.
5. The authentication method based on knowledge graph and voiceprint recognition according to claim 3, wherein the text recognition and intention recognition are performed on the reply voice information through the reply recognition model to obtain a reply result, and the matching degree value of the to-be-confirmed problem and the reply result is recognized through the reply recognition model to obtain a reply comprehensive result, which comprises:
according to the voice recognition technology, carrying out text recognition on the reply voice information through a reply text recognition model to obtain a reply text result; the reply identification model comprises a reply text identification model, a reply intention identification model and a dialogue management module;
according to natural language understanding technology, intention recognition is carried out on the reply voice information through the reply intention recognition model, and a reply state result is obtained;
Determining the reply text result and the reply state result as the reply result through the dialogue management module;
matching the reply text result with the child node in the to-be-confirmed problem through the dialogue management module to obtain a matching degree value of the reply text result and the to-be-confirmed problem;
and obtaining the reply comprehensive result according to the matching degree value and the reply state result through the dialogue management module.
6. The authentication method based on knowledge-graph and voiceprint recognition according to claim 5, wherein after determining the identity authentication result of the talking user in the current round of dialogue according to the voiceprint matching result and the reply comprehensive result, the method comprises:
inputting the to-be-confirmed problem, the user knowledge graph and the reply result into a second authentication problem generation model, and acquiring the next round of to-be-confirmed problem generated by the second authentication problem generation model through a knowledge decision method;
inputting a reply state result in the reply result and the next round of questions to be confirmed into a preset personalized voice conversion model, acquiring the next round of authentication questions converted by the personalized voice conversion model through a voice synthesis technology, and broadcasting the next round of authentication questions to the call user;
Receiving next round of return voice information of the call user for the next round of authentication problem voice reply, inputting the next round of return voice information and the user sample voice information into the voiceprint recognition model, and simultaneously inputting the next round of return voice information and the next round of problem to be confirmed into the reply recognition model;
acquiring a next round of voiceprint matching result output by the voiceprint recognition model according to the extracted voiceprint features in the next round of repeated voice information; the next round of voiceprint matching result refers to a confidence value of the voiceprint characteristics in the next round of repeated voice information and the user sample voice information;
text recognition and intention recognition are carried out on the next round of reply voice information through the reply recognition model, a next round of reply result is obtained, and then the matching degree of the next round of to-be-confirmed problem and the next round of reply result is recognized through the reply recognition model, so that a next round of reply comprehensive result is obtained;
determining an identity authentication result of the next dialog according to the next voiceprint matching result and the next reply comprehensive result;
when the completion of the conversations of the preset rounds is detected, the identity authentication result of each round of conversations is obtained, and the final identity authentication result of the conversation user is determined according to all the identity authentication results; the final identity authentication result refers to the identity authentication result of the talking user determined through multiple rounds of conversations.
7. The authentication method based on knowledge graph and voiceprint recognition according to claim 6, wherein the inputting the to-be-confirmed question, the user knowledge graph and the reply result into a second authentication question generation model, and obtaining the next round of to-be-confirmed question generated by the second authentication question generation model through a knowledge decision method comprises:
inquiring a child node matched with the reply comprehensive result, and acquiring a tree node associated with the child node;
searching a next-layer tree node which is the same as the tree node in the user knowledge graph, determining the searched next-layer tree node as a next-round graph node, acquiring node attributes in the next-round graph node, and determining the node attributes as a next-round node attribute;
acquiring the node problem template corresponding to the next round of node attribute, and determining the acquired node problem template as a second node problem template;
and combining the next round of node attributes with the second node problem template through the second authentication problem model to generate the next round of problems to be confirmed.
8. An authentication device based on knowledge graph and voiceprint recognition, comprising:
The receiving module is used for receiving an identity verification instruction of a call user and acquiring user information in the identity verification instruction;
the acquisition module is used for determining a user identification code of the call user according to the user information and acquiring user sample sound information and a user knowledge graph associated with the user identification code; the user knowledge graph comprises graph nodes of a tree structure;
the generation module is used for inputting the user knowledge graph into a first authentication problem generation model and acquiring a to-be-confirmed problem which is generated by the first authentication problem generation model and contains a trigeminal tree structure; the problem to be confirmed is generated by the first authentication problem generation model according to the map node and a first node problem template;
the conversion module is used for inputting the to-be-confirmed problem into a preset voice conversion model, acquiring authentication problem voice obtained by conversion of the voice conversion model through a voice synthesis technology, and broadcasting the authentication problem voice to the call user;
the input module is used for receiving the reply voice information of the call user aiming at the authentication problem voice, inputting the reply voice information and the user sample voice information into a voiceprint recognition model, and inputting the reply voice information and the problem to be confirmed into a reply recognition model;
The extraction module is used for extracting voiceprint characteristics in the reply voice information through the voiceprint recognition model and obtaining a voiceprint matching result output by the voiceprint recognition model according to the voiceprint characteristics; the voiceprint matching result refers to a confidence value of the voiceprint feature matching the user sample sound information;
the recognition module is used for carrying out text recognition and intention recognition on the reply voice information through the reply recognition model to obtain a reply result, and then recognizing the matching degree of the to-be-confirmed problem and the reply result through the reply recognition model to obtain a reply comprehensive result;
and the authentication module is used for determining an identity authentication result of the call user in the round of dialogue according to the voiceprint matching result and the answer comprehensive result.
9. A computer device comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the knowledge-graph and voiceprint recognition based authentication method according to any one of claims 1 to 7 when the computer program is executed.
10. A computer-readable storage medium storing a computer program, wherein the computer program, when executed by a processor, implements the knowledge-graph and voiceprint recognition based authentication method according to any one of claims 1 to 7.
CN202010723015.6A 2020-07-24 2020-07-24 Authentication method, device, equipment and medium based on knowledge graph and voiceprint recognition Active CN111883140B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010723015.6A CN111883140B (en) 2020-07-24 2020-07-24 Authentication method, device, equipment and medium based on knowledge graph and voiceprint recognition

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010723015.6A CN111883140B (en) 2020-07-24 2020-07-24 Authentication method, device, equipment and medium based on knowledge graph and voiceprint recognition

Publications (2)

Publication Number Publication Date
CN111883140A CN111883140A (en) 2020-11-03
CN111883140B true CN111883140B (en) 2023-07-21

Family

ID=73201347

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010723015.6A Active CN111883140B (en) 2020-07-24 2020-07-24 Authentication method, device, equipment and medium based on knowledge graph and voiceprint recognition

Country Status (1)

Country Link
CN (1) CN111883140B (en)

Families Citing this family (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11405180B2 (en) * 2019-01-15 2022-08-02 Fisher-Rosemount Systems, Inc. Blockchain-based automation architecture cybersecurity
CN112530438B (en) * 2020-11-27 2023-04-07 贵州电网有限责任公司 Identity authentication method based on knowledge graph assisted voiceprint recognition
CN114363277B (en) * 2020-12-31 2023-08-01 万翼科技有限公司 Intelligent chat method and device based on social relationship and related products
CN112951215A (en) * 2021-04-27 2021-06-11 平安科技(深圳)有限公司 Intelligent voice customer service answering method and device and computer equipment
CN113314125A (en) * 2021-05-28 2021-08-27 深圳市展拓电子技术有限公司 Voiceprint identification method, system and memory for monitoring room interphone
CN113781059A (en) * 2021-11-12 2021-12-10 百融至信(北京)征信有限公司 Identity authentication anti-fraud method and system based on intelligent voice
CN114615062A (en) * 2022-03-14 2022-06-10 河南应用技术职业学院 Computer network engineering safety control system

Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5897616A (en) * 1997-06-11 1999-04-27 International Business Machines Corporation Apparatus and methods for speaker verification/identification/classification employing non-acoustic and/or acoustic models and databases
CN102457845A (en) * 2010-10-14 2012-05-16 阿里巴巴集团控股有限公司 Method, equipment and system for authenticating identity by wireless service
CN102737634A (en) * 2012-05-29 2012-10-17 百度在线网络技术(北京)有限公司 Authentication method and device based on voice
WO2016015687A1 (en) * 2014-07-31 2016-02-04 腾讯科技(深圳)有限公司 Voiceprint verification method and device
CN110134795A (en) * 2019-04-17 2019-08-16 深圳壹账通智能科技有限公司 Generate method, apparatus, computer equipment and the storage medium of validation problem group
CN110164455A (en) * 2018-02-14 2019-08-23 阿里巴巴集团控股有限公司 Device, method and the storage medium of user identity identification
CN110931016A (en) * 2019-11-15 2020-03-27 深圳供电局有限公司 Voice recognition method and system for offline quality inspection
CN111046133A (en) * 2019-10-29 2020-04-21 平安科技(深圳)有限公司 Question-answering method, question-answering equipment, storage medium and device based on atlas knowledge base

Patent Citations (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US5897616A (en) * 1997-06-11 1999-04-27 International Business Machines Corporation Apparatus and methods for speaker verification/identification/classification employing non-acoustic and/or acoustic models and databases
CN102457845A (en) * 2010-10-14 2012-05-16 阿里巴巴集团控股有限公司 Method, equipment and system for authenticating identity by wireless service
CN102737634A (en) * 2012-05-29 2012-10-17 百度在线网络技术(北京)有限公司 Authentication method and device based on voice
WO2016015687A1 (en) * 2014-07-31 2016-02-04 腾讯科技(深圳)有限公司 Voiceprint verification method and device
CN110164455A (en) * 2018-02-14 2019-08-23 阿里巴巴集团控股有限公司 Device, method and the storage medium of user identity identification
CN110134795A (en) * 2019-04-17 2019-08-16 深圳壹账通智能科技有限公司 Generate method, apparatus, computer equipment and the storage medium of validation problem group
CN111046133A (en) * 2019-10-29 2020-04-21 平安科技(深圳)有限公司 Question-answering method, question-answering equipment, storage medium and device based on atlas knowledge base
CN110931016A (en) * 2019-11-15 2020-03-27 深圳供电局有限公司 Voice recognition method and system for offline quality inspection

Also Published As

Publication number Publication date
CN111883140A (en) 2020-11-03

Similar Documents

Publication Publication Date Title
CN111883140B (en) Authentication method, device, equipment and medium based on knowledge graph and voiceprint recognition
KR101963993B1 (en) Identification system and method with self-learning function based on dynamic password voice
US20200151310A1 (en) Method and apparatus for identity authentication
CN111858892B (en) Voice interaction method, device, equipment and medium based on knowledge graph
US10777207B2 (en) Method and apparatus for verifying information
US9361891B1 (en) Method for converting speech to text, performing natural language processing on the text output, extracting data values and matching to an electronic ticket form
US20170345424A1 (en) Voice dialog device and voice dialog method
US8812319B2 (en) Dynamic pass phrase security system (DPSS)
CN110169014A (en) Device, method and computer program product for certification
US20160014120A1 (en) Method, server, client and system for verifying verification codes
CN108989349B (en) User account unlocking method and device, computer equipment and storage medium
CN113724695B (en) Electronic medical record generation method, device, equipment and medium based on artificial intelligence
CN112633003A (en) Address recognition method and device, computer equipment and storage medium
US11816609B2 (en) Intelligent task completion detection at a computing device
CN112863489B (en) Speech recognition method, apparatus, device and medium
CN113436614A (en) Speech recognition method, apparatus, device, system and storage medium
WO2013056343A1 (en) System, method and computer program for correcting speech recognition information
CN113873088B (en) Interactive method and device for voice call, computer equipment and storage medium
CN112541446B (en) Biological feature library updating method and device and electronic equipment
CN111933117A (en) Voice verification method and device, storage medium and electronic device
CN111785280A (en) Identity authentication method and device, storage medium and electronic equipment
CN112820323B (en) Method and system for adjusting response queue priority based on client voice
CN115952482B (en) Medical equipment data management system and method
CN114095883B (en) Fixed telephone terminal communication method, device, computer equipment and storage medium
CN113111658B (en) Method, device, equipment and storage medium for checking information

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant