CN111897935B

CN111897935B - Knowledge graph-based conversational path selection method and device and computer equipment

Info

Publication number: CN111897935B
Application number: CN202010751546.6A
Authority: CN
Inventors: 唐文军; 贾晓谦; 宋子岳; 王冉
Original assignee: Zhongdian Jinxin Software Co Ltd
Current assignee: Zhongdian Jinxin Software Co Ltd
Priority date: 2020-07-30
Filing date: 2020-07-30
Publication date: 2023-04-07
Anticipated expiration: 2040-07-30
Also published as: CN111897935A

Abstract

The application relates to a method, a device, computer equipment and a storage medium for selecting a dialogical path based on a knowledge graph. The method comprises the following steps: acquiring a corresponding intention knowledge graph according to the current service scene, wherein the intention knowledge graph is created in advance according to the intention in the current service scene; acquiring input dialogs, and predicting the confidence of a dialog path corresponding to the input dialogs in the intention knowledge graph; determining a target utterance path according to the highest confidence level, and determining an output utterance of an input utterance according to the target utterance path. As the tactical path selection is based on the intention knowledge graph which accords with the current business scene, the tactical path selection is restricted by the specific business scene, thereby being capable of making the path selection which accords with the current business scene and improving the matching degree and the accuracy of the tactical.

Description

Knowledge graph-based conversational path selection method and device and computer equipment

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a knowledge graph-based tactical path selection method, a knowledge graph-based tactical path selection device and computer equipment.

Background

The dialect plays an important role in a sales scene, and the good dialect can improve the success rate of sales. In order to improve the speech skill capability of the electricity sales personnel/customer service, speech skill man-machine training can be carried out on the electricity sales personnel/customer service. The man-machine training is to simulate a client by a machine and train customer service staff so as to improve the ability of the customer service staff to answer questions.

The dialoging training has a certain flexibility, and the dialoging trend is different according to different feedback of the dialoging. The dialect is determined by the dialect routing. The conventional dialect path selection mechanism determines a dialect trend by judging a dialect score through single-factor analysis such as similarity retrieval, emotion analysis, context analysis and the like.

However, in the prior art, since the dialect paths in different fields are mixed, the question and the sentence are easily answered, and the question and the sentence are not matched.

Disclosure of Invention

In view of the foregoing, it is desirable to provide a knowledge-graph-based conversational path selection method, apparatus, computer device, and storage medium capable of improving a conversational matching degree.

A knowledge-graph-based conversational path selection method, the method comprising:

acquiring a corresponding intention knowledge graph according to a current service scene, wherein the intention knowledge graph is pre-created according to the intention in the current service scene;

obtaining input dialogs, and predicting the confidence of a dialogs path corresponding to the input dialogs in the intention knowledge graph;

determining a target utterance path based on the highest confidence level, and determining an output utterance of the input utterance based on the target utterance path.

In one embodiment, if the highest confidence in the dialect path is less than a threshold, a plurality of preset guide dialects corresponding to the input dialect are obtained;

predicting a probability that the preset guide speech is used as an output speech of an input speech;

and taking the preset guide words with the highest probability as the output words of the input words.

In one embodiment, the obtaining input utterance, predicting a confidence level of its corresponding utterance path in the intention knowledge-graph, includes:

inputting the input speech technology into a pre-trained speech technology path selection model based on semantic analysis to obtain a full-scale speech technology path corresponding to the input speech technology and a confidence coefficient thereof;

and selecting a conversational path corresponding to the intention knowledge graph from the full-scale conversational paths.

In one embodiment, the predicting the probability of the preset guided utterance as an output utterance of the input utterance includes:

taking the input speech as an upper sentence, taking a preset guide speech as a lower sentence, and inputting a pre-trained prediction model;

and acquiring the probability of each preset guide word output by the prediction model as an output word.

In one embodiment, the inputting the input speech into a pre-trained semantic analysis-based speech path selection model to obtain a full-scale speech path and a confidence thereof corresponding to the input speech comprises:

taking the input speech as the input of a pre-trained speech path selection model based on semantic analysis;

performing word segmentation processing on the input word by a word segmentation device of the word technique path selection model;

acquiring a feature vector of each participle;

sequentially inputting the feature vectors of each participle into a forward and reverse long-short term memory network to respectively obtain a forward hidden layer vector and a reverse hidden layer vector;

and splicing the forward hidden layer vector and the reverse hidden layer vector, and outputting a full-scale speech path of the input speech and the confidence thereof.

In one embodiment, obtaining the feature vector of each participle includes:

adding a starting label before the word segmentation of the input sentence head, and adding an ending label after the word segmentation of the sentence tail;

inputting the participles added with the start label and the end label into a block embedding model in sequence to obtain a block code;

obtaining a segment code according to the starting label and the ending label, and obtaining a position code according to the position of the segment;

obtaining a coding vector according to the block coding, the segment coding and the position coding;

and processing the coding vector through an attention mechanism to obtain a feature vector.

In one embodiment, the manner of training the predictive model includes:

obtaining a conversational sample;

according to the similarity of the dialogs, determining the guide dialogs which are most matched with the dialogs samples as sample positive examples, and determining the guide dialogs which are not matched with the dialogs samples as sample negative examples;

and inputting the positive sample case and the negative sample case into a semantic model for training to obtain a prediction model.

A knowledge-graph-based tactical routing apparatus, said apparatus comprising:

the system comprises a map acquisition module, a map generation module and a map generation module, wherein the map acquisition module is used for acquiring a corresponding intention knowledge map according to a current service scene, and the intention knowledge map is pre-created according to an intention in the current service scene;

a prediction module for obtaining an input utterance and predicting a confidence of an utterance path corresponding to the input utterance in the intention knowledge graph;

and the output module is used for determining a target speaking path according to the highest confidence coefficient and determining the output speaking of the input speaking according to the target speaking path.

A computer device comprising a memory storing a computer program and a processor implementing the steps of the method according to the embodiments when the processor executes the computer program.

A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method according to the above-mentioned embodiments.

The knowledge graph-based conversational path selection method, the knowledge graph-based conversational path selection device, the computer equipment and the storage medium are used for carrying out business scene conversational constraint through the knowledge graph based on specific scene client conversational skills in the customer service or electricity marketing field, and selecting the next conversational skill with the highest matching probability from scene conversational options set by the knowledge graph as the conversational skill returned by the system. As the dialoging path selection is based on the intention knowledge graph which accords with the current service scene, the dialoging path selection is restricted by the specific service scene, so that the path selection which accords with the current service scene can be made, and the matching degree and the accuracy of the dialoging are improved.

Drawings

FIG. 1 is a diagram of an embodiment of an application environment for a knowledge-graph based conversational path selection method;

FIG. 2 is a flow diagram illustrating a method for knowledge-graph based tactical routing in one embodiment;

FIG. 3 is a diagram illustrating the structure of an established intent knowledge graph in one embodiment;

FIG. 4 is a diagram illustrating the structure of a conversational path selection model in another embodiment;

FIG. 5 is a flow diagram of a method for knowledge-graph based tactical routing in another embodiment;

FIG. 6 is a schematic diagram illustrating a knowledge-graph based approach to phone routing in one embodiment;

FIG. 7 is a block diagram of a knowledge-graph based tactical routing apparatus in one embodiment;

FIG. 8 is a diagram of an internal structure of a computer device in one embodiment.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more clearly understood, the present application is further described in detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.

The knowledge-graph-based conversational path selection method provided by the application can be applied to an application environment as shown in fig. 1. Including a training terminal 102, a manager terminal 104, and a server 106. The training terminal and the manager terminal 104 communicate with the server 106 via a network, respectively. The man-machine training manager sets through the manager terminal 104, if the equipment conforms to the intention knowledge graph of the current service scene, the server acquires the intention knowledge graph conforming to the current service scene; acquiring an input speech technique; predicting the confidence of a speaking path corresponding to the input speaking in the intention knowledge graph; determining a target utterance path according to the highest confidence, and determining an output utterance of the input utterance according to the target utterance path. The server sends the output speech technology to the training terminal 102, and the participator conducts man-machine training through the training terminal 102. The training terminal 102 and the administrator terminal may be, but not limited to, various personal computers, notebook computers, smart phones, and tablet computers, and the server 104 may be implemented by an independent server or a server cluster formed by a plurality of servers.

In one embodiment, as shown in fig. 2, a knowledge-graph-based conversational path selection method is provided, which is illustrated by applying the method to the server 104 in fig. 1, and includes the following steps:

step 202, acquiring a corresponding intention knowledge graph according to the current service scene, wherein the intention knowledge graph is created in advance according to the intention in the current service scene.

Specifically, the knowledge-graph-based conversational path selection method is implemented based on an established intention knowledge graph. The knowledge graph-based speech skill path selection method can be used for training speech skill of customer service or seats, performing real-time speech skill path selection according to the speech skill of students during trainings of the students, and also can be used for real-time speech skill path selection according to the speech skill of real customers during service of the customer service and the telemarketing.

The intention knowledge graph is a knowledge graph established according to intention motivations of the conversational problems, covers all intentions in a conversational man-machine training scene or a telemarketing scene, and is classified and associated according to business modes and links. Specifically, the intention motivation of the word-of-speech problem is extracted, intentions are classified according to the business mode and links of the word-of-speech, the incidence relation among the intentions is established by simulating the method of actual sales promotion business, and finally an intention knowledge graph is formed. The intention motivation refers to the most direct reason and motivation when consumers buy and consume commodities, so the intention motivation for obtaining the words is abstracted from the problems of the consumers in the selling process, and the abstracted intentions are logically combined in a correlation way according to the selling process, and finally the formed intention knowledge graph is closer to the actual selling process.

It can be understood that the conversational problem is different for different business scenarios, and the established intention knowledge graph is also different. In one embodiment, the problems included in the intent knowledge graph may be: "i want to know the purpose, identity, of the incoming call", "i want to know the price of the product", "i want to know the service of the product", "i want to know how to settle the claim", and so on. An established intent knowledge graph in one embodiment is shown in FIG. 3.

A plurality of intent nodes and relationships between the intent nodes construct an intent knowledge graph. Each dialog scenario corresponds to one of the graphs, and each node in the knowledge graph enters an intention, a question list corresponding to the intention, and a context-guided question list, and the relation is an optional path of the dialect. For a business scenario, the conversational path has certain scenario logic constraints. Corresponding intention knowledge maps can be set for different business scenarios.

Step S204, acquiring an input dialect, and predicting the confidence of a dialect path corresponding to the input dialect in the intention knowledge graph.

Input speech refers to the input of speech content in a speech context. In the customer service training service scene, the input speech is the content of the customer service speaking with the consumer in the product sale process, and the speaking content can comprise questions posed to the consumer and also can be answers to the questions posed by the consumer in the product sale process. In the customer service system, the input technique is the speaking content of the customer, such as a certain question which is provided by the customer to the customer service or a certain question which is replied by the customer to the customer service. For customer service or an agent, the customer service or the agent can master the speech technology through man-machine training speech technology.

The input dialect is obtained in a voice form and needs to be converted into Chinese characters through voice recognition. Taking the dialectical training of an application scene as an example, a learner carries out the dialectical training aiming at a specific scene, the speaking voice of the learner is converted into Chinese characters through an ASR voice-to-character function, and the situation that after the learner speaks and converts the characters, the question is asked about which year your car is bought? ".

The utterance path refers to feedback for input utterances, with different feedback instructing a trainer or telemarketer to continue talking content with the consumer using the appropriate output utterances. If the input technique is a problem, the technique path is a reply to the input technique problem. Such as a jargon "ask for your car to buy in which year? "there are two corresponding tactical paths, one for" the previous year "and one for" car insurance i bought ". The different utterance path points to a different next utterance. For example, for a jargon path "previous year," indicating that the user purchased a car the previous year, the next jargon pointed to by the jargon path may be a jargon that recommended a car insurance to the user. The dialog path "car insurance i bought" indicates the car insurance that the user has purchased, the next dialog pointed to by the dialog path may be the car insurance expiration date, etc.

The conversational path is related to an intention knowledge graph of a business scene, is constrained by the intention knowledge graph of the business scene, and for an intention node corresponding to an input conversational skill, all conversational skills included in a next intention node pointed by the connection path are taken as optional output conversational skills according to the connection path of the intention node in the intention knowledge graph, the output conversational skills are responses to the input conversational skills for continuing conversation, and in a customer service training business scene, the output conversational skills are replies to the input conversational skills of a training customer service. In the customer service system, the output speech is the reply of customer service to the input speech of the customer. The path from the input utterance to the optional output utterance is the utterance path.

Semantic prediction is based on analyzing the input speech path corresponding to the speech from semantic dimensions. Where semantic models can be employed to predict confidence in the conversational path.

In one embodiment, a pre-trained phonics path selection model is used to predict the phonics path for an input phone. Wherein, the dialogistic path selection model can be trained by using a neural network of RoBERTA-BilSTM. RoBERTA is a Chinese full-word mask pre-training model RoBERTA-wwm-ext published by the Harbour and major news Federation laboratory, which is called RoBERTA for short. Firstly, student dialogues in sample dialogue data are used as input, machine dialogues are used as prediction results, a GELU activation function is adopted, a gradient descent process is carried out, errors are minimized, and a dialogues path selection model is obtained.

Specifically, predicting the confidence level of the phonetics path corresponding to the input phonetics in the intention knowledge graph comprises: inputting the input dialect into a pre-trained dialect path selection model based on semantic analysis to obtain a full-scale dialect path corresponding to the input dialect and a confidence coefficient thereof; and selecting a conversational path corresponding to the intention knowledge graph from the full-scale conversational paths.

The total conversational path refers to all conversational paths that the model supports to predict, and in one embodiment, may be all intention branch paths in the intention knowledge graph. And taking the input speech as the input of a pre-trained speech path selection model, wherein the speech path selection model is a semantic analysis model, and outputting a full-scale speech path based on semantic analysis and a confidence coefficient for selecting a corresponding speech path through the speech path selection model. The higher the confidence, the higher the probability of selection of the conversational path.

A conversational path refers to a path supported by the intent knowledge graph that can lead from an input conversational utterance to an output conversational utterance. The connection relationship before the intention of the intention knowledge graph is a path relationship. The total paths predicted by the conversational path selection model do not all have path direction in the intention knowledge graph, i.e., some of the total paths are not alternative paths supported by the intention knowledge graph. According to the technical scheme, under the constraint of the intention knowledge graph of the business scene, the dialogical path corresponding to the intention knowledge graph is selected from the full-quantity dialogical paths.

For example, the input terminology is "ask for your car to buy in which year? ", the likelihood of the phone path selection model predicting the full phone path, if there are five full phone paths, calculate the confidence of the predicted phone to each phone path separately. Through scene limitation of the knowledge graph, the input dialect correspondingly comprises two items, namely 'previous year' and 'car insurance I bought'. And (3) screening out the confidence coefficient of the knowledge graph corresponding to the tactical path from the full-quantity result through the knowledge graph tactical path constraint, wherein the obtained result is as follows: the probability of the 'previous year' as the output jargon is 0.1, the probability of the 'car insurance I bought' as the output jargon is 0.2, and the jargon path with the highest probability is selected as the output jargon of the machine, namely the 'car insurance I bought'.

Step 206, determining a target utterance path according to the highest confidence level, and determining an output utterance of the input utterance according to the target utterance path.

The dialect path with the highest confidence coefficient indicates that the probability of selecting the dialect path by the consumer is higher in the current scene, the target dialect path is determined according to the highest confidence coefficient, the dialect pointed by the target dialect path is used as the output dialect of the input dialect, and the real-time dialect path of the real-time client dialect is selected when the trainee is guided to train or is served by the customer service and the electric sales service.

According to the knowledge graph-based speech path selection method, based on the specific scene customer speech in the customer service or electricity marketing field, business scene speech constraint is carried out through the knowledge graph, and the next speech with the highest matching probability is selected from scene speech options set by the knowledge graph and used as the speech returned by the system. As the tactical path selection is based on the intention knowledge graph which accords with the current business scene, the tactical path selection is restricted by the specific business scene, thereby being capable of making the path selection which accords with the current business scene and improving the matching degree and the accuracy of the tactical.

The specific prediction process of the dialect path selection model comprises the steps of inputting an input dialect as a pre-trained dialect path selection model based on semantic analysis; performing word segmentation processing on the input word by a word segmentation device of the word path selection model; acquiring a feature vector of each participle; sequentially inputting the feature vectors of each participle into a forward and reverse long-short term memory network to respectively obtain a forward hidden layer vector and a reverse hidden layer vector; and splicing the forward hidden layer vector and the reverse hidden layer vector, and outputting a full-scale speech path of the input speech and the confidence thereof.

As shown in fig. 4, the structure of the pre-trained dialect path selection model is first subjected to a word segmentation process by a word segmentation unit 401. Asking for your car in which year? For example, the word segmentation result is "ask," your car, "is," "year," "buy," "is," by pre-training the model-specific segmenter segmentation results? ".

Further, a feature vector of each participle is obtained. Specifically, a start label is added before the word segmentation of the beginning of the sentence in the input sentence, and an end label is added after the word segmentation of the end of the sentence; inputting the participles added with the start label and the end label into a block embedding model in sequence to obtain a block code; obtaining a segment code according to the starting label and the ending label, and obtaining a position code according to the position of the segment; obtaining a coding vector according to the block coding, the segment coding and the position coding; and processing the coded vector through an attention mechanism to obtain a feature vector.

Where the start tag and end tag are used to identify the beginning and end of the input utterance, i.e., sentence boundaries. For example, the start tag is "[ CLS ]", the end tag is "[ SEP ]", block Embedding (Token Embedding), segment Embedding (Segment Embedding), position Embedding (Position Embedding) are performed to obtain block coding, segment coding, position coding, block Embedding is added to the beginning and end of each sentence "[ CLS ]", mark the beginning and end of sentences, i.e., "[ CLS ]", "ask", "your car", "yes", "which year", "buy", "which is", and "? "," [ SEP ] ", generates a 768-dimensional block vector for each block through the pre-trained block embedding model 402, where the segment code is an index vector of all 0's, corresponding to each block in which the block is embedded, i.e., [0, 0], and the position code calculation formula is as follows:

where pos represents the location information of the Chinese character, i is used to express the coding dimension, and dmodel is the maximum sequence length of the model, where dmodel may be 512 in this embodiment, and i is 0 to 255. For characterizing the position information of the block in the segment, the three codes are added and input into 12 attention units 403, each attention unit has a self-attention layer and a feedforward neural network layer, and the self-attention layer calculates each code vector by the following formula:

wherein Attention is a head of Attention vector, Q is a query vector obtained by multiplying an obtained weight by an input code vector, K is a key vector obtained by multiplying an obtained weight by an input code vector, V is a value vector obtained by multiplying an obtained weight by an input code vector, d _k Is an empirical constant. The RoBERTA has 12 attention heads, the matrixes obtained by calculating the 12 attention heads are spliced into a matrix, the matrix is input into a feedforward neural network after being subjected to addition and normalization with the original input matrix, the output result is subjected to addition and normalization again, the next attention unit is input, and the process is repeated to obtain the feature vector.

Further, the RoBERTa output is sequentially input into the forward LSTM unit 404 and the backward LSTM unit 405 for calculation; specifically, "[ CLS ]", "ask for questions," "your car," "yes," "which year," "buy," "is"? The attention unit output vectors corresponding to the ' and ' SEP ' are sequentially input into an LSTM unit to obtain a forward hidden vector; "[ SEP ]", "? The attention unit output vectors corresponding to the 'buying', 'year', 'yes', 'your vehicle', 'asking for a question' and 'CLS' are sequentially input into the LSTM unit to obtain the reverse hidden layer vector; after the forward and reverse hidden layer vector is spliced, the softmax layer 406 is input to obtain the selection probability distribution of each speech path.

In another embodiment, as shown in fig. 5, a knowledge-graph-based tactical routing method comprises:

step 502, acquiring a corresponding intention knowledge graph according to the current service scene, wherein the intention knowledge graph is created in advance according to the intention in the current service scene.

Step 504, an input utterance is obtained, and the confidence of an utterance path corresponding to the input utterance in the intention knowledge graph is predicted.

At step 506, it is determined whether the highest confidence in the conversational path is less than a threshold. If not, go to step 507, and if yes, go to step 508.

Step 507, determining a target dialect path according to the highest confidence, and determining an output dialect of the input dialect according to the target dialect path.

For example, the probability that the next sentence of input grammar is "the previous year" is the next grammar calculated by the RoBERTa model is 0.1, the probability that "i buy at risk" is 0.2, and the probabilities of the two are not very high, which indicates that the input question distribution of the semantic mode does not exist in the training sample, then the execution is carried out

Step 508, a plurality of preset guiding dialogs corresponding to the input dialogs are obtained.

The preset guide language technique is the language content preset to guide the speaking to continue. Wherein, a whole set of the guiding technique can be set, and the guiding technique preset for each intention node can also be set. For example, guided dialog may be a question asked as a guide.

In step 510, the probability of each pre-established guide utterance as an output utterance of the input utterance is predicted.

Specifically, the probability of the output utterance, in which each preset guide utterance is used as an input utterance, is predicted by using a next sentence prediction model constructed in advance. Wherein, the input speech is used as the upper sentence, the preset guide speech is used as the lower sentence, and the pre-trained prediction model is input; and acquiring each preset guide dialect output by the prediction model as the probability of the output dialect.

Specifically, the input dialogues and the guide dialogues preset by the system are sequentially input into a prediction model, the prediction model divides two dialogues into characters of [ SEP ] ", marks of [ CLS ]", and [ SEP ] "areadded to the head and the tail of the two dialogues, and the two guide dialogues are configured to be respectively: "car I bought in the previous year" and "car insurance I does not buy", there are two input processes, the first input is "[ CLS ]", "please", "ask", "you", "of", "car", "is", "which", "year", "buy", "of", "is? "," [ SEP ] "," "car", "I", "before", "year", "buy", "of", "the" "[ SEP ]", the second entry is "[ CLS ]", "please", "ask", "you", "of", "car", "is", "which", "year", "buy", "of", "is? "," [ SEP ] "," vehicle "," insurance "," I "," not "," buy "," SEP ] "; <xnotran> BERT RoBERTa , BERT "[ CLS ]" "[ SEP ]" 0, 1, , [0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1,1], [0,0,0,0,0,0,0,0,0,0,0,0,0,1,1,1,1,1,1]. </xnotran> On the downstream output side, a vector of a position of [ CLS ] "in the attention unit of the last layer is taken, and a binary probability distribution, namely the probability that the second sentence input twice before and after is the next sentence, is output through the softmax layer. Because the [ CLS ] does not have obvious semantic information, the semantic influence of each component in the sentence can be more fairly fused and used as the global information representation of the preceding sentence and the following sentence.

In step 512, the preset guiding dialect with the highest probability is used as the output dialect of the input dialect.

As shown in fig. 6, the next sentence of input speech is determined by the prediction model, and for the first input, the probability of outputting that "what you bought in the past year is the next sentence is 0.8; for the second input, the probability that the output is 'car insurance i do not buy' is 0.3, and 'car me bought in the previous year' is selected as the next sentence to return.

The prediction model has the characteristic of bidirectional understanding, can deeply excavate the semantic causal and bearing relation of a specific upper sentence and a specific lower sentence, overcomes the defect that the RoBERTA-BilSt model unidirectionally excavates the linguistic characteristics of the upper sentence, considers the semantic relation of the upper sentence and the lower sentence more carefully, and can effectively improve the prediction performance on the selection of local linguistic problems. And the prediction model can predict any two words, so that the defect that the RoBERTA-BilStm model can only predict the finite word operation result is overcome.

The inaccuracy of a single model processing mode can be relieved to a great extent through a context semantic compensation mechanism, objective conditions such as user thought change, customer service question answering and thinking jump in a conversation process can be simulated more truly by configuring some questions for actively guiding the conversation theme, the conversation theme has good openness, the student answering thought is exploited, and the training effect in the customer service and electric marketing field is improved. In the real-time phone operation selection, the dialogue can be smoother and more humanized by configuring reasonable guide questions, and the embarrassment can be well relieved under the condition of answering questions, so that the dialogue is carried out in a more harmonious atmosphere.

In another embodiment, a method of training a predictive model includes: obtaining a conversational sample; according to the similarity of the dialogs, determining the guide dialogs which are most matched with the dialogs as sample positive examples, and determining the guide dialogs which are not matched with the dialogs as sample negative examples; and inputting the positive sample case and the negative sample case into a semantic model for training to obtain a prediction model.

Wherein, the dialect sample refers to the dialect for training the prediction model. And determining the positive sample case and the negative sample case of the dialogistic sample by using a similarity determination model. The sample positive case refers to a next sentence having a certain correlation with the input utterance, and the sample negative case refers to a next sentence which is not related to the input utterance and is not output. Specifically, roBERTA is used as a pre-training model, input speech is coded into semantic vectors, and the semantic vectors are used as comparison reference of sentence pair similarity. The guide problems are coded into 768-dimensional semantic vectors, and the semantic vectors are mapped into sub-buckets by using a locality sensitive hashing algorithm, so that the probability that the guide problems in each sub-bucket are similar is high, and the probability that the guide problems in different sub-buckets are similar is low, and similarity retrieval is facilitated. And storing each sub-bucket data as a conversational similarity judgment model.

Adopting prediction (NSP) based on a BERT model, taking the student's dialect and the most matched guide problem as a sample positive example, extracting the guide problem which is not matched with the student's dialect completely from the guide problem as a sample negative example, wherein the ratio of the number of the positive example to the number of the negative example is 1:1. the mismatching process of the counter example is screened by a speech similarity judging module, and in the positive example, a problem that the semantic similarity of the guiding problem matched with the student speech is lower than a threshold value is searched out besides the guiding problem matched with the student speech, namely a problem that the guiding problem matched with the student is located in different buckets is searched for and serves as the guiding problem that the student speech is not matched. Inputting the positive and negative samples into a BERT pre-training model, and performing a fine tuning process to obtain a prediction model conforming to the sample data

The knowledge graph-based conversational path selection method is based on client conversational skills of specific scenes in the customer service or electricity marketing field, domain conversational constraint is carried out through the knowledge graph, semantic relevance of questions and answers is considered by applying a neural network-based conversational path selection model, and an output conversational skill with the highest matching probability is selected from scene conversational skill options set by the knowledge graph and used as a conversational skill returned by a system. For arbitrary return dialogs, a contextual semantic compensation mechanism is provided. Under the condition that the credibility of the returning dialect is lower than the threshold value, the problem which has the closest relation with the contextual relation of the dialect can be selected from a guide problem list preset by the system to return, so that the system is more humanized and intelligent in processing the unknown condition and more conforms to the problem processing mode of human beings. Through a context semantic compensation mechanism, more reliable next-step speech technology selection can be provided under the condition that model judgment is not credible, semantic relevance of local limited system guide problems is finely adjusted, and speech technology matching degree and accuracy are effectively improved. Meanwhile, the fixed result domain limitation of the semantic understanding model can be overcome, and some problems with strong guidance, large thinking span and certain continuity with the context can be added in the guidance problem list, so that the template mode that the dialect path selection model follows the scene flow selection dialect is broken. Because the method can well excavate the causal connection relation of the open context, the trend of the conversation can be actively guided in practical application, the phenomena of thinking transition and topic conversion in the conversation process are simulated, and the flexibility and the initiative of a conversation system are enhanced.

In one embodiment, as shown in fig. 7, a knowledge-graph-based tactical routing apparatus, comprises:

the map obtaining module 702 is configured to obtain a corresponding intention knowledge map according to a current service scene, where the intention knowledge map is created in advance according to an intention in the current service scene.

And the prediction module 704 is used for acquiring the input dialect and predicting the confidence of the dialect path corresponding to the input dialect in the intention knowledge graph.

An output module 706, configured to determine a target utterance path according to the highest confidence, and determine an output utterance of the input utterance according to the target utterance path.

The knowledge graph-based speech path selection device is based on the specific scene customer speech in the customer service or electricity marketing field, business scene speech constraints are carried out through the knowledge graph, and the next speech with the highest matching probability is selected from scene speech options set by the knowledge graph and used as the speech returned by the system. As the dialoging path selection is based on the intention knowledge graph which accords with the current service scene, the dialoging path selection is restricted by the specific service scene, so that the path selection which accords with the current service scene can be made, and the matching degree and the accuracy of the dialoging are improved.

In another embodiment, the method further comprises:

the guide dialogues obtaining module is used for obtaining a plurality of preset guide dialogues corresponding to the input dialogues.

And the guide prediction module is used for predicting the probability of the output speech of each preset guide speech as the input speech.

And the output module is also used for taking the preset guide dialect with the highest probability as the output dialect of the input dialect.

In another embodiment, the prediction module includes:

and the semantic analysis module is used for inputting the input speech technology into a pre-trained speech technology path selection model based on semantic analysis to obtain a full-scale speech technology path corresponding to the input speech technology and a confidence coefficient thereof.

And the constraint module is used for selecting a conversational path corresponding to the intention knowledge graph from the full-amount conversational paths.

In another embodiment, the guidance prediction module is configured to input a pre-trained prediction model by using an input utterance as an upper sentence and a preset guidance utterance as a lower sentence; and acquiring each preset guide dialect output by the prediction model as the probability of the output dialect.

In another embodiment, the semantic analysis module is used for taking input speech as the input of a pre-trained speech path selection model based on semantic analysis; performing word segmentation processing on the input word by a word segmentation device of the word path selection model; acquiring a feature vector of each participle; sequentially inputting the feature vectors of all the participles into a forward and reverse long-short term memory network to respectively obtain a forward hidden layer vector and a reverse hidden layer vector; and splicing the forward hidden layer vector and the reverse hidden layer vector, and outputting a full-scale speech path of the input speech and the confidence thereof.

In another embodiment, the manner of obtaining the feature vector of each participle includes: adding a starting label before the word segmentation of the beginning of the word sentence, and adding an ending label after the word segmentation of the end of the sentence; inputting each participle added with a start label and an end label into a block embedding model in sequence to obtain a block code; obtaining segment codes according to the start label and the end label, and obtaining position codes according to the positions of the segments; obtaining a coding vector according to the block coding, the segment coding and the position coding; and processing the coded vector through an attention mechanism to obtain a feature vector.

In another embodiment, the system further comprises a predictive training module for obtaining a conversational sample; according to the similarity of the dialogs, determining the guide dialogs which are most matched with the dialogs as sample positive examples, and determining the guide dialogs which are not matched with the dialogs as sample negative examples; and inputting the positive sample case and the negative sample case into a semantic model for training to obtain a prediction model.

For specific limitations of the knowledge-graph-based speaking path selection device, reference may be made to the above limitations of the knowledge-graph-based speaking path selection method, which will not be described herein again. The various modules of the above-described knowledge-graph-based tactical routing apparatus may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent of a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.

In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 8. The computer device includes a processor, a memory, and a network interface connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operating system and the computer program to run on the non-volatile storage medium. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a knowledge-graph based tactical routing method.

Those skilled in the art will appreciate that the architecture shown in fig. 8 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.

In one embodiment, a computer device is provided, comprising a memory and a processor, the memory having a computer program stored therein, the processor implementing the following steps when executing the computer program:

acquiring a corresponding intention knowledge graph according to the current service scene, wherein the intention knowledge graph is created in advance according to the intention in the current service scene;

acquiring an input dialect, and predicting the confidence coefficient of a dialect path corresponding to the input dialect in the intention knowledge graph;

a target utterance path is determined based on the highest confidence level, and an output utterance of the input utterance is determined based on the target utterance path.

In another embodiment, the processor, when executing the computer program, performs the steps of:

if the highest confidence coefficient in the dialect path is smaller than a threshold value, acquiring a plurality of preset guide dialects corresponding to the input dialect;

predicting the probability of each preset guide dialect as an output dialect of the input dialect;

and taking the preset guide dialect with the highest probability as the output dialect of the input dialect.

In another embodiment, obtaining input dialogs and predicting a confidence level of a dialogs path corresponding to the input dialogs in the intent knowledge graph comprises:

In another embodiment, predicting a probability of pre-setting a bootstrap utterance as an output utterance of an input utterance includes:

taking an input speech as an upper sentence, taking a preset guide speech as a lower sentence, and inputting a pre-trained prediction model;

and acquiring each preset guide dialect output by the prediction model as the probability of the output dialect.

In another embodiment, the method for inputting the input utterance into a pre-trained utterance path selection model based on semantic analysis to obtain a full-scale utterance path corresponding to the input utterance and a confidence thereof includes:

taking an input speech as the input of a pre-trained speech path selection model based on semantic analysis;

performing word segmentation processing on the input word by a word segmentation device of the word path selection model;

acquiring a feature vector of each participle;

sequentially inputting the feature vectors of all the participles into a forward and reverse long-short term memory network to respectively obtain a forward hidden layer vector and a reverse hidden layer vector;

and splicing the forward hidden layer vector and the reverse hidden layer vector, and outputting a full-scale conversational path of the input conversational and the confidence coefficient of the full-scale conversational path.

In another embodiment, obtaining the feature vector of each participle comprises:

adding a starting label before the word segmentation of the beginning of the word sentence, and adding an ending label after the word segmentation of the end of the sentence;

and processing the coded vector through an attention mechanism to obtain a feature vector.

In another embodiment, a method of training a predictive model includes:

obtaining a dialogistic sample;

according to the similarity of the dialogs, determining the guide dialogs which are most matched with the dialogs as sample positive examples, and determining the guide dialogs which are not matched with the dialogs as sample negative examples;

In one embodiment, a computer-readable storage medium is provided, having a computer program stored thereon, which when executed by a processor, performs the steps of:

determining a target utterance path according to the highest confidence, and determining an output utterance of the input utterance according to the target utterance path.

In another embodiment, if the highest confidence in the dialoging path is less than the threshold, a plurality of preset guiding dialogies corresponding to the input dialoging are obtained;

predicting the probability of each preset guide word as the output word of the input word;

and selecting a conversational path corresponding to the intention knowledge graph from the full-amount conversational paths.

In another embodiment, predicting the probability of a pre-set boot utterance as an outgoing utterance for an incoming utterance includes:

In another embodiment, inputting the input speech into a pre-trained semantic analysis-based speech path selection model to obtain a full-scale speech path corresponding to the input speech and a confidence thereof, includes:

acquiring a feature vector of each participle;

adding a starting label before the word segmentation of the beginning of the sentence in the input sentence, and adding an ending label after the word segmentation of the end of the sentence;

In another embodiment, a method of training a predictive model includes:

obtaining a conversational sample;

It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), rambus (Rambus) direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).

The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is specific and detailed, but not to be understood as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, and these are all within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A knowledge-graph-based conversational path selection method, the method comprising:

acquiring a corresponding intention knowledge graph according to a current service scene, wherein the intention knowledge graph is pre-created according to the intention in the current service scene; the intent knowledge graph includes a plurality of intent nodes and relationships between the intent nodes;

obtaining input dialogues, predicting confidence degrees of full-scale dialogues paths corresponding to the input dialogues, and selecting dialogues paths in which intention nodes corresponding to the input dialogues have path directions in the intention knowledge graph from the full-scale dialogues; the full-quantitative conversational path is all intention branch paths in the intention knowledge graph;

if the highest confidence coefficient in the speaking paths with path pointing directions in the full-quantity speaking path is larger than or equal to a threshold value, determining a target speaking path according to the highest confidence coefficient, and determining the output speaking of the input speaking according to the target speaking path;

if the highest confidence coefficient in the dialoging paths with path directions in the full-quantity dialoging paths is smaller than a threshold value, acquiring a plurality of preset guide dialoging corresponding to the input dialoging;

taking the input speech as an upper sentence, taking a preset guide speech as a lower sentence, inputting a pre-trained prediction model, and acquiring the probability of each preset guide speech output by the prediction model as an output speech;

2. The method of claim 1, wherein the obtaining input utterances, predicting confidence levels for full-scale utterance paths corresponding to the input utterances, and selecting an utterance path corresponding to the intended knowledge-graph among the full-scale utterance paths comprises:

3. The method of claim 2, wherein inputting the input utterance into a pre-trained semantic analysis-based utterance path selection model to obtain a full-scale utterance path and a confidence thereof corresponding to the input utterance comprises:

acquiring a feature vector of each participle;

4. The method of claim 3, wherein obtaining feature vectors for each participle comprises:

inputting each participle added with a start label and an end label into a block embedding model in sequence to obtain a block code;

obtaining a segment code according to the start label and the end label, and obtaining a position code according to the position of the segment;

5. The method of claim 1, wherein the means for training the predictive model comprises:

obtaining a conversational sample;

6. A knowledge-graph-based tactical routing apparatus, said apparatus comprising:

the system comprises a map acquisition module, a map generation module and a map generation module, wherein the map acquisition module is used for acquiring a corresponding intention knowledge map according to a current service scene, and the intention knowledge map is pre-created according to an intention in the current service scene; the intent knowledge graph includes a plurality of intent nodes and relationships between the intent nodes;

the prediction module is used for acquiring input dialogues, predicting the confidence coefficient of a full-scale dialogues path corresponding to the input dialogues, and selecting a dialogues path which an intention node corresponding to the input dialogues has path direction in the intention knowledge graph from the full-scale dialogues path; the full-quantitative conversational path is all intention branch paths in the intention knowledge graph;

a guiding dialect obtaining module, configured to obtain a plurality of preset guiding dialects corresponding to the input dialect when a highest confidence in the dialect paths having path directions in the full-amount dialect paths is smaller than a threshold;

the guide prediction module is used for taking the input speech as an upper sentence, taking a preset guide speech as a lower sentence, inputting a pre-trained prediction model, and acquiring the probability of each preset guide speech output by the prediction model as an output speech;

an output module, configured to determine a target conversational path according to the highest confidence if the highest confidence in the conversational paths with path direction in the full conversational path is greater than or equal to a threshold, and determine an output conversational path of the input conversational path according to the target conversational path; and if the highest confidence in the dialogs path is smaller than a threshold value, the preset guide dialogs with the highest probability are used as the output dialogs of the input dialogs.

7. The apparatus of claim 6, wherein the prediction module comprises:

the semantic analysis module is used for inputting the input speech technology into a pre-trained speech technology path selection model based on semantic analysis to obtain a full-scale speech technology path corresponding to the input speech technology and a confidence coefficient thereof;

and the constraint module is used for selecting the conversational path corresponding to the intention knowledge graph in the full-amount conversational path.

8. The apparatus of claim 7, wherein the semantic analysis module is configured to take the input utterance as an input to a pre-trained semantic analysis based utterance path selection model; performing word segmentation processing on the input word by a word segmentation device of the word technique path selection model; acquiring a feature vector of each participle; sequentially inputting the feature vectors of each participle into a forward and reverse long-short term memory network to respectively obtain a forward hidden layer vector and a reverse hidden layer vector; and splicing the forward hidden layer vector and the reverse hidden layer vector, and outputting a full-scale phone path of the input phone and the confidence coefficient of the full-scale phone path.

9. A computer device comprising a memory and a processor, the memory storing a computer program, characterized in that the processor, when executing the computer program, implements the steps of the method of any of claims 1 to 5.

10. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 5.