CN113762451B

CN113762451B - Task type question-answering robot based on scene and keyword rules

Info

Publication number: CN113762451B
Application number: CN202110995597.8A
Authority: CN
Inventors: 陈再蝶; 朱晓秋; 周杰; 樊伟东
Original assignee: Kangxu Technology Co ltd
Current assignee: Kangxu Technology Co ltd
Priority date: 2021-08-27
Filing date: 2021-08-27
Publication date: 2024-02-27
Anticipated expiration: 2041-08-27
Also published as: CN113762451A

Abstract

The invention discloses a task type question-answering robot based on scene and keyword rules, which comprises a multi-round question-answering tree, keyword rules, text similarity rules, a question emotion analysis mechanism and a slot configuration mechanism, wherein the text similarity rules comprise a bert model and a cosine similarity model, and the question emotion analysis mechanism comprises a gbdt classifier. In the invention, different scenes are distinguished through the primary key words, each scene is an independent tree, each tree is provided with an independent auxiliary key word and a primary key word, and algorithm contents such as text vectorization, text similarity and emotion analysis are added through a bert model, a cosine similarity model and a gbdt classifier except for a key word combination rule in each node of each tree, so that the task question-answering robot based on rules does not need to completely rely on rule configuration, and the defect of inflexibility of the traditional rule robot relative to dead plates is overcome.

Description

Task type question-answering robot based on scene and keyword rules

Technical Field

The invention relates to the technical field of task type robots, in particular to a task type question-answering robot based on scene and keyword rules.

Background

A task robot refers to a robot that provides information or services under specific conditions. In general, the task scenes such as flow checking, telephone fee checking, meal ordering, ticket ordering, consultation and the like are met for users with definite purposes. Because the demands of users are complex, and in general, the users need to interact in multiple rounds, the users may also continuously modify and perfect their demands in the process of dialogue, and task robots need to help users to make clear purposes by querying, clarifying and confirming, so two implementation methods exist in the industry: rule-based implementations and End-to-End-based implementations.

The End-to-End-based multi-round dialogue task robot tries to train an overall mapping relation from user-side natural language input to machine-side natural language output, so that flexibility and expansibility of a system are improved, but the model has very high requirements on quality and quantity of data and has unexplainability, so that the current industry mostly adopts a rule-based implementation mode.

The rule-based multi-round dialogue task robot is based on a rule of regular matching, and the method is too strict for the questioner and relatively dead;

the other is similar to a business dialogue system, the input text is mapped into a semantic framework composed of a plurality of semantic slots, and the matching rule of one semantic slot is composed of a plurality of slot value types and connective words, so that a complete piece of information is expressed. The disadvantage of this approach is: (1) rule development is error-prone; (2) the adjustment rule requires multiple iterations; (3) difficult maintenance in case of rule conflict; (4) The user cannot be flexibly and vividly understood due to complete dependence on rules;

therefore, the task type question-answering robot based on the scene and keyword rules is provided, and the defects of the traditional question-answering robot based on the rules are overcome, such as (1) the user question-answering method is relatively dead; (2) application scenario comparison limitation; (3) Rule configuration is easy to make mistakes and rule conflicts are easy to generate; and (4) the problem of low accuracy of regular positioning.

Disclosure of Invention

In order to solve the technical problems mentioned in the background art, a task type question-answering robot based on scene and keyword rules is provided.

In order to achieve the above purpose, the present invention adopts the following technical scheme:

the task type question-answering robot based on scene and keyword rules comprises a multi-round question-answering tree, keyword rules, text similarity rules, a question emotion analysis mechanism and a slot configuration mechanism, wherein the text similarity rules comprise a bert model and a cosine similarity model, the question emotion analysis mechanism comprises a gbdt classifier, and the bert model is subjected to fine tuning through a plurality of scene text data so as to be suitable for the cosine similarity model and the gbdt classifier;

the multi-round question-answering tree comprises a plurality of nodes, wherein the nodes are configured with corresponding keyword rules or emotion analysis mechanisms, the keyword rules comprise main keywords for positioning the multi-round question-answering tree and auxiliary keywords for positioning the nodes, question sentences are input to the question-answering robot, the nodes in the multi-round question-answering tree are positioned through the keyword rules, and corresponding question replies are matched;

the node also comprises a node for judging whether the previous round of node is a question or not, the node is not configured with a keyword rule, an emotion analysis mechanism and a text similarity rule, in the next round of node, the question sentence of the current round is divided into three types of yes, no and no positive answer questions through the question emotion analysis mechanism, corresponding answer matching is carried out for the clear yes and no categories, and multiple rounds of question-answering tree and node positioning are carried out again for the category without positive answer questions.

As a further description of the above technical solution:

the nodes configured with the keyword rules are synchronously configured with text similarity rules, the priority of the text similarity rules is lower than that of the keyword rules, namely when any node is not hit by the keyword rules, the corresponding node is hit by the text similarity rules, and the multi-round question-answering tree is reversely positioned.

As a further description of the above technical solution:

in the process of locating nodes and multiple rounds of question-answering trees through a text similarity rule, vectorizing question sentences through a bert model, and then carrying out similarity calculation and sequencing by using a cosine similarity model, wherein the node with the highest similarity and higher than a similarity threshold is used as a hit node;

the cosine similarity model is expressed as follows:

wherein A and B are two n-dimensional vectors that compute similarity.

As a further description of the above technical solution:

in the process of classifying question types through a question emotion analysis mechanism, a bert model is used for vectorizing the questions, and then a gbdt classifier is used for classifying the questions into three categories of yes, no and no positive answer questions.

As a further description of the above technical solution:

the gbdt classifier generates a weak classifier through multiple iterations, each classifier is trained on the residual error basis of the previous classifier, and the final gbdt classifier is obtained by weighting and summing the weak classifiers obtained through each training;

the weak classifier selects a classification regression tree, and the formula of the residual error is expressed as follows:

wherein the data (x _i ,r _im ) I=1, 2, ··, N is used as training data for the next round of classification regression tree, obtaining a new classification regression tree, wherein the corresponding leaf node area is R _jm J=1, 2, the contents of (J), J is the number of leaf nodes;

the formula of the gbdt classifier is expressed as follows:

wherein f ₀ (x) Gamma, the initial weak classifier _jm For best fit value calculated for leaf area, R _jm For the area of the leaf node, m represents the number of iterations, i.e. generationL is a loss function, c is an initial random given constant, I (x ε R) _jm ) Indicating that it is determined whether x belongs to a leaf, to return 1, and not to return 0.

As a further description of the above technical solution:

the bert model extracts features, i.e., embedded vectors of words and sentences, from the text data, which are used as cosine similarity models or input features for the gbdt classifier.

As a further description of the above technical solution:

the nodes are provided with a slot configuration mechanism, namely, a dynamically configured substitute is arranged in the question reply of each node, and specific contents are matched according to different multi-round question-answering trees.

In summary, due to the adoption of the technical scheme, the beneficial effects of the invention are as follows:

through the multi-round question-answering tree, different scenes are distinguished through the primary key words while the flow of the question-answering robot is normalized, each scene is an independent tree, each tree is configured with independent auxiliary key words and primary key words, the same words can show different rules in different scenes, under the design, the configuration of the rule robot is more normalized, the contradiction condition of rule conflict in the traditional rule robot is avoided to a great extent due to the fact that the scenes are distinguished, the rule configuration is simpler and more efficient and the responsibility is clear, in each node of each tree, algorithm contents such as text vectorization, text similarity and emotion analysis are added through a bert model, a cosine similarity model and a gbdt classifier, the task-type question-answering robot based on rules is not required to be completely dependent on the rule configuration, the defect that the traditional rule robot is inflexible relative to a dead plate is overcome, finally, the self-immobilized in the nodes can be returned to different environments through the different configuration, and the different answer conditions can be returned to the scene.

Drawings

Fig. 1 shows a schematic question-answering flow diagram of a task-type question-answering robot based on scene and keyword rules, provided according to an embodiment of the present invention;

fig. 2 shows a schematic view of a scene positioning flow of a task type question-answering robot based on scene and keyword rules according to an embodiment of the present invention;

fig. 3 shows a schematic structural diagram of a bert model of a task type question-answering robot based on scene and keyword rules according to an embodiment of the present invention;

fig. 4 shows a schematic sentence input diagram of a bert model of a task type question-answering robot based on scene and keyword rules according to an embodiment of the present invention.

Detailed Description

The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.

Example 1

Referring to fig. 1-4, the present invention provides a technical solution: the task type question-answering robot based on scene and keyword rules comprises a multi-round question-answering tree, keyword rules, text similarity rules, a question emotion analysis mechanism and a slot configuration mechanism, wherein the text similarity rules comprise a bert model and a cosine similarity model, the question emotion analysis mechanism comprises a gbdt classifier, the bert model carries out fine adjustment through a plurality of scene text data so as to be suitable for the cosine similarity model and the gbdt classifier, the bert model extracts features from the text data, namely embedded vectors of words and sentences, and the vectors are used as input features of the cosine similarity model or the gbdt classifier;

specifically, the bert model is a method for pre-training language representation, which can be used for extracting high-quality language features from text data, and can also be used for fine tuning the models by using own data to complete specific tasks (classification, entity identification, question answer and the like), so as to generate the most advanced prediction;

we will use the bert model to extract features, i.e., embedded vectors of words and sentences, from the text data, which are used as high quality feature inputs for downstream models. NLP models (e.g., LSTMs or CNNs) require input in the form of digital vectors, which typically means that the vocabulary and some of the speech features need to be converted to digital representations. In the past, words have been represented as unique index values (one-hot encoding), or more useful as neural Word embeddings, where the vocabulary is matched with fixed length feature embeddings produced by models such as Word2Vec or Fasttext. The bert model provides an advantage over models such as Word2Vec in that, although each Word under Word2Vec has a fixed representation, regardless of the context in which the Word appears, the Word representation generated by the bert model is dynamically informed by words surrounding the Word;

for example, given two sentences: "The man was accused of robbing a bank" and "The man went fishing by the bank of the river," Word2Vec will generate the same Word embeddings for the Word "bank" in both sentences, and different Word embeddings for "bank" in the bert model. In addition to capturing obvious differences such as word ambiguity, the contextually relevant words empeddings capture other forms of information that can yield more accurate feature representations, thereby improving model performance.

In principle, the method comprises the following steps: the bert model is a deep bi-directional pre-trained language understanding model using a transducer as a feature extractor. The bert model is a pre-training model, and a language characterization model is trained through a large amount of data by using a bidirectional transducer, which is a general model, and downstream tasks including classification, regression, machine translation, question-answering systems and the like are applied by fine tuning the model, and the downstream model applied in the example is a cosine similarity model or a gbdt classifier.

The structure of the bert model is shown in fig. 3 and 4, transformer Encoder, namely the input of all moments can be obtained through the Attention calculation at every moment, compared with other models, the input of the bert model adopts a mode of adding three Embeddings, and the aim of pre-training and predicting the next sentence is fulfilled by adding three vectors of Token Embeddings, segment Embeddings and Position Embeddings;

the Input of the bert model is two sentences: "my dog is cure", "helike playing". First, a special Token [ CLS ] is added to the beginning of the first sentence to mark the beginning of the sentence, the [ SEP ] is used to mark the end of the sentence, then 3 references are made to each Token, the Embedding (Token Embeddings) of the word, the Embedding (Position Embeddings) position, and the Embedding (Segment Embeddings) sentence, and finally the three references are input to the next layer in a summation mode.

The multi-round question-answering tree comprises a plurality of nodes, the nodes are provided with corresponding keyword rules or emotion analysis mechanisms, the keyword rules comprise main keywords for positioning the multi-round question-answering tree and auxiliary keywords for positioning the nodes, question sentences are input to the question-answering robot, the nodes in the multi-round question-answering tree are positioned through the keyword rules, and corresponding question replies are matched;

the node further comprises a node for judging whether the previous round of node is a question or not, the node is not provided with a keyword rule, an emotion analysis mechanism and a text similarity rule, in the next round of node, the question of the round is divided into three categories of yes, no and no positive answer questions through the question emotion analysis mechanism, corresponding answer matching is carried out on the clear yes and no categories, multiple rounds of question-answering tree and node positioning are carried out again on the non-positive answer questions, specifically, in the process of classifying the question types through the question emotion analysis mechanism, the vectorization is carried out on the question through a bert model, and the gbdt classifier is used for classifying the question into three categories of yes, no and no positive answer questions;

first, the basic framework of the task type question-answering robot is a plurality of question-answering trees, each tree can be understood as a scene or a major class, such as financial, fund, deposit, loan and the like, and each tree is configured with a plurality of primary key words to help locate the scene, as shown in the following table 1:

TABLE 1

Scene(s)	Primary key words
		Loan	Loan, borrow and borrow
Deposit	Deposit, cash, banknote, qian · deposit
		······	······

As shown in fig. 2, the first step of the question-answering robot is to locate a scene, if a question-answering record exists, the scene of the previous question-answering is inherited preferentially, if no keyword is matched under the scene, whether a question and text similarity rule exist, a new scene of a new question sentence is relocated, and if no primary keyword exists, a general scene is entered;

for each scene or each node under the tree, the corresponding keyword rules and text similarity rules are configured except that the previous question-answering image is a next node for asking whether the client asks a problem or not, and a plurality of auxiliary keywords are additionally configured in the keyword rules of the nodes besides the main keywords used for locating the scene, wherein the auxiliary keywords are as shown in the following table 2:

TABLE 2

Scene(s)	Keywords (including primary keywords and secondary keywords)
		Loan	Loans, house loans, car loans, and transactions
Deposit	Deposit, how to transact
		······	······

As can be seen from Table 2, the same words as "transact" or "how" etc. appear in multiple scenes, and after locating a specific scene, the specific node keywords or keyword combinations are matched, as shown in Table 3 below:

TABLE 3 Table 3

The invention adopts a mechanism of multi-round question-answering tree, and positions the scene through the main key words and then positions the nodes in the scene with the assistance of the auxiliary key words, so that the question-answering robot has clear thought and easy configuration, the condition of rule conflict is avoided to a great extent, and the difficulty of rule configuration is reduced;

secondly, for whether a node in a scene has a problem, classification is performed by a problem emotion analysis mechanism, and the rule shape is shown in the following table 4:

TABLE 4 Table 4

Specifically, the nodes configured with the keyword rules are synchronously configured with text similarity rules, the priority of the text similarity rules is lower than that of the keyword rules, namely when the keyword rules do not hit any node, the corresponding node is hit through the text similarity rules, and the multi-round question-answering tree is reversely positioned;

in the nodes of each scene, other nodes which are provided with keyword rules except for the nodes which are whether problems exist, some texts can be synchronously configured, the nodes can be positioned through text similarity, the defect that the keyword rules are relatively dead is overcome, and the rules are as shown in the following table 5:

TABLE 5

The text similarity rule only needs to consider the nodes and does not need to consider any scene, and the scene is located after hitting the nodes, which is contrary to the keyword rule.

Specifically, in the process of locating nodes and multiple rounds of question-answering trees through a text similarity rule, firstly vectorizing question sentences through a bert model, and then carrying out similarity calculation and sorting by using a cosine similarity model, wherein the node with the highest similarity and higher than a similarity threshold is used as a hit node;

the cosine similarity model is formulated as follows:

wherein A and B are two n-dimensional vectors for calculating similarity;

to determine whether two texts match, it is actually calculated whether the similarity of word vectors expressing the semantics of the two texts is close.

In the scene, a cosine similarity model is the most suitable and most widely used method, and the principle is that the cosine value of the included angle of two vectors in a vector space is used as a measure for measuring the individual difference of the two vectors, so that the similarity of the two vectors in the dimension direction can be obtained through the model, and the method can be widely applied.

Specifically, the gbdt classifier generates a weak classifier through multiple iterations, each classifier is trained on the basis of the residual error of the previous classifier, the finally obtained gbdt classifier is obtained by weighting and summing the weak classifiers obtained through each round of training, the requirements on the weak classifier are generally simple enough, and are low variance and high deviation, the training process is to continuously improve the precision of the final classifier by reducing the deviation, the weak classifier selects classification regression trees, and the depth of each classification regression tree is not deep due to the requirements on the low variance and the high deviation;

the formula of the residual is expressed as follows:

the formula of the gbdt classifier is expressed as follows:

wherein f ₀ (x) Gamma, the initial weak classifier _jm For best fit value calculated for leaf area, R _jm For the area of the leaf node, m represents the iteration number, i.e. the number of weak learners generated, L is the loss function, c is the initial random given constant, I (x ε R _jm ) Indicating that it is determined whether x belongs to a leaf, to return 1, and not to return 0.

Calculating a negative gradient, namely residual error, for each sample by using a residual error formula, taking the residual error obtained in the previous round as a true value of a sample letter, calculating a best fit value for a leaf area of a new classification regression tree, and finally updating the formula by using a strong classifier: f (f) _m (x)＝f _m-1 (x)+∑γ _jm I(x∈R _jm ) The regression classification tree is further updated, and finally the gbdt classifier is obtained.

According to the invention, a text similarity rule and a question emotion mechanism are added outside the keyword rule, so that text vectorization, emotion tendency analysis and text similarity calculation are facilitated, and the question answering robot is not dead like a simple rule robot and can hit nodes more accurately.

Specifically, the nodes are provided with a slot configuration mechanism, namely, a dynamically configured substitute is arranged in the question reply of each node, specific contents are matched according to different multi-round question-answering trees, each node is configured with a dynamic slot after the nodes are matched in any mode, different contents can be configured by a configurator according to different use environments, such as addresses related to a certain place in the reply, name information of related personnel and the like, and flexible reply can be performed according to different environment scenes;

for example, if a client locates a node such as "find client manager" at 2 different sites, the configuration of this node may be "may go to the hall to find xx manager for further consultation", where "xx" this slot may be "Zhang Sano" at site A and "Liqu" at site B, so that the question-answering robot may be flexibly used in various environments.

The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art, who is within the scope of the present invention, should make equivalent substitutions or modifications according to the technical scheme of the present invention and the inventive concept thereof, and should be covered by the scope of the present invention.

Claims

1. The task type question-answering robot based on scene and keyword rules is characterized by comprising a multi-round question-answering tree, keyword rules, text similarity rules, a question emotion analysis mechanism and a slot configuration mechanism, wherein the text similarity rules comprise a bert model and a cosine similarity model, the question emotion analysis mechanism comprises a gbdt classifier, and the bert model is subjected to fine tuning through a plurality of scene text data so as to be suitable for the cosine similarity model and the gbdt classifier;

the node further comprises a node for judging whether the previous round of nodes are questions or not, wherein keyword rules, emotion analysis mechanisms and text similarity rules are not configured in the node, in the next round of nodes, question sentences of the current round are divided into three categories of yes, no and no positive answer questions through the question emotion analysis mechanisms, corresponding answer matching is carried out on clear yes and no categories, and multiple rounds of question-answer tree and node positioning are carried out again on the categories of no positive answer questions;

the nodes configured with the keyword rules are synchronously configured with text similarity rules, the priority of the text similarity rules is lower than that of the keyword rules, namely when any node is not hit by the keyword rules, the corresponding node is hit by the text similarity rules, and the multi-round question-answering tree is reversely positioned;

2. The task type question-answering robot based on scene and keyword rules according to claim 1, wherein in the process of locating nodes and multiple rounds of question-answering trees through text similarity rules, vectorization is carried out on questions through a bert model, then similarity calculation and sorting are carried out through a cosine similarity model, and nodes with highest similarity and higher than a similarity threshold are used as hit nodes;

the cosine similarity model is expressed as follows:

；

wherein A and B are two n-dimensional vectors that compute similarity.

3. The task type question-answering robot based on scene and keyword rules according to claim 1, wherein the gbdt classifier generates a weak classifier through multiple iterations, each classifier is trained on the residual basis of the previous classifier, and the final gbdt classifier is obtained by weighting and summing the weak classifiers obtained by each round of training;

；

wherein the data is processedAs training data of the next round of classification regression tree, a new classification regression tree is obtained, and the corresponding leaf node area is +.>，/>The number of the leaf nodes;

the formula of the gbdt classifier is expressed as follows:

；

wherein,for the initial weak classifier, +.>For the best fit value calculated for the leaf area, +.>For the area of leaf nodes, m represents the number of iterations, i.e. the number of weak classifiers generated, +.>Is the loss function, c is an initial randomly given constant,indicating that it is determined whether x belongs to a leaf, to return 1, and not to return 0.

4. The task type question-answering robot based on scene and keyword rules according to claim 1, wherein the bert model extracts features from text data, i.e. embedded vectors of words and sentences, which are used as input features of cosine similarity model or gbdt classifier.

5. The task type question-answering robot based on scene and keyword rules according to claim 1, wherein the nodes are provided with a slot configuration mechanism, namely, dynamically configured substitutes are arranged in question replies of each node, and specific contents are matched according to different multi-round question-answering trees.