CN113762451B - Task type question-answering robot based on scene and keyword rules - Google Patents

Task type question-answering robot based on scene and keyword rules Download PDF

Info

Publication number
CN113762451B
CN113762451B CN202110995597.8A CN202110995597A CN113762451B CN 113762451 B CN113762451 B CN 113762451B CN 202110995597 A CN202110995597 A CN 202110995597A CN 113762451 B CN113762451 B CN 113762451B
Authority
CN
China
Prior art keywords
question
rules
answering
nodes
tree
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110995597.8A
Other languages
Chinese (zh)
Other versions
CN113762451A (en
Inventor
陈再蝶
朱晓秋
周杰
樊伟东
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kangxu Technology Co ltd
Original Assignee
Kangxu Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kangxu Technology Co ltd filed Critical Kangxu Technology Co ltd
Priority to CN202110995597.8A priority Critical patent/CN113762451B/en
Publication of CN113762451A publication Critical patent/CN113762451A/en
Application granted granted Critical
Publication of CN113762451B publication Critical patent/CN113762451B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/004Artificial life, i.e. computing arrangements simulating life
    • G06N3/008Artificial life, i.e. computing arrangements simulating life based on physical entities controlled by simulated intelligence so as to replicate intelligent life forms, e.g. based on robots replicating pets or humans in their appearance or behaviour
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/243Classification techniques relating to the number of classes
    • G06F18/24323Tree-organised classifiers
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N5/00Computing arrangements using knowledge-based models
    • G06N5/02Knowledge representation; Symbolic representation
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Evolutionary Biology (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Robotics (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a task type question-answering robot based on scene and keyword rules, which comprises a multi-round question-answering tree, keyword rules, text similarity rules, a question emotion analysis mechanism and a slot configuration mechanism, wherein the text similarity rules comprise a bert model and a cosine similarity model, and the question emotion analysis mechanism comprises a gbdt classifier. In the invention, different scenes are distinguished through the primary key words, each scene is an independent tree, each tree is provided with an independent auxiliary key word and a primary key word, and algorithm contents such as text vectorization, text similarity and emotion analysis are added through a bert model, a cosine similarity model and a gbdt classifier except for a key word combination rule in each node of each tree, so that the task question-answering robot based on rules does not need to completely rely on rule configuration, and the defect of inflexibility of the traditional rule robot relative to dead plates is overcome.

Description

Task type question-answering robot based on scene and keyword rules
Technical Field
The invention relates to the technical field of task type robots, in particular to a task type question-answering robot based on scene and keyword rules.
Background
A task robot refers to a robot that provides information or services under specific conditions. In general, the task scenes such as flow checking, telephone fee checking, meal ordering, ticket ordering, consultation and the like are met for users with definite purposes. Because the demands of users are complex, and in general, the users need to interact in multiple rounds, the users may also continuously modify and perfect their demands in the process of dialogue, and task robots need to help users to make clear purposes by querying, clarifying and confirming, so two implementation methods exist in the industry: rule-based implementations and End-to-End-based implementations.
The End-to-End-based multi-round dialogue task robot tries to train an overall mapping relation from user-side natural language input to machine-side natural language output, so that flexibility and expansibility of a system are improved, but the model has very high requirements on quality and quantity of data and has unexplainability, so that the current industry mostly adopts a rule-based implementation mode.
The rule-based multi-round dialogue task robot is based on a rule of regular matching, and the method is too strict for the questioner and relatively dead;
the other is similar to a business dialogue system, the input text is mapped into a semantic framework composed of a plurality of semantic slots, and the matching rule of one semantic slot is composed of a plurality of slot value types and connective words, so that a complete piece of information is expressed. The disadvantage of this approach is: (1) rule development is error-prone; (2) the adjustment rule requires multiple iterations; (3) difficult maintenance in case of rule conflict; (4) The user cannot be flexibly and vividly understood due to complete dependence on rules;
therefore, the task type question-answering robot based on the scene and keyword rules is provided, and the defects of the traditional question-answering robot based on the rules are overcome, such as (1) the user question-answering method is relatively dead; (2) application scenario comparison limitation; (3) Rule configuration is easy to make mistakes and rule conflicts are easy to generate; and (4) the problem of low accuracy of regular positioning.
Disclosure of Invention
In order to solve the technical problems mentioned in the background art, a task type question-answering robot based on scene and keyword rules is provided.
In order to achieve the above purpose, the present invention adopts the following technical scheme:
the task type question-answering robot based on scene and keyword rules comprises a multi-round question-answering tree, keyword rules, text similarity rules, a question emotion analysis mechanism and a slot configuration mechanism, wherein the text similarity rules comprise a bert model and a cosine similarity model, the question emotion analysis mechanism comprises a gbdt classifier, and the bert model is subjected to fine tuning through a plurality of scene text data so as to be suitable for the cosine similarity model and the gbdt classifier;
the multi-round question-answering tree comprises a plurality of nodes, wherein the nodes are configured with corresponding keyword rules or emotion analysis mechanisms, the keyword rules comprise main keywords for positioning the multi-round question-answering tree and auxiliary keywords for positioning the nodes, question sentences are input to the question-answering robot, the nodes in the multi-round question-answering tree are positioned through the keyword rules, and corresponding question replies are matched;
the node also comprises a node for judging whether the previous round of node is a question or not, the node is not configured with a keyword rule, an emotion analysis mechanism and a text similarity rule, in the next round of node, the question sentence of the current round is divided into three types of yes, no and no positive answer questions through the question emotion analysis mechanism, corresponding answer matching is carried out for the clear yes and no categories, and multiple rounds of question-answering tree and node positioning are carried out again for the category without positive answer questions.
As a further description of the above technical solution:
the nodes configured with the keyword rules are synchronously configured with text similarity rules, the priority of the text similarity rules is lower than that of the keyword rules, namely when any node is not hit by the keyword rules, the corresponding node is hit by the text similarity rules, and the multi-round question-answering tree is reversely positioned.
As a further description of the above technical solution:
in the process of locating nodes and multiple rounds of question-answering trees through a text similarity rule, vectorizing question sentences through a bert model, and then carrying out similarity calculation and sequencing by using a cosine similarity model, wherein the node with the highest similarity and higher than a similarity threshold is used as a hit node;
the cosine similarity model is expressed as follows:
wherein A and B are two n-dimensional vectors that compute similarity.
As a further description of the above technical solution:
in the process of classifying question types through a question emotion analysis mechanism, a bert model is used for vectorizing the questions, and then a gbdt classifier is used for classifying the questions into three categories of yes, no and no positive answer questions.
As a further description of the above technical solution:
the gbdt classifier generates a weak classifier through multiple iterations, each classifier is trained on the residual error basis of the previous classifier, and the final gbdt classifier is obtained by weighting and summing the weak classifiers obtained through each training;
the weak classifier selects a classification regression tree, and the formula of the residual error is expressed as follows:
wherein the data (x i ,r im ) I=1, 2, ··, N is used as training data for the next round of classification regression tree, obtaining a new classification regression tree, wherein the corresponding leaf node area is R jm J=1, 2, the contents of (J), J is the number of leaf nodes;
the formula of the gbdt classifier is expressed as follows:
wherein f 0 (x) Gamma, the initial weak classifier jm For best fit value calculated for leaf area, R jm For the area of the leaf node, m represents the number of iterations, i.e. generationL is a loss function, c is an initial random given constant, I (x ε R) jm ) Indicating that it is determined whether x belongs to a leaf, to return 1, and not to return 0.
As a further description of the above technical solution:
the bert model extracts features, i.e., embedded vectors of words and sentences, from the text data, which are used as cosine similarity models or input features for the gbdt classifier.
As a further description of the above technical solution:
the nodes are provided with a slot configuration mechanism, namely, a dynamically configured substitute is arranged in the question reply of each node, and specific contents are matched according to different multi-round question-answering trees.
In summary, due to the adoption of the technical scheme, the beneficial effects of the invention are as follows:
through the multi-round question-answering tree, different scenes are distinguished through the primary key words while the flow of the question-answering robot is normalized, each scene is an independent tree, each tree is configured with independent auxiliary key words and primary key words, the same words can show different rules in different scenes, under the design, the configuration of the rule robot is more normalized, the contradiction condition of rule conflict in the traditional rule robot is avoided to a great extent due to the fact that the scenes are distinguished, the rule configuration is simpler and more efficient and the responsibility is clear, in each node of each tree, algorithm contents such as text vectorization, text similarity and emotion analysis are added through a bert model, a cosine similarity model and a gbdt classifier, the task-type question-answering robot based on rules is not required to be completely dependent on the rule configuration, the defect that the traditional rule robot is inflexible relative to a dead plate is overcome, finally, the self-immobilized in the nodes can be returned to different environments through the different configuration, and the different answer conditions can be returned to the scene.
Drawings
Fig. 1 shows a schematic question-answering flow diagram of a task-type question-answering robot based on scene and keyword rules, provided according to an embodiment of the present invention;
fig. 2 shows a schematic view of a scene positioning flow of a task type question-answering robot based on scene and keyword rules according to an embodiment of the present invention;
fig. 3 shows a schematic structural diagram of a bert model of a task type question-answering robot based on scene and keyword rules according to an embodiment of the present invention;
fig. 4 shows a schematic sentence input diagram of a bert model of a task type question-answering robot based on scene and keyword rules according to an embodiment of the present invention.
Detailed Description
The following description of the embodiments of the present invention will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments of the present invention, but not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Example 1
Referring to fig. 1-4, the present invention provides a technical solution: the task type question-answering robot based on scene and keyword rules comprises a multi-round question-answering tree, keyword rules, text similarity rules, a question emotion analysis mechanism and a slot configuration mechanism, wherein the text similarity rules comprise a bert model and a cosine similarity model, the question emotion analysis mechanism comprises a gbdt classifier, the bert model carries out fine adjustment through a plurality of scene text data so as to be suitable for the cosine similarity model and the gbdt classifier, the bert model extracts features from the text data, namely embedded vectors of words and sentences, and the vectors are used as input features of the cosine similarity model or the gbdt classifier;
specifically, the bert model is a method for pre-training language representation, which can be used for extracting high-quality language features from text data, and can also be used for fine tuning the models by using own data to complete specific tasks (classification, entity identification, question answer and the like), so as to generate the most advanced prediction;
we will use the bert model to extract features, i.e., embedded vectors of words and sentences, from the text data, which are used as high quality feature inputs for downstream models. NLP models (e.g., LSTMs or CNNs) require input in the form of digital vectors, which typically means that the vocabulary and some of the speech features need to be converted to digital representations. In the past, words have been represented as unique index values (one-hot encoding), or more useful as neural Word embeddings, where the vocabulary is matched with fixed length feature embeddings produced by models such as Word2Vec or Fasttext. The bert model provides an advantage over models such as Word2Vec in that, although each Word under Word2Vec has a fixed representation, regardless of the context in which the Word appears, the Word representation generated by the bert model is dynamically informed by words surrounding the Word;
for example, given two sentences: "The man was accused of robbing a bank" and "The man went fishing by the bank of the river," Word2Vec will generate the same Word embeddings for the Word "bank" in both sentences, and different Word embeddings for "bank" in the bert model. In addition to capturing obvious differences such as word ambiguity, the contextually relevant words empeddings capture other forms of information that can yield more accurate feature representations, thereby improving model performance.
In principle, the method comprises the following steps: the bert model is a deep bi-directional pre-trained language understanding model using a transducer as a feature extractor. The bert model is a pre-training model, and a language characterization model is trained through a large amount of data by using a bidirectional transducer, which is a general model, and downstream tasks including classification, regression, machine translation, question-answering systems and the like are applied by fine tuning the model, and the downstream model applied in the example is a cosine similarity model or a gbdt classifier.
The structure of the bert model is shown in fig. 3 and 4, transformer Encoder, namely the input of all moments can be obtained through the Attention calculation at every moment, compared with other models, the input of the bert model adopts a mode of adding three Embeddings, and the aim of pre-training and predicting the next sentence is fulfilled by adding three vectors of Token Embeddings, segment Embeddings and Position Embeddings;
the Input of the bert model is two sentences: "my dog is cure", "helike playing". First, a special Token [ CLS ] is added to the beginning of the first sentence to mark the beginning of the sentence, the [ SEP ] is used to mark the end of the sentence, then 3 references are made to each Token, the Embedding (Token Embeddings) of the word, the Embedding (Position Embeddings) position, and the Embedding (Segment Embeddings) sentence, and finally the three references are input to the next layer in a summation mode.
The multi-round question-answering tree comprises a plurality of nodes, the nodes are provided with corresponding keyword rules or emotion analysis mechanisms, the keyword rules comprise main keywords for positioning the multi-round question-answering tree and auxiliary keywords for positioning the nodes, question sentences are input to the question-answering robot, the nodes in the multi-round question-answering tree are positioned through the keyword rules, and corresponding question replies are matched;
the node further comprises a node for judging whether the previous round of node is a question or not, the node is not provided with a keyword rule, an emotion analysis mechanism and a text similarity rule, in the next round of node, the question of the round is divided into three categories of yes, no and no positive answer questions through the question emotion analysis mechanism, corresponding answer matching is carried out on the clear yes and no categories, multiple rounds of question-answering tree and node positioning are carried out again on the non-positive answer questions, specifically, in the process of classifying the question types through the question emotion analysis mechanism, the vectorization is carried out on the question through a bert model, and the gbdt classifier is used for classifying the question into three categories of yes, no and no positive answer questions;
first, the basic framework of the task type question-answering robot is a plurality of question-answering trees, each tree can be understood as a scene or a major class, such as financial, fund, deposit, loan and the like, and each tree is configured with a plurality of primary key words to help locate the scene, as shown in the following table 1:
TABLE 1
Scene(s) Primary key words
Loan Loan, borrow and borrow
Deposit Deposit, cash, banknote, qian · deposit
······ ······
As shown in fig. 2, the first step of the question-answering robot is to locate a scene, if a question-answering record exists, the scene of the previous question-answering is inherited preferentially, if no keyword is matched under the scene, whether a question and text similarity rule exist, a new scene of a new question sentence is relocated, and if no primary keyword exists, a general scene is entered;
for each scene or each node under the tree, the corresponding keyword rules and text similarity rules are configured except that the previous question-answering image is a next node for asking whether the client asks a problem or not, and a plurality of auxiliary keywords are additionally configured in the keyword rules of the nodes besides the main keywords used for locating the scene, wherein the auxiliary keywords are as shown in the following table 2:
TABLE 2
Scene(s) Keywords (including primary keywords and secondary keywords)
Loan Loans, house loans, car loans, and transactions
Deposit Deposit, how to transact
······ ······
As can be seen from Table 2, the same words as "transact" or "how" etc. appear in multiple scenes, and after locating a specific scene, the specific node keywords or keyword combinations are matched, as shown in Table 3 below:
TABLE 3 Table 3
The invention adopts a mechanism of multi-round question-answering tree, and positions the scene through the main key words and then positions the nodes in the scene with the assistance of the auxiliary key words, so that the question-answering robot has clear thought and easy configuration, the condition of rule conflict is avoided to a great extent, and the difficulty of rule configuration is reduced;
secondly, for whether a node in a scene has a problem, classification is performed by a problem emotion analysis mechanism, and the rule shape is shown in the following table 4:
TABLE 4 Table 4
Specifically, the nodes configured with the keyword rules are synchronously configured with text similarity rules, the priority of the text similarity rules is lower than that of the keyword rules, namely when the keyword rules do not hit any node, the corresponding node is hit through the text similarity rules, and the multi-round question-answering tree is reversely positioned;
in the nodes of each scene, other nodes which are provided with keyword rules except for the nodes which are whether problems exist, some texts can be synchronously configured, the nodes can be positioned through text similarity, the defect that the keyword rules are relatively dead is overcome, and the rules are as shown in the following table 5:
TABLE 5
The text similarity rule only needs to consider the nodes and does not need to consider any scene, and the scene is located after hitting the nodes, which is contrary to the keyword rule.
Specifically, in the process of locating nodes and multiple rounds of question-answering trees through a text similarity rule, firstly vectorizing question sentences through a bert model, and then carrying out similarity calculation and sorting by using a cosine similarity model, wherein the node with the highest similarity and higher than a similarity threshold is used as a hit node;
the cosine similarity model is formulated as follows:
wherein A and B are two n-dimensional vectors for calculating similarity;
to determine whether two texts match, it is actually calculated whether the similarity of word vectors expressing the semantics of the two texts is close.
In the scene, a cosine similarity model is the most suitable and most widely used method, and the principle is that the cosine value of the included angle of two vectors in a vector space is used as a measure for measuring the individual difference of the two vectors, so that the similarity of the two vectors in the dimension direction can be obtained through the model, and the method can be widely applied.
Specifically, the gbdt classifier generates a weak classifier through multiple iterations, each classifier is trained on the basis of the residual error of the previous classifier, the finally obtained gbdt classifier is obtained by weighting and summing the weak classifiers obtained through each round of training, the requirements on the weak classifier are generally simple enough, and are low variance and high deviation, the training process is to continuously improve the precision of the final classifier by reducing the deviation, the weak classifier selects classification regression trees, and the depth of each classification regression tree is not deep due to the requirements on the low variance and the high deviation;
the formula of the residual is expressed as follows:
wherein the data (x i ,r im ) I=1, 2, ··, N is used as training data for the next round of classification regression tree, obtaining a new classification regression tree, wherein the corresponding leaf node area is R jm J=1, 2, the contents of (J), J is the number of leaf nodes;
the formula of the gbdt classifier is expressed as follows:
wherein f 0 (x) Gamma, the initial weak classifier jm For best fit value calculated for leaf area, R jm For the area of the leaf node, m represents the iteration number, i.e. the number of weak learners generated, L is the loss function, c is the initial random given constant, I (x ε R jm ) Indicating that it is determined whether x belongs to a leaf, to return 1, and not to return 0.
Calculating a negative gradient, namely residual error, for each sample by using a residual error formula, taking the residual error obtained in the previous round as a true value of a sample letter, calculating a best fit value for a leaf area of a new classification regression tree, and finally updating the formula by using a strong classifier: f (f) m (x)=f m-1 (x)+∑γ jm I(x∈R jm ) The regression classification tree is further updated, and finally the gbdt classifier is obtained.
According to the invention, a text similarity rule and a question emotion mechanism are added outside the keyword rule, so that text vectorization, emotion tendency analysis and text similarity calculation are facilitated, and the question answering robot is not dead like a simple rule robot and can hit nodes more accurately.
Specifically, the nodes are provided with a slot configuration mechanism, namely, a dynamically configured substitute is arranged in the question reply of each node, specific contents are matched according to different multi-round question-answering trees, each node is configured with a dynamic slot after the nodes are matched in any mode, different contents can be configured by a configurator according to different use environments, such as addresses related to a certain place in the reply, name information of related personnel and the like, and flexible reply can be performed according to different environment scenes;
for example, if a client locates a node such as "find client manager" at 2 different sites, the configuration of this node may be "may go to the hall to find xx manager for further consultation", where "xx" this slot may be "Zhang Sano" at site A and "Liqu" at site B, so that the question-answering robot may be flexibly used in various environments.
The foregoing is only a preferred embodiment of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art, who is within the scope of the present invention, should make equivalent substitutions or modifications according to the technical scheme of the present invention and the inventive concept thereof, and should be covered by the scope of the present invention.

Claims (5)

1. The task type question-answering robot based on scene and keyword rules is characterized by comprising a multi-round question-answering tree, keyword rules, text similarity rules, a question emotion analysis mechanism and a slot configuration mechanism, wherein the text similarity rules comprise a bert model and a cosine similarity model, the question emotion analysis mechanism comprises a gbdt classifier, and the bert model is subjected to fine tuning through a plurality of scene text data so as to be suitable for the cosine similarity model and the gbdt classifier;
the multi-round question-answering tree comprises a plurality of nodes, wherein the nodes are configured with corresponding keyword rules or emotion analysis mechanisms, the keyword rules comprise main keywords for positioning the multi-round question-answering tree and auxiliary keywords for positioning the nodes, question sentences are input to the question-answering robot, the nodes in the multi-round question-answering tree are positioned through the keyword rules, and corresponding question replies are matched;
the node further comprises a node for judging whether the previous round of nodes are questions or not, wherein keyword rules, emotion analysis mechanisms and text similarity rules are not configured in the node, in the next round of nodes, question sentences of the current round are divided into three categories of yes, no and no positive answer questions through the question emotion analysis mechanisms, corresponding answer matching is carried out on clear yes and no categories, and multiple rounds of question-answer tree and node positioning are carried out again on the categories of no positive answer questions;
the nodes configured with the keyword rules are synchronously configured with text similarity rules, the priority of the text similarity rules is lower than that of the keyword rules, namely when any node is not hit by the keyword rules, the corresponding node is hit by the text similarity rules, and the multi-round question-answering tree is reversely positioned;
in the process of classifying question types through a question emotion analysis mechanism, a bert model is used for vectorizing the questions, and then a gbdt classifier is used for classifying the questions into three categories of yes, no and no positive answer questions.
2. The task type question-answering robot based on scene and keyword rules according to claim 1, wherein in the process of locating nodes and multiple rounds of question-answering trees through text similarity rules, vectorization is carried out on questions through a bert model, then similarity calculation and sorting are carried out through a cosine similarity model, and nodes with highest similarity and higher than a similarity threshold are used as hit nodes;
the cosine similarity model is expressed as follows:
wherein A and B are two n-dimensional vectors that compute similarity.
3. The task type question-answering robot based on scene and keyword rules according to claim 1, wherein the gbdt classifier generates a weak classifier through multiple iterations, each classifier is trained on the residual basis of the previous classifier, and the final gbdt classifier is obtained by weighting and summing the weak classifiers obtained by each round of training;
the weak classifier selects a classification regression tree, and the formula of the residual error is expressed as follows:
wherein the data is processedAs training data of the next round of classification regression tree, a new classification regression tree is obtained, and the corresponding leaf node area is +.>,/>The number of the leaf nodes;
the formula of the gbdt classifier is expressed as follows:
wherein,for the initial weak classifier, +.>For the best fit value calculated for the leaf area, +.>For the area of leaf nodes, m represents the number of iterations, i.e. the number of weak classifiers generated, +.>Is the loss function, c is an initial randomly given constant,indicating that it is determined whether x belongs to a leaf, to return 1, and not to return 0.
4. The task type question-answering robot based on scene and keyword rules according to claim 1, wherein the bert model extracts features from text data, i.e. embedded vectors of words and sentences, which are used as input features of cosine similarity model or gbdt classifier.
5. The task type question-answering robot based on scene and keyword rules according to claim 1, wherein the nodes are provided with a slot configuration mechanism, namely, dynamically configured substitutes are arranged in question replies of each node, and specific contents are matched according to different multi-round question-answering trees.
CN202110995597.8A 2021-08-27 2021-08-27 Task type question-answering robot based on scene and keyword rules Active CN113762451B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110995597.8A CN113762451B (en) 2021-08-27 2021-08-27 Task type question-answering robot based on scene and keyword rules

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110995597.8A CN113762451B (en) 2021-08-27 2021-08-27 Task type question-answering robot based on scene and keyword rules

Publications (2)

Publication Number Publication Date
CN113762451A CN113762451A (en) 2021-12-07
CN113762451B true CN113762451B (en) 2024-02-27

Family

ID=78791550

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110995597.8A Active CN113762451B (en) 2021-08-27 2021-08-27 Task type question-answering robot based on scene and keyword rules

Country Status (1)

Country Link
CN (1) CN113762451B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018030672A1 (en) * 2016-08-09 2018-02-15 주식회사 피노텍 Robot automation consultation method and system for consulting with customer according to predetermined scenario by using machine learning
CN110033851A (en) * 2019-04-02 2019-07-19 腾讯科技(深圳)有限公司 Information recommendation method, device, storage medium and server
CN111242431A (en) * 2019-12-31 2020-06-05 联想(北京)有限公司 Information processing method and device, and method and device for constructing customer service conversation workflow
CN112948553A (en) * 2021-02-26 2021-06-11 平安国际智慧城市科技股份有限公司 Legal intelligent question and answer method and device, electronic equipment and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10679011B2 (en) * 2017-05-10 2020-06-09 Oracle International Corporation Enabling chatbots by detecting and supporting argumentation
US11782910B2 (en) * 2019-11-15 2023-10-10 Samsung Electronics Co., Ltd. System and method for dynamic inference collaboration

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2018030672A1 (en) * 2016-08-09 2018-02-15 주식회사 피노텍 Robot automation consultation method and system for consulting with customer according to predetermined scenario by using machine learning
CN110033851A (en) * 2019-04-02 2019-07-19 腾讯科技(深圳)有限公司 Information recommendation method, device, storage medium and server
CN111242431A (en) * 2019-12-31 2020-06-05 联想(北京)有限公司 Information processing method and device, and method and device for constructing customer service conversation workflow
CN112948553A (en) * 2021-02-26 2021-06-11 平安国际智慧城市科技股份有限公司 Legal intelligent question and answer method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113762451A (en) 2021-12-07

Similar Documents

Publication Publication Date Title
CN109684440B (en) Address similarity measurement method based on hierarchical annotation
Neculoiu et al. Learning text similarity with siamese recurrent networks
CN111241837B (en) Theft case legal document named entity identification method based on anti-migration learning
CN106407333B (en) Spoken language query identification method and device based on artificial intelligence
US9685155B2 (en) Method for distinguishing components of signal of environment
WO2018107810A1 (en) Voiceprint recognition method and apparatus, and electronic device and medium
CN106557462A (en) Name entity recognition method and system
CN109597994A (en) Short text problem semantic matching method and system
CN112836046A (en) Four-risk one-gold-field policy and regulation text entity identification method
Li et al. Towards zero-shot learning for automatic phonemic transcription
CN115357719B (en) Power audit text classification method and device based on improved BERT model
CN112015862B (en) User abnormal comment detection method and system based on hierarchical multichannel attention
CN108536670A (en) Output statement generating means, methods and procedures
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
CN114625879A (en) Short text clustering method based on self-adaptive variational encoder
WO2023071120A1 (en) Method for recognizing proportion of green assets in digital assets and related product
Nazir et al. A computer-aided speech analytics approach for pronunciation feedback using deep feature clustering
CN112347780A (en) Judicial fact finding generation method, device and medium based on deep neural network
CN113762451B (en) Task type question-answering robot based on scene and keyword rules
CN107886233B (en) Service quality evaluation method and system for customer service
CN110287396A (en) Text matching technique and device
CN115358817A (en) Intelligent product recommendation method, device, equipment and medium based on social data
CN114595329A (en) Few-sample event extraction system and method for prototype network
Liu [Retracted] Automatic English Pronunciation Evaluation Algorithm Based on Sequence Matching and Feature Fusion
CN116127981A (en) Semantic vector representation method, semantic vector representation device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Address after: 310000 2-206, 1399 liangmu Road, Cangqian street, Yuhang District, Hangzhou City, Zhejiang Province

Applicant after: Kangxu Technology Co.,Ltd.

Address before: 310000 2-206, 1399 liangmu Road, Cangqian street, Yuhang District, Hangzhou City, Zhejiang Province

Applicant before: Zhejiang kangxu Technology Co.,Ltd.

GR01 Patent grant
GR01 Patent grant