CN112905749B - Task-based multi-turn dialogue method based on intention-slot value rule tree - Google Patents

Task-based multi-turn dialogue method based on intention-slot value rule tree Download PDF

Info

Publication number
CN112905749B
CN112905749B CN202110267900.2A CN202110267900A CN112905749B CN 112905749 B CN112905749 B CN 112905749B CN 202110267900 A CN202110267900 A CN 202110267900A CN 112905749 B CN112905749 B CN 112905749B
Authority
CN
China
Prior art keywords
node
slot
intention
value
slot value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110267900.2A
Other languages
Chinese (zh)
Other versions
CN112905749A (en
Inventor
甘涛
喜宇辉
李春昂
何艳敏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
University of Electronic Science and Technology of China
Original Assignee
University of Electronic Science and Technology of China
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by University of Electronic Science and Technology of China filed Critical University of Electronic Science and Technology of China
Priority to CN202110267900.2A priority Critical patent/CN112905749B/en
Publication of CN112905749A publication Critical patent/CN112905749A/en
Application granted granted Critical
Publication of CN112905749B publication Critical patent/CN112905749B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation or dialogue systems
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/322Trees
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • G06F16/316Indexing structures
    • G06F16/325Hash tables
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06QINFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES; SYSTEMS OR METHODS SPECIALLY ADAPTED FOR ADMINISTRATIVE, COMMERCIAL, FINANCIAL, MANAGERIAL OR SUPERVISORY PURPOSES, NOT OTHERWISE PROVIDED FOR
    • G06Q40/00Finance; Insurance; Tax strategies; Processing of corporate or income taxes
    • G06Q40/02Banking, e.g. interest calculation or account maintenance

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • Mathematical Physics (AREA)
  • Business, Economics & Management (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Accounting & Taxation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Computing Systems (AREA)
  • Finance (AREA)
  • Economics (AREA)
  • Development Economics (AREA)
  • Marketing (AREA)
  • Strategic Management (AREA)
  • Technology Law (AREA)
  • General Business, Economics & Management (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The invention belongs to the technical field of natural language processing, and particularly provides a task-type multi-turn dialogue method based on an intention-slot value rule tree. According to the method, firstly, an intention-slot value rule tree with intention-slot value joint information as roots and leaves and slot value information as intermediate nodes is established according to the business rules of standard multi-turn dialogue corpora, in the dialogue process, the intention and the slot value in user sentences are extracted by adopting a neural network method, and the dialogue is carried out in a mode of traversing the intention-slot value rule tree with depth first, so that the business rules and the neural network method are effectively combined, and the problems that the traditional rule-based method is large in manual workload and poor in generalization capability, the neural network method is reduced in dialogue quality under the condition of few training samples and the like are solved; in addition, the accuracy of the initial dialogue is further improved by positioning the dialogue starting point by firstly matching the root and leaf nodes of the tree and then matching the middle node of the tree.

Description

Task-type multi-turn dialogue method based on intention-slot value rule tree
Technical Field
The invention belongs to the technical field of natural language processing, relates to a task type robot dialogue method, and particularly relates to a task type multi-turn dialogue method based on an intention-slot value rule tree.
Background
Since the advent of computers, how to better perform human-computer interaction has been a topic of interest. Dialog systems are one of the important tasks in natural language processing, enabling machines to communicate with humans using natural language like humans. Task-based dialog systems are intended to assist users in performing specific tasks, such as business consulting, airline ticket ordering, and the like.
The task type dialogue system generally comprises three modules of natural language understanding, dialogue management and natural language generation, wherein the natural language understanding module is mainly used for extracting semantic information (namely intention) at sentence level and semantic information (namely slot position information, hereinafter referred to as slot value) at word level from sentences input by a user, and the dialogue management module is used for organizing a dialogue strategy according to the currently extracted intention and slot value and combining context to generate a next dialogue action; and finally, the natural language generation module maps the dialogue action generated in the dialogue management to the natural language expression so as to reply the user.
Different task type dialogue methods are formed for different realization of each module of the dialogue system. The traditional method is based on rules, firstly standard multi-turn conversation texts of a robot and a user are generated based on business rules, and then corresponding conversation rules and standard replies are formulated according to the texts; the main problems of the rule-based method are that the workload of manually formulating rules is large, the generalization capability is weak, and when the sentences input by the user are not completely matched with the formulated rules, the conversation quality is easily reduced. In contrast, in recent years, the method based on the neural network is remarkably improved in the aspect of conversation quality, and by using a trained neural network model, the method can capture semantic and context information of a text, so that the user intention is estimated more accurately, and a reasonable conversation strategy is formed; however, the neural network-based dialogue method faces a main problem that the dialogue performance depends on the number of labeled training samples, when the number of training samples is small, the model precision is obviously reduced, and the dialogue quality is not ideal; for a specific business field, effective training samples are often insufficient, and a large amount of manual collection and labeling are needed.
Therefore, for a specific business field, the task-based dialog method needs to solve the problem of how to implement high-quality dialog under the condition of limited dialog corpus.
Disclosure of Invention
The invention aims to provide a task-based multi-turn dialogue method based on an intention-slot value rule tree, aiming at the problems of the existing task-based dialogue methods based on rules and a neural network, and further realize high-quality task-based multi-turn dialogue.
In order to achieve the purpose, the invention adopts the technical scheme that:
a task-based multi-turn dialog method based on an intent-to-slot rule tree, comprising: constructing an intention-slot value rule tree and a model and a multi-turn dialogue; it is characterized in that the preparation method is characterized in that,
the method for constructing the intention-slot value rule tree and the model comprises the following steps:
A1. intention-slot value rule tree construction;
a1-1, initialization:
a1-1-1, setting a slot value set S and an intention-slot value set U, and initializing that both S and U contain a character string element with a value of 'no category';
a1-1-2, setting an intention-groove value hash table UH and initializing to be empty;
a1-2, for each multi-turn dialog tree in the multi-turn dialog corpus:
the number of layers of the multi-round dialogue tree is an even number, odd-level nodes of the tree store user sentences, the odd-level nodes only have one sub-node, even-level nodes of the tree store robot sentences, and the even-level nodes have one or more sub-nodes;
A1-2-1, setting an intention-slot value rule tree T which is initially empty;
a1-2-2, setting an initially empty slot value cascade string
Figure BDA0002972728960000021
And an initially empty intention-slot value union string
Figure BDA0002972728960000022
A1-2-3. annotating the user intent u of the current multi-turn dialog tree, appending it to an intent-slot value union string
Figure BDA0002972728960000023
Performing the following steps;
a1-2-4, numbering nodes at even levels of the tree in the order from top to bottom and from left to right;
a1-2-5. creating a root node:
a1-2-5-1, extracting the slot values of the user statement q corresponding to the level 1 node (namely the root node) of the current multi-turn dialog tree, wherein the slot values are I;
a1-2-5-2. for each extracted bin value s i I ═ 1,2,3, ·, I as: will s i Add to the set of slot values S, while appending it to the intent-slot value union string
Figure BDA0002972728960000024
Performing the following steps;
a1-2-5-3. Association of intention-score character string
Figure BDA0002972728960000025
Add to intent-slot value set U;
a1-2-5-4, setting the number of the node at the layer 2 of the current multi-turn dialog tree as num, the robot sentence corresponding to the node as a, creating a node as the root node root of the intention-slot value rule tree T, and dividing num into a plurality of sections,
Figure BDA0002972728960000026
And a constitute the triad (num,
Figure BDA0002972728960000027
a) and storing the triplet as node data into the root;
a1-2-5-5
Figure BDA0002972728960000029
As a key, root's memory address &root as a value, creating a key-value pair
Figure BDA0002972728960000028
Storing the hash table as a hash table entry into an intention-slot value hash table UH;
a1-2-6. create the remaining nodes:
setting the current multi-turn dialog tree to have N odd-numbered layers, and performing the following operations on each node e from the 2 nd odd-numbered layer to the Nth odd-numbered layer:
a1-2-6-1, selecting a slot value S ' for a user statement q ' corresponding to the point e, and adding S ' into the slot value set S;
a1-2-6-2, let node e's father node number num p Num is the number in the node data found in the meaning-slot rule tree T p Node e of T
A1-2-6-3, let e sub-node number num c Creating an e T Sub-node e of T ', setting node e T ' numbering num c Number num c The robot statement a 'corresponding to the sub-nodes of the slot values s' and e form a triple (num) c S ', a') and store the triplet as node data to e T ' of (1);
a1-2-7. update leaf nodes:
and (3) for any leaf node of the current multi-turn dialog tree:
a1-2-7-1, setting the number of the current leaf node as num ', and finding the node leaf with the node number as num' in the intention-slot value rule tree T;
a1-2-7-2, setting an initial value to
Figure BDA0002972728960000031
Character string of
Figure BDA0002972728960000032
A1-2-7-3. for each node other than the root on the path from root to leaf of the intention-slot value rule tree T, the slot value s' in the node data is read and appended to
Figure BDA0002972728960000033
Performing the following steps;
a1-2-7-4. will
Figure BDA0002972728960000034
Add to intent-slot value set U;
a1-2-7-5, with
Figure BDA0002972728960000035
As the storage address of the key, leaf&leaf as a value, create a key-value pair
Figure BDA0002972728960000036
Storing the hash table as a hash table entry into an intention-slot value hash table UH;
a1-2-7-6. read triple (num) of leaf c S ', a') data, num of which c S ', a' and
Figure BDA0002972728960000037
form a quadruple
Figure BDA0002972728960000038
Updating the node data of the leaf into the quadruple;
A2. training intent-trough value recognition model:
and taking the intention-slot value set U as a type label set, and performing the following operation on each user statement in the training corpus: selecting an element from the intention-slot value set U as a label of the statement; training a general neural network classification model by taking user sentences and corresponding labels thereof as training samples to obtain an intention-trough value recognition model UM;
A3. training a slot value recognition model:
a3-1, using the slot value set S as a type label set, and performing the following operations on each user statement in the training corpus: performing word segmentation processing on a user sentence by adopting a Chinese word segmentation tool, and selecting an element from the slot value set S as a label of each word obtained by segmentation;
a3-2, taking the sentence and the corresponding label as training samples, and training the universal sequence marking model to obtain a slot value recognition model SM;
The multi-turn dialog comprises the steps of:
B1. intent-bin value matching:
b1-1, inputting the input sentence of the user into the intention-groove value identification model UM obtained in the step A2 for identification, and obtaining the predicted intention-groove value type
Figure BDA0002972728960000041
And a confidence probability c of the prediction;
b1-2. initial setting node e x If c is less than the preset confidence probability threshold c th Go to step B4-1;
b1-3. find the key word in the intent-slot hash table UH as
Figure BDA0002972728960000042
Reading the value of the table entry, and setting the corresponding intention of the table entry value-slot value rule tree T p Node e in p
B2. If e p Setting a conversation start node e for the leaf node x Is e p And go to step B4-1;
B3. positioning a conversation starting point:
b3-1. initializing the slot value set S' to null, conversation start node e x For intention-slot rule tree T p A root node of;
b3-2, inputting the input sentence of the user into the slot value recognition model SM obtained in the step A3 for recognition, obtaining one or more predicted slot values, and adding all the slot values into a slot value set S';
b3-3 for intention-slot value rule Tree T p Is provided with T p With M leaf nodes, then from T p The root node of the tree has M different paths to the leaf node, and each path l m (M ═ 1,2,3,. cndot., M) by:
B3-3-1 initial setting l m Is matched to length d m =0,l m Matched tail node e of m Is empty;
b3-3-2, starting from the child node of the root node, sequentially pairing l m Each node above does: if the slot value of the current node is an element in the slot value set S', d is updated m =d m +1 and label the current node as l m Matched tail node e of m
B3-4. search for all d m (M ═ 1,2,3, ·, M), matching end point e of the path corresponding to the maximum value m As conversation start node e x
B4. Tree-slot matching dialogue
B4-1 if node e x If the answer is null, replying a bottom-finding sentence which indicates that the robot cannot understand the user problem and ending the current multi-turn conversation;
b4-2, initially setting the current dialogue node e y For the conversation start node e x
B4-3 read node e y The robot sentence a' is returned to the user as the robot sentence;
b4-4. if node e y If the number of the leaf nodes is the leaf node, ending the current multi-turn conversation;
b4-5, reading a new sentence q input by the user aiming at the robot reply x And q is x Inputting the data into the slot value recognition model obtained in the step A3 for recognition, and obtaining a slot value s with the highest predicted confidence probability x
B4-6 at node e y Among all the sub-nodes of (2), the value of the node slot is searched for as s x If found, it is set as the current conversation node e y And go to step B4-3, otherwise, reply to the sentence requiring the user to re-enter and go to step B4-5.
Further, the confidence probability threshold c th The value range is as follows: c is more than or equal to 0.85 th ≤0.95。
The invention has the beneficial effects that:
the invention provides a multi-round dialogue method based on an intention-groove value rule tree, which comprises the steps of firstly, establishing an intention-groove value rule tree which takes intention-groove value joint information as roots and leaves and groove value information as intermediate nodes according to a service rule of a standard multi-round dialogue corpus, extracting intentions and groove values in user sentences by adopting a neural network method in the dialogue process, and carrying out dialogue in a mode of traversing the intention-groove value rule tree with depth priority, thereby effectively combining the service rule with the neural network method, and avoiding the problems of large manual workload, weak generalization capability, reduced dialogue quality and the like of the traditional rule-based method under the condition of less training samples; in addition, the accuracy of the initial dialogue is further improved by positioning the dialogue starting point by firstly matching the root and leaf nodes of the tree and then matching the middle node of the tree.
Drawings
FIG. 1 is a flow chart of the task-based multi-turn dialog method based on the intent-slot rule tree according to the present invention.
FIG. 2 is a diagram of a multi-turn dialog tree input according to an embodiment of the present invention.
FIG. 3 is a diagram illustrating an intent-slot rule tree generated in an embodiment of the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples.
The embodiment provides a task-based multi-turn dialog method based on an intention-slot value rule tree, the flow of which is shown in fig. 1, and the method comprises two stages of constructing the intention-slot value rule tree and a model and multi-turn dialog; the embodiment describes a multi-turn conversation method applied to a bank intelligent customer service robot for banking business.
First, the construction steps of the intention-slot rule tree and the model in the present invention will be explained in further detail; the steps of constructing the intent-slot rule tree and model specifically include the following steps a 1-A3:
A1. intention-slot value rule tree construction;
a1-1, initialization:
a1-1-1. two sets of strings are defined: a slot value set S and an intention-slot value set U, wherein the initialization S and the initialization U both comprise a character string element with a value of 'no category';
a1-1-2, initializing intention-groove value hash table UH is empty;
a1-2, if the inputted multi-turn dialogue corpus is L, wherein L contains a plurality of standard multi-turn dialogues, a multi-turn dialog is defined as an alternate dialog between a group of users and robots, the dialog content is stored in a tree structure (hereinafter referred to as multi-turn dialogue tree), the number of layers of the tree is even, the odd-numbered nodes of the tree store user sentences, the odd-numbered nodes only have one sub-node, the even-numbered nodes store robot sentences, and the even-numbered nodes have one or more sub-nodes, then for each multi-turn dialogue tree in L:
In this embodiment, the input multi-turn dialog corpus includes 43 standard multi-turn dialog trees, where the standard meaning indicates that the user initial input sentence corresponding to the root node of the current multi-turn dialog tree is a common problem in the field, and all the robot sentences are accurate replies given by the domain experts according to the user input sentence, and hereinafter, a multi-turn dialog tree whose subject is "consultation deduction affair" is described as an example, the multi-turn dialog tree has 6 layers, stores alternate dialog contents between a group of users and robots, and has a structure as shown in fig. 2;
a1-2-1, setting an intention-slot value rule tree T which is initially empty;
a1-2-2, setting an initially empty slot value cascade string
Figure BDA0002972728960000061
And an initially empty intention-slot value union string
Figure BDA0002972728960000062
A1-2-3, manually marking the user intention u corresponding to the current multi-turn dialog, and adding the user intention u to the intention-slot value joint character string
Figure BDA0002972728960000063
Performing the following steps;
in the embodiment, the intention of the user who manually marks the current multi-turn conversation is 'consultation', and the consultation is added into the initial empty character string to obtain the intention-slot value combined character string
Figure BDA0002972728960000064
A1-2-4, numbering the nodes at the even level of the tree from top to bottom and from left to right;
In this embodiment, the result of numbering the nodes of the even-numbered levels of the tree is shown in table 1;
TABLE 1
Figure BDA0002972728960000065
Figure BDA0002972728960000071
The number of each node of the current multi-turn dialog tree is shown in FIG. 2;
a1-2-5. creating a root node:
a1-2-5-1, setting a user statement corresponding to a node (namely a root node) at the layer 1 of the current multi-turn dialog tree as q, manually extracting a slot value of the q, and setting that I slot values are extracted at present;
in this embodiment, the user statement q corresponding to the level 1 node (i.e., root node) of the current multi-turn dialog tree is "why there is no deduction for asking for a question", I ═ 2 slot values are manually extracted: "withholding" and "failure";
a1-2-5-2. value s for each extracted bin i I ═ 1,2,3, ·, I as: will s i Add to the set of slot values S, while appending it to the intent-slot value union string
Figure BDA00029727289600000711
Performing the following steps;
in the present embodiment, the extracted bin value s is compared 1 Sum of "deduction" and s 2 Add to the set S as "fail" while appending it to the intent-slot union string
Figure BDA0002972728960000072
In (1), after update
Figure BDA0002972728960000073
A1-2-5-3. Association of intention and slot value character strings
Figure BDA0002972728960000074
Add to intent-slot value set U;
in this embodiment, the
Figure BDA0002972728960000075
Adding the value into an intention-slot value set U, wherein the obtained intention-slot value set U is { 'no category', 'consultation deduction failure' };
A1-2-5-4, setting the number of the node at the layer 2 of the current multi-turn dialog tree as num, the robot sentence corresponding to the node as a, creating a node as the root node root of the intention-slot value rule tree T, and dividing num into a plurality of sections,
Figure BDA0002972728960000076
And a constitute a triplet
Figure BDA0002972728960000077
And storing the triplet as node data into the root;
in this embodiment, the number num of the node at the layer 2 of the current multi-turn dialog tree is 1, the robot statement corresponding to the node is "which platform to ask for a question", a node is created as the root node root of the intention slot value rule tree T, 1, "which platform to ask for a deduction failure" and "which platform to ask for a question" constitute a triple (1, "which platform to ask for a deduction failure" and "which platform to ask for a question"), and the triple is stored in the root as the node data;
a1-2-5-5
Figure BDA0002972728960000078
As a key, root's memory address&root as a value, creating a key-value pair
Figure BDA0002972728960000079
Storing the hash table as a hash table entry into an intention-slot value hash table UH;
in the present embodiment, to
Figure BDA00029727289600000710
As key, root's memory address&root as a value, creating a key-value pair<"failure to consult deduction",&root>if so, storing the value in an intention-slot value hash table UH as a hash table item;
a1-2-6. create the remaining nodes: setting the current multi-turn dialog tree to have N odd-numbered layers, and performing the following operations on each node e of each odd-numbered layer from 2 nd to N th:
In this embodiment, the current multi-turn dialog has N ═ 3 odd-numbered layers, and each node e of the 2 nd and 3 rd odd-numbered layers is made, that is, each node e of the 3 rd and 5 th layers of the multi-turn dialog tree is made;
a1-2-6-1, selecting a slot value S '(the slot value reflects the most main semantic feature of the user statement q') for the user statement q 'corresponding to the point e, and adding the S' into the slot value set S;
in this embodiment, the slot values selected for each node of layers 3 and 5 of the multi-turn dialog tree are shown in table 2;
TABLE 2
Number of layers of tree Node serial number User statement Value of groove
3 layers of 1 st one Good person loan platform Good credit
3 layers of 2 nd (a) Vehicle loan platform Vehicle loan
5 layers of 1 st one WeChat change WeChat
5 layers of 2 nd (a) Payment balance Payment device
5 layers of 3 rd one Construction bank Construction bank
5 layers of The 4 th Agricultural bank Agricultural bank
A1-2-6-2, let e's father node number num p Num is the number in the node data found in the meaning-slot rule tree T p Node e of T
A1-2-6-3, let e sub-node number num c Creating an e T Sub-node e of T ', setting node e T ' numbering num c Number num c The robot statement a 'corresponding to the sub-nodes of the slot values s' and e form a triple (num) c S ', a') and store the triplet as node data to e T ' of (1);
in this embodiment, after step a1-2-6 is executed, the obtained slot value set S is { "no category", "good person loan", "car loan", "WeChat", "Payment treasure", "construction Bank", "agricultural Bank" }; the numbering of each node of the current intention-slot value rule tree is shown in fig. 3;
a1-2-7. update leaf nodes: and (3) for any leaf node of the current multi-turn dialog tree:
a1-2-7-1, setting the number of the current node as num, and finding the node leaf with the node number as num' in the intention-slot value rule tree T;
in this embodiment, 4 leaf nodes are found in the intention-slot rule tree T, and the node numbers num "are 4, 5, 6, and 7, respectively.
A1-2-7-2, setting an initial value to
Figure BDA0002972728960000081
Character string of
Figure BDA0002972728960000082
A1-2-7-3. for each node on the path from the root node root of T to the leaf except the root, the slot value s' in the node data is read and appended to
Figure BDA0002972728960000083
Performing the following steps;
in the embodiment, the slot value read from each node except the root on the path from the root to the leaf and the generated intention-slot value character string
Figure BDA0002972728960000091
As shown in table 3;
TABLE 3
Figure BDA0002972728960000092
A1-2-7-4. will
Figure BDA0002972728960000093
Add to intent-slot value set U;
a1-2-7-5, with
Figure BDA0002972728960000094
As the storage address of the key, leaf&leaf as a value, create a key-value pair
Figure BDA0002972728960000095
Storing the hash table as a hash table entry into an intention-slot value hash table UH;
a1-2-7-6. read triple (num) of leaf c S ', a') data, num of which c S ', a' and
Figure BDA0002972728960000096
form a quadruple
Figure BDA0002972728960000097
Updating the node data of the leaf into the quadruple;
in the present embodiment, after the step A1 is executed, the final generated intention-slot value rule tree T p As shown in fig. 3, the obtained intention-slot value set U is { "no category", "consultation deduction failure good credit WeChat", "consultation deduction failure good credit Payment treasure", "consultation deduction failure good credit Payment Bao", "consultation deduction failure vehicle credit construction Bank", "consultation deduction failure vehicle credit agricultural Bank" }.
A2. Training intent-trough value recognition model: taking the intention-slot value set U as a type label set, and adding a label to each user statement in the training corpus, namely selecting an element from the intention-slot value set U as a label of each user statement; taking all user sentences with labels as training samples, and training a general neural network classification model to obtain an intention-trough value recognition model UM;
in this embodiment, an intention-slot value set U { "no category", "consultation deduction failure good credit WeChat", "consultation deduction failure good credit Payment treasure", "consultation deduction failure vehicle credit construction Bank", "consultation deduction failure vehicle credit agricultural Bank" } is used as a type label set to train to obtain an intention-slot value identification model UM;
A3. Training a slot value recognition model:
a3-1, using the slot value set S as a type label set, and performing the following operations on each user statement in the training corpus: performing word segmentation processing on a user sentence by adopting a Chinese word segmentation tool, and selecting an element from the slot value set S as a label of each word obtained by segmentation;
in this embodiment, the slot value set S { "no category", "good person loan", "vehicle loan", "WeChat", "Payment treasure", "construction Bank", "agricultural Bank" } is used as a type label set;
a3-2, taking all user sentences with labels as training samples, and training a universal sequence label model to obtain a slot value recognition model SM;
the multi-turn dialog comprises the steps of:
the following description will be given by taking an example of the user initially inputting "good person loan has not deducted my money all the time";
B1. intent-bin value matching;
b1-1, inputting the input sentence of the user into the intention-groove value recognition model obtained in the step A2 for recognition, and obtaining the predicted intention-groove value type
Figure BDA0002972728960000103
And a confidence probability c of the prediction;
in this embodiment, the "good credits that have not deducted my money" input by the user is input into the intention-slot value recognition model obtained in step a2 for recognition, so as to obtain the predicted intention-slot value type
Figure BDA0002972728960000101
The confidence probability c of the prediction is 0.923;
b1-2. initial setting node e x If c is less than the preset confidence probability threshold c th Go to step B4-1;
in the present embodiment, the confidence probability threshold c th The value range is as follows: c is more than or equal to 0.85 th C is less than or equal to 0.95, where c th Is 0.9, due to the current confidence probabilityc is greater than c when c is 0.923 th If yes, continuing to execute the next step;
b1-3. find the key word in the intent-slot hash table UH as
Figure BDA0002972728960000104
Reading the value of the table entry, and setting the corresponding intention of the table entry value-slot value rule tree T p Node e in p
In this embodiment, the key found in the hash table entry of the intention-slot value is
Figure BDA0002972728960000102
Finds out the corresponding node e by the value of the table entry p For intention-slot rule tree T p A root node of;
B2. if e p Setting a conversation start node e for the leaf node x Is e p And go to step B4-1;
in this embodiment, e p If not, continuing to execute the next step;
B3. positioning a conversation starting point:
b3-1. initializing the slot value set S' to null, conversation start node e x For intention-slot rule tree T p A root node of;
b3-2, inputting the input sentence of the user into the slot value recognition model obtained in the step A3 for recognition, obtaining one or more predicted slot values, and adding the slot values into a slot value set S';
In this embodiment, the "good credit withheld my money" input by the user is input into the slot value identification model obtained in step a3 to be identified, so as to obtain predicted 3 slot values: "good credits", "deductions", and "failures" and add them to the bin set S ', i.e. S' { "good credits", "deductions", "failures" };
b3-3 for intention-slot value rule Tree T p Is provided with T p With M leaf nodes, then from T p The root node of the tree has M different paths to the leaf node, and each path l m (m=1,2,3,·H, M) by:
in this embodiment, the intent-slot rule Tree T p Total M is 4 leaf nodes, then from T p The root node of (1) has 4 different paths from the starting point to the leaf node, wherein the paths are respectively 1 Is 'node 1-node 2-node 4', l 2 Is "node 1-node 2-node 5", with 3 Is "node 1-node 3-node 6", and 4 is "node 1-node 3-node 7".
B3-3-1 initial setting l m Is matched to length d m =0,l m Matched tail node e of m Is empty;
b3-3-2, starting from the child node of the root node, sequentially pairing l m Each node above does: if the slot value of the current node is an element in the slot value set S', d is updated m =d m +1 and label the current node as l m Matched tail node e of m
In this embodiment, the calculated matching lengths d of 4 paths m And matching tail node e m As shown in table 4;
TABLE 4
m Route of travel Matching length Matching tail node numbering
1 Node 1-node 2-node 4 1 2
2 Node 1-node 2-node 5 1 2
3 Node 1-node 3-node 6 0 Air conditioner
4 Node 1-node 3-node 7 0 Air conditioner
B3-4 finding all d m (M ═ 1,2,3, ·, M), matching end point e of the path corresponding to the maximum value m As conversation start node e x
In this embodiment, find d 1 Maximum value, its corresponding path l 1 Matched tail node e of m Is the node with number 2, so a conversation start node e is set x Node number 2;
B4. tree-slot matching dialogue:
b4-1 if node e x If the answer is null, replying a bottom-finding sentence which indicates that the robot cannot understand the user problem and ending the current multi-turn conversation;
in this example, e x If not, continuing to execute the next step;
b4-2, initially setting the current dialogue node e y For the conversation start node e x
In this embodiment, a current session initiation node e is initially set y Node number 2;
b4-3 read node e y The robot sentence a' is returned to the user as the robot sentence;
in this embodiment, readTaken node e y The robot statement is 'which deduction mode is requested', and the robot statement is used as the robot statement to be replied to the user;
B4-4. if node e y If the number of the leaf nodes is the leaf node, ending the current multi-turn conversation;
in this embodiment, node e is the node y If not, continuing to execute the next step;
b4-5, reading a new sentence q input by the user aiming at the robot reply x And q is x Inputting the data into the slot value recognition model obtained in the step A3 for recognition, and obtaining a slot value s with the highest predicted confidence probability x
In this embodiment, let the current user statement q x Inputting the balance into the slot value identification model obtained in the step A3 for identification, and obtaining the slot value with the highest predicted confidence probability as the Payment treasure, namely s x "pay Bao";
b4-6 at node e y Among all the sub-nodes of (2), the value of the node slot is searched for as s x If found, it is set as the current conversation node e y And go to step B4-3, otherwise, reply to the sentence that requires the user to re-input and go to step B4-5.
In this embodiment, at node e y Of all the sub-nodes of (2), the node with the number 5 is found, and the slot value is equal to s x So that the node numbered 5 is set as the current conversation node e y And go to step B4-3;
the second time step B4-3 is performed, node e is read y The robot statement in (B) is "pay for baby deduction on 10 days per month" and returns it to the user as a robot statement, and then step B4-4 is performed for the second time since e y And if the number of the nodes is the leaf node, ending the current multi-turn conversation.
Based on this, in this embodiment, the user initially inputs "good loan has not deducted my money" through steps B1 to B4, and a complete multi-turn dialog is formed, and the dialog process is shown in table 5;
TABLE 5
Serial number User statement Robot sentence
1 Good loan has not deducted my money all the time
2 What kind of deduction mode to ask for
3 I bind the Payment balance
4 The payment fee is deducted 10 days per month
Table 6 shows that when the user initially inputs "how to not deduct my money" a plurality of rounds of dialog are formed. The start of the session is located at the root node in this example.
TABLE 6
Serial number User statement Robot return
1 How to not deduct my money
2 Which platform the request is
3 Vehicle loan platform
4 Asking which bank card is bound
5 Building element
6 Please contact construction bank customer service 95533 for consultation
From the results in tables 5 and 6, it can be seen that for different user inputs, the robot can give correct responses according to the business rules, and finally complete multiple rounds of conversations with the user, thereby verifying the validity of the method of the present invention.
While the invention has been described with reference to specific embodiments, any feature disclosed in this specification may be replaced by alternative features serving the same, equivalent or similar purpose, unless expressly stated otherwise; all of the disclosed features, or all of the method or process steps, may be combined in any combination, except mutually exclusive features and/or steps.

Claims (2)

1. A task-based multi-turn dialog method based on an intent-to-slot rule tree, comprising: constructing an intention-slot value rule tree and a model and a multi-turn dialogue; it is characterized in that the preparation method is characterized in that,
the method for constructing the intention-slot value rule tree and the model comprises the following steps:
A1. intention-slot value rule tree construction;
a1-1, initialization:
a1-1-1, setting a slot value set S and an intention-slot value set U, and initializing that both S and U contain a character string element with a value of 'no category';
a1-1-2, setting an intention-groove value hash table UH and initializing to be empty;
a1-2, for each multi-turn dialog tree in the multi-turn dialog corpus:
the number of layers of the multi-round dialogue tree is an even number, odd-level nodes of the tree store user sentences, the odd-level nodes only have one sub-node, even-level nodes of the tree store robot sentences, and the even-level nodes have one or more sub-nodes;
a1-2-1, setting an intention-slot value rule tree T which is initially empty;
a1-2-2, setting an initially empty slot value cascade string
Figure FDA0003663128170000012
And an initially empty intention-slotValue union string
Figure FDA0003663128170000011
A1-2-3. annotating the user intent u of the current multi-turn dialog tree, appending it to an intent-slot value union string
Figure FDA0003663128170000019
Performing the following steps;
a1-2-4, numbering nodes at even levels of the tree in the order from top to bottom and from left to right;
a1-2-5. creating a root node:
a1-2-5-1, extracting the slot values of the user statement q corresponding to the level 1 node of the current multi-turn dialog tree, wherein the slot values are I;
a1-2-5-2. for each extracted bin value s i The method comprises the following steps: will s i Add to the set of slot values S, while appending it to the intent-slot value union string
Figure FDA0003663128170000017
Performing the following steps; wherein I is 1,2,3, I;
a1-2-5-3. Association of intention-score character string
Figure FDA0003663128170000018
Add to intent-slot value set U;
a1-2-5-4, setting the number of the node at the layer 2 of the current multi-turn dialog tree as num, the robot sentence corresponding to the node as a, creating a node as the root node root of the intention-slot value rule tree T, and dividing num into a plurality of sections,
Figure FDA0003663128170000013
And a constitute a triplet
Figure FDA0003663128170000014
And storing the triplet as node data into the root;
a1-2-5-5
Figure FDA0003663128170000015
As a key, root's memory address&root as a value, creating a key-value pair
Figure FDA0003663128170000016
Storing the hash table as a hash table entry into an intention-slot value hash table UH;
a1-2-6. create the remaining nodes:
setting the current multi-turn dialog tree to have N odd-numbered layers, and performing the following operations on each node e from the 2 nd odd-numbered layer to the Nth odd-numbered layer:
a1-2-6-1, selecting a slot value S ' for a user statement q ' corresponding to the point e, and adding S ' into the slot value set S;
A1-2-6-2, the father node of the node e is numbered num p Num is the number in the node data found in the meaning-slot rule tree T p Node e of T
A1-2-6-3, let e sub-node number num c Creating an e T Sub-node e of T ', setting node e T ' numbering num c Number num c The robot statement a 'corresponding to the sub-nodes of the slot values s' and e form a triple (num) c S ', a') and store the triplet as node data to e T ' of (1);
a1-2-7. update leaf nodes:
and (3) for any leaf node of the current multi-turn dialog tree:
a1-2-7-1, setting the number of the current leaf node as num ', and finding the node leaf with the node number as num' in the intention-slot value rule tree T;
a1-2-7-2, setting an initial value to
Figure FDA0003663128170000021
Character string of
Figure FDA0003663128170000028
A1-2-7-3, divide the path from root to leaf of the intent-slot rule Tree TEach node outside the root reads and appends the slot value s' in the node data to
Figure FDA0003663128170000024
Performing the following steps;
a1-2-7-4. will
Figure FDA0003663128170000022
Add to intent-slot value set U;
a1-2-7-5, with
Figure FDA0003663128170000023
As the storage address of the key, leaf&leaf as a value, create a key-value pair
Figure FDA0003663128170000025
Storing the hash table as a hash table entry into an intention-slot value hash table UH;
a1-2-7-6. read triple (num) of leaf c S ', a') data, num of which c S ', a' and
Figure FDA0003663128170000026
form a quadruple
Figure FDA0003663128170000027
Updating the node data of the leaf into the quadruple;
A2. training intent-trough value recognition model:
and taking the intention-slot value set U as a type label set, and performing the following operation on each user statement in the training corpus: selecting an element from the intention-slot value set U as a label of the statement; training a general neural network classification model by taking user sentences and corresponding labels thereof as training samples to obtain an intention-trough value recognition model UM;
A3. training a slot value recognition model:
a3-1, using the slot value set S as a type label set, and performing the following operations on each user statement in the training corpus: performing word segmentation processing on a user sentence by adopting a Chinese word segmentation tool, and selecting an element from the slot value set S as a label of each word obtained by segmentation;
a3-2, taking the sentence and the corresponding label as training samples, and training the universal sequence marking model to obtain a slot value recognition model SM;
the multi-turn dialog comprises the steps of:
B1. intent-bin value matching:
b1-1, inputting the input sentence of the user into the intention-groove value identification model UM obtained in the step A2 for identification, and obtaining the predicted intention-groove value type
Figure FDA0003663128170000031
And a confidence probability c of the prediction;
b1-2. initial setting node e x If c is less than the preset confidence probability threshold c th Go to step B4-1;
b1-3. find the key word in the intent-slot hash table UH as
Figure FDA0003663128170000032
Reading the value of the table entry, and setting the corresponding intention of the table entry value-slot value rule tree T p Node e in p
B2. If e p Setting a conversation start node e for the leaf node x Is e p And go to step B4-1;
B3. positioning a conversation starting point:
b3-1. initializing the slot value set S' to null, conversation start node e x For intention-slot rule tree T p A root node of;
b3-2, inputting the input sentence of the user into the slot value recognition model SM obtained in the step A3 for recognition, obtaining one or more predicted slot values, and adding all the slot values into a slot value set S';
b3-3 for intention-slot value rule Tree T p Is provided with T p With M leaf nodes, then from T p The root node of the tree has M different paths to the leaf node, and each path isStrip path l m The method comprises the following steps:
b3-3-1 initial setting l m Is matched to length d m =0,l m Matched tail node e of m Is empty, wherein M ═ 1,2,3, ·, M;
b3-3-2, starting from the child node of the root node, sequentially pairing l m Each node above does: if the slot value of the current node is an element in the slot value set S', d is updated m =d m +1, and label the current node as l m Matched tail node e m
B3-4. search for all d m The maximum value in the data is the matching tail node e of the corresponding path m As conversation start node e x
B4. Tree-slot matching dialogue
B4-1 if node e x If the answer is null, replying a bottom-finding sentence which indicates that the robot cannot understand the user problem and ending the current multi-turn conversation;
b4-2, initially setting the current dialogue node e y For the conversation start node e x
B4-3 read node e y The robot sentence a' is returned to the user as the robot sentence;
b4-4. if node e y If the number of the leaf nodes is the leaf node, ending the current multi-turn conversation;
b4-5, reading a new sentence q input by the user aiming at the robot reply x And q is x Inputting the data into the slot value recognition model obtained in the step A3 for recognition, and obtaining a slot value s with the highest predicted confidence probability x
B4-6 at node e y Among all the sub-nodes of (2), the value of the node slot is searched for as s x If found, it is set as the current conversation node e y And go to step B4-3, otherwise, reply to the sentence that requires the user to re-input and go to step B4-5.
2. The method of claim 1, wherein the confidence probability threshold c is a threshold of a task-based multi-turn dialog based on an intent-to-slot rule tree th The value range is as follows: c is more than or equal to 0.85 th ≤0.95。
CN202110267900.2A 2021-03-12 2021-03-12 Task-based multi-turn dialogue method based on intention-slot value rule tree Active CN112905749B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110267900.2A CN112905749B (en) 2021-03-12 2021-03-12 Task-based multi-turn dialogue method based on intention-slot value rule tree

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110267900.2A CN112905749B (en) 2021-03-12 2021-03-12 Task-based multi-turn dialogue method based on intention-slot value rule tree

Publications (2)

Publication Number Publication Date
CN112905749A CN112905749A (en) 2021-06-04
CN112905749B true CN112905749B (en) 2022-07-29

Family

ID=76104959

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110267900.2A Active CN112905749B (en) 2021-03-12 2021-03-12 Task-based multi-turn dialogue method based on intention-slot value rule tree

Country Status (1)

Country Link
CN (1) CN112905749B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114490968B (en) * 2021-12-29 2022-11-25 北京百度网讯科技有限公司 Dialogue state tracking method, model training method and device and electronic equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059170A (en) * 2019-03-21 2019-07-26 北京邮电大学 More wheels based on user's interaction talk with on-line training method and system
CN111078846A (en) * 2019-11-25 2020-04-28 青牛智胜(深圳)科技有限公司 Multi-turn dialog system construction method and system based on business scene

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10740371B1 (en) * 2018-12-14 2020-08-11 Clinc, Inc. Systems and methods for intelligently configuring and deploying a machine learning-based dialogue system

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059170A (en) * 2019-03-21 2019-07-26 北京邮电大学 More wheels based on user's interaction talk with on-line training method and system
CN111078846A (en) * 2019-11-25 2020-04-28 青牛智胜(深圳)科技有限公司 Multi-turn dialog system construction method and system based on business scene

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Genetic Programming for Task Selection in Dialogue Systems;Omar Alfredo Gonzalez Padilla等;《 2010 IEEE Electronics, Robotics and Automotive Mechanics Conference》;20110117;全文 *
任务型人机对话系统中的认知技术——概念、进展及其未来;俞凯等;《计算机学报》;20151231;全文 *

Also Published As

Publication number Publication date
CN112905749A (en) 2021-06-04

Similar Documents

Publication Publication Date Title
JP7033562B2 (en) A multipurpose conversational agent based on deep learning techniques for processing natural language queries
CN110633409B (en) Automobile news event extraction method integrating rules and deep learning
CN108446286B (en) Method, device and server for generating natural language question answers
CN109063035A (en) A kind of man-machine more wheel dialogue methods towards trip field
CN108763510A (en) Intension recognizing method, device, equipment and storage medium
CN110502227A (en) The method and device of code completion, storage medium, electronic equipment
CN110276023B (en) POI transition event discovery method, device, computing equipment and medium
CN109857846B (en) Method and device for matching user question and knowledge point
Sheikh et al. Generative model chatbot for human resource using deep learning
CN109685056A (en) Obtain the method and device of document information
CN103593412B (en) A kind of answer method and system based on tree structure problem
CN113590784B (en) Triplet information extraction method and device, electronic equipment and storage medium
US11727213B2 (en) Automatic conversation bot generation using input form
CN112989002B (en) Question-answer processing method, device and equipment based on knowledge graph
CN109918494A (en) Context relation based on figure replys generation method, computer and medium
CN106980620A (en) A kind of method and device matched to Chinese character string
CN108304424A (en) Text key word extracting method and text key word extraction element
CN112905749B (en) Task-based multi-turn dialogue method based on intention-slot value rule tree
Wirawan et al. Balinese historian chatbot using full-text search and artificial intelligence markup language method
CN109815268A (en) A kind of transaction sanction list matching system
CN103064885B (en) One realizes the synchronous input system of multi-key word and method
CN112559718B (en) Method, device, electronic equipment and storage medium for dialogue processing
CN114372454B (en) Text information extraction method, model training method, device and storage medium
CN111680514B (en) Information processing and model training method, device, equipment and storage medium
AU2022204425B2 (en) Extracting key value pairs using positional coordinates

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant