CN109460450B - Dialog state tracking method and device, computer equipment and storage medium - Google Patents

Dialog state tracking method and device, computer equipment and storage medium Download PDF

Info

Publication number
CN109460450B
CN109460450B CN201811131847.8A CN201811131847A CN109460450B CN 109460450 B CN109460450 B CN 109460450B CN 201811131847 A CN201811131847 A CN 201811131847A CN 109460450 B CN109460450 B CN 109460450B
Authority
CN
China
Prior art keywords
label
text
dialog
slot
value
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201811131847.8A
Other languages
Chinese (zh)
Other versions
CN109460450A (en
Inventor
欧智坚
戴音培
张毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Tsinghua University
Original Assignee
Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Tsinghua University filed Critical Tsinghua University
Priority to CN201811131847.8A priority Critical patent/CN109460450B/en
Publication of CN109460450A publication Critical patent/CN109460450A/en
Application granted granted Critical
Publication of CN109460450B publication Critical patent/CN109460450B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Machine Translation (AREA)

Abstract

The application relates to a conversation state tracking method, a conversation state tracking device, a computer device and a storage medium. By adopting the method, the robustness of the conversation can be improved, multiple values can be taken in one slot, and the preference of a user on the values can be expressed.

Description

Dialog state tracking method and device, computer equipment and storage medium
Technical Field
The present application relates to the field of session technologies, and in particular, to a method and an apparatus for tracking a session state, a computer device, and a storage medium.
Background
With the development of dialogue technology, a dialogue state tracking technology has appeared, which performs dialogue state tracking based on system rules and extracts information contained in user statements through a recurrent neural network. The value in the slot-value pair used by existing dialog state systems to represent dialog states can only take one value, and this value does not express the user's preference.
However, the current dialog state tracking method has the problem of low robustness.
Disclosure of Invention
In view of the foregoing, it is desirable to provide a dialog state tracking method, apparatus, computer device and storage medium for addressing the above technical problems.
A dialog state tracking method, the method comprising:
acquiring a current wheel conversation text;
determining the current round of dialogue semantics according to the dialogue text and the rich dialogue state tracking rule;
and updating the current wheel conversation state according to the conversation semantics and the previous wheel conversation state.
In one embodiment, the determining the current wheel pair semantic comprises:
and analyzing the dialog text according to the rich dialog state tracking rule to obtain a field label of the dialog text.
In one embodiment, the parsing the dialog text according to the rich dialog state tracking rule to obtain the domain tag of the dialog text includes:
acquiring probability distribution of the corresponding label of each field according to the dialog text and the current system behavior;
and selecting the label with the maximum probability value in the probability distribution as a domain label.
In one embodiment, the selecting the tag with the highest probability value in the probability distribution as the domain tag comprises:
judging whether the domain label is a preset label or not,
and if so, analyzing the conversation text according to the rich conversation state tracking rule to obtain a slot label of the conversation text.
In one embodiment, the parsing the dialog text according to the rich dialog state tracking rule, and obtaining a slot tag of the dialog text includes:
acquiring probability distribution of a label corresponding to each notification slot in the field according to the dialog text and the current system behavior;
and selecting the label with the highest probability value in the probability distribution as the slot label.
In one embodiment, the selecting the tag with the highest probability value in the probability distribution as the slot tag comprises:
judging whether the slot label is a preset label or not,
and if so, analyzing the dialog text according to the rich dialog state tracking rule to obtain a value tag of the dialog text.
In one embodiment, the parsing the dialog text according to the rich dialog state tracking rule, and obtaining the value tag of the dialog text includes:
acquiring probability distribution of labels corresponding to each desirable value in the slot according to the dialog text and the current system behavior;
selecting the label with the highest probability in the probability distribution as the value label.
In one embodiment, the selecting the label with the highest probability in the probability distribution as the value label comprises:
and determining the current round of dialogue semantics according to the field, the slot, the value, the field label, the slot label and the value label.
A dialog state tracking device, the device comprising:
the text acquisition module is used for acquiring a current wheel conversation text;
the text processing module is used for determining the current round of dialogue semantics according to the dialogue text and the rich dialogue state tracking rule;
and the state updating module is used for updating the current wheel conversation state according to the conversation semantics and the previous wheel conversation state.
A computer device comprising a memory storing a computer program and a processor implementing the steps of the method as described above when executing the computer program.
A computer-readable storage medium, on which a computer program is stored which, when being executed by a processor, carries out the steps of the method as set forth above.
According to the conversation state tracking method, the conversation state tracking device, the computer equipment and the storage medium, the conversation semantics of the current wheel are determined according to the conversation text and the rich conversation state tracking rule, and the conversation state of the current wheel is updated according to the conversation semantics and the previous wheel conversation state, so that the robustness of conversation can be improved, multiple values can be taken in one slot, and the preference of a user on the values can be expressed.
Drawings
FIG. 1 is a diagram of an application environment for a dialog state tracking method in one embodiment;
FIG. 2 is a flow diagram that illustrates a method for session state tracking, according to one embodiment;
FIG. 3 is a schematic diagram illustrating a principle structure of rich dialog state tracking rules in a dialog state tracking method according to an embodiment;
FIG. 4 is a flowchart illustrating step S21 according to an embodiment;
FIG. 5 is a flow diagram illustrating a method for obtaining a domain tag in one embodiment;
FIG. 6 is a schematic diagram of the schematic structure of a convolutional neural network model in one embodiment;
FIG. 7 is a flowchart illustrating step S22 according to an embodiment;
FIG. 8 is a flowchart illustrating step S23 according to an exemplary embodiment;
FIG. 9 is a block diagram of a dialog state tracking device in one embodiment;
FIG. 10 is a diagram showing an internal structure of a computer device according to an embodiment.
Detailed Description
In order to make the objects, technical solutions and advantages of the present application more apparent, the present application is described in further detail below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application.
The dialog state tracking method provided by the application can be applied to the application environment shown in fig. 1. Wherein the terminal 102 communicates with the server 104 via a network.
The terminal 102 may be, but not limited to, various personal computers, notebook computers, smart phones, tablet computers, and portable wearable devices, and the server 104 may be implemented by an independent server or a server cluster formed by a plurality of servers.
In one embodiment, as shown in fig. 2, a dialog state tracking method is provided, which is described by taking the method as an example applied to the terminal in fig. 1, and includes the following steps:
step S1: and acquiring the current wheel conversation text.
In step S1, the current wheel dialog text refers to the text input by the terminal 102, which includes the voice text and the document text. The server 104 receives the current wheel-to-speech text sent by the terminal 102 through the network.
Step S2: and determining the current round of dialogue semantics according to the dialogue text and the rich dialogue state tracking rule.
In step S2, the rich dialog state tracking rule is expressed as follows, with reference to fig. 3:
d denotes a domain entity, s denotes a slot entity, and v denotes a value entity.
D represents a set of domains, SdRepresenting a set of slots, V, in the field dsRepresents the set of values in the bin s.
For each domain, μdLabels that may be desirable to represent tracking variables for a domain entity in a current turn of a conversation include: "mention", "not mention" (by default "not mentioned"). μ represents the set μdD ∈ D }. The grooves being marked similarly, ξsThe labels that may be desirable to represent tracking variables for the slot entity in the current round of the session include: "mention", "not mention", "don't care" (by default take "not mentioned"). XidRepresentation set { ξs∶s∈Sd}. Xi represents the set { xidD ∈ D }. Similarly, ηvLabels that may be desirable to represent tracking variables for value entity v in the current round of the dialog include: like and dislike (default "dislike"). EtasSet of representations { ηv∶v∈Vs},ηdSet of representations { ηs∶s∈SdEta represents a set { eta }dD ∈ D }. We denote the dialog state by (μ, ξ, η), which we call the "rich dialog state tracking rule". The dialog state in the form can solve the problems that one slot of the traditional dialog state cannot take a plurality of values and cannot reflect user preferences, and contains richer information than the traditional dialog state, so that the dialog state in the form is called as a 'rich dialog state'
After receiving the current round of dialog text, the server 104 processes the current round of dialog text through rich dialog state tracking rules to determine the current round of dialog semantics.
Step S3: and updating the current wheel conversation state according to the conversation semantics and the previous wheel conversation state.
In step S3, the dialog text is analyzed by the rich dialog state tracking rule, a new dialog state is formed in combination with the previous dialog state, and the dialog text ending at the terminal 102 of the current round is summarized, so that the user intention is tracked in the multi-round dialog.
According to the conversation state tracking method, the conversation semantics of the current round are determined according to the conversation text and the rich conversation state tracking rule by obtaining the conversation text of the current round, and then the conversation state of the current round is updated according to the conversation semantics and the conversation state of the previous round, so that the robustness of conversation can be improved, multiple values can be taken in one slot, and the preference of a user on the values can be expressed.
In one embodiment, step S2 includes:
step S21: and analyzing the dialog text according to the rich dialog state tracking rule to obtain a field label of the dialog text.
In step S21, the user text of the current round is represented as u, D represents a domain entity, and D represents a set of domains. For each domain, μdLabels that may be desirable to represent tracking variables for a domain entity in a current turn of a conversation include: "mentioned" and "not mentioned". And traversing all the domain entities D in the domain set D, and calculating the labels of the D. Assuming that the user text u of the current round is "movie i wants to watch gooey and bohai", the domain set D includes movies, music, time, weather, and the like, the movie domain label is "mentioned", and the other domain labels are "not mentioned".
In one embodiment, in conjunction with fig. 4, the step S21 includes:
step S211: and acquiring the probability distribution of the corresponding label of each field according to the dialog text and the current system behavior.
In step S211, with reference to fig. 5, the probability distribution of the corresponding label in each domain is obtained according to the user text u of the current round and the current system behavior a. The implementation process can be broken down into the following steps.
Converting a current round of user text u into a domain-specific embedding matrix f1(d,u)。
First, the user text is segmented. Suppose that the current round of user text contains kuWord u1,u2,…,
Figure BDA0001813769900000051
The dimension of word embedding for each word is dms.
Figure BDA0001813769900000052
A word embedding matrix representing the current round of user text u. Word embedding is a vector used to represent individual words. Compared with the traditional one-hot coding mode, the word embedding mode has the advantages that the dimensionality of word vectors is smaller, the distance between the word vectors of words with similar meanings is relatively short, and overfitting and the relation between the words is favorably reduced. Before training each discriminator, a dictionary is established containing commonly used words, and word embedding of each word is trained.
Figure BDA0001813769900000061
And (3) representing whether each word in the current user sentence u is related to the field d, establishing a semantic dictionary for each field, wherein the dictionary comprises common words specific to the field, and judging whether each word in u is related to the field d by matching keywords. Combining X with Xstr(d, u) are spliced together to obtain a specific domain embedded matrix
Figure BDA0001813769900000062
Figure BDA0001813769900000063
(II) converting the current wheel system behavior a into a specific field action vector f2(d,a)。f2(d, a) is a 7-dimensional vector with the ith element equal to 1 when the ith condition on a and d listed later holds, and 0 otherwise: (1) a asks for a certain notification slot in user d(ii) a (2) a informs a certain inquiry slot in a user d; (3) a, confirming a certain informing slot in d to a user; (4) a, informing a user that no information related to d is found; (5) a, informing a user of finding a piece of information related to d; (6) a, informing a user of finding a plurality of pieces of information related to d and inquiring the user about which one is selected; (7) none of the first 6 conditions hold.
(III) embedding matrix f into specific field1And (d, u) extracting the specific domain embedded vector by using a convolutional neural network. Using L convolution filter pairs f with window sizes of 1, 2 and 3 respectively1And (d, u) performing convolution, and then obtaining 3 vectors through a ReLU (rectified Linear Unit) activation function and maximum pooling, wherein the vectors can be respectively regarded as characteristic representations of a univariate model, a bivariate model and a ternary model. And finally, splicing the three vectors to be used as a specific field embedded vector.
In connection with fig. 6, only 3 convolution filters are drawn per n-gram model in order to save space.
And (IV) utilizing a door mechanism, constraining the embedded vector through the action vector to obtain a semantic feature vector h, wherein a calculation formula is as follows:
hi=f2(d,a)[i]·CNN(f1(d,u))
Figure BDA0001813769900000064
this operation may be understood as controlling the position of the embedded vector in the semantic feature vector with the motion vector.
(V) sending the semantic feature vector into a full-connection network, and calculating the probability distribution p (u) of the specific domain labeld). For fully-connected networks
Figure BDA0001813769900000071
It is shown that the hidden layer nodes are the same as the input layer, the activation function is softmax, the output is a 2-dimensional vector, and the probability that the domain label is "mentioned" and "not mentioned" is represented.
Figure BDA0001813769900000072
Step S212: and selecting the label with the maximum probability value in the probability distribution as a domain label.
In step S212, after the probability distribution of the label corresponding to each domain is obtained, the label with the maximum probability value in the probability distribution is selected as the domain label. The output probability form of the domain label is a two-dimensional vector with a sum of 1, representing the probabilities of "mentioned" and "not mentioned", respectively, (0.8,0.2), and the label with the highest probability is selected as the output, in this case the output label is "mentioned". All fields are traversed during testing, so that each field can independently determine whether the current round is "mentioned" or "not mentioned". May or may not be mentioned in all areas.
In one embodiment, said step S212 is followed by:
step S213: and judging whether the field label is a preset label or not.
In step S213, the label with the highest probability value in the probability distribution is used as the domain label, and there are two cases of the domain label, namely mentioned and not mentioned. In the application, a preset label is set as 'mentioned', and when a field label is mentioned, namely the field label is the same as the preset label, a slot label of the dialog text is continuously acquired; and when the field label is not mentioned, namely the field label is different from the preset label, the rich conversation state tracking rule stops continuously analyzing the conversation text.
Step S22: and if so, analyzing the conversation text according to the rich conversation state tracking rule to obtain a slot label of the conversation text.
Representing the user text of the current round as u, S represents a slot entity, SdRepresenting the set of slots in the domain d. For each groove, ξsThe labels that may be desirable to represent tracking variables for the slot entity in the current round of the session include: "mention", "not mention" and "don't care" (by default, "not mentioned").
The rich dialog state tracking rule starts traversing the slot set S in the field d according to the user text of the current rounddAnd acquiring a slot label of the dialog text. For example, "I want to see a movie of Ge you and Bohai", d is a movie, SdIncluding actor, genre, age, etc., the actor slot is labeled "mentioned" and the other slots are labeled "not mentioned".
In one embodiment, in conjunction with fig. 7, the step S22 includes:
step S221: and acquiring the probability distribution of the corresponding label of each informing slot in the field according to the dialog text and the current system behavior.
In step S221, the probability distribution of the corresponding label of each notification slot is obtained according to the user text u of the current round and the current system behavior a. The implementation process can be broken down into the following steps.
Converting a current round of user text u into a particular slot embedding matrix f1(s,u)。
First, the text is segmented. Suppose that the current round of user text contains kuWord u1,u2,…,
Figure BDA0001813769900000081
The dimension of word embedding for each word is dms.
Figure BDA0001813769900000082
A word embedding matrix representing the current round of user text u. Word embedding is a vector used to represent individual words. Compared with the traditional one-hot coding mode, the word embedding mode has the advantages that the dimensionality of word vectors is smaller, the distance between the word vectors of words with similar meanings is relatively short, and overfitting and the relation between the words is favorably reduced. Before training each discriminator, a dictionary is established containing commonly used words, and word embedding of each word is trained.
Figure BDA0001813769900000083
Whether each word in the current user sentence u is related to the slot s or not is shown, a semantic dictionary is established for each slot, the dictionary contains common words specific to the slot, and the word can be judged by matching keywords of each word in uWhether or not it is associated with slot s. Combining X with Xstr(s, u) are spliced together to obtain a specific slot embedded matrix
Figure BDA0001813769900000084
Figure BDA0001813769900000085
(II) converting the current wheel system behavior a into a specific groove action vector f2(s,a)。f2(s, a) is a 7-dimensional vector with the ith element equal to 1 when the ith condition on a and s listed later holds, otherwise equal to 0: (1) a, inquiring a certain value in a user s; (2) a, informing a user of a certain dereferencing value in s; (3) a, confirming a certain dereferencing value in s to a user; (4) a, informing a user that information related to s is not found; (5) a, informing a user to find a piece of information related to s; (6) a, informing a user of finding a plurality of pieces of information related to s and inquiring the user about which one is selected; (7) none of the first 6 conditions hold.
(III) embedding matrix f for specific slots1(s, u) extracting a specific slot embedding vector by using a convolutional neural network. Using L convolution filter pairs f with window sizes of 1, 2 and 3 respectively1And (s, u) performing convolution, and then obtaining 3 vectors through a ReLU (rectified Linear Unit) activation function and maximum pooling, wherein the vectors can be respectively regarded as characteristic representations of a univariate model, a bivariate model and a ternary model. Finally, the three vectors are spliced together to be used as a specific slot embedding vector.
To save space, only 3 convolution filters are drawn for each n-gram model.
And (IV) utilizing a door mechanism, constraining the embedded vector through the action vector to obtain a semantic feature vector h, wherein a calculation formula is as follows:
hi=f2(s,a)[i]·CNN(f1(s,u))
Figure BDA0001813769900000091
this operation may be understood as controlling the position of the embedded vector in the semantic feature vector with the motion vector.
And (V) finally, sending the semantic feature vector into a full-connection network, and calculating the probability distribution p (xi) of the specific slot labelsd)∈R3. For fully-connected networks
Figure BDA0001813769900000092
It is shown that the hidden layer nodes are the same as the input layer, the activation function is softmax, the output is a 3-dimensional vector, and the probability that the domain labels are "mentioned", "not mentioned", and "don't care" is represented.
Figure BDA0001813769900000093
Step S222: and selecting the label with the highest probability value in the probability distribution as the slot label.
In step S222, after the probability distribution of the label corresponding to each notification slot is obtained, the label with the highest probability value in the probability distribution is selected as the slot label. The output probability form of the slot label is a three-dimensional vector with a sum of 1, representing probabilities of "mentioned", "not mentioned", and "not intended", respectively. For example (0.8,0.1,0.1), we choose the label with the highest probability as the label of the slot at the time of testing, in this case the label with the highest probability is "mentioned". The notification slots in all domains are traversed during testing, so that each notification slot can independently judge whether the current round is 'mentioned', 'not mentioned' or 'not to see'. By traversing all the notification slots, we can obtain which slots are mentioned, which slots are not mentioned, and which slots can take all the values of the slots in the current conversation.
In one embodiment, said step S222 is followed by:
step S223: judging whether the slot label is a preset label or not,
in step S223, the label with the highest probability value in the probability distribution is used as the slot label, and there are three cases of "mention", "not mention" and "not intention". In the application, a preset label is set as 'mention', and when the slot label is 'mention', namely the slot label is the same as the preset label, the value label of the dialog text is continuously acquired; when the slot label is "not mentioned" or "not intended", i.e., the slot label is different from the preset label, the rich dialog state tracking rule stops parsing the dialog text continuously.
Step S23: and if so, analyzing the dialog text according to the rich dialog state tracking rule to obtain a value tag of the dialog text.
Representing the user text of the current round as u, V representing a value entity, VsRepresents the set of values in the bin s. For each value, ηvLabels that may be desirable to represent tracking variables for a value entity in a current round of a dialog include: "like" and "dislike".
According to the user text of the current round, the rich conversation state tracking rule traverses the field set and the groove set of the text, and then traverses the value set VsTo obtain their value labels. For example, "i want to see movies of georgy and yellow bohai", where the domain labels are [ movie ═ mentions ] and the slot label [ actor ═ mentions ], the value labels are [ georgy ═ likes ], and the yellow bohai ═ likes ].
In one embodiment, in conjunction with fig. 8, the step S23 includes:
step S231: and acquiring the probability distribution of the label corresponding to each dereferencing value in the slot according to the dialog text and the current system behavior.
In step S231, the probability distribution of the label corresponding to each value is obtained according to the user text u and the current system behavior a of the current round. The implementation process can be broken down into the following steps.
Converting the current round of user text u into a specific value embedding matrix f1(v,u)。
First, the text is segmented. Suppose that the current round of user text contains kuWord u1,u2,…,
Figure BDA0001813769900000101
Each wordHas dimensions of dms.
Figure BDA0001813769900000102
A word embedding matrix representing the current round of user text u. Word embedding is a vector used to represent individual words. Compared with the traditional one-hot coding mode, the word embedding mode has the advantages that the dimensionality of word vectors is smaller, the distance between the word vectors of words with similar meanings is relatively short, and overfitting and the relation between the words is favorably reduced. Before training each discriminator, a dictionary is established containing commonly used words, and word embedding of each word is trained. To locate v in a sentence, we replace f with a vector of all 1 s1The word vector of v in (v, u).
Figure BDA0001813769900000111
Representing whether each word in the current round of user sentence u is related to the value v, establishing a semantic dictionary, wherein the semantic dictionary comprises words expressing positive emotion and negative emotion, performing keyword matching on each word in u, and if the ith word is a word with positive emotion, xstr(v,u)i1, if it is a word with negative emotion, xstr(v,u)i1, otherwise xstr(v,u)i0. Combining X with Xstr(v, u) are spliced together to obtain a specific value embedded matrix
Figure BDA0001813769900000112
Figure BDA0001813769900000113
(II) embedding matrix f for specific value1(v, u) extracting a specific value embedding vector by using a convolution neural network. Using L convolution filter pairs f with window sizes of 1, 2 and 3 respectively1(v, u) convolution is carried out, and then 3 vectors are obtained through a ReLU (rectified Linear Unit) activation function and maximum pooling, and can be respectively regarded as characteristic representations of a univariate model, a bivariate model and a ternary model. Finally, the three vectors are spliced together to formVectors are embedded for a particular domain.
To save space, only 3 convolution filters are drawn for each n-gram model.
Thirdly, constraining the embedded vector through the action vector by utilizing a door mechanism to obtain a semantic feature vector h, wherein a calculation formula is as follows:
h=CNN(f1(v,u))#(3)
this operation may be understood as controlling the position of the embedded vector in the semantic feature vector with the motion vector.
And (IV) finally, sending the semantic feature vector into a full-connection network, and calculating the probability distribution p (eta) of the specific value labelvsd)∈R2. For fully-connected networks
Figure BDA0001813769900000114
It shows that the hidden layer nodes are the same as the input layer, the activation function is softmax, the output is a 2-dimensional vector, and the probability that the domain labels are like and dislike is shown.
Figure BDA0001813769900000115
Figure BDA0001813769900000121
Step S232: selecting the label with the highest probability in the probability distribution as the value label.
In step S232, the label having the highest probability value in the probability distribution is used as the value label, and there are two cases of the value label, i.e., "like" and "dislike".
In one embodiment, said step S232 is followed by:
step S24: and determining the current round of dialogue semantics according to the field, the slot, the value, the field label, the slot label and the value label.
In step S24, after the current round of text is parsed by the rich dialog state tracking rule, a domain label corresponding to the domain, a slot label corresponding to the slot, and a value label corresponding to the value are obtained. And combining the acquired field, slot, value, field label, slot label and value label to acquire the current round of dialogue semantics. If the current round text is "i want to see movies of gooey and bohai", the corresponding dialog state may be expressed as [ movie ═ mention, actor ═ mention, gooey ═ like, and bohai ═ like ], or "i do not want to see movies of gooey" may be expressed as [ movie ═ mention, actor ═ mention, gooey ═ dislike ] or simply expressed as [ movie (actor (gooey ═ dislike) ].
It should be understood that, although the steps in the flowchart of fig. 2 are shown in order as indicated by the arrows, the steps are not necessarily performed in order as indicated by the arrows. The steps are not performed in the exact order shown and described, and may be performed in other orders, unless explicitly stated otherwise. Moreover, at least a portion of the steps in fig. 2 may include multiple sub-steps or multiple stages that are not necessarily performed at the same time, but may be performed at different times, and the order of performance of the sub-steps or stages is not necessarily sequential, but may be performed in turn or alternately with other steps or at least a portion of the sub-steps or stages of other steps.
In one embodiment, as shown in fig. 9, there is provided a dialog state tracking apparatus, including: the device comprises an acquisition module, a processing module and an updating module, wherein:
the text acquisition module 1 is used for acquiring a current wheel conversation text;
the text processing module 2 is used for determining the current round of dialog semantics according to the dialog text and the rich dialog state tracking rule;
and the state updating module 3 is used for updating the current wheel conversation state according to the conversation semantics and the previous wheel conversation state.
In one embodiment, the text processing module 2 includes:
and the first processing module 21 is configured to parse the dialog text according to the rich dialog state tracking rule, and obtain a domain tag of the dialog text.
In one embodiment, the first processing module 21 includes:
the second processing module 211 is configured to obtain probability distribution of a corresponding tag in each field according to the dialog text and the current system behavior;
a first selecting module 212, configured to select a label with a maximum probability value in the probability distribution as a domain label.
In one embodiment, the first selection module 212 then comprises:
a first judging module 213, configured to judge whether the domain label is a preset label,
and the third processing module 22 is configured to determine that, if yes, the dialog text is analyzed according to the rich dialog state tracking rule, and a slot tag of the dialog text is obtained.
In one embodiment, the third processing module 22 includes:
a fourth processing module 221, configured to obtain, according to the dialog text and the current system behavior, probability distribution of a tag corresponding to each notification slot in the field;
a second selecting module 222, configured to select a label with a highest probability value in the probability distribution as the slot label.
In one embodiment, the second selection module 222 then comprises:
a second judging module 223 for judging whether the slot label is a preset label,
and the fifth processing module 23 is configured to, if the determination result is yes, analyze the dialog text according to the rich dialog state tracking rule, and obtain a value tag of the dialog text.
In one embodiment, the fifth processing module 23 includes:
a sixth processing module 231, configured to obtain, according to the dialog text and the current system behavior, probability distribution of a tag corresponding to each desirable value in the slot;
a third selecting module 232, configured to select a label with a highest probability in the probability distribution as the value label.
In one embodiment, the third selection module 232 then comprises:
and the seventh processing module 24 is configured to determine the current round of dialog semantics according to the field, the slot, the value, the field tag, the slot tag, and the value tag.
For a specific definition of the dialog state tracking device, reference may be made to the above definition of a dialog state tracking method, which is not described in detail herein. The modules in the dialog state tracking device described above may be implemented in whole or in part by software, hardware, and combinations thereof. The modules can be embedded in a hardware form or independent from a processor in the computer device, and can also be stored in a memory in the computer device in a software form, so that the processor can call and execute operations corresponding to the modules.
In one embodiment, a computer device is provided, which may be a server, and its internal structure diagram may be as shown in fig. 10. The computer device includes a processor, a memory, a network interface, and a database connected by a system bus. Wherein the processor of the computer device is configured to provide computing and control capabilities. The memory of the computer device comprises a nonvolatile storage medium and an internal memory. The non-volatile storage medium stores an operating system, a computer program, and a database. The internal memory provides an environment for the operation of an operating system and computer programs in the non-volatile storage medium. The database of the computer device is used for storing dialogue state processing data. The network interface of the computer device is used for communicating with an external terminal through a network connection. The computer program is executed by a processor to implement a method for tracking dialog states.
Those skilled in the art will appreciate that the architecture shown in fig. 10 is merely a block diagram of some of the structures associated with the disclosed aspects and is not intended to limit the computing devices to which the disclosed aspects apply, as particular computing devices may include more or less components than those shown, or may combine certain components, or have a different arrangement of components.
In an embodiment, a computer device is provided, comprising a memory in which a computer program is stored and a processor which, when executing the computer program, realizes the steps of the method as described above.
In an embodiment, a computer-readable storage medium is provided, on which a computer program is stored, which computer program, when being executed by a processor, carries out the steps of the method as described above.
It will be understood by those skilled in the art that all or part of the processes of the methods of the embodiments described above can be implemented by hardware instructions of a computer program, which can be stored in a non-volatile computer-readable storage medium, and when executed, can include the processes of the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the embodiments provided herein may include non-volatile and/or volatile memory, among others. Non-volatile memory can include read-only memory (ROM), Programmable ROM (PROM), Electrically Programmable ROM (EPROM), Electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), Dynamic RAM (DRAM), Synchronous DRAM (SDRAM), Double Data Rate SDRAM (DDRSDRAM), Enhanced SDRAM (ESDRAM), Synchronous Link DRAM (SLDRAM), Rambus Direct RAM (RDRAM), direct bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM).
The technical features of the above embodiments can be arbitrarily combined, and for the sake of brevity, all possible combinations of the technical features in the above embodiments are not described, but should be considered as the scope of the present specification as long as there is no contradiction between the combinations of the technical features.
The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims (11)

1. A dialog state tracking method, the method comprising:
acquiring a current wheel conversation text;
determining the current round of dialogue semantics according to the dialogue text and the rich dialogue state tracking rule; wherein the rich dialog state tracking rule comprises: defining tracking variables of the field entities, the slot entities and the value entities in the current round of conversation to obtain a field label of each field, a slot label of each slot and a value label of each value; the slot entity is included in the domain entity, the domain tag includes: "mentioned", "not mentioned"; wherein the domain tag is to convert the dialog text into a domain-specific embedding matrix; converting the current wheel system behavior into a specific field action vector; extracting a specific domain embedding vector from the specific domain embedding matrix by using a convolutional neural network; utilizing a door mechanism to constrain the specific field embedded vector through the specific field action vector to obtain a semantic feature vector; determining a domain label according to the semantic feature vector;
and updating the current wheel conversation state according to the conversation semantics and the previous wheel conversation state.
2. The method of claim 1, wherein determining a current wheel pair utterance sense according to the dialog text and rich dialog state tracking rules comprises:
and analyzing the dialog text according to the rich dialog state tracking rule to obtain a field label of the dialog text.
3. The method of claim 2, wherein parsing the dialog text according to the rich dialog state tracking rule to obtain a domain tag of the dialog text comprises:
acquiring probability distribution of the corresponding label of each field according to the dialog text and the current system behavior;
and selecting the label with the maximum probability value in the probability distribution as a domain label.
4. The method of claim 3, wherein selecting the label with the highest probability value in the probability distribution as the domain label comprises:
judging whether the domain label is a preset label or not,
and if so, analyzing the conversation text according to the rich conversation state tracking rule to obtain a slot label of the conversation text.
5. The method of claim 4, wherein parsing the dialog text according to the rich dialog state tracking rule and obtaining a slot tag of the dialog text comprises:
acquiring probability distribution of a label corresponding to each notification slot in the field according to the dialog text and the current system behavior;
and selecting the label with the highest probability value in the probability distribution as the slot label.
6. The method of claim 5, wherein the selecting the tag with the highest probability value in the probability distribution as the slot tag comprises:
judging whether the slot label is a preset label or not,
and if so, analyzing the dialog text according to the rich dialog state tracking rule to obtain a value tag of the dialog text.
7. The method of claim 6, wherein parsing the dialog text according to the rich dialog state tracking rule, and wherein obtaining a value tag for the dialog text comprises:
acquiring probability distribution of labels corresponding to each desirable value in the slot according to the dialog text and the current system behavior;
selecting the label with the highest probability in the probability distribution as the value label.
8. The method of claim 7, wherein selecting the label with the highest probability in the probability distribution as the value label comprises:
and determining the current round of dialogue semantics according to the field, the slot, the value, the field label, the slot label and the value label.
9. A dialog state tracking apparatus, the apparatus comprising:
the text acquisition module is used for acquiring a current wheel conversation text;
the text processing module is used for determining the current round of dialogue semantics according to the dialogue text and the rich dialogue state tracking rule; wherein the rich dialog state tracking rule comprises: defining tracking variables of the field entities, the slot entities and the value entities in the current round of conversation to obtain a field label of each field, a slot label of each slot and a value label of each value; the slot entity is included in the domain entity, the domain tag includes: "mentioned", "not mentioned"; wherein the domain tag is to convert the dialog text into a domain-specific embedding matrix; converting the current wheel system behavior into a specific field action vector; extracting a specific domain embedding vector from the specific domain embedding matrix by using a convolutional neural network; utilizing a door mechanism to constrain the specific field embedded vector through the specific field action vector to obtain a semantic feature vector; determining a domain label according to the semantic feature vector;
and the state updating module is used for updating the current wheel conversation state according to the conversation semantics and the previous wheel conversation state.
10. A computer device comprising a memory and a processor, the memory storing a computer program, wherein the processor implements the steps of the method of any one of claims 1 to 8 when executing the computer program.
11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the steps of the method of any one of claims 1 to 8.
CN201811131847.8A 2018-09-27 2018-09-27 Dialog state tracking method and device, computer equipment and storage medium Active CN109460450B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811131847.8A CN109460450B (en) 2018-09-27 2018-09-27 Dialog state tracking method and device, computer equipment and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811131847.8A CN109460450B (en) 2018-09-27 2018-09-27 Dialog state tracking method and device, computer equipment and storage medium

Publications (2)

Publication Number Publication Date
CN109460450A CN109460450A (en) 2019-03-12
CN109460450B true CN109460450B (en) 2021-07-09

Family

ID=65607017

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811131847.8A Active CN109460450B (en) 2018-09-27 2018-09-27 Dialog state tracking method and device, computer equipment and storage medium

Country Status (1)

Country Link
CN (1) CN109460450B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11526674B2 (en) * 2019-03-01 2022-12-13 Rakuten Group, Inc. Sentence extraction system, sentence extraction method, and information storage medium
CN110096516B (en) * 2019-03-25 2022-01-28 北京邮电大学 User-defined database interaction dialog generation method and system
CN110245221B (en) * 2019-05-13 2023-05-23 华为技术有限公司 Method and computer device for training dialogue state tracking classifier
CN111475616B (en) * 2020-03-13 2023-08-22 平安科技(深圳)有限公司 Multi-round dialogue method and device based on dialogue state prediction and computer equipment
CN112380875A (en) * 2020-11-18 2021-02-19 杭州大搜车汽车服务有限公司 Conversation label tracking method, device, electronic device and storage medium

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102411563B (en) * 2010-09-26 2015-06-17 阿里巴巴集团控股有限公司 Method, device and system for identifying target words
CN108536679B (en) * 2018-04-13 2022-05-20 腾讯科技(成都)有限公司 Named entity recognition method, device, equipment and computer readable storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
Tracking of Enriched Dialog States for Flexible Conversational Information Access;Yinpei Dai等;《2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP)》;20180913;第6139-6143页 *

Also Published As

Publication number Publication date
CN109460450A (en) 2019-03-12

Similar Documents

Publication Publication Date Title
CN109460450B (en) Dialog state tracking method and device, computer equipment and storage medium
CN110598206B (en) Text semantic recognition method and device, computer equipment and storage medium
US11948058B2 (en) Utilizing recurrent neural networks to recognize and extract open intent from text inputs
WO2018126325A1 (en) Learning document embeddings with convolutional neural network architectures
US11526698B2 (en) Unified referring video object segmentation network
CN108932342A (en) A kind of method of semantic matches, the learning method of model and server
AU2016256764A1 (en) Semantic natural language vector space for image captioning
CN108520041B (en) Industry classification method and system of text, computer equipment and storage medium
CN112307168B (en) Artificial intelligence-based inquiry session processing method and device and computer equipment
JP2021508866A (en) Promote area- and client-specific application program interface recommendations
CN113157863A (en) Question and answer data processing method and device, computer equipment and storage medium
CN111191032A (en) Corpus expansion method and device, computer equipment and storage medium
WO2021169453A1 (en) Text processing method and apparatus
CN112632252B (en) Dialogue response method, dialogue response device, computer equipment and storage medium
JP2021508391A (en) Promote area- and client-specific application program interface recommendations
CN113761868A (en) Text processing method and device, electronic equipment and readable storage medium
CN114329029A (en) Object retrieval method, device, equipment and computer storage medium
CN116975271A (en) Text relevance determining method, device, computer equipment and storage medium
Bouneffouf et al. Exploration/exploitation trade-off in mobile context-aware recommender systems
CN111680132B (en) Noise filtering and automatic classifying method for Internet text information
CN114491079A (en) Knowledge graph construction and query method, device, equipment and medium
Shrivastava et al. An efficient focused crawler using LSTM-CNN based deep learning
Chaudhuri et al. Modeling user behaviour in research paper recommendation system
CN111552827B (en) Labeling method and device, behavior willingness prediction model training method and device
CN114238798A (en) Search ranking method, system, device and storage medium based on neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant