CN111177331B

CN111177331B - Dialog intention recognition method and device

Info

Publication number: CN111177331B
Application number: CN201911168331.5A
Authority: CN
Inventors: 曾祥荣
Original assignee: Unisound Intelligent Technology Co Ltd
Current assignee: Unisound Intelligent Technology Co Ltd
Priority date: 2019-11-25
Filing date: 2019-11-25
Publication date: 2023-04-18
Anticipated expiration: 2039-11-25
Also published as: CN111177331A

Abstract

The invention discloses a method and a device for recognizing a conversation intention, which comprise the following steps: counting the probability of the intention of each historical user in the historical dialogue data to serve as a priori knowledge sequence; predicting that the currently input text belongs to the user intention a according to a preset classification model _i The probability of (a) q1; calculating the current input text belonging to the user intention a by using a priori knowledge sequence _i The probability of (q 2); and selecting the intention with the highest probability as the current intention of the currently input text according to the probability q1 and the probability q 2. Aiming at the existing conversation intention recognition method, the prior knowledge sequence is inserted into the current input text to predict the current intention of the user on the basis of predicting the probability that the currently input text belongs to the intention of a certain user in the training model, so that the problem that the intention of the user cannot be accurately judged only by using the text information input by the user at the current moment in the prior art is effectively solved, and the experience of the user is enhanced.

Description

Dialog intention recognition method and device

Technical Field

The invention relates to the technical field of artificial intelligence, in particular to a conversation intention identification method and device

Background

With the development of search engine technology, modern search engines, question-answering systems and dialogue robots need not to simply retrieve relevant information, but can deeply understand the information requirements of users so as to accurately provide the users with required services, and correctly recognizing the intentions of the users is a key step for achieving the aim.

The prior art dialog intention recognition methods mainly include methods based on keywords, template matching and machine learning models. However, in these methods, the intention of the user is determined only by using the text information input by the user at the current time in the conversation, and the current conversation intention of the user cannot be accurately determined.

Disclosure of Invention

In response to the displayed question, the method identifies the current dialog intent of the user based on predicting the user intent in the current input text using the probability of the user intent in the historical dialog data.

A dialog intention recognition method, comprising the steps of:

counting the probability of the intention of each historical user in the historical dialogue data to serve as a priori knowledge sequence;

predicting that the currently input text belongs to the user intention a according to a preset classification model _i The probability q1 of (a);

calculating the user intention a of the currently input text by using the prior knowledge sequence _i The probability of (q 2);

and selecting the intention with the maximum probability as the current intention of the currently input text according to the probability q1 and the probability q 2.

Preferably, the statistical method includes, as the prior knowledge sequence, counting the probability of occurrence of each historical user intention in the historical dialogue data, including:

collecting historical conversation data by using a conversation log, and acquiring keywords in the historical conversation data;

determining each historical user intention according to the keywords;

labeling the dialogue sentences containing the intentions of all historical users in the historical dialogue data;

the user intention a in the history dialogue data is calculated by using the following formula _i Probability of (c):

wherein, a _i I-th user intention, m, for each historical user intention _i For the user's intention a _i Number of occurrences in historical dialogue data, p (a) _i ) For the user's intention a _i Probability of occurrence in historical dialogue data.

Preferably, the statistics of the probability of the occurrence of each historical user intention in the historical dialogue data as the prior knowledge sequence further includes:

the occurrence of the user intention a under the first specific condition in the history dialogue data is calculated according to the following formula _i Probability of (c):

wherein，a _j For the jth user intention, m, of the historical user intentions _ij Is a _i At a _j Number of subsequent occurrences in the historical dialogue data, m _j Is a _j Number of occurrences in historical session data;

the occurrence of the user intention a under the second specific condition in the history dialogue data is calculated according to the following formula _i Probability of (c):

wherein, a _k For the kth user intention, m, among the historical user intentions _ijk Is a _i At a _j And a _k Number of subsequent occurrences in the historical dialogue data, m _jk Is a _j And a _k As well as the number of occurrences in the historical dialog data.

Preferably, the user intention a in the currently input text is calculated by using the prior knowledge sequence _i Includes:

predicting that the currently input text belongs to the user intention a according to the prior knowledge sequence _i The probability of (q 2);

if the sequence length is larger than or equal to a first preset threshold value, q2= p (a) _i |b ₁ ,b ₂ )；

If the sequence length is equal to the second predetermined threshold, q2= p (a) _i |b ₁ )；

If the sequence length is equal to a third predetermined threshold, q2= p (a) _i )；

Wherein b1 and b2 are historical intentions to which the currently input text belongs.

Preferably, according to the probability q1 and the probability q2, selecting the intention with the highest probability as the current intention of the currently input text comprises:

calculating the user intention a of the currently input text by using the following formula _i Total probability of (q):

q＝q1*q2；

and outputting the user intention with the maximum total probability q as the current intention of the currently input text.

A dialog intention recognition apparatus, the apparatus comprising:

the statistical module is used for counting the probability of the intention of each historical user in the historical dialogue data to serve as a priori knowledge sequence;

a prediction module for predicting the current input text belonging to the user intention a according to a preset classification model _i The probability q1 of (a);

a calculation module for calculating the user intention a of the currently input text by using the prior knowledge sequence _i The probability of (q 2);

and the selection module is used for selecting the intention with the maximum probability as the current intention of the currently input text according to the probability q1 and the probability q 2.

Preferably, the statistical module includes:

the acquisition submodule is used for collecting historical dialogue data by using the dialogue logs and acquiring keywords in the historical dialogue data;

the determining submodule is used for determining each historical user intention according to the keywords;

the marking submodule is used for marking the dialogue sentences containing the intentions of all historical users in the historical dialogue data;

a first calculation submodule for calculating a user intention a in the history dialogue data by using the following formula _i Probability of (c):

wherein, a _i I-th user intention, m, for each historical user intention _i For the user's intention a _i Number of occurrences in historical dialogue data, p (a) _i ) For the user's intention a _i Probability of occurrence in the historical dialogue data.

Preferably, the statistical module further includes:

a second calculation submodule for calculating the first specific condition of the historical dialogue data according to the following formulaThe user intention a appears _i Probability of (c):

wherein, a _j For the jth user intention, m, of the historical user intentions _ij Is a _i At a _j Number of subsequent occurrences in the historical dialogue data, m _j Is a _j Number of occurrences in historical dialogue data;

a third calculation submodule for calculating the user intention a appearing in the historical dialogue data under a second specific condition according to the following formula _i Probability of (c):

wherein, a _k For the kth user intention, m, among the historical user intents _ijk Is a _i At a _j And a _k Number of later occurrences in the historical dialogue data, m _jk Is a _j And a _k As well as the number of occurrences in the historical dialog data.

Preferably, the calculation module includes:

a prediction submodule for predicting that the currently input text belongs to the user intention a according to the prior knowledge sequence _i The probability of (q 2);

a first output submodule for outputting q2= p (a) if the sequence length is greater than or equal to a first preset threshold value _i |b ₁ ,b ₂ ) If the sequence length is equal to a second predetermined threshold, the output q2= p (a) _i |b ₁ ) If the sequence length is equal to a third predetermined threshold, the output q2= p (a) _i )；

Preferably, the selection module includes:

a fourth calculation submodule for calculating the attribute of the currently inputted text by using the following formulaIntention of the user a _i Total probability of (q):

q＝q1*q2；

and the second output submodule is used for outputting the user intention with the maximum total probability q as the current intention of the currently input text.

Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.

The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.

Drawings

FIG. 1 is a flowchart illustrating a dialog intention recognition method according to the present invention;

FIG. 2 is another flowchart illustrating a dialog intention recognition method according to the present invention;

FIG. 3 is a block diagram of a dialog intention recognition device according to the present invention;

fig. 4 is another structural diagram of a dialog intention recognition device according to the present invention.

Detailed Description

Reference will now be made in detail to the exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, like numbers in different drawings represent the same or similar elements unless otherwise indicated. The implementations described in the exemplary embodiments below are not intended to represent all implementations consistent with the present disclosure. Rather, they are merely examples of apparatus and methods consistent with certain aspects of the present disclosure, as detailed in the appended claims.

The prior art dialog intention recognition methods mainly include methods based on keywords, template matching and machine learning models. However, in these methods, the intention of the user is determined only by using the text information input by the user at the current time in the conversation, and the current conversation intention of the user cannot be accurately determined. In order to solve the above-described problems, the present embodiment provides a method of identifying a current dialog intention of a user based on predicting the user intention in a currently input text using a probability of the user intention in historical dialog data.

A dialog intention recognition method, as shown in fig. 1, comprising the steps of:

s101, counting the probability of the intention of each historical user in historical dialogue data to serve as a priori knowledge sequence;

step S102, predicting that the currently input text belongs to the user intention a according to a preset classification model _i The probability of (a) q1;

step S103, calculating the user intention a of the currently input text by using the prior knowledge sequence _i The probability of (q 2);

and step S104, selecting the intention with the maximum probability as the current intention of the currently input text according to the probability q1 and the probability q 2.

The working principle of the technical scheme is as follows: predicting that the currently input text belongs to the user intention a according to a training model by counting the probability of the occurrence of each historical intention in the historical dialogue data of the user and taking the probability as a priori knowledge sequence _i The probability of (a) q1; then, the prior knowledge sequence is utilized to calculate that the currently input text belongs to the user intention a _i The probability of (q 2); and selecting the intention with the highest probability as the current intention of the currently input text according to the probability q1 and the probability q 2.

The beneficial effects of the above technical scheme are: aiming at the existing conversation intention recognition method, the prior knowledge sequence is inserted into the current input text to predict the current intention of the user on the basis of predicting the probability that the currently input text belongs to the intention of a certain user in the training model, so that the problem that the intention of the user cannot be accurately judged only by using the text information input by the user at the current moment in the prior art is effectively solved, and the experience of the user is enhanced.

In one embodiment, as shown in fig. 2, counting the probability of occurrence of each historical user intention in the historical dialogue data as a priori knowledge sequence includes:

step S201, collecting historical dialogue data by using a dialogue log, and acquiring keywords in the historical dialogue data;

step S202, determining each historical user intention according to the keywords;

step S203, labeling the dialogue sentences containing the intentions of the historical dialogue data;

step S204, calculating the user intention a in the historical dialogue data by using the following formula _i Probability of (c):

The beneficial effects of the above technical scheme are: the method comprises the steps of accurately collecting each historical user intention in user historical dialogue data, calculating the probability of each user intention according to a formula, and predicting the user intention probability in the current input text of a user by using the calculated probability as a priori knowledge sequence.

In one embodiment, counting the probability of the occurrence of each historical user intention in the historical dialogue data as a priori knowledge sequence further comprises:

the occurrence of the user intention a under the first specific condition in the history dialogue data is calculated according to the following formula _i Probability of (2):

wherein, a _k For the kth user intention, m, among the historical user intents _ijk Is a _i At a _j And a _k Number of subsequent occurrences in the historical dialogue data, m _jk Is a _j And a _k Number of occurrences in historical dialogue data at the same time;

the first specific condition mentioned above may be in particular given the user's intention a _j The above-mentioned second specific condition may be that the user's intention a is given _j And user intention a _k Under the conditions of (a).

The beneficial effects of the above technical scheme are: calculating a under various conditions _i The occurrence probability enriches the types of the priori knowledge sequences, accurately predicts the user intentions according to the number of the user intentions of the current input text of the user, reduces the prediction range to a certain extent, increases the prediction types and improves the prediction accuracy.

In one embodiment, the user intention a in the currently input text is calculated by using a priori knowledge sequence _i Includes:

If the sequence length is equal to a second predetermined threshold, q2= p (a) _i |b ₁ )；

Specifically, the first preset threshold may be 2, the second preset threshold may be 1, and the third preset threshold may be 0.

The beneficial effects of the above technical scheme are: the current intention of the user is predicted according to the situation of the combination of the current input text and the prior knowledge sequence, the user intention in the current input text can be predicted more effectively, and compared with the prior art, the prediction result is more accurate.

In one embodiment, selecting the most probable intent as the current intent of the currently input text based on the probabilities q1 and q2 comprises:

q＝q1*q2；

The beneficial effects of the above technical scheme are: the predicted probability in the training model and the predicted probability of the prior knowledge sequence are combined to more accurately predict the user intention of the current input text of the user.

In one embodiment, the method comprises the following steps:

step 1: and collecting and labeling data.

Step 1.1: a large amount of dialogue data is collected using information such as dialogue logs.

Step 1.2: user intentions are defined that the current domain will involve. As in the air ticket booking task, the user intentions are: greetings, fares, flights, time, price, etc. Assume a total of n different intents, of which the ith oneIs intended to be denoted as a _i 。

Step 1.3: for part of the dialogue data, each sentence in each dialogue is manually marked to indicate which of the intentions defined in the step 1.2 belongs to.

Step 2: and acquiring prior knowledge.

Step 2.1: counting intention a according to the marked data in the step 1 _i The number of occurrences is recorded as m _i Thereby calculating the probability of each intention occurrence

Step 2.2: according to the marked data in the step 1, counting intentions a _i In intention a _j The number of subsequent occurrences, denoted m _ij Thereby calculating the intention a _i In a given intention a _j Is a probability of occurrence under the condition of

Step 2.3: according to the marked data in the step 1, counting intentions a _i In intention a _j And intention a _k The number of subsequent occurrences, denoted m _ijk Thereby calculating the intention a _i In a given intention a _j And intention a _k Probability of occurrence under the condition of (2)

And 3, step 3: a priori knowledge is fused in dialog intent recognition.

Step 3.1: predicting the user's intention a by using the classification model in the machine learning method according to the text input by the user _i Probability of (a) q' (a) _i )。

Step 3.2: predicting that the current moment of the user belongs to the intention a according to the intention sequence in the historical information of the current conversation _i Probability of (a) q ″ _i ). Assume that the historical intent sequence of this dialog is [ b ] ₁ ,b ₂ ,…]Wherein the intention of the last moment is denoted b ₁ . If it is pairedIf the length of the historical intention sequence is greater than or equal to 2, q' (a) _i )＝p(a _i |b ₁ ,b ₂ ). If the dialog history intention sequence length is 1, q' (a) _i )＝p(a _i |b ₁ ). If the dialog history intention sequence length is 0, then q' (a) _i )＝p(a _i )。

Step 3.3: predicting the belongingness a of a user _i Probability of (a) q (a) _i )＝q′(a _i )×q″(a _i )。

Step 3.4: and selecting the intention with the highest probability as the intention of the user.

The working principle and the beneficial effects of the technical scheme are as follows: the method provides a dialogue intention recognition method fusing priori knowledge, and the intention recognition is regarded as a step Markov process, namely when the intention of a current user is predicted, the previous intention is required to be considered to enable the predicted result to be more accurate. Compared with the method for inputting the current text and utilizing the training model to perform the user intention in the prior art, the method disclosed by the invention is more integrated with the prior knowledge.

The present embodiment also provides a dialog intention recognition apparatus, as shown in fig. 3, the apparatus including:

a statistical module 301, configured to count probabilities of occurrence of each historical user intention in the historical dialog data as a priori knowledge sequence;

a prediction module 302 for predicting that the currently inputted text belongs to the user intention a according to a preset classification model _i The probability of (a) q1;

a calculating module 303, configured to calculate that the currently input text belongs to the user intention a by using the priori knowledge sequence ⁱ The probability of (q 2);

and a selecting module 304, configured to select, according to the probability q1 and the probability q2, the intention with the highest probability as the current intention of the currently input text.

In one embodiment, as shown in fig. 4, the statistics module includes:

the obtaining sub-module 401 is configured to collect historical dialogue data by using the dialogue log, and obtain a keyword in the historical dialogue data;

a determining submodule 402 configured to determine each historical user intention according to the keyword;

a labeling submodule 403, configured to label a dialog statement that includes an intention of each historical user in historical dialog data;

a first calculation submodule 404 for calculating the user intention a in the history dialogue data by using the following formula _i Probability of (c):

wherein, a _i I-th user intention, m, for each historical user intention ⁱ For the user's intention a _i Number of occurrences in historical dialogue data, p (a) _i ) For the user's intention a _i Probability of occurrence in the historical dialogue data.

In one embodiment, the statistics module further comprises:

a second calculation submodule for calculating the occurrence of the user intention a under the first specific condition in the history dialogue data according to the following formula _i Probability of (c):

a third calculation submodule for calculating the user intention a appearing in the historical dialogue data under the second specific condition according to the following formula _i Probability of (c):

wherein, a _k For each historical user intention(k) th user intention, m _ijk Is a _i At a _j And a _k Number of subsequent occurrences in the historical dialogue data, m _jk Is a _j And a _k As well as the number of occurrences in the historical dialog data.

In one embodiment, a computing module, comprising:

a prediction submodule for predicting that the currently input text belongs to the user intention a according to the prior knowledge sequence ⁱ The probability of (q 2);

a first output submodule for outputting q2= p (a) if the sequence length is greater than or equal to a first preset threshold value _i |b ₁ ,b ₂ ) If the sequence length is equal to a second predetermined threshold, output q2= p (a) _i |b ₁ ) If the sequence length is equal to a third predetermined threshold, the output q2= p (a) _i )；

In one embodiment, the selection module includes:

a fourth calculation sub-module for calculating that the currently inputted text belongs to the user intention a using the following formula ⁱ Total probability of (q):

q＝q1*q2；

and the second output sub-module is used for outputting the user intention with the maximum total probability q as the current intention of the currently input text.

Other embodiments of the disclosure will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure disclosed herein. This application is intended to cover any variations, uses, or adaptations of the disclosure following, in general, the principles of the disclosure and including such departures from the present disclosure as come within known or customary practice in the art to which the disclosure pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the disclosure being indicated by the following claims.

It will be understood that the present disclosure is not limited to the precise arrangements described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present disclosure is limited only by the appended claims.

Claims

1. A dialog intention recognition method, comprising the steps of:

predicting that the currently input text belongs to the user intention a according to a preset classification model _i The probability of (a) q1;

calculating the current input text belonging to the user intention a by using the prior knowledge sequence _i The probability of (q 2);

according to the probability q1 and the probability q2, selecting an intention with the highest probability as a current intention of the currently input text;

wherein, the statistical historical dialogue data with the probability of each historical user intention as a priori knowledge sequence comprises:

collecting the historical dialogue data by using a dialogue log to obtain key words in the historical dialogue data;

determining the intentions of the historical users according to the keywords;

labeling the dialogue sentences containing the intentions of the historical dialogue data;

calculating the user intention a in the historical dialogue data by using the following formula _i Probability of (c):

；

wherein, the a _i For the ith user intention of the historical user intentions, the m _i For the user intention a _i The number of occurrences in the historical dialogue data, p (a) _i ) For the user intention a _i A probability of occurrence in the historical dialogue data;

wherein, the statistical historical dialogue data with the probability of each historical user intention as a priori knowledge sequence, further comprises:

calculating the user intention a appearing in the historical dialogue data under a first specific condition according to the following formula _i Probability of (c):

；

wherein, the a _j For the jth user intention in the historical user intentions, the m _ij Is a said _i At the a _j Number of subsequent occurrences in the historical dialogue data, the m _j Is a said _j A number of occurrences in the historical dialog data;

calculating the user intention a appearing in the historical dialogue data under a second specific condition according to the following formula _i Probability of (c):

；

wherein, the a _k For the kth user intention in the historical user intents, the m _ijk Is a said _i At the a _j And said a _k Number of subsequent occurrences in the historical dialog data, the m _jk Is a said _j And a _k The number of occurrences in the historical dialog data at the same time;

wherein, the prior knowledge sequence is used for calculating the intention a of the user in the currently input text _i Includes:

if the sequence length is greater than or equal to a first preset threshold,

；

if the sequence length is equal to a second predetermined lengthThe threshold value is set to a value that is,

；

if the sequence length is equal to a third predetermined threshold,

；

wherein, the b1 and the b2 are historical intentions to which the currently input text belongs.

2. The dialog intention recognition method of claim 1 wherein selecting the most probable intention as the current intention of the currently input text based on the probability q1 and the probability q2 comprises:

calculating that the currently inputted text belongs to the user intention a by using the following formula _i Total probability of (q):

；

3. A dialog intention recognition apparatus, characterized in that the apparatus comprises:

a prediction module for predicting the currently input text belonging to the user intention a according to a preset classification model _i The probability of (a) q1;

a calculation module for calculating the current input text belonging to the user intention a by using the priori knowledge sequence _i The probability of (q 2);

the selection module is used for selecting the intention with the maximum probability as the current intention of the currently input text according to the probability q1 and the probability q2;

wherein, the statistic module comprises:

the acquisition sub-module is used for collecting the historical dialogue data by using a dialogue log and acquiring keywords in the historical dialogue data;

the determining submodule is used for determining the intentions of the historical users according to the keywords;

the marking submodule is used for marking the dialogue sentences containing the intentions of the historical user in the historical dialogue data;

a first calculation submodule for calculating a user intention a in the history dialogue data using the following formula _i Probability of (c):

；

wherein, the a _i The ith user intention being the historical user intents, the m _i For the user intention a _i The number of occurrences in the historical dialogue data, p (a) _i ) For the user intention a _i A probability of occurrence in the historical dialogue data;

wherein, the statistic module further comprises:

a second calculation submodule for calculating the occurrence of the user intention a under a first specific condition in the history dialogue data according to the following formula _i Probability of (c):

；

wherein, the a _j For the jth user intention in the historical user intentions, the m _ij Is a said _i At the a _j Number of subsequent occurrences in the historical dialog data, the m _j Is a said _j A number of occurrences in the historical dialog data;

；

wherein, the a _k For the kth user intention in the historical user intentions, the m _ijk Is a said _i At the a _j And said a _k Number of subsequent occurrences in the historical dialog data, the m _jk Is a said _j And a _k The number of occurrences in the historical dialog data at the same time;

wherein the computing module comprises:

a prediction sub-module for predicting that the currently inputted text belongs to the user intention a according to the prior knowledge sequence _i The probability of (q 2);

a first output submodule for outputting if the sequence length is greater than or equal to a first preset threshold value

If the sequence length equals a second predetermined threshold value, output->

If the sequence length equals a third predetermined threshold value, an output->

；

4. The dialog intent recognition device of claim 3 wherein the selection module comprises:

a fourth calculation sub-module for calculating that the currently inputted text belongs to the user intention a using the following formula _i Total probability of (q):

；