CN110858226A

CN110858226A - Conversation management method and device

Info

Publication number: CN110858226A
Application number: CN201810893032.7A
Authority: CN
Inventors: 王颖帅; 李晓霞; 苗诗雨
Original assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Current assignee: Beijing Jingdong Century Trading Co Ltd; Beijing Jingdong Shangke Information Technology Co Ltd
Priority date: 2018-08-07
Filing date: 2018-08-07
Publication date: 2020-03-03

Abstract

The invention discloses a conversation management method and device, and relates to the technical field of computers. One embodiment of the method comprises: processing user input data to be processed according to the natural language understanding model to obtain characteristic information of the user input data to be processed; determining historical record information corresponding to the user input data to be processed, and determining the current conversation state according to the characteristic information and the historical record information; and generating response data corresponding to the user input data to be processed based on a neural network algorithm according to the current conversation state. According to the embodiment, the characteristic information of the user input data is acquired by using the natural language understanding model, then the current conversation state is determined according to the historical record information, and then the response data is generated, so that the design and maintenance cost of a large number of regular programs is reduced, the accuracy of recognizing the user input data is improved, and the user experience is improved.

Description

Conversation management method and device

Technical Field

The present invention relates to the field of computer technologies, and in particular, to a method and an apparatus for session management.

Background

In recent years, with continuous innovation and progress of artificial intelligence, the modes of purchasing and selling goods, providing or receiving services and performing other business activities of users are gradually changed to the mode of human-computer interaction. In the interaction process, the user intention is determined according to the request of the user by a conversation management method, and then the corresponding response result is fed back to the user. Therefore, the dialogue management method plays a crucial role in the human-computer interaction process, and the quality of the dialogue management method directly influences the human-computer interaction efficiency and the user experience.

The prior art conversation management method is a conversation management method based on conversation template matching, and is specifically explained as follows: program codes of the matching sentence patterns are written in advance, the input requests of the users are matched by using the program codes written in advance, then sentences which the robot needs to reply are obtained from the background database according to the matching results, for example, the user says that the user wants to buy the mobile phone, and the machine asks the user what brand of the mobile phone.

In the process of implementing the invention, the inventor finds that at least the following problems exist in the prior art: in the prior art conversation management method, programmers need to develop more regular dialogs to match possible input information of users, and a background server needs to reconstruct programs whenever the users add new dialogs, so that project management is not flexible enough; secondly, in the dialog management method in the prior art, since the program cannot match all statements that the user wants to say well in advance, the situation that the real online service cannot be matched occurs.

Disclosure of Invention

In view of this, embodiments of the present invention provide a session management method and apparatus, which can reduce design and maintenance costs of a large number of regular programs, improve accuracy of identifying user input data, and improve user experience.

To achieve the above object, according to an aspect of an embodiment of the present invention, there is provided a dialog management method.

The conversation management method of the embodiment of the invention comprises the following steps: processing user input data to be processed according to a natural language understanding model to acquire characteristic information of the user input data to be processed; determining historical record information corresponding to the user input data to be processed, and determining a current conversation state according to the characteristic information and the historical record information; and generating response data corresponding to the user input data to be processed based on a neural network algorithm according to the current conversation state.

Optionally, before analyzing the user input data to be processed according to the natural language understanding model, the method further comprises: obtaining a first sample set, wherein the first sample set comprises at least one user input sample data; labeling the user input sample data to obtain characteristic information of the user input sample data so as to generate a second sample set, wherein the second sample set comprises the characteristic information of the user input sample data; constructing a training sample set by using the first sample set and the second sample set; and training the training sample set to obtain the natural language understanding model, wherein the natural language understanding model inputs user input data and outputs characteristic information of the user input data.

Optionally, the history information includes: the method comprises the steps of user historical input data, characteristic information of the user historical input data and a dialogue state of the user historical input data, wherein the user historical input data are preset n times of user input data before the user input data to be processed, and n is an integer not less than zero.

Optionally, determining the current dialog state according to the feature information and the history information includes: judging whether the user input data to be processed is related to the user historical input data or not according to an overlapping degree principle, a scene classification principle or a model correlation principle; when the user input data to be processed is related to the user historical input data, updating the current conversation state according to the conversation state of the user historical input data and the user input data to be processed; and when the user input data to be processed is not related to the user historical input data, generating a new conversation state according to the user input data to be processed, and determining the new conversation state as the current conversation state.

Optionally, the characteristic information includes slot value information; and judging whether the user input data to be processed and the user historical input data are related according to an overlapping degree principle comprises the following steps: and calculating the overlapping degree of the slot value information of the user input data to be processed and the slot value information of the user historical input data, judging whether the overlapping degree is greater than a preset overlapping degree, and if so, confirming that the user input data to be processed is related to the user historical input data.

Optionally, the feature information includes scene information; and judging whether the user input data to be processed and the user historical input data are related according to a scene classification principle comprises the following steps: judging whether the scene information of the user input data to be processed is the same as the scene information of the user historical input data, if so, considering that the scene of the user input data to be processed is the same as the scene of the user historical input data, and confirming that the user input data to be processed is related to the user historical input data.

Optionally, judging whether the to-be-processed user input data and the user historical input data are related according to a model correlation principle includes: and performing correlation analysis on the user input data to be processed and the user historical input data according to a pre-constructed context correlation model, and if the correlation analysis result is correlation, confirming that the user input data to be processed is correlated with the user historical input data.

Optionally, before performing a correlation analysis on the to-be-processed user input data and the user historical input data according to a pre-constructed context correlation model, the method further includes: and constructing a context correlation model based on a dynamic memory network algorithm.

Optionally, after determining whether the user input data to be processed and the user historical input data are related according to an overlapping degree principle, a scene classification principle, or a model correlation principle, the method further includes: calculating the gate weight of the user input data to be processed by utilizing a neural network gate function; calculating the correlation value of the user input data to be processed and the user historical input data according to the gate weight; and if the correlation value is larger than a preset correlation threshold value, confirming that the user input data to be processed is related to the user historical input data.

To achieve the above object, according to another aspect of the embodiments of the present invention, an object management apparatus is provided.

A session management apparatus according to an embodiment of the present invention includes: the acquisition module is used for processing the user input data to be processed according to the natural language understanding model and acquiring the characteristic information of the user input data to be processed; the determining module is used for determining historical record information corresponding to the user input data to be processed and determining the current conversation state according to the characteristic information and the historical record information; and the generating module is used for generating response data corresponding to the user input data to be processed based on a neural network algorithm according to the current conversation state.

Optionally, the obtaining module is further configured to: obtaining a first sample set, wherein the first sample set comprises at least one user input sample data; labeling the user input sample data to obtain characteristic information of the user input sample data so as to generate a second sample set, wherein the second sample set comprises the characteristic information of the user input sample data; constructing a training sample set by using the first sample set and the second sample set; and training the training sample set to obtain the natural language understanding model, wherein the natural language understanding model inputs user input data and outputs characteristic information of the user input data.

Optionally, the determining module is further configured to: judging whether the user input data to be processed is related to the user historical input data or not according to an overlapping degree principle, a scene classification principle or a model correlation principle; when the user input data to be processed is related to the user historical input data, updating the current conversation state according to the conversation state of the user historical input data and the user input data to be processed; and when the user input data to be processed is not related to the user historical input data, generating a new conversation state according to the user input data to be processed, and determining the new conversation state as the current conversation state.

Optionally, the characteristic information includes slot value information; and the determining module is further configured to: and calculating the overlapping degree of the slot value information of the user input data to be processed and the slot value information of the user historical input data, judging whether the overlapping degree is greater than a preset overlapping degree, and if so, confirming that the user input data to be processed is related to the user historical input data.

Optionally, the feature information includes scene information; and the determining module is further configured to: judging whether the scene information of the user input data to be processed is the same as the scene information of the user historical input data, if so, considering that the scene of the user input data to be processed is the same as the scene of the user historical input data, and confirming that the user input data to be processed is related to the user historical input data.

Optionally, the determining module is further configured to: and performing correlation analysis on the user input data to be processed and the user historical input data according to a pre-constructed context correlation model, and if the correlation analysis result is correlation, confirming that the user input data to be processed is correlated with the user historical input data.

Optionally, the determining module is further configured to: and constructing a context correlation model based on a dynamic memory network algorithm.

Optionally, the determining module is further configured to: calculating the gate weight of the user input data to be processed by utilizing a neural network gate function; calculating the correlation value of the user input data to be processed and the user historical input data according to the gate weight; and if the correlation value is larger than a preset correlation threshold value, confirming that the user input data to be processed is related to the user historical input data.

To achieve the above object, according to still another aspect of embodiments of the present invention, there is provided an electronic apparatus.

An electronic device of an embodiment of the present invention includes: one or more processors; the storage device is used for storing one or more programs, and when the one or more programs are executed by one or more processors, the one or more processors realize the conversation management method of the embodiment of the invention.

To achieve the above object, according to still another aspect of an embodiment of the present invention, there is provided a computer-readable medium.

A computer-readable medium of an embodiment of the present invention has stored thereon a computer program that, when executed by a processor, implements a dialog management method of an embodiment of the present invention.

One embodiment of the above invention has the following advantages or benefits: the method has the advantages that the characteristic information of the user input data can be obtained by utilizing the natural language understanding model, then the current conversation state is determined according to the historical record information, and then the response data is generated, so that the design and maintenance cost of a large number of regular programs can be reduced, the accuracy of recognizing the user input data is improved, and the user experience is improved; in the embodiment of the invention, the training sample set consisting of the first sample set and the second sample set is trained to obtain the natural language understanding model, so that the natural language understanding model can be constructed by utilizing massive sample set data, and the accuracy of identifying the input data of the user is improved; in the embodiment of the invention, the correlation between the user input data to be processed and the historical user input data is judged from multiple angles according to an overlapping degree principle, a scene classification principle or a model correlation principle, so that the accuracy of predicting the context relationship can be improved, and the user experience is further improved; according to the embodiment of the invention, the overlapping degree of the user input data to be processed and the user historical input data is judged according to the slot value information, so that the context correlation can be predicted from the perspective of the keywords of the input data; according to the embodiment of the invention, the relevance of the user input data to be processed and the historical user input data is judged according to the scene information, so that the context relevance can be predicted according to the scene of the input data; according to the embodiment of the invention, the correlation between the user input data to be processed and the historical user input data is judged according to the pre-constructed context correlation model, so that the context correlation can be predicted by means of a model generated by massive data; in the embodiment of the invention, the correlation value of the user input data to be processed and the historical user input data is calculated according to the neural network gate function, so that the accuracy of predicting the context correlation can be improved.

Further effects of the above-mentioned non-conventional alternatives will be described below in connection with the embodiments.

Drawings

The drawings are included to provide a better understanding of the invention and are not to be construed as unduly limiting the invention. Wherein:

fig. 1 is a schematic diagram of the main steps of a dialog management method according to an embodiment of the invention;

FIG. 2 is a schematic diagram of slot value information under a class of cell phone products according to an embodiment of the invention;

FIG. 3 is a diagram illustrating a format of information for labeling a slot value according to an embodiment of the present invention;

fig. 4 is a schematic view of a main flow of a dialogue management method according to a referential embodiment of the present invention;

FIG. 5 is an overall architecture diagram of a session management method embodying the present invention;

FIG. 6 is a schematic diagram of the main modules of a dialog management device according to an embodiment of the present invention;

FIG. 7 is an exemplary system architecture diagram in which embodiments of the present invention may be employed;

fig. 8 is a schematic structural diagram of a computer system suitable for implementing a terminal device or a server according to an embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present invention are described below with reference to the accompanying drawings, in which various details of embodiments of the invention are included to assist understanding, and which are to be considered as merely exemplary. Accordingly, those of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the invention. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Fig. 1 is a schematic diagram of the main steps of a dialog management method according to an embodiment of the invention. As an embodiment of the present invention, as shown in fig. 1, the main steps of the dialog management method of the embodiment of the present invention may include:

step S101: and processing the user input data to be processed according to the natural language understanding model to acquire the characteristic information of the user input data to be processed.

In the process of man-machine interaction, the method is carried out in a mode that a sentence of user input data and a sentence of machine response data alternately appear. The user input data may be text data directly input by the user or voice data input by the user. When the user input data is voice data, the voice data needs to be recognized by voice recognition software to obtain corresponding text data. The response data is data corresponding to user input data fed back to the user by the machine, and may be response text data or response voice data. The machine in the invention can be an intelligent robot, an intelligent navigation customer service on a certain shopping application, and an intelligent customer service software of a certain computer, which is not limited by the invention.

One sentence of user input data and one sentence of corresponding response data form one pair of voice data. During the human-computer interaction process, single-round or multi-round dialogue data can be included. In the invention, the user input data in each pair of the conversation data can be used as the user input data to be processed. The step is that the user input data to be processed is used as the input of a natural language understanding model which is constructed in advance, and then the characteristic information of the user input data to be processed is obtained. For example, the user enters "i want to buy a computer", and the feature information obtained is "buy" and "computer".

As another embodiment of the present invention, before analyzing the user input data to be processed according to the natural language understanding model in step S101, the dialog management method may further include: a natural language understanding model is constructed in advance. Wherein, constructing the natural language understanding model specifically may include:

step S1011: obtaining a first sample set, wherein the first sample set may contain at least one user input sample data;

step S1012: performing labeling processing on user input sample data to acquire characteristic information of the user input sample data so as to generate a second sample set, wherein the second sample set can contain the characteristic information of the user input sample data;

step S1013: constructing a training sample set by using the first sample set and the second sample set;

step S1014: training the training sample set to obtain a natural language understanding model, wherein the natural language understanding model inputs user input data and outputs characteristic information of the user input data.

The first sample set of user input data can be on-line real user data, and certainly, after the real user data is obtained, the real data needs to be cleaned, words without information content in the real data are removed, such as 'haha' and 'hey', and blacklist data are also cleaned, such as 'network voice prohibition' and 'dirty words'. The invention carries out simple marking and manual marking on the cleaned real user data and then obtains the characteristic information of the real user data. After long-time statistics and accumulation, the real online user data belongs to massive data, so that the accuracy of the model is improved.

In the embodiment of the present invention, the feature information may include scene information and slot value information. Scene information refers to determining the user's intention from user data. For example, a user inputs 'i want to buy a computer' through an intelligent customer service, and can determine that the purpose of the user is purchasing and consult before sale through the intelligent customer service; if the user inputs ' i want to return the goods ordered before ' through the intelligent customer service ', it can be determined that the purpose of the user is to return the goods, and after-sale return consultation is carried out through the intelligent customer service. The slot value information refers to keyword information in data, and is divided into slot value information in _ slot in user input data and slot value information request _ slot desired by a machine. For example, a user inputs that 'i want to buy a millet mobile phone and want a screen to be a little bigger', that 'millet' needs to be marked as a brand word in _ slot, and that 'mobile phone' is a product word in _ slot, so that a machine only knows that the millet mobile phone cannot make a recommendation immediately, and the machine needs to know other slot value information request _ slot such as price, color, memory and the like, and can make a better commodity recommendation. Fig. 2 is a schematic diagram of slot value information under a mobile phone category according to an embodiment of the present invention, where the slot value information may be in the form of key-value pairs, for example, a key of the slot value information is a brand word, and a value is hua.

The above simple labeling of the cleaned real user data means that the program preliminarily labels the real user data according to a rule formulated in advance. And the optimization marking refers to the step of sending the simply marked data to a marking person, and the marking person carries out manual marking, modifies errors, reserves correct data and generates the characteristic information of the real user data. Considering that the present invention involves multiple rounds of sessions, the slot value information that needs to be labeled includes, in addition to: the information such as product words, brand words, modifiers, query ranges, channel numbers, etc. may also include attribute slot values, such as the size, color, usage, etc. of the product, and dynamic slot values, that is, slot values that may be changed, such as the memory size in the case of mobile phone products, the size in the case of clothing products, etc. may also need to be labeled. Fig. 3 is a schematic diagram illustrating a format of the annotation slot value information according to an embodiment of the invention.

Step S102: and determining historical record information corresponding to the user input data to be processed, and determining the current conversation state according to the characteristic information and the historical record data information. The conversation state is obtained by the machine according to the feedback of the input data of the user in the whole multi-turn conversation process. For example, the first round of input data of the user is "i want to buy a computer", the machine obtains the dialog state that "the user needs to buy the computer" according to the first round of input data of the user, the response data is "ask what brand the user needs for the computer", then the second round of input data of the user is "brand a", and then the machine obtains the current dialog state that "the user needs to buy the brand a computer" according to the second round of input data of the user and the previous dialog state.

In this embodiment of the present invention, the history information may include: user history input data, characteristic information of the user history input data, and a dialog state of the user history input data. The historical user input data is preset n rounds of user input data before the user input data to be processed, and n is an integer not less than zero. If n is zero, the history information is set to null. The history information is used to make the machine have memory knowledge, for example, in the above example, if the second round of input data of the user is input data of the user to be processed, the machine needs to remember what the first round of input data of the user is, so that in the subsequent human-computer interaction, the machine remembers that the user aims to buy one computer, and consults about specific information of the computer needed by the user, such as brand, price, size, and the like. The purpose of recording the feature information of the first round of input data is to determine context correlation using the feature information, which is described in detail below and is not specifically explained herein. The purpose of recording the dialog state of the user history input data is to determine whether the dialog state of the user input data to be processed has changed, which is described in detail below and is not explained in detail herein.

As still another embodiment of the present invention, the determining of the current dialog state according to the feature information and the history information in step S102 may include:

s1021: judging whether the user input data to be processed is related to the user historical input data or not according to an overlapping degree principle, a scene classification principle or a model correlation principle;

s1022: when the user input data to be processed is related to the user historical input data, updating the current conversation state according to the conversation state of the user historical input data and the user input data to be processed;

s1023: and when the user input data to be processed is not related to the user historical input data, generating a new conversation state according to the user input data to be processed, and determining the new conversation state as the current conversation state.

According to the method, whether the historical input data of the user and the input data of the user to be processed are related or not is judged from at least one of an overlapping degree principle, a scene classification principle or a model correlation principle, if so, the previous conversation state is updated, and otherwise, a new conversation state is started. For example, the input data of the 1 st round of the user is "i want to buy a computer", the machine determines that the dialog state is "the user needs to buy a computer" according to the input data of the 1 st round, then sends the response data of "ask what brand the user needs for a computer" to the user, then the input data of the 2 nd round of the user is "i buy a mobile phone", the machine determines that the input data of the 1 st round of the user and the input data of the 2 nd round are not related, therefore, the dialog state is "the user needs to buy a mobile phone", and sends the response data of "ask what brand the user needs for a mobile phone" to the user ", then the input data of the 3 rd round of the user is" brand B ", the machine determines that the dialog state is" the user needs to buy a computer "according to the input data of the 1 st round, the machine determines that the input data of the 2 nd round of the user and the input data of the 3, the updated dialog state is "the user needs to purchase the handset with brand B".

It should be noted that, in the present invention, the user history input data is preset n times of user input data before the user input data to be processed, where n is an integer not less than zero. Therefore, if the to-be-processed user input data is the 5 th round of user input data and the value of n is 3, the method for judging whether the 5 th round of user input data is related to the user history input data is specifically explained as follows:

firstly, judging whether the 5 th round of user input data is related to the 4 th round of user input data, if so, updating the current conversation state according to the conversation state corresponding to the 4 th round of user input data and the 5 th round of user input data;

when the 5 th round of user input data is not related to the 4 th round of user input data, judging whether the 5 th round of user input data is related to the 3 rd round of user input data, if so, updating the current conversation state according to the conversation state corresponding to the 3 rd round of user input data and the 5 th round of user input data;

when the 5 th round of user input data is not related to the 3 rd round of user input data, judging whether the 5 th round of user input data is related to the 2 nd round of user input data, if so, updating the current conversation state according to the conversation state corresponding to the 3 rd round of user input data and the 5 th round of user input data;

when the 5 th round of user input data and the 2 nd round of user input data are not correlated, the current dialog state is generated directly from the 5 th round of user input data.

In an embodiment of the present invention, after determining whether the to-be-processed user input data is related to the user history input data in step S1021, the session management method may further include: calculating the gate weight of the user input data to be processed by utilizing a neural network gate function; according to the gate weight, calculating a correlation value of the user input data to be processed and the user historical input data; and if the correlation value is larger than a preset correlation threshold value, confirming that the user input data to be processed is related to the user historical input data. According to the context correlation prediction method and device, after the fact that the user input data to be processed are related to the user historical input data is judged, the correlation value of the user input data to be processed and the user historical input data can be calculated, the specific correlation degree can be obtained according to the correlation value, when the correlation degree reaches a certain numerical value, the user input data to be processed and the user historical input data are considered to be related, and therefore the context correlation prediction accuracy can be further guaranteed.

As still another embodiment of the present invention, the characteristic information may include: slot value information. Therefore, the determining whether the user input data to be processed and the user history input data are related according to the principle of the degree of overlap in the step S1021 may include: and calculating the overlapping degree of the slot value information of the user input data to be processed and the slot value information of the historical user input data, judging whether the overlapping degree is greater than a preset overlapping degree, and if so, confirming that the user input data to be processed is related to the historical user input data. For example, if the user's round 1 input data is "i want to buy a computer", the slot value information of the round 1 input data is "computer", and the user's round 2 input data is "i want to buy a computer with brand a", the slot value information of the round 2 input data is "computer" and "brand a", and it is determined that the overlap degree between the slot value information of the round 1 input data of the user and the slot value information of the round 2 input data of the user is high, and therefore, the round 1 input data of the user and the round 2 input data of the user are considered to be related.

As another embodiment of the present invention, the characteristic information may include: and (4) scene information. Therefore, the step S1021 of determining whether the to-be-processed user input data and the user history input data are related according to the scene classification rule may include: judging whether the scene information of the user input data to be processed is the same as the scene information of the user historical input data, if so, determining that the scene of the user input data to be processed is the same as the scene of the user historical input data, and confirming that the user input data to be processed is related to the user historical input data. For example, if the input data of the 1 st round of the user is "i want to buy a computer", the scenario information of the input data of the 1 st round is "buy a computer", the input data of the 2 nd round of the user is "i want to buy a mobile phone", and the scenario information of the input data of the 2 nd round is "buy a mobile phone", the scenario information of the input data of the 1 st round of the user and the scenario information of the input data of the 2 nd round of the user are considered to be different, and therefore the input data of the 1 st round of the user and the input data of the 2 nd round of the user are considered to be irrelevant.

As still another embodiment of the present invention, the determining whether the to-be-processed user input data and the user history input data are related according to the model correlation rule in step S1021 may include: and performing correlation analysis on the user input data to be processed and the user historical input data according to a pre-constructed context correlation model, and if the correlation analysis result is correlation, determining that the user input data to be processed is correlated with the user historical input data.

In this embodiment of the present invention, before performing correlation analysis on the to-be-processed user input data and the user history input data according to the pre-constructed context correlation model, the dialog management method may further include: and constructing a context correlation model based on a dynamic memory network algorithm. The dynamic memory network algorithm of the invention comprises four sub-modules which are respectively: the system comprises an input text sub-module, a question sub-module, a scene memory network sub-module and an answer generation sub-module. The problem sub-module is to-be-processed user input data, the scene memory network sub-module is to construct the relationship between the to-be-processed user input data and the user historical input data in a deep learning neural network mode, the answer generation sub-module is to predict the relationship between the to-be-processed user input data and the user historical input data, yes is output if there is a relationship, and no is output if there is no relationship.

The input text submodule converts Chinese character phrases into distributed vector numbers recognized by a computer, the input information mode can be voice data or text data, if the input information mode is voice data, the voice data is converted into character data by means of a voice recognition model, and then the technology of natural language processing is used. In the present invention, a sequence T is input₁Is composed of a character w₁,w₂,…,w_TAnd the like. The input sequence of the present invention passes through a Recurrent Neural Network (abbreviated as RNN) code (which may be, but not limited to, an RNN code, and may also be in other coding forms, and the present invention is not limited thereto), and at each time t, the RNN Network updates the hidden state h_t＝RNN(L[w_t],h_t-1) Where L is the coding matrix, w_tIs the character index of the t-th Chinese character of the input sequence. If the text input by the user is a single short sentence, directly entering an RNN network and outputting a hidden state; if the user entered text is a relatively long sequence, the invention extracts the slot value and stores the slot value in a listThe vector of hidden states is output. The invention can also select GRU (Gated Current Unit which is a variant of a Recurrent neural network and has a simpler structure and consists of an input gate, a forgetting gate and an output gate function) with lower computation complexity for coding. Suppose at time t the input text sequence is x_tThe hidden state is h_tThen the structure of the GRU is defined as follows:

z_t＝σ(W^(z)x_t+U^(z)h_t-1+b^(z))

r_t＝σ(W^(r)x_t+U^(r)h_t-1+b^(r))

wherein

Is a self-defined operator, W is the weight of the input layer of the neural network, U is the weight of the hidden layer, b is a constant, W^(z)、W^(r)、

U^(z)、U^(r)、

And n are 7 hyper-parameters, and the synthesis can be summarized as h_t＝GRU(x_t,h_t-1)。

The question submodule, like the input text sequence, is also input into the model in the form of characters and is coded by the recurrent neural network RNN, the question T_QIs composed of a series of character strings, coded into mathematical vectors,

wherein L represents a wordA symbol-phrase encoding matrix is provided,

an index representing the t-th character or word in the question.

The scene memory network submodule is used for updating the scene memory in the input text submodule, the core component is used for iterative memory through an attention mechanism, and in each iteration, the comprehensive vector c of the attention mechanism considers the current problem q and the previous memory m^i-1Generating a current scene eⁱAfter that, the scene vector eⁱAnd a memory vector mⁱWill be used, the update scenario memory network formula is mⁱ＝GRU(eⁱ,m^i-1) The initial vector of the GRU is the problem itself m⁰Q, along with the building of the network, multiple scene memories may be needed, for example, a user in the intelligent assistant says "i browses a mobile phone with a brand B yesterday", a user says "but i want to buy a mobile phone with a brand C today", the two sentences can be regarded as multiple scenes, the current input information of the user "i want to buy a mobile phone with a brand C with a slightly larger screen" is equivalent to a problem, and the dynamic memory network needs to input according to the previous two sentences of the user to identify whether the current information is related to the previous input. The invention uses a gate function as an attention mechanism, and the calculation formula is

Wherein c is_tIs to input text, m^i-1The method is a memory network, q is a problem, the invention refers to the current user input information, the gate function reaches the preset threshold value or meets the maximum iteration number, the network converges, and the iteration is stopped.

The answer generation submodule is used for generating answers through vector calculation. The invention can set up another GRU network to generate answers. Memorizing the last state m of a network^TMAs the input of the GRU network, at each moment, the input also comprises a question vector q and a hidden state a of the previous layer_t-1And the previous prediction output y_t-1The calculation formula is y_t＝soft max(W^(a)a_t)，a_t＝GRU([y_t-1,q],a_t-1) The loss function of the invention is cross entropy, and the prediction of two categories of 'yes' or 'no' is made, which represents that the context is 'relevant' or 'irrelevant'.

In the invention, the parameters of the context correlation model constructed based on the dynamic memory network algorithm are set as follows:

(1) the cell number of the recurrent neural network, which is expressed as recurrent cell size in English, can be set to 128;

(2) the vector dimension of the text character is deepened, English is expressed as D, and the method can be set to be 50;

(3) the Learning rate of the optimization algorithm in deep Learning, English is expressed as Learning _ rate, and the Learning rate can be set to 0.005;

(4) the loss rate of random neurons of the Input layer, English is expressed as Input _ p, and the loss rate can be set to 0.5;

(5) the random neuron loss rate of the Output layer, English is expressed as Output _ p, and the random neuron loss rate can be set to 0.5;

(6) the amount of data for training one Batch at a time, denoted Batch _ size in English, can be set to 128;

(7) the memory segment in the scene memory network, English is expressed as Passes, the invention can be set as 4;

(8) the size of the hidden layer network which is propagated forwards is represented as Ff _ hidden _ size in English, and the size can be set to be 256;

(9) the Weight auto decay rate, expressed in English as Weight _ decay, may be set to 0.00000001;

(10) the number of the problems which can be trained by the network each time is represented as Training _ iterations _ count, and the invention can be set to 400000;

(11) the verification set is displayed once every other iteration, English is represented as Display _ step, and the method can be set to be 100;

(12) a session Context, English denoted as Context;

(13) a sentence end flag, English is expressed as Input _ presence _ ends;

(14) the gate function of the neural network, in english, is denoted Input _ gru.

Step S103: and generating response data corresponding to the user input data to be processed based on a neural network algorithm according to the current conversation state. For example, if the input data of the user is "i want to buy a mobile phone with brand a", it is provided that the brand slot value is "a" and the product slot value is "mobile phone", and after the machine acquires the slot value information info _ slot of the input data of the user, it is not possible to make a recommendation immediately and needs to know other slot values such as price, color, memory, etc., therefore, the machine will generate response data "ask for how many mobile phones you want, the user will acquire the response data of the machine, and will continue to input" i want to buy a mobile phone with price range between 1000 yuan and 2000 yuan ", then the machine will acquire a dialog state" the user wants to buy a mobile phone with brand a and price range between 1000 yuan and 2000 yuan ", then will inquire other slot value information, then recommend a mobile phone to the user, and if the machine finds that there is no mobile phone suitable for the user, will recommend other similar mobile phones to the user.

Fig. 4 is a schematic diagram of a main flow of a dialogue management method according to a referential embodiment of the present invention. As shown in fig. 4, the main flow of the dialog management method may include:

step S401: a natural language understanding model is constructed, wherein the constructed natural language understanding model inputs user input data and outputs scene information and slot value information of the user input data;

step S402: processing user input data to be processed according to the natural language understanding model, and acquiring scene information and slot value information of the user input data to be processed;

step S403: determining historical record information corresponding to the user input data to be processed, wherein the determined historical record information may include: the method comprises the steps of obtaining user historical input data, characteristic information of the user historical input data and a dialogue state of the user historical input data, wherein the user historical input data are preset n rounds of user input data before the user input data to be processed are obtained, and n is an integer not less than zero;

step S404: calculating the overlapping degree of the slot value information of the user input data to be processed and the slot value information of the user historical input data;

step S405: judging whether the overlapping degree is larger than a preset overlapping degree, if so, executing a step S409, otherwise, executing a step S411;

step S406: judging whether the scene information of the user input data to be processed is the same as the scene information of the user historical input data, if so, executing a step S409, otherwise, executing a step S411;

step S407: constructing a context correlation model based on a dynamic memory network algorithm, and performing correlation analysis on the user input data to be processed and the user historical input data according to the pre-constructed context correlation model;

step S408: judging whether the scene information of the user input data to be processed is related to the historical user input data according to the correlation analysis result, if so, executing a step S409, otherwise, executing a step S411;

step S409: calculating a correlation value of the user input data to be processed and the historical user input data, and judging whether the calculated correlation value is greater than a preset correlation threshold, if so, executing a step S410, otherwise, executing a step S411;

step S410: updating the current conversation state according to the conversation state of the historical input data of the user and the input data of the user to be processed;

step S411: generating a new conversation state according to the input data of the user to be processed, and determining the new conversation state as the current conversation state;

step S412: and generating response data corresponding to the user input data to be processed based on a neural network algorithm according to the current conversation state.

It should be noted that the building of the natural language understanding model in step S401 described above is specifically explained in the above (step S1011 to step S1014), and will not be described in detail here. Further, how to calculate the correlation value of the user input data to be processed and the user history input data is also explained in detail above, and thus the description in step S409 is not repeated. It should be noted that the above steps are performed to confirm that the user input data to be processed and the user history input data are related as long as the user input data to be processed and the user history input data meet the requirements in terms of the overlapping degree of the bin values, the scene information, or the correlation analysis result, and then step S409 is performed. In practical applications, three aspects may be set to meet requirements, and it can be determined that the user input data to be processed and the user history input data are related, or a method for setting weights for the three aspects is used, and when a weight value meets a certain condition (for example, is greater than a certain value), it can be determined that the user input data to be processed and the user history input data are related, of course, other methods may also be used, which is not limited by the present invention.

For convenience of understanding, the overall framework for implementing the dialog management method of the present invention is described in detail by taking a "machine" as an example of an "intelligent assistant", and an overall framework diagram for implementing the dialog management method of the present invention is shown in fig. 5. As shown in fig. 5, the overall framework for implementing the dialog management method of the present invention may include: the intelligent assistant multi-turn dialogue labeling module, the intelligent assistant multi-turn dialogue framework module, the dialogue management context extraction module, the dynamic memory network algorithm module, the multi-turn dialogue effect module and the online test cut-flow module.

Wherein, intelligent assistant's many rounds of dialogue marking module includes: the method comprises three units of voice assistant log list design, dialogue cleaning design and groove value labeling design. The voice assistant log list unit is used for extracting real user data on the line, labeling the user data by a labeling person, obtaining labeled linguistic data, inputting the real data and labeled expected data into a model for training, and generating a natural voice understanding model. The dialogue cleansing user cleanses the user's input data and the intelligent assistant's response data. The annotated slot value is designed to define which slot value information needs to be obtained.

The intelligent assistant multi-turn dialog framework module is an overall framework logical abstraction of multi-turn conversations, and comprises the following steps: the device comprises a natural language understanding module, a dialogue management submodule and a natural language generation submodule. The natural speech understanding module is used for identifying the intention of the user (namely identifying scene information of the user), and extracting the slot value information such as product words, brand words and modifiers. The natural speech understanding module may be subdivided into a scene intent classification model and a user natural language understanding model. The scene intention classification model has the following functions: classifying input sentences of a user in the intelligent assistant and judging which service scene is; the role of the user natural language understanding model is as follows: and analyzing input sentences of the user in the intelligent assistant, and identifying the slot value information such as brand words, product words, modifiers and the like. And the dialogue management submodule is used for identifying the current dialogue state of the user and judging whether the dialogue state needs to be switched. The dialogue management submodule can be subdivided into a plurality of rounds of dialogue units, a knowledge graph unit and a dialogue state maintenance unit. The multi-turn dialogue unit is used for enabling the intelligent assistant to interact with the user for multiple times; the knowledge graph unit enables the intelligent assistant to have memory knowledge, for example, a user says 'capital of China', and the intelligent assistant knows that Beijing is available; the conversation state maintenance unit is a unit for ensuring the normal operation of the conversation in the whole multi-turn conversation process. The natural language generation module is used for generating response data by the intelligent assistant.

The dialogue management context relation extraction module is used for managing context, identifying whether the current input of the user belongs to the previous dialogue or not, and judging which extracted product words and brand words need to be memorized and which can be forgotten. The dialogue management context extraction module is divided into three layers, namely a groove value overlapping degree, an intention overlapping degree and a dynamic memory network algorithm, and the three layers are used for predicting the context relationship from three different angles.

The dynamic memory network algorithm module is used for managing the context of a plurality of rounds of conversations, the user inputs the words in the intelligent assistant and performs dynamic memory network matching with the previous words, if the network characteristics are matched, the words are predicted to be context-dependent, the current conversation is not closed, if the network characteristics are not matched, the words are predicted to be context-independent, the current conversation is closed, and the next conversation state is performed. In the module, the information of the context input by the user and the current information input by the user are input into a dynamic network with scene memory, and the prediction result is obtained, so that whether the context is related or not is judged.

The multi-round dialogue effect module comprises an algorithm evaluation index and a service evaluation index and is used for evaluating the accuracy of the dialogue management algorithm. The online test cut flow module is used for testing the intelligent assistant before the intelligent assistant is online, testing various user input data and obtaining an online model test result, so that the effect of the dialogue management method can be verified, problems can be found and improved according to the test result, and the efficiency and reliability of online test can be ensured.

According to the technical scheme of the dialogue management, the characteristic information of the user input data can be obtained by utilizing the natural language understanding model, then the current dialogue state is determined according to the historical record information, and the response data is further generated, so that the design and maintenance cost of a large number of regular programs can be reduced, the accuracy of recognizing the user input data is improved, and the user experience is improved; in the embodiment of the invention, the training sample set consisting of the first sample set and the second sample set is trained to obtain the natural language understanding model, so that the natural language understanding model can be constructed by utilizing massive sample set data, and the accuracy of identifying the input data of the user is improved; in the embodiment of the invention, the correlation between the user input data to be processed and the historical user input data is judged from multiple angles according to an overlapping degree principle, a scene classification principle or a model correlation principle, so that the accuracy of predicting the context relationship can be improved, and the user experience is further improved; according to the embodiment of the invention, the overlapping degree of the user input data to be processed and the user historical input data is judged according to the slot value information, so that the context correlation can be predicted from the perspective of the keywords of the input data; according to the embodiment of the invention, the relevance of the user input data to be processed and the historical user input data is judged according to the scene information, so that the context relevance can be predicted according to the scene of the input data; according to the embodiment of the invention, the correlation between the user input data to be processed and the historical user input data is judged according to the pre-constructed context correlation model, so that the context correlation can be predicted by means of a model generated by massive data; in the embodiment of the invention, the correlation value of the user input data to be processed and the historical user input data is calculated according to the neural network gate function, so that the accuracy of predicting the context correlation can be improved.

Fig. 6 is a schematic diagram of main modules of a dialog management device according to an embodiment of the present invention. As shown in fig. 6, the session management apparatus 600 according to the embodiment of the present invention mainly includes the following modules: an acquisition module 601, a determination module 602, and a generation module 603.

The obtaining module 601 may be configured to process the user input data to be processed according to the natural language understanding model, and obtain feature information of the user input data to be processed. The determining module 602 may be configured to determine history information corresponding to the to-be-processed user input data, and determine a current session state according to the feature information and the history information. The generating module 603 may be configured to generate response data corresponding to the to-be-processed user input data based on a neural network algorithm according to the current dialog state.

In this embodiment of the present invention, the obtaining module 601 may further be configured to: acquiring a first sample set, wherein the first sample set comprises at least one user input sample data; labeling the user input sample data to obtain the characteristic information of the user input sample data so as to generate a second sample set, wherein the second sample set comprises the characteristic information of the user input sample data; constructing a training sample set by using the first sample set and the second sample set; training the training sample set to obtain a natural language understanding model, wherein the natural language understanding model inputs user input data and outputs characteristic information of the user input data.

In this embodiment of the present invention, the history information may include: user history input data, characteristic information of the user history input data, and a dialog state of the user history input data. The user historical input data is preset n rounds of user input data before the user input data to be processed, and n is an integer not less than zero.

In this embodiment of the present invention, the determining module 602 may further be configured to: judging whether the user input data to be processed is related to the user historical input data or not according to an overlapping degree principle, a scene classification principle or a model correlation principle; when the user input data to be processed is related to the user historical input data, updating the current conversation state according to the conversation state of the user historical input data and the user input data to be processed; and when the user input data to be processed is not related to the user historical input data, generating a new conversation state according to the user input data to be processed, and determining the new conversation state as the current conversation state.

In the embodiment of the present invention, the feature information may include: slot value information. The determination module 602 may also be configured to: and calculating the overlapping degree of the slot value information of the user input data to be processed and the slot value information of the historical user input data, judging whether the overlapping degree is greater than a preset overlapping degree, and if so, confirming that the user input data to be processed is related to the historical user input data.

In this embodiment of the present invention, the feature information may further include: and (4) scene information. The determination module 602 may also be configured to: judging whether the scene information of the user input data to be processed is the same as the scene information of the user historical input data, if so, determining that the scene of the user input data to be processed is the same as the scene of the user historical input data, and confirming that the user input data to be processed is related to the user historical input data.

In this embodiment of the present invention, the determining module 602 may further be configured to: and performing correlation analysis on the user input data to be processed and the user historical input data according to a pre-constructed context correlation model, and if the correlation analysis result is correlation, determining that the user input data to be processed is correlated with the user historical input data.

In this embodiment of the present invention, the determining module 602 may further be configured to: and constructing a context correlation model based on a dynamic memory network algorithm.

In this embodiment of the present invention, the determining module 602 may further be configured to: calculating the gate weight of the user input data to be processed by utilizing a neural network gate function; according to the gate weight, calculating a correlation value of the user input data to be processed and the user historical input data; if the correlation value is larger than the preset correlation threshold value, confirming that the input data of the user to be processed is related to the historical input data of the user

From the above description, it can be seen that the feature information of the user input data can be obtained by using the natural language understanding model, and then the current conversation state is determined according to the history information, so as to generate the response data, thereby reducing the design and maintenance cost of a large number of regular programs, improving the accuracy of identifying the user input data, and improving the user experience; in the embodiment of the invention, the training sample set consisting of the first sample set and the second sample set is trained to obtain the natural language understanding model, so that the natural language understanding model can be constructed by utilizing massive sample set data, and the accuracy of identifying the input data of the user is improved; in the embodiment of the invention, the correlation between the user input data to be processed and the historical user input data is judged from multiple angles according to an overlapping degree principle, a scene classification principle or a model correlation principle, so that the accuracy of predicting the context relationship can be improved, and the user experience is further improved; according to the embodiment of the invention, the overlapping degree of the user input data to be processed and the user historical input data is judged according to the slot value information, so that the context correlation can be predicted from the perspective of the keywords of the input data; according to the embodiment of the invention, the relevance of the user input data to be processed and the historical user input data is judged according to the scene information, so that the context relevance can be predicted according to the scene of the input data; according to the embodiment of the invention, the correlation between the user input data to be processed and the historical user input data is judged according to the pre-constructed context correlation model, so that the context correlation can be predicted by means of a model generated by massive data; in the embodiment of the invention, the correlation value of the user input data to be processed and the historical user input data is calculated according to the neural network gate function, so that the accuracy of predicting the context correlation can be improved.

Fig. 7 shows an exemplary system architecture 700 to which the dialog management method or dialog management device of an embodiment of the invention may be applied.

As shown in fig. 7, the system architecture 700 may include

terminal devices

701, 702, 703, a network 704, and a server 705. The network 704 serves to provide a medium for communication links between the

terminal devices

701, 702, 703 and the server 705. Network 704 may include various connection types, such as wired, wireless communication links, or fiber optic cables, to name a few.

A user may use the

terminal devices

701, 702, 703 to interact with a server 705 over a network 704, to receive or send messages or the like. The

terminal devices

701, 702, 703 may have installed thereon various communication client applications, such as a shopping-like application, a web browser application, a search-like application, an instant messaging tool, a mailbox client, social platform software, etc. (by way of example only).

The

terminal devices

701, 702, 703 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smart phones, tablet computers, laptop portable computers, desktop computers, and the like.

The server 705 may be a server providing various services, such as a background management server (for example only) providing support for shopping websites browsed by users using the

terminal devices

701, 702, 703. The backend management server may analyze and perform other processing on the received data such as the product information query request, and feed back a processing result (for example, target push information, product information — just an example) to the terminal device.

It should be noted that the session management method provided by the embodiment of the present invention is generally executed by the server 705, and accordingly, the session management apparatus is generally disposed in the server 705.

It should be understood that the number of terminal devices, networks, and servers in fig. 7 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.

Referring now to FIG. 8, shown is a block diagram of a computer system 800 suitable for use with a terminal device implementing an embodiment of the present invention. The terminal device shown in fig. 8 is only an example, and should not bring any limitation to the functions and the scope of use of the embodiments of the present invention.

As shown in fig. 8, the computer system 800 includes a Central Processing Unit (CPU)801 that can perform various appropriate actions and processes in accordance with a program stored in a Read Only Memory (ROM)802 or a program loaded from a storage section 808 into a Random Access Memory (RAM) 803. In the RAM 803, various programs and data necessary for the operation of the system 800 are also stored. The CPU 801, ROM 802, and RAM 803 are connected to each other via a bus 804. An input/output (I/O) interface 805 is also connected to bus 804.

The following components are connected to the I/O interface 805: an input portion 806 including a keyboard, a mouse, and the like; an output section 807 including a signal such as a Cathode Ray Tube (CRT), a Liquid Crystal Display (LCD), and the like, and a speaker; a storage portion 808 including a hard disk and the like; and a communication section 809 including a network interface card such as a LAN card, a modem, or the like. The communication section 809 performs communication processing via a network such as the internet. A drive 810 is also connected to the I/O interface 805 as necessary. A removable medium 811 such as a magnetic disk, an optical disk, a magneto-optical disk, a semiconductor memory, or the like is mounted on the drive 810 as necessary, so that a computer program read out therefrom is mounted on the storage section 808 as necessary.

In particular, according to the embodiments of the present disclosure, the processes described above with reference to the flowcharts may be implemented as computer software programs. For example, embodiments of the present disclosure include a computer program product comprising a computer program embodied on a computer readable medium, the computer program comprising program code for performing the method illustrated in the flow chart. In such an embodiment, the computer program can be downloaded and installed from a network through the communication section 809 and/or installed from the removable medium 811. The computer program executes the above-described functions defined in the system of the present invention when executed by the Central Processing Unit (CPU) 801.

It should be noted that the computer readable medium shown in the present invention can be a computer readable signal medium or a computer readable storage medium or any combination of the two. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples of the computer readable storage medium may include, but are not limited to: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In the present invention, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. In the present invention, however, a computer readable signal medium may include a propagated data signal with computer readable program code embodied therein, for example, in baseband or as part of a carrier wave. Such a propagated data signal may take many forms, including, but not limited to, electro-magnetic, optical, or any suitable combination thereof. A computer readable signal medium may also be any computer readable medium that is not a computer readable storage medium and that can communicate, propagate, or transport a program for use by or in connection with an instruction execution system, apparatus, or device. Program code embodied on a computer readable medium may be transmitted using any appropriate medium, including but not limited to: wireless, wire, fiber optic cable, RF, etc., or any suitable combination of the foregoing.

The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams or flowchart illustration, and combinations of blocks in the block diagrams or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.

The modules described in the embodiments of the present invention may be implemented by software or hardware. The described modules may also be provided in a processor, which may be described as: a processor includes an acquisition module, a determination module, and a generation module. The names of the modules do not limit the modules themselves in some cases, and for example, the obtaining unit may be further described as a module that processes the user input data to be processed according to the natural language understanding model and obtains the feature information of the user input data to be processed.

As another aspect, the present invention also provides a computer-readable medium that may be contained in the apparatus described in the above embodiments; or may be separate and not incorporated into the device. The computer readable medium carries one or more programs which, when executed by a device, cause the device to comprise: processing user input data to be processed according to the natural language understanding model to obtain characteristic information of the user input data to be processed; determining historical record information corresponding to the user input data to be processed, and determining the current conversation state according to the characteristic information and the historical record information; and generating response data corresponding to the user input data to be processed based on a neural network algorithm according to the current conversation state.

According to the technical scheme of the embodiment of the invention, the characteristic information of the user input data can be obtained by utilizing the natural language understanding model, then the current conversation state is determined according to the historical record information, and the response data is further generated, so that the design and maintenance cost of a large number of regular programs can be reduced, the accuracy of identifying the user input data is improved, and the user experience is improved; in the embodiment of the invention, the training sample set consisting of the first sample set and the second sample set is trained to obtain the natural language understanding model, so that the natural language understanding model can be constructed by utilizing massive sample set data, and the accuracy of identifying the input data of the user is improved; in the embodiment of the invention, the correlation between the user input data to be processed and the historical user input data is judged from multiple angles according to an overlapping degree principle, a scene classification principle or a model correlation principle, so that the accuracy of predicting the context relationship can be improved, and the user experience is further improved; according to the embodiment of the invention, the overlapping degree of the user input data to be processed and the user historical input data is judged according to the slot value information, so that the context correlation can be predicted from the perspective of the keywords of the input data; according to the embodiment of the invention, the relevance of the user input data to be processed and the historical user input data is judged according to the scene information, so that the context relevance can be predicted according to the scene of the input data; according to the embodiment of the invention, the correlation between the user input data to be processed and the historical user input data is judged according to the pre-constructed context correlation model, so that the context correlation can be predicted by means of a model generated by massive data; in the embodiment of the invention, the correlation value of the user input data to be processed and the historical user input data is calculated according to the neural network gate function, so that the accuracy of predicting the context correlation can be improved

The above-described embodiments should not be construed as limiting the scope of the invention. Those skilled in the art will appreciate that various modifications, combinations, sub-combinations, and substitutions can occur, depending on design requirements and other factors. Any modification, equivalent replacement, and improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A dialog management method, comprising:

processing user input data to be processed according to a natural language understanding model to acquire characteristic information of the user input data to be processed;

determining historical record information corresponding to the user input data to be processed, and determining a current conversation state according to the characteristic information and the historical record information;

and generating response data corresponding to the user input data to be processed based on a neural network algorithm according to the current conversation state.

2. The method of claim 1, wherein prior to analyzing the user input data to be processed according to the natural language understanding model, the method further comprises:

obtaining a first sample set, wherein the first sample set comprises at least one user input sample data;

labeling the user input sample data to obtain characteristic information of the user input sample data so as to generate a second sample set, wherein the second sample set comprises the characteristic information of the user input sample data;

constructing a training sample set by using the first sample set and the second sample set;

and training the training sample set to obtain the natural language understanding model, wherein the natural language understanding model inputs user input data and outputs characteristic information of the user input data.

3. The method of claim 1, wherein the history information comprises: the method comprises the steps of user historical input data, characteristic information of the user historical input data and a dialogue state of the user historical input data, wherein the user historical input data are preset n times of user input data before the user input data to be processed, and n is an integer not less than zero.

4. The method of claim 3, wherein determining a current dialog state based on the characterization information and the history information comprises:

judging whether the user input data to be processed is related to the user historical input data or not according to an overlapping degree principle, a scene classification principle or a model correlation principle;

when the user input data to be processed is related to the user historical input data, updating the current conversation state according to the conversation state of the user historical input data and the user input data to be processed;

and when the user input data to be processed is not related to the user historical input data, generating a new conversation state according to the user input data to be processed, and determining the new conversation state as the current conversation state.

5. The method of claim 4, wherein the characteristic information comprises slot value information; and judging whether the user input data to be processed and the user historical input data are related according to an overlapping degree principle comprises the following steps:

and calculating the overlapping degree of the slot value information of the user input data to be processed and the slot value information of the user historical input data, judging whether the overlapping degree is greater than a preset overlapping degree, and if so, confirming that the user input data to be processed is related to the user historical input data.

6. The method of claim 4, wherein the feature information comprises scene information; and judging whether the user input data to be processed and the user historical input data are related according to a scene classification principle comprises the following steps:

judging whether the scene information of the user input data to be processed is the same as the scene information of the user historical input data, if so, considering that the scene of the user input data to be processed is the same as the scene of the user historical input data, and confirming that the user input data to be processed is related to the user historical input data.

7. The method of claim 4, wherein determining whether the user input data to be processed and the user historical input data are related according to a model relevance rule comprises:

and performing correlation analysis on the user input data to be processed and the user historical input data according to a pre-constructed context correlation model, and if the correlation analysis result is correlation, confirming that the user input data to be processed is correlated with the user historical input data.

8. The method of claim 7, wherein prior to performing a correlation analysis on the pending user input data and the user historical input data according to a pre-constructed context correlation model, the method further comprises: and constructing a context correlation model based on a dynamic memory network algorithm.

9. The method of claim 4, wherein after determining whether the user input data to be processed and the user historical input data are related according to an overlap degree principle, a scene classification principle, or a model correlation principle, the method further comprises:

calculating the gate weight of the user input data to be processed by utilizing a neural network gate function;

calculating the correlation value of the user input data to be processed and the user historical input data according to the gate weight;

and if the correlation value is larger than a preset correlation threshold value, confirming that the user input data to be processed is related to the user historical input data.

10. A dialog management device, comprising:

the acquisition module is used for processing the user input data to be processed according to the natural language understanding model and acquiring the characteristic information of the user input data to be processed;

the determining module is used for determining historical record information corresponding to the user input data to be processed and determining the current conversation state according to the characteristic information and the historical record information;

and the generating module is used for generating response data corresponding to the user input data to be processed based on a neural network algorithm according to the current conversation state.

11. The apparatus of claim 10, wherein the obtaining module is further configured to:

12. The apparatus of claim 10, wherein the history information comprises: the method comprises the steps of user historical input data, characteristic information of the user historical input data and a dialogue state of the user historical input data, wherein the user historical input data are preset n times of user input data before the user input data to be processed, and n is an integer not less than zero.

13. The apparatus of claim 12, wherein the determining module is further configured to:

14. The apparatus of claim 13, wherein the characteristic information comprises slot value information; and the determining module is further configured to:

15. The apparatus of claim 13, wherein the feature information comprises scene information; and the determining module is further configured to:

16. The apparatus of claim 13, wherein the determining module is further configured to:

17. The apparatus of claim 16, wherein the determining module is further configured to: and constructing a context correlation model based on a dynamic memory network algorithm.

18. The apparatus of claim 13, wherein the determining module is further configured to:

19. An electronic device, comprising:

one or more processors;

a storage device for storing one or more programs,

when executed by the one or more processors, cause the one or more processors to implement the method of any one of claims 1-9.

20. A computer-readable medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1-9.