CN112860862A

CN112860862A - Method and device for generating intelligent body dialogue sentences in man-machine dialogue

Info

Publication number: CN112860862A
Application number: CN202110133448.0A
Authority: CN
Inventors: 宇洋; 袁彩霞; 王小捷; 刘咏彬; 李蕾
Original assignee: Beijing University of Posts and Telecommunications
Current assignee: Beijing University of Posts and Telecommunications
Priority date: 2021-02-01
Filing date: 2021-02-01
Publication date: 2021-05-28
Anticipated expiration: 2041-02-01
Also published as: CN112860862B

Abstract

The application discloses a method and a device for generating intelligent agent dialogue sentences in man-machine dialogue, wherein the method comprises the following steps: extracting attribute values and scene categories in a preset knowledge base from conversation historical data of the current man-machine conversation by using a pre-trained natural language understanding model; wherein the knowledge base is composed of knowledge triples; based on the attribute values and the scene categories, relevant knowledge triples are screened out from the knowledge base to obtain candidate knowledge subsets; and generating and outputting a current response sentence for the intelligent agent by utilizing a pre-trained dialogue generation model based on the dialogue historical data and the candidate knowledge subset. By adopting the invention, the man-machine conversation of a multi-task scene can be supported.

Description

Method and device for generating intelligent body dialogue sentences in man-machine dialogue

Technical Field

The invention relates to an artificial intelligence technology, in particular to a method and a device for generating intelligent agent dialogue sentences in man-machine dialogue.

Background

Existing human-machine dialog implementations are typically implemented for a particular scenario. The specific scenes can be divided into four types, namely chatting, question answering, recommending and task type dialogue. The intelligent agent can chat with the user without an explicit target; the question answering means that when a user asks a question to the intelligent agent, the intelligent agent can answer the question; the recommendation means that the intelligent agent can recommend proper information to the user according to the knowledge base and the chat with the user; a task-based dialog means that the agent can engage in a dialog with the user around a particular object, for example to help the user buy movie tickets, book hotels, etc.

Due to the fact that the conversation targets of different types of scenes are different, the man-machine conversation implementation scheme aiming at a certain type of scenes can only adapt to the corresponding application scene, and is not suitable for other types of scenes. In real life, the application scene boundary of the man-machine conversation is not clear. For example, people may join some greetings, error reports, etc. in completing a ticket ordering task, and in a chat scenario, may initiate some specific service requests, such as when chatting about a movie topic, the user may need an agent to help order movie tickets, query orders, request recommendations, ask answers, etc. Therefore, it is desirable to provide a human-machine conversation scheme that can serve multiple task scenarios to meet the above application requirements.

Disclosure of Invention

In view of this, the main objective of the present invention is to provide a method and an apparatus for generating an intelligent agent dialog statement in a human-computer dialog, which can support a human-computer dialog in a multitask scenario.

In order to achieve the above purpose, the embodiment of the present invention provides a technical solution:

a method for generating intelligent agent dialogue sentences in man-machine dialogue comprises the following steps:

extracting attribute values and scene categories in a preset knowledge base from conversation historical data of the current man-machine conversation by using a pre-trained natural language understanding model; wherein the knowledge base is composed of knowledge triples;

based on the attribute values and the scene categories, relevant knowledge triples are screened out from the knowledge base to obtain candidate knowledge subsets;

and generating and outputting a current response sentence for the intelligent agent by utilizing a pre-trained dialogue generation model based on the dialogue historical data and the candidate knowledge subset.

Preferably, the extracting the attribute values and the scene categories in the preset knowledge base includes:

splicing the dialogue history data with a preset special mark, and inputting the dialogue history data into an encoder of the natural language understanding model for encoding to obtain a corresponding dialogue history vector and a corresponding scene information vector;

inputting the dialogue history vector into a CRF layer of the natural language understanding model for sequence labeling to obtain the attribute values contained in the dialogue history data;

and inputting the scene information vector into a multilayer perceptron of the natural language understanding model for scene classification to obtain the scene category of the man-machine conversation.

Preferably, the screening out relevant knowledge triples from the knowledge base based on the attribute values and the scene categories to obtain candidate knowledge subsets includes:

if the scene category is chatting, traversing each attribute value, searching a knowledge triple containing the attribute value from the knowledge base, and constructing the candidate knowledge subset by using all the searched knowledge triples;

if the scene category is question answering, traversing each attribute value contained in the latest dialog in the dialog historical data, and searching a knowledge triple containing the attribute value from the knowledge base; constructing the candidate knowledge subsets by using all searched knowledge triples;

if the scene category is recommended, combining all the primary key entity values in the attribute values pairwise, traversing each combination, determining a common attribute value of the attribute values in the combination, and searching a knowledge triple containing the common attribute value from the knowledge base for each common attribute value; constructing the candidate knowledge subsets by using all searched knowledge triples;

and if the scene type is a task type conversation, traversing each key entity value in the attribute values, searching a knowledge triple which contains the key entity value and is related to the current man-machine conversation task from the knowledge base, and constructing the candidate knowledge subset by using all the searched knowledge triples.

Preferably, the generating a current response sentence for the agent by using a pre-trained dialogue generating model based on the dialogue history data and the candidate knowledge subset includes:

inputting the dialogue historical data into a dialogue coder of the dialogue generating model for coding to obtain a comprehensive characterization vector C of the dialogue historical data and word vectors of all words contained in the dialogue historical data;

inputting the candidate knowledge subsets into a knowledge encoder of the dialogue generation model for encoding to obtain a comprehensive characterization vector kg of the candidate knowledge subsets and a vector representation of each knowledge triple in the candidate knowledge subsets;

and generating the response sentence by utilizing a natural language generator of the dialogue generation model based on the comprehensive characterization vector C of the dialogue historical data, the comprehensive characterization vector kg of the candidate knowledge subset, the word vector and the vector representation of the knowledge triplet.

Preferably, the inputting the dialogue history data into the dialogue coder of the dialogue generating model for coding includes:

expanding the conversation history data by adding conversation role information and conversation turn information to which each word belongs in the conversation history data;

dividing the expanded dialogue historical data according to dialogue turns;

coding each pair of dialogue data obtained by the division by using a sentence-level bidirectional threshold recurrent neural network (BiGRU) to obtain word vectors of all words contained in each pair of dialogs;

calculating a first dialogue vector of each dialogue by adopting a self-attention mechanism based on the word vectors of all words contained in each dialogue;

encoding the first dialogue vectors of all the dialogs in a turn by using a dialogue-turn secondary BiGRU to obtain a second dialogue vector of each turn of dialogue;

and calculating a comprehensive characterization vector C of the dialogue historical data by adopting a self-attention mechanism based on the second dialogue vector.

Preferably, the encoding the candidate knowledge subset input to the knowledge encoder of the dialog generation model comprises:

calculating an entity word vector of each knowledge triple in the candidate knowledge subset by using a TransE model;

obtaining a vector representation of each knowledge triple in the candidate knowledge subset by using a multilayer perceptron based on the entity word vector of each knowledge triple;

and obtaining a comprehensive characterization vector kg of the candidate knowledge subset by using a self-attention mechanism based on the vector representation of each knowledge triple.

Preferably, generating the response sentence using a natural language generator of the dialog generation model includes:

splicing the vector representation of the knowledge triples with the word vectors, and writing a spliced result M into a memory network of the natural language generator; wherein, M ═ h [ [ (h)₁,...,h_n)；(k₁,...,k_g)]＝[M₁,...,M_n+g]，h_nRepresenting the nth word vector; k is a radical of_gA vector representation representing the g-th knowledge triplet; n represents the number of word vectors; g represents the number of knowledge triples;

initial query vector s using GRU of the natural language generator for decoding₀Initialized to said comprehensive characterization vector C and said comprehensive characterization vectorThe splicing result of the amount kg;

at each time t at which the GRU decodes, the GRU is based on a query vector s at a previous time_t-1And the generated word y at the previous moment_t-1Generating a query vector s of the current time t_tComputing said query vector s using an attention mechanism_tObtaining the query vector s according to the correlation degree of each storage unit in the memory network_tDegree of correlation α with each of the words in the dialogue history data_i ^tAnd the query vector s_tA degree of correlation β with each knowledge triplet of the candidate knowledge subset_r ^t(ii) a Based on the degree of correlation α_it, calculating the joint representation c of the dialogue historical data by adopting a weighted summation mode_tBased on said degree of correlation β_rt, calculating the joint representation g of the candidate knowledge subsets by adopting a weighted summation mode_t(ii) a With said c_tAs a query vector, accessing the memory network in a multi-hop manner to obtain a knowledge distribution p_ptr(ii) a In the g_tAs a query vector, accessing a preset dictionary by adopting a multilayer perceptron to obtain dictionary distribution p_vocab(ii) a Distributing p based on said knowledge_ptrAnd the dictionary distribution p_vocabObtaining the generated word y at the current time t by using a gating mechanism_t；

And the GRU generates a current response statement for the intelligent agent based on the generated words at all the moments.

The embodiment of the invention also discloses a device for generating the intelligent agent dialogue sentences in the man-machine dialogue, which comprises the following steps:

the information extraction module is used for extracting attribute values and scene categories in a preset knowledge base from conversation historical data of the current man-machine conversation by utilizing a pre-trained natural language understanding model; wherein the knowledge base is composed of knowledge triples;

the knowledge screening module is used for screening out related knowledge triples from the knowledge base based on the attribute values and the scene categories to obtain candidate knowledge subsets;

and the dialogue response module is used for generating and outputting a current response statement for the intelligent agent by utilizing a pre-trained dialogue generation model based on the dialogue historical data and the candidate knowledge subset.

The embodiment of the invention also discloses equipment for generating the intelligent agent dialogue sentences in the man-machine dialogue, which comprises a processor and a memory;

the memory stores an application program executable by the processor, and the application program is used for enabling the processor to execute the method for generating the intelligent agent dialogue statement in the man-machine dialogue.

A computer-readable storage medium having stored therein computer-readable instructions for executing the method for generating an agent dialog statement in a human-computer dialog as described above.

According to the technical scheme, the generation scheme of the intelligent agent dialogue sentences in the man-machine dialogue provided by the embodiment of the invention extracts the attribute values in the knowledge base from the dialogue historical data and identifies the dialogue scene types, selects the candidate knowledge subsets related to the dialogue from the knowledge base based on the attribute values and the scene types in the dialogue, and then generates the current intelligent agent response sentences based on the candidate knowledge subsets and the current dialogue historical data. Therefore, on one hand, the number of the knowledge triples used for generating the response sentences can be effectively reduced by constructing the candidate knowledge subsets, so that the operation overhead generated by the response sentences can be reduced, and the sentence generation efficiency can be improved, on the other hand, the generated candidate knowledge subsets can be matched with the current scene categories by screening the knowledge triples in the knowledge base based on the scene categories, so that the generated response sentences can be matched with the current man-machine conversation scene, the intelligence of the response sentences is improved, and the man-machine conversation experience of the user is improved. Therefore, the intelligent agent dialogue statement generation scheme provided by the embodiment of the invention can be suitable for various task scenes.

Drawings

FIG. 1 is a schematic flow chart of a method according to an embodiment of the present invention.

Fig. 2 is a schematic structural diagram of an apparatus according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.

Fig. 1 is a schematic flow diagram of an embodiment of the present invention, and as shown in fig. 1, a method for generating an agent dialog statement in a human-computer dialog implemented by the embodiment mainly includes:

101, extracting attribute values and scene categories in a preset knowledge base from conversation historical data of current man-machine conversation by using a pre-trained natural language understanding model; wherein the knowledge base is composed of knowledge triples.

Here, the dialog history data of the current human-computer dialog is dialog sentences that have been generated in the current human-computer dialog. In this step, the attribute values and scene categories in the dialog history data are identified, so that in step 102, a candidate knowledge sub-base related to the current dialog is constructed based on the attribute values and the scene categories, and thus, in the subsequent steps, only the candidate knowledge sub-base is needed to be based on, and the response sentence does not need to be generated based on the whole knowledge base, so that the generation efficiency of the response sentence can be improved.

The dialogue history data may be specifically expressed as X ═ X (X)₁,...,x_n) Wherein each element corresponds to a word, and n represents the number of words contained in the dialog history data.

It should be noted that, the embodiment of the present invention requires a large-scale data set to be constructed in advance, which can be used for a multitask scenario dialog. Wherein, the training data can realize large-scale expansion according to the template and the knowledge base. Given a domain-specific database, it contains a primary key field and several attribute fields describing the primary key. The home key field refers to a specific entity object that uniquely identifies a record in the database, such as a movie work, a hotel, a tourist attraction, etc., and the attribute field refers to a plurality of elements describing the home key entity, such as a movie work having attributes "director", "actor", "show time", etc., with attribute having an attribute value, which is also considered to be an entity. The attribute name often describes a semantic relationship type between an entity and some attribute value thereof, for example, a movie work "hawthorn tree love" has an attribute "director", and "director" has an attribute value "zhangju", and then "director" describes a semantic relationship between "hawthorn tree love" and "zhangju". Thus, an entity and an attribute value thereof can be represented as a knowledge triplet, each triplet consisting of a head entity, a relation and a tail entity, e.g. < hawthorn tree love, director, spanic collusion > indicating that the relation between the head entity "hawthorn tree love" and the tail entity "spanic collusion" is "director". The knowledge base is composed of a large number of knowledge triples. The relationship types between the entities can be set according to different fields, for example, in the Movie field, the relationship types can be designed into 12 types, which are Movie _ Name, Actor, Director, Writer, Release, gente, Language, Plot, Date, Num _ packets, heatre _ Name, and Time.

Specifically, the method for constructing the knowledge base based on the knowledge triples is known to those skilled in the art, and is not described herein again.

In one embodiment, the following method may be specifically adopted to extract the attribute values and the scene categories in the preset knowledge base:

and step 1011, after the dialogue history data is spliced with a preset special mark, inputting the dialogue history data into an encoder of the natural language understanding model for encoding, and obtaining a corresponding dialogue history vector and a corresponding scene information vector.

In this step, the dialog history data X needs to be (X)₁,...,x_n) And a special mark (denoted as x)_@) Splicing is carried out, and then the splicing result is (x)₁,...,x_n,x_@) The coding will result in the vector representation H ═ (H)₁,...,h_n，h_@). Wherein the special mark x_@After encoding, the obtained scene information vector h_@Scene information related to the dialogue historical data is carried, and therefore scene classification can be achieved in the subsequent steps based on the scene information vector.

Specifically, the encoder may be a BiGRU, and the specific encoding method is the same as that in the prior art, and is not described herein again.

Step 1012, inputting the dialogue history vector into a Conditional Random Field (CRF) layer of the natural language understanding model for sequence labeling, so as to obtain the attribute values included in the dialogue history data.

The step is expressed by a formula as follows: y is_logit＝CRF(h₁,...,h_n)。Y_logitIndicating the result of the sequence annotation.

And 1013, inputting the scene information vector into a multilayer perceptron of the natural language understanding model to perform scene classification, so as to obtain the scene category of the man-machine conversation.

The step is expressed by a formula as follows: sce_logit＝softmax(MLP(h_@))。Sce_logitIndicating the output scene category.

And 102, screening related knowledge triples from the knowledge base based on the attribute values and the scene categories to obtain candidate knowledge subsets.

In an embodiment, the following method may be specifically adopted to screen out relevant knowledge triples from the knowledge base based on the attribute values and the scene categories, so as to obtain a candidate knowledge subset:

a. and if the scene category is chatting, traversing each attribute value, searching a knowledge triple containing the attribute value from the knowledge base, and constructing the candidate knowledge subset by using all the searched knowledge triples.

b. If the scene category is question answering, traversing each attribute value contained in the latest dialog in the dialog historical data, and searching a knowledge triple containing the attribute value from the knowledge base; and constructing the candidate knowledge subsets by using all the found knowledge triples.

Considering that in the question-and-answer scenario, the response of the agent should be answered to the latest question posed by the user, therefore, it is necessary to find the relevant knowledge triple based on each attribute value contained in the latest round of dialog in the dialog history data so that the constructed candidate knowledge subset matches the dialog scenario requirement.

c. If the scene category is recommended, combining all the primary key entity values in the attribute values pairwise, traversing each combination, determining a common attribute value of the attribute values in the combination, and searching a knowledge triple containing the common attribute value from the knowledge base for each common attribute value; and constructing the candidate knowledge subsets by using all the found knowledge triples.

In consideration of the fact that information which is interested by the user needs to be provided for the user in a recommendation scene, all key entity values in the attribute values in the conversation history need to be combined pairwise to obtain all possible pairwise key entity value combinations, interest points of the user are found by searching a common attribute value between the two key entity values, and then knowledge triples are screened based on the common attribute value, so that the screened knowledge triples can provide the information which is interested by the user, and the task requirement of the recommendation scene can be met.

d. And if the scene type is a task type conversation, traversing each key entity value in the attribute values, searching a knowledge triple which contains the key entity value and is related to the current man-machine conversation task from the knowledge base, and constructing the candidate knowledge subset by using all the searched knowledge triples.

In consideration of the fact that the human-computer conversation in the task-based conversation scene needs to complete a predetermined task, in this case, when the knowledge triples are screened, it is necessary to ensure that the screened knowledge triples are related to the current human-computer conversation task so as to meet the task requirements of the task-based conversation scene.

And 103, generating and outputting a current response sentence for the intelligent agent by utilizing a pre-trained dialogue generation model based on the dialogue historical data and the candidate knowledge subset.

In one embodiment, the following method may be specifically adopted to generate a current response statement for the agent based on the dialogue history data and the candidate knowledge subset:

step 1031, inputting the dialogue history data into a dialogue encoder of the dialogue generating model for encoding, and obtaining a comprehensive characterization vector C of the dialogue history data and word vectors of all words contained in the dialogue history data.

In one embodiment, the step may input the dialogue history data to a dialogue coder of the dialogue generating model to perform coding processing by using the following method:

step x1, adding the dialogue role information and the dialogue turn information of each word in the dialogue history data to expand the dialogue history data.

The step is used for expanding the conversation history data, and corresponding conversation role information and conversation turn information are added after each word, namely, the conversation history data X is equal to (X)₁,...,x_n) Extended as X ═ c₁,...,c_n) Wherein c is_i＝(x_iAnd u/$ s, t), wherein i is more than or equal to 1 and less than or equal to n, u represents a sentence of the user, s represents a sentence returned by the intelligent agent, and t represents a conversation turn. Therefore, the expanded conversation history can enable the model to capture more information related to the conversation during coding, so that the generation of a reply sentence which is more matched with the sentence of the user is facilitated, and the effectiveness of the reply sentence can be improved.

And step x2, dividing the expanded dialogue history data according to the dialogue turns.

And step x3, coding each pair of dialogue data obtained by dividing by using a sentence-level bidirectional threshold recurrent neural network (BiGRU) to obtain word vectors of all words contained in each pair of dialogs.

Step x4, a self-attention mechanism is used to calculate the first dialogue vector for each dialogue round based on the word vectors of all words contained in each dialogue round.

Step x5, encoding the first dialogue vectors of all the dialogs in turn by using the secondary BiGRU of the dialogue turn to obtain the second dialogue vectors of each dialogue turn.

And step x6, calculating a comprehensive characterization vector C of the dialogue historical data by adopting a self-attention mechanism based on the second dialogue vector.

Step 1032, inputting the candidate knowledge subsets into a knowledge encoder of the dialogue generating model for encoding, so as to obtain a comprehensive characterization vector kg of the candidate knowledge subsets and a vector representation of each knowledge triple in the candidate knowledge subsets.

In one embodiment, this step may encode the candidate knowledge subset by inputting the candidate knowledge subset to a knowledge encoder of the dialog generation model by:

and step y1, calculating an entity word vector of each knowledge triple in the candidate knowledge subset by using a TransE model.

And y2, obtaining the vector representation of each knowledge triple in the candidate knowledge subset by using a multilayer perceptron based on the entity word vector of each knowledge triple.

And step y3, obtaining a comprehensive characterization vector kg of the candidate knowledge subset by using a self-attention mechanism based on the vector representation of each knowledge triple.

And 1033, generating the response sentence by using a natural language generator of the dialogue generating model based on the comprehensive characterization vector C of the dialogue history data, the comprehensive characterization vector kg of the candidate knowledge subset, the word vector and the vector representation of the knowledge triplet.

Preferably, the response statement may be generated by a method of dynamic interaction between a Memory Network (Memory Network) and a GRU by using a natural language generator of the dialog generation model by the following method:

and step z1, splicing the vector representation of the knowledge triple with the word vector, and writing the spliced result M into the memory network of the natural language generator.

Here, the vector representation KG of the knowledge triplet is given as [ k ═ k%₁,...,k_g]And the word vector representation of the word in the dialog history H ═ H₁,...,h_n) Splicing together asThe input is written into the memory network.

Wherein, M ═ h [ [ (h)₁,...,h_n)；(k₁,...,k_g)]＝[M₁,...,M_n+g]，h_nRepresenting the nth word vector; k is a radical of_gA vector representation representing the g-th knowledge triplet; n represents the number of word vectors; g represents the number of knowledge triples.

The memory network, i.e. the memory, has the function of reading and writing, and is mainly used for outputting the result H (H) of the dialog coder₁,...,h_n) And the output result KG ═ k of the knowledge encoder₁,...,k_g]And writing the data into a memory so as to facilitate the query when the GRU dynamically generates words at each moment.

Step z2 initial query vector s using GRU of the natural language generator for decoding₀And initializing the splicing result of the comprehensive characterization vector C and the comprehensive characterization vector kg.

Step z3, at each time t of decoding of the GRU, the GRU is based on the query vector s at the previous time_t-1And the generated word y at the previous moment_t-1Generating a query vector s of the current time t_t(ii) a I.e. according to s_t＝GRU(s_t-1，e(y_t-1) Get the query vector s)_t；

Using the attention mechanism, in accordance with the calculation of the query vector s_tObtaining the query vector s according to the correlation degree of each storage unit in the memory network_tA degree of correlation with each of the words in the dialogue history data

And the query vector s_tA degree of relevance to each knowledge triple in the candidate knowledge subset

Namely: according to

Obtaining the correlation

Wherein i represents a word number; i is more than or equal to 1 and less than or equal to n; according to

Obtaining the correlation

Wherein r represents a knowledge triplet number; r is more than or equal to 1 and less than or equal to g;

based on the degree of correlation

Calculating a joint representation c of the dialogue historical data by adopting a weighted summation mode_t(ii) a That is, in accordance with

To obtain the c_t；

Based on the degree of correlation

Calculating a joint representation g of the candidate knowledge subsets by means of weighted summation_t(ii) a That is, in accordance with

To obtain the g_t；

With said c_tAs a query vector, accessing the memory network in a multi-hop manner to obtain a knowledge distribution p_ptr(ii) a I.e. according to p_ptr＝multihop([s_t，g_t]M) to obtain said p_ptr(ii) a The specific method for accessing the memory network by using the multi-hop method is known by those skilled in the art and is not described herein again;

in the g_tAs a query vector, accessing a preset dictionary by adopting a multilayer perceptron to obtain dictionary distribution p_vocab(ii) a I.e. according to p_vocab＝mlp([s_t，g_t]V) to obtainp_vocab；

Distributing p based on said knowledge_ptrAnd the dictionary distribution p_vocabAnd obtaining the generated word at the current time t by using a gating mechanism.

In this step, c is expressed by the union of the dialogue history data_tAs a query vector, accessing the memory network in a multi-hop manner to obtain a knowledge distribution p_ptrThe knowledge distribution p may be improved by using the result of the concatenation of the vector representation of the knowledge triples and the word vector_ptrThe accuracy of (2).

And step z4, the GRU generates a current response statement for the agent based on the generated words at all the time.

In this step, the system response Y is generated using the generated words selected by the GRU at all times (Y ═ Y)₁,...,y_t,...y_m) Wherein, y_tRepresenting the generated word of the GRU at time t.

Based on the embodiment, it can be seen that in the technical scheme, attribute values and conversation scene categories in conversation historical data are identified, knowledge triples related to conversation are screened from a knowledge base based on the attribute values and the scene categories in the conversation, a candidate knowledge subset is constructed, and then a current system response statement is generated based on the candidate knowledge subset and current conversation historical data. Therefore, on one hand, the number of the knowledge triples used for generating the response sentences can be effectively reduced by constructing the candidate knowledge subsets, so that the operation overhead generated by the response sentences can be reduced, and the sentence generation efficiency can be improved, on the other hand, the generated candidate knowledge subsets can be matched with the current scene categories by screening the knowledge triples in the knowledge base based on the scene categories, so that the generated response sentences can be matched with the current man-machine conversation scene, the intelligence and the accuracy of the response sentences can be improved, and the man-machine conversation experience of a user can be effectively improved. Therefore, the invention is suitable for various task scenes.

Corresponding to the above method embodiment, the embodiment of the present invention further discloses a device for generating an intelligent agent dialog statement in a human-computer dialog, as shown in fig. 2, the device includes:

The embodiment of the invention also discloses equipment for generating the intelligent agent dialogue sentences in the man-machine dialogue, which comprises a processor and a memory; the memory stores an application program executable by the processor, and the application program is used for enabling the processor to execute the method for generating the intelligent agent dialogue statement in the man-machine dialogue.

The memory may be embodied as various storage media such as an Electrically Erasable Programmable Read Only Memory (EEPROM), a Flash memory (Flash memory), and a Programmable Read Only Memory (PROM). The processor may be implemented to include one or more central processors or one or more field programmable gate arrays, wherein the field programmable gate arrays integrate one or more central processor cores. In particular, the central processor or central processor core may be implemented as a CPU or MCU.

It should be noted that not all steps and modules in the above flows and structures are necessary, and some steps or modules may be omitted according to actual needs. The execution order of the steps is not fixed and can be adjusted as required. The division of each module is only for convenience of describing adopted functional division, and in actual implementation, one module may be divided into multiple modules, and the functions of multiple modules may also be implemented by the same module, and these modules may be located in the same device or in different devices.

The hardware modules in the various embodiments may be implemented mechanically or electronically. For example, a hardware module may include a specially designed permanent circuit or logic device (e.g., a special purpose processor such as an FPGA or ASIC) for performing specific operations. A hardware module may also include programmable logic devices or circuits (e.g., including a general-purpose processor or other programmable processor) that are temporarily configured by software to perform certain operations. The implementation of the hardware module in a mechanical manner, or in a dedicated permanent circuit, or in a temporarily configured circuit (e.g., configured by software), may be determined based on cost and time considerations.

The present invention also provides a machine-readable storage medium storing instructions for causing a machine to perform a method as described herein. Specifically, a system or an apparatus equipped with a storage medium on which a software program code that realizes the functions of any of the embodiments described above is stored may be provided, and a computer (or a CPU or MPU) of the system or the apparatus is caused to read out and execute the program code stored in the storage medium. Further, part or all of the actual operations may be performed by an operating system or the like operating on the computer by instructions based on the program code. The functions of any of the above-described embodiments may also be implemented by writing the program code read out from the storage medium to a memory provided in an expansion board inserted into the computer or to a memory provided in an expansion unit connected to the computer, and then causing a CPU or the like mounted on the expansion board or the expansion unit to perform part or all of the actual operations based on the instructions of the program code.

Examples of the storage medium for supplying the program code include floppy disks, hard disks, magneto-optical disks, optical disks (e.g., CD-ROMs, CD-R, CD-RWs, DVD-ROMs, DVD-RAMs, DVD-RWs, DVD + RWs), magnetic tapes, nonvolatile memory cards, and ROMs. Alternatively, the program code may be downloaded from a server computer or the cloud by a communication network.

"exemplary" means "serving as an example, instance, or illustration" herein, and any illustration, embodiment, or steps described as "exemplary" herein should not be construed as a preferred or advantageous alternative. For the sake of simplicity, the drawings are only schematic representations of the parts relevant to the invention, and do not represent the actual structure of the product. In addition, in order to make the drawings concise and understandable, components having the same structure or function in some of the drawings are only schematically illustrated or only labeled. In this document, "a" does not mean that the number of the relevant portions of the present invention is limited to "only one", and "a" does not mean that the number of the relevant portions of the present invention "more than one" is excluded. In this document, "upper", "lower", "front", "rear", "left", "right", "inner", "outer", and the like are used only to indicate relative positional relationships between relevant portions, and do not limit absolute positions of the relevant portions.

The above description is only a preferred embodiment of the present invention, and is not intended to limit the scope of the present invention. Any modification, equivalent replacement, or improvement made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims

1. A method for generating intelligent agent dialogue sentences in man-machine dialogue is characterized by comprising the following steps:

2. The method of claim 1, wherein extracting the attribute values and scene categories in the predetermined knowledge base comprises:

3. The method of claim 1, wherein the screening of relevant knowledge triples from the knowledge base based on the attribute values and the scene categories to obtain candidate knowledge subsets comprises:

4. The method of claim 1, wherein generating a current response statement for an agent using a pre-trained dialogue generation model based on the dialogue history data and the candidate knowledge subset comprises:

5. The method of claim 4, wherein the inputting the dialogue history data into the dialogue coder of the dialogue generating model for coding comprises:

dividing the expanded dialogue historical data according to dialogue turns;

6. The method of claim 4, wherein the inputting the subset of candidate knowledge into the knowledge coder of the dialog generation model for encoding comprises:

7. The method of claim 4, wherein generating the response sentence using a natural language generator of the dialog generation model comprises:

initial query vector s using GRU of the natural language generator for decoding₀Initializing a splicing result of the comprehensive characterization vector C and the comprehensive characterization vector kg;

at each instant t at which the GRU decodesThe GRU is based on the query vector s at the previous moment_t-1And the generated word y at the previous moment_t-1Generating a query vector s of the current time t_tComputing said query vector s using an attention mechanism_tObtaining the query vector s according to the correlation degree of each storage unit in the memory network_tA degree of correlation with each of the words in the dialogue history data

Based on the degree of correlation

Calculating a joint representation c of the dialogue historical data by adopting a weighted summation mode_tBased on said degree of correlation

Computing a joint representation g of the candidate knowledge subsets by means of weighted summation_t(ii) a With said c_tAs a query vector, accessing the memory network in a multi-hop manner to obtain a knowledge distribution p_ptr(ii) a In the g_tAs a query vector, accessing a preset dictionary by adopting a multilayer perceptron to obtain dictionary distribution p_vocab(ii) a Distributing p based on said knowledge_ptrAnd the dictionary distribution p_vocabObtaining the generated word y at the current time t by using a gating mechanism_t；

8. An apparatus for generating dialog sentences of an agent in a human-computer dialog, comprising:

9. The generation equipment of the intelligent agent dialogue sentences in the man-machine dialogue is characterized by comprising a processor and a memory;

the memory stores an application program executable by the processor for causing the processor to execute the method for generating an intelligent agent dialogue statement in a man-machine dialogue according to any one of claims 1 to 7.

10. A computer-readable storage medium having stored therein computer-readable instructions for executing the method for generating an agent dialog statement in a human-computer dialog according to any one of claims 1 to 7.