CN110704703A

CN110704703A - Man-machine conversation method and device

Info

Publication number: CN110704703A
Application number: CN201910926286.9A
Authority: CN
Inventors: 吴文权; 郭振; 刘占一; 张喜媛; 吴华
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2019-09-27
Filing date: 2019-09-27
Publication date: 2020-01-17

Abstract

The application provides a man-machine conversation method and a man-machine conversation device, wherein the method comprises the following steps: determining conversation target information, acquiring a plurality of knowledge information and guidance information of the conversation target information, initiating conversation interaction to a user by combining the guidance information of the conversation target information, performing multi-turn conversation interaction with the user according to the conversation target information, and determining target reply information of the conversation information input by the user in an Nth turn-to-turn conversation for each turn-to-turn conversation according to historical conversation information before the Nth turn-to-turn conversation, the conversation target information and the plurality of knowledge information. Therefore, in the process of human-computer interaction, the machine in the human-computer interaction system can actively initiate active conversation, and the human-computer interaction process is actively guided based on the conversation target information, so that the active interaction capability of the machine in the human-computer interaction is improved, the intelligence of the human-computer interaction is further improved, and the user experience degree of the human-computer interaction is improved.

Description

Man-machine conversation method and device

Technical Field

The application relates to the technical field of artificial intelligence, in particular to a man-machine conversation method and device.

Background

The man-machine conversation system refers to a system for information interaction between a person and a machine through natural language, and at present, many products are provided on the market based on the man-machine conversation system, some products appear in the form of personal assistants (siri, degree secret and the like), some products appear in the form of chat robots, and other products are built in terminals such as intelligent sound boxes, intelligent vehicle-mounted equipment and intelligent televisions.

The general interaction process of the man-machine conversation system is as follows: the user initiates a chat, and then the dialogue system answers the content input by the user, so that the man-machine interaction is realized. That is, after receiving question information input by a user, a machine in the human-machine interaction system outputs corresponding reply information according to the question information. Therefore, in the process of human-computer interaction in the related art, most of machines in the human-computer interaction system are in a passive interaction mode, if a user does not input interaction information into the human-computer interaction system, the human-computer interaction system cannot perform intelligent interaction between human machines, and the human-computer interaction experience of the user is not ideal.

Disclosure of Invention

The present application is directed to solving, at least to some extent, one of the technical problems in the related art.

To this end, a first object of the present application is to propose a man-machine interaction method.

A second object of the present application is to provide a human-machine interaction device.

A third object of the present application is to provide an electronic device.

A fourth object of the present application is to propose a computer readable storage medium.

A fifth object of the present application is to propose a computer program product.

To achieve the above object, an embodiment of a first aspect of the present application provides a human-machine interaction method, including: determining conversation target information; acquiring necessary information corresponding to the conversation target information, wherein the necessary information comprises guide information and a plurality of knowledge information; initiating conversation interaction to a user based on the guide information, and performing multiple rounds of conversation interaction with the user according to the conversation target information; and in the N wheel conversation process of the multi-wheel conversation interaction, determining target reply information of the user input conversation information in the N wheel conversation according to the historical conversation information before the N wheel conversation, the conversation target information and the plurality of knowledge information, wherein N is a positive integer.

The man-machine conversation method determines conversation target information, acquires a plurality of knowledge information and guide information of the conversation target information, initiates conversation interaction to a user by combining the guide information of the conversation target information, performs multi-turn conversation interaction with the user according to the conversation target information, and determines target reply information of the conversation information input by the user in an nth turn of conversation according to historical conversation information, the conversation target information and the plurality of knowledge information before the nth turn of conversation for each turn of conversation in the nth turn of conversation interaction. Therefore, in the process of human-computer interaction, the machine in the human-computer interaction system can actively initiate active conversation, and the human-computer interaction process is actively guided based on the conversation target information, so that the active interaction capability of the machine in the human-computer interaction is improved, the intelligence of the human-computer interaction is further improved, and the user experience degree of the human-computer interaction is improved.

To achieve the above object, a third aspect of the present application provides a human-machine interaction device, including: the determining module is used for determining the conversation target information; the first acquisition module is used for acquiring necessary information corresponding to the conversation target information, wherein the necessary information comprises guide information and a plurality of knowledge information; the dialogue interaction module is used for initiating dialogue interaction to a user based on the guide information and carrying out multi-turn dialogue interaction with the user according to the dialogue target information; and the reply determining module is used for determining target reply information of the dialogue information input by the user in the Nth wheel dialogue according to the historical dialogue information before the Nth wheel dialogue, the dialogue target information and the knowledge information in the Nth wheel dialogue interaction process, wherein N is a positive integer.

The man-machine conversation device provided by the embodiment of the application determines conversation target information, acquires a plurality of knowledge information and guidance information of the conversation target information, initiates conversation interaction to a user by combining the guidance information of the conversation target information, performs multi-round conversation interaction with the user according to the conversation target information, and determines target reply information of user input conversation information in an Nth round of conversation according to historical conversation information, the conversation target information and the plurality of knowledge information before the Nth round of conversation for each round of conversation in an Nth round of conversation interaction. Therefore, in the process of human-computer interaction, the machine in the human-computer interaction system can actively initiate active conversation, and the human-computer interaction process is actively guided based on the conversation target information, so that the active interaction capability of the machine in the human-computer interaction is improved, the intelligence of the human-computer interaction is further improved, and the user experience degree of the human-computer interaction is improved.

To achieve the above object, an embodiment of a third aspect of the present application provides an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and the processor implements the man-machine interaction method as described above when executing the computer program.

In order to achieve the above object, a fourth aspect of the present application provides a computer-readable storage medium, where instructions of the storage medium, when executed by a processor, implement the man-machine interaction method as described above.

In order to achieve the above object, an embodiment of a fifth aspect of the present application provides a computer program product, where when being executed by an instruction processor, the computer program product implements a qualification auditing method as described above.

Additional aspects and advantages of the present application will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the present application.

Drawings

The foregoing and/or additional aspects and advantages of the present application will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:

FIG. 1 is a flow diagram of a human-machine dialog method according to one embodiment of the present application;

FIG. 2 is a first flowchart of a refinement of step 104 in the embodiment shown in FIG. 1;

FIG. 3 is a schematic flow chart of training a first dialogue model;

FIG. 4 is a diagram of an example model structure of a first dialogue model;

fig. 5 is a schematic diagram of a detailed flow chart of step 104 in the embodiment shown in fig. 1.

FIG. 6 is a detailed flow chart of step 502;

FIG. 7 is a schematic flow chart of training a second dialogue model;

FIG. 8 is a diagram showing an example of a model structure of a second dialogue model;

FIG. 9 is a flowchart illustrating a human-machine dialog method according to an embodiment of the present application;

FIG. 10 is a schematic diagram of a human-machine dialog device according to an embodiment of the present application;

FIG. 11 is a schematic diagram of a human-machine interaction device according to another embodiment of the present application;

FIG. 12 is a schematic diagram of a human-machine dialog device according to another embodiment of the present application;

FIG. 13 is a schematic diagram of a human-machine dialog device according to another embodiment of the present application;

FIG. 14 is a schematic structural diagram of an electronic device according to one embodiment of the present application.

Detailed Description

Reference will now be made in detail to the embodiments of the present application, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to the same or similar elements or elements having the same or similar function throughout. The embodiments described below with reference to the drawings are exemplary and intended to be used for explaining the present application and should not be construed as limiting the present application.

Machines in the man-machine dialog systems of the related art generally adopt a passive dialog form, and if no user input is received in the man-machine dialog systems, the machines in the man-machine dialog systems cannot lead users to carry out active dialog. The man-machine conversation system only supports an interactive mode in a passive conversation mode, and influences the man-machine interaction experience of a user.

The method comprises the steps of determining conversation target information, acquiring necessary information of the conversation target information, initiating conversation interaction to a user by combining guide information of the conversation target information, carrying out multi-turn conversation interaction with the user according to the conversation target information, and determining target reply information of the user input conversation information of an Nth turn of conversation for each turn of conversation according to historical conversation information, the conversation target information and a plurality of knowledge information of the Nth turn of conversation in the Nth turn of conversation. Therefore, in the process of human-computer interaction, the machine in the human-computer interaction system can actively initiate active conversation, and the human-computer interaction process is actively guided based on the conversation target information, so that the active interaction capability of the machine in the human-computer interaction is improved, the intelligence of the human-computer interaction is further improved, and the user experience degree of the human-computer interaction is improved.

The man-machine conversation method, the man-machine conversation device and the electronic equipment in the embodiment of the application are described below with reference to the attached drawings.

FIG. 1 is a flow diagram of a human-machine conversation method according to one embodiment of the present application. It should be noted that, the man-machine interaction method provided in this embodiment is applied to a man-machine interaction system, and an execution subject of the man-machine interaction method may be a controller of a machine in the man-machine interaction system or a server of the man-machine interaction system, which is not limited in this embodiment.

As shown in fig. 1, the man-machine conversation method may include:

step 101, determining dialog target information.

In particular, upon determining that a machine-initiated active dialog in a human-machine dialog system is required, dialog target information required by the machine to initiate the active dialog may be determined.

For example, a user sends a request for initiating the active dialog by triggering a corresponding control, or dialog information input by the user is not received within a preset time after the start of the man-machine dialog system is detected, and at this time, the machine can be determined to be required to initiate the active dialog.

In one embodiment of the application, in order to enable the determined conversation target information to meet the user requirement, the interest information of a user to be interacted can be determined, and the conversation target information of the machine can be determined by combining the interest information of the user.

The conversation target information comprises conversation topic information required by the machine to actively lead the conversation in the man-machine conversation system.

The conversation topic information may include one conversation topic and may also include a plurality of topics. It can be understood that, when the conversation topic information includes a plurality of topics, the conversation topic information may further include topic skipping indication information, and the topic skipping indication information is used for indicating a topic skipping order of the plurality of topics.

Topics of conversation may include, but are not limited to, economics, movies, novels, etc., and the implementation is not limited to topics of conversation.

For example, by analyzing the interest information of the user, it is determined that the user likes to watch a movie showing "star a", and if the star a shows a new movie B recently, the new movie B can be used as the dialogue target information when it is determined that the machine is required to initiate an active dialogue.

Step 102, acquiring necessary information corresponding to the conversation target information, wherein the necessary information comprises guide information and a plurality of knowledge information.

Specifically, the necessary information corresponding to the dialog target information may be acquired according to a correspondence between the pre-stored dialog target information and the necessary information.

In this embodiment, in different application scenarios, the manner of acquiring the plurality of pieces of knowledge information of the dialog target information is different, which is illustrated as follows:

as a possible implementation manner, a plurality of pieces of knowledge information of the dialog target information may be acquired according to a correspondence relationship between the dialog target information and the knowledge information that is stored in advance.

As another possible implementation manner, a plurality of pieces of knowledge information of the dialog target information may be acquired in combination with a pre-constructed knowledge graph.

The knowledge graph comprises nodes and edges, and relationship information between entities and attribute information corresponding to each entity are stored in the knowledge graph.

Specifically, an entity in the conversation topic can be determined, and the attribute information of the entity is acquired from a pre-constructed knowledge graph, and the attribute information of the entity is used as a plurality of pieces of knowledge information of the conversation target information.

In the following example, after the new movie B is used as the dialog target information, if the attribute information corresponding to the node of the movie B in the knowledge graph includes showing time information, director condition information, comment condition information, behind-the-scenes feature information, and the like, at this time, according to the knowledge graph, the knowledge information of the new movie B may be acquired, which may include showing time, director condition, comment condition, behind-the-scenes feature information, and the like.

Wherein the guidance information is preset according to the conversation topic in the conversation target information.

It is understood that the topic types of the conversation topics in the conversation target information are different, and the corresponding guidance information is usually different, for example, the topic type is novel, and the corresponding guidance information may be "what type of novel you like to see? "for another example, the topic type is a movie, and the corresponding guidance information may be" what type of movie you like to see ", etc.

In the present embodiment, there are various ways to obtain the guidance information of the conversation target information, for example, the guidance information of the conversation target information may be obtained according to a correspondence relationship between the conversation target information and the guidance information that is stored in advance, or the guidance information of the conversation target information may be generated based on a preset guidance information generation rule according to a topic type of a conversation topic in the conversation target information.

And 103, initiating conversation interaction to the user based on the guide information, and performing multiple rounds of conversation interaction with the user according to the conversation target information.

In this embodiment, after obtaining the guidance information of the session target information, the machine may be controlled to initiate session interaction with the user by the guidance information, and perform multiple rounds of session interaction with the user according to the session target information.

Wherein the machine may output the guidance information in voice and/or text form to initiate an active conversational interaction to the user.

And 104, in the process of the nth wheel of the multi-wheel conversation interaction, determining target reply information of the user input conversation information in the nth wheel of the conversation according to the historical conversation information, the conversation target information and the plurality of knowledge information before the nth wheel of the conversation, wherein N is a positive integer.

And the historical conversation information before the Nth call is all the man-machine conversation records related to the conversation target information before the Nth call.

In this embodiment, in an nth wheel session of a multi-round session interaction, various implementation manners of determining target reply information of the user input session information in the nth wheel session are provided according to historical session information, session target information and a plurality of knowledge information of the nth wheel session, for example, the historical session information, the session target information and the plurality of knowledge information of the nth wheel session may be input into a recurrent neural network with a replication mechanism, and input and output sequence length difference (Seq2Seq) models may be input into the recurrent neural network, and the target reply information of the user input session information in the nth wheel session is determined through the Seq2Seq model. Of course, in a specific implementation, the target reply message of the nth party to the dialog message input by the user may be determined in other manners.

Other specific implementations of step 104 will be described in detail in the following embodiments.

In an embodiment of the present application, in order to accurately determine the target reply information corresponding to the dialog information through the Seq2Seq model, before using the Seq2Seq model, the Seq2Seq model may be trained by using the dialog target sample data and the sample dialog information and knowledge information thereof, until the reply information output by the Seq2Seq model matches the sample reply information of the corresponding sample dialog information, the training of the Seq2Seq model is finished.

It can be understood that, in this embodiment, after completing the dialog target information by performing multiple dialog interactions with the user according to the dialog target information, the next dialog target information of the human-computer dialog can be determined, and the human-computer dialog is continued based on the next dialog target information until the human-computer dialog is ended.

As an example, in order to improve the fluency of the conversation interaction, when the next conversation target information is determined, another conversation topic having an association relationship with the conversation topic is determined according to the conversation topic information in the conversation target information, the determined corresponding conversation topic is used as the next conversation target information, and the machine is controlled to continue the human-computer interaction with the user according to the determined conversation topic.

For example, the conversation target information includes two conversation topics, the conversation topic 1 is a movie a, the conversation topic 2 is a starring actor B of the movie a, and after the control device and the user perform multiple rounds of conversations to complete the conversation target information, next conversation target information of the conversation target information can be determined based on the conversation target information, and it is assumed that the next conversation target information, the conversation topic 3 is a lover C of the starring actor B, and the conversation topic D is starring movie information of the lover C. The man-machine conversation method determines conversation target information, acquires a plurality of knowledge information and guide information of the conversation target information, initiates conversation interaction to a user by combining the guide information of the conversation target information, performs multi-turn conversation interaction with the user according to the conversation target information, and determines target reply information of the conversation information input by the user in an nth turn of conversation according to historical conversation information, the conversation target information and the plurality of knowledge information before the nth turn of conversation for each turn of conversation in the nth turn of conversation interaction. Therefore, in the process of human-computer interaction, the machine in the human-computer interaction system can actively initiate active conversation, and the human-computer interaction process is actively guided based on the conversation target information, so that the active interaction capability of the machine in the human-computer interaction is improved, the intelligence of the human-computer interaction is further improved, and the user experience degree of the human-computer interaction is improved.

As shown in fig. 2, in an embodiment, when the necessary information further includes a dialog object type, the specific implementation process of the step 104 may include:

step 201, determining a plurality of candidate reply messages of the dialog information according to a preset dialog corpus and a dialog target type.

The dialogue corpus stores a large amount of dialogue corpus information, the dialogue corpus information comprises dialogue target information and a plurality of man-machine dialogues corresponding to the dialogue target information, and each man-machine dialog consists of user dialogue information and reply information returned to a user by a man-machine dialogue system.

The dialogue target types are classified according to topic types in the dialogue target information, and can also be classified according to topic type jumping relations in the dialogue target information, for example, the dialogue target information comprises two dialogue topics, the dialogue target information jumps from a movie to a topic related to people, the dialogue target information corresponds to one dialogue target type, and the dialogue target information jumps from one movie topic to another movie topic and corresponds to another dialogue target type.

The dialog corpus information in the dialog corpus can be predetermined in various ways, for example, the dialog corpus can be obtained in a way of automatically mining dialog data, and the dialog corpus can also be obtained according to dialog data between corresponding annotating personnel in the annotation platform.

In this embodiment, in order to improve the accuracy of the dialog corpus, the embodiment takes the example of obtaining the dialog corpus according to the dialog data between corresponding annotators in the annotation platform as an example for description.

The specific process of developing dialogue annotation by the annotation platform is as follows: and the annotation platform generates a conversation annotation task and sends the conversation annotation task to two corresponding annotators, one of the two annotators annotates a conversation which plays a role of a machine and actively leads a conversation process to complete a set conversation target according to the provided conversation annotation task, and the other annotator plays a real user role to respond to the machine role. The conversation annotation task comprises conversation target information and knowledge information of the conversation target information.

Step 202, splicing the mth candidate reply message, the history conversation message before the nth wheel calls and the conversation message in the plurality of candidate reply messages to obtain the input text message, wherein M is a positive integer.

Step 203, inputting the input text information, the dialogue target information and the plurality of knowledge information into a first dialogue model trained in advance, determining the probability value of the Mth candidate reply information as the target reply information through the first dialogue model, and selecting the candidate reply information with the maximum probability value from the plurality of candidate reply information as the target reply information and outputting the candidate reply information.

In this case, it is understood that, when one conversation topic is included in the conversation target information, entity name information of an entity in the conversation topic may be input into the first conversation model. In addition, when the conversation object information includes two conversation topics, the entity name of each conversation topic, and relationship information between entities in the two conversation topics may be input to the first conversation model.

In one embodiment, the first dialog model includes an attention layer and an output layer, and the specific implementation manner of determining the probability value of the mth candidate reply information as the target reply information through the first dialog model may be as follows: determining target knowledge information used for replying the dialog information from the knowledge information through an attention layer according to the input text information, the dialog target information and the knowledge information; and determining the probability value of the candidate reply information as the target reply information through the output layer according to the target knowledge information.

Specifically, after the first dialog model receives the input text information, the dialog target information and the plurality of knowledge information, the plurality of knowledge information may be encoded through a first encoding layer in the first dialog model in combination with the dialog target information to obtain a first representation vector corresponding to each knowledge information, and correspondingly, the input text information may be encoded through a second encoding layer to obtain a second representation vector of the input text information. Then, the attention layer in the first dialogue model combines the first expression vector and the second expression vector to determine a third expression vector for replying the dialogue information, determines target knowledge information required for replying the dialogue information according to the third expression vector, and determines the correlation degree between the target knowledge information and candidate reply information through the output layer, wherein if the correlation degree between the target knowledge information and the candidate reply information is higher, the probability value of the corresponding candidate reply information as the target reply information is higher.

In this embodiment, after obtaining the probability value of each candidate reply information as the target reply information, the candidate reply information may be ranked based on the probability values, and the candidate reply information with the highest probability value is selected as the target reply information.

Correspondingly, after the target reply message of the dialogue message input by the Nth wheel of the dialogue user is determined, the man-machine dialogue system can be controlled to output the target reply message.

Wherein, the first dialogue model is trained in advance, and a specific process of training the first dialogue model will be described in the following embodiments.

The training process of the first dialogue model is described below with reference to fig. 3 and 4.

As shown in fig. 3, may include:

step 301, obtaining sample dialogue information of the dialogue target sample data according to a preset dialogue corpus, and obtaining sample reply information and dialogue history sample data corresponding to the sample dialogue information.

And step 302, splicing the sample dialogue information and the sample dialogue information to obtain sample input information.

Step 303, acquiring knowledge information of the dialogue target sample data.

In this embodiment, the knowledge information of the dialog target sample data may be acquired according to a pre-constructed knowledge graph.

Step 303, training the first dialogue model according to the dialogue target sample data, the sample reply information and the sample input information, and finishing the training of the first dialogue model until the reply information output by the first dialogue model is matched with the sample reply information of the sample dialogue information.

Wherein the corresponding model structure diagram in the first dialogue model is shown in fig. 4, where knowledge 1-knowledge 3 in fig. 4 represent knowledge information of the dialogue object goalG, the first coding layer in this example is a knowledge coder, and the knowledge coder codes the dialogue object goalG and the corresponding knowledge information, to obtain a representation vector corresponding to each knowledge, k1-k3 represents the representation vector corresponding to the knowledge, the sample input information X and the sample reply information Y are encoded separately by an encoder in the first encoding layer, and training the network parameters of the multilayer neural network MLP in the first dialogue model by combining the obtained expression vectors until the reply information output by the first dialogue model is matched with the sample reply information of the sample dialogue information, and finishing the training of the first dialogue model. The network parameters are specified when the first dialogue model training is finished.

As shown in fig. 5, in another embodiment of the present application, the implementation process of the step 104 may include:

step 501, splicing the dialogue target information, the historical dialogue information and the dialogue information to obtain input text information.

Step 502, inputting the input text information and the plurality of knowledge information into a pre-trained second dialogue model to obtain the target reply information of the dialogue information.

In an embodiment of the present application, the second dialogue model may include a first coding layer, a second coding layer, an attention layer, and a coding layer, as shown in fig. 6, a specific implementation process of the step 502 may be:

step 601, inputting the input text information to the first coding layer to obtain a first expression vector of the input text information.

Step 602, inputting a plurality of knowledge information into the second coding layer, and obtaining a second expression vector of each knowledge information.

Step 603, inputting the first representation vector and the second representation vector into the attention layer, determining a third representation vector for replying dialog information from the second representation vector through the attention layer, and determining a position weight of the third representation vector.

Step 604, inputting the third representation vector and the position weight into the coding layer to generate a target reply message of the dialog message.

A specific process of training the second dialogue model is described below with reference to fig. 7 and 8.

Step 701, obtaining sample dialogue information of the dialogue target sample data according to a preset dialogue corpus, and obtaining sample reply information and dialogue history sample data corresponding to the sample dialogue information.

And 702, splicing the conversation target sample data, the sample conversation information and the conversation history sample data to obtain sample input information.

Step 703, acquiring knowledge information of the dialogue target sample data.

Step 704, training the second dialogue model according to the knowledge information of the dialogue target sample data, the sample input information and the sample reply information corresponding to the sample dialogue information, and finishing the training of the second dialogue model until the reply information output by the second dialogue model is matched with the sample reply information of the sample dialogue information.

As shown in fig. 8, it can be seen from fig. 8 that, in this example, when the second dialogue model is trained, the posterior knowledge information in the sample reply information Y is used to guide the model to perform prior knowledge selection, that is, the prior knowledge distribution p (ki | x) is fitted to the posterior knowledge distribution p (ki | x, Y), and the KL divergence of the two distribution vectors is used as a part of the Loss during training. The KL divergence calculation method formula is as follows:

where N represents the total number of samples.

In order to avoid serious information Loss in the process of calculating the posterior knowledge distribution, by using the thought of self-coding as a reference, the posterior knowledge distribution calculated by the sample reply information Y can decode the sample reply information Y in the training stage, namely, each word of the sample reply information Y is predicted by the posterior distribution, the BOW Loss of the prediction result is also taken as a part of the whole Loss, and the BOW Loss is calculated as follows:

wherein, y_tIndicating the t-th word, k, in the predicted reply message_cAnd representing the knowledge information vector required to be used in the prediction result, wherein the conversation objects jointly participate in the selection and the decoding of the reply of the knowledge information as a part of the input information.

Here, the goalG in fig. 8 indicates sample dialogue target data, sample dialogue information X, and sample reply information Y.

For example, like the conversation history and the current input by the user "... i like science fiction movies", the sample reply message is "what you feel" a certain earth "movies, and the box office has 46 billions". The conversation topic of the sample conversation target information is a certain earth, the knowledge information of the certain earth includes 'the lead actor of the certain earth is Wu' and 'the box office of the certain earth is 46 hundred million'. The encoder encodes the dialog history and current input into a vector x ═ 0.023,0.011, 0.045. ], and the knowledge encoder encodes each piece of knowledge into a vector k1, k 2. The knowledge encoder also encodes the standard reply Y into a vector Y, and the a priori knowledge distribution is calculated as weights [ w1, w2] in the target vectors k1, k2 using x as the given vector, and the a posteriori knowledge distribution is calculated as weights [ w1, w 2' ] in the target vectors k1, k2 using Y as the given vector. Model training was targeted to narrow the differences between [ w1, w2] and [ w1 ', w 2' ]. That is, the KL algorithm may be used in the example to calculate the distance of two vectors, taking this distance as a loss function.

In order to make the present application more clear to those skilled in the art, the man-machine conversation method of the embodiment of the present application is further described below with reference to fig. 9.

FIG. 9 is a flowchart illustrating a man-machine interaction method according to an embodiment of the present application.

Step 901, after the system is started, a dialogue model which is trained in advance by an offline process is loaded.

It should be noted that the dialogue model in this embodiment may be any one of the first dialogue model, the second dialogue model, and the Seq2Seq model in this embodiment.

And step 902, judging whether the conversation target information of the machine needs to be reset, and if so, executing step 903.

Step 903, determining the dialog target information of the machine and the corresponding knowledge information.

In this embodiment, the dialog target information of the machine may be determined according to interest information of the user to be dialogged.

And step 904, predicting the reply information of the current conversation by using the conversation model according to the conversation history, the conversation target information and the knowledge information.

The conversation history is the conversation record of the machine and the human after the machine conversation target information is set.

Step 905, controlling the machine to output reply information to wait for the input of the user.

Step 906, determine whether the user replies the reply message, if yes, go to step 902.

Specifically, when the control machine outputs the reply information, if the user input of the user for the reply information is acquired, the step 902 is skipped to predict the reply of the machine according to the new dialog history.

Fig. 10 is a schematic structural diagram of a man-machine interaction device according to an embodiment of the present application.

As shown in fig. 10, the human-machine conversation apparatus includes a determination module 110, a first obtaining module 120, a conversation interaction module 130, and a reply determination module 140, wherein:

a determining module 110, configured to determine the session target information.

The first obtaining module 120 is configured to obtain necessary information corresponding to the session target information, where the necessary information includes guidance information and a plurality of knowledge information.

And the dialogue interaction module 130 is used for initiating dialogue interaction to the user based on the guiding information and performing multiple rounds of dialogue interaction with the user according to the dialogue target information.

The reply determining module 140 is configured to determine, during an nth wheel session of the multi-wheel session interaction, target reply information of the user input session information in the nth wheel session according to historical session information, session target information, and the plurality of knowledge information before the nth wheel session, where N is a positive integer.

In an embodiment of the present application, on the basis of the embodiment of the apparatus shown in fig. 10, as shown in fig. 11, the necessary information further includes a dialog target type, and the reply determination module 140 may include:

the first determining unit 141 is configured to determine a plurality of candidate reply messages of the dialog information according to a preset dialog corpus and a dialog target type of the dialog target information.

The first splicing unit 142 is configured to splice an mth candidate reply message of the multiple candidate reply messages, the nth historical dialog message before the dialog, and the dialog message to obtain the input text message, where M is a positive integer.

The second determining unit 143 is configured to input the input text information, the dialog target information, and the plurality of knowledge information into a first dialog model trained in advance, determine a probability value of an mth candidate reply information as the target reply information through the first dialog model, and select and output a candidate reply information with a maximum probability value as the target reply information from the plurality of candidate reply information.

In one embodiment of the present application, the first dialogue model includes an attention layer and an output layer.

The second determining unit 143 is specifically configured to: determining target knowledge information used for replying the dialog information from the knowledge information through an attention layer according to the input text information, the dialog target information and the knowledge information; and determining the probability value of the Mth candidate reply information as the target reply information through the output layer according to the target knowledge information, and selecting the candidate reply information with the maximum probability value from the plurality of candidate reply information as the target reply information and outputting the target reply information.

In an embodiment of the present application, based on the embodiment of the apparatus shown in fig. 10, as shown in fig. 12, the reply determination module 140 may include:

and a second splicing unit 144, configured to splice the dialog target information, the historical dialog information, and the dialog information to obtain input text information.

The third determining unit 145 is configured to input the input text information and the plurality of knowledge information into a second pre-trained dialogue model, so as to obtain target reply information of the dialogue information.

In an embodiment of the present application, the second dialogue model includes a first coding layer, a second coding layer, an attention layer, and a coding layer, and the third determining unit 145 is specifically configured to: and inputting the input text information into the first coding layer to obtain a first expression vector of the input text information. And inputting a plurality of knowledge information into the second coding layer to obtain a second expression vector of each knowledge information. The first representation vector and the second representation vector are input to an attention layer, a third representation vector for replying dialog information is determined from the second representation vector through the attention layer, and a position weight of the third representation vector is determined. And inputting the third representation vector and the position weight into the coding layer to generate target reply information of the dialog information.

In an embodiment of the present application, on the basis of the embodiment of the apparatus shown in fig. 12, as shown in fig. 13, the apparatus may further include:

the second obtaining module 150 is configured to obtain sample dialog information of the dialog target sample data according to a preset dialog corpus, and obtain sample reply information and dialog history sample data corresponding to the sample dialog information.

And the splicing module 160 is configured to splice the dialog target sample data, the sample dialog information, and the dialog history sample data to obtain sample input information.

And a third obtaining module 170, configured to obtain knowledge information of the dialog target sample data.

The training module 180 is configured to train the second dialog model according to the knowledge information of the dialog target sample data, the sample input information, and the sample reply information corresponding to the sample dialog information, until the reply information output by the second dialog model matches the sample reply information of the sample dialog information, the training of the second dialog model is completed.

It should be noted that the explanation of the embodiment of the man-machine conversation method is also applicable to the man-machine conversation device of the embodiment, and the implementation principle is similar, and is not repeated here.

The man-machine conversation device determines conversation target information, acquires a plurality of knowledge information and guidance information of the conversation target information, initiates conversation interaction to a user by combining the guidance information of the conversation target information, performs multi-turn conversation interaction with the user according to the conversation target information, and determines target reply information of the conversation information input by the user in an nth turn of conversation according to historical conversation information, the conversation target information and the plurality of knowledge information before the nth turn of conversation for each turn of conversation in the nth turn of conversation interaction. Therefore, in the process of human-computer interaction, the machine in the human-computer interaction system can actively initiate active conversation, and the human-computer interaction process is actively guided based on the conversation target information, so that the active interaction capability of the machine in the human-computer interaction is improved, the intelligence of the human-computer interaction is further improved, and the user experience degree of the human-computer interaction is improved.

FIG. 14 is a schematic structural diagram of an electronic device according to one embodiment of the present application. The electronic device includes:

memory 1001, processor 1002, and computer programs stored on memory 1001 and executable on processor 1002.

The processor 1002, when executing the program, implements the man-machine interaction method provided in the above-described embodiments.

Further, the electronic device further includes:

a communication interface 1003 for communicating between the memory 1001 and the processor 1002.

A memory 1001 for storing computer programs that may be run on the processor 1002.

Memory 1001 may include high-speed RAM memory and may also include non-volatile memory (e.g., at least one disk memory).

The processor 1002 is configured to implement the human-computer interaction method according to the above embodiments when executing a program.

If the memory 1001, the processor 1002, and the communication interface 1003 are implemented independently, the communication interface 1003, the memory 1001, and the processor 1002 may be connected to each other through a bus and perform communication with each other. The bus may be an Industry Standard Architecture (ISA) bus, a Peripheral Component Interconnect (PCI) bus, an Extended ISA (enhanced Industry Standard Architecture) bus, or the like. The bus may be divided into an address bus, a data bus, a control bus, etc. For ease of illustration, only one thick line is shown in FIG. 14, but this is not intended to represent only one bus or type of bus.

Optionally, in a specific implementation, if the memory 1001, the processor 1002, and the communication interface 1003 are integrated on one chip, the memory 1001, the processor 1002, and the communication interface 1003 may complete communication with each other through an internal interface.

The processor 1002 may be a Central Processing Unit (CPU), an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits configured to implement embodiments of the present Application.

The present embodiment also provides a computer-readable storage medium having stored thereon a computer program, characterized in that the program, when executed by a processor, implements the human-machine conversation method as above.

In the description herein, reference to the description of the term "one embodiment," "some embodiments," "an example," "a specific example," or "some examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the application. In this specification, the schematic representations of the terms used above are not necessarily intended to refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples. Furthermore, various embodiments or examples and features of different embodiments or examples described in this specification can be combined and combined by one skilled in the art without contradiction.

Furthermore, the terms "first", "second" and "first" are used for descriptive purposes only and are not to be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated. Thus, a feature defined as "first" or "second" may explicitly or implicitly include at least one such feature. In the description of the present application, "plurality" means at least two, e.g., two, three, etc., unless specifically limited otherwise.

Any process or method descriptions in flow charts or otherwise described herein may be understood as representing modules, segments, or portions of code which include one or more executable instructions for implementing steps of a custom logic function or process, and alternate implementations are included within the scope of the preferred embodiment of the present application in which functions may be executed out of order from that shown or discussed, including substantially concurrently or in reverse order, depending on the functionality involved, as would be understood by those reasonably skilled in the art of the present application.

The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device. More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.

It should be understood that portions of the present application may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. If implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.

It will be understood by those skilled in the art that all or part of the steps carried by the method for implementing the above embodiments may be implemented by hardware that is related to instructions of a program, and the program may be stored in a computer-readable storage medium, and when executed, the program includes one or a combination of the steps of the method embodiments.

In addition, functional units in the embodiments of the present application may be integrated into one processing module, or each unit may exist alone physically, or two or more units are integrated into one module. The integrated module can be realized in a hardware mode, and can also be realized in a software functional module mode. The integrated module, if implemented in the form of a software functional module and sold or used as a separate product, may also be stored in a computer readable storage medium.

The storage medium mentioned above may be a read-only memory, a magnetic or optical disk, etc. Although embodiments of the present application have been shown and described above, it is understood that the above embodiments are exemplary and should not be construed as limiting the present application, and that variations, modifications, substitutions and alterations may be made to the above embodiments by those of ordinary skill in the art within the scope of the present application.

Claims

1. A method of human-computer interaction, the method comprising:

determining conversation target information;

acquiring necessary information corresponding to the conversation target information, wherein the necessary information comprises guide information and a plurality of knowledge information;

initiating conversation interaction to a user based on the guide information, and performing multiple rounds of conversation interaction with the user according to the conversation target information;

and in the N wheel conversation process of the multi-wheel conversation interaction, determining target reply information of the user input conversation information in the N wheel conversation according to the historical conversation information before the N wheel conversation, the conversation target information and the plurality of knowledge information, wherein N is a positive integer.

2. The method of claim 1, wherein the necessary information further includes a dialog target type, and the determining the target reply information of the user input dialog information in the nth wheel dialog according to the historical dialog information before the nth wheel dialog, the dialog target information and the plurality of knowledge information comprises:

determining a plurality of candidate reply messages of the dialog information according to a preset dialog corpus and the dialog target type;

splicing the Mth candidate reply message in the plurality of candidate reply messages, the history dialogue message before the Nth wheel speaks and the dialogue message to obtain input text message, wherein M is a positive integer;

inputting the input text information, the dialogue target information and the knowledge information into a pre-trained first dialogue model, determining the probability value of the Mth candidate reply information as the target reply information through the first dialogue model, and selecting the candidate reply information with the maximum probability value from the candidate reply information as the target reply information and outputting the candidate reply information.

3. The method of claim 2, wherein the first dialogue model comprises an attention layer and an output layer;

the determining, by the first dialogue model, a probability value of the mth candidate reply message as the target reply message includes:

according to the input text information, the conversation target information and the plurality of knowledge information, determining target knowledge information used for replying the conversation information from the plurality of knowledge information through the attention layer;

and determining the probability value of the Mth candidate reply information as the target reply information through the output layer according to the target knowledge information, and selecting the candidate reply information with the maximum probability value from the plurality of candidate reply information as the target reply information and outputting the candidate reply information.

4. The method of claim 1, wherein determining the target reply message for the user to input the dialog message in the nth round of dialog based on the historical dialog message, the dialog target message, and the plurality of knowledge messages in the nth round of dialog comprises:

splicing the conversation target information, the historical conversation information and the conversation information to obtain input text information;

and inputting the input text information and the plurality of knowledge information into a pre-trained second dialogue model to obtain target reply information of the dialogue information.

5. The method of claim 4, wherein the second dialogue model comprises a first coding layer, a second coding layer, an attention layer and a coding layer, and the inputting the input text information and the plurality of knowledge information into a second dialogue model trained in advance to obtain the target reply information of the dialogue information comprises:

inputting the input text information into the first coding layer to obtain a first expression vector of the input text information;

inputting the knowledge information into the second coding layer to obtain a second expression vector of each knowledge information;

inputting the first representation vector and the second representation vector into the attention layer, determining a third representation vector for replying the dialog information from the second representation vector through the attention layer, and determining a position weight of the third representation vector;

inputting the third representation vector and the position weight to the coding layer to generate target reply information of the dialog information.

6. The method according to claim 4 or 5, before the inputting the text information and the plurality of knowledge information into a second pre-trained dialogue model to obtain a target reply message of the dialogue information, further comprising:

obtaining sample dialogue information of dialogue target sample data according to a preset dialogue corpus, and obtaining sample reply information and dialogue history sample data corresponding to the sample dialogue information;

splicing the conversation target sample data, the sample conversation information and the conversation history sample data to obtain sample input information;

acquiring knowledge information of the dialogue target sample data;

and training the second dialogue model according to the knowledge information of the dialogue target sample data, the sample input information and the sample reply information corresponding to the sample dialogue information, and finishing the training of the second dialogue model until the reply information output by the second dialogue model is matched with the sample reply information of the sample dialogue information.

7. A human-machine interaction device, characterized in that it comprises:

the determining module is used for determining the conversation target information;

the first acquisition module is used for acquiring necessary information corresponding to the conversation target information, wherein the necessary information comprises guide information and a plurality of knowledge information;

the dialogue interaction module is used for initiating dialogue interaction to a user based on the guide information and carrying out multi-turn dialogue interaction with the user according to the dialogue target information;

and the reply determining module is used for determining target reply information of the dialogue information input by the user in the Nth wheel dialogue according to the historical dialogue information before the Nth wheel dialogue, the dialogue target information and the knowledge information in the Nth wheel dialogue interaction process, wherein N is a positive integer.

8. The apparatus of claim 7, wherein the necessary information further comprises a dialog target type, and wherein the reply determination module comprises:

the first determining unit is used for determining a plurality of candidate reply messages of the dialogue information according to a preset dialogue corpus and the dialogue target type;

the first splicing unit is used for splicing the Mth candidate reply message in the multiple candidate reply messages, the history dialogue message before the N wheel calls and the dialogue message to obtain input text message, wherein M is a positive integer;

an input unit for inputting the input text information, the dialogue object information and the plurality of knowledge information into a first dialogue model trained in advance,

and the second determining unit is used for inputting the input text information, the dialogue target information and the knowledge information into a first dialogue model trained in advance, determining the probability value of the Mth candidate reply information as the target reply information through the first dialogue model, and selecting the candidate reply information with the maximum probability value from the candidate reply information as the target reply information and outputting the candidate reply information.

9. The apparatus of claim 8, wherein the first dialogue model comprises an attention layer and an output layer;

the second determining unit is specifically configured to:

10. The apparatus of claim 7, wherein the reply determination module comprises:

the second splicing unit is used for splicing the conversation target information, the historical conversation information and the conversation information to obtain input text information;

and the third determining unit is used for inputting the input text information and the plurality of knowledge information into a pre-trained second dialogue model to obtain target reply information of the dialogue information.

11. The apparatus according to claim 10, wherein the second dialog model comprises a first coding layer, a second coding layer, an attention layer and a coding layer, and the third determining unit is specifically configured to:

12. The apparatus of claim 10 or 11, further comprising:

the second acquisition module is used for acquiring sample dialogue information of dialogue target sample data according to a preset dialogue corpus, and acquiring sample reply information and dialogue history sample data corresponding to the sample dialogue information;

the splicing module is used for splicing the conversation target sample data, the sample conversation information and the conversation history sample data to obtain sample input information;

the third acquisition module is used for acquiring knowledge information of the dialogue target sample data;

and the training module is used for training the second dialogue model according to the knowledge information of the dialogue target sample data, the sample input information and the sample reply information corresponding to the sample dialogue information until the second dialogue model finishes training when the reply information output by the second dialogue model is matched with the sample reply information of the sample dialogue information.

13. An electronic device, comprising:

memory, processor and computer program stored on the memory and executable on the processor, characterized in that the processor implements the man-machine interaction method according to any of claims 1 to 6 when executing the program.

14. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out a man-machine interaction method according to any one of claims 1 to 6.