CN116521832A

CN116521832A - Dialogue interaction method, device and system, electronic equipment and storage medium

Info

Publication number: CN116521832A
Application number: CN202310249564.8A
Authority: CN
Inventors: 焦振宇; 孙叔琦; 张红阳; 吴华; 王海峰
Original assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Current assignee: Beijing Baidu Netcom Science and Technology Co Ltd
Priority date: 2023-03-10
Filing date: 2023-03-10
Publication date: 2023-08-01

Abstract

The invention discloses a dialogue interaction method, device and system, electronic equipment and storage medium, and relates to the technical field of computers, in particular to the technical field of big data. The method comprises the following steps: acquiring dialogue upper information, wherein the dialogue upper information is a history dialogue record of a user and a dialogue system; inputting the dialogue context information into a pre-trained multi-stage dialogue model to determine a target dialogue decision from a plurality of candidate dialogue decisions according to the dialogue context information; generating query information of the target dialogue decision through the multi-stage dialogue model, and acquiring dialogue knowledge based on the query information, wherein the dialogue knowledge is knowledge of dialogue reply information for generating dialogue context information; and generating the dialogue reply information of the dialogue context information according to the dialogue knowledge so as to complete dialogue interaction. The method and the device realize cross-domain dialogue interaction through the multi-stage dialogue model, improve dialogue experience and reduce cost of optimal configuration.

Description

Dialogue interaction method, device and system, electronic equipment and storage medium

Technical Field

The disclosure relates to the technical field of computers, in particular to the technical field of big data, and particularly relates to a dialogue interaction method, a dialogue interaction device, a dialogue interaction system, electronic equipment and a storage medium.

Background

Human-computer intelligent dialogues can be generally classified into task dialogues, knowledge dialogues and open domain dialogues according to the application scenes. The task type dialogue mainly supports the completion of certain tasks, the knowledge type dialogue mainly supports the inquiry of users on certain knowledge, the open domain dialogue mainly is boring, and the requirements of emotion accompaniment of users are met. Different types of dialogs are often required simultaneously in practical applications, and the boundaries of the different types of dialogs are also somewhat ambiguous.

Related technologies generally realize cross-type dialogue interaction through dialogue central control joint deployment, and have the problems of poor dialogue effect, high optimal configuration cost, obvious interaction machine sense and the like.

Disclosure of Invention

The disclosure provides a dialogue interaction method, a dialogue interaction device, a dialogue interaction system, an electronic device and a storage medium, and aims to solve the problems that in the related art, a dialogue effect is poor, optimal configuration cost is high, interaction machine feel is obvious and the like when cross-type dialogue interaction is realized through dialogue central control joint deployment.

According to a first aspect of the present disclosure, there is provided a dialogue interaction method, including: acquiring dialogue upper information, wherein the dialogue upper information is a history dialogue record of a user and a dialogue system; inputting the dialogue context information into a pre-trained multi-stage dialogue model to determine a target dialogue decision from a plurality of candidate dialogue decisions according to the dialogue context information; generating query information of the target dialogue decision through the multi-stage dialogue model, and acquiring dialogue knowledge based on the query information, wherein the dialogue knowledge is knowledge of dialogue reply information for generating dialogue context information; and generating the dialogue reply information of the dialogue context information according to the dialogue knowledge so as to complete dialogue interaction.

According to a second aspect of the present disclosure, there is provided a multi-stage dialog model training method comprising: acquiring a plurality of sample dialogue information; training the dialogue decision layer according to the sample dialogue information to determine a target dialogue decision from a plurality of candidate dialogue decisions through the dialogue decision layer; inputting the target dialogue decision into the information inquiry layer, and training the information inquiry layer to generate inquiry information of the target dialogue decision through the information inquiry layer; and acquiring dialogue knowledge based on the query information, inputting the dialogue knowledge into the dialogue reply layer, and training the dialogue reply layer to generate dialogue reply information through the dialogue reply layer.

According to a third aspect of the present disclosure, there is provided a dialogue interaction device comprising: a dialogue context acquisition module configured to perform acquisition of dialogue context information, the dialogue context information being a history dialogue record of a user and a dialogue system; a target dialog decision module configured to perform inputting the dialog context information into a pre-trained multi-stage dialog model to determine a target dialog decision from a plurality of candidate dialog decisions in accordance with the dialog context information; a dialogue knowledge query module configured to execute query information for generating the target dialogue decision through the multi-stage dialogue model, and acquire dialogue knowledge based on the query information, the dialogue knowledge being knowledge of dialogue reply information for generating the dialogue context information; a dialog reply generation module configured to execute the dialog reply information that generated the dialog context information in accordance with the dialog knowledge to complete a dialog interaction.

According to a fourth aspect of the present disclosure, there is provided a multi-stage dialog model training device comprising: a sample dialogue information acquisition module configured to perform acquisition of a plurality of sample dialogue information; a dialog decision layer training module configured to perform training of the dialog decision layer in accordance with the sample dialog information to determine a target dialog decision from a plurality of candidate dialog decisions by the dialog decision layer; an information query layer training module configured to perform inputting the target dialog decision into the information query layer, training the information query layer to generate query information of the target dialog decision through the information query layer; and the dialogue reply layer query module is configured to acquire dialogue knowledge based on the query information, input the dialogue knowledge into the dialogue reply layer and train the dialogue reply layer so as to generate dialogue reply information through the dialogue reply layer.

According to a fifth aspect of the present disclosure, there is provided a conversational interaction system, comprising: the interactive module is used for receiving dialogue context information interacted with a user and outputting dialogue reply information generated by the processing module; the processing module comprises a dialogue decision unit, an information inquiry unit and a dialogue reply unit, wherein the dialogue decision unit is used for determining a target dialogue decision from a plurality of candidate dialogue decisions according to the dialogue context information; the information inquiry unit is used for generating inquiry information of the target dialogue decision and acquiring dialogue knowledge based on the inquiry information; the dialogue reply unit is used for generating dialogue reply information of the dialogue context information according to the dialogue knowledge; and the knowledge warehouse layer is used for storing the dialogue knowledge.

According to a fifth aspect of the present disclosure, there is provided an electronic device comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; wherein,,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of the preceding aspects.

According to a sixth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of the preceding aspects.

According to a seventh aspect of the present disclosure, there is provided a computer program product comprising a computer program which, when executed by a processor, implements the method of any one of the preceding aspects.

In one or more embodiments of the present disclosure, dialog context information is obtained, the dialog context information being a historical dialog record of a user with a dialog system; inputting dialogue context information into a pre-trained multi-stage dialogue model to determine a target dialogue decision from a plurality of candidate dialogue decisions according to the dialogue context information; generating query information of a target dialogue decision through a multi-stage dialogue model, and acquiring dialogue knowledge based on the query information; and generating dialogue reply information of dialogue context information according to dialogue knowledge so as to complete dialogue interaction. According to the method and the device, dialogue interaction is carried out through the pre-trained multi-stage dialogue model, cross-type dialogue interaction is achieved, flexibility of man-machine dialogue interaction is improved, dialogue effect is improved, and cost of optimal configuration is reduced.

It should be understood that the description in this section is not intended to identify key or critical features of the embodiments of the disclosure, nor is it intended to be used to limit the scope of the disclosure. Other features of the present disclosure will become apparent from the following specification.

Drawings

The drawings are for a better understanding of the present solution and are not to be construed as limiting the present disclosure. Wherein:

fig. 1 is a schematic diagram illustrating a dialogue interaction method in the related art according to an embodiment of the present disclosure;

FIG. 2 is a flow diagram of a conversational interaction method according to a first embodiment of the disclosure;

fig. 3 is a schematic diagram of a dialogue example of a dialogue interaction method according to a first embodiment of the present disclosure;

FIG. 4 is a schematic diagram of a multi-stage OnePass parallel reasoning mechanism of a conversational interaction method according to a first embodiment of the disclosure;

FIG. 5 is a schematic diagram of a dialog example of a dialog interaction method in accordance with a second embodiment of the present disclosure;

FIG. 6 is a flow diagram of a multi-stage dialog model training method in accordance with an embodiment of the present disclosure;

FIG. 7 is a schematic diagram of generating sample dialog information for a multi-stage dialog model training method in accordance with an embodiment of the disclosure;

FIG. 8 is a schematic diagram of a multi-stage conversation model trained by a multi-stage conversation model training method in accordance with embodiments of the present disclosure;

FIG. 9 is a schematic diagram of a dialog interaction device for implementing an embodiment of the present disclosure;

FIG. 10 is a schematic diagram of a multi-stage dialog model training device for implementing an embodiment of the present disclosure;

fig. 11 is a block diagram of an electronic device used to implement an embodiment of the present disclosure.

Detailed Description

Exemplary embodiments of the present disclosure are described below in conjunction with the accompanying drawings, which include various details of the embodiments of the present disclosure to facilitate understanding, and should be considered as merely exemplary. Accordingly, one of ordinary skill in the art will recognize that various changes and modifications of the embodiments described herein can be made without departing from the scope and spirit of the present disclosure. Also, descriptions of well-known functions and constructions are omitted in the following description for clarity and conciseness.

Human-computer intelligent dialogues can be generally classified into task dialogues, knowledge dialogues and open domain dialogues according to the application scenes. Task conversations mainly support the completion of certain types of tasks, such as ordering train tickets, purchasing goods, ordering meals, etc. Knowledge dialogs are primarily intended to support user inquiries for certain knowledge, such as consulting bank deposit rates, asking how to do fish-minded meat, etc. The open domain dialogue is mainly boring, and the needs of the emotion accompany aspect of the user are met. The single-type dialog systems each have a respective modeling approach. Task-based conversations are generally divided into conversational understanding, conversational state tracking, conversational management, conversational generation, and cascading conversational requests. Knowledge-based conversations are generally handled in three modules, coarse screening, fine ranking, and replying. The chat type dialogue is generally realized by constructing an end-to-end seq2seq model, and a configurable intervention knowledge base can be added to ensure the intervention of part of the chat content. However, in an actual application scenario, different types of dialogs are often required at the same time, and a certain blurring problem exists on boundaries of the different types of dialogs.

The prior art establishes a number of different types of dialog skills together by dialog centering to achieve a multi-type dialog. As shown in fig. 1, two types of the session center controls are provided, one type is a distributed center control, a session request is sent to all single-type session skills, and a result of one skill is selected to be returned to a requesting party in a rule mode according to a returned result of different skills. The other is a flow type central control, which requires a developer to configure dialogue flow rules in advance, and a dialogue request uses a reply of a single type dialogue skill formulated by the flow according to the current dialogue flow and the current dialogue state and the flow rules.

However, in practical applications, the above-described related art solution has the following problems:

(1) The upper limit of the dialog effect is low. The dialogue strategy of the distributed central control is simpler and regularized, and cannot well cope with complex real dialogue conditions. Although the flow control can be configured with complex flow rules, the actual dialogue flow is manually configured, and the flow complexity that can be effectively controlled by the human brain is limited, so that too complex dialogue cannot be configured in practice. In addition, in the modeling process of the traditional task type and knowledge type conversations, a configurator is required to grasp concepts such as intention, word slots, standard questions and similar questions, and divide a target scene into mutually differentiated intention and standard questions which can be mutually matched, so that the requirements on scene understanding and abstract ability of the configurator are extremely high, and complex scenes are difficult to reasonably divide in practice, so that the complex scenes cannot be modeled well, and the problem that the upper limit of the conversation effect is low, and particularly, the ideal conversation effect cannot be achieved in the complex scenes exists.

(2) The optimal configuration cost is high, because possible user intentions and corresponding samples and conversation processes are required to be manually configured, configuration personnel need to spend a great deal of time to complete scene combing, confirm required intentions, word slots, standard questions and the like, configure a great deal of samples for each intension, label questions so as to cover various user expressions as much as possible, and connect interaction logics which possibly appear in series through configuration of the conversation processes, so that the whole configuration process is complex, and the uncovered intentions and conversation processes are required to be continuously supplemented and configured, and the optimization cost is high.

(3) The dialogue is relatively rigid, all dialogue flows are regularly configured, so that the machine feel is obvious in the whole interaction process, and the interaction process is single in the face of different expressions of users.

In order to solve the above-mentioned problems of the related art, embodiments of the present disclosure provide a dialogue interaction method, a dialogue interaction device, a dialogue interaction system, an electronic device, and a storage medium. The present disclosure is described in detail below with reference to specific examples.

In a first embodiment, as shown in fig. 2, fig. 2 is a flow chart of a dialogue interaction method according to a first embodiment of the disclosure, which may be implemented by a computer program and may be run on a device for dialogue interaction. The computer program may be integrated in the application or may run as a stand-alone tool class application.

The data processing device may be an electronic device with a man-machine interaction function, where the electronic device includes but is not limited to: wearable devices, handheld devices, personal computers, tablet computers, vehicle-mounted devices, smart phones, computing devices, or other processing devices connected to a wireless modem, etc. Electronic devices in different networks may be called different names, for example: user equipment, access electronics, subscriber units, subscriber stations, mobile stations, remote electronics, mobile devices, consumer electronics, wireless communication devices, user agents or user equipment, cellular telephones, cordless telephones, personal digital assistants (personal digital assistant, PDAs), fifth Generation mobile communication technology (5th Generation Mobile Communication Technology,5G) networks, fourth Generation mobile communication technology (the 4th Generation mobile communication technology,4G) networks, third Generation mobile communication technology (3 rd-Generation, 3G) networks, or electronics in future evolution networks, and the like.

The dialogue interaction method of the first embodiment will be described in detail. As shown in fig. 2, the dialogue interaction method includes the following steps:

S201: acquiring dialogue upper information, wherein the dialogue upper information is a history dialogue record of a user and a dialogue system;

s202: inputting dialogue context information into a pre-trained multi-stage dialogue model to determine a target dialogue decision from a plurality of candidate dialogue decisions according to the dialogue context information;

s203: generating query information of a target dialogue decision through a multi-stage dialogue model, and acquiring dialogue knowledge based on the query information, wherein the dialogue knowledge is knowledge of dialogue reply information for generating dialogue context information;

s204: and generating dialogue reply information of dialogue context information according to dialogue knowledge so as to complete dialogue interaction.

According to the method and the device, dialogue interaction is carried out through the pre-trained multi-stage dialogue model, cross-type dialogue interaction is achieved, flexibility of man-machine dialogue interaction is improved, dialogue effect is improved, and cost of optimal configuration is reduced.

The following describes each step in the data processing method, specifically, the dialogue interaction method includes:

s201: dialogue context information is obtained, which is a historical dialogue record of a user and a dialogue system.

The dialogue context information is a history dialogue record of a user and a dialogue system, and comprises dialogue records generated by a plurality of interactive dialogue processes before the dialogue interaction. As shown in the dialogue example of fig. 3, the above-mentioned dialogue context information is "today's good impression" entered by the user at the time of the first interaction, and the above-mentioned dialogue context information is the dialogue content of "today's good impression" to "what scenic spots are interesting in beijing" at the time of the fifth interaction. It should be noted that the above scenario is only an exemplary scenario, and the scope of the embodiments of the present disclosure is not limited thereto.

S202: the dialog context information is input into a pre-trained multi-stage dialog model to determine a target dialog decision from a plurality of candidate dialog decisions based on the dialog context information.

The embodiments of the present disclosure implement a dialogue interaction process through the pre-trained multi-stage dialogue model described above, which may employ a multi-stage OnePass parallel reasoning mechanism as shown in FIG. 4, under which the dialogue is divided into multiple stages, but only needs to be performed once from left to right due to only one model and the fact that the model is executed. Compared with the whole modeling which is completed by using a plurality of models, or the same model, the whole modeling is completed by executing a plurality of passes, so that the method has obvious advantages in terms of calculation amount and calculation efficiency, and the running cost of the service is lower.

Illustratively, the multi-stage dialog model may include a dialog decision layer, an information query layer, and a dialog response layer. The dialogue interaction method provided by the embodiment of the disclosure can be applied to interaction dialogue of multiple scenes, the candidate dialogue decisions are dialogue decisions of all scenes, and the target dialogue decisions correspond to dialogue decisions required by current dialogue interaction. The above process of determining a target dialog decision from a plurality of candidate dialog decisions based on dialog context information may be implemented as follows: determining the similarity between the dialogue context information and the candidate dialogue decisions through the dialogue decision layer, and determining a plurality of related dialogue decisions from the candidate dialogue decisions according to the similarity; the dialog context information is matched with the relevant dialog decisions by a dialog decision layer to determine a target dialog decision from the relevant dialog decisions.

Specifically, taking a dialogue system that supports a plurality of calendar scenes at the same time as an example, the above process will be described in detail: assuming that the dialogue context information input by the user is 'help me find out that the weather of Beijing in tomorrow is suitable for playing a ball', the candidate dialogue decisions are coarsely screened through the similarity between the dialogue context information and the candidate dialogue decisions, N relevant dialogue decisions are determined from a plurality of candidate dialogue decisions, such as inquiring weather APIs, calling court reservation APIs, calling shopping APIs, calling encyclopedia knowledge base and the like. Further, by combining the matching result and the classification result, for example, the matching result and the classification result can be combined through weighted average, so as to determine that the final target dialogue decision is to query the weather API and call the encyclopedia knowledge base. It should be noted that the above scenario is only an exemplary scenario, and the scope of the embodiments of the present disclosure is not limited thereto.

S203: query information for a target dialog decision is generated by a multi-stage dialog model, and dialog knowledge, which is knowledge of dialog reply information that generated dialog context information, is obtained based on the query information.

In the embodiment of the disclosure, the above-mentioned query information is used to query the knowledge of the target dialogue decision, that is, the above-mentioned dialogue knowledge, in the corresponding knowledge base, so as to generate the dialogue reply information based on the queried knowledge. Illustratively, the knowledge bases may form a knowledge repository layer of the multi-stage dialogue model, and may be used for storing various kinds of knowledge, for example, an API service, a general knowledge base, a stored image base, a dialogue recommendation base, and the like.

Illustratively, the process of generating the query information of the target dialog decision through the multi-stage dialog model and acquiring the dialog knowledge based on the query information may be implemented as follows: and generating corresponding query information according to the type of the target dialogue decision in the information query layer of the multi-stage dialogue model, and acquiring dialogue knowledge in a corresponding knowledge base based on the query information. For example, query statements of a database are generated by an information query layer, and dialogue knowledge is queried in the database. Alternatively, a command sentence of an API (call interface) is generated by the information query layer, and dialogue knowledge is acquired based on the command sentence.

Specifically, taking the above dialogue system supporting a plurality of schedule scenes at the same time as an example, after determining that the target dialogue decision is to query the weather API and call the encyclopedia knowledge base, in this step, parameters required for querying the weather API are generated through the above information query layer, and in combination with fig. 3, the parameters of the API are "time is tomorrow, place is Beijing" and the problem required for calling the encyclopedia knowledge base is "what weather is suitable for playing a ball". Based on the combined API command of ' looking up weather (time=tomorrow, place=Beijing) ' to request weather API service, obtaining a corresponding result of ' weather fine, temperature 15-25 ℃, using the generated problems to query an encyclopedia knowledge base, and obtaining a corresponding result of ' weather with moderate clear temperature is suitable for playing balls '. It should be noted that the above scenario is only an exemplary scenario, and the scope of the embodiments of the present disclosure is not limited thereto.

After the dialogue knowledge is acquired, in step S204, the embodiment of the disclosure generates dialogue reply information for returning the dialogue context information in the present interaction and feeds back the dialogue reply information to the user. Illustratively, this process may be implemented as follows: determining input knowledge of a dialogue answer layer in dialogue knowledge according to dialogue context information in a dialogue decision layer of the multi-stage dialogue model; dialogue reply information of dialogue context information is generated in a dialogue reply layer according to input knowledge.

Specifically, taking the dialogue system supporting a plurality of schedule scenes at the same time as an example, using knowledge obtained by query to generate dialogue reply information of 'Beijing weather in open sky is fine, 15-25 degrees, is suitable for playing ball', and returns to the user to complete interaction. It should be noted that the above scenario is only an exemplary scenario, and the scope of the embodiments of the present disclosure is not limited thereto.

Referring to fig. 5, fig. 5 is a flow chart illustrating a dialogue interaction method according to a second embodiment of the disclosure. Specifically:

s501: dialogue context information is obtained, which is a historical dialogue record of a user and a dialogue system.

S502: and encoding the dialogue context information to generate a target dialogue vector corresponding to the dialogue context information.

In an embodiment of the present disclosure, the multi-stage dialog model further includes a dialog coding layer, which may be used to code dialog context information, and generate a target dialog vector that can be processed by the model for subsequent unified use. Through sharing the coding, the mutual borrowing force among all levels of the model can be promoted, and the faster coding speed can be obtained.

S503: the dialog context information is input into a pre-trained multi-stage dialog model to determine a target dialog decision from a plurality of candidate dialog decisions based on the dialog context information.

S504: query information for a target dialog decision is generated by a multi-stage dialog model, and dialog knowledge, which is knowledge of dialog reply information that generated dialog context information, is obtained based on the query information.

S505: and generating dialogue reply information of dialogue context information according to dialogue knowledge so as to complete dialogue interaction.

In addition, the implementation details of the steps S501 and S503 to S505 are already described in detail at the corresponding positions of the steps S201 and S104, and are not described herein again.

In addition, the embodiment of the disclosure also provides a multi-stage dialogue model training method, which is used for the multi-stage dialogue model involved in the dialogue interaction method. As shown in fig. 6, the multi-stage dialog model training method includes the steps of:

s601: a plurality of sample dialogue information is acquired.

In the disclosed embodiment, this step is used to obtain sample dialogue information for model training. For example, a history of a user's dialogue with the dialogue system may be acquired as the sample dialogue information.

However, since the dialogue data is difficult to collect, only the dialogue data is used for retraining, the models cannot fully learn the basic capability possibly needed in massive application scenes, the data of other natural language processing tasks are relatively large, the task data and the dialogue task data are processed by using other natural languages through a unified mode, the universal capability of the models can be improved by means of the other task data, and the adaptation capability to the modes can be enhanced.

Preferably, embodiments of the present disclosure may also generate sample dialogue information from a target task, where the target task may generate one or more of a task, a classification task, and a matching task. As shown in fig. 7, an example of processing task enhancement models through multiple natural languages is presented, and generating tasks (such as dialogue abstracts, reading understanding, entity recognition), classifying tasks (intention classification) and matching tasks (semantic matching and label matching) can be uniformly used for retraining the multi-stage dialogue models, so as to enhance the model capability.

S602: the dialog decision layer is trained in accordance with the sample dialog information to determine a target dialog decision from a plurality of candidate dialog decisions by the dialog decision layer.

S603: the target dialogue decision is input into the information inquiry layer, and the information inquiry layer is trained to generate inquiry information of the target dialogue decision through the information inquiry layer.

S604: dialogue knowledge is acquired based on the query information, and the dialogue knowledge is input into a dialogue reply layer, and the dialogue reply layer is trained to generate dialogue reply information through the dialogue reply layer.

In an embodiment of the present disclosure, the multi-stage dialog model further includes a dialog coding layer, and the model training method may train the dialog coding layer by: the dialogue encoding layer is trained according to the sample dialogue information to encode the sample dialogue information through the dialogue encoding layer to generate a target dialogue vector of the sample dialogue information.

In a specific embodiment, a multi-stage dialogue model trained by the method provided by the embodiment of the disclosure may be shown in fig. 8, where the multi-stage dialogue model includes a dialogue coding layer, a dialogue decision layer, an information query layer, a knowledge repository layer, and a dialogue reply layer.

The unified dialogue coding layer codes dialogue upper information into vector representation which can be processed by the model through a neural network for subsequent use. Through sharing context coding, the mutual borrowing force among different levels of the model can be promoted, and meanwhile, the method is beneficial to obtaining a faster coding speed.

The dialogue decision layer comprises dialogue prior decision and dialogue posterior decision. A priori dialog decision refers to a decision of what dialog action should be taken in the current situation based on the dialog context. The dialogue posterior decision is to decide which key knowledge needs to be input into the dialogue generation model according to knowledge conditions and dialogue context conditions after various knowledge is acquired.

The information query layer generates corresponding query sentences according to the dialogue decision result and the dialogue context. According to the knowledge base to be queried, different query sentences are correspondingly generated. For example, if a search engine is queried, a corresponding search question is generated, if a database is queried, a corresponding SQL statement is generated, and if a certain API is queried, a corresponding API command is generated.

The knowledge warehouse layer contains various knowledge bases needed by the dialogue. These knowledge bases contain a variety of information required for the dialog. The dialog warehouse layer may be different in different applications, mainly based on the actual application needs.

The dialogue reply layer provides dialogue reply capability, and replies according with dialogue logic according to dialogue context and queried knowledge. Knowledge can help the dialogue system give more accurate and satisfactory replies, meet the requirements of task-type dialogues, knowledge-type dialogues and the like on accurate replies, and simultaneously improve the quality of open-domain dialog replies.

In this embodiment, a complete dialogue request is sequentially subjected to dialogue encoding, dialogue priori decision, dialogue query information generation, knowledge warehouse interaction, dialogue posterior decision and answer generation to complete the whole dialogue interaction process, and the whole process only needs to be executed once by the model.

The task-type dialogue, the knowledge-type dialogue and the cross-type dialogue are unified into the unified cross-type dialogue system, so that any single-type dialogue system can be realized by the unified cross-type dialogue system, and compared with the traditional cross-type dialogue system formed by combining a plurality of single-type dialogue systems through dialogue central control, the unified cross-type dialogue system can achieve greater advantages in effect and performance.

In the technical scheme of the disclosure, the related processes of collecting, storing, using, processing, transmitting, providing, disclosing and the like of the personal information of the user accord with the regulations of related laws and regulations, and the public order colloquial is not violated.

The following are device embodiments of the present disclosure that may be used to perform method embodiments of the present disclosure. For details not disclosed in the embodiments of the apparatus of the present disclosure, please refer to the embodiments of the method of the present disclosure.

Referring to fig. 9, a dialog interaction device for implementing an embodiment of the present disclosure is shown. The dialog interaction device may be implemented as whole or part of a device by software, hardware or a combination of both. The dialogue interaction device 900 includes a dialogue context acquisition module 901, a target dialogue decision module 902, a dialogue knowledge query module 903, and a dialogue reply generation module 904, where:

a dialogue context acquisition module 901 configured to perform acquisition of dialogue context information, which is a history dialogue record of a user and a dialogue system;

a target dialog decision module 902 configured to perform inputting dialog context information into a pre-trained multi-stage dialog model to determine a target dialog decision from a plurality of candidate dialog decisions in accordance with the dialog context information;

a dialogue knowledge query module 903 configured to execute query information for generating a target dialogue decision through a multi-stage dialogue model, and acquire dialogue knowledge, which is knowledge of dialogue reply information for generating dialogue context information, based on the query information;

The dialogue reply generation module 904 is configured to execute dialogue reply information that generates dialogue context information according to dialogue knowledge to complete dialogue interaction.

In an embodiment of the present disclosure, the multi-stage conversation model includes a conversation coding layer, and the process of inputting conversation context information into the pre-trained multi-stage conversation model to determine a target conversation decision from a plurality of candidate conversation decisions according to the conversation context information may be implemented as follows: encoding the dialogue upper information through a dialogue encoding layer to generate a target dialogue vector corresponding to the dialogue upper information; a target dialog decision is determined from a plurality of candidate dialog decisions based on the target dialog vector.

In an embodiment of the disclosure, the multi-stage dialog model includes a dialog decision layer; the above-described process of determining a target dialog decision from a plurality of candidate dialog decisions based on the target dialog vector may be implemented as follows: determining the similarity between the target dialogue vector and the candidate dialogue decision through a dialogue decision layer, and determining a plurality of related dialogue decisions from the candidate dialogue decisions according to the similarity; the target dialog vector and the related dialog decision are matched by a dialog decision layer to determine a target dialog decision from the related dialog decisions.

In an embodiment of the disclosure, the multi-stage dialogue model includes an information query layer; the process of generating the query information of the target dialogue decision through the multi-stage dialogue model and acquiring the dialogue knowledge based on the query information can be implemented as follows: and generating corresponding query information in the information query layer according to the type of the target dialogue decision, and acquiring dialogue knowledge in a corresponding knowledge base based on the query information.

Specifically, when the target dialogue decision is a database query, a query sentence of the database is generated through the information query layer, and dialogue knowledge is queried in the database. When the target dialogue decision is call interface inquiry, generating a command statement of the call interface through the information inquiry layer, and acquiring dialogue knowledge based on the command statement.

In an embodiment of the disclosure, the multi-stage dialogue model includes a dialogue reply layer; the process of generating the dialogue reply message of the dialogue context information according to the dialogue knowledge may be implemented as follows: determining input knowledge of a dialogue answer layer in dialogue knowledge according to a target dialogue vector of dialogue context information at a dialogue decision layer; dialogue reply information of dialogue context information is generated in a dialogue reply layer according to input knowledge.

In an embodiment of the disclosure, the multi-stage dialog model includes a knowledge repository layer for storing the dialog knowledge.

Referring to fig. 10, a multi-stage dialog model training device for implementing embodiments of the present disclosure is shown. The multi-stage dialog model training device may be implemented as all or part of the device by software, hardware, or a combination of both. The multi-stage dialogue model training device 1000 includes a dialogue sample dialogue information acquisition module 1001, a dialogue decision layer training module 1002, an information query layer training module 1003, and a dialogue answer multi-layer query training module 1004, wherein:

a sample dialogue information acquisition module 1001 configured to perform acquisition of a plurality of sample dialogue information;

a dialog decision layer training module 1002 configured to perform training of a dialog decision layer in accordance with sample dialog information to determine a target dialog decision from a plurality of candidate dialog decisions by the dialog decision layer;

an information query layer training module 1003 configured to perform inputting of the target dialog decision into the information query layer, and train the information query layer to generate query information of the target dialog decision through the information query layer;

the dialogue reply layer query training module 1004 is configured to perform acquiring dialogue knowledge based on the query information, and input the dialogue knowledge into the dialogue reply layer, and train the dialogue reply layer to generate dialogue reply information through the dialogue reply layer.

In an embodiment of the disclosure, the multi-stage conversation model may further include a conversation coding layer, and the apparatus further includes a conversation coding layer training module configured to train the conversation coding layer according to the sample conversation information, so as to encode the sample conversation information by the conversation coding layer, and generate a target conversation vector of the sample conversation information.

In an embodiment of the present disclosure, the apparatus further includes a sample dialogue information generating module configured to generate sample dialogue information according to a target task, where the target task may include one or more of a generating task, a classifying task, and a matching task.

In an embodiment of the disclosure, the multi-stage dialog model includes a knowledge repository layer for storing dialog knowledge.

It should be noted that, when the dialogue interaction device and the multi-stage dialogue model training device provided in the embodiments execute the dialogue interaction method and the multi-stage dialogue model training method, only the division of the functional modules is used for illustration, in practical application, the functional allocation may be completed by different functional modules according to needs, that is, the internal structure of the device is divided into different functional modules, so as to complete all or part of the functions described above. In addition, the data processing apparatus and the data processing method embodiment provided in the foregoing embodiments belong to the same concept, which embody the detailed implementation process in the method embodiment, and are not described herein again.

The embodiment of the disclosure also provides a dialogue interaction system which comprises an interaction module, a processing module and a knowledge warehouse. Wherein:

the interaction module is used for receiving dialogue upper information interacted with a user and outputting dialogue reply information generated by the processing module;

the processing module comprises a dialogue decision unit, an information inquiry unit and a dialogue reply unit, wherein the dialogue decision unit is used for determining a target dialogue decision from a plurality of candidate dialogue decisions according to dialogue context information; the information inquiry unit is used for generating inquiry information of a target dialogue decision and acquiring dialogue knowledge based on the inquiry information; the dialogue reply unit is used for generating dialogue reply information of dialogue context information according to dialogue knowledge;

the knowledge warehouse layer is used for storing dialogue knowledge.

In an embodiment of the disclosure, the processing module further includes a session encoding unit, configured to encode session context information, and generate a target session vector corresponding to the session context information.

The foregoing embodiment numbers of the present disclosure are merely for description and do not represent advantages or disadvantages of the embodiments.

In the technical scheme of the disclosure, the acquisition, storage, application and the like of the related user personal information all conform to the regulations of related laws and regulations, and the public sequence is not violated.

According to embodiments of the present disclosure, the present disclosure also provides an electronic device, a readable storage medium and a computer program product.

Fig. 11 illustrates a schematic block diagram of an example electronic device 1100 that can be used to implement embodiments of the present disclosure. Electronic devices are intended to represent various forms of digital computers, such as laptops, desktops, workstations, personal digital assistants, servers, blade servers, mainframes, and other appropriate computers. The electronic device may also represent various forms of mobile devices, such as personal digital processing, cellular telephones, smartphones, wearable devices, and other similar computing devices. The components shown herein, their connections and relationships, and their functions, are meant to be exemplary only, and are not meant to limit implementations of the disclosure described and/or claimed herein.

As shown in fig. 11, the apparatus 1100 includes a computing unit 1101 that can perform various appropriate actions and processes according to a computer program stored in a Read Only Memory (ROM) 1102 or a computer program loaded from a storage unit 1108 into a Random Access Memory (RAM) 1103. In the RAM 1103, various programs and data required for the operation of the device 1100 can also be stored. The computing unit 1101, ROM 1102, and RAM 1103 are connected to each other by a bus 1104. An input/output (I/O) interface 1105 is also connected to bus 1104.

Various components in device 1100 are connected to I/O interface 1105, including: an input unit 1106 such as a keyboard, a mouse, etc.; an output unit 1107 such as various types of displays, speakers, and the like; a storage unit 1108, such as a magnetic disk, optical disk, etc.; and a communication unit 1109 such as a network card, modem, wireless communication transceiver, or the like. The communication unit 1109 allows the device 1100 to exchange information/data with other devices through a computer network such as the internet and/or various telecommunication networks.

The computing unit 1101 may be a variety of general purpose and/or special purpose processing components having processing and computing capabilities. Some examples of computing unit 1101 include, but are not limited to, a Central Processing Unit (CPU), a Graphics Processing Unit (GPU), various specialized Artificial Intelligence (AI) computing chips, various computing units running machine learning model algorithms, a Digital Signal Processor (DSP), and any suitable processor, controller, microcontroller, etc. The computing unit 1101 performs the respective methods and processes described above, such as a data processing method. For example, in some embodiments, the data processing method may be implemented as a computer software program tangibly embodied on a machine-readable medium, such as storage unit 1108. In some embodiments, some or all of the computer programs may be loaded and/or installed onto device 1100 via ROM1102 and/or communication unit 1109. When a computer program is loaded into the RAM 1103 and executed by the computing unit 1101, one or more steps of the data processing method described above may be performed. Alternatively, in other embodiments, the computing unit 1101 may be configured to perform the data processing method by any other suitable means (e.g. by means of firmware).

Various implementations of the systems and techniques described here above may be implemented in digital electronic circuitry, integrated circuit systems, field Programmable Gate Arrays (FPGAs), application Specific Integrated Circuits (ASICs), application Specific Standard Products (ASSPs), systems On Chip (SOCs), load programmable logic devices (CPLDs), computer hardware, firmware, software, and/or combinations thereof. These various embodiments may include: implemented in one or more computer programs, the one or more computer programs may be executed and/or interpreted on a programmable system including at least one programmable processor, which may be a special purpose or general-purpose programmable processor, that may receive data and instructions from, and transmit data and instructions to, a storage system, at least one input device, and at least one output device.

Program code for carrying out methods of the present disclosure may be written in any combination of one or more programming languages. These program code may be provided to a processor or controller of a general purpose computer, special purpose computer, or other programmable data processing apparatus such that the program code, when executed by the processor or controller, causes the functions/operations specified in the flowchart and/or block diagram to be implemented. The program code may execute entirely on the machine, partly on the machine, as a stand-alone software package, partly on the machine and partly on a remote machine or entirely on the remote machine or server.

In the context of this disclosure, a machine-readable medium may be a tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device. The machine-readable medium may be a machine-readable signal medium or a machine-readable storage medium. The machine-readable medium may include, but is not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any suitable combination of the foregoing. More specific examples of a machine-readable storage medium would include an electrical connection based on one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber, a portable compact disc read-only memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing.

To provide for interaction with a user, the systems and techniques described here can be implemented on a computer having: a display device (e.g., a CRT (cathode ray tube) or LCD (liquid crystal display) monitor) for displaying information to a user; and a keyboard and pointing device (e.g., a mouse or trackball) by which a user can provide input to the computer. Other kinds of devices may also be used to provide for interaction with a user; for example, feedback provided to the user may be any form of sensory feedback (e.g., visual feedback, auditory feedback, or tactile feedback); and input from the user may be received in any form, including acoustic input, speech input, or tactile input.

The systems and techniques described here can be implemented in a computing system that includes a background component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a user computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such background, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication (e.g., a communication network). Examples of communication networks include: local Area Networks (LANs), wide Area Networks (WANs), the internet, and blockchain networks.

The computer system may include a client and a server. The client and server are typically remote from each other and typically interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. The server can be a cloud server, also called a cloud computing server or a cloud host, and is a host product in a cloud computing service system, so that the defects of high management difficulty and weak service expansibility in the traditional physical hosts and VPS service ("Virtual Private Server" or simply "VPS") are overcome. The server may also be a server of a distributed system or a server that incorporates a blockchain.

It should be appreciated that various forms of the flows shown above may be used to reorder, add, or delete steps. For example, the steps recited in the present disclosure may be performed in parallel or sequentially or in a different order, provided that the desired results of the technical solutions of the present disclosure are achieved, and are not limited herein.

The above detailed description should not be taken as limiting the scope of the present disclosure. It will be apparent to those skilled in the art that various modifications, combinations, sub-combinations and alternatives are possible, depending on design requirements and other factors. Any modifications, equivalent substitutions and improvements made within the spirit and principles of the present disclosure are intended to be included within the scope of the present disclosure.

Claims

1. A method of conversational interaction, comprising:

acquiring dialogue upper information, wherein the dialogue upper information is a history dialogue record of a user and a dialogue system;

inputting the dialogue context information into a pre-trained multi-stage dialogue model to determine a target dialogue decision from a plurality of candidate dialogue decisions according to the dialogue context information;

generating query information of the target dialogue decision through the multi-stage dialogue model, and acquiring dialogue knowledge based on the query information, wherein the dialogue knowledge is knowledge of dialogue reply information for generating dialogue context information;

And generating the dialogue reply information of the dialogue context information according to the dialogue knowledge so as to complete dialogue interaction.

2. The conversational interaction method of claim 1, wherein the multi-stage conversational model comprises a conversational coding layer; the inputting the dialog context information into a pre-trained multi-stage dialog model to determine a target dialog decision from a plurality of candidate dialog decisions based on the dialog context information, comprising:

encoding the dialogue upper information through the dialogue encoding layer to generate a target dialogue vector corresponding to the dialogue upper information;

and determining a target dialogue decision from a plurality of candidate dialogue decisions according to the target dialogue vector.

3. The conversational interaction method of claim 2, wherein the multi-stage conversational model comprises a conversational decision layer; the determining a target dialog decision from a plurality of candidate dialog decisions according to the target dialog vector comprises:

determining the similarity between the target dialogue vector and the candidate dialogue decision through the dialogue decision layer, and determining a plurality of related dialogue decisions from the candidate dialogue decisions according to the similarity;

The target dialog vector and the relevant dialog decision are matched by the dialog decision layer to determine the target dialog decision from the relevant dialog decisions.

4. A method of dialogue interaction as claimed in claim 3 wherein the multi-stage dialogue model comprises an information query layer; the generating query information of the target dialogue decision through the multi-stage dialogue model and acquiring dialogue knowledge based on the query information comprises the following steps:

and generating corresponding query information in the information query layer according to the type of the target dialogue decision, and acquiring the dialogue knowledge in a corresponding knowledge base based on the query information.

5. The method of claim 4, wherein the target dialogue decision is a database query; generating, at the information query layer, the corresponding query information according to the type of the target dialogue decision, and acquiring the dialogue knowledge in a corresponding knowledge base based on the query information, including:

and generating query sentences of a database through the information query layer, and querying the dialogue knowledge in the database.

6. The method of claim 4, wherein the target dialog decision is a call interface query; generating, at the information query layer, the corresponding query information according to the type of the target dialogue decision, and acquiring the dialogue knowledge in a corresponding knowledge base based on the query information, including:

And generating a command statement of a calling interface through the information query layer, and acquiring the dialogue knowledge based on the command statement.

7. The conversational interaction method of claim 4, wherein the multi-stage conversational model comprises a conversational response layer; the generating the dialogue reply information of the dialogue context information according to the dialogue knowledge comprises the following steps:

determining, at the dialog decision layer, input knowledge of the dialog response layer from the dialog knowledge according to the target dialog vector of the dialog context information;

generating the dialogue reply information of the dialogue upper information according to the input knowledge in the dialogue reply layer.

8. The conversational interaction method of any one of claims 1-7, wherein the multi-stage conversational model includes a knowledge warehouse layer to store the conversational knowledge.

9. A multi-stage dialog model training method, wherein the multi-stage dialog model comprises a dialog decision layer, an information query layer and a dialog response layer, the method comprising:

acquiring a plurality of sample dialogue information;

training the dialogue decision layer according to the sample dialogue information to determine a target dialogue decision from a plurality of candidate dialogue decisions through the dialogue decision layer;

Inputting the target dialogue decision into the information inquiry layer, and training the information inquiry layer to generate inquiry information of the target dialogue decision through the information inquiry layer;

and acquiring dialogue knowledge based on the query information, inputting the dialogue knowledge into the dialogue reply layer, and training the dialogue reply layer to generate dialogue reply information through the dialogue reply layer.

10. The multi-stage conversation model training method of claim 9 wherein the multi-stage conversation model comprises a conversation coding layer, the method further comprising:

training the dialogue coding layer according to the sample dialogue information to code the sample dialogue information through the dialogue coding layer, and generating a target dialogue vector of the sample dialogue information.

11. The multi-stage conversation model training method of any one of claims 9 or 10 wherein the obtaining a plurality of sample conversation information comprises:

and generating the sample dialogue information according to target tasks, wherein the target tasks comprise one or more of generating tasks, classifying tasks and matching tasks.

12. The multi-stage conversation model training method of claim 11 wherein the multi-stage conversation model comprises a knowledge warehouse layer for storing the conversation knowledge.

13. A dialog interaction device comprising:

a dialogue context acquisition module configured to perform acquisition of dialogue context information, the dialogue context information being a history dialogue record of a user and a dialogue system;

a target dialog decision module configured to perform inputting the dialog context information into a pre-trained multi-stage dialog model to determine a target dialog decision from a plurality of candidate dialog decisions in accordance with the dialog context information;

a dialogue knowledge query module configured to execute query information for generating the target dialogue decision through the multi-stage dialogue model, and acquire dialogue knowledge based on the query information, the dialogue knowledge being knowledge of dialogue reply information for generating the dialogue context information;

a dialog reply generation module configured to execute the dialog reply information that generated the dialog context information in accordance with the dialog knowledge to complete a dialog interaction.

14. The dialog interaction device of claim 13 wherein the multi-stage dialog model includes a dialog coding layer, the target dialog decision module being specifically configured to:

15. The conversational interaction device of claim 14, wherein the multi-stage conversational model includes a conversational decision layer, the target conversational decision module further to:

16. The dialog interaction device of claim 15 wherein the multi-stage dialog model includes an information query layer, the dialog knowledge query module being specifically configured to:

17. The conversational interaction device of claim 16, wherein the target conversational decision is a database query, the conversational knowledge query module further to:

18. The dialog interaction device of claim 16, wherein the target dialog decision is a call interface query, the dialog knowledge query module further configured to:

19. The conversational interaction device of claim 16, wherein the multi-stage conversational model includes a conversational reply layer, the conversational reply generation module to:

20. A dialog interaction device according to any of claims 13 to 19, wherein the multi-stage dialog model comprises a knowledge repository layer for storing the dialog knowledge.

21. A multi-stage dialog model training device, wherein the multi-stage dialog model comprises a dialog decision layer, an information query layer, and a dialog response layer, the device comprising:

A sample dialogue information acquisition module configured to perform acquisition of a plurality of sample dialogue information;

a dialog decision layer training module configured to perform training of the dialog decision layer in accordance with the sample dialog information to determine a target dialog decision from a plurality of candidate dialog decisions by the dialog decision layer;

an information query layer training module configured to perform inputting the target dialog decision into the information query layer, training the information query layer to generate query information of the target dialog decision through the information query layer;

and the dialogue reply layer query training module is configured to acquire dialogue knowledge based on the query information, input the dialogue knowledge into the dialogue reply layer and train the dialogue reply layer so as to generate dialogue reply information through the dialogue reply layer.

22. The multi-stage conversation model training device of claim 21 wherein the multi-stage conversation model comprises a conversation coding layer, the device further comprising a conversation coding layer training module; the dialogue coding layer training module is used for:

23. The multi-stage dialog model training device of any of claims 20 or 21, further comprising a sample dialog information generation module to:

24. The multi-stage conversation model training device of claim 23 wherein the multi-stage conversation model comprises a knowledge warehouse layer for storing the conversation knowledge.

25. A conversational interaction system, the system comprising:

the interactive module is used for receiving dialogue context information interacted with a user and outputting dialogue reply information generated by the processing module;

the processing module comprises a dialogue decision unit, an information inquiry unit and a dialogue reply unit, wherein the dialogue decision unit is used for determining a target dialogue decision from a plurality of candidate dialogue decisions according to the dialogue context information; the information inquiry unit is used for generating inquiry information of the target dialogue decision and acquiring dialogue knowledge based on the inquiry information; the dialogue reply unit is used for generating dialogue reply information of the dialogue context information according to the dialogue knowledge;

And the knowledge warehouse is used for storing the dialogue knowledge.

26. The conversational interaction system of claim 25, wherein the processing module further comprises a conversational encoding unit to: and encoding the dialogue upper information to generate a target dialogue vector corresponding to the dialogue upper information.

27. The conversational interaction system of claim 25, wherein the knowledge store comprises a call interface library and a universal knowledge library.

28. An electronic device, comprising:

at least one processor; and

a memory communicatively coupled to the at least one processor; it is characterized in that the method comprises the steps of,

the memory stores instructions executable by the at least one processor to enable the at least one processor to perform the method of any one of claims 1-12.

29. A non-transitory computer readable storage medium storing computer instructions for causing the computer to perform the method of any one of claims 1-12.

30. A computer program product comprising a computer program which, when executed by a processor, implements the method according to any of claims 1-12.