CN109616108B

CN109616108B - Multi-turn dialogue interaction processing method and device, electronic equipment and storage medium

Info

Publication number: CN109616108B
Application number: CN201811446940.8A
Authority: CN
Inventors: 王晓雪; 牛嘉斌; 林士翔
Original assignee: Volkswagen China Investment Co Ltd; Mobvoi Innovation Technology Co Ltd
Current assignee: Volkswagen China Investment Co Ltd; Mobvoi Innovation Technology Co Ltd
Priority date: 2018-11-29
Filing date: 2018-11-29
Publication date: 2022-05-31
Anticipated expiration: 2038-11-29
Also published as: CN109616108A

Abstract

The embodiment of the invention discloses a multi-turn dialogue interaction processing method and device, electronic equipment and a storage medium. The method comprises the following steps: when the current conversation turn is finished, acquiring a target interaction entity determined by the current conversation turn; updating the interaction knowledge map according to the knowledge corresponding to the target interaction entity; wherein the interaction knowledge graph is associated with knowledge corresponding to at least one interaction entity determined by an ended conversation turn in the plurality of conversations; adjusting a dictionary and/or a model used by at least one processing module in the dialogue interaction system according to the updated interaction knowledge map; and when receiving the dialogue information corresponding to the next dialogue turn, making a response corresponding to the dialogue information by using the adjusted dialogue interaction system. Through the technical scheme, the modules of the conversation interaction system and the conversation context can share the same semantic information and knowledge, so that the conversation tasks can be arbitrarily converted, and seamless connection is achieved.

Description

Multi-turn dialogue interaction processing method and device, electronic equipment and storage medium

Technical Field

The present invention relates to information processing technologies, and in particular, to a method and an apparatus for processing multi-turn dialog interactions, an electronic device, and a storage medium.

Background

Man-machine conversation is a sub-direction in the field of artificial intelligence, and popular speaking is to allow people to interact with computers through human language (i.e. natural language).

A dialog interaction system usually needs to go through a series of processing modules from user dialog input to final response output, including speech recognition-intent classification-semantic slot sequence labeling-dialog management-candidate response acquisition-language generation, and the whole link is very long and complex.

In the specific implementation process, the inventor finds that information delivery in a conversation interaction system may have a situation of information inconsistency, and particularly, the situation is more likely to occur in multiple rounds of conversations, so that a poor experience is brought to a user. Exemplified with an ideal dialog interaction:

i want to listen to the song of Zheng Jun- > (system putting song) > who his lovers are- > (system answering singer's relevant knowledge question of Zheng Jun) > having performed TV shows- > (system answering Zheng Jun/Liu Yun's relevant knowledge question) > getting back to the pizza bar- > (system putting Zheng Jun's getting back to the pizza)

The above-described dialogue interaction is ideal, and may not be the case in practice. For example, the second round of the dialog of the user "who his lovers are", although the dialog management module can determine that "he" refers to "zheng jun" by referring to the resolution, if the knowledge map is simply queried according to the entity "zheng jun", it is found that the singer "zheng jun" and the physicist "zheng jun" exist in the knowledge map, and the dialog interaction system cannot necessarily select the correct entity "zheng jun" to answer. As another example, the dialog interaction system may also mistake the "pizza back" intent classification for the navigation task for the user's fourth turn of the dialog.

Disclosure of Invention

In view of this, embodiments of the present invention provide a method, an apparatus, an electronic device, and a storage medium for processing multi-round dialog interactions, so as to solve the problem of missing or inconsistent information transfer in a dialog interaction system.

In order to solve the above problems, embodiments of the present invention mainly provide the following technical solutions:

in a first aspect, an embodiment of the present invention provides a method for processing multiple rounds of dialog interactions, where the method includes:

when the current conversation turn is finished, acquiring a target interaction entity determined by the current conversation turn;

updating an interaction knowledge graph according to knowledge corresponding to the target interaction entity; wherein the interaction knowledge graph is associated with knowledge corresponding to at least one interaction entity determined by an ended conversation turn in a plurality of conversations;

adjusting a dictionary and/or a model used by at least one processing module in the dialogue interaction system according to the updated interaction knowledge graph;

and when receiving the dialogue information corresponding to the next dialogue turn, making a response corresponding to the dialogue information by using the adjusted dialogue interaction system.

In a second aspect, an embodiment of the present invention further provides a multi-turn dialog interaction processing apparatus, where the apparatus includes:

the interactive entity acquisition module is used for acquiring a target interactive entity determined by the current conversation turn when the current conversation turn is finished;

the interactive knowledge map updating module is used for updating the interactive knowledge map according to the knowledge corresponding to the target interactive entity; wherein the interaction knowledge graph is associated with knowledge corresponding to at least one interaction entity determined by an ended conversation turn in a plurality of conversations;

the dialogue interaction system adjusting module is used for adjusting a dictionary and/or a model used by at least one processing module in the dialogue interaction system according to the updated interaction knowledge map;

and the dialogue interaction system processing module is used for making a response corresponding to the dialogue information by using the adjusted dialogue interaction system when the dialogue information corresponding to the next dialogue turn is received.

In a third aspect, an embodiment of the present invention further provides an electronic device, including: at least one processor; and at least one memory, bus connected with the processor; the processor and the memory complete mutual communication through the bus; the processor is used for calling the program instructions in the memory so as to execute the multi-round dialogue interaction processing method in any embodiment of the invention.

In a fourth aspect, an embodiment of the present invention further provides a non-transitory computer-readable storage medium storing computer instructions, where the computer instructions cause the computer to execute the multi-turn dialog interaction processing method according to any embodiment of the present invention.

By the technical scheme, the technical scheme provided by the embodiment of the invention at least has the following advantages:

according to the embodiment of the invention, when the current conversation turn is finished each time, the target interaction entity determined by the current conversation turn is obtained, the interaction knowledge map is updated according to the knowledge corresponding to the target interaction entity, and then the dictionary and/or the model used by at least one processing module in the conversation interaction system is adjusted according to the updated interaction knowledge map, so that when the conversation information corresponding to the next conversation turn is received, the adjusted conversation interaction system can be used for making a corresponding answer. Through the technical scheme, the modules of the conversation interaction system and the conversation context can share the same semantic information and knowledge, so that the conversation tasks can be converted randomly, seamless connection is achieved, and the problem that the semantic information is lost or inconsistent across modules due to errors generated by transmitting the semantic information among the modules of the conversation interaction system is solved.

The foregoing description is only an overview of the technical solutions of the embodiments of the present invention, and the embodiments of the present invention can be implemented according to the content of the description in order to make the technical means of the embodiments of the present invention more clearly understood, and the detailed description of the embodiments of the present invention is provided below in order to make the foregoing and other objects, features, and advantages of the embodiments of the present invention more clearly understandable.

Drawings

Various other advantages and benefits will become apparent to those of ordinary skill in the art upon reading the following detailed description of the preferred embodiments. The drawings are only for purposes of illustrating the preferred embodiments and are not to be construed as limiting the embodiments of the invention. Also, like reference numerals are used to refer to like parts throughout the drawings. In the drawings:

fig. 1 is a flowchart illustrating a multi-turn dialog interaction processing method according to an embodiment of the present invention;

fig. 2 is a flowchart illustrating a multi-turn dialog interaction processing method according to a second embodiment of the present invention;

fig. 3 is a flowchart illustrating a multi-turn dialog interaction processing method according to a third embodiment of the present invention;

fig. 4 is a schematic structural diagram illustrating a multi-turn dialogue interaction processing apparatus according to a fourth embodiment of the present invention;

fig. 5 shows a schematic structural diagram of an electronic device according to a fifth embodiment of the present invention.

Detailed Description

Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited to the embodiments set forth herein. Rather, these embodiments are provided so that this disclosure will be thorough and complete, and will fully convey the scope of the disclosure to those skilled in the art.

Example one

Fig. 1 is a flowchart of a multi-turn dialog interaction processing method according to an embodiment of the present invention, which is applicable to a situation where semantic information delivery is inconsistent when a dialog interaction system processes multiple turns of dialogs, and the method may be implemented by a multi-turn dialog interaction processing apparatus provided in an embodiment of the present invention, which may be implemented in software and/or hardware, and may be generally integrated in a processor. As shown in fig. 1, the method of the embodiment of the present invention specifically includes:

and S110, when the current conversation turn is finished, acquiring the target interaction entity determined by the current conversation turn.

Aiming at the situation that the man-machine interaction dialog is a plurality of turns of dialog, in order to keep the semantic information among the plurality of turns of dialog consistent, the content of each turn of dialog of the user is analyzed, and the interactive content of the current turn of dialog is used as the basis for analyzing the dialog information of the next turn of dialog of the user, even the dialog information of a plurality of subsequent turns of dialog.

Therefore, when each round of conversation is finished, namely when the current conversation round is finished, namely the conversation interaction system makes a corresponding response to the current round of conversation of the current user, the target interaction entity in the round of conversation can be determined. The target interactive entity is specifically determined by the conversation content of the current conversation turn, and the entity can be information such as a person name, a place name, an organization name, a number, a date and the like.

For example, if the dialog information of the user in the current dialog turn is "i want to listen to the song of zheng jun", then the target interactive entity in the current dialog turn can be determined as "zheng jun" or "singer zheng jun" according to "zheng jun" and "song".

S120, updating the interactive knowledge map according to knowledge corresponding to the target interactive entity; wherein the interaction knowledge graph is associated with knowledge corresponding to at least one interaction entity determined by an ended conversation turn in the plurality of conversations.

The interaction knowledge map is associated with knowledge corresponding to the interaction entity, and the specific association is the knowledge corresponding to the interaction entity determined by the session turn which has been finished in the multiple sessions, so that when the current session turn is finished, the interaction knowledge map is updated according to the target interaction entity determined by the current session turn, namely the updated interaction knowledge map can be associated with the knowledge corresponding to the target interaction entity.

The following dialogue interaction is an example, "I want to listen to Zheng Jun song- > (system plays songs) - > who his lovers are- > (system answers relative knowledge questions of Zheng Jun)," after the first dialogue turn ends, the interactive knowledge map is associated with knowledge corresponding to the Zheng Jun, after the second turn ends, the interactive knowledge map is updated, and the updated interactive knowledge map is associated with knowledge corresponding to the Zheng Jun and the lovers who Zheng Jun.

It is worth noting that the interactive knowledge-graph is dynamically updated, and may not be associated with any knowledge before the start of multiple sessions, and is dynamically updated as the content of the sessions increases. Specifically, the interaction knowledge graph may be emptied after the end of multiple sessions, or may be emptied after the end of the current session turn and under the condition that session information corresponding to the next session turn is not received after a set time period is exceeded.

And S130, adjusting dictionaries and/or models used by at least one processing module in the dialogue interaction system according to the updated interaction knowledge graph.

Each module in the dialog interaction system generally needs to define a dictionary and derive a model based on the dictionary to complete a specific task, for example, the speech recognition module performs speech recognition on speech input by a user based on a speech recognition model and a language model, and for example, the intention classification module performs intention classification on text output by the speech recognition module based on a general domain dictionary and a task domain dictionary, and so on.

After the dictionaries and/or models used by the processing modules in the interactive system are adjusted according to the updated interactive knowledge map, the processing modules in the interactive system can share the same semantic information and knowledge.

For example, in the speech recognition module, the related extended vocabulary and the language model statistical information are obtained according to the updated mutual knowledge map and are fused with the basic dictionary and the general language model.

For another example, in modules such as intention classification/semantic groove sequence labeling, relevant extended entities are obtained according to the updated interactive knowledge map, and a domain relevant dictionary is dynamically loaded for feature extraction/feature weight modification of a Chinese word segmentation/statistical model, so that the intention classification and semantic groove sequence recognition rate is improved.

And S140, when receiving the dialogue information corresponding to the next dialogue turn, making a response corresponding to the dialogue information by using the adjusted dialogue interaction system.

When receiving the dialogue information corresponding to the next dialogue turn, the adjusted dialogue interaction system is used for processing, the same semantic information and knowledge can be shared among multiple turns of dialogue, and then response corresponding to the dialogue information and conforming to the user's intention is given.

Following a conversation example, "i want to listen to zheng song- > (system plays songs) - > r his lovers are- > (system answers relative knowledge question of zheng jun of singer) - > rd shows tv drama- > (system answers relative knowledge question of zheng jun/liuyun) - > r comes back to the pizza bar- > (system plays back to pizza of zheng jun)", in the fourth dialogue turn, because the mutual knowledge map is associated with the knowledge corresponding to 'the singer Zheng Jun' and 'the lover Zheng Jun' and the like, furthermore, when the adjusted dialogue system processes the dialogue information of the user in the forth dialogue turn, the adjusted dialogue system processes the 'returning to Lasa' as the 'song sung by Zheng Jun of the singer' preferentially, without handling "pizza" as place name and further without misclassifying "Back-to-pizza" as a navigation task.

The multi-turn interactive dialogue processing method provided by the embodiment of the invention obtains the target interactive entity determined by the current dialogue turn when the current dialogue turn is finished each time, updates the interactive knowledge map according to the knowledge corresponding to the target interactive entity, further adjusts the dictionary and/or the model used by at least one processing module in the interactive dialogue system according to the updated interactive knowledge map, so that the processing modules can share the same semantic information and knowledge, further uses the adjusted interactive dialogue system to make corresponding answers when the dialogue information corresponding to the next dialogue turn is received, so that the dialogue contexts can share the same semantic information and knowledge with each other, allows the dialogue tasks to be converted randomly, achieves seamless connection, and avoids errors generated by the semantic information transmitted among the modules of the interactive dialogue system, thereby causing the problem of missing or inconsistent semantic information across modules.

Example two

Fig. 2 is a flowchart of a multi-turn dialog interaction processing method according to a second embodiment of the present invention. On the basis of the technical scheme, the embodiment of the invention updates the interaction knowledge map according to the knowledge corresponding to the target interaction entity, and specifically comprises the following steps:

inquiring a general knowledge map according to the target interaction entity to acquire knowledge corresponding to the target interaction entity; and updating the interaction knowledge map according to the knowledge corresponding to the target interaction entity.

As shown in fig. 2, the method provided in the embodiment of the present invention specifically includes:

s210, when the current conversation turn is finished, the target interaction entity determined by the current conversation turn is obtained.

S220, inquiring the general knowledge map according to the target interaction entity, and acquiring knowledge corresponding to the target interaction entity.

After the target interaction entity is determined according to the current conversation turn, the knowledge related to the target interaction entity is inquired in the general knowledge graph, and the knowledge is further used as a reference basis for adjusting a dictionary and/or a model used by a processing module in the conversation interaction system.

And S230, updating the interaction knowledge map according to the knowledge corresponding to the target interaction entity, wherein the interaction knowledge map is associated with the knowledge corresponding to at least one interaction entity determined by the finished conversation turn in the multi-turn conversation.

And associating the acquired knowledge related to the target interaction entity to an interaction knowledge map, and further taking the interaction knowledge map as a reference basis for adjusting a dictionary and/or a model used by a processing module in the dialogue interaction system.

In particular, the interactive knowledge-graph may associate knowledge corresponding to at least one interactive entity determined by an ended conversation turn in the multiple conversations in such a manner that the interactive knowledge-graph comprises knowledge corresponding to at least one interactive entity determined by an ended conversation turn in the multiple conversations, i.e., an interactive knowledge-graph is constructed and knowledge corresponding to at least one interactive entity determined by an ended conversation turn in the multiple conversations is added to the interactive knowledge-graph, and then dictionaries and/or models used by at least one processing module in the conversation interaction system may be adjusted according to the knowledge in the interactive knowledge-graph.

It may also be that the interaction knowledge map includes a link to knowledge corresponding to at least one interaction entity determined by an ended turn of a dialog in the multiple turns of dialogs, that is, the interaction knowledge map does not directly include knowledge, but the knowledge corresponding to the at least one interaction entity determined by the ended turn of the dialog in the multiple turns of dialogs may be obtained through the link in the interaction knowledge map, and then the corresponding knowledge may be obtained according to the link in the interaction knowledge map, thereby adjusting a dictionary and/or a model used by at least one processing module in the dialog interaction system.

And S240, adjusting the dictionary and/or the model used by at least one processing module in the dialogue interaction system according to the updated interaction knowledge graph.

For example, the determined target interactive entity is "singer zheng jun", and all knowledge corresponding to the entity "singer zheng jun" in the general knowledge map is associated to the interactive knowledge map, such as which songs the singer zheng jun sings have, who lovers are, what activities have recently participated in, and so on.

Furthermore, a dictionary and/or model used by at least one processing module in the dialog interaction system may be adjusted based on the interaction knowledge graph, wherein the processing module in the dialog interaction system comprises: the system comprises a voice recognition module, a semantic understanding module, a dialogue management module and a language generation module, wherein the semantic understanding module can be further specifically divided into an intention classification submodule and a semantic slot sequence labeling submodule.

The processing modules in the dialogue interaction system respectively correspond to the dictionaries and/or models to complete specific tasks, for example, the voice recognition module completes voice recognition tasks based on the voice recognition dictionaries and the voice models, the intention classification submodule completes intention classification tasks based on the general field dictionaries and the task field dictionaries, the semantic groove sequence labeling submodule completes semantic groove labeling tasks based on the task semantic groove dictionaries and the named entity dictionaries, the dialogue management module determines response operation of the computer for user dialogue information based on the semantic groove transfer inheritance dictionaries, the task semantic groove dictionaries and the named entity dictionaries, and the language generation module generates question responses corresponding to user dialogue information based on the language generation dictionaries and the task field dictionaries.

Taking the speech recognition module as an example, pronunciations of all words are stored in the speech recognition dictionary, the pronunciation is used for connecting the acoustic model and the language model, the words can be found out through the speech recognition dictionary after phonemes in user dialogue information are recognized, and the core of the language model is to predict the probability of a sentence or a group of word sequences, so as to generate text information corresponding to the user dialogue information.

Further, adjusting a lexicon and/or a model used by at least one processing module in the dialog interaction system according to the updated interaction knowledge graph, wherein the lexicon and/or the model comprises at least one of the following items:

adjusting a relevant dictionary and a language model weight in the voice recognition module according to the updated interactive knowledge map;

adjusting a relevant dictionary and an intention classification model weight in a semantic understanding module according to the updated interactive knowledge map; and

and adjusting the relevant dictionaries in the dialogue management module and the language generation module according to the updated interactive knowledge map.

For example, according to a target interaction entity 'singer Zheng Jun', a knowledge triplet 'Zheng Jun-lover-Liu Run' is obtained from the general knowledge map, then the updated interaction knowledge map can be associated with the knowledge triplet, and the voice recognition dictionary and the language model weight in the voice recognition module are adjusted according to the interaction knowledge map. If the speech recognition dictionary does not include "liuyun-liuyun", then "liuyun-liuyun" is added to the speech recognition dictionary, and the language model is adjusted to increase the prediction probability of the word "liuyun".

For another example, the updated interaction knowledge graph according to the target interaction entity "singer wang fei" may associate the knowledge triple "wang fei-song-red bean", and adjust the general field dictionary and the task field dictionary in the intention classification sub-module according to the interaction knowledge graph, so that the weight of the intention classification sub-module identifying "red bean" as song field words is increased, and the weight of the intention classification sub-module identifying "red bean" as food field words is decreased.

And S250, when the dialogue information corresponding to the next dialogue turn is received, making a response corresponding to the dialogue information by using the adjusted dialogue interaction system.

For example, when the dialogue interaction is "i want to listen to the song of royal jelly- > (system dials song) - > i red bean- > (system response)," the adjusted dialogue interaction system is used to respond to the user dialogue information "red bean" of the second interaction turn, the song red bean is played instead of providing some knowledge about food red beans.

It is worth pointing out that, since the interaction knowledge graph associates the knowledge corresponding to the interaction entity "singer joffy" at the end of the first interaction turn, when the user switches topics in the nth interaction turn, which is unrelated to the interaction entity "singer joffy", and then the user brings up a dialog corresponding to the interaction entity "singer joffy", such as "red beans", in the n + m interaction turn, the adjusted dialog interaction system can still give a correct response.

In the technical scheme, the conversation contexts can share the same semantic information and knowledge with each other, so that the conversation tasks are allowed to be converted randomly, seamless connection is achieved, and the problem that cross-module semantic information is lost or inconsistent due to errors caused by transmission of the semantic information among modules of the conversation interaction system is avoided.

EXAMPLE III

Fig. 3 is a flowchart of a multi-turn dialog interaction processing method according to a third embodiment of the present invention. On the basis of the above technical solution, the embodiment of the present invention uses the adjusted dialog interaction system to make a response corresponding to the dialog information, specifically:

performing voice recognition on the dialogue information based on the adjusted related dictionary and the adjusted voice model by using a voice recognition module in the adjusted dialogue interaction system;

performing semantic analysis on the recognition result output by the voice recognition by using a semantic understanding module in the adjusted dialogue interaction system based on the adjusted related dictionary and intention classification model;

performing reference resolution, semantic slot inheritance and task parameter analysis operations on analysis results output by a semantic understanding module by using a dialog management module in the adjusted dialog interaction system based on the adjusted related dictionary and the adjusted interaction knowledge map, generating candidate responses corresponding to dialog information, and making final responses corresponding to the dialog information;

and generating natural language text corresponding to the final response based on the adjusted related dictionary by using a language generation module in the adjusted dialogue interaction system.

As shown in fig. 3, the method provided in the embodiment of the present invention specifically includes:

and S310, when the current conversation turn is finished, acquiring the target interaction entity determined by the current conversation turn.

And S320, inquiring the general knowledge map according to the target interaction entity to acquire knowledge corresponding to the target interaction entity.

And S330, updating the interaction knowledge map according to the knowledge corresponding to the target interaction entity, wherein the interaction knowledge map is associated with the knowledge corresponding to at least one interaction entity determined by the finished conversation turn in the multi-turn conversation.

And S340, adjusting a dictionary and/or a model used by at least one processing module in the dialogue interaction system according to the updated interaction knowledge graph.

For the explanation of S310-S340, please refer to the foregoing embodiments, which are not described herein.

And S350, when receiving the dialogue information corresponding to the next dialogue turn, performing voice recognition on the dialogue information based on the adjusted related dictionary and the adjusted voice model by using the voice recognition module in the adjusted dialogue interaction system.

The voice recognition module recognizes the voice of the user into a text through the acoustic model and the language model, and after the dictionary and the language model corresponding to the voice recognition module are adjusted according to the updated interaction knowledge map, the voice recognition module can better accord with the intention of the user when recognizing the dialogue information of the next dialogue turn, thereby greatly improving the accuracy of the voice recognition.

And S360, performing semantic analysis on the recognition result output by the voice recognition based on the adjusted related dictionary and intention classification model by using a semantic understanding module in the adjusted dialogue interaction system.

And the intention classification submodule completes the intention classification task based on the adjusted general field dictionary and the task field dictionary, and then the semantic slot sequence labeling submodule completes the semantic slot sequence labeling task based on the task semantic slot dictionary and the named entity dictionary according to the intention classification result. Among them, entities such as a person name, a place name, a facility name, a number, a date, and the like are stored in the named entity dictionary.

Furthermore, a semantic analysis result of the semantic understanding module can be obtained, for example, from the sentence "how is the weather in the Mingtian day in Beijing," the semantic understanding module can obtain the following semantic results:

domain (Domain): weather (weather)

Intent (Intent): weather inquiry

Word Slot (Slot): city (city) Beijing

Day (date) ═ tomorrow

When most of knowledge associated with the interactive knowledge map is music knowledge, namely, when most of topics involved in multiple rounds of conversations of the user are music, if the user says a sentence of 'Chengdu', the semantic understanding module classifies the 'Chengdu' into songs instead of classifying the 'Chengdu' into places, and further intends to determine that the songs are played.

And S370, carrying out reference resolution, semantic slot inheritance and cross-task parameter analysis operations on the analysis result output by the semantic understanding module by using the adjusted conversation management module in the conversation interaction system based on the adjusted related dictionary and the adjusted interaction knowledge map, generating candidate responses corresponding to the conversation information, and further making a final response corresponding to the conversation information.

The session types of the candidate answers may include a task type, a chatting type and a question and answer type.

Following a dialogue interaction example, "i want to listen to zheng song- > (system dials song) - > his lover is- > (system answers relative knowledge question of zheng jun of singer) - > rd has played tv drama- > (system answers relative knowledge question of zheng jun/liu yun) - > r comes back to the rasa bar- > (system dials back to rasa of zheng jun)", and aiming at dialogue "his lover is who" of the second dialogue turn of dialogue, the dialogue management module in the adjusted dialogue interaction system can determine that "he" refers to zheng jun "through referring to resolution, and can also determine that" zheng jun "refers to" singer zhen jun "through interaction knowledge map instead of" physicist zheng jun ", thereby being able to inquire answer based on" singer zheng jun "in general knowledge map, and thus generate the correct response.

The reference resolution, the semantic slot inheritance and the task parameter resolution are all highly related to a specific task scene, for example, a purposeful (to) related entity is needed in a navigation task, a hotel search needs a location (location) entity, although the semantic slot names (to/location) of the two tasks are different, the dialogue management module can know that the two semantic slot indexes are entities of geographic information through knowledge related to an interaction knowledge map, and therefore, the ambiguity problem is easier to process.

For example, when the user wants to listen to songs of Wangfeng- > (system dials songs) - > Red beans- > (system responses), "the user dialog information of the second interaction turn" Red beans "is ambiguous, and then when the dialog management module makes a response to the" Red beans ", the candidate responses generated according to the" Song Red beans "are ranked before the candidate responses generated according to the" food Red beans "according to the associated knowledge in the interaction knowledge graph, and then the candidate responses generated according to the" Song Red beans "are output as final responses.

And then, a chat task is performed, if the dialog system is provided with personal equipment, favorite fruits are apples, and a user asks a question that the user likes to eat the strawberries, the dialog system can know that the strawberries refer to the fruits and can simultaneously give a response that the user matched that the personal equipment in the personal equipment likes to apples, so that the user can correctly and accurately answer the response that the user likes to eat the apples and the like.

And S380, generating a natural language text corresponding to the final response based on the adjusted related dictionary by using the language generation module in the adjusted interactive dialogue system.

If the conversation management module determines that the next operation of the computer is to output voice information as a response corresponding to the user conversation, the language generation module is used for generating a natural language text based on the adjusted related dictionary, and then the natural language text is converted into the voice of a flow field by using a voice synthesis technology and is played to the user.

According to the technical scheme provided by the embodiment of the invention, the interactive knowledge map is used as the reference information to adjust the dictionaries and/or models used by each processing module in the dialogue interaction system so as to keep the semantic information transmitted among the modules of the dialogue interaction system consistent, and further, when the adjusted dialogue interaction system makes a response corresponding to the dialogue information by referring to the interactive knowledge map, the ambiguity problem is solved more easily.

Example four

Fig. 4 is a schematic structural diagram of a multiple-round dialog interaction processing apparatus according to a fourth embodiment of the present invention, which is suitable for a situation where semantic information delivery is inconsistent when a dialog interaction system processes multiple rounds of dialogs, and the apparatus may be implemented in a software and/or hardware manner, and may be generally integrated in a processor. As shown in fig. 4, the apparatus specifically includes: an interactive entity acquisition module 410, an interactive knowledge graph update module 420, a dialog interaction system adaptation module 430, and a dialog interaction system processing module 440, wherein,

an interactive entity obtaining module 410, configured to obtain a target interactive entity determined by a current conversation turn when the current conversation turn is ended;

an interaction knowledge graph updating module 420, configured to update an interaction knowledge graph according to knowledge corresponding to the target interaction entity; wherein the interaction knowledge graph is associated with knowledge corresponding to at least one interaction entity determined by an ended conversation turn in a plurality of conversations;

a dialogue interaction system adjusting module 430, configured to adjust a dictionary and/or a model used by at least one processing module in the dialogue interaction system according to the updated interaction knowledge graph;

and the dialogue interaction system processing module 440 is configured to, when receiving the dialogue information corresponding to the next dialogue turn, make a response corresponding to the dialogue information using the adjusted dialogue interaction system.

The multi-turn dialogue interaction processing device provided by the embodiment of the invention obtains the target interaction entity determined by the current dialogue turn when the current dialogue turn is finished each time, updating the interactive knowledge map according to the knowledge corresponding to the target interactive entity, and further adjusting dictionaries and/or models used by at least one processing module in the interactive system according to the updated interactive knowledge map so that the processing modules can share the same semantic information and knowledge, and then when receiving the dialogue information corresponding to the next dialogue turn, using the adjusted dialogue interaction system to make a corresponding answer, therefore, the conversation contexts can share the same semantic information and knowledge with each other, the conversation tasks are allowed to be converted randomly, seamless connection is achieved, and the problem that the semantic information is lost or inconsistent across modules due to errors generated by the transmission of the semantic information among modules of a conversation interaction system is solved.

Further, the interaction knowledge map updating module 420 is specifically configured to query a general knowledge map according to the target interaction entity, and acquire knowledge corresponding to the target interaction entity; and updating the interactive knowledge graph according to the knowledge corresponding to the target interactive entity.

Wherein the interaction knowledge-graph comprises knowledge corresponding to at least one interaction entity determined by an ended conversation turn in a plurality of conversations, or,

the interaction knowledge graph includes links to knowledge corresponding to at least one interactive entity determined by an ended turn of dialog in a plurality of turns of dialog.

Specifically, the processing module in the dialog interaction system includes: the system comprises a voice recognition module, a semantic understanding module, a dialogue management module and a language generation module.

Further, the dialog interaction system adjustment module 430 specifically includes at least one unit,

the first adjusting unit is used for adjusting a relevant dictionary and a language model weight in the voice recognition module according to the updated interactive knowledge graph;

the second adjusting unit is used for adjusting the relevant dictionary and the intention classification model weight in the semantic understanding module according to the updated interactive knowledge map; and

and the third adjusting unit is used for adjusting the relevant dictionaries in the dialogue management module and the language generation module according to the updated interaction knowledge graph.

Specifically, the dialogue interaction system processing module 440 is configured to, when receiving dialogue information corresponding to a next dialogue turn, perform voice recognition on the dialogue information based on the adjusted related dictionary and voice model by using the adjusted voice recognition module in the dialogue interaction system;

performing semantic analysis on the recognition result output by the voice recognition based on the adjusted related dictionary and intention classification model by using an adjusted semantic understanding module in the dialogue interaction system;

performing reference resolution, semantic slot inheritance and task parameter analysis operations on an analysis result output by the semantic understanding module by using an adjusted related dictionary and the adjusted interaction knowledge map by using an adjusted conversation management module in the conversation interaction system, generating a candidate response corresponding to the conversation information, and further making a final response corresponding to the conversation information;

and generating a natural language text corresponding to the final response based on the adjusted relevant dictionary by using the adjusted language generation module in the dialogue interaction system.

Wherein the session type of the candidate answer includes at least one of: task type, chatting type, and question and answer type.

Since the multi-turn dialog interaction processing apparatus described in the embodiment of the present invention is an apparatus capable of executing the multi-turn dialog interaction processing method described in the embodiment of the present invention, based on the multi-turn dialog interaction processing method described in the embodiment of the present invention, a person skilled in the art can understand a specific implementation manner of the multi-turn dialog interaction processing apparatus described in the embodiment of the present invention and various variations thereof, so how the multi-turn dialog interaction processing apparatus implements the multi-turn dialog interaction processing method described in the embodiment of the present invention is not described in detail herein. The scope of the present application is intended to be covered by the following claims as long as those skilled in the art can implement the apparatus for performing the multi-turn interactive processing method in the embodiments of the present invention.

EXAMPLE five

An embodiment of the present invention provides an electronic device, as shown in fig. 5, including: at least one processor (processor) 51; and at least one memory (memory)52, a bus 53 connected to the processor 51; wherein the content of the first and second substances,

the processor 51 and the memory 52 complete mutual communication through the bus 53;

the memory 52 is a non-transitory computer-readable storage medium, and can be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to a multi-turn dialog interaction processing method according to an embodiment of the present invention (for example, as shown in fig. 4, the interactive entity acquiring module 410, the interactive knowledge map updating module 420, the dialog interaction system adjusting module 430, and the dialog interaction system processing module 440). The processor 51 is configured to call program instructions/modules in the memory 52 to execute the steps in one of the multi-round dialog interaction processing methods in the above-described method embodiments.

The memory 52 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to use of the electronic device, and the like. Further, the memory 52 may include high speed random access memory, and may also include non-transitory memory, such as at least one magnetic disk storage device, flash memory device, or other non-transitory solid state storage device. In some embodiments, the memory 52 may optionally include memory located remotely from the processor 51, which may be connected to the terminal device via a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.

Example six

An embodiment of the present invention provides a non-transitory computer-readable storage medium, where the non-transitory computer-readable storage medium stores computer instructions, where the computer instructions cause the computer to execute a multi-turn dialog interaction processing method provided in the foregoing method embodiments, where the method includes: .

As will be appreciated by one skilled in the art, embodiments of the present application may be provided as a method, system, or computer program product. Accordingly, the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present application may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, and the like) having computer-usable program code embodied therein.

The present application is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.

These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.

In a typical configuration, a computing device includes one or more processors (CPUs), input/output interfaces, network interfaces, and memory.

The memory may include forms of volatile memory in a computer readable medium, Random Access Memory (RAM) and/or non-volatile memory, such as Read Only Memory (ROM) or flash memory (flash RAM). The memory is an example of a computer-readable medium.

Computer-readable media, including both non-transitory and non-transitory, removable and non-removable media, may implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of computer storage media include, but are not limited to, phase change memory (PRAM), Static Random Access Memory (SRAM), Dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), Read Only Memory (ROM), Electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), Digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium that can be used to store information that can be accessed by a computing device. As defined herein, a computer readable medium does not include a transitory computer readable medium such as a modulated data signal and a carrier wave.

It should also be noted that the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in the process, method, article, or apparatus that comprises the element.

The above embodiments are merely examples of the present application, and are not intended to limit the present application, and the technical features of the embodiments may be combined and arranged within the scope of the present invention. Various modifications and changes may occur to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the scope of the claims of the present application.

Claims

1. A multi-turn dialogue interaction processing method is characterized by comprising the following steps:

updating an interaction knowledge graph according to knowledge corresponding to the target interaction entity; wherein the interaction knowledge-graph is associated with knowledge corresponding to at least one interaction entity determined by an ended conversation turn in a plurality of conversations; the interactive knowledge map is dynamically updated along with the increase of the conversation content, and the interactive knowledge map does not relate to any knowledge before the start of multiple rounds of conversations;

when receiving the dialogue information corresponding to the next dialogue turn, making a response corresponding to the dialogue information by using the adjusted dialogue interaction system;

wherein, the processing module in the dialogue interaction system comprises: the system comprises a voice recognition module, a semantic understanding module, a dialogue management module and a language generation module;

the adjusting of the lexicon and/or model used by at least one processing module in the dialog interaction system according to the updated interaction knowledge graph comprises at least one of the following items:

adjusting a relevant dictionary and a language model weight in the voice recognition module according to the updated interaction knowledge graph;

adjusting a relevant dictionary and an intention classification model weight in the semantic understanding module according to the updated interactive knowledge map; and

and adjusting the relevant dictionaries in the dialogue management module and the language generation module according to the updated interactive knowledge graph.

2. The method of claim 1, wherein updating the interaction knowledge graph based on knowledge corresponding to the target interaction entity comprises:

inquiring a general knowledge graph according to the target interaction entity to acquire knowledge corresponding to the target interaction entity;

and updating the interaction knowledge graph according to the knowledge corresponding to the target interaction entity.

3. The method of claim 1 or 2, wherein the interaction knowledge graph comprises knowledge corresponding to at least one interaction entity determined by an ended dialog turn in a plurality of dialogs, or,

4. The method of claim 1, wherein making a response corresponding to the dialog information using the adjusted dialog interaction system comprises:

performing voice recognition on the dialogue information based on the adjusted relevant dictionary and the adjusted voice model by using the adjusted voice recognition module in the dialogue interaction system;

5. The method of claim 4, wherein the session type of the candidate answer comprises at least one of: task type, chatting type, and question and answer type.

6. A multi-turn dialog interaction processing apparatus, comprising:

the interactive knowledge map updating module is used for updating the interactive knowledge map according to the knowledge corresponding to the target interactive entity; wherein the interaction knowledge-graph is associated with knowledge corresponding to at least one interaction entity determined by an ended conversation turn in a plurality of conversations; the interactive knowledge map is dynamically updated along with the increase of the conversation content, and the interactive knowledge map does not relate to any knowledge before the start of multiple rounds of conversations;

the dialogue interaction system processing module is used for making a response corresponding to the dialogue information by using the adjusted dialogue interaction system when the dialogue information corresponding to the next dialogue turn is received;

the dialogue interaction system adjusting module comprises at least one of the following units:

the first adjusting unit is used for adjusting a relevant dictionary and language model weight in the voice recognition module according to the updated interaction knowledge graph;

7. An electronic device, comprising:

at least one processor;

and at least one memory, bus connected with the processor; wherein the content of the first and second substances,

the processor and the memory complete mutual communication through the bus;

the processor is used for calling the program instructions in the memory to execute the multi-turn dialogue interaction processing method of any one of the claims 1 to 5.

8. A non-transitory computer-readable storage medium storing computer instructions for causing a computer to perform the multi-turn dialog interaction processing method of any one of claims 1 to 5.