CN114297352A

CN114297352A - Conversation state tracking method and device, man-machine conversation system and working machine

Info

Publication number: CN114297352A
Application number: CN202111404933.3A
Authority: CN
Inventors: 邓伟杰; 任景彪; 蒋华晨
Original assignee: Shengjing Intelligent Technology Jiaxing Co ltd
Current assignee: Shengjing Intelligent Technology Jiaxing Co ltd
Priority date: 2021-11-24
Filing date: 2021-11-24
Publication date: 2022-04-08

Abstract

The invention provides a conversation state tracking method, a conversation state tracking device, a man-machine conversation system and an operating machine, wherein the method not only considers the input text of a user in the current round of conversation, but also considers the preset conversation outline information corresponding to a conversation robot, and generates the conversation state information of the current round of conversation by combining a characteristic fusion mode, so that the multi-source information can be fully utilized, the tracked conversation state information of the current round of conversation can be more accurate, and the good experience of the user in the subsequent man-machine conversation process can be further ensured.

Description

Conversation state tracking method and device, man-machine conversation system and working machine

Technical Field

The invention relates to the technical field of data processing, in particular to a method and a device for tracking a conversation state, a man-machine conversation system and an operating machine.

Background

In recent years, man-machine conversation systems are used in more and more scenes, and conversation state tracking is an important function of man-machine conversation systems, and aims to accurately recognize conversation states, so that correct responses to user instructions can be made only if conversation states are accurately recognized.

The existing dialogue state tracking method applied to multi-turn dialogue generally utilizes semantic information input by a user in the current turn, firstly codes the semantic information, predicts the intention of the user, judges the type of a slot position, finally generates the slot position value and returns a dialogue state meeting the requirements of a man-machine dialogue system.

However, in the above dialog state tracking method, only the semantic information currently input by the user is utilized, which may result in low accuracy of the tracked dialog state, and further may result in poor user experience in the subsequent human-computer dialog process.

Disclosure of Invention

The invention provides a conversation state tracking method, a conversation state tracking device, a man-machine conversation system and an operating machine, which are used for overcoming the defects in the prior art.

The invention provides a dialogue state tracking method, which comprises the following steps:

acquiring a user input text of a current turn of conversation, and determining a first coding characteristic based on the user input text;

determining preset dialog outline information corresponding to the dialog robot, and determining a second coding characteristic based on the preset dialog outline information;

determining a fusion feature based on the first coding feature and the second coding feature;

and generating the dialog state information of the current turn of dialog based on the fusion characteristics.

The dialog state tracking method provided by the invention further comprises the following steps:

obtaining the dialog state information of the dialog of the previous turn of the current turn of the dialog, and determining a third coding characteristic based on the dialog state information of the dialog of the previous turn;

correspondingly, the determining a fusion feature based on the first coding feature and the second coding feature specifically includes:

determining the fused feature based on the first encoding feature, the second encoding feature, and the third encoding feature.

According to the dialog state tracking method provided by the present invention, the determining of the third encoding characteristic based on the dialog state information of the previous dialog includes:

and coding the dialog state information of the previous dialog based on the dialog state map corresponding to the dialog robot to obtain the third coding feature.

According to the dialog state tracking method provided by the invention, the dialog state map is constructed based on the following method:

determining nodes of the dialogue state graph according to user intentions, slot value types and slot position values corresponding to the plurality of dialogue robots;

defining a user intention connecting edge, a slot value type connecting edge and a slot value connecting edge, and constructing the dialogue state graph based on the nodes, the user intention connecting edge, the slot value type connecting edge and the slot value connecting edge.

According to the dialog state tracking method provided by the invention, the preset dialog outline information comprises preset user intention information, preset slot position type information and preset slot position value information;

the preset user intention information is obtained by splicing the name text of the conversation robot and the intention description text corresponding to the conversation robot;

the preset slot position type information is obtained by splicing the name text and a slot position type description text corresponding to the conversation robot;

and the preset slot value information is obtained by splicing the slot type description text and the slot value corresponding to the conversation robot.

According to the dialog state tracking method provided by the present invention, the determining a second encoding characteristic based on the preset dialog schema information specifically includes:

and respectively coding the preset user intention information, the preset slot position type information and the preset slot position value information to obtain the second coding characteristic.

According to the dialog state tracking method provided by the present invention, the determining a first encoding characteristic based on the user input text specifically includes:

acquiring a historical dialogue text of historical round dialogue, and splicing the user input text with the historical dialogue text to obtain a spliced text;

and coding the spliced text to obtain the first coding feature.

According to the dialog state tracking method provided by the present invention, the generating of the dialog state information of the current turn of dialog based on the fusion feature specifically includes:

inputting the fusion characteristics into a long and short memory artificial neural network model based on an attention mechanism to obtain context vectors output by the long and short memory artificial neural network model;

inputting the context vector into a deep learning model to obtain the dialog state information of the current turn of dialog output by the deep learning model;

the long and short memory artificial neural network model is obtained based on characteristic sample training with context vector labels; the deep learning model is obtained by training based on a context vector sample carrying a dialog state label.

The present invention also provides a dialog state tracking device, including:

the first coding module is used for acquiring a user input text of a current turn of conversation and determining a first coding characteristic based on the user input text;

the second coding module is further used for determining preset dialogue outline information corresponding to the dialogue robot and determining second coding characteristics based on the preset dialogue outline information;

the fusion module is used for obtaining fusion characteristics based on the first coding characteristics and the second coding characteristics;

and the generating module is used for generating the conversation state information of the current turn of conversation based on the fusion characteristics.

The invention also provides a man-machine conversation system which comprises the conversation state tracking device.

The invention also provides a working machine comprising the man-machine interaction system.

The present invention also provides an electronic device, which includes a memory, a processor and a computer program stored in the memory and running on the processor, wherein the processor executes the program to implement the steps of the dialog state tracking method according to any of the above embodiments.

The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, performs the steps of the dialog state tracking method as described in any of the above.

The present invention also provides a computer program product comprising a computer program which, when executed by a processor, performs the steps of the dialog state tracking method according to any of the above-mentioned claims.

According to the conversation state tracking method, the conversation state tracking device, the man-machine conversation system and the operation machine, the user input text of the current round of conversation is obtained, and the first coding feature is determined based on the user input text; then, determining preset dialogue outline information corresponding to the dialogue robot, and determining a second coding characteristic based on the preset dialogue outline information; obtaining a fusion characteristic based on the first coding characteristic and the second coding characteristic; and finally, generating the dialog state information of the current turn of dialog according to the fusion characteristics. The method has the advantages that the input text of the user of the current round of conversation is considered, the preset conversation outline information corresponding to the conversation robot is also considered, and the conversation state information of the current round of conversation is generated in combination with a characteristic fusion mode, so that the multi-source information can be fully utilized, the tracked conversation state information of the current round of conversation can be more accurate, and the good experience of the user in the subsequent man-machine conversation process can be further ensured.

Drawings

In order to more clearly illustrate the present invention or the technical solutions in the prior art, the drawings needed for the description of the embodiments or the prior art will be briefly described below, and it is obvious for those skilled in the art to obtain other drawings without creative efforts.

FIG. 1 is a flow chart illustrating a dialog state tracking method according to the present invention;

FIG. 2 is a schematic structural diagram of a dialog state tracking device according to the present invention;

fig. 3 is a schematic structural diagram of an electronic device provided in the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention clearer, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is obvious that the described embodiments are some, but not all embodiments of the present invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.

Because the existing dialog state tracking method applied to multi-turn dialog only utilizes semantic information currently input by a user, the accuracy of the tracked dialog state is not high, and further the user experience in the subsequent man-machine dialog process is possibly poor. Therefore, the embodiment of the invention provides a dialog state tracking method to solve the problems in the prior art.

Fig. 1 is a schematic flowchart of a dialog state tracking method according to an embodiment of the present invention, as shown in fig. 1, the method includes:

s1, acquiring a user input text of the current turn of conversation, and determining a first coding characteristic based on the user input text;

s2, determining preset dialog outline information corresponding to the dialog robot, and determining a second coding feature based on the preset dialog outline information;

s3, obtaining a fusion feature based on the first coding feature and the second coding feature;

and S4, generating the dialog state information of the current turn of dialog based on the fusion characteristics.

Specifically, the dialog state tracking method provided in the embodiment of the present invention is applied to dialog state tracking in a multi-turn dialog scenario. Wherein a dialog state refers to a representation of one or more phases of a dialog for which a user provides input to the dialog bot. The input provided by the user may be in a form of voice or text, which is not specifically limited in the embodiment of the present invention. When the input provided by the user is in speech form, the dialog bot then converts it to text form for subsequent processing.

The execution main body of the method is a dialog state tracking device, the device may be configured in a server, the server may be a local server or a cloud server, the local server may specifically be a computer, a tablet computer, a smart phone, and the like, and the embodiment of the present invention is not particularly limited thereto.

Step S1 is executed to obtain the user input text of the current turn of dialog, where the current turn of dialog may be any turn of dialog in multiple turns of dialog, for example, the current turn of dialog may be an initial turn of dialog, a final turn of dialog, or a turn of dialog between the initial turn of dialog and the final turn of dialog, and this is not limited in this respect.

The user input text may be a textual representation of the user-provided input in the current turn of the dialog, which, as previously described, may be in either a speech or text form. When the input provided by the user is in speech form, it may be converted to text to user input text. When the input provided by the user is in text form, it can be used directly as user input text.

After obtaining the user input text, a first encoding characteristic may be determined from the user input text. The first coding feature refers to a text coding feature, and the first coding feature may be obtained by directly coding a text input by a user, or may be obtained by coding the text input by the user after performing corresponding processing, and is not specifically limited herein.

Step S2 is executed to determine preset dialog schema information corresponding to a dialog robot, where the dialog robot refers to a dialog robot involved in multiple rounds of dialog, the dialog robot has the preset dialog schema information, and the preset dialog schema information refers to key information that the dialog robot needs to obtain to implement a dialog function, and may include, for example, preset user intention information, preset slot type information, and preset slot value information. The preset user intention information refers to information corresponding to the conversation robot and relevant to user intention, the preset slot position type information refers to information corresponding to the conversation robot and relevant to a slot position type, and the preset slot position value information refers to information corresponding to the conversation robot and relevant to a slot position value. The preset dialog outline information is only related to the dialog robot, is not related to the text input by the user and belongs to the attribute information of the dialog robot.

After the preset dialog schema information corresponding to the dialog robot is determined, the second encoding feature may be determined according to the preset dialog schema information corresponding to the dialog robot. The second coding feature refers to a preset dialog schema coding feature, and the second coding feature may be obtained by directly coding the preset dialog schema information, or by coding the preset dialog schema information after performing corresponding processing, which is not limited herein.

It is understood that the execution sequence of step S1 and step S2 can be set as required, and step S2 can be executed after step S1 is executed, step S1 can be executed after step S2 is executed, and steps S1 and S2 can be executed simultaneously, which is not limited in detail herein.

Then, step S3 is executed to obtain a fused feature based on the first encoding feature and the second encoding feature. In the embodiment of the present invention, the first coding feature and the second coding feature may be input to the fusion model, respectively, to obtain the fusion feature output by the fusion model. The fusion model can be used for fusing the first coding feature and the second coding feature and coding to obtain a fusion feature.

The fusion model may be a deep learning model, and may include, but is not limited to, a Long Short-Term Memory (LSTM) model, a Convolutional Neural Networks (CNN) model, an Attention (Attention) model, and the like. The fusion model can be obtained through supervised training or unsupervised training, which is not specifically limited in the embodiment of the invention.

Finally, step S4 is executed to generate dialog state information of the current turn of dialog according to the fusion features. When generating the dialog state information of the current turn of dialog, the fusion features may be decoded first, and then the dialog state information of the current turn of dialog may be determined by combining the trained neural network model with the decoding result. The dialog state information of the current turn of dialog also includes information such as user intention, slot type, slot position value and the like.

The dialog state tracking method provided by the embodiment of the invention comprises the steps of firstly obtaining a user input text of a current turn of dialog, and determining a first coding characteristic based on the user input text; then, determining preset dialogue outline information corresponding to the dialogue robot, and determining a second coding characteristic based on the preset dialogue outline information; obtaining a fusion characteristic based on the first coding characteristic and the second coding characteristic; and finally, generating the dialog state information of the current turn of dialog according to the fusion characteristics. The method has the advantages that the input text of the user of the current round of conversation is considered, the preset conversation outline information corresponding to the conversation robot is also considered, and the conversation state information of the current round of conversation is generated in combination with a characteristic fusion mode, so that the multi-source information can be fully utilized, the tracked conversation state information of the current round of conversation can be more accurate, and the good experience of the user in the subsequent man-machine conversation process can be further ensured.

On the basis of the foregoing embodiment, the dialog state tracking method provided in the embodiment of the present invention further includes:

Specifically, in the embodiment of the present invention, the dialog state information of the previous dialog turn of the current dialog turn may also be obtained. The previous round of dialog refers to a round of dialog performed before the current round of dialog. The dialog state information of each turn of dialog can be characterized by user intention, slot type and slot value together. The user intention may indicate the functions that the user needs to implement through the conversation robot, such as "booking tickets", "querying", etc., i.e., the domain to which the conversation relates. The slot type may represent an attribute type of the slot, such as "time", "place", "person", and the like. The slot value may represent specific information of the slot, such as "1 month and 1 day of 2005", "beijing", "zhang" and the like.

After the session state information of the previous session is obtained, the third encoding characteristic may be determined according to the session state information of the previous session. The third coding feature refers to a dialog state coding feature, and the third coding feature may be obtained by directly coding the dialog state information of the previous dialog, or may be obtained by coding the dialog state information of the previous dialog after performing corresponding processing, and is not limited in this respect.

It is understood that the order of the step of determining the third encoding characteristic, and the steps S1 and S2 may be set as required, and the steps S1, S2 and the step of determining the third encoding characteristic may be performed sequentially, or the steps of determining the third encoding characteristic, then the step S2 and finally the step S1 may be performed first, or the steps S1, S2 and the step of determining the third encoding characteristic may be performed simultaneously, which is not limited in detail herein.

Further, when the fusion feature is determined, the first coding feature, the second coding feature, and the third coding feature may be combined, that is, the first coding feature, the second coding feature, and the third coding feature may be respectively input to the fusion model, so as to obtain the fusion feature output by the fusion model. The fusion model can be used for fusing and coding the first coding feature, the second coding feature and the third coding feature to obtain a fusion feature.

It will be appreciated that the fusion model has two inputs and one output if the first coding feature is fused with the second coding feature, and three inputs and one output if the first coding feature, the second coding feature and the third coding feature are fused.

In the embodiment of the invention, on the basis of considering the input text of the user of the current turn of conversation and the preset conversation outline information corresponding to the conversation robot, the conversation state information of the previous turn of conversation of the current turn of conversation is also considered, so that the multisource information can be further fully utilized, the accuracy of the conversation state information of the current turn of conversation is improved, and the experience of the user in the subsequent man-machine conversation process is further improved.

On the basis of the foregoing embodiment, in the dialog state tracking method provided in the embodiment of the present invention, the preset dialog schema information includes preset user intention information, preset slot type information, and preset slot value information;

Specifically, in the embodiment of the present invention, the preset dialog schema information corresponding to the dialog robot may include preset user intention information, preset slot type information, and preset slot value information.

The preset user intention information may be obtained by concatenating a name text of the dialogue robot, which may refer to a text representation of the name of the dialogue robot, such as "ticket booking robot", "query robot", and the like, with an intention description text corresponding to the dialogue robot. The intention description text corresponding to the conversation robot refers to a text representation of description information of user intention corresponding to the conversation robot, the intention description text can correspond to detailed explanation information of user intention, for example, if the conversation robot is a ' ticket booking robot ', the corresponding user intention is ' booking ticket ', the intention description text can be ' acquiring relevant ticket by cash or equal value exchange, and the like, and the ticket can be classified into a presentation ticket, an airplane ticket, a train ticket, a movie ticket, a meal ticket, a grain ticket, a leaflet ticket, a visit ticket, an exhibition ticket, a bus ticket and the like.

The intention description text can be queried from a pre-constructed intention dictionary in which each intention and its corresponding detailed interpretation information are stored.

The method for splicing the name text of the conversation robot and the intention description text corresponding to the conversation robot may be set as required, for example, the name text and the intention description text may be spliced in sequence, which is not specifically limited in the embodiment of the present invention.

The preset slot type information may be obtained by splicing a name text with a slot type description text corresponding to the conversation robot, where the slot type description text corresponding to the conversation robot is a text representation of the description information of the slot type corresponding to the conversation robot, and the slot type description text may correspond to detailed explanation information of the slot type, for example, if the slot type corresponding to the conversation robot is "time", the slot type description text may be "time is a representation of a perpetual motion, a continuous change, and a sequence of a substance".

The slot type description text can be obtained by searching a pre-constructed slot type dictionary, and each slot type and the corresponding detailed explanation information are stored in the slot type dictionary.

The splicing manner of the name text and the slot type description text corresponding to the conversation robot may be set as required, for example, the name text and the slot type description text may be spliced in sequence, which is not specifically limited in the embodiment of the present invention.

The preset slot value information may be obtained by splicing a slot type description text with a slot value corresponding to the conversation robot, where the slot value corresponding to the conversation robot refers to related information on a slot of a slot type corresponding to the conversation robot, and for example, the slot value corresponding to the conversation robot may be "1 month and 1 day 2005".

The splicing manner of the slot type description text and the slot position value corresponding to the conversation robot may be set according to needs, for example, the slots may be spliced in sequence, which is not specifically limited in the embodiment of the present invention.

In the embodiment of the invention, when the preset user intention information, the preset slot position type information and the preset slot position value information are determined, the text splicing is respectively introduced, and the semantic information contained in the user intention, the slot position type and the slot position value is considered, so that the generalization capability of the session state tracking is stronger, the expandability is higher, and the method can be also suitable for creating a new intention or a slot position type in a new field.

On the basis of the foregoing embodiment, the dialog state tracking method provided in an embodiment of the present invention, where the determining a second encoding characteristic based on the preset dialog schema information specifically includes:

Specifically, in the embodiment of the present invention, when the second coding feature is determined, the preset user intention information, the preset slot type information, and the preset slot value information may be respectively coded, so as to obtain a first coding result corresponding to the preset user intention information, a second coding result corresponding to the preset slot type information, and a third coding result corresponding to the preset slot value information, where the first coding result, the second coding result, and the third coding result are all the second coding feature.

Here, the encoding may be implemented by using a text encoder, that is, the preset user intention information, the preset slot type information, and the preset slot position value information may be respectively input to the text encoder, so as to obtain a second encoding characteristic output by the text encoder.

The text encoder may include, but is not limited to, models that may semantically encode words and sentences, such as TFIDF, Word2vec, Bert, and other Bert-like models, and the text encoder may be obtained by performing supervised training on an information sample to be encoded that carries an encoding feature tag, or may be obtained by performing unsupervised training on an information sample to be encoded, and is not specifically limited herein.

In the embodiment of the invention, the preset user intention information, the preset slot position type information and the preset slot position value information are respectively coded to obtain the second coding characteristics, so that the diversity and the richness of the second coding characteristics can be ensured.

On the basis of the foregoing embodiment, the dialog state tracking method provided in an embodiment of the present invention, where determining the first encoding characteristic based on the user input text specifically includes:

and coding the spliced text to obtain the first coding feature.

Specifically, in the embodiment of the present invention, when determining the first encoding characteristic according to the text input by the user, the historical dialog text of the historical round dialog may be obtained first. The historical turn dialog refers to all turns of dialog before the current turn of dialog, and the historical dialog text refers to dialog text of all turns of dialog before the current turn of dialog, which may include user input text as well as feedback text of the dialog bot.

And then splicing the user input text of the current turn of conversation with the historical conversation text to obtain a spliced text, and then coding the spliced text to obtain the first coding feature. Here, the stitched text may be input to a text encoder resulting in the first encoding characteristic output by the text encoder.

In the embodiment of the invention, the user input text of the current round of conversation is spliced with the historical conversation text, so that the historical conversation text can be fully utilized, and the reliability of the conversation state tracking result is improved.

On the basis of the foregoing embodiment, the dialog state tracking method provided in an embodiment of the present invention, where the determining a third encoding characteristic based on the dialog state information of the previous dialog includes:

Specifically, in the embodiment of the present invention, when the third encoding characteristic is determined, the dialog state information of the previous dialog may be encoded according to the dialog state map corresponding to the dialog robot, so as to obtain the third encoding characteristic. The conversation state map is a map which is constructed in advance and used for representing the association relation among the user intention, the slot type and the slot position value corresponding to each conversation robot. In combination with the dialog state map, the dialog state information of the previous dialog can be encoded by using a graph neural network model, and the encoding mode includes but is not limited to GCN, GAT, GraphSAGE and the like.

In the embodiment of the invention, when the third coding feature is determined, the user intention, the slot type and the incidence relation between slot values corresponding to each pair of telephone robots are considered through the dialogue state map, so that the accuracy of dialogue state tracking can be improved.

On the basis of the above embodiment, the dialog state tracking method provided in the embodiment of the present invention is configured such that the dialog state map is constructed based on the following method:

Specifically, in the embodiment of the present invention, the dialog state map used in determining the third encoding characteristic may be constructed by the following method: firstly, determining nodes of a dialogue state map according to user intentions, slot value types and slot value values corresponding to a plurality of dialogue robots, namely, taking the user intentions, slot position types and slot position values corresponding to the plurality of dialogue robots as the nodes. Three types of connected edges of the dialog state graph are then defined: the user intends to connect the edges, the slot value type connection edges, and the slot value connection edges. And finally, constructing a dialogue state graph through the determined nodes and the defined three connecting edges.

The embodiment of the invention provides a construction method of a dialogue state map, which can ensure the feasibility of a scheme.

On the basis of the foregoing embodiment, the dialog state tracking method provided in an embodiment of the present invention, where generating the dialog state information of the current turn of dialog based on the fusion feature specifically includes:

Specifically, in the embodiment of the present invention, when generating the dialog state information of the current turn of dialog through the fusion feature, the fusion feature may be decoded first, and the decoding process may be implemented by inputting the fusion feature into a Long Short-Term Memory (LSTM) model based on an Attention mechanism, so as to obtain a context vector output by the Long Short-Term Memory artificial neural network model, where the context vector is a decoding result.

And then, inputting the context vector into the deep learning model to obtain the dialog state information of the current turn of dialog output by the deep learning model. The deep learning model can be an LSTM model, an Attention model, a transform model, a pointer network model and the like.

Here, the LSTM model may be obtained by training a feature sample carrying a context vector label; the deep learning model can be obtained by training context vector samples carrying conversation state labels.

In the embodiment of the invention, the tracking of the conversation state information of the current round of conversation can be realized through the deep learning model, and the reliability and the accuracy of the tracking result are ensured.

acquiring a user portrait text, and coding the user portrait text to obtain user portrait characteristics;

fusing the first coding feature, the second coding feature, the third coding feature and the user portrait feature to obtain a final fused feature;

and generating the dialog state information of the current turn of dialog based on the final fusion characteristics.

Specifically, in the embodiment of the present invention, the user portrait text refers to a text representation of related information of the user portrait related to the current turn of dialog, and the user portrait text may be input to a text encoder, and the text encoder encodes the user portrait text to obtain the user portrait feature. And then, fusing the first coding feature, the second coding feature, the third coding feature and the user portrait feature to obtain a final fused feature.

The final fusion feature can be regarded as a fusion feature obtained by fusing the first coding feature, the second coding feature and the third coding feature, and the fusion feature is obtained by fusing the user portrait feature.

In the embodiment of the invention, under the condition of better user modeling, the user portrait text can be introduced, so that the fusion characteristics are richer, and the accuracy of the tracked dialog state information of the current turn of dialog is further improved.

On the basis of the above embodiments, the dialog state tracking method provided in the embodiments of the present invention can directly obtain a dense dialog state tracking vector under the condition of a large data volume, that is, the dialog state information of the dialog in the current round is represented by the dialog state tracking vector, so that it is not necessary to extract specific dialog state information such as user intention, slot type, slot value, and the like, and thus, the calculation amount can be saved and the tracking cost can be reduced.

As shown in fig. 2, on the basis of the above embodiment, an embodiment of the present invention provides a dialog state tracking apparatus, including:

the first encoding module 21 is configured to acquire a user input text of a current turn of a dialog, and determine a first encoding characteristic based on the user input text;

the second encoding module 22 is further configured to determine preset dialog schema information corresponding to the dialog robot, and determine a second encoding characteristic based on the preset dialog schema information;

a fusion module 23, configured to determine a fusion feature based on the first coding feature and the second coding feature;

and the generating module 24 is configured to generate the dialog state information of the current turn of dialog based on the fusion feature.

On the basis of the foregoing embodiment, the dialog state tracking apparatus provided in the embodiment of the present invention further includes a third encoding module, configured to:

correspondingly, the fusion module is specifically configured to:

On the basis of the foregoing embodiment, in the dialog state tracking apparatus provided in the embodiment of the present invention, the preset dialog schema information includes preset user intention information, preset slot type information, and preset slot value information;

On the basis of the foregoing embodiment, in the dialog state tracking apparatus provided in the embodiment of the present invention, the second encoding module is specifically configured to:

On the basis of the foregoing embodiment, in the dialog state tracking apparatus provided in the embodiment of the present invention, the first encoding module is specifically configured to:

and coding the spliced text to obtain the first coding feature.

On the basis of the foregoing embodiment, in the dialog state tracking apparatus provided in the embodiment of the present invention, the third encoding module is specifically configured to:

On the basis of the foregoing embodiment, the dialog state tracking apparatus provided in the embodiment of the present invention further includes a map building module, configured to:

On the basis of the foregoing embodiment, in the dialog state tracking apparatus provided in the embodiment of the present invention, the generating module is specifically configured to:

Specifically, the functions of the modules in the dialog state tracking apparatus provided in the embodiment of the present invention correspond to the operation flows of the steps in the embodiments of the methods one to one, and the implementation effects are also consistent.

On the basis of the foregoing embodiments, an embodiment of the present invention provides a human-machine conversation system, which includes the conversation state tracking device provided in the foregoing embodiments, so as to track the conversation state information of the current turn of conversation through the conversation state tracking device.

On the basis of the above embodiments, an embodiment of the present invention provides a working machine, including the human-machine conversation system provided in the above embodiments, so as to implement a human-machine conversation through the human-machine conversation system, and in a process of the human-machine conversation, a function of tracking conversation state information of a current round of conversation may be implemented.

Fig. 3 illustrates a physical structure diagram of an electronic device, which may include, as shown in fig. 3: a processor (processor)310, a communication Interface (communication Interface)320, a memory (memory)330 and a communication bus 340, wherein the processor 310, the communication Interface 320 and the memory 330 communicate with each other via the communication bus 340. The processor 310 may call logic instructions in the memory 330 to perform the dialog state tracking method provided in the above embodiments, the method comprising: acquiring a user input text of a current turn of conversation, and determining a first coding characteristic based on the user input text; determining preset dialog outline information corresponding to the dialog robot, and determining a second coding characteristic based on the preset dialog outline information; determining a fusion feature based on the first coding feature and the second coding feature; and generating the dialog state information of the current turn of dialog based on the fusion characteristics.

In addition, the logic instructions in the memory 330 may be implemented in the form of software functional units and stored in a computer readable storage medium when the software functional units are sold or used as independent products. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.

In another aspect, the present invention also provides a computer program product, the computer program product including a computer program, the computer program being storable on a non-transitory computer readable storage medium, the computer program being capable of executing, when executed by a processor, the dialog state tracking method provided in the above embodiments, the method including: acquiring a user input text of a current turn of conversation, and determining a first coding characteristic based on the user input text; determining preset dialog outline information corresponding to the dialog robot, and determining a second coding characteristic based on the preset dialog outline information; determining a fusion feature based on the first coding feature and the second coding feature; and generating the dialog state information of the current turn of dialog based on the fusion characteristics.

In yet another aspect, the present invention also provides a non-transitory computer readable storage medium, on which a computer program is stored, the computer program being implemented by a processor to perform the dialog state tracking method provided in the above embodiments, the method including: acquiring a user input text of a current turn of conversation, and determining a first coding characteristic based on the user input text; determining preset dialog outline information corresponding to the dialog robot, and determining a second coding characteristic based on the preset dialog outline information; determining a fusion feature based on the first coding feature and the second coding feature; and generating the dialog state information of the current turn of dialog based on the fusion characteristics.

The above-described embodiments of the apparatus are merely illustrative, and the units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of the present embodiment. One of ordinary skill in the art can understand and implement it without inventive effort.

Through the above description of the embodiments, those skilled in the art will clearly understand that each embodiment can be implemented by software plus a necessary general hardware platform, and certainly can also be implemented by hardware. With this understanding in mind, the above-described technical solutions may be embodied in the form of a software product, which can be stored in a computer-readable storage medium such as ROM/RAM, magnetic disk, optical disk, etc., and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the methods described in the embodiments or some parts of the embodiments.

Finally, it should be noted that: the above examples are only intended to illustrate the technical solution of the present invention, but not to limit it; although the present invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical solutions described in the foregoing embodiments may still be modified, or some technical features may be equivalently replaced; and such modifications or substitutions do not depart from the spirit and scope of the corresponding technical solutions of the embodiments of the present invention.

Claims

1. A dialog state tracking method, comprising:

2. The dialog state tracking method of claim 1 further comprising:

3. The method for tracking dialog state of claim 2, wherein the determining a third encoding characteristic based on the dialog state information of the previous dialog turn comprises:

4. The dialog state tracking method of claim 3 wherein the dialog state graph is constructed based on:

5. The dialog state tracking method according to any one of claims 1 to 4, wherein the preset dialog schema information includes preset user intention information, preset slot type information, and preset slot value information;

6. The dialog state tracking method of any of claims 1-4 wherein determining a first coding feature based on the user-entered text comprises:

and coding the spliced text to obtain the first coding feature.

7. The dialog state tracking method according to any one of claims 1 to 4, wherein generating the dialog state information of the current turn of dialog based on the fusion feature specifically comprises:

8. A dialog state tracking device, comprising:

9. A human-computer dialog system comprising a dialog state tracking device according to claim 8.

10. A work machine comprising a human-machine dialog system as claimed in claim 9.