CN113158062A - User intention identification method and device based on heterogeneous graph neural network - Google Patents

User intention identification method and device based on heterogeneous graph neural network Download PDF

Info

Publication number
CN113158062A
CN113158062A CN202110502094.2A CN202110502094A CN113158062A CN 113158062 A CN113158062 A CN 113158062A CN 202110502094 A CN202110502094 A CN 202110502094A CN 113158062 A CN113158062 A CN 113158062A
Authority
CN
China
Prior art keywords
user
neural network
sentence
intention
recognition result
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110502094.2A
Other languages
Chinese (zh)
Inventor
郑海涛
王栋
李自然
沈颖
肖喜
江勇
夏树涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenzhen International Graduate School of Tsinghua University
Original Assignee
Shenzhen International Graduate School of Tsinghua University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenzhen International Graduate School of Tsinghua University filed Critical Shenzhen International Graduate School of Tsinghua University
Priority to CN202110502094.2A priority Critical patent/CN113158062A/en
Publication of CN113158062A publication Critical patent/CN113158062A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/044Recurrent networks, e.g. Hopfield networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Machine Translation (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a method and a device for identifying user intentions based on a heterogeneous graph neural network, wherein the intention identification device directly identifies the user intentions according to conversation contents to obtain a primary identification result; the historical information screening device screens the user historical information according to the conversation content and the preliminary result to obtain the user historical information related to the conversation content; the recognition result adjusting device adjusts the recognition result by combining the conversation content, the preliminary recognition result and the related historical information so as to more accurately recognize the user intention. The invention solves the problem that the traditional deep learning method cannot effectively identify the personalized expression of the user, designs a two-stage intention identification strategy, utilizes the heterogeneous graph neural network to combine with the historical speech of the user, effectively identifies the personalized speech of the user, improves the accuracy of intention identification, and further provides help for the system to make accurate reply.

Description

User intention identification method and device based on heterogeneous graph neural network
Technical Field
The invention relates to the technical field of computer application, computer systems and technical products thereof, in particular to a user intention identification method and device based on a heterogeneous graph neural network.
Background
Natural language understanding is one of the core topics of artificial intelligence. The computer is used for simulating the language interaction process between the human beings, so that the computer can understand and use the natural language of the human beings and provide information services for the human beings, such as solving questions, inquiring data and the like. Conventional natural language understanding techniques have been applied to various fields including search systems, question-answering systems, dialogue systems, and the like. The search system identifies the user intention according to the user input, returns the most relevant contents for the user, such as search engines of Baidu, Google, must and the like, searches in West Wen and searches, and even when the user selects and purchases heart-mind commodities in Jingdong and Taobao, the user can not leave the search system. Question-answering systems and dialogue systems are also gradually emerging in daily life, such as intelligent customer service, chat robots, and the like. The rapid development of the natural language understanding technology enables a computer to understand the intention of a user more accurately, and further provides convenient and rapid service for the user.
Conventional natural language understanding techniques generally use rule templates customized by field experts manually to parse user input, which results in that models cannot be generalized, search systems for the automotive field cannot be applied to the computer field, and each system requires a lot of time and effort to elaborate. On the other hand, due to the colloquial and diversity of natural languages, the input text is usually irregular or even has spelling or grammar errors, which also causes that the model based on the rule template has poor robustness, and the user input is required to be accurate for recognition, thus bringing much inconvenience to the user.
In recent years, natural language understanding techniques based on deep learning have received a great deal of attention from both academic and industrial circles and have shown great commercial potential. Compared with the method based on the rule template, the method based on the deep learning realizes the end-to-end deep learning model through data driving, reduces the manual intervention in the model training process, and enables the model to be conveniently transferred from one field to another field. The traditional deep learning-based method generally adopts a time sequence model to model the semantic of the spoken language, omits the flexible and frequent interaction process among multiple users in the conversation, and adopts a neural network to model the conversation in order to solve the problem. However, due to the randomness and diversity of natural language, people can express their intentions in various personalized ways, which brings more difficulty to accurately recognize the user's intentions.
Disclosure of Invention
The invention aims to make up for the defect that the prior art cannot accurately identify the personalized expression of a user, and provides a user intention identification method and device based on a heterogeneous graph neural network.
The invention is realized by the following technical scheme:
a user intention recognition device based on a heterogeneous graph neural network comprises a intention recognition device, a historical information screening device and a recognition result adjusting device;
the intention recognition device directly recognizes the intention of the user according to the conversation content of the user to obtain a preliminary recognition result;
the historical information screening device screens the user historical information according to the conversation content and the primary recognition result to obtain the user historical information related to the conversation content;
the recognition result adjusting device adopts a heteromorphic neural network to encode the current conversation content and the user history information related to the conversation content, so as to adjust the preliminary recognition result and recognize the final user intention.
The intention recognition device directly recognizes the intention of the user according to the conversation content to obtain a preliminary recognition result, and the specific steps are as follows: the intention recognition device converts the dialogue content into corresponding feature vectors and obtains a preliminary recognition result for each sentence in the dialogue content.
The intention recognition device converts the dialogue content into corresponding feature vectors, and obtains a preliminary recognition result for each sentence in the dialogue content, which is as follows: the intention recognition device adopts a convolutional neural network and a bidirectional long-short term memory network to obtain sentence characteristic vectors containing context information, uses a full connection layer as a classifier, classifies each sentence in the dialogue content, and further obtains a preliminary recognition result for each sentence in the dialogue content.
The user history information refers to the historical speech of the user in other conversations.
The history information screening device screens the user history information according to the conversation content and the primary recognition result to obtain the user history information related to the conversation content, and the history information screening device specifically comprises the following steps:
the history information screening device screens the history information of a user by using a coarse-grained history selection module, firstly, converting discrete words in a text into continuous word vector representation by using pre-training word coding, and respectively obtaining the vector representation of the current conversation content and the vector representation of the history speech of the user by maximum pooling operation; and calculating cosine similarity between the two types of historical speeches to obtain a relevance score of the user historical speeches and the current conversation content, classifying the user historical speeches according to intention categories, and selecting the first K historical speeches most relevant to the current conversation content for each category according to the relevance score.
Calculating the relevance scores between each sentence in the current conversation content and K historical speeches in each category by using a relevance recalculation method module to obtain a similarity matrix; and recalculating the relevance scores of the K historical speeches in each category and the whole current conversation content by using the initial recognition result and the similarity matrix.
The heterogeneous graph neural network comprises two nodes: sentence nodes and label nodes, wherein vectors of the sentence nodes are represented by sentence characteristic vectors obtained by the intention recognition device, and the label nodes are represented by vectors of K historical speeches which are most relevant to the current conversation content and obtained by the historical information screening device according to the relevance scores in a weighted average mode.
The heterogeneous graph neural network comprises two edges: the system comprises a user edge and a label edge, wherein the user edge is used for representing the relation between sentence nodes, and the weight of the user edge is initialized by adopting an attention module based on similarity; the label edges are used for connecting label nodes and sentence nodes, the weight of the label edges represents the relation between the sentence nodes and different labels, the preliminary identification result obtained by the intention identification device is used as the initialization of the weight of the label edges, the node representation is updated by using a message transmission strategy specific to the edge relation, each sentence node is classified by using a full connection layer according to the sentence node representation obtained by the neural network of the heterogeneous graph, and the adjusted intention identification result is obtained for each sentence.
In the conversation content, the user sides are divided into 4 types, namely front of the user, back of the user, front of other people and back of other people according to the user and the speaking sequence.
A user intention identification method based on a heterogeneous graph neural network comprises the following specific steps:
s1, directly recognizing the user intention according to the dialogue content of the user to obtain a preliminary recognition result;
s2, screening the user history information according to the conversation content and the preliminary identification result obtained in the step S1 to obtain the user history information related to the conversation content;
s3 encodes the current dialog content and the user history information related to the dialog content obtained in step S2 using a heterogeneous neural network, and further adjusts the preliminary recognition result obtained in step S1 to recognize the final user intention.
The invention has the advantages that:
the invention solves the problem that the traditional deep learning method cannot effectively identify the personalized expression of the user, designs a two-stage intention identification strategy, firstly preliminarily identifies the intention of the user according to the context of the conversation, and then more accurately identifies the intention of the user by combining the current conversation of the user and the historical speech once in other conversations by utilizing a heterogeneous graph neural network. Meanwhile, the invention also designs a noise reduction mechanism to avoid noise caused by irrelevant historical information, thereby further improving the accuracy of identifying the personalized expression of the user.
In some embodiments of the method, the heterogeneous graph neural network is combined with the historical speech of the user, so that the personalized speech of the user is effectively identified, the accuracy of intention identification is improved, and further, the help is provided for the system to make an accurate reply.
Drawings
FIG. 1 is a block diagram of an apparatus according to an embodiment of the present invention.
Fig. 2 is a schematic diagram of an intent recognition apparatus according to an embodiment of the present invention.
FIG. 3 is a schematic diagram of a coarse-grained history selection module according to an embodiment of the invention.
Fig. 4 is a block diagram of a correlation recalculation module and an identification result adjustment apparatus according to an embodiment of the present invention.
FIG. 5 is a flowchart of a method according to an embodiment of the present invention.
Detailed Description
A user intention recognition device based on a heterogeneous graph neural network, an intention recognition device 1 directly recognizes a user intention according to conversation contents so as to obtain a preliminary recognition result; the historical information screening device 2 screens the user historical information according to the conversation content and the preliminary result to obtain the user historical information related to the conversation content; the recognition result adjusting means 3 adjusts the recognition result in accordance with the dialogue contents, the preliminary recognition result and the related history information to thereby more accurately recognize the user's intention.
The present invention comprises a two-stage intent recognition process: the method comprises the steps of initial identification and identification result adjustment, wherein the user intention is identified only according to the context in the initial identification stage, the personalized expression of the user cannot be identified, and certain identification defects exist, the user personalized expression is known to a certain extent in the identification result adjustment stage in combination with the historical speech of the user, and certain adjustment is carried out based on the initial identification result, so that the user intention can be identified more accurately.
The invention aims to solve the problem of user intention identification in a conversation system, namely, given a user input and a section of conversation history, the method automatically identifies the user intention, and the system generates a corresponding reply according to the identified user intention, so that the whole conversation process is continuous and smooth. The method can also be applied to an emotion analysis system, and by analyzing the current emotional state of the user, the internal requirements of the user can be known, the medicine can be taken according to the symptoms, and a product more suitable for the requirements of the user can be designed. The specific embodiment is as follows:
as shown in fig. 1, the present invention is composed of an intention identifying apparatus 1, a history information filtering apparatus 2, and an identification result adjusting apparatus 3. The intention recognition device 1 directly recognizes the intention of the user according to the dialogue content of the user to obtain a preliminary recognition result; the historical information screening device 2 screens the user historical information according to the conversation content and the primary recognition result to obtain the user historical information related to the conversation content; the recognition result adjusting device 3 encodes the current dialogue content and the user history information related to the dialogue content by using the heteromorphic neural network, and further adjusts the preliminary recognition result to recognize the final user intention. The details of the implementation of each device are as follows:
an intention recognition device:
the intention recognition device 1 converts the user dialog into corresponding feature vectors and obtains a preliminary intention recognition result for each dialog. Since the input is a sentence by sentence, the intention recognition apparatus extracts sentence features using a Convolutional Neural Network (CNN), and converts the input sentence into feature vectors. In order to model the context information between sentences, the intention recognition device adopts a bidirectional circulation network structure, and the network unit adopts a long-short term memory network (LSTM). The long-term and short-term memory network can effectively process sequence information, and the problems of gradient disappearance and gradient explosion in the deep learning process are avoided. The method comprises the steps of obtaining sentence characteristic vectors containing context information by utilizing a long-term and short-term memory network, classifying each sentence in a conversation by using a full connection layer (FFN) as a classifier, and further obtaining a preliminary intention recognition result for each sentence in the conversation.
As shown in FIG. 2, the intention recognition device 1 first encodes discrete words in a dialog by means of a pre-trained word encoding matrix e
Figure BDA0003056805120000051
Conversion into continuous code vectors
Figure BDA0003056805120000052
Ut=e(xt)
In which the dialog consists of a plurality of sentences
Figure BDA0003056805120000053
Composition, t represents the t-th sentence in the dialog, L is the sentence length, deThe dimension is encoded for the word. Then, U is put intInputting the convolutional neural network CNN with the maximum pool to extract the local features of the sentence to obtain the sentence representation
Figure BDA0003056805120000061
ut=CNN(Ut)
Wherein d iscRepresenting the sentence vector dimensions of the CNN output. Then, will { u1,u2,...,uNInputting to a bidirectional long-short term memory network (BilSTM) to obtain a sentence representation containing context information:
Figure BDA0003056805120000062
Figure BDA0003056805120000063
Figure BDA0003056805120000064
wherein the BilSTM forward cell
Figure BDA0003056805120000065
And a backward element
Figure BDA0003056805120000066
Forward representation to be obtained
Figure BDA0003056805120000067
And backward representation
Figure BDA0003056805120000068
Stitching together to obtain a context-aware sentence representation of the tth sentence in a conversation
Figure BDA0003056805120000069
dhIs the hidden layer dimension of BiLSTM. Finally, the local feature representation u of the sentence is obtainedtAnd a context-aware representation htInputting the classifier to obtain a preliminary recognition result of the user intention:
pt=Wα[ut,ht]+bα
wherein [ u ]t,ht]Represents that u istAnd htSplicing is carried out, WαAnd bαIs a parameter that needs to be learned,
Figure BDA00030568051200000610
represents the preliminary recognition result of the t-th word in the dialog, S is the number of tags, ytThe normalized results are shown.
History information screening apparatus 2:
the historical information screening device 2 screens the historical information of the user according to the conversation content and the primary recognition result, and avoids introducing a large amount of noise information to damage the system performance. The user history information refers to the historical speech of the user in other conversations, and the historical speech contains the expression habit of the user and is helpful for identifying the personalized expression of the user.
In order to screen the content related to the current conversation from a large amount of user history information, the history information screening device firstly uses a coarse-grained history selection module. Firstly, discrete words in a text are converted into continuous word vector representation by utilizing pre-training word coding, and vector representation of a current conversation and vector representation of historical speech of a user are respectively obtained through maximum pooling operation. And calculating cosine similarity between the two types of speech to obtain a correlation score between the historical speech of the user and the current conversation. And meanwhile, classifying the historical speeches of the user according to the intention categories, and selecting the top K historical speeches which are most relevant to the current conversation for each category according to the relevance scores.
However, the above method is a coarse-grained filtering process, and in order to more accurately calculate the correlation between the user's historical speech and the current conversation, the historical information screening apparatus uses a correlation recalculation method module 4 to calculate the correlation scores between each speech and the K historical speeches in each category in the conversation, so as to obtain a similarity matrix. Using the preliminary recognition results and the similarity matrix, the device recalculates the relevance scores for the K historical utterances in each category for the current conversation as a whole.
The history information screening device 2 screens the history speech of the user according to the conversation content and the primary recognition result obtained by the intention recognition device, and mainly comprises two modules: a coarse-grained history selection module and a relevance recalculation module 4.
As shown in fig. 3, the coarse-grained history selection module first selects all the historical speeches of these users from the database according to the list of users participating in the current conversation. Thereafter, the current dialogue and the user historical speech are converted into continuous vectors by using a pre-trained word coding matrix
Figure BDA0003056805120000071
And
Figure BDA0003056805120000072
wherein N is in conversationNumber of sentences, L being the maximum length of a sentence in a dialogue, LhFor the length of the user's historical speech, deThe dimensions of the matrix are encoded for the words. Respectively obtaining the representation of the current dialog by maximum pooling
Figure BDA0003056805120000073
And representation of user history
Figure BDA0003056805120000074
And calculating cosine similarity between the two to obtain the correlation between the historical speech of the user and the current conversation:
Figure BDA0003056805120000075
and then classifying the historical speeches of the user according to the intention categories, and selecting the top K historical speeches which are most relevant to the current conversation for each category according to the relevance scores.
As shown in fig. 4, the relevance recalculation module recalculates the relevance score of the user's historical speech to the current conversation based on the content of the conversation and the preliminary recognition result obtained by the intention recognition device. Here, a similarity matrix is obtained by calculating the correlations between the K sentences in the jth category and the N sentences in the current dialog
Figure BDA0003056805120000076
Figure BDA0003056805120000077
Figure BDA0003056805120000078
Wherein WsIs a parameter to be learned, hnIs a sentence representation of the nth sentence in the dialog,
Figure BDA0003056805120000081
is a sentence representation of the kth sentence in the jth category, both sentence representations being context-aware sentence representations obtained by the intent recognition means. According to the similarity matrix
Figure BDA0003056805120000082
And an initial recognition result of each sentence in the dialog with respect to the jth category obtained by the intention recognition means
Figure BDA0003056805120000083
Recalculating the relevance score:
Figure BDA0003056805120000084
Figure BDA0003056805120000085
wherein
Figure BDA0003056805120000086
Representing the relevance scores of the K sentences in the jth category to the current conversation.
Recognition result adjusting means 3:
the recognition result adjusting device 3 adjusts the recognition result according to the dialogue content, the preliminary recognition result obtained by the intention recognition device, the history information obtained by the history information screening device, and the correlation score, and further more accurately recognizes the intention of the user.
The recognition result adjusting device 3 uses a kind of heterogeneous graph neural network to encode the current conversation content and the user history speech, and then recognizes the personalized expression of the user. The heterogeneous graph contains two types of nodes: sentence nodes and label nodes. Wherein the sentence node vector representation is from the sentence vector representation obtained by the intention recognition device, and the tag node obtains the tag node vector representation by utilizing the K historical utterances which are most relevant to the current dialogue and are obtained by the historical information screening device, and weighting and averaging according to the relevance scores. Meanwhile, the heterogeneous graph contains two edges: user edges and label edges. Wherein the user edges are used to represent the relationship between sentence nodes. In the conversation, each user is influenced by the user and other users, and in addition, by the speaking sequence, the user is classified into 4 categories (front of the user, back of the user, front of another person, and back of another person) according to the user and the speaking sequence. Meanwhile, in order to represent the magnitude of the influence between users, a similarity-based attention module is used to initialize the weight of the user side. And the label edge is used for connecting the label node and the sentence node, the weight of the label edge represents the relation between the sentence node and different labels, and the preliminary identification result obtained by the intention identification device is used as the initialization of the weight of the label edge. It should be noted that the heterogeneous neural network also updates the side weights during the training process. Since the heterogeneous graph contains a variety of edge relationships, an edge-relationship-specific messaging policy, RGCN, is used here to update the node representation. Finally, according to sentence node representation obtained by the neural network of the heterogeneous graph, each sentence node is classified by using a full connection layer (FFN), and a more accurate intention identification result is obtained for each sentence.
The recognition result adjusting device 3 adjusts the recognition result according to the dialogue content, the preliminary recognition result obtained by the intention recognition device, the history information obtained by the history information screening device, and the correlation score, and further more accurately recognizes the intention of the user.
As shown in fig. 4, the recognition result adjusting apparatus 3 uses a kind of heterogeneous neural network to encode the current conversation content and the user history speech, and further recognizes the personalized expression of the user. The heterogeneous graph contains two types of nodes: sentence nodes and label nodes. Representation of sentence nodes Using context-aware sentence representations h obtained by an intent recognition means1,h2,...,hNInitialize, where N represents the number of sentences in the dialog. Representation of tag node e1,e2,...,eSUsing a weighted average of the K sentence representations under the label:
Figure BDA0003056805120000091
wherein S is the number of tags,
Figure BDA0003056805120000092
is the relevance score of the kth sentence in the jth category and the current dialogue calculated by the relevance recalculation module in the historical information screening device,
Figure BDA0003056805120000093
is a sentence representation of the kth sentence in the jth category. Meanwhile, the heterogeneous graph contains two edges: user edges and label edges. The user-side weights are initialized as follows:
Figure BDA0003056805120000094
where the sentence node hiReceive other sentence nodes h1,...,hNThe total weight sum of the transmitted information is 1. Label edge weight Using initial recognition result p obtained by intention recognition apparatusiAnd (3) initializing:
Figure BDA0003056805120000095
where the sentence node hiReceiving label node { e1,...,eSThe total weight sum of the transmitted information is 1. Since the heterogeneous graph contains multiple edge relationships, the node representation is updated here using an edge relationship-specific messaging policy RGCN:
Figure BDA0003056805120000096
wherein z isjRepresenting a node in the graph, αi,jAnd alphai,iIs the weight of the edge or edges,
Figure BDA0003056805120000101
indicating a relationship with the ith node
Figure BDA0003056805120000102
Of the network. c. Ci,jIs a learnable normalization constant, σ is the ReLU activation function,
Figure BDA0003056805120000103
and
Figure BDA0003056805120000104
are learnable parameters. Based on the heterogeneous graph neural network, a sentence representation { g ] fusing historical information of a user is obtained1,g2,...,gN}. Finally, the local characteristics of the sentence { u1,u2,...,uN}, context-aware sentence characteristics { h1,h2,...,hNAnd sentence representation fusing user history information { g }1,g2,...,gNSplicing the predicted data and the predicted data together and inputting the spliced data and the predicted data into a classifier to obtain a final prediction result:
yt=sigmoid(Wα[ut,ht,gt]+bα)
wherein WαFrom the learned parameters of the intent recognition device classifier, sigmoid is used as the activation function and cross entropy is used as the loss function.
As shown in fig. 5, a method for identifying a user intention based on a neural network of a heterogeneous graph includes the following specific steps:
s1, directly recognizing the user intention according to the dialogue content of the user to obtain a preliminary recognition result;
s2, screening the user history information according to the conversation content and the preliminary identification result obtained in the step S1 to obtain the user history information related to the conversation content;
s3 encodes the current dialog content and the user history information related to the dialog content obtained in step S2 using a heterogeneous neural network, and further adjusts the preliminary recognition result obtained in step S1 to recognize the final user intention.

Claims (10)

1. A user intention identification method based on a heterogeneous graph neural network is characterized by comprising the following steps: the method comprises the following steps:
s1, directly recognizing the user intention according to the dialogue content of the user to obtain a preliminary recognition result;
s2, screening the user history information according to the conversation content and the preliminary identification result obtained in the step S1 to obtain the user history information related to the conversation content;
and S3, encoding the current conversation content and the user history information related to the conversation content obtained in the step S2 by adopting a heterogeneous graph neural network, and further adjusting the preliminary identification result obtained in the step S1 to identify the final user intention.
2. The method for recognizing the user intention based on the neural network of the heterogeneous graph as claimed in claim 1, wherein: the step S1 of directly recognizing the user' S intention according to the dialog content to obtain a preliminary recognition result includes the following specific steps: and converting the conversation content into corresponding feature vectors, and obtaining a preliminary recognition result for each sentence in the conversation content.
3. The method for recognizing the user intention based on the neural network of the heterogeneous graph as claimed in claim 2, wherein: the conversion of the dialogue content into the corresponding feature vector and the obtaining of a preliminary recognition result for each sentence in the dialogue content are as follows: the method comprises the steps of obtaining sentence characteristic vectors containing context information by adopting a convolutional neural network and a bidirectional long-short term memory network, classifying each sentence in conversation contents by using a full connection layer as a classifier, and further obtaining a primary recognition result for each sentence in the conversation contents.
4. The method for recognizing the user intention based on the neural network of the heterogeneous graph as claimed in claim 3, wherein: the user history information refers to the historical speech of the user in other conversations.
5. The method for recognizing the user intention based on the neural network of the heterogeneous graph as claimed in claim 4, wherein: step S2, which is to filter the user history information according to the dialog content and the preliminary recognition result to obtain the user history information related to the dialog content, specifically as follows: screening user history information by using a coarse-grained history selection module, firstly, converting discrete words in a text into continuous word vector representation by using pre-training word coding, and respectively obtaining the vector representation of current conversation content and the vector representation of user history speech by maximum pooling operation; and calculating cosine similarity between the two types of historical speeches to obtain a relevance score of the user historical speeches and the current conversation content, classifying the user historical speeches according to intention categories, and selecting the first K historical speeches most relevant to the current conversation content for each category according to the relevance score.
6. The method for recognizing the user intention based on the neural network of the heterogeneous graph as claimed in claim 5, wherein: calculating the relevance scores between each sentence in the current conversation content and K historical speeches in each category by using a relevance recalculation method to obtain a similarity matrix; and recalculating the relevance scores of the K historical speeches in each category and the whole current conversation content by using the initial recognition result and the similarity matrix.
7. The method for recognizing the user intention based on the neural network of the heterogeneous graph as claimed in claim 6, wherein: the heterogeneous graph neural network comprises two nodes: sentence nodes and tag nodes, wherein the vector representation of the sentence nodes is from the sentence feature vector representation obtained in step S1, and the tag nodes are obtained by using the vector representations of the K historical utterances most relevant to the current dialog content obtained in step S2 and weighted average according to the relevance scores.
8. The method for recognizing the user intention based on the neural network of the heterogeneous graph as claimed in claim 7, wherein: the heterogeneous graph neural network comprises two edges: the system comprises a user edge and a label edge, wherein the user edge is used for representing the relation between sentence nodes, and the weight of the user edge is initialized by adopting an attention module based on similarity; the label edges are used for connecting label nodes and sentence nodes, the weight of the label edges represents the relation between the sentence nodes and different labels, the preliminary identification result obtained in the step S1 is used as the initialization of the label edge weight, the node representation is updated by using a message transmission strategy specific to the edge relation, each sentence node is classified by using a full connection layer according to the sentence node representation obtained by the neural network of the heterogeneous graph, and then the adjusted intention identification result is obtained for each sentence.
9. The method for recognizing the user intention based on the neural network of the heterogeneous graph as claimed in claim 8, wherein: in the conversation content, the user sides are divided into 4 types, namely front of the user, back of the user, front of other people and back of other people according to the user and the speaking sequence.
10. The apparatus of the method for recognizing user's intention based on the neural network of the heterogeneous map as claimed in claim 1, wherein: the system comprises an intention identification device, a historical information screening device and an identification result adjusting device;
the intention recognition device directly recognizes the intention of the user according to the conversation content of the user to obtain a preliminary recognition result;
the historical information screening device screens the user historical information according to the conversation content and the primary recognition result to obtain the user historical information related to the conversation content;
the recognition result adjusting device adopts a heteromorphic neural network to encode the current conversation content and the user history information related to the conversation content, so as to adjust the preliminary recognition result and recognize the final user intention.
CN202110502094.2A 2021-05-08 2021-05-08 User intention identification method and device based on heterogeneous graph neural network Pending CN113158062A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110502094.2A CN113158062A (en) 2021-05-08 2021-05-08 User intention identification method and device based on heterogeneous graph neural network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110502094.2A CN113158062A (en) 2021-05-08 2021-05-08 User intention identification method and device based on heterogeneous graph neural network

Publications (1)

Publication Number Publication Date
CN113158062A true CN113158062A (en) 2021-07-23

Family

ID=76873967

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110502094.2A Pending CN113158062A (en) 2021-05-08 2021-05-08 User intention identification method and device based on heterogeneous graph neural network

Country Status (1)

Country Link
CN (1) CN113158062A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109241255A (en) * 2018-08-20 2019-01-18 华中师范大学 A kind of intension recognizing method based on deep learning
US20190228070A1 (en) * 2016-09-30 2019-07-25 Huawei Technologies Co., Ltd. Deep learning based dialog method, apparatus, and device
CN112271001A (en) * 2020-11-17 2021-01-26 中山大学 Medical consultation dialogue system and method applying heterogeneous graph neural network
WO2021042543A1 (en) * 2019-09-04 2021-03-11 平安科技(深圳)有限公司 Multi-round dialogue semantic analysis method and system based on long short-term memory network
CN112613308A (en) * 2020-12-17 2021-04-06 中国平安人寿保险股份有限公司 User intention identification method and device, terminal equipment and storage medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20190228070A1 (en) * 2016-09-30 2019-07-25 Huawei Technologies Co., Ltd. Deep learning based dialog method, apparatus, and device
CN109241255A (en) * 2018-08-20 2019-01-18 华中师范大学 A kind of intension recognizing method based on deep learning
WO2021042543A1 (en) * 2019-09-04 2021-03-11 平安科技(深圳)有限公司 Multi-round dialogue semantic analysis method and system based on long short-term memory network
CN112271001A (en) * 2020-11-17 2021-01-26 中山大学 Medical consultation dialogue system and method applying heterogeneous graph neural network
CN112613308A (en) * 2020-12-17 2021-04-06 中国平安人寿保险股份有限公司 User intention identification method and device, terminal equipment and storage medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
DONG WANG: ""Integrating User History into Heterogeneous Graph for Dialogue Act Recognition"", 28TH INTERNATIONAL CONFERENCE ON COMPUTATIONAL LINGUISTICS *

Similar Documents

Publication Publication Date Title
CN111625641B (en) Dialog intention recognition method and system based on multi-dimensional semantic interaction representation model
CN111583909B (en) Voice recognition method, device, equipment and storage medium
CN110990543A (en) Intelligent conversation generation method and device, computer equipment and computer storage medium
WO2021190259A1 (en) Slot identification method and electronic device
CN111666381B (en) Task type question-answer interaction system oriented to intelligent control
CN112037773B (en) N-optimal spoken language semantic recognition method and device and electronic equipment
CN114973062A (en) Multi-modal emotion analysis method based on Transformer
US11450310B2 (en) Spoken language understanding
CN111680512B (en) Named entity recognition model, telephone exchange extension switching method and system
CN111898670A (en) Multi-mode emotion recognition method, device, equipment and storage medium
CN115292461B (en) Man-machine interaction learning method and system based on voice recognition
CN110210036A (en) A kind of intension recognizing method and device
WO2023093295A1 (en) Artificial intelligence-based audio processing method and apparatus, electronic device, computer program product, and computer-readable storage medium
CN113223509A (en) Fuzzy statement identification method and system applied to multi-person mixed scene
CN110597968A (en) Reply selection method and device
CN114648016A (en) Event argument extraction method based on event element interaction and tag semantic enhancement
CN112328748A (en) Method for identifying insurance configuration intention
CN112417132A (en) New intention recognition method for screening negative samples by utilizing predicate guest information
CN112988970A (en) Text matching algorithm serving intelligent question-answering system
CN114386426B (en) Gold medal speaking skill recommendation method and device based on multivariate semantic fusion
CN114239607A (en) Conversation reply method and device
CN114003700A (en) Method and system for processing session information, electronic device and storage medium
CN112257432A (en) Self-adaptive intention identification method and device and electronic equipment
CN116361442A (en) Business hall data analysis method and system based on artificial intelligence
CN115376547A (en) Pronunciation evaluation method and device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination