CN114065769B - Method, device, equipment and medium for training emotion reason pair extraction model - Google Patents
Method, device, equipment and medium for training emotion reason pair extraction model Download PDFInfo
- Publication number
- CN114065769B CN114065769B CN202210039899.2A CN202210039899A CN114065769B CN 114065769 B CN114065769 B CN 114065769B CN 202210039899 A CN202210039899 A CN 202210039899A CN 114065769 B CN114065769 B CN 114065769B
- Authority
- CN
- China
- Prior art keywords
- clause
- reason
- emotion
- output
- network
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
Abstract
The embodiment of the invention discloses a method, a device, equipment and a medium for training an emotion reason pair extraction model, which relate to the field of neural network models and comprise the following steps: inputting the document sample into a first coding network to code words and clauses, and obtaining emotion clause representation and reason clause representation; predicting two clause representations of each clause to obtain two clause prediction results, emotion output and reason output; inputting emotion output and reason output to a graph attention network for updating; obtaining corresponding emotion representation and reason representation according to the two updated outputs based on a pairing network, and pairing the emotion representation and the reason representation to obtain an emotion reason pair; obtaining an emotional cause pair prediction result according to the prediction network; and calculating a loss value according to the prediction result and updating the model. Therefore, the invention extracts the mutual relation among each clause by using the graph attention network, enriches the information contained in the emotion output and reason output of each clause and improves the accuracy.
Description
Technical Field
The invention relates to the field of neural network models, in particular to a method, a device, equipment and a medium for training an emotion reason pair extraction model.
Background
The emotion classification task mainly classifies the emotion of the text, but with the continuous development of natural language processing technology, the task of classifying the emotion only cannot meet the requirements in real life, so researchers turn their attention to the reason behind the emotion.
Therefore, how to determine the emotion words/sentences in the text and the reason words/sentences pointing to the emotion words/sentences is one of the current main research directions.
Disclosure of Invention
In view of this, the present invention provides a method, an apparatus, a device and a medium for training an emotion reason pair extraction model, which are used to determine an emotion clause and a reason clause pointing to the emotion clause in a text.
In a first aspect, an embodiment of the present invention provides a method for training an emotion reason pair extraction model, including:
inputting a document sample into the first coding network to obtain emotion clause representation and reason clause representation of each clause in the document sample;
obtaining a first emotion clause prediction result and a first reason clause prediction result of each clause according to the emotion clause expression and the reason clause expression of each clause, and obtaining emotion output and reason output of each clause based on a second coding network, wherein the emotion output of each clause is obtained through the first reason clause prediction result and the emotion clause expression of the clause, and the reason output of each clause is obtained through the first emotion clause prediction result and the reason clause expression of the clause;
inputting the emotion output and the reason output of each clause into a graph attention network to obtain updated emotion output and updated reason output of each clause, wherein the graph attention network is used for updating the emotion output and the reason output of each clause according to the emotion output and the reason output of each clause;
based on the pairing network, obtaining emotion expression of each clause according to the updated emotion output of each clause, obtaining reason expression of each clause according to the updated reason output of each clause, and pairing the emotion expression and the reason expression of all the clauses in pairs to obtain emotion reason pairs;
inputting all the emotional reason pairs into a prediction network to obtain the prediction result of the emotional reason pairs;
and obtaining an emotional cause pair prediction loss value corresponding to the emotional cause pair prediction result according to a first preset formula, and updating the emotional cause pair extraction model by using the emotional cause pair prediction loss value.
Optionally, in an implementation manner provided by the embodiment of the present invention, the graph attention network includes a preset number of layers of graph attention layers connected in sequence, where the first layer of graph attention layer is used to update the emotion output and the reason output of each clause, and the remaining layers of graph attention layers of the graph attention network are used to update the emotion output and the reason output of the previous graph attention layer.
Optionally, in an implementation manner provided by the embodiment of the present invention, the obtaining, based on the pairing network, an emotion representation of each clause according to an updated emotion output of each clause, and obtaining a reason representation of each clause according to an updated reason output of each clause includes:
inputting the updated emotion output of each clause into a second preset formula of the pairing network to obtain emotion expression, and inputting the updated reason output of each clause into a third preset formula of the pairing network to obtain reason expression;
the second preset equation comprises:
in the formula (I), the compound is shown in the specification,is shown asiThe emotional representation of the individual clauses,σthe representation of the function of relu,is shown asiThe updated emotion output of the clause,W e representing the trainable weights corresponding to the emotion representations,b e representing trainable deviations corresponding to the emotional representations;
the third preset equation comprises:
in the formula (I), the compound is shown in the specification,is shown asiThe reason for the individual clause indicates that,σthe representation of the function of relu,is shown asiThe reason for the updated clause is output,W c the presentation reason represents a corresponding trainable weight,b c the presentation reason represents a corresponding trainable deviation.
Further, in an implementation manner provided by the embodiment of the present invention, pairing the emotion expressions and reason expressions of all the clauses in pairs to obtain an emotion reason pair, includes:
based on a double affine mechanism, representing the emotion of each clause as a central item, representing the reason of each clause as an auxiliary item, and pairwise pairing all the central items and all the auxiliary items based on a preset formula set in a pairing network to obtain a corresponding emotion reason pair.
Optionally, in an implementation manner provided by the embodiment of the present invention, the first preset equation includes:
in the formula (I), the compound is shown in the specification,L pair the predicted loss value representing the emotional cause is compared with the predicted loss value,is indicated bypEmotional representation and the first of clausesqThe reason of each clause represents the obtained emotional reason pair; when in useWhen the prediction is correctY p,q Get 1, otherwise get 0.
Optionally, in an implementation manner provided by the embodiment of the present invention, after obtaining, based on the second coding network, a first emotion clause prediction result and a first reason clause prediction result of each clause according to an emotion clause representation and a reason clause representation of each clause, and obtaining an emotion output of each clause and a reason output of each clause, the method further includes:
obtaining a clause classification loss value corresponding to the document sample according to a first emotion clause prediction result and a first reason clause prediction result of each clause;
the updating the emotional cause pair extraction model by using the emotional cause pair prediction loss value comprises the following steps:
and updating the emotional cause pair extraction model by using the emotional cause pair prediction loss value and the clause classification loss value.
Optionally, in an implementation manner provided by the embodiment of the present invention, after obtaining, based on the second coding network, a first emotion clause prediction result and a first reason clause prediction result of each clause according to an emotion clause representation and a reason clause representation of each clause, and obtaining an emotion output of each clause and a reason output of each clause, the method further includes:
obtaining a second emotion clause prediction result and a second reason clause prediction result of each clause by utilizing emotion output of each clause and reason output of each clause based on the prediction network;
obtaining a clause prediction loss value corresponding to the document sample according to a second emotion clause prediction result and a second reason clause prediction result of each clause;
the updating the emotion reason pair extraction model by using the emotion reason pair prediction loss value and the clause classification loss value comprises the following steps:
and updating the emotional cause pair extraction model by utilizing the emotional cause pair prediction loss value, the clause classification loss value and the clause prediction loss value.
In a second aspect, an embodiment of the present invention provides an apparatus for training an emotional cause pair extraction model, where the emotional cause pair extraction model includes a first coding network, a second coding network, a graph attention network, a pairing network, and a prediction network, and the apparatus includes:
the first coding module is used for inputting the document sample into the first coding network to obtain the emotion clause representation and the reason clause representation of each clause in the document sample;
a second coding module, configured to obtain, based on a second coding network, a first emotion clause prediction result and a first reason clause prediction result for each clause according to an emotion clause expression and a reason clause expression of each clause, and obtain an emotion output for each clause and a reason output for each clause, where the emotion output for each clause is obtained by the first reason clause prediction result and the emotion clause expression of the clause, and the reason output for each clause is obtained by the first emotion clause prediction result and the reason clause expression of the clause;
the graph attention module is used for inputting the emotion output and the reason output of each clause into a graph attention network to obtain updated emotion output and updated reason output of each clause, wherein the graph attention network is used for updating the emotion output and the reason output of each clause according to the emotion output and the reason output of each clause;
the matching module is used for obtaining the emotion expression of each clause according to the updated emotion output of each clause and obtaining the reason expression of each clause according to the updated reason output of each clause based on the matching network, and pairwise matching the emotion expression and the reason expression of all the clauses to obtain an emotion reason pair;
the prediction module is used for inputting all the emotional reason pairs into a prediction network to obtain the prediction result of the emotional reason pairs;
and the updating module is used for obtaining the emotional cause pair prediction loss value corresponding to the emotional cause pair prediction result according to a first preset formula and updating the emotional cause pair extraction model by utilizing the emotional cause pair prediction loss value.
In a third aspect, an embodiment of the present invention provides a computer device, including a memory and a processor, where the memory stores a computer program, and the computer program, when running on the processor, executes the method for training the emotion reason pair extraction model as disclosed in any one of the first aspects.
In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium, on which a computer program is stored, where the computer program, when running on a processor, performs the method for training the emotion cause pair extraction model as disclosed in any one of the first aspects.
The method for training the emotion reason pair extraction model, provided by the embodiment of the invention, comprises the steps of inputting a document sample into a first coding network to code words and clauses in the document sample, and further obtaining emotion clause representation of each clause and reason clause representation of each clause, namely generating two representations for each clause; then, preliminarily predicting each clause according to the emotion clause expression and the reason clause expression of each clause, further obtaining a first emotion clause prediction result and a first reason clause prediction result of each clause, and obtaining emotion output according to the first reason clause prediction result and the emotion clause expression of each clause, namely inputting the first emotion clause prediction result and the first reason clause prediction result into a BI-LSTM (Bi-directional Long Short-Term Memory) network to obtain corresponding output, and similarly obtaining the reason output of each clause according to the first emotion clause prediction result and the reason clause expression of each clause; then, inputting the emotion output and the reason output of each clause into a graph attention network for updating to obtain updated emotion output and updated reason output of each clause; based on the pairing network, firstly obtaining emotion expression of each clause according to the updated emotion output of each clause, obtaining reason expression of each clause according to the updated reason output of each clause, and pairing the emotion expression and the reason expression pairwise to obtain an emotion reason pair; predicting all emotional reason pairs according to the prediction network to obtain the prediction result of the emotional reason pairs; and finally, calculating a prediction loss value according to the emotional cause pair prediction result so as to update the emotional cause pair extraction model.
Based on the above, the embodiment of the invention extracts the mutual relation between each clause by using the attention network, so that the emotion output and the reason output of each clause are fused and updated, and the information contained in the emotion output and the reason output of each clause is enriched, so that the model can complete prediction based on richer information, and the prediction accuracy is improved; furthermore, the embodiment of the invention also combines the emotion extraction task and the reason extraction task, updates the reason representation by the emotion extraction result, namely the first emotion clause prediction result, so as to obtain the reason output, and updates the emotion representation by the reason clause extraction result, namely the first reason clause prediction result, so as to obtain the emotion output, thereby realizing the interaction of the two types of clause extraction tasks.
Drawings
In order to more clearly illustrate the technical solution of the present invention, the drawings required to be used in the embodiments will be briefly described below, and it should be understood that the following drawings only illustrate some embodiments of the present invention, and therefore should not be considered as limiting the scope of the present invention. Like components are numbered similarly in the various figures.
FIG. 1 is a schematic flow chart of a first method for training an emotional cause pair extraction model according to an embodiment of the present invention;
FIG. 2 is a flowchart illustrating a second method for training an emotional cause pair extraction model according to an embodiment of the present invention;
FIG. 3 is a flow chart of a third method for training an emotional cause pair extraction model according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a training device for emotion reason pair extraction models according to an embodiment of the present invention.
Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments.
The components of embodiments of the present invention generally described and illustrated in the figures herein may be arranged and designed in a wide variety of different configurations. Thus, the following detailed description of the embodiments of the present invention, presented in the figures, is not intended to limit the scope of the invention, as claimed, but is merely representative of selected embodiments of the invention. All other embodiments, which can be derived by a person skilled in the art from the embodiments of the present invention without making any creative effort, shall fall within the protection scope of the present invention.
Hereinafter, the terms "including", "having", and their derivatives, which may be used in various embodiments of the present invention, are only intended to indicate specific features, numbers, steps, operations, elements, components, or combinations of the foregoing, and should not be construed as first excluding the existence of, or adding to, one or more other features, numbers, steps, operations, elements, components, or combinations of the foregoing.
Furthermore, the terms "first," "second," "third," and the like are used solely to distinguish one from another and are not to be construed as indicating or implying relative importance.
Unless otherwise defined, all terms (including technical and scientific terms) used herein have the same meaning as commonly understood by one of ordinary skill in the art to which various embodiments of the present invention belong. The terms (such as those defined in commonly used dictionaries) should be interpreted as having a meaning that is consistent with their contextual meaning in the relevant art and will not be interpreted in an idealized or overly formal sense unless expressly so defined herein in various embodiments of the present invention.
Referring to fig. 1, fig. 1 is a schematic flowchart illustrating a method for training an emotion reason pair extraction model according to an embodiment of the present invention, where the emotion reason pair extraction model in the method for training an emotion reason pair extraction model according to an embodiment of the present invention includes a first coding network, a second coding network, an attention network, a pairing network, and a prediction network, and the method includes:
s110, inputting the document sample into the first coding network to obtain the emotion clause representation and the reason clause representation of each clause in the document sample.
Specifically, after a document sample is input into a first coding network, the first coding network carries out embedding processing on each vocabulary in the document sample, namely, each vocabulary is converted into vector representation of a preset dimension, and then word vectors of the preset dimension corresponding to each clause are input into two Bi-LSTM (Bi-directional Long Short-Term Memory) networks to obtain corresponding expression of words representing emotions, namely emotion word representation, and obtain expression of words representing reasons, namely reason word representation. It can be understood that, for the emotion clause and the reason clause, the words concerned by the emotion clause and the reason clause are not the same, so after the word vector with the preset dimension is obtained, the corresponding emotion word representation and reason word representation are obtained through two Bi-LSTM networks in the embodiment of the present invention.
Then, after obtaining the emotion word representation and the reason word representation, the words are aggregated and context information inside the clauses, namely between the words, is encoded, so that the model can focus on the words which are important to the words, and then the emotion clause representation and the reason clause representation are obtained. It can also be understood that the degree of importance of each word in a clause is also different for an emotion clause and a reason clause, and therefore, the embodiment of the present invention generates both a reason clause representation and an emotion clause representation for each clause.
To better illustrate the specific process of obtaining the reason clause representation and the emotion clause representation, reference may be made to the following equation:
in the formula (I), the compound is shown in the specification,represents the firstiFirst in a clausejThe feature output of each emotional word is carried out,respectively representing the corresponding first weight matrix and second weight matrix of the emotion clause representation,to representiFirst in a clausejThe number of the emotional words is expressed,the representative emotion clause indicates the corresponding bias parameter,represents the firstiFirst in a clausejThe weight of an individual word or words,represents the firstiThe emotional clause representation of an individual clause.
In the formula (I), the compound is shown in the specification,represents the firstiFirst in a clausejThe feature output of the individual reason word is,the respective cause clauses represent corresponding first and second weight matrices,to representiFirst in a clausejThe individual reason word indicates that the reason word,the representative cause clause indicates the corresponding deviation parameter,represents the firstiFirst in a clausejThe weight of an individual word or words,represents the firstiThe reason clause for an individual clause indicates.
It is understood that the emotion word Representation and the reason word Representation are not limited to being obtained through the Bi-LSTM network, but in another implementation provided by the embodiment of the present invention, the emotion word Representation and the reason word Representation are also obtained through the bert (bidirectional Encoder retrieval from transforms) network.
And S120, based on a second coding network, obtaining a first emotion clause prediction result and a first reason clause prediction result of each clause according to the emotion clause representation and the reason clause representation of each clause, and obtaining an emotion output of each clause and a reason output of each clause, wherein the emotion output of each clause is obtained through the first reason clause prediction result and the emotion clause representation of the clause, and the reason output of each clause is obtained through the first emotion clause prediction result and the reason clause representation of the clause.
It should be understood that the emotion extraction and reason extraction of clauses are not mutually independent tasks, and there is a mutual identification relationship between emotion and reason, and it can be seen that the result of emotion extraction is helpful for better finding out the reason, and the result of reason extraction is helpful for better identifying whether the clauses are emotion clauses.
Therefore, the embodiment of the present invention provides two auxiliary tasks, namely, an emotion extraction auxiliary task and a reason extraction auxiliary task, which are respectively used for performing preliminary prediction on the emotion clause representation and the reason clause representation of each clause to obtain corresponding prediction results.
After obtaining preliminary prediction results, namely a first emotion clause prediction result and a first reason clause prediction result, updating the reason clause representation of each clause by using the first emotion clause prediction result of each clause so as to obtain the reason output of each clause; similarly, the emotion clause representation of each clause is updated by using the first reason clause prediction result of each clause, and then emotion output of each clause is obtained.
To better illustrate the specific process of obtaining the emotion output of each clause and the reason output of each clause according to the embodiment of the present invention, the following formula can be referred to:
in the formula (I), the compound is shown in the specification,represents the firstiThe emotional output of the individual clauses is performed,represents the firstiThe emotional output of the individual clauses is performed,represents the firstiThe emotional-clause representation of an individual clause,represents the firstiThe reason clause for an individual clause indicates,assisting the task's result on behalf of the cause, i.e. according toiThe cause clause of an clause indicates the first cause clause prediction that was obtained,representing the result of an emotion-extracting auxiliary task, i.e. according toiThe emotion clause of each clause represents the obtained first emotion clause prediction result;LSTM f andLSTM b respectively representing a forward LSTM network and a reverse LSTM network.
It is understood that the positions between the clauses, i.e. the sequence of the clauses, is relatively important information, and the Bi-LSTM network can well embody the sequence of the clauses, so that the embodiment of the present invention adopts the Bi-LSTM network to obtain the emotion output of each clause and the reason output of each clause in this implementation manner. It is further understood that the Bi-LSTM network is only an optional way, and the specific network/algorithm used to accomplish the emotion aid task and the emotion aid task can be set and/or selected according to actual situations. In one embodiment, the emotion aid task and the emotion aid task are completed through an LSTM (Long Short-Term Memory) network.
It is to be understood in connection withAndthe embodiment of the present invention does not additionally limit this, and can be set and/or selected according to actual situations.
By way of example, in one implementation provided by embodiments of the present invention,andthe acquisition process comprises the following steps:
acquiring emotion characteristics and reason characteristics of each clause from emotion clause expression and reason clause expression of each clause based on a first characteristic extraction formula;
and inputting the emotional characteristic and the reason characteristic of each clause into two fully-connected networks which are connected in sequence, and calculating a first emotional clause prediction result and a first reason clause prediction result of each clause by using a first prediction formula.
Wherein the first feature extraction formula comprises:
in the formula (I), the compound is shown in the specification,represents the firstiThe emotional characteristics represented by the emotional clauses of the clauses,represents the firstiThe reason for an individual clause is the characteristic of the reason that the clause indicates,represents the firstiThe emotional-clause representation of an individual clause,represents the firstiThe reason clause for an individual clause indicates,σrepresenting the function of relu and representing the function of relu,W ae the weights are extracted on behalf of the emotional features,W ac the weights are extracted on behalf of the cause features,b ae the bias is extracted on the basis of the emotional features,b ac the representative cause feature extraction bias.
Wherein the first prediction equation comprises:
in the formula (I), the compound is shown in the specification,represents the firstiThe clause is the prediction result of the emotion clause, namely the first emotion clause prediction result,represents the firstiThe prediction result of the reason clause, i.e. the first reason clause prediction result,representing the weight of the prediction of the emotion,the weight is predicted on behalf of the cause,representing the bias of the emotion prediction,the cause prediction bias is represented.
And S130, inputting the emotion output and the reason output of each clause into a graph attention network to obtain updated emotion output and updated reason output of each clause, wherein the graph attention network is used for updating the emotion output and the reason output of each clause according to the emotion output and the reason output of each clause.
It should be understood that the correlation between the clauses has a large influence on extracting the emotional cause pair, and the Graph attention network (GAT) can well capture the information between the clauses and learn the attention weight autonomously. The clause representations obtained from the graph attention network can naturally incorporate higher order information compared to LSTM or Bi-LSTM. It will be appreciated that when stacking a multi-layer graph attention network, each node has already encoded neighbor information once in the previous layer, so when this layer is encoded, information of more neighbor nodes, i.e., higher order information, is actually encoded. Therefore, the embodiment of the invention adopts the network clause expression of attention network, namely the emotion output and reason output input of each clause are further updated.
Optionally, in an implementation manner provided by the embodiment of the present invention, the graph attention network includes a preset number of layers of graph attention layers connected in sequence, where the first layer of graph attention layer is used to update the emotion output and the reason output of each clause, and the remaining layers of graph attention layers of the graph attention network are used to update the emotion output and the reason output of the previous graph attention layer.
That is, embodiments of the present invention enhance the association between clause representations by stacking a preset number of layers of graph attention layers.
To better illustrate the specific process of updating emotion output and reason output by using the attention layer of the graph in the embodiment of the present invention, the following 3 formulas can be referred to:
in the formula (I), the compound is shown in the specification,represents the firsttLayer diagram attention layer outputiOutputting the emotion after updating the clauses;N(i) Represents and isiEach clause has an edge directly adjacent to the other clauses, and a self-looping edge, and furtherN(i) All clauses in the document are contained;graph attention weight representing correspondence and emotional output for reflectingiA clause and a secondjThe degree of relevance of an individual clause;W e t()represents the firsttEmotional weight of the layer;b e t()represents the firstt(ii) an emotional bias of the layer;represents the firsttMulti-layer perceptron of layer map attention to layers for use in mappingVector mapping from stitching to real number;Represents the firsttVertex in layeriAnd vertexiAssociated vertexkWith respect to emotion.
It should be understood that the updated reason output is similar to the updated emotion output acquisition process, and therefore the reason output update process and the corresponding formula description are not repeated.
Optionally, in an implementation manner provided by the embodiment of the present invention, the preset number of layers is 2 layers.
S140, based on the pairing network, obtaining emotion expression of each clause according to the updated emotion output of each clause, obtaining reason expression of each clause according to the updated reason output of each clause, and pairing the emotion expression and the reason expression of all the clauses in pairs to obtain emotion reason pairs.
It can be understood that each of the clause updated emotion output and reason output is actually a Bi-LSTM output fused with high-order information, and therefore, to obtain an emotion reason pair, the emotion output and reason output need to be converted into an emotion expression and a reason expression respectively, that is, a reason clause expression and an emotion clause expression corresponding to each clause are represented.
It will also be appreciated that embodiments of the present invention treat the emotion cause pair extraction task as a link prediction task, i.e. predicting potential edges between vertices in a graph. For the emotion reason pair extraction task, the extraction task is converted to determine whether an edge pointing to another vertex exists for the vertices in the graph, namely, each clause, and if the edge exists, the result is regarded as that a correct emotion reason pair is obtained.
It should be further understood that the process of obtaining the emotional cause pairs may be a cartesian product of the emotional representations and the cause representations of all the clauses, so as to obtain all the possible emotional cause pairs. The specific process for acquiring the emotional cause pair can be selected according to actual conditions.
And S150, inputting all the emotional reason pairs into a prediction network to obtain the emotional reason pair prediction result.
Thus, all emotion cause pairs are predicted, and when a cause clause in one emotion cause pair indicates that it can point to an emotion clause representation, an emotion clause and a cause clause pointing to the emotion clause in the document sample are obtained.
And S160, obtaining the emotional cause pair prediction loss value corresponding to the emotional cause pair prediction result according to a first preset formula, and updating the emotional cause pair extraction model by using the emotional cause pair prediction loss value.
And finally, calculating a corresponding loss value according to the emotional cause pair prediction result to update the emotional cause pair extraction model, and stopping training until a preset condition is reached to obtain the trained emotional cause pair extraction model.
Optionally, in an implementation manner provided by the embodiment of the present invention, the first preset equation includes:
in the formula (I), the compound is shown in the specification,L pair the predicted loss value representing the emotional cause is compared with the predicted loss value,is indicated bypEmotional representation and the first of clausesqThe reason of each clause represents the obtained emotional reason pair; when in useWhen the prediction is correctY p,q Get 1, otherwise get 0.
Therefore, the embodiment of the invention extracts the mutual relation between each clause by using the attention network, so that the emotion output and the reason output of each clause are fused and updated, and the information contained in the emotion output and the reason output of each clause is enriched, so that the model can complete prediction based on richer information, and the prediction accuracy is improved; furthermore, the embodiment of the invention also combines the emotion extraction task and the reason extraction task, updates the reason representation by the emotion extraction result, namely the first emotion clause prediction result, so as to obtain the reason output, and updates the emotion representation by the reason clause extraction result, namely the first reason clause prediction result, so as to obtain the emotion output, thereby realizing the interaction of the two types of clause extraction tasks.
Optionally, referring to fig. 2, fig. 2 shows a second implementation manner provided by the embodiment of the present invention, and optionally, in an implementation manner provided by the embodiment of the present invention, the S140 includes:
s141, inputting the emotion output after each clause is updated to a second preset formula of the pairing network to obtain emotion expression, and inputting the reason output after each clause is updated to a third preset formula of the pairing network to obtain reason expression;
the second preset equation comprises:
in the formula (I), the compound is shown in the specification,is shown asiThe emotional representation of the individual clauses,σthe representation of the function of relu,is shown asiThe updated emotion output of the clause,W e representing the trainable weights corresponding to the emotion representations,b e representing trainable deviations corresponding to the emotional representations;
the third preset equation comprises:
in the formula (I), the compound is shown in the specification,is shown asiThe reason for the individual clause indicates that,σthe representation of the function of relu,is shown asiThe reason for the updated clause is output,W c the presentation reason represents a corresponding trainable weight,b c the presentation reason represents a corresponding trainable deviation.
Further, referring to fig. 3, fig. 3 is a flowchart illustrating a method for training a third emotion reason pair extraction model according to an embodiment of the present invention, that is, in an implementation manner provided by the embodiment of the present invention, S140 includes:
and S142, based on a double affine mechanism, taking the emotion expression of each clause as a central item, taking the reason expression of each clause as a subordinate item, and pairwise pairing all the central items and all the subordinate items based on a preset formula set in a pairing network to obtain a corresponding emotion reason pair.
Wherein, the preset formula set comprises:
in the formula (I), the compound is shown in the specification,in the emotional cause pair matrix which represents all the emotional cause pair components, the first step ispEmotional representation of clauses andqthe reason of each clause represents the obtained emotional reason pair, the length of the document sample is | D |, all possible emotional reason pairs will form an emotional reason pair matrix of | D |. D |, rows and columns respectively represent the indexes of emotional clauses and reason clauses in the emotional reason pair matrix, and the rows are used as the indexespTo indicate, column byqRepresents;is represented bypEmotional representation of clauses andqthe reason for an individual clause represents the resulting pair of initial emotional reasons,A p,q the weight of the distance between the emotion clause and the reason clause is represented, the closer the emotion expression is to the reason expression,A p,q the larger;grepresenting a sigmoid function;M p,q representing emotional cause pair in the matrixpGo to the firstqThe emotional cause items of the column;ϵis a smoothing parameter.
Optionally, in an implementation manner provided by the embodiment of the present invention, after S120, the method further includes:
obtaining a clause classification loss value corresponding to the document sample according to a first emotion clause prediction result and a first reason clause prediction result of each clause;
further, S160 includes:
and updating the emotional cause pair extraction model by using the emotional cause pair prediction loss value and the clause classification loss value.
It can be understood that the embodiment of the present invention provides two auxiliary tasks to assist in completing emotion extraction and reason extraction, and further, in this embodiment of the present invention, the embodiment also obtains loss for results of the two auxiliary tasks and assists in updating various parameters in the model.
Further, the manner of obtaining the clause classification loss value can be described according to the following formula:
in the formula (I), the compound is shown in the specification,L aux representing a value of a clause classification penalty,representing the result of the prediction of the first emotion clause,representing the predicted result of the first reason clause(ii) a When in useWhen the prediction is correct, the prediction is carried out,taking 1, and taking 0 otherwise; when in useWhen the prediction is correct, the prediction is carried out,take 1, otherwise take 0.
Further, in an implementation manner provided by the embodiment of the present invention, after S120, the method further includes:
obtaining a second emotion clause prediction result and a second reason clause prediction result of each clause by utilizing emotion output of each clause and reason output of each clause based on the prediction network;
obtaining a clause prediction loss value corresponding to the document sample according to a second emotion clause prediction result and a second reason clause prediction result of each clause;
the updating the emotion reason pair extraction model by using the emotion reason pair prediction loss value and the clause classification loss value comprises the following steps:
and updating the emotional cause pair extraction model by utilizing the emotional cause pair prediction loss value, the clause classification loss value and the clause prediction loss value.
Specifically, after the emotion clause representation and the reason clause representation of each clause are updated by using each first emotion clause prediction result and first reason clause prediction result to obtain the emotion output and reason output of each clause, the embodiment of the invention also predicts the emotion output and reason output of each clause again according to the updated emotion output and reason output of each clause to obtain a second emotion clause prediction result and a second reason clause prediction result of each clause, and calculates corresponding loss according to the second emotion clause prediction result and the second reason clause prediction result, so that each parameter in the model is updated.
Corresponding to the method for training the emotion reason pair extraction model provided by the embodiment of the present invention, an embodiment of the present invention further provides a device for training the emotion reason pair extraction model, referring to fig. 4, fig. 4 shows a schematic structural diagram of the device for training the emotion reason pair extraction model provided by the embodiment of the present invention, in the device 200 for training the emotion reason pair extraction model provided by the embodiment of the present invention, the emotion reason pair extraction model includes a first coding network, a second coding network, a graph attention network, a pairing network and a prediction network, and the device includes:
a first encoding module 210, configured to input a document sample to the first encoding network, so as to obtain an emotion clause representation of each clause and a reason clause representation of each clause in the document sample;
a second encoding module 220, configured to obtain a first emotion clause prediction result and a first reason clause prediction result of each clause, and obtain an emotion output of each clause and a reason output of each clause according to the emotion clause expression and the reason clause expression of each clause, based on a second encoding network, wherein the emotion output of each clause is obtained by the first reason clause prediction result and the emotion clause expression of the clause, and the reason output of each clause is obtained by the first emotion clause prediction result and the reason clause expression of the clause;
a graph attention module 230, configured to input the emotion output and the reason output of each clause into a graph attention network, so as to obtain an updated emotion output and an updated reason output of each clause, where the graph attention network is configured to update the emotion output and the reason output of each clause according to the emotion output and the reason output of each clause;
a matching module 240, configured to, based on the matching network, obtain an emotion representation of each clause according to the updated emotion output of each clause, obtain a reason representation of each clause according to the updated reason output of each clause, and pair-pair match the emotion representations and the reason representations of all clauses to obtain an emotion reason pair;
the prediction module 250 is configured to input all the emotional cause pairs to a prediction network to obtain an emotional cause pair prediction result;
and the updating module 260 is configured to obtain an emotional cause pair prediction loss value corresponding to the emotional cause pair prediction result according to a first preset equation, and update the emotional cause pair extraction model by using the emotional cause pair prediction loss value.
It can be understood that, in the technical solution of the training apparatus for emotion reason pair extraction model disclosed in the embodiment of the present invention, through the synergistic effect of each functional module, the method for training emotion reason pair extraction model shown in fig. 1 in the embodiment is executed, and the implementation and beneficial effects related to the method for training emotion reason pair extraction model are also applicable in this embodiment, and are not described herein again.
The embodiment of the invention also provides computer equipment which comprises a memory and a processor, wherein the memory stores a computer program, and the computer program executes the method for training the emotion reason pair extraction model disclosed in the embodiment when running on the processor.
The embodiment of the invention also provides a computer readable storage medium, wherein a computer program is stored on the computer readable storage medium, and when the computer program runs on a processor, the method for training the emotion reason pair extraction model disclosed in the embodiment is executed.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus and method can be implemented in other ways. The apparatus embodiments described above are merely illustrative and, for example, the flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of apparatus, methods and computer program products according to various embodiments of the present invention. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It should also be noted that, in alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.
In addition, each functional module or unit in each embodiment of the present invention may be integrated together to form an independent part, or each module may exist separately, or two or more modules may be integrated to form an independent part.
The functions, if implemented in the form of software functional modules and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention or a part of the technical solution that contributes to the prior art in essence can be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a smart phone, a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The above description is only for the specific embodiments of the present invention, but the scope of the present invention is not limited thereto, and any person skilled in the art can easily conceive of the changes or substitutions within the technical scope of the present invention, and all the changes or substitutions should be covered within the scope of the present invention.
Claims (9)
1. A method for training an emotional cause pair extraction model, wherein the emotional cause pair extraction model comprises a first coding network, a second coding network, a graph attention network, a pairing network and a prediction network, and the method comprises the following steps:
inputting a document sample into the first coding network to obtain emotion clause representation and reason clause representation of each clause in the document sample;
obtaining a first emotion clause prediction result and a first reason clause prediction result of each clause according to the emotion clause expression and the reason clause expression of each clause, and obtaining emotion output and reason output of each clause based on a second coding network, wherein the emotion output of each clause is obtained through the first reason clause prediction result and the emotion clause expression of the clause, and the reason output of each clause is obtained through the first emotion clause prediction result and the reason clause expression of the clause;
inputting the emotion output and the reason output of each clause into a graph attention network to obtain updated emotion output and updated reason output of each clause, wherein the graph attention network is used for updating the emotion output and the reason output of each clause according to the emotion output and the reason output of each clause;
based on the pairing network, obtaining emotion expression of each clause according to the updated emotion output of each clause, obtaining reason expression of each clause according to the updated reason output of each clause, and pairing the emotion expression and the reason expression of all the clauses in pairs to obtain emotion reason pairs;
inputting all the emotional reason pairs into a prediction network to obtain the prediction result of the emotional reason pairs;
obtaining an emotional cause pair prediction loss value corresponding to the emotional cause pair prediction result according to a first preset formula, and updating the emotional cause pair extraction model by using the emotional cause pair prediction loss value, wherein the first preset formula comprises:
in the formula (I), the compound is shown in the specification,L pair the predicted loss value representing the emotional cause is compared with the predicted loss value,is indicated bypEmotional representation and the first of clausesqThe reason of each clause represents the obtained emotional reason pair; when in useWhen the prediction is correctY p,q Get 1, otherwise get 0.
2. The method of claim 1, wherein the graph attention network comprises a preset number of layers of graph attention layers connected in sequence, wherein the first layer of graph attention layer is used for updating the emotion output and the reason output of each clause, and each of the remaining layers of graph attention layers of the graph attention network is used for updating the emotion output and the reason output of the previous graph attention layer.
3. The method of claim 1, wherein said deriving an emotion representation for each of said clauses based on said pairing network based on an updated emotion output for each of said clauses and a reason representation for each of said clauses based on an updated reason output for each of said clauses comprises:
inputting the updated emotion output of each clause into a second preset formula of the pairing network to obtain emotion expression, and inputting the updated reason output of each clause into a third preset formula of the pairing network to obtain reason expression;
the second preset equation comprises:
in the formula (I), the compound is shown in the specification,is shown asiThe emotional representation of the individual clauses,σthe representation of the function of relu,is shown asiThe updated emotion output of the clause,W e representing the trainable weights corresponding to the emotion representations,b e representing trainable deviations corresponding to the emotional representations;
the third preset equation comprises:
in the formula (I), the compound is shown in the specification,is shown asiThe reason for the individual clause indicates that,σthe representation of the function of relu,is shown asiThe reason for the updated clause is output,W c the presentation reason represents a corresponding trainable weight,b c the presentation reason represents a corresponding trainable deviation.
4. The method of claim 3, wherein pairing the emotion and reason representations of all the clauses pairwise to obtain an emotion reason pair comprises:
based on a double affine mechanism, representing the emotion of each clause as a central item, representing the reason of each clause as an auxiliary item, and pairwise pairing all the central items and all the auxiliary items based on a preset formula set in a pairing network to obtain a corresponding emotion reason pair.
5. The method of claim 1, wherein after obtaining the first emotion clause prediction result and the first reason clause prediction result for each of the clauses based on the emotion clause representation and the reason clause representation for each of the clauses and obtaining the emotion output for each of the clauses and the reason output for each of the clauses based on the second coding network, the method further comprises:
obtaining a clause classification loss value corresponding to the document sample according to a first emotion clause prediction result and a first reason clause prediction result of each clause;
the updating the emotional cause pair extraction model by using the emotional cause pair prediction loss value comprises the following steps:
and updating the emotional cause pair extraction model by using the emotional cause pair prediction loss value and the clause classification loss value.
6. The method of claim 5, wherein after obtaining the first emotion clause prediction result and the first reason clause prediction result for each of the clauses based on the emotion clause representation and the reason clause representation for each of the clauses and obtaining the emotion output for each of the clauses and the reason output for each of the clauses based on the second coding network, the method further comprises:
obtaining a second emotion clause prediction result and a second reason clause prediction result of each clause by utilizing emotion output of each clause and reason output of each clause based on the prediction network;
obtaining a clause prediction loss value corresponding to the document sample according to a second emotion clause prediction result and a second reason clause prediction result of each clause;
the updating the emotion reason pair extraction model by using the emotion reason pair prediction loss value and the clause classification loss value comprises the following steps:
and updating the emotional cause pair extraction model by utilizing the emotional cause pair prediction loss value, the clause classification loss value and the clause prediction loss value.
7. An apparatus for training an emotional cause pair extraction model, wherein the emotional cause pair extraction model comprises a first coding network, a second coding network, a graph attention network, a pairing network and a prediction network, the apparatus comprising:
the first coding module is used for inputting the document sample into the first coding network to obtain the emotion clause representation and the reason clause representation of each clause in the document sample;
a second coding module, configured to obtain, based on a second coding network, a first emotion clause prediction result and a first reason clause prediction result for each clause according to an emotion clause expression and a reason clause expression of each clause, and obtain an emotion output for each clause and a reason output for each clause, where the emotion output for each clause is obtained by the first reason clause prediction result and the emotion clause expression of the clause, and the reason output for each clause is obtained by the first emotion clause prediction result and the reason clause expression of the clause;
the graph attention module is used for inputting the emotion output and the reason output of each clause into a graph attention network to obtain updated emotion output and updated reason output of each clause, wherein the graph attention network is used for updating the emotion output and the reason output of each clause according to the emotion output and the reason output of each clause;
the matching module is used for obtaining the emotion expression of each clause according to the updated emotion output of each clause and obtaining the reason expression of each clause according to the updated reason output of each clause based on the matching network, and pairwise matching the emotion expression and the reason expression of all the clauses to obtain an emotion reason pair;
the prediction module is used for inputting all the emotional reason pairs into a prediction network to obtain the prediction result of the emotional reason pairs;
an updating module, configured to obtain an emotional cause pair prediction loss value corresponding to the emotional cause pair prediction result according to a first preset formula, and update the emotional cause pair extraction model by using the emotional cause pair prediction loss value, where the first preset formula includes:
in the formula (I), the compound is shown in the specification,L pair the predicted loss value representing the emotional cause is compared with the predicted loss value,is indicated bypEmotional representation and the first of clausesqThe reason of each clause represents the obtained emotional reason pair; when in useWhen the prediction is correctY p,q Get 1, otherwise get 0.
8. A computer device comprising a memory and a processor, the memory storing a computer program which, when run on the processor, performs the method of training the affective cause pair extraction model according to any of claims 1-6.
9. A computer-readable storage medium, having stored thereon a computer program which, when run on a processor, performs the method of training the affective-cause-pair extraction model according to any of claims 1-6.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210039899.2A CN114065769B (en) | 2022-01-14 | 2022-01-14 | Method, device, equipment and medium for training emotion reason pair extraction model |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210039899.2A CN114065769B (en) | 2022-01-14 | 2022-01-14 | Method, device, equipment and medium for training emotion reason pair extraction model |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114065769A CN114065769A (en) | 2022-02-18 |
CN114065769B true CN114065769B (en) | 2022-04-08 |
Family
ID=80230893
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210039899.2A Active CN114065769B (en) | 2022-01-14 | 2022-01-14 | Method, device, equipment and medium for training emotion reason pair extraction model |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114065769B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2604690A1 (en) * | 2006-10-06 | 2008-04-06 | Accenture Global Services Gmbh | Technology event detection, analysis, and reporting system |
CN103646088A (en) * | 2013-12-13 | 2014-03-19 | 合肥工业大学 | Product comment fine-grained emotional element extraction method based on CRFs and SVM |
CN106484767A (en) * | 2016-09-08 | 2017-03-08 | 中国科学院信息工程研究所 | A kind of event extraction method across media |
CN111382565A (en) * | 2020-03-09 | 2020-07-07 | 南京理工大学 | Multi-label-based emotion-reason pair extraction method and system |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20090234711A1 (en) * | 2005-09-14 | 2009-09-17 | Jorey Ramer | Aggregation of behavioral profile data using a monetization platform |
US20090234861A1 (en) * | 2005-09-14 | 2009-09-17 | Jorey Ramer | Using mobile application data within a monetization platform |
US20140358523A1 (en) * | 2013-05-30 | 2014-12-04 | Wright State University | Topic-specific sentiment extraction |
CN104679825B (en) * | 2015-01-06 | 2018-10-09 | 中国农业大学 | Macroscopic abnormity of earthquake acquisition of information based on network text and screening technique |
CN104731923A (en) * | 2015-03-26 | 2015-06-24 | 无锡中科泛在信息技术研发中心有限公司 | Construction method for Internet product review excavation noumenon lexicon |
-
2022
- 2022-01-14 CN CN202210039899.2A patent/CN114065769B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2604690A1 (en) * | 2006-10-06 | 2008-04-06 | Accenture Global Services Gmbh | Technology event detection, analysis, and reporting system |
CN103646088A (en) * | 2013-12-13 | 2014-03-19 | 合肥工业大学 | Product comment fine-grained emotional element extraction method based on CRFs and SVM |
CN106484767A (en) * | 2016-09-08 | 2017-03-08 | 中国科学院信息工程研究所 | A kind of event extraction method across media |
CN111382565A (en) * | 2020-03-09 | 2020-07-07 | 南京理工大学 | Multi-label-based emotion-reason pair extraction method and system |
Non-Patent Citations (2)
Title |
---|
"Influence Analysis of Emotional Behaviors and User Relationships Based on Twitter Data";Kiichi Tago 等;《Tsinghua Science and Technology》;20180215;第23卷(第1期);第104-113页 * |
"基于情感膨胀门控CNN的情感-原因对提取";代建华 等;《数据分析与知识发现》;20200609;第4卷(第8期);第232-237页 * |
Also Published As
Publication number | Publication date |
---|---|
CN114065769A (en) | 2022-02-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110969020B (en) | CNN and attention mechanism-based Chinese named entity identification method, system and medium | |
CN112084331A (en) | Text processing method, text processing device, model training method, model training device, computer equipment and storage medium | |
CN110413844A (en) | Dynamic link prediction technique based on space-time attention depth model | |
JP7417679B2 (en) | Information extraction methods, devices, electronic devices and storage media | |
CN113435211B (en) | Text implicit emotion analysis method combined with external knowledge | |
CN112765370B (en) | Entity alignment method and device of knowledge graph, computer equipment and storage medium | |
CN114358007A (en) | Multi-label identification method and device, electronic equipment and storage medium | |
CN114358201A (en) | Text-based emotion classification method and device, computer equipment and storage medium | |
CN112380835A (en) | Question answer extraction method fusing entity and sentence reasoning information and electronic device | |
CN116258137A (en) | Text error correction method, device, equipment and storage medium | |
CN114358020A (en) | Disease part identification method and device, electronic device and storage medium | |
CN115964459A (en) | Multi-hop inference question-answering method and system based on food safety cognitive map | |
US11941360B2 (en) | Acronym definition network | |
CN113505583A (en) | Sentiment reason clause pair extraction method based on semantic decision diagram neural network | |
CN111368531A (en) | Translation text processing method and device, computer equipment and storage medium | |
CN116414988A (en) | Graph convolution aspect emotion classification method and system based on dependency relation enhancement | |
CN114065769B (en) | Method, device, equipment and medium for training emotion reason pair extraction model | |
CN113239184B (en) | Knowledge base acquisition method and device, computer equipment and storage medium | |
CN114998041A (en) | Method and device for training claim settlement prediction model, electronic equipment and storage medium | |
CN112380326B (en) | Question answer extraction method based on multilayer perception and electronic device | |
CN114090778A (en) | Retrieval method and device based on knowledge anchor point, electronic equipment and storage medium | |
CN114627282A (en) | Target detection model establishing method, target detection model application method, target detection model establishing device, target detection model application device and target detection model establishing medium | |
CN113836910A (en) | Text recognition method and system based on multilevel semantics | |
CN113821610A (en) | Information matching method, device, equipment and storage medium | |
CN113641789A (en) | Viewpoint retrieval method and system based on hierarchical fusion of multi-head attention network and convolutional network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |