CN114969318A - Multi-task standpoint detection method based on multi-graph sparse interaction network - Google Patents
Multi-task standpoint detection method based on multi-graph sparse interaction network Download PDFInfo
- Publication number
- CN114969318A CN114969318A CN202210069686.4A CN202210069686A CN114969318A CN 114969318 A CN114969318 A CN 114969318A CN 202210069686 A CN202210069686 A CN 202210069686A CN 114969318 A CN114969318 A CN 114969318A
- Authority
- CN
- China
- Prior art keywords
- task
- graph
- emotion
- text
- relation
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/284—Lexical analysis, e.g. tokenisation or collocates
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Abstract
The invention discloses a multi-task position detection method based on a multi-graph sparse interaction network. The method comprises the steps that an input text is input into a multi-map sparse interaction network model, and the position detection polarity and the emotion classification polarity of the input text are obtained; the multi-graph sparse interaction network model comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the multi-graph sparse interaction module is used for updating the intra-graph node characteristics of the vertical field task graph, the emotion task graph and the task relation graph and updating the sparse interaction of the node characteristics among the graphs; the task related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classified emotion. The technical scheme of the invention improves the accuracy of the vertical detection for the text of the text pushing.
Description
Technical Field
The invention relates to the technical field of position detection, in particular to a multi-task position detection method based on a multi-graph sparse interaction network.
Background
The existing position detection method mainly uses a machine learning method and a deep learning method. The machine learning method needs to do a lot of work of feature engineering, extract features manually, and then design a machine learning model to train the extracted features, such as a support vector machine (SupportVectorMachine), a decision tree model, a random forest, and the like. The main disadvantages of the method are that a lot of time is consumed for carrying out the feature engineering, the information contained in the manually selected features is limited, and the performance of the model is reduced to a certain extent; meanwhile, most of the machine learning methods contain a large amount of hyper-parameters, the optimal values of the hyper-parameters need to be selected manually, and the machine learning methods are time-consuming and labor-consuming and cannot be applied in a large scale. In the early stage of the vertical detection method based on deep learning, a Recurrent Neural Network (RNN), a Convolutional Neural Network (CNN) and an attention mechanism (attentionchannels) are mainly used to automatically capture the characteristics of a news text, thereby improving the performance of vertical detection. Under the influence of the popularity of Transformers, some recent work focuses on improving the performance of the position detection task by using the Bert model, mainly by utilizing the powerful word embedding capability of the Bert. The deep learning-based position detection method related to the invention is to use auxiliary information to construct an auxiliary task to help improve the expression of the position detection task, such as emotional information, expression information, subjective or objective nature of text, and the like. The auxiliary LSTM network is mainly designed to extract emotional characteristics, the main LSTM network is used to extract characteristics of a main task, and then the emotional characteristics and the position characteristics are simply spliced together to predict the position of a news text.
In the prior art, the relevance between target expression and position expression, such as RNN and CNN, cannot be captured by adopting a position detection method based on deep learning. In the auxiliary task-based elevation detection method, emotional features and elevation features are simply spliced, the complexity of the relationship between tasks is neglected, the two tasks are regarded as the same importance, and the two tasks are simply spliced together, so that large negative migration is generated, and the performance of a model is reduced. And when the form of the auxiliary task is adopted, only the performance of the main task is concerned, and no consideration is given to the performance improvement of the auxiliary task. Therefore, the accuracy rate of the prior art for performing the position detection on the text of the tweet is low.
Disclosure of Invention
The invention provides a multi-task vertical detection method based on a multi-graph sparse interaction network, which improves the accuracy of vertical detection for text pushing.
An embodiment of the invention provides a multi-task position detection method based on a multi-graph sparse interaction network, which comprises the following steps:
inputting an input text into a multi-map sparse interaction network model to obtain the stand detection polarity and the emotion classification polarity of the input text; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;
the text encoding module is used for processing input text into a plurality of word vectors for the multi-graph building module, the multi-graph sparse interaction module and the task related attention module to use;
the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the position task graph is a graph constructed according to the syntax dependence tree of the input text and a graph constructed by pragmatic weight of words of the input text when a position detection task is carried out; the emotion task graph is constructed according to the syntax dependence tree of the input text and pragmatic weight of a single word of the input text during emotion classification task; the task relation graph is constructed according to word category importance weight of words of an input text and a syntax dependence tree of the input text;
the multi-graph sparse interaction module is used for updating intra-graph node features of the vertical field task graph, the emotion task graph and the task relation graph and updating sparse interaction of the node features among the graphs; the sparse interaction among the graphs comprises sparse interaction between a position task graph and a task relation graph and sparse interaction between an emotion task graph and a task relation graph;
and the task related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classification emotion according to the position feature representation of the position task diagram and the emotion feature representation of the emotion task diagram.
Further, the method for constructing the elevation task graph and the emotion task graph of the multi-graph sparse interaction network model comprises the following steps:
constructing a first syntax dependency tree according to the syntax structure of the text-pushing text, and acquiring a root word set of the first syntax dependency tree;
adding words in the target text into the first syntactic dependency tree according to the word connection relation between the target text and the root word set to obtain a second syntactic dependency tree of the input text;
calculating a first pragmatic weight and a second pragmatic weight of the tweet text and a third pragmatic weight of the target text, wherein the first pragmatic weight is the pragmatic weight of each word of the tweet text in a vertical detection task, and the second pragmatic weight is the pragmatic weight of each word of the tweet text in an emotion classification task;
and constructing the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntactic dependency tree, and constructing the emotion task graph according to the second pragmatic weight and the second syntactic dependency tree.
Further, constructing a task relationship graph of the multi-graph sparse interaction network model specifically comprises:
constructing word category importance weight, and constructing a position label importance relation vector and an emotion label importance relation vector according to the word category importance weight; the word category importance weight is the relationship between words and label categories of different tasks;
and calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing the task relation graph according to the task interaction relation.
Further, updating the intra-graph node characteristics of the vertical task graph, the emotion task graph and the task relationship graph, specifically:
according to the formulaRespectively carrying out intra-graph node feature updating on the vertical task graph, the emotion task graph and the task relation graph, wherein task belongs to { st, se and re } which respectively represent the vertical task graph, the emotion task graph and the task relation graph, I is an identity matrix, sigma is a nonlinear activation function, l represents the number of network layers of current iteration,a parameter indicating the layer l network is shown,an adjacency matrix representing a vertical task graph, an emotional task graph or a task relationship graph;j denotes the adjacency matrix diagonal fill 1.
Further, the sparse interaction of the node features between the graphs is updated, specifically, the sparse interaction is updated
According to the following formula:
updating sparse interaction of node features among graphs, wherein the node features comprise a position task graph, an emotion task graph and a task relation graph, and the formula comprisesAndrespectively show the task diagram g of the place st Emotional task graph g se And task relation graph g re And performing matrix representation of the node features of the graph after updating the node features in the graph by using a GCN graph neural network, wherein alpha is the number of node information of the vertical task graph and the number of node information of the emotional task graph, which are used for controlling task relation graph fusion by using a hyper-parameter, and l is the number of network layers of the current iteration.
Furthermore, sparse interaction is encouraged between the position task graph and the task relation graph and between the emotion task graph and the task relation graph according to a first loss function; and multiple interactions between the position task graph and the task relation graph and between the emotion task graph and the task relation graph are encouraged according to the second loss function.
Further, the first loss function is specifically:
the second loss function is specifically:
in the formulaFor the sparse mask matrix of the position task,and (3) a sparse mask matrix of the emotion tasks, wherein L represents the number of network layers of the current iteration, and L represents the total number of preset iteration networks.
Further, a vertical characteristic representation r of the vertical task graph is calculated according to the following formula st :
Where α is the attention weight, h, expressed in a vertical characteristic i And m + n represents the length of the input text for the feature vector of the ith word after being coded by a BERT model.
Further, calculating the emotional characteristic representation r of the emotional task diagram according to the following formula se :
In the formula of alpha ‘ For the attention weight of the representation of the emotional feature,output g for emotion correlation graph se The feature of the ith node in the text entry table represents, and m + n represents the length of the input text.
Further, the polarity of the detection standpoint and classification emotion of the input text is calculated according to the following formula:
y task =softmax(W task r task +b task );
in the formula y task | task={st,se} Detecting polarity and emotion classification polarity for the multi-graph sparse interaction network model prediction standpoint, W task | task={st,se} As weight of the full connection layer, b task Is W task Corresponding offset, r task | task={st,se} For both the standpoint and emotional characterizations, softmax is the activation function.
The embodiment of the invention has the following beneficial effects:
the invention provides a multi-task vertical detection method based on a multi-graph sparse interaction network. According to the invention, through training the emotion analysis task and the position detection task in a combined manner, a task-specific graph (namely a position task graph and an emotion task graph) and a task-related graph (namely a task relation graph) are constructed for each task, and a sparse interaction module between the graphs is constructed, so that sparse interaction between the position task graph and the emotion task graph is realized, information sharing between the tasks is facilitated, and the expressive force of the multi-graph sparse interaction network model on each task is improved.
Drawings
Fig. 1 is a schematic structural diagram of a multi-graph sparse interaction network model according to an embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it is obvious that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
As shown in fig. 1, in the multi-task standpoint detection method based on the multi-graph sparse interaction network provided by an embodiment of the present invention, an input text is input to a multi-graph sparse interaction network model, and a standpoint detection polarity and an emotion classification polarity of the input text are obtained; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;
the text encoding module is used for processing input text into a plurality of word vectors for the multi-graph building module, the multi-graph sparse interaction module and the task related attention module to use;
the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the position task graph is a graph constructed according to the syntax dependence tree of the input text and a graph constructed by pragmatic weight of words of the input text when a position detection task is carried out; the emotion task graph is constructed according to the syntax dependence tree of the input text and pragmatic weight of a single word of the input text during emotion classification task; the task relation graph is constructed according to word category importance weight of words of an input text and a syntax dependence tree of the input text;
the multi-graph sparse interaction module is used for updating intra-graph node features of the vertical field task graph, the emotion task graph and the task relation graph and updating sparse interaction of the node features among the graphs; the sparse interaction among the graphs comprises sparse interaction between a position task graph and a task relation graph and sparse interaction between an emotion task graph and a task relation graph;
and the task related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classification emotion according to the position feature representation of the position task diagram and the emotion feature representation of the emotion task diagram.
As an embodiment, the method for constructing the task relation graph of the model comprises the following steps:
constructing word category importance weight, and constructing a position label importance relation vector and an emotion label importance relation vector according to the word category importance weight;
and calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing the task relation graph according to the task interaction relation.
As one of the embodiments, the text encoding module uses BERT to encode the target words T in the target text and the tweet text content C. Firstly, splicing the text content C of the text to be pushed and the target word T to form an input text S, wherein S is { T ═ T } 1 ,…,t m ,w 1 ,w 2 ,…,w n }. S is then processed into the input format of the BERT model: [ CLS]t 1 …t m W 1 W 2 …w n [SEP]. And then input it into the BERT network model to capture the contextual characteristics of the input text. The process can be defined by the following formula:
H=BERT(S)
where H is the output of the BERT network model, H ═ H { [ H ] 1 ,h 2 ,…,h m+n }. Where each element in H is a characteristic representation of a word in the input text,including a characteristic representation of the context information for the tth word.
As one embodiment, the method for constructing the elevation task map and the emotion task map comprises the following steps:
constructing a first syntax dependency tree according to the syntax structure of the text-pushing text, and acquiring a root word set of the first syntax dependency tree;
adding words in the target text into the first syntactic dependency tree according to the word connection relation between the target text and the root word set to obtain a second syntactic dependency tree of the input text; calculating a first pragmatic weight and a second pragmatic weight of the tweet text and a third pragmatic weight of the target text, wherein the first pragmatic weight is the pragmatic weight of each word of the tweet text in a vertical detection task (namely, the pragmatic weight of the word about the vertical task is calculated by using the relative co-occurrence frequency and word frequency of the word and the vertical labels (support and object) in the whole corpus); the second pragmatic weight is the pragmatic weight of each word of the text in the emotion classification task (namely, the pragmatic weight of the word with respect to the vertical task is obtained by calculating the relative co-occurrence frequency and the word frequency of the word and the emotion label in the whole corpus). The pragmatic weight refers to the dependency (or influence) of the word in the inferred text on a specific target.
And constructing the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntactic dependency tree, and constructing the emotion task graph according to the second pragmatic weight and the second syntactic dependency tree.
Specifically, each piece of tweet text is parsed using a parsing tool to construct a syntactic dependency tree; and representing the relation between the words in the syntactic dependency tree as connecting lines between the nodes by representing the words as the nodes of the graph (namely the nodes of the graph are word vectors of each word in the text of the Chinese character) so as to construct a base graph of the target text T and the Chinese character CThe method comprises the following steps:
constructing a first syntactic dependency tree according to the syntactic structure of the tweet text CCombined pipeThe sentence-passing parser obtains a root word set W of the first syntax dependency tree r 。
Since the target text T is not a complete sentence, but a phrase or a word, and cannot be modeled as a syntactic dependency tree, according to the word connection relationship between the target text and the root word set, the embodiment of the present invention adds the word in the target text to the first syntactic dependency tree to obtain a second syntactic dependency tree of the input text SThe second syntactic dependency treeThe calculation formula of (c) is as follows:
in the formula, W r Representing a first syntactic dependency treeThe root word of (a) is,second syntactic dependency tree representing input text SW i And W j Any two different words in the input text S, which represents the tweet text and the target text.
In order to capture the importance of words in the input text and the interaction characteristics between the words, the pragmatic weight of the words and the word frequency of the words of different tasks need to be calculated. Calculating the frequency of each word in the input text appearing in the whole corpusIn the formula, N (W) i ) As a word W i The number of times it appears in the corpus, N being the number of all words in the corpus. The embodiment of the invention calculates the pragmatic weight of the word in different tasks aiming at different tasks, comprising a first pragmatic weight phi task (w i )| task=stance And a second pragmatic weight phi task (w i )| task=sentiment When the first pragmatic weight and the second pragmatic weight are calculated, only the category with practical significance is considered, two label categories which do not contain useful information, namely neutral position and neutral emotion, are omitted, and the specific calculation process is shown as the following formula:
in the formula, N (W) i ,label + ) And N (W) i ,label - ) Respectively represent words W i The quantities appearing in the context task labels "support" and "objection" or respectively represent the words W i The number of occurrences in the emotional task tags "active" and "passive"; n (label) + ) And N (label) - ) Representing the total number of the standpoint task labels "support" and "opposition", respectively, or the emotional task labels "active" and "negative", respectively, with μ being the mean and δ being the standard deviation.
According to the formulaAnd calculating a third pragmatic weight of the target text. And the third pragmatic weight of the target text is used for establishing the relation between the target text and the tweet text, namely constructing an edge between the target text and the tweet text in the graph.
And calculating a first adjacent matrix of the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntax dependency tree, and further obtaining the elevation task graph according to the first adjacent matrix.
And calculating a second adjacency matrix of the emotion task graph according to the second pragmatic weight and a second syntactic dependency tree, and further obtaining the emotion task graph according to the second adjacency matrix.
Specifically, the first adjacency matrix and the second adjacency matrix are calculated according to the following formulas:
in the formula s j And s i Are words in the input text.
As an embodiment, the task relation graph of the multi-graph sparse interaction network model is constructed, and the method comprises the following steps:
constructing word category importance weight, and constructing a position label importance relation vector and an emotion label importance relation vector according to the word category importance weight;
and calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing the task relation graph according to the task interaction relation.
Specifically, in consideration of fine-grained interaction relationships between different tasks, that is, relationships of tasks in different categories are different, the word category importance weight is used in the embodiment of the present inventionThe relation between the word and the label types of different tasks is established, so that the label importance relation characteristics of different words to different tasks are obtained, information interaction among different tasks is facilitated, and the task expressive force of the multi-graph sparse interaction network model is improved. According to the formulaCalculating the importance weight of the word class, wherein W i E C is the word of the tweet text,c i is a label category of the corresponding task (e.g., task is a position detection task, c i E { support, object }),the expression W i The tag class appearing in task is c i The number of times of (c);label class c represented in task i The total number of occurrences of the corresponding word; | w i I represents the word W i The number of occurrences in the corpus; task denotes a position task or an emotional task. The word category importance weight refers to the importance relationship between words and different label categories of different tasks, namely, the importance relationship between words in two tasks is represented, and the word category importance weight is used for capturing the similarity relationship between the two tasks based on task label layers. The method comprises the steps of firstly calculating the relation between a word and each label category of the vertical task to form a vector, namely a vertical label importance relation vector, and then calculating the relation between the word and each label category of the emotion task to form a vector, namely an emotion label importance relation vector. And finally, calculating the similarity of the two vectors, and using the similarity as the importance weight of the word to the label category of each task to measure the similarity relation of the word between the two tasks.
Specifically, a label importance relation vector is constructed for each task according to the calculated word class importance weight, wherein the label importance relation vector comprises a position label importance relation vector and an emotion label importance relation vector, namely phi stance (w i ) And phi sentiment (w i )。
The importance relationship vector represents the importance relationship of each word to different tag categories, denoted asWherein task represents a position task or an emotion task, c i A certain label representing the task is shown,in normalized form of the word class importance weights,whereinAndis thatMean and standard deviation of. Through the calculation, the importance relation vector of the word to different label categories under different tasks, namely the importance relation vector phi of the vertical label can be obtained stance (w i ) And
emotional tag importance relationship vector Φ sentiment (w i )。
Calculating task interaction relation xi (W) based on word level according to the position label importance relation vector and the emotion label importance relation vector i ):
Where, sta denotes a position detection task and se denotes an emotion classification task. A second syntactic dependency tree according to said input text STask interaction xi (W) i ) Adjacent matrix used for constructing task relation graph after standardization processingFurther according to the adjacency matrixObtaining the task relation graph and the adjacency matrixThe calculation formula of (c) is:
as one embodiment, intra-graph node features of the vertical-field task graph, the emotion task graph and the task relation graph are updated, namely, horizontal intra-graph updating is carried out. Specifically, the embodiment of the present invention uses a graph convolutional neural network (GCN) to perform intra-graph iterative update, and the update process is independent for each graph in a broad sense (only horizontal update of the graph is considered), that is, each graph independently updates node features. Respectively carrying out respective intra-graph node feature updating on the position task graph, the emotion task graph and the task relation graph according to the following formulas:
in the formula, task belongs to { st, se, re } respectively represents a position task graph, an emotion task graph and a task relation graph,representing the characteristics of the nodes at the output of the l-1 th network,parameters representing the l-th network, I being an identity matrix, σ being a non-linear activation function;j denotes the adjacency matrix diagonal fill 1,characterization of convolution kernels for graph convolution networksVector and routine graph convolution neural network updated parameters; l denotes the number of network layers for the current iteration. The initial node characteristics are expressed as
Vertical task graph g st And emotional task graph g se Graph g relating to tasks respectively re Performing sparse interaction to reach g st Heel g se The interaction between the tasks is carried out, so that the different tasks can be helped to realize information sharing. Initializing g st Heel g re Is as followsg se Heel g re Is as followsSparse mask matrix for defining a context taskSparse mask matrix for emotional tasksThe mask matrix z represents a group of random binary variables and is responsible for controlling the interaction between the position task graph and the task relation graph and controlling the interaction between the position task graph and the task relation graph. Therefore, when the iterative interaction of the ith layer graph is carried out, the sparse interaction of the node features among the graphs is updated according to the formulas (1) to (3), namely the sparse interaction between the vertical task graph and the task relation graph and the sparse interaction between the emotion task graph and the task relation graph are updated:
wherein the content of the first and second substances,andparameters of connecting edges between the elevation task chart and the emotion task chart and the task relation chart are respectively shown, l represents the number of network layers of the current iteration,andrespectively show the vertical task chart g st Emotion task map g se And task relation graph g re And (3) performing matrix representation of the node characteristics of the graph after updating the node characteristics in the graph by using a GCN graph neural network, wherein alpha is the number of node information of the live task graph and the number of node information of the emotion task graph, which are used for controlling task relation graph fusion by using a hyper-parameter.
Since the set of random binary variables represented by the mask matrix z are discrete variables and are not derivable, they are not suitable for deep learning models. Therefore, in the embodiment of the present invention, a Gumbel-Softmax distribution is adopted to continuously convert the discrete random variable z into the continuous variable v so as to adapt to the updating process of the multi-graph sparse interaction network model in the embodiment of the present invention, and a calculation formula thereof is as follows:
where i and j represent randomly chosen 0 and 1, and τ is a scaling parameter,G l Is an independent identically distributed sample, pi, sampled from a standard normal distribution Gumbel (0,1) l =[1-z l ,z l ]It is the mask matrix z that is represented as a set of random binary variables.
In order to improve the training efficiency of the model, sparse interaction between the position task graph and the task relation graph and sparse interaction between the emotion task graph and the task relation graph are encouraged according to a first loss function (namely sparse regularization), wherein the first loss function is as follows:
in the formula, L represents the number of network layers of the current iteration, L represents the total number of network layers of the set iteration to avoid too sparse interaction among the graphs, so that the whole multi-graph sparse interaction network model is divided into three independent graph networks, and then multi-interaction between the vertical task graph and the task relationship graph, multi-interaction between the emotion task graph and the task relationship graph are encouraged according to a second loss function (namely, sharing regularity), namely, multi-interaction at the bottom layer is encouraged, and bottom layer information is shared among the tasks, wherein the second loss function is as follows:
the multi-image sparse interaction module is used for information sharing between two tasks to obtain information helpful for training, noise information influencing task training is filtered, the sharing efficiency and quality between the tasks are improved, and the performance of the multi-image sparse interaction network model for executing each task is further improved.
As one of the embodiments, a vertical feature representation g of a vertical task graph is obtained at the task-related attention module st And emotional feature representation g of emotional task graph se . For the position detection task, in order to obtain the position feature representation related to the target, a mask mechanism is required to be adopted to filter out non-target words, and specifically, the design is adoptedThe mask matrix is used for setting the corresponding position of the target word to be 1 and setting the corresponding position of the non-target word to be 0 so as to obtain the characteristic representation of the elevation task graph after the mask matrix is convertedAnd then using the attention mechanism based on the retrieval to obtain richer position feature representation related to the target word, wherein the attention weight alpha of the position feature representation is calculated according to the following formula:
wherein h is the output of the BERT network model,representing the characteristic representation of the t-th word after being coded by a BERT network model; m + n represents the length of a word in the input text (the length of the target text is m, the length of the tweet text is n),representing the output of the elevation correlation diagram after being subjected to mask matrix conversionThe characteristic representation of the ith node (i.e., the ith word vector), β t Attention weight, β, representing the t-th word vector i Attention weight, α, expressed as a vertical feature of the ith word vector t The attention weight representing the t-th word vector is normalized. The attention weight of the position feature representation represents the attention of all words in the context to the position feature.
Then, the position characteristic representation r of the position task graph is calculated according to the following formula st :
h i Feature vectors, α, encoded for the ith word by the BERT model i Attention weights represented for the vertical features of the ith word vector.
Similarly, calculating emotional feature representation r of emotional task diagram according to formulas (4) to (6) se :
Output g representing emotion correlation diagram se The feature representation of the ith node (or word vector) in the text is shown, and alpha ' is the attention weight, beta ' of the emotional feature representation ' t Attention weight, β ', for emotional characterization of the t-th word vector' i Attention weight for emotion characterization for the ith word vector.
Obtaining a final position feature representation r st And emotional feature representation r se And then, fusing text features and rich context features by using a full connection layer, and obtaining the polarity of the detection position of the input text and the polarity of the classified emotion:
y task =softmax(W task r task +b task )
in the formula, y task | task={st,se} Detecting polarity and emotion classification poles for the multi-graph sparse interaction network model prediction from the standpointProperty, W task | task={st,se} Weight of the full connection layer, b task Is W task Corresponding to the offset, softmax is the activation function.
Finally, the objective function of the whole multi-image sparse interaction module is a linear combination of the loss functions of the standpoint detection task and the emotion analysis task:
in the formula, theta is a parameter of the multitask graph network model, and lambda is 1 、λ 2 、λ 3 Is the coefficient corresponding to the loss term; d represents the (d) th tweet,for the set of all the tweets,for the emotion task label of the predicted d-th tweet of the model,for the tag of the context task of the d-th tweet predicted by the model,representing a first loss function (i.e. sparse regularization),representing a second loss function (i.e., sharing regularization), encouraging task sharing.
The task related attention module is used for calculating the final vertical characteristic representation of the vertical task graph and the final emotional characteristic representation of the emotional task graph, and performing vertical detection and emotion classification according to the vertical characteristic representation and the emotional characteristic representation. The embodiment of the invention constructs three graph structures to capture the interaction relationship between tasks, wherein a task relationship graph is constructed to capture the word-level correlation between the position detection task and the emotion analysis task, so as to help each task to share information and reduce the generation of noise in the information sharing process.
The embodiment of the invention predicts the position and attitude of epidemic prevention measures for some government according to short news texts published by the public on a social platform. Because the news text content is short, the information quantity is small, and higher accuracy rate is difficult to obtain by adopting a single-task learning mode, the invention provides a multi-task vertical detection method based on a multi-graph sparse interaction network.
The foregoing is directed to the preferred embodiment of the present invention, and it is understood that various changes and modifications may be made by one skilled in the art without departing from the spirit of the invention, and it is intended that such changes and modifications be considered as within the scope of the invention.
It will be understood by those skilled in the art that all or part of the processes of the above embodiments may be implemented by a computer program, which can be stored in a computer readable storage medium and can include the processes of the above embodiments when executed. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.
Claims (10)
1. A multi-task position detection method based on a multi-graph sparse interaction network is characterized in that,
inputting an input text into a multi-map sparse interaction network model to obtain the position detection polarity and the emotion classification polarity of the input text; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;
the text encoding module is used for processing input text into a plurality of word vectors for the multi-graph building module, the multi-graph sparse interaction module and the task related attention module to use;
the multi-graph construction module is used for constructing a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the position task graph is a graph constructed according to the syntax dependence tree of the input text and a graph constructed by pragmatic weight of words of the input text when a position detection task is carried out; the emotion task graph is constructed according to the syntax dependence tree of the input text and pragmatic weight of words of the input text during emotion classification task; the task relation graph is constructed according to word category importance weight of words of an input text and a syntax dependence tree of the input text;
the multi-graph sparse interaction module is used for updating intra-graph node features of the vertical field task graph, the emotion task graph and the task relation graph and updating sparse interaction of the node features among the graphs; the sparse interaction among the graphs comprises sparse interaction between a position task graph and a task relation graph and sparse interaction between an emotion task graph and a task relation graph;
and the task related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classification emotion according to the position feature representation of the position task diagram and the emotion feature representation of the emotion task diagram.
2. The multi-graph sparse interaction network-based multi-task position detection method according to claim 1, wherein the establishment of the position task graph and the emotion task graph of the multi-graph sparse interaction network model comprises the following steps:
constructing a first syntax dependency tree according to the syntax structure of the text-pushing text, and acquiring a root word set of the first syntax dependency tree;
adding words in the target text into the first syntactic dependency tree according to the word connection relation between the target text and the root word set to obtain a second syntactic dependency tree of the input text;
calculating a first pragmatic weight and a second pragmatic weight of the tweet text and a third pragmatic weight of the target text, wherein the first pragmatic weight is the pragmatic weight of each word of the tweet text in a vertical detection task, and the second pragmatic weight is the pragmatic weight of each word of the tweet text in an emotion classification task;
and constructing the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntactic dependency tree, and constructing the emotion task graph according to the second pragmatic weight and the second syntactic dependency tree.
3. The multi-task position detection method based on the multi-graph sparse interaction network as claimed in claim 2, wherein the task relationship graph of the multi-graph sparse interaction network model is constructed, specifically:
constructing word category importance weight, and constructing a vertical label importance relation vector and an emotion label importance relation vector according to the word category importance weight; the word category importance weight is the relationship between words and label categories of different tasks;
and calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing the task relation graph according to the task interaction relation.
4. The multi-graph sparse interaction network-based multi-task standpoint detection method according to claim 3, characterized by updating intra-graph node features of the vertical task graph, the emotion task graph and the task relationship graph, specifically:
according to the formulaRespectively carrying out intra-graph node feature updating on the vertical task graph, the emotion task graph and the task relation graph, wherein task belongs to { st, se and re } and respectively represents the vertical task graph, the emotion task graph and the task relation graph, I is an identity matrix, sigma is a nonlinear activation function, l represents the number of network layers of current iteration,a parameter indicating the layer l network is shown,an adjacency matrix representing a vertical task graph, an emotional task graph or a task relationship graph;j denotes the adjacency matrix diagonal fill 1.
5. The multi-graph sparse interaction network-based multi-task position detection method according to claim 4, wherein the sparse interaction of the node features among the graphs is updated, specifically to
According to the following formula:
for features of nodes between graphsThe node features comprise a position task graph, an emotion task graph and a task relation graph, wherein the node features are updated through sparse interactionAndrespectively show the vertical task chart g st Emotion task map g se And task relation graph g re And performing matrix representation of the node features of the graph after updating the node features in the graph by using a GCN graph neural network, wherein alpha is the number of node information of the vertical task graph and the number of node information of the emotional task graph, which are used for controlling task relation graph fusion by using a hyper-parameter, and l is the number of network layers of the current iteration.
6. The multi-graph sparse interaction network-based multi-task position detection method according to claim 5, wherein sparse interaction between the position task graph and the task relationship graph and between the emotion task graph and the task relationship graph is encouraged according to a first loss function; and multiple interactions between the position task graph and the task relation graph and between the emotion task graph and the task relation graph are encouraged according to the second loss function.
7. The multi-graph sparse interaction network-based multi-task position detection method according to claim 6, wherein the first loss function is specifically:
the second loss function is specifically:
8. The multi-graph sparse interaction network based multitask position detection method according to claim 7,
calculating a position feature representation r of the position task graph according to the following formula st :
Where α is the attention weight, h, expressed in a vertical characteristic i And m + n represents the length of the input text for the feature vector of the ith word after being coded by a BERT model.
9. The multi-graph sparse interaction network-based multi-task position detection method according to claim 8, wherein the emotional feature representation r of the emotional task graph is calculated according to the following formula se :
10. The multi-graph sparse interaction network-based multi-task position detection method according to any one of claims 1 to 9, wherein the detection position of the input text and the polarity of the classified emotion are calculated according to the following formulas:
y task =softmax(W task r task +b task );
in the formula y task | task={st,se} Detecting polarity and emotion classification polarity for the multi-graph sparse interaction network model prediction standpoint, W task | task={st,se} Weight of the full connection layer, b task Is W task Corresponding offset, r task | task={st,se} For both the standpoint and emotional characterizations, softmax is the activation function.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210069686.4A CN114969318B (en) | 2022-01-21 | 2022-01-21 | Multi-task standpoint detection method based on multi-graph sparse interaction network |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210069686.4A CN114969318B (en) | 2022-01-21 | 2022-01-21 | Multi-task standpoint detection method based on multi-graph sparse interaction network |
Publications (2)
Publication Number | Publication Date |
---|---|
CN114969318A true CN114969318A (en) | 2022-08-30 |
CN114969318B CN114969318B (en) | 2023-04-07 |
Family
ID=82974812
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210069686.4A Active CN114969318B (en) | 2022-01-21 | 2022-01-21 | Multi-task standpoint detection method based on multi-graph sparse interaction network |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN114969318B (en) |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180218253A1 (en) * | 2017-01-31 | 2018-08-02 | Conduent Business Services, Llc | Stance classification of multi-perspective consumer health information |
US20210089936A1 (en) * | 2019-09-24 | 2021-03-25 | International Business Machines Corporation | Opinion snippet detection for aspect-based sentiment analysis |
CN112925907A (en) * | 2021-02-05 | 2021-06-08 | 昆明理工大学 | Microblog comment viewpoint object classification method based on event graph convolutional neural network |
CN112926337A (en) * | 2021-02-05 | 2021-06-08 | 昆明理工大学 | End-to-end aspect level emotion analysis method combined with reconstructed syntax information |
-
2022
- 2022-01-21 CN CN202210069686.4A patent/CN114969318B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20180218253A1 (en) * | 2017-01-31 | 2018-08-02 | Conduent Business Services, Llc | Stance classification of multi-perspective consumer health information |
US20210089936A1 (en) * | 2019-09-24 | 2021-03-25 | International Business Machines Corporation | Opinion snippet detection for aspect-based sentiment analysis |
CN112925907A (en) * | 2021-02-05 | 2021-06-08 | 昆明理工大学 | Microblog comment viewpoint object classification method based on event graph convolutional neural network |
CN112926337A (en) * | 2021-02-05 | 2021-06-08 | 昆明理工大学 | End-to-end aspect level emotion analysis method combined with reconstructed syntax information |
Non-Patent Citations (1)
Title |
---|
冷佳: "主题发现和情感分类的联合分析研究", 《CNKI中国知网》 * |
Also Published As
Publication number | Publication date |
---|---|
CN114969318B (en) | 2023-04-07 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110162749B (en) | Information extraction method, information extraction device, computer equipment and computer readable storage medium | |
CN106502985B (en) | neural network modeling method and device for generating titles | |
CN104598611B (en) | The method and system being ranked up to search entry | |
CN109902301B (en) | Deep neural network-based relationship reasoning method, device and equipment | |
CN112633010A (en) | Multi-head attention and graph convolution network-based aspect-level emotion analysis method and system | |
CN113435211B (en) | Text implicit emotion analysis method combined with external knowledge | |
CN112307168B (en) | Artificial intelligence-based inquiry session processing method and device and computer equipment | |
CN111476038A (en) | Long text generation method and device, computer equipment and storage medium | |
CN111062220B (en) | End-to-end intention recognition system and method based on memory forgetting device | |
CN111274790A (en) | Chapter-level event embedding method and device based on syntactic dependency graph | |
CN113987167A (en) | Dependency perception graph convolutional network-based aspect-level emotion classification method and system | |
CN114168707A (en) | Recommendation-oriented emotion type conversation method | |
CN114528398A (en) | Emotion prediction method and system based on interactive double-graph convolutional network | |
CN114692568A (en) | Sequence labeling method based on deep learning and application | |
CN110297894B (en) | Intelligent dialogue generating method based on auxiliary network | |
CN110955765A (en) | Corpus construction method and apparatus of intelligent assistant, computer device and storage medium | |
CN117094325B (en) | Named entity identification method in rice pest field | |
US20220138425A1 (en) | Acronym definition network | |
CN112035629B (en) | Method for implementing question-answer model based on symbolized knowledge and neural network | |
CN112148879B (en) | Computer readable storage medium for automatically labeling code with data structure | |
CN114969318B (en) | Multi-task standpoint detection method based on multi-graph sparse interaction network | |
CN116932938A (en) | Link prediction method and system based on topological structure and attribute information | |
CN113869034B (en) | Aspect emotion classification method based on reinforced dependency graph | |
CN114911940A (en) | Text emotion recognition method and device, electronic equipment and storage medium | |
CN116468030A (en) | End-to-end face-level emotion analysis method based on multitasking neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |