CN114969318B

CN114969318B - Multi-task standpoint detection method based on multi-graph sparse interaction network

Info

Publication number: CN114969318B
Application number: CN202210069686.4A
Authority: CN
Inventors: 廖清; 柴合言; 丁烨; 方滨兴; 高翠芸; 王晔; 王轩
Original assignee: Dongguan University of Technology; Shenzhen Graduate School Harbin Institute of Technology
Current assignee: Dongguan University of Technology; Shenzhen Graduate School Harbin Institute of Technology
Priority date: 2022-01-21
Filing date: 2022-01-21
Publication date: 2023-04-07
Anticipated expiration: 2042-01-21
Also published as: CN114969318A

Abstract

The invention discloses a multi-task position detection method based on a multi-graph sparse interaction network. The method comprises the steps that an input text is input into a multi-map sparse interaction network model, and the position detection polarity and the emotion classification polarity of the input text are obtained; the multi-graph sparse interaction network model comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the multi-graph sparse interaction module is used for updating the intra-graph node characteristics of the vertical field task graph, the emotion task graph and the task relation graph and updating the sparse interaction of the node characteristics among the graphs; the task-related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classified emotion. The technical scheme of the invention improves the accuracy of the vertical detection for the text of the text pushing.

Description

Multi-task standpoint detection method based on multi-graph sparse interaction network

Technical Field

The invention relates to the technical field of position detection, in particular to a multi-task position detection method based on a multi-graph sparse interaction network.

Background

The existing position detection method mainly uses a machine learning method and a deep learning method. The Machine learning method needs to do a lot of work of feature engineering, extract features manually, and then design a Machine learning model to train the extracted features, such as a Support Vector Machine (Support Vector Machine), a decision tree model, a random forest, and the like. The main disadvantages of the method are that a large amount of time is consumed for carrying out feature engineering, the information contained in manually selected features is limited, and the performance of the model is reduced to a certain extent; meanwhile, most of the machine learning methods contain a large amount of over-parameters, the optimal values of the over-parameters need to be manually selected, time and labor are consumed, and the machine learning methods cannot be applied in a large scale. In the early stage of the vertical detection method based on deep learning, a Recurrent Neural Network (RNN), a Convolutional Neural Network (CNN) and an Attention Mechanism (Attention Mechanism) are mainly used to automatically capture the characteristics of the news text, so that the performance of the vertical detection is improved. Some recent work, influenced by the popularity of transformations, has focused on using the Bert model to improve the performance of the standpoint detection task, mainly by exploiting the powerful word embedding capability of Bert. The deep learning-based position detection method related to the invention uses auxiliary information to construct an auxiliary task to help improve the expression of the position detection task, such as emotional information, expression information, subjective or objective nature of text and the like. The auxiliary LSTM network is mainly designed to extract emotional characteristics, the main LSTM network is used to extract characteristics of a main task, and then the emotional characteristics and the position characteristics are simply spliced together to predict the position of a news text.

In the prior art, the relevance between target expression and position expression, such as RNN and CNN, cannot be captured by adopting a position detection method based on deep learning. In the auxiliary task-based position detection method, emotional characteristics and position characteristics are simply spliced, the complexity of the relationship between tasks is ignored, the two tasks are regarded as the same importance and are simply spliced together, so that larger negative migration is generated, and the performance of a model is reduced. And when the form of the auxiliary task is adopted, only the performance of the main task is concerned, and no consideration is given to the performance improvement of the auxiliary task. Therefore, the accuracy rate of the prior art for performing the position detection on the text of the tweet is low.

Disclosure of Invention

The invention provides a multi-task position detection method based on a multi-graph sparse interaction network, which improves the position detection accuracy aiming at a text of a text to be inferred.

An embodiment of the invention provides a multi-task position detection method based on a multi-graph sparse interaction network, which comprises the following steps:

inputting an input text into a multi-map sparse interaction network model to obtain the stand detection polarity and the emotion classification polarity of the input text; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;

the text encoding module is used for processing input text into a plurality of word vectors for the multi-graph building module, the multi-graph sparse interaction module and the task related attention module to use;

the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the position task graph is a graph constructed according to the syntax dependence tree of the input text and a graph constructed by pragmatic weight of words of the input text when a position detection task is carried out; the emotion task graph is constructed according to the syntax dependence tree of the input text and pragmatic weight of words of the input text during emotion classification task; the task relation graph is constructed according to word category importance weight of words of an input text and a syntax dependence tree of the input text;

the multi-graph sparse interaction module is used for updating intra-graph node features of the vertical field task graph, the emotion task graph and the task relation graph and updating sparse interaction of the node features among the graphs; the sparse interaction among the diagrams comprises sparse interaction between a position task diagram and a task relation diagram and sparse interaction between an emotion task diagram and a task relation diagram;

and the task related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classification emotion according to the position feature representation of the position task diagram and the emotion feature representation of the emotion task diagram.

Further, the method for constructing the elevation task graph and the emotion task graph of the multi-graph sparse interaction network model comprises the following steps of:

constructing a first syntactic dependency tree according to the syntactic structure of the tweet text, and acquiring a root word set of the first syntactic dependency tree;

adding words in the target text into the first syntactic dependency tree according to the word connection relation between the target text and the root word set to obtain a second syntactic dependency tree of the input text;

calculating a first pragmatic weight and a second pragmatic weight of the tweet text and a third pragmatic weight of the target text, wherein the first pragmatic weight is the pragmatic weight of each word of the tweet text in a vertical detection task, and the second pragmatic weight is the pragmatic weight of each word of the tweet text in an emotion classification task;

and constructing the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntactic dependency tree, and constructing the emotion task graph according to the second pragmatic weight and the second syntactic dependency tree.

Further, constructing a task relationship graph of the multi-graph sparse interaction network model specifically comprises:

constructing word category importance weight, and constructing a vertical label importance relation vector and an emotion label importance relation vector according to the word category importance weight; the word category importance weight is the relationship between words and label categories of different tasks;

and calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing the task relation graph according to the task interaction relation.

Further, updating the intra-graph node characteristics of the vertical task graph, the emotion task graph and the task relationship graph, specifically:

according to the formula

Respectively carrying out intra-graph node feature updating on the vertical task graph, the emotion task graph and the task relation graph, wherein task belongs to { st, se and re } which respectively represents the vertical task graph, the emotion task graph and the task relation graph, I is a unit matrix, sigma is a nonlinear activation function, l represents the number of network layers of current iteration, and/or>

A parameter indicating a level l network>

An adjacency matrix representing a vertical task graph, an emotional task graph or a task relationship graph; />

j denotes the adjacency matrix diagonal fill 1.

Further, the updating of the sparse interaction of the node features among the graphs is specifically performed according to the following formula:

updating sparse interaction of node features among graphs, wherein the node features comprise a position task graph, an emotion task graph and a task relation graph, and the formula comprises

And &>

Respectively show the vertical task chart g ^st Emotion task map g ^se And task relation graph g ^re And performing matrix representation of the node features of the graph after updating the node features in the graph by using a GCN graph neural network, wherein alpha is the number of node information of the vertical task graph and the number of node information of the emotional task graph, which are used for controlling task relation graph fusion by using a hyper-parameter, and l is the number of network layers of the current iteration.

Furthermore, sparse interaction is encouraged between the position task graph and the task relation graph and between the emotion task graph and the task relation graph according to a first loss function; and encouraging multi-interaction between the position task graph and the task relation graph and between the emotion task graph and the task relation graph according to the second loss function.

Further, the first loss function is specifically:

the second loss function is specifically:

in the formula

Sparse mask matrix for a vertical task, <' >>

And (3) a sparse mask matrix of the emotion tasks, wherein L represents the number of network layers of the current iteration, and L represents the total number of preset iteration networks.

Further, a position feature representation r of the position task graph is calculated according to the following formula ^st ：

Where α is the attention weight expressed by the feature of the vertical position, h _i And m + n represents the length of the input text for the feature vector of the ith word after being coded by a BERT model.

Further, calculating the emotional characteristic representation r of the emotional task diagram according to the following formula ^se ：

Where alpha' is the attention weight for the emotional feature representation,

output g for emotion correlation graph ^se The characteristic of the ith node in the text entry table, and m + n represents the length of the input text.

Further, the polarity of the detection standpoint and classification emotion of the input text is calculated according to the following formula:

y ^task ＝softmax(W ^task r ^task +b ^task )；

in the formula y ^task | _{task＝{st,se}} Detecting polarity and emotion classification polarity for the multi-graph sparse interaction network model prediction standpoint, W ^task | _{task＝{st,se}} Weight of the full connection layer, b ^task Is W ^task Corresponding offset, r ^task | _{task＝{st,se}} Softmax is the activation function for both the standpoint and emotional characterizations.

The embodiment of the invention has the following beneficial effects:

the invention provides a multi-task vertical detection method based on a multi-graph sparse interaction network. According to the invention, through training the emotion analysis task and the standpoint detection task in a combined manner, a task-specific graph (namely, a standpoint task graph and an emotion task graph) and a task-related graph (namely, a task relation graph) are constructed for each task, and a sparse interaction module between the graphs is constructed, so that sparse interaction between the standpoint task graph and the emotion task graph is realized, information sharing between the tasks is facilitated, and the expressive force of the multi-graph sparse interaction network model on each task is improved.

Drawings

Fig. 1 is a schematic structural diagram of a multi-graph sparse interaction network model according to an embodiment of the present invention.

Detailed Description

The technical solutions in the present invention will be described clearly and completely with reference to the drawings in the present invention, and it should be apparent that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.

As shown in fig. 1, in the multi-task standpoint detection method based on the multi-graph sparse interaction network provided by an embodiment of the present invention, an input text is input to a multi-graph sparse interaction network model, and a standpoint detection polarity and an emotion classification polarity of the input text are obtained; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;

the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the position task graph is a graph constructed according to the syntax dependence tree of the input text and a graph constructed by pragmatic weight of words of the input text when a position detection task is carried out; the emotion task graph is a graph constructed according to the syntax dependence tree of the input text and pragmatic weights of words of the input text during emotion classification tasks; the task relation graph is constructed according to word category importance weight of words of an input text and a syntax dependence tree of the input text;

the multi-graph sparse interaction module is used for updating intra-graph node characteristics of the vertical field task graph, the emotion task graph and the task relation graph and updating sparse interaction of the node characteristics among the graphs; the sparse interaction among the graphs comprises sparse interaction between a position task graph and a task relation graph and sparse interaction between an emotion task graph and a task relation graph;

As one embodiment, the method for constructing the task relation graph of the model comprises the following steps:

constructing word category importance weight, and constructing a vertical label importance relation vector and an emotional label importance relation vector according to the word category importance weight;

As one of the embodiments, the text encoding module uses BERT to encode the tweet text content C and the target word T in the target text. Firstly, splicing the text content C of the tweet and the target word T to form an input text S, wherein S = { T = { T } ₁ ,…,t _m ,w ₁ ,w ₂ ,…,w _n }. The processing into the input format of the BERT model is then: [ CLS]t ₁ …t _m w ₁ w ₂ …w _n [SEP]. And then input it into the BERT network model to capture the contextual features of the input text. The process can be defined by the following formula:

H＝BERT(S)

where H is the output of the BERT network model, H = { H = ₁ ,h ₂ ,…,h _m+n }. Where each element in H is a characteristic representation of a word in the input text,

including a characteristic representation of the context information for the tth word.

As one embodiment, the method for constructing the elevation task map and the emotion task map comprises the following steps:

constructing a first syntax dependency tree according to the syntax structure of the text-pushing text, and acquiring a root word set of the first syntax dependency tree;

adding words in the target text into the first syntactic dependency tree according to the word connection relation between the target text and the root word set to obtain a second syntactic dependency tree of the input text; calculating a first pragmatic weight and a second pragmatic weight of the tweet text and a third pragmatic weight of the target text, wherein the first pragmatic weight is a pragmatic weight of each word of the tweet text in a vertical detection task (namely, a pragmatic weight of a word about a vertical task is calculated by using relative co-occurrence frequency and word frequency of the word and vertical labels (support and object) in the whole corpus); the second pragmatic weight is the pragmatic weight of each word of the tweet text in the emotion classification task (namely, the pragmatic weight of each word about the vertical task is calculated by using the relative co-occurrence frequency and the word frequency of the word and the emotion label in the whole corpus). The pragmatic weight refers to the dependency (or influence) of the word in the inferred text on a specific target.

Specifically, each piece of tweet text is parsed using a parsing tool to construct a syntactic dependency tree; and representing the relation between the words in the syntactic dependency tree as connecting lines between the nodes by representing the words as the nodes of the graph (namely the nodes of the graph are word vectors of each word in the text of the Chinese character) so as to construct a base graph of the target text T and the Chinese character C

The method comprises the following steps:

constructing a first syntactic dependency tree according to the syntactic structure of the tweet text C

And obtaining the root word set w of the first syntactic dependency tree through a syntactic resolver ^r 。

Since the target text T is not a complete sentence, but a phrase or a word, and cannot be modeled as a syntactic dependency tree, according to the word connection relationship between the target text and the root word set, the embodiment of the present invention adds the word in the target text to the first syntactic dependency tree to obtain a second syntactic dependency tree of the input text S

The second syntactic dependency tree pick>

The calculation formula of (a) is as follows:

in the formula, w ^r Represents the first sentenceFamilies dependency tree

Is based on the root word>

A second syntactic dependency tree representing the input text S->

w _i And w _j Any two different words in the input text S, which represents the tweet text and the target text.

In order to capture the importance of words in the input text and the interaction characteristics between the words, the pragmatic weight of the words and the word frequency of the words of different tasks need to be calculated. Calculating the frequency of each word in the input text in the whole corpus

In the formula, N (w) _i ) As a word w _i The number of times it appears in the corpus, N being the number of all words in the corpus. The embodiment of the invention calculates the pragmatic weight of the word in different tasks aiming at different tasks, comprising a first pragmatic weight phi ^task (w _i )| _{task＝stance} And a second pragmatic weight phi ^task (w _i )| _{task＝sentiment} When the first pragmatic weight and the second pragmatic weight are calculated, only the category with practical significance is considered, two label categories which do not contain useful information, namely neutral position and neutral emotion, are ignored, and the specific calculation process is shown as the following formula:

in the formula, N (w) _i ,label ₊ ) And N (w) _i ,label _- ) Respectively represent words w _i The quantities appearing in the context task labels "support" and "objection" or respectively denote the words w _i The number of occurrences in the emotional task tags "active" and "passive"; n (label) ₊ ) And N (label) _- ) The total number of the position task labels "support" and "opposition", respectively, or the total number of the emotion task labels "positive" and "negative", respectively, is represented, μ is the mean and δ is the standard deviation.

According to the formula

And calculating a third pragmatic weight of the target text. And the third pragmatic weight of the target text is used for establishing the relation between the target text and the text of the tweet, namely constructing an edge between the target text and the text of the tweet in the graph.

And calculating a first adjacency matrix of the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntactic dependency tree, and further obtaining the elevation task graph according to the first adjacency matrix.

And calculating a second adjacency matrix of the emotion task graph according to the second pragmatic weight and a second syntax dependency tree, and further obtaining the emotion task graph according to the second adjacency matrix.

Specifically, the first adjacency matrix and the second adjacency matrix are calculated according to the following formulas:

in the formula s _j And s _i Are words in the input text.

As an embodiment, the task relation graph of the multi-graph sparse interaction network model is constructed, and the method comprises the following steps:

constructing word category importance weight, and constructing a vertical label importance relation vector and an emotion label importance relation vector according to the word category importance weight;

Specifically, in consideration of fine-grained interaction relationships between different tasks, that is, relationships of tasks in different categories are different, the word category importance weight is used in the embodiment of the present invention

The relation between the word and the label types of different tasks is built, so that the label importance relation characteristics of different words to different tasks are obtained, information interaction among different tasks is facilitated, and the task expressive force of the multi-graph sparse interaction network model is improved. According to the formula

Calculating the importance weight of the word class, wherein w _i e.C is a word of the tweet text, C _i Is a label category of the corresponding task (e.g., task is a position detection task, c _i E { support, objection }), based on a status of a subscriber or subscriber>

The expression w _i The label class appearing at task is c _i The number of times of (c); />

Label class c represented in task _i The total number of occurrences of the corresponding word; | w _i | denotes the word w _i The number of occurrences in the corpus; task denotes a position task or an emotion task. The word category importance weight refers to the importance relationship between words and different label categories of different tasks, namely, the importance relationship between words in two tasks is represented, and the word category importance weight is used for capturing the similarity relationship between the two tasks based on task label layers. Firstly, calculating the relation between the word and each label category of the vertical task to form a vector, namely a vertical label importance relation vector, and then calculating the relation between the word and the emotion taskThe relationship among all label categories of the affair forms a vector, namely an emotion label importance relationship vector. And finally, calculating the similarity of the two vectors, and taking the similarity as the importance weight of the word to the label category of each task for measuring the similarity relation of the word between the two tasks.

Specifically, a label importance relation vector is constructed for each task according to the calculated word class importance weight, wherein the label importance relation vector comprises a position label importance relation vector and an emotion label importance relation vector, namely phi ^stance (w _i ) And phi ^sentiment (w _i )。

The importance relationship vector represents the importance relationship of each word to different label categories, denoted as

Wherein task represents a position task or an emotion task, c _i A certain label representing task>

In normalized form as a word class importance weight>

Wherein->

And &>

Is->

Mean and standard deviation of (d). Through the calculation, the importance relation vector of the word to different label categories under different tasks, namely the importance relation vector phi of the vertical label can be obtained ^stance (w _i ) And affective tag importance relationship vector Φ ^sentiment (w _i )。

According to the weight of the vertical labelTask interaction relation xi (w) based on word level is calculated by importance relation vector and emotion label importance relation vector _i )：

Where, sta denotes a position detection task and se denotes an emotion classification task. A second syntactic dependency tree from the input text S

Task interaction xi (w) _i ) Adjacency matrix @, which is used to construct a task graph after normalization, is>

In turn according to said adjacency matrix>

Obtaining the task relationship diagram, the adjacency matrix->

The calculation formula of (2) is as follows:

as one embodiment, intra-graph node characteristics of the vertical task graph, the emotion task graph and the task relation graph are updated, namely, horizontal intra-graph updating is performed. Specifically, the embodiment of the present invention uses Graph Convolutional Neural Network (GCN) to perform intra-Graph iterative update, and the update process is independent for each Graph in a broad sense (only horizontal update of the Graph is considered), that is, each Graph independently updates node features. Respectively updating the node characteristics in the map of the vertical task map, the emotion task map and the task relation map according to the following formulas:

in the formula, task belongs to { st, se, re } respectively represents a position task graph, an emotion task graph and a task relation graph,

node characteristics, representing the output of the l-1 th network, are evaluated>

Parameters representing the l-th network, wherein I is an identity matrix, and sigma is a nonlinear activation function; />

j denotes the adjacent matrix diagonal filled 1, <' >>

Updating the characteristic vector of the convolution kernel of the graph convolution network and the updated parameter of the conventional graph convolution neural network; l denotes the number of network layers for the current iteration. The initial node characteristic is expressed as->

Vertical task graph g ^st And emotion task graph g ^se Graph g relating to tasks respectively ^re Performing sparse interaction to reach g ^st Heel g ^se The interaction between the tasks is carried out, so that the different tasks can be helped to realize information sharing. Initializing g ^st Heel g ^re Is as follows

g ^se Heel g ^re Is on the side->

Sparse mask matrix for defining a context task/>

Sparse mask matrix->

The mask matrix z represents a group of random binary variables and is responsible for controlling the interaction between the position task graph and the task relation graph and controlling the interaction between the position task graph and the task relation graph. Therefore, when the iterative interaction of the ith layer graph is carried out, the sparse interaction of the node features among the graphs is updated according to the formulas (1) to (3), namely the sparse interaction between the vertical field task graph and the task relation graph and the sparse interaction between the emotion task graph and the task relation graph are updated:

wherein the content of the first and second substances,

and &>

Parameters respectively representing connecting edges between the position task graph and the emotion task graph and the task relation graph, wherein l represents the number of network layers of the current iteration, and->

And &>

Are respectively provided withDiagram g representing a task from the standpoint ^st Emotion task map g ^se And task relation graph g ^re And performing matrix representation of the node characteristics of the graph after updating the node characteristics in the graph by using a GCN graph neural network, wherein alpha is the number of node information of the vertical task graph and the number of node information of the emotional task graph, and the number of the node information of the vertical task graph is used for controlling task relation graph fusion by using a hyper-parameter.

Since the set of random binary variables represented by the mask matrix z are discrete variables and are not derivable, they are not suitable for deep learning models. Therefore, in the embodiment of the present invention, a Gumbel-Softmax distribution is adopted to continuously convert the discrete random variable z into the continuous variable v so as to adapt to an update process of the multi-graph sparse interaction network model in the embodiment of the present invention, and a calculation formula thereof is as follows:

where i and j represent randomly chosen 0 and 1, τ is the scaling parameter, G _l Is an independent identically distributed sample, π, sampled from a standard normal distribution Gumbel (0, 1) _l ＝[1-z _l ,z _l ]Is that the mask matrix z is represented as a set of random binary variables.

In order to improve the training efficiency of the model, sparse interaction between the position task graph and the task relation graph and sparse interaction between the emotion task graph and the task relation graph are encouraged according to a first loss function (namely sparse regularization), wherein the first loss function is as follows:

in the formula, L represents the number of network layers of the current iteration, L represents the total number of network layers of the set iteration to avoid too sparse interaction among the graphs, so that the whole multi-graph sparse interaction network model is divided into three independent graph networks, and then multi-interaction between the vertical task graph and the task relationship graph, multi-interaction between the emotion task graph and the task relationship graph are encouraged according to a second loss function (namely, sharing regularity), namely, multi-interaction at the bottom layer is encouraged, and bottom layer information is shared among the tasks, wherein the second loss function is as follows:

the multi-image sparse interaction module is used for information sharing between two tasks to acquire information helpful for training, noise information influencing task training is filtered, the sharing efficiency and quality between the tasks are improved, and the performance of the multi-image sparse interaction network model for executing each task is further improved.

As one of the embodiments, the position feature representation g of the position task graph is obtained at the task-related attention module ^st And emotional feature representation g of emotional task graph ^se . For the position detection task, in order to obtain the position feature representation related to the target, a mask mechanism is required to be adopted to filter out non-target words, specifically, by designing a mask matrix, the corresponding position of the target word is set to 1, and the corresponding position of the non-target word is set to 0, so that the feature representation of the position task graph after the mask matrix conversion is obtained

And then using the attention mechanism based on retrieval to obtain richer position feature representation related to the target word, wherein the attention weight alpha of the position feature representation is calculated according to the following formula:

wherein h is the output of the BERT network model,

represents the t-th word passRepresenting the coded characteristics of the BERT network model; m + n represents the length of a word in the input text (the length of the target text is m, the length of the text is n), and/or>

Representing the outcome of a position-dependent graph after mask matrix conversion>

Characteristic representation of the ith node (i.e., the ith word vector), β _t Attention weight, β, representing the t-th word vector _i Attention weight, α, expressed as a vertical feature of the ith word vector _t The attention weight representing the t-th word vector is normalized. The attention weight of the position feature representation represents the attention of all words in the context to the position feature.

Then, the position characteristic representation r of the position task graph is calculated according to the following formula ^st ：

h _i Feature vectors, α, encoded for the ith word by the BERT model _i Attention weight represented by the position feature of the ith word vector.

Similarly, calculating emotional characteristic representation r of the emotional task diagram according to formulas (4) to (6) ^se ：

Output g representing emotion correlation diagram ^se The feature representation of the ith node (or word vector) in the text is shown, and alpha ' is the attention weight, beta ' of the emotional feature representation ' _t Attention weight, β 'for emotion characterization of the t-th word vector' _i Attention weight for emotion characterization for the ith word vector.

Obtaining a final position feature representation r ^st And emotional feature representation r ^se And then, fusing text features and rich context features by using a full connection layer, and obtaining the polarity of the detection position and the polarity of the classified emotion of the input text:

y ^task ＝softmax(W ^task r ^task +b ^task )

in the formula, y ^task | _{task＝{st,se}} Detecting polarity and emotion classification polarity for the multi-graph sparse interaction network model prediction standpoint, W ^task | _{task＝{st,se}} Weight of the full connection layer, b ^task Is W ^task Corresponding to the offset, softmax is the activation function.

Finally, the objective function of the whole multi-image sparse interaction module is a linear combination of the loss functions of the standpoint detection task and the emotion analysis task:

in the formula, theta is a parameter of the multitask graph network model, and lambda is ₁ 、λ ₂ 、λ ₃ Is the coefficient corresponding to the loss term; d represents the mth piece of tweet,

for the set of all tweets, <' > H>

For the emotion task label of the "d" clause predicted by the model,

for the tag of the nth text-pushing context task predicted by the model, be->

Representing a first loss function (i.e., sparse regularization), ->

A second loss function (i.e., sharing discipline) is represented, encouraging task sharing.

The task related attention module is used for calculating the final position feature representation of the position task diagram and the final emotional feature representation of the emotional task diagram, and performing position detection and emotion classification according to the position feature representation and the emotional feature representation. The embodiment of the invention constructs three graph structures to capture the interaction relationship between tasks, wherein a task relationship graph is constructed to capture the word-level-based correlation between the position detection task and the emotion analysis task, so as to help each task to share information and reduce the noise generation in the information sharing process.

According to the embodiment of the invention, the position and attitude of the short news text published on the social platform by the public are predicted. Because the news text content is short, the information quantity is small, and a high accuracy rate is difficult to obtain by adopting a single-task learning mode, the invention provides a multi-task stand detection method based on a multi-graph sparse interaction network.

While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.

It will be understood by those skilled in the art that all or part of the processes of the above embodiments may be implemented by hardware related to instructions of a computer program, and the computer program may be stored in a computer readable storage medium, and when executed, may include the processes of the above embodiments. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

Claims

1. A multi-task position detection method based on a multi-graph sparse interaction network is characterized in that,

inputting an input text into a multi-map sparse interaction network model to obtain the position detection polarity and the emotion classification polarity of the input text; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;

the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the position task graph is a graph constructed according to the syntax dependence tree of the input text and a graph constructed by pragmatic weight of words of the input text when a position detection task is carried out; the emotion task graph is constructed according to the syntax dependence tree of the input text and pragmatic weight of words of the input text during emotion classification task; the task relation graph is constructed according to word category importance weight of words of an input text and a syntactic dependency tree of the input text; the construction of the standpoint task graph and the emotion task graph of the multi-graph sparse interaction network model comprises the following steps: constructing a first syntax dependency tree according to the syntax structure of the text-pushing text, and acquiring a root word set of the first syntax dependency tree; adding words in the target text into the first syntactic dependency tree according to the word connection relation between the target text and the root word set to obtain a second syntactic dependency tree of the input text; calculating a first pragmatic weight and a second pragmatic weight of the tweet text and a third pragmatic weight of the target text, wherein the first pragmatic weight is the pragmatic weight of each word of the tweet text in a vertical detection task, and the second pragmatic weight is the pragmatic weight of each word of the tweet text in an emotion classification task; constructing the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntactic dependency tree, and constructing the emotion task graph according to the second pragmatic weight and the second syntactic dependency tree; constructing a task relation graph of the multi-graph sparse interaction network model, wherein the task relation graph comprises the following steps: constructing word category importance weight, and constructing a vertical label importance relation vector and an emotion label importance relation vector according to the word category importance weight; the word category importance weight is the relationship between words and label categories of different tasks; calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing a task relation graph according to the task interaction relation;

2. The multi-graph sparse interaction network-based multi-task position detection method according to claim 1, wherein intra-graph node features of the vertical task graph, the emotion task graph and the task relationship graph are updated, and specifically:

according to the formula

A parameter indicating a level l network>

An adjacency matrix representing a vertical task graph, an emotion task graph or a task relation graph; />

j denotes the adjacency matrix diagonal fill 1./>

3. The multi-graph sparse interaction network-based multi-task position detection method according to claim 2, wherein the sparse interaction of the node features among the graphs is updated, specifically to

According to the following formula:

And &>

Respectively show the vertical task chart g ^st Emotion task map g ^se And task relation graph g ^re And performing matrix representation of the node features of the graph after updating the node features in the graph by using a GCN graph neural network, wherein alpha is a hyper-parameter and is used for controlling the number of node information of the vertical task graph and the number of node information of the emotional task graph fused by the task relational graph, and l represents the number of network layers of the current iteration.

4. The multi-graph sparse interaction network-based multi-task position detection method according to claim 3, wherein sparse interaction between the position task graph and the task relationship graph and between the emotion task graph and the task relationship graph is encouraged according to a first loss function; and encouraging the interaction between the position task graph and the task relation graph and the interaction between the emotion task graph and the task relation graph according to the second loss function.

5. The multi-graph sparse interaction network-based multi-task position detection method according to claim 4, wherein the first loss function is specifically:

the second loss function is specifically:

in the formula

Sparse mask matrix for a vertical task, <' >>

6. The multi-graph sparse interaction network based multitask position detection method according to claim 5,

calculating a position feature representation r of the position task graph according to the following formula ^st ：

In the formula of alpha _i Attention weight, h, expressed for the feature of the vertical _i And m + n represents the length of the input text for the feature vector of the ith word after being coded by a BERT model.

7. The multi-graph sparse interaction network-based multi-task position detection method according to claim 6, wherein the emotional feature representation r of the emotional task graph is calculated according to the following formula ^se ：

Of formula (II)' _i Attention weighting for affective feature representation，

8. The multi-graph sparse interaction network-based multi-task position detection method according to any one of claims 1 to 7, wherein the detection position of the input text and the polarity of the classified emotion are calculated according to the following formulas:

y ^task ＝softmax(W ^task r ^task +b ^task )；

in the formula y ^task Detecting polarity and emotion classification polarity for the multi-graph sparse interaction network model prediction standpoint, W ^task Weight of the full connection layer, b ^task Is W ^task Corresponding offset, r ^task For both the standpoint and emotional characterizations, softmax is the activation function.