CN114969318A

CN114969318A - Multi-task standpoint detection method based on multi-graph sparse interaction network

Info

Publication number: CN114969318A
Application number: CN202210069686.4A
Authority: CN
Inventors: 廖清; 柴合言; 丁烨; 方滨兴; 高翠芸; 王晔; 王轩
Original assignee: Dongguan University of Technology; Shenzhen Graduate School Harbin Institute of Technology
Current assignee: Dongguan University of Technology; Shenzhen Graduate School Harbin Institute of Technology
Priority date: 2022-01-21
Filing date: 2022-01-21
Publication date: 2022-08-30
Anticipated expiration: 2042-01-21
Also published as: CN114969318B

Abstract

The invention discloses a multi-task position detection method based on a multi-graph sparse interaction network. The method comprises the steps that an input text is input into a multi-map sparse interaction network model, and the position detection polarity and the emotion classification polarity of the input text are obtained; the multi-graph sparse interaction network model comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the multi-graph sparse interaction module is used for updating the intra-graph node characteristics of the vertical field task graph, the emotion task graph and the task relation graph and updating the sparse interaction of the node characteristics among the graphs; the task related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classified emotion. The technical scheme of the invention improves the accuracy of the vertical detection for the text of the text pushing.

Description

Multi-task standpoint detection method based on multi-graph sparse interaction network

Technical Field

The invention relates to the technical field of position detection, in particular to a multi-task position detection method based on a multi-graph sparse interaction network.

Background

The existing position detection method mainly uses a machine learning method and a deep learning method. The machine learning method needs to do a lot of work of feature engineering, extract features manually, and then design a machine learning model to train the extracted features, such as a support vector machine (SupportVectorMachine), a decision tree model, a random forest, and the like. The main disadvantages of the method are that a lot of time is consumed for carrying out the feature engineering, the information contained in the manually selected features is limited, and the performance of the model is reduced to a certain extent; meanwhile, most of the machine learning methods contain a large amount of hyper-parameters, the optimal values of the hyper-parameters need to be selected manually, and the machine learning methods are time-consuming and labor-consuming and cannot be applied in a large scale. In the early stage of the vertical detection method based on deep learning, a Recurrent Neural Network (RNN), a Convolutional Neural Network (CNN) and an attention mechanism (attentionchannels) are mainly used to automatically capture the characteristics of a news text, thereby improving the performance of vertical detection. Under the influence of the popularity of Transformers, some recent work focuses on improving the performance of the position detection task by using the Bert model, mainly by utilizing the powerful word embedding capability of the Bert. The deep learning-based position detection method related to the invention is to use auxiliary information to construct an auxiliary task to help improve the expression of the position detection task, such as emotional information, expression information, subjective or objective nature of text, and the like. The auxiliary LSTM network is mainly designed to extract emotional characteristics, the main LSTM network is used to extract characteristics of a main task, and then the emotional characteristics and the position characteristics are simply spliced together to predict the position of a news text.

In the prior art, the relevance between target expression and position expression, such as RNN and CNN, cannot be captured by adopting a position detection method based on deep learning. In the auxiliary task-based elevation detection method, emotional features and elevation features are simply spliced, the complexity of the relationship between tasks is neglected, the two tasks are regarded as the same importance, and the two tasks are simply spliced together, so that large negative migration is generated, and the performance of a model is reduced. And when the form of the auxiliary task is adopted, only the performance of the main task is concerned, and no consideration is given to the performance improvement of the auxiliary task. Therefore, the accuracy rate of the prior art for performing the position detection on the text of the tweet is low.

Disclosure of Invention

The invention provides a multi-task vertical detection method based on a multi-graph sparse interaction network, which improves the accuracy of vertical detection for text pushing.

An embodiment of the invention provides a multi-task position detection method based on a multi-graph sparse interaction network, which comprises the following steps:

inputting an input text into a multi-map sparse interaction network model to obtain the stand detection polarity and the emotion classification polarity of the input text; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;

the text encoding module is used for processing input text into a plurality of word vectors for the multi-graph building module, the multi-graph sparse interaction module and the task related attention module to use;

the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the position task graph is a graph constructed according to the syntax dependence tree of the input text and a graph constructed by pragmatic weight of words of the input text when a position detection task is carried out; the emotion task graph is constructed according to the syntax dependence tree of the input text and pragmatic weight of a single word of the input text during emotion classification task; the task relation graph is constructed according to word category importance weight of words of an input text and a syntax dependence tree of the input text;

the multi-graph sparse interaction module is used for updating intra-graph node features of the vertical field task graph, the emotion task graph and the task relation graph and updating sparse interaction of the node features among the graphs; the sparse interaction among the graphs comprises sparse interaction between a position task graph and a task relation graph and sparse interaction between an emotion task graph and a task relation graph;

and the task related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classification emotion according to the position feature representation of the position task diagram and the emotion feature representation of the emotion task diagram.

Further, the method for constructing the elevation task graph and the emotion task graph of the multi-graph sparse interaction network model comprises the following steps:

constructing a first syntax dependency tree according to the syntax structure of the text-pushing text, and acquiring a root word set of the first syntax dependency tree;

adding words in the target text into the first syntactic dependency tree according to the word connection relation between the target text and the root word set to obtain a second syntactic dependency tree of the input text;

calculating a first pragmatic weight and a second pragmatic weight of the tweet text and a third pragmatic weight of the target text, wherein the first pragmatic weight is the pragmatic weight of each word of the tweet text in a vertical detection task, and the second pragmatic weight is the pragmatic weight of each word of the tweet text in an emotion classification task;

and constructing the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntactic dependency tree, and constructing the emotion task graph according to the second pragmatic weight and the second syntactic dependency tree.

Further, constructing a task relationship graph of the multi-graph sparse interaction network model specifically comprises:

constructing word category importance weight, and constructing a position label importance relation vector and an emotion label importance relation vector according to the word category importance weight; the word category importance weight is the relationship between words and label categories of different tasks;

and calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing the task relation graph according to the task interaction relation.

Further, updating the intra-graph node characteristics of the vertical task graph, the emotion task graph and the task relationship graph, specifically:

according to the formula

Respectively carrying out intra-graph node feature updating on the vertical task graph, the emotion task graph and the task relation graph, wherein task belongs to { st, se and re } which respectively represent the vertical task graph, the emotion task graph and the task relation graph, I is an identity matrix, sigma is a nonlinear activation function, l represents the number of network layers of current iteration,

a parameter indicating the layer l network is shown,

an adjacency matrix representing a vertical task graph, an emotional task graph or a task relationship graph;

j denotes the adjacency matrix diagonal fill 1.

Further, the sparse interaction of the node features between the graphs is updated, specifically, the sparse interaction is updated

According to the following formula:

updating sparse interaction of node features among graphs, wherein the node features comprise a position task graph, an emotion task graph and a task relation graph, and the formula comprises

And

respectively show the task diagram g of the place ^st Emotional task graph g ^se And task relation graph g ^re And performing matrix representation of the node features of the graph after updating the node features in the graph by using a GCN graph neural network, wherein alpha is the number of node information of the vertical task graph and the number of node information of the emotional task graph, which are used for controlling task relation graph fusion by using a hyper-parameter, and l is the number of network layers of the current iteration.

Furthermore, sparse interaction is encouraged between the position task graph and the task relation graph and between the emotion task graph and the task relation graph according to a first loss function; and multiple interactions between the position task graph and the task relation graph and between the emotion task graph and the task relation graph are encouraged according to the second loss function.

Further, the first loss function is specifically:

the second loss function is specifically:

in the formula

For the sparse mask matrix of the position task,

and (3) a sparse mask matrix of the emotion tasks, wherein L represents the number of network layers of the current iteration, and L represents the total number of preset iteration networks.

Further, a vertical characteristic representation r of the vertical task graph is calculated according to the following formula ^st ：

Where α is the attention weight, h, expressed in a vertical characteristic _i And m + n represents the length of the input text for the feature vector of the ith word after being coded by a BERT model.

Further, calculating the emotional characteristic representation r of the emotional task diagram according to the following formula ^se ：

In the formula of alpha ^‘ For the attention weight of the representation of the emotional feature,

output g for emotion correlation graph ^se The feature of the ith node in the text entry table represents, and m + n represents the length of the input text.

Further, the polarity of the detection standpoint and classification emotion of the input text is calculated according to the following formula:

y ^task ＝softmax(W ^task r ^task +b ^task )；

in the formula y ^task | _task={st,se} Detecting polarity and emotion classification polarity for the multi-graph sparse interaction network model prediction standpoint, W ^task | _task={st,se} As weight of the full connection layer, b ^task Is W ^task Corresponding offset, r ^task | _task={st,se} For both the standpoint and emotional characterizations, softmax is the activation function.

The embodiment of the invention has the following beneficial effects:

the invention provides a multi-task vertical detection method based on a multi-graph sparse interaction network. According to the invention, through training the emotion analysis task and the position detection task in a combined manner, a task-specific graph (namely a position task graph and an emotion task graph) and a task-related graph (namely a task relation graph) are constructed for each task, and a sparse interaction module between the graphs is constructed, so that sparse interaction between the position task graph and the emotion task graph is realized, information sharing between the tasks is facilitated, and the expressive force of the multi-graph sparse interaction network model on each task is improved.

Drawings

Fig. 1 is a schematic structural diagram of a multi-graph sparse interaction network model according to an embodiment of the present invention.

Detailed Description

The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it is obvious that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.

As shown in fig. 1, in the multi-task standpoint detection method based on the multi-graph sparse interaction network provided by an embodiment of the present invention, an input text is input to a multi-graph sparse interaction network model, and a standpoint detection polarity and an emotion classification polarity of the input text are obtained; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;

As an embodiment, the method for constructing the task relation graph of the model comprises the following steps:

constructing word category importance weight, and constructing a position label importance relation vector and an emotion label importance relation vector according to the word category importance weight;

As one of the embodiments, the text encoding module uses BERT to encode the target words T in the target text and the tweet text content C. Firstly, splicing the text content C of the text to be pushed and the target word T to form an input text S, wherein S is { T ═ T } ₁ ,…,t _m ,w ₁ ,w ₂ ,…,w _n }. S is then processed into the input format of the BERT model: [ CLS]t ₁ …t _m W ₁ W ₂ …w _n [SEP]. And then input it into the BERT network model to capture the contextual characteristics of the input text. The process can be defined by the following formula:

H＝BERT(S)

where H is the output of the BERT network model, H ═ H { [ H ] ₁ ,h ₂ ,…,h _m+n }. Where each element in H is a characteristic representation of a word in the input text,

including a characteristic representation of the context information for the tth word.

As one embodiment, the method for constructing the elevation task map and the emotion task map comprises the following steps:

adding words in the target text into the first syntactic dependency tree according to the word connection relation between the target text and the root word set to obtain a second syntactic dependency tree of the input text; calculating a first pragmatic weight and a second pragmatic weight of the tweet text and a third pragmatic weight of the target text, wherein the first pragmatic weight is the pragmatic weight of each word of the tweet text in a vertical detection task (namely, the pragmatic weight of the word about the vertical task is calculated by using the relative co-occurrence frequency and word frequency of the word and the vertical labels (support and object) in the whole corpus); the second pragmatic weight is the pragmatic weight of each word of the text in the emotion classification task (namely, the pragmatic weight of the word with respect to the vertical task is obtained by calculating the relative co-occurrence frequency and the word frequency of the word and the emotion label in the whole corpus). The pragmatic weight refers to the dependency (or influence) of the word in the inferred text on a specific target.

Specifically, each piece of tweet text is parsed using a parsing tool to construct a syntactic dependency tree; and representing the relation between the words in the syntactic dependency tree as connecting lines between the nodes by representing the words as the nodes of the graph (namely the nodes of the graph are word vectors of each word in the text of the Chinese character) so as to construct a base graph of the target text T and the Chinese character C

The method comprises the following steps:

constructing a first syntactic dependency tree according to the syntactic structure of the tweet text C

Combined pipeThe sentence-passing parser obtains a root word set W of the first syntax dependency tree ^r 。

Since the target text T is not a complete sentence, but a phrase or a word, and cannot be modeled as a syntactic dependency tree, according to the word connection relationship between the target text and the root word set, the embodiment of the present invention adds the word in the target text to the first syntactic dependency tree to obtain a second syntactic dependency tree of the input text S

The second syntactic dependency tree

The calculation formula of (c) is as follows:

in the formula, W ^r Representing a first syntactic dependency tree

The root word of (a) is,

second syntactic dependency tree representing input text S

W _i And W _j Any two different words in the input text S, which represents the tweet text and the target text.

In order to capture the importance of words in the input text and the interaction characteristics between the words, the pragmatic weight of the words and the word frequency of the words of different tasks need to be calculated. Calculating the frequency of each word in the input text appearing in the whole corpus

In the formula, N (W) _i ) As a word W _i The number of times it appears in the corpus, N being the number of all words in the corpus. The embodiment of the invention calculates the pragmatic weight of the word in different tasks aiming at different tasks, comprising a first pragmatic weight phi ^task (w _i )| _task=stance And a second pragmatic weight phi ^task (w _i )| _{task=sentiment} When the first pragmatic weight and the second pragmatic weight are calculated, only the category with practical significance is considered, two label categories which do not contain useful information, namely neutral position and neutral emotion, are omitted, and the specific calculation process is shown as the following formula:

w _i ∈C；

in the formula, N (W) _i ,label ₊ ) And N (W) _i ,label _- ) Respectively represent words W _i The quantities appearing in the context task labels "support" and "objection" or respectively represent the words W _i The number of occurrences in the emotional task tags "active" and "passive"; n (label) ₊ ) And N (label) _- ) Representing the total number of the standpoint task labels "support" and "opposition", respectively, or the emotional task labels "active" and "negative", respectively, with μ being the mean and δ being the standard deviation.

According to the formula

And calculating a third pragmatic weight of the target text. And the third pragmatic weight of the target text is used for establishing the relation between the target text and the tweet text, namely constructing an edge between the target text and the tweet text in the graph.

And calculating a first adjacent matrix of the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntax dependency tree, and further obtaining the elevation task graph according to the first adjacent matrix.

And calculating a second adjacency matrix of the emotion task graph according to the second pragmatic weight and a second syntactic dependency tree, and further obtaining the emotion task graph according to the second adjacency matrix.

Specifically, the first adjacency matrix and the second adjacency matrix are calculated according to the following formulas:

in the formula s _j And s _i Are words in the input text.

As an embodiment, the task relation graph of the multi-graph sparse interaction network model is constructed, and the method comprises the following steps:

Specifically, in consideration of fine-grained interaction relationships between different tasks, that is, relationships of tasks in different categories are different, the word category importance weight is used in the embodiment of the present invention

The relation between the word and the label types of different tasks is established, so that the label importance relation characteristics of different words to different tasks are obtained, information interaction among different tasks is facilitated, and the task expressive force of the multi-graph sparse interaction network model is improved. According to the formula

Calculating the importance weight of the word class, wherein W _i E C is the word of the tweet text,c _i is a label category of the corresponding task (e.g., task is a position detection task, c _i E { support, object }),

the expression W _i The tag class appearing in task is c _i The number of times of (c);

label class c represented in task _i The total number of occurrences of the corresponding word; | w _i I represents the word W _i The number of occurrences in the corpus; task denotes a position task or an emotional task. The word category importance weight refers to the importance relationship between words and different label categories of different tasks, namely, the importance relationship between words in two tasks is represented, and the word category importance weight is used for capturing the similarity relationship between the two tasks based on task label layers. The method comprises the steps of firstly calculating the relation between a word and each label category of the vertical task to form a vector, namely a vertical label importance relation vector, and then calculating the relation between the word and each label category of the emotion task to form a vector, namely an emotion label importance relation vector. And finally, calculating the similarity of the two vectors, and using the similarity as the importance weight of the word to the label category of each task to measure the similarity relation of the word between the two tasks.

Specifically, a label importance relation vector is constructed for each task according to the calculated word class importance weight, wherein the label importance relation vector comprises a position label importance relation vector and an emotion label importance relation vector, namely phi ^stance (w _i ) And phi ^sentiment (w _i )。

The importance relationship vector represents the importance relationship of each word to different tag categories, denoted as

Wherein task represents a position task or an emotion task, c _i A certain label representing the task is shown,

in normalized form of the word class importance weights,

wherein

And

is that

Mean and standard deviation of. Through the calculation, the importance relation vector of the word to different label categories under different tasks, namely the importance relation vector phi of the vertical label can be obtained ^stance (w _i ) And

emotional tag importance relationship vector Φ ^sentiment (w _i )。

Calculating task interaction relation xi (W) based on word level according to the position label importance relation vector and the emotion label importance relation vector _i )：

Where, sta denotes a position detection task and se denotes an emotion classification task. A second syntactic dependency tree according to said input text S

Task interaction xi (W) _i ) Adjacent matrix used for constructing task relation graph after standardization processing

Further according to the adjacency matrix

Obtaining the task relation graph and the adjacency matrix

The calculation formula of (c) is:

as one embodiment, intra-graph node features of the vertical-field task graph, the emotion task graph and the task relation graph are updated, namely, horizontal intra-graph updating is carried out. Specifically, the embodiment of the present invention uses a graph convolutional neural network (GCN) to perform intra-graph iterative update, and the update process is independent for each graph in a broad sense (only horizontal update of the graph is considered), that is, each graph independently updates node features. Respectively carrying out respective intra-graph node feature updating on the position task graph, the emotion task graph and the task relation graph according to the following formulas:

in the formula, task belongs to { st, se, re } respectively represents a position task graph, an emotion task graph and a task relation graph,

representing the characteristics of the nodes at the output of the l-1 th network,

parameters representing the l-th network, I being an identity matrix, σ being a non-linear activation function;

j denotes the adjacency matrix diagonal fill 1,

characterization of convolution kernels for graph convolution networksVector and routine graph convolution neural network updated parameters; l denotes the number of network layers for the current iteration. The initial node characteristics are expressed as

Vertical task graph g ^st And emotional task graph g ^se Graph g relating to tasks respectively ^re Performing sparse interaction to reach g ^st Heel g ^se The interaction between the tasks is carried out, so that the different tasks can be helped to realize information sharing. Initializing g ^st Heel g ^re Is as follows

g ^se Heel g ^re Is as follows

Sparse mask matrix for defining a context task

Sparse mask matrix for emotional tasks

The mask matrix z represents a group of random binary variables and is responsible for controlling the interaction between the position task graph and the task relation graph and controlling the interaction between the position task graph and the task relation graph. Therefore, when the iterative interaction of the ith layer graph is carried out, the sparse interaction of the node features among the graphs is updated according to the formulas (1) to (3), namely the sparse interaction between the vertical task graph and the task relation graph and the sparse interaction between the emotion task graph and the task relation graph are updated:

wherein the content of the first and second substances,

and

parameters of connecting edges between the elevation task chart and the emotion task chart and the task relation chart are respectively shown, l represents the number of network layers of the current iteration,

and

respectively show the vertical task chart g ^st Emotion task map g ^se And task relation graph g ^re And (3) performing matrix representation of the node characteristics of the graph after updating the node characteristics in the graph by using a GCN graph neural network, wherein alpha is the number of node information of the live task graph and the number of node information of the emotion task graph, which are used for controlling task relation graph fusion by using a hyper-parameter.

Since the set of random binary variables represented by the mask matrix z are discrete variables and are not derivable, they are not suitable for deep learning models. Therefore, in the embodiment of the present invention, a Gumbel-Softmax distribution is adopted to continuously convert the discrete random variable z into the continuous variable v so as to adapt to the updating process of the multi-graph sparse interaction network model in the embodiment of the present invention, and a calculation formula thereof is as follows:

where i and j represent randomly chosen 0 and 1, and τ is a scaling parameter，G _l Is an independent identically distributed sample, pi, sampled from a standard normal distribution Gumbel (0,1) _l ＝[1-z _l ,z _l ]It is the mask matrix z that is represented as a set of random binary variables.

In order to improve the training efficiency of the model, sparse interaction between the position task graph and the task relation graph and sparse interaction between the emotion task graph and the task relation graph are encouraged according to a first loss function (namely sparse regularization), wherein the first loss function is as follows:

in the formula, L represents the number of network layers of the current iteration, L represents the total number of network layers of the set iteration to avoid too sparse interaction among the graphs, so that the whole multi-graph sparse interaction network model is divided into three independent graph networks, and then multi-interaction between the vertical task graph and the task relationship graph, multi-interaction between the emotion task graph and the task relationship graph are encouraged according to a second loss function (namely, sharing regularity), namely, multi-interaction at the bottom layer is encouraged, and bottom layer information is shared among the tasks, wherein the second loss function is as follows:

the multi-image sparse interaction module is used for information sharing between two tasks to obtain information helpful for training, noise information influencing task training is filtered, the sharing efficiency and quality between the tasks are improved, and the performance of the multi-image sparse interaction network model for executing each task is further improved.

As one of the embodiments, a vertical feature representation g of a vertical task graph is obtained at the task-related attention module ^st And emotional feature representation g of emotional task graph ^se . For the position detection task, in order to obtain the position feature representation related to the target, a mask mechanism is required to be adopted to filter out non-target words, and specifically, the design is adoptedThe mask matrix is used for setting the corresponding position of the target word to be 1 and setting the corresponding position of the non-target word to be 0 so as to obtain the characteristic representation of the elevation task graph after the mask matrix is converted

And then using the attention mechanism based on the retrieval to obtain richer position feature representation related to the target word, wherein the attention weight alpha of the position feature representation is calculated according to the following formula:

wherein h is the output of the BERT network model,

representing the characteristic representation of the t-th word after being coded by a BERT network model; m + n represents the length of a word in the input text (the length of the target text is m, the length of the tweet text is n),

representing the output of the elevation correlation diagram after being subjected to mask matrix conversion

The characteristic representation of the ith node (i.e., the ith word vector), β _t Attention weight, β, representing the t-th word vector _i Attention weight, α, expressed as a vertical feature of the ith word vector _t The attention weight representing the t-th word vector is normalized. The attention weight of the position feature representation represents the attention of all words in the context to the position feature.

Then, the position characteristic representation r of the position task graph is calculated according to the following formula ^st ：

h _i Feature vectors, α, encoded for the ith word by the BERT model _i Attention weights represented for the vertical features of the ith word vector.

Similarly, calculating emotional feature representation r of emotional task diagram according to formulas (4) to (6) ^se ：

Output g representing emotion correlation diagram ^se The feature representation of the ith node (or word vector) in the text is shown, and alpha ' is the attention weight, beta ' of the emotional feature representation ' _t Attention weight, β ', for emotional characterization of the t-th word vector' _i Attention weight for emotion characterization for the ith word vector.

Obtaining a final position feature representation r ^st And emotional feature representation r ^se And then, fusing text features and rich context features by using a full connection layer, and obtaining the polarity of the detection position of the input text and the polarity of the classified emotion:

y ^task ＝softmax(W ^task r ^task +b ^task )

in the formula, y ^task | _task={st,se} Detecting polarity and emotion classification poles for the multi-graph sparse interaction network model prediction from the standpointProperty, W ^task | _task={st,se} Weight of the full connection layer, b ^task Is W ^task Corresponding to the offset, softmax is the activation function.

Finally, the objective function of the whole multi-image sparse interaction module is a linear combination of the loss functions of the standpoint detection task and the emotion analysis task:

in the formula, theta is a parameter of the multitask graph network model, and lambda is ₁ 、λ ₂ 、λ ₃ Is the coefficient corresponding to the loss term; d represents the (d) th tweet,

for the set of all the tweets,

for the emotion task label of the predicted d-th tweet of the model,

for the tag of the context task of the d-th tweet predicted by the model,

representing a first loss function (i.e. sparse regularization),

representing a second loss function (i.e., sharing regularization), encouraging task sharing.

The task related attention module is used for calculating the final vertical characteristic representation of the vertical task graph and the final emotional characteristic representation of the emotional task graph, and performing vertical detection and emotion classification according to the vertical characteristic representation and the emotional characteristic representation. The embodiment of the invention constructs three graph structures to capture the interaction relationship between tasks, wherein a task relationship graph is constructed to capture the word-level correlation between the position detection task and the emotion analysis task, so as to help each task to share information and reduce the generation of noise in the information sharing process.

The embodiment of the invention predicts the position and attitude of epidemic prevention measures for some government according to short news texts published by the public on a social platform. Because the news text content is short, the information quantity is small, and higher accuracy rate is difficult to obtain by adopting a single-task learning mode, the invention provides a multi-task vertical detection method based on a multi-graph sparse interaction network.

The foregoing is directed to the preferred embodiment of the present invention, and it is understood that various changes and modifications may be made by one skilled in the art without departing from the spirit of the invention, and it is intended that such changes and modifications be considered as within the scope of the invention.

It will be understood by those skilled in the art that all or part of the processes of the above embodiments may be implemented by a computer program, which can be stored in a computer readable storage medium and can include the processes of the above embodiments when executed. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

Claims

1. A multi-task position detection method based on a multi-graph sparse interaction network is characterized in that,

inputting an input text into a multi-map sparse interaction network model to obtain the position detection polarity and the emotion classification polarity of the input text; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;

the multi-graph construction module is used for constructing a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the position task graph is a graph constructed according to the syntax dependence tree of the input text and a graph constructed by pragmatic weight of words of the input text when a position detection task is carried out; the emotion task graph is constructed according to the syntax dependence tree of the input text and pragmatic weight of words of the input text during emotion classification task; the task relation graph is constructed according to word category importance weight of words of an input text and a syntax dependence tree of the input text;

2. The multi-graph sparse interaction network-based multi-task position detection method according to claim 1, wherein the establishment of the position task graph and the emotion task graph of the multi-graph sparse interaction network model comprises the following steps:

3. The multi-task position detection method based on the multi-graph sparse interaction network as claimed in claim 2, wherein the task relationship graph of the multi-graph sparse interaction network model is constructed, specifically:

constructing word category importance weight, and constructing a vertical label importance relation vector and an emotion label importance relation vector according to the word category importance weight; the word category importance weight is the relationship between words and label categories of different tasks;

4. The multi-graph sparse interaction network-based multi-task standpoint detection method according to claim 3, characterized by updating intra-graph node features of the vertical task graph, the emotion task graph and the task relationship graph, specifically:

according to the formula

Respectively carrying out intra-graph node feature updating on the vertical task graph, the emotion task graph and the task relation graph, wherein task belongs to { st, se and re } and respectively represents the vertical task graph, the emotion task graph and the task relation graph, I is an identity matrix, sigma is a nonlinear activation function, l represents the number of network layers of current iteration,

a parameter indicating the layer l network is shown,

j denotes the adjacency matrix diagonal fill 1.

5. The multi-graph sparse interaction network-based multi-task position detection method according to claim 4, wherein the sparse interaction of the node features among the graphs is updated, specifically to

According to the following formula:

for features of nodes between graphsThe node features comprise a position task graph, an emotion task graph and a task relation graph, wherein the node features are updated through sparse interaction

And

respectively show the vertical task chart g ^st Emotion task map g ^se And task relation graph g ^re And performing matrix representation of the node features of the graph after updating the node features in the graph by using a GCN graph neural network, wherein alpha is the number of node information of the vertical task graph and the number of node information of the emotional task graph, which are used for controlling task relation graph fusion by using a hyper-parameter, and l is the number of network layers of the current iteration.

6. The multi-graph sparse interaction network-based multi-task position detection method according to claim 5, wherein sparse interaction between the position task graph and the task relationship graph and between the emotion task graph and the task relationship graph is encouraged according to a first loss function; and multiple interactions between the position task graph and the task relation graph and between the emotion task graph and the task relation graph are encouraged according to the second loss function.

7. The multi-graph sparse interaction network-based multi-task position detection method according to claim 6, wherein the first loss function is specifically:

the second loss function is specifically:

in the formula

For the sparse mask matrix of the position task,

8. The multi-graph sparse interaction network based multitask position detection method according to claim 7,

calculating a position feature representation r of the position task graph according to the following formula ^st ：

9. The multi-graph sparse interaction network-based multi-task position detection method according to claim 8, wherein the emotional feature representation r of the emotional task graph is calculated according to the following formula ^se ：

Where alpha' is the attention weight for the affective feature representation,

output g for emotion correlation graph ^se The characteristic of the ith node in the text entry table, and m + n represents the length of the input text.

10. The multi-graph sparse interaction network-based multi-task position detection method according to any one of claims 1 to 9, wherein the detection position of the input text and the polarity of the classified emotion are calculated according to the following formulas:

y ^task ＝softmax(W ^task r ^task +b ^task )；

in the formula y ^task | _{task＝{st，se}} Detecting polarity and emotion classification polarity for the multi-graph sparse interaction network model prediction standpoint, W ^task | _{task＝{st，se}} Weight of the full connection layer, b ^task Is W ^task Corresponding offset, r ^task | _{task＝{st，se}} For both the standpoint and emotional characterizations, softmax is the activation function.