CN114969318A - Multi-task standpoint detection method based on multi-graph sparse interaction network - Google Patents

Multi-task standpoint detection method based on multi-graph sparse interaction network Download PDF

Info

Publication number
CN114969318A
CN114969318A CN202210069686.4A CN202210069686A CN114969318A CN 114969318 A CN114969318 A CN 114969318A CN 202210069686 A CN202210069686 A CN 202210069686A CN 114969318 A CN114969318 A CN 114969318A
Authority
CN
China
Prior art keywords
task
graph
emotion
text
relation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210069686.4A
Other languages
Chinese (zh)
Other versions
CN114969318B (en
Inventor
廖清
柴合言
丁烨
方滨兴
高翠芸
王晔
王轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongguan University of Technology
Shenzhen Graduate School Harbin Institute of Technology
Original Assignee
Dongguan University of Technology
Shenzhen Graduate School Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongguan University of Technology, Shenzhen Graduate School Harbin Institute of Technology filed Critical Dongguan University of Technology
Priority to CN202210069686.4A priority Critical patent/CN114969318B/en
Publication of CN114969318A publication Critical patent/CN114969318A/en
Application granted granted Critical
Publication of CN114969318B publication Critical patent/CN114969318B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Abstract

The invention discloses a multi-task position detection method based on a multi-graph sparse interaction network. The method comprises the steps that an input text is input into a multi-map sparse interaction network model, and the position detection polarity and the emotion classification polarity of the input text are obtained; the multi-graph sparse interaction network model comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the multi-graph sparse interaction module is used for updating the intra-graph node characteristics of the vertical field task graph, the emotion task graph and the task relation graph and updating the sparse interaction of the node characteristics among the graphs; the task related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classified emotion. The technical scheme of the invention improves the accuracy of the vertical detection for the text of the text pushing.

Description

Multi-task standpoint detection method based on multi-graph sparse interaction network
Technical Field
The invention relates to the technical field of position detection, in particular to a multi-task position detection method based on a multi-graph sparse interaction network.
Background
The existing position detection method mainly uses a machine learning method and a deep learning method. The machine learning method needs to do a lot of work of feature engineering, extract features manually, and then design a machine learning model to train the extracted features, such as a support vector machine (SupportVectorMachine), a decision tree model, a random forest, and the like. The main disadvantages of the method are that a lot of time is consumed for carrying out the feature engineering, the information contained in the manually selected features is limited, and the performance of the model is reduced to a certain extent; meanwhile, most of the machine learning methods contain a large amount of hyper-parameters, the optimal values of the hyper-parameters need to be selected manually, and the machine learning methods are time-consuming and labor-consuming and cannot be applied in a large scale. In the early stage of the vertical detection method based on deep learning, a Recurrent Neural Network (RNN), a Convolutional Neural Network (CNN) and an attention mechanism (attentionchannels) are mainly used to automatically capture the characteristics of a news text, thereby improving the performance of vertical detection. Under the influence of the popularity of Transformers, some recent work focuses on improving the performance of the position detection task by using the Bert model, mainly by utilizing the powerful word embedding capability of the Bert. The deep learning-based position detection method related to the invention is to use auxiliary information to construct an auxiliary task to help improve the expression of the position detection task, such as emotional information, expression information, subjective or objective nature of text, and the like. The auxiliary LSTM network is mainly designed to extract emotional characteristics, the main LSTM network is used to extract characteristics of a main task, and then the emotional characteristics and the position characteristics are simply spliced together to predict the position of a news text.
In the prior art, the relevance between target expression and position expression, such as RNN and CNN, cannot be captured by adopting a position detection method based on deep learning. In the auxiliary task-based elevation detection method, emotional features and elevation features are simply spliced, the complexity of the relationship between tasks is neglected, the two tasks are regarded as the same importance, and the two tasks are simply spliced together, so that large negative migration is generated, and the performance of a model is reduced. And when the form of the auxiliary task is adopted, only the performance of the main task is concerned, and no consideration is given to the performance improvement of the auxiliary task. Therefore, the accuracy rate of the prior art for performing the position detection on the text of the tweet is low.
Disclosure of Invention
The invention provides a multi-task vertical detection method based on a multi-graph sparse interaction network, which improves the accuracy of vertical detection for text pushing.
An embodiment of the invention provides a multi-task position detection method based on a multi-graph sparse interaction network, which comprises the following steps:
inputting an input text into a multi-map sparse interaction network model to obtain the stand detection polarity and the emotion classification polarity of the input text; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;
the text encoding module is used for processing input text into a plurality of word vectors for the multi-graph building module, the multi-graph sparse interaction module and the task related attention module to use;
the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the position task graph is a graph constructed according to the syntax dependence tree of the input text and a graph constructed by pragmatic weight of words of the input text when a position detection task is carried out; the emotion task graph is constructed according to the syntax dependence tree of the input text and pragmatic weight of a single word of the input text during emotion classification task; the task relation graph is constructed according to word category importance weight of words of an input text and a syntax dependence tree of the input text;
the multi-graph sparse interaction module is used for updating intra-graph node features of the vertical field task graph, the emotion task graph and the task relation graph and updating sparse interaction of the node features among the graphs; the sparse interaction among the graphs comprises sparse interaction between a position task graph and a task relation graph and sparse interaction between an emotion task graph and a task relation graph;
and the task related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classification emotion according to the position feature representation of the position task diagram and the emotion feature representation of the emotion task diagram.
Further, the method for constructing the elevation task graph and the emotion task graph of the multi-graph sparse interaction network model comprises the following steps:
constructing a first syntax dependency tree according to the syntax structure of the text-pushing text, and acquiring a root word set of the first syntax dependency tree;
adding words in the target text into the first syntactic dependency tree according to the word connection relation between the target text and the root word set to obtain a second syntactic dependency tree of the input text;
calculating a first pragmatic weight and a second pragmatic weight of the tweet text and a third pragmatic weight of the target text, wherein the first pragmatic weight is the pragmatic weight of each word of the tweet text in a vertical detection task, and the second pragmatic weight is the pragmatic weight of each word of the tweet text in an emotion classification task;
and constructing the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntactic dependency tree, and constructing the emotion task graph according to the second pragmatic weight and the second syntactic dependency tree.
Further, constructing a task relationship graph of the multi-graph sparse interaction network model specifically comprises:
constructing word category importance weight, and constructing a position label importance relation vector and an emotion label importance relation vector according to the word category importance weight; the word category importance weight is the relationship between words and label categories of different tasks;
and calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing the task relation graph according to the task interaction relation.
Further, updating the intra-graph node characteristics of the vertical task graph, the emotion task graph and the task relationship graph, specifically:
according to the formula
Figure BDA0003481563650000041
Respectively carrying out intra-graph node feature updating on the vertical task graph, the emotion task graph and the task relation graph, wherein task belongs to { st, se and re } which respectively represent the vertical task graph, the emotion task graph and the task relation graph, I is an identity matrix, sigma is a nonlinear activation function, l represents the number of network layers of current iteration,
Figure BDA0003481563650000042
a parameter indicating the layer l network is shown,
Figure BDA0003481563650000043
an adjacency matrix representing a vertical task graph, an emotional task graph or a task relationship graph;
Figure BDA0003481563650000044
j denotes the adjacency matrix diagonal fill 1.
Further, the sparse interaction of the node features between the graphs is updated, specifically, the sparse interaction is updated
According to the following formula:
Figure BDA0003481563650000045
Figure BDA0003481563650000046
Figure BDA0003481563650000047
updating sparse interaction of node features among graphs, wherein the node features comprise a position task graph, an emotion task graph and a task relation graph, and the formula comprises
Figure BDA0003481563650000048
And
Figure BDA0003481563650000049
respectively show the task diagram g of the place st Emotional task graph g se And task relation graph g re And performing matrix representation of the node features of the graph after updating the node features in the graph by using a GCN graph neural network, wherein alpha is the number of node information of the vertical task graph and the number of node information of the emotional task graph, which are used for controlling task relation graph fusion by using a hyper-parameter, and l is the number of network layers of the current iteration.
Furthermore, sparse interaction is encouraged between the position task graph and the task relation graph and between the emotion task graph and the task relation graph according to a first loss function; and multiple interactions between the position task graph and the task relation graph and between the emotion task graph and the task relation graph are encouraged according to the second loss function.
Further, the first loss function is specifically:
Figure BDA0003481563650000051
the second loss function is specifically:
Figure BDA0003481563650000052
in the formula
Figure BDA0003481563650000053
For the sparse mask matrix of the position task,
Figure BDA0003481563650000054
and (3) a sparse mask matrix of the emotion tasks, wherein L represents the number of network layers of the current iteration, and L represents the total number of preset iteration networks.
Further, a vertical characteristic representation r of the vertical task graph is calculated according to the following formula st
Figure BDA0003481563650000055
Where α is the attention weight, h, expressed in a vertical characteristic i And m + n represents the length of the input text for the feature vector of the ith word after being coded by a BERT model.
Further, calculating the emotional characteristic representation r of the emotional task diagram according to the following formula se
Figure BDA0003481563650000056
In the formula of alpha For the attention weight of the representation of the emotional feature,
Figure BDA0003481563650000057
output g for emotion correlation graph se The feature of the ith node in the text entry table represents, and m + n represents the length of the input text.
Further, the polarity of the detection standpoint and classification emotion of the input text is calculated according to the following formula:
y task =softmax(W task r task +b task );
in the formula y task | task={st,se} Detecting polarity and emotion classification polarity for the multi-graph sparse interaction network model prediction standpoint, W task | task={st,se} As weight of the full connection layer, b task Is W task Corresponding offset, r task | task={st,se} For both the standpoint and emotional characterizations, softmax is the activation function.
The embodiment of the invention has the following beneficial effects:
the invention provides a multi-task vertical detection method based on a multi-graph sparse interaction network. According to the invention, through training the emotion analysis task and the position detection task in a combined manner, a task-specific graph (namely a position task graph and an emotion task graph) and a task-related graph (namely a task relation graph) are constructed for each task, and a sparse interaction module between the graphs is constructed, so that sparse interaction between the position task graph and the emotion task graph is realized, information sharing between the tasks is facilitated, and the expressive force of the multi-graph sparse interaction network model on each task is improved.
Drawings
Fig. 1 is a schematic structural diagram of a multi-graph sparse interaction network model according to an embodiment of the present invention.
Detailed Description
The technical solutions of the present invention will be described clearly and completely with reference to the accompanying drawings, and it is obvious that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be obtained by a person skilled in the art without any inventive step based on the embodiments of the present invention, are within the scope of the present invention.
As shown in fig. 1, in the multi-task standpoint detection method based on the multi-graph sparse interaction network provided by an embodiment of the present invention, an input text is input to a multi-graph sparse interaction network model, and a standpoint detection polarity and an emotion classification polarity of the input text are obtained; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;
the text encoding module is used for processing input text into a plurality of word vectors for the multi-graph building module, the multi-graph sparse interaction module and the task related attention module to use;
the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the position task graph is a graph constructed according to the syntax dependence tree of the input text and a graph constructed by pragmatic weight of words of the input text when a position detection task is carried out; the emotion task graph is constructed according to the syntax dependence tree of the input text and pragmatic weight of a single word of the input text during emotion classification task; the task relation graph is constructed according to word category importance weight of words of an input text and a syntax dependence tree of the input text;
the multi-graph sparse interaction module is used for updating intra-graph node features of the vertical field task graph, the emotion task graph and the task relation graph and updating sparse interaction of the node features among the graphs; the sparse interaction among the graphs comprises sparse interaction between a position task graph and a task relation graph and sparse interaction between an emotion task graph and a task relation graph;
and the task related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classification emotion according to the position feature representation of the position task diagram and the emotion feature representation of the emotion task diagram.
As an embodiment, the method for constructing the task relation graph of the model comprises the following steps:
constructing word category importance weight, and constructing a position label importance relation vector and an emotion label importance relation vector according to the word category importance weight;
and calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing the task relation graph according to the task interaction relation.
As one of the embodiments, the text encoding module uses BERT to encode the target words T in the target text and the tweet text content C. Firstly, splicing the text content C of the text to be pushed and the target word T to form an input text S, wherein S is { T ═ T } 1 ,…,t m ,w 1 ,w 2 ,…,w n }. S is then processed into the input format of the BERT model: [ CLS]t 1 …t m W 1 W 2 …w n [SEP]. And then input it into the BERT network model to capture the contextual characteristics of the input text. The process can be defined by the following formula:
H=BERT(S)
where H is the output of the BERT network model, H ═ H { [ H ] 1 ,h 2 ,…,h m+n }. Where each element in H is a characteristic representation of a word in the input text,
Figure BDA0003481563650000071
including a characteristic representation of the context information for the tth word.
As one embodiment, the method for constructing the elevation task map and the emotion task map comprises the following steps:
constructing a first syntax dependency tree according to the syntax structure of the text-pushing text, and acquiring a root word set of the first syntax dependency tree;
adding words in the target text into the first syntactic dependency tree according to the word connection relation between the target text and the root word set to obtain a second syntactic dependency tree of the input text; calculating a first pragmatic weight and a second pragmatic weight of the tweet text and a third pragmatic weight of the target text, wherein the first pragmatic weight is the pragmatic weight of each word of the tweet text in a vertical detection task (namely, the pragmatic weight of the word about the vertical task is calculated by using the relative co-occurrence frequency and word frequency of the word and the vertical labels (support and object) in the whole corpus); the second pragmatic weight is the pragmatic weight of each word of the text in the emotion classification task (namely, the pragmatic weight of the word with respect to the vertical task is obtained by calculating the relative co-occurrence frequency and the word frequency of the word and the emotion label in the whole corpus). The pragmatic weight refers to the dependency (or influence) of the word in the inferred text on a specific target.
And constructing the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntactic dependency tree, and constructing the emotion task graph according to the second pragmatic weight and the second syntactic dependency tree.
Specifically, each piece of tweet text is parsed using a parsing tool to construct a syntactic dependency tree; and representing the relation between the words in the syntactic dependency tree as connecting lines between the nodes by representing the words as the nodes of the graph (namely the nodes of the graph are word vectors of each word in the text of the Chinese character) so as to construct a base graph of the target text T and the Chinese character C
Figure BDA0003481563650000081
The method comprises the following steps:
constructing a first syntactic dependency tree according to the syntactic structure of the tweet text C
Figure BDA0003481563650000082
Combined pipeThe sentence-passing parser obtains a root word set W of the first syntax dependency tree r
Since the target text T is not a complete sentence, but a phrase or a word, and cannot be modeled as a syntactic dependency tree, according to the word connection relationship between the target text and the root word set, the embodiment of the present invention adds the word in the target text to the first syntactic dependency tree to obtain a second syntactic dependency tree of the input text S
Figure BDA0003481563650000083
The second syntactic dependency tree
Figure BDA0003481563650000084
The calculation formula of (c) is as follows:
Figure BDA0003481563650000085
in the formula, W r Representing a first syntactic dependency tree
Figure BDA0003481563650000091
The root word of (a) is,
Figure BDA0003481563650000092
second syntactic dependency tree representing input text S
Figure BDA0003481563650000093
W i And W j Any two different words in the input text S, which represents the tweet text and the target text.
In order to capture the importance of words in the input text and the interaction characteristics between the words, the pragmatic weight of the words and the word frequency of the words of different tasks need to be calculated. Calculating the frequency of each word in the input text appearing in the whole corpus
Figure BDA0003481563650000094
In the formula, N (W) i ) As a word W i The number of times it appears in the corpus, N being the number of all words in the corpus. The embodiment of the invention calculates the pragmatic weight of the word in different tasks aiming at different tasks, comprising a first pragmatic weight phi task (w i )| task=stance And a second pragmatic weight phi task (w i )| task=sentiment When the first pragmatic weight and the second pragmatic weight are calculated, only the category with practical significance is considered, two label categories which do not contain useful information, namely neutral position and neutral emotion, are omitted, and the specific calculation process is shown as the following formula:
Figure BDA0003481563650000095
Figure BDA0003481563650000096
w i ∈C;
in the formula, N (W) i ,label + ) And N (W) i ,label - ) Respectively represent words W i The quantities appearing in the context task labels "support" and "objection" or respectively represent the words W i The number of occurrences in the emotional task tags "active" and "passive"; n (label) + ) And N (label) - ) Representing the total number of the standpoint task labels "support" and "opposition", respectively, or the emotional task labels "active" and "negative", respectively, with μ being the mean and δ being the standard deviation.
According to the formula
Figure BDA0003481563650000097
And calculating a third pragmatic weight of the target text. And the third pragmatic weight of the target text is used for establishing the relation between the target text and the tweet text, namely constructing an edge between the target text and the tweet text in the graph.
And calculating a first adjacent matrix of the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntax dependency tree, and further obtaining the elevation task graph according to the first adjacent matrix.
And calculating a second adjacency matrix of the emotion task graph according to the second pragmatic weight and a second syntactic dependency tree, and further obtaining the emotion task graph according to the second adjacency matrix.
Specifically, the first adjacency matrix and the second adjacency matrix are calculated according to the following formulas:
Figure BDA0003481563650000101
in the formula s j And s i Are words in the input text.
As an embodiment, the task relation graph of the multi-graph sparse interaction network model is constructed, and the method comprises the following steps:
constructing word category importance weight, and constructing a position label importance relation vector and an emotion label importance relation vector according to the word category importance weight;
and calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing the task relation graph according to the task interaction relation.
Specifically, in consideration of fine-grained interaction relationships between different tasks, that is, relationships of tasks in different categories are different, the word category importance weight is used in the embodiment of the present invention
Figure BDA0003481563650000102
The relation between the word and the label types of different tasks is established, so that the label importance relation characteristics of different words to different tasks are obtained, information interaction among different tasks is facilitated, and the task expressive force of the multi-graph sparse interaction network model is improved. According to the formula
Figure BDA0003481563650000103
Calculating the importance weight of the word class, wherein W i E C is the word of the tweet text,c i is a label category of the corresponding task (e.g., task is a position detection task, c i E { support, object }),
Figure BDA0003481563650000104
the expression W i The tag class appearing in task is c i The number of times of (c);
Figure BDA0003481563650000105
label class c represented in task i The total number of occurrences of the corresponding word; | w i I represents the word W i The number of occurrences in the corpus; task denotes a position task or an emotional task. The word category importance weight refers to the importance relationship between words and different label categories of different tasks, namely, the importance relationship between words in two tasks is represented, and the word category importance weight is used for capturing the similarity relationship between the two tasks based on task label layers. The method comprises the steps of firstly calculating the relation between a word and each label category of the vertical task to form a vector, namely a vertical label importance relation vector, and then calculating the relation between the word and each label category of the emotion task to form a vector, namely an emotion label importance relation vector. And finally, calculating the similarity of the two vectors, and using the similarity as the importance weight of the word to the label category of each task to measure the similarity relation of the word between the two tasks.
Specifically, a label importance relation vector is constructed for each task according to the calculated word class importance weight, wherein the label importance relation vector comprises a position label importance relation vector and an emotion label importance relation vector, namely phi stance (w i ) And phi sentiment (w i )。
The importance relationship vector represents the importance relationship of each word to different tag categories, denoted as
Figure BDA0003481563650000111
Wherein task represents a position task or an emotion task, c i A certain label representing the task is shown,
Figure BDA0003481563650000112
in normalized form of the word class importance weights,
Figure BDA0003481563650000113
wherein
Figure BDA0003481563650000114
And
Figure BDA0003481563650000115
is that
Figure BDA0003481563650000116
Mean and standard deviation of. Through the calculation, the importance relation vector of the word to different label categories under different tasks, namely the importance relation vector phi of the vertical label can be obtained stance (w i ) And
emotional tag importance relationship vector Φ sentiment (w i )。
Calculating task interaction relation xi (W) based on word level according to the position label importance relation vector and the emotion label importance relation vector i ):
Figure BDA0003481563650000117
Where, sta denotes a position detection task and se denotes an emotion classification task. A second syntactic dependency tree according to said input text S
Figure BDA0003481563650000118
Task interaction xi (W) i ) Adjacent matrix used for constructing task relation graph after standardization processing
Figure BDA0003481563650000119
Further according to the adjacency matrix
Figure BDA00034815636500001110
Obtaining the task relation graph and the adjacency matrix
Figure BDA00034815636500001111
The calculation formula of (c) is:
Figure BDA00034815636500001112
as one embodiment, intra-graph node features of the vertical-field task graph, the emotion task graph and the task relation graph are updated, namely, horizontal intra-graph updating is carried out. Specifically, the embodiment of the present invention uses a graph convolutional neural network (GCN) to perform intra-graph iterative update, and the update process is independent for each graph in a broad sense (only horizontal update of the graph is considered), that is, each graph independently updates node features. Respectively carrying out respective intra-graph node feature updating on the position task graph, the emotion task graph and the task relation graph according to the following formulas:
Figure BDA0003481563650000121
in the formula, task belongs to { st, se, re } respectively represents a position task graph, an emotion task graph and a task relation graph,
Figure BDA0003481563650000122
representing the characteristics of the nodes at the output of the l-1 th network,
Figure BDA0003481563650000123
parameters representing the l-th network, I being an identity matrix, σ being a non-linear activation function;
Figure BDA0003481563650000124
j denotes the adjacency matrix diagonal fill 1,
Figure BDA0003481563650000125
characterization of convolution kernels for graph convolution networksVector and routine graph convolution neural network updated parameters; l denotes the number of network layers for the current iteration. The initial node characteristics are expressed as
Figure BDA0003481563650000126
Figure BDA0003481563650000127
Vertical task graph g st And emotional task graph g se Graph g relating to tasks respectively re Performing sparse interaction to reach g st Heel g se The interaction between the tasks is carried out, so that the different tasks can be helped to realize information sharing. Initializing g st Heel g re Is as follows
Figure BDA0003481563650000128
g se Heel g re Is as follows
Figure BDA0003481563650000129
Sparse mask matrix for defining a context task
Figure BDA00034815636500001210
Sparse mask matrix for emotional tasks
Figure BDA00034815636500001211
The mask matrix z represents a group of random binary variables and is responsible for controlling the interaction between the position task graph and the task relation graph and controlling the interaction between the position task graph and the task relation graph. Therefore, when the iterative interaction of the ith layer graph is carried out, the sparse interaction of the node features among the graphs is updated according to the formulas (1) to (3), namely the sparse interaction between the vertical task graph and the task relation graph and the sparse interaction between the emotion task graph and the task relation graph are updated:
Figure BDA00034815636500001212
Figure BDA00034815636500001213
Figure BDA00034815636500001214
wherein the content of the first and second substances,
Figure BDA00034815636500001215
and
Figure BDA00034815636500001216
parameters of connecting edges between the elevation task chart and the emotion task chart and the task relation chart are respectively shown, l represents the number of network layers of the current iteration,
Figure BDA00034815636500001217
and
Figure BDA00034815636500001218
respectively show the vertical task chart g st Emotion task map g se And task relation graph g re And (3) performing matrix representation of the node characteristics of the graph after updating the node characteristics in the graph by using a GCN graph neural network, wherein alpha is the number of node information of the live task graph and the number of node information of the emotion task graph, which are used for controlling task relation graph fusion by using a hyper-parameter.
Since the set of random binary variables represented by the mask matrix z are discrete variables and are not derivable, they are not suitable for deep learning models. Therefore, in the embodiment of the present invention, a Gumbel-Softmax distribution is adopted to continuously convert the discrete random variable z into the continuous variable v so as to adapt to the updating process of the multi-graph sparse interaction network model in the embodiment of the present invention, and a calculation formula thereof is as follows:
Figure BDA0003481563650000131
where i and j represent randomly chosen 0 and 1, and τ is a scaling parameter,G l Is an independent identically distributed sample, pi, sampled from a standard normal distribution Gumbel (0,1) l =[1-z l ,z l ]It is the mask matrix z that is represented as a set of random binary variables.
In order to improve the training efficiency of the model, sparse interaction between the position task graph and the task relation graph and sparse interaction between the emotion task graph and the task relation graph are encouraged according to a first loss function (namely sparse regularization), wherein the first loss function is as follows:
Figure BDA0003481563650000132
in the formula, L represents the number of network layers of the current iteration, L represents the total number of network layers of the set iteration to avoid too sparse interaction among the graphs, so that the whole multi-graph sparse interaction network model is divided into three independent graph networks, and then multi-interaction between the vertical task graph and the task relationship graph, multi-interaction between the emotion task graph and the task relationship graph are encouraged according to a second loss function (namely, sharing regularity), namely, multi-interaction at the bottom layer is encouraged, and bottom layer information is shared among the tasks, wherein the second loss function is as follows:
Figure BDA0003481563650000133
the multi-image sparse interaction module is used for information sharing between two tasks to obtain information helpful for training, noise information influencing task training is filtered, the sharing efficiency and quality between the tasks are improved, and the performance of the multi-image sparse interaction network model for executing each task is further improved.
As one of the embodiments, a vertical feature representation g of a vertical task graph is obtained at the task-related attention module st And emotional feature representation g of emotional task graph se . For the position detection task, in order to obtain the position feature representation related to the target, a mask mechanism is required to be adopted to filter out non-target words, and specifically, the design is adoptedThe mask matrix is used for setting the corresponding position of the target word to be 1 and setting the corresponding position of the non-target word to be 0 so as to obtain the characteristic representation of the elevation task graph after the mask matrix is converted
Figure BDA0003481563650000141
And then using the attention mechanism based on the retrieval to obtain richer position feature representation related to the target word, wherein the attention weight alpha of the position feature representation is calculated according to the following formula:
Figure BDA0003481563650000142
Figure BDA0003481563650000143
wherein h is the output of the BERT network model,
Figure BDA0003481563650000144
representing the characteristic representation of the t-th word after being coded by a BERT network model; m + n represents the length of a word in the input text (the length of the target text is m, the length of the tweet text is n),
Figure BDA0003481563650000145
representing the output of the elevation correlation diagram after being subjected to mask matrix conversion
Figure BDA0003481563650000146
The characteristic representation of the ith node (i.e., the ith word vector), β t Attention weight, β, representing the t-th word vector i Attention weight, α, expressed as a vertical feature of the ith word vector t The attention weight representing the t-th word vector is normalized. The attention weight of the position feature representation represents the attention of all words in the context to the position feature.
Then, the position characteristic representation r of the position task graph is calculated according to the following formula st
Figure BDA0003481563650000147
h i Feature vectors, α, encoded for the ith word by the BERT model i Attention weights represented for the vertical features of the ith word vector.
Similarly, calculating emotional feature representation r of emotional task diagram according to formulas (4) to (6) se
Figure BDA0003481563650000148
Figure BDA0003481563650000149
Figure BDA00034815636500001410
Figure BDA00034815636500001411
Output g representing emotion correlation diagram se The feature representation of the ith node (or word vector) in the text is shown, and alpha ' is the attention weight, beta ' of the emotional feature representation ' t Attention weight, β ', for emotional characterization of the t-th word vector' i Attention weight for emotion characterization for the ith word vector.
Obtaining a final position feature representation r st And emotional feature representation r se And then, fusing text features and rich context features by using a full connection layer, and obtaining the polarity of the detection position of the input text and the polarity of the classified emotion:
y task =softmax(W task r task +b task )
in the formula, y task | task={st,se} Detecting polarity and emotion classification poles for the multi-graph sparse interaction network model prediction from the standpointProperty, W task | task={st,se} Weight of the full connection layer, b task Is W task Corresponding to the offset, softmax is the activation function.
Finally, the objective function of the whole multi-image sparse interaction module is a linear combination of the loss functions of the standpoint detection task and the emotion analysis task:
Figure BDA0003481563650000151
in the formula, theta is a parameter of the multitask graph network model, and lambda is 1 、λ 2 、λ 3 Is the coefficient corresponding to the loss term; d represents the (d) th tweet,
Figure BDA0003481563650000152
for the set of all the tweets,
Figure BDA0003481563650000153
for the emotion task label of the predicted d-th tweet of the model,
Figure BDA0003481563650000154
for the tag of the context task of the d-th tweet predicted by the model,
Figure BDA0003481563650000155
representing a first loss function (i.e. sparse regularization),
Figure BDA0003481563650000156
representing a second loss function (i.e., sharing regularization), encouraging task sharing.
The task related attention module is used for calculating the final vertical characteristic representation of the vertical task graph and the final emotional characteristic representation of the emotional task graph, and performing vertical detection and emotion classification according to the vertical characteristic representation and the emotional characteristic representation. The embodiment of the invention constructs three graph structures to capture the interaction relationship between tasks, wherein a task relationship graph is constructed to capture the word-level correlation between the position detection task and the emotion analysis task, so as to help each task to share information and reduce the generation of noise in the information sharing process.
The embodiment of the invention predicts the position and attitude of epidemic prevention measures for some government according to short news texts published by the public on a social platform. Because the news text content is short, the information quantity is small, and higher accuracy rate is difficult to obtain by adopting a single-task learning mode, the invention provides a multi-task vertical detection method based on a multi-graph sparse interaction network.
The foregoing is directed to the preferred embodiment of the present invention, and it is understood that various changes and modifications may be made by one skilled in the art without departing from the spirit of the invention, and it is intended that such changes and modifications be considered as within the scope of the invention.
It will be understood by those skilled in the art that all or part of the processes of the above embodiments may be implemented by a computer program, which can be stored in a computer readable storage medium and can include the processes of the above embodiments when executed. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

Claims (10)

1. A multi-task position detection method based on a multi-graph sparse interaction network is characterized in that,
inputting an input text into a multi-map sparse interaction network model to obtain the position detection polarity and the emotion classification polarity of the input text; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;
the text encoding module is used for processing input text into a plurality of word vectors for the multi-graph building module, the multi-graph sparse interaction module and the task related attention module to use;
the multi-graph construction module is used for constructing a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the position task graph is a graph constructed according to the syntax dependence tree of the input text and a graph constructed by pragmatic weight of words of the input text when a position detection task is carried out; the emotion task graph is constructed according to the syntax dependence tree of the input text and pragmatic weight of words of the input text during emotion classification task; the task relation graph is constructed according to word category importance weight of words of an input text and a syntax dependence tree of the input text;
the multi-graph sparse interaction module is used for updating intra-graph node features of the vertical field task graph, the emotion task graph and the task relation graph and updating sparse interaction of the node features among the graphs; the sparse interaction among the graphs comprises sparse interaction between a position task graph and a task relation graph and sparse interaction between an emotion task graph and a task relation graph;
and the task related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classification emotion according to the position feature representation of the position task diagram and the emotion feature representation of the emotion task diagram.
2. The multi-graph sparse interaction network-based multi-task position detection method according to claim 1, wherein the establishment of the position task graph and the emotion task graph of the multi-graph sparse interaction network model comprises the following steps:
constructing a first syntax dependency tree according to the syntax structure of the text-pushing text, and acquiring a root word set of the first syntax dependency tree;
adding words in the target text into the first syntactic dependency tree according to the word connection relation between the target text and the root word set to obtain a second syntactic dependency tree of the input text;
calculating a first pragmatic weight and a second pragmatic weight of the tweet text and a third pragmatic weight of the target text, wherein the first pragmatic weight is the pragmatic weight of each word of the tweet text in a vertical detection task, and the second pragmatic weight is the pragmatic weight of each word of the tweet text in an emotion classification task;
and constructing the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntactic dependency tree, and constructing the emotion task graph according to the second pragmatic weight and the second syntactic dependency tree.
3. The multi-task position detection method based on the multi-graph sparse interaction network as claimed in claim 2, wherein the task relationship graph of the multi-graph sparse interaction network model is constructed, specifically:
constructing word category importance weight, and constructing a vertical label importance relation vector and an emotion label importance relation vector according to the word category importance weight; the word category importance weight is the relationship between words and label categories of different tasks;
and calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing the task relation graph according to the task interaction relation.
4. The multi-graph sparse interaction network-based multi-task standpoint detection method according to claim 3, characterized by updating intra-graph node features of the vertical task graph, the emotion task graph and the task relationship graph, specifically:
according to the formula
Figure FDA0003481563640000021
Respectively carrying out intra-graph node feature updating on the vertical task graph, the emotion task graph and the task relation graph, wherein task belongs to { st, se and re } and respectively represents the vertical task graph, the emotion task graph and the task relation graph, I is an identity matrix, sigma is a nonlinear activation function, l represents the number of network layers of current iteration,
Figure FDA0003481563640000022
a parameter indicating the layer l network is shown,
Figure FDA0003481563640000023
an adjacency matrix representing a vertical task graph, an emotional task graph or a task relationship graph;
Figure FDA0003481563640000024
j denotes the adjacency matrix diagonal fill 1.
5. The multi-graph sparse interaction network-based multi-task position detection method according to claim 4, wherein the sparse interaction of the node features among the graphs is updated, specifically to
According to the following formula:
Figure FDA0003481563640000031
Figure FDA0003481563640000032
Figure FDA0003481563640000033
for features of nodes between graphsThe node features comprise a position task graph, an emotion task graph and a task relation graph, wherein the node features are updated through sparse interaction
Figure FDA0003481563640000034
And
Figure FDA0003481563640000035
respectively show the vertical task chart g st Emotion task map g se And task relation graph g re And performing matrix representation of the node features of the graph after updating the node features in the graph by using a GCN graph neural network, wherein alpha is the number of node information of the vertical task graph and the number of node information of the emotional task graph, which are used for controlling task relation graph fusion by using a hyper-parameter, and l is the number of network layers of the current iteration.
6. The multi-graph sparse interaction network-based multi-task position detection method according to claim 5, wherein sparse interaction between the position task graph and the task relationship graph and between the emotion task graph and the task relationship graph is encouraged according to a first loss function; and multiple interactions between the position task graph and the task relation graph and between the emotion task graph and the task relation graph are encouraged according to the second loss function.
7. The multi-graph sparse interaction network-based multi-task position detection method according to claim 6, wherein the first loss function is specifically:
Figure FDA0003481563640000036
the second loss function is specifically:
Figure FDA0003481563640000037
in the formula
Figure FDA0003481563640000038
For the sparse mask matrix of the position task,
Figure FDA0003481563640000039
and (3) a sparse mask matrix of the emotion tasks, wherein L represents the number of network layers of the current iteration, and L represents the total number of preset iteration networks.
8. The multi-graph sparse interaction network based multitask position detection method according to claim 7,
calculating a position feature representation r of the position task graph according to the following formula st
Figure FDA0003481563640000041
Where α is the attention weight, h, expressed in a vertical characteristic i And m + n represents the length of the input text for the feature vector of the ith word after being coded by a BERT model.
9. The multi-graph sparse interaction network-based multi-task position detection method according to claim 8, wherein the emotional feature representation r of the emotional task graph is calculated according to the following formula se
Figure FDA0003481563640000042
Where alpha' is the attention weight for the affective feature representation,
Figure FDA0003481563640000043
output g for emotion correlation graph se The characteristic of the ith node in the text entry table, and m + n represents the length of the input text.
10. The multi-graph sparse interaction network-based multi-task position detection method according to any one of claims 1 to 9, wherein the detection position of the input text and the polarity of the classified emotion are calculated according to the following formulas:
y task =softmax(W task r task +b task );
in the formula y task | task={st,se} Detecting polarity and emotion classification polarity for the multi-graph sparse interaction network model prediction standpoint, W task | task={st,se} Weight of the full connection layer, b task Is W task Corresponding offset, r task | task={st,se} For both the standpoint and emotional characterizations, softmax is the activation function.
CN202210069686.4A 2022-01-21 2022-01-21 Multi-task standpoint detection method based on multi-graph sparse interaction network Active CN114969318B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210069686.4A CN114969318B (en) 2022-01-21 2022-01-21 Multi-task standpoint detection method based on multi-graph sparse interaction network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210069686.4A CN114969318B (en) 2022-01-21 2022-01-21 Multi-task standpoint detection method based on multi-graph sparse interaction network

Publications (2)

Publication Number Publication Date
CN114969318A true CN114969318A (en) 2022-08-30
CN114969318B CN114969318B (en) 2023-04-07

Family

ID=82974812

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210069686.4A Active CN114969318B (en) 2022-01-21 2022-01-21 Multi-task standpoint detection method based on multi-graph sparse interaction network

Country Status (1)

Country Link
CN (1) CN114969318B (en)

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180218253A1 (en) * 2017-01-31 2018-08-02 Conduent Business Services, Llc Stance classification of multi-perspective consumer health information
US20210089936A1 (en) * 2019-09-24 2021-03-25 International Business Machines Corporation Opinion snippet detection for aspect-based sentiment analysis
CN112925907A (en) * 2021-02-05 2021-06-08 昆明理工大学 Microblog comment viewpoint object classification method based on event graph convolutional neural network
CN112926337A (en) * 2021-02-05 2021-06-08 昆明理工大学 End-to-end aspect level emotion analysis method combined with reconstructed syntax information

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20180218253A1 (en) * 2017-01-31 2018-08-02 Conduent Business Services, Llc Stance classification of multi-perspective consumer health information
US20210089936A1 (en) * 2019-09-24 2021-03-25 International Business Machines Corporation Opinion snippet detection for aspect-based sentiment analysis
CN112925907A (en) * 2021-02-05 2021-06-08 昆明理工大学 Microblog comment viewpoint object classification method based on event graph convolutional neural network
CN112926337A (en) * 2021-02-05 2021-06-08 昆明理工大学 End-to-end aspect level emotion analysis method combined with reconstructed syntax information

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
冷佳: "主题发现和情感分类的联合分析研究", 《CNKI中国知网》 *

Also Published As

Publication number Publication date
CN114969318B (en) 2023-04-07

Similar Documents

Publication Publication Date Title
CN110162749B (en) Information extraction method, information extraction device, computer equipment and computer readable storage medium
CN106502985B (en) neural network modeling method and device for generating titles
CN104598611B (en) The method and system being ranked up to search entry
CN109902301B (en) Deep neural network-based relationship reasoning method, device and equipment
CN112633010A (en) Multi-head attention and graph convolution network-based aspect-level emotion analysis method and system
CN113435211B (en) Text implicit emotion analysis method combined with external knowledge
CN112307168B (en) Artificial intelligence-based inquiry session processing method and device and computer equipment
CN111476038A (en) Long text generation method and device, computer equipment and storage medium
CN111062220B (en) End-to-end intention recognition system and method based on memory forgetting device
CN111274790A (en) Chapter-level event embedding method and device based on syntactic dependency graph
CN113987167A (en) Dependency perception graph convolutional network-based aspect-level emotion classification method and system
CN114168707A (en) Recommendation-oriented emotion type conversation method
CN114528398A (en) Emotion prediction method and system based on interactive double-graph convolutional network
CN114692568A (en) Sequence labeling method based on deep learning and application
CN110297894B (en) Intelligent dialogue generating method based on auxiliary network
CN110955765A (en) Corpus construction method and apparatus of intelligent assistant, computer device and storage medium
CN117094325B (en) Named entity identification method in rice pest field
US20220138425A1 (en) Acronym definition network
CN112035629B (en) Method for implementing question-answer model based on symbolized knowledge and neural network
CN112148879B (en) Computer readable storage medium for automatically labeling code with data structure
CN114969318B (en) Multi-task standpoint detection method based on multi-graph sparse interaction network
CN116932938A (en) Link prediction method and system based on topological structure and attribute information
CN113869034B (en) Aspect emotion classification method based on reinforced dependency graph
CN114911940A (en) Text emotion recognition method and device, electronic equipment and storage medium
CN116468030A (en) End-to-end face-level emotion analysis method based on multitasking neural network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant