CN114969318B - Multi-task standpoint detection method based on multi-graph sparse interaction network - Google Patents

Multi-task standpoint detection method based on multi-graph sparse interaction network Download PDF

Info

Publication number
CN114969318B
CN114969318B CN202210069686.4A CN202210069686A CN114969318B CN 114969318 B CN114969318 B CN 114969318B CN 202210069686 A CN202210069686 A CN 202210069686A CN 114969318 B CN114969318 B CN 114969318B
Authority
CN
China
Prior art keywords
task
graph
emotion
text
relation
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210069686.4A
Other languages
Chinese (zh)
Other versions
CN114969318A (en
Inventor
廖清
柴合言
丁烨
方滨兴
高翠芸
王晔
王轩
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dongguan University of Technology
Shenzhen Graduate School Harbin Institute of Technology
Original Assignee
Dongguan University of Technology
Shenzhen Graduate School Harbin Institute of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dongguan University of Technology, Shenzhen Graduate School Harbin Institute of Technology filed Critical Dongguan University of Technology
Priority to CN202210069686.4A priority Critical patent/CN114969318B/en
Publication of CN114969318A publication Critical patent/CN114969318A/en
Application granted granted Critical
Publication of CN114969318B publication Critical patent/CN114969318B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Data Mining & Analysis (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Databases & Information Systems (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a multi-task position detection method based on a multi-graph sparse interaction network. The method comprises the steps that an input text is input into a multi-map sparse interaction network model, and the position detection polarity and the emotion classification polarity of the input text are obtained; the multi-graph sparse interaction network model comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the multi-graph sparse interaction module is used for updating the intra-graph node characteristics of the vertical field task graph, the emotion task graph and the task relation graph and updating the sparse interaction of the node characteristics among the graphs; the task-related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classified emotion. The technical scheme of the invention improves the accuracy of the vertical detection for the text of the text pushing.

Description

Multi-task standpoint detection method based on multi-graph sparse interaction network
Technical Field
The invention relates to the technical field of position detection, in particular to a multi-task position detection method based on a multi-graph sparse interaction network.
Background
The existing position detection method mainly uses a machine learning method and a deep learning method. The Machine learning method needs to do a lot of work of feature engineering, extract features manually, and then design a Machine learning model to train the extracted features, such as a Support Vector Machine (Support Vector Machine), a decision tree model, a random forest, and the like. The main disadvantages of the method are that a large amount of time is consumed for carrying out feature engineering, the information contained in manually selected features is limited, and the performance of the model is reduced to a certain extent; meanwhile, most of the machine learning methods contain a large amount of over-parameters, the optimal values of the over-parameters need to be manually selected, time and labor are consumed, and the machine learning methods cannot be applied in a large scale. In the early stage of the vertical detection method based on deep learning, a Recurrent Neural Network (RNN), a Convolutional Neural Network (CNN) and an Attention Mechanism (Attention Mechanism) are mainly used to automatically capture the characteristics of the news text, so that the performance of the vertical detection is improved. Some recent work, influenced by the popularity of transformations, has focused on using the Bert model to improve the performance of the standpoint detection task, mainly by exploiting the powerful word embedding capability of Bert. The deep learning-based position detection method related to the invention uses auxiliary information to construct an auxiliary task to help improve the expression of the position detection task, such as emotional information, expression information, subjective or objective nature of text and the like. The auxiliary LSTM network is mainly designed to extract emotional characteristics, the main LSTM network is used to extract characteristics of a main task, and then the emotional characteristics and the position characteristics are simply spliced together to predict the position of a news text.
In the prior art, the relevance between target expression and position expression, such as RNN and CNN, cannot be captured by adopting a position detection method based on deep learning. In the auxiliary task-based position detection method, emotional characteristics and position characteristics are simply spliced, the complexity of the relationship between tasks is ignored, the two tasks are regarded as the same importance and are simply spliced together, so that larger negative migration is generated, and the performance of a model is reduced. And when the form of the auxiliary task is adopted, only the performance of the main task is concerned, and no consideration is given to the performance improvement of the auxiliary task. Therefore, the accuracy rate of the prior art for performing the position detection on the text of the tweet is low.
Disclosure of Invention
The invention provides a multi-task position detection method based on a multi-graph sparse interaction network, which improves the position detection accuracy aiming at a text of a text to be inferred.
An embodiment of the invention provides a multi-task position detection method based on a multi-graph sparse interaction network, which comprises the following steps:
inputting an input text into a multi-map sparse interaction network model to obtain the stand detection polarity and the emotion classification polarity of the input text; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;
the text encoding module is used for processing input text into a plurality of word vectors for the multi-graph building module, the multi-graph sparse interaction module and the task related attention module to use;
the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the position task graph is a graph constructed according to the syntax dependence tree of the input text and a graph constructed by pragmatic weight of words of the input text when a position detection task is carried out; the emotion task graph is constructed according to the syntax dependence tree of the input text and pragmatic weight of words of the input text during emotion classification task; the task relation graph is constructed according to word category importance weight of words of an input text and a syntax dependence tree of the input text;
the multi-graph sparse interaction module is used for updating intra-graph node features of the vertical field task graph, the emotion task graph and the task relation graph and updating sparse interaction of the node features among the graphs; the sparse interaction among the diagrams comprises sparse interaction between a position task diagram and a task relation diagram and sparse interaction between an emotion task diagram and a task relation diagram;
and the task related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classification emotion according to the position feature representation of the position task diagram and the emotion feature representation of the emotion task diagram.
Further, the method for constructing the elevation task graph and the emotion task graph of the multi-graph sparse interaction network model comprises the following steps of:
constructing a first syntactic dependency tree according to the syntactic structure of the tweet text, and acquiring a root word set of the first syntactic dependency tree;
adding words in the target text into the first syntactic dependency tree according to the word connection relation between the target text and the root word set to obtain a second syntactic dependency tree of the input text;
calculating a first pragmatic weight and a second pragmatic weight of the tweet text and a third pragmatic weight of the target text, wherein the first pragmatic weight is the pragmatic weight of each word of the tweet text in a vertical detection task, and the second pragmatic weight is the pragmatic weight of each word of the tweet text in an emotion classification task;
and constructing the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntactic dependency tree, and constructing the emotion task graph according to the second pragmatic weight and the second syntactic dependency tree.
Further, constructing a task relationship graph of the multi-graph sparse interaction network model specifically comprises:
constructing word category importance weight, and constructing a vertical label importance relation vector and an emotion label importance relation vector according to the word category importance weight; the word category importance weight is the relationship between words and label categories of different tasks;
and calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing the task relation graph according to the task interaction relation.
Further, updating the intra-graph node characteristics of the vertical task graph, the emotion task graph and the task relationship graph, specifically:
according to the formula
Figure GDA0003996728260000031
Respectively carrying out intra-graph node feature updating on the vertical task graph, the emotion task graph and the task relation graph, wherein task belongs to { st, se and re } which respectively represents the vertical task graph, the emotion task graph and the task relation graph, I is a unit matrix, sigma is a nonlinear activation function, l represents the number of network layers of current iteration, and/or>
Figure GDA0003996728260000041
A parameter indicating a level l network>
Figure GDA0003996728260000042
An adjacency matrix representing a vertical task graph, an emotional task graph or a task relationship graph; />
Figure GDA0003996728260000043
j denotes the adjacency matrix diagonal fill 1.
Further, the updating of the sparse interaction of the node features among the graphs is specifically performed according to the following formula:
Figure GDA0003996728260000044
Figure GDA0003996728260000045
Figure GDA0003996728260000046
updating sparse interaction of node features among graphs, wherein the node features comprise a position task graph, an emotion task graph and a task relation graph, and the formula comprises
Figure GDA0003996728260000047
And &>
Figure GDA0003996728260000048
Respectively show the vertical task chart g st Emotion task map g se And task relation graph g re And performing matrix representation of the node features of the graph after updating the node features in the graph by using a GCN graph neural network, wherein alpha is the number of node information of the vertical task graph and the number of node information of the emotional task graph, which are used for controlling task relation graph fusion by using a hyper-parameter, and l is the number of network layers of the current iteration.
Furthermore, sparse interaction is encouraged between the position task graph and the task relation graph and between the emotion task graph and the task relation graph according to a first loss function; and encouraging multi-interaction between the position task graph and the task relation graph and between the emotion task graph and the task relation graph according to the second loss function.
Further, the first loss function is specifically:
Figure GDA0003996728260000049
the second loss function is specifically:
Figure GDA00039967282600000410
in the formula
Figure GDA0003996728260000051
Sparse mask matrix for a vertical task, <' >>
Figure GDA0003996728260000052
And (3) a sparse mask matrix of the emotion tasks, wherein L represents the number of network layers of the current iteration, and L represents the total number of preset iteration networks.
Further, a position feature representation r of the position task graph is calculated according to the following formula st
Figure GDA0003996728260000053
Where α is the attention weight expressed by the feature of the vertical position, h i And m + n represents the length of the input text for the feature vector of the ith word after being coded by a BERT model.
Further, calculating the emotional characteristic representation r of the emotional task diagram according to the following formula se
Figure GDA0003996728260000054
Where alpha' is the attention weight for the emotional feature representation,
Figure GDA0003996728260000055
output g for emotion correlation graph se The characteristic of the ith node in the text entry table, and m + n represents the length of the input text.
Further, the polarity of the detection standpoint and classification emotion of the input text is calculated according to the following formula:
y task =softmax(W task r task +b task );
in the formula y task | task={st,se} Detecting polarity and emotion classification polarity for the multi-graph sparse interaction network model prediction standpoint, W task | task={st,se} Weight of the full connection layer, b task Is W task Corresponding offset, r task | task={st,se} Softmax is the activation function for both the standpoint and emotional characterizations.
The embodiment of the invention has the following beneficial effects:
the invention provides a multi-task vertical detection method based on a multi-graph sparse interaction network. According to the invention, through training the emotion analysis task and the standpoint detection task in a combined manner, a task-specific graph (namely, a standpoint task graph and an emotion task graph) and a task-related graph (namely, a task relation graph) are constructed for each task, and a sparse interaction module between the graphs is constructed, so that sparse interaction between the standpoint task graph and the emotion task graph is realized, information sharing between the tasks is facilitated, and the expressive force of the multi-graph sparse interaction network model on each task is improved.
Drawings
Fig. 1 is a schematic structural diagram of a multi-graph sparse interaction network model according to an embodiment of the present invention.
Detailed Description
The technical solutions in the present invention will be described clearly and completely with reference to the drawings in the present invention, and it should be apparent that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be obtained by a person skilled in the art without making any creative effort based on the embodiments in the present invention, belong to the protection scope of the present invention.
As shown in fig. 1, in the multi-task standpoint detection method based on the multi-graph sparse interaction network provided by an embodiment of the present invention, an input text is input to a multi-graph sparse interaction network model, and a standpoint detection polarity and an emotion classification polarity of the input text are obtained; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;
the text encoding module is used for processing input text into a plurality of word vectors for the multi-graph building module, the multi-graph sparse interaction module and the task related attention module to use;
the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the position task graph is a graph constructed according to the syntax dependence tree of the input text and a graph constructed by pragmatic weight of words of the input text when a position detection task is carried out; the emotion task graph is a graph constructed according to the syntax dependence tree of the input text and pragmatic weights of words of the input text during emotion classification tasks; the task relation graph is constructed according to word category importance weight of words of an input text and a syntax dependence tree of the input text;
the multi-graph sparse interaction module is used for updating intra-graph node characteristics of the vertical field task graph, the emotion task graph and the task relation graph and updating sparse interaction of the node characteristics among the graphs; the sparse interaction among the graphs comprises sparse interaction between a position task graph and a task relation graph and sparse interaction between an emotion task graph and a task relation graph;
and the task related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classification emotion according to the position feature representation of the position task diagram and the emotion feature representation of the emotion task diagram.
As one embodiment, the method for constructing the task relation graph of the model comprises the following steps:
constructing word category importance weight, and constructing a vertical label importance relation vector and an emotional label importance relation vector according to the word category importance weight;
and calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing the task relation graph according to the task interaction relation.
As one of the embodiments, the text encoding module uses BERT to encode the tweet text content C and the target word T in the target text. Firstly, splicing the text content C of the tweet and the target word T to form an input text S, wherein S = { T = { T } 1 ,…,t m ,w 1 ,w 2 ,…,w n }. The processing into the input format of the BERT model is then: [ CLS]t 1 …t m w 1 w 2 …w n [SEP]. And then input it into the BERT network model to capture the contextual features of the input text. The process can be defined by the following formula:
H=BERT(S)
where H is the output of the BERT network model, H = { H = 1 ,h 2 ,…,h m+n }. Where each element in H is a characteristic representation of a word in the input text,
Figure GDA0003996728260000071
including a characteristic representation of the context information for the tth word.
As one embodiment, the method for constructing the elevation task map and the emotion task map comprises the following steps:
constructing a first syntax dependency tree according to the syntax structure of the text-pushing text, and acquiring a root word set of the first syntax dependency tree;
adding words in the target text into the first syntactic dependency tree according to the word connection relation between the target text and the root word set to obtain a second syntactic dependency tree of the input text; calculating a first pragmatic weight and a second pragmatic weight of the tweet text and a third pragmatic weight of the target text, wherein the first pragmatic weight is a pragmatic weight of each word of the tweet text in a vertical detection task (namely, a pragmatic weight of a word about a vertical task is calculated by using relative co-occurrence frequency and word frequency of the word and vertical labels (support and object) in the whole corpus); the second pragmatic weight is the pragmatic weight of each word of the tweet text in the emotion classification task (namely, the pragmatic weight of each word about the vertical task is calculated by using the relative co-occurrence frequency and the word frequency of the word and the emotion label in the whole corpus). The pragmatic weight refers to the dependency (or influence) of the word in the inferred text on a specific target.
And constructing the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntactic dependency tree, and constructing the emotion task graph according to the second pragmatic weight and the second syntactic dependency tree.
Specifically, each piece of tweet text is parsed using a parsing tool to construct a syntactic dependency tree; and representing the relation between the words in the syntactic dependency tree as connecting lines between the nodes by representing the words as the nodes of the graph (namely the nodes of the graph are word vectors of each word in the text of the Chinese character) so as to construct a base graph of the target text T and the Chinese character C
Figure GDA0003996728260000081
The method comprises the following steps:
constructing a first syntactic dependency tree according to the syntactic structure of the tweet text C
Figure GDA0003996728260000082
And obtaining the root word set w of the first syntactic dependency tree through a syntactic resolver r
Since the target text T is not a complete sentence, but a phrase or a word, and cannot be modeled as a syntactic dependency tree, according to the word connection relationship between the target text and the root word set, the embodiment of the present invention adds the word in the target text to the first syntactic dependency tree to obtain a second syntactic dependency tree of the input text S
Figure GDA0003996728260000083
The second syntactic dependency tree pick>
Figure GDA0003996728260000084
The calculation formula of (a) is as follows:
Figure GDA0003996728260000085
in the formula, w r Represents the first sentenceFamilies dependency tree
Figure GDA0003996728260000086
Is based on the root word>
Figure GDA0003996728260000087
A second syntactic dependency tree representing the input text S->
Figure GDA0003996728260000088
w i And w j Any two different words in the input text S, which represents the tweet text and the target text.
In order to capture the importance of words in the input text and the interaction characteristics between the words, the pragmatic weight of the words and the word frequency of the words of different tasks need to be calculated. Calculating the frequency of each word in the input text in the whole corpus
Figure GDA0003996728260000089
In the formula, N (w) i ) As a word w i The number of times it appears in the corpus, N being the number of all words in the corpus. The embodiment of the invention calculates the pragmatic weight of the word in different tasks aiming at different tasks, comprising a first pragmatic weight phi task (w i )| task=stance And a second pragmatic weight phi task (w i )| task=sentiment When the first pragmatic weight and the second pragmatic weight are calculated, only the category with practical significance is considered, two label categories which do not contain useful information, namely neutral position and neutral emotion, are ignored, and the specific calculation process is shown as the following formula:
Figure GDA0003996728260000091
Figure GDA0003996728260000092
in the formula, N (w) i ,label + ) And N (w) i ,label - ) Respectively represent words w i The quantities appearing in the context task labels "support" and "objection" or respectively denote the words w i The number of occurrences in the emotional task tags "active" and "passive"; n (label) + ) And N (label) - ) The total number of the position task labels "support" and "opposition", respectively, or the total number of the emotion task labels "positive" and "negative", respectively, is represented, μ is the mean and δ is the standard deviation.
According to the formula
Figure GDA0003996728260000093
And calculating a third pragmatic weight of the target text. And the third pragmatic weight of the target text is used for establishing the relation between the target text and the text of the tweet, namely constructing an edge between the target text and the text of the tweet in the graph.
And calculating a first adjacency matrix of the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntactic dependency tree, and further obtaining the elevation task graph according to the first adjacency matrix.
And calculating a second adjacency matrix of the emotion task graph according to the second pragmatic weight and a second syntax dependency tree, and further obtaining the emotion task graph according to the second adjacency matrix.
Specifically, the first adjacency matrix and the second adjacency matrix are calculated according to the following formulas:
Figure GDA0003996728260000094
in the formula s j And s i Are words in the input text.
As an embodiment, the task relation graph of the multi-graph sparse interaction network model is constructed, and the method comprises the following steps:
constructing word category importance weight, and constructing a vertical label importance relation vector and an emotion label importance relation vector according to the word category importance weight;
and calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing the task relation graph according to the task interaction relation.
Specifically, in consideration of fine-grained interaction relationships between different tasks, that is, relationships of tasks in different categories are different, the word category importance weight is used in the embodiment of the present invention
Figure GDA0003996728260000101
The relation between the word and the label types of different tasks is built, so that the label importance relation characteristics of different words to different tasks are obtained, information interaction among different tasks is facilitated, and the task expressive force of the multi-graph sparse interaction network model is improved. According to the formula
Figure GDA0003996728260000102
Calculating the importance weight of the word class, wherein w i e.C is a word of the tweet text, C i Is a label category of the corresponding task (e.g., task is a position detection task, c i E { support, objection }), based on a status of a subscriber or subscriber>
Figure GDA0003996728260000103
The expression w i The label class appearing at task is c i The number of times of (c); />
Figure GDA0003996728260000104
Label class c represented in task i The total number of occurrences of the corresponding word; | w i | denotes the word w i The number of occurrences in the corpus; task denotes a position task or an emotion task. The word category importance weight refers to the importance relationship between words and different label categories of different tasks, namely, the importance relationship between words in two tasks is represented, and the word category importance weight is used for capturing the similarity relationship between the two tasks based on task label layers. Firstly, calculating the relation between the word and each label category of the vertical task to form a vector, namely a vertical label importance relation vector, and then calculating the relation between the word and the emotion taskThe relationship among all label categories of the affair forms a vector, namely an emotion label importance relationship vector. And finally, calculating the similarity of the two vectors, and taking the similarity as the importance weight of the word to the label category of each task for measuring the similarity relation of the word between the two tasks.
Specifically, a label importance relation vector is constructed for each task according to the calculated word class importance weight, wherein the label importance relation vector comprises a position label importance relation vector and an emotion label importance relation vector, namely phi stance (w i ) And phi sentiment (w i )。
The importance relationship vector represents the importance relationship of each word to different label categories, denoted as
Figure GDA0003996728260000105
Wherein task represents a position task or an emotion task, c i A certain label representing task>
Figure GDA0003996728260000106
In normalized form as a word class importance weight>
Figure GDA0003996728260000111
Wherein->
Figure GDA0003996728260000112
And &>
Figure GDA0003996728260000113
Is->
Figure GDA0003996728260000114
Mean and standard deviation of (d). Through the calculation, the importance relation vector of the word to different label categories under different tasks, namely the importance relation vector phi of the vertical label can be obtained stance (w i ) And affective tag importance relationship vector Φ sentiment (w i )。
According to the weight of the vertical labelTask interaction relation xi (w) based on word level is calculated by importance relation vector and emotion label importance relation vector i ):
Figure GDA0003996728260000115
Where, sta denotes a position detection task and se denotes an emotion classification task. A second syntactic dependency tree from the input text S
Figure GDA00039967282600001115
Task interaction xi (w) i ) Adjacency matrix @, which is used to construct a task graph after normalization, is>
Figure GDA0003996728260000116
In turn according to said adjacency matrix>
Figure GDA0003996728260000117
Obtaining the task relationship diagram, the adjacency matrix->
Figure GDA0003996728260000118
The calculation formula of (2) is as follows:
Figure GDA0003996728260000119
as one embodiment, intra-graph node characteristics of the vertical task graph, the emotion task graph and the task relation graph are updated, namely, horizontal intra-graph updating is performed. Specifically, the embodiment of the present invention uses Graph Convolutional Neural Network (GCN) to perform intra-Graph iterative update, and the update process is independent for each Graph in a broad sense (only horizontal update of the Graph is considered), that is, each Graph independently updates node features. Respectively updating the node characteristics in the map of the vertical task map, the emotion task map and the task relation map according to the following formulas:
Figure GDA00039967282600001110
in the formula, task belongs to { st, se, re } respectively represents a position task graph, an emotion task graph and a task relation graph,
Figure GDA00039967282600001111
node characteristics, representing the output of the l-1 th network, are evaluated>
Figure GDA00039967282600001112
Parameters representing the l-th network, wherein I is an identity matrix, and sigma is a nonlinear activation function; />
Figure GDA00039967282600001113
j denotes the adjacent matrix diagonal filled 1, <' >>
Figure GDA00039967282600001114
Updating the characteristic vector of the convolution kernel of the graph convolution network and the updated parameter of the conventional graph convolution neural network; l denotes the number of network layers for the current iteration. The initial node characteristic is expressed as->
Figure GDA0003996728260000121
Figure GDA0003996728260000122
Vertical task graph g st And emotion task graph g se Graph g relating to tasks respectively re Performing sparse interaction to reach g st Heel g se The interaction between the tasks is carried out, so that the different tasks can be helped to realize information sharing. Initializing g st Heel g re Is as follows
Figure GDA0003996728260000123
g se Heel g re Is on the side->
Figure GDA0003996728260000124
Sparse mask matrix for defining a context task/>
Figure GDA0003996728260000125
Sparse mask matrix->
Figure GDA0003996728260000126
The mask matrix z represents a group of random binary variables and is responsible for controlling the interaction between the position task graph and the task relation graph and controlling the interaction between the position task graph and the task relation graph. Therefore, when the iterative interaction of the ith layer graph is carried out, the sparse interaction of the node features among the graphs is updated according to the formulas (1) to (3), namely the sparse interaction between the vertical field task graph and the task relation graph and the sparse interaction between the emotion task graph and the task relation graph are updated:
Figure GDA0003996728260000127
Figure GDA0003996728260000128
Figure GDA0003996728260000129
wherein the content of the first and second substances,
Figure GDA00039967282600001210
and &>
Figure GDA00039967282600001211
Parameters respectively representing connecting edges between the position task graph and the emotion task graph and the task relation graph, wherein l represents the number of network layers of the current iteration, and->
Figure GDA00039967282600001212
And &>
Figure GDA00039967282600001213
Are respectively provided withDiagram g representing a task from the standpoint st Emotion task map g se And task relation graph g re And performing matrix representation of the node characteristics of the graph after updating the node characteristics in the graph by using a GCN graph neural network, wherein alpha is the number of node information of the vertical task graph and the number of node information of the emotional task graph, and the number of the node information of the vertical task graph is used for controlling task relation graph fusion by using a hyper-parameter.
Since the set of random binary variables represented by the mask matrix z are discrete variables and are not derivable, they are not suitable for deep learning models. Therefore, in the embodiment of the present invention, a Gumbel-Softmax distribution is adopted to continuously convert the discrete random variable z into the continuous variable v so as to adapt to an update process of the multi-graph sparse interaction network model in the embodiment of the present invention, and a calculation formula thereof is as follows:
Figure GDA00039967282600001214
where i and j represent randomly chosen 0 and 1, τ is the scaling parameter, G l Is an independent identically distributed sample, π, sampled from a standard normal distribution Gumbel (0, 1) l =[1-z l ,z l ]Is that the mask matrix z is represented as a set of random binary variables.
In order to improve the training efficiency of the model, sparse interaction between the position task graph and the task relation graph and sparse interaction between the emotion task graph and the task relation graph are encouraged according to a first loss function (namely sparse regularization), wherein the first loss function is as follows:
Figure GDA0003996728260000131
in the formula, L represents the number of network layers of the current iteration, L represents the total number of network layers of the set iteration to avoid too sparse interaction among the graphs, so that the whole multi-graph sparse interaction network model is divided into three independent graph networks, and then multi-interaction between the vertical task graph and the task relationship graph, multi-interaction between the emotion task graph and the task relationship graph are encouraged according to a second loss function (namely, sharing regularity), namely, multi-interaction at the bottom layer is encouraged, and bottom layer information is shared among the tasks, wherein the second loss function is as follows:
Figure GDA0003996728260000132
the multi-image sparse interaction module is used for information sharing between two tasks to acquire information helpful for training, noise information influencing task training is filtered, the sharing efficiency and quality between the tasks are improved, and the performance of the multi-image sparse interaction network model for executing each task is further improved.
As one of the embodiments, the position feature representation g of the position task graph is obtained at the task-related attention module st And emotional feature representation g of emotional task graph se . For the position detection task, in order to obtain the position feature representation related to the target, a mask mechanism is required to be adopted to filter out non-target words, specifically, by designing a mask matrix, the corresponding position of the target word is set to 1, and the corresponding position of the non-target word is set to 0, so that the feature representation of the position task graph after the mask matrix conversion is obtained
Figure GDA0003996728260000133
And then using the attention mechanism based on retrieval to obtain richer position feature representation related to the target word, wherein the attention weight alpha of the position feature representation is calculated according to the following formula:
Figure GDA0003996728260000134
Figure GDA0003996728260000141
wherein h is the output of the BERT network model,
Figure GDA0003996728260000142
represents the t-th word passRepresenting the coded characteristics of the BERT network model; m + n represents the length of a word in the input text (the length of the target text is m, the length of the text is n), and/or>
Figure GDA0003996728260000143
Representing the outcome of a position-dependent graph after mask matrix conversion>
Figure GDA0003996728260000144
Characteristic representation of the ith node (i.e., the ith word vector), β t Attention weight, β, representing the t-th word vector i Attention weight, α, expressed as a vertical feature of the ith word vector t The attention weight representing the t-th word vector is normalized. The attention weight of the position feature representation represents the attention of all words in the context to the position feature.
Then, the position characteristic representation r of the position task graph is calculated according to the following formula st
Figure GDA0003996728260000145
h i Feature vectors, α, encoded for the ith word by the BERT model i Attention weight represented by the position feature of the ith word vector.
Similarly, calculating emotional characteristic representation r of the emotional task diagram according to formulas (4) to (6) se
Figure GDA0003996728260000146
Figure GDA0003996728260000147
Figure GDA0003996728260000148
Figure GDA0003996728260000149
Output g representing emotion correlation diagram se The feature representation of the ith node (or word vector) in the text is shown, and alpha ' is the attention weight, beta ' of the emotional feature representation ' t Attention weight, β 'for emotion characterization of the t-th word vector' i Attention weight for emotion characterization for the ith word vector.
Obtaining a final position feature representation r st And emotional feature representation r se And then, fusing text features and rich context features by using a full connection layer, and obtaining the polarity of the detection position and the polarity of the classified emotion of the input text:
y task =softmax(W task r task +b task )
in the formula, y task | task={st,se} Detecting polarity and emotion classification polarity for the multi-graph sparse interaction network model prediction standpoint, W task | task={st,se} Weight of the full connection layer, b task Is W task Corresponding to the offset, softmax is the activation function.
Finally, the objective function of the whole multi-image sparse interaction module is a linear combination of the loss functions of the standpoint detection task and the emotion analysis task:
Figure GDA0003996728260000151
in the formula, theta is a parameter of the multitask graph network model, and lambda is 1 、λ 2 、λ 3 Is the coefficient corresponding to the loss term; d represents the mth piece of tweet,
Figure GDA0003996728260000153
for the set of all tweets, <' > H>
Figure GDA0003996728260000154
For the emotion task label of the "d" clause predicted by the model,
Figure GDA0003996728260000155
for the tag of the nth text-pushing context task predicted by the model, be->
Figure GDA0003996728260000156
Representing a first loss function (i.e., sparse regularization), ->
Figure GDA0003996728260000157
A second loss function (i.e., sharing discipline) is represented, encouraging task sharing.
The task related attention module is used for calculating the final position feature representation of the position task diagram and the final emotional feature representation of the emotional task diagram, and performing position detection and emotion classification according to the position feature representation and the emotional feature representation. The embodiment of the invention constructs three graph structures to capture the interaction relationship between tasks, wherein a task relationship graph is constructed to capture the word-level-based correlation between the position detection task and the emotion analysis task, so as to help each task to share information and reduce the noise generation in the information sharing process.
According to the embodiment of the invention, the position and attitude of the short news text published on the social platform by the public are predicted. Because the news text content is short, the information quantity is small, and a high accuracy rate is difficult to obtain by adopting a single-task learning mode, the invention provides a multi-task stand detection method based on a multi-graph sparse interaction network.
While the foregoing is directed to the preferred embodiment of the present invention, it will be understood by those skilled in the art that various changes and modifications may be made without departing from the spirit and scope of the invention.
It will be understood by those skilled in the art that all or part of the processes of the above embodiments may be implemented by hardware related to instructions of a computer program, and the computer program may be stored in a computer readable storage medium, and when executed, may include the processes of the above embodiments. The storage medium may be a magnetic disk, an optical disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), or the like.

Claims (8)

1. A multi-task position detection method based on a multi-graph sparse interaction network is characterized in that,
inputting an input text into a multi-map sparse interaction network model to obtain the position detection polarity and the emotion classification polarity of the input text; the multi-graph sparse interaction network model is a graph convolution neural network model and comprises a text coding module, a multi-graph construction module, a multi-graph sparse interaction module and a task related attention module; the input text comprises a text pushing text and a target text;
the text encoding module is used for processing input text into a plurality of word vectors for the multi-graph building module, the multi-graph sparse interaction module and the task related attention module to use;
the multi-graph building module is used for building a position task graph, an emotion task graph and a task relation graph of the multi-graph sparse interaction network model; the position task graph is a graph constructed according to the syntax dependence tree of the input text and a graph constructed by pragmatic weight of words of the input text when a position detection task is carried out; the emotion task graph is constructed according to the syntax dependence tree of the input text and pragmatic weight of words of the input text during emotion classification task; the task relation graph is constructed according to word category importance weight of words of an input text and a syntactic dependency tree of the input text; the construction of the standpoint task graph and the emotion task graph of the multi-graph sparse interaction network model comprises the following steps: constructing a first syntax dependency tree according to the syntax structure of the text-pushing text, and acquiring a root word set of the first syntax dependency tree; adding words in the target text into the first syntactic dependency tree according to the word connection relation between the target text and the root word set to obtain a second syntactic dependency tree of the input text; calculating a first pragmatic weight and a second pragmatic weight of the tweet text and a third pragmatic weight of the target text, wherein the first pragmatic weight is the pragmatic weight of each word of the tweet text in a vertical detection task, and the second pragmatic weight is the pragmatic weight of each word of the tweet text in an emotion classification task; constructing the elevation task graph according to the first pragmatic weight, the third pragmatic weight and the second syntactic dependency tree, and constructing the emotion task graph according to the second pragmatic weight and the second syntactic dependency tree; constructing a task relation graph of the multi-graph sparse interaction network model, wherein the task relation graph comprises the following steps: constructing word category importance weight, and constructing a vertical label importance relation vector and an emotion label importance relation vector according to the word category importance weight; the word category importance weight is the relationship between words and label categories of different tasks; calculating a task interaction relation according to the position label importance relation vector and the emotion label importance relation vector, and constructing a task relation graph according to the task interaction relation;
the multi-graph sparse interaction module is used for updating intra-graph node features of the vertical field task graph, the emotion task graph and the task relation graph and updating sparse interaction of the node features among the graphs; the sparse interaction among the diagrams comprises sparse interaction between a position task diagram and a task relation diagram and sparse interaction between an emotion task diagram and a task relation diagram;
and the task related attention module is used for calculating the polarity of the detection position of the input text and the polarity of the classification emotion according to the position feature representation of the position task diagram and the emotion feature representation of the emotion task diagram.
2. The multi-graph sparse interaction network-based multi-task position detection method according to claim 1, wherein intra-graph node features of the vertical task graph, the emotion task graph and the task relationship graph are updated, and specifically:
according to the formula
Figure FDA0003996728250000021
Respectively carrying out intra-graph node feature updating on the vertical task graph, the emotion task graph and the task relation graph, wherein task belongs to { st, se and re } which respectively represents the vertical task graph, the emotion task graph and the task relation graph, I is a unit matrix, sigma is a nonlinear activation function, l represents the number of network layers of current iteration, and/or>
Figure FDA0003996728250000022
A parameter indicating a level l network>
Figure FDA0003996728250000023
An adjacency matrix representing a vertical task graph, an emotion task graph or a task relation graph; />
Figure FDA0003996728250000024
j denotes the adjacency matrix diagonal fill 1./>
3. The multi-graph sparse interaction network-based multi-task position detection method according to claim 2, wherein the sparse interaction of the node features among the graphs is updated, specifically to
According to the following formula:
Figure FDA0003996728250000025
Figure FDA0003996728250000026
Figure FDA0003996728250000027
updating sparse interaction of node features among graphs, wherein the node features comprise a position task graph, an emotion task graph and a task relation graph, and the formula comprises
Figure FDA0003996728250000031
And &>
Figure FDA0003996728250000032
Respectively show the vertical task chart g st Emotion task map g se And task relation graph g re And performing matrix representation of the node features of the graph after updating the node features in the graph by using a GCN graph neural network, wherein alpha is a hyper-parameter and is used for controlling the number of node information of the vertical task graph and the number of node information of the emotional task graph fused by the task relational graph, and l represents the number of network layers of the current iteration.
4. The multi-graph sparse interaction network-based multi-task position detection method according to claim 3, wherein sparse interaction between the position task graph and the task relationship graph and between the emotion task graph and the task relationship graph is encouraged according to a first loss function; and encouraging the interaction between the position task graph and the task relation graph and the interaction between the emotion task graph and the task relation graph according to the second loss function.
5. The multi-graph sparse interaction network-based multi-task position detection method according to claim 4, wherein the first loss function is specifically:
Figure FDA0003996728250000033
the second loss function is specifically:
Figure FDA0003996728250000034
in the formula
Figure FDA0003996728250000035
Sparse mask matrix for a vertical task, <' >>
Figure FDA0003996728250000036
And (3) a sparse mask matrix of the emotion tasks, wherein L represents the number of network layers of the current iteration, and L represents the total number of preset iteration networks.
6. The multi-graph sparse interaction network based multitask position detection method according to claim 5,
calculating a position feature representation r of the position task graph according to the following formula st
Figure FDA0003996728250000037
In the formula of alpha i Attention weight, h, expressed for the feature of the vertical i And m + n represents the length of the input text for the feature vector of the ith word after being coded by a BERT model.
7. The multi-graph sparse interaction network-based multi-task position detection method according to claim 6, wherein the emotional feature representation r of the emotional task graph is calculated according to the following formula se
Figure FDA0003996728250000041
Of formula (II)' i Attention weighting for affective feature representation,
Figure FDA0003996728250000042
Output g for emotion correlation graph se The characteristic of the ith node in the text entry table, and m + n represents the length of the input text.
8. The multi-graph sparse interaction network-based multi-task position detection method according to any one of claims 1 to 7, wherein the detection position of the input text and the polarity of the classified emotion are calculated according to the following formulas:
y task =softmax(W task r task +b task );
in the formula y task Detecting polarity and emotion classification polarity for the multi-graph sparse interaction network model prediction standpoint, W task Weight of the full connection layer, b task Is W task Corresponding offset, r task For both the standpoint and emotional characterizations, softmax is the activation function.
CN202210069686.4A 2022-01-21 2022-01-21 Multi-task standpoint detection method based on multi-graph sparse interaction network Active CN114969318B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210069686.4A CN114969318B (en) 2022-01-21 2022-01-21 Multi-task standpoint detection method based on multi-graph sparse interaction network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210069686.4A CN114969318B (en) 2022-01-21 2022-01-21 Multi-task standpoint detection method based on multi-graph sparse interaction network

Publications (2)

Publication Number Publication Date
CN114969318A CN114969318A (en) 2022-08-30
CN114969318B true CN114969318B (en) 2023-04-07

Family

ID=82974812

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210069686.4A Active CN114969318B (en) 2022-01-21 2022-01-21 Multi-task standpoint detection method based on multi-graph sparse interaction network

Country Status (1)

Country Link
CN (1) CN114969318B (en)

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10628738B2 (en) * 2017-01-31 2020-04-21 Conduent Business Services, Llc Stance classification of multi-perspective consumer health information
US11501187B2 (en) * 2019-09-24 2022-11-15 International Business Machines Corporation Opinion snippet detection for aspect-based sentiment analysis
CN112926337B (en) * 2021-02-05 2022-05-17 昆明理工大学 End-to-end aspect level emotion analysis method combined with reconstructed syntax information
CN112925907A (en) * 2021-02-05 2021-06-08 昆明理工大学 Microblog comment viewpoint object classification method based on event graph convolutional neural network

Also Published As

Publication number Publication date
CN114969318A (en) 2022-08-30

Similar Documents

Publication Publication Date Title
CN108920622B (en) Training method, training device and recognition device for intention recognition
CN110162749B (en) Information extraction method, information extraction device, computer equipment and computer readable storage medium
CN111078836B (en) Machine reading understanding method, system and device based on external knowledge enhancement
CN111538835B (en) Social media emotion classification method and device based on knowledge graph
CN104598611B (en) The method and system being ranked up to search entry
CN108780464A (en) Method and system for handling input inquiry
CN113435211B (en) Text implicit emotion analysis method combined with external knowledge
CN107729290B (en) Representation learning method of super-large scale graph by using locality sensitive hash optimization
CN113535904B (en) Aspect level emotion analysis method based on graph neural network
CN111274790A (en) Chapter-level event embedding method and device based on syntactic dependency graph
CN113987155B (en) Conversational retrieval method integrating knowledge graph and large-scale user log
CN111651447A (en) Intelligent construction life-cycle data processing, analyzing and controlling system
CN112131888A (en) Method, device and equipment for analyzing semantic emotion and storage medium
CN113987167A (en) Dependency perception graph convolutional network-based aspect-level emotion classification method and system
CN114692605A (en) Keyword generation method and device fusing syntactic structure information
CN112765983A (en) Entity disambiguation method based on neural network combined with knowledge description
CN114692568A (en) Sequence labeling method based on deep learning and application
CN114429122A (en) Aspect level emotion analysis system and method based on cyclic attention
CN110297894B (en) Intelligent dialogue generating method based on auxiliary network
CN114510576A (en) Entity relationship extraction method based on BERT and BiGRU fusion attention mechanism
CN114691838B (en) Training and recommending method of chat robot search recommending model and electronic equipment
CN114385813A (en) Water environment text aspect-level viewpoint mining method based on multi-feature fusion
CN114969318B (en) Multi-task standpoint detection method based on multi-graph sparse interaction network
CN116821294A (en) Question-answer reasoning method and device based on implicit knowledge ruminant
CN116484868A (en) Cross-domain named entity recognition method and device based on diffusion model generation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant