CN114970557A - Knowledge enhancement-based cross-language structured emotion analysis method - Google Patents

Knowledge enhancement-based cross-language structured emotion analysis method Download PDF

Info

Publication number
CN114970557A
CN114970557A CN202210423028.0A CN202210423028A CN114970557A CN 114970557 A CN114970557 A CN 114970557A CN 202210423028 A CN202210423028 A CN 202210423028A CN 114970557 A CN114970557 A CN 114970557A
Authority
CN
China
Prior art keywords
word
sentence
training
embedding
language
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210423028.0A
Other languages
Chinese (zh)
Inventor
张旗
杨向东
冯石路
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Oriental Fortune Information Co ltd
Original Assignee
Oriental Fortune Information Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Oriental Fortune Information Co ltd filed Critical Oriental Fortune Information Co ltd
Priority to CN202210423028.0A priority Critical patent/CN114970557A/en
Publication of CN114970557A publication Critical patent/CN114970557A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • G06F40/35Discourse or dialogue representation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/253Grammatical analysis; Style critique
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/042Knowledge-based neural networks; Logical representations of neural networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/047Probabilistic or stochastic networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a knowledge enhancement-based cross-language structured emotion analysis method. In the present invention, a countermeasure-based embedded adapter is designed: semantically rich word-embedded representations are dynamically learned through a word-level based Attention (Attention) mechanism. Meanwhile, in order to improve the robustness of representation, a countermeasure mechanism is arranged to add disturbance to word embedding. The invention also designs an encoding layer based on the graph neural network: structured knowledge (e.g., syntactic parse trees) is important to structured sentiment analysis tasks; also, although the word order between different semantics is inconsistent, the syntactic structure is similar. To this end, the present invention incorporates structured knowledge (e.g., syntactic structures) into the model, learning a structured representation. And finally, the invention performs decoding operation through a decoding layer to extract the target, the bearer, the viewpoint words and the emotion polarity information contained in the text.

Description

Knowledge enhancement-based cross-language structured emotion analysis method
Technical Field
The invention relates to a structured emotion analysis method.
Background
With the continuous warming of social media, both users and their production content are growing at an explosive rate, which radically changes the way information is accepted and disseminated by the public as well as enterprises. Structured sentiment analysis is a very meaningful job in the presence of large data on the order of tens of millions of news per day. For example: a media worker can train an emotion analysis model to know favorite and disliked movies of people according to a large number of comments about movies on the internet; an investor can construct a model which is helpful for stock market prediction, and the optimistic degree of the stocks is evaluated by the leave messages of people in forums; a government worker can evaluate the emotional change of people watching Tutt speeches through an emotion analysis model so as to analyze the love degree of the speeches. For this reason, structured sentiment analysis is proposed, which can identify the sentiment of users expressed on real-time events such as financial news, sports, weather, entertainment and the like on a social platform, and is crucial for many applications.
Specifically, structured emotion analysis refers to extracting structured knowledge (such as objects, opinion words, holders, etc.) in text and predicting their emotions, and is an important research direction in the field of Natural Language Processing (NLP). The task comprises two subtasks of structured extraction and emotion analysis. First, the structured extraction task automatically extracts the main body and each component part from the text, and gives the relationship existing between each part. Then, for a given structured data, its corresponding emotion is predicted. The method depends on entity extraction and relationship extraction, but has higher difficulty compared with the entity extraction and the relationship extraction, and relates to methods and technologies of multiple subjects such as natural language processing, machine learning, pattern matching and the like. In recent years, with the development of deep neural networks, especially the wide application of large-scale pre-training methods, the effect of a structured emotion analysis task is greatly improved.
However, due to the complex labeling of the structured emotion analysis task, the acquisition cost is high, and the data set is small, so that the effect of the neural network model is greatly limited. To this end, cross-language migration methods are proposed for structured extraction, thereby reducing the need for annotation data. Most cross-language structure migrations suffer from language-specific problems that are too dependent on bilingual dictionaries and parallel corpora, which require additional resources or tools. Feng et al propose applying cross-language migration to the task of sequence annotation without regard to complexity. Wang et al use similar spatially distributed representations across languages for relationship extraction.
In recent years, multilingual pre-training models (e.g., mBERT, XLM, etc.) have enjoyed great success in cross-language migration. Liu et al and Nguyen et al apply multilingual word embedding to cross-language tasks such as semantic role labeling, dependency parsing, named entity recognition. However, most current approaches are based on modeling a pre-training word vector. There are in fact many cross-language pre-training models and have different semantic information. Because the optimization objectives for each pre-trained model are different, the data sets are also different. More, structured knowledge is widely used for structured extraction tasks, and how to use the structured emotion analysis task in cross-language is not fully researched.
Disclosure of Invention
The technical problem to be solved by the invention is as follows: with the development of the pre-training language model, a plurality of pre-training model structures, training targets and training data are arbitrarily proposed aiming at cross-language structured extraction, so that different cross-language pre-training models have different semantic information, and therefore, no good research has been found on how to fully utilize all cross-language pre-training models to improve word embedding expression; meanwhile, structured knowledge is important for structured emotion analysis tasks, structured information (such as targets, holders, viewpoint words and the like) is close to each other in a syntax tree, different languages are labeled based on the same syntax rules, so that the syntax structures of the structured information are similar, and how to combine the syntax structures for the cross-language structured emotion analysis tasks is ignored by current work.
In order to solve the technical problems, the technical scheme of the invention is to provide a knowledge enhancement-based cross-language structured emotion analysis method, which is characterized in that a training corpus of a source language is adopted
Figure BDA0003608720270000021
Figure BDA0003608720270000022
Whose labels are sets of view tuples
Figure BDA0003608720270000023
Wherein the content of the first and second substances,
Figure BDA0003608720270000024
which is indicative of the number of samples,
Figure BDA0003608720270000025
representing a sample containing a number of view tuples; set of viewpoint tuples
Figure BDA0003608720270000026
The kth view tuple o k =(h k ,t k ,e k ,p k ) Represents the kth view tuple o k Holder h of k By the term of opinion e k For the target t k Expressing emotional polarity p k Wherein:
Figure BDA0003608720270000027
Figure BDA0003608720270000028
is a training corpus
Figure BDA0003608720270000029
The jth sentence x in j The sub-string of (a) is,
Figure BDA00036087202700000210
Figure BDA00036087202700000211
Figure BDA00036087202700000212
respectively the bearer h k Target t k Term "viewpoint" e k In sentence x j The start position of (2) is set,
Figure BDA00036087202700000213
are respectively sentences x j In
Figure BDA00036087202700000214
The word at the location of the location,
Figure BDA00036087202700000215
are respectively the bearer h k Target t k Term "viewpoint" e k In sentence x j In the end position of (a) to (b),
Figure BDA00036087202700000216
are respectively sentences x j In
Figure BDA00036087202700000217
And (3) the cross-language structured emotion analysis method specifically comprises the following steps of:
s101, constructing and training a countermeasure embedding adapter, designing a word attention mechanism when constructing the countermeasure embedding adapter so as to capture a plurality of embedded important implicit distributed semantics pre-trained on different corpora by different training strategies and tasks, and then improving the robustness of word embedding by adopting a countermeasure training strategy;
when training against the embedded adapter, get
Figure BDA0003608720270000031
Set of cross-language pre-training models
Figure BDA0003608720270000032
Will train the corpus
Figure BDA0003608720270000033
Per sentence input set in (2)
Figure BDA0003608720270000034
Is/are as follows
Figure BDA0003608720270000035
Pre-training a model in a cross-language mode so as to obtain word embedding vectors of sentences of each sentence; for training corpora
Figure BDA0003608720270000036
For any sentence, the words are fused and used through a word-level attention mechanism
Figure BDA0003608720270000037
Obtaining a word embedding vector by a cross-language pre-training model so as to obtain a final word embedding vector corresponding to each sentence, and finally adding disturbance into the final word embedding vector;
step S102, constructing and training a grammar GCN encoder:
obtaining training corpus
Figure BDA0003608720270000038
Constructing a graph for each sentence based on the syntactic parse tree, then calculating to obtain an in-out degree matrix of the graph, and obtaining a grammar GCN encoder according to the graph and the in-out degree matrix; embedding the words obtained in the step S101 after adding disturbance into a vector input Grammar (GCN) encoder to obtain a structural representation of a uniform space, thereby obtaining a structural hidden representation with rich and stable information;
step S103, constructing and training a decoder:
based on the structural hidden representation with rich and stable information, extracting viewpoint words by predicting the starting and ending positions of the viewpoint words, and regarding the viewpoint words as trigger words of each viewpoint; then, extracting the target and the holder, and predicting the emotional polarity of given expression;
and step S104, for any sentence x obtained in real time, obtaining a word embedding vector added with disturbance by using the trained confrontation embedding adapter, inputting the word embedding vector into the trained grammar GCN encoder to obtain the hidden layer representation of each word in the sentence x, and finally extracting all viewpoint tuples contained in the sentence x by using the trained decoder.
Preferably, the step S101 specifically includes the following steps:
step S1011, obtaining training corpus
Figure BDA0003608720270000039
The jth sentence x in j The word embedding vector of (1), comprising the steps of:
will sentence x j Are respectively input
Figure BDA00036087202700000310
After a cross-language pre-training model is obtained
Figure BDA00036087202700000311
Embedding different words into the vector, wherein the sentence x j Input the ith cross-language pre-training model M i The word-embedded vector obtained after this is represented as
Figure BDA00036087202700000312
Figure BDA00036087202700000313
In the formula (I), the compound is shown in the specification,
Figure BDA00036087202700000314
Figure BDA00036087202700000315
representing a sentence x j The first word of Chinese l By cross-language pretrainingTraining model M i Word embedding, | x, is obtained j I represents the sentence x j The total number of words in;
fusing sentence x-based attention mechanisms through word-level j Obtained by
Figure BDA0003608720270000041
Different word embedding vectors to obtain final word embedding vector E j
Figure BDA0003608720270000042
Wherein e is l Embedding a vector E for a word j The first word in the embedding has:
Figure BDA0003608720270000043
Figure BDA0003608720270000044
in the formula, v a 、W a And b a In order to train the parameters, the user may,
Figure BDA0003608720270000045
denotes v a The transposed matrix of (2);
step S1012, set
Figure BDA0003608720270000046
Representation for sentence x j In which r is l Representation for sentence x j Word embedding e of the first word in l The disturbance of (2); sentence x j Adding disturbance r j Is represented by
Figure BDA0003608720270000047
Further use of
Figure BDA0003608720270000048
Indicates that there is
Figure BDA0003608720270000049
Wherein the content of the first and second substances,
Figure BDA00036087202700000410
for sentence x, obtained by j Worst disturbance of
Figure BDA00036087202700000411
Figure BDA00036087202700000412
Computing using an estimated method
Figure BDA00036087202700000413
Comprises the following steps:
Figure BDA00036087202700000414
Figure BDA00036087202700000415
wherein g is | x j L g l The splicing of the two pieces of the paper is carried out,
Figure BDA00036087202700000416
indicating e for word embedding l Is calculated by the gradient, | · | 2 The norm of L2 is shown,
Figure BDA00036087202700000417
Figure BDA00036087202700000418
represents the loss for a single sample, e is a parameter used to control the degree of perturbation;
based on countering disturbances
Figure BDA00036087202700000419
Minimizing maximum likelihood of resistance training
Figure BDA00036087202700000420
Thereby obtaining a sentence for the sentence x j Worst disturbance of
Figure BDA00036087202700000421
The setup of the confrontational training is as follows:
Figure BDA00036087202700000422
during training, sentence x j Adding perturbations
Figure BDA00036087202700000423
Obtaining the word embedding vector after adding the disturbance
Figure BDA00036087202700000424
Figure BDA00036087202700000425
Preferably, the step S102 includes the steps of:
during training, training corpora are obtained
Figure BDA0003608720270000051
Middle sentence x j The set of relationships of the syntax parse tree is denoted as E j (ii) a As a sentence x j Construction drawing G j ,G j =(V j ,E j ),
Figure BDA0003608720270000052
v l Is the l-th node, is sentence x j The first word w in (1) l (ii) a Based on the graph G j Establishing an adjacency matrix
Figure BDA0003608720270000053
If it is in the graph G j Node v in m And node v n There is a connecting edge between them, then A mn 1, otherwise A mn =0,A mn For the elements adjacent to the m-th row and n-th column in the matrix A, a graph G is obtained j And has an entry and exit degree matrix D mm =∑ n A mn ,D mm The element in the mth row and mth column of the access matrix D is 0, and the other elements in the access matrix D are 0.
Based on the graph G j The adjacency matrix A and the degree of access matrix D construct a sentence x j The syntax GCN encoder of (1), the syntax GCN encoder having P +1 graphics convolution layer in common and the hidden representation of the P-th graphics convolution layer
Figure BDA0003608720270000054
Is learned from the hidden representation of its adjacent p-1 th layer of the graphics convolution layer,
Figure BDA0003608720270000055
as a sentence x j The first word w in (1) l Hidden representation at p-th layer, H (p) Is represented as follows:
Figure BDA0003608720270000056
in the formula, W (p) For trainable parameters, H (0) Embedding vectors for perturbed words
Figure BDA0003608720270000057
Adding the perturbed sentence x obtained in step S102 j Word-embedded vector
Figure BDA0003608720270000058
Obtaining a structured representation of a unified space from the P +1 th layer after entering a syntax GCN encoder
Figure BDA0003608720270000059
Wherein the content of the first and second substances,
Figure BDA00036087202700000510
representing the finally obtained sentence x j The first word of Chinese l The hidden layer representation of the method can obtain the structural hidden representation with rich information and robustness.
Preferably, the step S103 specifically includes the following steps:
step S1031, viewpoint word extraction: to extract the opinion words in the sentence, when trained, two binary classifiers are used to predict sentence x j The first word w in l Is the term of point of view e k Probability of starting position of
Figure BDA00036087202700000511
Or probability of end position
Figure BDA00036087202700000512
As shown in the following formula:
Figure BDA00036087202700000513
Figure BDA00036087202700000514
in the formula (I), the compound is shown in the specification,
Figure BDA00036087202700000515
are parameters that can be learned;
cross entropy is used as a loss function
Figure BDA0003608720270000061
Then there are:
Figure BDA0003608720270000062
in the formula: CE (·) represents a cross-entropy function;
Figure BDA0003608720270000063
is a sample viewpoint word e k Labels for the start and end positions, if l is a terme k A start or end position of, then
Figure BDA0003608720270000064
Equal to 1;
step S1032, target word extraction: to take into account the opinion information when extracting the target, the predicted sentence x is predicted from the opinion term tokens when training j The first word w in l Is a target t k Probability of starting position of
Figure BDA0003608720270000065
Or probability of end position
Figure BDA0003608720270000066
As shown in the following formula:
Figure BDA0003608720270000067
Figure BDA0003608720270000068
in the formula (I), the compound is shown in the specification,
Figure BDA0003608720270000069
are parameters that can be learned;
Figure BDA00036087202700000610
is a term of view e obtained by a syntactic GCN encoder k A hidden layer representation of the word at the start position,
Figure BDA00036087202700000611
is a term of view e obtained by a syntactic GCN encoder k A hidden layer representation of the word at the end position of (a); [ a; b]The connection operation of a and b is shown;
the loss function of the target word extraction is
Figure BDA00036087202700000612
Then there are:
Figure BDA00036087202700000613
in the formula (I), the compound is shown in the specification,
Figure BDA00036087202700000614
is the sample object t k Labels for start and end positions, if l is the target t k A start or end position of, then
Figure BDA00036087202700000615
Equal to 1;
step S1033, the holder extracts: during training, sentence x is predicted from the viewpoint word representations j The first word w in l Is the holder h k Probability of starting position of
Figure BDA00036087202700000616
Or probability of end position
Figure BDA00036087202700000617
As shown in the following formula:
Figure BDA00036087202700000618
Figure BDA00036087202700000619
in the formula (I), the compound is shown in the specification,
Figure BDA00036087202700000620
are parameters that can be learned.
The loss function extracted by the bearer is
Figure BDA00036087202700000621
Then there are:
Figure BDA0003608720270000071
in the formula (I), the compound is shown in the specification,
Figure BDA0003608720270000072
is the sample holder h k Labels for the start and end positions, if l is the holder h k A start or end position of, then
Figure BDA0003608720270000073
Equal to 1;
step S1034, emotion polarity prediction:
during training, the maximum pooling method is used to obtain sentence x j The sentence of (1) characterizes r s =Maxpooling(H (P) ) And connecting it with the viewpoint word representation to make polarity classification to obtain sentence x j Expressing emotional polarity p k Probability of (c):
Figure BDA0003608720270000074
in the formula (I), the compound is shown in the specification,
Figure BDA0003608720270000075
are parameters that can be learned;
based on emotional probability distribution, loss function
Figure BDA0003608720270000076
The following were used:
Figure BDA0003608720270000077
in the formula (I), the compound is shown in the specification,
Figure BDA0003608720270000078
is the sample emotional polarity p k The label of (1).
In the present invention, a countermeasure-based embedded adapter is designed: semantically rich word-embedded representations are dynamically learned through a word-level based Attention (Attention) mechanism. Meanwhile, in order to improve the robustness of representation, a countermeasure mechanism is arranged to add disturbance to word embedding. The invention also designs an encoding layer based on the graph neural network: structured knowledge (e.g., syntactic analysis trees) is important to structured sentiment analysis tasks; also, although the word order between different semantics is inconsistent, the syntactic structure is similar. To this end, the present invention incorporates structured knowledge (e.g., syntactic structures) into the model, learning a structured representation. Finally, the invention performs decoding operation through a decoding layer to extract the target, the holder, the viewpoint words and the emotion polarity information contained in the text.
The invention focuses on cross-language structured emotion analysis, provides cross-language structured emotion analysis based on knowledge enhancement, trains on a source language, and migrates to a target language for testing, thereby reducing the requirement on target language labeling data and improving the effect of neural network model extraction. Therefore, the invention provides a knowledge-enhanced cross-language structured emotion analysis model, which adds implicit knowledge and explicit knowledge into a cross-language structured emotion analysis task. First, the present invention designs a countermeasure-based embedded adaptor that adaptively combines multiple embedded representations using word-level attention mechanism and a countermeasure strategy, learning semantically rich and robust word representations. Furthermore, inspired by the existing work, the invention integrates the general syntactic dependency into the cross-language structured emotion analysis. The syntax tree has important significance for the structured emotion analysis task, and meanwhile, different semantics have similar syntax structures, so that the structured representation can be well learned by combining the syntax structures into the task.
Drawings
FIG. 1 is a diagram of a knowledge-based enhanced cross-language structured migration model of the present invention.
Detailed Description
The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.
The existing structured emotion analysis data set relates to fewer languages and is smaller in scale, and the performance of a neural network model is limited to a great extent. Therefore, in the knowledge enhancement-based cross-language structured emotion analysis method disclosed by the invention, a knowledge enhancement cross-language structured migration is proposed, as shown in fig. 1. First, the present invention designs a countermeasure-based embedded learning adapter to adaptively capture implicit semantic information from multi-language embedding to learn semantically rich and robust information representations. In addition, the present invention incorporates a syntactic GCN encoder to learn structural representations between different languages. Finally, the invention performs structured extraction based on the sentence representations learned by the two parts.
In the present invention, formalization is defined as follows:
given a corpus of a source language
Figure BDA0003608720270000081
Whose labels are sets of view tuples
Figure BDA0003608720270000082
Wherein the content of the first and second substances,
Figure BDA0003608720270000083
which is indicative of the number of samples,
Figure BDA0003608720270000084
the representative sample contains the viewpoint tuple number. Set of viewpoint tuples
Figure BDA0003608720270000085
The kth viewpoint tuple o k =(h k ,t k ,t k ,p k ) Represents the kth view tuple o k Holder of h k By the term of opinion e k For the target t k Expressing emotion polesProperty p k Wherein:
Figure BDA0003608720270000086
Figure BDA0003608720270000087
is a training corpus
Figure BDA0003608720270000088
The jth sentence x in j The sub-string of (a) is,
Figure BDA0003608720270000089
Figure BDA00036087202700000810
respectively the bearer h k Target t k Term "viewpoint" e k In sentence x j In the above-mentioned position of the start position,
Figure BDA00036087202700000811
are respectively sentences x j In
Figure BDA00036087202700000812
The word at the location of the location,
Figure BDA00036087202700000813
Figure BDA00036087202700000814
respectively the bearer h k Target t k Term "viewpoint" e k In sentence x j In the end position of (a) to (b),
Figure BDA00036087202700000815
Figure BDA00036087202700000816
are respectively sentences x j In (1)
Figure BDA00036087202700000817
The word at the location.
Based on the definition, the cross-language structured emotion analysis method based on knowledge enhancement provided by the invention specifically comprises the following steps of:
and S101, constructing and training a counterattack embedding adapter, and learning a rich and robust cross-language migration word embedding method through the counterattack embedding adapter. To this end, in building the anti-embedding adapter, the invention first designs a word attention mechanism to capture the important implicit distributed semantics of multiple embeddings pre-trained with different training strategies and tasks on different corpora. The invention then employs an antagonistic training strategy to improve the robustness of word embedding.
Step S101 specifically includes the following steps:
step S1011, designing a word level attention mechanism:
multilingual pre-training language models, such as mBERT and XLM, have been widely used for different cross-language tasks with great success. In fact, developers and researchers have released a large number of multilingual pre-trained language models. There are more than 100 models on Hugging Face website. These models have different semantic information because they are trained on different large-scale datasets using different targets and settings. For example, mBERT-base-cast and mBERT-base-uncased are trained on the first 104 languages with the largest Wikipedia using a Mask Language Modeling (MLM) target based on lower case or upper case text. The XLM-RoBERTA model was pre-trained on these 100 languages with 2.5TB of data. However, most existing work uses one of these models as word embedding. In order to obtain better word representation, the invention designs a word-level attention mechanism to better combine multiple cross-language pre-training word embedding.
Specifically, obtaining a compound of
Figure BDA0003608720270000091
Set of cross-language pre-training models
Figure BDA0003608720270000092
During trainingWill train the corpus
Figure BDA0003608720270000093
Per sentence input set in (2)
Figure BDA0003608720270000094
Is/are as follows
Figure BDA0003608720270000095
And pre-training the model in a cross-language mode, so as to obtain a word embedding vector of each sentence. Since the sub-words of different cross-language pre-training models after processing the same sentence are different, the training corpus is different
Figure BDA0003608720270000096
For any sentence, the invention uses
Figure BDA0003608720270000097
And taking the average value of the sub-words obtained by the cross-language pre-training model as the final word embedding of the word, thereby obtaining the final word embedding vector corresponding to each sentence.
In this embodiment, during training, training corpora are obtained
Figure BDA0003608720270000098
The jth sentence x in China j The word embedding vector specifically comprises the following steps:
will sentence x j Are respectively input
Figure BDA0003608720270000099
After a cross-language pre-training model is obtained
Figure BDA00036087202700000910
Embedding different words into the vector, wherein the sentence x j Input the ith cross-language pre-training model M i The word-embedded vector obtained after this is represented as
Figure BDA0003608720270000101
Figure BDA0003608720270000102
In the formula (I), the compound is shown in the specification,
Figure BDA0003608720270000103
Figure BDA0003608720270000104
representing a sentence x j The first word of Chinese l By pre-training the model M across languages i Word embedding, | x, is obtained j I represents the sentence x j The total number of words in.
Fusing sentence-based x through word-level attention mechanism j Obtained by
Figure BDA0003608720270000105
Different word embedding vectors to obtain final word embedding vector E j
Figure BDA0003608720270000106
Wherein e is l Embedding a vector E for a word j The first word in the embedding has:
Figure BDA0003608720270000107
Figure BDA0003608720270000108
in the formula, v a 、W a And b a In order to train the parameters in a trainable manner,
Figure BDA0003608720270000109
a representation matrix v a Is transposed matrix of
In the word-level attention mechanism provided by the invention, weights of different dimensions of different words are different, for example, an emotion expression word focuses on emotion information, and a target word focuses on entity information.
Step S1012, word embedding based on the confrontation: the word-level attention mechanism obtains semantically rich word representations, however, instability of the cross-language transfer model is caused by type and semantic differences between the source language and the target language. Therefore, to improve the robustness of word embedding, the present invention applies antagonism training to the input word embedding space of cross-language transfer. Existing research has shown that antagonism training is a novel regularization algorithm that improves robustness by adding some interference to the input.
In particular, during the training, the user can,
Figure BDA00036087202700001010
representation for sentence x j In which r is l Representation for sentence x j Word embedding e of the first word in l The disturbance of (2). Sentence x j Adding disturbance r j Is represented by
Figure BDA00036087202700001011
Further use of
Figure BDA00036087202700001012
Indicates that there is
Figure BDA00036087202700001013
Wherein, the first and the second end of the pipe are connected with each other,
Figure BDA00036087202700001014
in training, for sentence x, it is obtained by j Worst disturbance of
Figure BDA00036087202700001015
Figure BDA00036087202700001016
Will be provided with
Figure BDA00036087202700001017
Represented by the formula:
Figure BDA00036087202700001018
s.t.||r j ||<∈
in the formula (I), the compound is shown in the specification,
Figure BDA0003608720270000111
Figure BDA0003608720270000112
denotes the loss for the jth sample, | denotes the L1 norm, e is the parameter used to control the degree of disturbance.
Then the method of estimation is used to calculate
Figure BDA0003608720270000113
Comprises the following steps:
Figure BDA0003608720270000114
Figure BDA0003608720270000115
wherein g is | x j L g l The splicing of the two pieces of the paper is carried out,
Figure BDA0003608720270000116
to represent
Figure BDA0003608720270000117
Indicating e for word embedding l Is calculated by the gradient, | · | 2 Representing the L2 norm.
Based on countering disturbances
Figure BDA0003608720270000118
Minimizing maximum likelihood of resistance training
Figure BDA0003608720270000119
Thereby obtaining a sentence for the sentence x j Worst disturbance of
Figure BDA00036087202700001110
The setup of the confrontational training is as follows:
Figure BDA00036087202700001111
during training, sentence x j Adding perturbations
Figure BDA00036087202700001112
Then obtain the word embedding vector after adding the disturbance
Figure BDA00036087202700001113
Figure BDA00036087202700001114
Step S102, constructing and training a grammar GCN encoder: the emphasis of the antagonistic embedded adapter is to learn a rich and robust distribution characterization, while explicit knowledge is ignored. Therefore, in order to learn a cross-language structural characterization, the invention introduces a syntactic GCN encoder, which integrates a dependency parsing tree into a cross-language structured emotion analysis. Since the syntax parse tree plays a crucial role in the structured extraction. As shown in fig. 1, it can be found from the syntax tree that the holder "long-level observer group of the south african community" and the target "muga president" are both on one subtree. Furthermore, the distance between the representation of the parse tree and the target (or holder) is close to a sentence. All this shows that the model can learn the structural relationships between targets, holders and expressions by parsing the similar words on the tree. Second, two tree structures with similar semantic sentences are similar in multiple languages because they are labeled according to the same linguist's syntactic rules. Thus, the present invention proposes a syntactic GCN encoder to model explicit structural knowledge of cross-language migrations.
When training, firstly, obtain training corpus
Figure BDA00036087202700001115
And constructing a graph for each sentence based on the syntactic parse tree, then calculating to obtain an in-out degree matrix of the graph, and obtaining a grammar GCN encoder according to the graph and the in-out degree matrix. And embedding the words obtained in the step S101 after adding disturbance into a vector input Grammar (GCN) encoder to obtain a structural representation of a uniform space, thereby obtaining a structural hidden representation with rich and stable information.
Wherein, during training, the embodiment obtains the corpus by the open source tool Stanza
Figure BDA0003608720270000121
Middle sentence x j The set of relationships of the syntax parse tree is denoted as E j . As a sentence x j Construction drawing G j ,G j =(V j ,E j ),
Figure BDA0003608720270000122
v l Is the l-th node, is sentence x j The first word w in (1) l . Based on the graph G j Establishing an adjacency matrix
Figure BDA0003608720270000123
If it is in the graph G j Node v in m And node v n There is a connecting edge between them, then A mn 1, otherwise A mn =0,A mn For the elements adjacent to the m-th row and n-th column in the matrix A, a graph G is obtained j And has an entry and exit degree matrix D mm =∑ n A mn ,D mm The element in the mth row and mth column of the access matrix D is 0, and the other elements in the access matrix D are 0.
Based on the graph G j The adjacency matrix A and the degree of access matrix D construct a sentence x j The syntax GCN encoder of (1), the syntax GCN encoder having P +1 graphics convolution layer in common and the hidden representation of the P-th graphics convolution layer
Figure BDA0003608720270000124
Is learned from the hidden representation of its adjacent p-1 th layer of the graphics convolution layer,
Figure BDA0003608720270000125
as a sentence x j The first word w in (1) l Hidden representation at p-th layer, H (p) Is represented as follows:
Figure BDA0003608720270000126
in the formula, W (p) For trainable parameters, H (0) Embedding vectors for perturbed words
Figure BDA0003608720270000127
Adding the perturbed sentence x obtained in step S102 j Word-embedded vector
Figure BDA0003608720270000128
Obtaining a structured representation of a unified space from the P +1 th layer after entering a syntax GCN encoder
Figure BDA0003608720270000129
Wherein the content of the first and second substances,
Figure BDA00036087202700001210
representing the finally obtained sentence x j The first word w in l The hidden representation of the data is obtained, thereby obtaining the structural hidden representation with rich and robust information.
Step S103, constructing and training a decoder: based on the structural hidden representation with rich and stable information, the invention adopts a simple decoding strategy for the four subtasks. First, the present invention extracts viewpoint words by predicting the start and end positions of the viewpoint words. These viewpoint words are treated as trigger words for each viewpoint. The invention then extracts the target and the bearer and predicts the emotional polarity for a given expression.
Step S103 specifically includes the following steps:
step S1031, viewpoint word extraction: to extract the opinion words in sentences, the present invention predicts sentence x using two binary classifiers when training j The first word of Chinese l Is the term of point of view e k Probability of starting position of
Figure BDA0003608720270000131
Or probability of an end position
Figure BDA0003608720270000132
As shown in the following formula:
Figure BDA0003608720270000133
Figure BDA0003608720270000134
in the formula (I), the compound is shown in the specification,
Figure BDA0003608720270000135
are parameters that can be learned.
Cross entropy is used as a loss function
Figure BDA0003608720270000136
Then there are:
Figure BDA0003608720270000137
in the formula: CE (·) represents a cross-entropy function;
Figure BDA0003608720270000138
is a sample viewpoint word e k Labels for the start and end positions, if l is the term e k A start or end position of, then
Figure BDA0003608720270000139
Equal to 1.
In the prediction process, if
Figure BDA00036087202700001310
If it is greater than 0.5, l is the term e k The start position of (2); if it is
Figure BDA00036087202700001311
If it is greater than 0.5, l is the term e k The end position of (2).
Step S1032, target word extraction: to take into account the opinion information when extracting the target, the present invention predicts the predicted sentence x from the opinion term characterization when training j The first word w in l Is a target t k Probability of starting position of
Figure BDA00036087202700001312
Or probability of end position
Figure BDA00036087202700001313
As shown in the following formula:
Figure BDA00036087202700001314
Figure BDA00036087202700001315
in the formula (I), the compound is shown in the specification,
Figure BDA00036087202700001316
are parameters that can be learned;
Figure BDA00036087202700001317
is a term of view e obtained by a syntactic GCN encoder k A hidden layer representation of the word at the start position,
Figure BDA00036087202700001318
is a term of view e obtained by a syntactic GCN encoder k A hidden layer representation of the word at the end position of (a); [ a; b]Showing the connection operation of a and b.
The loss function of the target word extraction is
Figure BDA00036087202700001319
Then there are:
Figure BDA00036087202700001320
in the formula (I), the compound is shown in the specification,
Figure BDA00036087202700001321
is the sample object t k Labels for start and end positions, if l is the target t k A start or end position of, then
Figure BDA0003608720270000141
Equal to 1.
In the prediction process, if
Figure BDA0003608720270000142
If it is greater than 0.5, l is the target t k The start position of (2); if it is
Figure BDA0003608720270000143
If it is greater than 0.5, l is the target t k The end position of (1).
Step S1033, the holder extracts: as with target word extraction, in training, the present invention predicts sentence x based on viewpoint word characterization j The first word of Chinese l Is the holder h k Probability of starting position of
Figure BDA0003608720270000144
Or probability of end position
Figure BDA0003608720270000145
As shown in the following formula:
Figure BDA0003608720270000146
Figure BDA0003608720270000147
in the formula (I), the compound is shown in the specification,
Figure BDA0003608720270000148
are parameters that can be learned.
The loss function extracted by the bearer is
Figure BDA0003608720270000149
Then there are:
Figure BDA00036087202700001410
in the formula (I), the compound is shown in the specification,
Figure BDA00036087202700001411
is the sample holder h k Labels for the start and end positions, if l is the holder h k A start or end position of, then
Figure BDA00036087202700001412
Equal to 1.
In the prediction process, if
Figure BDA00036087202700001413
If greater than 0.5, l is the holder h k The start position of (2); if it is
Figure BDA00036087202700001414
If the value is more than 0.5, l is the holder h k The end position of (1).
Step S1034, emotion polarity prediction:
finally, the aim of the invention is to predict and assign the term of opinion e k Relative emotional polarity p k . In training, the present invention uses the max-posing method to obtain sentence x j The sentence of (1) characterizes r s =Maxpooling(H (P) ) And view it togetherThe word features are connected to carry out polarity classification to obtain a sentence x j Expressing emotional polarity p k Probability of (c):
Figure BDA00036087202700001415
in the formula (I), the compound is shown in the specification,
Figure BDA00036087202700001416
are parameters that can be learned.
Based on the emotional probability distribution, the loss function designed by the invention is as follows:
Figure BDA0003608720270000151
in the formula (I), the compound is shown in the specification,
Figure BDA0003608720270000152
is the sample emotional polarity p k Is marked with a label
It should be noted that the tuple (target t) is expressed due to the viewpoint word k Term "viewpoint" e k And the holder h k ) The invention incorporates the term of opinion e for this purpose k To predict emotion.
Finally, through a decoder model, all viewpoint tuples contained in the sentence can be finally extracted by the method, wherein the viewpoint tuples contain the bearer, the target word, the viewpoint word and the emotion polarity.
And step S104, for any sentence x obtained in real time, obtaining a word embedding vector added with disturbance by using the trained confrontation embedding adapter, inputting the word embedding vector into the trained grammar GCN encoder to obtain the hidden layer representation of each word in the sentence x, and finally extracting all viewpoint tuples contained in the sentence x by using the trained decoder.
Aiming at the problem of data loss of the structured event, the invention provides a method for cross-language structured migration with enhanced knowledge. The invention combines the latest technologies of a graph convolution network, a countermeasure model and the like, and creates a cross-language and cross-form structured extraction model according to the characteristics of the structured extraction task. The invention is mainly different from the traditional structured extraction method in the following two aspects
(1) The counter-embedded adapter:
the goal of the present invention is to learn a word embedding that is informative and robust for cross-language translation. First, a word-level attention mechanism is designed to capture multiple embedded significant distribution implied semantic information pre-trained with different training strategies and tasks on different corpora. Then, an antagonistic training strategy is employed to improve the robustness of the embedding.
(2) GCN-based structured representation learning:
the emphasis of the antagonistic embedded encoder is to learn a rich and robust distribution characterization, while the explicit knowledge is ignored. Therefore, modeling semantic information displayed by using a GCN mode is reasonably applied to structured emotion analysis.
In the present invention, a model for cross-language structured sentiment analysis is presented. An antagonistic embedding adaptator is designed for learning informativeness and robustness embedding. Then, a syntactic GCN encoder is introduced to learn the parse-tree based structural representation. We compared the model to supervised and unsupervised baselines over five datasets in four languages. Experimental results show that the model has great advantages in cross-language migration. Meanwhile, ablation experimental studies were also conducted to demonstrate the effectiveness of each module in the model.
(1) Knowledge-enhanced cross-language structured sentiment analysis model performance
The present invention is directed to the use of knowledge learned in one language to improve generalization ability in another language. To verify the validity of the model, it is evaluated for cross-language migration, trained on the source language, and tested on the target language without the tagged data. Twenty-five migration tasks were performed on five data sets in four low-resource languages, and the model of the present invention was compared to supervised and unsupervised baselines, as shown in table 1 below.
Figure BDA0003608720270000161
Table 1 is an evaluation table of the results of the experiment
From this table, the following results can be obtained. The migration model of the invention has obviously better performance in cross-language structured emotion analysis than the unsupervised baseline. In particular, the model of the invention achieves the best performance in all indexes, and so on the data set. In particular, the target F1 is improved by more than three points on three of the data sets. All this shows that the information structure representation of the present invention can help the model migrate structured emotions between different languages.
(2) Robust embedded encoder and GCN-based structured representation learning effectiveness
To investigate the effectiveness of each module consisting of the model of the present invention, ablation tests were performed on five data sets, as shown in table 2 below. The average score of all source data sets except the target data set is reported. In particular, the present invention removes the antagonistic embedded adapters (-AEA) and the syntactic GCN encoder (-SGCNE), respectively, from the model.
Figure BDA0003608720270000171
Table 2 shows the results of the ablation experiment
The results indicate that both the adversarial embedded adapter and the syntactic GCN encoder are important to this task. In particular, the adversarial embedding adapter can capture the diversity of the underlying features from various multilingual embedding that contains different semantic information. It learns information embedding through attention mechanism and improves robustness through antagonistic training strategy. The final word embedding can well improve the performance of cross-language migration. At the same time, the structural characterization learned by the syntactic GCN encoder can further improve the model effect, since the dependency parse tree is crucial for structured sentiment.

Claims (4)

1. A cross-language structured emotion analysis method based on knowledge enhancement is characterized in that training corpora of source language are adopted
Figure FDA0003608720260000011
Whose labels are sets of view tuples
Figure FDA0003608720260000012
Wherein, the first and the second end of the pipe are connected with each other,
Figure FDA0003608720260000013
which is indicative of the number of samples,
Figure FDA0003608720260000014
representing a sample containing a number of view tuples; set of viewpoint tuples
Figure FDA0003608720260000015
The kth view tuple o k =(h k ,t k ,e k ,p k ) Represents the kth view tuple o k Holder of h k By the term of opinion e k For the target t k Expressing emotional polarity p k Wherein:
Figure FDA0003608720260000016
Figure FDA0003608720260000017
Figure FDA0003608720260000018
is a training corpus
Figure FDA0003608720260000019
The jth sentence x in j The sub-string of (a) is,
Figure FDA00036087202600000110
Figure FDA00036087202600000111
respectively the bearer h k Target t k Term "viewpoint" e k In sentence x j In the above-mentioned position of the start position,
Figure FDA00036087202600000112
are respectively sentences x j In
Figure FDA00036087202600000113
The word at the location of the location,
Figure FDA00036087202600000114
respectively the bearer h k Target t k Term "viewpoint" e k In sentence x j In the end position of (a) to (b),
Figure FDA00036087202600000115
are respectively sentences x j In
Figure FDA00036087202600000116
And (3) the cross-language structured emotion analysis method specifically comprises the following steps of:
s101, constructing and training a countermeasure embedding adapter, designing a word attention mechanism when constructing the countermeasure embedding adapter so as to capture a plurality of embedded important implicit distributed semantics pre-trained on different corpora by different training strategies and tasks, and then improving the robustness of word embedding by adopting a countermeasure training strategy; when training against the embedded adapter, get
Figure FDA00036087202600000117
Set of cross-language pre-training models
Figure FDA00036087202600000118
Will train the corpus
Figure FDA00036087202600000119
Per sentence input set in (2)
Figure FDA00036087202600000120
Is/are as follows
Figure FDA00036087202600000121
Pre-training a model in a cross-language mode so as to obtain word embedding vectors of sentences of each sentence; for training corpora
Figure FDA00036087202600000122
For any sentence, the words are fused and used through a word-level attention mechanism
Figure FDA00036087202600000123
Obtaining a word embedding vector by a cross-language pre-training model so as to obtain a final word embedding vector corresponding to each sentence, and finally adding disturbance into the final word embedding vector;
step S102, constructing and training a grammar GCN encoder:
obtaining training corpus
Figure FDA00036087202600000124
Constructing a graph for each sentence based on the syntactic parse tree, then calculating to obtain an in-out degree matrix of the graph, and obtaining a grammar GCN encoder according to the graph and the in-out degree matrix; embedding the words obtained in the step S101 after adding disturbance into a vector input Grammar (GCN) encoder to obtain a structural representation of a uniform space, thereby obtaining a structural hidden representation with rich and stable information;
step S103, constructing and training a decoder:
based on the structural hidden representation with rich and stable information, extracting viewpoint words by predicting the starting and ending positions of the viewpoint words, and regarding the viewpoint words as trigger words of each viewpoint; then, extracting the target and the holder, and predicting the emotional polarity of given expression;
and step S104, for any sentence x obtained in real time, obtaining a word embedding vector added with disturbance by using the trained confrontation embedding adapter, inputting the word embedding vector into the trained grammar GCN encoder to obtain the hidden layer representation of each word in the sentence x, and finally extracting all viewpoint tuples contained in the sentence x by using the trained decoder.
2. The knowledge-enhancement-based cross-language structured emotion analysis method of claim 1, wherein the step S101 specifically comprises the steps of:
step S1011, obtaining training corpus
Figure FDA0003608720260000021
The jth sentence x in j The word embedding vector of (1), comprising the steps of:
will sentence x j Are respectively input
Figure FDA0003608720260000022
After a cross-language pre-training model is obtained
Figure FDA0003608720260000023
Embedding different words into the vector, wherein the sentence x j Input the ith cross-language pre-training model M i The word-embedded vector obtained after this is represented as
Figure FDA0003608720260000024
Figure FDA0003608720260000025
In the formula (I), the compound is shown in the specification,
Figure FDA0003608720260000026
Figure FDA0003608720260000027
representing a sentence x j The first word w in l By pre-training the model M across languages i Word embedding, | x, is obtained j I represents the sentence x j The total number of words in;
fusing sentence x-based attention mechanisms through word-level j Obtained by
Figure FDA0003608720260000028
Different word embedding vectors to obtain final word embedding vector E j
Figure FDA0003608720260000029
Wherein e is l Embedding a vector E for a word j The first word in the embedding has:
Figure FDA00036087202600000210
Figure FDA00036087202600000211
in the formula, v a 、W a And b a In order to train the parameters, the user may,
Figure FDA00036087202600000212
denotes v a The transposed matrix of (2);
step S1012, set
Figure FDA00036087202600000213
Representation for sentence x j In which r is l Representation for sentence x j Word embedding e of the first word in l The disturbance of (2); sentence x j Adding disturbance r j Is represented by
Figure FDA00036087202600000214
Further use of
Figure FDA00036087202600000215
Indicates that there is
Figure FDA00036087202600000216
Wherein the content of the first and second substances,
Figure FDA00036087202600000217
for sentence x, obtained by j Worst disturbance of
Figure FDA00036087202600000218
Figure FDA00036087202600000219
Computing using an estimated method
Figure FDA0003608720260000031
Comprises the following steps:
Figure FDA0003608720260000032
Figure FDA0003608720260000033
wherein g is | x j L g l The splicing of the two pieces of the paper is carried out,
Figure FDA0003608720260000034
indicating e for word embedding l Is calculated by gradient, | · | 2 The norm of L2 is shown,
Figure FDA0003608720260000035
l (-) denotes the loss for the jth sample, ∈ is the parameter used to control the degree of perturbation;
based on countering disturbances
Figure FDA0003608720260000036
Minimizing maximum likelihood of resistance training
Figure FDA0003608720260000037
Thereby obtaining a sentence for the sentence x j Worst disturbance of
Figure FDA0003608720260000038
The setup of the confrontational training is as follows:
Figure FDA0003608720260000039
during training, sentence x j Adding perturbations
Figure FDA00036087202600000310
Then obtain the word embedding vector after adding the disturbance
Figure FDA00036087202600000311
Figure 1
3. The knowledge-enhancement-based cross-language structured emotion analysis method of claim 1, wherein the step S102 comprises the steps of:
during training, training corpora are obtained
Figure FDA00036087202600000313
Middle sentence x j The set of relationships of the syntax parse tree is denoted as E j (ii) a As a sentence x j Construction drawing G j ,G j =(V j ,E j ),
Figure FDA00036087202600000314
v l Is the l-th node, is sentence x j The first word w in (1) l (ii) a Based on the graph G j Establishing an adjacency matrix
Figure FDA00036087202600000315
If it is in the graph G j Node v in m And node v n There is a connecting edge between them, then A mn 1, otherwise A mn =0,A mn For the elements adjacent to the m-th row and n-th column in the matrix A, a graph G is obtained j And has an entry and exit degree matrix D mm =∑ n A mn ,D mm The element in the mth row and mth column of the access matrix D is 0, and the other elements in the access matrix D are 0.
Based on the graph G j The adjacency matrix A and the degree of access matrix D construct a sentence x j The syntax GCN encoder of (1), the hidden representation of the p-th graphics convolution layer
Figure FDA00036087202600000316
Is learned from the hidden representation of its adjacent p-1 th layer of the graphics convolution layer,
Figure FDA00036087202600000317
as a sentence x j The first word w in (1) l Hidden representation at p-th layer, H (p) Is represented as follows:
Figure FDA0003608720260000041
in the formula, W (p) For trainable parameters, H (0) Embedding vectors for perturbed words
Figure FDA0003608720260000042
Adding the perturbed sentence x obtained in step S102 j Word-embedded vector
Figure FDA0003608720260000043
After the syntax GCN encoder is input, a structural representation of a unified space is obtained from a P +1 layer
Figure FDA0003608720260000044
Wherein the content of the first and second substances,
Figure FDA0003608720260000045
representing the finally obtained sentence x j The first word w in l The hidden representation of the data is obtained, thereby obtaining the structural hidden representation with rich and robust information.
4. The knowledge-enhancement-based cross-language structured emotion analysis method of claim 3, wherein the step S103 specifically comprises the steps of:
step S1031, viewpoint word extraction: to extract the opinion words in sentences, during training, two binary classifiers are used to predict sentence x j The first word w in l Is the term of point of view e k Probability of starting position of
Figure FDA0003608720260000046
Or probability of end position
Figure FDA0003608720260000047
As shown in the following formula:
Figure FDA0003608720260000048
Figure FDA0003608720260000049
in the formula (I), the compound is shown in the specification,
Figure FDA00036087202600000410
are parameters that can be learned;
cross entropy is used as a loss function
Figure FDA00036087202600000411
Then there are:
Figure FDA00036087202600000412
in the formula: CE (·) represents a cross-entropy function;
Figure FDA00036087202600000413
is a sample viewpoint word e k Labels for the start and end positions, if l is the term e k A start or end position of, then
Figure FDA00036087202600000414
Equal to 1;
step S1032, target word extraction: to take into account the opinion information when extracting the target, the predicted sentence x is predicted from the opinion term tokens when training j The first word w in l Is a target t k Probability of starting position of
Figure FDA00036087202600000415
Or probability of end position
Figure FDA00036087202600000416
As shown in the following formula:
Figure FDA00036087202600000417
Figure FDA00036087202600000418
in the formula,
Figure FDA0003608720260000051
Are parameters that can be learned;
Figure FDA0003608720260000052
is a term of view e obtained by a syntactic GCN encoder k A hidden layer representation of the word at the start position,
Figure FDA0003608720260000053
is a term of view e obtained by a syntactic GCN encoder k A hidden layer representation of the word at the end position of (a); [ a; b]The connection operation of a and b is shown;
the loss function of the target word extraction is
Figure FDA0003608720260000054
Then there are:
Figure FDA0003608720260000055
in the formula (I), the compound is shown in the specification,
Figure FDA0003608720260000056
is the sample object t k Labels for start and end positions, if l is the target t k A start or end position of, then
Figure FDA0003608720260000057
Equal to 1;
step S1033, the holder extracts: during training, the sentence x is predicted according to the viewpoint word representation j The first word w in l Is the holder h k Probability of starting position of
Figure FDA0003608720260000058
Or probability of end position
Figure FDA0003608720260000059
As shown in the following formula:
Figure FDA00036087202600000510
Figure FDA00036087202600000511
in the formula (I), the compound is shown in the specification,
Figure FDA00036087202600000512
are parameters that can be learned.
The loss function extracted by the bearer is
Figure FDA00036087202600000513
Then there are:
Figure FDA00036087202600000514
in the formula (I), the compound is shown in the specification,
Figure FDA00036087202600000515
is the sample holder h k Labels for the start and end positions, if l is the holder h k A start or end position of, then
Figure FDA00036087202600000516
Equal to 1;
step S1034, emotion polarity prediction:
during training, the method of maximum pooling is used to obtain sentence x j The sentence of (1) characterizes r s =Maxpooling(H (P) ) And connecting the character with the viewpoint word characteristics for polarity classification to obtain a sentence x j Expressing emotional polarity p k Probability of (c):
Figure FDA00036087202600000517
in the formula (I), the compound is shown in the specification,
Figure FDA00036087202600000518
are parameters that can be learned;
based on emotional probability distribution, loss function
Figure FDA0003608720260000061
The following were used:
Figure FDA0003608720260000062
in the formula (I), the compound is shown in the specification,
Figure FDA0003608720260000063
is the sample emotional polarity p k The label of (1).
CN202210423028.0A 2022-04-21 2022-04-21 Knowledge enhancement-based cross-language structured emotion analysis method Pending CN114970557A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210423028.0A CN114970557A (en) 2022-04-21 2022-04-21 Knowledge enhancement-based cross-language structured emotion analysis method

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210423028.0A CN114970557A (en) 2022-04-21 2022-04-21 Knowledge enhancement-based cross-language structured emotion analysis method

Publications (1)

Publication Number Publication Date
CN114970557A true CN114970557A (en) 2022-08-30

Family

ID=82978789

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210423028.0A Pending CN114970557A (en) 2022-04-21 2022-04-21 Knowledge enhancement-based cross-language structured emotion analysis method

Country Status (1)

Country Link
CN (1) CN114970557A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115204183A (en) * 2022-09-19 2022-10-18 华南师范大学 Knowledge enhancement based dual-channel emotion analysis method, device and equipment

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115204183A (en) * 2022-09-19 2022-10-18 华南师范大学 Knowledge enhancement based dual-channel emotion analysis method, device and equipment
CN115204183B (en) * 2022-09-19 2022-12-27 华南师范大学 Knowledge enhancement-based two-channel emotion analysis method, device and equipment

Similar Documents

Publication Publication Date Title
Peng et al. A survey on deep learning for textual emotion analysis in social networks
Palangi et al. Deep sentence embedding using long short-term memory networks: Analysis and application to information retrieval
Qiu et al. DGeoSegmenter: A dictionary-based Chinese word segmenter for the geoscience domain
CN106599032B (en) Text event extraction method combining sparse coding and structure sensing machine
Peng et al. Phonetic-enriched text representation for Chinese sentiment analysis with reinforcement learning
CN109753566A (en) The model training method of cross-cutting sentiment analysis based on convolutional neural networks
Cai et al. Intelligent question answering in restricted domains using deep learning and question pair matching
Svoboda et al. New word analogy corpus for exploring embeddings of Czech words
Li et al. UD_BBC: Named entity recognition in social network combined BERT-BiLSTM-CRF with active learning
CN110046353A (en) Aspect level emotion analysis method based on multi-language level mechanism
CN114818717A (en) Chinese named entity recognition method and system fusing vocabulary and syntax information
Dou et al. Improving word embeddings for antonym detection using thesauri and sentiwordnet
CN110222344A (en) A kind of composition factor analysis algorithm taught for pupil&#39;s composition
Zahidi et al. Different valuable tools for Arabic sentiment analysis: a comparative evaluation.
CN112784602A (en) News emotion entity extraction method based on remote supervision
Huang et al. A window-based self-attention approach for sentence encoding
Steur et al. Next-generation neural networks: Capsule networks with routing-by-agreement for text classification
Marreddy et al. Multi-task text classification using graph convolutional networks for large-scale low resource language
Da et al. Deep learning based dual encoder retrieval model for citation recommendation
Yao Attention-based BiLSTM neural networks for sentiment classification of short texts
Xie et al. A novel attention based CNN model for emotion intensity prediction
Tarride et al. A comparative study of information extraction strategies using an attention-based neural network
CN114970557A (en) Knowledge enhancement-based cross-language structured emotion analysis method
Ojo et al. Transformer-based approaches to sentiment detection
CN111159405B (en) Irony detection method based on background knowledge

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination