CN114970557A

CN114970557A - Knowledge enhancement-based cross-language structured emotion analysis method

Info

Publication number: CN114970557A
Application number: CN202210423028.0A
Authority: CN
Inventors: 张旗; 杨向东; 冯石路
Original assignee: Oriental Fortune Information Co ltd
Current assignee: Oriental Fortune Information Co ltd
Priority date: 2022-04-21
Filing date: 2022-04-21
Publication date: 2022-08-30

Abstract

The invention relates to a knowledge enhancement-based cross-language structured emotion analysis method. In the present invention, a countermeasure-based embedded adapter is designed: semantically rich word-embedded representations are dynamically learned through a word-level based Attention (Attention) mechanism. Meanwhile, in order to improve the robustness of representation, a countermeasure mechanism is arranged to add disturbance to word embedding. The invention also designs an encoding layer based on the graph neural network: structured knowledge (e.g., syntactic parse trees) is important to structured sentiment analysis tasks; also, although the word order between different semantics is inconsistent, the syntactic structure is similar. To this end, the present invention incorporates structured knowledge (e.g., syntactic structures) into the model, learning a structured representation. And finally, the invention performs decoding operation through a decoding layer to extract the target, the bearer, the viewpoint words and the emotion polarity information contained in the text.

Description

Knowledge enhancement-based cross-language structured emotion analysis method

Technical Field

The invention relates to a structured emotion analysis method.

Background

With the continuous warming of social media, both users and their production content are growing at an explosive rate, which radically changes the way information is accepted and disseminated by the public as well as enterprises. Structured sentiment analysis is a very meaningful job in the presence of large data on the order of tens of millions of news per day. For example: a media worker can train an emotion analysis model to know favorite and disliked movies of people according to a large number of comments about movies on the internet; an investor can construct a model which is helpful for stock market prediction, and the optimistic degree of the stocks is evaluated by the leave messages of people in forums; a government worker can evaluate the emotional change of people watching Tutt speeches through an emotion analysis model so as to analyze the love degree of the speeches. For this reason, structured sentiment analysis is proposed, which can identify the sentiment of users expressed on real-time events such as financial news, sports, weather, entertainment and the like on a social platform, and is crucial for many applications.

Specifically, structured emotion analysis refers to extracting structured knowledge (such as objects, opinion words, holders, etc.) in text and predicting their emotions, and is an important research direction in the field of Natural Language Processing (NLP). The task comprises two subtasks of structured extraction and emotion analysis. First, the structured extraction task automatically extracts the main body and each component part from the text, and gives the relationship existing between each part. Then, for a given structured data, its corresponding emotion is predicted. The method depends on entity extraction and relationship extraction, but has higher difficulty compared with the entity extraction and the relationship extraction, and relates to methods and technologies of multiple subjects such as natural language processing, machine learning, pattern matching and the like. In recent years, with the development of deep neural networks, especially the wide application of large-scale pre-training methods, the effect of a structured emotion analysis task is greatly improved.

However, due to the complex labeling of the structured emotion analysis task, the acquisition cost is high, and the data set is small, so that the effect of the neural network model is greatly limited. To this end, cross-language migration methods are proposed for structured extraction, thereby reducing the need for annotation data. Most cross-language structure migrations suffer from language-specific problems that are too dependent on bilingual dictionaries and parallel corpora, which require additional resources or tools. Feng et al propose applying cross-language migration to the task of sequence annotation without regard to complexity. Wang et al use similar spatially distributed representations across languages for relationship extraction.

In recent years, multilingual pre-training models (e.g., mBERT, XLM, etc.) have enjoyed great success in cross-language migration. Liu et al and Nguyen et al apply multilingual word embedding to cross-language tasks such as semantic role labeling, dependency parsing, named entity recognition. However, most current approaches are based on modeling a pre-training word vector. There are in fact many cross-language pre-training models and have different semantic information. Because the optimization objectives for each pre-trained model are different, the data sets are also different. More, structured knowledge is widely used for structured extraction tasks, and how to use the structured emotion analysis task in cross-language is not fully researched.

Disclosure of Invention

The technical problem to be solved by the invention is as follows: with the development of the pre-training language model, a plurality of pre-training model structures, training targets and training data are arbitrarily proposed aiming at cross-language structured extraction, so that different cross-language pre-training models have different semantic information, and therefore, no good research has been found on how to fully utilize all cross-language pre-training models to improve word embedding expression; meanwhile, structured knowledge is important for structured emotion analysis tasks, structured information (such as targets, holders, viewpoint words and the like) is close to each other in a syntax tree, different languages are labeled based on the same syntax rules, so that the syntax structures of the structured information are similar, and how to combine the syntax structures for the cross-language structured emotion analysis tasks is ignored by current work.

In order to solve the technical problems, the technical scheme of the invention is to provide a knowledge enhancement-based cross-language structured emotion analysis method, which is characterized in that a training corpus of a source language is adopted

Whose labels are sets of view tuples

Wherein the content of the first and second substances,

which is indicative of the number of samples,

representing a sample containing a number of view tuples; set of viewpoint tuples

The kth view tuple o _k ＝(h _k ,t _k ,e _k ,p _k ) Represents the kth view tuple o _k Holder h of _k By the term of opinion e _k For the target t _k Expressing emotional polarity p _k Wherein:

is a training corpus

The jth sentence x in _j The sub-string of (a) is,

respectively the bearer h _k Target t _k Term "viewpoint" e _k In sentence x _j The start position of (2) is set,

are respectively sentences x _j In

The word at the location of the location,

are respectively the bearer h _k Target t _k Term "viewpoint" e _k In sentence x _j In the end position of (a) to (b),

are respectively sentences x _j In

And (3) the cross-language structured emotion analysis method specifically comprises the following steps of:

s101, constructing and training a countermeasure embedding adapter, designing a word attention mechanism when constructing the countermeasure embedding adapter so as to capture a plurality of embedded important implicit distributed semantics pre-trained on different corpora by different training strategies and tasks, and then improving the robustness of word embedding by adopting a countermeasure training strategy;

when training against the embedded adapter, get

Set of cross-language pre-training models

Will train the corpus

Per sentence input set in (2)

Is/are as follows

Pre-training a model in a cross-language mode so as to obtain word embedding vectors of sentences of each sentence; for training corpora

For any sentence, the words are fused and used through a word-level attention mechanism

Obtaining a word embedding vector by a cross-language pre-training model so as to obtain a final word embedding vector corresponding to each sentence, and finally adding disturbance into the final word embedding vector;

step S102, constructing and training a grammar GCN encoder:

obtaining training corpus

Constructing a graph for each sentence based on the syntactic parse tree, then calculating to obtain an in-out degree matrix of the graph, and obtaining a grammar GCN encoder according to the graph and the in-out degree matrix; embedding the words obtained in the step S101 after adding disturbance into a vector input Grammar (GCN) encoder to obtain a structural representation of a uniform space, thereby obtaining a structural hidden representation with rich and stable information;

step S103, constructing and training a decoder:

based on the structural hidden representation with rich and stable information, extracting viewpoint words by predicting the starting and ending positions of the viewpoint words, and regarding the viewpoint words as trigger words of each viewpoint; then, extracting the target and the holder, and predicting the emotional polarity of given expression;

and step S104, for any sentence x obtained in real time, obtaining a word embedding vector added with disturbance by using the trained confrontation embedding adapter, inputting the word embedding vector into the trained grammar GCN encoder to obtain the hidden layer representation of each word in the sentence x, and finally extracting all viewpoint tuples contained in the sentence x by using the trained decoder.

Preferably, the step S101 specifically includes the following steps:

step S1011, obtaining training corpus

The jth sentence x in _j The word embedding vector of (1), comprising the steps of:

will sentence x _j Are respectively input

After a cross-language pre-training model is obtained

Embedding different words into the vector, wherein the sentence x _j Input the ith cross-language pre-training model M _i The word-embedded vector obtained after this is represented as

In the formula (I), the compound is shown in the specification,

representing a sentence x _j The first word of Chinese _l By cross-language pretrainingTraining model M _i Word embedding, | x, is obtained _j I represents the sentence x _j The total number of words in;

fusing sentence x-based attention mechanisms through word-level _j Obtained by

Different word embedding vectors to obtain final word embedding vector E _j ，

Wherein e is _l Embedding a vector E for a word _j The first word in the embedding has:

in the formula, v _a 、W _a And b _a In order to train the parameters, the user may,

denotes v _a The transposed matrix of (2);

step S1012, set

Representation for sentence x _j In which r is _l Representation for sentence x _j Word embedding e of the first word in _l The disturbance of (2); sentence x _j Adding disturbance r _j Is represented by

Further use of

Indicates that there is

Wherein the content of the first and second substances,

for sentence x, obtained by _j Worst disturbance of

Computing using an estimated method

Comprises the following steps:

wherein g is | x _j L g _l The splicing of the two pieces of the paper is carried out,

indicating e for word embedding _l Is calculated by the gradient, | · | ₂ The norm of L2 is shown,

represents the loss for a single sample, e is a parameter used to control the degree of perturbation;

based on countering disturbances

Minimizing maximum likelihood of resistance training

Thereby obtaining a sentence for the sentence x _j Worst disturbance of

The setup of the confrontational training is as follows:

during training, sentence x _j Adding perturbations

Obtaining the word embedding vector after adding the disturbance

Preferably, the step S102 includes the steps of:

during training, training corpora are obtained

Middle sentence x _j The set of relationships of the syntax parse tree is denoted as E _j (ii) a As a sentence x _j Construction drawing G _j ，G _j ＝(V _j ,E _j )，

v _l Is the l-th node, is sentence x _j The first word w in (1) _l (ii) a Based on the graph G _j Establishing an adjacency matrix

If it is in the graph G _j Node v in _m And node v _n There is a connecting edge between them, then A _mn 1, otherwise A _mn ＝0，A _mn For the elements adjacent to the m-th row and n-th column in the matrix A, a graph G is obtained _j And has an entry and exit degree matrix D _mm ＝∑ _n A _mn ，D _mm The element in the mth row and mth column of the access matrix D is 0, and the other elements in the access matrix D are 0.

Based on the graph G _j The adjacency matrix A and the degree of access matrix D construct a sentence x _j The syntax GCN encoder of (1), the syntax GCN encoder having P +1 graphics convolution layer in common and the hidden representation of the P-th graphics convolution layer

Is learned from the hidden representation of its adjacent p-1 th layer of the graphics convolution layer,

as a sentence x _j The first word w in (1) _l Hidden representation at p-th layer, H ^(p) Is represented as follows:

in the formula, W ^(p) For trainable parameters, H ⁽⁰⁾ Embedding vectors for perturbed words

Adding the perturbed sentence x obtained in step S102 _j Word-embedded vector

Obtaining a structured representation of a unified space from the P +1 th layer after entering a syntax GCN encoder

Wherein the content of the first and second substances,

representing the finally obtained sentence x _j The first word of Chinese _l The hidden layer representation of the method can obtain the structural hidden representation with rich information and robustness.

Preferably, the step S103 specifically includes the following steps:

step S1031, viewpoint word extraction: to extract the opinion words in the sentence, when trained, two binary classifiers are used to predict sentence x _j The first word w in _l Is the term of point of view e _k Probability of starting position of

Or probability of end position

As shown in the following formula:

in the formula (I), the compound is shown in the specification,

are parameters that can be learned;

cross entropy is used as a loss function

Then there are:

in the formula: CE (·) represents a cross-entropy function;

is a sample viewpoint word e _k Labels for the start and end positions, if l is a terme _k A start or end position of, then

Equal to 1;

step S1032, target word extraction: to take into account the opinion information when extracting the target, the predicted sentence x is predicted from the opinion term tokens when training _j The first word w in _l Is a target t _k Probability of starting position of

Or probability of end position

As shown in the following formula:

in the formula (I), the compound is shown in the specification,

are parameters that can be learned;

is a term of view e obtained by a syntactic GCN encoder _k A hidden layer representation of the word at the start position,

is a term of view e obtained by a syntactic GCN encoder _k A hidden layer representation of the word at the end position of (a); [ a; b]The connection operation of a and b is shown;

the loss function of the target word extraction is

Then there are:

in the formula (I), the compound is shown in the specification,

is the sample object t _k Labels for start and end positions, if l is the target t _k A start or end position of, then

Equal to 1;

step S1033, the holder extracts: during training, sentence x is predicted from the viewpoint word representations _j The first word w in _l Is the holder h _k Probability of starting position of

Or probability of end position

As shown in the following formula:

in the formula (I), the compound is shown in the specification,

are parameters that can be learned.

The loss function extracted by the bearer is

Then there are:

in the formula (I), the compound is shown in the specification,

is the sample holder h _k Labels for the start and end positions, if l is the holder h _k A start or end position of, then

Equal to 1;

step S1034, emotion polarity prediction:

during training, the maximum pooling method is used to obtain sentence x _j The sentence of (1) characterizes r ^s ＝Maxpooling(H ^(P) ) And connecting it with the viewpoint word representation to make polarity classification to obtain sentence x _j Expressing emotional polarity p _k Probability of (c):

in the formula (I), the compound is shown in the specification,

are parameters that can be learned;

based on emotional probability distribution, loss function

The following were used:

in the formula (I), the compound is shown in the specification,

is the sample emotional polarity p _k The label of (1).

In the present invention, a countermeasure-based embedded adapter is designed: semantically rich word-embedded representations are dynamically learned through a word-level based Attention (Attention) mechanism. Meanwhile, in order to improve the robustness of representation, a countermeasure mechanism is arranged to add disturbance to word embedding. The invention also designs an encoding layer based on the graph neural network: structured knowledge (e.g., syntactic analysis trees) is important to structured sentiment analysis tasks; also, although the word order between different semantics is inconsistent, the syntactic structure is similar. To this end, the present invention incorporates structured knowledge (e.g., syntactic structures) into the model, learning a structured representation. Finally, the invention performs decoding operation through a decoding layer to extract the target, the holder, the viewpoint words and the emotion polarity information contained in the text.

The invention focuses on cross-language structured emotion analysis, provides cross-language structured emotion analysis based on knowledge enhancement, trains on a source language, and migrates to a target language for testing, thereby reducing the requirement on target language labeling data and improving the effect of neural network model extraction. Therefore, the invention provides a knowledge-enhanced cross-language structured emotion analysis model, which adds implicit knowledge and explicit knowledge into a cross-language structured emotion analysis task. First, the present invention designs a countermeasure-based embedded adaptor that adaptively combines multiple embedded representations using word-level attention mechanism and a countermeasure strategy, learning semantically rich and robust word representations. Furthermore, inspired by the existing work, the invention integrates the general syntactic dependency into the cross-language structured emotion analysis. The syntax tree has important significance for the structured emotion analysis task, and meanwhile, different semantics have similar syntax structures, so that the structured representation can be well learned by combining the syntax structures into the task.

Drawings

FIG. 1 is a diagram of a knowledge-based enhanced cross-language structured migration model of the present invention.

Detailed Description

The invention will be further illustrated with reference to the following specific examples. It should be understood that these examples are for illustrative purposes only and are not intended to limit the scope of the present invention. Further, it should be understood that various changes or modifications of the present invention may be made by those skilled in the art after reading the teaching of the present invention, and such equivalents may fall within the scope of the present invention as defined in the appended claims.

The existing structured emotion analysis data set relates to fewer languages and is smaller in scale, and the performance of a neural network model is limited to a great extent. Therefore, in the knowledge enhancement-based cross-language structured emotion analysis method disclosed by the invention, a knowledge enhancement cross-language structured migration is proposed, as shown in fig. 1. First, the present invention designs a countermeasure-based embedded learning adapter to adaptively capture implicit semantic information from multi-language embedding to learn semantically rich and robust information representations. In addition, the present invention incorporates a syntactic GCN encoder to learn structural representations between different languages. Finally, the invention performs structured extraction based on the sentence representations learned by the two parts.

In the present invention, formalization is defined as follows:

given a corpus of a source language

Whose labels are sets of view tuples

Wherein the content of the first and second substances,

which is indicative of the number of samples,

the representative sample contains the viewpoint tuple number. Set of viewpoint tuples

The kth viewpoint tuple o _k ＝(h _k ,t _k ,t _k ,p _k ) Represents the kth view tuple o _k Holder of h _k By the term of opinion e _k For the target t _k Expressing emotion polesProperty p _k Wherein:

is a training corpus

The jth sentence x in _j The sub-string of (a) is,

respectively the bearer h _k Target t _k Term "viewpoint" e _k In sentence x _j In the above-mentioned position of the start position,

are respectively sentences x _j In

The word at the location of the location,

respectively the bearer h _k Target t _k Term "viewpoint" e _k In sentence x _j In the end position of (a) to (b),

are respectively sentences x _j In (1)

The word at the location.

Based on the definition, the cross-language structured emotion analysis method based on knowledge enhancement provided by the invention specifically comprises the following steps of:

and S101, constructing and training a counterattack embedding adapter, and learning a rich and robust cross-language migration word embedding method through the counterattack embedding adapter. To this end, in building the anti-embedding adapter, the invention first designs a word attention mechanism to capture the important implicit distributed semantics of multiple embeddings pre-trained with different training strategies and tasks on different corpora. The invention then employs an antagonistic training strategy to improve the robustness of word embedding.

Step S101 specifically includes the following steps:

step S1011, designing a word level attention mechanism:

multilingual pre-training language models, such as mBERT and XLM, have been widely used for different cross-language tasks with great success. In fact, developers and researchers have released a large number of multilingual pre-trained language models. There are more than 100 models on Hugging Face website. These models have different semantic information because they are trained on different large-scale datasets using different targets and settings. For example, mBERT-base-cast and mBERT-base-uncased are trained on the first 104 languages with the largest Wikipedia using a Mask Language Modeling (MLM) target based on lower case or upper case text. The XLM-RoBERTA model was pre-trained on these 100 languages with 2.5TB of data. However, most existing work uses one of these models as word embedding. In order to obtain better word representation, the invention designs a word-level attention mechanism to better combine multiple cross-language pre-training word embedding.

Specifically, obtaining a compound of

Set of cross-language pre-training models

During trainingWill train the corpus

Per sentence input set in (2)

Is/are as follows

And pre-training the model in a cross-language mode, so as to obtain a word embedding vector of each sentence. Since the sub-words of different cross-language pre-training models after processing the same sentence are different, the training corpus is different

For any sentence, the invention uses

And taking the average value of the sub-words obtained by the cross-language pre-training model as the final word embedding of the word, thereby obtaining the final word embedding vector corresponding to each sentence.

In this embodiment, during training, training corpora are obtained

The jth sentence x in China _j The word embedding vector specifically comprises the following steps:

will sentence x _j Are respectively input

After a cross-language pre-training model is obtained

In the formula (I), the compound is shown in the specification,

representing a sentence x _j The first word of Chinese _l By pre-training the model M across languages _i Word embedding, | x, is obtained _j I represents the sentence x _j The total number of words in.

Fusing sentence-based x through word-level attention mechanism _j Obtained by

Different word embedding vectors to obtain final word embedding vector E _j ，

in the formula, v _a 、W _a And b _a In order to train the parameters in a trainable manner,

a representation matrix v _a Is transposed matrix of

In the word-level attention mechanism provided by the invention, weights of different dimensions of different words are different, for example, an emotion expression word focuses on emotion information, and a target word focuses on entity information.

Step S1012, word embedding based on the confrontation: the word-level attention mechanism obtains semantically rich word representations, however, instability of the cross-language transfer model is caused by type and semantic differences between the source language and the target language. Therefore, to improve the robustness of word embedding, the present invention applies antagonism training to the input word embedding space of cross-language transfer. Existing research has shown that antagonism training is a novel regularization algorithm that improves robustness by adding some interference to the input.

In particular, during the training, the user can,

representation for sentence x _j In which r is _l Representation for sentence x _j Word embedding e of the first word in _l The disturbance of (2). Sentence x _j Adding disturbance r _j Is represented by

Further use of

Indicates that there is

Wherein, the first and the second end of the pipe are connected with each other,

in training, for sentence x, it is obtained by _j Worst disturbance of

Will be provided with

Represented by the formula:

s.t.||r _j ||<∈

in the formula (I), the compound is shown in the specification,

denotes the loss for the jth sample, | denotes the L1 norm, e is the parameter used to control the degree of disturbance.

Then the method of estimation is used to calculate

Comprises the following steps:

to represent

Indicating e for word embedding _l Is calculated by the gradient, | · | ₂ Representing the L2 norm.

Based on countering disturbances

Minimizing maximum likelihood of resistance training

Thereby obtaining a sentence for the sentence x _j Worst disturbance of

The setup of the confrontational training is as follows:

during training, sentence x _j Adding perturbations

Then obtain the word embedding vector after adding the disturbance

Step S102, constructing and training a grammar GCN encoder: the emphasis of the antagonistic embedded adapter is to learn a rich and robust distribution characterization, while explicit knowledge is ignored. Therefore, in order to learn a cross-language structural characterization, the invention introduces a syntactic GCN encoder, which integrates a dependency parsing tree into a cross-language structured emotion analysis. Since the syntax parse tree plays a crucial role in the structured extraction. As shown in fig. 1, it can be found from the syntax tree that the holder "long-level observer group of the south african community" and the target "muga president" are both on one subtree. Furthermore, the distance between the representation of the parse tree and the target (or holder) is close to a sentence. All this shows that the model can learn the structural relationships between targets, holders and expressions by parsing the similar words on the tree. Second, two tree structures with similar semantic sentences are similar in multiple languages because they are labeled according to the same linguist's syntactic rules. Thus, the present invention proposes a syntactic GCN encoder to model explicit structural knowledge of cross-language migrations.

When training, firstly, obtain training corpus

And constructing a graph for each sentence based on the syntactic parse tree, then calculating to obtain an in-out degree matrix of the graph, and obtaining a grammar GCN encoder according to the graph and the in-out degree matrix. And embedding the words obtained in the step S101 after adding disturbance into a vector input Grammar (GCN) encoder to obtain a structural representation of a uniform space, thereby obtaining a structural hidden representation with rich and stable information.

Wherein, during training, the embodiment obtains the corpus by the open source tool Stanza

Middle sentence x _j The set of relationships of the syntax parse tree is denoted as E _j . As a sentence x _j Construction drawing G _j ，G _j ＝(V _j ,E _j )，

v _l Is the l-th node, is sentence x _j The first word w in (1) _l . Based on the graph G _j Establishing an adjacency matrix

Adding the perturbed sentence x obtained in step S102 _j Word-embedded vector

Wherein the content of the first and second substances,

representing the finally obtained sentence x _j The first word w in _l The hidden representation of the data is obtained, thereby obtaining the structural hidden representation with rich and robust information.

Step S103, constructing and training a decoder: based on the structural hidden representation with rich and stable information, the invention adopts a simple decoding strategy for the four subtasks. First, the present invention extracts viewpoint words by predicting the start and end positions of the viewpoint words. These viewpoint words are treated as trigger words for each viewpoint. The invention then extracts the target and the bearer and predicts the emotional polarity for a given expression.

Step S103 specifically includes the following steps:

step S1031, viewpoint word extraction: to extract the opinion words in sentences, the present invention predicts sentence x using two binary classifiers when training _j The first word of Chinese _l Is the term of point of view e _k Probability of starting position of

Or probability of an end position

As shown in the following formula:

in the formula (I), the compound is shown in the specification,

are parameters that can be learned.

Cross entropy is used as a loss function

Then there are:

in the formula: CE (·) represents a cross-entropy function;

is a sample viewpoint word e _k Labels for the start and end positions, if l is the term e _k A start or end position of, then

Equal to 1.

In the prediction process, if

If it is greater than 0.5, l is the term e _k The start position of (2); if it is

If it is greater than 0.5, l is the term e _k The end position of (2).

Step S1032, target word extraction: to take into account the opinion information when extracting the target, the present invention predicts the predicted sentence x from the opinion term characterization when training _j The first word w in _l Is a target t _k Probability of starting position of

Or probability of end position

As shown in the following formula:

in the formula (I), the compound is shown in the specification,

are parameters that can be learned;

is a term of view e obtained by a syntactic GCN encoder _k A hidden layer representation of the word at the end position of (a); [ a; b]Showing the connection operation of a and b.

The loss function of the target word extraction is

Then there are:

in the formula (I), the compound is shown in the specification,

Equal to 1.

In the prediction process, if

If it is greater than 0.5, l is the target t _k The start position of (2); if it is

If it is greater than 0.5, l is the target t _k The end position of (1).

Step S1033, the holder extracts: as with target word extraction, in training, the present invention predicts sentence x based on viewpoint word characterization _j The first word of Chinese _l Is the holder h _k Probability of starting position of

Or probability of end position

As shown in the following formula:

in the formula (I), the compound is shown in the specification,

are parameters that can be learned.

The loss function extracted by the bearer is

Then there are:

in the formula (I), the compound is shown in the specification,

Equal to 1.

In the prediction process, if

If greater than 0.5, l is the holder h _k The start position of (2); if it is

If the value is more than 0.5, l is the holder h _k The end position of (1).

Step S1034, emotion polarity prediction:

finally, the aim of the invention is to predict and assign the term of opinion e _k Relative emotional polarity p _k . In training, the present invention uses the max-posing method to obtain sentence x _j The sentence of (1) characterizes r ^s ＝Maxpooling(H ^(P) ) And view it togetherThe word features are connected to carry out polarity classification to obtain a sentence x _j Expressing emotional polarity p _k Probability of (c):

in the formula (I), the compound is shown in the specification,

are parameters that can be learned.

Based on the emotional probability distribution, the loss function designed by the invention is as follows:

in the formula (I), the compound is shown in the specification,

is the sample emotional polarity p _k Is marked with a label

It should be noted that the tuple (target t) is expressed due to the viewpoint word _k Term "viewpoint" e _k And the holder h _k ) The invention incorporates the term of opinion e for this purpose _k To predict emotion.

Finally, through a decoder model, all viewpoint tuples contained in the sentence can be finally extracted by the method, wherein the viewpoint tuples contain the bearer, the target word, the viewpoint word and the emotion polarity.

Aiming at the problem of data loss of the structured event, the invention provides a method for cross-language structured migration with enhanced knowledge. The invention combines the latest technologies of a graph convolution network, a countermeasure model and the like, and creates a cross-language and cross-form structured extraction model according to the characteristics of the structured extraction task. The invention is mainly different from the traditional structured extraction method in the following two aspects

(1) The counter-embedded adapter:

the goal of the present invention is to learn a word embedding that is informative and robust for cross-language translation. First, a word-level attention mechanism is designed to capture multiple embedded significant distribution implied semantic information pre-trained with different training strategies and tasks on different corpora. Then, an antagonistic training strategy is employed to improve the robustness of the embedding.

(2) GCN-based structured representation learning:

the emphasis of the antagonistic embedded encoder is to learn a rich and robust distribution characterization, while the explicit knowledge is ignored. Therefore, modeling semantic information displayed by using a GCN mode is reasonably applied to structured emotion analysis.

In the present invention, a model for cross-language structured sentiment analysis is presented. An antagonistic embedding adaptator is designed for learning informativeness and robustness embedding. Then, a syntactic GCN encoder is introduced to learn the parse-tree based structural representation. We compared the model to supervised and unsupervised baselines over five datasets in four languages. Experimental results show that the model has great advantages in cross-language migration. Meanwhile, ablation experimental studies were also conducted to demonstrate the effectiveness of each module in the model.

(1) Knowledge-enhanced cross-language structured sentiment analysis model performance

The present invention is directed to the use of knowledge learned in one language to improve generalization ability in another language. To verify the validity of the model, it is evaluated for cross-language migration, trained on the source language, and tested on the target language without the tagged data. Twenty-five migration tasks were performed on five data sets in four low-resource languages, and the model of the present invention was compared to supervised and unsupervised baselines, as shown in table 1 below.

Table 1 is an evaluation table of the results of the experiment

From this table, the following results can be obtained. The migration model of the invention has obviously better performance in cross-language structured emotion analysis than the unsupervised baseline. In particular, the model of the invention achieves the best performance in all indexes, and so on the data set. In particular, the target F1 is improved by more than three points on three of the data sets. All this shows that the information structure representation of the present invention can help the model migrate structured emotions between different languages.

(2) Robust embedded encoder and GCN-based structured representation learning effectiveness

To investigate the effectiveness of each module consisting of the model of the present invention, ablation tests were performed on five data sets, as shown in table 2 below. The average score of all source data sets except the target data set is reported. In particular, the present invention removes the antagonistic embedded adapters (-AEA) and the syntactic GCN encoder (-SGCNE), respectively, from the model.

Table 2 shows the results of the ablation experiment

The results indicate that both the adversarial embedded adapter and the syntactic GCN encoder are important to this task. In particular, the adversarial embedding adapter can capture the diversity of the underlying features from various multilingual embedding that contains different semantic information. It learns information embedding through attention mechanism and improves robustness through antagonistic training strategy. The final word embedding can well improve the performance of cross-language migration. At the same time, the structural characterization learned by the syntactic GCN encoder can further improve the model effect, since the dependency parse tree is crucial for structured sentiment.

Claims

1. A cross-language structured emotion analysis method based on knowledge enhancement is characterized in that training corpora of source language are adopted

Whose labels are sets of view tuples

which is indicative of the number of samples,

The kth view tuple o _k ＝(h _k ,t _k ,e _k ,p _k ) Represents the kth view tuple o _k Holder of h _k By the term of opinion e _k For the target t _k Expressing emotional polarity p _k Wherein:

is a training corpus

The jth sentence x in _j The sub-string of (a) is,

are respectively sentences x _j In

The word at the location of the location,

are respectively sentences x _j In

s101, constructing and training a countermeasure embedding adapter, designing a word attention mechanism when constructing the countermeasure embedding adapter so as to capture a plurality of embedded important implicit distributed semantics pre-trained on different corpora by different training strategies and tasks, and then improving the robustness of word embedding by adopting a countermeasure training strategy; when training against the embedded adapter, get

Set of cross-language pre-training models

Will train the corpus

Per sentence input set in (2)

Is/are as follows

step S102, constructing and training a grammar GCN encoder:

obtaining training corpus

step S103, constructing and training a decoder:

2. The knowledge-enhancement-based cross-language structured emotion analysis method of claim 1, wherein the step S101 specifically comprises the steps of:

step S1011, obtaining training corpus

will sentence x _j Are respectively input

After a cross-language pre-training model is obtained

In the formula (I), the compound is shown in the specification,

representing a sentence x _j The first word w in _l By pre-training the model M across languages _i Word embedding, | x, is obtained _j I represents the sentence x _j The total number of words in;

fusing sentence x-based attention mechanisms through word-level _j Obtained by

Different word embedding vectors to obtain final word embedding vector E _j ，

denotes v _a The transposed matrix of (2);

step S1012, set

Further use of

Indicates that there is

Wherein the content of the first and second substances,

for sentence x, obtained by _j Worst disturbance of

Computing using an estimated method

Comprises the following steps:

indicating e for word embedding _l Is calculated by gradient, | · | ₂ The norm of L2 is shown,

l (-) denotes the loss for the jth sample, ∈ is the parameter used to control the degree of perturbation;

based on countering disturbances

Minimizing maximum likelihood of resistance training

Thereby obtaining a sentence for the sentence x _j Worst disturbance of

The setup of the confrontational training is as follows:

during training, sentence x _j Adding perturbations

Then obtain the word embedding vector after adding the disturbance

3. The knowledge-enhancement-based cross-language structured emotion analysis method of claim 1, wherein the step S102 comprises the steps of:

during training, training corpora are obtained

Based on the graph G _j The adjacency matrix A and the degree of access matrix D construct a sentence x _j The syntax GCN encoder of (1), the hidden representation of the p-th graphics convolution layer

Adding the perturbed sentence x obtained in step S102 _j Word-embedded vector

After the syntax GCN encoder is input, a structural representation of a unified space is obtained from a P +1 layer

Wherein the content of the first and second substances,

4. The knowledge-enhancement-based cross-language structured emotion analysis method of claim 3, wherein the step S103 specifically comprises the steps of:

step S1031, viewpoint word extraction: to extract the opinion words in sentences, during training, two binary classifiers are used to predict sentence x _j The first word w in _l Is the term of point of view e _k Probability of starting position of

Or probability of end position

As shown in the following formula:

in the formula (I), the compound is shown in the specification,

are parameters that can be learned;

cross entropy is used as a loss function

Then there are:

in the formula: CE (·) represents a cross-entropy function;

Equal to 1;

Or probability of end position

As shown in the following formula:

in the formula，

Are parameters that can be learned;

the loss function of the target word extraction is

Then there are:

in the formula (I), the compound is shown in the specification,

Equal to 1;

step S1033, the holder extracts: during training, the sentence x is predicted according to the viewpoint word representation _j The first word w in _l Is the holder h _k Probability of starting position of

Or probability of end position

As shown in the following formula:

in the formula (I), the compound is shown in the specification,

are parameters that can be learned.

The loss function extracted by the bearer is

Then there are:

in the formula (I), the compound is shown in the specification,

Equal to 1;

step S1034, emotion polarity prediction:

during training, the method of maximum pooling is used to obtain sentence x _j The sentence of (1) characterizes r ^s ＝Maxpooling(H ^(P) ) And connecting the character with the viewpoint word characteristics for polarity classification to obtain a sentence x _j Expressing emotional polarity p _k Probability of (c):

in the formula (I), the compound is shown in the specification,

are parameters that can be learned;

based on emotional probability distribution, loss function

The following were used:

in the formula (I), the compound is shown in the specification,

is the sample emotional polarity p _k The label of (1).