CN115080688A - Method and device for analyzing low-sample cross-domain emotion - Google Patents
Method and device for analyzing low-sample cross-domain emotion Download PDFInfo
- Publication number
- CN115080688A CN115080688A CN202210661020.8A CN202210661020A CN115080688A CN 115080688 A CN115080688 A CN 115080688A CN 202210661020 A CN202210661020 A CN 202210661020A CN 115080688 A CN115080688 A CN 115080688A
- Authority
- CN
- China
- Prior art keywords
- emotion
- sentence
- vector
- node
- feature vector
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Granted
Links
- 230000008451 emotion Effects 0.000 title claims abstract description 115
- 238000000034 method Methods 0.000 title claims abstract description 53
- 239000013598 vector Substances 0.000 claims abstract description 126
- 230000004927 fusion Effects 0.000 claims abstract description 9
- 230000006870 function Effects 0.000 claims description 33
- 238000004458 analytical method Methods 0.000 claims description 32
- 238000012549 training Methods 0.000 claims description 24
- 230000002996 emotional effect Effects 0.000 claims description 14
- 230000008569 process Effects 0.000 claims description 10
- 239000011159 matrix material Substances 0.000 claims description 9
- 238000002372 labelling Methods 0.000 claims description 7
- 230000007246 mechanism Effects 0.000 claims description 5
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 claims description 4
- 238000004364 calculation method Methods 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 238000001228 spectrum Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 238000013145 classification model Methods 0.000 claims description 2
- 150000001875 compounds Chemical class 0.000 claims description 2
- 238000007499 fusion processing Methods 0.000 claims description 2
- 230000000694 effects Effects 0.000 abstract description 5
- 238000005516 engineering process Methods 0.000 abstract description 3
- 238000003058 natural language processing Methods 0.000 abstract description 2
- 238000013528 artificial neural network Methods 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 2
- 238000004590 computer program Methods 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 239000000463 material Substances 0.000 description 2
- 230000005012 migration Effects 0.000 description 2
- 238000013508 migration Methods 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 230000003287 optical effect Effects 0.000 description 2
- 230000004931 aggregating effect Effects 0.000 description 1
- 230000004075 alteration Effects 0.000 description 1
- 230000005540 biological transmission Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000001914 filtration Methods 0.000 description 1
- 238000009434 installation Methods 0.000 description 1
- 238000005065 mining Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 239000013307 optical fiber Substances 0.000 description 1
- 230000008447 perception Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000006467 substitution reaction Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/36—Creation of semantic tools, e.g. ontology or thesauri
- G06F16/367—Ontology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/211—Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- General Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Animal Behavior & Ethology (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a method and a device for analyzing a few-sample cross-domain emotion, wherein the method comprises the following steps: sentence data is obtained, and the sentence data is input into a trained BERT coder to obtain a first feature vector; inputting sentence data into a trained GCN encoder to obtain a second feature vector; performing feature fusion on the first feature vector and the second feature vector to obtain vector representation of the sentence; inputting the vector representation of the sentence into the trained few-sample prototype network model, and outputting the emotion polarity of the sentence; according to the method, the field sharing characteristics and the field specific characteristics are captured by using the few-sample learning technology, so that the emotion prediction effect of the model transferred from the source field to the target field is improved. The invention can be widely applied to the technical field of natural language processing.
Description
Technical Field
The invention relates to the technical field of natural language processing, in particular to a method and a device for analyzing emotion of a few samples in a cross-domain mode.
Background
Emotion analysis is a task that automatically classifies the emotion polarity of text data. At present, the emotion analysis model based on the deep neural network obtains remarkable performance, but the current method adopting the neural network needs a large amount of labeled samples to achieve an ideal prediction effect. At the same time, the labeling of the training samples requires a lot of manpower and a lot of time.
To alleviate the problem of reliance on large amounts of manually labeled data, cross-domain emotion analysis tasks have recently become a research focus, with the goal of moving knowledge from label rich source domains to label sparse target domains. The main challenge of cross-domain emotion analysis is the need to overcome the difference between the source domain and the target domain, especially when the source domain and the target domain are significantly different. In the face of this challenge, many studies propose to extract domain-invariant grammatical features as bridges for domain migration, thereby reducing inter-domain differences. However, the expression of the language is diversified, and the grammar information with unchanged domain can cause transmission errors of emotion migration due to the diversity of the language table. Meanwhile, the current cross-domain emotion analysis method usually focuses on learning and mining invariant features between domains, but ignores domain specific features, and as domain differences become larger, the domain invariant features are limited, so that the performance of cross-domain emotion analysis is reduced.
Disclosure of Invention
In order to solve at least one of the technical problems in the prior art to a certain extent, the invention aims to provide a method and a device for analyzing a low-sample cross-domain emotion.
The technical scheme adopted by the invention is as follows:
a few-sample cross-domain emotion analysis method comprises the following steps:
sentence data is obtained, and the sentence data is input into a trained BERT coder to obtain a first feature vector;
inputting sentence data into a trained GCN encoder to obtain a second feature vector;
performing feature fusion on the first feature vector and the second feature vector to obtain vector representation of the sentence;
inputting the vector representation of the sentence into the trained few-sample prototype network model, and outputting the emotion polarity of the sentence;
wherein the few-sample prototype network model is obtained by training in the following way: obtaining labeling samples of preset positive emotion and negative emotion, obtaining sentence vector representations of the labeling samples, respectively mapping the sentence vector representations to the same feature space, and taking an average vector represented by the sentence vectors with the same polarity as a prototype representation representing the corresponding emotion polarity.
Further, the BERT encoder is trained by:
acquiring a text of a source domain or a target domain, and training a BERT encoder to acquire rich domain feature knowledge; wherein the feature vector of each sentence in the text is represented as:
x w =h [cls] =BERT(x)
in the formula, x represents the input sentence, h [cls] The expression is a hidden vector expression of special characters before sentences of a BERT coder, and the BERT coder is a sentence coder.
Further, the GCN encoder is trained by:
designing two self-supervision tasks, and training a GCN encoder; the two self-supervision tasks comprise a relation classification task and an emotion alignment classification task;
the method comprises the following steps that a relation classification task is given to any two nodes, and the relation of the two nodes can be judged based on a relation classification model of a GCN encoder; the emotion alignment task gives two aspect words and viewpoint words, and whether the two aspect words and the viewpoint words have the same emotion polarity needs to be judged based on an emotion alignment model of a GCN encoder; the two self-supervision tasks aim at the common domain relation knowledge and learn the emotional alignment characteristics between the aspect viewpoint pairs so as to obtain a characteristic vector containing the common background knowledge and the aspect viewpoint word emotional alignment.
Further, in the relationship classification task, the node feature vector is obtained by fusion represented by the neighboring node of the node, and the fusion process of the feature vector is represented as follows:
wherein the content of the first and second substances,representing all neighbor nodes of node i under relation r, g i Is the initial node feature vector initialized randomly, and is converted into h after the two-step graph convolution process is used i B, carrying out the following steps of; sigma denotes the Relu activation function; l denotes the l-th layer graph convolution, c i,r The number of neighbor nodes of the node i is shown,representing a parameter matrix, x, to be trained j A feature vector representation representing the node j,representing a parameter matrix to be trained;
the penalty function generated by the relationship classification task is:
wherein, s (v) i ,r i,j ,v j ) Expressed is a matrix analysis score function; r r The representation is a vector representation of the relationship r; t denotes the set of nodes of the graph G, and y denotes whether a relationship r exists between a given node i and a given node j i,j If yes, y takes a value of 1, and if not, y takes a value of 0;
the emotion alignment classification task generates a loss function as follows:
wherein N represents the number of unlabeled samples of the source field and the target field; p k The aspect word-emotion word pair contained in the kth unlabeled sample is shown.
Further, the method also comprises the step of constructing the common sense knowledge graph:
based on the unlabeled samples of the source field and the target field, taking a sentence as a unit, taking words with the designated parts of speech as nouns, verbs and adjectives in the sentence as link seeds, and linking out the knowledge triples of the next hop through a conceptNet common sense knowledge base; finally, de-overlapping the sub-graph spectrums linked by all sentences to form a domain common knowledge map, and providing knowledge support for cross-domain emotion analysis;
wherein the constructed common sense knowledge map of the field is represented as:
wherein, a node vi ∈ V in the constructed graph, and a relation triple (vi, ri, j, vj) ∈ phi, whereinExpressed as two nodes vi in relation to vj.
Further, the method also comprises the following steps:
and calculating the importance degree of the nodes in the knowledge graph by adopting an attention mechanism, wherein the importance degree of each node is expressed as:
wherein e is i The representation is a vector representation of the ith node, α i Indicating the importance of the ith node, e k Representing a vector representation of the kth node, N i All neighbor node sets of the ith node are represented. .
Further, the performing feature fusion on the first feature vector and the second feature vector to obtain a vector representation of the sentence includes:
splicing the first characteristic vector and the second characteristic vector, calculating the probability of all possible polarities of the input text, selecting the emotion label with the maximum probability as a final predicted emotion label, and in the step of completing an emotion analysis task, expressing the characteristic vector of each sentence as follows:
x=[x w ;x g ]
wherein x is g Is obtained by a common knowledge vector, x, with respect to the emotional alignment of the aspect-view words w Sentence vectors with context information, generated by the BERT encoder, [;]representing the stitching vector.
Further, after the sentence vector representation of the sample is input into the prototype network model with few samples, the following steps are performed:
for k samples of positive and negative emotion categories, a prototype for each emotion category is calculated:
in the formula (I), the compound is shown in the specification,the feature vector representation of the jth sample of the positive/negative emotion category is represented;
and outputting the emotion probability of the sentence x, wherein the calculation formula of the emotion probability is as follows:
in the formula, c i Is the ith emotional polarity; q denotes the sample to be tested; d () denotes the euclidean distance between the given two vectors.
Further, the method also comprises the following steps:
and (3) training the few-sample prototype network model by adopting an Adam optimizer, wherein a loss function in the training process is represented as follows:
L=L recon +L softmax
wherein L is recon Representing the loss of sentence vector representation reconstruction; l is softmax The cross entropy loss function of the emotion classification task is shown.
The other technical scheme adopted by the invention is as follows:
a small sample cross-domain emotion analysis device, comprising:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method described above.
The other technical scheme adopted by the invention is as follows:
a computer readable storage medium in which a processor executable program is stored, which when executed by a processor is for performing the method as described above.
The invention has the beneficial effects that: according to the method, the field sharing characteristics and the field specific characteristics are captured by using the few-sample learning technology, so that the emotion prediction effect of the model transferred from the source field to the target field is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description is made on the drawings of the embodiments of the present invention or the related technical solutions in the prior art, and it should be understood that the drawings in the following description are only for convenience and clarity of describing some embodiments in the technical solutions of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flowchart illustrating steps of a method for cross-domain emotion analysis with few samples according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method for knowledge-enhancement-based small-sample cross-domain emotion analysis in an embodiment of the present invention;
FIG. 3 is a schematic model structure diagram of a knowledge enhancement-based small-sample cross-domain emotion analysis method in the embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.
In the description of the present invention, it should be understood that the orientation or positional relationship referred to in the description of the orientation, such as upper, lower, front, rear, left, right, etc., is based on the orientation or positional relationship shown in the drawings only for the convenience of description of the present invention and simplification of the description, and does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.
In the description of the present invention, the meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and larger, smaller, larger, etc. are understood as excluding the number, and larger, smaller, inner, etc. are understood as including the number. If the first and second are described for the purpose of distinguishing technical features, they are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
In the description of the present invention, unless otherwise explicitly limited, terms such as arrangement, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions.
As shown in fig. 1, the present embodiment provides a method for analyzing emotion with few samples across domains, including the following steps:
s1, obtaining sentence data, inputting the sentence data into a trained BERT coder, and obtaining a first feature vector;
s2, inputting sentence data into the trained GCN encoder to obtain a second feature vector;
s3, performing feature fusion on the first feature vector and the second feature vector to obtain vector representation of the sentence;
and S4, inputting the vector representation of the sentence into the trained few-sample prototype network model, and outputting the emotion polarity of the sentence. Wherein the few-sample prototype network model is obtained by training in the following way: obtaining labeling samples of preset positive emotion and negative emotion, obtaining sentence vector representations of the labeling samples, respectively mapping the sentence vector representations to the same feature space, and taking an average vector represented by the sentence vectors with the same polarity as a prototype representation representing the corresponding emotion polarity.
As an alternative embodiment, the BERT encoder is trained by:
acquiring a text of a source domain or a target domain, and training a BERT encoder to acquire rich domain feature knowledge; wherein the feature vector of each sentence in the text is represented as:
x w =h [cls] =BERT(x)
in the formula, x represents that the input sentence BERT is a sentence encoder.
As an alternative embodiment, the GCN encoder is trained by:
designing two self-supervision tasks, and training a GCN encoder; the two self-supervision tasks comprise a relation classification task and an emotion alignment classification task;
the relation classification task is given any two nodes, and the model can judge the relation of the two nodes; the emotion alignment task gives two aspect words and viewpoint words, and the model needs to judge whether the two aspect words and the viewpoint words have the same emotion polarity; the two self-supervision tasks aim at the common domain relation knowledge and learn the emotional alignment characteristics between the aspect viewpoint pairs so as to obtain a characteristic vector containing the common background knowledge and the aspect viewpoint word emotional alignment.
In the step of designing two self-supervision tasks, namely a relationship classification task and an emotion alignment classification task, to pre-train a GCN automatic encoder, namely predicting the relationship between nodes to obtain a common sense knowledge feature vector, and learning emotion alignment features between aspect viewpoint pairs by using an emotion alignment binary classification task, so as to obtain a feature vector containing background common sense and aspect viewpoint word emotion alignment, the conversion process of the feature vector may be expressed as:
wherein, all the neighbor nodes of the representative node i under the relation r are normalization constants which can be preset, g i Is the initial node feature vector initialized randomly, and is converted into h after the two-step graph convolution process is used i The sum is a weight matrix of the l layer; σ denotes the Relu activation function. Meanwhile, the loss function generated by the relation classification task self-supervision learning task is as follows:
wherein, s (v) i ,r i,j ,v j ) Expressed is a matrix analysis score function; r is r The representation is a vector representation of the relation r.
The loss function generated by the emotion alignment classification self-supervision learning task is as follows:
wherein N represents the number of unlabeled samples of the source field and the target field; p is k The aspect word-emotion word pair contained in the kth unlabeled sample is shown.
As an alternative embodiment, the method also comprises the step of constructing the common sense knowledge map:
based on the unlabeled samples of the source field and the target field, taking a sentence as a unit, taking words with the designated parts of speech as nouns, verbs and adjectives in the sentence as link seeds, and linking out the knowledge triples of the next hop through a conceptNet common sense knowledge base; finally, de-overlapping the sub-graph spectrums linked by all sentences to form a domain common knowledge map, and providing knowledge support for cross-domain emotion analysis;
wherein the constructed common sense knowledge map of the field is represented as:
wherein, a node v in the map is constructed i E.v, relationship triplet (V) i ,r i,j ,v j ) E.g. phi, whereinDenoted as two nodes v i And v j And (4) relationship.
As an alternative embodiment, the importance of a given graph node is calculated using an attention mechanism, and thus the importance of each node is expressed as:
wherein e is i The representation is a vector representation of the ith node, α i Indicating the importance, range, of the ith nodeIs [0,1 ]]。
A knowledge-aware attention mechanism module: given a sentence, the external knowledge base ConceptNet can be used to link to a sub-spectrogram, but not every link node has equal effect on the emotion analysis task across domains. In this regard, an attention mechanism module based on knowledge perception is designed; given a representation of all the nodes that are linked, the importance of each linked node is calculated.
As an alternative embodiment, the features are fused: and designing a feature fusion module, giving a sentence, and respectively adopting a BERT encoder and a GCN encoder to carry out feature representation to respectively obtain two feature vectors. And finally, a vector representation of the sentence is obtained by adopting a splicing mode.
Calculating the probability of all possible polarities of an input text by adopting a mode of splicing vectors generated by two encoders (namely a BERT encoder and a GCN image encoder), selecting the emotion label with the maximum probability as a final predicted emotion label, and in the step of completing an emotion analysis task, the feature vector of each sentence can be expressed as:
x=[x w ;x g ]
wherein x is g Is obtained by a common knowledge vector, x, with respect to the emotional alignment of the aspect-view words w Is a sentence vector with context information generated by the BERT encoder, [;]a stitching vector is represented.
As an alternative embodiment, the prototype network learning with less samples adopts a prototype network model based on metric learning, k samples of positive emotion classes and k samples of negative emotion classes are respectively given, and a prototype of each emotion class is calculated:
wherein the content of the first and second substances,representing the feature vector representation of the jth sample of the positive/negative emotion classes;
meanwhile, the emotion probability of a given sentence x is output, and the calculation formula is as follows:
in the formula, c i Is the ith emotional polarity; q denotes the sample to be tested; d () denotes the euclidean distance between the given two vectors.
As an optional embodiment, in the process of jointly training an aspect term emotion alignment task and an emotion analysis task and training a model by using an Adam optimizer to obtain an optimal parameter, a loss function may be represented as:
L=L recon +L softmax
wherein L is recon Representing the loss of sentence vector representation reconstruction; l is softmax The cross entropy loss function of the emotion classification task is shown.
As shown in fig. 2 and fig. 3, the embodiment provides a few-sample cross-domain emotion analysis method based on knowledge enhancement, the method introduces a large amount of common knowledge by using a small amount of labeled samples in the target domain, so that the domain-invariant features can be captured, the domain-specific features can be obtained, model parameters are optimized by using a few-sample learning mode, the overfitting problem caused during model training is avoided, the adaptability of the model domain can be effectively improved, and the cross-domain emotion analysis performance is further improved. The model comprises a pre-training field BERT encoder, a pre-training GCN automatic encoder, a feature vector representation training classifier which splices vectors generated by the two encoders to be sentences, and a prototype network module for less-sample learning. The method comprises the following steps:
(1) and inputting label-free samples of a source field or a target field to pre-train the BERT coder.
The pre-training BERT encoder is used for obtaining rich domain knowledge through large-scale label-free data pre-training, and the feature vector of each sentence in the text is represented as follows:
x w =h [cls] =BERT(x)
where x represents the input sentence and BERT is the sentence encoder.
(2) Based on unlabeled samples (namely comment sentences) of the source field and the target field, taking sentences as units, taking words with specified parts of speech as nouns, verbs and adjectives in the sentences as link seeds, and linking out knowledge triples of the next hop through a ConceptNet common sense knowledge base; and finally, carrying out de-duplication and combination on the sub-graph spectrums linked by all sentences to finally form a domain common knowledge graph, and providing knowledge support for cross-domain emotion analysis. The dependency relationship means that if the dependency syntactic relationship between specified words in a sentence is "nsubj", "amod", or "xcomp", they are connected as a "description" relationship. And finally, the seed filtering ConceptNet creates subgraphs, and the subgraphs of all sentences are combined into a domain common sense graph which can be expressed as follows:
wherein, a node v in the map is constructed i E.v, relationship triplet (V) i ,r i ,j,v j ) E.g. phi, where r i,j Refers to the relationship of two nodes in the ConceptNet.
(3) Two self-supervision tasks, namely a relation classification task and an emotion alignment classification task are designed to train a GCN (generalized convolutional neural network) automatic encoder, namely, the relation between a prediction node and a node is predicted to obtain a common sense knowledge feature vector, and an emotion alignment feature between aspect viewpoint pairs is learned by utilizing an emotion alignment binary classification task, so that a feature vector containing background common sense and aspect viewpoint word emotion alignment is obtained.
The transformation process of the feature vector can be expressed as:
wherein, all the neighbor nodes of the representative node i under the relation r are normalization constants which can be preset, g i Is the initial node feature vector initialized randomly, and is converted into h after the two-step graph convolution process is used i I.e., aggregating feature vectors for the domain, and a weight matrix referring to the l-th layer.
(4) And inputting the feature vectors generated by the GCN automatic encoder into a graph feature reconstructor, and adapting the graph node level feature vectors to word level vectors through the graph feature reconstructor.
The feature mapping layer and the graph feature reconstructor are designed and expressed by taking sentences as units as follows:
x c =W c x’ c +b c
x’ recon =W recon x c +b recon
where x represents the vector representation of the sentence, W c And b c All the weight matrixes are weight matrixes, sentence feature vectors obtained by expressing all nodes in an average map after x constructs a sub-map and passes through a GCN automatic encoder are sentence feature vector representations which are obtained after a feature reconstructor and are adapted to a word level distribution space, and the vector representation of a sentence is used as final vector representation of the GCN on the sentence x.
Therefore, the loss function of the reconstruction function is expressed by the following formula using the cosine similarity function:
after the sub-map is constructed for the sentence x, the sub-map is input into a GCN automatic encoder to obtain a sentence characteristic vector representation, and the sentence characteristic vector representation is obtained through a reconstruction function.
(5) And splicing vectors generated by the two encoders to serve as a vector input classifier of a sentence, calculating the probability of all possible polarities of the input text, selecting the emotion label with the maximum probability as a final predicted emotion label, and completing an emotion analysis task.
The feature vector of a sentence can be represented as:
x=[x c ;x w ]
wherein x is c Is obtained by a common knowledge vector, x, with respect to the emotional alignment of the aspect-view words w Is a sentence vector with context information generated by the BERT encoder. [;]a stitching vector is represented.
Therefore, in the step of completing the emotion analysis task, the emotion probability is calculated as follows:
where C is the likely emotional polarity.
(6) Training the few-sample prototype network learning, adopting a prototype network model based on metric learning, respectively giving k samples of positive emotion categories and k samples of negative emotion categories, and calculating the prototype of each emotion category:
wherein the feature vector representation of the jth sample of the positive/negative emotion classification is represented;
meanwhile, the emotion probability of a given sentence x is output, and the calculation formula is as follows:
wherein, is the ith emotion polarity; q denotes the sample to be tested; d () denotes the euclidean distance between the given two vectors.
(7) In the process of obtaining the optimal parameters by combining an opinion word emotion alignment task and an emotion analysis task in the aspect of training by adopting an Adam optimizer, a loss function can be expressed as follows:
L=L recon +L softmax
wherein, the loss of sentence vector representation reconstruction is represented; the cross entropy loss function of the emotion classification task is shown.
In summary, the present application utilizes knowledge enhancement and few sample learning techniques to capture domain-sharing features and domain-specific features, effectively solves the problem that existing methods cannot effectively capture domain-invariant features and domain-specific features at the same time, and further improves the prediction effect of sentiment analysis of target domain data.
The embodiment also provides a few-sample cross-domain emotion analysis device, which comprises:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method of fig. 1.
The device for analyzing the low-sample cross-domain emotion can execute the method for analyzing the low-sample cross-domain emotion provided by the method embodiment of the invention, can execute any combination of the implementation steps of the method embodiment, and has corresponding functions and beneficial effects of the method.
The embodiment of the application also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and executed by the processor to cause the computer device to perform the method illustrated in fig. 1.
In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in a separate physical device or software module. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those of ordinary skill in the art will be able to practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the foregoing description of the specification, reference to the description of "one embodiment/example," "another embodiment/example," or "certain embodiments/examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.
Claims (10)
1. A small-sample cross-domain emotion analysis method is characterized by comprising the following steps:
sentence data is obtained, the sentence data is input into a trained BERT encoder, and a first feature vector is obtained;
inputting sentence data into a trained GCN encoder to obtain a second feature vector;
performing feature fusion on the first feature vector and the second feature vector to obtain vector representation of the sentence;
inputting the vector representation of the sentence into the trained few-sample prototype network model, and outputting the emotion polarity of the sentence;
wherein the few-sample prototype network model is obtained by training in the following way: obtaining labeling samples of preset positive emotion and negative emotion, obtaining sentence vector representations of the labeling samples, respectively mapping the sentence vector representations to a feature space, and taking an average vector represented by the sentence vectors with the same polarity as a prototype representation representing the corresponding emotion polarity.
2. The method of claim 1, wherein the BERT encoder is trained by:
acquiring a text of a source domain or a target domain, and training a BERT encoder to acquire rich domain feature knowledge; wherein the feature vector of each sentence in the text is represented as:
x w =h [cls] =BERT(x)
in the formula, x represents the input sentence, h [cls] The expression is a hidden vector expression of special characters before sentences of a BERT coder, and the BERT coder is a sentence coder.
3. The method of claim 1, wherein the GCN encoder is trained by:
designing two self-supervision tasks, and training a GCN encoder; the two self-supervision tasks comprise a relation classification task and an emotion alignment classification task;
the method comprises the following steps that a relation classification task is given to any two nodes, and the relation of the two nodes can be judged based on a relation classification model of a GCN encoder; the emotion alignment task gives two aspect words and viewpoint words, and whether the two aspect words and the viewpoint words have the same emotion polarity needs to be judged based on an emotion alignment model of a GCN encoder; the two self-supervision tasks aim at the common domain relation knowledge and learn the emotional alignment characteristics between the aspect viewpoint pairs so as to obtain a characteristic vector containing the common background knowledge and the aspect viewpoint word emotional alignment.
4. The method for analyzing emotion of a small sample across domains as recited in claim 3, wherein in the task of relationship classification, the feature vector of a node is obtained by fusion of neighboring node representations of the node, and the fusion process is represented as follows:
wherein the content of the first and second substances,represents all neighbor nodes of node i under the relation r, g i Is the initial node feature vector initialized randomly, and is converted into h after the two-step graph convolution process is used i (ii) a Sigma denotes the Relu activation function; l denotes the l-th layer graph convolution, c i,r The number of neighbor nodes of the node i is shown,representing the parameter matrix to be trained, x j A feature vector representation representing the node j,representing a parameter matrix to be trained;
the penalty function generated by the relationship classification task is:
wherein, s (v) i ,r i,j ,v j ) Expressed is a matrix analysis score function; r r The representation is a vector representation of the relationship r; t denotes the set of nodes of the graph G, and y denotes whether a relationship r exists between a given node i and a given node j i,j If yes, y takes a value of 1, and if not, y takes a value of 0;
the emotion alignment classification task generates a loss function as follows:
wherein N represents the number of unlabeled samples of the source field and the target field; p k The aspect word-emotion word pair contained in the kth unlabeled sample is shown.
5. The method for analyzing emotion of small sample across domains as claimed in claim 1, further comprising the step of constructing a common sense knowledge graph:
based on the unlabeled samples of the source field and the target field, taking a sentence as a unit, taking words with the designated parts of speech as nouns, verbs and adjectives in the sentence as link seeds, and linking out the knowledge triples of the next hop through a conceptNet common sense knowledge base; finally, de-overlapping the sub-graph spectrums linked by all sentences to form a domain common knowledge map, and providing knowledge support for cross-domain emotion analysis;
wherein the constructed common sense knowledge map of the field is represented as:
6. The method for analyzing emotion of small sample across domains as recited in claim 5, further comprising the steps of:
and calculating the importance degree of the nodes in the knowledge graph by adopting an attention mechanism, wherein the importance degree of each node is expressed as:
wherein e is i The representation is a vector representation of the ith node, α i Indicating the importance of the ith node, e k Representing a vector representation of the kth node, N i All neighbor node sets of the ith node are represented.
7. The method of claim 1, wherein the performing feature fusion on the first feature vector and the second feature vector to obtain a vector representation of a sentence comprises:
and splicing the first feature vector and the second feature vector, calculating the probability of all possible polarities of the input text, selecting the emotion label with the maximum probability as a final predicted emotion label, and finishing the emotion analysis task, wherein the feature vector of each sentence is expressed as follows:
x=[x w ;x g ]
wherein x is g Is obtained by a common knowledge vector, x, with respect to the emotional alignment of the aspect-view words w Is a sentence vector with context information generated by the BERT encoder, [;]a stitching vector is represented.
8. The method of claim 1, wherein the sentence vector representation of the sample is inputted into the prototype network model with less samples, and the following steps are performed:
for k samples of positive and negative emotion categories, a prototype for each emotion category is calculated:
in the formula (I), the compound is shown in the specification,the feature vector representation of the jth sample of the positive/negative emotion category is represented;
and outputting the emotion probability of the sentence x, wherein the calculation formula of the emotion probability is as follows:
in the formula, c i Is the ith emotional polarity; q denotes the sample to be tested; d () denotes the euclidean distance between the given two vectors.
9. The method for analyzing emotion of small sample across domains as recited in claim 1, further comprising the steps of:
and (3) training the few-sample prototype network model by adopting an Adam optimizer, wherein a loss function in the training process is represented as follows:
L=L recon +L softmax
wherein L is recon Representing the loss of sentence vector representation reconstruction; l is softmax The cross entropy loss function of the emotion classification task is shown.
10. A small sample cross-domain emotion analysis device, comprising:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method of any one of claims 1-9.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210661020.8A CN115080688B (en) | 2022-06-13 | 2022-06-13 | Cross-domain emotion analysis method and device for few samples |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202210661020.8A CN115080688B (en) | 2022-06-13 | 2022-06-13 | Cross-domain emotion analysis method and device for few samples |
Publications (2)
Publication Number | Publication Date |
---|---|
CN115080688A true CN115080688A (en) | 2022-09-20 |
CN115080688B CN115080688B (en) | 2024-06-04 |
Family
ID=83251179
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202210661020.8A Active CN115080688B (en) | 2022-06-13 | 2022-06-13 | Cross-domain emotion analysis method and device for few samples |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115080688B (en) |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115905518A (en) * | 2022-10-17 | 2023-04-04 | 华南师范大学 | Emotion classification method, device and equipment based on knowledge graph and storage medium |
CN116562305A (en) * | 2023-07-10 | 2023-08-08 | 江西财经大学 | Aspect emotion four-tuple prediction method and system |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110008338A (en) * | 2019-03-04 | 2019-07-12 | 华南理工大学 | A kind of electric business evaluation sentiment analysis method of fusion GAN and transfer learning |
CN112860901A (en) * | 2021-03-31 | 2021-05-28 | 中国工商银行股份有限公司 | Emotion analysis method and device integrating emotion dictionaries |
CN113722439A (en) * | 2021-08-31 | 2021-11-30 | 福州大学 | Cross-domain emotion classification method and system based on antagonism type alignment network |
US20220092267A1 (en) * | 2020-09-23 | 2022-03-24 | Jingdong Digits Technology Holding Co., Ltd. | Method and system for aspect-level sentiment classification by graph diffusion transformer |
-
2022
- 2022-06-13 CN CN202210661020.8A patent/CN115080688B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN110008338A (en) * | 2019-03-04 | 2019-07-12 | 华南理工大学 | A kind of electric business evaluation sentiment analysis method of fusion GAN and transfer learning |
US20220092267A1 (en) * | 2020-09-23 | 2022-03-24 | Jingdong Digits Technology Holding Co., Ltd. | Method and system for aspect-level sentiment classification by graph diffusion transformer |
CN112860901A (en) * | 2021-03-31 | 2021-05-28 | 中国工商银行股份有限公司 | Emotion analysis method and device integrating emotion dictionaries |
CN113722439A (en) * | 2021-08-31 | 2021-11-30 | 福州大学 | Cross-domain emotion classification method and system based on antagonism type alignment network |
Non-Patent Citations (1)
Title |
---|
余珊珊;苏锦钿;李鹏飞;: "一种基于自注意力的句子情感分类方法", 计算机科学, no. 04, 30 April 2020 (2020-04-30), pages 205 - 207 * |
Cited By (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115905518A (en) * | 2022-10-17 | 2023-04-04 | 华南师范大学 | Emotion classification method, device and equipment based on knowledge graph and storage medium |
CN115905518B (en) * | 2022-10-17 | 2023-10-20 | 华南师范大学 | Emotion classification method, device, equipment and storage medium based on knowledge graph |
CN116562305A (en) * | 2023-07-10 | 2023-08-08 | 江西财经大学 | Aspect emotion four-tuple prediction method and system |
CN116562305B (en) * | 2023-07-10 | 2023-09-12 | 江西财经大学 | Aspect emotion four-tuple prediction method and system |
Also Published As
Publication number | Publication date |
---|---|
CN115080688B (en) | 2024-06-04 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Li et al. | Natural language generation using deep learning to support MOOC learners | |
Yang et al. | SGM: sequence generation model for multi-label classification | |
Gu et al. | Stack-captioning: Coarse-to-fine learning for image captioning | |
CN110188202B (en) | Training method and device of semantic relation recognition model and terminal | |
CN109344404B (en) | Context-aware dual-attention natural language reasoning method | |
CN115080688A (en) | Method and device for analyzing low-sample cross-domain emotion | |
CN112115700A (en) | Dependency syntax tree and deep learning based aspect level emotion analysis method | |
CN111368082A (en) | Emotion analysis method for domain adaptive word embedding based on hierarchical network | |
JP2022128441A (en) | Augmenting textual data for sentence classification using weakly-supervised multi-reward reinforcement learning | |
CN115687638A (en) | Entity relation combined extraction method and system based on triple forest | |
Jafaritazehjani et al. | Style versus Content: A distinction without a (learnable) difference? | |
Mu et al. | A BERT model generates diagnostically relevant semantic embeddings from pathology synopses with active learning | |
Nelimarkka | Computational thinking and social science: Combining programming, methodologies and fundamental concepts | |
CN113283488B (en) | Learning behavior-based cognitive diagnosis method and system | |
CN114443846A (en) | Classification method and device based on multi-level text abnormal composition and electronic equipment | |
CN117370736A (en) | Fine granularity emotion recognition method, electronic equipment and storage medium | |
CN112395858A (en) | Multi-knowledge point marking method and system fusing test question data and answer data | |
CN117350286A (en) | Natural language intention translation method oriented to intention driving data link network | |
CN117111952A (en) | Code complement method and device based on generation type artificial intelligence and medium | |
CN115730608A (en) | Learner online communication information analysis method and system | |
CN114626529A (en) | Natural language reasoning fine-tuning method, system, device and storage medium | |
CN111723301B (en) | Attention relation identification and labeling method based on hierarchical theme preference semantic matrix | |
CN113536808A (en) | Reading understanding test question difficulty automatic prediction method introducing multiple text relations | |
CN113792144A (en) | Text classification method based on semi-supervised graph convolution neural network | |
CN113869518A (en) | Visual common sense reasoning method and device, electronic equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |