CN115080688A - Method and device for analyzing low-sample cross-domain emotion - Google Patents

Method and device for analyzing low-sample cross-domain emotion Download PDF

Info

Publication number
CN115080688A
CN115080688A CN202210661020.8A CN202210661020A CN115080688A CN 115080688 A CN115080688 A CN 115080688A CN 202210661020 A CN202210661020 A CN 202210661020A CN 115080688 A CN115080688 A CN 115080688A
Authority
CN
China
Prior art keywords
emotion
sentence
vector
node
feature vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202210661020.8A
Other languages
Chinese (zh)
Other versions
CN115080688B (en
Inventor
蔡毅
任浩鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202210661020.8A priority Critical patent/CN115080688B/en
Publication of CN115080688A publication Critical patent/CN115080688A/en
Application granted granted Critical
Publication of CN115080688B publication Critical patent/CN115080688B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method and a device for analyzing a few-sample cross-domain emotion, wherein the method comprises the following steps: sentence data is obtained, and the sentence data is input into a trained BERT coder to obtain a first feature vector; inputting sentence data into a trained GCN encoder to obtain a second feature vector; performing feature fusion on the first feature vector and the second feature vector to obtain vector representation of the sentence; inputting the vector representation of the sentence into the trained few-sample prototype network model, and outputting the emotion polarity of the sentence; according to the method, the field sharing characteristics and the field specific characteristics are captured by using the few-sample learning technology, so that the emotion prediction effect of the model transferred from the source field to the target field is improved. The invention can be widely applied to the technical field of natural language processing.

Description

Method and device for analyzing low-sample cross-domain emotion
Technical Field
The invention relates to the technical field of natural language processing, in particular to a method and a device for analyzing emotion of a few samples in a cross-domain mode.
Background
Emotion analysis is a task that automatically classifies the emotion polarity of text data. At present, the emotion analysis model based on the deep neural network obtains remarkable performance, but the current method adopting the neural network needs a large amount of labeled samples to achieve an ideal prediction effect. At the same time, the labeling of the training samples requires a lot of manpower and a lot of time.
To alleviate the problem of reliance on large amounts of manually labeled data, cross-domain emotion analysis tasks have recently become a research focus, with the goal of moving knowledge from label rich source domains to label sparse target domains. The main challenge of cross-domain emotion analysis is the need to overcome the difference between the source domain and the target domain, especially when the source domain and the target domain are significantly different. In the face of this challenge, many studies propose to extract domain-invariant grammatical features as bridges for domain migration, thereby reducing inter-domain differences. However, the expression of the language is diversified, and the grammar information with unchanged domain can cause transmission errors of emotion migration due to the diversity of the language table. Meanwhile, the current cross-domain emotion analysis method usually focuses on learning and mining invariant features between domains, but ignores domain specific features, and as domain differences become larger, the domain invariant features are limited, so that the performance of cross-domain emotion analysis is reduced.
Disclosure of Invention
In order to solve at least one of the technical problems in the prior art to a certain extent, the invention aims to provide a method and a device for analyzing a low-sample cross-domain emotion.
The technical scheme adopted by the invention is as follows:
a few-sample cross-domain emotion analysis method comprises the following steps:
sentence data is obtained, and the sentence data is input into a trained BERT coder to obtain a first feature vector;
inputting sentence data into a trained GCN encoder to obtain a second feature vector;
performing feature fusion on the first feature vector and the second feature vector to obtain vector representation of the sentence;
inputting the vector representation of the sentence into the trained few-sample prototype network model, and outputting the emotion polarity of the sentence;
wherein the few-sample prototype network model is obtained by training in the following way: obtaining labeling samples of preset positive emotion and negative emotion, obtaining sentence vector representations of the labeling samples, respectively mapping the sentence vector representations to the same feature space, and taking an average vector represented by the sentence vectors with the same polarity as a prototype representation representing the corresponding emotion polarity.
Further, the BERT encoder is trained by:
acquiring a text of a source domain or a target domain, and training a BERT encoder to acquire rich domain feature knowledge; wherein the feature vector of each sentence in the text is represented as:
x w =h [cls] =BERT(x)
in the formula, x represents the input sentence, h [cls] The expression is a hidden vector expression of special characters before sentences of a BERT coder, and the BERT coder is a sentence coder.
Further, the GCN encoder is trained by:
designing two self-supervision tasks, and training a GCN encoder; the two self-supervision tasks comprise a relation classification task and an emotion alignment classification task;
the method comprises the following steps that a relation classification task is given to any two nodes, and the relation of the two nodes can be judged based on a relation classification model of a GCN encoder; the emotion alignment task gives two aspect words and viewpoint words, and whether the two aspect words and the viewpoint words have the same emotion polarity needs to be judged based on an emotion alignment model of a GCN encoder; the two self-supervision tasks aim at the common domain relation knowledge and learn the emotional alignment characteristics between the aspect viewpoint pairs so as to obtain a characteristic vector containing the common background knowledge and the aspect viewpoint word emotional alignment.
Further, in the relationship classification task, the node feature vector is obtained by fusion represented by the neighboring node of the node, and the fusion process of the feature vector is represented as follows:
Figure BDA0003690851950000021
Figure BDA0003690851950000022
wherein the content of the first and second substances,
Figure BDA0003690851950000023
representing all neighbor nodes of node i under relation r, g i Is the initial node feature vector initialized randomly, and is converted into h after the two-step graph convolution process is used i B, carrying out the following steps of; sigma denotes the Relu activation function; l denotes the l-th layer graph convolution, c i,r The number of neighbor nodes of the node i is shown,
Figure BDA0003690851950000024
representing a parameter matrix, x, to be trained j A feature vector representation representing the node j,
Figure BDA0003690851950000025
representing a parameter matrix to be trained;
the penalty function generated by the relationship classification task is:
Figure BDA0003690851950000026
Figure BDA0003690851950000027
wherein, s (v) i ,r i,j ,v j ) Expressed is a matrix analysis score function; r r The representation is a vector representation of the relationship r; t denotes the set of nodes of the graph G, and y denotes whether a relationship r exists between a given node i and a given node j i,j If yes, y takes a value of 1, and if not, y takes a value of 0;
the emotion alignment classification task generates a loss function as follows:
Figure BDA0003690851950000031
wherein N represents the number of unlabeled samples of the source field and the target field; p k The aspect word-emotion word pair contained in the kth unlabeled sample is shown.
Further, the method also comprises the step of constructing the common sense knowledge graph:
based on the unlabeled samples of the source field and the target field, taking a sentence as a unit, taking words with the designated parts of speech as nouns, verbs and adjectives in the sentence as link seeds, and linking out the knowledge triples of the next hop through a conceptNet common sense knowledge base; finally, de-overlapping the sub-graph spectrums linked by all sentences to form a domain common knowledge map, and providing knowledge support for cross-domain emotion analysis;
wherein the constructed common sense knowledge map of the field is represented as:
Figure BDA0003690851950000032
wherein, a node vi ∈ V in the constructed graph, and a relation triple (vi, ri, j, vj) ∈ phi, wherein
Figure BDA0003690851950000033
Expressed as two nodes vi in relation to vj.
Further, the method also comprises the following steps:
and calculating the importance degree of the nodes in the knowledge graph by adopting an attention mechanism, wherein the importance degree of each node is expressed as:
Figure BDA0003690851950000034
wherein e is i The representation is a vector representation of the ith node, α i Indicating the importance of the ith node, e k Representing a vector representation of the kth node, N i All neighbor node sets of the ith node are represented. .
Further, the performing feature fusion on the first feature vector and the second feature vector to obtain a vector representation of the sentence includes:
splicing the first characteristic vector and the second characteristic vector, calculating the probability of all possible polarities of the input text, selecting the emotion label with the maximum probability as a final predicted emotion label, and in the step of completing an emotion analysis task, expressing the characteristic vector of each sentence as follows:
x=[x w ;x g ]
wherein x is g Is obtained by a common knowledge vector, x, with respect to the emotional alignment of the aspect-view words w Sentence vectors with context information, generated by the BERT encoder, [;]representing the stitching vector.
Further, after the sentence vector representation of the sample is input into the prototype network model with few samples, the following steps are performed:
for k samples of positive and negative emotion categories, a prototype for each emotion category is calculated:
Figure BDA0003690851950000041
in the formula (I), the compound is shown in the specification,
Figure BDA0003690851950000042
the feature vector representation of the jth sample of the positive/negative emotion category is represented;
and outputting the emotion probability of the sentence x, wherein the calculation formula of the emotion probability is as follows:
Figure BDA0003690851950000043
in the formula, c i Is the ith emotional polarity; q denotes the sample to be tested; d () denotes the euclidean distance between the given two vectors.
Further, the method also comprises the following steps:
and (3) training the few-sample prototype network model by adopting an Adam optimizer, wherein a loss function in the training process is represented as follows:
L=L recon +L softmax
wherein L is recon Representing the loss of sentence vector representation reconstruction; l is softmax The cross entropy loss function of the emotion classification task is shown.
The other technical scheme adopted by the invention is as follows:
a small sample cross-domain emotion analysis device, comprising:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method described above.
The other technical scheme adopted by the invention is as follows:
a computer readable storage medium in which a processor executable program is stored, which when executed by a processor is for performing the method as described above.
The invention has the beneficial effects that: according to the method, the field sharing characteristics and the field specific characteristics are captured by using the few-sample learning technology, so that the emotion prediction effect of the model transferred from the source field to the target field is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description is made on the drawings of the embodiments of the present invention or the related technical solutions in the prior art, and it should be understood that the drawings in the following description are only for convenience and clarity of describing some embodiments in the technical solutions of the present invention, and it is obvious for those skilled in the art that other drawings can be obtained according to these drawings without creative efforts.
FIG. 1 is a flowchart illustrating steps of a method for cross-domain emotion analysis with few samples according to an embodiment of the present invention;
FIG. 2 is a flow chart of a method for knowledge-enhancement-based small-sample cross-domain emotion analysis in an embodiment of the present invention;
FIG. 3 is a schematic model structure diagram of a knowledge enhancement-based small-sample cross-domain emotion analysis method in the embodiment of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention. The step numbers in the following embodiments are provided only for convenience of illustration, the order between the steps is not limited at all, and the execution order of each step in the embodiments can be adapted according to the understanding of those skilled in the art.
In the description of the present invention, it should be understood that the orientation or positional relationship referred to in the description of the orientation, such as upper, lower, front, rear, left, right, etc., is based on the orientation or positional relationship shown in the drawings only for the convenience of description of the present invention and simplification of the description, and does not indicate or imply that the device or element referred to must have a specific orientation, be constructed and operated in a specific orientation, and thus, should not be construed as limiting the present invention.
In the description of the present invention, the meaning of a plurality of means is one or more, the meaning of a plurality of means is two or more, and larger, smaller, larger, etc. are understood as excluding the number, and larger, smaller, inner, etc. are understood as including the number. If the first and second are described for the purpose of distinguishing technical features, they are not to be understood as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
In the description of the present invention, unless otherwise explicitly limited, terms such as arrangement, installation, connection and the like should be understood in a broad sense, and those skilled in the art can reasonably determine the specific meanings of the above terms in the present invention in combination with the specific contents of the technical solutions.
As shown in fig. 1, the present embodiment provides a method for analyzing emotion with few samples across domains, including the following steps:
s1, obtaining sentence data, inputting the sentence data into a trained BERT coder, and obtaining a first feature vector;
s2, inputting sentence data into the trained GCN encoder to obtain a second feature vector;
s3, performing feature fusion on the first feature vector and the second feature vector to obtain vector representation of the sentence;
and S4, inputting the vector representation of the sentence into the trained few-sample prototype network model, and outputting the emotion polarity of the sentence. Wherein the few-sample prototype network model is obtained by training in the following way: obtaining labeling samples of preset positive emotion and negative emotion, obtaining sentence vector representations of the labeling samples, respectively mapping the sentence vector representations to the same feature space, and taking an average vector represented by the sentence vectors with the same polarity as a prototype representation representing the corresponding emotion polarity.
As an alternative embodiment, the BERT encoder is trained by:
acquiring a text of a source domain or a target domain, and training a BERT encoder to acquire rich domain feature knowledge; wherein the feature vector of each sentence in the text is represented as:
x w =h [cls] =BERT(x)
in the formula, x represents that the input sentence BERT is a sentence encoder.
As an alternative embodiment, the GCN encoder is trained by:
designing two self-supervision tasks, and training a GCN encoder; the two self-supervision tasks comprise a relation classification task and an emotion alignment classification task;
the relation classification task is given any two nodes, and the model can judge the relation of the two nodes; the emotion alignment task gives two aspect words and viewpoint words, and the model needs to judge whether the two aspect words and the viewpoint words have the same emotion polarity; the two self-supervision tasks aim at the common domain relation knowledge and learn the emotional alignment characteristics between the aspect viewpoint pairs so as to obtain a characteristic vector containing the common background knowledge and the aspect viewpoint word emotional alignment.
In the step of designing two self-supervision tasks, namely a relationship classification task and an emotion alignment classification task, to pre-train a GCN automatic encoder, namely predicting the relationship between nodes to obtain a common sense knowledge feature vector, and learning emotion alignment features between aspect viewpoint pairs by using an emotion alignment binary classification task, so as to obtain a feature vector containing background common sense and aspect viewpoint word emotion alignment, the conversion process of the feature vector may be expressed as:
Figure BDA0003690851950000061
Figure BDA0003690851950000062
wherein, all the neighbor nodes of the representative node i under the relation r are normalization constants which can be preset, g i Is the initial node feature vector initialized randomly, and is converted into h after the two-step graph convolution process is used i The sum is a weight matrix of the l layer; σ denotes the Relu activation function. Meanwhile, the loss function generated by the relation classification task self-supervision learning task is as follows:
Figure BDA0003690851950000063
Figure BDA0003690851950000064
wherein, s (v) i ,r i,j ,v j ) Expressed is a matrix analysis score function; r is r The representation is a vector representation of the relation r.
The loss function generated by the emotion alignment classification self-supervision learning task is as follows:
Figure BDA0003690851950000065
wherein N represents the number of unlabeled samples of the source field and the target field; p is k The aspect word-emotion word pair contained in the kth unlabeled sample is shown.
As an alternative embodiment, the method also comprises the step of constructing the common sense knowledge map:
based on the unlabeled samples of the source field and the target field, taking a sentence as a unit, taking words with the designated parts of speech as nouns, verbs and adjectives in the sentence as link seeds, and linking out the knowledge triples of the next hop through a conceptNet common sense knowledge base; finally, de-overlapping the sub-graph spectrums linked by all sentences to form a domain common knowledge map, and providing knowledge support for cross-domain emotion analysis;
wherein the constructed common sense knowledge map of the field is represented as:
Figure BDA0003690851950000071
wherein, a node v in the map is constructed i E.v, relationship triplet (V) i ,r i,j ,v j ) E.g. phi, wherein
Figure BDA0003690851950000072
Denoted as two nodes v i And v j And (4) relationship.
As an alternative embodiment, the importance of a given graph node is calculated using an attention mechanism, and thus the importance of each node is expressed as:
Figure BDA0003690851950000073
wherein e is i The representation is a vector representation of the ith node, α i Indicating the importance, range, of the ith nodeIs [0,1 ]]。
A knowledge-aware attention mechanism module: given a sentence, the external knowledge base ConceptNet can be used to link to a sub-spectrogram, but not every link node has equal effect on the emotion analysis task across domains. In this regard, an attention mechanism module based on knowledge perception is designed; given a representation of all the nodes that are linked, the importance of each linked node is calculated.
As an alternative embodiment, the features are fused: and designing a feature fusion module, giving a sentence, and respectively adopting a BERT encoder and a GCN encoder to carry out feature representation to respectively obtain two feature vectors. And finally, a vector representation of the sentence is obtained by adopting a splicing mode.
Calculating the probability of all possible polarities of an input text by adopting a mode of splicing vectors generated by two encoders (namely a BERT encoder and a GCN image encoder), selecting the emotion label with the maximum probability as a final predicted emotion label, and in the step of completing an emotion analysis task, the feature vector of each sentence can be expressed as:
x=[x w ;x g ]
wherein x is g Is obtained by a common knowledge vector, x, with respect to the emotional alignment of the aspect-view words w Is a sentence vector with context information generated by the BERT encoder, [;]a stitching vector is represented.
As an alternative embodiment, the prototype network learning with less samples adopts a prototype network model based on metric learning, k samples of positive emotion classes and k samples of negative emotion classes are respectively given, and a prototype of each emotion class is calculated:
Figure BDA0003690851950000074
wherein the content of the first and second substances,
Figure BDA0003690851950000075
representing the feature vector representation of the jth sample of the positive/negative emotion classes;
meanwhile, the emotion probability of a given sentence x is output, and the calculation formula is as follows:
Figure BDA0003690851950000081
in the formula, c i Is the ith emotional polarity; q denotes the sample to be tested; d () denotes the euclidean distance between the given two vectors.
As an optional embodiment, in the process of jointly training an aspect term emotion alignment task and an emotion analysis task and training a model by using an Adam optimizer to obtain an optimal parameter, a loss function may be represented as:
L=L recon +L softmax
wherein L is recon Representing the loss of sentence vector representation reconstruction; l is softmax The cross entropy loss function of the emotion classification task is shown.
As shown in fig. 2 and fig. 3, the embodiment provides a few-sample cross-domain emotion analysis method based on knowledge enhancement, the method introduces a large amount of common knowledge by using a small amount of labeled samples in the target domain, so that the domain-invariant features can be captured, the domain-specific features can be obtained, model parameters are optimized by using a few-sample learning mode, the overfitting problem caused during model training is avoided, the adaptability of the model domain can be effectively improved, and the cross-domain emotion analysis performance is further improved. The model comprises a pre-training field BERT encoder, a pre-training GCN automatic encoder, a feature vector representation training classifier which splices vectors generated by the two encoders to be sentences, and a prototype network module for less-sample learning. The method comprises the following steps:
(1) and inputting label-free samples of a source field or a target field to pre-train the BERT coder.
The pre-training BERT encoder is used for obtaining rich domain knowledge through large-scale label-free data pre-training, and the feature vector of each sentence in the text is represented as follows:
x w =h [cls] =BERT(x)
where x represents the input sentence and BERT is the sentence encoder.
(2) Based on unlabeled samples (namely comment sentences) of the source field and the target field, taking sentences as units, taking words with specified parts of speech as nouns, verbs and adjectives in the sentences as link seeds, and linking out knowledge triples of the next hop through a ConceptNet common sense knowledge base; and finally, carrying out de-duplication and combination on the sub-graph spectrums linked by all sentences to finally form a domain common knowledge graph, and providing knowledge support for cross-domain emotion analysis. The dependency relationship means that if the dependency syntactic relationship between specified words in a sentence is "nsubj", "amod", or "xcomp", they are connected as a "description" relationship. And finally, the seed filtering ConceptNet creates subgraphs, and the subgraphs of all sentences are combined into a domain common sense graph which can be expressed as follows:
Figure BDA0003690851950000082
wherein, a node v in the map is constructed i E.v, relationship triplet (V) i ,r i ,j,v j ) E.g. phi, where r i,j Refers to the relationship of two nodes in the ConceptNet.
(3) Two self-supervision tasks, namely a relation classification task and an emotion alignment classification task are designed to train a GCN (generalized convolutional neural network) automatic encoder, namely, the relation between a prediction node and a node is predicted to obtain a common sense knowledge feature vector, and an emotion alignment feature between aspect viewpoint pairs is learned by utilizing an emotion alignment binary classification task, so that a feature vector containing background common sense and aspect viewpoint word emotion alignment is obtained.
The transformation process of the feature vector can be expressed as:
Figure BDA0003690851950000091
Figure BDA0003690851950000092
wherein, all the neighbor nodes of the representative node i under the relation r are normalization constants which can be preset, g i Is the initial node feature vector initialized randomly, and is converted into h after the two-step graph convolution process is used i I.e., aggregating feature vectors for the domain, and a weight matrix referring to the l-th layer.
(4) And inputting the feature vectors generated by the GCN automatic encoder into a graph feature reconstructor, and adapting the graph node level feature vectors to word level vectors through the graph feature reconstructor.
The feature mapping layer and the graph feature reconstructor are designed and expressed by taking sentences as units as follows:
x c =W c x’ c +b c
x’ recon =W recon x c +b recon
where x represents the vector representation of the sentence, W c And b c All the weight matrixes are weight matrixes, sentence feature vectors obtained by expressing all nodes in an average map after x constructs a sub-map and passes through a GCN automatic encoder are sentence feature vector representations which are obtained after a feature reconstructor and are adapted to a word level distribution space, and the vector representation of a sentence is used as final vector representation of the GCN on the sentence x.
Therefore, the loss function of the reconstruction function is expressed by the following formula using the cosine similarity function:
Figure BDA0003690851950000093
after the sub-map is constructed for the sentence x, the sub-map is input into a GCN automatic encoder to obtain a sentence characteristic vector representation, and the sentence characteristic vector representation is obtained through a reconstruction function.
(5) And splicing vectors generated by the two encoders to serve as a vector input classifier of a sentence, calculating the probability of all possible polarities of the input text, selecting the emotion label with the maximum probability as a final predicted emotion label, and completing an emotion analysis task.
The feature vector of a sentence can be represented as:
x=[x c ;x w ]
wherein x is c Is obtained by a common knowledge vector, x, with respect to the emotional alignment of the aspect-view words w Is a sentence vector with context information generated by the BERT encoder. [;]a stitching vector is represented.
Therefore, in the step of completing the emotion analysis task, the emotion probability is calculated as follows:
Figure BDA0003690851950000101
where C is the likely emotional polarity.
(6) Training the few-sample prototype network learning, adopting a prototype network model based on metric learning, respectively giving k samples of positive emotion categories and k samples of negative emotion categories, and calculating the prototype of each emotion category:
Figure BDA0003690851950000102
wherein the feature vector representation of the jth sample of the positive/negative emotion classification is represented;
meanwhile, the emotion probability of a given sentence x is output, and the calculation formula is as follows:
Figure BDA0003690851950000103
wherein, is the ith emotion polarity; q denotes the sample to be tested; d () denotes the euclidean distance between the given two vectors.
(7) In the process of obtaining the optimal parameters by combining an opinion word emotion alignment task and an emotion analysis task in the aspect of training by adopting an Adam optimizer, a loss function can be expressed as follows:
L=L recon +L softmax
wherein, the loss of sentence vector representation reconstruction is represented; the cross entropy loss function of the emotion classification task is shown.
In summary, the present application utilizes knowledge enhancement and few sample learning techniques to capture domain-sharing features and domain-specific features, effectively solves the problem that existing methods cannot effectively capture domain-invariant features and domain-specific features at the same time, and further improves the prediction effect of sentiment analysis of target domain data.
The embodiment also provides a few-sample cross-domain emotion analysis device, which comprises:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method of fig. 1.
The device for analyzing the low-sample cross-domain emotion can execute the method for analyzing the low-sample cross-domain emotion provided by the method embodiment of the invention, can execute any combination of the implementation steps of the method embodiment, and has corresponding functions and beneficial effects of the method.
The embodiment of the application also discloses a computer program product or a computer program, which comprises computer instructions, and the computer instructions are stored in a computer readable storage medium. The computer instructions may be read by a processor of a computer device from a computer-readable storage medium, and executed by the processor to cause the computer device to perform the method illustrated in fig. 1.
In alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flow charts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed and in which sub-operations described as part of larger operations are performed independently.
Furthermore, although the present invention is described in the context of functional modules, it should be understood that, unless otherwise stated to the contrary, one or more of the described functions and/or features may be integrated in a single physical device and/or software module, or one or more functions and/or features may be implemented in a separate physical device or software module. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary for an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be understood within the ordinary skill of an engineer, given the nature, function, and internal relationship of the modules. Accordingly, those of ordinary skill in the art will be able to practice the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative of and not intended to limit the scope of the invention, which is defined by the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer readable storage medium. Based on such understanding, the technical solution of the present invention may be embodied in the form of a software product, which is stored in a storage medium and includes instructions for causing a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes.
The logic and/or steps represented in the flowcharts or otherwise described herein, e.g., an ordered listing of executable instructions that can be considered to implement logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). Additionally, the computer-readable medium could even be paper or another suitable medium upon which the program is printed, as the program can be electronically captured, via for instance optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner if necessary, and then stored in a computer memory.
It should be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above embodiments, the various steps or methods may be implemented in software or firmware stored in memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, any one or combination of the following techniques, which are known in the art, may be used: a discrete logic circuit having a logic gate circuit for implementing a logic function on a data signal, an application specific integrated circuit having an appropriate combinational logic gate circuit, a Programmable Gate Array (PGA), a Field Programmable Gate Array (FPGA), or the like.
In the foregoing description of the specification, reference to the description of "one embodiment/example," "another embodiment/example," or "certain embodiments/examples," etc., means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiment or example. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiments of the present invention have been illustrated and described, it will be understood by those skilled in the art that various changes in form and details may be made therein without departing from the spirit and scope of the invention as defined by the appended claims.

Claims (10)

1. A small-sample cross-domain emotion analysis method is characterized by comprising the following steps:
sentence data is obtained, the sentence data is input into a trained BERT encoder, and a first feature vector is obtained;
inputting sentence data into a trained GCN encoder to obtain a second feature vector;
performing feature fusion on the first feature vector and the second feature vector to obtain vector representation of the sentence;
inputting the vector representation of the sentence into the trained few-sample prototype network model, and outputting the emotion polarity of the sentence;
wherein the few-sample prototype network model is obtained by training in the following way: obtaining labeling samples of preset positive emotion and negative emotion, obtaining sentence vector representations of the labeling samples, respectively mapping the sentence vector representations to a feature space, and taking an average vector represented by the sentence vectors with the same polarity as a prototype representation representing the corresponding emotion polarity.
2. The method of claim 1, wherein the BERT encoder is trained by:
acquiring a text of a source domain or a target domain, and training a BERT encoder to acquire rich domain feature knowledge; wherein the feature vector of each sentence in the text is represented as:
x w =h [cls] =BERT(x)
in the formula, x represents the input sentence, h [cls] The expression is a hidden vector expression of special characters before sentences of a BERT coder, and the BERT coder is a sentence coder.
3. The method of claim 1, wherein the GCN encoder is trained by:
designing two self-supervision tasks, and training a GCN encoder; the two self-supervision tasks comprise a relation classification task and an emotion alignment classification task;
the method comprises the following steps that a relation classification task is given to any two nodes, and the relation of the two nodes can be judged based on a relation classification model of a GCN encoder; the emotion alignment task gives two aspect words and viewpoint words, and whether the two aspect words and the viewpoint words have the same emotion polarity needs to be judged based on an emotion alignment model of a GCN encoder; the two self-supervision tasks aim at the common domain relation knowledge and learn the emotional alignment characteristics between the aspect viewpoint pairs so as to obtain a characteristic vector containing the common background knowledge and the aspect viewpoint word emotional alignment.
4. The method for analyzing emotion of a small sample across domains as recited in claim 3, wherein in the task of relationship classification, the feature vector of a node is obtained by fusion of neighboring node representations of the node, and the fusion process is represented as follows:
Figure FDA0003690851940000011
Figure FDA0003690851940000012
wherein the content of the first and second substances,
Figure FDA0003690851940000013
represents all neighbor nodes of node i under the relation r, g i Is the initial node feature vector initialized randomly, and is converted into h after the two-step graph convolution process is used i (ii) a Sigma denotes the Relu activation function; l denotes the l-th layer graph convolution, c i,r The number of neighbor nodes of the node i is shown,
Figure FDA0003690851940000021
representing the parameter matrix to be trained, x j A feature vector representation representing the node j,
Figure FDA0003690851940000022
representing a parameter matrix to be trained;
the penalty function generated by the relationship classification task is:
Figure FDA0003690851940000023
Figure FDA0003690851940000024
wherein, s (v) i ,r i,j ,v j ) Expressed is a matrix analysis score function; r r The representation is a vector representation of the relationship r; t denotes the set of nodes of the graph G, and y denotes whether a relationship r exists between a given node i and a given node j i,j If yes, y takes a value of 1, and if not, y takes a value of 0;
the emotion alignment classification task generates a loss function as follows:
Figure FDA0003690851940000025
wherein N represents the number of unlabeled samples of the source field and the target field; p k The aspect word-emotion word pair contained in the kth unlabeled sample is shown.
5. The method for analyzing emotion of small sample across domains as claimed in claim 1, further comprising the step of constructing a common sense knowledge graph:
based on the unlabeled samples of the source field and the target field, taking a sentence as a unit, taking words with the designated parts of speech as nouns, verbs and adjectives in the sentence as link seeds, and linking out the knowledge triples of the next hop through a conceptNet common sense knowledge base; finally, de-overlapping the sub-graph spectrums linked by all sentences to form a domain common knowledge map, and providing knowledge support for cross-domain emotion analysis;
wherein the constructed common sense knowledge map of the field is represented as:
Figure FDA0003690851940000026
wherein, a node v in the map is constructed i e.V, relationship triplet (V) i ,r i,j ,v j ) E.g. phi, wherein
Figure FDA0003690851940000027
Denoted as two nodes v i And v j The relation phi represents all the triad sets contained in the map G.
6. The method for analyzing emotion of small sample across domains as recited in claim 5, further comprising the steps of:
and calculating the importance degree of the nodes in the knowledge graph by adopting an attention mechanism, wherein the importance degree of each node is expressed as:
Figure FDA0003690851940000028
wherein e is i The representation is a vector representation of the ith node, α i Indicating the importance of the ith node, e k Representing a vector representation of the kth node, N i All neighbor node sets of the ith node are represented.
7. The method of claim 1, wherein the performing feature fusion on the first feature vector and the second feature vector to obtain a vector representation of a sentence comprises:
and splicing the first feature vector and the second feature vector, calculating the probability of all possible polarities of the input text, selecting the emotion label with the maximum probability as a final predicted emotion label, and finishing the emotion analysis task, wherein the feature vector of each sentence is expressed as follows:
x=[x w ;x g ]
wherein x is g Is obtained by a common knowledge vector, x, with respect to the emotional alignment of the aspect-view words w Is a sentence vector with context information generated by the BERT encoder, [;]a stitching vector is represented.
8. The method of claim 1, wherein the sentence vector representation of the sample is inputted into the prototype network model with less samples, and the following steps are performed:
for k samples of positive and negative emotion categories, a prototype for each emotion category is calculated:
Figure FDA0003690851940000031
in the formula (I), the compound is shown in the specification,
Figure FDA0003690851940000032
the feature vector representation of the jth sample of the positive/negative emotion category is represented;
and outputting the emotion probability of the sentence x, wherein the calculation formula of the emotion probability is as follows:
Figure FDA0003690851940000033
in the formula, c i Is the ith emotional polarity; q denotes the sample to be tested; d () denotes the euclidean distance between the given two vectors.
9. The method for analyzing emotion of small sample across domains as recited in claim 1, further comprising the steps of:
and (3) training the few-sample prototype network model by adopting an Adam optimizer, wherein a loss function in the training process is represented as follows:
L=L recon +L softmax
wherein L is recon Representing the loss of sentence vector representation reconstruction; l is softmax The cross entropy loss function of the emotion classification task is shown.
10. A small sample cross-domain emotion analysis device, comprising:
at least one processor;
at least one memory for storing at least one program;
when executed by the at least one processor, cause the at least one processor to implement the method of any one of claims 1-9.
CN202210661020.8A 2022-06-13 2022-06-13 Cross-domain emotion analysis method and device for few samples Active CN115080688B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210661020.8A CN115080688B (en) 2022-06-13 2022-06-13 Cross-domain emotion analysis method and device for few samples

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210661020.8A CN115080688B (en) 2022-06-13 2022-06-13 Cross-domain emotion analysis method and device for few samples

Publications (2)

Publication Number Publication Date
CN115080688A true CN115080688A (en) 2022-09-20
CN115080688B CN115080688B (en) 2024-06-04

Family

ID=83251179

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210661020.8A Active CN115080688B (en) 2022-06-13 2022-06-13 Cross-domain emotion analysis method and device for few samples

Country Status (1)

Country Link
CN (1) CN115080688B (en)

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115905518A (en) * 2022-10-17 2023-04-04 华南师范大学 Emotion classification method, device and equipment based on knowledge graph and storage medium
CN116562305A (en) * 2023-07-10 2023-08-08 江西财经大学 Aspect emotion four-tuple prediction method and system

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008338A (en) * 2019-03-04 2019-07-12 华南理工大学 A kind of electric business evaluation sentiment analysis method of fusion GAN and transfer learning
CN112860901A (en) * 2021-03-31 2021-05-28 中国工商银行股份有限公司 Emotion analysis method and device integrating emotion dictionaries
CN113722439A (en) * 2021-08-31 2021-11-30 福州大学 Cross-domain emotion classification method and system based on antagonism type alignment network
US20220092267A1 (en) * 2020-09-23 2022-03-24 Jingdong Digits Technology Holding Co., Ltd. Method and system for aspect-level sentiment classification by graph diffusion transformer

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008338A (en) * 2019-03-04 2019-07-12 华南理工大学 A kind of electric business evaluation sentiment analysis method of fusion GAN and transfer learning
US20220092267A1 (en) * 2020-09-23 2022-03-24 Jingdong Digits Technology Holding Co., Ltd. Method and system for aspect-level sentiment classification by graph diffusion transformer
CN112860901A (en) * 2021-03-31 2021-05-28 中国工商银行股份有限公司 Emotion analysis method and device integrating emotion dictionaries
CN113722439A (en) * 2021-08-31 2021-11-30 福州大学 Cross-domain emotion classification method and system based on antagonism type alignment network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
余珊珊;苏锦钿;李鹏飞;: "一种基于自注意力的句子情感分类方法", 计算机科学, no. 04, 30 April 2020 (2020-04-30), pages 205 - 207 *

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115905518A (en) * 2022-10-17 2023-04-04 华南师范大学 Emotion classification method, device and equipment based on knowledge graph and storage medium
CN115905518B (en) * 2022-10-17 2023-10-20 华南师范大学 Emotion classification method, device, equipment and storage medium based on knowledge graph
CN116562305A (en) * 2023-07-10 2023-08-08 江西财经大学 Aspect emotion four-tuple prediction method and system
CN116562305B (en) * 2023-07-10 2023-09-12 江西财经大学 Aspect emotion four-tuple prediction method and system

Also Published As

Publication number Publication date
CN115080688B (en) 2024-06-04

Similar Documents

Publication Publication Date Title
Li et al. Natural language generation using deep learning to support MOOC learners
Yang et al. SGM: sequence generation model for multi-label classification
Gu et al. Stack-captioning: Coarse-to-fine learning for image captioning
CN110188202B (en) Training method and device of semantic relation recognition model and terminal
CN109344404B (en) Context-aware dual-attention natural language reasoning method
CN115080688A (en) Method and device for analyzing low-sample cross-domain emotion
CN112115700A (en) Dependency syntax tree and deep learning based aspect level emotion analysis method
CN111368082A (en) Emotion analysis method for domain adaptive word embedding based on hierarchical network
JP2022128441A (en) Augmenting textual data for sentence classification using weakly-supervised multi-reward reinforcement learning
CN115687638A (en) Entity relation combined extraction method and system based on triple forest
Jafaritazehjani et al. Style versus Content: A distinction without a (learnable) difference?
Mu et al. A BERT model generates diagnostically relevant semantic embeddings from pathology synopses with active learning
Nelimarkka Computational thinking and social science: Combining programming, methodologies and fundamental concepts
CN113283488B (en) Learning behavior-based cognitive diagnosis method and system
CN114443846A (en) Classification method and device based on multi-level text abnormal composition and electronic equipment
CN117370736A (en) Fine granularity emotion recognition method, electronic equipment and storage medium
CN112395858A (en) Multi-knowledge point marking method and system fusing test question data and answer data
CN117350286A (en) Natural language intention translation method oriented to intention driving data link network
CN117111952A (en) Code complement method and device based on generation type artificial intelligence and medium
CN115730608A (en) Learner online communication information analysis method and system
CN114626529A (en) Natural language reasoning fine-tuning method, system, device and storage medium
CN111723301B (en) Attention relation identification and labeling method based on hierarchical theme preference semantic matrix
CN113536808A (en) Reading understanding test question difficulty automatic prediction method introducing multiple text relations
CN113792144A (en) Text classification method based on semi-supervised graph convolution neural network
CN113869518A (en) Visual common sense reasoning method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant