CN115080688B - Cross-domain emotion analysis method and device for few samples - Google Patents

Cross-domain emotion analysis method and device for few samples Download PDF

Info

Publication number
CN115080688B
CN115080688B CN202210661020.8A CN202210661020A CN115080688B CN 115080688 B CN115080688 B CN 115080688B CN 202210661020 A CN202210661020 A CN 202210661020A CN 115080688 B CN115080688 B CN 115080688B
Authority
CN
China
Prior art keywords
emotion
sentence
node
vector
encoder
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210661020.8A
Other languages
Chinese (zh)
Other versions
CN115080688A (en
Inventor
蔡毅
任浩鹏
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
South China University of Technology SCUT
Original Assignee
South China University of Technology SCUT
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by South China University of Technology SCUT filed Critical South China University of Technology SCUT
Priority to CN202210661020.8A priority Critical patent/CN115080688B/en
Publication of CN115080688A publication Critical patent/CN115080688A/en
Application granted granted Critical
Publication of CN115080688B publication Critical patent/CN115080688B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a method and a device for analyzing emotion of a few samples in a cross-domain manner, wherein the method comprises the following steps: sentence data are acquired, and are input into a trained BERT encoder to acquire a first feature vector; inputting sentence data into a trained GCN encoder to obtain a second feature vector; feature fusion is carried out on the first feature vector and the second feature vector, and the vector representation of the sentence is obtained; inputting the vector representation of the sentence into the trained prototype network model of the few samples, and outputting the emotion polarity of the sentence; according to the invention, the field sharing characteristics and the field specific characteristics are captured by using a few-sample learning technology, so that the emotion prediction effect of the model transferred from the source field to the target field is improved. The invention can be widely applied to the technical field of natural language processing.

Description

Cross-domain emotion analysis method and device for few samples
Technical Field
The invention relates to the technical field of natural language processing, in particular to a method and a device for analyzing emotion of a few samples in a cross-domain manner.
Background
Emotion analysis is a task that automatically classifies the emotion polarity of text data. At present, an emotion analysis model based on a deep neural network obtains remarkable performance, but the current method adopting the neural network needs a large number of labeling samples to achieve an ideal prediction effect. Meanwhile, the marking of the training sample requires a lot of manpower and a lot of time.
In order to alleviate the problem of reliance on large amounts of manually tagged data, recently, cross-domain emotion analysis tasks have become research hotspots, the purpose of which is to transfer knowledge from tag-rich source domains to tag-scarce target domains. The main challenge of cross-domain emotion analysis is the need to overcome the difference between source and target domains, especially when the source and target domains differ significantly. In view of this challenge, many studies propose to extract domain-invariant grammatical features as a bridge for domain migration, thereby reducing inter-domain differences. However, language expression is diversified, and grammar information with unchanged fields can cause transmission errors of emotion migration due to the diversity of language tables. Meanwhile, the current cross-domain emotion analysis method is usually focused on unchanged features between learning and mining domains, but ignores domain-specific features, and as domain differences become larger, the unchanged features of the domains are limited, so that the performance of the cross-domain emotion analysis is reduced.
Disclosure of Invention
In order to solve at least one of the technical problems existing in the prior art to a certain extent, the invention aims to provide a cross-field emotion analysis method and device with few samples.
The technical scheme adopted by the invention is as follows:
A few sample cross-domain emotion analysis method comprises the following steps:
Sentence data are acquired, and are input into a trained BERT encoder to acquire a first feature vector;
inputting sentence data into a trained GCN encoder to obtain a second feature vector;
feature fusion is carried out on the first feature vector and the second feature vector, and the vector representation of the sentence is obtained;
Inputting the vector representation of the sentence into the trained prototype network model of the few samples, and outputting the emotion polarity of the sentence;
Wherein the less-sample prototype network model is obtained by training in the following manner: the method comprises the steps of obtaining a preset labeling sample of positive emotion and negative emotion, obtaining sentence vector representations of the labeling sample, mapping the sentence vector representations into the same feature space respectively, and taking average vectors of the sentence vector representations with the same polarity as prototype representations representing the corresponding emotion polarities.
Further, the BERT encoder is trained by:
Acquiring texts of a source domain or a target domain, and training a BERT encoder to acquire rich domain feature knowledge; wherein, the feature vector of each sentence in the text is expressed as:
xw=h[cls]=BERT(x)
Where x represents the input sentence, h [cls] represents the hidden vector representation of the special character before the BERT encoder, and BERT is the sentence encoder.
Further, the GCN encoder is trained by:
Designing two self-supervision tasks, and training a GCN encoder; the two self-supervision tasks comprise a relationship classification task and an emotion alignment classification task;
the relation classification task is to set any two nodes, and the relation between the two nodes can be judged based on a relation classification model of the GCN encoder; the emotion alignment task is to give two aspect words and viewpoint words, and an emotion alignment model based on a GCN encoder needs to judge whether the two aspect words and the viewpoint words have the same emotion polarity; the goal of these two self-supervising tasks is domain relationship wisdom, and learning emotion alignment features between aspect viewpoint pairs, thereby obtaining feature vectors containing background wisdom and aspect viewpoint emotion alignment.
Further, in the relation classification task, the node feature vector is obtained by fusion of neighboring node representations of the node, and the fusion process of the feature vector is represented as follows:
Wherein, G i is an initial node characteristic vector which is initialized randomly and represents all neighbor nodes of the node i under the relation r, and the node i is converted into h i after a two-step graph rolling process is used for the node i; σ represents Relu activation functions; l represents the convolution of the layer-I graph, c i,r represents the number of neighbor nodes of node i,/>Representing the parameter matrix to be trained, x j representing the eigenvector representation of node j,/>Representing a parameter matrix to be trained;
the loss function generated by the relation classification task is:
Wherein s (v i,ri,j,vj) represents a matrix analysis score function; r r represents a vector representation of the relationship R; t represents a node set of the graph G, y represents whether a relation r i,j exists between a given node i and a given node j, if yes, the value of y is 1, and if not, the value of y is 0;
the loss function generated by the emotion alignment classification task is as follows:
Wherein N represents the number of unlabeled samples in the source field and the target field; p k represents the aspect word-emotion word pair contained in the kth unlabeled sample.
Further, the method also comprises the step of constructing a common sense knowledge graph:
Based on unlabeled samples of the source field and the target field, taking sentences as units, taking words with specified parts of speech as nouns, verbs and adjectives in the sentences as link seeds, and linking out a knowledge triplet of the next hop through a ConceptNet common sense knowledge base; finally, de-overlapping the sub-spectrums linked by all sentences to form a domain common sense knowledge graph, and providing knowledge support for cross-domain emotion analysis;
wherein, the constructed knowledge graph of the general knowledge in the field is expressed as follows:
wherein, a node vi epsilon V in the map is constructed, and a relation triplet (vi, ri, j, vj) epsilon phi is constructed, wherein Represented as two nodes vi in relation to vj.
Further, the method also comprises the following steps:
Calculating the importance degree of the nodes in the knowledge graph by adopting an attention mechanism, wherein the importance degree of each node is expressed as follows:
Wherein e i represents a vector representation of the ith node, α i represents a degree of importance of the ith node, e k represents a vector representation of the kth node, and N i represents a set of all neighbor nodes of the ith node. .
Further, the feature fusion is performed on the first feature vector and the second feature vector to obtain a vector representation of the sentence, including:
The first feature vector and the second feature vector are spliced, the probability of all possible polarities of the input text is calculated, the emotion label with the highest probability is selected as the final predictive emotion label, and in the step of completing emotion analysis tasks, the feature vector of each sentence is expressed as follows:
x=[xw;xg]
Wherein x g is a common sense knowledge vector with aspect viewpoint emotion alignment, x w is a sentence vector with context information generated by the BERT encoder, [; and represents a splice vector.
Further, after the sentence vector representation of the sample is input into the prototype network model of the small sample, the following steps are performed:
for k samples of positive and negative emotion categories, a prototype for each emotion category is calculated:
In the method, in the process of the invention, A feature vector representation of the j-th sample of positive/negative emotion categories;
the emotion probability of the sentence x is output, and the calculation formula of the emotion probability is as follows:
Wherein c i is the ith emotion polarity; q represents the sample to be tested; d () represents the euclidean distance between a given two vectors.
Further, the method also comprises the following steps:
Training the small sample prototype network model by adopting an Adam optimizer, wherein a loss function in the training process is expressed as follows:
L=Lrecon+Lsoftmax
Wherein L recon represents the loss of sentence vector representation reconstruction; l softmax represents the cross entropy loss function of the emotion classification task.
The invention adopts another technical scheme that:
a small sample cross-domain emotion analysis device, comprising:
At least one processor;
at least one memory for storing at least one program;
the at least one program, when executed by the at least one processor, causes the at least one processor to implement the method described above.
The invention adopts another technical scheme that:
a computer readable storage medium, in which a processor executable program is stored, which when executed by a processor is adapted to carry out the method as described above.
The beneficial effects of the invention are as follows: according to the invention, the field sharing characteristics and the field specific characteristics are captured by using a few-sample learning technology, so that the emotion prediction effect of the model transferred from the source field to the target field is improved.
Drawings
In order to more clearly illustrate the embodiments of the present invention or the technical solutions in the prior art, the following description is made with reference to the accompanying drawings of the embodiments of the present invention or the related technical solutions in the prior art, and it should be understood that the drawings in the following description are only for convenience and clarity of describing some embodiments in the technical solutions of the present invention, and other drawings may be obtained according to these drawings without the need of inventive labor for those skilled in the art.
FIG. 1 is a flow chart of steps of a method for cross-domain emotion analysis with few samples in an embodiment of the present invention;
FIG. 2 is a flow chart of a knowledge-based enhanced few-sample cross-domain emotion analysis method in an embodiment of the present invention;
FIG. 3 is a schematic diagram of a model structure of a knowledge-enhancement-based few-sample cross-domain emotion analysis method in an embodiment of the invention.
Detailed Description
Embodiments of the present invention are described in detail below, examples of which are illustrated in the accompanying drawings, wherein like or similar reference numerals refer to like or similar elements or elements having like or similar functions throughout. The embodiments described below by referring to the drawings are illustrative only and are not to be construed as limiting the invention. The step numbers in the following embodiments are set for convenience of illustration only, and the order between the steps is not limited in any way, and the execution order of the steps in the embodiments may be adaptively adjusted according to the understanding of those skilled in the art.
In the description of the present invention, it should be understood that references to orientation descriptions such as upper, lower, front, rear, left, right, etc. are based on the orientation or positional relationship shown in the drawings, are merely for convenience of description of the present invention and to simplify the description, and do not indicate or imply that the apparatus or elements referred to must have a particular orientation, be constructed and operated in a particular orientation, and thus should not be construed as limiting the present invention.
In the description of the present invention, a number means one or more, a number means two or more, and greater than, less than, exceeding, etc. are understood to not include the present number, and above, below, within, etc. are understood to include the present number. The description of the first and second is for the purpose of distinguishing between technical features only and should not be construed as indicating or implying relative importance or implicitly indicating the number of technical features indicated or implicitly indicating the precedence of the technical features indicated.
In the description of the present invention, unless explicitly defined otherwise, terms such as arrangement, installation, connection, etc. should be construed broadly and the specific meaning of the terms in the present invention can be reasonably determined by a person skilled in the art in combination with the specific contents of the technical scheme.
As shown in fig. 1, the embodiment provides a cross-domain emotion analysis method with few samples, which includes the following steps:
S1, acquiring sentence data, and inputting the sentence data into a trained BERT encoder to acquire a first feature vector;
s2, inputting sentence data into a trained GCN encoder to obtain a second feature vector;
s3, carrying out feature fusion on the first feature vector and the second feature vector to obtain the vector representation of the sentence;
S4, inputting the vector representation of the sentence into the trained prototype network model of the few samples, and outputting the emotion polarity of the sentence. Wherein the less-sample prototype network model is obtained by training in the following manner: the method comprises the steps of obtaining a preset labeling sample of positive emotion and negative emotion, obtaining sentence vector representations of the labeling sample, mapping the sentence vector representations into the same feature space respectively, and taking average vectors of the sentence vector representations with the same polarity as prototype representations representing the corresponding emotion polarities.
As an alternative embodiment, the BERT encoder is trained by:
Acquiring texts of a source domain or a target domain, and training a BERT encoder to acquire rich domain feature knowledge; wherein, the feature vector of each sentence in the text is expressed as:
xw=h[cls]=BERT(x)
where x represents that the input sentence BERT is a sentence encoder.
As an alternative embodiment, the GCN encoder is trained by:
Designing two self-supervision tasks, and training a GCN encoder; the two self-supervision tasks comprise a relationship classification task and an emotion alignment classification task;
The relation classification task is given any two nodes, and the model can judge the relation of the two nodes; the emotion alignment task is to give two aspect words and viewpoint words, and the model needs to judge whether the two aspect words and the viewpoint words have the same emotion polarity; the goal of these two self-supervising tasks is domain relationship wisdom, and learning emotion alignment features between aspect viewpoint pairs, thereby obtaining feature vectors containing background wisdom and aspect viewpoint emotion alignment.
Designing two self-supervision tasks, namely a relation classification task and an emotion alignment classification task, pre-training a GCN automatic encoder, namely predicting the relation between nodes to obtain common sense knowledge feature vectors, and learning emotion alignment features between aspect viewpoint pairs by using an emotion alignment binary classification task, so as to obtain feature vectors containing background common sense and aspect viewpoint word emotion alignment, wherein the conversion process of the feature vectors can be expressed as follows:
Wherein, the representative node i is a normalization constant which can be preset under the relation r, g i is an initial node feature vector which is randomly initialized, and after a two-step graph rolling process is used for the representative node i, the representative node i is converted into h i, namely a field aggregation feature vector, and a weight matrix of a first layer is referred; σ represents Relu activation functions. Meanwhile, the loss function generated by the relation classification task self-supervision learning task is as follows:
Wherein s (v i,ri,j,vj) represents a matrix analysis score function; the R r representation is a vector representation of the relationship R.
The loss function generated by the emotion alignment classification self-supervision learning task is as follows:
Wherein N represents the number of unlabeled samples in the source field and the target field; p k represents the aspect word-emotion word pair contained in the kth unlabeled sample.
As an alternative embodiment, the method further comprises the step of constructing a common sense knowledge graph:
Based on unlabeled samples of the source field and the target field, taking sentences as units, taking words with specified parts of speech as nouns, verbs and adjectives in the sentences as link seeds, and linking out a knowledge triplet of the next hop through a ConceptNet common sense knowledge base; finally, de-overlapping the sub-spectrums linked by all sentences to form a domain common sense knowledge graph, and providing knowledge support for cross-domain emotion analysis;
wherein, the constructed knowledge graph of the general knowledge in the field is expressed as follows:
wherein, a node V i epsilon V and a relation triplet (V i,ri,j,vj) epsilon phi in the map are constructed, wherein Represented as two nodes v i in relation to v j.
As an alternative embodiment, the importance of a given graph node is calculated using an attention mechanism, and therefore, the importance of each node is expressed as:
wherein e i represents a vector representation of the ith node, and α i represents the importance of the ith node, and the range is [0,1].
Knowledge aware attention mechanism module: given a sentence, an external knowledge base ConceptNet can be utilized to link to a sub-spectrogram, but not every linked node has an equal effect on the task of emotion analysis across domains. For this, an attention mechanism module based on knowledge perception is designed; given a representation of all the nodes linked, the importance of each linked node is calculated.
As an alternative embodiment, feature fusion: and designing a feature fusion module, giving a sentence, and respectively adopting a BERT encoder and a GCN encoder to perform feature representation to respectively obtain two feature vectors. And finally obtaining the vector representation of the sentence by adopting a splicing mode.
The method comprises the steps of calculating the probability of all possible polarities of an input text by adopting a mode of splicing vectors generated by two encoders (namely a BERT encoder and a GCN graph encoder), selecting an emotion label with the highest probability as a final predictive emotion label, and completing emotion analysis tasks, wherein feature vectors of each sentence can be expressed as follows:
x=[xw;xg]
Wherein x g is a common sense knowledge vector with aspect viewpoint emotion alignment, x w is a sentence vector with context information generated by the BERT encoder, [; and represents a splice vector.
As an alternative embodiment, the prototype network learning with few samples adopts a prototype network model based on metric learning, k samples of positive and negative emotion categories are respectively given, and a prototype of each emotion category is calculated:
Wherein, A feature vector representation of the j-th sample of positive/negative emotion categories;
meanwhile, the emotion probability of a given sentence x is output, and the calculation formula is as follows:
Wherein c i is the ith emotion polarity; q represents the sample to be tested; d () represents the euclidean distance between a given two vectors.
As an alternative embodiment, in the process of combining the training aspect viewpoint word emotion alignment task and emotion analysis task and adopting an Adam optimizer to train a model to obtain the optimal parameters, the loss function can be expressed as:
L=Lrecon+Lsoftmax
Wherein L recon represents the loss of sentence vector representation reconstruction; l softmax represents the cross entropy loss function of the emotion classification task.
As shown in fig. 2 and fig. 3, the present embodiment provides a knowledge enhancement-based method for analyzing a small number of samples across domains, which uses a small number of labeling samples in a target domain to introduce a large amount of common sense knowledge, so as to capture the unchanged features of the domain and obtain the specific features of the domain, and adopts a small number of sample learning modes to optimize model parameters, so that the problem of overfitting caused during model training is avoided, the adaptability of the model domain can be effectively improved, and the performance of the across domains emotion analysis can be further improved. The model includes a pre-training field BERT encoder, a pre-training GCN auto-encoder, a training classifier that concatenates vectors generated from both encoders as feature vector representations of sentences, and a prototype network module for low sample learning. The method comprises the following steps:
(1) And (5) inputting unlabeled samples in the source field or the target field to pretrain the BERT encoder.
The pre-training BERT encoder is characterized in that rich domain knowledge is obtained by pre-training the BERT encoder in large scale without marking data, and the feature vector of each sentence in the text is expressed as follows:
xw=h[cls]=BERT(x)
Where x represents the input sentence and BERT is the sentence encoder.
(2) Based on unlabeled samples (i.e. comment sentences) of the source field and the target field, taking sentences as units, taking words with specified parts of speech as nouns, verbs and adjectives in the sentences as link seeds, and linking out a knowledge triplet of the next hop through a ConceptNet common sense knowledge base; and finally, de-overlapping and combining the sub-graphs linked by all sentences to finally form a domain common sense knowledge graph, and providing knowledge support for cross-domain emotion analysis. The dependency relationship means that if the dependency syntax relationship between specified words in a sentence is "nsubj", "amod" or "xcomp", they are connected as a "description" relationship. Finally, seed filtering ConceptNet creates subgraphs, and the subgraphs of all sentences are combined into a domain common sense atlas which can be expressed as:
The method comprises the steps of constructing a node V i epsilon V in a map, and a relation triplet (V i,ri,j,vj) epsilon phi, wherein r i,j refers to the relation of two nodes in ConceptNet.
(3) Designing two self-supervision tasks, a relation classification task and an emotion alignment classification task to train a GCN automatic encoder, namely predicting the relation between nodes to obtain a common sense knowledge feature vector, and learning emotion alignment features between aspect viewpoint pairs by using an emotion alignment binary classification task to obtain a feature vector containing background common sense and aspect viewpoint word emotion alignment.
The process of converting the feature vector can be expressed as:
Wherein, all neighbor nodes of the representative node i under the relation r are normalization constants which can be preset, g i is an initial node feature vector which is randomly initialized, and after a two-step graph rolling process is used for the initial node feature vector, the initial node feature vector is converted into h i, namely, the initial node feature vector is a field aggregation feature vector, and the initial node feature vector is a weight matrix of a first layer.
(4) The feature vector generated by the GCN automatic encoder is input into a graph feature reconstructor, and the graph node level feature vector is adapted to the word level vector through the graph feature reconstructor.
The feature mapping layer and the graph feature reconstructor take sentences as units, and the design is expressed as follows:
xc=Wcx'c+bc
x’recon=Wreconxc+brecon
Wherein x represents the sentence vector representation, W c and b c are weight matrices, the sentence feature vectors obtained by all node representations in the average atlas after x constructs the sub atlas and passes through the GCN automatic encoder are sentence feature vector representations which are obtained by the feature reconstructor and are suitable for word level distribution space, and the sentence vector representation is used as the final vector representation of GCN on the sentence x.
Thus, the loss function of the reconstruction function uses a cosine similarity function represented by the following formula:
wherein, after the sentence x constructs the sub-atlas, the atlas is input into the sentence feature vector representation obtained by the GCN automatic encoder, and the sentence feature vector representation obtained by the reconstruction function.
(5) And splicing vectors generated by the two encoders to serve as vectors of sentences to be input into a classifier, calculating probabilities of all possible polarities of input texts, and selecting an emotion label with the highest probability as a final predictive emotion label to complete emotion analysis tasks.
The feature vector of a sentence can be expressed as:
x=[xc;xw]
Where x c is a common sense knowledge vector with aspect viewpoint emotion alignment and x w is a sentence vector with context information generated by the BERT encoder. [ (r) ]; and represents a splice vector.
Therefore, in the step of completing the emotion analysis task, the calculation formula of the emotion probability is as follows:
wherein C is the possible emotional polarity.
(6) Training a prototype network learning with a few samples, adopting a prototype network model based on metric learning, respectively giving k samples of positive and negative emotion categories, and calculating a prototype of each emotion category:
wherein, the feature vector representation of the j-th sample of the positive/negative emotion category is represented;
meanwhile, the emotion probability of a given sentence x is output, and the calculation formula is as follows:
Wherein, is the ith emotion polarity; q represents the sample to be tested; d () represents the euclidean distance between a given two vectors.
(7) In the process of obtaining the optimal parameters by adopting the emotion alignment task and the emotion analysis task of the viewpoint words in the aspect of combined training of the Adam optimizer, the loss function can be expressed as follows:
L=Lrecon+Lsoftmax
wherein, the sentence vector represents the loss of reconstruction; represented is a cross entropy loss function of emotion classification tasks.
In summary, the method and the device utilize knowledge enhancement and less sample learning technology to capture the domain sharing characteristics and the domain specific characteristics, effectively solve the problem that the existing method cannot capture the domain invariant characteristics and the domain specific characteristics at the same time, and further improve the prediction effect of emotion analysis of target domain data.
The embodiment also provides a small sample cross-domain emotion analysis device, which comprises:
At least one processor;
at least one memory for storing at least one program;
the at least one program, when executed by the at least one processor, causes the at least one processor to implement the method illustrated in fig. 1.
The small-sample cross-domain emotion analysis device provided by the embodiment of the invention can be used for executing any combination implementation steps of the small-sample cross-domain emotion analysis method provided by the embodiment of the method, and has the corresponding functions and beneficial effects.
Embodiments of the present application also disclose a computer program product or computer program comprising computer instructions stored in a computer readable storage medium. The computer instructions may be read from a computer-readable storage medium by a processor of a computer device, and executed by the processor, to cause the computer device to perform the method shown in fig. 1.
In some alternative embodiments, the functions/acts noted in the block diagrams may occur out of the order noted in the operational illustrations. For example, two blocks shown in succession may in fact be executed substantially concurrently or the blocks may sometimes be executed in the reverse order, depending upon the functionality/acts involved. Furthermore, the embodiments presented and described in the flowcharts of the present invention are provided by way of example in order to provide a more thorough understanding of the technology. The disclosed methods are not limited to the operations and logic flows presented herein. Alternative embodiments are contemplated in which the order of various operations is changed, and in which sub-operations described as part of a larger operation are performed independently.
Furthermore, while the invention is described in the context of functional modules, it should be appreciated that, unless otherwise indicated, one or more of the described functions and/or features may be integrated in a single physical device and/or software module or one or more functions and/or features may be implemented in separate physical devices or software modules. It will also be appreciated that a detailed discussion of the actual implementation of each module is not necessary to an understanding of the present invention. Rather, the actual implementation of the various functional modules in the apparatus disclosed herein will be apparent to those skilled in the art from consideration of their attributes, functions and internal relationships. Accordingly, one of ordinary skill in the art can implement the invention as set forth in the claims without undue experimentation. It is also to be understood that the specific concepts disclosed are merely illustrative and are not intended to be limiting upon the scope of the invention, which is to be defined in the appended claims and their full scope of equivalents.
The functions, if implemented in the form of software functional units and sold or used as a stand-alone product, may be stored in a computer-readable storage medium. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
Logic and/or steps represented in the flowcharts or otherwise described herein, e.g., a ordered listing of executable instructions for implementing logical functions, can be embodied in any computer-readable medium for use by or in connection with an instruction execution system, apparatus, or device, such as a computer-based system, processor-containing system, or other system that can fetch the instructions from the instruction execution system, apparatus, or device and execute the instructions. For the purposes of this description, a "computer-readable medium" can be any means that can contain, store, communicate, propagate, or transport the program for use by or in connection with the instruction execution system, apparatus, or device.
More specific examples (a non-exhaustive list) of the computer-readable medium would include the following: an electrical connection (electronic device) having one or more wires, a portable computer diskette (magnetic device), a Random Access Memory (RAM), a read-only memory (ROM), an erasable programmable read-only memory (EPROM or flash memory), an optical fiber device, and a portable compact disc read-only memory (CDROM). In addition, the computer readable medium may even be paper or other suitable medium on which the program is printed, as the program may be electronically captured, via, for instance, optical scanning of the paper or other medium, then compiled, interpreted or otherwise processed in a suitable manner, if necessary, and then stored in a computer memory.
It is to be understood that portions of the present invention may be implemented in hardware, software, firmware, or a combination thereof. In the above-described embodiments, the various steps or methods may be implemented in software or firmware stored in a memory and executed by a suitable instruction execution system. For example, if implemented in hardware, as in another embodiment, may be implemented using any one or combination of the following techniques, as is well known in the art: discrete logic circuits having logic gates for implementing logic functions on data signals, application specific integrated circuits having suitable combinational logic gates, programmable Gate Arrays (PGAs), field Programmable Gate Arrays (FPGAs), and the like.
In the foregoing description of the present specification, reference has been made to the terms "one embodiment/example", "another embodiment/example", "certain embodiments/examples", and the like, means that a particular feature, structure, material, or characteristic described in connection with the embodiment or example is included in at least one embodiment or example of the invention. In this specification, schematic representations of the above terms do not necessarily refer to the same embodiments or examples. Furthermore, the particular features, structures, materials, or characteristics described may be combined in any suitable manner in any one or more embodiments or examples.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: many changes, modifications, substitutions and variations may be made to the embodiments without departing from the spirit and principles of the invention, the scope of which is defined by the claims and their equivalents.
While the preferred embodiment of the present application has been described in detail, the present application is not limited to the above embodiments, and various equivalent modifications and substitutions can be made by those skilled in the art without departing from the spirit of the present application, and these equivalent modifications and substitutions are intended to be included in the scope of the present application as defined in the appended claims.

Claims (3)

1. The cross-domain emotion analysis method for a few samples is characterized by comprising the following steps of:
Sentence data are acquired, and are input into a trained BERT encoder to acquire a first feature vector;
inputting sentence data into a trained GCN encoder to obtain a second feature vector;
feature fusion is carried out on the first feature vector and the second feature vector, and the vector representation of the sentence is obtained;
Inputting the vector representation of the sentence into the trained prototype network model of the few samples, and outputting the emotion polarity of the sentence;
Wherein the less-sample prototype network model is obtained by training in the following manner: obtaining a preset labeling sample of positive emotion and negative emotion, obtaining sentence vector representation of the labeling sample, mapping the sentence vector representation to a feature space respectively, and taking average vectors of the sentence vector representations with the same polarity as prototype representations representing the corresponding emotion polarities;
the BERT encoder is trained by:
Acquiring texts of a source field or a target field, and training a BERT encoder to acquire rich field feature knowledge; wherein, the feature vector of each sentence in the text is expressed as:
xw=h[cls]=BERT(x)
Wherein x represents an input sentence, h [cls] represents a hidden vector representation of a special character before a sentence of a BERT encoder, and BERT is the sentence encoder;
The GCN encoder is trained by:
Designing two self-supervision tasks, and training a GCN encoder; the two self-supervision tasks comprise a relationship classification task and an emotion alignment classification task;
The relation classification task is to set any two nodes, and the relation between the two nodes can be judged based on a relation classification model of the GCN encoder; the emotion alignment task is to give two aspect words and viewpoint words, and an emotion alignment model based on a GCN encoder needs to judge whether the two aspect words and the viewpoint words have the same emotion polarity; the two self-supervision tasks aim at domain relationship common sense and learn emotion alignment features between aspect viewpoint pairs, so as to obtain feature vectors containing background common sense and aspect viewpoint emotion alignment;
In the relation classification task, the node characteristic vector is obtained by fusion of the neighbor node representations of the nodes, and the fusion process is represented as follows:
Wherein, G i is an initial node characteristic vector which is randomly initialized and represents all neighbor nodes of the node i under the relation r, and the node i is converted into h i after a two-step graph rolling process is used for the initial node characteristic vector; σ represents Relu activation functions; l represents the convolution of the layer-I graph, c i,r represents the number of neighbor nodes of node i,/>Representing the parameter matrix to be trained, x j representing the eigenvector representation of node j, x i representing the eigenvector representation of node i,/>Representing a parameter matrix to be trained;
the loss function generated by the relation classification task is:
Wherein s (v i,ri,j,vj) represents a matrix analysis score function; r r represents a vector representation of the relationship R; t represents a node set of the graph G, y represents whether a relation r i,j exists between a given node i and a given node j, if yes, the value of y is 1, and if not, the value of y is 0;
the loss function generated by the emotion alignment classification task is as follows:
N represents the number of unlabeled samples in the source field and the target field; p k represents the aspect word-emotion word pair contained in the kth unlabeled sample;
The method also comprises the steps of constructing a common sense knowledge graph:
Based on unlabeled samples of the source field and the target field, taking sentences as units, taking words with specified parts of speech as nouns, verbs and adjectives in the sentences as link seeds, and linking out a knowledge triplet of the next hop through a ConceptNet common sense knowledge base; finally, de-overlapping the sub-spectrums linked by all sentences to form a domain common sense knowledge graph, and providing knowledge support for cross-domain emotion analysis;
wherein, the constructed knowledge graph of the general knowledge in the field is expressed as follows:
Wherein, a node V i epsilon V and a relation triplet (V i,ri,j,Vj) epsilon phi in the map are constructed, wherein Expressed as a relation between two nodes v i and v j, phi represents all triples contained in the graph G;
The method also comprises the following steps:
Calculating the importance degree of the nodes in the knowledge graph by adopting an attention mechanism, wherein the importance degree of each node is expressed as follows:
Wherein e j represents the vector representation of the jth node, α i represents the importance level of the ith node, e k represents the vector representation of the kth node, and N i represents all neighbor node sets of the ith node;
The feature fusion is performed on the first feature vector and the second feature vector to obtain a vector representation of a sentence, including:
The first feature vector and the second feature vector are spliced, the probability of all possible polarities of the input text is calculated, the emotion label with the highest probability is selected as the final predictive emotion label, and in the step of completing emotion analysis tasks, the feature vector of each sentence is expressed as follows:
x=[xw;xg]
Wherein x g is a common sense knowledge vector with aspect viewpoint emotion alignment, x w is a sentence vector with context information generated by the BERT encoder, [; and represents a splice vector;
after the sentence vector representation of the sample is input into the prototype network model of the small sample, the following steps are executed:
for k samples of positive and negative emotion categories, a prototype for each emotion category is calculated:
In the method, in the process of the invention, A feature vector representation of the j-th sample of positive/negative emotion categories;
the emotion probability of the sentence x is output, and the calculation formula of the emotion probability is as follows:
Wherein c i is the ith emotion polarity; q represents the sample to be tested; d () represents the euclidean distance between a given two vectors.
2. The method of claim 1, further comprising the steps of:
Training the small sample prototype network model by adopting an Adam optimizer, wherein a loss function in the training process is expressed as follows:
L=Lrecon+Lsoftmax
Wherein L recon represents the loss of sentence vector representation reconstruction; l softmax represents the cross entropy loss function of the emotion classification task.
3. A small sample cross-domain emotion analysis device, comprising:
At least one processor;
at least one memory for storing at least one program;
The at least one program, when executed by the at least one processor, causes the at least one processor to implement the method of any of claims 1-2.
CN202210661020.8A 2022-06-13 2022-06-13 Cross-domain emotion analysis method and device for few samples Active CN115080688B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210661020.8A CN115080688B (en) 2022-06-13 2022-06-13 Cross-domain emotion analysis method and device for few samples

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210661020.8A CN115080688B (en) 2022-06-13 2022-06-13 Cross-domain emotion analysis method and device for few samples

Publications (2)

Publication Number Publication Date
CN115080688A CN115080688A (en) 2022-09-20
CN115080688B true CN115080688B (en) 2024-06-04

Family

ID=83251179

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210661020.8A Active CN115080688B (en) 2022-06-13 2022-06-13 Cross-domain emotion analysis method and device for few samples

Country Status (1)

Country Link
CN (1) CN115080688B (en)

Families Citing this family (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115905518B (en) * 2022-10-17 2023-10-20 华南师范大学 Emotion classification method, device, equipment and storage medium based on knowledge graph
CN116562305B (en) * 2023-07-10 2023-09-12 江西财经大学 Aspect emotion four-tuple prediction method and system

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008338A (en) * 2019-03-04 2019-07-12 华南理工大学 A kind of electric business evaluation sentiment analysis method of fusion GAN and transfer learning
CN112860901A (en) * 2021-03-31 2021-05-28 中国工商银行股份有限公司 Emotion analysis method and device integrating emotion dictionaries
CN113722439A (en) * 2021-08-31 2021-11-30 福州大学 Cross-domain emotion classification method and system based on antagonism type alignment network

Family Cites Families (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US11868730B2 (en) * 2020-09-23 2024-01-09 Jingdong Digits Technology Holding Co., Ltd. Method and system for aspect-level sentiment classification by graph diffusion transformer

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110008338A (en) * 2019-03-04 2019-07-12 华南理工大学 A kind of electric business evaluation sentiment analysis method of fusion GAN and transfer learning
CN112860901A (en) * 2021-03-31 2021-05-28 中国工商银行股份有限公司 Emotion analysis method and device integrating emotion dictionaries
CN113722439A (en) * 2021-08-31 2021-11-30 福州大学 Cross-domain emotion classification method and system based on antagonism type alignment network

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
一种基于自注意力的句子情感分类方法;余珊珊;苏锦钿;李鹏飞;;计算机科学;20200430(第04期);第205-207页 *

Also Published As

Publication number Publication date
CN115080688A (en) 2022-09-20

Similar Documents

Publication Publication Date Title
CN108984724B (en) Method for improving emotion classification accuracy of specific attributes by using high-dimensional representation
Li et al. Natural language generation using deep learning to support MOOC learners
CN110008338B (en) E-commerce evaluation emotion analysis method integrating GAN and transfer learning
CN113672708B (en) Language model training method, question-answer pair generation method, device and equipment
CN111126386B (en) Sequence domain adaptation method based on countermeasure learning in scene text recognition
CN115080688B (en) Cross-domain emotion analysis method and device for few samples
CN108664589B (en) Text information extraction method, device, system and medium based on domain self-adaptation
CN106980608A (en) A kind of Chinese electronic health record participle and name entity recognition method and system
CN117149989B (en) Training method for large language model, text processing method and device
CN111914085B (en) Text fine granularity emotion classification method, system, device and storage medium
CN107590127B (en) Automatic marking method and system for question bank knowledge points
CN110309514A (en) A kind of method for recognizing semantics and device
CN112749274B (en) Chinese text classification method based on attention mechanism and interference word deletion
CN113626589B (en) Multi-label text classification method based on mixed attention mechanism
CN106682387A (en) Method and device used for outputting information
JP2022128441A (en) Augmenting textual data for sentence classification using weakly-supervised multi-reward reinforcement learning
CN111507093A (en) Text attack method and device based on similar dictionary and storage medium
CN117370736B (en) Fine granularity emotion recognition method, electronic equipment and storage medium
CN113988079A (en) Low-data-oriented dynamic enhanced multi-hop text reading recognition processing method
CN114626529A (en) Natural language reasoning fine-tuning method, system, device and storage medium
CN114462418A (en) Event detection method, system, intelligent terminal and computer readable storage medium
CN116757195B (en) Implicit emotion recognition method based on prompt learning
CN114626378A (en) Named entity recognition method and device, electronic equipment and computer readable storage medium
Zeng Intelligent test algorithm for English writing using English semantic and neural networks
WO2023159759A1 (en) Model training method and apparatus, emotion message generation method and apparatus, device and medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant