CN116361420A - Comment data enhancement and aspect-level emotion analysis method based on multi-prompt learning - Google Patents
Comment data enhancement and aspect-level emotion analysis method based on multi-prompt learning Download PDFInfo
- Publication number
- CN116361420A CN116361420A CN202310340273.XA CN202310340273A CN116361420A CN 116361420 A CN116361420 A CN 116361420A CN 202310340273 A CN202310340273 A CN 202310340273A CN 116361420 A CN116361420 A CN 116361420A
- Authority
- CN
- China
- Prior art keywords
- prompt
- template
- emotion
- bert
- polarity
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 230000008451 emotion Effects 0.000 title claims abstract description 167
- 238000004458 analytical method Methods 0.000 title claims abstract description 89
- 238000012549 training Methods 0.000 claims abstract description 60
- 238000000034 method Methods 0.000 claims abstract description 44
- 238000012545 processing Methods 0.000 claims abstract description 11
- 241001112285 Berta Species 0.000 claims description 24
- 230000006870 function Effects 0.000 claims description 19
- 239000013598 vector Substances 0.000 claims description 17
- 238000012360 testing method Methods 0.000 claims description 13
- 238000004364 calculation method Methods 0.000 claims description 12
- 238000012986 modification Methods 0.000 claims description 12
- 230000004048 modification Effects 0.000 claims description 12
- 230000008569 process Effects 0.000 claims description 12
- 238000011156 evaluation Methods 0.000 claims description 9
- 238000013528 artificial neural network Methods 0.000 claims description 8
- 239000011159 matrix material Substances 0.000 claims description 6
- 230000002457 bidirectional effect Effects 0.000 claims description 4
- 230000003416 augmentation Effects 0.000 claims description 3
- 238000012216 screening Methods 0.000 claims description 3
- 230000011218 segmentation Effects 0.000 claims description 3
- 238000013473 artificial intelligence Methods 0.000 abstract 1
- 230000007935 neutral effect Effects 0.000 description 3
- 238000005516 engineering process Methods 0.000 description 2
- 230000015654 memory Effects 0.000 description 2
- 239000003607 modifier Substances 0.000 description 2
- 238000013527 convolutional neural network Methods 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 238000003062 neural network model Methods 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 230000006403 short-term memory Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/33—Querying
- G06F16/3331—Query processing
- G06F16/334—Query execution
- G06F16/3344—Query execution using natural language analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
- G06F16/353—Clustering; Classification into predefined classes
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/10—Complex mathematical operations
- G06F17/16—Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- General Health & Medical Sciences (AREA)
- Health & Medical Sciences (AREA)
- Software Systems (AREA)
- Computing Systems (AREA)
- Databases & Information Systems (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Evolutionary Computation (AREA)
- Molecular Biology (AREA)
- Biophysics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Biomedical Technology (AREA)
- Computational Mathematics (AREA)
- Mathematical Analysis (AREA)
- Mathematical Optimization (AREA)
- Pure & Applied Mathematics (AREA)
- Algebra (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a comment data enhancement and aspect-level emotion analysis method based on multi-prompt learning, and relates to the field of natural language recognition processing in artificial intelligence. Firstly, a prompting template structure composed of an input slot, an aspect slot, a polarity answer slot, an emotion keyword slot and a prompting mode is provided, and based on the prompting template structure, a prompting template automatic generation method based on a BERT model is provided. Secondly, an optimized prompt template is generated for the appointed aspect emotion analysis data set by using the proposed automatic prompt template generation method, and data enhancement is carried out on the training set of the aspect emotion analysis data set through the generated optimized prompt template. And finally, performing fine adjustment of multi-prompt learning on the BERT model by using the training set of the data-enhanced aspect emotion analysis data set to obtain an aspect emotion analysis BERT model based on multi-prompt learning, so that the aspect emotion analysis problem is solved by a more effective method.
Description
Technical Field
The invention relates to emotion analysis in the field of natural language recognition processing, in particular to a comment data enhancement and aspect-level emotion analysis method based on multi-prompt learning, which can be widely applied to aspect-level emotion analysis tasks in various fields.
Background
The purpose of aspect-level emotion classification is to predict the polarity of aspect words in sentences or documents, which is a task of fine-grained emotion analysis, unlike traditional emotion analysis tasks, which is to do emotion polarity analysis (typically three classifications of positive, negative, neutral) on aspect words. Aspect-level emotion classification is commonly used in comment sentences of commentators, such as: shopping comments in a mall, food comments, movie comments, and the like. Aspect-level emotion classification typically involves multiple aspect words and their associated emotion polarities in a sentence.
With the continued development of artificial neural network technology, various neural networks such as Bidirectional Encoder Representations from Transformers (BERT) neural network language models proposed by Long Short-TermMemory (LSTM), deep Memory Network and Google AI Language are applied to aspect polarity classification, thereby providing an end-to-end classification method therefor without any feature engineering effort. However, when there are multiple targets in a sentence, the aspect polarity classification task needs to distinguish between emotions of different aspects. Therefore, compared with the sentence-level emotion analysis, the method has the advantages that the aspect polarity classification task is more complex, and more corpus is needed for fine adjustment although the comment can be deeply understood through the pre-trained neural network language model. However, very unfortunately, since aspect polarity classification labels are more time-consuming and labor-consuming, the corpus of aspect-level emotion analysis is usually smaller, and the corpus distribution on different polarities is not uniform. In order to solve the problem, the invention provides a comment data enhancement and aspect-level emotion analysis method based on multi-prompt learning.
The prompter refers to a question, a complete filling, an explanation or a demonstration attached to the original input, and is a new learning technology which appears with the wide application of the pre-training neural network language model, and aims to explore the knowledge type learned by the pre-training model. According to the invention, a reasonable prompt template structure is defined, and based on a pre-trained BERT neural network model, an automatic generation method of the prompt template is provided, an optimized template mode sequence is generated for different data sets, and further a comment data enhancement and aspect-level emotion analysis method based on multi-prompt learning is provided.
Disclosure of Invention
The invention discloses a comment data enhancement and aspect emotion analysis method based on multi-prompt learning, which is characterized by comprising the following steps of:
s1, defining the structure of a prompt template for aspect emotion analysis to be composed of an input slot, an aspect slot, a polarity answer slot, an emotion keyword slot and a prompt mode;
s2, providing an automatic prompting template generation method based on the BERT model based on the structure of the prompting template defined in the step S1;
s3, generating an optimized prompting template for the appointed aspect-level emotion analysis data set psi by using the automatic prompting template generation method based on the BERT model provided in the step S2;
s4, carrying out data enhancement on a training set of the aspect-level emotion analysis data set psi by using the optimized prompt template generated in the step S3;
s5, performing fine adjustment of multi-prompt learning on the BERT model by using the training set of the enhanced aspect emotion analysis data set ψ of the data obtained in the step S4 to obtain an aspect emotion analysis BERT model based on multi-prompt learning;
s6, performing emotion prediction on the aspect targets in the test set of the aspect emotion analysis data set ψ by using the BERT model finely tuned in the step S5;
the BERT model refers to Bidirectional Encoder Representations from Transformers (BERT, bi-directional encoder representation based on transducers) neural network language model proposed by Google AI Language.
Further, the step S1 specifically includes:
s1.1, defining the structure of a prompt template of the aspect emotion analysis into the following form:
T=f(X,A,Z,K;P) (1)
wherein T is a defined prompting template, X is an original comment sentence, A is an aspect target to be predicted in X, Z is a potential emotion polarity answer, K is an emotion keyword, P is a template mode, and f (X, A, Z, K; P) is a constructor for filling X, A, K and Z into P;
the emotion keyword is an emotion noun reflecting emotion characteristics;
the template mode is a sentence frame comprising an input slot [ X ], an aspect slot [ A ], a polarity answer slot [ Z ] and an emotion keyword slot [ K ];
the input groove [ X ] is used for filling X, the aspect groove [ A ] is used for filling A, the polarity answer groove [ Z ] is used for filling Z, and the emotion keyword groove [ K ] is used for filling K;
s1.2, classifying the types of the prompt templates of the aspect emotion analysis into two main types of modification prompt templates and prefix prompt templates;
the modifier alert template is an alert template in which alerts appear in the middle of an evaluation statement in the form of a stationary language, for example, for an english comment: "the)staff was so horrible ", modified cues were: "the)staff that gets a[Z]comment was so horrible";
The template mode of the modification prompt template is defined as follows:
P m =[X1]+[A]+f m (Z,K)+[X2]. (2)
wherein P is m For the defined template pattern, [ X1 ]]And [ X2 ]]For two input subslots, X1 is the left sentence component of the aspect target A to be predicted in X, X2 is the right sentence component of the aspect target A to be predicted in X, f m (Z, K) is a constructor that uses Z and K to form a modification cue;
the prefix hint template is a hint template that presents hints as independent sentences behind an evaluation sentence, for example: "the staff ws so rigidstaff gets a[Z]comment";
The template mode of the prefix hint template is defined as follows:
P p =[X].+f p (A,Z,K). (3)
wherein P is p For the defined template pattern, f p (A, Z, K) is a constructor that uses A, Z and K to form prefix hints.
Further, the step S2 specifically includes:
s2.1, screening out common emotion keywords from emotion analysis corpus to form an emotion keyword set D;
s2.2 using a simplified BERT-based sentence pair hint mode: p (P) 0 =[CLS]+[X]+[SEP]+[A]+[Z]+[K]+[SEP]Testing the emotion keyword set D on a training set of a designated aspect-level emotion analysis data set ψ by using a pre-training BERT model and a classification layer of a next sentence prediction task of the BERT model to generate an optimal emotion keyword of ψThe calculation process is as follows:
x k =BertTokenizer(f(x,a,z a ,k;p 0 )) (4)
H k =BERT(x k ) (5)
wherein, [ CLS ]]For classifier in BERT model, [ SEP ]]For separators in the BERT model, k is any emotion keyword in D, x is a comment sample with aspect target a, z a For true emotion polarity answer of aspect target a, x k ∈R n×e Comment and prompt sentence pairs formed by adding prompt filled with emotion keyword k for x, n is x k Number of words in BERTThe quantity, e, is the dimension of word encoding in the BERT model, BERT (·) represents the pre-trained BERT model, H k ∈R n×d Is x k The hidden state sequence after BERT processing, d is the dimension of the hidden state of the BERT model,is x k Middle classifier [ CLS ]]Corresponding hidden state o k ∈R |B| Is x k Confidence vector of filling emotion keyword k, B= { yes, no } is set of logic values, b|is number of elements in set B, W b ∈R |B|×d Is a representation matrix of logical values in B, B b ∈R |B| Is the bias vector of the BERT classification layer, +.>For the logical value of the probability, y is a logical value in B,/and y is a logical value in B>Is x k The logical value of true is +.>Confidence score of time, o k,y Is x k Confidence score when y is taken as the established logical value, +.>For predicting x k The logical value of true is +.>Probability of θ b All parameters representing the BERT model, exp (. Cndot.) representing the exponential function with base e,/->Is the ith x in E k E is a training set of a specified aspect-level emotion analysis dataset ψ, |E| is the number of comment samples in E, y yes A logical tag with a logical value "yes", function->Solving k which enables the function argument to be maximum, wherein the function BertTokenizer (·) is a word segmentation device of the BERT model;
s2.3, respectively designing three prompting modes for the prefix prompting template and the modification prompting template according to the position relation between the aspect target and other words to form a discrete space M of the template modes, as shown in a table 1:
TABLE 1 discrete spaces M of template patterns
S2.4 optimal emotion keywords for the specified aspect level emotion analysis dataset ψ generated using step S2.2And using the pre-trained BERT model and the classification layer of the next sentence prediction task of the BERT model, testing the discrete space M of the template model on the training set of ψ to generate an optimized template pattern sequence of the specified aspect level emotion analysis data set ψ>The calculation process is as follows:
H p =BERT(x p ) (10)
wherein p is any template pattern in M, x 'is a comment sample with aspect target a', z a′ For true emotion polarity answer of aspect target a', x p ∈R u Comment input with prompt formed by adding prompt of template mode p for x', u is x p Number of words in BERT, H p ∈R u×d Is x p The hidden state sequence after BERT processing,is x p Middle classifier [ CLS ]]Corresponding hidden state o p ∈R |B| Is x p Confidence vector using template pattern p, +.>Is x p The logical value of true is +.>Confidence score of time, o p,y Is x p Confidence score when y is taken as the established logical value, +.>For finding x p The logical value of true is +.>Prediction probability of +.>Function +.>The ranking of p is found such that the arguments are ordered in descending order.
Further, the step S3 specifically includes:
generating an optimized prompt template for the designated aspect-level emotion analysis data set ψ by using the automatic prompt template generation method based on the BERT model proposed in the step S2Wherein (1)>Optimal emotion keyword representing ψ ++>Calculated by equation (8) in step S2.2,/is>Representing an optimized template pattern sequence of ψ, which refers to the template pattern sequence ranked by equation (13) in M.
Further, in the step S4, the training set of the aspect emotion analysis data set ψ is subjected to data enhancement, which follows the following principles:
(1) The data enhancement of the training set refers to the expansion of comment samples in the training set and the pairing of prompt modes;
(2) To avoid overfitting, the moderation principle is followed when using multi-hint learning augmentation data, i.e. only the training subset of polarity with a small number of samples is extended and at least one of the training subsets of polarity is kept unchanged;
(3) When the training subset is expanded, each original comment sentence is generated from the step S3 according to the requirementSelecting a plurality of top-ranked prompt modes for pairing to form a plurality of comment samples with different prompt modes, and using only +.>Rank first inPaired with each original comment sentence to form a corresponding comment sample with a hint pattern, e.g., for a 14Lap dataset, we use the top three hint patterns to expand the training subset on neutral and use only the first one hint pattern on the other polarity, as shown in Table 2:
TABLE 2 prompt template for extending training samples in a 14Lap dataset
(4) Only the training samples of the aspect-level emotion analysis dataset ψ are expanded while the number of test samples is kept unchanged.
Further, the step S5 specifically includes:
s5.1 taking a comment sample with prompt mode from the training set of the aspect emotion analysis data set ψ expanded in the step S4Feeding into BERT model BERTA to be trimmed for aspect emotion analysis, obtaining BERT-based input sequence +.>And +.>Hidden state sequence in raw BERTA>The calculation process is as follows:
wherein,,for the original comment sentence in the taken comment sample with prompt mode +.>Is->Prompt mode of pairing->Is->Aspect objective to be evaluated,/->Is->Optimal emotion keyword of located dataset ψ, [ MASK ]]For the mask symbol in the BERT model, BERTA (·) represents the BERT model to be trimmed for the aspect emotion analysis;
s5.2 willMiddle classifier [ CLS ]]Corresponding hidden state->Feeding into BERTA classification layer to obtainConfidence vector on the polarity answer set Ω= { positive, negative, neutral->The I and the Q are the middle polarity of the omegaAnswer number and [ MASK ]]For the answer of the specified polarity->The calculation process is as follows:
wherein,,is a representation matrix of the polarity answers in Ω, +.>Is the bias vector of the BERTA classification layer,all parameters representing the BERTA model, +.>For predicting->In [ MASK ]]Is->W is any one of the polarity answers in Ω;
s5.3 fine-tuning the BERTA model using the following cross entropy loss function:
repeating the steps S5.1 to S5.3 until the training set sample matched with the expansion and prompt is learned.
Further, the step S6 specifically includes:
from the slaveRandomly selecting a prompt mode, matching with an original comment sentence to be tested to form comment input with prompt, sending the comment input to a BERTA model finely tuned in the step S5, processing the comment input by adopting a formula (14) to a formula (17), and obtaining the emotion polarity of the original comment sentence to be tested on a target in a specified aspect through the following formula (19):
wherein z is any one of the polarity answers in omega,for the calculated emotion polarity, function ∈>And solving for z which makes the function independent variable maximum.
The invention has the following advantages:
(1) A prompt model is provided to explore which knowledge related to the task of aspect-level emotion analysis is mastered by the pre-trained BERT model, and the input representation is made to be close to the knowledge so as to promote the understanding of the model on comments;
(2) The automatic template generation method is provided for optimizing the prompt modes of the emotion analysis data sets of different aspects, so that the adaptability of the model to the emotion analysis application scenes of different aspects is enhanced;
(3) In order to solve the problems of training and unbalanced performance among different polarities in the existing aspect-level emotion analysis model, a data enhancement method based on multi-prompt learning is provided so as to expand a training set of aspect-level emotion analysis tasks;
(4) It was confirmed that multi-cue learning was meaningful to training the aspect-level emotion analysis task model.
Drawings
FIG. 1 is a schematic flow chart of the method of the present invention.
Detailed Description
The present invention is further illustrated below with reference to specific examples, but the scope of the present invention is not limited to the following examples.
For a specified aspect-level emotion analysis data set ψ, according to the flow chart of the method of the present invention shown in fig. 1, the enhancement of comment data based on multi-prompt learning and aspect-level emotion analysis are completed by the following steps:
s1, defining the structure of a prompt template for aspect emotion analysis to be composed of an input slot, an aspect slot, a polarity answer slot, an emotion keyword slot and a prompt mode;
s2, providing an automatic prompting template generation method based on the BERT model based on the structure of the prompting template defined in the step S1;
s3, generating an optimized prompting template for the appointed aspect-level emotion analysis data set psi by using the automatic prompting template generation method based on the BERT model provided in the step S2;
s4, carrying out data enhancement on a training set of the aspect-level emotion analysis data set psi by using the optimized prompt template generated in the step S3;
s5, performing fine adjustment of multi-prompt learning on the BERT model by using the training set of the enhanced aspect emotion analysis data set ψ of the data obtained in the step S4 to obtain an aspect emotion analysis BERT model based on multi-prompt learning;
s6, performing emotion prediction on the aspect targets in the test set of the aspect emotion analysis data set ψ by using the BERT model finely tuned in the step S5;
the BERT model refers to Bidirectional Encoder Representations from Transformers (BERT, bi-directional encoder representation based on transducers) neural network language model proposed by Google AI Language.
Further, the step S1 specifically includes:
s1.1, defining the structure of a prompt template of the aspect emotion analysis into the following form:
T=f(X,A,Z,K;P) (1)
wherein T is a defined prompting template, X is an original comment sentence, A is an aspect target to be predicted in X, Z is a potential emotion polarity answer, K is an emotion keyword, P is a template mode, and f (X, A, Z, K; P) is a constructor for filling X, A, K and Z into P;
the emotion keyword is an emotion noun reflecting emotion characteristics;
the template mode is a sentence frame comprising an input slot [ X ], an aspect slot [ A ], a polarity answer slot [ Z ] and an emotion keyword slot [ K ];
the input groove [ X ] is used for filling X, the aspect groove [ A ] is used for filling A, the polarity answer groove [ Z ] is used for filling Z, and the emotion keyword groove [ K ] is used for filling K;
s1.2, classifying the types of the prompt templates of the aspect emotion analysis into two main types of modification prompt templates and prefix prompt templates;
the modifier alert template is an alert template in which alerts appear in the middle of an evaluation statement in the form of a stationary language, for example, for an english comment: "the)staff was so horrible ", modified cues were: "the)staff that gets a[Z]comment was so horrible";
The template mode of the modification prompt template is defined as follows:
P m =[X1]+[A]+f m (Z,K)+[X2]. (2)
wherein P is m For the defined template pattern, [ X1 ]]And [ X2 ]]For two input subslots, X1 is the left sentence component of the aspect target A to be predicted in X, X2 is the right sentence component of the aspect target A to be predicted in X, f m (Z, K) is a constructor that uses Z and K to form a modification cue;
the prefix hint template is a hint template that presents hints as independent sentences behind an evaluation sentence, for example: "the staff ws so rigidstaff gets a[Z]comment";
The template mode of the prefix hint template is defined as follows:
P p =[X].+f p (A,Z,K). (3)
wherein P is p For the defined template pattern, f p { A, Z, K) is a constructor that uses A, Z and K to form prefix hints.
Further, the step S2 specifically includes:
s2.1, screening out common emotion keywords from emotion analysis corpus to form an emotion keyword set D;
s2.2 using a simplified BERT-based sentence pair hint mode: p (P) 0 =[CLS]+[X]+[SEP]+[A]+[Z]+[K]+[SEP]Testing the emotion keyword set D on a training set of a designated aspect-level emotion analysis data set ψ by using a pre-training BERT model and a classification layer of a next sentence prediction task of the BERT model to generate an optimal emotion keyword of ψThe calculation process is as follows:
x k =BertTokenizer(f(x,a,z a ,k;p 0 )) (4)
H k =BERT(x k ) (5)
wherein, [ CLS ]]For classifier in BERT model, [ SEP ]]For separators in the BERT model, k is any emotion keyword in D, x is a comment sample with aspect target a, z a For true emotion polarity answer of aspect target a, x k ∈R n×e Comment and prompt sentence pairs formed by adding prompt filled with emotion keyword k for x, n is x k The number of words in the BERT, e, is the dimension of word encoding in the BERT model, BERT (&) represents the pre-trained BERT model, H k ∈R n×d Is x k The hidden state sequence after BERT processing, d is the dimension of the hidden state of the BERT model,is x k Middle classifier [ CLS ]]Corresponding hidden state o k ∈R |B| Is x k Confidence vector of filling emotion keyword k, B= { yes, no } is set of logic values, b|is number of elements in set B, W b ∈R |B|×d Is a representation matrix of logical values in B, B b ∈R |B| Is the bias vector of the BERT classification layer, +.>For the logical value of the probability, y is a logical value in B,/and y is a logical value in B>Is x k The logical value of true is +.>Confidence score of time, o k,y Is x k Confidence score when y is taken as the established logical value, +.>For predicting x k The logical value of true is +.>Probability of θ b All parameters representing the BERT model, exp (. Cndot.) representing the exponential function with base e,/->Is the ith x in E k E is a training set of a specified aspect-level emotion analysis dataset ψ, |E| is the number of comment samples in E, y yes A logical tag with a logical value "yes", function->Solving k which enables the function argument to be maximum, wherein the function BertTokenizer (·) is a word segmentation device of the BERT model;
s2.3, respectively designing three prompting modes for the prefix prompting template and the modification prompting template according to the position relation between the aspect target and other words to form a discrete space M of the template modes, as shown in a table 1:
TABLE 1 discrete spaces M of template patterns
S2.4 optimal emotion keywords for the specified aspect level emotion analysis dataset ψ generated using step S2.2And using the pre-trained BERT model and the classification layer of the next sentence prediction task of the BERT model, testing the discrete space M of the template model on the training set of ψ to generate an optimized template pattern sequence of the specified aspect level emotion analysis data set ψ>The calculation process is as follows:
H p =BERT(x p ) (10)
wherein p is any template pattern in M, x 'is a comment sample with aspect target a', z a′ For true emotion polarity answer of aspect target a', x p ∈R u Comment input with prompt formed by adding prompt of template mode p for x', u is x p Number of words in BERT, H p ∈R u×d Is x p The hidden state sequence after BERT processing,is x p Middle classifier [ CLS ]]Corresponding hidden state o p ∈R |B| Is x p Confidence vector using template pattern p, +.>Is x p The logical value of true is +.>Confidence score of time, o p,y Is x p Confidence score when y is taken as the established logical value, +.>For finding x p The logical value of true is +.>Prediction probability of +.>Is the ith x in E p Function->The ranking of p is found such that the arguments are ordered in descending order.
Further, the step S3 specifically includes:
generating an optimized prompt template for the designated aspect-level emotion analysis data set ψ by using the automatic prompt template generation method based on the BERT model proposed in the step S2Wherein (1)>Optimal emotion keyword representing ψ ++>Calculated by equation (8) in step S2.2,/is>Representing an optimized template pattern sequence of ψ, which refers to the template pattern sequence ranked by equation (13) in M.
Further, in the step S4, the training set of the aspect emotion analysis data set is subjected to data enhancement, which follows the following principles:
(1) The data enhancement of the training set refers to the expansion of comment samples in the training set and the pairing of prompt modes;
(2) To avoid overfitting, the moderation principle is followed when using multi-hint learning augmentation data, i.e. only the training subset of polarity with a small number of samples is extended and at least one of the training subsets of polarity is kept unchanged;
(3) When the training subset is expanded, each original comment sentence is generated from the step S3 according to the requirementSelecting a plurality of top-ranked prompt modes for pairing to form a plurality of comment samples with different prompt modes, and using only +.>To pair with each original comment sentence to form a corresponding comment sample with a hint pattern, e.g. for a 14Lap dataset we use the top three hint patterns to extend the training subset on neutral and use only one hint pattern on the first in the other polarity as shown in Table 2:
TABLE 2 prompt template for extending training samples in a 14Lap dataset
(4) Only the training samples of the aspect-level emotion analysis dataset are expanded while the number of test samples is kept unchanged.
Further, the step S5 specifically includes:
s5.1 taking a comment sample with prompt mode from the training set of the aspect emotion analysis data set ψ expanded in the step S4Feeding into BERT model BERTA to be trimmed for aspect emotion analysis, obtaining BERT-based input sequence +.>And +.>Hidden state sequence in BERTA +.>The calculation process is as follows:
wherein,,for the original comment sentence in the taken comment sample with prompt mode +.>Is->Prompt mode of pairing->Is->Aspect objective to be evaluated,/->Is->Optimal emotion keyword of located dataset ψ, [ MASK ]]For the mask symbol in the BERT model, BERTA (·) represents the BERT model to be trimmed for the aspect emotion analysis;
s5.2 willMiddle classifier [ CLS ]]Corresponding hidden state->Feeding into BERTA classification layer to obtainConfidence vector on the polarity answer set Ω= { positive, negative, neutral->I and Q are the number of answers to the polarity in Q and [ MASK ]]For the answer of the specified polarity->The calculation process is as follows:
wherein,,is a representation matrix of the polarity answers in Ω, +.>Is the bias vector of the BERTA classification layer,all parameters representing the BERTA model, +.>For predicting->In [ MASK ]]Is->W is any one of the polarity answers in Ω;
s5.3 fine-tuning the BERTA model using the following cross entropy loss function:
wherein,,is the i-th polarity answer in omega, y i Is->In [ MASK ]]Is->True probability tags of (2);
repeating the steps S5.1 to S5.3 until the training set sample matched with the expansion and prompt is learned.
Further, the step S6 specifically includes:
from the slaveRandomly selecting a prompt mode, matching with an original comment sentence to be tested to form comment input with prompt, sending the comment input to a BERTA model finely tuned in the step S5, processing the comment input by adopting a formula (14) to a formula (17), and obtaining the emotion polarity of the original comment sentence to be tested on a target in a specified aspect through the following formula (19):
wherein z is any one of the polarity answers in omega,for the calculated emotion polarity, function ∈>And solving for z which makes the function independent variable maximum.
Application instance
1. Example Environment
The hyper parameters of the examples are shown in table 3.
Table 3 hyper parameters of examples
2. Data set and optimized prompt template thereof
This example evaluates the model of the present invention on five benchmark datasets taken from three sequential tasks of the international semantic evaluation seminar, including 14Lap and 14Rest in SemEval-2014 task 4, 15Rest in SemEval 2015 task 12 and 16Rest in SemEval 2016 task 5, and the Tweet dataset, as shown in table 4;
table 4 evaluation data set
Then, using the automatic generating method of the prompt template based on the BERT model proposed in the step S2 of the present invention, the prompt modes of the best emotion keywords and the ranking top-k are selected for the five-aspect emotion analysis data set, as shown in table 5. The correspondence between the pattern numbers and the presentation patterns is shown in table 1.
Table 5 prompt template to evaluate optimization of a dataset
3. Data enhancement
According to the data enhancement method provided by the step S4, data enhancement is respectively carried out on the training sets of the five aspect-level emotion analysis data sets, and comment samples with prompt modes are obtained as shown in the table 6.
Table 6 data enhanced dataset used in the evaluation, the numbers in the table represent the number of comment samples with hint patterns
4. Contrast method
This example compares the model of the present invention to 6 aspect level emotion classification methods, including 3 non-BERT methods and 3 BERT-based methods, as follows:
(1) non-BERT method
MenNet [1] uses a multi-layer memory network in conjunction with attention to capture the importance of each context word to the polarity classification of a counterpart
IAN 2 features of specific aspects and contexts are extracted using two LSTM networks respectively, then their attention vectors are generated interactively, and finally the two attention vectors are connected for aspect polarity classification
TNet-LF [3] employs the CNN layer to extract salient features from word representations based on bi-directional LSTM layer transformations, and proposes correlation-based components to generate specific target representations of words in sentences, the model also employing location decay techniques
(2) BERT-based method
BERT-BASE [4] is a BERT-BASE version developed by Google AI language laboratory, which uses a single sentence input method: "[ CLS ] +comment sentence+ [ SEP ]" for aspect polarity classification
AEN-BERT [5] modeling context and aspect goals with BERT-based multi-headed attention
BERT-SPC [5] employs the input structure of Sentence Pair Classification (SPC): "[ CLS ] + comment sentence+ [ SEP ] + target t+ [ SEP ]".
Reference is made to:
[1]Tang D,Qin B,Liu T(2016)Aspect Level Sentiment Classification with Deep Memory Network.In:Empirical methods in natural language processing,pp214-224
[2]Ma D,Li S,Zhang X,Wang H(2017)Interactive attentions networks for aspect-level sentiment classification.In:Proceedings ofthe 26th International Joint Conference on Artificial Intelligence,Melbourne,Australia,19-25August 2017,pp 4068-4074
[3]Li X,Bing L,Lam W,Shi B(2018)Transformation Networks for Target-Oriented Sentiment Classification.In Proceedings ofACL,pp 946-956
[4]Devlin J,Chang MW,Lee K,Toutanova K(2019)BERT:Pre-training of Deep Bidirectional Transformers for Language Understanding.In:Proceedings ofthe 2019Conference ofNAACL,pp 4171-4186
[5]Song Y,Wang J,Jiang T,Liu Z,Rao Y(2019)Attentional encoder network for targeted sentiment classification.In:arXiv preprint arXiv:1902.09314
5. example comparison results
The training sets of the five data-enhanced aspect emotion analysis data sets provided by the example are used for fine tuning the BERT respectively, and the test sets of the five aspect emotion analysis data sets are tested by using the BERT model after fine tuning respectively, so that the comparison results shown in Table 7 are obtained;
table 7 example comparison results
The results in table 7 show that the method provided by the invention implemented by the example is significantly superior to various non-BERT aspect emotion classification methods and BERT-based aspect emotion classification methods in terms of accuracy and M-F1 values, which fully proves that the comment data enhancement and aspect emotion analysis method based on multi-prompt learning provided by the invention is feasible and excellent.
Claims (1)
1. A comment data enhancement and aspect-level emotion analysis method based on multi-prompt learning is characterized by comprising the following steps:
s1, defining the structure of a prompt template for aspect emotion analysis to be composed of an input slot, an aspect slot, a polarity answer slot, an emotion keyword slot and a prompt mode;
s2, providing an automatic prompting template generation method based on the BERT model based on the structure of the prompting template defined in the step S1;
s3, generating an optimized prompting template for the appointed aspect-level emotion analysis data set psi by using the automatic prompting template generation method based on the BERT model provided in the step S2;
s4, carrying out data enhancement on a training set of the aspect-level emotion analysis data set psi by using the optimized prompt template generated in the step S3;
s5, performing fine adjustment of multi-prompt learning on the BERT model by using the training set of the enhanced aspect emotion analysis data set ψ of the data obtained in the step S4 to obtain an aspect emotion analysis BERT model based on multi-prompt learning;
s6, performing emotion prediction on the aspect targets in the test set of the aspect emotion analysis data set ψ by using the BERT model finely tuned in the step S5;
the BERT model refers to Bidirectional Encoder Representations from Transformers (BERT, bi-directional encoder representation based on transducers) neural network language model proposed by Google AI Language.
The step S1 specifically includes:
s1.1, defining the structure of a prompt template of the aspect emotion analysis into the following form:
T=f(X,A,Z,K;P) (1)
wherein T is a defined prompting template, X is an original comment sentence, A is an aspect target to be predicted in X, Z is a potential emotion polarity answer, K is an emotion keyword, P is a template mode, and f (X, A, Z, K; P) is a constructor for filling X, A, K and Z into P;
the emotion keyword is an emotion noun reflecting emotion characteristics;
the template mode is a sentence frame comprising an input slot [ X ], an aspect slot [ A ], a polarity answer slot [ Z ] and an emotion keyword slot [ K ];
the input groove [ X ] is used for filling X, the aspect groove [ A ] is used for filling A, the polarity answer groove [ Z ] is used for filling Z, and the emotion keyword groove [ K ] is used for filling K;
s1.2, classifying the types of the prompt templates of the aspect emotion analysis into two main types of modification prompt templates and prefix prompt templates;
the modification prompt template is a prompt template with prompts appearing in the middle of the evaluation statement in a fixed language form, and is defined as follows:
P m =[X1]+[A]+f m (Z,K)+[X2]. (2)
wherein P is m For the defined template pattern, [ X1 ]]And [ X2 ]]For two input subslots, X1 is the left sentence component of the aspect target A to be predicted in X, X2 is the right sentence component of the aspect target A to be predicted in X, f m (Z, K) is a constructor that uses Z and K to form a modification cue;
the prefix hint template is a hint template for generating hints in independent sentences behind an evaluation sentence, and is defined as follows:
P p =[X].+f p (A,Z,K). (3)
wherein P is p For the defined template pattern, f p (A, Z, K) is a constructor that uses A, Z and K to form prefix hints.
The step S2 specifically includes:
s2.1, screening out common emotion keywords from emotion analysis corpus to form an emotion keyword set D;
s2.2 using a simplified BERT-based sentence pair hint mode:P 0 =[CLS]+[X]+[SEP]+[A]+[Z]+[K]+[SEP]Testing the emotion keyword set D on a training set of the appointed aspect-level emotion analysis data set ψ by using a pre-training BERT model and a classification layer of a next sentence prediction task of the BERT model to generate an optimal emotion keyword of ψThe calculation process is as follows:
x k =BertTokenizer(f(x,a,z a ,k;p 0 )) (4)
H k =BERT(x k ) (5)
wherein, [ CLS ]]For classifier in BERT model, [ SEP ]]For separators in the BERT model, k is any emotion keyword in D, x is an original comment sample with aspect target a in a training set of psi, z a For true emotion polarity answer of aspect target a, x k ∈R n×e Comment and prompt sentence pairs formed by adding prompt filled with emotion keyword k for x, n is x k The number of words in the BERT, e, is the dimension of word encoding in the BERT model, BERT (&) represents the pre-trained BERT model, H k ∈R n×d Is x k The hidden state sequence after BERT processing, d is the dimension of the hidden state of the BERT model,is x k Middle classifier [ CLS ]]Corresponding hidden state o k ∈R |B| Is x k Confidence vector of filling emotion keyword k, B= { yes, no } is set of logic values, b|is number of elements in set B, W b ∈R |B|×d Is a representation matrix of logical values in B, B b ∈R |B| Is the bias vector of the BERT classification layer, +.>For the logical value of the probability, y is a logical value in B,/and y is a logical value in B>Is x k The logical value of true is +.>Confidence score of time, o k,y Is x k Confidence score when y is taken as the established logical value, +.>For predicting x k The logical value of true is +.>Probability of θ b All parameters representing the BERT model, exp (. Cndot.) representing the exponential function with base e,/->Is the ith x in E k E is a training set of a specified aspect-level emotion analysis dataset ψ, |E| is the number of comment samples in E, y yes A logical tag with a logical value "yes", function->Solving k which enables the function argument to be maximum, wherein the function BertTokenizer (·) is a word segmentation device of the BERT model;
s2.3, respectively designing three prompting modes for the prefix prompting template and the modification prompting template according to the position relation between the aspect target and other words to form a discrete space M of the template modes, as shown in a table 1:
TABLE 1 discrete spaces M of template patterns
S2.4 optimal emotion keywords for the specified aspect level emotion analysis dataset ψ generated using step S2.2And using the pre-trained BERT model and the classification layer of the next sentence prediction task of the BERT model to test the discrete space M of the template pattern on the training set of ψ, generating an optimized template pattern sequence +.>The calculation process is as follows:
H p =BERT(x p ) (10)
wherein p is any one of M modesPlate pattern, x 'is an original comment sample with aspect target a' in the training set of ψ, z a′ For true emotion polarity answer of aspect target a', x p ∈R u Comment input with prompt formed by adding prompt of template mode p for x', u is x p Number of words in BERT, H p ∈R u×d Is x p The hidden state sequence after BERT processing,is x p Middle classifier [ CLS ]]Corresponding hidden state o p ∈R |B| Is x p With the confidence vector of the template pattern p,is x p The logical value of true is +.>Confidence score of time, o p,y Is x p Confidence score when y is taken as the established logical value, +.>For finding x p The logical value of true is +.>Prediction probability of +.>Is the ith x in E p Function->The ranking of p is found such that the arguments are ordered in descending order.
The step S3 specifically includes:
the automatic generation method of the prompt template based on the BERT model, which is proposed in the step S2, is used for designating aspect levelEmotion analysis dataset ψ generates an optimized hint templateWherein (1)>Optimal emotion keyword representing ψ ++>Calculated by equation (8) in step S2.2,/is>Representing an optimized template pattern sequence of ψ, which refers to the template pattern sequence ranked by equation (13) in M.
In the step S4, the training set of the aspect emotion analysis data set ψ is subjected to data enhancement, and the following principles are followed:
(1) The data enhancement of the training set refers to the expansion of comment samples in the training set and the pairing of prompt modes;
(2) To avoid overfitting, the moderation principle is followed when using multi-hint learning augmentation data, i.e. only the training subset of polarity with a small number of samples is extended and at least one of the training subsets of polarity is kept unchanged;
(3) When the training subset is expanded, each original comment sentence is generated from the step S3 according to the requirementSelecting a plurality of top-ranked prompt modes for pairing to form a plurality of comment samples with different prompt modes, and using only +.>The first prompting mode of the ranking is matched with each original comment sentence, and the shape is formedForming a corresponding comment sample with a prompt mode;
(4) Only the training samples of the aspect-level emotion analysis dataset ψ are expanded while the number of test samples is kept unchanged.
The step S5 specifically includes:
s5.1 taking a comment sample with prompt mode from the training set of the aspect emotion analysis data set ψ expanded in the step S4Feeding into BERT model BERTA to be trimmed for aspect emotion analysis, obtaining BERT-based input sequence +.>And +.>Hidden state sequence in BERTA +.>The calculation process is as follows:
wherein,,for the original comment sentence in the taken comment sample with prompt mode +.>Is->Prompt mode of pairing->Is->Aspect objective to be evaluated,/->Is->Optimal emotion keyword of located dataset ψ, [ MASK ]]For the mask symbol in the BERT model, BERTA (·) represents the BERT model to be trimmed for the aspect emotion analysis;
s5.2 willMiddle classifier [ CLS ]]Corresponding hidden state->Sending into the classification layer of BERTA to obtain ∈A->Confidence vector on the polarity answer set Ω= { positive, negative, neutral->I and Q are the number of answers to the polarity in Q and [ MASK ]]For the answer of the specified polarity->The calculation process is as follows:
wherein,,is a representation matrix of the polarity answers in Ω, +.>Is the bias vector of the BERTA class layer, < >>All parameters representing the BERTA model, +.>For predicting->In [ MASK ]]Is->W is any one of the polarity answers in Ω;
s5.3 fine-tuning the BERTA model using the following cross entropy loss function:
wherein,,is the i-th polarity answer in omega, y i Is->In [ MASK ]]Is->True probability tags of (2);
repeating the steps S5.1 to S5.3 until the training set sample matched with the expansion and prompt is learned.
The step S6 specifically includes:
from the slaveRandomly selecting a prompt mode, matching with an original comment sentence to be tested to form comment input with prompt, sending the comment input to a BERTA model finely tuned in the step S5, processing the comment input by adopting a formula (14) to a formula (17), and obtaining the emotion polarity of the original comment sentence to be tested on a target in a specified aspect through the following formula (19):
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310340273.XA CN116361420A (en) | 2023-03-31 | 2023-03-31 | Comment data enhancement and aspect-level emotion analysis method based on multi-prompt learning |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202310340273.XA CN116361420A (en) | 2023-03-31 | 2023-03-31 | Comment data enhancement and aspect-level emotion analysis method based on multi-prompt learning |
Publications (1)
Publication Number | Publication Date |
---|---|
CN116361420A true CN116361420A (en) | 2023-06-30 |
Family
ID=86920611
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202310340273.XA Pending CN116361420A (en) | 2023-03-31 | 2023-03-31 | Comment data enhancement and aspect-level emotion analysis method based on multi-prompt learning |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN116361420A (en) |
Cited By (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117473083A (en) * | 2023-09-30 | 2024-01-30 | 齐齐哈尔大学 | Aspect-level emotion classification model based on prompt knowledge and hybrid neural network |
CN117497140A (en) * | 2023-10-09 | 2024-02-02 | 合肥工业大学 | Multi-level depression state detection method based on fine granularity prompt learning |
CN117763128A (en) * | 2024-01-18 | 2024-03-26 | 杭州阿里云飞天信息技术有限公司 | Man-machine interaction data processing method, server, storage medium and program product |
-
2023
- 2023-03-31 CN CN202310340273.XA patent/CN116361420A/en active Pending
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117473083A (en) * | 2023-09-30 | 2024-01-30 | 齐齐哈尔大学 | Aspect-level emotion classification model based on prompt knowledge and hybrid neural network |
CN117473083B (en) * | 2023-09-30 | 2024-05-28 | 齐齐哈尔大学 | Aspect-level emotion classification model based on prompt knowledge and hybrid neural network |
CN117497140A (en) * | 2023-10-09 | 2024-02-02 | 合肥工业大学 | Multi-level depression state detection method based on fine granularity prompt learning |
CN117497140B (en) * | 2023-10-09 | 2024-05-31 | 合肥工业大学 | Multi-level depression state detection method based on fine granularity prompt learning |
CN117763128A (en) * | 2024-01-18 | 2024-03-26 | 杭州阿里云飞天信息技术有限公司 | Man-machine interaction data processing method, server, storage medium and program product |
CN117763128B (en) * | 2024-01-18 | 2024-06-04 | 杭州阿里云飞天信息技术有限公司 | Man-machine interaction data processing method, server, storage medium and program product |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Dong et al. | A survey of natural language generation | |
Tu et al. | Multi-hop reading comprehension across multiple documents by reasoning over heterogeneous graphs | |
Gong et al. | Efficient training of bert by progressively stacking | |
Koncel-Kedziorski et al. | Text generation from knowledge graphs with graph transformers | |
Du et al. | Convolution-based neural attention with applications to sentiment classification | |
Bajaj et al. | G3raphground: Graph-based language grounding | |
US20230274420A1 (en) | Method and system for automated generation of text captions from medical images | |
CN116361420A (en) | Comment data enhancement and aspect-level emotion analysis method based on multi-prompt learning | |
Yang et al. | Neural attentive network for cross-domain aspect-level sentiment classification | |
Ye et al. | Few-shot learning with a strong teacher | |
Chen et al. | Commonsense knowledge aware concept selection for diverse and informative visual storytelling | |
Nagaraj et al. | Kannada to English Machine Translation Using Deep Neural Network. | |
Zhou et al. | Multi-label image classification via category prototype compositional learning | |
Farazi et al. | Accuracy vs. complexity: a trade-off in visual question answering models | |
Wang et al. | Representation learning from limited educational data with crowdsourced labels | |
Wu et al. | Visual Question Answering | |
Yusuf et al. | Evaluation of graph convolutional networks performance for visual question answering on reasoning datasets | |
Xu et al. | CNN-based skip-gram method for improving classification accuracy of chinese text | |
Song et al. | Lsun: Construction of a large-scale image dataset using deep learning with humans in the loop | |
Llopis et al. | Matching user queries in natural language with Cyber-Physical Systems using deep learning through a Transformer approach | |
Du et al. | Hierarchical multi-layer transfer learning model for biomedical question answering | |
Sahoo et al. | Transformer based multimodal similarity search method for E-Commerce platforms | |
Wang et al. | PAIC: Parallelised attentive image captioning | |
Sri Neha et al. | A Comparative Analysis on Image Caption Generator Using Deep Learning Architecture—ResNet and VGG16 | |
Hao et al. | Layered feature representation for differentiable architecture search |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |