CN113723075B - Specific target emotion analysis method for enhancing and resisting learning by fusing word shielding data - Google Patents

Specific target emotion analysis method for enhancing and resisting learning by fusing word shielding data Download PDF

Info

Publication number
CN113723075B
CN113723075B CN202110999219.7A CN202110999219A CN113723075B CN 113723075 B CN113723075 B CN 113723075B CN 202110999219 A CN202110999219 A CN 202110999219A CN 113723075 B CN113723075 B CN 113723075B
Authority
CN
China
Prior art keywords
word
sample
representing
adv
clean
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202110999219.7A
Other languages
Chinese (zh)
Other versions
CN113723075A (en
Inventor
刘小洋
代尚宏
高绿苑
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Chongqing University of Technology
Original Assignee
Chongqing University of Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Chongqing University of Technology filed Critical Chongqing University of Technology
Priority to CN202110999219.7A priority Critical patent/CN113723075B/en
Publication of CN113723075A publication Critical patent/CN113723075A/en
Application granted granted Critical
Publication of CN113723075B publication Critical patent/CN113723075B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/247Thesauruses; Synonyms
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Databases & Information Systems (AREA)
  • Biophysics (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Machine Translation (AREA)

Abstract

The invention provides a specific target emotion analysis method for enhancing and counterlearning by fusing word shielding data, which comprises the following steps of: s1, synonym replacement and random word insertion are carried out on sentences by using a target entity shielding mode to generate effective samples and the effective samples are fused with original samples, and therefore word shielding data enhancement is achieved; s2, constructing a BERT-BASE-based confrontation learning specific target emotion classification model, and training the emotion classification model by using a clean sample and a confrontation sample together to enable the model to have a confrontation defense function; and S3, respectively carrying out counterstudy on the original sample and the sample subjected to data enhancement. The invention adopts data enhancement and confrontation training, has stronger robustness and can obtain better results.

Description

Specific target emotion analysis method for enhancing and counterlearning of fused word shielding data
Technical Field
The invention relates to the field of natural language processing, in particular to a specific target emotion analysis method for enhancing and resisting learning by fusing word shielding data.
Background
With the rapid development of social media (Microblog, twitter, facebook, etc.), sentiment analysis has become an extremely important task. A specific target Sentiment Analysis (ABSA) is a basic task in the field of text classification, and aims to analyze fine-grained Sentiment tendency of text data of an online social network by utilizing a deep learning and Natural Language Processing (NLP) technology, so that a user can clearly know Sentiment Polarity (Sentiment Polarity) and Attitude (attribute) hidden by a specific entity (Aspect) of online social network comment data. One sentence contains one or more entities, each entity has different emotional polarity, for example, a comment "Great food but the service waters dreadful! ", the emotional polarity of the entity" food "is" positive ", and the emotional polarity of the entity" service "is" negative ". Compared with sentence-level emotion analysis, the ABSA can display more accurate emotion key information of fine-grained entities for users.
The emotion analysis task is remarkably successful by adopting machine learning and deep learning. For example, kiritchenko and others adopt a Machine learning method to construct an artificial feature extraction model, and train an emotion classification model by using an extracted feature through a Support Vector Machine (SVM), but the artificial feature extraction is complicated and has low efficiency. In order to solve the complexity of manually extracting the features, a deep learning method is adopted to automatically extract more complex deep features. For example, li et al use an adaptive recurrent neural network to transform the dependency tree for different targets in the sentence, resulting in multiple different feature combining functions to train the model using the neural network. Because sentences have sequence, many models adopt Long Short-Term Memory networks (LSTMs) to extract Long-Term dependency information of the sentences, and Tang et al adopt two LSTMs to splice context feature vectors of specific target entities to obtain emotion classification models, but emotion information of Long-distance words can be lost. To be able to obtain textual long-range feature information, bahdana et al first applied the attention mechanism to natural language processing, and then many researchers introduced the attention mechanism into the emotion analysis task. Wang et al uses LSTM and attention weighting to obtain sentence expression vectors, tang et al adds feature word and target entity relative distance information to the attention mechanism, and uses multi-attention to obtain final entity expression. Chen et al constructs a RAM (Current Attention on Memory) structure to capture context semantic information and focus Attention, and fuses important features in long difficult sentences through a multi-Attention mechanism.
Because the parallelism of a Recurrent Neural Network (RNN) is not high, a transform structure is designed by Vaswani and the like, the RNN structural idea is completely abandoned, the parallelism is improved, a self-attention and multi-head attention mechanism is adopted, and position embedded information is added to help a model to understand the sequence of a language, so that long-distance dependency characteristic information can be captured better. Devlin et al designed bi-directional Encoder representation (BERT) with the Encoder part in the Transformer structure, which exhibited superior results on the text classification task, while being significantly improved on the ABSA task over other models.
However, the deep learning model is vulnerable to the attack of resisting samples (adaptive applications), and the recognition effect of the model after the attack of resisting samples on the ABSA task can be seen from fig. 1. Recent research shows that the robust neural network model can be constructed by resisting sample training, so that the robustness of the model is improved. The confrontation learning process is to input the confrontation sample into the model to continue learning from the input sample through the confrontation sample with less disturbance generated by the gradient. In the field of natural language processing, a traditional text countermeasure sample is obtained by perturbing a word or a sentence, for example, eger and the like selects a neighboring character of each character to perform shape-approximating word replacement, and Jin and the like replace an original word by a greedy-based word method. There is an antagonistic defense against attacks, for example, goodfellow et al propose a Fast Gradient Method (FGM) antagonistic training that outperforms other baseline methods on the text classification task.
Data enhancement and confrontation training has not been applied to the ABSA task in the above studies, and all of them are based on a model to improve the effect. For the currently disclosed ABSA data sets, the sufficient generalization capability and model robustness and model efficiency of the model are difficult to achieve.
Disclosure of Invention
The invention aims to at least solve the technical problems in the prior art, and particularly creatively provides a specific target emotion analysis method for enhancing and resisting learning by fusing word shielding data.
In order to achieve the above object, the present invention provides a specific target emotion analysis method for enhancing and counterlearning by fusing word masking data, comprising the following steps:
s1, synonym replacement and random word insertion are carried out on sentences by using a target entity shielding mode to generate effective samples and the effective samples are fused with original samples, so that word shielding data enhancement is realized;
s2, constructing a BERT-BASE-based confrontation learning specific target emotion classification model, taking the data fused in the S1 as the input of the model, and training the emotion classification model by using a clean sample and a confrontation sample together;
and S3, finally obtaining a specific target emotion analysis result, so that the model has the function of confrontation and defense.
Further, the method also comprises the following steps:
s4, respectively carrying out countermeasure learning on the original sample and the data-enhanced sample, and evaluating through evaluation indexes; the evaluation index includes: accuracy and/or F1 values.
Further, the calculation method of synonym replacement in step S1 is:
Figure BDA0003235073600000031
wherein S is Sr Representing data after synonym replacement;
F SR () represents a synonym replacement data enhancement function;
S In an input representing an original corpus;
Figure BDA0003235073600000032
is the ith word of an original sample;
aspect represents a specific target entity;
rep (·) represents a word replacement function;
Figure BDA0003235073600000033
indicating the id word needing to be replaced;
id Sr a location representing a word replacement;
Figure BDA0003235073600000041
indicates that based on the ith word->
Figure BDA0003235073600000042
Randomly finding num in Wordnet library 1 A synonym;
| A = means not equal.
Further, the calculation method of randomly inserting words in step S1 is:
Figure BDA0003235073600000043
wherein S is Ri Representing the data after random insertion;
F RI (. To) represents a random insertion data enhancement function;
insert (·) denotes inserting a word after the id's word;
Figure BDA0003235073600000044
indicating the id-th word needing to be inserted;
id RI representing a previous position in the sentence where the word is to be inserted;
ran (Wordnet, num) means to randomly find num in Wordnet library 2 A word.
Further, the S2 includes:
enhanced data Da Out As a clean sample, da Out =S Sr ∪S Ri (ii) a For each batch of clean samples, the clean sample is first used to generate the opposing perturbation r of the word embedding layer adv Thereby generating a challenge sample; the Adv-BERT model performs each batch training of the challenge samples and each batch training of the clean samples using BERT.
Further, the loss function for each batch training of the clean sample is calculated as follows:
Figure BDA0003235073600000045
wherein L is clean (. Represents the loss function of a clean sample, N batch Denotes the size of a batch, theta denotes the neural network parameter, p (y) i |E i ,aspect i (ii) a Theta) represents the emotion prediction probability function of the ith sample in one batch;
the penalty function for each batch of challenge samples is calculated as follows:
Figure BDA0003235073600000046
wherein L is adv (. Represents a loss function of challenge samples, N batch Denotes the size of a batch, theta denotes the neural network parameter, p (y) i |E adv(i) ,aspect i (ii) a Theta) represents the ith confrontation sample emotion prediction function.
Further, still include:
minimize the loss function for each batch of clean and challenge samples:
Figure BDA0003235073600000051
where L (-) represents the model loss function,
Figure BDA0003235073600000052
represents the value of the model parameter theta, L, when the loss function is minimized clean (θ) represents the clean sample per batch penalty function, L adv (θ) represents the loss function for each batch of challenge samples.
Further, the hidden layer of the BERT model adopts a gaussian error linear unit as an activation function:
Figure BDA0003235073600000053
wherein gelu (·) represents a gaussian error linear unit, theta represents a neural network parameter, and tanh is a hyperbolic tangent function.
Further, the antagonistic learning includes:
applying the antagonistic learning in the ABSA task, adding the antagonistic disturbance in an embedding layer of the model, wherein the probability that the emotion of the target entity aspect is y in one sentence is p (y | S) BertIn Aspect), thus embedding of the modelThe loss function after layer addition against perturbation is as follows:
-logp(y|E w +r adv ,aspect;θ) (1)
wherein
Figure BDA0003235073600000054
p(y|E w +r adv Aspect; theta) represents the addition of the antagonistic disturbance r adv Emotion prediction probability of r adv Representing the counterdisturbance, r represents an counterdisturbance to the input, α represents a disturbance scaling factor, | | · | | represents a norm, argmin represents the r variable that minimizes the objective function, and then assigns this r value to radv,
Figure BDA0003235073600000055
representing the predicted probability after adding the perturbation r.
Further, the antagonistic learning further includes:
finding out the confrontation perturbation by using a fast gradient descent method, calculating the confrontation perturbation by using back propagation in a neural network, adding the confrontation perturbation and the word vector of the original embedded layer to obtain a confrontation sample, and obtaining the confrontation perturbation r adv The calculation is as follows:
Figure BDA0003235073600000056
wherein
Figure BDA0003235073600000061
Figure BDA0003235073600000062
The challenge sample loss function for ABSA is therefore as follows:
Figure BDA0003235073600000063
wherein the confrontation sample E adv Is represented as follows:
Figure BDA0003235073600000064
where α represents the anti-perturbation scaling factor, g w Represents the gradient of the word embedding layer in the model, | · | | non-woven phosphor 2 Represents a two-norm; ^ represents a gradient operator, E w The word-embedding tensor representing the clean sample,
Figure BDA0003235073600000065
the probability of a clean sample emotion prediction is represented, aspect denotes a specific target entity>
Figure BDA0003235073600000066
A constant set representing current parameters of the neural network classifier; lambda [ alpha ] i (.) represents the evaluation of a characteristic value of the matrix, which is greater or less than>
Figure BDA0003235073600000067
Denotes g w The conjugate transpose of (1); n denotes the total number of samples, p (y) i |E adv(i) ,aspect i (ii) a θ) represents the prediction probability of the ith challenge sample, y i True tag representing the ith sample, E adv(i) Embedding layer tensor, aspect, representing the ith countermeasure sample i Representing the ith specific target entity, and theta represents a neural network parameter;
r adv representing opposition to disturbance, E seg Word-embedded tensor, E, representing clean samples pos The position representing a clean sample embeds the tensor,
Figure BDA0003235073600000068
represents a tensor addition, based on the sum of the partial values>
Figure BDA0003235073600000069
Word embedding representing the 1 st word of a sample,/>
Figure BDA00032350736000000610
antagonistic perturbations embedded in a sample corresponding to word 1>
Figure BDA00032350736000000611
Word embedding, in-dash, representing the 2 nd word of a sample>
Figure BDA00032350736000000612
A sample corresponding to a 2 nd word embedded counter perturbation>
Figure BDA00032350736000000613
Word embedding, representing the i +1 th word of a sample, is present>
Figure BDA00032350736000000614
One sample corresponds to the counter perturbation embedded by the i +1 st word,
Figure BDA00032350736000000615
word embedding, which represents the nth word of a sample, is present>
Figure BDA00032350736000000616
A sample corresponds to the counter-perturbation embedded in the nth word>
Figure BDA00032350736000000617
Word embedding, representing the (n + 1) th word of a sample>
Figure BDA00032350736000000618
One sample corresponds to the n +1 th word-embedded counter perturbation.
In summary, due to the adoption of the technical scheme, the invention has the advantages that:
(1) Selecting BERT-BASE as a reference model, carrying out experiments on two data sets of Laptop and Restaurant, providing a text word shielding data enhancement method (synonym replacement keeping unchanged semantics and syntax structure, random insertion of emotion vocabulary and the like) based on the existing specific target emotion analysis corpus, and constructing an effective specific target emotion analysis new corpus;
(2) A data enhancement confrontation training method is provided, a generated new corpus (the original corpus and the data enhancement corpus are fused) is input into a model, and then fine confrontation disturbance is carried out on a word embedding layer of the model to obtain a specific target emotion analysis model with strong robustness;
(3) The specific target emotion classification models of the provided fusion word shielding data enhancement and the countermeasure learning are used for verifying that the data enhancement samples can effectively improve the performance of the models respectively, the performance of the models can be improved only by using the original data for the countermeasure training, and finally, the two data characteristics are fused for the countermeasure learning to obtain a better result.
Additional aspects and advantages of the invention will be set forth in part in the description which follows and, in part, will be obvious from the description, or may be learned by practice of the invention.
Drawings
The above and/or additional aspects and advantages of the present invention will become apparent and readily appreciated from the following description of the embodiments, taken in conjunction with the accompanying drawings of which:
FIG. 1 is a diagram of a deep learning model for countering attack emotion recognition in the prior art.
FIG. 2 is a schematic diagram of the network structure of the WMDE-AL model of the present invention.
FIG. 3 is a diagram of the confrontational training Accuracy of the present invention at different sizes α.
FIG. 4 is a diagram illustrating the magnitude of the increase in the resistance training evaluation index of different sizes α of the present invention.
Detailed Description
Reference will now be made in detail to embodiments of the present invention, examples of which are illustrated in the accompanying drawings, wherein like reference numerals refer to the same or similar elements or elements having the same or similar functions throughout. The embodiments described below with reference to the accompanying drawings are illustrative only for the purpose of explaining the present invention, and are not to be construed as limiting the present invention.
1. Related work
ABSA is also called fine-grained emotion analysis and is a research hotspot at home and abroad all the time. The task mainly works to determine the emotion classification of a specific target entity in each sentence, and although the current research adopts a deep learning method to obtain a better-performance result, the problems that less training data causes weak generalization capability, the accuracy and the robustness of a model cannot be compatible and the like exist.
In the traditional deep learning method, a complex Neural Network structure is adopted to extract features, and for the problem of text classification, a Recurrent Neural Network (RNN), a long-short term memory Network (LSTM) and the like are adopted to obtain context characteristic information, but the traditional Neural networks do not consider specific target entities. To add a specific target entity to the signature coding, tang et al propose that TD-LSTM encodes the upper and lower context of the target entity with two LSTMs, respectively, in order to take the target entity into account in the signature coding. Wang et al propose to use an attention mechanism to obtain important information of the target entity after LSTM encoding. Chen et al propose that RAM utilizes a multi-attention mechanism to capture target entity feature information in long difficult sentences, solving the problem of contextual distraction. However, the models do not consider syntactic constraints and distant word dependence problems, which leads to misjudgment of the target entity emotion. Zhang et al propose that AS-GCN build a graph convolution neural network on the sentence dependency tree to obtain syntactic information and word dependencies. Karimi et al propose a BERT confrontation training architecture, fine-tune BERT models using confrontation training, and improve neural network generalization capability.
Antagonistic learning refers to training a neural network with an antagonistic sample to achieve antagonistic defense. Goodfellow et al propose FGM to resist disturbance of the embedded layer of the LSTM model by means of an image classification countertraining method, and perform countertraining on the obtained countersamples, but the counterdisturbance gradient is not constrained. Sato et al propose an iAdvT-Text confrontation learning method to generate confrontation samples on a word embedding layer, constrain the change direction of the gradient according to the distance of the word embedding layer, and finally improve the generalization capability of the model through confrontation learning. Li and the like propose a fine-grained virtual confrontation training method, which introduces character level confrontation disturbance to improve the initialization of confrontation training, and solves the problem of disturbance constraint by forcibly constraining the disturbance size through character level standardization. In order to overcome the problems of inconsistent semantics and unsmooth language of the generated confrontation samples, li and the like propose a BERT-ATTACK confrontation sample generator, firstly searching words of which the input sequence is easy to ATTACK, then generating substitute words of the words easy to ATTACK by the BERT, and generating the fluent and reasonable confrontation samples by utilizing the advantage that the BERT can capture context semantics. The above studies may improve model robustness, but there is a lack of research to improve model accuracy with antagonistic training. Xie et al propose an AdvProp confrontation training method, and use confrontation samples and clean samples to train together in an image classification task to solve the problem of uneven feature distribution, so that the image confrontation samples are verified to improve the classification accuracy.
The text data enhancement idea is derived from the image domain, but unlike image data enhancement, data enhancement is applied to prevent network overfitting when the dataset is small. The commonly used data enhancement modes in the NLP task comprise translation, synonym replacement, sentence abbreviation and the like, and recent research shows that text data enhancement can improve the performance of the NLP task. For example, zhu et al propose to automatically generate relevant non-answered questions according to the answered questions, the original text and the answers, and further improve the performance of the reading understanding system as a data enhancement method. In addition to data enhancement in reading comprehension, data enhancement is also required in text classification. Wei et al propose four data enhancement techniques of synonym replacement, random insertion, random exchange, and random deletion, which modify the original text only and do not modify the data labels. If the modified semantics have changed, then the data is invalid.
2. Proposed method
The definitions of the symbols used in the patent formula and model of the present invention are shown in table 1. The frame diagram of the Word-masking Data Enhancement and adaptive Learning model (WMDE-AL) provided by the present invention is shown in fig. 2. WMDE-AL references a simple text data enhancement method and a text countermeasure training, and enhances a specific target corpus by improving the simple data enhancement method.
Definition of all symbols in the model of Table 1
Figure BDA0003235073600000091
/>
Figure BDA0003235073600000101
Fig. 2 includes two modules, word-masking Data Enhancement (WMDE) and counterlearning (AL). (1) WMDE module samples S of original corpus In Carrying out synonym replacement (constraint condition: keeping sentence smooth and semantic unchanged) and random insertion (constraint condition: keeping sentence structure unchanged) by shielding aspect to carry out data enhancement, and then combining the generated data and the original data to obtain input S of BERT BertIn (ii) a (2) The AL module is combined with the BERT model and the Adv-BERT model to simultaneously learn the characteristics of the clean sample and the confrontation sample, and the problem of uneven distribution of the characteristics of the samples is solved.
3.1ABSA antagonistic learning
Counterlearning is a method to improve model robustness in the classification problem, with the goal of adding counterdisturbance to the raw data to minimize the maximum error classification optimization parameter θ. Applying the antagonistic learning in the ABSA task, adding the antagonistic disturbance in an embedding layer of the model, and assuming that the probability that the emotion of the target entity aspect is y in one sentence is p (y | S) BertIn Aspect), the loss function after the embedded layer of the model is added to resist disturbance is as follows:
-logp(y|E w +r adv ,aspect;θ) (1)
wherein
Figure BDA0003235073600000111
/>
Where p (y | E) w +r adv Aspect; theta) represents the addition of the antagonistic disturbance r adv Emotional prediction probability of E w Word-embedding tensor, r, representing clean samples adv Representing the countermeasure disturbance, aspect representing a specific target entity, theta representing a neural network parameter, r representing an countermeasure disturbance to the input, alpha representing a disturbance scaling factor, | | · | | | representing a norm, argmin representing an r variable that minimizes the target function, and then assigning this r value to radv,
Figure BDA0003235073600000112
representing the predicted probability after adding the disturbance r; the main meaning of equation (2) is to randomly add a perturbation to the sample to minimize the inverse of the loss function, i.e. to find the final perturbation variable radv in case of maximizing the loss function.
In order to solve the above minimization problem, the worst sample interference minimization loss function is sought, the fast gradient descent method is used to find the counterdisturbance, the counterdisturbance can be calculated by back propagation in the neural network, then the counterdisturbance can be added with the word vector of the original embedding layer to obtain the countersample, and the counterdisturbance r can be calculated by back propagation in the neural network adv The calculation process is as follows:
Figure BDA0003235073600000113
wherein
Figure BDA0003235073600000114
Figure BDA0003235073600000115
The antagonistic sample loss function for ABSA is therefore as follows:
Figure BDA0003235073600000116
wherein the confrontation sample E adv Is represented as follows:
Figure BDA0003235073600000121
where α represents the anti-perturbation scaling factor, g w Represents the gradient of the word embedding layer in the model, | · | | non-woven phosphor 2 Represents a two-norm; v represents the gradient operator, E w The word-embedding tensor representing the clean sample,
Figure BDA0003235073600000122
representing the probability of a clean sample emotion prediction, aspect represents a particular target entity, </or >>
Figure BDA0003235073600000123
A constant set representing current parameters of the neural network classifier; lambda [ alpha ] i (.) represents the evaluation of a characteristic value of the matrix, which is greater or less than>
Figure BDA0003235073600000124
Denotes g w The conjugation transpose of (1); n denotes the total number of samples, p (y) i |E adv(i) ,aspect i (ii) a θ) represents the prediction probability of the ith challenge sample, y i A true tag representing the ith sample, E adv(i) Embedding layer tensor, aspect, representing the ith countermeasure sample i Representing the ith specific target entity, and theta represents a neural network parameter;
r adv representing opposition to disturbance, E seg Word-embedding tensor, E, representing clean samples pos The position representing a clean sample is embedded in the tensor,
Figure BDA0003235073600000125
represents a tensor addition, based on the sum of the partial values>
Figure BDA0003235073600000126
Word embedding, representing the 1 st word of a sample, is present>
Figure BDA0003235073600000127
A sample corresponds to the counter-perturbation embedded in the 1 st word>
Figure BDA0003235073600000128
Word embedding, representing the 2 nd word of a sample, is present>
Figure BDA0003235073600000129
A sample corresponds to the counter-perturbation embedded in the 2 nd word>
Figure BDA00032350736000001210
Word embedding, representing the i +1 th word of a sample>
Figure BDA00032350736000001211
One sample corresponds to the counter perturbation for the i +1 th word embedding,
Figure BDA00032350736000001212
word embedding, which represents the nth word of a sample, in->
Figure BDA00032350736000001213
A sample corresponding to an n-th word embedded counter perturbation>
Figure BDA00032350736000001214
Word embedding, representing the (n + 1) th word of a sample, is present>
Figure BDA00032350736000001215
One sample corresponds to the n +1 th word embedding counter perturbation.
Through the countermeasure training method, the loss function of the countermeasure sample and the characteristics of the countermeasure sample, namely the characteristics of the countermeasure sample extracted through the model, can be obtained, whether the robustness and the accuracy of the model can be improved through the combined characteristic distribution of the clean sample and the countermeasure sample is researched, and how to extract effective characteristics is the main work of the invention. Next, how to solve the problem of the non-uniform feature distribution by means of Adv-BERT will be described.
3.2 the WMDE-AL model proposed
For a small data set, data enhancement is the simplest strategy for improving feature diversification, so that synonym replacement and random insertion are used for data enhancement, in order to keep target entities in sentences unchanged, a WMDE method is adopted for data enhancement, and the statistics of the enhanced samples are shown in Table 2.
F SR The equation is:
Figure BDA0003235073600000139
wherein S Sr Representing data after synonym replacement, F SR (. Represents a synonym replacement data enhancement function, S In Representing the input of an original corpus of material,
Figure BDA0003235073600000131
is the ith word of an original sample, aspect represents a particular target entity, rep (-) word replacement function, and/or>
Figure BDA0003235073600000132
Indicating the id-th word, id, that needs to be replaced Sr Indicating the location of the word replacement(s),
Figure BDA0003235073600000133
indicates that based on the ith word->
Figure BDA0003235073600000134
Randomly searching num in Wordnet library 1 A synonym; | A = means not equal.
F RI The equation is as follows:
Figure BDA0003235073600000135
wherein S Ri Representing data after random insertion, F RI (. Cndot.) denotes a random insertion data enhancement function, insert (. Cndot.) denotesThe word is inserted after the ith word,
Figure BDA0003235073600000136
indicating the id-th word, id, that needs to be inserted RI Indicating the previous position in the sentence where the word is to be inserted, ran (Wordnet, num) indicates randomly finding num in Wordnet 2 A word.
Table 2 sample statistics after data enhancement
Figure BDA0003235073600000137
Enhanced data Da Out As a clean sample, da Out =S Sr ∪S Ri (ii) a For each batch of clean samples, the clean sample is first used to generate the opposing perturbation r of the word embedding layer adv Thereby generating a confrontation sample, and then training the confrontation sample by using Adv-BERT. Each batch training of clean samples was performed using BERT, and each batch training of challenge samples was performed by Adv-BERT. Where each batch penalty function for a clean sample is calculated as follows:
Figure BDA0003235073600000138
wherein L is clean (. Represents the loss function of a clean sample, N batch Denotes the size of a batch, p (y) i |E i ,aspect i (ii) a Theta) represents the emotion prediction probability function of the ith sample in a batch, y i A true tag representing the ith sample, E i Embedding layer tensor, aspect, representing the ith clean sample i Represents the ith clean sample specific target entity, theta represents a neural network parameter,
the penalty function for each batch of challenge samples is calculated as follows:
Figure BDA0003235073600000141
wherein L is adv (. Represents a loss function of challenge samples, N batch Denotes the size of a batch, theta denotes the neural network parameter, p (y) i |E adv(i) ,aspect i (ii) a Theta) represents the ith challenge sample emotion prediction function, y i A true tag representing the ith sample, E adv(i) Embedding layer tensor, aspect, representing the ith countermeasure sample i Representing the ith clean sample specific target entity, and theta represents a neural network parameter;
the loss function for each batch of two samples is finally minimized:
Figure BDA0003235073600000142
where L (-) represents the model loss function,
Figure BDA0003235073600000143
represents the value of the model parameter theta, L, when the loss function is minimized clean (θ) represents the clean sample per batch penalty function, L adv (θ) represents a loss function for each batch of challenge samples;
according to the invention, a specific target emotion classification of BERT-BASE is selected as a baseline, an individual data enhancement mode, an individual confrontation learning mode and a data enhancement confrontation learning mode experiment are respectively carried out, and compared with a BERT-BASE reference model, because an embedding layer of the BERT model has three vectors which are respectively Word embedding (Word embedding), segment embedding (Segment embedding) and Position embedding (Position embedding). In the experiment, counterattack is only carried out aiming at word embedding, so that a word embedding countersample is generated, and the other two embeddings are not changed. WMDE-AL Algorithm 1 shows:
Figure BDA0003235073600000144
Figure BDA0003235073600000151
the algorithm 1 comprises a WMDE function and an AL function, wherein the WMDE function describes a text word shielding data enhancement algorithm process, and the AL function describes an ABSA antagonistic learning process.
4. Analysis of Experimental results
4.1 preparation of the experiment
(1) Data set: the patent experiment of the invention uses two data sets of Laptop and Restaurant in SemEval2014, a specific target entity has four emotion polarities of positive, neutral, negative and conflict, as the proportion of the conflict polarity is small, the preprocessing is carried out by using a method for removing conflict polarity corpora of other researchers for reference, and the statistics of the number of the three emotion polarities of each data set is shown in Table 2. The dataset was tokenizer participled using a bertocher in a Pytorch-transformations tool and data enhancement using a Wordnet thesaurus in an NLTK tool.
(2) And (3) resisting the attack: the invention uses FGM as an anti-attack method, selects alpha with different sizes to carry out FGM attack, uses FGM to carry out anti-disturbance on a word embedding layer of a BERT model to generate an anti-sample, and then carries out anti-sample training through Adv-BERT.
(3) A reference model: the patent of the present invention uses BERT-BASE (L =12,h =768,a =12,total parameters = 110m) as the reference model of ABSA, where L denotes the number of network Hidden Layers (Numbers of Hidden Layers), H denotes the Size of network Hidden Layers (Numbers Size), and a denotes the number of Self-Attention Heads (Numbers of Self-Attention Heads). The activation function of the hidden layer of the BERT model adopts a Gaussian Error Linear unit (gelu), and the calculation formula is as follows:
Figure BDA0003235073600000161
wherein gelu (·) represents a gaussian error linear unit, θ represents a neural network parameter, and tanh represents a hyperbolic tangent function;
(4) Experimental environment and hyper-parameter settings: the patent experiment of the invention is realized by using a GPU (GeForce RTX 3090), a 24G video memory and a PyTorch 1.8.1 framework. The hyper-parameter settings are shown in table 3.
TABLE 3 Experimental hyper-parameter statistical table
Parameter(s) Value
Batch (batch) size 16
Learning rate 2e-5
Anti-disturbance scaling factor alpha α∈[0.01,0.09]
L2 regularization 0.01
Dropout rate 0.1
Initialization device xavier_uniform_
Optimizer adam
Number of training sessions 5
Maximum sequence length 128
4.2 analysis of results
In order to verify the effectiveness of the proposed WMDE method, data enhancement is respectively carried out on a Laptop data set and a Restaurant data set through experiments, synonym replacement for shielding specific target words and random word insertion are carried out on each sentence to generate two pieces of enhanced data, then original data are combined and input into a BERT-BASE model to carry out specific target emotion classification, and the comparison of the WMDE method with other models and the comparison of the WMDE method with the performance of the BERT-BASE reference model are given in table 5. When the reference model is BERT-BASE, the emotion classification accuracy after the Laptop data set WMDE is enhanced is 79.00%, and the emotion classification accuracy is improved by 2.35% compared with the original data training accuracy of 76.65%. Similarly, the emotion classification accuracy rate after the enhancement of the Restaurant data set WMDE is 84.38%, which is improved by 0.36% compared with the original data training accuracy rate of 84.02%. Experimental results show that WMDE data enhancement is carried out on Laptop and Restaurant aspect-level emotion classification data sets to generate new training data, the model can be effectively improved, and the Laptop small data set is good in improving effect.
Through the comparison of the experimental results, in the WMDE method, synonym replacement is performed through a word shielding method, aspect is kept unchanged, the replaced parts of speech are the same, and the smoothness of the generated sentences and the unchanged semantics are ensured. When words are inserted randomly, the method of inserting adverbs is adopted, the meanings and the syntactic structures of the generated sentences and the original sentences are ensured not to be changed, and the similarity calculation is carried out on the new samples and the original samples when the final samples are fused. Therefore, the WMDE method has the function of generating more accurate enhanced samples through the original data on the premise of not changing the meaning and the syntactic structure of the original data, so that the model can learn more effective characteristics. In the experimental process, the effectiveness of synonym replacement and random insertion algorithms on model feature learning is verified respectively, and finally, the two enhancement modes are combined, and the result shows that the effect of combining the two enhancement modes is optimal.
Fusing the data enhancement sample and the original sample to be used as an input sample of the model, respectively using the perturbation coefficient range on the Laptop data set and the Restaurant data set from 0.01 to 0.09, and increasing the step length to 0.01. FIG. 3 shows a diagram of an Accuracy of a comparison experiment between disturbance coefficients alpha with different sizes and a BERT-BASE model set by using a WMDE-AL model provided by the invention, wherein (a) a subgraph in FIG. 3 shows a schematic diagram of an Accuracy of confrontation training under different sizes alpha of a Laptop data set, and (b) a subgraph in FIG. 3 shows a schematic diagram of an Accuracy of confrontation training under different sizes alpha of a Restaurant data set; each subgraph in FIG. 3 includes the performance of the four methods BERT-BASE, BERT-WMDE, BERT-AL, and BERT-WMDE-AL. The method is characterized in that alpha with different sizes is used for resisting training evaluation index growth amplitude based on BERT-BASE, as shown in FIG. 4, wherein the scale of a radar graph represents the size of resisting disturbance alpha, a subgraph in FIG. 4 (a) represents the growth condition of a Laptop data set Accuracy and F1 values, and the WMDE-AL method growth amplitude is higher than that of the AL method according to the subgraph; the subgraph in fig. 4 (b) shows the growth of the retaurant data set Accuracy and F1 values, and it can be derived from the subgraph that the WMDE-AL method has a slightly lower growth amplitude than the AL method. For the Laptop data set, when alpha =0.02, the accuracy reaches the maximum of 79.94% by using a BERT-AL training mode, which is improved by 3.29% compared with the BERT-BASE training mode; when the training mode of BERT-WMDE-AL is used, when alpha =0.01, the accuracy reaches the maximum value of 80.88%, which is respectively improved by 4.23% and 1.88% compared with the training modes of BERT-BASE and BERT-WMDE, and the accuracy of adding word mask data to enhance the countertraining is improved by 0.94% compared with the accuracy of not adding word mask data to enhance the countertraining. For the Restaurant data set, when alpha =0.08, the accuracy reaches the maximum of 85.71% by using a BERT-AL training mode, which is 1.69% higher than that by using a BERT-BASE training mode; when the training mode of BERT-WMDE-AL is used, when alpha =0.02, the accuracy rate reaches the maximum value of 85.27%, which is improved by 1.25% and 0.89% compared with the training modes of BERT-BASE and BERT-WMDE respectively, and the accuracy rate of the anti-training enhanced by adding the word mask data is reduced by 0.44% compared with the accuracy rate of the anti-training enhanced by not adding the word mask data. The results of the countertraining experiments show that the WMDE-AL performance is better for small data sets as well, and is obviously improved compared with a benchmark model. But for slightly larger datasets the performance of the AL method directly over WMDE-AL method is slightly better. According to the result analysis, the AL method and the WMDE-AL method can effectively improve the sample characteristic diversity, and the confrontation sample is utilized to improve the text representation quality, so that the specific target emotion classification performance is improved.
TABLE 4 Accuracy values for the enhanced BERT-BASE countertraining
Figure BDA0003235073600000181
4.3 model comparison
The model performance comparison includes: (1) The performances of the WMDE, AL and WMDE-AL methods proposed by the patent of the invention are compared with the performances of BERT-BASE, and the three methods are compared at the same time; (2) Compared to the model that currently performs better on the Laptop and Restaurant datasets. The evaluation indexes are Accuracy and F1 values, and the performance of the reference model is shown in table 5, which can be obtained as follows:
(1) The features were extracted using a deep learning model with TD-LSTM (Tang et al.2016) with 71.83% and 78.00% accuracy on the Laptop and Restaurant datasets, respectively. The deep learning specific target emotion classification model solves the problem that the machine learning solves the complex artificial feature extraction problem, and the performance of the deep learning model is mostly superior to the best machine learning performance.
(2) MemNet (Tang et al.2016) combines multiple attention by adopting linear combination to extract target entity characteristic information, so that specific target emotion classification performance is improved, and the accuracy rates are 72.20% and 81.00% respectively.
(3) RAM (Chen et al.2017) adopts GRU network structure to combine multiple attention weights, different attention emotion feature vectors are combined in a nonlinear mode, and the accuracy rates are 74.49% and 80.23% respectively.
(4) MGAN (Fan et el.2018) captures word-level interaction relation between target entities and sentences by adopting fine-grained and coarse-grained attention, and then carries out specific target emotion classification, wherein the accuracy rates of the classification are 75.39% and 81.25%, respectively.
(5) RepWalk (Zheng et al.2020) employs a random copy walk based on syntax trees to capture contextual features of sentence information, effectively utilizing syntax structures to improve sentence representation with 78.20% and 83.80% accuracy, respectively.
(6) The BERT-PT (Xu et al.2019) retrains the context information BERT model by adopting a large-scale specific target field corpus, improves the quality of the final task word representation, and has the accuracy rates of 78.07% and 84.95% respectively.
TABLE 5 Overall Performance comparison of target-specific sentiment classification models on Laptop and Restaurant datasets
Figure BDA0003235073600000191
/>
Figure BDA0003235073600000201
In table 5, # indicates the experimental result of the present patent, "-" indicates the experimental result in the reference thereof indicates that it is not recorded, and the remaining data indicates that it is from the original document.
In order to evaluate the performance of the method provided by the patent of the invention on a specific target emotion analysis task, BERT-BASE is adopted as a target model of countermeasure training, and the following three comparative experiments are respectively carried out: (1) Firstly, verifying the effectiveness of WMDE in generating new corpora, selecting an optimal result as an experimental result through five times of WMDE experiments, wherein the accuracy rates of the Laptop and Restaurant data sets are 79.00% and 84.38% respectively; (2) Verifying the effectiveness of AL of the original data set, and performing countermeasure training by using a clean sample and a countermeasure sample, wherein the accuracy rates of the AL are respectively 79.94% and 85.71%, and are respectively 1.87% and 0.76% higher than that of a BERT-PT model; (3) The verification shows that a new training sample is generated through AL and the anti-training is carried out by fusing the original sample, namely the WMDE-AL method provided by the invention has the accuracy rates of 80.88% and 85.27% respectively, and is 2.81% and 0.32% higher than the accuracy rate of a BERT-PT model respectively.
5 conclusion
A specific target emotion analysis model for word mask data enhancement and confrontation learning aims to improve the feature distribution diversity of a clean sample and improve BERT-BASE emotion classification performance by using confrontation training. Experimental results on Laptop and Restaurant data sets show that the addition of WMDE, AL and WMDE-AL by BERT-BASE is obviously improved compared with BERT-BASE specific target emotion analysis, wherein the AL and WMDE-AL methods are superior to the BERT-PT model at the advanced level at present. The main conclusions are as follows: (1) Performing word shielding data enhancement on a specific target field corpus through synonym replacement and random insertion words, keeping the semantics and the grammar structure of sentences unchanged and shielding entity words from being replaced, and effectively performing data enhancement on a field data set; (2) And respectively carrying out confrontation learning on the original data and the data after data enhancement, and utilizing a clean sample and a confrontation sample to improve the specific target emotion classification identification degree so as to achieve the purpose of confrontation defense.
While embodiments of the present invention have been shown and described, it will be understood by those of ordinary skill in the art that: various changes, modifications, substitutions and alterations can be made to the embodiments without departing from the principles and spirit of the invention, the scope of which is defined by the claims and their equivalents.

Claims (8)

1. A specific target emotion analysis method for enhancing and counterlearning by fusing word shielding data is characterized by comprising the following steps of:
s1, synonym replacement and random word insertion are carried out on sentences by using a target entity shielding mode to generate effective samples and the effective samples are fused with original samples;
s2, constructing a BERT-BASE-based confrontation learning specific target emotion classification model, taking the data fused in the S1 as the input of the model, and training the emotion classification model by using a clean sample and a confrontation sample together:
enhanced data Da Out As a clean sample, da Out =S Sr ∪S Ri ,S Sr For data after synonym replacement, S Ri The data after random insertion; for each batch of clean samples, the clean sample is first used to generate the opposing perturbation r of the word embedding layer adv Thereby generating a challenge sample; the Adv-BERT model performs every batch training against the sample,each batch training of clean samples was performed using the BERT model;
the emotion classification model comprises a BERT model and an Adv-BERT model;
the antagonistic learning includes: applying the antagonistic learning in the ABSA task, adding the antagonistic disturbance in an embedding layer of the model, wherein the probability that the emotion of the target entity aspect is y in one sentence is p (y | S) BertIn ,aspect),S BertIn Representing the input to the BERT model, the embedded layer of the model therefore adds the loss function against disturbance as follows:
-log p(y|E w +r adv ,aspect;θ) (1)
wherein
Figure QLYQS_1
p(y|E w +r adv Aspect; theta) represents the addition of the antagonistic disturbance r adv Emotional prediction probability of r adv Representing the counterdisturbance, r representing an counterdisturbance to the input, α representing a disturbance scaling factor, | | · | | | representing a norm, argmin representing the r variable that minimizes the objective function, and then assigning this r value to r adv
Figure QLYQS_2
Representing the predicted probability after adding the perturbation r, E w A word-embedding tensor representing a clean sample, θ represents a neural network parameter, and>
Figure QLYQS_3
a constant set representing current parameters of the neural network classifier;
and S3, finally obtaining a specific target emotion analysis result.
2. The method for analyzing emotion of a specific target in fused word mask data enhancement and antagonistic learning according to claim 1, further comprising the steps of:
s4, respectively carrying out countermeasure learning on the original sample and the data-enhanced sample, and evaluating through evaluation indexes; the evaluation index includes: accuracy and/or F1 values.
3. The method for analyzing emotion of a specific target in fused word mask data enhancement and antagonistic learning according to claim 1, wherein the calculation method of synonym substitution in step S1 is:
Figure QLYQS_4
wherein S is Sr Representing data after synonym replacement;
F SR () represents a synonym replacement data enhancement function;
S In an input representing an original corpus;
Figure QLYQS_5
is the ith word of an original sample;
aspect represents a specific target entity;
rep (-) represents a word replacement function;
Figure QLYQS_6
indicating the id word needing to be replaced;
id Sr a location representing a word replacement;
Figure QLYQS_7
indicates that based on the ith word->
Figure QLYQS_8
Randomly finding num in Wordnet library 1 A synonym;
| A = means not equal.
4. The method for analyzing emotion of a specific target in fused word mask data enhancement and counterstudy according to claim 1, wherein the calculation method for randomly inserting words in step S1 is:
Figure QLYQS_9
wherein S is Ri Representing the data after random insertion;
F RI (. To) represents a random insertion data enhancement function;
insert (·) denotes inserting a word after the id's word;
Figure QLYQS_10
indicating the id-th word needing to be inserted;
id RI representing a previous position in the sentence where the word is to be inserted;
Ran(Wordnet,num 2 ) Indicating that num is randomly found in Wordnet library 2 A word.
5. The method for analyzing emotion of a specific target in fused word mask data enhancement and counterstudy according to claim 1, wherein the loss function of each batch training of the clean sample is calculated as follows:
Figure QLYQS_11
wherein L is clean (. Represents the loss function of a clean sample, N batch Denotes the size of a batch, theta denotes the neural network parameter, p (y) i |E i ,aspect i (ii) a Theta) represents the emotion prediction probability function of the ith sample in a batch, y i A true tag representing the ith sample, E i Embedding layer tensor, aspect, representing the ith clean sample i A specific target entity representing the ith sample;
the penalty function for each batch of challenge samples is calculated as follows:
Figure QLYQS_12
wherein L is adv (. Represents a loss function of challenge samples, N batch Denotes the size of a batch, theta denotes the neural network parameter, p (y) i |E adv(i) ,aspect i (ii) a Theta) represents the ith confrontation sample emotion prediction function, E adv(i) The embedded layer tensor representing the ith challenge sample.
6. The method for analyzing emotion of a specific target in fused word mask data enhancement and antagonistic learning according to claim 5, further comprising:
minimize the loss function for each batch of clean and challenge samples:
Figure QLYQS_13
where L (-) represents the model loss function,
Figure QLYQS_14
represents the value of the model parameter theta, L, when the loss function is minimized clean (θ) represents the clean sample per batch penalty function, L adv (θ) represents the loss function for each batch of challenge samples.
7. The method for analyzing emotion of a specific target in fused word mask data enhancement and antagonistic learning according to claim 1, wherein the hidden layer of the BERT model adopts a Gaussian error linear unit as an activation function:
Figure QLYQS_15
/>
wherein gelu (·) represents a gaussian error linear unit, theta represents a neural network parameter, and tanh is a hyperbolic tangent function.
8. The method for analyzing emotion of a specific target in a fused word with masked data enhancement and antagonistic learning according to claim 1, wherein said antagonistic learning further comprises:
finding out the confrontation perturbation by using a fast gradient descent method, calculating the confrontation perturbation by using back propagation in a neural network, then adding the confrontation perturbation and the word vector of the original embedded layer to obtain a confrontation sample, and obtaining the confrontation perturbation r adv The calculation is as follows:
Figure QLYQS_16
wherein
Figure QLYQS_17
Figure QLYQS_18
The challenge sample loss function for ABSA is therefore as follows:
Figure QLYQS_19
wherein the confrontation sample E adv Is represented as follows:
Figure QLYQS_20
where α represents the anti-perturbation scaling factor, g w Represents the gradient of the word embedding layer in the model, | · | | non-calculation 2 Represents a two-norm;
Figure QLYQS_21
representing gradient operators, E w Means for indicating drynessWord-embedded tensor of a clean sample, and/or>
Figure QLYQS_22
Representing the probability of a clean sample emotion prediction, aspect represents a particular target entity, </or >>
Figure QLYQS_23
A constant set representing current parameters of the neural network classifier; lambda [ alpha ] i (-) represents the eigenvalues of the matrix, g w H Denotes g w The conjugate transpose of (1); n denotes the total number of samples, p (y) i |E adv(i) ,aspect i (ii) a θ) represents the prediction probability of the ith challenge sample, y i A true tag representing the ith sample, E adv(i) Embedding layer tensor, aspect, representing the ith countermeasure sample i Representing the ith specific target entity, and theta represents a neural network parameter;
r adv representing resistance to disturbance, E seg Word-embedding tensor, E, representing clean samples pos The position representing a clean sample embeds the tensor,
Figure QLYQS_25
represents a tensor addition, <' > based on>
Figure QLYQS_27
Word embedding, representing the 1 st word of a sample, is present>
Figure QLYQS_31
A sample corresponds to the counter-perturbation embedded in the 1 st word>
Figure QLYQS_26
Word embedding, representing the 2 nd word of a sample, is present>
Figure QLYQS_28
One sample corresponds to the 2 nd word-embedded counter perturbation,
Figure QLYQS_30
to representWord embedding of the i +1 th word of a sample, based on the value of the word or the word value>
Figure QLYQS_34
Antagonistic perturbations embedded in a sample corresponding to the (i + 1) th word>
Figure QLYQS_24
Word embedding, which represents the nth word of a sample, is present>
Figure QLYQS_29
A sample corresponds to the counter-perturbation embedded in the nth word>
Figure QLYQS_32
Word embedding, representing the (n + 1) th word of a sample, is present>
Figure QLYQS_33
One sample corresponds to the n +1 th word embedding counter perturbation. />
CN202110999219.7A 2021-08-28 2021-08-28 Specific target emotion analysis method for enhancing and resisting learning by fusing word shielding data Active CN113723075B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110999219.7A CN113723075B (en) 2021-08-28 2021-08-28 Specific target emotion analysis method for enhancing and resisting learning by fusing word shielding data

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110999219.7A CN113723075B (en) 2021-08-28 2021-08-28 Specific target emotion analysis method for enhancing and resisting learning by fusing word shielding data

Publications (2)

Publication Number Publication Date
CN113723075A CN113723075A (en) 2021-11-30
CN113723075B true CN113723075B (en) 2023-04-07

Family

ID=78678668

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110999219.7A Active CN113723075B (en) 2021-08-28 2021-08-28 Specific target emotion analysis method for enhancing and resisting learning by fusing word shielding data

Country Status (1)

Country Link
CN (1) CN113723075B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114781352A (en) * 2022-04-07 2022-07-22 重庆邮电大学 Emotion analysis method based on association between grammar dependency type and aspect
CN114880586A (en) * 2022-06-07 2022-08-09 电子科技大学 Confrontation-based social circle inference method through mobility context awareness
CN115392259B (en) * 2022-10-27 2023-04-07 暨南大学 Microblog text sentiment analysis method and system based on confrontation training fusion BERT
CN115858791B (en) * 2023-02-17 2023-09-15 成都信息工程大学 Short text classification method, device, electronic equipment and storage medium
CN116776884A (en) * 2023-06-26 2023-09-19 中山大学 Data enhancement method and system for medical named entity recognition

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111159416A (en) * 2020-04-02 2020-05-15 腾讯科技(深圳)有限公司 Language task model training method and device, electronic equipment and storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109117482B (en) * 2018-09-17 2021-07-06 武汉大学 Confrontation sample generation method for Chinese text emotion orientation detection
CN110245229B (en) * 2019-04-30 2023-03-28 中山大学 Deep learning theme emotion classification method based on data enhancement
US20210142181A1 (en) * 2019-11-07 2021-05-13 Microsoft Technology Licensing, Llc Adversarial training of machine learning models
CN110909164A (en) * 2019-11-22 2020-03-24 科大国创软件股份有限公司 Text enhancement semantic classification method and system based on convolutional neural network
CN111324744B (en) * 2020-02-17 2023-04-07 中山大学 Data enhancement method based on target emotion analysis data set
CN112528675A (en) * 2020-12-14 2021-03-19 成都易书桥科技有限公司 Confrontation sample defense algorithm based on local disturbance
CN112580337A (en) * 2020-12-29 2021-03-30 南京航空航天大学 Emotion classification model and emotion classification method based on data enhancement

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111159416A (en) * 2020-04-02 2020-05-15 腾讯科技(深圳)有限公司 Language task model training method and device, electronic equipment and storage medium

Also Published As

Publication number Publication date
CN113723075A (en) 2021-11-30

Similar Documents

Publication Publication Date Title
CN113723075B (en) Specific target emotion analysis method for enhancing and resisting learning by fusing word shielding data
Logeswaran et al. Sentence ordering and coherence modeling using recurrent neural networks
Xia et al. Xgpt: Cross-modal generative pre-training for image captioning
Kant et al. Practical text classification with large pre-trained language models
CN113705678B (en) Specific target emotion analysis method for enhancing antagonism learning by using word shielding data
CN111738003B (en) Named entity recognition model training method, named entity recognition method and medium
Zhuo et al. Segment-level sequence modeling using gated recursive semi-markov conditional random fields
CN113723076A (en) Specific target emotion analysis method based on word mask data enhancement and counterstudy
CN115658954B (en) Cross-modal search countermeasure method based on prompt learning
Choi et al. Mem-kgc: Masked entity model for knowledge graph completion with pre-trained language model
Guo et al. Implicit discourse relation recognition via a BiLSTM-CNN architecture with dynamic chunk-based max pooling
Zhou et al. Robust reading comprehension with linguistic constraints via posterior regularization
Alsmadi et al. Adversarial machine learning in text processing: a literature survey
Kitada et al. Making attention mechanisms more robust and interpretable with virtual adversarial training
CN113220865B (en) Text similar vocabulary retrieval method, system, medium and electronic equipment
US20240119716A1 (en) Method for multimodal emotion classification based on modal space assimilation and contrastive learning
Xue et al. Variational Causal Inference Network for Explanatory Visual Question Answering
Wu et al. Multi-tasking for Aspect-based Sentiment Analysis via Constructing Auxiliary Self-Supervision ACOP task
Zhu et al. Generating semantically valid adversarial questions for tableqa
Chen et al. Self-discriminative learning for unsupervised document embedding
Duan et al. A Parameter-Adaptive Convolution Neural Network for Capturing the Context-Specific Information in Natural Language Understanding
Li et al. Adaptive feature discrimination and denoising for asymmetric text matching
Im et al. Multilayer CARU Model for Text Summarization
Li et al. Textual Adversarial Attacks on Named Entity Recognition in a Hard Label Black Box Setting
Chatzigianellis Greek news topics classification using graph neural networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant