CN114722798A - Ironic recognition model based on convolutional neural network and attention system - Google Patents

Ironic recognition model based on convolutional neural network and attention system Download PDF

Info

Publication number
CN114722798A
CN114722798A CN202210108214.5A CN202210108214A CN114722798A CN 114722798 A CN114722798 A CN 114722798A CN 202210108214 A CN202210108214 A CN 202210108214A CN 114722798 A CN114722798 A CN 114722798A
Authority
CN
China
Prior art keywords
semantic
ironic
layer
text sequence
features
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210108214.5A
Other languages
Chinese (zh)
Inventor
孟佳娜
朱彦霖
刘爽
孙世昶
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Dalian Minzu University
Original Assignee
Dalian Minzu University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Dalian Minzu University filed Critical Dalian Minzu University
Priority to CN202210108214.5A priority Critical patent/CN114722798A/en
Publication of CN114722798A publication Critical patent/CN114722798A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

The invention belongs to the field of natural language processing, and relates to an ironic recognition model based on a convolutional neural network and an attention system. The method comprises the following steps: s1, a text representation layer; s2, a semantic feature extraction layer; s3, establishing a ironic semantic relation modeling layer; and S4, determining the ironic intention. Has the advantages that: the method can identify potential irony expressions in social media and excavate real emotional tendency of a user, makes up the defects of the traditional sequence model, realizes the modeling of emotional semantic relations between sequences in sentences, and simultaneously compared with the existing irony identification method based on word pair contradiction, the model can more easily capture semantic contradiction information generated by the effect of inverse mock, improves the accuracy of irony identification tasks, and implicitly divides sentences into phrase segments through convolution operation to extract higher-level semantic features; semantic inconsistency information between sentence phrase fragments can be captured more accurately.

Description

Ironic recognition model based on convolutional neural network and attention system
Technical Field
The invention belongs to the field of natural language processing, and relates to an ironic recognition model based on a convolutional neural network and an attention system.
Background
Ironic is a common metaphorical language in human social life, and is widely liked by social media users due to strong and unique linguistic effects during expression, but certain difficulties and challenges are brought to the text emotion analysis task due to factors such as ambiguity of ironic expression, difference between literal meaning and real emotion and the like, and if real emotion information contained in the text cannot be accurately identified, the accuracy of the text emotion analysis and opinion mining tasks is seriously affected.
The intra-sentence semantic conflict is an important characteristic of the inverse mock, and when a traditional sequence model such as a long-term and short-term memory network is used for inverse mock semantic modeling, the emotional semantic relationship embedded in a sentence is difficult to accurately capture, so that the performance of the model is difficult to improve all the time; in recent years, a small number of researchers have proposed improved algorithms aiming at the shortcomings of the sequence model, and the main idea is to capture the semantic conflict relationship of the anti-sarcasm text by modeling the semantic relationship between words in the sentence, which overcomes the shortcomings of the conventional sequence model, and can accurately focus on the semantic joint information between the sequences in the sentence, so as to greatly improve the accuracy of the anti-mock recognition task, but it has been found that in the process of modeling the semantic relationship between words by using the above method, sometimes it is difficult to capture the semantic conflict information between words because the semantic information contained in the words is too little, for example, in the anti-mock sentence "Going in to word for 2 words in traffic in the 50min drive", semantic conflicts in sentences are difficult to find, but the phrase fragment 'work for 2 hours' and the phrase fragment 'work the 50min drive' are observed, so that the two phrases can be found to have strong emotional contrast semantically.
Thus, when modeling the linguistic structural features that produce the anti-mock effect, it is more reasonable to capture semantic inconsistency information between sentence phrase fragments than between words, because the phrase fragments contain much more semantic information than a single word.
Disclosure of Invention
In order to overcome some defects and shortcomings of the existing ironic recognition model, the invention provides the following technical scheme: an ironic recognition model based on a convolutional neural network and attention mechanism, comprising:
s1, a text representation layer: generating a word vector of a word by a pre-training language model BERT;
s2, a semantic feature extraction layer: extracting semantic features by implicitly segmenting a sentence into phrase segments by using a CNN layer of a text convolutional neural network, capturing ironic task relevance features by an Attention mechanism assistance model, distributing different weight scores according to the importance degrees of the features, and performing weighted combination on the features and the weights;
s3, an ironic semantic relation modeling layer: the semantic conflict information in the sentence is captured by modeling the semantic union information among the semantic features in the sentence;
s4, ironic intention distinguishing layer: reducing characteristic dimension of the ironic semantics through the linear layer, predicting the output of the linear layer through the softmax layer, and judging whether the text sequence contains ironic retouching;
the model inputs a text sequence S through a text representation layer, and a word vector representation matrix E of the text sequence is obtained through adding segmentation marks, conversion and training;
taking the word vector representation matrix E as the input of a semantic feature extraction layer, extracting features by using convolution operation, and splicing the obtained single phrase segment features into a feature matrix M; analyzing the importance degrees of different semantic features of the matrix M through an attention network, distributing corresponding weights to the importance degrees, and finally performing weighted combination on the weights and the features to obtain a final semantic feature V;
semantic feature V as input to a modeling layer of ironic relationshipsModeling, pooling and weighting summation are carried out on the over-semantic combined information to obtain the irony semantic representation f of the text sequencea
Ironic representation f of text sequencesaFor the input of the ironic discrimination layer, the ironic characteristic representation f is reduced by the linear layeraThe softmax layer predicts the output of the linear layer, judges whether the text sequence contains ironic revisions or not, and obtains the classification result of the model
Figure BDA0003494084700000021
Further, in the text representation layer: training the ironic texts under different context information by using a pre-training language model BERT to obtain a word embedding expression containing the context information;
the steps of using the pre-trained language model BERT are as follows:
first, an input text sequence S is defined, S ═ S1,s2,...,snA, si represents the ith word of the text sequence, and n represents the number of words contained in the text sequence;
secondly, adding a 'CLS' segmentation mark and a 'SEP' segmentation mark at the beginning and the end of the text sequence respectively, and converting the text sequence into a specific format which can be processed by a BERT model;
and finally, the processed text sequence is sent to a BERT model for training to obtain a word vector expression matrix E of the text sequence, and the specific process is as shown in a formula 3.1:
E=BERT(“[CLS]”+s1,s2,s3,...sn+“[SEP]”) (3.1)
wherein E ═ { E ═ E1,e2,e3...,en},ei∈RkA word vector representation representing the ith word in the text sequence, and k represents the dimension of the word vector.
Further, the pre-training language model BERT includes a Token Embedding model Token Embedding, a Position Embedding model Position Embedding, and a Segment Embedding model Segment Embedding.
Further, the CNN layer feature extraction step includes:
(1) inputting a two-dimensional word vector matrix of a text sequence;
(2) continuously performing convolution operation on different convolution kernels and the word vector matrix to extract a characteristic diagram of the text sequence;
(3) processing the feature maps with different sizes by using a maximum pooling method to obtain single features with fixed length;
(4) splicing all the single features into a simple feature vector;
the process is as follows:
first, a two-dimensional word vector matrix E, E ═ E of a text sequence is input1,e2,e3...,enWhere n represents the number of words contained in the text sequence, ei∈RkRepresenting the word vector representation of the ith word in the text sequence, wherein k is the dimension of the word vector;
secondly, inputting the two-dimensional word vector matrix E into a convolution module, and extracting the characteristics of the phrase fragments by utilizing the convolution operation of a convolution kernel and the text sequence in the convolution window. In the experiment, the size of the convolution kernel W is defined as h × k, where h represents the size of a convolution kernel window, which determines the number of words contained in the phrase fragment, and the feature extraction process of the phrase fragment can be expressed as formula 3.2:
ci=f(W·Ei:i+h-1+b) (3.2)
here we define a function operation f, c of convolutioniRepresenting the phrase characteristics extracted after the convolution operation is carried out on the phrase fragment formed by the ith word to the (i + h-1) th word in the word vector matrix E and the convolution kernel W, wherein b is offset;
thirdly, after the convolution kernel W completes the convolution operation with the whole word vector matrix E from top to bottom, a single feature map c of all the phrase segments in the text sequence is obtained, and c is (c)1,c2,c3,...cn-h+1);
And finally, repeating the operation by using k convolution kernels with the same size as W, and splicing the obtained feature graphs to obtain the semantic features of the phrase fragmentsThe matrix M belongs to R(n-h+1)*k,M=(m1,m2,m3,...,mn-h+1) Wherein m isi∈RkRepresenting a semantic feature representation of the ith phrase fragment in the text sequence.
Furthermore, the Attention layer reinforces the key semantic features by giving weight scores to different semantic features, and the steps are as follows:
firstly, the semantic features mi extracted from the feature extraction layer are input into a linear layer, and then the result is processed by a tanh function to obtain a hidden layer feature representation m 'of the ith phrase'iWhere b is an offset, the specific process can be expressed as shown in formula (3.3):
m′i=tanh(W·mi+b) (3.3)
secondly, according to semantic feature m'iCalculating semantic feature m 'by similarity with context vector q'iS weight fraction ofiThen, the weighting fractions of all the characteristics are normalized by using a softmax function to obtain a final weighting matrix a, and the specific calculation process is as the following formula 3.4, 3.5 and 3.6:
Figure BDA0003494084700000051
a=Soft max(s1,s2,s3,...,sn-h+1) (3.5)
a=(a1,a2,a3,...,an-h+1) (3.6)
finally, the weight fraction aiAnd semantic features miAnd (3) obtaining a final semantic feature V by correlation, wherein the specific calculation mode is as the following formulas 3.7 and 3.8:
vi=ai*mi (3.7)
V=(v1,v2,v3,...,vn-h+1),V∈R(n-h+1)*k (3.8)
has the advantages that: the method can identify potential irony expressions in social media and excavate real emotional tendency of a user, makes up the defects of the traditional sequence model, realizes the modeling of emotional semantic relations between sequences in sentences, and simultaneously compared with the existing irony identification method based on word pair contradiction, the model can more easily capture semantic contradiction information generated by the effect of inverse mock, improves the accuracy of irony identification tasks, and implicitly divides sentences into phrase segments through convolution operation to extract higher-level semantic features; semantic inconsistency information between sentence phrase fragments can be captured more accurately.
Drawings
FIG. 1, inverse mock identifies a model structure diagram;
FIG. 2 is a block diagram of the BERT model;
FIG. 3 is an input structure diagram of the BERT model;
FIG. 4 is a diagram of a textual convolutional neural network architecture;
FIG. 5 is a diagram of semantic feature extraction architecture;
FIG. 6 is a diagram of an attention layer model architecture;
FIG. 7 is a diagram of a reverse mock semantic modeling module architecture.
Detailed Description
The invention provides an ironic recognition model based on a convolutional neural network and an attention system. In the model of the invention, semantic information is not extracted by taking a word as a minimum unit, but a sentence is implicitly segmented into phrase fragments through convolution operation to extract semantic features of higher level; meanwhile, after the feature extraction is finished, an attention model is added to further process all the features, semantic information related to the current ironic recognition task is strengthened and is distributed with higher weight, and other irrelevant information is weakened and is distributed with lower weight; then, according to the internal language structure characteristics of the ironic retrieval, modeling is carried out on semantic association information among the characteristics, so that the model captures emotional inconsistency information embedded among internal sequences of sentences to form sentence-level ironic representation; and finally, sending the ironic representation of the text sequence into a classifier for ironic discrimination.
1. Integral framework of ironic recognition model
1.1 problem definition
In the field of natural language processing, inverse mock recognition is generally defined as a binary task whose purpose is to identify whether an irony is contained in a given text. Typically, given a sequence of text S and a set of categories Y {0,1}, the model can correctly predict the input text S as either the ironic category, Y ═ 1, or the non-ironic category, Y ═ 0 }. For example, "i like me son to turn me white eye" is predicted as ironic by the model, and "i like me son to send me gift" is predicted as non-ironic.
1.2 integral frame
In order to be able to identify the potential irony expressions in social media and to exploit the true emotional tendencies of the user, the present invention proposes an irony identification model based on convolutional neural networks and attention, which makes up the deficiencies of the conventional sequence models, enables the modeling of the emotional-semantic relationships between sequences within sentences, and at the same time captures the semantic contradictions created by the effect of inverse mock more easily than the existing irony identification methods based on word pair contradictions.
The overall framework of the model is shown in fig. 1 below, which consists of a total of four parts: the first part is a text presentation layer; the second part is a semantic feature extraction layer; the third part is an ironic semantic relation modeling layer; the fourth part is the irony discrimination layer.
In the whole model, firstly, a pre-training language model BERT is used for training the ironic texts under different context information to obtain a word embedding expression containing the context information; secondly, sequentially inputting a text vector matrix obtained by BERT training into a convolutional neural network and an attention network, extracting semantic features with richer emotional information by using convolution operation and attention operation, and giving corresponding weights to the features; modeling semantic association information among different semantic features according to the language structure characteristics of the inverse mock, extracting semantic contradiction information forming an ironic effect and forming sentence-level ironic text representation; finally, the characteristic representation of the ironic text is sent to a classifier for ironic discrimination.
1.2.1 text presentation layer
The machine can not understand the language form in real life like a human, so to process the input data by the computer, the input data needs to be converted into the machine language firstly, and the simplest way of conversion at present is to map words into a vector space and convert the words into word vectors containing semantic information.
However, due to the complexity and diversity of languages, the same word may have many different meanings, for example, the meaning of the word "bother" in the anti-mock sentence "I love to be bothered while the meaning of the word" bother "in the non-anti-mock sentence" I don't knock low you bother with that short crown "is" wasted mouth and tongue ", so this requires that the word vector model must be able to accurately represent the meaning of the word according to different context information when training; further more importantly, the semantics of the words in the ironic recognition task are crucial to whether ironic intent can be correctly recognized, and as exemplified by the above-mentioned inverse mock, if the "bother" in a sentence is not accurately expressed by a word vector in the sense of "disturbing", then it will be difficult for the model to determine that it is ironic. Therefore, at the text representation layer of the model, in order to accurately characterize semantic information of words in different contexts and improve the accuracy of the anti-mock recognition task, a word vector of the words is generated by using the pre-training language model BERT.
The BERT is a novel deep contextualized word vector representation method, and a language model MLM (masked language model) and a plurality of deep bidirectional transducer assemblies are adopted to pre-train massive data, so that the model can learn the context information of words from the left direction and the right direction, the representation capability of word vectors is improved to a great extent, and finally, text representation capable of fully reflecting the complex semantics and the context information of the words is generated. The main structure of BERT is shown in fig. 2 below:
wherein, E1, E2, En is the input of the model, and is composed of Token Embedding, Position Embedding and Segment Embedding, Token Embedding is a common subword representing the minimum unit, Position Embedding is Position Embedding and represents the Position information of the word in the text sequence, Segment Embedding is segmentation Embedding and is used for segmenting two sentences, the detailed structure of which is shown in fig. 3 below; trm stands for the encoder portion of the transform, which consists of a multi-headed attention mechanism and a full-link layer, for converting input data into eigenvectors; t1, T2.., Tn is the output of the model, which is the output of the last transform encoder corresponding to the input.
The BERT model used in the experiment is a pretrained language model BERT-BASE based on English corpus sourced from Google, the number of layers of hidden layers in the model is 12, the size of hidden vectors is 768, the number of heads in an attention mechanism is 12, and the specific steps of training by using the model are as follows:
first, an input text sequence S is defined, S ═ S1,s2,...,snA, si represents the ith word of the text sequence, and n represents the number of words contained in the text sequence;
secondly, adding a 'CLS' segmentation mark and a 'SEP' segmentation mark at the beginning and the end of the text sequence respectively, and converting the text sequence into a specific format which can be processed by a BERT model;
and finally, the processed text sequence is sent to a BERT model for training to obtain a word vector representation matrix E of the text sequence, and the specific process is shown as formula (3.1):
E=BERT(“[CLS]”+s1,s2,s3,...sn+“[SEP]”) (3.1)
wherein E ═ { E ═ E1,e2,e3...,en},ei∈RkA word vector representation representing the ith word in the text sequence, and k represents the dimension of the word vector.
1.2.2 semantic feature extraction layer
While the relatively advanced ironic recognition models currently recognize ironic intentions of a text sequence by capturing semantic conflict information between words and words within a sentence, it has been found that when modeling semantic conflicts between sentence features using this method, sometimes the semantic conflict information between a word and other features cannot be captured because the semantic conflict information contained in the word is too small. Therefore, in order to compensate for the problem of too little semantic information contained in the word features, a semantic feature representation method based on phrase fragments is proposed, which extracts semantic features of higher levels by implicitly dividing a sentence into phrase fragments using a convolutional neural network, then captures features more related to an ironic task using an attention-machine assistance model, and re-weightedly combines the features and weight values after assigning different weight scores thereto according to the importance degrees of the features.
1.2.2.1 CNN layer
In order to solve the problem that semantic information extracted by taking words as input cannot comprehensively summarize text content, a semantic feature representation method based on phrase fragments is provided, and because the phrase fragments are equivalent to local information of the whole text in the whole text sequence, a text convolution neural network is adopted to extract the local features.
The text convolution neural network is a commonly used local feature extractor in the field of natural language processing, and generally comprises four modules, namely an input module, a convolution module, a pooling module and a full-connection module.
The specific way of feature extraction is as follows:
(1) inputting a two-dimensional word vector matrix of a text sequence;
(2) continuously performing convolution operation on different convolution kernels and the word vector matrix to extract a characteristic diagram of the text sequence;
(3) processing the feature maps with different sizes by using a maximum pooling method to obtain single features with fixed length;
(4) all the single features are spliced into a simple feature vector. The detailed process of text convolution is shown in FIG. 4 below:
in the feature extraction process of the phrase segments, the convolution kernel continuously moves on the whole text sequence to implicitly divide the text sequence into a plurality of phrase segments, the length of the phrase segments is determined by the size of a convolution kernel window, and the extraction of the phrase features is realized by the convolution operation of the convolution kernel and the text sequence in the convolution window. The specific process is as follows:
first, a two-dimensional word vector matrix E, E ═ E of a text sequence is input1,e2,e3...,enWhere n represents the number of words contained in the text sequence, ei∈RkA word vector representation representing the ith word in the text sequence, k being the dimension of the word vector.
Secondly, inputting the two-dimensional word vector matrix E into a convolution module, and extracting the characteristics of the phrase fragments by utilizing the convolution operation of a convolution kernel and the text sequence in the convolution window. In the experiment, the size of the convolution kernel W is defined as h × k, where h represents the size of a convolution kernel window, which determines the number of words contained in the phrase fragment, and the feature extraction process of the phrase fragment can be expressed as formula (3.2):
ci=f(W·Ei:i+h-1+b) (3.2)
here we define a function operation f, c of convolutioniRepresenting the phrase characteristics extracted after the convolution operation is carried out on the phrase fragment formed by the ith word to the (i + h-1) th word in the word vector matrix E by the convolution kernel W, and b is the offset.
Thirdly, after the convolution kernel W completes the convolution operation with the whole word vector matrix E from top to bottom, a single feature map c of all the phrase segments in the text sequence is obtained, and c is (c)1,c2,c3,...cn-h+1)。
And finally, repeating the operation by using k convolution kernels with the same size as W, and splicing the obtained feature maps to obtain a semantic feature matrix M belonging to R of the phrase fragment(n-h+1)*k,M=(m1,m2,m3,...,mn-h+1) Wherein m isi∈RkRepresenting a semantic feature representation of the ith phrase fragment in the text sequence.
The specific process of feature extraction is shown in fig. 5 below.
1.2.2.2 Attention layer
After the convolutional neural network, phrase segment characteristics with richer semantic information are extracted by the model, in order to highlight characteristic information with higher relevance with the current ironic task in all semantic characteristics, an attention model is added behind the convolutional neural network, the attention model is utilized to analyze the importance degree of different semantic characteristics and distribute corresponding weight values to the semantic characteristics, and finally the weight values and the characteristics are combined in a reweighting mode.
In the research of the text, inspired by a hierarchical attention mechanism method proposed by Yang et al, a method for analyzing the importance degree of a short-language feature in a text sequence is designed, and the purpose of strengthening key semantic features is achieved by giving weight scores to different semantic features, wherein the specific design steps are as follows:
firstly, the semantic features mi extracted from the feature extraction layer are input into a linear layer, and then the result is processed by a tanh function to obtain a hidden layer feature representation m 'of the ith phrase'iWhere b is an offset, the specific process can be expressed as shown in formula (3.3):
m′i=tanh(W·mi+b) (3.3)
secondly, according to semantic feature m'iSemantic feature m 'is calculated by similarity to context vector q'iS weight fraction ofiThen, the weighting fractions of all the characteristics are normalized by using a softmax function to obtain a final weighting matrix a, and the specific calculation processes are shown as formulas (3.4), (3.5) and (3.6):
Figure BDA0003494084700000101
a=Soft max(s1,s2,s3,...,sn-h+1) (3.5)
a=(a1,a2,a3,...,an-h+1) (3.6)
finally, the weight fraction aiAnd semantic features miAnd (3) obtaining a final semantic feature V by correlation, wherein the specific calculation mode is shown in formulas (3.7) and (3.8):
vi=ai*mi (3.7)
V=(v1,v2,v3,...,vn-h+1),V∈R(n-h+1)*k (3.8)
the calculation process of the entire attention layer is shown in fig. 6 below.
1.2.3 ironic relationship modeling layer
While semantic contradictions or emotional inconsistencies between the internal features of sentences are an important cause of the ironic effect, in the present study, based on the linguistic structural features of the ironic expression, an ironic modeling method based on the intra-sentence semantic force mechanism has been proposed, which captures the semantic conflict information inside the sentence by modeling the semantic union between the internal semantic features of the sentence, thereby achieving the purpose of recognizing the textual ironic effect.
The method mainly comprises the following steps:
1. modeling semantic association information among all phrase fragments in a text sequence by using an intra-sentence attention mechanism, calculating attention scores among all phrase characteristics to obtain a two-dimensional score matrix, wherein the size of the score represents the semantic conflict degree among the phrase characteristics;
2. finding out the attention score which is the largest for the current phrase characteristic by using a line maximum pooling method, wherein the larger the score is, the more possibility that the phrase characteristic generates semantic conflict with other phrase characteristics is shown;
3. and performing weighted summation on all phrase features and the maximum line conflict scores corresponding to the phrase features to obtain sentence-level vector representation of the text sequence. The specific design steps are as follows:
first, feature v for phraseiAnd vjCalculating the attention score w between themi,jThe calculation method is shown in formula (3.9):
wi,j=tanh(vi·Mi,j·vj T) (3.9)
wherein, wi,jIs representative of the semantic feature viAnd vjDegree of semantic conflict between, Mi,j∈R(n-h+1)*(n-h+1)Is a parameter matrix. The calculation method is subject to Xiong et alHeuristic of the method of calculating semantic conflict information between words.
Secondly, calculating semantic conflict scores among all phrase features in the text sequence to obtain a score matrix W as shown in formula (3.10):
Figure BDA0003494084700000121
wherein when i is j, the corresponding fraction w isi,jSet to 0, since there is no semantic conflict between the phrase features themselves, setting them to 0 can avoid the impact on the overall semantic conflict information.
Thirdly, performing row maximum pooling on the score matrix W, and performing normalization processing on the pooled score matrix by using a softmax function to obtain a score matrix a with characteristics, wherein the specific calculation processes are shown as formulas (3.11) and (3.12):
Figure BDA0003494084700000122
a=(a1,a2,...an-h+1) (3.12)
after performing a row-wise pooling operation on the score matrix W, the model may be aided in finding the value of the current phrase at which the verbal conflict is greatest, with a greater value indicating a greater likelihood of ironic conflict in the text sequence.
Finally, the attention score aiWeighted summation is carried out with the corresponding phrase characteristics to obtain the irony semantic representation f of the text sequencea,fa∈RkThe specific calculation process is shown as formula (3.13):
Figure BDA0003494084700000123
the calculation of the semantic conflict score for the entire irony relationship modeling module is shown in figure 7.
1.2.4 ironic discrimination module
Where J is a cost function, yi' is the prediction of sample i by our model, yiIs the true label for sample i. N is the size of the training data. R is the standard L2 regularization, and λ is the weight of R.
The irony discrimination layer is composed of a linear layer and a softmax classification layer, the linear layer is used for reducing irony characteristic expression faThe object of the softmax layer is to predict the output of the linear layer and determine whether the text sequence contains an ironic expression, as shown in equation (3.14):
Figure BDA0003494084700000131
wherein
Figure BDA0003494084700000132
Is the classification result of the model, W is the Rk*2,b∈R2Is a parameter of a linear layer, and is continuously learned along with the training of the model.
The model is optimized by adopting a cross entropy loss function in the training process, and the definition of the loss function is shown as the formula (3.15):
Figure BDA0003494084700000133
wherein
Figure BDA0003494084700000134
Is a prediction tag of the model for sample i, yiIs the true label of the sample, N is the size of the training data, and R is the standard L2 regularization.
1.3 summary
The ironic recognition model based on the convolutional neural network and the attention mechanism provided by the invention is described in detail in this chapter. The introductory portion introduces some of the deficiencies of the existing ironic recognition models and where the models herein improve; secondly, the overall structure of the model is briefly summarized; finally, each part of the model is detailed, including a BERT-based text representation layer, a CNN and Attention-based semantic feature extraction layer, an ironic semantic relationship modeling layer, and an ironic intent discrimination layer.

Claims (5)

1. An ironic recognition model based on a convolutional neural network and attention mechanism, comprising:
s1, a text representation layer: generating a word vector of a word by a pre-training language model BERT;
s2, a semantic feature extraction layer: extracting semantic features by implicitly segmenting a sentence into phrase segments by using a CNN layer of a text convolutional neural network, capturing ironic task relevance features by an Attention mechanism assistance model, distributing different weight scores according to the importance degrees of the features, and performing weighted combination on the features and the weights;
s3, an ironic semantic relation modeling layer: the semantic conflict information in the sentence is captured by modeling the semantic union information among the semantic features in the sentence;
s4, ironic intention distinguishing layer: the characteristic dimension of the ironic semantics is reduced through the linear layer, the output of the linear layer is predicted through the softmax layer, and whether the text sequence contains ironic paraphrasing or not is judged;
the model inputs a text sequence S through a text representation layer, and a word vector representation matrix E of the text sequence is obtained through adding segmentation marks, conversion and training;
taking the word vector representation matrix E as the input of a semantic feature extraction layer, extracting features by using convolution operation, and splicing the obtained single phrase segment features into a feature matrix M; analyzing the importance degrees of different semantic features of the matrix M through an attention network, distributing corresponding weights to the importance degrees, and finally performing weighted combination on the weights and the features to obtain a final semantic feature V;
the semantic feature V is used as the input of the ironic semantic relation modeling layer, and the ironic semantic representation f of the text sequence is obtained by modeling, pooling and weighted summation through semantic joint informationa
Ironic representation f of text sequencesaFor the input of the ironic discrimination layer, the ironic characteristic representation f is reduced by the linear layeraThe softmax layer predicts the output of the linear layer, judges whether the text sequence contains ironic revisions or not, and obtains the classification result of the model
Figure FDA0003494084690000011
2. The ironic recognition model based on convolutional neural network and attention mechanism of claim 1, wherein in the textual representation layer: training the ironic texts under different context information by using a pre-training language model BERT to obtain a word embedding expression containing the context information;
the steps of using the pre-trained language model BERT are as follows:
first, an input text sequence S is defined, S ═ S1,s2,...,snA, si represents the ith word of the text sequence, and n represents the number of words contained in the text sequence;
secondly, adding a 'CLS' segmentation mark and a 'SEP' segmentation mark at the beginning and the end of the text sequence respectively, and converting the text sequence into a specific format which can be processed by a BERT model;
and finally, the processed text sequence is sent to a BERT model for training to obtain a word vector expression matrix E of the text sequence, and the specific process is as shown in a formula 3.1:
E=BERT(“[CLS]”+s1,s2,s3,...sn+“[SEP]”) (3.1)
wherein E ═ { E ═ E1,e2,e3...,en},ei∈RkA word vector representation representing the ith word in the text sequence, and k represents the dimension of the word vector.
3. The ironic recognition model based on convolutional neural network and attention mechanism of claim 2, wherein the pre-trained language model BERT comprises Token Embedding model Token Embedding, Position Embedding model Position Embedding, and Segment Embedding model Segment Embedding.
4. The ironic recognition model based on convolutional neural network and attention mechanism of claim 1, wherein the CNN layer feature extraction step comprises:
(1) inputting a two-dimensional word vector matrix of a text sequence;
(2) continuously performing convolution operation on different convolution kernels and the word vector matrix to extract a characteristic diagram of the text sequence;
(3) processing the feature maps with different sizes by using a maximum pooling method to obtain single features with fixed length;
(4) splicing all the single features into a simple feature vector;
the process is as follows:
first, a two-dimensional word vector matrix E, E ═ E of a text sequence is input1,e2,e3...,enWhere n represents the number of words contained in the text sequence, ei∈RkRepresenting the word vector representation of the ith word in the text sequence, wherein k is the dimension of the word vector;
secondly, inputting the two-dimensional word vector matrix E into a convolution module, and extracting the characteristics of the phrase fragments by utilizing the convolution operation of a convolution kernel and the text sequence in the convolution window. In the experiment, the size of the convolution kernel W is defined as h × k, where h represents the size of a convolution kernel window, which determines the number of words contained in the phrase fragment, and the feature extraction process of the phrase fragment can be expressed as formula 3.2:
ci=f(W·Ei:i+h-1+b) (3.2)
here we define a function operation f, c of convolutioniRepresenting the phrase characteristics extracted after the convolution operation is carried out on the phrase fragment formed by the ith word to the (i + h-1) th word in the word vector matrix E and the convolution kernel W, wherein b is offset;
again, the convolution kernel W is completed from top to bottom with the entire word vector matrix EAfter the convolution operation, obtaining a single feature map c of all phrase fragments in the text sequence, wherein c is (c)1,c2,c3,...cn-h+1);
And finally, repeating the operation by using k convolution kernels with the same size as W, and splicing the obtained feature maps to obtain a semantic feature matrix M belonging to R of the phrase fragment(n-h+1)*k,M=(m1,m2,m3,...,mn-h+1) Wherein m isi∈RkRepresenting a semantic feature representation of the ith phrase fragment in the text sequence.
5. The ironic recognition model based on convolutional neural network and Attention mechanism of claim 1, wherein the Attention layer enhances key semantic features by assigning weight scores to different semantic features, and comprises the following steps:
firstly, the semantic features mi extracted from the feature extraction layer are input into a linear layer, and then the result is processed by a tanh function to obtain a hidden layer feature representation m 'of the ith phrase'iWhere b is an offset, the specific process can be represented by formula (3.3):
m′i=tanh(W·mi+b) (3.3)
secondly, according to semantic feature m'iCalculating semantic feature m 'by similarity with context vector q'iS weight fraction ofiThen, the weighting fractions of all the characteristics are normalized by using a softmax function to obtain a final weighting matrix a, and the specific calculation process is as the following formula 3.4, 3.5 and 3.6:
Figure FDA0003494084690000031
a=Softmax(s1,s2,s3,...,sn-h+1) (3.5)
a=(a1,a2,a3,...,an-h+1) (3.6)
finally, the right of willMultiple aiAnd semantic features miAnd (3) obtaining a final semantic feature V by correlation, wherein the specific calculation mode is as the following formulas 3.7 and 3.8:
vi=ai*mi (3.7)
V=(v1,v2,v3,...,vn-h+1),V∈R(n-h+1)*k (3.8)。
CN202210108214.5A 2022-01-28 2022-01-28 Ironic recognition model based on convolutional neural network and attention system Pending CN114722798A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210108214.5A CN114722798A (en) 2022-01-28 2022-01-28 Ironic recognition model based on convolutional neural network and attention system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210108214.5A CN114722798A (en) 2022-01-28 2022-01-28 Ironic recognition model based on convolutional neural network and attention system

Publications (1)

Publication Number Publication Date
CN114722798A true CN114722798A (en) 2022-07-08

Family

ID=82236290

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210108214.5A Pending CN114722798A (en) 2022-01-28 2022-01-28 Ironic recognition model based on convolutional neural network and attention system

Country Status (1)

Country Link
CN (1) CN114722798A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116882415A (en) * 2023-09-07 2023-10-13 湖南中周至尚信息技术有限公司 Text emotion analysis method and system based on natural language processing

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116882415A (en) * 2023-09-07 2023-10-13 湖南中周至尚信息技术有限公司 Text emotion analysis method and system based on natural language processing
CN116882415B (en) * 2023-09-07 2023-11-24 湖南中周至尚信息技术有限公司 Text emotion analysis method and system based on natural language processing

Similar Documents

Publication Publication Date Title
CN110609891B (en) Visual dialog generation method based on context awareness graph neural network
CN111581961B (en) Automatic description method for image content constructed by Chinese visual vocabulary
CN109472024B (en) Text classification method based on bidirectional circulation attention neural network
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
Wang et al. Application of convolutional neural network in natural language processing
CN111488739A (en) Implicit discourse relation identification method based on multi-granularity generated image enhancement representation
CN111930942B (en) Text classification method, language model training method, device and equipment
CN110647612A (en) Visual conversation generation method based on double-visual attention network
CN113065577A (en) Multi-modal emotion classification method for targets
CN112115238A (en) Question-answering method and system based on BERT and knowledge base
CN112487822A (en) Cross-modal retrieval method based on deep learning
CN114239585A (en) Biomedical nested named entity recognition method
CN113657115A (en) Multi-modal Mongolian emotion analysis method based on ironic recognition and fine-grained feature fusion
CN114547299A (en) Short text sentiment classification method and device based on composite network model
CN115130591A (en) Cross supervision-based multi-mode data classification method and device
CN116610778A (en) Bidirectional image-text matching method based on cross-modal global and local attention mechanism
CN114861082A (en) Multi-dimensional semantic representation-based aggressive comment detection method
CN114756678A (en) Unknown intention text identification method and device
CN114722798A (en) Ironic recognition model based on convolutional neural network and attention system
CN113157918A (en) Commodity name short text classification method and system based on attention mechanism
Yuan A Classroom Emotion Recognition Model Based on a Convolutional Neural Network Speech Emotion Algorithm
CN116958677A (en) Internet short video classification method based on multi-mode big data
Vijayaraju Image retrieval using image captioning
CN116187349A (en) Visual question-answering method based on scene graph relation information enhancement
CN115510230A (en) Mongolian emotion analysis method based on multi-dimensional feature fusion and comparative reinforcement learning mechanism

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination