CN114722798A

CN114722798A - Ironic recognition model based on convolutional neural network and attention system

Info

Publication number: CN114722798A
Application number: CN202210108214.5A
Authority: CN
Inventors: 孟佳娜; 朱彦霖; 刘爽; 孙世昶
Original assignee: Dalian Minzu University
Current assignee: Dalian Minzu University
Priority date: 2022-01-28
Filing date: 2022-01-28
Publication date: 2022-07-08

Abstract

The invention belongs to the field of natural language processing, and relates to an ironic recognition model based on a convolutional neural network and an attention system. The method comprises the following steps: s1, a text representation layer; s2, a semantic feature extraction layer; s3, establishing a ironic semantic relation modeling layer; and S4, determining the ironic intention. Has the advantages that: the method can identify potential irony expressions in social media and excavate real emotional tendency of a user, makes up the defects of the traditional sequence model, realizes the modeling of emotional semantic relations between sequences in sentences, and simultaneously compared with the existing irony identification method based on word pair contradiction, the model can more easily capture semantic contradiction information generated by the effect of inverse mock, improves the accuracy of irony identification tasks, and implicitly divides sentences into phrase segments through convolution operation to extract higher-level semantic features; semantic inconsistency information between sentence phrase fragments can be captured more accurately.

Description

Ironic recognition model based on convolutional neural network and attention system

Technical Field

The invention belongs to the field of natural language processing, and relates to an ironic recognition model based on a convolutional neural network and an attention system.

Background

Ironic is a common metaphorical language in human social life, and is widely liked by social media users due to strong and unique linguistic effects during expression, but certain difficulties and challenges are brought to the text emotion analysis task due to factors such as ambiguity of ironic expression, difference between literal meaning and real emotion and the like, and if real emotion information contained in the text cannot be accurately identified, the accuracy of the text emotion analysis and opinion mining tasks is seriously affected.

The intra-sentence semantic conflict is an important characteristic of the inverse mock, and when a traditional sequence model such as a long-term and short-term memory network is used for inverse mock semantic modeling, the emotional semantic relationship embedded in a sentence is difficult to accurately capture, so that the performance of the model is difficult to improve all the time; in recent years, a small number of researchers have proposed improved algorithms aiming at the shortcomings of the sequence model, and the main idea is to capture the semantic conflict relationship of the anti-sarcasm text by modeling the semantic relationship between words in the sentence, which overcomes the shortcomings of the conventional sequence model, and can accurately focus on the semantic joint information between the sequences in the sentence, so as to greatly improve the accuracy of the anti-mock recognition task, but it has been found that in the process of modeling the semantic relationship between words by using the above method, sometimes it is difficult to capture the semantic conflict information between words because the semantic information contained in the words is too little, for example, in the anti-mock sentence "Going in to word for 2 words in traffic in the 50min drive", semantic conflicts in sentences are difficult to find, but the phrase fragment 'work for 2 hours' and the phrase fragment 'work the 50min drive' are observed, so that the two phrases can be found to have strong emotional contrast semantically.

Thus, when modeling the linguistic structural features that produce the anti-mock effect, it is more reasonable to capture semantic inconsistency information between sentence phrase fragments than between words, because the phrase fragments contain much more semantic information than a single word.

Disclosure of Invention

In order to overcome some defects and shortcomings of the existing ironic recognition model, the invention provides the following technical scheme: an ironic recognition model based on a convolutional neural network and attention mechanism, comprising:

s1, a text representation layer: generating a word vector of a word by a pre-training language model BERT;

s2, a semantic feature extraction layer: extracting semantic features by implicitly segmenting a sentence into phrase segments by using a CNN layer of a text convolutional neural network, capturing ironic task relevance features by an Attention mechanism assistance model, distributing different weight scores according to the importance degrees of the features, and performing weighted combination on the features and the weights;

s3, an ironic semantic relation modeling layer: the semantic conflict information in the sentence is captured by modeling the semantic union information among the semantic features in the sentence;

s4, ironic intention distinguishing layer: reducing characteristic dimension of the ironic semantics through the linear layer, predicting the output of the linear layer through the softmax layer, and judging whether the text sequence contains ironic retouching;

the model inputs a text sequence S through a text representation layer, and a word vector representation matrix E of the text sequence is obtained through adding segmentation marks, conversion and training;

taking the word vector representation matrix E as the input of a semantic feature extraction layer, extracting features by using convolution operation, and splicing the obtained single phrase segment features into a feature matrix M; analyzing the importance degrees of different semantic features of the matrix M through an attention network, distributing corresponding weights to the importance degrees, and finally performing weighted combination on the weights and the features to obtain a final semantic feature V;

semantic feature V as input to a modeling layer of ironic relationshipsModeling, pooling and weighting summation are carried out on the over-semantic combined information to obtain the irony semantic representation f of the text sequence_a；

Ironic representation f of text sequences_aFor the input of the ironic discrimination layer, the ironic characteristic representation f is reduced by the linear layer_aThe softmax layer predicts the output of the linear layer, judges whether the text sequence contains ironic revisions or not, and obtains the classification result of the model

Further, in the text representation layer: training the ironic texts under different context information by using a pre-training language model BERT to obtain a word embedding expression containing the context information;

the steps of using the pre-trained language model BERT are as follows:

first, an input text sequence S is defined, S ═ S₁,s₂,...,s_nA, si represents the ith word of the text sequence, and n represents the number of words contained in the text sequence;

secondly, adding a 'CLS' segmentation mark and a 'SEP' segmentation mark at the beginning and the end of the text sequence respectively, and converting the text sequence into a specific format which can be processed by a BERT model;

and finally, the processed text sequence is sent to a BERT model for training to obtain a word vector expression matrix E of the text sequence, and the specific process is as shown in a formula 3.1:

E＝BERT(“[CLS]”+s₁,s₂,s₃,...s_n+“[SEP]”) (3.1)

wherein E ═ { E ═ E₁,e₂,e₃...,e_n}，e_i∈R^kA word vector representation representing the ith word in the text sequence, and k represents the dimension of the word vector.

Further, the pre-training language model BERT includes a Token Embedding model Token Embedding, a Position Embedding model Position Embedding, and a Segment Embedding model Segment Embedding.

Further, the CNN layer feature extraction step includes:

(1) inputting a two-dimensional word vector matrix of a text sequence;

(2) continuously performing convolution operation on different convolution kernels and the word vector matrix to extract a characteristic diagram of the text sequence;

(3) processing the feature maps with different sizes by using a maximum pooling method to obtain single features with fixed length;

(4) splicing all the single features into a simple feature vector;

the process is as follows:

first, a two-dimensional word vector matrix E, E ═ E of a text sequence is input₁,e₂,e₃...,e_nWhere n represents the number of words contained in the text sequence, e_i∈R^kRepresenting the word vector representation of the ith word in the text sequence, wherein k is the dimension of the word vector;

secondly, inputting the two-dimensional word vector matrix E into a convolution module, and extracting the characteristics of the phrase fragments by utilizing the convolution operation of a convolution kernel and the text sequence in the convolution window. In the experiment, the size of the convolution kernel W is defined as h × k, where h represents the size of a convolution kernel window, which determines the number of words contained in the phrase fragment, and the feature extraction process of the phrase fragment can be expressed as formula 3.2:

c_i＝f(W·E_i:i+h-1+b) (3.2)

here we define a function operation f, c of convolution_iRepresenting the phrase characteristics extracted after the convolution operation is carried out on the phrase fragment formed by the ith word to the (i + h-1) th word in the word vector matrix E and the convolution kernel W, wherein b is offset;

thirdly, after the convolution kernel W completes the convolution operation with the whole word vector matrix E from top to bottom, a single feature map c of all the phrase segments in the text sequence is obtained, and c is (c)₁,c₂,c₃,...c_n-h+1)；

And finally, repeating the operation by using k convolution kernels with the same size as W, and splicing the obtained feature graphs to obtain the semantic features of the phrase fragmentsThe matrix M belongs to R^(n-h+1)*k,M＝(m₁,m₂,m₃,...,m_n-h+1) Wherein m is_i∈R^kRepresenting a semantic feature representation of the ith phrase fragment in the text sequence.

Furthermore, the Attention layer reinforces the key semantic features by giving weight scores to different semantic features, and the steps are as follows:

firstly, the semantic features mi extracted from the feature extraction layer are input into a linear layer, and then the result is processed by a tanh function to obtain a hidden layer feature representation m 'of the ith phrase'_iWhere b is an offset, the specific process can be expressed as shown in formula (3.3):

m′_i＝tanh(W·m_i+b) (3.3)

secondly, according to semantic feature m'_iCalculating semantic feature m 'by similarity with context vector q'_iS weight fraction of_iThen, the weighting fractions of all the characteristics are normalized by using a softmax function to obtain a final weighting matrix a, and the specific calculation process is as the following formula 3.4, 3.5 and 3.6:

a＝Soft max(s₁,s₂,s₃,...,s_n-h+1) (3.5)

a＝(a₁,a₂,a₃,...,a_n-h+1) (3.6)

finally, the weight fraction a_iAnd semantic features m_iAnd (3) obtaining a final semantic feature V by correlation, wherein the specific calculation mode is as the following formulas 3.7 and 3.8:

v_i＝a_i*m_i (3.7)

V＝(v₁,v₂,v₃,...,v_n-h+1)，V∈R^(n-h+1)*k (3.8)

has the advantages that: the method can identify potential irony expressions in social media and excavate real emotional tendency of a user, makes up the defects of the traditional sequence model, realizes the modeling of emotional semantic relations between sequences in sentences, and simultaneously compared with the existing irony identification method based on word pair contradiction, the model can more easily capture semantic contradiction information generated by the effect of inverse mock, improves the accuracy of irony identification tasks, and implicitly divides sentences into phrase segments through convolution operation to extract higher-level semantic features; semantic inconsistency information between sentence phrase fragments can be captured more accurately.

Drawings

FIG. 1, inverse mock identifies a model structure diagram;

FIG. 2 is a block diagram of the BERT model;

FIG. 3 is an input structure diagram of the BERT model;

FIG. 4 is a diagram of a textual convolutional neural network architecture;

FIG. 5 is a diagram of semantic feature extraction architecture;

FIG. 6 is a diagram of an attention layer model architecture;

FIG. 7 is a diagram of a reverse mock semantic modeling module architecture.

Detailed Description

The invention provides an ironic recognition model based on a convolutional neural network and an attention system. In the model of the invention, semantic information is not extracted by taking a word as a minimum unit, but a sentence is implicitly segmented into phrase fragments through convolution operation to extract semantic features of higher level; meanwhile, after the feature extraction is finished, an attention model is added to further process all the features, semantic information related to the current ironic recognition task is strengthened and is distributed with higher weight, and other irrelevant information is weakened and is distributed with lower weight; then, according to the internal language structure characteristics of the ironic retrieval, modeling is carried out on semantic association information among the characteristics, so that the model captures emotional inconsistency information embedded among internal sequences of sentences to form sentence-level ironic representation; and finally, sending the ironic representation of the text sequence into a classifier for ironic discrimination.

1. Integral framework of ironic recognition model

1.1 problem definition

In the field of natural language processing, inverse mock recognition is generally defined as a binary task whose purpose is to identify whether an irony is contained in a given text. Typically, given a sequence of text S and a set of categories Y {0,1}, the model can correctly predict the input text S as either the ironic category, Y ═ 1, or the non-ironic category, Y ═ 0 }. For example, "i like me son to turn me white eye" is predicted as ironic by the model, and "i like me son to send me gift" is predicted as non-ironic.

1.2 integral frame

In order to be able to identify the potential irony expressions in social media and to exploit the true emotional tendencies of the user, the present invention proposes an irony identification model based on convolutional neural networks and attention, which makes up the deficiencies of the conventional sequence models, enables the modeling of the emotional-semantic relationships between sequences within sentences, and at the same time captures the semantic contradictions created by the effect of inverse mock more easily than the existing irony identification methods based on word pair contradictions.

The overall framework of the model is shown in fig. 1 below, which consists of a total of four parts: the first part is a text presentation layer; the second part is a semantic feature extraction layer; the third part is an ironic semantic relation modeling layer; the fourth part is the irony discrimination layer.

In the whole model, firstly, a pre-training language model BERT is used for training the ironic texts under different context information to obtain a word embedding expression containing the context information; secondly, sequentially inputting a text vector matrix obtained by BERT training into a convolutional neural network and an attention network, extracting semantic features with richer emotional information by using convolution operation and attention operation, and giving corresponding weights to the features; modeling semantic association information among different semantic features according to the language structure characteristics of the inverse mock, extracting semantic contradiction information forming an ironic effect and forming sentence-level ironic text representation; finally, the characteristic representation of the ironic text is sent to a classifier for ironic discrimination.

1.2.1 text presentation layer

The machine can not understand the language form in real life like a human, so to process the input data by the computer, the input data needs to be converted into the machine language firstly, and the simplest way of conversion at present is to map words into a vector space and convert the words into word vectors containing semantic information.

However, due to the complexity and diversity of languages, the same word may have many different meanings, for example, the meaning of the word "bother" in the anti-mock sentence "I love to be bothered while the meaning of the word" bother "in the non-anti-mock sentence" I don't knock low you bother with that short crown "is" wasted mouth and tongue ", so this requires that the word vector model must be able to accurately represent the meaning of the word according to different context information when training; further more importantly, the semantics of the words in the ironic recognition task are crucial to whether ironic intent can be correctly recognized, and as exemplified by the above-mentioned inverse mock, if the "bother" in a sentence is not accurately expressed by a word vector in the sense of "disturbing", then it will be difficult for the model to determine that it is ironic. Therefore, at the text representation layer of the model, in order to accurately characterize semantic information of words in different contexts and improve the accuracy of the anti-mock recognition task, a word vector of the words is generated by using the pre-training language model BERT.

The BERT is a novel deep contextualized word vector representation method, and a language model MLM (masked language model) and a plurality of deep bidirectional transducer assemblies are adopted to pre-train massive data, so that the model can learn the context information of words from the left direction and the right direction, the representation capability of word vectors is improved to a great extent, and finally, text representation capable of fully reflecting the complex semantics and the context information of the words is generated. The main structure of BERT is shown in fig. 2 below:

wherein, E1, E2, En is the input of the model, and is composed of Token Embedding, Position Embedding and Segment Embedding, Token Embedding is a common subword representing the minimum unit, Position Embedding is Position Embedding and represents the Position information of the word in the text sequence, Segment Embedding is segmentation Embedding and is used for segmenting two sentences, the detailed structure of which is shown in fig. 3 below; trm stands for the encoder portion of the transform, which consists of a multi-headed attention mechanism and a full-link layer, for converting input data into eigenvectors; t1, T2.., Tn is the output of the model, which is the output of the last transform encoder corresponding to the input.

The BERT model used in the experiment is a pretrained language model BERT-BASE based on English corpus sourced from Google, the number of layers of hidden layers in the model is 12, the size of hidden vectors is 768, the number of heads in an attention mechanism is 12, and the specific steps of training by using the model are as follows:

and finally, the processed text sequence is sent to a BERT model for training to obtain a word vector representation matrix E of the text sequence, and the specific process is shown as formula (3.1):

E＝BERT(“[CLS]”+s₁,s₂,s₃,...s_n+“[SEP]”) (3.1)

1.2.2 semantic feature extraction layer

While the relatively advanced ironic recognition models currently recognize ironic intentions of a text sequence by capturing semantic conflict information between words and words within a sentence, it has been found that when modeling semantic conflicts between sentence features using this method, sometimes the semantic conflict information between a word and other features cannot be captured because the semantic conflict information contained in the word is too small. Therefore, in order to compensate for the problem of too little semantic information contained in the word features, a semantic feature representation method based on phrase fragments is proposed, which extracts semantic features of higher levels by implicitly dividing a sentence into phrase fragments using a convolutional neural network, then captures features more related to an ironic task using an attention-machine assistance model, and re-weightedly combines the features and weight values after assigning different weight scores thereto according to the importance degrees of the features.

1.2.2.1 CNN layer

In order to solve the problem that semantic information extracted by taking words as input cannot comprehensively summarize text content, a semantic feature representation method based on phrase fragments is provided, and because the phrase fragments are equivalent to local information of the whole text in the whole text sequence, a text convolution neural network is adopted to extract the local features.

The text convolution neural network is a commonly used local feature extractor in the field of natural language processing, and generally comprises four modules, namely an input module, a convolution module, a pooling module and a full-connection module.

The specific way of feature extraction is as follows:

(1) inputting a two-dimensional word vector matrix of a text sequence;

(4) all the single features are spliced into a simple feature vector. The detailed process of text convolution is shown in FIG. 4 below:

in the feature extraction process of the phrase segments, the convolution kernel continuously moves on the whole text sequence to implicitly divide the text sequence into a plurality of phrase segments, the length of the phrase segments is determined by the size of a convolution kernel window, and the extraction of the phrase features is realized by the convolution operation of the convolution kernel and the text sequence in the convolution window. The specific process is as follows:

first, a two-dimensional word vector matrix E, E ═ E of a text sequence is input₁,e₂,e₃...,e_nWhere n represents the number of words contained in the text sequence, e_i∈R^kA word vector representation representing the ith word in the text sequence, k being the dimension of the word vector.

Secondly, inputting the two-dimensional word vector matrix E into a convolution module, and extracting the characteristics of the phrase fragments by utilizing the convolution operation of a convolution kernel and the text sequence in the convolution window. In the experiment, the size of the convolution kernel W is defined as h × k, where h represents the size of a convolution kernel window, which determines the number of words contained in the phrase fragment, and the feature extraction process of the phrase fragment can be expressed as formula (3.2):

c_i＝f(W·E_i:i+h-1+b) (3.2)

here we define a function operation f, c of convolution_iRepresenting the phrase characteristics extracted after the convolution operation is carried out on the phrase fragment formed by the ith word to the (i + h-1) th word in the word vector matrix E by the convolution kernel W, and b is the offset.

Thirdly, after the convolution kernel W completes the convolution operation with the whole word vector matrix E from top to bottom, a single feature map c of all the phrase segments in the text sequence is obtained, and c is (c)₁,c₂,c₃,...c_n-h+1)。

And finally, repeating the operation by using k convolution kernels with the same size as W, and splicing the obtained feature maps to obtain a semantic feature matrix M belonging to R of the phrase fragment^(n-h+1)*k,M＝(m₁,m₂,m₃,...,m_n-h+1) Wherein m is_i∈R^kRepresenting a semantic feature representation of the ith phrase fragment in the text sequence.

The specific process of feature extraction is shown in fig. 5 below.

1.2.2.2 Attention layer

After the convolutional neural network, phrase segment characteristics with richer semantic information are extracted by the model, in order to highlight characteristic information with higher relevance with the current ironic task in all semantic characteristics, an attention model is added behind the convolutional neural network, the attention model is utilized to analyze the importance degree of different semantic characteristics and distribute corresponding weight values to the semantic characteristics, and finally the weight values and the characteristics are combined in a reweighting mode.

In the research of the text, inspired by a hierarchical attention mechanism method proposed by Yang et al, a method for analyzing the importance degree of a short-language feature in a text sequence is designed, and the purpose of strengthening key semantic features is achieved by giving weight scores to different semantic features, wherein the specific design steps are as follows:

m′_i＝tanh(W·m_i+b) (3.3)

secondly, according to semantic feature m'_iSemantic feature m 'is calculated by similarity to context vector q'_iS weight fraction of_iThen, the weighting fractions of all the characteristics are normalized by using a softmax function to obtain a final weighting matrix a, and the specific calculation processes are shown as formulas (3.4), (3.5) and (3.6):

a＝Soft max(s₁,s₂,s₃,...,s_n-h+1) (3.5)

a＝(a₁,a₂,a₃,...,a_n-h+1) (3.6)

finally, the weight fraction a_iAnd semantic features m_iAnd (3) obtaining a final semantic feature V by correlation, wherein the specific calculation mode is shown in formulas (3.7) and (3.8):

v_i＝a_i*m_i (3.7)

V＝(v₁,v₂,v₃,...,v_n-h+1)，V∈R^(n-h+1)*k (3.8)

the calculation process of the entire attention layer is shown in fig. 6 below.

1.2.3 ironic relationship modeling layer

While semantic contradictions or emotional inconsistencies between the internal features of sentences are an important cause of the ironic effect, in the present study, based on the linguistic structural features of the ironic expression, an ironic modeling method based on the intra-sentence semantic force mechanism has been proposed, which captures the semantic conflict information inside the sentence by modeling the semantic union between the internal semantic features of the sentence, thereby achieving the purpose of recognizing the textual ironic effect.

The method mainly comprises the following steps:

1. modeling semantic association information among all phrase fragments in a text sequence by using an intra-sentence attention mechanism, calculating attention scores among all phrase characteristics to obtain a two-dimensional score matrix, wherein the size of the score represents the semantic conflict degree among the phrase characteristics;

2. finding out the attention score which is the largest for the current phrase characteristic by using a line maximum pooling method, wherein the larger the score is, the more possibility that the phrase characteristic generates semantic conflict with other phrase characteristics is shown;

3. and performing weighted summation on all phrase features and the maximum line conflict scores corresponding to the phrase features to obtain sentence-level vector representation of the text sequence. The specific design steps are as follows:

first, feature v for phrase_iAnd v_jCalculating the attention score w between them_i,jThe calculation method is shown in formula (3.9):

w_i,j＝tanh(v_i·M_i,j·v_j ^T) (3.9)

wherein, w_i,jIs representative of the semantic feature v_iAnd v_jDegree of semantic conflict between, M_i,j∈R^{(n-h+1)*(n-h+1)}Is a parameter matrix. The calculation method is subject to Xiong et alHeuristic of the method of calculating semantic conflict information between words.

Secondly, calculating semantic conflict scores among all phrase features in the text sequence to obtain a score matrix W as shown in formula (3.10):

wherein when i is j, the corresponding fraction w is_i,jSet to 0, since there is no semantic conflict between the phrase features themselves, setting them to 0 can avoid the impact on the overall semantic conflict information.

Thirdly, performing row maximum pooling on the score matrix W, and performing normalization processing on the pooled score matrix by using a softmax function to obtain a score matrix a with characteristics, wherein the specific calculation processes are shown as formulas (3.11) and (3.12):

a＝(a₁,a₂,...a_n-h+1) (3.12)

after performing a row-wise pooling operation on the score matrix W, the model may be aided in finding the value of the current phrase at which the verbal conflict is greatest, with a greater value indicating a greater likelihood of ironic conflict in the text sequence.

Finally, the attention score a_iWeighted summation is carried out with the corresponding phrase characteristics to obtain the irony semantic representation f of the text sequence_a，f_a∈R^kThe specific calculation process is shown as formula (3.13):

the calculation of the semantic conflict score for the entire irony relationship modeling module is shown in figure 7.

1.2.4 ironic discrimination module

Where J is a cost function, y_i' is the prediction of sample i by our model, y_iIs the true label for sample i. N is the size of the training data. R is the standard L2 regularization, and λ is the weight of R.

The irony discrimination layer is composed of a linear layer and a softmax classification layer, the linear layer is used for reducing irony characteristic expression f_aThe object of the softmax layer is to predict the output of the linear layer and determine whether the text sequence contains an ironic expression, as shown in equation (3.14):

wherein

Is the classification result of the model, W is the R^k*2,b∈R²Is a parameter of a linear layer, and is continuously learned along with the training of the model.

The model is optimized by adopting a cross entropy loss function in the training process, and the definition of the loss function is shown as the formula (3.15):

wherein

Is a prediction tag of the model for sample i, y_iIs the true label of the sample, N is the size of the training data, and R is the standard L2 regularization.

1.3 summary

The ironic recognition model based on the convolutional neural network and the attention mechanism provided by the invention is described in detail in this chapter. The introductory portion introduces some of the deficiencies of the existing ironic recognition models and where the models herein improve; secondly, the overall structure of the model is briefly summarized; finally, each part of the model is detailed, including a BERT-based text representation layer, a CNN and Attention-based semantic feature extraction layer, an ironic semantic relationship modeling layer, and an ironic intent discrimination layer.

Claims

1. An ironic recognition model based on a convolutional neural network and attention mechanism, comprising:

s4, ironic intention distinguishing layer: the characteristic dimension of the ironic semantics is reduced through the linear layer, the output of the linear layer is predicted through the softmax layer, and whether the text sequence contains ironic paraphrasing or not is judged;

the semantic feature V is used as the input of the ironic semantic relation modeling layer, and the ironic semantic representation f of the text sequence is obtained by modeling, pooling and weighted summation through semantic joint information_a；

2. The ironic recognition model based on convolutional neural network and attention mechanism of claim 1, wherein in the textual representation layer: training the ironic texts under different context information by using a pre-training language model BERT to obtain a word embedding expression containing the context information;

the steps of using the pre-trained language model BERT are as follows:

E＝BERT(“[CLS]”+s₁,s₂,s₃,...s_n+“[SEP]”) (3.1)

3. The ironic recognition model based on convolutional neural network and attention mechanism of claim 2, wherein the pre-trained language model BERT comprises Token Embedding model Token Embedding, Position Embedding model Position Embedding, and Segment Embedding model Segment Embedding.

4. The ironic recognition model based on convolutional neural network and attention mechanism of claim 1, wherein the CNN layer feature extraction step comprises:

(1) inputting a two-dimensional word vector matrix of a text sequence;

(4) splicing all the single features into a simple feature vector;

the process is as follows:

c_i＝f(W·E_i:i+h-1+b) (3.2)

again, the convolution kernel W is completed from top to bottom with the entire word vector matrix EAfter the convolution operation, obtaining a single feature map c of all phrase fragments in the text sequence, wherein c is (c)₁,c₂,c₃,...c_n-h+1)；

5. The ironic recognition model based on convolutional neural network and Attention mechanism of claim 1, wherein the Attention layer enhances key semantic features by assigning weight scores to different semantic features, and comprises the following steps:

firstly, the semantic features mi extracted from the feature extraction layer are input into a linear layer, and then the result is processed by a tanh function to obtain a hidden layer feature representation m 'of the ith phrase'_iWhere b is an offset, the specific process can be represented by formula (3.3):

m′_i＝tanh(W·m_i+b) (3.3)

a＝Softmax(s₁,s₂,s₃,...,s_n-h+1) (3.5)

a＝(a₁,a₂,a₃,...,a_n-h+1) (3.6)

finally, the right of willMultiple a_iAnd semantic features m_iAnd (3) obtaining a final semantic feature V by correlation, wherein the specific calculation mode is as the following formulas 3.7 and 3.8:

v_i＝a_i*m_i (3.7)

V＝(v₁,v₂,v₃,...,v_n-h+1)，V∈R^(n-h+1)*k (3.8)。