CN112287197A - Method for detecting sarcasm of case-related microblog comments described by dynamic memory cases - Google Patents

Method for detecting sarcasm of case-related microblog comments described by dynamic memory cases Download PDF

Info

Publication number
CN112287197A
CN112287197A CN202011005842.8A CN202011005842A CN112287197A CN 112287197 A CN112287197 A CN 112287197A CN 202011005842 A CN202011005842 A CN 202011005842A CN 112287197 A CN112287197 A CN 112287197A
Authority
CN
China
Prior art keywords
microblog
case
comment
representation
comments
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202011005842.8A
Other languages
Chinese (zh)
Other versions
CN112287197B (en
Inventor
余正涛
谭陈琛
相艳
郭军军
黄于欣
线岩团
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN202011005842.8A priority Critical patent/CN112287197B/en
Publication of CN112287197A publication Critical patent/CN112287197A/en
Application granted granted Critical
Publication of CN112287197B publication Critical patent/CN112287197B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/355Class or cluster creation or modification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9536Search customisation based on social or collaborative filtering
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/284Lexical analysis, e.g. tokenisation or collocates
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis

Abstract

The invention relates to a case description dynamic memory case description-related microblog comment sarcasm detection method, and belongs to the technical field of natural language processing. The invention comprises the following steps: constructing a irony data set of microblog related; respectively carrying out feature coding on the text of the case-related microblog and the case-related microblog comments by utilizing word embedding and position embedding, and introducing an attention mechanism into the case-related microblog comments subjected to feature coding; obtaining a text representation related to the microblog comments related to the case through a dynamic memory mechanism; and splicing the obtained text representation and the comment representation to obtain a new representation, training a model by taking the representation as input, and performing case-related microblog comment ironic detection through the model. The invention realizes the irony detection of case-related comment sentences by using the case description information as the background information, detects the irony from the acquired public sentiment data and provides support for the subsequent sentiment analysis.

Description

Method for detecting sarcasm of case-related microblog comments described by dynamic memory cases
Technical Field
The invention relates to a case description dynamic memory case description-related microblog comment sarcasm detection method, and belongs to the technical field of natural language processing.
Background
With the rapid development of the internet, people pay more and more attention to case-related events, and views of the case-related events are often published on microblogs. In the field of endeavor, people often choose sarcasm to express their own subjective emotion to certain sensitive information. Therefore, ironic detection is an important task for public opinion analysis in the field of case.
The microblog ironic detection task has the following two characteristics: firstly, the ironic sentence is similar to the display emotional sentence in expression, and the ironic sentence is opposite in meaning and is difficult to distinguish on the premise of lacking background; and secondly, the contents of the microblog comments related to the case are not only related to the corresponding microblog texts, but also related to other microblog texts of the same case. The microblog is used for describing the same case, a plurality of microblog texts with different expressions exist, and the microblog texts of the same case can represent the complete case together to form complete description of the case. The sarcasm is detected from the acquired public sentiment data, so that the public sentiment can be correctly guided, and the negative influence caused by the public sentiment event is effectively reduced.
Disclosure of Invention
The invention provides a method for detecting sarcasm of case-related microblog comments described by dynamic memory cases, which is used for detecting sarcasm in case-related microblog emotion analysis and solves the problem of low emotion analysis performance caused by inaccurate sarcasm detection of case-related microblog comments.
The technical scheme of the invention is as follows: a method for detecting sarcasm of case-related microblog comments described by dynamic memory cases comprises the following steps:
step1, constructing a irony data set of the case microblog: a plurality of microblog comments and microblog texts related to the case are crawled through a crawler technology, and the data set is manually labeled to obtain a case-related microblog ironic data set.
Step2, respectively carrying out feature coding on the case-related microblog text and the case-related microblog comments by word embedding and position embedding, and introducing an attention mechanism into the case-related microblog comments after feature coding; obtaining a text representation related to the microblog comments related to the case through a dynamic memory mechanism; and splicing the obtained text representation and the comment representation to obtain a new representation, training a model by taking the representation as input, and performing case-related microblog comment ironic detection through the model.
As a further scheme of the invention, the specific steps of Step1 are as follows:
step1.1, crawling relevant microblog texts and comments of dozens of current hot cases from the Xinlang microblog by using a crawler based on a Scapy frame;
step1.2, filtering and screening the texts and comments of the microblog involved in the case, wherein the filtering and screening mode is as follows: (1) dividing microblog messages according to a forwarding relation '//' for ensuring that comments below the forwarded microblog are analyzed based on the original microblog, (2) deleting a structure of '@ + user name + reply' in the microblog comments and deleting irrelevant hyperlink advertisements;
step1.3, obtaining an involved microblog sarcasia data set by adopting artificial marking, marking the data set by taking one microblog comment as a unit, marking 0 cases as sarcasia in the microblog comment sentences, marking 1 as non-involved microblog sarcasia and obtaining intersection by three blind persons.
As a further scheme of the invention, the specific steps of Step2 are as follows:
step2.1, coding the microblog text involved in the case by adopting a word position coding mode to obtain a word vector l with position informationse
Figure BDA0002695862760000021
Wherein S belongs to [0, S-1], S represents the number of words involved in the microblog text, E belongs to [0, E-1], E represents the dimension of word embedding;
step2.2, respectively using the text of the microblog related to the case as a position information word vector lseAnd word vector w trained with word2vec on large-scale microblog corpussElement multiplication is carried out to obtain the text representation f of the case-related microblogi
Figure BDA0002695862760000022
lsIs characterized by one-hot, representing a column vector representing information of a position information word, SiRepresenting the word number of the ith case-related microblog text, multiplying the' representative vector by elements, and fi∈REEmbedded representation with location information representing the ith case-related microblog text, where i e 0, N]N represents the number of case-related microblog texts, and the number of case-related microblog texts is N when consistent with the number of case-related microblog comments, wsEmbedding a representative word into a vector;
step2.3, sending the text of the case-related microblog into a BiGRU model capable of extracting bidirectional semantic features to obtain coded text
Figure BDA0002695862760000023
Figure BDA0002695862760000024
Figure BDA0002695862760000025
Step2.4, commenting the microblog related to the case
Figure BDA0002695862760000026
Word vector representation w trained with word2veccAnd sending it into BiGRU model to obtain codeThe latter remarks
Figure BDA0002695862760000031
Figure BDA0002695862760000032
Wherein C belongs to [0, C ], and C represents the number of words in the comment;
step2.5, introducing an attention mechanism to the microblog comments related to the case, so that the microblog comments related to the case can pay attention to the word-level more key information in the sentence, and obtaining a comment v introduced with the attention mechanism;
Figure BDA0002695862760000033
wherein, WuIs a parameter matrix, buIs an offset term, uwRepresenting a word-level context vector, alphacIs a weight matrix, viThe weighted and summed comment sentence vectors are obtained, and v is the weighted comment;
step2.6, and characterizing the obtained text of the case
Figure BDA0002695862760000034
Characterization v of microblog comments involved in case and previous round of memory information mt-1And performing interactive calculation to splice the obtained results to obtain a characteristic splicing matrix zt
m0=v (6)
Figure BDA0002695862760000035
m0Representing initial memory information, initializing by using v, wherein t is memory frequency, |, represents element absolute value, [;]representing the concatenation of the vectors;
step2.7, introducing an attention mechanism, and acquiring a weighted microblog text representation gt
Figure BDA0002695862760000036
Wherein, Wz、WztIs a parameter matrix, bz、bwzIs a bias term;
step2.8, mixing gtReplace the update gate in the GRU;
rt=σ(Wrzzt+Wrhht-1+br) (8)
Figure BDA0002695862760000041
Figure BDA0002695862760000042
wherein r istIndicating a reset gate in the gated attention GRU,
Figure BDA0002695862760000043
representing candidate hidden layers, htRepresents a hidden state, Wrz、Wrh
Figure BDA0002695862760000044
And
Figure BDA0002695862760000045
is a parameter matrix; brAnd
Figure BDA0002695862760000046
is a bias term;
step2.9, memorizing the information m of the previous roundt-1Hidden state h obtained by gated attention GRUtSplicing with the comment v, inputting a linear layer, and activating by using a ReLU function; because one input round cannot well remember all required information, multiple iterations are required;
Figure BDA0002695862760000047
wherein m ist∈R2d×NMemory information of t memory times; wmIs a parameter matrix;
step2.10, splicing the obtained text representation and the comment representation to obtain a new representation, and training a model by taking the representation as input; splicing the text representation related to the comment and the comment subjected to feature coding to obtain a new representation, and deciding the category of the maximum probability by adopting a softmax classification function;
Figure BDA0002695862760000048
wherein, WyA parameter matrix is represented.
The invention has the beneficial effects that:
1. the consistency of the sarcasm of the microblog related files and the case description information is realized, the problem that the difference between the literal meaning and the actual meaning is large can be well solved, and the detection and measurement effect is improved.
2. A dynamic memory mechanism is utilized to obtain better semantic representation through multiple iterations, and related background knowledge is effectively memorized.
Drawings
Fig. 1 is a schematic diagram of a specific structure of the recognition model in the present invention.
Detailed Description
Example 1: as shown in fig. 1, a method for detecting sarcasm of case description related to microblog comments dynamically includes:
step1, constructing a irony data set of the case-related microblog; the method comprises the following specific steps:
step1.1, crawling relevant microblog texts and comments of dozens of current hot cases from the Xinlang microblog by using a crawler based on a Scapy frame;
step1.2, filtering and screening the texts and comments of the microblog involved in the case, wherein the filtering and screening mode is as follows: (1) dividing microblog messages according to a forwarding relation '//' for ensuring that comments below the forwarded microblog are analyzed based on the original microblog, (2) deleting a structure of '@ + user name + reply' in the microblog comments and deleting irrelevant hyperlink advertisements;
step1.3, obtaining a sarcasm data set related to a microblog by adopting artificial marking; marking work is carried out by taking one microblog comment as a unit, the microblog comment sentences contain sarcasm and cases, the cases are marked as 0, other cases are regarded as non-sarcasm microblog sarcasm and marked as 1, and the intersection is obtained by blind judgment of three persons. After the data are out of order, the data are divided into a training set and a testing set, and the data distribution condition is shown in table 1.
TABLE 1 statistical information of the microblog irony dataset
Figure BDA0002695862760000051
Step2, respectively carrying out feature coding on the case-related microblog text and the case-related microblog comments by word embedding and position embedding, and introducing an attention mechanism into the case-related microblog comments after feature coding; obtaining a text representation related to the microblog comments related to the case through a dynamic memory mechanism; and splicing the obtained text representation and the comment representation to obtain a new representation, training a model by taking the representation as input, and performing case-related microblog comment ironic detection through the model.
Step2.1, coding the microblog text involved in the case by adopting a word position coding mode to obtain a word vector l with position informationse
Figure BDA0002695862760000052
Wherein S belongs to [0, S-1], S represents the number of words involved in the microblog text, E belongs to [0, E-1], E represents the dimension of word embedding;
step2.2, respectively using the text of the microblog related to the case as a position information word vector lseAnd word vector w trained with word2vec on large-scale microblog corpussElement multiplication is carried out to obtain the text representation f of the case-related microblogi
Figure BDA0002695862760000053
lsIs characterized by one-hot, representing a column vector representing information of a position information word, SiRepresenting the word number of the ith case-related microblog text, multiplying the' representative vector by elements, and fi∈REEmbedded representation with location information representing the ith case-related microblog text, where i e 0, N]N represents the number of case-related microblog texts, and the number of case-related microblog texts is N when consistent with the number of case-related microblog comments, wsEmbedding a representative word into a vector;
step2.3, sending the text of the case-related microblog into a BiGRU model capable of extracting bidirectional semantic features to obtain coded text
Figure BDA0002695862760000061
Figure BDA0002695862760000062
Figure BDA0002695862760000063
Step2.4, commenting the microblog related to the case
Figure BDA0002695862760000064
Word vector representation w trained with word2veccAnd sending the comment into a BiGRU model to obtain the coded comment
Figure BDA0002695862760000065
Figure BDA0002695862760000066
Wherein C belongs to [0, C ], and C represents the number of words in the comment;
step2.5, introducing an attention mechanism to the microblog comments related to the case, so that the microblog comments related to the case can pay attention to the word-level more key information in the sentence, and obtaining a comment v introduced with the attention mechanism;
Figure BDA0002695862760000067
wherein, WuIs a parameter matrix, buIs an offset term, uwRepresenting a word-level context vector, alphacIs a weight matrix, viThe weighted and summed comment sentence vectors are obtained, and v is the weighted comment;
step2.6, and characterizing the obtained text of the case
Figure BDA0002695862760000068
Characterization v of microblog comments involved in case and previous round of memory information mt-1And performing interactive calculation to splice the obtained results to obtain a characteristic splicing matrix zt
m0=v (6)
Figure BDA0002695862760000069
m0Representing initial memory information, initializing by using v, wherein t is memory frequency, |, represents element absolute value, [;]representing the concatenation of the vectors;
step2.7, introducing an attention mechanism, and acquiring a weighted microblog text representation gt
Figure BDA0002695862760000071
Wherein, Wz、WztIs a parameter matrix, bz、bwzIs a bias term;
step2.8, mixing gtReplace the update gate in the GRU;
rt=σ(Wrzzt+Wrhht-1+br) (8)
Figure BDA0002695862760000072
Figure BDA0002695862760000073
wherein r istIndicating a reset gate in the gated attention GRU,
Figure BDA0002695862760000074
representing candidate hidden layers, htRepresents a hidden state, Wrz、Wrh
Figure BDA0002695862760000075
And
Figure BDA0002695862760000076
is a parameter matrix; brAnd
Figure BDA0002695862760000077
is a bias term;
step2.9, memorizing the information m of the previous roundt-1Hidden state h obtained by gated attention GRUtSplicing with the comment v, inputting a linear layer, and activating by using a ReLU function; because one input round cannot well remember all required information, multiple iterations are required;
Figure BDA0002695862760000078
wherein m ist∈R2d×NMemory information of t memory times; wmIs a parameter matrix;
step2.10, splicing the obtained text representation and the comment representation to obtain a new representation, and training a model by taking the representation as input; splicing the text representation related to the comment and the comment subjected to feature coding to obtain a new representation, and deciding the category of the maximum probability by adopting a softmax classification function;
Figure BDA0002695862760000079
wherein, WyA parameter matrix is represented.
To illustrate the effect of the present invention, 2-group comparative experiments were set up. The first group of experiments verify the effectiveness of the case description information in fusion, and the other group of experiments verify the effectiveness of the dynamic memory mechanism.
(1) Validity verification of dynamic memory mechanism integrated with case description information
And comparing the modes of ironic detection only by using microblog comment sentences and ironic detection by combining the comment sentences of case description information in the reference model. And taking the text corresponding to the case-related microblog comment as case description information of the case-related microblog comment in a reference model, firstly, respectively inputting the microblog comment and the text corresponding to the microblog comment into the model, then, splicing the characteristics of the microblog comment and the text, and finally, classifying. The results of the experiment are shown in table 2.
TABLE 2 comparison of experimental results with case description information
Figure BDA0002695862760000081
As can be seen from the analysis of table 2, comparing the two data sets under the same model, it can be seen that the performance index combining the case description information and the comment data set is superior to the performance index of the comment data set only, wherein the F1 value is raised by 2.41% at the maximum after the HAN model adds the case description information, which proves that the introduction of the case description information in the ironic detection is effective. The F1 value for the selfatention model was the greatest in both datasets, reaching 83.10% and 83.15% for Acc and F1 values, respectively, after combining case description information, but still lower than the model herein. Acc and F1 values of the model in the table 2 are optimal results, and reach 85.65% and 85.91% respectively, so that the effectiveness of the model on the task of micro-blog ironic identification is proved.
(2) Validity verification of dynamic memory mechanism
The second part verifies the validity of the dynamic memory mechanism, namely model performance under different memory times is compared, and the experimental results are shown in table 3.
TABLE 3 validation of memory mechanisms
Figure BDA0002695862760000082
Analysis of Table 3 reveals that when the number of memorization is 0, m is0Is v: acc value, P value, R value and F1 value are lowest; when the Acc value and the F1 value in the memory times of 0-3 are increased along with the increase of the memory times and reach the optimal performance in three times of memory, the performance of the model can be influenced and optimized by memorizing case description information for many times. By comparing the evaluation indexes of 3 and 4 memoization, it can be seen that the Acc value and the F1 value begin to decrease as the number of memoization continues to increase, respectively 1.94% and 2.39% lower than the 3 memoization, indicating that overfitting occurs as the number of memoization increases. Experiments prove that the best effect is obtained by 3 times of memory.
Furthermore, combining tables 2 and 3, it can be seen that when the number of remembers is 0, the model herein outperforms the baseline model at Acc, P, and F1 values, indicating that a single body of text does not provide support for comments. When the number of memorization times is 3, the Acc value is raised by 6.01% and the F1 value is raised by 6.27% compared with the RCNN model with the comment as input. On one hand, it is verified that the case description information is effectively added into the case detection task related to the microblog sarcasm, namely, the case description information is beneficial to improving the case detection performance related to the microblog sarcasm; on the other hand, the dynamic memory mechanism provided by the text can fully memorize case description information related to case-related microblog comments, and the memorized information can effectively guide the sarcasm detection task.
The experimental data prove that the problem that the difference between the literal meaning and the actual meaning is large can be well solved by judging the consistency of case description information and comment sentences. Meanwhile, a dynamic memory mechanism is utilized to obtain better semantic representation through multiple iterations. Experiments prove that the method provided by the invention achieves the optimal effect compared with a plurality of baseline models. Aiming at the sarcasm detection task of the concerned microblog comment, the method for guiding the microblog comment detection by using the case description information is effective.
While the present invention has been described in detail with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.

Claims (4)

1. A method for detecting sarcasm of case-related microblog comments described by dynamic memory cases is characterized by comprising the following steps: the method comprises the following steps:
step1, constructing a irony data set of the case-related microblog;
step2, respectively carrying out feature coding on the case-related microblog text and the case-related microblog comments by word embedding and position embedding, and introducing an attention mechanism into the case-related microblog comments after feature coding; obtaining a text representation related to the microblog comments related to the case through a dynamic memory mechanism; and splicing the obtained text representation and the comment representation to obtain a new representation, training a model by taking the representation as input, and performing case-related microblog comment ironic detection through the model.
2. The detection method for the sarcasm of the related microblog comments described by the dynamic memory case as claimed in claim 1, wherein the detection method comprises the following steps: in Step1, crawling a plurality of microblog comments and microblog texts related to the case through a crawler technology, and manually labeling the data set to obtain a case-related microblog ironic data set.
3. The detection method for the sarcasm of the micro-blog comments related to the dynamic memory case description according to claim 1 or 2, wherein: the specific steps of Step1 are as follows:
step1.1, crawling relevant microblog texts and comments of dozens of current hot cases from the Xinlang microblog by using a crawler based on a Scapy frame;
step1.2, filtering and screening the texts and comments of the microblog involved in the case, wherein the filtering and screening mode is as follows: (1) dividing microblog messages according to a forwarding relation '//' for ensuring that comments below the forwarded microblog are analyzed based on the original microblog, (2) deleting a structure of '@ + user name + reply' in the microblog comments and deleting irrelevant hyperlink advertisements;
step1.3, obtaining a sarcasm data set related to a microblog by adopting artificial marking; marking work is carried out by taking one microblog comment as a unit, the microblog comment sentences contain sarcasm and cases, the cases are marked as 0, other cases are regarded as non-sarcasm microblog sarcasm and marked as 1, and the intersection is obtained by blind judgment of three persons.
4. The detection method for the sarcasm of the related microblog comments described by the dynamic memory case as claimed in claim 1, wherein the detection method comprises the following steps: the specific steps of Step2 are as follows:
step2.1, coding the microblog text involved in the case by adopting a word position coding mode to obtain a word vector l with position informationse
Figure FDA0002695862750000011
Wherein S belongs to [0, S-1], S represents the number of words involved in the microblog text, E belongs to [0, E-1], E represents the dimension of word embedding;
step2.2, respectively using the text of the microblog related to the case as a position information word vector lseAnd word vector w trained with word2vec on large-scale microblog corpussElement multiplication is carried out to obtain the text representation f of the case-related microblogi
Figure FDA0002695862750000021
lsIs characterized by one-hot, representing a column vector representing information of a position information word, SiRepresenting the word number of the ith case-related microblog text, multiplying the' representative vector by elements, and fi∈RERepresents the ithEmbedded representation with location information for a subject microblog text, where i e [0, N]N represents the number of case-related microblog texts, and the number of case-related microblog texts is N when consistent with the number of case-related microblog comments, wsEmbedding a representative word into a vector;
step2.3, sending the text of the case-related microblog into a BiGRU model capable of extracting bidirectional semantic features to obtain coded text
Figure FDA0002695862750000022
Figure FDA0002695862750000023
Figure FDA0002695862750000024
Step2.4, commenting the microblog related to the case
Figure FDA0002695862750000025
Word vector representation w trained with word2veccAnd sending the comment into a BiGRU model to obtain the coded comment
Figure FDA0002695862750000026
Figure FDA0002695862750000027
Wherein C belongs to [0, C ], and C represents the number of words in the comment;
step2.5, introducing an attention mechanism to the microblog comments related to the case, so that the microblog comments related to the case can pay attention to the word-level more key information in the sentence, and obtaining a comment v introduced with the attention mechanism;
Figure FDA0002695862750000028
wherein, WuIs a parameter matrix, buIs an offset term, uwRepresenting a word-level context vector, alphacIs a weight matrix, viThe weighted and summed comment sentence vectors are obtained, and v is the weighted comment;
step2.6, and characterizing the obtained text of the case
Figure FDA0002695862750000031
Characterization v of microblog comments involved in case and previous round of memory information mt-1And performing interactive calculation to splice the obtained results to obtain a characteristic splicing matrix zt
m0=v (6)
Figure FDA0002695862750000032
m0Representing initial memory information, initializing by using v, wherein t is memory frequency, |, represents element absolute value, [;]representing the concatenation of the vectors;
step2.7, introducing an attention mechanism, and acquiring a weighted microblog text representation gt
Figure FDA0002695862750000033
Wherein, Wz、WztIs a parameter matrix, bz、bwzIs a bias term;
step2.8, mixing gtReplace the update gate in the GRU;
rt=σ(Wrzzt+Wrhht-1+br) (8)
Figure FDA0002695862750000034
Figure FDA0002695862750000035
wherein r istIndicating a reset gate in the gated attention GRU,
Figure FDA0002695862750000036
representing candidate hidden layers, htRepresents a hidden state, Wrz、Wrh
Figure FDA0002695862750000038
And
Figure FDA0002695862750000039
is a parameter matrix; brAnd
Figure FDA00026958627500000310
is a bias term;
step2.9, memorizing the information m of the previous roundt-1Hidden state h obtained by gated attention GRUtSplicing with the comment v, inputting a linear layer, and activating by using a ReLU function; because one input round cannot well remember all required information, multiple iterations are required;
Figure FDA0002695862750000037
wherein m ist∈R2d×NMemory information of t memory times; wmIs a parameter matrix;
step2.10, splicing the obtained text representation and the comment representation to obtain a new representation, and training a model by taking the representation as input; splicing the text representation related to the comment and the comment subjected to feature coding to obtain a new representation, and deciding the category of the maximum probability by adopting a softmax classification function;
Figure FDA0002695862750000041
wherein, WyA parameter matrix is represented.
CN202011005842.8A 2020-09-23 2020-09-23 Method for detecting sarcasm of case-related microblog comments described by dynamic memory cases Active CN112287197B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202011005842.8A CN112287197B (en) 2020-09-23 2020-09-23 Method for detecting sarcasm of case-related microblog comments described by dynamic memory cases

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011005842.8A CN112287197B (en) 2020-09-23 2020-09-23 Method for detecting sarcasm of case-related microblog comments described by dynamic memory cases

Publications (2)

Publication Number Publication Date
CN112287197A true CN112287197A (en) 2021-01-29
CN112287197B CN112287197B (en) 2022-07-19

Family

ID=74422152

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011005842.8A Active CN112287197B (en) 2020-09-23 2020-09-23 Method for detecting sarcasm of case-related microblog comments described by dynamic memory cases

Country Status (1)

Country Link
CN (1) CN112287197B (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800229A (en) * 2021-02-05 2021-05-14 昆明理工大学 Knowledge graph embedding-based semi-supervised aspect-level emotion analysis method for case-involved field
CN112926336A (en) * 2021-02-05 2021-06-08 昆明理工大学 Microblog case aspect-level viewpoint identification method based on text comment interactive attention
CN113657115A (en) * 2021-07-21 2021-11-16 内蒙古工业大学 Multi-modal Mongolian emotion analysis method based on ironic recognition and fine-grained feature fusion

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446275A (en) * 2018-03-21 2018-08-24 北京理工大学 Long text emotional orientation analytical method based on attention bilayer LSTM
CN110134962A (en) * 2019-05-17 2019-08-16 中山大学 A kind of across language plain text irony recognition methods based on inward attention power
CN110162625A (en) * 2019-04-19 2019-08-23 杭州电子科技大学 Based on word in sentence to the irony detection method of relationship and context user feature
CN110807323A (en) * 2019-09-20 2020-02-18 平安科技(深圳)有限公司 Emotion vector generation method and device
CN110866403A (en) * 2018-08-13 2020-03-06 中国科学院声学研究所 End-to-end conversation state tracking method and system based on convolution cycle entity network
CN111008274A (en) * 2019-12-10 2020-04-14 昆明理工大学 Case microblog viewpoint sentence identification and construction method of feature extended convolutional neural network
CN111159405A (en) * 2019-12-27 2020-05-15 北京工业大学 Irony detection method based on background knowledge
US20200184345A1 (en) * 2018-12-11 2020-06-11 Hiwave Technologies Inc. Method and system for generating a transitory sentiment community
CN111507101A (en) * 2020-03-03 2020-08-07 杭州电子科技大学 Ironic detection method based on multi-level semantic capsule routing
CN111581474A (en) * 2020-04-02 2020-08-25 昆明理工大学 Evaluation object extraction method of case-related microblog comments based on multi-head attention system

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN108446275A (en) * 2018-03-21 2018-08-24 北京理工大学 Long text emotional orientation analytical method based on attention bilayer LSTM
CN110866403A (en) * 2018-08-13 2020-03-06 中国科学院声学研究所 End-to-end conversation state tracking method and system based on convolution cycle entity network
US20200184345A1 (en) * 2018-12-11 2020-06-11 Hiwave Technologies Inc. Method and system for generating a transitory sentiment community
CN110162625A (en) * 2019-04-19 2019-08-23 杭州电子科技大学 Based on word in sentence to the irony detection method of relationship and context user feature
CN110134962A (en) * 2019-05-17 2019-08-16 中山大学 A kind of across language plain text irony recognition methods based on inward attention power
CN110807323A (en) * 2019-09-20 2020-02-18 平安科技(深圳)有限公司 Emotion vector generation method and device
CN111008274A (en) * 2019-12-10 2020-04-14 昆明理工大学 Case microblog viewpoint sentence identification and construction method of feature extended convolutional neural network
CN111159405A (en) * 2019-12-27 2020-05-15 北京工业大学 Irony detection method based on background knowledge
CN111507101A (en) * 2020-03-03 2020-08-07 杭州电子科技大学 Ironic detection method based on multi-level semantic capsule routing
CN111581474A (en) * 2020-04-02 2020-08-25 昆明理工大学 Evaluation object extraction method of case-related microblog comments based on multi-head attention system

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
ANKIT KUMAR 等: "Ask Me Anything:Dynamic Memory Networks for Natural Language Processing", 《ARXIV.ORG/PDF/1506.07285.PDF》, 24 June 2015 (2015-06-24), pages 1 - 10, XP055447094 *
SUZANA ILIC 等: "Deep contextualized word representations for detecting sarcasm and irony", 《ARXIV.ORG/PDF/1809.09795.PDF》, 26 September 2018 (2018-09-26), pages 1 - 6 *
韩虎 等: "面向社交媒体评论的上下文语境讽刺检测模型", 《计算机工程》, vol. 47, no. 01, 17 January 2020 (2020-01-17), pages 66 - 71 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112800229A (en) * 2021-02-05 2021-05-14 昆明理工大学 Knowledge graph embedding-based semi-supervised aspect-level emotion analysis method for case-involved field
CN112926336A (en) * 2021-02-05 2021-06-08 昆明理工大学 Microblog case aspect-level viewpoint identification method based on text comment interactive attention
CN112800229B (en) * 2021-02-05 2022-12-20 昆明理工大学 Knowledge graph embedding-based semi-supervised aspect-level emotion analysis method for case-involved field
CN113657115A (en) * 2021-07-21 2021-11-16 内蒙古工业大学 Multi-modal Mongolian emotion analysis method based on ironic recognition and fine-grained feature fusion
CN113657115B (en) * 2021-07-21 2023-06-30 内蒙古工业大学 Multi-mode Mongolian emotion analysis method based on ironic recognition and fine granularity feature fusion

Also Published As

Publication number Publication date
CN112287197B (en) 2022-07-19

Similar Documents

Publication Publication Date Title
CN111966917B (en) Event detection and summarization method based on pre-training language model
CN108363743B (en) Intelligent problem generation method and device and computer readable storage medium
Styawati et al. Sentiment analysis on online transportation reviews using Word2Vec text embedding model feature extraction and support vector machine (SVM) algorithm
CN112287197B (en) Method for detecting sarcasm of case-related microblog comments described by dynamic memory cases
CN108363790A (en) For the method, apparatus, equipment and storage medium to being assessed
CN109726745B (en) Target-based emotion classification method integrating description knowledge
CN107315738A (en) A kind of innovation degree appraisal procedure of text message
CN110929034A (en) Commodity comment fine-grained emotion classification method based on improved LSTM
CN112989033B (en) Microblog emotion classification method based on emotion category description
CN112256866A (en) Text fine-grained emotion analysis method based on deep learning
CN112395421B (en) Course label generation method and device, computer equipment and medium
CN111368082A (en) Emotion analysis method for domain adaptive word embedding based on hierarchical network
CN111241397A (en) Content recommendation method and device and computing equipment
CN107818173B (en) Vector space model-based Chinese false comment filtering method
CN115544252A (en) Text emotion classification method based on attention static routing capsule network
Suhartono et al. Argument annotation and analysis using deep learning with attention mechanism in Bahasa Indonesia
CN111274354B (en) Referee document structuring method and referee document structuring device
CN111104492B (en) Civil aviation field automatic question and answer method based on layering Attention mechanism
da Rocha et al. A text as unique as a fingerprint: Text analysis and authorship recognition in a Virtual Learning Environment of the Unified Health System in Brazil
Hussain et al. A technique for perceiving abusive bangla comments
CN115906824A (en) Text fine-grained emotion analysis method, system, medium and computing equipment
CN110569495A (en) Emotional tendency classification method and device based on user comments and storage medium
Munnes et al. Examining sentiment in complex texts. A comparison of different computational approaches
CN111159405B (en) Irony detection method based on background knowledge
CN114942991A (en) Emotion classification model construction method based on metaphor recognition

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant