CN112926336A - Microblog case aspect-level viewpoint identification method based on text comment interactive attention - Google Patents
Microblog case aspect-level viewpoint identification method based on text comment interactive attention Download PDFInfo
- Publication number
- CN112926336A CN112926336A CN202110163045.0A CN202110163045A CN112926336A CN 112926336 A CN112926336 A CN 112926336A CN 202110163045 A CN202110163045 A CN 202110163045A CN 112926336 A CN112926336 A CN 112926336A
- Authority
- CN
- China
- Prior art keywords
- text
- comment
- comments
- microblog
- attention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000002452 interceptive effect Effects 0.000 title claims abstract description 41
- 230000007246 mechanism Effects 0.000 claims abstract description 34
- 230000011218 segmentation Effects 0.000 claims abstract description 8
- 239000011159 matrix material Substances 0.000 claims description 39
- 239000013598 vector Substances 0.000 claims description 37
- 230000006870 function Effects 0.000 claims description 19
- 238000004364 calculation method Methods 0.000 claims description 11
- 230000009466 transformation Effects 0.000 claims description 10
- 230000008569 process Effects 0.000 claims description 6
- 230000009193 crawling Effects 0.000 claims description 4
- 238000013507 mapping Methods 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 238000013528 artificial neural network Methods 0.000 claims description 3
- 238000006243 chemical reaction Methods 0.000 claims description 3
- 238000001914 filtration Methods 0.000 claims description 3
- 238000012545 processing Methods 0.000 claims description 3
- 230000004927 fusion Effects 0.000 abstract description 3
- 238000003058 natural language processing Methods 0.000 abstract description 3
- 230000000694 effects Effects 0.000 description 5
- 238000002474 experimental method Methods 0.000 description 4
- 238000004458 analytical method Methods 0.000 description 3
- 238000010586 diagram Methods 0.000 description 3
- 238000012552 review Methods 0.000 description 3
- 238000013527 convolutional neural network Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- 238000012935 Averaging Methods 0.000 description 1
- 238000002679 ablation Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000002708 enhancing effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 238000007781 pre-processing Methods 0.000 description 1
- 230000001737 promoting effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/30—Semantic analysis
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/951—Indexing; Web crawling techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/22—Matching criteria, e.g. proximity measures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/10—Text processing
- G06F40/12—Use of codes for handling textual entities
- G06F40/126—Character encoding
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/205—Parsing
- G06F40/216—Parsing using statistical methods
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/237—Lexical tools
- G06F40/242—Dictionaries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F40/00—Handling natural language data
- G06F40/20—Natural language analysis
- G06F40/279—Recognition of textual entities
- G06F40/289—Phrasal analysis, e.g. finite state techniques or chunking
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Artificial Intelligence (AREA)
- Computational Linguistics (AREA)
- General Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Data Mining & Analysis (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Computational Biology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Life Sciences & Earth Sciences (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Computation (AREA)
- Evolutionary Biology (AREA)
- Probability & Statistics with Applications (AREA)
- Machine Translation (AREA)
Abstract
The invention relates to a microblog case aspect-level viewpoint identification method based on text comment interactive attention, and belongs to the technical field of natural language processing. According to the method, texts and related comments of a plurality of hot cases are crawled from a microblog to form a data set, and then word segmentation is carried out on the data set to construct a dictionary. The method comprises the steps of firstly respectively encoding the text and the comment, then realizing the fusion of the text information and the comment information based on an interactive attention mechanism, and realizing the recognition of the comment text case aspect viewpoint based on the fused characteristics. According to the method, the accuracy of case-level viewpoint identification is effectively improved, and the problem of poor microblog case-level viewpoint identification performance is solved.
Description
Technical Field
The invention relates to a microblog case aspect-level viewpoint identification method based on text comment interactive attention, and belongs to the technical field of natural language processing.
Background
In recent years, the concerns of netizens on legal cases are increased, the reports of microblog media on such cases are more frequent, and a plurality of netizens give comments on the cases. At present, the texts of the news also contain viewpoints of different objects, and in the face of huge data, it is not practical to grasp public opinion trends by manually reading a large number of comments, and people or judicial authorities pay more attention to netizen viewpoints of some aspects of cases. Therefore, the research of microblog-data-oriented case-related aspect-level viewpoint identification has important research significance for rapidly mastering the internet situation. However, the general form and expression mode of the microblog data are flexible and changeable, and the judgment of the microblog data in terms of the grade viewpoint based on the traditional natural language processing method is difficult. In fact, the microblog text is a statement of case facts, wherein the statement includes descriptions of various aspects of the case, and microblog comments are mostly discussions expanded around the text content, so that the information combined with the text can provide assistance for understanding of case-related texts.
Disclosure of Invention
The invention provides a microblog case aspect viewpoint identification method based on interactive attention of text comments, which is used for realizing fusion of text information and comment information, promoting and enhancing semantic representation of comments, realizing identification of comment text case aspect viewpoints based on fused features and solving the problem of poor aspect viewpoint identification performance caused by insufficient comment information or ambiguous information.
The technical scheme of the invention is as follows: a microblog case aspect-level viewpoint identification method based on interactive attention of text comments comprises the steps of crawling texts and comments of a plurality of hot cases from a microblog by using a crawler, constructing a microblog case corpus, performing word segmentation and other preprocessing on the corpus, and then constructing a dictionary. Aiming at the problem of recognition of aspect-level viewpoints of microblog related documents, a case aspect-level viewpoint recognition method based on text and interactive comment attention is provided, and recognition of aspect-level viewpoints is achieved by fusing context information of social media.
The method comprises the following steps:
step1, acquiring relevant texts and comments of the hot cases from the microblog, constructing a microblog case corpus, and then segmenting the corpus to construct a dictionary;
step2, encoding the microblog texts and comments by using a multi-head attention mechanism;
step3, fusing the information of the texts and the comments through an interactive attention mechanism;
step4, identifying aspect level views of microblog cases using a multi-label classifier.
As a further scheme of the present invention, the Step1 specifically comprises the following steps:
step1.1, firstly, crawling microblog case linguistic data from the Internet by utilizing a web crawler program;
step1.2, filtering and denoising the crawled microblog case linguistic data to construct a microblog case data set;
and Step1.3, extracting the comments related to the microblog cases from the Step1.2 data set, corresponding to the texts to which the comments belong, marking corresponding labels, forming the microblog case linguistic data through manual processing, segmenting the comments by using a Chinese word segmentation tool, and constructing a dictionary.
As a further scheme of the invention, the step Step1.3 comprises the following specific steps:
step1.3.1, extracting the comments related to the microblog cases from the step1.2 data set, and corresponding to the texts;
step1.3.2, marking each comment with a corresponding label;
step1.3.3, segmenting the comments by using a Chinese segmentation tool, and inputting all the comments according to batches until all the comments are input;
step1.3.4, constructing a dictionary by the words obtained by Step1.3.3, specifically operating to construct an empty dictionary first, inputting each word into the empty dictionary, adding the word into the dictionary if the dictionary does not contain the word, and skipping the next word if the word is contained, and completing in turn.
As a further scheme of the present invention, the Step2 specifically comprises the following steps:
step2.1, using the text and comments of the microblog case as input at two ends of the model, adopting the same coding mode for the text and the comments, and expressing each sentence into an embedded matrix about the sentence;
the text and the comment of the microblog case are used as two inputs of a coding end, a sentence is assumed, n words exist in the sentence, and a sentence X represents the following formula:
X=(x1,x2,...,xn)
after the sentence is embedded, the expression formula of the word embedding sequence is as follows:
E=(w1,w2,...,wn)
e is a sequence which expresses sentences into a two-dimensional embedding matrix, all the embedding of the sentences are connected together, the dimension size is n multiplied by d, n is the number of words, d is the embedding dimension of the sentences, and each element in the sequence E is independent;
step2.2, coding the microblog text and the comments by adopting a multi-head attention mechanism, so that each word and all the words in the sentence are concerned by calculation, and the representation of the microblog case text and the comments is obtained;
reading each text sequence by using a multi-head attention mechanism, calculating the attention of each word and all words, converting a two-dimensional embedded matrix E into fixed single values Q, K and V, linearly changing the fixed single values and inputting the fixed single values into the multi-head attention mechanism with 8 heads, finally splicing the output values of all the heads, and converting the output values into one output value which is the same as the single head by a linear conversion layer, wherein the specific calculation formula is as follows:
A=Linear(Multihead(Q,K,V))
the matrix A represents the representation of the text sequence obtained by multi-head attention mechanism coding; q, K, V are fixed single values;
and encoding the text and the comments by adopting a multi-head attention mechanism, so that each word in the sentence and all the words are concerned by calculation, and the representation of the text and the comments of the microblog case is obtained.
The multi-head attention mechanism is different in that multiple calculations are performed, and the model can be allowed to learn related information in different representation subspaces. Each word in the sentence pays attention to all words, and the dependency relationship can be directly calculated regardless of the distance between the words, so that the internal structure of one sentence is learned, the calculation at the previous moment is not depended on, and the parallel operation can be well realized. A multi-headed attention model can be understood macroscopically as a query to a series of key-value pair mappings. And performing word embedding on each word to obtain an embedding matrix about the sentence, and performing linear transformation on the embedding matrix to obtain a corresponding query (Q), a key (K) and a value (V). Q, K and V are fixed single values, and are respectively subjected to linear transformation and then input to the scaling dot product, wherein h times are needed, so-called multiple heads, parameters are not shared among the heads, and the parameters subjected to linear transformation at Q, K, V are different. Then, splicing the results of h times of scaling dot product attention, and taking the value obtained by performing linear transformation once again as the result of the multi-head attention mechanism, as shown in the following formula:
MultiHead(Q,K,V)=Concat(head1,head2,,,headh)Wi o
wherein:
headi=Attention(QWi q,KWi k,VWi v)
In this work, a scaling dot product of h-8 parallels is used to note, i.e., 8 heads. For each head dk= dv=dmodel/h。
Wherein the input dimension is dkQ, K and a dimension dvV.calculating the dot product of Q and all K, each key divided byAnd apply the soft max function to obtain the weights for these values as shown in the equation:
whereinIs a scaling factor, which plays a role in adjusting so that the inner volume above the cup is not too large. WhereinRefers to the square root of the key vector dimension.
As a further scheme of the present invention, the Step3 specifically comprises the following steps:
step3.1, firstly, calculating the similarity between the text matrix and the comment matrix;
the representations of the texts and comments obtained through the multi-head attention mechanism coding are respectively represented by a matrix L and a matrix K, and the attention is calculated from two directions: the focus from comment to body and from body to comment, which come from a shared similarity matrix between the context embedded information of the body and comment, the similarity matrix is calculated as follows:
Stj=α(K:t,L:j)
wherein StjRepresenting the similarity between the tth comment word and the jth text word, alpha being a trainable scalar function, the similarity between two input vector representations of the encoded text and comment,K:tthe t-th column vector of K, L:jA jth column vector of L;
step3.2, calculating the attention from the text to the comment, namely an attention coding module from the text to the comment, wherein the text to comment coding module shows which text words are most relevant to each comment word;
text-to-comment attention indicates which text words are most relevant to each comment word, let atRepresenting the attention weight between the text word and the tth comment word, Σ a for all ttjAttention weight is given by a 1t=soft max(St:) And calculating to obtain, wherein each text vector participating in attention is expressed as shown in a formula:
step3.3, calculating the attention of the comment to the text, namely an attention coding module for commenting to the text, wherein the attention coding module for commenting to the text indicates which comment words are most similar to a certain text word;
the attention of the comment to the text indicates which comment words are most similar to a certain text word by p ═ soft max (max)col(s)) to obtain attention weights on the comment words, where the maximum function is maxcol(s) performed across columns, then each comment vector participating in the attention is represented as shown in the formula:
the sum of the weights of the comment words most relevant to the text at a certain time is expressedAt any moment, useTo represent this comment vector;
step3.4, fusing text and comment information through an interactive attention mechanism, mutually focusing attention on the text and the comment, and fusing the text and the comment information to obtain a representation of the comment containing the text information;
and finally, splicing the comment word embedding and the text comment interactive attention vector, wherein the obtained matrix is represented by G, as shown in a formula:
each column vector in G can be viewed as a representation of textual information for each comment word, where the β function is an arbitrarily trainable neural network that fuses 3 input vectors.
As a further scheme of the present invention, the Step4 specifically comprises the following steps:
step4.1, performing linear transformation on the information characteristics of the text and the comments fused in the step3 to obtain the probabilities of four classes in each sentence of comment;
step4.2, obtaining two classifications of each of the four classifications according to the probability of each classification through a Sigmoid function;
step4.3, if the classification result is more than 0.4, the classification belongs to the class, otherwise, the classification does not belong to the class;
step4.4, map the comments to the corresponding tags, identifying the aspect level views of the case.
Through the process of mutual attention of the text comments, the G matrix obtained by fusing the interactive attention information of the text comments is a comment representation containing the text information. By subjecting this representation to a 4-level linear transformation and a non-linear activation function, 4 classes of vectors F are obtained. And then, carrying out secondary classification on each class through a sigmod function, and mapping vectors of 4 classes to (0,1) to obtain a predicted value P. The classification process is shown by the following two formulas:
F=tanh(Linear(G))
P=Sigmoid(F)
for the predicted value P, if the output value is greater than 0.4, the predicted value P is classified as a positive class, that is, if the output value is greater than 0.4, the comment belongs to the corresponding class, otherwise, the comment does not belong to the corresponding class. And each comment belongs to one or more types, so that identification of a microblog case aspect level viewpoint is realized.
The invention has the beneficial effects that: the method comprises the steps of firstly coding texts and comments respectively based on a Transformer framework, then realizing the fusion of text information and comment information based on an interactive attention mechanism, and realizing the recognition of comment text case aspect-level viewpoints based on the fused characteristics. By adopting the interactive attention and the microblog text information, the accuracy of case-aspect-level viewpoint identification can be remarkably improved.
The method solves the problem of poor microblog case aspect-level viewpoint identification performance, and has a practical effect on microblog case aspect-level viewpoint identification by fusing text and comment information through an interactive attention mechanism.
Drawings
FIG. 1 is a general flow diagram of the present invention;
FIG. 2 is a general model diagram of the present invention;
FIG. 3 is a schematic diagram of a text comment interactive attention-coding network in accordance with the present invention.
Detailed Description
Example 1: as shown in fig. 1-3, a microblog case aspect level viewpoint identification method based on interactive attention of text comments, the method includes:
step1, acquiring relevant texts and comments of the hot cases from the microblog, constructing a microblog case corpus, and then segmenting the corpus to construct a dictionary;
step2, encoding the microblog texts and comments by using a multi-head attention mechanism;
step3, fusing the information of the texts and the comments through an interactive attention mechanism;
step4, identifying aspect level views of microblog cases using a multi-label classifier.
As a further scheme of the present invention, the Step1 specifically comprises the following steps:
step1.1, firstly, crawling microblog case linguistic data from the Internet by utilizing a web crawler program;
step1.2, filtering and denoising the crawled microblog case linguistic data to construct a microblog case data set;
and Step1.3, extracting the comments related to the microblog cases from the Step1.2 data set, corresponding to the texts to which the comments belong, marking corresponding labels, forming the microblog case linguistic data through manual processing, segmenting the comments by using a Chinese word segmentation tool, and constructing a dictionary.
As a further scheme of the invention, the step Step1.3 comprises the following specific steps:
step1.3.1, extracting the comments related to the microblog cases from the step1.2 data set, and corresponding to the texts;
step1.3.2, marking each comment with a corresponding label;
step1.3.3, segmenting the comments by using a Chinese segmentation tool, and inputting all the comments according to batches until all the comments are input;
step1.3.4, constructing a dictionary by the words obtained by Step1.3.3, specifically operating to construct an empty dictionary first, inputting each word into the empty dictionary, adding the word into the dictionary if the dictionary does not contain the word, and skipping the next word if the word is contained, and completing in turn.
As a further scheme of the present invention, the Step2 specifically comprises the following steps:
step2.1, using the text and comments of the microblog case as input at two ends of the model, adopting the same coding mode for the text and the comments, and expressing each sentence into an embedded matrix about the sentence;
the text and the comment of the microblog case are used as two inputs of a coding end, a sentence is assumed, n words exist in the sentence, and a sentence X represents the following formula:
X=(x1,x2,...,xn)
after the sentence is embedded, the expression formula of the word embedding sequence is as follows:
E=(w1,w2,...,wn)
e is a sequence which expresses sentences into a two-dimensional embedding matrix, all the embedding of the sentences are connected together, the dimension size is n multiplied by d, n is the number of words, d is the embedding dimension of the sentences, and each element in the sequence E is independent;
step2.2, coding the microblog text and the comments by adopting a multi-head attention mechanism, so that each word and all the words in the sentence are concerned by calculation, and the representation of the microblog case text and the comments is obtained;
reading each text sequence by using a multi-head attention mechanism, calculating the attention of each word and all words, converting a two-dimensional embedded matrix E into fixed single values Q, K and V, linearly changing the fixed single values and inputting the fixed single values into the multi-head attention mechanism with 8 heads, finally splicing the output values of all the heads, and converting the output values into one output value which is the same as the single head by a linear conversion layer, wherein the specific calculation formula is as follows:
A=Linear(Multihead(Q,K,V))
the matrix A represents the representation of the text sequence obtained by multi-head attention mechanism coding; q, K, V are fixed single values;
and encoding the text and the comments by adopting a multi-head attention mechanism, so that each word in the sentence and all the words are concerned by calculation, and the representation of the text and the comments of the microblog case is obtained.
As a further scheme of the present invention, the Step3 specifically comprises the following steps:
step3.1, firstly, calculating the similarity between the text matrix and the comment matrix;
the representations of the texts and comments obtained through the multi-head attention mechanism coding are respectively represented by a matrix L and a matrix K, and the attention is calculated from two directions: the focus from comment to body and from body to comment, which come from a shared similarity matrix between the context embedded information of the body and comment, the similarity matrix is calculated as follows:
Stj=α(K:t,L:j)
wherein StjRepresenting the similarity between the tth comment word and the jth text word, alpha being the trainable scalar function, the similarity between two input vector representations of the coded text and comment, K:tThe t-th column vector of K, L:jA jth column vector of L;
step3.2, calculating the attention from the text to the comment, namely an attention coding module from the text to the comment, wherein the text to comment coding module shows which text words are most relevant to each comment word;
text-to-comment attention indicates which text words are most relevant to each comment word, let atRepresenting the attention weight between the text word and the tth comment word, Σ a for all ttjAttention weight is given by a 1t=soft max(St:) And calculating to obtain, wherein each text vector participating in attention is expressed as shown in a formula:
step3.3, calculating the attention of the comment to the text, namely an attention coding module for commenting to the text, wherein the attention coding module for commenting to the text indicates which comment words are most similar to a certain text word;
the attention of the comment to the text indicates which comment words are most similar to a certain text word by p ═ soft max (max)col(s)) to obtain attention weights on the comment words, where the maximum function is maxcol(s) performed across columns, then each comment vector participating in the attention is represented as shown in the formula:
the sum of the weights of the comment words most relevant to the text at a certain time is expressed, and is used for all timesTo represent this comment vector;
step3.4, fusing text and comment information through an interactive attention mechanism, mutually focusing attention on the text and the comment, and fusing the text and the comment information to obtain a representation of the comment containing the text information;
and finally, splicing the comment word embedding and the text comment interactive attention vector, wherein the obtained matrix is represented by G, as shown in a formula:
each column vector in G can be viewed as a representation of textual information for each comment word, where the β function is an arbitrarily trainable neural network that fuses 3 input vectors.
As a further scheme of the present invention, the Step4 specifically comprises the following steps:
step4.1, performing linear transformation on the information characteristics of the text and the comments fused in the step3 to obtain the probabilities of four classes in each sentence of comment;
step4.2, obtaining two classifications of each of the four classifications according to the probability of each classification through a Sigmoid function;
step4.3, if the classification result is more than 0.4, the classification belongs to the class, otherwise, the classification does not belong to the class;
step4.4, map the comments to the corresponding tags, identifying the aspect level views of the case.
Through the process of mutual attention of the text comments, the G matrix obtained by fusing the interactive attention information of the text comments is a comment representation containing the text information. By subjecting this representation to a 4-level linear transformation and a non-linear activation function, 4 classes of vectors F are obtained. And then, carrying out secondary classification on each class through a sigmod function, and mapping vectors of 4 classes to (0,1) to obtain a predicted value P. The classification process is shown by the following two formulas:
F=tanh(Linear(G))
P=Sigmoid(F)
for the predicted value P, if the output value is greater than 0.4, the predicted value P is classified as a positive class, that is, if the output value is greater than 0.4, the comment belongs to the corresponding class, otherwise, the comment does not belong to the corresponding class. And each comment belongs to one or more types, so that identification of a microblog case aspect level viewpoint is realized.
The present invention uses Hamming Loss (HL) to estimate the accuracy of the model. HL counts the number of misclassifications, i.e. labels not belonging to this sample are predicted or labels belonging to this sample are not predicted. The smaller the value of Ham min gLoss, the better the performance. For one test set, S { (x)1,Y1),(x2,Y2)....(xn,Yn) The loss is calculated as follows:
n is the number of samples, L is the number of labels, Yi,jIs the true value of the jth component in the ith prediction, Pi,jIs the predicted value of the jth component in the ith prediction result. XOR indicates that XOR (0,1) ═ XOR (1,0) ═ 1, and XOR (0,0) ═ XOR (1,1) ═ 0.
The accuracy (Precision, P), Recall (Recall, R) and F1 value (F1-score, F1) are adopted as the evaluation indexes of the model.
Weigthed-F1: f1 value Weigthed-F1 was used in the invention. P, R, F1 for each class was calculated and then averaged to give an F1 value over the entire sample. On this basis, weighted averaging is performed in consideration of the number of each category.
To clarify how much the text comment interactive attention-coding network contributes to the model, we eliminated it and compared it with the results obtained by the model, as shown in table 1.
TABLE 1 ablation test results
Numbering | Model structure | | R | F1 | |
1 | The method of the invention | 0.7349 | 0.7122 | 0.6603 | |
2 | (-) Interactive attention network | 0.6708 | 0.6832 | 0.6080 |
The "(-) interactive attention network" means the interactive attention network with text comments removed. Analysis of table 1 shows that compared with the multi-head attention mechanism, the interactive attention network fused text information improves the accuracy of classification by 6.4%, the recall rate by 2.9% and the F1 value by 5.2%.
To evaluate the recognition effect of each class, we also calculated the precision rate, recall rate, and F1 value for each class. The results of the experiment are shown in table 2.
TABLE 2 recognition Effect of each class
Numbering | Categories | | R | F1 | |
1 | |
0.8053 | 0.9229 | 0.8601 | |
2 | class 2 | 0.8222 | 0.1588 | 0.2622 | |
3 | class 3 | 0.6450 | 0.7842 | 1.0000 | |
4 | class 4 | 0.6489 | 0.2837 | 0.3948 |
As can be seen from Table 2, class1, class2, class3, and class4 represent four different categories, respectively. In our dataset, the number of reviews containing class1 and class3 was greater, the number of reviews containing class2 was less, and the number of reviews containing class4 was about half. Analysis of Table 2 reveals that the accuracy of class1 and class2 is higher, and the recall and F1 values of class1 and class3 are higher.
In order to further evaluate the effect of the text comment interactive attention-based microblog case-level viewpoint identification model, experiments are performed on the data set provided by the invention by using different baseline models, and compared with the method provided by the invention, the experiment results are shown in table 3.
TABLE 3 model comparison experiment
Modeling method | P | R | F1 |
CNN | 0.6958 | 0.7014 | 0.6385 |
CNN-RNN | 0.6831 | 0.6783 | 0.6206 |
Transformer | 0.6669 | 0.6847 | 0.6096 |
The method of the invention | 0.7349 | 0.7122 | 0.6603 |
The comparison with the baseline model is not fused with the information of the text, and only the comments are classified. Analysis table 3 shows that:
(1) compared with CNN, the method of the invention improves the accuracy by 3.9%, the recall rate by 1.08% and the F1 value by 2.1%.
(2) Compared with CNN-RNN, the method provided by the invention has the advantages that the accuracy is improved by 5.1%, the recall rate is improved by 3.4%, and the F1 value is improved by 3.9%.
(3) Compared with a Transformer, the method disclosed by the invention has the advantages that the accuracy is improved by 6.8%, the recall rate is improved by 2.7%, and the F1 value is improved by 5%.
In conclusion, the interactive attention mechanism is used for fusing the text and comment information, and the method has a practical effect on microblog case aspect-level viewpoint identification.
While the present invention has been described in detail with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.
Claims (9)
1. A microblog case aspect-level viewpoint identification method based on text comment interactive attention is characterized by comprising the following steps of:
step1, constructing a microblog case corpus, and then segmenting words and constructing a dictionary for the corpus;
step2, encoding the microblog texts and comments by using a multi-head attention mechanism;
step3, fusing the information of the texts and the comments through an interactive attention mechanism;
step4, identifying aspect level views of microblog cases using a multi-label classifier.
2. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 1, wherein: the specific steps of Step1 are as follows:
step1.1, firstly, crawling microblog case linguistic data from the Internet by utilizing a web crawler program;
step1.2, filtering and denoising the crawled microblog case linguistic data to construct a microblog case data set;
and Step1.3, extracting the comments related to the microblog cases from the Step1.2 data set, corresponding to the texts to which the comments belong, marking corresponding labels, forming the microblog case linguistic data through manual processing, segmenting the comments by using a Chinese word segmentation tool, and constructing a dictionary.
3. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 2, wherein: the specific steps of the step Step1.3 are as follows:
step1.3.1, extracting the comments related to the microblog cases from the step1.2 data set, and corresponding to the texts;
step1.3.2, marking each comment with a corresponding label;
step1.3.3, segmenting the comments by using a Chinese segmentation tool, and inputting all the comments according to batches until all the comments are input;
step1.3.4, constructing a dictionary by the words obtained by Step1.3.3, specifically operating to construct an empty dictionary first, inputting each word into the empty dictionary, adding the word into the dictionary if the dictionary does not contain the word, and skipping the next word if the word is contained, and completing in turn.
4. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 1, wherein: the specific steps of Step2 are as follows:
step2.1, using the text and comments of the microblog case as input at two ends of the model, adopting the same coding mode for the text and the comments, and expressing each sentence into an embedded matrix about the sentence;
step2.2, coding the microblog texts and the comments by adopting a multi-head attention mechanism, so that each word and all the words in the sentence are concerned by calculation, and the representation of the microblog case texts and the comments is obtained.
5. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 4, wherein: in Step2.1, the text and the comments of the microblog case are used as two inputs of a coding end, a sentence is assumed, n words exist in the sentence, and the sentence X represents the following formula:
X=(x1,x2,...,xn)
after the sentence is embedded, the expression formula of the word embedding sequence is as follows:
E=(w1,w2,...,wn)
e is the sequence of sentences represented as a two-dimensional embedding matrix that ties all the embeddings of a sentence together, with dimensions of n x d, n being the number of words, d being the dimension of sentence embedding, each element in the sequence E now being independent of the other.
6. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 4, wherein: in Step2.2, reading each text sequence by using a multi-head attention mechanism, calculating the attention of each word and all words, converting a two-dimensional embedded matrix E into fixed single values Q, K and V, linearly changing the fixed single values and inputting the fixed single values into the multi-head attention mechanism with 8 heads, finally splicing the output values of all the heads, and converting the output values into one output value which is the same as the single head by a linear conversion layer, wherein the specific calculation formula is as follows:
A=Linear(Multihead(Q,K,V))
the matrix A represents the representation of the text sequence obtained by multi-head attention mechanism coding; q, K, V are fixed single values;
and encoding the text and the comments by adopting a multi-head attention mechanism, so that each word in the sentence and all the words are concerned by calculation, and the representation of the text and the comments of the microblog case is obtained.
7. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 1, wherein: the specific steps of Step3 are as follows:
step3.1, firstly, calculating the similarity between the text matrix and the comment matrix;
the representations of the texts and comments obtained through the multi-head attention mechanism coding are respectively represented by a matrix L and a matrix K, and the attention is calculated from two directions: the focus from comment to body and from body to comment, which come from a shared similarity matrix between the context embedded information of the body and comment, the similarity matrix is calculated as follows:
Stj=α(K:t,L:j)
wherein StjRepresenting the t-th comment word and the j-th text wordSimilarity between two input vector representations of a trainable scalar function, coded text and comments, K:tThe t-th column vector of K, L:jA jth column vector of L;
step3.2, calculating the attention from the text to the comment, namely an attention coding module from the text to the comment, wherein the text to comment coding module shows which text words are most relevant to each comment word;
text-to-comment attention indicates which text words are most relevant to each comment word, let atRepresenting the attention weight between the text word and the tth comment word, Σ a for all ttjAttention weight is given by a 1t=softmax(St:) And calculating to obtain, wherein each text vector participating in attention is expressed as shown in a formula:
step3.3, calculating the attention of the comment to the text, namely an attention coding module for commenting to the text, wherein the attention coding module for commenting to the text indicates which comment words are most similar to a certain text word;
the attention of the comment to the text indicates which comment words are most similar to a certain text word by p ═ soft max (max)col(s)) to obtain attention weights on the comment words, where the maximum function is maxcol(s) performed across columns, then each comment vector participating in the attention is represented as shown in the formula:
the sum of the weights of the comment words most relevant to the text at a certain time is expressed, and is used for all timesTo represent this comment vector;
step3.4, fusing text and comment information through an interactive attention mechanism, mutually focusing attention on the text and the comment, and fusing the text and the comment information to obtain a representation of the comment containing the text information;
and finally, splicing the comment word embedding and the text comment interactive attention vector, wherein the obtained matrix is represented by G, as shown in a formula:
each column vector in G can be viewed as a representation of textual information for each comment word, where the β function is an arbitrarily trainable neural network that fuses 3 input vectors.
8. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 1, wherein: the specific steps of Step4 are as follows:
step4.1, performing linear transformation on the information characteristics of the text and the comments fused in the step3 to obtain the probabilities of four classes in each sentence of comment;
step4.2, obtaining two classifications of each of the four classifications according to the probability of each classification through a Sigmoid function;
step4.3, if the classification result is more than 0.4, the classification belongs to the class, otherwise, the classification does not belong to the class;
step4.4, map the comments to the corresponding tags, identifying the aspect level views of the case.
9. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 1, wherein: the Step4 comprises the following steps:
through the process that the text comments concern each other, a G matrix obtained by fusing the interactive attention information of the text comments is a comment representation containing the text information; obtaining vectors of 4 classes by carrying out linear transformation on the representation through 4 layers and a nonlinear activation function; secondly, performing secondary classification on each class through a sigmod function, and mapping vectors of 4 classes to (0,1) to obtain a predicted value P; the classification process is shown by the following two formulas:
F=tanh(Linear(G))
P=Sigmoid(F)
and for the predicted value P, when the output value is greater than 0.4, classifying the predicted value P into a positive class, namely, when the output value is greater than 0.4, the comment belongs to the corresponding class, otherwise, the comment does not belong to the class, and each comment belongs to one class or multiple classes, so that identification of the microblog case aspect grade viewpoint is realized.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110163045.0A CN112926336A (en) | 2021-02-05 | 2021-02-05 | Microblog case aspect-level viewpoint identification method based on text comment interactive attention |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110163045.0A CN112926336A (en) | 2021-02-05 | 2021-02-05 | Microblog case aspect-level viewpoint identification method based on text comment interactive attention |
Publications (1)
Publication Number | Publication Date |
---|---|
CN112926336A true CN112926336A (en) | 2021-06-08 |
Family
ID=76170853
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110163045.0A Pending CN112926336A (en) | 2021-02-05 | 2021-02-05 | Microblog case aspect-level viewpoint identification method based on text comment interactive attention |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN112926336A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116541523A (en) * | 2023-04-28 | 2023-08-04 | 重庆邮电大学 | Legal judgment public opinion classification method based on big data |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111581474A (en) * | 2020-04-02 | 2020-08-25 | 昆明理工大学 | Evaluation object extraction method of case-related microblog comments based on multi-head attention system |
CN111680154A (en) * | 2020-04-13 | 2020-09-18 | 华东师范大学 | Comment text attribute level emotion analysis method based on deep learning |
CN112287197A (en) * | 2020-09-23 | 2021-01-29 | 昆明理工大学 | Method for detecting sarcasm of case-related microblog comments described by dynamic memory cases |
-
2021
- 2021-02-05 CN CN202110163045.0A patent/CN112926336A/en active Pending
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111581474A (en) * | 2020-04-02 | 2020-08-25 | 昆明理工大学 | Evaluation object extraction method of case-related microblog comments based on multi-head attention system |
CN111680154A (en) * | 2020-04-13 | 2020-09-18 | 华东师范大学 | Comment text attribute level emotion analysis method based on deep learning |
CN112287197A (en) * | 2020-09-23 | 2021-01-29 | 昆明理工大学 | Method for detecting sarcasm of case-related microblog comments described by dynamic memory cases |
Non-Patent Citations (1)
Title |
---|
顾健伟 等: "基于双向注意力流和自注意力结合的机器阅读理解", 《南京大学学报(自然科学)》 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116541523A (en) * | 2023-04-28 | 2023-08-04 | 重庆邮电大学 | Legal judgment public opinion classification method based on big data |
CN116541523B (en) * | 2023-04-28 | 2024-08-16 | 芽米科技(广州)有限公司 | Legal judgment public opinion classification method based on big data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN110298037B (en) | Convolutional neural network matching text recognition method based on enhanced attention mechanism | |
CN110990564B (en) | Negative news identification method based on emotion calculation and multi-head attention mechanism | |
CN112732916B (en) | BERT-based multi-feature fusion fuzzy text classification system | |
CN111159407B (en) | Method, apparatus, device and medium for training entity recognition and relation classification model | |
CN110427623A (en) | Semi-structured document Knowledge Extraction Method, device, electronic equipment and storage medium | |
CN111414476A (en) | Attribute-level emotion analysis method based on multi-task learning | |
CN110347836B (en) | Method for classifying sentiments of Chinese-Yue-bilingual news by blending into viewpoint sentence characteristics | |
CN114896388A (en) | Hierarchical multi-label text classification method based on mixed attention | |
CN110046356B (en) | Label-embedded microblog text emotion multi-label classification method | |
CN110287323A (en) | A kind of object-oriented sensibility classification method | |
CN111259153B (en) | Attribute-level emotion analysis method of complete attention mechanism | |
CN114926150A (en) | Digital intelligent auditing method and device for transformer technology conformance assessment | |
CN110472245B (en) | Multi-label emotion intensity prediction method based on hierarchical convolutional neural network | |
CN113987187A (en) | Multi-label embedding-based public opinion text classification method, system, terminal and medium | |
CN112966503A (en) | Aspect level emotion analysis method | |
CN115526236A (en) | Text network graph classification method based on multi-modal comparative learning | |
CN115759092A (en) | Network threat information named entity identification method based on ALBERT | |
CN114528835A (en) | Semi-supervised specialized term extraction method, medium and equipment based on interval discrimination | |
CN113609857A (en) | Legal named entity identification method and system based on cascade model and data enhancement | |
CN117807232A (en) | Commodity classification method, commodity classification model construction method and device | |
CN113076425B (en) | Event related viewpoint sentence classification method for microblog comments | |
CN117390131B (en) | Text emotion classification method for multiple fields | |
CN112926336A (en) | Microblog case aspect-level viewpoint identification method based on text comment interactive attention | |
CN114595693A (en) | Text emotion analysis method based on deep learning | |
CN113064967A (en) | Complaint reporting credibility analysis method based on deep migration network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20210608 |
|
RJ01 | Rejection of invention patent application after publication |