CN112926336A - Microblog case aspect-level viewpoint identification method based on text comment interactive attention - Google Patents

Microblog case aspect-level viewpoint identification method based on text comment interactive attention Download PDF

Info

Publication number
CN112926336A
CN112926336A CN202110163045.0A CN202110163045A CN112926336A CN 112926336 A CN112926336 A CN 112926336A CN 202110163045 A CN202110163045 A CN 202110163045A CN 112926336 A CN112926336 A CN 112926336A
Authority
CN
China
Prior art keywords
text
comment
comments
microblog
attention
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202110163045.0A
Other languages
Chinese (zh)
Inventor
余正涛
段玲
郭军军
相艳
黄于欣
线岩团
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Kunming University of Science and Technology
Original Assignee
Kunming University of Science and Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Kunming University of Science and Technology filed Critical Kunming University of Science and Technology
Priority to CN202110163045.0A priority Critical patent/CN112926336A/en
Publication of CN112926336A publication Critical patent/CN112926336A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/951Indexing; Web crawling techniques
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • G06F40/126Character encoding
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Evolutionary Computation (AREA)
  • Evolutionary Biology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Machine Translation (AREA)

Abstract

The invention relates to a microblog case aspect-level viewpoint identification method based on text comment interactive attention, and belongs to the technical field of natural language processing. According to the method, texts and related comments of a plurality of hot cases are crawled from a microblog to form a data set, and then word segmentation is carried out on the data set to construct a dictionary. The method comprises the steps of firstly respectively encoding the text and the comment, then realizing the fusion of the text information and the comment information based on an interactive attention mechanism, and realizing the recognition of the comment text case aspect viewpoint based on the fused characteristics. According to the method, the accuracy of case-level viewpoint identification is effectively improved, and the problem of poor microblog case-level viewpoint identification performance is solved.

Description

Microblog case aspect-level viewpoint identification method based on text comment interactive attention
Technical Field
The invention relates to a microblog case aspect-level viewpoint identification method based on text comment interactive attention, and belongs to the technical field of natural language processing.
Background
In recent years, the concerns of netizens on legal cases are increased, the reports of microblog media on such cases are more frequent, and a plurality of netizens give comments on the cases. At present, the texts of the news also contain viewpoints of different objects, and in the face of huge data, it is not practical to grasp public opinion trends by manually reading a large number of comments, and people or judicial authorities pay more attention to netizen viewpoints of some aspects of cases. Therefore, the research of microblog-data-oriented case-related aspect-level viewpoint identification has important research significance for rapidly mastering the internet situation. However, the general form and expression mode of the microblog data are flexible and changeable, and the judgment of the microblog data in terms of the grade viewpoint based on the traditional natural language processing method is difficult. In fact, the microblog text is a statement of case facts, wherein the statement includes descriptions of various aspects of the case, and microblog comments are mostly discussions expanded around the text content, so that the information combined with the text can provide assistance for understanding of case-related texts.
Disclosure of Invention
The invention provides a microblog case aspect viewpoint identification method based on interactive attention of text comments, which is used for realizing fusion of text information and comment information, promoting and enhancing semantic representation of comments, realizing identification of comment text case aspect viewpoints based on fused features and solving the problem of poor aspect viewpoint identification performance caused by insufficient comment information or ambiguous information.
The technical scheme of the invention is as follows: a microblog case aspect-level viewpoint identification method based on interactive attention of text comments comprises the steps of crawling texts and comments of a plurality of hot cases from a microblog by using a crawler, constructing a microblog case corpus, performing word segmentation and other preprocessing on the corpus, and then constructing a dictionary. Aiming at the problem of recognition of aspect-level viewpoints of microblog related documents, a case aspect-level viewpoint recognition method based on text and interactive comment attention is provided, and recognition of aspect-level viewpoints is achieved by fusing context information of social media.
The method comprises the following steps:
step1, acquiring relevant texts and comments of the hot cases from the microblog, constructing a microblog case corpus, and then segmenting the corpus to construct a dictionary;
step2, encoding the microblog texts and comments by using a multi-head attention mechanism;
step3, fusing the information of the texts and the comments through an interactive attention mechanism;
step4, identifying aspect level views of microblog cases using a multi-label classifier.
As a further scheme of the present invention, the Step1 specifically comprises the following steps:
step1.1, firstly, crawling microblog case linguistic data from the Internet by utilizing a web crawler program;
step1.2, filtering and denoising the crawled microblog case linguistic data to construct a microblog case data set;
and Step1.3, extracting the comments related to the microblog cases from the Step1.2 data set, corresponding to the texts to which the comments belong, marking corresponding labels, forming the microblog case linguistic data through manual processing, segmenting the comments by using a Chinese word segmentation tool, and constructing a dictionary.
As a further scheme of the invention, the step Step1.3 comprises the following specific steps:
step1.3.1, extracting the comments related to the microblog cases from the step1.2 data set, and corresponding to the texts;
step1.3.2, marking each comment with a corresponding label;
step1.3.3, segmenting the comments by using a Chinese segmentation tool, and inputting all the comments according to batches until all the comments are input;
step1.3.4, constructing a dictionary by the words obtained by Step1.3.3, specifically operating to construct an empty dictionary first, inputting each word into the empty dictionary, adding the word into the dictionary if the dictionary does not contain the word, and skipping the next word if the word is contained, and completing in turn.
As a further scheme of the present invention, the Step2 specifically comprises the following steps:
step2.1, using the text and comments of the microblog case as input at two ends of the model, adopting the same coding mode for the text and the comments, and expressing each sentence into an embedded matrix about the sentence;
the text and the comment of the microblog case are used as two inputs of a coding end, a sentence is assumed, n words exist in the sentence, and a sentence X represents the following formula:
X=(x1,x2,...,xn)
after the sentence is embedded, the expression formula of the word embedding sequence is as follows:
E=(w1,w2,...,wn)
e is a sequence which expresses sentences into a two-dimensional embedding matrix, all the embedding of the sentences are connected together, the dimension size is n multiplied by d, n is the number of words, d is the embedding dimension of the sentences, and each element in the sequence E is independent;
step2.2, coding the microblog text and the comments by adopting a multi-head attention mechanism, so that each word and all the words in the sentence are concerned by calculation, and the representation of the microblog case text and the comments is obtained;
reading each text sequence by using a multi-head attention mechanism, calculating the attention of each word and all words, converting a two-dimensional embedded matrix E into fixed single values Q, K and V, linearly changing the fixed single values and inputting the fixed single values into the multi-head attention mechanism with 8 heads, finally splicing the output values of all the heads, and converting the output values into one output value which is the same as the single head by a linear conversion layer, wherein the specific calculation formula is as follows:
A=Linear(Multihead(Q,K,V))
the matrix A represents the representation of the text sequence obtained by multi-head attention mechanism coding; q, K, V are fixed single values;
and encoding the text and the comments by adopting a multi-head attention mechanism, so that each word in the sentence and all the words are concerned by calculation, and the representation of the text and the comments of the microblog case is obtained.
The multi-head attention mechanism is different in that multiple calculations are performed, and the model can be allowed to learn related information in different representation subspaces. Each word in the sentence pays attention to all words, and the dependency relationship can be directly calculated regardless of the distance between the words, so that the internal structure of one sentence is learned, the calculation at the previous moment is not depended on, and the parallel operation can be well realized. A multi-headed attention model can be understood macroscopically as a query to a series of key-value pair mappings. And performing word embedding on each word to obtain an embedding matrix about the sentence, and performing linear transformation on the embedding matrix to obtain a corresponding query (Q), a key (K) and a value (V). Q, K and V are fixed single values, and are respectively subjected to linear transformation and then input to the scaling dot product, wherein h times are needed, so-called multiple heads, parameters are not shared among the heads, and the parameters subjected to linear transformation at Q, K, V are different. Then, splicing the results of h times of scaling dot product attention, and taking the value obtained by performing linear transformation once again as the result of the multi-head attention mechanism, as shown in the following formula:
MultiHead(Q,K,V)=Concat(head1,head2,,,headh)Wi o
wherein:
headi=Attention(QWi q,KWi k,VWi v)
wherein the parameter matrix
Figure RE-GDA0003033705960000031
In this work, a scaling dot product of h-8 parallels is used to note, i.e., 8 heads. For each head dk= dv=dmodel/h。
Wherein the input dimension is dkQ, K and a dimension dvV.calculating the dot product of Q and all K, each key divided by
Figure RE-GDA0003033705960000032
And apply the soft max function to obtain the weights for these values as shown in the equation:
Figure RE-GDA0003033705960000041
wherein
Figure RE-GDA0003033705960000042
Is a scaling factor, which plays a role in adjusting so that the inner volume above the cup is not too large. Wherein
Figure RE-GDA0003033705960000043
Refers to the square root of the key vector dimension.
As a further scheme of the present invention, the Step3 specifically comprises the following steps:
step3.1, firstly, calculating the similarity between the text matrix and the comment matrix;
the representations of the texts and comments obtained through the multi-head attention mechanism coding are respectively represented by a matrix L and a matrix K, and the attention is calculated from two directions: the focus from comment to body and from body to comment, which come from a shared similarity matrix between the context embedded information of the body and comment, the similarity matrix is calculated as follows:
Stj=α(K:t,L:j)
wherein StjRepresenting the similarity between the tth comment word and the jth text word, alpha being a trainable scalar function, the similarity between two input vector representations of the encoded text and comment,K:tthe t-th column vector of K, L:jA jth column vector of L;
step3.2, calculating the attention from the text to the comment, namely an attention coding module from the text to the comment, wherein the text to comment coding module shows which text words are most relevant to each comment word;
text-to-comment attention indicates which text words are most relevant to each comment word, let atRepresenting the attention weight between the text word and the tth comment word, Σ a for all ttjAttention weight is given by a 1t=soft max(St:) And calculating to obtain, wherein each text vector participating in attention is expressed as shown in a formula:
Figure RE-GDA0003033705960000044
Figure RE-GDA0003033705960000045
the method comprises the steps of calculating a concerned text vector with all comment words;
step3.3, calculating the attention of the comment to the text, namely an attention coding module for commenting to the text, wherein the attention coding module for commenting to the text indicates which comment words are most similar to a certain text word;
the attention of the comment to the text indicates which comment words are most similar to a certain text word by p ═ soft max (max)col(s)) to obtain attention weights on the comment words, where the maximum function is maxcol(s) performed across columns, then each comment vector participating in the attention is represented as shown in the formula:
Figure RE-GDA0003033705960000046
Figure RE-GDA0003033705960000051
the sum of the weights of the comment words most relevant to the text at a certain time is expressedAt any moment, use
Figure RE-GDA0003033705960000052
To represent this comment vector;
step3.4, fusing text and comment information through an interactive attention mechanism, mutually focusing attention on the text and the comment, and fusing the text and the comment information to obtain a representation of the comment containing the text information;
and finally, splicing the comment word embedding and the text comment interactive attention vector, wherein the obtained matrix is represented by G, as shown in a formula:
Figure RE-GDA0003033705960000053
each column vector in G can be viewed as a representation of textual information for each comment word, where the β function is an arbitrarily trainable neural network that fuses 3 input vectors.
As a further scheme of the present invention, the Step4 specifically comprises the following steps:
step4.1, performing linear transformation on the information characteristics of the text and the comments fused in the step3 to obtain the probabilities of four classes in each sentence of comment;
step4.2, obtaining two classifications of each of the four classifications according to the probability of each classification through a Sigmoid function;
step4.3, if the classification result is more than 0.4, the classification belongs to the class, otherwise, the classification does not belong to the class;
step4.4, map the comments to the corresponding tags, identifying the aspect level views of the case.
Through the process of mutual attention of the text comments, the G matrix obtained by fusing the interactive attention information of the text comments is a comment representation containing the text information. By subjecting this representation to a 4-level linear transformation and a non-linear activation function, 4 classes of vectors F are obtained. And then, carrying out secondary classification on each class through a sigmod function, and mapping vectors of 4 classes to (0,1) to obtain a predicted value P. The classification process is shown by the following two formulas:
F=tanh(Linear(G))
P=Sigmoid(F)
for the predicted value P, if the output value is greater than 0.4, the predicted value P is classified as a positive class, that is, if the output value is greater than 0.4, the comment belongs to the corresponding class, otherwise, the comment does not belong to the corresponding class. And each comment belongs to one or more types, so that identification of a microblog case aspect level viewpoint is realized.
The invention has the beneficial effects that: the method comprises the steps of firstly coding texts and comments respectively based on a Transformer framework, then realizing the fusion of text information and comment information based on an interactive attention mechanism, and realizing the recognition of comment text case aspect-level viewpoints based on the fused characteristics. By adopting the interactive attention and the microblog text information, the accuracy of case-aspect-level viewpoint identification can be remarkably improved.
The method solves the problem of poor microblog case aspect-level viewpoint identification performance, and has a practical effect on microblog case aspect-level viewpoint identification by fusing text and comment information through an interactive attention mechanism.
Drawings
FIG. 1 is a general flow diagram of the present invention;
FIG. 2 is a general model diagram of the present invention;
FIG. 3 is a schematic diagram of a text comment interactive attention-coding network in accordance with the present invention.
Detailed Description
Example 1: as shown in fig. 1-3, a microblog case aspect level viewpoint identification method based on interactive attention of text comments, the method includes:
step1, acquiring relevant texts and comments of the hot cases from the microblog, constructing a microblog case corpus, and then segmenting the corpus to construct a dictionary;
step2, encoding the microblog texts and comments by using a multi-head attention mechanism;
step3, fusing the information of the texts and the comments through an interactive attention mechanism;
step4, identifying aspect level views of microblog cases using a multi-label classifier.
As a further scheme of the present invention, the Step1 specifically comprises the following steps:
step1.1, firstly, crawling microblog case linguistic data from the Internet by utilizing a web crawler program;
step1.2, filtering and denoising the crawled microblog case linguistic data to construct a microblog case data set;
and Step1.3, extracting the comments related to the microblog cases from the Step1.2 data set, corresponding to the texts to which the comments belong, marking corresponding labels, forming the microblog case linguistic data through manual processing, segmenting the comments by using a Chinese word segmentation tool, and constructing a dictionary.
As a further scheme of the invention, the step Step1.3 comprises the following specific steps:
step1.3.1, extracting the comments related to the microblog cases from the step1.2 data set, and corresponding to the texts;
step1.3.2, marking each comment with a corresponding label;
step1.3.3, segmenting the comments by using a Chinese segmentation tool, and inputting all the comments according to batches until all the comments are input;
step1.3.4, constructing a dictionary by the words obtained by Step1.3.3, specifically operating to construct an empty dictionary first, inputting each word into the empty dictionary, adding the word into the dictionary if the dictionary does not contain the word, and skipping the next word if the word is contained, and completing in turn.
As a further scheme of the present invention, the Step2 specifically comprises the following steps:
step2.1, using the text and comments of the microblog case as input at two ends of the model, adopting the same coding mode for the text and the comments, and expressing each sentence into an embedded matrix about the sentence;
the text and the comment of the microblog case are used as two inputs of a coding end, a sentence is assumed, n words exist in the sentence, and a sentence X represents the following formula:
X=(x1,x2,...,xn)
after the sentence is embedded, the expression formula of the word embedding sequence is as follows:
E=(w1,w2,...,wn)
e is a sequence which expresses sentences into a two-dimensional embedding matrix, all the embedding of the sentences are connected together, the dimension size is n multiplied by d, n is the number of words, d is the embedding dimension of the sentences, and each element in the sequence E is independent;
step2.2, coding the microblog text and the comments by adopting a multi-head attention mechanism, so that each word and all the words in the sentence are concerned by calculation, and the representation of the microblog case text and the comments is obtained;
reading each text sequence by using a multi-head attention mechanism, calculating the attention of each word and all words, converting a two-dimensional embedded matrix E into fixed single values Q, K and V, linearly changing the fixed single values and inputting the fixed single values into the multi-head attention mechanism with 8 heads, finally splicing the output values of all the heads, and converting the output values into one output value which is the same as the single head by a linear conversion layer, wherein the specific calculation formula is as follows:
A=Linear(Multihead(Q,K,V))
the matrix A represents the representation of the text sequence obtained by multi-head attention mechanism coding; q, K, V are fixed single values;
and encoding the text and the comments by adopting a multi-head attention mechanism, so that each word in the sentence and all the words are concerned by calculation, and the representation of the text and the comments of the microblog case is obtained.
As a further scheme of the present invention, the Step3 specifically comprises the following steps:
step3.1, firstly, calculating the similarity between the text matrix and the comment matrix;
the representations of the texts and comments obtained through the multi-head attention mechanism coding are respectively represented by a matrix L and a matrix K, and the attention is calculated from two directions: the focus from comment to body and from body to comment, which come from a shared similarity matrix between the context embedded information of the body and comment, the similarity matrix is calculated as follows:
Stj=α(K:t,L:j)
wherein StjRepresenting the similarity between the tth comment word and the jth text word, alpha being the trainable scalar function, the similarity between two input vector representations of the coded text and comment, K:tThe t-th column vector of K, L:jA jth column vector of L;
step3.2, calculating the attention from the text to the comment, namely an attention coding module from the text to the comment, wherein the text to comment coding module shows which text words are most relevant to each comment word;
text-to-comment attention indicates which text words are most relevant to each comment word, let atRepresenting the attention weight between the text word and the tth comment word, Σ a for all ttjAttention weight is given by a 1t=soft max(St:) And calculating to obtain, wherein each text vector participating in attention is expressed as shown in a formula:
Figure RE-GDA0003033705960000081
Figure RE-GDA0003033705960000082
the method comprises the steps of calculating a concerned text vector with all comment words;
step3.3, calculating the attention of the comment to the text, namely an attention coding module for commenting to the text, wherein the attention coding module for commenting to the text indicates which comment words are most similar to a certain text word;
the attention of the comment to the text indicates which comment words are most similar to a certain text word by p ═ soft max (max)col(s)) to obtain attention weights on the comment words, where the maximum function is maxcol(s) performed across columns, then each comment vector participating in the attention is represented as shown in the formula:
Figure RE-GDA0003033705960000083
Figure RE-GDA0003033705960000084
the sum of the weights of the comment words most relevant to the text at a certain time is expressed, and is used for all times
Figure RE-GDA0003033705960000085
To represent this comment vector;
step3.4, fusing text and comment information through an interactive attention mechanism, mutually focusing attention on the text and the comment, and fusing the text and the comment information to obtain a representation of the comment containing the text information;
and finally, splicing the comment word embedding and the text comment interactive attention vector, wherein the obtained matrix is represented by G, as shown in a formula:
Figure RE-GDA0003033705960000086
each column vector in G can be viewed as a representation of textual information for each comment word, where the β function is an arbitrarily trainable neural network that fuses 3 input vectors.
As a further scheme of the present invention, the Step4 specifically comprises the following steps:
step4.1, performing linear transformation on the information characteristics of the text and the comments fused in the step3 to obtain the probabilities of four classes in each sentence of comment;
step4.2, obtaining two classifications of each of the four classifications according to the probability of each classification through a Sigmoid function;
step4.3, if the classification result is more than 0.4, the classification belongs to the class, otherwise, the classification does not belong to the class;
step4.4, map the comments to the corresponding tags, identifying the aspect level views of the case.
Through the process of mutual attention of the text comments, the G matrix obtained by fusing the interactive attention information of the text comments is a comment representation containing the text information. By subjecting this representation to a 4-level linear transformation and a non-linear activation function, 4 classes of vectors F are obtained. And then, carrying out secondary classification on each class through a sigmod function, and mapping vectors of 4 classes to (0,1) to obtain a predicted value P. The classification process is shown by the following two formulas:
F=tanh(Linear(G))
P=Sigmoid(F)
for the predicted value P, if the output value is greater than 0.4, the predicted value P is classified as a positive class, that is, if the output value is greater than 0.4, the comment belongs to the corresponding class, otherwise, the comment does not belong to the corresponding class. And each comment belongs to one or more types, so that identification of a microblog case aspect level viewpoint is realized.
The present invention uses Hamming Loss (HL) to estimate the accuracy of the model. HL counts the number of misclassifications, i.e. labels not belonging to this sample are predicted or labels belonging to this sample are not predicted. The smaller the value of Ham min gLoss, the better the performance. For one test set, S { (x)1,Y1),(x2,Y2)....(xn,Yn) The loss is calculated as follows:
Figure RE-GDA0003033705960000091
n is the number of samples, L is the number of labels, Yi,jIs the true value of the jth component in the ith prediction, Pi,jIs the predicted value of the jth component in the ith prediction result. XOR indicates that XOR (0,1) ═ XOR (1,0) ═ 1, and XOR (0,0) ═ XOR (1,1) ═ 0.
The accuracy (Precision, P), Recall (Recall, R) and F1 value (F1-score, F1) are adopted as the evaluation indexes of the model.
Figure RE-GDA0003033705960000092
Figure RE-GDA0003033705960000101
Weigthed-F1: f1 value Weigthed-F1 was used in the invention. P, R, F1 for each class was calculated and then averaged to give an F1 value over the entire sample. On this basis, weighted averaging is performed in consideration of the number of each category.
To clarify how much the text comment interactive attention-coding network contributes to the model, we eliminated it and compared it with the results obtained by the model, as shown in table 1.
TABLE 1 ablation test results
Numbering Model structure P R F1
1 The method of the invention 0.7349 0.7122 0.6603
2 (-) Interactive attention network 0.6708 0.6832 0.6080
The "(-) interactive attention network" means the interactive attention network with text comments removed. Analysis of table 1 shows that compared with the multi-head attention mechanism, the interactive attention network fused text information improves the accuracy of classification by 6.4%, the recall rate by 2.9% and the F1 value by 5.2%.
To evaluate the recognition effect of each class, we also calculated the precision rate, recall rate, and F1 value for each class. The results of the experiment are shown in table 2.
TABLE 2 recognition Effect of each class
Numbering Categories P R F1
1 class 1 0.8053 0.9229 0.8601
2 class 2 0.8222 0.1588 0.2622
3 class 3 0.6450 0.7842 1.0000
4 class 4 0.6489 0.2837 0.3948
As can be seen from Table 2, class1, class2, class3, and class4 represent four different categories, respectively. In our dataset, the number of reviews containing class1 and class3 was greater, the number of reviews containing class2 was less, and the number of reviews containing class4 was about half. Analysis of Table 2 reveals that the accuracy of class1 and class2 is higher, and the recall and F1 values of class1 and class3 are higher.
In order to further evaluate the effect of the text comment interactive attention-based microblog case-level viewpoint identification model, experiments are performed on the data set provided by the invention by using different baseline models, and compared with the method provided by the invention, the experiment results are shown in table 3.
TABLE 3 model comparison experiment
Modeling method P R F1
CNN 0.6958 0.7014 0.6385
CNN-RNN 0.6831 0.6783 0.6206
Transformer 0.6669 0.6847 0.6096
The method of the invention 0.7349 0.7122 0.6603
The comparison with the baseline model is not fused with the information of the text, and only the comments are classified. Analysis table 3 shows that:
(1) compared with CNN, the method of the invention improves the accuracy by 3.9%, the recall rate by 1.08% and the F1 value by 2.1%.
(2) Compared with CNN-RNN, the method provided by the invention has the advantages that the accuracy is improved by 5.1%, the recall rate is improved by 3.4%, and the F1 value is improved by 3.9%.
(3) Compared with a Transformer, the method disclosed by the invention has the advantages that the accuracy is improved by 6.8%, the recall rate is improved by 2.7%, and the F1 value is improved by 5%.
In conclusion, the interactive attention mechanism is used for fusing the text and comment information, and the method has a practical effect on microblog case aspect-level viewpoint identification.
While the present invention has been described in detail with reference to the embodiments shown in the drawings, the present invention is not limited to the embodiments, and various changes can be made without departing from the spirit of the present invention within the knowledge of those skilled in the art.

Claims (9)

1. A microblog case aspect-level viewpoint identification method based on text comment interactive attention is characterized by comprising the following steps of:
step1, constructing a microblog case corpus, and then segmenting words and constructing a dictionary for the corpus;
step2, encoding the microblog texts and comments by using a multi-head attention mechanism;
step3, fusing the information of the texts and the comments through an interactive attention mechanism;
step4, identifying aspect level views of microblog cases using a multi-label classifier.
2. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 1, wherein: the specific steps of Step1 are as follows:
step1.1, firstly, crawling microblog case linguistic data from the Internet by utilizing a web crawler program;
step1.2, filtering and denoising the crawled microblog case linguistic data to construct a microblog case data set;
and Step1.3, extracting the comments related to the microblog cases from the Step1.2 data set, corresponding to the texts to which the comments belong, marking corresponding labels, forming the microblog case linguistic data through manual processing, segmenting the comments by using a Chinese word segmentation tool, and constructing a dictionary.
3. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 2, wherein: the specific steps of the step Step1.3 are as follows:
step1.3.1, extracting the comments related to the microblog cases from the step1.2 data set, and corresponding to the texts;
step1.3.2, marking each comment with a corresponding label;
step1.3.3, segmenting the comments by using a Chinese segmentation tool, and inputting all the comments according to batches until all the comments are input;
step1.3.4, constructing a dictionary by the words obtained by Step1.3.3, specifically operating to construct an empty dictionary first, inputting each word into the empty dictionary, adding the word into the dictionary if the dictionary does not contain the word, and skipping the next word if the word is contained, and completing in turn.
4. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 1, wherein: the specific steps of Step2 are as follows:
step2.1, using the text and comments of the microblog case as input at two ends of the model, adopting the same coding mode for the text and the comments, and expressing each sentence into an embedded matrix about the sentence;
step2.2, coding the microblog texts and the comments by adopting a multi-head attention mechanism, so that each word and all the words in the sentence are concerned by calculation, and the representation of the microblog case texts and the comments is obtained.
5. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 4, wherein: in Step2.1, the text and the comments of the microblog case are used as two inputs of a coding end, a sentence is assumed, n words exist in the sentence, and the sentence X represents the following formula:
X=(x1,x2,...,xn)
after the sentence is embedded, the expression formula of the word embedding sequence is as follows:
E=(w1,w2,...,wn)
e is the sequence of sentences represented as a two-dimensional embedding matrix that ties all the embeddings of a sentence together, with dimensions of n x d, n being the number of words, d being the dimension of sentence embedding, each element in the sequence E now being independent of the other.
6. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 4, wherein: in Step2.2, reading each text sequence by using a multi-head attention mechanism, calculating the attention of each word and all words, converting a two-dimensional embedded matrix E into fixed single values Q, K and V, linearly changing the fixed single values and inputting the fixed single values into the multi-head attention mechanism with 8 heads, finally splicing the output values of all the heads, and converting the output values into one output value which is the same as the single head by a linear conversion layer, wherein the specific calculation formula is as follows:
A=Linear(Multihead(Q,K,V))
the matrix A represents the representation of the text sequence obtained by multi-head attention mechanism coding; q, K, V are fixed single values;
and encoding the text and the comments by adopting a multi-head attention mechanism, so that each word in the sentence and all the words are concerned by calculation, and the representation of the text and the comments of the microblog case is obtained.
7. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 1, wherein: the specific steps of Step3 are as follows:
step3.1, firstly, calculating the similarity between the text matrix and the comment matrix;
the representations of the texts and comments obtained through the multi-head attention mechanism coding are respectively represented by a matrix L and a matrix K, and the attention is calculated from two directions: the focus from comment to body and from body to comment, which come from a shared similarity matrix between the context embedded information of the body and comment, the similarity matrix is calculated as follows:
Stj=α(K:t,L:j)
wherein StjRepresenting the t-th comment word and the j-th text wordSimilarity between two input vector representations of a trainable scalar function, coded text and comments, K:tThe t-th column vector of K, L:jA jth column vector of L;
step3.2, calculating the attention from the text to the comment, namely an attention coding module from the text to the comment, wherein the text to comment coding module shows which text words are most relevant to each comment word;
text-to-comment attention indicates which text words are most relevant to each comment word, let atRepresenting the attention weight between the text word and the tth comment word, Σ a for all ttjAttention weight is given by a 1t=softmax(St:) And calculating to obtain, wherein each text vector participating in attention is expressed as shown in a formula:
Figure RE-FDA0003033705950000031
Figure RE-FDA0003033705950000032
the method comprises the steps of calculating a concerned text vector with all comment words;
step3.3, calculating the attention of the comment to the text, namely an attention coding module for commenting to the text, wherein the attention coding module for commenting to the text indicates which comment words are most similar to a certain text word;
the attention of the comment to the text indicates which comment words are most similar to a certain text word by p ═ soft max (max)col(s)) to obtain attention weights on the comment words, where the maximum function is maxcol(s) performed across columns, then each comment vector participating in the attention is represented as shown in the formula:
Figure RE-FDA0003033705950000033
Figure RE-FDA0003033705950000034
the sum of the weights of the comment words most relevant to the text at a certain time is expressed, and is used for all times
Figure RE-FDA0003033705950000035
To represent this comment vector;
step3.4, fusing text and comment information through an interactive attention mechanism, mutually focusing attention on the text and the comment, and fusing the text and the comment information to obtain a representation of the comment containing the text information;
and finally, splicing the comment word embedding and the text comment interactive attention vector, wherein the obtained matrix is represented by G, as shown in a formula:
Figure RE-FDA0003033705950000036
each column vector in G can be viewed as a representation of textual information for each comment word, where the β function is an arbitrarily trainable neural network that fuses 3 input vectors.
8. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 1, wherein: the specific steps of Step4 are as follows:
step4.1, performing linear transformation on the information characteristics of the text and the comments fused in the step3 to obtain the probabilities of four classes in each sentence of comment;
step4.2, obtaining two classifications of each of the four classifications according to the probability of each classification through a Sigmoid function;
step4.3, if the classification result is more than 0.4, the classification belongs to the class, otherwise, the classification does not belong to the class;
step4.4, map the comments to the corresponding tags, identifying the aspect level views of the case.
9. The microblog case-level viewpoint identifying method based on interactive attention of text comments as recited in claim 1, wherein: the Step4 comprises the following steps:
through the process that the text comments concern each other, a G matrix obtained by fusing the interactive attention information of the text comments is a comment representation containing the text information; obtaining vectors of 4 classes by carrying out linear transformation on the representation through 4 layers and a nonlinear activation function; secondly, performing secondary classification on each class through a sigmod function, and mapping vectors of 4 classes to (0,1) to obtain a predicted value P; the classification process is shown by the following two formulas:
F=tanh(Linear(G))
P=Sigmoid(F)
and for the predicted value P, when the output value is greater than 0.4, classifying the predicted value P into a positive class, namely, when the output value is greater than 0.4, the comment belongs to the corresponding class, otherwise, the comment does not belong to the class, and each comment belongs to one class or multiple classes, so that identification of the microblog case aspect grade viewpoint is realized.
CN202110163045.0A 2021-02-05 2021-02-05 Microblog case aspect-level viewpoint identification method based on text comment interactive attention Pending CN112926336A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202110163045.0A CN112926336A (en) 2021-02-05 2021-02-05 Microblog case aspect-level viewpoint identification method based on text comment interactive attention

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202110163045.0A CN112926336A (en) 2021-02-05 2021-02-05 Microblog case aspect-level viewpoint identification method based on text comment interactive attention

Publications (1)

Publication Number Publication Date
CN112926336A true CN112926336A (en) 2021-06-08

Family

ID=76170853

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202110163045.0A Pending CN112926336A (en) 2021-02-05 2021-02-05 Microblog case aspect-level viewpoint identification method based on text comment interactive attention

Country Status (1)

Country Link
CN (1) CN112926336A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116541523A (en) * 2023-04-28 2023-08-04 重庆邮电大学 Legal judgment public opinion classification method based on big data

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111581474A (en) * 2020-04-02 2020-08-25 昆明理工大学 Evaluation object extraction method of case-related microblog comments based on multi-head attention system
CN111680154A (en) * 2020-04-13 2020-09-18 华东师范大学 Comment text attribute level emotion analysis method based on deep learning
CN112287197A (en) * 2020-09-23 2021-01-29 昆明理工大学 Method for detecting sarcasm of case-related microblog comments described by dynamic memory cases

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111581474A (en) * 2020-04-02 2020-08-25 昆明理工大学 Evaluation object extraction method of case-related microblog comments based on multi-head attention system
CN111680154A (en) * 2020-04-13 2020-09-18 华东师范大学 Comment text attribute level emotion analysis method based on deep learning
CN112287197A (en) * 2020-09-23 2021-01-29 昆明理工大学 Method for detecting sarcasm of case-related microblog comments described by dynamic memory cases

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
顾健伟 等: "基于双向注意力流和自注意力结合的机器阅读理解", 《南京大学学报(自然科学)》 *

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116541523A (en) * 2023-04-28 2023-08-04 重庆邮电大学 Legal judgment public opinion classification method based on big data
CN116541523B (en) * 2023-04-28 2024-08-16 芽米科技(广州)有限公司 Legal judgment public opinion classification method based on big data

Similar Documents

Publication Publication Date Title
CN110298037B (en) Convolutional neural network matching text recognition method based on enhanced attention mechanism
CN110990564B (en) Negative news identification method based on emotion calculation and multi-head attention mechanism
CN112732916B (en) BERT-based multi-feature fusion fuzzy text classification system
CN111159407B (en) Method, apparatus, device and medium for training entity recognition and relation classification model
CN110427623A (en) Semi-structured document Knowledge Extraction Method, device, electronic equipment and storage medium
CN111414476A (en) Attribute-level emotion analysis method based on multi-task learning
CN110347836B (en) Method for classifying sentiments of Chinese-Yue-bilingual news by blending into viewpoint sentence characteristics
CN114896388A (en) Hierarchical multi-label text classification method based on mixed attention
CN110046356B (en) Label-embedded microblog text emotion multi-label classification method
CN110287323A (en) A kind of object-oriented sensibility classification method
CN111259153B (en) Attribute-level emotion analysis method of complete attention mechanism
CN114926150A (en) Digital intelligent auditing method and device for transformer technology conformance assessment
CN110472245B (en) Multi-label emotion intensity prediction method based on hierarchical convolutional neural network
CN113987187A (en) Multi-label embedding-based public opinion text classification method, system, terminal and medium
CN112966503A (en) Aspect level emotion analysis method
CN115526236A (en) Text network graph classification method based on multi-modal comparative learning
CN115759092A (en) Network threat information named entity identification method based on ALBERT
CN114528835A (en) Semi-supervised specialized term extraction method, medium and equipment based on interval discrimination
CN113609857A (en) Legal named entity identification method and system based on cascade model and data enhancement
CN117807232A (en) Commodity classification method, commodity classification model construction method and device
CN113076425B (en) Event related viewpoint sentence classification method for microblog comments
CN117390131B (en) Text emotion classification method for multiple fields
CN112926336A (en) Microblog case aspect-level viewpoint identification method based on text comment interactive attention
CN114595693A (en) Text emotion analysis method based on deep learning
CN113064967A (en) Complaint reporting credibility analysis method based on deep migration network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
RJ01 Rejection of invention patent application after publication

Application publication date: 20210608

RJ01 Rejection of invention patent application after publication