CN111027313A - BiGRU judgment result tendency analysis method based on attention mechanism - Google Patents
BiGRU judgment result tendency analysis method based on attention mechanism Download PDFInfo
- Publication number
- CN111027313A CN111027313A CN201811166731.8A CN201811166731A CN111027313A CN 111027313 A CN111027313 A CN 111027313A CN 201811166731 A CN201811166731 A CN 201811166731A CN 111027313 A CN111027313 A CN 111027313A
- Authority
- CN
- China
- Prior art keywords
- word
- vector
- bigru
- attention mechanism
- matrix
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Abstract
The invention relates to a BiGRU judgment result tendency analysis method based on an attention mechanism, which extracts keyword information of parties of a judgment document to be analyzed and keyword information of judgment results; segmenting the judgment result into a plurality of single sentences, and performing word segmentation and stop word removal processing on each single sentence to obtain a word sequence; constructing a word vector table, and expressing the word sequence as a corresponding word vector matrix by using the word vector table; performing BiGRU calculation on the word vector matrix to obtain a feature vector of the word vector matrix, and performing attention calculation on the feature vector of the word vector matrix to obtain an output vector of an attention mechanism; the output vector of the attention mechanism is sorted using softmax. The method improves the accuracy of the algorithm, enables the classification result to be more accurate, makes up for the neglect of the context of the text based on the traditional one-way neural network, improves the accuracy of the emotion classification result, has good effect on strengthening the key information in the text, and improves the accuracy of the algorithm.
Description
Technical Field
The invention relates to the field of deep learning and natural language processing, in particular to a BiGRU judgment result tendency analysis method based on an attention mechanism.
Background
With the development of modern information technology and the deepening of legal construction, along with the disclosure of a large number of referee documents, the knowledge mining from a large number of referee documents becomes more meaningful. Text emotion analysis: it is also called opinion mining, tendency analysis, etc. and, in brief, is a process of analyzing, processing, inducing and reasoning subjective texts. The mainstream algorithms in research in the field of text emotion analysis are most popular with deep learning.
With the development of text emotion analysis, the application of the text emotion analysis to judgment result tendency analysis is a necessary trend. The current text sentiment analysis algorithm is applied to the research of sentiment analysis of judgment results, because the information loss causes the accuracy of the sentiment analysis to be reduced, the calculation is complex, and the sentiment analysis efficiency is restricted to a certain extent.
Aiming at the problems in the analysis and research of the judgment result tendency, the invention provides a BiGRU judgment result tendency analysis method based on attention mechanism, so as to improve the accuracy of the judgment result tendency analysis and the efficiency of the judgment result tendency.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a BiGRU judgment result tendency analysis method based on an attention mechanism, which solves the problem that the classification accuracy is reduced due to information loss in tendency analysis in judgment results, and improves the efficiency of judgment result tendency analysis calculation.
The technical scheme adopted by the invention for realizing the purpose is as follows:
a BiGRU decision result tendency analysis method based on attention mechanism,
extracting key word information of parties of a referee document to be analyzed and key word information of a judgment result;
segmenting the judgment result into a plurality of single sentences, and performing word segmentation and stop word removal processing on each single sentence to obtain a word sequence;
constructing a word vector table, and expressing the word sequence as a corresponding word vector matrix by using the word vector table;
performing BiGRU calculation on the word vector matrix to obtain a feature vector of the word vector matrix, and performing attention calculation on the feature vector of the word vector matrix to obtain an output vector of an attention mechanism;
the output vector of the attention mechanism is sorted using softmax.
The keyword information of the parties comprises original reports, announcements, appetitives, appetities, applicants and applicants.
The judgment result keyword information includes "judgment as follows: "," adjudicate as follows: "," adjudicate as follows: "," is determined as follows: ".
If the judgment result does not have the standard legal title of the party, the non-standard legal title of the party is replaced by the standard legal title of the corresponding party, and the method specifically comprises the following steps:
the constructing of the word vector table includes the following processes:
removing stop words and performing word segmentation processing on the referee document training set to generate a corpus required for constructing word vectors; generating a first vocabulary list for the materials, counting and sequencing word frequency of each word, and taking V words with the maximum word frequency to form a second vocabulary list;
each word in the second vocabulary list is represented by a corresponding one-hot vector, and the dimension of the one-hot vector corresponding to each word is V, so that a one-hot vector list is generated;
and (5) performing dimension reduction on the one-hot vector table by using a Skip-gram model to generate a word vector table.
The word vector matrix is represented as:
(w1,w2,w3,",wn)→S=(s1,s2,s3,",sn)
wherein s isiIs the word vector of the ith keyword, and S is a word vector matrix.
The eigenvectors of the word vector matrix are calculated as follows:
zt=σ(Wz·[ht-1,xt])
rt=σ(Wr·[ht-1,xt])
wherein x istIs input data, htIs the output of the current GRU calculation unit, ht-1Is the calculation output of the previous calculation unit, ztIs to update the door rtIs a reset gate, ztAnd rtCo-control the slave ht-1Hidden state to htCalculation of hidden state, updating gate while controlling current input data and previous memory information ht-1Outputting a value z between 0 and 1t,ztDetermine how much to put ht-1And the next state.Are candidate hidden states and a reset gate is used to control the flow of the last hidden state containing past time information. In the formula, sigma is sigmoid function, tanh is tangent activation function, Wz,WrW are the update gate, the reset gate, and the weight matrix of the candidate hidden state, respectively.
The output vector calculation process of the attention mechanism is as follows:
1) performing similarity calculation on the word vector of each word in the word vector matrix and all the word vectors in the matrix to obtain a weight, specifically comprising:
M=tanh(ht)
htfor the output vector of t time steps calculated by the BiGRU layer, tanh is an activation function, and M is a temporary weight matrix.
4) Normalizing the temporary weight matrix by utilizing a softmax function, which specifically comprises the following steps:
α=softmax(wTM)
wherein, wTThe attention weight matrix α is obtained through softmax calculation for the weight matrix initialized at random and learned in training.
5) And performing weighted summation on the weights and the corresponding feature vectors to obtain an output vector of the attention layer, wherein the weighted summation specifically comprises the following steps:
γ=htαT
where γ is the output vector of the attention layer.
The invention has the following beneficial effects and advantages:
1. according to the method, the connection of the text context is strengthened by using the bidirectional neural network, the accuracy of the algorithm is improved, the classification result is more accurate, the neglect of the text context based on the traditional unidirectional neural network is compensated, and the emotion classification result accuracy is improved.
2. According to the method, the loss of the text detail information is reduced by adding the attention mechanism, a good effect can be achieved on strengthening the key information in the text, and the accuracy of the algorithm is improved to a certain extent.
3. The invention can carry out emotion classification calculation in specific fields according to texts in different fields, and has certain personalized expandability.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a diagram of a model for feature vector computation of the word vector matrix according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples.
In order to make the aforementioned objects, features and advantages of the present invention more comprehensible, embodiments accompanying the drawings are described in detail below. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, but rather should be construed as modified in the spirit and scope of the present invention as set forth in the appended claims.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
As shown in fig. 1, a BiGRU decision result tendency analysis and calculation method based on attention mechanism includes the following steps:
step 1: the user inputs a referee document needing to analyze the tendency of the judgment result;
step 2: extracting required original reported information and a judgment result from the referee document according to the keywords;
and step 3: replacing the name of the party in the judgment result extracted in the step 2 with the original standard legal name;
and 4, step 4: dividing the judgment result processed in the step 3 into strips, and refining the judgment result;
and 5: removing stop words from each result and segmenting words to form word sequences;
step 6: expressing the judgment result after word segmentation as a corresponding judgment result vector (S) by utilizing a trained word vector table;
and 7: performing BiGRU calculation on the word vector sequence to obtain a feature vector of the word vector;
and 8: performing attention calculation on the calculation result obtained in the step 7 to obtain an output vector of an attention mechanism;
and step 9: the output vector of the attention mechanism in step 8 is finally sorted with softmax.
Wherein, the step 2: extracting key information, wherein the specific format of the official document is as follows:
and extracting the information of the party and the text content of the judgment result from the keywords.
And step 3:
and if no standard legal title judgment regulation exists in judgment, the legal titles need to be unified, and the replacement of the personal title in the judgment result with the original quilt and other standard legal titles is realized in a longest subsequence matching mode. And replacing the matched position in the judgment result by a corresponding legal title through the calculation of the longest public subsequence.
The formula is as follows:
and C [ i, j ] represents the length of the longest public subsequence, and the largest C [ i, j ] is selected and replaced by a corresponding legal name. Example (c):
original defended information:
the upper complainer (original trial) license x, man.
The loved one (former trial announcement) wang x, woman ·.
And (4) judging regulation:
the two trial cases accepted cost six thousand one hundred fifty dollars, borne by the permit x (paid).
Replacement is with a standard sentence:
the two-trial case acceptance cost six thousand, one hundred and fifty yuan, and is borne by the complainer (paid).
In the step 6: first, the related concepts are defined as follows:
decision result vector (S): decision after stripping using natural language processing techniquesAnd performing operations such as word segmentation and word stop removal to obtain a word sequence, and expressing the word sequence as S (S) through a word vector table1,s2,",sn) Wherein s isiA word vector for the ith keyword.
(w1,w2,w3,",wn)→S=(s1,s2,s3,",sn)
Example (b):
and (4) judging the text: permitting a certain first of the Prolate to be divorced from a certain second of the Paris Caucaria;
and (3) judging a word sequence: (grant, grandfather, Franzeny, and, quilt, Cao, divorce).
A word vector matrix: (w)1,w2,w3,",wn)→S=(s1,s2,s3,",sn)。
And 7: feature vector of word vector: and inputting the word vector into a BiGRU network for calculation to obtain a feature vector. As shown in fig. 2, the word vector sequence S obtained in step 6 is set to (S)1,s2,…,sn-1,sn) Input into BiGRU network, i.e. x in formulatAnd finally obtaining the feature vector corresponding to the word vector through the calculation of each unit of the network layer.
zt=σ(Wz·[ht-1,xt])
rt=σ(Wr·[ht-1,xt])
The specific description is as follows: x is the number oftIs input data, htIs the output of the current GRU calculation unit, ht-1Is the calculation output of the previous calculation unit, ztIs to update the door rtIs a reset gate, ztAnd rtCo-control the slave ht-1Hidden shapeState to htCalculation of hidden state, updating gate while controlling current input data and previous memory information ht-1Outputting a value z between 0 and 1t,ztDetermine how much to put ht-1And the next state. h ^ etAre candidate hidden states and a reset gate is used to control the flow of the last hidden state containing past time information. In the formula, sigma is sigmoid function, tanh is tangent activation function, Wz,WrW are the update gate, the reset gate, and the weight matrix of the candidate hidden state, respectively.
And 8: the attention calculation for the vector is accomplished by inputting the feature vector of the word vector into the attention mechanism to calculate the output vector of the attention mechanism.
The specific calculation is as follows: 1) performing similarity calculation on the word vector of each word in the word vector matrix and all the word vectors in the matrix to obtain a weight, specifically comprising:
M=tanh(ht)
htfor the output vector of t time steps calculated by the BiGRU layer, tanh is an activation function, and M is a temporary weight matrix.
2) Normalizing the temporary weight matrix by utilizing a softmax function, which specifically comprises the following steps:
α=softmax(wTM)
wherein, wTThe attention weight matrix α is obtained through softmax calculation for the weight matrix initialized at random and learned in training.
3) And performing weighted summation on the weights and the corresponding feature vectors to obtain an output vector of the attention layer, wherein the weighted summation specifically comprises the following steps:
γ=htαT
where γ is the output vector of the attention layer.
And mapping the final result through an activation function to obtain a classification result of the text.
Claims (8)
1. A BiGRU judgment result tendency analysis method based on an attention mechanism is characterized in that:
extracting key word information of parties of a referee document to be analyzed and key word information of a judgment result;
segmenting the judgment result into a plurality of single sentences, and performing word segmentation and stop word removal processing on each single sentence to obtain a word sequence;
constructing a word vector table, and expressing the word sequence as a corresponding word vector matrix by using the word vector table;
performing BiGRU calculation on the word vector matrix to obtain a feature vector of the word vector matrix, and performing attention calculation on the feature vector of the word vector matrix to obtain an output vector of an attention mechanism;
the output vector of the attention mechanism is sorted using softmax.
2. The attention mechanism-based BiGRU decision-tendency analysis method of claim 1, wherein: the keyword information of the parties comprises original reports, announcements, appetitives, appetities, applicants and applicants.
3. The attention mechanism-based BiGRU decision-tendency analysis method of claim 1, wherein: the judgment result keyword information includes "judgment as follows: "," adjudicate as follows: "," adjudicate as follows: "," is determined as follows: ".
4. The attention mechanism-based BiGRU decision-tendency analysis method of claim 1, wherein: if the judgment result does not have the standard legal title of the party, the non-standard legal title of the party is replaced by the standard legal title of the corresponding party, and the method specifically comprises the following steps:
wherein, C [ i, j ] represents the length of the longest public subsequence, and the largest C [ i, j ] is selected and replaced by a corresponding legal title.
5. The attention mechanism-based BiGRU decision-tendency analysis method of claim 1, wherein: the constructing of the word vector table includes the following processes:
removing stop words and performing word segmentation processing on the referee document training set to generate a corpus required for constructing word vectors; generating a first vocabulary list for the materials, counting and sequencing word frequency of each word, and taking V words with the maximum word frequency to form a second vocabulary list;
each word in the second vocabulary list is represented by a corresponding one-hot vector, and the dimension of the one-hot vector corresponding to each word is V, so that a one-hot vector list is generated;
and (5) performing dimension reduction on the one-hot vector table by using a Skip-gram model to generate a word vector table.
6. The attention mechanism-based BiGRU decision-tendency analysis method of claim 1, wherein: the word vector matrix is represented as:
(w1,w2,w3,…,wn)→S=(s1,s2,s3,…,sn)
wherein s isiIs the word vector of the ith keyword, and S is a word vector matrix.
7. The attention mechanism-based BiGRU decision-tendency analysis method of claim 1, wherein: the eigenvectors of the word vector matrix are calculated as follows:
zt=σ(Wz·[ht-1,xt])
rt=σ(Wr·[ht-1,xt])
wherein x istIs input data, htIs the output of the current GRU calculation unit, ht-1Is the calculation output of the previous calculation unit, ztIs to update the door rtIs a reset gate, ztAnd rtCo-control the slave ht-1Hidden state to htCalculation of hidden state, updating gate while controlling current input data and previous memory information ht-1Outputting a value z between 0 and 1t、ztDetermine how much to put ht-1Passing to the next state;is a candidate hidden state, and uses a reset gate to control the flow of the last hidden state containing past time information; in the formula, sigma is sigmoid function, tanh is tangent activation function, Wz、WrW are the weight matrices of the update gate, the reset gate and the candidate hidden state, respectively.
8. The attention mechanism-based BiGRU decision-tendency analysis method of claim 1, wherein: the output vector calculation process of the attention mechanism is as follows:
1) performing similarity calculation on the word vector of each word in the word vector matrix and all the word vectors in the matrix to obtain a weight, specifically comprising:
M=tanh(ht)
htcalculating output vectors of t time steps through a BiGRU layer, wherein tanh is an activation function, and M is a temporary weight matrix;
2) normalizing the temporary weight matrix by utilizing a softmax function, which specifically comprises the following steps:
α=softmax(wTM)
wherein, wTCalculating a weight matrix which is initialized randomly and learned in training by softmax to obtain an attention weight matrix α;
3) and performing weighted summation on the weights and the corresponding feature vectors to obtain an output vector of the attention layer, wherein the weighted summation specifically comprises the following steps:
γ=htαT
where γ is the output vector of the attention layer.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811166731.8A CN111027313A (en) | 2018-10-08 | 2018-10-08 | BiGRU judgment result tendency analysis method based on attention mechanism |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201811166731.8A CN111027313A (en) | 2018-10-08 | 2018-10-08 | BiGRU judgment result tendency analysis method based on attention mechanism |
Publications (1)
Publication Number | Publication Date |
---|---|
CN111027313A true CN111027313A (en) | 2020-04-17 |
Family
ID=70190357
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201811166731.8A Pending CN111027313A (en) | 2018-10-08 | 2018-10-08 | BiGRU judgment result tendency analysis method based on attention mechanism |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN111027313A (en) |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105938495A (en) * | 2016-04-29 | 2016-09-14 | 乐视控股(北京)有限公司 | Entity relationship recognition method and apparatus |
CN107247702A (en) * | 2017-05-05 | 2017-10-13 | 桂林电子科技大学 | A kind of text emotion analysis and processing method and system |
CN108304365A (en) * | 2017-02-23 | 2018-07-20 | 腾讯科技(深圳)有限公司 | keyword extracting method and device |
CN108320051A (en) * | 2018-01-17 | 2018-07-24 | 哈尔滨工程大学 | A kind of mobile robot dynamic collision-free planning method based on GRU network models |
CN108595601A (en) * | 2018-04-20 | 2018-09-28 | 福州大学 | A kind of long text sentiment analysis method incorporating Attention mechanism |
-
2018
- 2018-10-08 CN CN201811166731.8A patent/CN111027313A/en active Pending
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105938495A (en) * | 2016-04-29 | 2016-09-14 | 乐视控股(北京)有限公司 | Entity relationship recognition method and apparatus |
CN108304365A (en) * | 2017-02-23 | 2018-07-20 | 腾讯科技(深圳)有限公司 | keyword extracting method and device |
CN107247702A (en) * | 2017-05-05 | 2017-10-13 | 桂林电子科技大学 | A kind of text emotion analysis and processing method and system |
CN108320051A (en) * | 2018-01-17 | 2018-07-24 | 哈尔滨工程大学 | A kind of mobile robot dynamic collision-free planning method based on GRU network models |
CN108595601A (en) * | 2018-04-20 | 2018-09-28 | 福州大学 | A kind of long text sentiment analysis method incorporating Attention mechanism |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Xie et al. | Speech emotion classification using attention-based LSTM | |
Almuzaini et al. | Impact of stemming and word embedding on deep learning-based Arabic text categorization | |
CN109753566B (en) | Model training method for cross-domain emotion analysis based on convolutional neural network | |
CN108763326B (en) | Emotion analysis model construction method of convolutional neural network based on feature diversification | |
CN109657239B (en) | Chinese named entity recognition method based on attention mechanism and language model learning | |
CN108984526B (en) | Document theme vector extraction method based on deep learning | |
CN108446271B (en) | Text emotion analysis method of convolutional neural network based on Chinese character component characteristics | |
CN110807320B (en) | Short text emotion analysis method based on CNN bidirectional GRU attention mechanism | |
CN108009148B (en) | Text emotion classification representation method based on deep learning | |
EP2486470B1 (en) | System and method for inputting text into electronic devices | |
CN111177374A (en) | Active learning-based question and answer corpus emotion classification method and system | |
CN108536754A (en) | Electronic health record entity relation extraction method based on BLSTM and attention mechanism | |
CN109086269B (en) | Semantic bilingual recognition method based on semantic resource word representation and collocation relationship | |
CN112232087B (en) | Specific aspect emotion analysis method of multi-granularity attention model based on Transformer | |
CN113591483A (en) | Document-level event argument extraction method based on sequence labeling | |
CN110472245B (en) | Multi-label emotion intensity prediction method based on hierarchical convolutional neural network | |
CN109919175B (en) | Entity multi-classification method combined with attribute information | |
CN112487237B (en) | Music classification method based on self-adaptive CNN and semi-supervised self-training model | |
CN114417851A (en) | Emotion analysis method based on keyword weighted information | |
CN111400494A (en) | Sentiment analysis method based on GCN-Attention | |
Sun et al. | VCWE: visual character-enhanced word embeddings | |
CN113987187A (en) | Multi-label embedding-based public opinion text classification method, system, terminal and medium | |
CN113094502A (en) | Multi-granularity takeaway user comment sentiment analysis method | |
CN111241820A (en) | Bad phrase recognition method, device, electronic device, and storage medium | |
CN112818698B (en) | Fine-grained user comment sentiment analysis method based on dual-channel model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |