CN111027313A - BiGRU judgment result tendency analysis method based on attention mechanism - Google Patents

BiGRU judgment result tendency analysis method based on attention mechanism Download PDF

Info

Publication number
CN111027313A
CN111027313A CN201811166731.8A CN201811166731A CN111027313A CN 111027313 A CN111027313 A CN 111027313A CN 201811166731 A CN201811166731 A CN 201811166731A CN 111027313 A CN111027313 A CN 111027313A
Authority
CN
China
Prior art keywords
word
vector
bigru
attention mechanism
matrix
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201811166731.8A
Other languages
Chinese (zh)
Inventor
王宁
周晓磊
李世林
刘堂亮
张镝
祁柏林
赵奎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shenyang Institute of Computing Technology of CAS
Original Assignee
Shenyang Institute of Computing Technology of CAS
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shenyang Institute of Computing Technology of CAS filed Critical Shenyang Institute of Computing Technology of CAS
Priority to CN201811166731.8A priority Critical patent/CN111027313A/en
Publication of CN111027313A publication Critical patent/CN111027313A/en
Pending legal-status Critical Current

Links

Images

Abstract

The invention relates to a BiGRU judgment result tendency analysis method based on an attention mechanism, which extracts keyword information of parties of a judgment document to be analyzed and keyword information of judgment results; segmenting the judgment result into a plurality of single sentences, and performing word segmentation and stop word removal processing on each single sentence to obtain a word sequence; constructing a word vector table, and expressing the word sequence as a corresponding word vector matrix by using the word vector table; performing BiGRU calculation on the word vector matrix to obtain a feature vector of the word vector matrix, and performing attention calculation on the feature vector of the word vector matrix to obtain an output vector of an attention mechanism; the output vector of the attention mechanism is sorted using softmax. The method improves the accuracy of the algorithm, enables the classification result to be more accurate, makes up for the neglect of the context of the text based on the traditional one-way neural network, improves the accuracy of the emotion classification result, has good effect on strengthening the key information in the text, and improves the accuracy of the algorithm.

Description

BiGRU judgment result tendency analysis method based on attention mechanism
Technical Field
The invention relates to the field of deep learning and natural language processing, in particular to a BiGRU judgment result tendency analysis method based on an attention mechanism.
Background
With the development of modern information technology and the deepening of legal construction, along with the disclosure of a large number of referee documents, the knowledge mining from a large number of referee documents becomes more meaningful. Text emotion analysis: it is also called opinion mining, tendency analysis, etc. and, in brief, is a process of analyzing, processing, inducing and reasoning subjective texts. The mainstream algorithms in research in the field of text emotion analysis are most popular with deep learning.
With the development of text emotion analysis, the application of the text emotion analysis to judgment result tendency analysis is a necessary trend. The current text sentiment analysis algorithm is applied to the research of sentiment analysis of judgment results, because the information loss causes the accuracy of the sentiment analysis to be reduced, the calculation is complex, and the sentiment analysis efficiency is restricted to a certain extent.
Aiming at the problems in the analysis and research of the judgment result tendency, the invention provides a BiGRU judgment result tendency analysis method based on attention mechanism, so as to improve the accuracy of the judgment result tendency analysis and the efficiency of the judgment result tendency.
Disclosure of Invention
Aiming at the defects of the prior art, the invention provides a BiGRU judgment result tendency analysis method based on an attention mechanism, which solves the problem that the classification accuracy is reduced due to information loss in tendency analysis in judgment results, and improves the efficiency of judgment result tendency analysis calculation.
The technical scheme adopted by the invention for realizing the purpose is as follows:
a BiGRU decision result tendency analysis method based on attention mechanism,
extracting key word information of parties of a referee document to be analyzed and key word information of a judgment result;
segmenting the judgment result into a plurality of single sentences, and performing word segmentation and stop word removal processing on each single sentence to obtain a word sequence;
constructing a word vector table, and expressing the word sequence as a corresponding word vector matrix by using the word vector table;
performing BiGRU calculation on the word vector matrix to obtain a feature vector of the word vector matrix, and performing attention calculation on the feature vector of the word vector matrix to obtain an output vector of an attention mechanism;
the output vector of the attention mechanism is sorted using softmax.
The keyword information of the parties comprises original reports, announcements, appetitives, appetities, applicants and applicants.
The judgment result keyword information includes "judgment as follows: "," adjudicate as follows: "," adjudicate as follows: "," is determined as follows: ".
If the judgment result does not have the standard legal title of the party, the non-standard legal title of the party is replaced by the standard legal title of the corresponding party, and the method specifically comprises the following steps:
Figure BDA0001821312480000021
the constructing of the word vector table includes the following processes:
removing stop words and performing word segmentation processing on the referee document training set to generate a corpus required for constructing word vectors; generating a first vocabulary list for the materials, counting and sequencing word frequency of each word, and taking V words with the maximum word frequency to form a second vocabulary list;
each word in the second vocabulary list is represented by a corresponding one-hot vector, and the dimension of the one-hot vector corresponding to each word is V, so that a one-hot vector list is generated;
and (5) performing dimension reduction on the one-hot vector table by using a Skip-gram model to generate a word vector table.
The word vector matrix is represented as:
(w1,w2,w3,",wn)→S=(s1,s2,s3,",sn)
wherein s isiIs the word vector of the ith keyword, and S is a word vector matrix.
The eigenvectors of the word vector matrix are calculated as follows:
zt=σ(Wz·[ht-1,xt])
rt=σ(Wr·[ht-1,xt])
Figure BDA0001821312480000031
Figure BDA0001821312480000032
wherein x istIs input data, htIs the output of the current GRU calculation unit, ht-1Is the calculation output of the previous calculation unit, ztIs to update the door rtIs a reset gate, ztAnd rtCo-control the slave ht-1Hidden state to htCalculation of hidden state, updating gate while controlling current input data and previous memory information ht-1Outputting a value z between 0 and 1t,ztDetermine how much to put ht-1And the next state.
Figure BDA0001821312480000033
Are candidate hidden states and a reset gate is used to control the flow of the last hidden state containing past time information. In the formula, sigma is sigmoid function, tanh is tangent activation function, Wz,WrW are the update gate, the reset gate, and the weight matrix of the candidate hidden state, respectively.
The output vector calculation process of the attention mechanism is as follows:
1) performing similarity calculation on the word vector of each word in the word vector matrix and all the word vectors in the matrix to obtain a weight, specifically comprising:
M=tanh(ht)
htfor the output vector of t time steps calculated by the BiGRU layer, tanh is an activation function, and M is a temporary weight matrix.
4) Normalizing the temporary weight matrix by utilizing a softmax function, which specifically comprises the following steps:
α=softmax(wTM)
wherein, wTThe attention weight matrix α is obtained through softmax calculation for the weight matrix initialized at random and learned in training.
5) And performing weighted summation on the weights and the corresponding feature vectors to obtain an output vector of the attention layer, wherein the weighted summation specifically comprises the following steps:
γ=htαT
where γ is the output vector of the attention layer.
The invention has the following beneficial effects and advantages:
1. according to the method, the connection of the text context is strengthened by using the bidirectional neural network, the accuracy of the algorithm is improved, the classification result is more accurate, the neglect of the text context based on the traditional unidirectional neural network is compensated, and the emotion classification result accuracy is improved.
2. According to the method, the loss of the text detail information is reduced by adding the attention mechanism, a good effect can be achieved on strengthening the key information in the text, and the accuracy of the algorithm is improved to a certain extent.
3. The invention can carry out emotion classification calculation in specific fields according to texts in different fields, and has certain personalized expandability.
Drawings
FIG. 1 is a flow chart of the method of the present invention.
FIG. 2 is a diagram of a model for feature vector computation of the word vector matrix according to the present invention.
Detailed Description
The present invention will be described in further detail with reference to the accompanying drawings and examples.
In order to make the aforementioned objects, features and advantages of the present invention more comprehensible, embodiments accompanying the drawings are described in detail below. In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present invention. This invention may, however, be embodied in many different forms and should not be construed as limited to the embodiments set forth herein, but rather should be construed as modified in the spirit and scope of the present invention as set forth in the appended claims.
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs. The terminology used in the description of the invention herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention.
As shown in fig. 1, a BiGRU decision result tendency analysis and calculation method based on attention mechanism includes the following steps:
step 1: the user inputs a referee document needing to analyze the tendency of the judgment result;
step 2: extracting required original reported information and a judgment result from the referee document according to the keywords;
and step 3: replacing the name of the party in the judgment result extracted in the step 2 with the original standard legal name;
and 4, step 4: dividing the judgment result processed in the step 3 into strips, and refining the judgment result;
and 5: removing stop words from each result and segmenting words to form word sequences;
step 6: expressing the judgment result after word segmentation as a corresponding judgment result vector (S) by utilizing a trained word vector table;
and 7: performing BiGRU calculation on the word vector sequence to obtain a feature vector of the word vector;
and 8: performing attention calculation on the calculation result obtained in the step 7 to obtain an output vector of an attention mechanism;
and step 9: the output vector of the attention mechanism in step 8 is finally sorted with softmax.
Wherein, the step 2: extracting key information, wherein the specific format of the official document is as follows:
Figure BDA0001821312480000051
and extracting the information of the party and the text content of the judgment result from the keywords.
And step 3:
and if no standard legal title judgment regulation exists in judgment, the legal titles need to be unified, and the replacement of the personal title in the judgment result with the original quilt and other standard legal titles is realized in a longest subsequence matching mode. And replacing the matched position in the judgment result by a corresponding legal title through the calculation of the longest public subsequence.
The formula is as follows:
Figure BDA0001821312480000061
and C [ i, j ] represents the length of the longest public subsequence, and the largest C [ i, j ] is selected and replaced by a corresponding legal name. Example (c):
original defended information:
the upper complainer (original trial) license x, man.
The loved one (former trial announcement) wang x, woman ·.
And (4) judging regulation:
the two trial cases accepted cost six thousand one hundred fifty dollars, borne by the permit x (paid).
Replacement is with a standard sentence:
the two-trial case acceptance cost six thousand, one hundred and fifty yuan, and is borne by the complainer (paid).
In the step 6: first, the related concepts are defined as follows:
decision result vector (S): decision after stripping using natural language processing techniquesAnd performing operations such as word segmentation and word stop removal to obtain a word sequence, and expressing the word sequence as S (S) through a word vector table1,s2,",sn) Wherein s isiA word vector for the ith keyword.
(w1,w2,w3,",wn)→S=(s1,s2,s3,",sn)
Example (b):
and (4) judging the text: permitting a certain first of the Prolate to be divorced from a certain second of the Paris Caucaria;
and (3) judging a word sequence: (grant, grandfather, Franzeny, and, quilt, Cao, divorce).
A word vector matrix: (w)1,w2,w3,",wn)→S=(s1,s2,s3,",sn)。
And 7: feature vector of word vector: and inputting the word vector into a BiGRU network for calculation to obtain a feature vector. As shown in fig. 2, the word vector sequence S obtained in step 6 is set to (S)1,s2,…,sn-1,sn) Input into BiGRU network, i.e. x in formulatAnd finally obtaining the feature vector corresponding to the word vector through the calculation of each unit of the network layer.
zt=σ(Wz·[ht-1,xt])
rt=σ(Wr·[ht-1,xt])
Figure BDA0001821312480000071
Figure BDA0001821312480000072
The specific description is as follows: x is the number oftIs input data, htIs the output of the current GRU calculation unit, ht-1Is the calculation output of the previous calculation unit, ztIs to update the door rtIs a reset gate, ztAnd rtCo-control the slave ht-1Hidden shapeState to htCalculation of hidden state, updating gate while controlling current input data and previous memory information ht-1Outputting a value z between 0 and 1t,ztDetermine how much to put ht-1And the next state. h ^ etAre candidate hidden states and a reset gate is used to control the flow of the last hidden state containing past time information. In the formula, sigma is sigmoid function, tanh is tangent activation function, Wz,WrW are the update gate, the reset gate, and the weight matrix of the candidate hidden state, respectively.
And 8: the attention calculation for the vector is accomplished by inputting the feature vector of the word vector into the attention mechanism to calculate the output vector of the attention mechanism.
The specific calculation is as follows: 1) performing similarity calculation on the word vector of each word in the word vector matrix and all the word vectors in the matrix to obtain a weight, specifically comprising:
M=tanh(ht)
htfor the output vector of t time steps calculated by the BiGRU layer, tanh is an activation function, and M is a temporary weight matrix.
2) Normalizing the temporary weight matrix by utilizing a softmax function, which specifically comprises the following steps:
α=softmax(wTM)
wherein, wTThe attention weight matrix α is obtained through softmax calculation for the weight matrix initialized at random and learned in training.
3) And performing weighted summation on the weights and the corresponding feature vectors to obtain an output vector of the attention layer, wherein the weighted summation specifically comprises the following steps:
γ=htαT
where γ is the output vector of the attention layer.
And mapping the final result through an activation function to obtain a classification result of the text.

Claims (8)

1. A BiGRU judgment result tendency analysis method based on an attention mechanism is characterized in that:
extracting key word information of parties of a referee document to be analyzed and key word information of a judgment result;
segmenting the judgment result into a plurality of single sentences, and performing word segmentation and stop word removal processing on each single sentence to obtain a word sequence;
constructing a word vector table, and expressing the word sequence as a corresponding word vector matrix by using the word vector table;
performing BiGRU calculation on the word vector matrix to obtain a feature vector of the word vector matrix, and performing attention calculation on the feature vector of the word vector matrix to obtain an output vector of an attention mechanism;
the output vector of the attention mechanism is sorted using softmax.
2. The attention mechanism-based BiGRU decision-tendency analysis method of claim 1, wherein: the keyword information of the parties comprises original reports, announcements, appetitives, appetities, applicants and applicants.
3. The attention mechanism-based BiGRU decision-tendency analysis method of claim 1, wherein: the judgment result keyword information includes "judgment as follows: "," adjudicate as follows: "," adjudicate as follows: "," is determined as follows: ".
4. The attention mechanism-based BiGRU decision-tendency analysis method of claim 1, wherein: if the judgment result does not have the standard legal title of the party, the non-standard legal title of the party is replaced by the standard legal title of the corresponding party, and the method specifically comprises the following steps:
Figure FDA0001821312470000011
wherein, C [ i, j ] represents the length of the longest public subsequence, and the largest C [ i, j ] is selected and replaced by a corresponding legal title.
5. The attention mechanism-based BiGRU decision-tendency analysis method of claim 1, wherein: the constructing of the word vector table includes the following processes:
removing stop words and performing word segmentation processing on the referee document training set to generate a corpus required for constructing word vectors; generating a first vocabulary list for the materials, counting and sequencing word frequency of each word, and taking V words with the maximum word frequency to form a second vocabulary list;
each word in the second vocabulary list is represented by a corresponding one-hot vector, and the dimension of the one-hot vector corresponding to each word is V, so that a one-hot vector list is generated;
and (5) performing dimension reduction on the one-hot vector table by using a Skip-gram model to generate a word vector table.
6. The attention mechanism-based BiGRU decision-tendency analysis method of claim 1, wherein: the word vector matrix is represented as:
(w1,w2,w3,…,wn)→S=(s1,s2,s3,…,sn)
wherein s isiIs the word vector of the ith keyword, and S is a word vector matrix.
7. The attention mechanism-based BiGRU decision-tendency analysis method of claim 1, wherein: the eigenvectors of the word vector matrix are calculated as follows:
zt=σ(Wz·[ht-1,xt])
rt=σ(Wr·[ht-1,xt])
Figure FDA0001821312470000021
Figure FDA0001821312470000022
wherein x istIs input data, htIs the output of the current GRU calculation unit, ht-1Is the calculation output of the previous calculation unit, ztIs to update the door rtIs a reset gate, ztAnd rtCo-control the slave ht-1Hidden state to htCalculation of hidden state, updating gate while controlling current input data and previous memory information ht-1Outputting a value z between 0 and 1t、ztDetermine how much to put ht-1Passing to the next state;
Figure FDA0001821312470000023
is a candidate hidden state, and uses a reset gate to control the flow of the last hidden state containing past time information; in the formula, sigma is sigmoid function, tanh is tangent activation function, Wz、WrW are the weight matrices of the update gate, the reset gate and the candidate hidden state, respectively.
8. The attention mechanism-based BiGRU decision-tendency analysis method of claim 1, wherein: the output vector calculation process of the attention mechanism is as follows:
1) performing similarity calculation on the word vector of each word in the word vector matrix and all the word vectors in the matrix to obtain a weight, specifically comprising:
M=tanh(ht)
htcalculating output vectors of t time steps through a BiGRU layer, wherein tanh is an activation function, and M is a temporary weight matrix;
2) normalizing the temporary weight matrix by utilizing a softmax function, which specifically comprises the following steps:
α=softmax(wTM)
wherein, wTCalculating a weight matrix which is initialized randomly and learned in training by softmax to obtain an attention weight matrix α;
3) and performing weighted summation on the weights and the corresponding feature vectors to obtain an output vector of the attention layer, wherein the weighted summation specifically comprises the following steps:
γ=htαT
where γ is the output vector of the attention layer.
CN201811166731.8A 2018-10-08 2018-10-08 BiGRU judgment result tendency analysis method based on attention mechanism Pending CN111027313A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201811166731.8A CN111027313A (en) 2018-10-08 2018-10-08 BiGRU judgment result tendency analysis method based on attention mechanism

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201811166731.8A CN111027313A (en) 2018-10-08 2018-10-08 BiGRU judgment result tendency analysis method based on attention mechanism

Publications (1)

Publication Number Publication Date
CN111027313A true CN111027313A (en) 2020-04-17

Family

ID=70190357

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201811166731.8A Pending CN111027313A (en) 2018-10-08 2018-10-08 BiGRU judgment result tendency analysis method based on attention mechanism

Country Status (1)

Country Link
CN (1) CN111027313A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105938495A (en) * 2016-04-29 2016-09-14 乐视控股(北京)有限公司 Entity relationship recognition method and apparatus
CN107247702A (en) * 2017-05-05 2017-10-13 桂林电子科技大学 A kind of text emotion analysis and processing method and system
CN108304365A (en) * 2017-02-23 2018-07-20 腾讯科技(深圳)有限公司 keyword extracting method and device
CN108320051A (en) * 2018-01-17 2018-07-24 哈尔滨工程大学 A kind of mobile robot dynamic collision-free planning method based on GRU network models
CN108595601A (en) * 2018-04-20 2018-09-28 福州大学 A kind of long text sentiment analysis method incorporating Attention mechanism

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN105938495A (en) * 2016-04-29 2016-09-14 乐视控股(北京)有限公司 Entity relationship recognition method and apparatus
CN108304365A (en) * 2017-02-23 2018-07-20 腾讯科技(深圳)有限公司 keyword extracting method and device
CN107247702A (en) * 2017-05-05 2017-10-13 桂林电子科技大学 A kind of text emotion analysis and processing method and system
CN108320051A (en) * 2018-01-17 2018-07-24 哈尔滨工程大学 A kind of mobile robot dynamic collision-free planning method based on GRU network models
CN108595601A (en) * 2018-04-20 2018-09-28 福州大学 A kind of long text sentiment analysis method incorporating Attention mechanism

Similar Documents

Publication Publication Date Title
Xie et al. Speech emotion classification using attention-based LSTM
Almuzaini et al. Impact of stemming and word embedding on deep learning-based Arabic text categorization
CN109753566B (en) Model training method for cross-domain emotion analysis based on convolutional neural network
CN108763326B (en) Emotion analysis model construction method of convolutional neural network based on feature diversification
CN109657239B (en) Chinese named entity recognition method based on attention mechanism and language model learning
CN108984526B (en) Document theme vector extraction method based on deep learning
CN108446271B (en) Text emotion analysis method of convolutional neural network based on Chinese character component characteristics
CN110807320B (en) Short text emotion analysis method based on CNN bidirectional GRU attention mechanism
CN108009148B (en) Text emotion classification representation method based on deep learning
EP2486470B1 (en) System and method for inputting text into electronic devices
CN111177374A (en) Active learning-based question and answer corpus emotion classification method and system
CN108536754A (en) Electronic health record entity relation extraction method based on BLSTM and attention mechanism
CN109086269B (en) Semantic bilingual recognition method based on semantic resource word representation and collocation relationship
CN112232087B (en) Specific aspect emotion analysis method of multi-granularity attention model based on Transformer
CN113591483A (en) Document-level event argument extraction method based on sequence labeling
CN110472245B (en) Multi-label emotion intensity prediction method based on hierarchical convolutional neural network
CN109919175B (en) Entity multi-classification method combined with attribute information
CN112487237B (en) Music classification method based on self-adaptive CNN and semi-supervised self-training model
CN114417851A (en) Emotion analysis method based on keyword weighted information
CN111400494A (en) Sentiment analysis method based on GCN-Attention
Sun et al. VCWE: visual character-enhanced word embeddings
CN113987187A (en) Multi-label embedding-based public opinion text classification method, system, terminal and medium
CN113094502A (en) Multi-granularity takeaway user comment sentiment analysis method
CN111241820A (en) Bad phrase recognition method, device, electronic device, and storage medium
CN112818698B (en) Fine-grained user comment sentiment analysis method based on dual-channel model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination