CN109213868A

CN109213868A - Entity level sensibility classification method based on convolution attention mechanism network

Info

Publication number: CN109213868A
Application number: CN201811394014.0A
Authority: CN
Inventors: 张树武; 易谦; 刘杰; 张桂煊; 关虎
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2018-11-21
Filing date: 2018-11-21
Publication date: 2019-01-15

Abstract

The present invention proposes a kind of entity level sensibility classification method based on convolution attention mechanism network, and the method obtains text matrix and target entity vector by carrying out processing to target text；Text matrix and target entity vector are subjected to pretreatment and obtain Text eigenvector；Text eigenvector and text matrix are subjected to pretreatment and obtain new Text eigenvector；M step S30 is repeated, M Text eigenvector is obtained；Linear transformation will be carried out after the splicing of all Text eigenvectors, inputs activation primitive, obtain the probability that text belongs to each emotion classification.Word around word is also accounted for range when obtaining attention weight by method of the invention, so that the text representation finally obtained be made to have stronger emotional expression ability.Finally realize more accurate entity level emotional semantic classification task.

Description

Entity level sensibility classification method based on convolution attention mechanism network

Technical field

The present invention relates to natural language analysis technical field more particularly to a kind of realities based on convolution attention mechanism network Body rank sensibility classification method.

Background technique

In recent years, universal with Internet technology rapid development and shopping at network, there is a large amount of right on internet In the comment of commodity.Using these comments, the Sentiment orientation of these texts is judged, thus to customer for the need of product It asks and analyze and then improve product, be research hotspot very concerned in recent years.And the sentiment analysis technology of entity level, then More refinement ground obtains commentator to the Sentiment orientation of a certain specific object of commodity.The previous emotion based on artificial design features Classification method is gradually unable to satisfy the demand used, and method neural network based has become the master of present sentiment analysis Flow direction.Sensibility classification method neural network based has been used for the applied fields such as the analysis of public opinion, commodity user feedback analysis Jing Zhong.

The sensibility classification method of existing entity level is based primarily upon Recognition with Recurrent Neural Network (RNN) and attention mechanism. Recognition with Recurrent Neural Network is commonly used for carrying out the amount with sequential relationship due to having the ability that can retain past information It indicates.And also remain fixed order just in the text, between word and word, so Recognition with Recurrent Neural Network is just suitble to use In indicating the context relation between this word and word, so as to obtain the vector expression for including text semantic content.

The main function of attention mechanism is to assign a weight to each of sentence word, this weight represents this Significance level of a word in this sentence, so that contributing vector corresponding to bigger word can the semantic of sentence Obtain bigger weight.In emotional semantic classification, by using attention mechanism can by some words for being used to indicate emotion with And degree adverb assigns bigger weight, so that obtaining vector relevant to the emotion of sentence indicates.And the emotion of entity level Classification is the emotional semantic classification task more refined, it needs to obtain passage for the emotional category of a certain given entity, and It can include multiple entities, and this section of text is also different for the expressed emotion out of different entities in one section of text.This When, the effect of attention mechanism is other than the vocabulary for emphasizing to show emotion, it is also necessary to which distinguishing the word that these show emotion is No is to be directed to target entity, and only assign the weight bigger for the emotion vocabulary of target word, and assign and be directed to other realities The emotion vocabulary of body assigns smaller weight.Emotional category of the text vector finally obtained in this way aiming at target entity 's.

But this based on the method for Recognition with Recurrent Neural Network and attention mechanism tool, there are two problems.One problem is, by Have the characteristics that retain past information in Recognition with Recurrent Neural Network, it must successively be calculated when calculating according to order, and be protected Stay past all information.And such characteristic, so that Recognition with Recurrent Neural Network can not carry out parallel computation when running on GPU, So that the efficiency calculated can not improve.Second Problem is under normal circumstances, to calculate a word using attention mechanism When the weight of language, the correlation degree between this word and target entity is only considered.But sometimes one word is for entire sentence Semanteme contribution is not only related with itself, also related with the word around this word because the degree adverb of one word of modification or Negative word can significantly change effect of this word in sentence, but current attention mechanism does not consider this Kind situation.

In view of the foregoing, it is necessary to propose a kind of entity level emotional semantic classification side based on convolution attention mechanism network The method of method.

Summary of the invention

In order to solve the above-mentioned technical problem, in order to which the technology for solving how to improve entity level emotional semantic classification accuracy is asked Topic, the invention proposes a kind of entity level sensibility classification method based on convolution attention mechanism network, the method is also wrapped It includes:

Step S10 carries out processing to target text and obtains text matrix and target entity vector；

Text matrix and target entity vector are carried out pretreatment and obtain Text eigenvector by step S20；

Text eigenvector and text matrix are carried out pretreatment and obtain new Text eigenvector by step S30；

Step S30 is repeated M times, is often repeated once Text eigenvector obtained by as input next time, most M Text eigenvector is obtained eventually；

Step S40 will carry out linear transformation after the splicing of all Text eigenvectors, input activation primitive, obtain text category In the probability of each emotion classification；

Step S50 is optimized and then right using error function of the back-propagation algorithm to convolution attention mechanism network All parameters are updated in network.

Preferably, the pretreatment of the step S20 includes:

Text matrix obtained and target entity vector are inputted convolution attention mechanism unit, obtained by step S21 The attention weight of each word in text；

The term vector that text matrix is included is weighted summation using attention weight, it is special to obtain text by step S22 Levy vector.

Preferably, the term vector that text matrix is included is weighted using attention weight in the step S22 and is asked With specifically:

Step S221, by term vector corresponding to word each in text multiplied by corresponding attention weight；

Step S222 sums the vector set that step S211 is obtained, and obtains Text eigenvector.

Preferably, the pretreatment of the step S30 includes:

The feature vector of text obtained is inputted convolution attention mechanism unit by step S31；

Text matrix is inputted convolution attention mechanism unit by step S32；

Step S33 repeats the operation for obtaining attention weight and weighted sum operation, obtain new text feature to Amount.

Preferably, the step S10 includes:

Step S11 carries out participle to target text using open source segmentation methods and obtains orderly set of words；

Step S12, using a large amount of text pre-training term vectors obtained from internet, for each word in text with And target entity, word is indicated using the vector row that the dimension that pre-training obtains is D；

The term vector of each word in text is arranged and is combined by word order by step S13, obtains text matrix.

Preferably, the convolution attention mechanism in the method includes: known keyword vector and content matrix；

Crucial term vector and term vector each in content matrix are spliced, " keyword-content " matrix is obtained；

Convolution operation is carried out to " keyword-entity " matrix using the convolution kernel of several specified sizes, obtains convolution feature Matrix；

Convolution eigenmatrix is added with bias vector；

Use hyperbolic tangent function as the activation primitive of convolution operation, formula indicates are as follows:

Wherein x indicates the value of each element in eigenmatrix；

For feature vector corresponding to each word in convolution eigenmatrix, carries out maximizing pondization operation, obtain Obtain attention weight feature；

Gained attention weight feature vector is normalized using softmax function；Wherein, softmax function Formula indicates are as follows:

Wherein x_jIndicate that j-th of element in feature vector, x indicate current element value.

Preferably, the maximization pondization operation specifically includes:

Only the maximum value in keeping characteristics vector all elements is as attention weight characteristic value corresponding to word；

Obtain dimension and the consistent attention weight feature vector of textual words number.

Preferably, the step S40 includes:

Step S41 splices resulting Text eigenvector；

Vector after splicing is multiplied by step S42 with linear transformation weight matrix, and adds bias vector；

Feature vector after linear transformation is sent into nonlinear activation function by step S43, is obtained text and is belonged to each emotion class Other probability.

Preferably, the error function in the step S50 includes:

Intersection entropy loss letter between the emotion probability distribution and training sample emotion probability distribution of acquired probability vector The sum of several and regular terms of network parameter.

Preferably, the cross entropy loss function are as follows:

Wherein n indicates the number of training sample, and x and y respectively indicate the obtained emotion probability distribution of network and training The practical emotion probability distribution of sample.

Preferably, the regular terms of the network parameter includes:

The regular terms of network parameter refers to the sum of 2 norms of all weight matrix and bias vector being mentioned in network.

The present invention is based on the entity level sensibility classification methods of convolution attention mechanism network, by carrying out to target text Processing obtains text matrix and target entity vector；Text matrix and target entity vector are subjected to pretreatment and obtain text Feature vector；Text eigenvector and text matrix are subjected to pretreatment and obtain new Text eigenvector；Repeat M step S30 obtains M Text eigenvector；Linear transformation will be carried out after the splicing of all Text eigenvectors, inputs activation primitive, obtain Obtain the probability that text belongs to each emotion classification.Method of the invention also receives the word around word when obtaining attention weight Enter limit of consideration, to make the text representation finally obtained that there is stronger emotional expression ability.It finally realizes more accurate Entity level emotional semantic classification task.

Detailed description of the invention

Attached drawing is as a part of the invention, and for providing further understanding of the invention, of the invention is schematic Examples and descriptions thereof are used to explain the present invention, but does not constitute an undue limitation on the present invention.Obviously, the accompanying drawings in the following description Only some embodiments to those skilled in the art without creative efforts, can be with Other accompanying drawings can also be obtained according to these attached drawings.In the accompanying drawings:

Fig. 1 is the entity level sensibility classification method stream based on convolution attention mechanism network of an embodiment of the present invention Journey schematic diagram；

Fig. 2 is to show the process that text matrix and target entity vector are located in advance in a kind of embodiment of the method for the present invention It is intended to；

Fig. 3 in a kind of embodiment of the method for the present invention using the term vector that text matrix is included by attention weight by being carried out The flow diagram of weighted sum；

Fig. 4 is that Text eigenvector and text matrix are carried out pretreated process in a kind of embodiment of the method for the present invention Schematic diagram；

Fig. 5 is to carry out processing to target text in a kind of embodiment of the method for the present invention to obtain text matrix and target entity The flow diagram of vector.

The embodiments will be further described with reference to the accompanying drawings for the realization, the function and the advantages of the object of the present invention.

Specific embodiment

The invention proposes a kind of entity level sensibility classification methods based on convolution attention mechanism network, it is intended to solve The technical issues of how improving entity level emotional semantic classification accuracy.

In an embodiment of the present invention, referring to Fig.1~Fig. 5, described method includes following steps:

Specifically, the step S10 is as shown in Figure 5, comprising:

Wherein, the embodiment of the present invention can segment Chinese text using stammerer participle library to text, and for English text since English is inherently made of word, therefore does not need to be segmented.

As an example, for text, " screen of this mobile phone is very big, but battery durable ability is not very strong.", participle Result afterwards be " this portion mobile phone screen very it is big, but battery continuation of the journey ability be not very it is strong."

In above-mentioned steps, term vector training is carried out using Glove algorithm, term vector obtained has the expression semanteme of word Ability.

Specifically, for the text comprising n word, the term vector dimension of each word is D, by word be expressed as word to After amount, the ordered set for the term vector that n dimension is D is obtained, the text matrix having a size of n × D is obtained after combining according to order, Entity is then expressed as the vector that dimension is D.

In this step, it is pre-processed by text matrix and target entity vector to obtain Text eigenvector. Specific steps are as shown in Figure 2, comprising:

Wherein, by inputting text matrix and entity vector matrix in convolution attention mechanism unit, to obtain text Attention weight of each word includes step S21a to S21f in this.

Step S21a: target entity vector is regarded as the input of convolution attention mechanism unit key term vector, by text square Battle array is regarded as content matrix.

Step S21b: crucial term vector and term vector each in content matrix are spliced, and are obtained " keyword-content " Matrix.

Specifically, for the crucial term vector of content matrix and dimension for D having a size of n × D, spliced " keyword- The size of entity " matrix is n × 2D.

Step S21c: convolution operation is carried out to " keyword-entity " matrix using the convolution kernel of several specified sizes, is obtained Convolution eigenmatrix.

Specifically, for " keyword-entity " matrix having a size of n × 2D, the k convolution kernels having a size of w × 2D are used Convolution operation is carried out to it, in order to keep the first dimension size constancy of convolution operation front and back matrix, is needed before operation to " crucial Word-entity " matrix carries out polishing, i.e., ties up full null matrix of the head and the tail splicing having a size of (w-1)/2 × 2D in matrix first.Finally obtain Obtain the convolution eigenmatrix having a size of n × k.

Step S21d: convolution eigenmatrix is added with bias vector.

Step S21e: by convolution eigenmatrix unbalanced input activation primitive.

In above-mentioned steps, activation primitive using hyperbolic tangent function as convolution operation, formula is indicated are as follows:

Step S21f: it for k dimensional feature corresponding to each word in convolution eigenmatrix, carries out maximizing pond Operation obtains attention weight feature vector.

Wherein, maximum pondization operation only retains the maximum value in k dimensional feature as the spy of attention weight corresponding to word Value indicative obtains dimension and the consistent attention weight feature vector of textual words number.

Step S21h: gained attention weight feature vector is normalized using softmax function.

Specifically, for certain one-dimensional element x in weight feature vector_i, normalization after value obtained by following formula :

It is attention weight corresponding to i-th of word in text by the value that above formula obtains.

The term vector that text matrix is included is weighted summation, such as Fig. 3 using attention weight in the step S22 It is shown, specifically:

Step S222 sums the vector set that step S221 is obtained, and obtains Text eigenvector.

Text eigenvector and text matrix are carried out pretreatment and obtain new Text eigenvector by step S30.Such as figure Shown in 4, specific steps include:

Text matrix is inputted convolution attention mechanism unit by step S32；

Step S30 calculates attention power with by target entity vector and text Input matrix convolution attention mechanism unit Weight, and obtain that method used by Text eigenvector is identical, and details are not described herein.

It should be noted that step S30 is M times repeatable, a new Text eigenvector will be obtained by being often repeated once. It is further known that, it is assumed that step S30 is repeated M times, including previous step Text eigenvector obtained, and M+1 can be obtained altogether Text eigenvector.

The step S40 includes:

Step S41 splices resulting Text eigenvector；

Specifically, if one is obtained M+1 Text eigenvector, the dimension of each feature vector is D, vector after splicing Dimension is then (M+1) × D.

Wherein, consistent in function formula and step S21h using softmax function as nonlinear activation function, herein It repeats no more.

When above-mentioned technical proposal carries out the emotional semantic classification of entity level, convolution operation has been used to realize attention mechanism, So that above-mentioned algorithm is able to carry out parallel computation when running on GPU, computational efficiency is greatly enhanced.In addition, obtaining The word around word is also accounted for into range when obtaining attention weight, so that it is stronger to have the text representation finally obtained Emotional expression ability.Finally realize more accurate entity level emotional semantic classification task.

In an alternative embodiment of the invention, after the step S40 further include:

Wherein, the embodiment of the present invention can optimize optimization aim using stochastic gradient descent algorithm.

The present invention further provides a kind of entity level emotional semantic classification system based on convolution attention mechanism network, referring to Figure, the entity level emotional semantic classification system based on convolution attention mechanism network include: storage 101, processor 102 and deposit Store up the entity grade based on convolution attention mechanism network that can be run on the memory 101 and on the processor 102 Sorrow of separation sense sort program, the entity level emotional semantic classification program based on convolution attention mechanism network is by the processor 102 realize method as described above when executing.

In addition, the embodiment of the present invention also proposes a kind of computer readable storage medium, the computer readable storage medium On be stored with the entity level emotional semantic classification program based on convolution attention mechanism network, it is described to be based on convolution attention mechanism net The entity level emotional semantic classification program of network realizes method as described above when being executed by processor.

Although each step is described in the way of above-mentioned precedence in above-described embodiment, this field Technical staff is appreciated that the effect in order to realize the present embodiment, executes between different steps not necessarily in such order, It (parallel) execution simultaneously or can be executed with reverse order, these simple variations all protection scope of the present invention it It is interior.

Technical solution is provided for the embodiments of the invention above to be described in detail.Although applying herein specific A example the principle of the present invention and embodiment are expounded, still, the explanation of above-described embodiment be only applicable to help manage Solve the principle of the embodiment of the present invention；Meanwhile to those skilled in the art, according to an embodiment of the present invention, it is being embodied It can be made a change within mode and application range.

It should be noted that the flowchart or block diagram being referred to herein is not limited solely to form shown in this article, It can also be divided and/or be combined.

It should be noted that: label and text in attached drawing are intended merely to be illustrated more clearly that the present invention, are not intended as pair The improper restriction of the scope of the present invention.

Again it should be noted that description and claims of this specification and term " first " in above-mentioned attached drawing, " Two " etc. be to be used to distinguish similar objects, rather than be used to describe a particular order or precedence order.It should be understood that using in this way Data can be interchanged in appropriate circumstances, so that the embodiment of the present invention described herein can be in addition to illustrating herein Or the sequence other than those of description is implemented.

Term " includes " or any other like term are intended to cover non-exclusive inclusion, so that including a system Process, method, article or equipment/device of column element not only includes those elements, but also including being not explicitly listed Other elements, or further include the intrinsic element of these process, method, article or equipment/devices.

As used herein, term " module " may refer to the software object executed on a computing system or routine. Disparate modules described herein can be embodied as to the object executed on a computing system or process (for example, as independence Thread).While it is preferred that realize system and method described herein with software, but with hardware or software and hard The realization of the combination of part is also possible and can be conceived to.

Each step of the invention can be realized with general computing device, for example, they can concentrate on it is single On computing device, such as: personal computer, server computer, handheld device or portable device, laptop device or more Processor device can also be distributed over a network of multiple computing devices, they can be to be different from sequence herein Shown or described step is executed, perhaps they are fabricated to each integrated circuit modules or will be more in them A module or step are fabricated to single integrated circuit module to realize.Therefore, the present invention is not limited to any specific hardware and soft Part or its combination.

Programmable logic device can be used to realize in method provided by the invention, and it is soft also to may be embodied as computer program Part or program module (it include routines performing specific tasks or implementing specific abstract data types, programs, objects, component or Data structure etc.), such as embodiment according to the present invention can be a kind of computer program product, run the computer program Product executes computer for demonstrated method.The computer program product includes computer readable storage medium, should It include computer program logic or code section on medium, for realizing the method.The computer readable storage medium can To be the built-in medium being mounted in a computer or the removable medium (example that can be disassembled from basic computer Such as: using the storage equipment of hot plug technology).The built-in medium includes but is not limited to rewritable nonvolatile memory, Such as: RAM, ROM, flash memory and hard disk.The removable medium includes but is not limited to: and optical storage media (such as: CD- ROM and DVD), magnetic-optical storage medium (such as: MO), magnetic storage medium (such as: tape or mobile hard disk), can with built-in Rewrite the media (such as: storage card) of nonvolatile memory and the media (such as: ROM box) with built-in ROM.

The above is only a preferred embodiment of the present invention, is not intended to limit the scope of the invention, all to utilize this hair Equivalent structure or equivalent flow shift made by bright specification and accompanying drawing content is applied directly or indirectly in other relevant skills Art field, is included within the scope of the present invention.

Claims

1. a kind of entity level sensibility classification method based on convolution attention mechanism network, which is characterized in that the method packet It includes:

Step S30 is repeated M times, Text eigenvector obtained is often repeated once by as input next time, finally obtains Obtain M Text eigenvector；

Step S40 will carry out linear transformation after the splicing of all Text eigenvectors, input activation primitive, obtain text and belong to respectively The probability of a emotion classification.

2. the entity level sensibility classification method according to claim 1 based on convolution attention mechanism network, feature It is, the pretreatment of the step S20 includes:

Text matrix obtained and target entity vector are inputted convolution attention mechanism unit, obtain text by step S21 In each word attention weight；

The term vector that text matrix is included is weighted summation using attention weight by step S22, obtain text feature to Amount.

3. the entity level sensibility classification method according to claim 2 based on convolution attention mechanism network, feature It is, the term vector that text matrix is included is weighted summation using attention weight in the step S22 specifically:

4. the entity level sensibility classification method according to claim 1 based on convolution attention mechanism network, feature It is, the pretreatment of the step S30 includes:

Text matrix is inputted convolution attention mechanism unit by step S32；

Step S33 repeats the operation for obtaining attention weight and weighted sum operation, obtains new Text eigenvector.

5. the entity level sensibility classification method according to claim 1 based on convolution attention mechanism network, feature It is, the step S10 includes:

Step S12, using a large amount of text pre-training term vectors obtained from internet, for each word and mesh in text Entity is marked, indicates word using the vector row that the dimension that pre-training obtains is D；

6. the entity level sensibility classification method according to claim 1 based on convolution attention mechanism network, feature It is, the convolution attention mechanism in the method includes: known keyword vector and content matrix；

Convolution operation is carried out to " keyword-entity " matrix using the convolution kernel of several specified sizes, obtains convolution eigenmatrix；

Convolution eigenmatrix is added with bias vector；

Wherein x indicates the value of each element in eigenmatrix；

For feature vector corresponding to each word in convolution eigenmatrix, carries out maximizing pondization operation, be infused Meaning power weight feature；

Gained attention weight feature vector is normalized using softmax function；Wherein, the formula of softmax function It indicates are as follows:

7. the entity level sensibility classification method according to claim 6 based on convolution attention mechanism network, feature It is, the maximization pondization operation specifically includes:

8. the entity level sensibility classification method according to claim 1 based on convolution attention mechanism network, feature It is, the step S40 includes:

Step S41 splices resulting Text eigenvector；

Feature vector after linear transformation is sent into nonlinear activation function by step S43, is obtained text and is belonged to each emotional category Probability.

9. the entity level sensibility classification method according to claim 1 based on convolution attention mechanism network, feature It is, the error function in the step S50 includes:

Cross entropy loss function between the emotion probability distribution and training sample emotion probability distribution of acquired probability vector with And the sum of regular terms of network parameter.

10. the entity level sensibility classification method according to claim 8 based on convolution attention mechanism network, feature It is, the cross entropy loss function are as follows:

Wherein n indicates the number of training sample, and x and y respectively indicate the obtained emotion probability distribution of network and training sample Practical emotion probability distribution.

11. the entity level sensibility classification method according to claim 8 based on convolution attention mechanism network, feature It is, the regular terms of the network parameter includes: