CN108845986A

CN108845986A - A kind of sentiment analysis method, equipment and system, computer readable storage medium

Info

Publication number: CN108845986A
Application number: CN201810538689.1A
Authority: CN
Inventors: 胡晓
Original assignee: ZTE Corp
Current assignee: ZTE Corp
Priority date: 2018-05-30
Filing date: 2018-05-30
Publication date: 2018-11-20

Abstract

A kind of sentiment analysis method, equipment and system, computer readable storage medium, the sentiment analysis method include：Term vector is generated according to corpus；Feature vector is generated according to the corpus, the term vector is inputted to the first length memory network model pre-established, the first information of the first length memory network model output and described eigenvector are inputted into the second length memory network model pre-established；The Sentiment orientation of the corpus is determined according to the second information that the second length memory network model exports.The scheme that this example provides, uses two layers of LSTM, it is contemplated that the long-range correlation between sentence information can more accurately reflect Sentiment orientation.

Description

A kind of sentiment analysis method, equipment and system, computer readable storage medium

Technical field

The present invention relates to a kind of sentiment analysis method, equipment and systems, computer readable storage medium.

Background technique

21 century be information technology rapid development epoch, people's lives and computer, internet are closely bound up, people and Exchange and conmmunication mode between people has also penetrated into network.Go out using microblogging, wechat as the social media platform of representative It is existing, more mobile interchange is rooted in the hearts of the people.By taking microblogging as an example, microblogging, the i.e. abbreviation of micro-blog are one based on customer relationship Information share, propagate and obtain platform, user can pass through web (webpage), WAP (Wireless Application Protocol, Wireless Application Protocol) and various client components, information is delivered with the text of 140 words or so, and realize instant Share.Brief, the penetrating use for having attracted a large amount of public figure of microblogging, the number of fans that these public figures are driven is with ten thousand Meter.Everyone no matter when and where can freely and easily record the drop of life, interact with friend, express viewpoint etc. Deng.The information of each microblogging transmitting includes the personal position and emotion of publisher, it is necessary to the participant of microblogging carrying Emotion carries out mining analysis.

Summary of the invention

A present invention at least embodiment provides a kind of sentiment analysis method, equipment and system, computer-readable storage medium Matter.

In order to reach the object of the invention, a present invention at least embodiment provides a kind of sentiment analysis method, including：

Term vector is generated according to corpus；

Feature vector is generated according to the corpus, the term vector is inputted to the first length memory network mould pre-established Type, the second length that the first information of the first length memory network model output and described eigenvector input are pre-established Short memory network model；

The Sentiment orientation of the corpus is determined according to the second information that the second length memory network model exports.

A present invention at least embodiment provides a kind of sentiment analysis system, including：Data processing module, memory module and calculation Method analysis module, wherein：

The data processing module is used for, and obtains corpus；

The memory module is used for, and stores the corpus；

The algorithm analysis module is used for, and generates feature vector according to the corpus, and term vector input is built in advance The first vertical length memory network model, by the first length memory network model output the first information and the feature to The second length memory network model that amount input pre-establishes；According to the second letter of the second length memory network model output Breath determines the Sentiment orientation of the corpus.

A kind of sentiment analysis equipment of a present invention at least embodiment, including memory and processor, the memory storage There is program, described program realizes sentiment analysis method described in any embodiment when reading execution by the processor.

A kind of computer readable storage medium of a present invention at least embodiment, the computer-readable recording medium storage have One or more program, one or more of programs can be executed by one or more processor, to realize any reality Apply sentiment analysis method described in example.

Scheme provided in this embodiment using two layers of LSTM, and joined characteristic information, make compared to term vector is only used It only considered the association (long association and short association) between word for the single layer LSTM of input, single layer LSTM, can not examine completely Consider the long-range correlation between sentence information, scheme provided in this embodiment uses two layers of LSTM, it is contemplated that between sentence information Long-range correlation can more accurately carry out sentiment analysis.

Other features and advantages of the present invention will be illustrated in the following description, also, partly becomes from specification It obtains it is clear that understand through the implementation of the invention.The objectives and other advantages of the invention can be by specification, right Specifically noted structure is achieved and obtained in claim and attached drawing.

Detailed description of the invention

Attached drawing is used to provide to further understand technical solution of the present invention, and constitutes part of specification, with this The embodiment of application technical solution for explaining the present invention together, does not constitute the limitation to technical solution of the present invention.

Fig. 1 is the sentiment analysis method flow diagram that one embodiment of the invention provides；

Fig. 2 is the first LSTM cytological map that one embodiment of the invention provides；

Fig. 3 is the 2nd LSTM cytological map that one embodiment of the invention provides；

Fig. 4 is the sentiment analysis schematic diagram that one embodiment of the invention provides；

Fig. 5 is the two-way sentiment analysis schematic diagram that one embodiment of the invention provides；

Fig. 6 is the sentiment analysis system block diagram that one embodiment of the invention provides；

Fig. 7 is the sentiment analysis method flow diagram that one embodiment of the invention provides；

Fig. 8 is the training method flow chart that one embodiment of the invention provides；

Fig. 9 is the sentiment analysis method flow diagram that one embodiment of the invention provides；

Figure 10 is the sentiment analysis method flow diagram that one embodiment of the invention provides；

Figure 11 is the sentiment analysis method flow diagram that one embodiment of the invention provides；

Figure 12 is the sentiment analysis equipment block diagram that one embodiment of the invention provides.

Specific embodiment

To make the objectives, technical solutions, and advantages of the present invention clearer, below in conjunction with attached drawing to the present invention Embodiment be described in detail.It should be noted that in the absence of conflict, in the embodiment and embodiment in the application Feature can mutual any combination.

Step shown in the flowchart of the accompanying drawings can be in a computer system such as a set of computer executable instructions It executes.Also, although logical order is shown in flow charts, and it in some cases, can be to be different from herein suitable Sequence executes shown or described step.

In the application, feelings are carried out by two layers of length memory network (Long Short-Term Memory, abbreviation LSTM) Sense analysis.

As shown in Figure 1, one embodiment of the invention provides a kind of sentiment analysis method, including：

Step 101, term vector is generated according to corpus；

Step 102, feature vector is generated according to the corpus, the term vector is inputted to the first LSTM mould pre-established The first information of first LSTM model output and described eigenvector are inputted the 2nd LSTM model pre-established by type；

Step 103, the Sentiment orientation of the corpus is determined according to the second information that the 2nd LSTM model exports.

Scheme provided in this embodiment, using two layers of LSTM (the first LSTM and the 2nd LSTM), compared to only using term vector Single layer LSTM, single layer LSTM only considered the association (long association and short association) between word, can not consider sentence completely Long-range correlation between information, scheme provided in this embodiment, it is contemplated that the long-range correlation between sentence information, it can be more accurately Carry out sentiment analysis.

Wherein, further include before the step 101：Step 100, the target information being analysed to is pre-processed into preset format Data as the corpus.The target information such as internet data, the internet data such as microblogging text, microblogging Comment, the comment of wechat public platform text, wechat, commodity evaluation, body, news comment etc..

In one embodiment, in step 100, the target information being analysed to, which is pre-processed into the data of preset format, includes： The presupposed information for including in the target information is cleaned, the presupposed information is such as picture, voice etc., by the target after cleaning The regular structural data for preset format of information, each structural data is as a corpus.The information for needing to wash can With pre-defined.It is of course also possible to extract information without cleaning directly from target information and generate structural data.Wherein, Can first it judge with the presence or absence of presupposed information in target information, if it is present cleaning presupposed information；Alternatively, not making to judge, directly It connects and target information is cleaned.

A kind of format of structural data is as shown in table 1, it may include following field：

It identifies (id), which is the sequence number of corpus；

Corpus content, the field are the content of corpus, and the content of corpus includes the information such as emoticon.Corpus content and mesh Marking information, there may be difference, for example, eliminating picture (non-emoticon), the voice etc. in target information in corpus content Deng.

Corpus type indicates that the corpus includes at least one of：Text, the comment of text and comment reply；

Owner, the publisher of finger speech material, the field are Optional Field；

Time, the issuing time of finger speech material, the field are Optional Field.

1 structured data fields explanation of table

By taking microblogging as an example, a microblogging text can be used as a corpus after pretreatment, and every of the microblogging text Comment can be used as a corpus after pretreatment, and the reply of comment can be used as a corpus after pretreatment.Certainly, Can also by microblogging text and its comment be used as a corpus, alternatively, by microblogging comment on and comment on reply as a corpus, Etc..

Wherein, corpus can store in the database, and a plurality of associated corpus can be stored by predetermined manner.For example, just The corresponding corpus of text, the corresponding corpus of the comment of text, the corresponding corpus of the reply of comment are sequentially stored in continuous address. Alternatively, increasing indication field in commenting on corresponding corpus, which can carry corresponding with the associated text of the comment Corpus ID, indicate the comment be the corpus ID instruction the corresponding text of corpus comment, it is corresponding in the reply of comment Increase indication field in corpus, which can carry the ID of the corresponding corpus of the associated comment of reply with the comment, The reply for indicating the comment is the reply of the corresponding comment of corpus of corpus ID instruction.

It should be noted that above structure data are merely illustrative, can according to need as other structures, for example only include Corpus content and corpus type.

It generates term vector and refers to the vector for converting word to computer capacity understanding.In one embodiment, the step 101 In, generating term vector according to corpus includes：

One or more first term vectors will be generated after corpus participle, using first term vector as institute's predicate to Amount；

Or, by one or more first term vectors are generated after corpus participle, based on topic belonging to the corpus or Classification generates the second term vector, combines the first term vector and second term vector to obtain the term vector.Wherein, by first The combination that term vector and second term vector combine to obtain the term vector, which can be, splices the two, corpus institute The topic of category or classification can be stored in advance.

For example, GloVe etc. generates term vector using word2vec；For example, I does not believe that after segmenting as " I " " no " " phase Letter ", the term vector of " I " are [0.59 ..., 0.70 ...], and the term vector of " no " is [0.32 ..., 0.60 ...], " it is believed that " Term vector is [0.19 ..., 0.55 ...].Corpus (usually sentence) is divided into independent word one by one by corpus participle.

For example, corpus " I does not believe that " is belonging to it entitled " society ", after participle for " I " " no " " it is believed that ", then general The term vector of the term vector of " I " and " society " splices, that is, the term vector that splices [my term vector, social word to Amount] input as the first LSTM, the term vector of the term vector of " no " and " society " splice as the defeated of the first LSTM Enter, by " it is believed that " term vector and the term vector of " society " carry out input of the splicing as the first LSTM.

In one embodiment, in step 102, the characteristic information includes at least one of：The social comment feature of extraction, Emotion characteristic feature and macro society feature, the social comment feature indicate in the comment information of the corpus with the presence or absence of the language Expect the reply of publisher, the emotion characteristic feature indicates first kind emotional symbol in the corpus and its comment information and the Two class emotional symbol quantitative relations；The macro society feature indicates whether occur that other users is prompted to check this in the corpus The prompt user information of corpus.Wherein, the comment information of the corpus i.e. comment information of the corresponding target information of the corpus；With microblogging For, the corpus obtained by text manipulation, comment information is to be directed to the comment information of the text.

Wherein it is possible to classify to emotional symbol (such as emoticon), first kind emotional symbol and the second class emotion Symbol, the first kind are such as positive emotional symbol, for example passive emotional symbol of the second analogy, which specific emotional symbol is Positive emotion symbol, which emotional symbol are that Negative Affect symbol can pre-define.

By taking microblogging as an example：

Feature is commented in social activity：It analyzes microblogging text and its replys comment information, whether there is microblogging in comment information The reply of text publisher.Under normal circumstances, text publisher can adhere to his consistent Sentiment orientation.If there is microblogging text The reply of publisher, then the social activity comment feature be 1, if there is no microblogging text publisher reply then social activity comment spy Sign is 0.

Emotion characteristic feature is：Microblogging text and its comment information are analyzed, the emoticon in these information, reference are collected Expression classification group (for example emoticon is divided into two classes, positive emoticon and passive emoticon), if positive emoticon Number amount is more than passive emoticon quantity, then this feature is 1, is on the contrary then 0.It particularly, can be with if without emoticon It is 0 that this feature, which is arranged,.

Macro society feature：Microblogging text sequence is analyzed, if occurred in different microblogging texts in analysis length threshold value Microblogging specifically prompts user information "@user name ", prompts microblogging publisher user concern, then this feature is 1, is then on the contrary 0。

According to characteristic information construction feature vector, for example, such as social comment feature is 1, emotion characteristic feature is 1, macro Seeing social characteristic is 0, then feature vector is [1,1,0].Certainly, social comment feature, emotion characteristic feature and macro society are special Sign construction feature vector, the application can also be not construed as limiting this in other sequences.

It should be noted that the value of feature is 1 or 0 merely illustrative in the application, it can according to need and be taken as other Value.Furthermore it is possible to extract more or less features as needed.In addition, the extraction of features described above information is merely illustrative, Ke Yigen Characteristic information is generated according to other information relevant to Sentiment orientation.

In one embodiment, the term vector includes term vector x1⁽⁰⁾~x1^(N-1)；It is in step 102, the term vector is defeated Enter the first length memory network model pre-established, by the first information of the first length memory network model output and institute Stating the second length memory network model that feature vector input pre-establishes includes：

By term vector x1⁽⁰⁾The first length memory network model is inputted, the first length memory network model is defeated The first information and described eigenvector out inputs the second length memory network model；

And so on, by term vector x1^(N-1)The first length memory network model is inputted, first length is remembered The first information and described eigenvector of network model output input the second length memory network model, obtain described second Second information of length memory network model output.

Specifically, assume the term vector generated according to corpus be it is N number of, be followed successively by x1⁽⁰⁾~x1^(N-1), the characteristic information of corpus For d；

By x1⁽⁰⁾The first LSTM model is inputted, h1 is exported⁰, by h1⁰The first LSTM model, output are inputted with d h2⁰；

By x1⁽¹⁾And h1⁰The first LSTM model is inputted, h1 is exported¹, by h1¹、d、h2⁰Input the 2nd LSTM mould Type exports h2¹；

And so on, by x1^(N-1)And h1^N-2The first LSTM model is inputted, h1 is exported^N-1, by h1^N-1、d、h2^N-2Input The 2nd LSTM model exports h2^N-1；

In the step 103, determine that the emotion of the corpus is inclined according to the second information that the 2nd LSTM model exports To including：

According to the h2^N-1Determine the Sentiment orientation of the corpus.

In one embodiment, the cell state figure of the first LSTM model is as shown in Fig. 2, the first LSTM model is as follows：

Wherein, describedFor the state vector f1 of the forgetting door t moment of the first LSTM model^(t)I-th yuan Element, the σ is sigmod unit function, describedFor weight vectors b1^fI-th of element, describedFor weight matrix U1^fThe element of i-th row, jth column, it is describedFor weight matrix W1^fThe i-th row, jth column element, it is describedIt is described The input vector x1 of the t moment of first LSTM model^(t)J-th of element, i.e., t moment input j-th yuan of the term vector Element；It is describedFor the output vector h1 at the first LSTM model t-1 moment^(t-1)J-th of element；

It is describedFor the state vector g1 of the input gate t moment of the first LSTM model^(t)I-th of element, it is describedFor weight vectors b1^gI-th of element, it is describedFor weight matrix U1^gThe element of i-th row, jth column, it is describedFor Weight matrix W1^gThe i-th row, jth column element；

It is describedFor the state vector q1 of the out gate t moment of the first LSTM model^(t)I-th of element, it is describedFor weight vectors b1^qI-th of element, it is describedFor weight matrix U1^qThe element of i-th row, jth column, it is describedFor Weight matrix W1^qThe i-th row, jth column element；

It is describedFor intermediate state vector (i.e. cell (cell) shape of LSTM model of the first LSTM model t moment State) s1^(t)I-th of element, it is describedFor the intermediate state vector s1 at the first LSTM model t-1 moment^(t-1)I-th A element, the b1_iFor i-th of element of weight vectors b1, the U1_i,jFor the i-th row of weight matrix U1, jth arrange element, The W1_i,jThe element arranged for the i-th row of weight matrix W1, jth；s1^(t-1)It is not shown in FIG. 2.

It is describedFor the output vector h1 of the first LSTM model t moment^(t)I-th of element；The first information The h1 of the t moment output of the as described first LSTM model^(t), tanh is hyperbolic tangent function.

Fig. 3 is the cell state figure of the 2nd LSTM model, and the 2nd LSTM model is as follows：

Wherein, describedFor the state vector f2 of the forgetting door t moment of the 2nd LSTM model^(t)I-th yuan Element, the σ is sigmod unit function, describedFor weight vectors b2^fI-th of element, describedFor weight matrix U2^fThe element of i-th row, jth column, it is describedFor weight matrix W2^fThe i-th row, jth column element, it is describedFor weight Matrix V 2^fThe i-th row, jth column element, it is describedFor the input vector x2 of the t moment of the 2nd LSTM model^(t)'s J-th of element, i.e. t moment are input to j-th of element of the first information of the 2nd LSTM model；It is describedFor The output vector h2 at the 2nd LSTM model t-1 moment^(t-1)J-th of element；It is describedDescribed is input to for t moment The described eigenvector d of two LSTM models^(t)J-th of element；

It is describedFor the state vector g2 of the input gate t moment of the 2nd LSTM model^(t)I-th of element, it is describedFor weight vectors b2^gI-th of element, describedFor weight matrix U2^gThe element of i-th row, jth column, it is described For weight matrix W2^gThe i-th row, jth column element；It is describedFor weight matrix V2^gThe i-th row, jth column element；

It is describedFor the state vector q2 of the out gate t moment of the 2nd LSTM model^(t)I-th of element, it is describedFor weight vectors b2^qI-th of element, it is describedFor weight matrix U2^qThe element of i-th row, jth column, it is described For weight matrix W2^qThe i-th row, jth column element, it is describedFor weight matrix V2^qThe i-th row, jth column element；

It is describedFor the intermediate state vector s2 of the 2nd LSTM model t moment^(t)I-th of element, it is describedFor the intermediate state vector s2 at the 2nd LSTM model t-1 moment^(t-1)I-th of element, the b2_iFor weight to Measure i-th of element of b2, the U2_i,jFor the element that the i-th row of weight matrix U2, jth arrange, the W2_i,jFor weight matrix W2's The element of i-th row, jth column, the V2_i,jThe element arranged for the i-th row of weight matrix V2, jth；

It is describedFor the output vector h2 of the 2nd LSTM model t moment^(t)I-th of element；Second information The h2 of the as described 2nd LSTM model t moment output^(t), tanh is hyperbolic tangent function.

It should be noted that the first LSTM model and the 2nd LSTM model shown in above-mentioned formula are merely illustrative, can incite somebody to action The LSTM model of other constructions is applied in the application.

In one embodiment, the first LSTM model and the 2nd LSTM model are based on such as under type foundation：

Training corpus is obtained, after carrying out Sentiment orientation mark to the training corpus, based on the training corpus to first Initial LSTM model and the second initial LSTM model are trained, and obtain the first LSTM model and the 2nd LSTM mould Type.

Wherein, the Sentiment orientation exported in step 103, which is divided to, to be positive and passive two classes, it is of course also possible to be divided into more Multiclass, such as very positive, general positive, general passive, very passive etc., for another example actively, passive, optimistic, sadness etc., Alternatively, taking scoring mechanism, from passiveness to being actively 0-10 points, score is higher, represents more positive, etc..

A kind of training method is as follows：

The first LSTM initial model (also referred to as word-level LSTM) is constructed according to formula (1)-formula (5), according to formula (6)- Formula (10) constructs the 2nd LSTM initial model (also referred to as Sentence-level LSTM)；When initial, b1, b1^f,b1^g,b1^qCan value be 0, U1,U1^f,U1^g,U1^q, W1, W1^f,W1^g,W1^qIt can use normal distribution random value, constitute parameter matrix.b2,b2^f,b2^g, b2^qCan value be 0, U2, U2^f,U2^g,U2^q, W2, W2^f,W2^g,W2^q, V2, V2^f,V2^g,V2^qIt can use normal distribution to take at random Value constitutes parameter matrix.It should be noted that above-mentioned initial value is merely illustrative, it can according to need and take other values, such as basis Experience takes initial value, alternatively, being worth trained under a scene as the initial value, etc. under another scene.

Set algorithm hyper parameter, hyper parameter is as shown in table 2, and the value of hyper parameter can be set as needed.

Table 2 trains hyper parameter group field explanation

Term vector is converted by training corpus participle, the first term vector is obtained, by topic belonging to training corpus or classification Information is converted into term vector, obtains the second term vector, will be used as the first LSTM mould after the first term vector and the combination of the second term vector The input quantity of type；For example, term vector can be converted by word with word2vec algorithm；Characteristic information is extracted (before such as Social the comment feature, emotion characteristic feature, macro society feature mentioned), according to characteristic information construction feature vector d；It carries out When training, the information for needing to input is as shown in table 3, including contains training corpus and expression classification group, wherein expression classification group refers to To classify to emoticon, one kind is positive emoticon, and one kind is passive emoticon, when extracting emotion characteristic feature, It can be extracted according to the classification in expression classification group.

3 training corpus of table inputs explanation

Trained loss function uses cross entropy, can be defined as

WhereinFor the distribution of the Sentiment orientation of corpus i mark.It is of course also possible to use unknown losses function, this Shen Please this is not construed as limiting.

Training corpus is divided into training set according to preset ratio, verifying collection and test set (such as training set 80%, verifying Collection 10%, test set 10%), above-mentioned LSTM model is trained using training corpus, the parameter in model is optimized, Terminate until trained, saves the first LSTM model and the 2nd LSTM model obtained at this time.Parameter optimization training method can adopt With AdaDelta algorithm, AdaGrad algorithm, Adam algorithm etc..Specifically, by the term vector of training corpus (directly according to corpus The term vector that the obtained term vector of term vector or corpus and the term vector of topic combines) input the first LSTM it is initial Model exports the first information, and the first information and feature vector are inputted the 2nd LSTM initial model, export the second information, according to Second information determines Sentiment orientation, Optimal Parameters, until the Sentiment orientation of output is consistent with the Sentiment orientation marked in advance.

Export Sentiment orientation P (i)=Softmax (h2_i), the Sentiment orientation of output is predefined Sentiment orientation classification (such as actively, passive, optimistic, sadness etc.).

After model training, corpus to be analyzed is inputted, (Sentiment orientation is the feelings pre-defined to output Sentiment orientation Sense tendency type).The format of the corpus to be analyzed of input can be as shown in table 4：

The format description of the corpus to be analyzed of table 4

In one embodiment, in step 103, the Sentiment orientation for determining the corpus according to second information includes：It will The second information input Softmax function or RELU function, by the Softmax function or the output institute predicate of RELU function The Sentiment orientation of material.It needs, Softmax function or RELU function are merely illustrative, can according to need using other points Class device.

Fig. 4 is the flow chart of sentiment analysis.As shown in figure 4, term vector 401 is input to the first LSTM, the first LSTM output The first information 402, the first information 402 and feature vector d are input to the 2nd LSTM, and the 2nd LSTM exports the second information 403, by the Two information 403 input Softmax function, and the output of Softmax function is Sentiment orientation.It should be noted that in Fig. 4, it is right One corpus, characteristic information only one, therefore, the d in figure^(t-1), d^(t), d^(t+1)It is identical, it is the feature letter of the corpus Breath.h1^(t-2)And h2^(t-2)An initial value can be set.

For example, for example corpus content is " I does not believe that ", in the case where not considering topic belonging to corpus, then：

A:The term vector of " I " is inputted in the first LSTM model, the first LSTM model exports h1⁽⁰⁾, by h1⁽⁰⁾" I not It is believed that " characteristic information be input to the 2nd LSTM model, the 2nd LSTM model exports h2⁽⁰⁾；

B:Input the term vector and h1 of " no "⁽⁰⁾To the first LSTM model, the first LSTM model exports h1⁽¹⁾, by h2⁽⁰⁾、h1⁽¹⁾The characteristic information of " I does not believe that " is input to the 2nd LSTM model, and the 2nd LSTM model exports h2⁽¹⁾；

C:Input " it is believed that " term vector and h1⁽¹⁾To the first LSTM model, the first LSTM model exports h1⁽²⁾, by h2⁽¹⁾、 h1⁽²⁾The characteristic information of " I does not believe that " is input to the 2nd LSTM model, and the 2nd LSTM model exports h2⁽²⁾；

D:By h2⁽²⁾It is input to Softmax function, the output of Softmax function is the Sentiment orientation of " I does not believe that ".

In the case where considering topic belonging to corpus, it is assumed that entitled belonging to the corpus：" society ", then：

A:Term vector the first LSTM model of input for combining the term vector of the term vector of " I " and " society ", first LSTM model exports h1⁽⁰⁾, by h1⁽⁰⁾The characteristic information of " I does not believe that " is input to the 2nd LSTM model, the 2nd LSTM model Export h2⁽⁰⁾；

B:The term vector of the term vector of " no " and " society " is combined to the term vector and h1⁽⁰⁾It is input to One LSTM model, the first LSTM model export h1⁽¹⁾, by h2⁽⁰⁾、h1⁽¹⁾The characteristic information of " I does not believe that " is input to second LSTM model, the 2nd LSTM model export h2⁽¹⁾；

C:By " it is believed that " term vector and " society " term vector be combined described in term vector and h1⁽¹⁾It is input to First LSTM model, the first LSTM model export h1⁽²⁾, by h2⁽¹⁾、h1⁽²⁾The characteristic information of " I does not believe that " is input to second LSTM model, the 2nd LSTM model export h2⁽²⁾；

It should be noted that LSTM model need to use previous output as this input (such as in step B, Need h1⁽⁰⁾Input as the first LSTM model), if this is that for the first time, previous output uses an initial value, The initial value can according to need setting.

Scheme provided in this embodiment, relative to the single layer LSTM for only using term vector, in addition to considering the association between word, The association between sentence is also contemplated, for example, the 2nd LSTM model is according to h1 in step C⁽²⁾(word-based " I " " no " " phase The analysis that letter " is made) and h2⁽¹⁾(analysis that word-based " I " " no " makes) carries out subsequent analysis.And if only using single layer In LSTM, step C, according to h1⁽¹⁾(analysis that word-based " I " " no " makes) and " it is believed that " term vector carry out subsequent analysis, It only considered the association between word, scheme provided by the present application, it is contemplated that the association between sentence, in addition, by extracting feature Information joined Sentiment orientation represented by multimedia messages (such as emoticon), can more accurately reflect Sentiment orientation.

It should be noted that illustrating only three layers (the LSTM model at each moment represents one layer), the actual number of plies in figure It is determined by the word number of the corpus inputted.

In one embodiment, described that the term vector is inputted to the first length memory network model pre-established, by institute The second length memory that the first information and described eigenvector input for stating the output of the first length memory network model pre-establish Network model includes：

By the term vector x1⁽⁰⁾~x1^(N-1)From x1⁽⁰⁾To x1^(N-1)Sequentially input the first length memory network mould The first information of first length memory network model output and described eigenvector are inputted second length and remembered by type Network model；Obtain term vector x1^(N-1)After inputting the first length memory network model, the second length memory network mould The positive information of the second of type output；

By the term vector x1⁽⁰⁾~x1^(N-1)From x1^(N-1)To x1⁽⁰⁾Sequentially input the first length memory network mould The first information of first length memory network model output and described eigenvector are inputted second length and remembered by type Network model；Obtain term vector x1⁽⁰⁾After inputting the first length memory network model, the second length memory network mould Second reversed information of type output；

Second information according to the second length memory network model output determines the Sentiment orientation of the corpus Including：

Second positive information and the second reversed information combination are obtained into the combined information, determined according to the combined information The Sentiment orientation of the corpus.

As shown in figure 5, region 501 is to extract term vector according to positive sequence from corpus, it is sequentially inputted to the first LSTM model, To extract term vector according to inverted order from corpus in region 502, it is sequentially inputted to the first LSTM model, by the in region 501 The second information is obtained after the output combination of the 2nd LSTM in the output of two LSTM models and region 502, the second information input is arrived Softmax function, the output of Softmax function are Sentiment orientation.Combination is such as by the 2nd LSTM in region 501 The output of the 2nd LSTM is spliced in the output of model and region 502.

For example, for example corpus content is " I does not believe that ", is sequentially input in the first LSTM model in region 501 " I " " no " " it is believed that " term vector, inverted sequence is then pressed in the first LSTM model in region 502, sequentially input " it is believed that " " no " The output of the 2nd LSTM model in region 501 and the output of the 2nd LSTM model in region 502 are carried out group by the term vector of " I " Conjunction obtains the second information, and by the second information input to Softmax function, the output of Softmax function is " I does not believe that " Sentiment orientation.If it is considered that topic belonging to corpus or classification, then sequentially input in the first LSTM model in region 501 Topic belonging to the combination of the term vector of topic belonging to the term vector of " I " and the corpus, the term vector of " no " and the corpus The combination of term vector, " it is believed that " term vector and the corpus belonging to topic term vector combination, first in region 502 In LSTM model then press inverted sequence, sequentially input " it is believed that " term vector and the corpus belonging to topic term vector combination, " no " combination of the term vector of topic belonging to term vector and the corpus, the term vector of " I " and topic belonging to the corpus The combination of term vector.

In one embodiment, also Sentiment orientation is counted based on topic or classification, and exports each topic or classification Synthesis Sentiment orientation accounting.It is of course also possible to directly export the Sentiment orientation of each corpus without statistics.

One embodiment of the invention provides a kind of sentiment analysis system, as shown in fig. 6, the sentiment analysis system includes：Number According to processing module 601, memory module 602 and algorithm analysis module 603, wherein：

The data processing module 601 is used for, and generates term vector according to corpus；

The memory module 602 is used for, and stores the corpus；The memory module can be distributed file system；Such as HDFS (Hadoop Distributed File System, Hadoop distributed file system), MongoDB etc.；

The algorithm analysis module 603 is used for, and generates term vector according to corpus, generates feature vector according to the corpus, The term vector is inputted into the first LSTM model that pre-establishes, by the first information of the first LSTM model output and described Feature vector inputs the 2nd LSTM model pre-established, according to the second information determination that the 2nd LSTM model exports The Sentiment orientation of corpus.

In one embodiment, the algorithm analysis module 603 includes training unit 6031 and analytical unit 6032, wherein：

The data processing module 602 is also used to, and Sentiment orientation mark is carried out to the corpus, as training corpus；

The training unit 6031 is used for, based on the training corpus at the beginning of the preset first initial LSTM model and second Beginning LSTM model is trained, and obtains the first LSTM model and the 2nd LSTM model；It can periodically be trained；

The analytical unit 6032 is used for, and generates feature vector according to the corpus, by term vector input described the The first information of first LSTM model output and described eigenvector are inputted the 2nd LSTM mould by one LSTM model Type.

In one embodiment, the analytical unit 6032 is also used to, and counts the emotion of each corpus under each topic or classification Tendency.

In one embodiment, the system also includes display module 604, for showing the Sentiment orientation of the corpus, or Person, shows the Sentiment orientation of topic or classification, for example, counting to the Sentiment orientation of all corpus under topic, output is not The accounting etc. of feeling of sympathy tendency.

Specifically how to be trained and specifically how to obtain Sentiment orientation please refers to embodiment of the method, details are not described herein again.

In one embodiment, the data processing module includes data grabber unit 6011 and data pre-processing unit 6012, wherein：

The data grabber unit 6011 is used for, and obtains target information；The module can be with distributed arrangement, it can respectively Configuration is in multiple positions.

The data pre-processing unit 6012 is used for, and target information is pre-processed into the data of preset format as institute's predicate Material, is stored in the memory module；Memory module can be distributed file system.Start by set date sentiment analysis task is (certainly, Can also be started with the non-timed, for example, starting after receiving enabled instruction, alternatively, starting, etc. after meeting trigger condition), it is loaded into Resulting model file after algorithm training, inputs corpus data, obtains the Sentiment orientation of every corpus, counts by unit of topic Sentiment orientation, and be shown.

The application is further illustrated below by specific example.

Example one

Interconnection abundant can be generated in public microblogging used in daily life, wechat circle of friends and internet site Net data.In order to realize that the supervision of the social security on network and public opinion guidance perform effectively, analysis in real time and tracking are public Focus and society's dynamic of public opinion in real time are very important.

As shown in fig. 7, the present embodiment provides a kind of sentiment analysis methods, including：

Step 701, grab the comment under content of microblog and microblogging using distributed reptile, wechat circle of friends content and Comment；

Step 702, the data grabbed are pre-processed, pretreated data structured are stored in HDFS, Each structural data is as a corpus.A kind of format of storage is as shown in table 5.

Step 703, Sentiment orientation mark is carried out to part corpus, as training corpus.

A kind of notation methods are as shown in table 6 below.It should be noted that the notation methods in table 6 are merely illustrative, it can basis Need to be labeled as more Sentiment orientations.

Step 704, the first LSTM initial model is established, training corpus is inputted the first LSTM by the 2nd LSTM initial model Initial model and the 2nd LSTM initial model, are trained, and obtain the first LSTM model and the 2nd LSTM model, the super ginseng of model Number is as shown in table 7.

In one embodiment, theme belonging to corpus can be by directly being obtained by microblogging classification information, such as star, science and technology, Society etc..The expression information of input can presort according to the expression that microblogging carries.Loss function is cross entropy, training platform Select Caffe2.

5 data memory format of table

6 Sentiment orientation of table mark

Table is arranged in 7 example of table, one hyper parameter

Parameter	Type	Description
			dropout_rate	double	It is set as 0.5
batch_size	int	It is set as 3
			word_embedding_dim	int	It is set as 256
length_training_text	int	It is set as 5

It should be noted that above-mentioned hyper parameter value is merely illustrative, it can according to need and be set as other values.

Step 705, the corpus being analysed to inputs, the first LSTM model and the 2nd LSTM obtained in invocation step 704 Model carries out sentiment analysis, and analysis result is presented；It is exemplified below table 8.

Table 8 analyzes result

Content	Sentiment orientation
		It is preferential to buy：News comment says that she does not know how everybody sees making a show before seeing	Negative sense
Wave Jack mark：Well warm elder sister, the strength of emotional affection	It is positive

In one embodiment, it can be counted for topic in step 705, count each microblogging under some topic and its comment The Sentiment orientation of opinion.

In one embodiment, as shown in figure 8, the training process in step 704 includes：

Step 801：Corpus content sentence is segmented, converts term vector for word；And the word with topic belonging to corpus Vector is spliced, and input vector is formed.

It should be noted that in another embodiment, the term vector that can also be directly segmented using corpus content is not used The term vector of topic；

For example, term vector can be converted by word with word2vec algorithm；

Step 802：The first LSTM, parameter b1, b1 therein are constructed using above-mentioned formula (1) (2) (3) (4) (5)^f、b1^g、 b1^qIt is initialized as the vector that all elements are 0, U1, U1^f,U1^g,U1^q,W1,W1^f,W1^g,W1^qIt is random to can use normal distribution Value constitutes parameter matrix；Export the first information；

Step 803：Feature is extracted, according to three kinds of features (social activity comment feature, emotion characteristic feature, macroscopical society Meeting feature), extract latent structure feature vector d；

Step 804：The 2nd LSTM is constructed using formula (6) (7) (8) (9) (10), wherein the input quantity X2 is step The first information of 802 outputs, d are the feature vector in step 803.Parameter b2, b2^f,b2^g,b2^qBeing initialized as all elements is 0 Vector, U2, U2^f,U2^g,U2^q, W2, W2^f,W2^g,W2^q, V2, V2^f,V2^g,V2^qUsing normal distribution random value, parameter is constituted Matrix；

Step 805, it is as shown in table 7 to define hyper parameter value, shown in trained loss function such as formula (11), by training language Material is divided into training set according to preset ratio, and verifying collection and test set are trained, according to defeated after the specified step number of training method iteration Algorithm model out, the specified step number is such as 1000 steps, certainly, merely illustrative herein, can according to need other steps of iteration Number.

Scheme provided in this embodiment can accurately analyze netizen to the attitude of topic, track society's dynamic of public opinion in real time.

Example two

The commodity that the performance of e-commerce pushes more and more commodity producers directly to be produced are carried out on the net It sells, such as automobile, household electrical appliances and food etc..After consumer buys commodity, can directly it make comments in commodity page.

As shown in figure 9, including：

Step 901：For specific commodity, it is collected respectively in different electric business platforms (such as Taobao, Jingdone district, Suning's electricity Quotient, when equal) buyer's comment；

Step 902：Since data volume is huge, can use distributed deployment data processing module, respective pretreatment data, Distributed data base HBase is stored in by data are regular.

In the present embodiment, Sentiment orientation mark can be carried out to comment according to the marking situation of user, for example, being more than or equal to Samsung is then to like, conversely, being then not like；

Step 903：For electric quotient data data, algorithm is deployed on distributed tensorflow platform, with GPU plus It carries and calculates, carry out off-line training；

Establish the first initial LSTM model and the second initial LSTM model respectively, in the present embodiment, hyper parameter setting such as table 9 Shown, loss function is cross entropy.Training obtains the first LSTM model and the 2nd LSTM model on tensorflow platform.

9 example of table, two hyper parameter value

Parameter	Type	Description
			dropout_rate	double	It is set as 0.5
batch_size	int	It is set as 5
			word_embedding_dim	int	It is set as 256
length_training_text	int	It is set as 5

Step 904：The corpus being analysed to is inputted according to the format of table 4, the first LSTM mould obtained in invocation step 903 Type and the 2nd LSTM analyze the Sentiment orientation of corpus, and result (for example, counting according to type of merchandize) is analyzed in statistic of classification, generate Report.

It should be noted that in another embodiment, can also directly export the comment of each commodity without statistics Sentiment orientation.

In one embodiment, the step 903 includes：

Step 9031：By corpus content sentence segment, calculated with word2vec algorithm, by word be converted into word to Amount, referred to as content term vector；Corresponding type of merchandize will be commented on as affiliated topic, such as washing machine, mobile phone, micro-wave oven Deng according to topic generation topic term vector；Content term vector and topic term vector are spliced to the input as the first LSTM；It needs Illustrate, in another embodiment, content term vector can also be only used, does not use topic term vector；

Step 9032：The first LSTM, parameter b1, b1 therein are constructed using formula (1) (2) (3) (4) (5)^f,b1^g,b1^q It is initialized as the vector that all elements are 0, U1, U1^f,U1^g,U1^q, W1, W1^f,W1^g,W1^qUtilize normal distribution random value, structure At parameter matrix；Term vector is inputted into word-level LSTM；Certainly, above-mentioned initial value is merely illustrative, can according to need and takes it He is worth, such as can be using the value of the first LSTM model trained under other scenes as the initial value under this scene.

Step 9033：Feature is extracted, (social activity comment feature, emotion characteristic feature are macro than three kinds of features as previously mentioned See social characteristic), construction feature vector d.

It should be noted that in the comment reply in Jingdone district store, there are the replies of many systems or customer service standardization to reply, These contents can be filtered when carrying out feature extraction.Feature is commented on for social activity, the additional comment of original text author can be considered as The reply content of author；In comment on commodity be not present expression information when, can first a customized sentiment dictionary, by what is wherein occurred Emotion vocabulary is extracted as expression information；

Step 9034：The 2nd LSTM is constructed using above-mentioned formula (6) (7) (8) (9) (10), wherein input quantity x2 is step The output of 9032 the oneth LSTM, d are feature vector described in step 9033.Parameter b2, b2 therein^f,b2^g,b2^qInitially Turn to the vector that all elements are 0, U2, U2^f,U2^g,U2^q, W2, W2^f,W2^g,W2^q, V2, V2^f,V2^g,V2^qUsing normal distribution with Machine value constitutes parameter matrix；Certainly, above-mentioned initial value is merely illustrative, can according to need and takes other values, such as can be by it The value of trained 2nd LSTM model is as the initial value under this scene under his scene.

Step 9035：It is as shown in table 9 to define hyper parameter value, trained loss function such as formula (11)；

Step 9036：Training corpus is divided into training set according to preset ratio, verifying collection and test set to the first LSTM and 2nd LSTM is trained, according to output algorithm model after the specified step number of training method iteration；For example, 1000 step of iteration.

The present embodiment can carry out consumer's sentiment analysis for particular commodity, more accurately to confirm consumer to production The hobby of product adjusts product quality and sales tactics.

Example three

In smart city system, the policy information moment of government's publication affects the production and operation life of society.This In example, after the publication of orientation analysis specific subject policy, the common people facilitate government and adjust and improve its policy to its Sentiment orientation Strategy.

As shown in Figure 10, including：

Step 1001：The questionnaire of government policy information is collected, including network online investigation questionnaire and visiting and investigating is asked Volume；

Step 1002：By the questionnaire input system；

Step 1003：Data preprocessing module pre-processes the data of the questionnaire of typing, by the regular deposit data of data Library MySQL；It should be noted that database MySQL is merely illustrative herein, it can according to need and use other databases.

Step 1004：The first LSTM model and the 2nd LSTM model are established in starting algorithm training.In the present embodiment, The LSTM model of pyTorch platform construction first and second LSTM model, loss function is cross entropy, hyper parameter value such as table Shown in 10.It should be noted that hyper parameter value is merely illustrative, it can according to need and take other values.

Step 1005：Sentiment analysis is carried out to corpus to be analyzed, counts Sentiment orientation, and feeds back to relevant policies publication Department.

10 example of table, three hyper parameter value

Parameter	Type	Description
			dropout_rate	double	It is set as 0.5
batch_size	int	It is set as 3
			word_embedding_dim	int	It is set as 128
length_training_text	int	It is set as 3

In one embodiment, the step 1004 includes：

Step 10041：Corpus content sentence is segmented, converts term vector for word with word2vec；The present embodiment In, topic is government affairs information classification, such as forestry, health, medical treatment, house etc., by the term vector of topic and content term vector Input quantity after being spliced as the first LSTM；It should be noted that in another embodiment, lexical word can also be only used Vector does not use topic term vector；

Step 10042：The first initial LSTM model, b1, b1 are constructed using formula (1) (2) (3) (4) (5)^f、b1^g、b1^qJust Begin to turn to the vector that all elements are 0, U1, U1^f,U1^g,U1^q,W1,W1^f,W1^g,W1^qIt can use normal distribution random value, Constitute parameter matrix；It is above-mentioned term vector that it, which is inputted,；

Step 10043：Characteristic information is extracted, (social activity comment feature, emotion characteristic feature are macro for example, extracting three kinds of features See social characteristic), according to characteristic information construction feature vector d；

Step 10044：The 2nd LSTM is constructed using formula (6) (7) (8) (9) (10), wherein input quantity X2 is step The output of 10042 the oneth LSTM, d are the feature vector in step 10043.Parameter b2, b2^f,b2^g,b2^qIt is initialized as all members The vector that element is 0, U2, U2^f,U2^g,U2^q, W2, W2^f,W2^g,W2^q, V2, V2^f,V2^g,V2^qUtilize normal distribution random value, structure At parameter matrix；

Step 10045：It is as shown in table 10 to define hyper parameter value, shown in trained loss function such as formula (11).

Step 10046：Training corpus is divided into training set according to preset ratio, verifying collection and test set are trained, root The first LSTM model and the 2nd LSTM model are exported after specifying step number according to training method iteration.

Example four

Internet news information is often the flashpoint of public opinion, therefore pays close attention to social influence caused by particular news Become of crucial importance.In the present embodiment, effectively sentiment analysis is carried out to news and news comment, filtering and emphasis monitoring cause The news information of great society's repercussion.

As shown in figure 11, including：

Step 1101：Collect the content and comment information of particular news；

Step 1102：It is pre-processed, carries out abstract extraction for the content of news, convert short essay for long text news This information；

Step 1103：Combination News abstract and comment；That is, news comment is carried out with news in brief corresponding；

Step 1104：Carry out algorithm training.In the present embodiment, in the initial LSTM of tensorflow platform construction first and Two initial LSTM, loss function are cross entropy, and hyper parameter setting is as shown in table 11.

The setting of 11 example of table, four hyper parameter

Step 1105：Sentiment analysis is carried out to corpus to be analyzed using the first LSTM model and the 2nd LSTM model, point Class statistic analysis result generates report.

In one embodiment, the step 1104 includes：

Step 11041：Corpus (being herein news in brief) sentence is segmented, is calculated with word2vec algorithm, is obtained interior Hold term vector；The corresponding classifying content of news, will as affiliated topic, such as sport, finance and economics, science and technology, society, tourism etc. Topic shift is topic term vector, and after being spliced with content term vector, the input quantity as the first LSTM；It needs to illustrate It is that in another embodiment, content term vector can also be only used, does not use topic term vector；

Step 11042：The first LSTM, parameter b1, b1 therein are constructed using formula (1) (2) (3) (4) (5)^f、b1^g、b1^q It is initialized as the vector that all elements are 0, U1, U1^f,U1^g,U1^q,W1,W1^f,W1^g,W1^qIt can use normal distribution to take at random Value constitutes parameter matrix；

Step 11043：Characteristic information is extracted, for example (social activity comment feature, emotion characteristic feature are macro for three kinds of features of extraction See social characteristic), it is based on extracted characteristic information construction feature vector d；

Step 11044：The 2nd LSTM is constructed using formula (6) (7) (8) (9) (10), input quantity X2 is step 11042 The output of first LSTM, d are the feature vector in step 11043.Parameter b2, b2^f,b2^g,b2^qIt is initialized as all members The vector that element is 0, U2, U2^f,U2^g,U2^q, W2, W2^f,W2^g,W2^q, V2, V2^f,V2^g,V2^qUtilize normal distribution random value, structure At parameter matrix；

Step 11045：It is as shown in table 11 to define hyper parameter value, shown in trained loss function such as formula (11)；

Step 11046：Training corpus is divided into training set according to preset ratio, verifying collection and test set are trained, defeated Algorithm model out, i.e. the first LSTM model of output and the 2nd LSTM model.

The implementation for carrying out sentiment analysis under several scenes using the application is presented above, it should be noted that this Apply without being limited thereto, can be used for the sentiment analysis under other scenes, for example carry out emotion point for the usage experience to APP Analysis, etc..Alternatively, it is also possible to not carry out algorithm training to each scene, same model can be shared with multiple scenes.

One embodiment of the invention provides a kind of sentiment analysis equipment, including memory and processor, the memory storage There is program, described program realizes sentiment analysis method described in any of the above-described embodiment when reading execution by the processor.

One embodiment of the invention provides a kind of computer readable storage medium, and the computer-readable recording medium storage has One or more program, one or more of programs can be executed by one or more processor, to realize above-mentioned Sentiment analysis method described in one embodiment.

The computer readable storage medium includes：It is USB flash disk, read-only memory (ROM, Read-Only Memory), random Access memory (RAM, Random Access Memory), mobile hard disk, magnetic or disk etc. are various to can store program The medium of code.

Although disclosed herein embodiment it is as above, the content only for ease of understanding the present invention and use Embodiment is not intended to limit the invention.Technical staff in any fields of the present invention is taken off not departing from the present invention Under the premise of the spirit and scope of dew, any modification and variation, but the present invention can be carried out in the form and details of implementation Scope of patent protection, still should be subject to the scope of the claims as defined in the appended claims.

Claims

1. a kind of sentiment analysis method, including：

Term vector is generated according to corpus；

Feature vector is generated according to the corpus, the term vector is inputted to the first length memory network model pre-established, The first information of first length memory network model output and described eigenvector are inputted into the second length pre-established Memory network model；

2. sentiment analysis method according to claim 1, which is characterized in that before the generation feature vector according to corpus also Including the target information being analysed to is pre-processed into the data of preset format as the corpus.

3. sentiment analysis method according to claim 1, which is characterized in that the corpus includes corpus content, alternatively, language Expect content and corpus type；The corpus type indicates that the corpus includes at least one of：Text, text comment and comment The reply of opinion.

4. sentiment analysis method according to claim 1, which is characterized in that described to generate term vector packet according to the corpus It includes：

One or more first term vectors will be generated after corpus participle, using first term vector as the term vector； Or, being generated one or more first term vectors are generated after corpus participle based on topic belonging to the corpus or classification Second term vector combines first term vector and second term vector to obtain the term vector.

5. sentiment analysis method according to claim 1, which is characterized in that described to generate feature vector according to the corpus Including：According to the corpus extract characteristic information, according to the characteristic information generate feature vector, the characteristic information include with It is at least one lower：Social activity comment feature, emotion characteristic feature and macro society feature, the social comment feature indicate the corpus Comment information in whether there is corpus publisher reply；The emotion characteristic feature indicates the corpus and its comment letter In breath include first kind emotional symbol and the second class emotional symbol quantitative relation；Described in the macro society feature instruction Whether the prompt user information that prompts other users check the corpus is occurred in corpus.

6. sentiment analysis method according to claim 1, which is characterized in that

The term vector includes term vector x1⁽⁰⁾~x1^(N-1)；

It is described that the term vector is inputted to the first length memory network model pre-established, by the first length memory network The first information and described eigenvector of model output input the second length memory network model pre-established and include：

By term vector x1⁽⁰⁾The first length memory network model is inputted, by the first length memory network model output The first information and described eigenvector input the second length memory network model；

And so on, by term vector x1^(N-1)The first length memory network model is inputted, by the first length memory network The first information and described eigenvector of model output input the second length memory network model, obtain second length Second information of memory network model output.

7. sentiment analysis method according to any one of claims 1 to 6, which is characterized in that the first length memory network Model and the second length memory network model are based under type such as and establish：

Sentiment orientation mark is carried out to training corpus, based on the training corpus to the preset first initial length memory network mould Type and the second initial length memory network model are trained, and obtain the first length memory network model and described Two length memory network models.

8. sentiment analysis method according to any one of claims 1 to 6, which is characterized in that the first length memory network Model is as follows：

Wherein, describedIt is i-th yuan of the state vector of the forgetting door t moment of the first length memory network model Element, the σ is sigmod unit function, describedFor weight vectors b1^fI-th of element, describedFor weight matrix U1^fThe element of i-th row, jth column, it is describedFor weight matrix W1^fThe i-th row, jth column element, it is describedIt is described J-th yuan of the term vector of j-th of element of the input vector of the t moment of one length memory network model, i.e. t moment input Element, it is describedFor the output vector h1 at the first length memory network model t-1 moment^(t-1)J-th of element；

It is describedIt is described for i-th of element of the state vector of the input gate t moment of the first length memory network modelFor weight vectors b1^gI-th of element, it is describedFor weight matrix U1^gThe element of i-th row, jth column, it is describedFor Weight matrix W1^gThe i-th row, jth column element；

It is describedIt is described for i-th of element of the state vector of the out gate t moment of the first length memory network modelFor weight vectors b1^qI-th of element, it is describedFor weight matrix U1^qThe element of i-th row, jth column, it is describedFor Weight matrix W1^qThe i-th row, jth column element；

It is describedFor the intermediate state vector s1 of the first length memory network model t moment^(t)I-th of element, it is describedFor the intermediate state vector s1 at the first length memory network model t-1 moment^(t-1)I-th of element, the b1_i For i-th of element of weight vectors b1, the U1_i,jFor the element that the i-th row of weight matrix U1, jth arrange, the W1_i,jFor weight The element that i-th row of matrix W 1, jth arrange；

It is describedFor the output vector h1 of the first length memory network model t moment^(t)I-th of element, described first Information is the h1 of the first length memory network model t moment output^(t), tanh is hyperbolic tangent function.

9. sentiment analysis method according to any one of claims 1 to 6, which is characterized in that the second length memory network Model is as follows：

Wherein, describedFor the second length memory network model forgetting door t moment state vector i-th of element, The σ is sigmod unit function, describedFor weight vectors b2^fI-th of element, it is describedFor weight matrix U2^fThe The element of i row, jth column, it is describedFor weight matrix W2^fThe i-th row, jth column element, it is describedFor weight matrix V2^fThe i-th row, jth column element, it is describedFor the input vector x2 of the t moment of the second length memory network model^(t) J-th of element, i.e. t moment j-th of element being input to the first information of the second length memory network model, institute It statesFor the output vector h2 at the second length memory network model t-1 moment^(t-1)J-th of element, it is describedFor T moment is input to j-th of element of the described eigenvector of the second length memory network model；

It is describedIt is described for i-th of element of the state vector of the input gate t moment of the second length memory network modelFor weight vectors b2^gI-th of element, describedFor weight matrix U2^gThe element of i-th row, jth column, it is describedFor Weight matrix W2^gThe i-th row, jth column element；It is describedFor weight matrix V2^gThe i-th row, jth column element；

It is describedIt is described for i-th of element of the state vector of the out gate t moment of the second length memory network modelFor weight vectors b2^qI-th of element, it is describedFor weight matrix U2^qThe element of i-th row, jth column, it is described For weight matrix W2^qThe i-th row, jth column element, it is describedFor weight matrix V2^qThe i-th row, jth column element；

It is describedFor the intermediate state vector s2 of the second length memory network model t moment^(t)I-th of element, it is describedFor the intermediate state vector s2 at the second length memory network model t-1 moment^(t-1)I-th of element, the b2_i For i-th of element of weight vectors b2, the U2_i,jFor the element that the i-th row of weight matrix U2, jth arrange, the W2_i,jFor weight The element that i-th row of matrix W 2, jth arrange, the V2_i,jThe element arranged for the i-th row of weight matrix V2, jth；

It is describedFor the output vector h2 of the second length memory network model t moment^(t)I-th of element；Described second Information is the h2 of the second length memory network model t moment output^(t), tanh is hyperbolic tangent function.

10. sentiment analysis method according to any one of claims 1 to 6, which is characterized in that

The term vector includes term vector x1⁽⁰⁾~x1^(N-1)；

By the term vector x1⁽⁰⁾~x1^(N-1)From x1⁽⁰⁾To x1^(N-1)The first length memory network model is sequentially input, by institute The first information and described eigenvector for stating the output of the first length memory network model input the second length memory network mould Type；Obtain term vector x1^(N-1)After inputting the first length memory network model, the second length memory network model is obtained The positive information of the second of output；

By the term vector x1⁽⁰⁾~x1^(N-1)From x1^(N-1)To x1⁽⁰⁾The first length memory network model is sequentially input, by institute The first information and described eigenvector for stating the output of the first length memory network model input the second length memory network mould Type；Obtain term vector x1⁽⁰⁾After inputting the first length memory network model, the second length memory network model is obtained Second reversed information of output；

Second information according to the second length memory network model output determines that the Sentiment orientation of the corpus includes：

Described second positive information and the second reversed information are combined acquisition combined information, according to the combined information Determine the Sentiment orientation of the corpus.

11. a kind of sentiment analysis system, which is characterized in that including：Data processing module, memory module and algorithm analysis module, Wherein：

The data processing module is used for, and obtains corpus；

The memory module is used for, and stores the corpus；

The algorithm analysis module is used for, and generates feature vector according to the corpus, and term vector input is pre-established First length memory network model, the first information of the first length memory network model output and described eigenvector is defeated Enter the second length memory network model pre-established；The second information exported according to the second length memory network model is true The Sentiment orientation of the fixed corpus.

12. a kind of sentiment analysis equipment, which is characterized in that including memory and processor, the memory is stored with program, institute Program is stated when reading execution by the processor, realizes the sentiment analysis method as described in claims 1 to 10 is any.

13. a kind of computer readable storage medium, which is characterized in that the computer-readable recording medium storage have one or Multiple programs, one or more of programs can be executed by one or more processor, to realize such as claims 1 to 10 Any sentiment analysis method.