CN108763214B

CN108763214B - Automatic construction method of emotion dictionary for commodity comments

Info

Publication number: CN108763214B
Application number: CN201810539447.4A
Authority: CN
Inventors: 冯钧; 贡诚; 李晓东; 邹希
Original assignee: Hohai University HHU
Current assignee: Hohai University HHU
Priority date: 2018-05-30
Filing date: 2018-05-30
Publication date: 2021-09-24
Anticipated expiration: 2038-05-30
Also published as: CN108763214A

Abstract

The invention discloses an automatic construction method of an emotion dictionary for commodity comments, which comprises text preprocessing, semantic relation mining and emotion word clustering. The text preprocessing is used for preprocessing the commodity comments and extracting emotional words and evaluation objects contained in a certain type of commodity comments. And mining semantic relations, namely mining the semantic relations between the emotion words and the evaluation objects, and representing the semantic relations between the emotion words and the evaluation objects in a matrix form. And (4) clustering the emotional words, wherein the emotional words are reasonably classified into k types by carrying out unsupervised clustering on the emotional words according to the mutual distance of the emotional words in an emotional matrix space. The invention constructs a domain emotion dictionary aiming at the characteristics of texts in the commodity comment field, the dictionary can divide emotion words into multiple classes instead of two classes which are commendably and commendably, and the domain emotion dictionary has great advantages in the emotion classification task and the like compared with other existing general emotion dictionaries in the commodity comment field.

Description

Automatic construction method of emotion dictionary for commodity comments

Technical Field

The invention relates to a self-defined construction method of an emotion dictionary aiming at the commodity comment field, and belongs to the technical field of computer information technology processing.

Background

With the development of various shopping websites, a large number of comments about various commodities appear on the network, and people can look up the comments anytime and anywhere. It is important to identify the emotional tendencies that these reviews imply, both for the merchant and the consumer. And a good emotion dictionary is the basis for analyzing the emotional tendency of the text. It is well known that sentiment analysis of text requires consideration of the industry to which the text belongs. The existing emotion dictionaries are all universal and do not have emotion dictionaries in a specified field aiming at commodity comments. Obviously, it is not appropriate to utilize the conventional emotion dictionary to perform emotion analysis on the product comment text. Therefore, the automatic construction method of the emotion dictionary, especially for the emotion dictionary in a specific field, draws more and more attention and researches of experts.

The existing construction method of the emotion dictionary can be divided into two categories, namely corpus-based and knowledge base-based, for Chinese and English. And constructing an emotion dictionary based on a corpus, wherein the most common method is to select seed words and determine the emotion polarity of the emotion words by calculating the relation between the emotion words with unknown emotion polarity and the seed words, namely a PMI value. Then, the available common knowledge base for Chinese is very limited, so that the research for constructing Chinese emotion dictionaries by using the knowledge base is very rare. However, when constructing an emotion dictionary for a product review field, it is necessary to consider an evaluation target in particular. The evaluation object is a certain characteristic of the commodity which is evaluated by the user, for example, for a mobile phone, the evaluation object can be a characteristic of a screen, a battery and the like of the mobile phone.

On the other hand, the existing emotion dictionary usually only contains some emotion words, and the emotion words are divided into two categories of positive words and negative words. There are also some scholars who classify emotions into joyous, sad, fear, surprise, anger, jealousy, the six major classes. In summary, the existing emotion classification is based on the experience knowledge of people to determine the emotion classes into which emotion words can be classified.

Considering that many emotional words often show different emotional tendencies in different fields, it is important to be able to accurately identify these emotional words and evaluate the subjects or subjects in the fields, especially in the field of merchandise review. Fast finds it difficult to construct a domain emotion dictionary by means of information experts or by means of crowdsourcing services. Shi et al extract key information from the domain text using association rule algorithms in conjunction with supervised machine learning approaches. Zhang et al extracts an evaluation object of a product using a Point Mutual Information (PMI) and an association rule algorithm. Considering the sequence problem of the evaluation object, Qiu and the like provide a two-way propagation algorithm on the basis of calculating the positive relationship between the emotional words and the product. Mishne selects the evaluation object by using the part of speech and the word frequency of the word.

PMI is a common indicator used to consider the degree of association between two words. Turney and Littman use PMI and LSA to calculate the degree of association between two words, this method of calculating the relationship between a word and a seed word by using PMI is generally called so-PMI. The PMI is improved by Islam and Inkpen, the SOC-PMI is provided, the emotion classification task is a basic task of emotion analysis, and the performance of the used emotion dictionary can be directly reflected by the quality of the classification result. Pang takes the emotion classification task as a text classification task and tests three classifiers, namely naive Bayes, a support vector machine and maximum entropy. Li and Hao expand the evaluation object by using a spectral clustering method. Yang et al then uses word2vec to calculate the cosine similarity between the word and the seed word.

Most of the existing emotion dictionary construction methods are general dictionaries, and the general dictionaries are not suitable for analyzing texts in a specific field, such as commodity comment texts, so that it is very important to construct an emotion dictionary which can be suitable for the specific field.

Disclosure of Invention

The purpose of the invention is as follows: the invention aims to solve the defects in the prior art and provides an emotion dictionary construction method aiming at the field of commodity comments.

The technical scheme is as follows: an automatic construction method of an emotion dictionary for commodity comments sequentially comprises the following steps:

(1) and preprocessing the original commodity comment text. The method is characterized in that the method usually adopts Chinese word segmentation, stop word filtering and other measures to determine the emotional words and the evaluation objects contained in the text in the designated field. The determination of the emotional words and the evaluation objects is to select nouns contained in the comment texts as the evaluation objects according to the parts of speech of the words, and adjectives, adverbs and verbs in the comment texts are used as the emotional words.

(2) And (2) mining the relation between the emotion words obtained in the step (1) and the evaluation object, and generating an emotion matrix representing the relation.

(3) Screening the evaluation objects obtained in the step (1) to leave a part of key evaluation objects.

(4) Similar to the step (2), considering the relationship between the emotional words and the key evaluation objects, and generating an emotional matrix representing the relationship between the emotional words and the key evaluation objects.

(5) And (4) mining the correlation between the key evaluation object screened in the step (3) and the original evaluation object obtained in the step (1), and generating a correlation matrix representing the correlation between the key evaluation object and the original evaluation object.

(6) And (5) obtaining a correlation matrix by using the two emotion matrixes obtained in the steps (2) and (4) and the correlation matrix obtained in the step (5), and generating a new emotion matrix for representing the relationship between the emotion words and the key evaluation object.

(7) And (4) clustering the emotion words according to the distance between the emotion words in the emotion matrix in the step (6), and dividing the emotion words into several types to obtain a domain emotion dictionary.

(8) Applying the emotion dictionary to the emotion classification task, determining an optimal k value by adopting methods such as cross check and the like according to different fields, and dividing the emotion words into k types.

Further, in the step (2), the relationship between the emotion word and the evaluation object directly reflects a modification degree of the emotion word on the evaluation object:

(2.1) quantifying the relation between the emotion words and the evaluation objects by using the co-occurrence of the emotion words and the evaluation objects, wherein the PMI is adopted to calculate the relation between the emotion words and the evaluation objects.

(2.2) we use a matrix (emotion matrix) to represent the relationship, the rows of the matrix represent all emotion words, each column of the matrix is the evaluation object, and each unit of the matrix represents the corresponding emotion word and PMI value of the evaluation object.

Further, in the step (3), the tf-idf concept is used for screening the evaluation object, and the details specifically include the following:

(3.1) merging comments of the same type of product into a document, and calculating tf-idf values of words according to the times of occurrence of the words in different documents, namely different product comment sets, and the frequency of reverse documents.

And (3.2) calculating tf-idf values of all words, sequencing the tf-idf values of the evaluation objects, setting a threshold value, and screening the evaluation objects which reach the threshold value by us to be considered as final evaluation objects.

Further, in the step (4), similarly to the constructed emotion matrix in the step (2), the only difference is that the emotion matrix in the step (4) includes the relationship between the emotion word and the evaluation object left after the screening, and not the relationship between the emotion word and the entire evaluation object.

Further, in the step (5), the relationship between all the evaluation objects and the screened evaluation objects is mined, and a correlation matrix between the key evaluation object and all the evaluation objects is generated, wherein the specific details are as follows:

(5.1) the key evaluation objects are the evaluation objects obtained by screening in the step (3), and all the evaluation objects are all the nouns contained in the initial product review.

(5.2) the association degree between the key evaluation object and all the evaluation objects can be understood as a synonymy relationship in the corpus, and the relationship is represented by [0,1], wherein 0 represents no relationship, and 1 represents the highest association degree. Other degrees of correlation are expressed by numerical values between [0,1] intervals, and closer to 1 indicates higher degrees of correlation.

Further, in the step (6), we use the two emotion matrixes and the correlation matrix constructed before to generate the final emotion matrix, which is based on an improved PMI algorithm EPMI algorithm that we propose, as follows:

(6.1) EPMI Algorithm:

that is, we are computing the emotion words e_iAnd the evaluation object m_jIn the case of the relationship between the two, it is necessary to consider and evaluate the object m in addition to the relationship between the two_jThose evaluation objects are related, and the degree of the relation is u in the formula_jkTo indicate.

(6.2) the new EPMI algorithm is adopted to mine the relation between the emotion words and the evaluation objects, and a new emotion matrix is constructed. The new emotion matrix can be directly obtained by the EPMI algorithm through the two emotion matrixes and the association matrix obtained previously.

Further, in the step (7), the emotion words are clustered, and in the emotion matrix, the emotion words can be represented as a vector, so that the emotion words can be clustered unsupervised according to the distance between the vector and the vector.

Further, in the step (8), because unsupervised clustering is adopted in the clustering process, the number k of final clusters is uncertain, the optimal k is different for different product reviews and different text analysis tasks, and a k value with stable performance can be selected through cross checking.

Has the advantages that: compared with the prior art, the method for automatically constructing the emotion dictionary aiming at the commodity comment field provided by the invention has the advantage that the expression of the field emotion dictionary is superior to that of a general emotion dictionary in the specific field. The invention provides a method for constructing an emotion dictionary on comment corpus, and the constructed emotion dictionary is different from the traditional emotion dictionary, and can divide emotion words into unfixed k classes instead of some classes such as fixed commendability and derogation, and the performance of the domain emotion dictionary with higher dimensionality is better in tasks such as emotion classification.

Drawings

FIG. 1 is a flow diagram of a text pre-processing module;

FIG. 2 is a schematic diagram of semantic relationship mining;

FIG. 3 is a flow chart of k value selection.

Detailed Description

The present invention is further illustrated by the following examples, which are intended to be purely exemplary and are not intended to limit the scope of the invention, as various equivalent modifications of the invention will occur to those skilled in the art upon reading the present disclosure and fall within the scope of the appended claims.

First, in order to facilitate understanding of the present invention, the following description is made:

1. text preprocessing:

the text preprocessing module is mainly used for preprocessing an original comment text, as shown in fig. 1, for chinese, chinese word segmentation is required, and many existing open-source software can be used for chinese word segmentation, such as jieba word segmentation software, ik word segmenter, and the like. If the user wants to achieve the best word segmentation effect, the user often needs to add a self-defined dictionary to identify some rarely-used field words. In the case of english, since words are separated by commas, the step of word segmentation is not required, but the english vocabulary involves more complicated tenses and other problems, and therefore, the shape and stem of the english vocabulary needs to be restored. Similarly, many open source software also support stemming english vocabulary, such as the most commonly used natural language processing toolkit NLTK.

Besides the basic preprocessing of the comment text, the module can obtain a preliminary emotion word and an evaluation object most importantly. And the method of part-of-speech analysis is adopted to extract emotional words and evaluate objects. The emotional words are used for expressing that a user has strong subjective colors on a certain event, and the adjectives, the adverbs and the verbs are selected as the emotional words. The evaluation object, as the name implies, refers to the object modified by the emotional words, and may be a product or a feature of the product. We choose nouns as the evaluation objects.

Of course, whether the text is English or Chinese, a large number of words and phrases do not contain any meaning, such as "yes", "do", and the like in Chinese, and "is", "the", and the like in English, and stop word processing is required to be performed, so that the words are filtered out, and the negative influence of the meaningless high-frequency words on the text analysis is reduced.

2. Semantic relationship mining

The semantic relation mining module is a core module of the invention, and can lay a good foundation for the next emotion word clustering only by fully mining the relation between the emotion words and the evaluation object. The process of obtaining the final emotion matrix is mainly divided into three stages, as shown in fig. 2.

The relationship between the emotional words and the evaluation objects is expressed as a modified and modified relationship, the relationship is expressed as a co-occurrence in the comment text, and the more frequent the co-occurrence, the closer the relationship between the emotional words and the evaluation objects is considered. Since the PMI calculates the relationship between two vocabularies by the co-occurrence of the two vocabularies, we firstly use the PMI to calculate the initial emotion matrix.

At this time, a problem of too high dimension is faced, because the number of evaluation objects in a comment text is often far greater than the number of emotion words, that is, in an emotion matrix, emotion words are represented as a vector, the dimension of the vector is very high, and when clustering is performed on the emotion words according to the distance of the vector, the accuracy and performance of clustering are greatly affected. By taking the tf-idf as a reference, the evaluation object is screened, and the evaluation object refers to a product or a certain characteristic of the product commented by the user. These product features are domain-related, for example, in a mobile phone product review, a user may be interested in features such as "mobile phone", "screen", "battery", etc., and in a hotel product review, a user may be interested in features such as "hotel", "toilet", "air conditioner", etc. Features such as 'cell phone', 'screen', 'battery' and the like will appear in a large number of mobile phone product reviews, but in few hotel reviews. Features such as hotels, toilets, air conditioners and the like will appear in numerous hotel reviews and not in mobile phone product reviews. Therefore, we can use the thought of tf-idf to screen the real evaluation object.

tf-idf is often said for text, an object processed by us is a piece of comment text, we need to merge comments of the same type of product into a document, a set of the comments is regarded as a document of corresponding product comments, and the number of documents is the number of types of different product comments collected by us. Using these documents, the value of tf-idf for each word in the comment text can be calculated. And the nouns with higher tf-idf values are often the evaluation objects of interest.

A threshold value alpha is set, and only the noun with the tf-idf value reaching the threshold value is determined as the evaluation object, which is also called as a key evaluation object. Similar to the original emotion matrix, the relation between emotion words and key evaluation objects is considered, namely PMI values of the emotion words and the key evaluation objects are calculated, so that a new emotion matrix can be constructed, the emotion words can be represented as a vector in the matrix, and because the evaluation objects are screened by tf-idf, the dimension of the vector is far smaller than that of the vector in the original emotion matrix.

The PMI value only considers the co-occurrence between the emotion words and the evaluation objects, which causes some semantic loss, and when the relationship between the emotion words and the evaluation objects is calculated, not only the direct relationship between the emotion words and the evaluation objects but also the relationship between other evaluation objects related to the evaluation objects need to be considered. Therefore, we calculate a correlation between the key evaluation object and all the evaluation objects, and obtain a correlation matrix, and the algorithm of the correlation between the key evaluation object and all the evaluation objects is as follows:

where M is all product features, M' is the filtered key evaluation object, D is the set of reviews, and normal () is a simple normalization function to normalize the values of each dimension in the vector to 0,1]Numerical values in the intervals. Furthermore, (m'_i,m_j) in d represents a key evaluation object, m'_iAnd the evaluation object m_jWhen the two evaluation objects appear in the same comment, the more times the two evaluation objects appear in the same comment, the higher the association degree of the two evaluation objects.

Through the above calculation, a correlation matrix can be obtained, and the emotion matrix which is finally needed by us can be obtained according to the EPMI algorithm, namely, the formula (5), by using the two emotion matrices and the correlation matrix which are obtained before.

3. Emotional word clustering

After a final emotion matrix is obtained, the emotion words can be clustered, the emotion words in the matrix can be represented as vectors, clustering is carried out according to the distance between the vectors, and due to the fact that unsupervised clustering is adopted, a cross-check mode is adopted, and emotion classification is taken as an example, and proper k is selected. The specific selection process is shown in fig. 3.

Dividing a comment data set into m parts, selecting m-1 parts as a test set, taking the rest 1 parts as test data, calculating the classification accuracy of different k values on the test set, testing different k values by using different test sets and training sets, and finally performing m-round tests, so that m times of accuracy can be obtained for each k value, and selecting the k value with the highest average accuracy as a final k value.

The automatic construction method of the sentiment dictionary for the commodity comment sequentially comprises the following steps:

(1) and preprocessing the original commodity comment text. The method is characterized in that the method usually adopts Chinese word segmentation, stop word filtering and other measures to determine the emotional words and the evaluation objects contained in the text in the designated field.

The relationship between the emotion words and the evaluation objects directly reflects the modification degree of the emotion words on the evaluation objects:

The relationship between an emotion word and an evaluation object is calculated using Point Mutual Information (PMI), and the PMI calculation formula is defined as follows:

wherein, p (word)₁,word₂) Is word₁And word₂Probability that two words co-occur in the same window in the text of the article review. N is the number of different words contained by the product review under consideration. count (word)₁,word₂) Finger word₁And word₂The number of times two words co-occur in the same window in a review of the good. count (word) refers to the number of times the word appears in the item review text.

The emotion matrix between the emotion words and the evaluation objects is defined as a matrix A as follows:

the constructed emotion matrix A is composed of n rows and p columns. Wherein n rows represent n emotional words, i.e. e₁～e_nAnd the p column represents p evaluation objects, i.e., m₁～m_p. Where p is much larger than n. And w_ijRepresenting emotional words e_iAnd the evaluation object m_jPMI value in between, w_ij＝PMI(e_i,m_j)。

The tf-idf idea is used for screening evaluation objects, and the method specifically comprises the following details:

(3.2) calculating tf-idf values of all words, sorting tf-idf values of the evaluation objects, setting a threshold value, and screening out t evaluation objects which reach the threshold value as final evaluation objects.

(4) Similar to the step (2), considering the relationship between the emotional words and the key evaluation objects, and generating an emotional matrix representing the relationship between the emotional words and the key evaluation objects. The only difference from step (2) is that in step (4), the emotion matrix contains the relationship between the emotion word and the evaluation object left after the screening, and not the relationship between the emotion word and the entire evaluation object.

The constructed emotion matrix B is n rows and t columns. The n rows also represent n emotional words, and the t columns represent t key evaluation objects.

In step (5), relationships between all the evaluation objects and the screened evaluation objects are mined, and a correlation matrix between the key evaluation object and all the evaluation objects is generated, wherein the specific details are as follows:

(5.2) the association degree between the key evaluation object and all the evaluation objects can be understood as a synonymy relationship in the corpus, and the relationship is represented by [0,1], wherein 0 represents no relationship, and 1 represents the highest association degree.

The constructed correlation matrix C is shown below:

the correlation matrix C is composed of t rows and p columns, wherein the t rows represent t screened key evaluation objects, the p columns represent t total evaluation objects, and u_ijRepresents the evaluation object m_iAnd the evaluation object m_jThe correlation can be formed by m_iAnd m_jThe number of times a comment simultaneously appears in a product comment text is counted.

(6) And (4) obtaining two emotion matrixes obtained in the steps (2) and (4) and a correlation matrix obtained in the step (5), and generating a new emotion matrix for representing the relationship between the emotion words and the key evaluation object.

Generating a final emotion matrix through an EPMI algorithm by using the two emotion matrixes and the incidence matrix which are constructed previously; EPMI algorithm:

The emotion matrix D is calculated as follows:

D[n][t]＝B[n][t]+A[n][p]*C^T[t][p] (5)

the matrix D is the same as the matrices A and B and is used for representing the relationship between the emotional words and the evaluation objects, and the difference is that the matrices A and B are calculated by using the traditional PMI algorithm, and the matrix D is calculated by using the improved PMI algorithm EPMI algorithm.

In the emotion matrix D, the emotion words can be expressed into vectors by each row in the matrix, and the emotion words can be grouped into several types by adopting a clustering method such as k-means and the like according to the distance between the vectors in the matrix space. Finally, a domain emotion dictionary which divides emotion words into some classes can be obtained.

Claims

1. An automatic construction method of an emotion dictionary for commodity comments is characterized by sequentially comprising the following steps:

(1) preprocessing an original commodity comment text, and determining emotion words and evaluation objects contained in a specified field text;

(2) mining the relation between the emotion words obtained in the step (1) and the evaluation object, and generating an emotion matrix representing the relation;

(3) screening the evaluation objects obtained in the step (1) and leaving key evaluation objects;

(4) considering the relation between the emotional words and the key evaluation objects, and generating an emotional matrix representing the relation between the emotional words and the key evaluation objects;

(5) mining the correlation between the key evaluation object screened in the step (3) and the original evaluation object obtained in the step (1), and generating a correlation matrix representing the correlation between the key evaluation object and the original evaluation object;

(6) obtaining a correlation matrix by using the two emotion matrixes obtained in the steps (2) and (4) and the correlation matrix obtained in the step (5), and generating a new emotion matrix for representing the relationship between the emotion words and the key evaluation object;

(7) clustering the emotion words according to the distance between the emotion words in the emotion matrix in the step (6), and dividing the emotion words into several types to obtain a domain emotion dictionary;

(8) applying an emotion dictionary to an emotion classification task, determining an optimal k value by adopting methods such as cross check and the like according to different fields, and dividing emotion words into k types;

in the step (2), the relationship between the emotion words and the evaluation objects directly reflects a modification degree of the emotion words to the evaluation objects:

(2.1) quantifying the relation between the emotion words and the evaluation objects by using the co-occurrence of the emotion words and the evaluation objects, wherein the PMI is adopted to calculate the relation between the emotion words and the evaluation objects;

the PMI calculation formula is defined as follows:

wherein, p (word)₁,word₂) Is word₁And word₂Probability that two words co-occur in the same window in the commodity comment text; n is the number of different words contained in the commodity review under consideration; count (word)₁,word₂) Finger word₁And word₂The number of times that two words co-occur in the same window in the commodity comment is counted (word), which is the number of times that the word appears in the commodity comment text;

(2.2) a matrix (emotion matrix) is used for representing the relation, rows of the matrix represent all emotion words, each column of the matrix is an evaluation object, and each unit of the matrix represents the corresponding emotion word and the PMI value of the evaluation object;

the formed emotion matrix A is formed by n rows and p columns; wherein n rows represent n emotional words, i.e. e₁～e_nAnd the p column represents p evaluation objects, i.e., m₁～m_p(ii) a And w_ijRepresenting emotional words e_iAnd the evaluation object m_jPMI value in between, w_ij＝PMI(e_i，m_j)。

2. The method as claimed in claim 1, wherein in the step (1), the determination of the emotion words and the evaluation objects is to select nouns contained in the comment text as the evaluation objects according to the parts of speech of the words, and adjectives, adverbs and verbs in the comment text are used as the emotion words.

3. The method for automatically constructing an emotion dictionary for commodity comments as claimed in claim 1, wherein in the step (3), tf-idf ideas are used for screening of evaluation objects, and the method specifically includes the following details:

(3.1) combining the comments of the same type of product into a document, and calculating tf-idf values of words according to the times of the words appearing in different documents and the frequency of reverse documents;

and (3.2) calculating tf-idf values of all words, sequencing the tf-idf values of the evaluation objects, setting a threshold value, and screening out t evaluation objects reaching the threshold value to be considered as final evaluation objects.

4. The automatic construction method of an emotion dictionary for merchandise comments according to claim 1, wherein in step (4), similarly to the constructed emotion matrix in step (2), the only difference is that in step (4), the emotion matrix contains the relationship between the emotion word and the evaluation object left after the screening, but not the relationship between the emotion word and the entire evaluation object;

the constructed emotion matrix B is n rows and t columns; n rows represent n emotional words, and t columns represent t key evaluation objects;

5. the method for automatically constructing an emotion dictionary for commodity comments as set forth in claim 1, wherein in the step (5), the relationship between all the evaluation objects and the screened evaluation objects is mined, and a correlation matrix between the key evaluation object and all the evaluation objects is generated, wherein the correlation matrix C is as follows:

the correlation matrix C is composed of t rows and p columns, wherein the t rows represent t screened key evaluation objects, the p columns represent p original evaluation objects, and u_ijIndicates the evaluation object mi and the evaluation object m_jCorrelation between m, correlation is represented by_iAnd m_jThe number of times a comment simultaneously appears in a product comment text is counted.

6. The automatic construction method of emotion dictionary for merchandise review as set forth in claim 1, wherein in said step (6), a final emotion matrix is generated by EPMI algorithm using two emotion matrices and correlation matrix constructed previously;

EPMI algorithm:

in calculating emotional words e_iAnd the evaluation object m_jWhen the relationship between them, not only need to take into accountConsidering the relationship between the two, the object m needs to be considered and evaluated_jThose evaluation objects are related, and the degree of the relation is u in the formula_jkTo represent;

the constructed new emotion matrix D calculation formula is as follows:

D[n][t]＝B[n][t]+A[n][p]*C^T[t][p] (5)；

wherein, D [ n ]][t]And B [ n ]][t]Emotion matrixes all representing correlation between n emotion words and t key evaluation objects, A [ n ]][p]An emotion matrix representing the correlation between n emotion words and p total evaluation objects, C^T[t][p]The correlation between the key evaluation object and all the evaluation objects is expressed, and formula (5) is a matrixed representation of formula (4).

7. The method as claimed in claim 1, wherein in the step (7), the emotion words are clustered, in the emotion matrix D, the emotion words are represented as vectors by each row in the matrix, and the emotion words are clustered into several classes by using a k-means clustering method according to the distance between the vectors in the matrix space, thereby finally obtaining a domain emotion dictionary which divides the emotion words into several classes.